Updata Readme.md
This commit is contained in:
parent
95ea8a8021
commit
28dd4fa032
|
@ -41,10 +41,10 @@ This project is only a framework project, which does not have the function of sp
|
|||
|
||||
The singing voice conversion model uses SoftVC content encoder to extract source audio speech features, then the vectors are directly fed into VITS instead of converting to a text based intermediate; thus the pitch and intonations are conserved. Additionally, the vocoder is changed to [NSF HiFiGAN](https://github.com/openvpi/DiffSinger/tree/refactor/modules/nsf_hifigan) to solve the problem of sound interruption.
|
||||
|
||||
### 🆕 4.0-Vec768-Layer12 Version Update Content
|
||||
### 🆕 4.1-Stable Version Update Content
|
||||
|
||||
- Feature input is changed to [Content Vec](https://github.com/auspicious3000/contentvec) Transformer output of 12 layer, the branch is not compatible with 4.0 model
|
||||
- Update the shallow diffusion, you can use the shallow diffusion model to improve the sound quality
|
||||
- Feature input is changed to [Content Vec](https://github.com/auspicious3000/contentvec) Transformer output of 12 layer, And compatible with 4.0 branches.
|
||||
- Update the shallow diffusion, you can use the shallow diffusion model to improve the sound quality.
|
||||
|
||||
### 🆕 Questions about compatibility with the 4.0 model
|
||||
|
||||
|
@ -53,7 +53,7 @@ The singing voice conversion model uses SoftVC content encoder to extract source
|
|||
```
|
||||
"model": {
|
||||
.........
|
||||
"ssl_dim": 768,
|
||||
"ssl_dim": 256,
|
||||
"n_speakers": 200,
|
||||
"speech_encoder":"vec256l9"
|
||||
}
|
||||
|
|
|
@ -39,9 +39,9 @@
|
|||
|
||||
歌声音色转换模型,通过SoftVC内容编码器提取源音频语音特征,与F0同时输入VITS替换原本的文本输入达到歌声转换的效果。同时,更换声码器为 [NSF HiFiGAN](https://github.com/openvpi/DiffSinger/tree/refactor/modules/nsf_hifigan) 解决断音问题
|
||||
|
||||
### 🆕 4.0-Vec768-Layer12 版本更新内容
|
||||
### 🆕 4.1-Stable 版本更新内容
|
||||
|
||||
+ 特征输入更换为 [Content Vec](https://github.com/auspicious3000/contentvec) 的第12层Transformer输出
|
||||
+ 特征输入更换为 [Content Vec](https://github.com/auspicious3000/contentvec) 的第12层Transformer输出,并兼容4.0分支
|
||||
+ 更新浅层扩散,可以使用浅层扩散模型提升音质
|
||||
|
||||
### 🆕 关于兼容4.0模型的问题
|
||||
|
@ -51,7 +51,7 @@
|
|||
```
|
||||
"model": {
|
||||
.........
|
||||
"ssl_dim": 768,
|
||||
"ssl_dim": 256,
|
||||
"n_speakers": 200,
|
||||
"speech_encoder":"vec256l9"
|
||||
}
|
||||
|
|
|
@ -77,7 +77,7 @@
|
|||
"\n",
|
||||
"#@markdown\n",
|
||||
"\n",
|
||||
"!git clone https://github.com/svc-develop-team/so-vits-svc -b 4.0-Vec768-Layer12\n",
|
||||
"!git clone https://github.com/svc-develop-team/so-vits-svc -b 4.1-Stable\n",
|
||||
"%pip uninstall -y torchdata torchtext\n",
|
||||
"%pip install --upgrade pip setuptools numpy numba\n",
|
||||
"%pip install pyworld praat-parselmouth fairseq tensorboardX torchcrepe librosa==0.9.1 pyyaml pynvml pyloudnorm\n",
|
||||
|
|
Loading…
Reference in New Issue