Merge pull request #209 from svc-develop-team/4.1-Stable

stable to latest
This commit is contained in:
YuriHead 2023-05-30 11:13:41 +08:00 committed by GitHub
commit a2f85c71a0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 18 additions and 2 deletions

View File

@ -236,7 +236,13 @@ If the speech_encoder argument is omitted, the default value is vec768l12
**Use loudness embedding**
If loudness embedding is used, the 'vol_aug' and 'vol_embedding' in config.json will be set to true. After use, the trained model will match the loudness of the input source; otherwise, it will be the loudness of the training set.
Add `--vol_aug` if you want to enable loudness embedding:
```shell
python preprocess_flist_config.py --speech_encoder vec768l12 --vol_aug
```
After enabling loudness embedding, the trained model will match the loudness of the input source; otherwise, it will be the loudness of the training set.
### 3. Generate hubert and f0

View File

@ -238,7 +238,13 @@ whisper-ppg
**使用响度嵌入**
若使用响度嵌入需要将config.json中的`vol_aug`,`vol_embedding`设置为true.使用后训练出的模型将匹配到输入源响度,否则为训练集响度。
若使用响度嵌入,需要增加`--vol_aug`参数,比如:
```shell
python preprocess_flist_config.py --speech_encoder vec768l12 --vol_aug
```
使用后训练出的模型将匹配到输入源响度,否则为训练集响度。
### 3. 生成hubert与f0

View File

@ -29,6 +29,7 @@ if __name__ == "__main__":
parser.add_argument("--val_list", type=str, default="./filelists/val.txt", help="path to val list")
parser.add_argument("--source_dir", type=str, default="./dataset/44k", help="path to source dir")
parser.add_argument("--speech_encoder", type=str, default="vec768l12", help="choice a speech encoder|'vec768l12','vec256l9','hubertsoft','whisper-ppg'")
parser.add_argument("--vol_aug", action="store_true", help="Whether to use volume embedding and volume augmentation")
args = parser.parse_args()
train = []
@ -89,6 +90,9 @@ if __name__ == "__main__":
elif args.speech_encoder == "whisper-ppg" :
config_template["model"]["ssl_dim"] = config_template["model"]["filter_channels"] = config_template["model"]["gin_channels"] = 1024
d_config_template["data"]["encoder_out_channels"] = 1024
if args.vol_aug:
config_template["train"]["vol_aug"] = config_template["model"]["vol_embedding"] = True
print("Writing configs/config.json")
with open("configs/config.json", "w") as f: