Merge pull request #209 from svc-develop-team/4.1-Stable

stable to latest
2023-05-30 11:13:41 +08:00 · 2023-05-30 11:13:41 +08:00 · a2f85c71a0
parent 10c7c06acb 3320039d10
commit a2f85c71a0
3 changed files with 18 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -236,7 +236,13 @@ If the speech_encoder argument is omitted, the default value is vec768l12

 **Use loudness embedding**

-If loudness embedding is used, the 'vol_aug' and 'vol_embedding' in config.json will be set to true. After use, the trained model will match the loudness of the input source; otherwise, it will be the loudness of the training set.
+Add `--vol_aug` if you want to enable loudness embedding:
+
+```shell
+python preprocess_flist_config.py --speech_encoder vec768l12 --vol_aug
+```
+
+After enabling loudness embedding, the trained model will match the loudness of the input source; otherwise, it will be the loudness of the training set.

 ### 3. Generate hubert and f0

--- a/README_zh_CN.md
+++ b/README_zh_CN.md
@ -238,7 +238,13 @@ whisper-ppg

 **使用响度嵌入**

-若使用响度嵌入，需要将config.json中的`vol_aug`,`vol_embedding`设置为true.使用后训练出的模型将匹配到输入源响度，否则为训练集响度。
+若使用响度嵌入，需要增加`--vol_aug`参数，比如：
+
+```shell
+python preprocess_flist_config.py --speech_encoder vec768l12 --vol_aug
+```
+
+使用后训练出的模型将匹配到输入源响度，否则为训练集响度。

 ### 3. 生成hubert与f0

--- a/preprocess_flist_config.py
+++ b/preprocess_flist_config.py
@ -29,6 +29,7 @@ if __name__ == "__main__":
    parser.add_argument("--val_list", type=str, default="./filelists/val.txt", help="path to val list")
    parser.add_argument("--source_dir", type=str, default="./dataset/44k", help="path to source dir")
    parser.add_argument("--speech_encoder", type=str, default="vec768l12", help="choice a speech encoder|'vec768l12','vec256l9','hubertsoft','whisper-ppg'")
+    parser.add_argument("--vol_aug", action="store_true", help="Whether to use volume embedding and volume augmentation")
    args = parser.parse_args()
    
    train = []
@ -89,6 +90,9 @@ if __name__ == "__main__":
    elif args.speech_encoder == "whisper-ppg" :
        config_template["model"]["ssl_dim"] = config_template["model"]["filter_channels"] = config_template["model"]["gin_channels"] = 1024
        d_config_template["data"]["encoder_out_channels"] = 1024
+    
+    if args.vol_aug:
+        config_template["train"]["vol_aug"] = config_template["model"]["vol_embedding"] = True

    print("Writing configs/config.json")
    with open("configs/config.json", "w") as f: