From 9cf6826e0d101a8459587696bfa1e75ed0cbd070 Mon Sep 17 00:00:00 2001 From: Miuzarte <982809597@qq.com> Date: Wed, 5 Apr 2023 19:01:30 +0800 Subject: [PATCH] Update README.md --- README.md | 22 +++++++++++----------- README_zh_CN.md | 22 +++++++++++----------- 2 files changed, 22 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index 8af666d..26c605d 100644 --- a/README.md +++ b/README.md @@ -131,19 +131,19 @@ python inference_main.py -m "logs/44k/G_30400.pth" -c "configs/config.json" -n " ``` Required parameters: -- -m, --model_path: path to the model. -- -c, --config_path: path to the configuration file. -- -n, --clean_names: a list of wav file names located in the raw folder. -- -t, --trans: pitch adjustment, supports positive and negative (semitone) values. -- -s, --spk_list: target speaker name for synthesis. -- -cl, --clip: voice forced slicing, set to 0 to turn off(default), duration in seconds. +- `-m` | `--model_path`: path to the model. +- `-c` | `--config_path`: path to the configuration file. +- `-n` | `--clean_names`: a list of wav file names located in the raw folder. +- `-t` | `--trans`: pitch adjustment, supports positive and negative (semitone) values. +- `-s` | `--spk_list`: target speaker name for synthesis. +- `-cl` | `--clip`: voice forced slicing, set to 0 to turn off(default), duration in seconds. Optional parameters: see the next section -- -lg, --linear_gradient: The cross fade length of two audio slices in seconds. If there is a discontinuous voice after forced slicing, you can adjust this value. Otherwise, it is recommended to use the default value of 0. -- -fmp, --f0_mean_pooling: Apply mean filter (pooling) to f0,which may improve some hoarse sounds. Enabling this option will reduce inference speed. -- -a, --auto_predict_f0: automatic pitch prediction for voice conversion, do not enable this when converting songs as it can cause serious pitch issues. -- -cm, --cluster_model_path: path to the clustering model, fill in any value if clustering is not trained. -- -cr, --cluster_infer_ratio: proportion of the clustering solution, range 0-1, fill in 0 if the clustering model is not trained. +- `-lg` | `--linear_gradient`: The cross fade length of two audio slices in seconds. If there is a discontinuous voice after forced slicing, you can adjust this value. Otherwise, it is recommended to use the default value of 0. +- `-fmp` | `--f0_mean_pooling`: Apply mean filter (pooling) to f0,which may improve some hoarse sounds. Enabling this option will reduce inference speed. +- `-a` | `--auto_predict_f0`: automatic pitch prediction for voice conversion, do not enable this when converting songs as it can cause serious pitch issues. +- `-cm` | `--cluster_model_path`: path to the clustering model, fill in any value if clustering is not trained. +- `-cr` | `--cluster_infer_ratio`: proportion of the clustering solution, range 0-1, fill in 0 if the clustering model is not trained. ## 🤔 Optional Settings diff --git a/README_zh_CN.md b/README_zh_CN.md index 9cbd020..0d6a8f2 100644 --- a/README_zh_CN.md +++ b/README_zh_CN.md @@ -131,19 +131,19 @@ python inference_main.py -m "logs/44k/G_30400.pth" -c "configs/config.json" -n " ``` 必填项部分 -+ -m, --model_path:模型路径 -+ -c, --config_path:配置文件路径 -+ -n, --clean_names:wav 文件名列表,放在 raw 文件夹下 -+ -t, --trans:音高调整,支持正负(半音) -+ -s, --spk_list:合成目标说话人名称 -+ -cl, --clip:音频强制切片,默认0为自动切片,单位为秒/s ++ `-m` | `--model_path`:模型路径 ++ `-c` | `--config_path`:配置文件路径 ++ `-n` | `--clean_names`:wav 文件名列表,放在 raw 文件夹下 ++ `-t` | `--trans`:音高调整,支持正负(半音) ++ `-s` | `--spk_list`:合成目标说话人名称 ++ `-cl` | `--clip`:音频强制切片,默认0为自动切片,单位为秒/s 可选项部分:部分具体见下一节 -+ -lg, --linear_gradient:两段音频切片的交叉淡入长度,如果强制切片后出现人声不连贯可调整该数值,如果连贯建议采用默认值0,单位为秒 -+ -fmp, --f0_mean_pooling:是否对F0使用均值滤波器(池化),对部分哑音可能有改善。注意,启动该选项会导致推理速度下降,默认关闭 -+ -a, --auto_predict_f0:语音转换自动预测音高,转换歌声时不要打开这个会严重跑调 -+ -cm, --cluster_model_path:聚类模型路径,如果没有训练聚类则随便填 -+ -cr, --cluster_infer_ratio:聚类方案占比,范围0-1,若没有训练聚类模型则默认0即可 ++ `-lg` | `--linear_gradient`:两段音频切片的交叉淡入长度,如果强制切片后出现人声不连贯可调整该数值,如果连贯建议采用默认值0,单位为秒 ++ `-fmp` | `--f0_mean_pooling`:是否对F0使用均值滤波器(池化),对部分哑音可能有改善。注意,启动该选项会导致推理速度下降,默认关闭 ++ `-a` | `--auto_predict_f0`:语音转换自动预测音高,转换歌声时不要打开这个会严重跑调 ++ `-cm` | `--cluster_model_path`:聚类模型路径,如果没有训练聚类则随便填 ++ `-cr` | `--cluster_infer_ratio`:聚类方案占比,范围0-1,若没有训练聚类模型则默认0即可 ## 🤔 可选项