Updata Readme

2023-05-30 18:07:07 +08:00 · 2023-05-30 18:07:07 +08:00 · 2b281bc446
parent a2f85c71a0
commit 2b281bc446
2 changed files with 19 additions and 16 deletions
--- a/README.md
+++ b/README.md
@ -226,14 +226,6 @@ whisper-ppg
 If the speech_encoder argument is omitted, the default value is vec768l12
 #### You can modify some parameters in the generated config.json and diffusion.yaml
 * `keep_ckpts`: Keep the last `keep_ckpts` models during training. Set to `0` will keep them all. Default is `3`.
 * `all_in_mem`, `cache_all_data`: Load all dataset to RAM. It can be enabled when the disk IO of some platforms is too low and the system memory is **much larger** than your dataset.
 * `batch_size`: The amount of data loaded to the GPU for a single training session can be adjusted to a size lower than the video memory capacity.
 **Use loudness embedding**
 Add `--vol_aug` if you want to enable loudness embedding:
@ -244,6 +236,16 @@ python preprocess_flist_config.py --speech_encoder vec768l12 --vol_aug
 After enabling loudness embedding, the trained model will match the loudness of the input source; otherwise, it will be the loudness of the training set.
 #### You can modify some parameters in the generated config.json and diffusion.yaml
 * `keep_ckpts`: Keep the last `keep_ckpts` models during training. Set to `0` will keep them all. Default is `3`.
 * `all_in_mem`, `cache_all_data`: Load all dataset to RAM. It can be enabled when the disk IO of some platforms is too low and the system memory is **much larger** than your dataset.
 * `batch_size`: The amount of data loaded to the GPU for a single training session can be adjusted to a size lower than the video memory capacity.
 ### 3. Generate hubert and f0
 ```shell
--- a/README_zh_CN.md
+++ b/README_zh_CN.md
@ -228,14 +228,6 @@ whisper-ppg
 如果省略speech_encoder参数，默认值为vec768l12
 #### 此时可以在生成的config.json与diffusion.yaml修改部分参数
 * `keep_ckpts`：训练时保留最后几个模型，`0`为保留所有，默认只保留最后`3`个
 * `all_in_mem`,`cache_all_data`：加载所有数据集到内存中，某些平台的硬盘IO过于低下、同时内存容量 **远大于** 数据集体积时可以启用
 * `batch_size`：单次训练加载到GPU的数据量，调整到低于显存容量的大小即可
 **使用响度嵌入**
 若使用响度嵌入，需要增加`--vol_aug`参数，比如：
@ -246,6 +238,15 @@ python preprocess_flist_config.py --speech_encoder vec768l12 --vol_aug
 使用后训练出的模型将匹配到输入源响度，否则为训练集响度。
 #### 此时可以在生成的config.json与diffusion.yaml修改部分参数
 * `keep_ckpts`：训练时保留最后几个模型，`0`为保留所有，默认只保留最后`3`个
 * `all_in_mem`,`cache_all_data`：加载所有数据集到内存中，某些平台的硬盘IO过于低下、同时内存容量 **远大于** 数据集体积时可以启用
 * `batch_size`：单次训练加载到GPU的数据量，调整到低于显存容量的大小即可
 ### 3. 生成hubert与f0
 ```shell