Update README.md

Merge pull request #83 from svc-develop-team/optimize-some-code
删除了一些无意义代码
2023-03-24 16:59:47 +08:00 · 2023-03-24 14:46:45 +09:00 · 2023-03-24 13:42:38 +08:00 · 2023-03-24 01:00:14 -04:00 · 2023-03-24 12:58:22 +08:00 · 2023-03-24 12:47:31 +08:00
10 changed files with 100 additions and 71 deletions
--- a/.github/ISSUE_TEMPLATE/ask_for_help.yaml
+++ b/.github/ISSUE_TEMPLATE/ask_for_help.yaml
@ -7,9 +7,9 @@ body:
  - type: markdown
    attributes:
      value: |
-        #### 提问前请先自己去尝试解决，可以借助chatgpt或一些搜索引擎（谷歌/必应/New Bing/StackOverflow等等）。如果实在无法自己解决再发issue，在提issue之前，请先了解《[提问的智慧](https://github.com/ryanhanwu/How-To-Ask-Questions-The-Smart-Way/blob/main/README-zh_CN.md)》
+        #### 提问前请先自己去尝试解决，可以借助chatgpt或一些搜索引擎（谷歌/必应/New Bing/StackOverflow等等）。如果实在无法自己解决再发issue，在提issue之前，请先了解《[提问的智慧](https://github.com/ryanhanwu/How-To-Ask-Questions-The-Smart-Way/blob/main/README-zh_CN.md)》。
        ---
-        ### 什么样的issues会被close
+        ### 什么样的issue会被直接close
        1. 伸手党
        2. 一键包/环境包相关
        3. 提供的信息不全
@ -22,11 +22,11 @@ body:
    attributes:
      label: 请勾选下方的确认框。
      options:
-        - label: "我已仔细阅读README.md"
+        - label: "我已仔细阅读README.md。"
          required: true
-        - label: "我已通过各种搜索引擎排查问题，我要提出的问题并不常见"
+        - label: "我已通过各种搜索引擎排查问题，我要提出的问题并不常见。"
          required: true
-        - label: "我未在使用由第三方用户提供的一键包/环境包"
+        - label: "我未在使用由第三方用户提供的一键包/环境包。"
          required: true

  - type: markdown
@ -98,7 +98,7 @@ body:
    id: Log
    attributes:
      label: 日志
-      description: 从执行命令到执行完毕输出的所有信息
+      description: 从执行命令到执行完毕输出的所有信息（包括你所执行的命令）
      render: python
    validations:
      required: true
@ -106,7 +106,7 @@ body:
  - type: textarea
    id: ValidOneClick
    attributes:
-      label: 截图`so-vits-svc`文件夹并粘贴到此处
+      label: 截图`so-vits-svc`、`logs/44k`文件夹并粘贴到此处
    validations:
      required: true

--- a/.github/ISSUE_TEMPLATE/ask_for_help_en_US.yaml
+++ b/.github/ISSUE_TEMPLATE/ask_for_help_en_US.yaml
@ -7,9 +7,9 @@ body:
  - type: markdown
    attributes:
      value: |
-        #### Please try to solve the problem yourself before asking for help，You can use chatgpt or some search engines like google, bing, new bing and StackOverflow until you really find that you can't solve it by yourself. And before you raise an issue, please understand *[How To Ask Questions The Smart Way](http://www.catb.org/~esr/faqs/smart-questions.html)* in advance
+        #### Please try to solve the problem yourself before asking for help. You can use chatgpt or some search engines like google, bing, new bing and StackOverflow until you really find that you can't solve it by yourself. And before you raise an issue, please understand *[How To Ask Questions The Smart Way](http://www.catb.org/~esr/faqs/smart-questions.html)* in advance.
        ---
-        ### What kind of issue will be close immediately
+        ### What kind of issue will be closed immediately
        1. Beggars or Free Riders
        2. One click package / Environment package (Not using `pip install -r requirement.txt`)
        3. Incomplete information
@ -22,11 +22,11 @@ body:
    attributes:
      label: Please check the checkboxes below.
      options:
-        - label: "I have read README.md carefully"
+        - label: "I have read README.md carefully."
          required: true
-        - label: "I have been troubleshooting issues through various search engines. The questions I want to ask are not common"
+        - label: "I have been troubleshooting issues through various search engines. The questions I want to ask are not common."
          required: true
-        - label: "I am NOT using one click package / environment package"
+        - label: "I am NOT using one click package / environment package."
          required: true

  - type: markdown
@ -74,7 +74,7 @@ body:
    id: DatasetSource
    attributes:
      label: Dataset source (Used to judge the dataset quality)
-      description: Like UVR-processed streaming audio / Recorded in recording studio
+      description: Such as UVR-processed streaming audio / Recorded in recording studio
    validations:
      required: true

@ -82,7 +82,7 @@ body:
    id: WhereOccurs
    attributes:
      label: Where thr problem occurs or what command you executed
-      description: Like Preprocessing / Training / `python preprocess_hubert_f0.py`
+      description: Such as Preprocessing / Training / `python preprocess_hubert_f0.py`
    validations:
      required: true

@ -98,7 +98,7 @@ body:
    id: Log
    attributes:
      label: Log
-      description: All information output from the command you executed to the end of execution
+      description: All information output from the command you executed to the end of execution (include the command)
      render: python
    validations:
      required: true
@ -106,7 +106,7 @@ body:
  - type: textarea
    id: ValidOneClick
    attributes:
-      label: Screenshot `so-vits-svc` folder and paste here
+      label: Screenshot `so-vits-svc` and `logs/44k` folders and paste here
    validations:
      required: true

--- a/.github/ISSUE_TEMPLATE/default.md
+++ b/.github/ISSUE_TEMPLATE/default.md
@ -0,0 +1,7 @@
+---
+name: Default issue
+about: 如果模板中没有你想发起的issue类型，可以选择此项，但这个issue也许会获得一个较低的处理优先级 / If there is no issue type you want to raise, you can start with this one. But this issue maybe will get a lower priority to deal with.
+title: ''
+labels: 'not urgent'
+assignees: ''
+---
--- a/.github/ISSUE_TEMPLATE/none.md
+++ b/.github/ISSUE_TEMPLATE/none.md
@ -1,7 +0,0 @@
---
-name: Default issue
-about: 如果模板中没有你想发起的issue类型，可以选择此项，但这个issue会获得一个较低的处理优先级 / If there is no issue type you want to raise, you can start with this one. But this issue will get a lower priority to deal with.
-title: ''
-labels: 'lower priority'
-assignees: ''
---
--- a/README.md
+++ b/README.md
@ -61,7 +61,7 @@ Although the pretrained model generally does not cause any copyright problems, p

 Simply place the dataset in the `dataset_raw` directory with the following file structure.

-```shell
+```
 dataset_raw
 ├───speaker0
 │   ├───xxx1-xxx1.wav
@ -73,15 +73,25 @@ dataset_raw
    └───xxx7-xxx007.wav
 ```

+You can customize the speaker name.
+
+```
+dataset_raw
+└───suijiSUI
+    ├───1.wav
+    ├───...
+    └───25788785-20221210-200143-856_01_(Vocals)_0_0.wav
+```
+
 ## 🛠️ Preprocessing

-1. Resample to 44100hz
+1. Resample to 44100Hz and mono

 ```shell
 python resample.py
 ```

-2. Automatically split the dataset into training, validation, and test sets, and generate configuration files
+2. Automatically split the dataset into training and validation sets, and generate configuration files

 ```shell
 python preprocess_flist_config.py
@ -170,7 +180,7 @@ Use [onnx_export.py](https://github.com/svc-develop-team/so-vits-svc/blob/4.0/on

 Note: For Hubert Onnx models, please use the models provided by MoeSS. Currently, they cannot be exported on their own (Hubert in fairseq has many unsupported operators and things involving constants that can cause errors or result in problems with the input/output shape and results when exported.)  [Hubert4.0](https://huggingface.co/NaruseMioShirakana/MoeSS-SUBModel)

-## Previous contributors
+## ☀️ Previous contributors

 For some reason the author deleted the original repository. Because of the negligence of the organization members, the contributor list was cleared because all files were directly reuploaded to this repository at the beginning of the reconstruction of this repository. Now add a previous contributor list to README.md.

--- a/README_zh_CN.md
+++ b/README_zh_CN.md
@ -61,7 +61,7 @@ http://obs.cstcloud.cn/share/obs/sankagenkeshi/checkpoint_best_legacy_500.pt

 仅需要以以下文件结构将数据集放入dataset_raw目录即可

-```shell
+```
 dataset_raw
 ├───speaker0
 │   ├───xxx1-xxx1.wav
@ -73,15 +73,25 @@ dataset_raw
    └───xxx7-xxx007.wav
 ```

+可以自定义说话人名称
+
+```
+dataset_raw
+└───suijiSUI
+    ├───1.wav
+    ├───...
+    └───25788785-20221210-200143-856_01_(Vocals)_0_0.wav
+```
+
 ## 🛠️ 数据预处理

-1. 重采样至 44100hz
+1. 重采样至44100Hz单声道

 ```shell
 python resample.py
 ```

-2. 自动划分训练集 验证集 测试集 以及自动生成配置文件
+2. 自动划分训练集、验证集，以及自动生成配置文件

 ```shell
 python preprocess_flist_config.py
@ -170,7 +180,7 @@ python inference_main.py -m "logs/44k/G_30400.pth" -c "configs/config.json" -n "
 + 注意：Hubert Onnx模型请使用MoeSS提供的模型，目前无法自行导出（fairseq中Hubert有不少onnx不支持的算子和涉及到常量的东西，在导出时会报错或者导出的模型输入输出shape和结果都有问题）
 [Hubert4.0](https://huggingface.co/NaruseMioShirakana/MoeSS-SUBModel)

-## 旧贡献者
+## ☀️ 旧贡献者

 因为某些原因原作者进行了删库处理，本仓库重建之初由于组织成员疏忽直接重新上传了所有文件导致以前的contributors全部木大，现在在README里重新添加一个旧贡献者列表

--- a/data_utils.py
+++ b/data_utils.py
@ -47,6 +47,8 @@ class TextAudioSpeakerLoader(torch.utils.data.Dataset):
        audio_norm = audio / self.max_wav_value
        audio_norm = audio_norm.unsqueeze(0)
        spec_filename = filename.replace(".wav", ".spec.pt")
+
+        # Ideally, all data generated after Mar 25 should have .spec.pt
        if os.path.exists(spec_filename):
            spec = torch.load(spec_filename)
        else:
--- a/preprocess_flist_config.py
+++ b/preprocess_flist_config.py
@ -25,13 +25,11 @@ if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--train_list", type=str, default="./filelists/train.txt", help="path to train list")
    parser.add_argument("--val_list", type=str, default="./filelists/val.txt", help="path to val list")
-    parser.add_argument("--test_list", type=str, default="./filelists/test.txt", help="path to test list")
    parser.add_argument("--source_dir", type=str, default="./dataset/44k", help="path to source dir")
    args = parser.parse_args()
    
    train = []
    val = []
-    test = []
    idx = 0
    spk_dict = {}
    spk_id = 0
@ -51,13 +49,11 @@ if __name__ == "__main__":
            new_wavs.append(file)
        wavs = new_wavs
        shuffle(wavs)
-        train += wavs[2:-2]
+        train += wavs[2:]
        val += wavs[:2]
-        test += wavs[-2:]

    shuffle(train)
    shuffle(val)
-    shuffle(test)
            
    print("Writing", args.train_list)
    with open(args.train_list, "w") as f:
@ -70,12 +66,6 @@ if __name__ == "__main__":
        for fname in tqdm(val):
            wavpath = fname
            f.write(wavpath + "\n")
-            
-    print("Writing", args.test_list)
-    with open(args.test_list, "w") as f:
-        for fname in tqdm(test):
-            wavpath = fname
-            f.write(wavpath + "\n")

    config_template["spk"] = spk_dict
    config_template["model"]["n_speakers"] = spk_id
--- a/preprocess_hubert_f0.py
+++ b/preprocess_hubert_f0.py
@ -7,10 +7,12 @@ from random import shuffle
 import torch
 from glob import glob
 from tqdm import tqdm
+from modules.mel_processing import spectrogram_torch

 import utils
 import logging
-logging.getLogger('numba').setLevel(logging.WARNING)
+
+logging.getLogger("numba").setLevel(logging.WARNING)
 import librosa
 import numpy as np

@ -29,11 +31,42 @@ def process_one(filename, hmodel):
        wav16k = torch.from_numpy(wav16k).to(device)
        c = utils.get_hubert_content(hmodel, wav_16k_tensor=wav16k)
        torch.save(c.cpu(), soft_path)
+
    f0_path = filename + ".f0.npy"
    if not os.path.exists(f0_path):
-        f0 = utils.compute_f0_dio(wav, sampling_rate=sampling_rate, hop_length=hop_length)
+        f0 = utils.compute_f0_dio(
+            wav, sampling_rate=sampling_rate, hop_length=hop_length
+        )
        np.save(f0_path, f0)

+    spec_path = filename.replace(".wav", ".spec.pt")
+    if not os.path.exists(spec_path):
+        # Process spectrogram
+        # The following code can't be replaced by torch.FloatTensor(wav)
+        # because load_wav_to_torch return a tensor that need to be normalized
+
+        audio, sr = utils.load_wav_to_torch(filename)
+        if sr != hps.data.sampling_rate:
+            raise ValueError(
+                "{} SR doesn't match target {} SR".format(
+                    sr, hps.data.sampling_rate
+                )
+            )
+
+        audio_norm = audio / hps.data.max_wav_value
+        audio_norm = audio_norm.unsqueeze(0)
+
+        spec = spectrogram_torch(
+            audio_norm,
+            hps.data.filter_length,
+            hps.data.sampling_rate,
+            hps.data.hop_length,
+            hps.data.win_length,
+            center=False,
+        )
+        spec = torch.squeeze(spec, 0)
+        torch.save(spec, spec_path)
+

 def process_batch(filenames):
    print("Loading hubert for content...")
@ -46,17 +79,23 @@ def process_batch(filenames):

 if __name__ == "__main__":
    parser = argparse.ArgumentParser()
-    parser.add_argument("--in_dir", type=str, default="dataset/44k", help="path to input dir")
+    parser.add_argument(
+        "--in_dir", type=str, default="dataset/44k", help="path to input dir"
+    )

    args = parser.parse_args()
-    filenames = glob(f'{args.in_dir}/*/*.wav', recursive=True)  # [:10]
+    filenames = glob(f"{args.in_dir}/*/*.wav", recursive=True)  # [:10]
    shuffle(filenames)
-    multiprocessing.set_start_method('spawn',force=True)
+    multiprocessing.set_start_method("spawn", force=True)

    num_processes = 1
    chunk_size = int(math.ceil(len(filenames) / num_processes))
-    chunks = [filenames[i:i + chunk_size] for i in range(0, len(filenames), chunk_size)]
+    chunks = [
+        filenames[i : i + chunk_size] for i in range(0, len(filenames), chunk_size)
+    ]
    print([len(c) for c in chunks])
-    processes = [multiprocessing.Process(target=process_batch, args=(chunk,)) for chunk in chunks]
+    processes = [
+        multiprocessing.Process(target=process_batch, args=(chunk,)) for chunk in chunks
+    ]
    for p in processes:
        p.start()
--- a/spec_gen.py
+++ b/spec_gen.py
@ -1,22 +0,0 @@
-from data_utils import TextAudioSpeakerLoader
-import json
-from tqdm import tqdm
-
-from utils import HParams
-
-config_path = 'configs/config.json'
-with open(config_path, "r") as f:
-    data = f.read()
-config = json.loads(data)
-hps = HParams(**config)
-
-train_dataset = TextAudioSpeakerLoader("filelists/train.txt", hps)
-test_dataset = TextAudioSpeakerLoader("filelists/test.txt", hps)
-eval_dataset = TextAudioSpeakerLoader("filelists/val.txt", hps)
-
-for _ in tqdm(train_dataset):
-    pass
-for _ in tqdm(eval_dataset):
-    pass
-for _ in tqdm(test_dataset):
-    pass
Author	SHA1	Message	Date
Miuzarte	58322242ac	Update README.md	2023-03-24 16:59:47 +08:00
红血球AE3803	27ef997952	Merge pull request #83 from svc-develop-team/optimize-some-code 删除了一些无意义代码	2023-03-24 14:46:45 +09:00
Miuzarte	a0f7a031cb	Update README.md	2023-03-24 13:42:38 +08:00
Lengyue	32cfec751e	remove redundent spec_gen and fix related bug	2023-03-24 01:00:14 -04:00
Miuzarte	75522a6ede	Update issues template	2023-03-24 12:58:22 +08:00
Miuzarte	6a953317b9	Update issues template	2023-03-24 12:47:31 +08:00
Lengyue	2854013a8a	rm test dataset that is never used	2023-03-24 00:43:29 -04:00
Miuzarte	f0ada33687	Update issues template	2023-03-24 12:41:56 +08:00
Miuzarte	eb8ef9a305	Update issues template	2023-03-24 12:36:19 +08:00
Miuzarte	4ce3a869f6	Update issues template	2023-03-24 12:33:38 +08:00
Miuzarte	1c7b153285	Update issues template	2023-03-24 12:27:43 +08:00