Merge pull request #216 from svc-develop-team/4.1-Stable

To Latest
This commit is contained in:
YuriHead 2023-06-02 14:30:35 +08:00 committed by GitHub
commit 3c517af54c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 10 additions and 52 deletions

View File

@ -192,32 +192,10 @@ python resample.py
#### Attention
Although this project has the script resample.py for resampling, to mono and loudness matching, the default loudness matching is to match to 0db. This may cause damage to the sound quality. While python's loudness matching package pyloudnorm is unable to limit the level, this results in a burst. Therefore, it is suggested to consider using professional sound processing software such as `adobe audition` for resampling, to mono and loudness matching processing. If you use other software for resampling, to mono and loudness matching, do not run the preceding command.
Although this project has the script resample.py for resampling, to mono and loudness matching, the default loudness matching is to match to 0db. This may cause damage to the sound quality. While python's loudness matching package pyloudnorm is unable to limit the level, this results in a burst. Therefore, it is suggested to consider using professional sound processing software such as `adobe audition` for loudness matching processing. If you have already used other software for loudness matching, run the command with the argument `--skip_loudnorm`:
To manually process the audio, you need to put the dataset into the Dataset directory with the following file structure. If the directory does not exist, you can create it yourself.
```
dataset
└───44k
├───speaker0
│ ├───xxx1-xxx1.wav
│ ├───...
│ └───Lxx-0xx8.wav
└───speaker1
├───xx2-0xxx2.wav
├───...
└───xxx7-xxx007.wav
```
You can customize the speaker name.
```
dataset
└───44k
└───suijiSUI
├───1.wav
├───...
└───25788785-20221210-200143-856_01_(Vocals)_0_0.wav
```shell
python resample.py --skip_loudnorm
```
### 2. Automatically split the dataset into training and validation sets, and generate configuration files.

View File

@ -194,32 +194,10 @@ python resample.py
#### 注意
虽然本项目拥有重采样、转换单声道与响度匹配的脚本resample.py但是默认的响度匹配是匹配到0db。这可能会造成音质的受损。而python的响度匹配包pyloudnorm无法对电平进行压限这会导致爆音。所以建议可以考虑使用专业声音处理软件如`adobe audition`等软件做重采样、转换单声道与响度匹配处理。若使用其他软件做重采样、转换单声道与响度匹配,则可以不运行上述命令。
虽然本项目拥有重采样、转换单声道与响度匹配的脚本resample.py但是默认的响度匹配是匹配到0db。这可能会造成音质的受损。而python的响度匹配包pyloudnorm无法对电平进行压限这会导致爆音。所以建议可以考虑使用专业声音处理软件如`adobe audition`等软件做响度匹配处理。若已经使用其他软件做响度匹配,可以在运行上述命令时添加`--skip_loudnorm`跳过响度匹配步骤。如:
若手动处理音频需要以以下文件结构将数据集放入dataset目录即可。若无该目录可以自行创建。
```
dataset
└───44k
├───speaker0
│ ├───xxx1-xxx1.wav
│ ├───...
│ └───Lxx-0xx8.wav
└───speaker1
├───xx2-0xxx2.wav
├───...
└───xxx7-xxx007.wav
```
可以自定义说话人名称
```
dataset
└───44k
└───suijiSUI
├───1.wav
├───...
└───25788785-20221210-200143-856_01_(Vocals)_0_0.wav
```shell
python resample.py --skip_loudnorm
```
### 2. 自动划分训练集、验证集,以及自动生成配置文件

View File

@ -15,12 +15,13 @@ def process(item):
if os.path.exists(wav_path) and '.wav' in wav_path:
os.makedirs(os.path.join(args.out_dir2, speaker), exist_ok=True)
wav, sr = librosa.load(wav_path, sr=None)
wav, _ = librosa.effects.trim(wav, top_db=20)
wav, _ = librosa.effects.trim(wav, top_db=40)
peak = np.abs(wav).max()
if peak > 1.0:
wav = 0.98 * wav / peak
wav2 = librosa.resample(wav, orig_sr=sr, target_sr=args.sr2)
wav2 /= max(wav2.max(), -wav2.min())
if not args.skip_loudnorm:
wav2 /= max(wav2.max(), -wav2.min())
save_name = wav_name
save_path2 = os.path.join(args.out_dir2, speaker, save_name)
wavfile.write(
@ -35,6 +36,7 @@ if __name__ == "__main__":
parser.add_argument("--sr2", type=int, default=44100, help="sampling rate")
parser.add_argument("--in_dir", type=str, default="./dataset_raw", help="path to source dir")
parser.add_argument("--out_dir2", type=str, default="./dataset/44k", help="path to target dir")
parser.add_argument("--skip_loudnorm", action="store_true", help="Skip loudness matching if you have done it")
args = parser.parse_args()
processs = 30 if cpu_count() > 60 else (cpu_count()-2 if cpu_count() > 4 else 1)
pool = Pool(processes=processs)