Updata stft complex
This commit is contained in:
parent
2def595e02
commit
23a48ff5d6
|
@ -265,7 +265,7 @@ After enabling loudness embedding, the trained model will match the loudness of
|
|||
|
||||
* `timesteps`: The total number of steps in the diffusion model, which defaults to 1000.
|
||||
|
||||
* `k_step_max`: Training can only train 'k_step_max' step diffusion to save training time, note that the value must be less than 'timesteps', 0 is to train the entire diffusion model, **Note: if you do not train the entire diffusion model will not be able to use only_diffusion!**
|
||||
* `k_step_max`: Training can only train `k_step_max` step diffusion to save training time, note that the value must be less than `timesteps`, 0 is to train the entire diffusion model, **Note: if you do not train the entire diffusion model will not be able to use only_diffusion!**
|
||||
|
||||
##### **List of Vocoders**
|
||||
|
||||
|
|
|
@ -265,7 +265,7 @@ python preprocess_flist_config.py --speech_encoder vec768l12 --vol_aug
|
|||
|
||||
* `timesteps` : 扩散模型总步数,默认为1000.
|
||||
|
||||
* `k_step_max` : 训练时可仅训练`k_step_max`步扩散以节约训练时间,注意,该值必须小于`timesteps`,0为训练全部整个扩散模型,**注意,如果不训练整个扩散模型将无法使用仅扩散推理!**
|
||||
* `k_step_max` : 训练时可仅训练`k_step_max`步扩散以节约训练时间,注意,该值必须小于`timesteps`,0为训练整个扩散模型,**注意,如果不训练整个扩散模型将无法使用仅扩散模型推理!**
|
||||
|
||||
##### **声码器列表**
|
||||
|
||||
|
|
|
@ -64,8 +64,8 @@ def spectrogram_torch(y, n_fft, sampling_rate, hop_size, win_size, center=False)
|
|||
y = y.squeeze(1)
|
||||
|
||||
spec = torch.stft(y, n_fft, hop_length=hop_size, win_length=win_size, window=hann_window[wnsize_dtype_device],
|
||||
center=center, pad_mode='reflect', normalized=False, onesided=True, return_complex=False)
|
||||
|
||||
center=center, pad_mode='reflect', normalized=False, onesided=True, return_complex=True)
|
||||
spec = torch.view_as_real(spec)
|
||||
spec = torch.sqrt(spec.pow(2).sum(-1) + 1e-6)
|
||||
return spec
|
||||
|
||||
|
|
Loading…
Reference in New Issue