Hello, I have trained my tacotron2 model successfully on around 2000+ audio files. However, while inferencing the audio output through Waveglow the audio is not clear. There is too much noise. Where as it was generating a perfect clear audio when I tested the model couple of months back.
While training the model I stopped the training at
error rate 0.18 as the model was overfitting.
hparams.batch_size = 20 hparams.epochs = 500 hparams.p_attention_dropout=0.4 hparams.p_decoder_dropout=0.1 hparams.decay_start = 15000 # wait till decay_start to start decaying learning rate hparams.A_ = 5e-4 # Start/Max Learning Rate hparams.B_ = 8000 # Decay Rate hparams.C_ = 0 # Shift learning rate equation by this value hparams.min_learning_rate = 1e-6 # Min Learning Rate generate_mels = True # Don't change hparams.show_alignments = True alignment_graph_height = 600 alignment_graph_width = 1000 hparams.load_mel_from_disk = True
Attention Maps of the generated audio