Download PDFOpen PDF in browserThe End-to-End Speech Synthesis System for the VLSP Campaign 2019EasyChair Preprint 17423 pages•Date: October 22, 2019AbstractThe traditional speech synthesis systems are typically built by multiple components, such as including a text analysis front-end, an acoustic model and an audio synthesis module. Building these components often requires a lot of people possessing extensive domain experts and may contain brittle design choices. In this paper, we describe how we build a Vietnamese speech synthesis system (TTS) based on Deep Learning techniques. We completed the build of two speech synthesis systems, with BigCorpus (Mean Opinion Score of 3.47) and SmallCorpus (Mean Opinion Score of 4.13) in text-to-speech shared-tasks of VLSP 2019. In addition, transfer learning and fine-tuning techniques are also applied to solve noise data problems of training data in BigCorpus and shortage of data in SmallCorpus. Keyphrases: Tacotron2, Vietnamese speech synthesis, deep learning, speech synthesis, speech synthesis system, text-to-speech
|