Show simple item record

dc.contributor.authorGarcía Romillo, Víctor
dc.contributor.authorHernáez Rioja, Inmaculada ORCID
dc.contributor.authorNavas Cordón, Eva ORCID
dc.date.accessioned2022-02-11T17:30:55Z
dc.date.available2022-02-11T17:30:55Z
dc.date.issued2022-02-07
dc.identifier.citationApplied Sciences 12 (3): 1686 (2022)es_ES
dc.identifier.issn2076-3417
dc.identifier.urihttp://hdl.handle.net/10810/55452
dc.description.abstractIn this paper, we describe the implementation and evaluation of Text to Speech synthesizers based on neural networks for Spanish and Basque. Several voices were built, all of them using a limited number of data. The system applies Tacotron 2 to compute mel-spectrograms from the input sequence, followed by WaveGlow as neural vocoder to obtain the audio signals from the spectrograms. The limited number of data used for training the models leads to synthesis errors in some sentences. To automatically detect those errors, we developed a new method that is able to find the sentences that have lost the alignment during the inference process. To mitigate the problem, we implemented a guided attention providing the system with the explicit duration of the phonemes. The resulting system was evaluated to assess its robustness, quality and naturalness both with objective and subjective measures. The results reveal the capacity of the system to produce good quality and natural audios.es_ES
dc.description.sponsorshipThis work was funded by the Basque Government (Project refs. PIBA 2018-035, IT-1355-19). This work is part of the project Grant PID 2019-108040RB-C21 funded by MCIN/AEI/10.13039/ 501100011033.es_ES
dc.language.isoenges_ES
dc.publisherMDPIes_ES
dc.relationinfo:eu-repo/grantAgreement/MCIN/PID 2019-108040RB-C21es_ES
dc.rightsinfo:eu-repo/semantics/openAccesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/
dc.subjectspeech synthesises_ES
dc.subjectrobustnesses_ES
dc.subjecttext to speeches_ES
dc.subjectSpanishes_ES
dc.subjectBasquees_ES
dc.titleEvaluation of Tacotron Based Synthesizers for Spanish and Basquees_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.date.updated2022-02-11T14:46:20Z
dc.rights.holder© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).es_ES
dc.relation.publisherversionhttps://www.mdpi.com/2076-3417/12/3/1686es_ES
dc.identifier.doi10.3390/app12031686
dc.departamentoesIngeniería de comunicaciones
dc.departamentoeuIngeniaritza kimikoa


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Except where otherwise noted, this item's license is described as © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).