Mejora en la implementación de voz de un bertsolari sintetico

Del Blanco Sierra, Eder

View/Open

Trabajo Fin de Grado (3.704Mb)

Date

2016-11-28

Author

Del Blanco Sierra, Eder

Metadata

Show full item record

Estadisticas en RECOLECTA
(LA Referencia)

URI

http://hdl.handle.net/10810/19630

Abstract

[ES]Este proyecto ha sido propuesto por el Grupo de investigación Aholab Signal Processing Laboratory, perteneciente al Departamento de Ingeniería de Comunicaciones de la Escuela de Ingeniería de Bilbao de la UPV/EHU. Este trabajo se enmarca en un proyecto que comenzó en el año 2011 [ASTIGARRAGA. A], con el objetivo de crear un robot capaz de improvisar y cantar versos (Bertsobot). Para ello colaboran la Asociación de Amigos del Bertsolarismo1 y tres laboratorios de la UPV/EHU: el grupo de Robótica y Sistemas Autónomos2, Ixa3 y Aholab4. La herramienta sobre la que se basa el sintetizador de voz cantada creado en este proyecto es el conversor de texto a voz AhoTTS de Aholab. El conversor de texto a voz AhoTTS genera voz hablada a partir de un texto escrito. Para que la voz pueda cantar, es necesario realizar algunas adaptaciones sobre el software. Por un lado, existen grandes diferencias en las características de la voz hablada y la voz cantada, por lo que es necesario revisar el proceso de generación de la señal. Por otro lado, la entrada en un sistema de generación de voz cantada es una partitura musical, que contiene información no únicamente sobre el texto a cantar (como es en el caso de la voz hablada) sino también la melodía (frecuencias fundamentales y duración de los sonidos). Este proyecto tiene como punto de partida algunos trabajos previos realizados para implementar las adaptaciones mencionadas. Los objetivos de este trabajo vienen dados por las deficiencias iniciales, y consiste precisamente en darles respuesta: conseguir que la pronunciación de la voz sintética sea más comprensible, solventar problemas de transición entre notas y dinamizar la configuración de la melodía. La herramienta propuesta para lograr dichos objetivos es Pure Data, un lenguaje de programación visual ideado para crear música electrónica. En este proyecto se ha desarrollado un software que permite obtener la voz cantada correspondiente a una partitura de forma totalmente automática. El sistema desarrollado se ha presentado al Singing Synthesis Challenge de la conferencia internacional Interspeech 2016 y será presentado en la misma en septiembre de 2016. El sistema se describe en el artículo publicado en las actas del congreso [DEL BLANCO, E.].

[EU]Proiektu hau Aholabek proposatu du. Aholab UPV/EHUko Ingeniaritza Goi Eskola Teknikoko Elektronika eta Telekomunikazio sailako seinale prozesamenduko laborategi bat da. Lan hau 2011 urtean sortu zen proiektuan oinarritzen da [ASTIGARRAGA. A], non bertsoak inprobisatu eta kantatzeko gai den robot bat (Bertsobot) egiten saiatu ziren. Horretarako parte hartu zuten Bertsozale elkartea5 eta EUP/EHUko hiru laborategi: Robotika eta Sistema Autonomoen Ikerketa taldea6, Ixa7 eta Aholab8. Proiektu honetan, Aholaben eginkizuna bertsoei ahotsa ematea da. Proiektu honetan sortutako ahots abeslari sintetizatzailea Aholabek sortutako hizketa-sintetizatzailea (AhoTTS) erabiliz egin da. AhoTTS hizketa-sintetizatzaileak ahots mintzatuta sortzen du testu idatzizko batetik. Ahotsak abestu ahal izateko derrigorrezkoa da softwarean moldaketa batzuk egitea. Alde batetik, ahots mintzatua eta ahots abestua ezaugarri oso desberdinak dauzkate. Hori dela eta, seinale sormen prozesua egokitu behar da. Beste alde batetik, kantu-sintetizatzaile baten sarrera partitura bat da, zeinek testuari (ahots mintzatuaren kasuan bezala) zein melodiari buruzko informazioa (frekuentzia fundamentalak eta soinuen iraupena) dauka. Proiektu honek abiapuntutzat dauzka aipatutako moldaketak inplementatzeko aurretiko lan batzuk. Lan honen helburua hasierako hutsuneak konpontzea da: sortutako ahots sintetikoaren ulergarritasuna hobetzea, noten arteko iragatea leuntzea eta melodiaren eraketa dinamizatzea da. Helburuak lortzeko Pure Data softwarea erabiltzea proposatzen da, musika elektronikoa sortzeko egindako programazio hizkuntza bisual bat. Proiektu honetan garatu da partitura baten melodia abesten duen ahots bat automatikoki sortu ahal duen software bat. Garatutako sistema Interspeech 2016 nazioarteko hitzaldiko Singing Synthesis Challengera bidali da, eta bertan aurkestuko da 2016ko irailean. Sistema kongresuko aktetan argitaratutako artikuluan deskribatzen da [DEL BLANCO, E.]. 5 http://

[EN]This project has been proposed by Aholab Signal Processing Laboratory, belonging to the Department of Communications Engineering, located in the Faculty of Engineering of the University if the Basque Country (UPV/EHU). This work is defined in a project which began in 2011 [ASTIGARRAGA. A], with the objective of creating a robot able to improvising and singing verses (Bertsobot). To do this, colaborates the Association of the Friends of Bertsolaritza9 and three laboratories of the UPV/EHU: the group of Robotics and Autonomous Systems10, Ixa11 and Aholab12. The singing voice synthesizer created in this project is based on AhoTTS, the text to speech conversor of Aholab. AhoTTS generates spoken voice from a written text. In order to the voice can sing, the software must be adapted. On one hand, there are large differences between the features of the spoken voice and the singing voice, therefore it is necesary to inspect the signal generation process. On the other hand, the input of a singing voice is a music score, which contains text information (as the speech syntesizer) as well as melody information (fudamental frecuencies and duration of each sound). This project has as startpoint some previous works done in order to implement the mentioned adaptations. The goal of this work is solving the initial deficiences: improving the pronunciation of the synthetic voice in order to doing ir more comprehensible, solving issues of the transition between notes and facilitate the configuration of the melody. The tool proposed to achieve those goals is Pure Data, a visual programing language thought to create electronic music. The product of this project is a software able to obtain automaticaly the singing voice corresponding to a score. The developed system has been sent to the Singing Synthesis Challenge of Interpeech 2016 and it will be shown on September of 2016. The system is described on the article published on the proceedings of the conference [DEL BLANCO, E.]. 9 http://

Collections

Trabajos Académicos-Escuela de Ingeniería de Bilbao (Restringido)