Show simple item record

dc.contributor.advisorNavas Cordón, Eva ORCID
dc.contributor.advisorHernáez Rioja, Inmaculada ORCID
dc.contributor.authorMadina González, Margot
dc.date.accessioned2023-06-30T14:51:04Z
dc.date.available2023-06-30T14:51:04Z
dc.date.issued2023-06-30
dc.identifier.urihttp://hdl.handle.net/10810/61821
dc.description.abstract[EN] Automatic Speech Recognition (ASR) systems have become an everyday use tool worldwide. Their use has spread throughout these last years and they have also been implemented in Environmental Control Systems (ECS) or Speech Generating Devices (SGD), among others. These systems might be especially beneficial for people with physical disabilities, as they would be able to control different devices with voice commands, therefore reducing the physical effort they have to make. However, people with functional diversity usually present difficulties in speech articulation too. One of the most common speech articulation problems is dysarthria, a disorder in the nervous system which causes weakness in muscles used for speech. Existing commercial ASR systems are not able to correctly understand dysarthric speech, so people with this condition cannot exploit this technology. Some investigation tackling this issue has been conducted, but an optimal solution has not been reached yet. On the other hand, nearly all existing investigation on the matter is in English, no previous study has approached the problem in other languages. Apart form this, ASR systems require of large speech databases, which are currently very few, most of them in English and they have not been designed for this end. Some commercial ASR systems offer a customization interface where users can train a base model with their speech data and thus improve the recognition accuracy. In this thesis, we evaluated the performance of the commercial ASR system Microsoft Azure Speech to Text. First, we reviewed the current state of the art. Then, we created a pilot database in Spanish and recorded it with 3 heterogeneous people with dysarthria and 1 typical speaker to be used as reference. Lastly, we trained the system and conducted different experiments to measure its accuracy. Results show that, overall, the customized models outperform the base models of the system. However, the results were not homogeneous, but vary depending on the speaker. Even though the recognition accuracy improved considerably, the results were far from being as good as those obtained for typical speech.es_ES
dc.language.isoenges_ES
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectautomatic speech recognitiones_ES
dc.subjectdysarthria
dc.subjectintelligibility
dc.subjectSpanish
dc.titleEvaluation of STT technologies performance and database design for Spanish dysarthric speeches_ES
dc.typeinfo:eu-repo/semantics/masterThesis
dc.date.updated2021-06-14T09:00:34Z
dc.language.rfc3066es
dc.rights.holder© 2021, la autora
dc.contributor.degreeMáster Universitario en Análisis y Procesamiento del Lenguaje
dc.contributor.degreeHizkuntzaren Azterketa eta Prozesamendua Unibertsitate Masterra
dc.identifier.gaurregister114375-729038-09es_ES
dc.identifier.gaurassign123042-729038es_ES


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record