Frame-Based Phone Classification Using EMG Signals

Salomons, Inge; Del Blanco Sierra, Eder; Navas Cordón, Eva; Hernáez Rioja, Inmaculada; De Zuazo Oteiza, Xabier

dc.contributor.author	Salomons, Inge
dc.contributor.author	Del Blanco Sierra, Eder
dc.contributor.author	Navas Cordón, Eva
dc.contributor.author	Hernáez Rioja, Inmaculada
dc.contributor.author	De Zuazo Oteiza, Xabier
dc.date.accessioned	2023-08-08T12:15:09Z
dc.date.available	2023-08-08T12:15:09Z
dc.date.issued	2023-06-30
dc.identifier.citation	Applied Sciences 13(13) : (2023) // Article ID 7746	es_ES
dc.identifier.issn	2076-3417
dc.identifier.uri	http://hdl.handle.net/10810/62135
dc.description.abstract	This paper evaluates the impact of inter-speaker and inter-session variability on the development of a silent speech interface (SSI) based on electromyographic (EMG) signals from the facial muscles. The final goal of the SSI is to provide a communication tool for Spanish-speaking laryngectomees by generating audible speech from voiceless articulation. However, before moving on to such a complex task, a simpler phone classification task in different modalities regarding speaker and session dependency is performed for this study. These experiments consist of processing the recorded utterances into phone-labeled segments and predicting the phonetic labels using only features obtained from the EMG signals. We evaluate and compare the performance of each model considering the classification accuracy. Results show that the models are able to predict the phonetic label best when they are trained and tested using data from the same session. The accuracy drops drastically when the model is tested with data from a different session, although it improves when more data are added to the training data. Similarly, when the same model is tested on a session from a different speaker, the accuracy decreases. This suggests that using larger amounts of data could help to reduce the impact of inter-session variability, but more research is required to understand if this approach would suffice to account for inter-speaker variability as well.	es_ES
dc.description.sponsorship	This research was funded by Agencia Estatal de Investigación grant number ref.PID2019-108040RB-C21/AEI/10.13039/501100011033	es_ES
dc.language.iso	eng	es_ES
dc.publisher	MDPI	es_ES
dc.relation	info:eu-repo/grantAgreement/MICINN/PID2019-108040RB-C21	es_ES
dc.rights	info:eu-repo/semantics/openAccess	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	EMG signals	es_ES
dc.subject	phone classification	es_ES
dc.subject	silent speech interfaces	es_ES
dc.subject	human–computer interaction	es_ES
dc.subject	speech processing	es_ES
dc.title	Frame-Based Phone Classification Using EMG Signals	es_ES
dc.type	info:eu-repo/semantics/article	es_ES
dc.date.updated	2023-07-13T14:07:17Z
dc.rights.holder	© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/ 4.0/).	es_ES
dc.relation.publisherversion	https://www.mdpi.com/2076-3417/13/13/7746	es_ES
dc.identifier.doi	10.3390/app13137746
dc.departamentoes	Ingeniería de comunicaciones
dc.departamentoeu	Komunikazioen ingeniaritza

Files in this item

Name:: applsci-13-07746.pdf
Size:: 22.64Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Artículos

Show simple item record

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/ 4.0/).

Except where otherwise noted, this item's license is described as © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/ 4.0/).