Frame-Based Phone Classification Using EMG Signals
dc.contributor.author | Salomons, Inge | |
dc.contributor.author | Del Blanco Sierra, Eder | |
dc.contributor.author | Navas Cordón, Eva | |
dc.contributor.author | Hernáez Rioja, Inmaculada | |
dc.contributor.author | De Zuazo Oteiza, Xabier | |
dc.date.accessioned | 2023-08-08T12:15:09Z | |
dc.date.available | 2023-08-08T12:15:09Z | |
dc.date.issued | 2023-06-30 | |
dc.identifier.citation | Applied Sciences 13(13) : (2023) // Article ID 7746 | es_ES |
dc.identifier.issn | 2076-3417 | |
dc.identifier.uri | http://hdl.handle.net/10810/62135 | |
dc.description.abstract | This paper evaluates the impact of inter-speaker and inter-session variability on the development of a silent speech interface (SSI) based on electromyographic (EMG) signals from the facial muscles. The final goal of the SSI is to provide a communication tool for Spanish-speaking laryngectomees by generating audible speech from voiceless articulation. However, before moving on to such a complex task, a simpler phone classification task in different modalities regarding speaker and session dependency is performed for this study. These experiments consist of processing the recorded utterances into phone-labeled segments and predicting the phonetic labels using only features obtained from the EMG signals. We evaluate and compare the performance of each model considering the classification accuracy. Results show that the models are able to predict the phonetic label best when they are trained and tested using data from the same session. The accuracy drops drastically when the model is tested with data from a different session, although it improves when more data are added to the training data. Similarly, when the same model is tested on a session from a different speaker, the accuracy decreases. This suggests that using larger amounts of data could help to reduce the impact of inter-session variability, but more research is required to understand if this approach would suffice to account for inter-speaker variability as well. | es_ES |
dc.description.sponsorship | This research was funded by Agencia Estatal de Investigación grant number ref.PID2019-108040RB-C21/AEI/10.13039/501100011033 | es_ES |
dc.language.iso | eng | es_ES |
dc.publisher | MDPI | es_ES |
dc.relation | info:eu-repo/grantAgreement/MICINN/PID2019-108040RB-C21 | es_ES |
dc.rights | info:eu-repo/semantics/openAccess | es_ES |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
dc.subject | EMG signals | es_ES |
dc.subject | phone classification | es_ES |
dc.subject | silent speech interfaces | es_ES |
dc.subject | human–computer interaction | es_ES |
dc.subject | speech processing | es_ES |
dc.title | Frame-Based Phone Classification Using EMG Signals | es_ES |
dc.type | info:eu-repo/semantics/article | es_ES |
dc.date.updated | 2023-07-13T14:07:17Z | |
dc.rights.holder | © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/ 4.0/). | es_ES |
dc.relation.publisherversion | https://www.mdpi.com/2076-3417/13/13/7746 | es_ES |
dc.identifier.doi | 10.3390/app13137746 | |
dc.departamentoes | Ingeniería de comunicaciones | |
dc.departamentoeu | Komunikazioen ingeniaritza |
Files in this item
This item appears in the following Collection(s)
Except where otherwise noted, this item's license is described as © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/ 4.0/).