Investigating supra-intelligibility aspects of speech

Simantiraki, Olympia

dc.contributor.advisor	Cooke, Martin
dc.contributor.advisor	García Lecumberri, María Luisa
dc.contributor.author	Simantiraki, Olympia
dc.date.accessioned	2022-08-09T10:42:30Z
dc.date.available	2022-08-09T10:42:30Z
dc.date.issued	2022-06-23
dc.date.submitted	2022-06-23
dc.identifier.uri	http://hdl.handle.net/10810/57266
dc.description	158 p.	es_ES
dc.description.abstract	Synthetic and recorded speech form a great part of oureveryday listening experience, and much of our exposure tothese forms of speech occurs in potentially noisy settings such as on public transport, in the classroom or workplace, while driving, and in our homes. Optimising speech output to ensure that salient information is both correctly and effortlessly received is a main concern for the designers of applications that make use of the speech modality. Most of the focus in adapting speech output to challenging listening conditions has been on intelligibility, and specifically on enhancing intelligibility by modifying speech prior to presentation. However, the quality of the generated speech is not always satisfying for the recipient, which might lead to fatigue, or reluctance in using this communication modality. Consequently, a sole focus on intelligibility enhancement provides an incomplete picture of a listener¿s experience since the effect of modified or synthetic speech on other characteristics risks being ignored. These concerns motivate the study of 'supra-intelligibility' factors such as the additional cognitive demand that modified speech may well impose upon listeners, as well as quality, naturalness, distortion and pleasantness. This thesis reports on an investigation into two supra-intelligibility factors: listening effort and listener preferences. Differences in listening effort across four speech types (plain natural, Lombard, algorithmically-enhanced, and synthetic speech) were measured using existing methods, including pupillometry, subjective judgements, and intelligibility scores. To explore the effects of speech features on listener preferences, a new tool, SpeechAdjuster, was developed. SpeechAdjuster allows the manipulation of virtually any aspect of speech and supports the joint elicitation of listener preferences and intelligibility measures. The tool reverses the roles of listener and experimenter by allowing listeners direct control of speech characteristics in real-time. Several experiments to explore the effects of speech properties on listening preferences and intelligibility using SpeechAdjuster were conducted. Participants were permitted to change a speech feature during an open-ended adjustment phase, followed by a test phase in which they identified speech presented with the feature value selected at the end of the adjustment phase. Experiments with native normal-hearing listeners measured the consequences of allowing listeners to change speech rate, fundamental frequency, and other features which led to spectral energy redistribution. Speech stimuli were presented in both quiet and masked conditions. Results revealed that listeners prefer feature modifications similar to those observed in naturally modified speech in noise (Lombard speech). Further, Lombard speech required the least listening effort compared to either plain natural, algorithmically-enhanced, or synthetic speech. For stationary noise, as noise level increased listeners chose slower speech rates and flatter tilts compared to the original speech. Only the choice of fundamental frequency was not consistent with that observed in Lombard speech. It is possible that features such as fundamental frequency that talkers naturally modify are by-products of the speech type (e.g. hyperarticulated speech) and might not be advantageous for the listener.Findings suggest that listener preferences provide information about the processing of speech over and above that measured by intelligibility. One of the listeners¿ concerns was to maximise intelligibility. In noise, listeners preferred the feature values for which more information survived masking, choosing speech rates that led to a contrast with the modulation rate of the masker, or modifications that led to a shift of spectral energy concentration to higher frequencies compared to those of the masker. For all features being modified by listeners, preferences were evident even when intelligibility was at or close to ceiling levels. Such preferences might result from a desire to reduce the cognitive effort of understanding speech, or from a desire to reproduce the sound of typical speech features experienced in real-world noisy conditions, or to optimise the quality of the modified signal. Investigation of supra-intelligibility aspects of speech promises to improve the quality of speech enhancement algorithms, bringing with it the potential of reducing the effort of understanding artificially-modified or generated forms of speech.	es_ES
dc.language.iso	eng	es_ES
dc.rights	info:eu-repo/semantics/openAccess	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc/3.0/es/	*
dc.subject	phonetics	es_ES
dc.subject	psycholinguistics	es_ES
dc.title	Investigating supra-intelligibility aspects of speech	es_ES
dc.type	info:eu-repo/semantics/doctoralThesis	es_ES
dc.rights.holder	Attribution-NonCommercial 3.0 Spain	*
dc.rights.holder	(cc) 2022 Olympia Simantiraki (cc by-nc 4.0)
dc.identifier.studentID	891471	es_ES
dc.identifier.projectID	19851	es_ES
dc.departamentoes	Filología Inglesa y Alemana y Traducción e Interpretación	es_ES
dc.departamentoeu	Ingeles eta Aleman Filologia eta Itzulpengintza eta Interpretazioa	es_ES

Files in this item

Name:: license_rdf
Size:: 920bytes
Format:: application/rdf+xml

View/Open

Name:: TESIS_OLYMPIA_SIMANTIRAKI.pdf
Size:: 42.96Mb
Format:: PDF
Description:: Tesis Doctoral

View/Open

This item appears in the following Collection(s)

TD-Arte y Humanidades

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial 3.0 Spain