One-to-One or One Too Many ? Linking Sound-to-Letter Mappings to Speech Sound Perception and Production in Early Readers

,


A B S T R A C T
Purpose: Effects related to literacy acquisition have been observed at different levels of speech processing. This study investigated the link between orthographic knowledge and children's perception and production of specific speech sounds. Method: Sixty Spanish-speaking second graders, differing in their phonological decoding skills, completed a speech perception and a production task. In the perception task, a behavioral adaptation of the oddball paradigm was used. Children had to detect orthographically consistent /t/, which has a unique orthographic representation (hti), and inconsistent /k/, which maps onto three different graphemes (hci, hqui, and hki), both appearing infrequently within a repetitive auditory sequence. In the production task, children produced these same sounds in meaningless syllables. Results: Perception results show that all children were faster at detecting consistent than inconsistent sounds regardless of their decoding skills. In the production task, however, the same facilitation for consistent sounds was linked to better decoding skills. Conclusions: These findings demonstrate differences in speech sound processing related to literacy acquisition. Literacy acquisition may therefore affect already-formed speech sound representations. Crucially, the strength of this link in production is modulated by individual decoding skills.
This study explores the relationship between literacy acquisition and both perception and production of specific speech sounds. Previous research has revealed changes in auditory speech processing related to literacy acquisition (Burnham, 2003;Castles et al., 2003;Morais et al., 1979;Seidenberg & Tanenhaus, 1979). At the metalinguistic level, for instance, orthographic knowledge has been linked with the enhanced perception of speech sounds and, consequently, improved phonological awareness. As a result of this improvement, a person's ability to perceive and manipulate discrete sounds within spoken words ameliorates (Morais et al., 1979). Specifically, performance on phonological awareness tasks (e.g., removing the initial sound from the word /kaet/) increases with literacy acquisition and is better in adults and children with better reading and phonological decoding skills (Pratt & Brady, 1988;Swank & Catts, 1994). Furthermore, both adult and child readers are faster and more accurate in mentally removing speech sounds with transparent grapheme correspondences (e.g., /m/ in hmisti) than those without transparent grapheme correspondences (e.g., /r/ in hwristi; Castles et al., 2003). These findings thus illustrate the positive link between orthographic knowledge and performance on tasks that require phonological awareness.
Interestingly, around the same time that formal reading instruction starts, at the beginning of primary school, two important changes in speech perception and production have been observed: Children become better at speech sound discrimination, and their speech production becomes more compact, as explained in detail below. First, children become better at discriminating two sounds from the same phonetic feature continuum, suggesting that their speech sound perception improves (Hoonhorst et al., 2009(Hoonhorst et al., , 2011Kolinsky et al., 2021; see also Hazan and Barrett, 2000, for evidence of an ongoing refinement in the categorical perception skill until the age of 12 years). For instance, perception of the boundaries between two speech sounds such as /d/ and /t/, which differ in terms of voice onset time, becomes more precise as indicated by increased steepness in the identification function slope. This pronounced improvement in boundary precision (BP) has been reported around the age of 6 years (Burnham, 2003;Hoonhorst et al., 2011), at the onset of formal reading instruction (see the reading hypothesis proposed by Burnham, 2003). To disentangle the effects of age and literacy on increased BP-and thus directly test the idea that literacy acquisition drives these BP improvements- Kolinsky et al. (2021) compared categorical perception of /d/ versus /t/ in beginning children readers, beginning adult readers, and skilled adult readers. Significant differences were observed in BP (as measured by the slope of the identification function) between the two groups of adults (beginning vs. skilled readers), but not between children and beginning adult readers. These results indicate that improvements in BP are indeed linked to reading acquisition rather than age. Although these data provided indirect evidence that speech sound perception sharpens with literacy acquisition, there is still no direct evidence that learning phoneme-to-grapheme conversion (PGC) rules, in particular, is related to this change. Such evidence would shed light on whether literacy acquisition affects speech processing 1 by modifying already existing representations of speech sounds. It could also suggest that, once literacy has been acquired, speech sound processing automatically activates corresponding orthographic representations, hence leading to differences in perceiving and producing sounds that map onto only one grapheme versus those that map onto multiple graphemes. Therefore, in this study, we focus specifically on the link between learning PGC rules and speech sound perception and production.
Second, around the same time children enter school, their speech production also changes: Vowel production becomes increasingly compact as measured by decreases in formant frequency dispersion (Ménard et al., 2007). Ménard et al. compared vowel production in two groups of French-speaking children; the 8-year-olds produced more compact French vowels than the 4-year-olds. Explanations for this reduction in formant frequency dispersion have so far emphasized only physical changes, such as vocal tract growth during maturation, as the main cause driving these developmental changes in speech production (Ménard et al., 2004). However, a closer look at the reported results shows that increased compactness is most strongly pronounced for sounds that map onto only one grapheme in French (i.e., consistent sounds). For example, based on visual inspection of Figure 3 in Ménard et al. (2004), the reduction of formant dispersion seems to be larger for /u/, which maps onto only one digraph houi in French, than for /o/, which has at least three possible orthographic representations in French: hoi, haui, and heaui. This points to a potential link between orthographic knowledge and the production of speech sounds, a possibility directly tested in this study.
The hypothesis that speech sound production may, apart from obvious physical changes, also be linked to literacy acquisition is supported by recent evidence showing a relationship between articulatory maturation and literacy skills. Popescu and Noiray (2021) used the ultrasound tongue imaging technique to measure the degree of coarticulation between vowels and consonants (i.e., the intersegmental co-articulation degree). As several previous studies reported lesser degree of co-articulation in older as compared to younger speakers (Noiray et al., 2013;Noiray, Wieling, et al., 2019;Zharkova et al., 2011), this measure is commonly used to assess articulatory maturation (Noiray, Wieling, et al., 2019). Popescu and Noiray observed less intersyllabic co-articulation, hence better articulatory skills, during production of disyllabic German pseudowords in a group of more proficient German beginning readers. Similarly, Saletta (2019) showed a link between greater speech movement stability and better reading skills in 7-year-old English-speaking children. Altogether, these findings show a link between good literacy skills and articulatory gestures related to speech production, demonstrating this way that differences in speech processing related to literacy acquisition are not only present in speech perception but can be observed in production as well. There is still no direct evidence, however, that PGC rule learning, in particular, is linked to these differences. This possibility is thus tested in this study.
Finally, further evidence that orthographic knowledge and, more specifically, the acquisition of PGC rules can influence spoken word processing comes from the literature on the orthographic consistency effect (OCE; see the seminal study by Seidenberg & Tanenhaus, 1979). By manipulating spelling-to-sound consistency, studies reporting the OCE have shown that spoken words with consistently spelled rhymes (e.g., /oʊb/ in globe, which can only be written as hobei) are recognized faster and more accurately than words with inconsistently spelled rhymes (e.g., /eɪm/ in name, which can be written as either hamei or haimi). These effects have been shown in different paradigms performed entirely in the auditory modality (i.e., without any 1 The term speech processing refers to both perception and production. explicit reference to orthography), such as auditory lexical decision (Ziegler & Ferrand, 1998), phoneme monitoring (Dijkstra et al., 1995), phoneme counting (Treiman & Cassar, 1997), and shadowing (Pattamadilok et al., 2009;Ventura et al., 2004) tasks. With less robust results (Alario et al., 2007), the OCE has also been observed in speech production tasks such as picture naming (Rastle et al., 2011). Importantly, the OCE has been studied across languages with different degrees of orthographic complexity. English has a particularly complex orthography with both spelling-to-sound and sound-to-spelling inconsistencies (Ziegler et al., 1997), yet the OCE has been replicated in French (Ziegler & Ferrand, 1998), which is inconsistent only in one direction (i.e., from sound to spelling; Ziegler et al., 1996), but more transparent languages such as Portuguese (Ventura et al., 2004) as well. Importantly, the OCE has also been studied in populations with different levels of reading skills and was found to be nonsignificant in prereading and dyslexic children (Miller & Swick, 2003;Ziegler & Muneaux, 2007).
The fact that the aforementioned orthographic effects were observed in tasks not making explicit reference to orthography makes the OCE suitable for investigating the relationship between orthographic knowledge and spoken language processing. Since the OCE has been observed in a range of populations with different levels of reading skill, such as child and adult readers, it is also a convenient tool for tracking changes in spoken language processing related to the acquisition of literacy. However, although previous research only investigated the OCE at the lexical level, orthographic effects on speech processing may arise independently of lexical access, that is, at the level of individual speech sounds. Although there is strong evidence that whole-word processing is affected by PGC rules, there is still no direct evidence that these effects lie at the level of individual speech sounds and are related to literacy acquisition. This study will thus test whether the OCE is sensitive to a lower level of spoken language processing, that is, the level of speech sound processing.
In summary, this study set out to expand on previous OCE findings by testing the effects related to the acquisition of literacy at the more fine-grained level of individual speech sound processing. Consequently, we aimed to examine the relationship between literacy acquisition and the processing of individual speech sounds. We hypothesized that if speech sound representations are malleable, they may get affected by the orthographic codes that become associated with them during literacy acquisition: Sounds that map onto only one grapheme could have more salient and fine-grained representations than sounds that map onto more than one grapheme. If so, these reading-related changes should be observed as improved processing of single-grapheme speech sounds in individuals with better orthographic knowledge (i.e., individuals who developed stronger sound-to-letter links). Moreover, we reasoned that if these effects of orthographic consistency arise even at the lowest units of speech processing, they should be most easily observed at early stages of reading acquisition when phonological decoding skills play a central role in successful reading. Beginning readers rely heavily on phonological decoding skills (Share, 1995(Share, , 2004 and therefore predominantly use the sublexical route when reading isolated words (Coltheart, 2005;Coltheart et al., 1993Coltheart et al., , 2001; hence, they most likely prioritize single phonemes and phonological decoding during the earliest stages of reading acquisition.

This Study
This study aimed to investigate the relationship between orthographic knowledge and both perception and production of individual speech sounds in early readers of Spanish. We took advantage of the OCE, shown to be robust at the lexical level, to investigate perception and production of speech sounds that vary in terms of the number of graphemes they map onto. In contrast to previously studied languages (i.e., English, French, and Portuguese), Spanish is orthographically highly transparent: Most Spanish sounds map onto only one grapheme. Nevertheless, Spanish contains several inconsistencies present at the level of individual speech sounds (e.g., the sound /k/ can be written as hci, hqui, or hki), therefore making it a good candidate for investigating orthographic effects at the speech sound level.
We tested how orthographic effects interact with phonological decoding skills in 60 second graders with different levels of decoding skills. 2 All children completed a two-session experiment comprising tasks designed to measure their phonological decoding skills and their production and perception of voiceless plosives /p/, /t/, and /k/. These three sounds were particularly suitable for our purposes for several reasons. First, given that these sounds are acquired around the same time in Spanish (Macken & Barton, 1980;McLeod & Crowe, 2018) and that children acquiring Spanish can already produce these sounds at an early age, the second graders tested in this study were not expected to exhibit much variability in producing these sounds due to insufficient speech motor control. Second, the production of these sounds would not be compromised if a child had recently lost a tooth, particularly, their front teeth, as is common for this age group. Third, these sounds are phonetically similar to each other, differing in only one phonetic feature (i.e., place of articulation; see Table 1). Finally, and crucially for this study and our main research questions, these sounds vary in terms of the consistency with which they map onto graphemes in Spanish (see Table 1). Although /p/ and /t/ are both consistent sounds, with unique grapheme representations (hpi and hti, respectively), /k/ is an inconsistent sound that maps onto three different graphemes (hci, hqui and hki). We hypothesized that any differences related to orthography would be observed between consistent and inconsistent sounds.
To test perception of consistent versus inconsistent speech sounds, we employed a behavioral version of the classic oddball paradigm (Näätänen et al., 1978). In our version of the task, children had to detect and respond to one of two deviant sounds that infrequently appeared within a repetitive auditory stream of sounds. Importantly, although one of the deviant sounds was consistent (/t/, which can only be written as hti), the other was inconsistent (/k/, which maps onto three different graphemes: hci, hqui, or hki). This particular task was used in order to test the perception of the sounds in isolation rather than in the context of comparison with another sound, as done in categorical perception task. We reasoned that if there is a relationship between speech sound perception and the knowledge of PGC rules, children would be slower to respond to the inconsistent than to the consistent sound. Given that consistent sounds get associated with only one grapheme, their representations may benefit from this oneto-one link and thus become more salient as compared to those of inconsistent sounds (i.e., sounds that have multiple grapheme representations). Furthermore, we expected that differences in processing orthographic consistencies should be larger in children with better decoding skills as their phoneme-to-grapheme links should be stronger.
In the speech production task, speech onset time (SOT; see Table 1) was used as a behavioral indicator of the time needed to prepare and initiate the oral response. Building on recent findings from Popescu and Noiray (2021) and in line with the literature showing that children with poor reading skills (e.g., children with dyslexia) are slower to articulate and produce syllables compared to their peers with no reading problems (Duranovic & Sehic, 2013;Fawcett & Nicolson, 2002), we set out to investigate if these reading-related effects-in addition to being present at the articulatory level-may also be seen at higher levels of speech production (i.e., the level of speech planning). We hypothesized that children with better decoding skills would also be overall faster at producing isolated speech sounds. Finally, if there is a link between phoneme-to-grapheme consistency and speech sound production, it should be stronger for consistent sounds. Both the perception and production tasks were designed in such a way that their completion did not rely and, importantly, did not make explicit reference to orthography (i.e., orthographic representations of tested sounds). This was done to minimize the possibility of activating orthographic representations and hence investigate the nature of speech sound representations.

Method Participants
Sixty second graders took part in the study (M age = 7;5 [years;months], SD age = 3.35 months, 34 girls). All children came from the same socioeconomic background, and all attended the same school in Vitoria-Gasteiz, Spain, where they were tested. The study was approved by the Basque Center on Cognition, Brain and Language (BCBL) Ethics Review Board (Reference No. 141119SM_B), and all parents provided written consent for their child's participation in the study. None of the children had any reported learning, speech, or reading disabilities. All children had Spanish as their first and dominant language, which was also the only language spoken at home.

Procedure
The experiment was organized into two sessions, which for most children took place on two consecutive days. Eight children who completed the first session on a Friday had to complete the second session on the following Monday. The first session always started with the Note. Although the sound /k/ has three possible grapheme representations, the frequency of these different realizations differs: hci and hqui have similar frequency of appearance, but hki is less frequent in Spanish words (Duchon et al., 2013).
speech production task: Children produced eight speech sounds in separate randomized blocks. This task was always completed at the beginning of the experiment to avoid any orthographic interference from the two decoding tasks administered during the same session. As a part of a different project, children then completed the first half of a categorical perception task. This task was broken down into two parts so it would not be too long for the children to complete. Finally, at the end of the first session, children completed two phonological decoding tasks, that is, a word-pseudoword reading task and the Alondra test. As only the latter is of relevance for this study, it will hence be described in detail (see The Alondra Test section). The order of the phonological decoding tasks was counterbalanced across participants. The second session always started with the second half of the categorical perception task, followed by the two oddball sound detection tasks (two different sets of sounds were tested, /θ/-/f/ and /k/-/t/, the former being a part of a different project is not reported here). The order of these tasks was counterbalanced across participants. At the end of the second session, all children completed a spelling task and the nonverbal IQ assessment with the Matrices subtest of the Kaufman Brief Intelligence Test-Second Edition (Kaufman & Kaufman, 2004). Nonverbal IQ was assessed in order to ensure all children are within 2.5 SDs from the sample mean. The categorical perception, word-pseudoword reading, and spelling tasks were administered for a different project and are not presented here.
Each child was tested individually in a silent room in the Junior Laboratory of the BCBL located at the children's school. While performing the experimental tasks, children were seated in front of the computer; the distance between the child and the computer screen was around 60 cm. The experimenter remained in the room with the child during the entire experimental procedure. All tasks were controlled using OpenSesame software (Version 3.0.2; Mathôt et al., 2012), and the audio stimuli were played through Sennheiser GSP 350 headphones with an integrated microphone that recorded verbal responses. Before each task, children indicated if the volume was at a comfortable level and, if necessary, the volume was adjusted accordingly.

Speech Perception Task
The aim of the speech perception task was to compare children's perception of consistent and inconsistent speech sounds. Each child completed two different versions of the task. Both versions had the same structure but tested different sound pairs. Here, we consider only the results from the /t/ versus /k/ pair.
The task followed the structure of the classic auditory oddball paradigm (Näätänen et al., 1978), in which a standard or base sound is presented on most of the trials whereas a deviant sound is presented at irregular intervals. In this version of the task, children were presented with an auditory sequence of base sounds (the consistent speech sound /f/ in the syllable /fə/) and were instructed to press a highlighted key ("M") on the keyboard as quickly as possible every time they heard a sound that differed from this base sound (i.e., a deviant). Two different deviant sounds were embedded in the auditory stream: the inconsistent deviant /k/ and the consistent deviant /t/, both presented as schwa syllables. The base sound /f/ was presented on 80% of trials, and on the remaining 20% of trials, children heard one of the two deviant sounds (10% /k/, 10% /t/). Note that all three consonants consisted of a consonant and schwa vowel (i.e., /kə/, /tə/, and /fə/); however, for the sake of simplicity, in this article, the three sounds are always depicted as pure consonants without the schwa vowel.
Ten pseudorandomized lists were created and presented in a random order to control the number of base sounds that appeared before either the consistent or inconsistent deviant and to ensure that deviant sounds were never presented on consecutive trials. These lists comprised sequences of three, four, or five consecutive base sounds, followed by one of the two deviants (see Figure 1). Each list included a total of 60 sounds, comprising 48 base and Figure 1. Graphical representation of the speech perception task. The interstimulus interval (ISI) was jittered and ranged from 900 to 1000 in steps of 50 ms. /k/ was the inconsistent deviant, and /t/ was the consistent deviant sound. Three, four, or five base sounds were presented consecutively before one of the two deviants appeared. 12 deviant sounds. In total, the 10 lists together included 600 trials (480 base sounds, which served as fillers and were not relevant for the analysis, and 120 deviant sounds, which were analyzed). To reduce predictability, the number of /k/ and /t/ within each list varied from four to eight (e.g., in List 1, there were four instances of /k/ and eight of /t/; in List 2, there were five instances of /k/ and seven instances of /t/; etc.). The interstimulus interval (ISI) between each two sounds was randomized at either 900, 950, or 1000 ms. While performing the task, a yellow star was always present at the center of the screen. Children had a break after each list, and the next list was initiated only when the child was ready to move on with the task. There was one practice list containing 20 trials (four deviant sounds).
Speech sounds compared in the statistical analysis were of the same length (i.e., both syllables /kə/ and /tə/ were 250-ms long, whereas the fricative /fə/ was 260 ms long). Six different tokens of each sound (i.e., six different recordings of the /k/, /t/, and /f/ produced by the same speaker) were used to add variability to the task and ensure that discrimination was not based solely on acoustic, but phonological similarity as well. Sounds were recorded by a male Spanish (first language) speaker from the same region as the children tested in the study. Recordings were made in a sound-attenuated booth using a Sennheiser microphone, and all sounds were later normalized to the same intensity using Praat software (Boersma & Weenink, 2021). The entire task took approximately 20 min to complete.

Speech Production Task
In the speech production task, children were instructed to produce a specific sound every time they saw a star on the screen. A total of eight sounds were tested: three plosives (/p/, /t/, and /k/) and five vowels (/a/, /e/, /i/, /o/, and /u/). Production of Spanish vowels was tested as part of a different project, so the results are not reported here. Each sound was produced in a unique block. Children were told to produce the sounds as rapidly as possible upon seeing the star on the screen (not before).
The order of the sound production blocks was counterbalanced across participants. Each block consisted of 20 experimental trials with the same structure. At the beginning of each block, children heard the target sound 3 times in a row. These sounds were presented exclusively aurally to avoid creating any explicit relationship between the sounds and their grapheme representations. To make sure children heard the sound that had to be produced, they were asked to repeat it right after the three presentations. After ensuring that the child was ready, the experimenter initiated the start of the respective production block. The production part started with the star appearing on the screen, which also initiated the microphone. The star remained on the screen for 750 ms. To avoid automatization of production, which would lead children to anticipate the articulation, the ISI was jittered in steps of 50 ms between 500 and 700 ms. Each block started with five practice trials, which were systematically excluded from any further analysis. After each block, children were given a short break before moving on to the next one. The entire task lasted approximately 10 min.

The Alondra Test
To measure children's decoding skills, the Alondra test (Lallier et al., 2021) was employed. The Alondra test is a Spanish adaptation of the original French Alouette test (Lefavrais, 1967), which consists of reading a meaningless text composed of real words and pseudowords embedded in grammatically and syntactically correct sentences within a predefined time limit, in this case 3 min. The original Alouette test is used as a screening tool to diagnose developmental dyslexia in French-speaking children (Lefavrais, 1967(Lefavrais, , 2005 and adults (Cavalli et al., 2018). It taps into skills that are usually impaired in dyslexic readers, such as reading fluency, phonological decoding, and irregular word reading (Sprenger-Charolles et al., 2005. The Alondra test is structured like a real text, which invites more natural reading and thus provides an ecologically valid test of phonological decoding skills.
In the Alondra test, children were instructed to read aloud the text, presented on paper, as quickly and accurately as possible. The text contains 280 Spanish words and pseudowords organized into eight lines. Before the task, children were told that the text they were about to read would not make sense so they should not focus on understanding it but rather read it as quickly and correctly as possible. Each child decided when they were ready to start reading, and the experimenter then started the chronometer and the recording device. After 3 min, regardless of whether the child had finished reading the text or not, the experimenter indicated that the task had ended. The score for each child was calculated as the total number of words and pseudowords read correctly within 3 min. This score was manually coded by two independent listeners. If they did not agree as to whether a child had read a word or pseudoword correctly, that child's data were rechecked by one of the listeners, who then calculated the final score.

The Alondra Test
The density plot in Figure 2 shows the distribution of scores from the Alondra test. The decoding scores ranged from 34 to 263, with a median value of 121.5 and a mean of 135.9 (SD = 53.8). Only two children finished reading the entire text within 3 min.

Speech Perception Task
The speech perception task tested how quickly children perceived and detected consistent /t/ and inconsistent /k/ within a repetitive auditory sequence. Reaction times (RTs) for key press responses were analyzed using linear mixed-effects models (Baayen et al., 2008). Accuracy, calculated as the number of trials children correctly responded to the deviant sound, was analyzed using generalized linear mixed-effect models with a binominal link. Both analyses were performed in the R statistical environment (Version 4.0.2; R Core Team, 2020) using the lme4 package (Version 1.1-23; Douglas Bates et al., 2015). p values for RT analysis were obtained through the lmerTest package (Version 3.1-2; Kuznetsova et al., 2017).
Across the two analyses, the fixed factor speech sound (/t/ vs. /k/) was sum-coded (/t/ as −0.5 and /k/ as 0.5), whereas the continuous factor phonological decoding skills (i.e., the score from the Alondra test) and the covariate sequence length (i.e., the number of base sounds between two deviant ones) were both centered and scaled. In both analyses, we aimed to include the maximal random structure justified by the design (Barr et al., 2013). To avoid convergence or singularity, however, we followed the parsimonious approach  to build down the random effects structure. All the reported models, therefore, represent the highest converging nonsingular models, which included by-participant random intercepts and byparticipant random slopes for the fixed factor speech sound.

RTs
We analyzed RTs on deviant trials to which children responded (84.5% of all deviant trials). We then identified and removed extreme values (i.e., RTs below 200 ms and above 1000 ms) through visual inspection of the RT distribution (1.97% of deviant trials with responses). All models were run on raw RT data as the lambda value was close to 1 after the Box-Cox transformation (Box & Cox, 1964).
The model with the best fit (see Table 2) looking into response times in the speech perception task showed a significant effect of speech sound (β = 8.00, SE = 3.55, t = 2.25, p = .024) since RTs were shorter for the consistent than the inconsistent sound. Neither the phonological decoding skills (β = −10.3, SE = 9.29, t = −1.11, p = .269) nor the interaction between speech sound and phonological decoding skills were significant (β = −.992, SE = 3.41, t = −0.291, p = .770). However, the covariate sequence length was significant (β = −10.41, SE = 1.73, t = −6.03, p < .001), indicating that the RTs decreased with the increase in the number of base sounds preceding one of the two deviants.
This analysis thus shows an overall consistency effect, but no detectable differences related to decoding skills. Response times for the two speech sounds as a function of decoding skills are presented in Figure 3.

Accuracy
The model with the best fit showed a significant effect of phonological decoding skills (β = 0.201, SE = 0.102, z = 1.98, p = .048), indicating that the probability of failing to respond to an infrequent sound (either consistent or inconsistent) decreased with an increase in the decoding skills. There was no effect of speech sound (β = −.051, SE = 0.079, z = −0.643, p = .519) and no interaction between speech sound and phonological decoding skills (β = −.068, SE = 0.072, z = −0.951, p = .342). However, as in the model looking into RTs, the covariate sequence length was significant, given that the probability Figure 2. Distribution of decoding scores. Density plot showing the distribution of phonological decoding skills scores from the Alondra test. The y-axis shows the decoding score calculated as the number of words and pseudowords read correctly within 3 min. The dashed vertical line represents the mean score of 135.9 (SD = 53.8). of failing to respond to an infrequent sound was lower after longer sequences of repetitive base sounds (i.e., accuracy was higher after longer sequences of base sounds; β = 0.071, SE = 0.035, z = 2.02, p = .043). To sum up the perception data, although speech sound consistency affected response times regardless of the decoding skills, the latter were linked to higher percentage of accuracy (see Table 3 and Figure 4).

Speech Production Task
The speech production task measured the time needed to articulate and produce consistent and inconsistent speech sounds. Hence, the main dependent variable was SOT in milliseconds for the three voiceless plosives /p/, /t/, and /k/. SOTs were measured from the onset of the production cue (the star) until the moment of the plosive release. SOTs for all three sounds were calculated using Chronset software (Roux et al., 2017) and then manually checked using Praat software (Version 6.1.40; Boersma & Weenink, 2021). Due to technical problems during recording, data from one child had to be discarded from the analysis.
SOTs were analyzed using linear mixed-effects models, following the same procedure as in the speech perception task. Before the analyses, extreme values (SOTs below 150 ms or above 1200 ms), determined based on a visual inspection of the distribution, were removed (3.4% of all data). 3 SOTs were log-transformed as indicated by the Box-Cox transformation (Box-Cox, 1964).
The full fixed-effects structure included two fixed factors, that is, phonological decoding skills and speech sound. The continuous factor phonological decoding skills was centered and scaled, whereas Helmert contrast coding was used for the three-level categorical factor speech sound to create two contrasts of interest. The first contrast (hereafter, SOT_pt) compared the difference between the mean SOT of the first two levels (i.e., /p/ vs. /t/). The second contrast (hereafter, SOT_ptk) compared the difference between the mean of the first two levels (/ p/ and /t/, both consistent speech sounds) to the third level (/k/, an inconsistent speech sound). This contrast coding scheme was chosen as it provided the maximal power to test for a difference between consistent and inconsistent speech sounds (see Schad et al., 2020). The random effect structure included by-participant and byitem random intercepts as well as by-participant random Figure 3. Reaction times (RTs; in ms) for the two speech sounds as a function of decoding skills. RTs (y-axis) to two deviant speech sounds (/t/ and /k/) as a function of the phonological decoding skills (x-axis). The shaded parts surrounding the lines represent the confidence intervals, and the dots represent the mean RTs of individual participants. Note. Significant p values are displayed in bold. Corr. = correlations between the varying intercepts and slopes for participants. Figure 4. Proportion of accuracy for the two speech sounds as a function of decoding skills score. Proportion of accuracy (y-axis) to two deviant speech sounds (/t/ and /k/) as a function of phonological decoding skills (x-axis). The lines represent logistic curves, and the dots represent mean accuracy of individual participants.
slopes for the two contrasts (i.e., SOT_pt and SOT_ptk; see Table 4). The model with the best fit looking at SOTs in the speech production task showed no significant effects for either the SOT_pt (β = 0.021, SE = 0.02, t = −0.05, p = .963) or the SOT_ptk contrast difference (β = 0.021, SE = 0.01, t = 0.24, p = .821). Moreover, the effect of phonological decoding skills was not significant (β = −.042, SE = 0.024, t = −1.72, p = .086). Importantly, however, the model yielded a significant interaction between phonological decoding skills and the SOT_ptk contrast (β = 0.021, SE = 0.008, t = 2.47, p = .014), showing that the difference in SOTs between the two consistent and the inconsistent sounds varied as a function of phonological decoding skills (see Figure 5). No significant interaction between phonological decoding skills and the SOT_pk contrast was detected (β = 0.01, SE = 0.002, t = 0.48, p = .63), indicating that the difference between the two consistent sounds did not differ as a function of phonological decoding skills.

Discussion
This study investigated the relationship between orthographic knowledge, in particular phonological decoding skills, and speech sound processing during the early stages of reading acquisition. A total of 60 Spanishspeaking second graders, with different levels of decoding skills, were tested on both perception and production of speech sounds that vary in terms of the number of graphemes they map onto: Consistent sounds /p/ and /t/ have a unique grapheme representation in Spanish (hpi and hti, respectively), whereas the inconsistent sound /k/ has three possible grapheme representations in Spanish (hci, hqui, and hki). In the perception task, children had to detect both consistent (/t/) and inconsistent (/k/) deviant sounds, which appeared infrequently within a repetitive auditory stream. In the production task, they produced consistent (/p/ and /t/) or inconsistent (/k/) sounds as soon as they saw a star appear on the computer screen. Overall, our results show differences in individual speech sound processing related to both orthographic consistency and decoding skills. In both tasks, faster processing (i.e., a facilitation effect) was observed for consistent as compared to inconsistent sounds. In the perception task, orthographic effects seemed to be present for all children (i.e., along the entire spectrum of decoding skills), whereas in the production task, they were modulated by children's decoding skills: The facilitation for consistent sounds in production was positively associated with children's decoding skills, that is, the strength of their phoneme-to-grapheme links. These findings revealing differences in speech sound processing related to orthographic (in)consistency-observed in tasks not making explicit reference to orthographic representations-raise a possibility that orthographic codes associated with speech sounds during literacy acquisition affect already formed speech sound representations.
Previous research has shown that spelling-to-sound consistency affects auditory word recognition by facilitating the processing of words with consistent as compared to words with inconsistent rhymes (Pattamadilok et al., 2009;Seidenberg & Tanenhaus, 1979;Ventura et al., 2004;Ziegler & Ferrand, 1998). These OCEs were thought to stem from sublexical units (e.g., rhymes) but have been observed only in tasks involving lexical access and processing. Here, we present evidence that OCEs are also present at lower sublexical levels of speech processing. Note. Significant p values are displayed in bold. Corr. = correlations between the varying intercepts and slopes for participants. Figure 5. Speech onset time (SOT; in ms) for the three speech sounds as function of phonological decoding skills. SOTs (y-axis) for three speech sounds (/p/ in light gray, /t/ in gray, and /k/ in black) as a function of the (raw) phonological decoding skills score (x-axis). The shaded areas surrounding the lines indicate confidence intervals, and the dots represent mean SOTs per participant.
Namely, our results reveal OCEs in the perception of individual speech sounds and in tasks that do not require lexical access or processing. All children, irrespective of their decoding skills, were faster at detecting the deviant sound /t/, which has only one grapheme representation in Spanish. The absence of an interaction between orthographic consistency and decoding skill was somewhat surprising as we expected that children with better decoding skills, who also built stronger links between sound and letters, would be more affected by orthographic consistency. We offer two possible explanations for this unexpected null result. First, the absence of an interaction between orthographic consistency and decoding skill may be due the lack of power in our design. To increase chances of children completing the repetitive oddball task successfully, we only tested two sounds and kept the number of trials to a minimum. Second, it could be that despite the differences in decoding skills in the group of children we tested, these differences were not large enough to capture small perception effects. Therefore, more research with different sound pairs and testing children at an even earlier stages of reading acquisition, as well as children at risk of developing reading disorders, is needed to test the robustness and the generalizability of the reported differences. Interestingly, the accuracy performance in our behavioral version of the oddball task differed as a function of children's decoding skills, as poorer decoding skills were associated with lower accuracy. That is, the probability of detecting and responding to an infrequent sound decreased with the decrease in decoding skills. This pattern of results is in line with recent findings showing that both adults and children with reading-related issues (i.e., developmental dyslexia; Pagliarini et al., 2020) have difficulty anticipating timed events in the auditory domain.
Another important contribution of this study is the link we observed between speech sound production and phonological decoding, a critical basis for reading skills. This is in line with recent findings by Popescu and Noiray (2021), who report less intersegmental co-articulation in pseudoword production in a group of more proficient German readers. Here, we expand on this finding by showing a relationship between better phonological decoding skills, suggesting also better reading skills, and faster preparation and initiation of speech sound production. This finding thus implies a higher level-not just articulatory-locus of the orthographic effect. Importantly, better phonological decoding skills were linked to faster production of consistent sounds (i.e., /p/ and /t/) but not of the inconsistent sound (i.e., /k/). This suggests that learning PGC rules is indeed related to the observed differences in speech sound production. Moreover, the fact that phonological decoding skills were linked to faster production of the consistent but not the inconsistent sound eliminates the possibility that the observed pattern of results was due to physical properties of the speech production system (e.g., because labials such as /p/ can be produced faster). The previous finding that Spanishspeaking children acquire the three sounds /p/, /t/, and /k/ around the same time (McLeod & Crowe, 2018) further supports the conclusion that the differences we observed between inconsistent /k/ and consistent /p/ and /t/ in production are indeed not driven by physical changes and differences in the production system.
Our findings thus suggest that early indicators of future reading ability, such as phonological decoding skills, are related to differences in speech production at the early stages of reading acquisition. Indeed, there is evidence that better phonological awareness skills are associated with better articulatory skills (Noiray, Popescu, et al., 2019). Nevertheless, more research is needed to better understand the relationship between reading and reading-related skills such as phonological awareness and the parallel development of speech production skills. Importantly, further research should focus on determining whether these two skills develop in interaction or whether one largely relies on the development of the other.
Moreover, this study and the two studies that also observed an association between reading-related and articulatory skills (Noiray, Popescu, et al., 2019;Popescu & Noiray, 2021) were conducted in orthographically transparent languages. Future research should examine whether the same effects are present in orthographically opaque languages. It may be the case that the relationship between reading and speech processing is strongest in languages with more transparent, one-to-one mappings between sounds and graphemes. In languages with opaque orthographic systems such as English and French, children may have to rely on larger grain sizes, such as rhymes or even whole words, when learning to read (Ziegler & Goswami, 2005). If so, they may be less affected by individual phoneme-to-grapheme consistencies and might not even show any effects at the level of speech sound processing observed in this study. Alternatively, related differences in speech processing could follow different developmental trajectories in languages with different orthographic systems. Indeed, cross-linguistic developmental differences have already been reported for the OCE at the word level (see Ventura et al., 2008). Future studies could hence investigate the developmental trajectory of the effects reported here in children learning to read and write in an opaque orthography such as French or English.
A better understanding of how learning phonemeto-grapheme correspondences affects processing of speech sounds at different stages of reading development, speech sound perception and production would require additional research with children at later stages of reading acquisition. As reading becomes more automatized with experience, the processing of sublexical units-especially single phonemes-decreases. Thus, it is possible that the speech sound processing differences reported in this study are temporary and would not be found in older children with more reading experience or in skilled adult readers.
Finally, findings from this study may have implications for early detection of problems with phonological decoding and consequently future reading deficits. As discussed above, in both the perception and production task used in this study, there were differences in speech sound processing related to phonological decoding skills. Better phonological decoding skills were related to higher accuracy in the detection of infrequent sounds. Furthermore, they were also related to shorter production latencies, but only for sounds with consistent grapheme correspondences. These data thus indicate that children's phonological decoding skills are related to how orthographic consistencies are processed, at least during the early stages of reading acquisition. Crucially, the present findings highlight the importance of speech perception and production interventions before the start, but also during the process of reading acquisition. Based on the observed link between speech sound processing and decoding skills, different interventions and training sessions could be planned with the aim to boost children's speech sound perception and production skills before they even start learning orthographic codes. If reading skills develop in interaction with speech sound processing, the improvement in one skill may entail a similar improvement in the other one. Moreover, it would be informative to administer these same tasks to children at risk of dyslexia or with reading difficulties. Doing so would provide insight into how these populations process individual speech sounds, how they build links between sounds and graphemes, and hence how these links influence speech processing. A better understanding of these linking processes could shed light on the underlying mechanisms responsible for PGCs. This, in turn, could lead to better understanding of the issues children at risk of dyslexia or with reading difficulties encounter during reading acquisition, as well as the influence such issues have on speech processing.

Conclusions
Previous research has shown differences related to literacy acquisition at various levels of speech processing. Orthographic knowledge has not only been linked with better phonological awareness skills but with improved speech perception as well, in particular, the improved perception of speech sound boundaries. Furthermore, better literacy skills are related to better articulatory skills, at least during the early stages of reading acquisition. Here, we report a link between processing orthographic consistencies and both perception and production of individual speech sounds in beginning readers of a transparent language. We additionally demonstrate that differences in speech sound production related to the acquisition of literacy are modulated by an individual's phonological decoding skills. This finding raises the possibility that already formed representations of speech sounds may be malleable and can thus be influenced by the orthographic codes they are associated with during reading acquisition. Overall, the present findings have important implications for both language development and reading acquisition research.

Data Availability Statement
The data sets generated during and/or analyzed during this study are available in the OSF repository, https:// osf.io/k3nvx/?view_only=b767cdbeae16408ebeeda0324e9d727c.