Evaluacion Aparaxia PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

JSLHR

Tutorial

Assessment of Childhood Apraxia


of Speech: A Review/Tutorial of Objective
Measurement Techniques
Hayo Terband,a Aravind Namasivayam,b Edwin Maas,c Frits van Brenk,d Marja-Liisa Mailend,e
Sanne Diepeveen,f,g Pascal van Lieshout,b and Ben Maassenh

Background: With respect to the clinical criteria for diagnosing operationalize and assess these 3 core characteristics.
childhood apraxia of speech (commonly defined as a Methodological details are reviewed for each procedure,
disorder of speech motor planning and/or programming), along with a short overview of research results reported
research has made important progress in recent years. Three in the literature.
segmental and suprasegmental speech characteristics—error Conclusion: The 3 types of measurement procedures should
inconsistency, lengthened and disrupted coarticulation, be seen as complementary. Some characteristics are better
and inappropriate prosody—have gained wide acceptance suited to be described at the perceptual level (especially
in the literature for purposes of participant selection. However, phonemic errors and prosody), others at the acoustic level
little research has sought to empirically test the diagnostic (especially phonetic distortions, coarticulation, and prosody),
validity of these features. One major obstacle to such empirical and still others at the kinematic level (especially coarticulation,
study is the fact that none of these features is stated in stability, and gestural coordination). The type of data collected
operationalized terms. determines, to a large extent, the interpretation that can be
Purpose: This tutorial provides a structured overview given regarding the underlying deficit. Comprehensive studies
of perceptual, acoustic, and articulatory measurement are needed that include more than 1 diagnostic feature and
procedures that have been used or could be used to more than 1 type of measurement procedure.

F
rom a historical perspective, childhood apraxia of
speech (CAS) is a controversial clinical entity,
with respect to both clinical signs and underlying
a
Utrecht Institute of Linguistics-OTS, Utrecht University, deficit. In 1981, Guyette and Diedrich had concluded that
the Netherlands “…No pathognomonic symptoms or necessary and suffi-
b
Oral Dynamics Laboratory, Department of Speech-Language cient conditions were found for the diagnosis…” (p. 44)
Pathology, University of Toronto, Ontario, Canada
c
and critically termed CAS as “a label in search of a popu-
Department of Communication Sciences and Disorders, Temple lation” (p. 39). Despite clinical studies to further character-
University, Philadelphia, PA
d ize CAS (e.g., Aram & Horwitz, 1983; Ekelman & Aram,
Department of Communicative Disorders and Sciences, University
at Buffalo, NY
1984; Marion, Sussman, & Marquardt, 1993; Pollock &
e
Moss Rehabilitation Research Institute, Moss Rehabilitation Hall, 1991; B. Smith, Marquardt, Cannito, & Davis, 1994;
Hospital, Elkins Park, PA Walton & Pollock, 1993), this situation had not changed
f
HAN University of Applied Sciences, Nijmegen, the Netherlands much by the time of 1994, when Shriberg (1994) con-
g
Department of Rehabilitation, Donders Institute for Brain, cluded that development in this field was moving endlessly
Cognition and Behaviour, Radboud University Medical Center, sideways.
Nijmegen, the Netherlands Since then, a large body of research has been dedicated
h
Center for Language and Cognition, Research School of Behavioral
to characterize the speech impairment and underlying func-
and Cognitive Neurosciences, University of Groningen, The Netherlands
tional and neuromotor deficit of CAS, and this endeavor
Correspondence to Hayo Terband: [email protected]
has been successful in some respects. There is an agreement
Editor-in-Chief: Julie Liss that, from a functional point of view, CAS is a disorder of
Received May 11, 2019 motor planning and/or motor programming (American
Accepted May 18, 2019 Speech-Language-Hearing Association [ASHA], 2007) or,
https://doi.org/10.1044/2019_JSLHR-S-CSMC7-19-0214
Publisher Note: This article is part of the Special Issue: Select
Papers From the 7th International Conference on Speech Motor Disclosure: The authors have declared that no competing interests existed at the time
Control. of publication.

Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019 • Copyright © 2019 American Speech-Language-Hearing Association 2999
Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
in other words, an inability to transform an abstract pho- 2018, for a systematic review of the differential diagnostic
nological code into motor speech commands (cf. Maassen, value of these features). Alternative approaches such as
Nijland, & Terband, 2010). More specifically, ASHA defined developing psycholinguistic profiles derived from process-
CAS as “a neurological childhood (pediatric) speech sound oriented diagnostics have been proposed elsewhere (e.g.,
disorder in which the precision and consistency of move- Terband, Maassen, & Maas, 2016, 2019). The goal of the
ments underlying speech are impaired in the absence of current article is to provide a structured overview of mea-
neuromuscular deficits (e.g., abnormal reflexes, abnormal surement procedures that have been used or may be used
tone)…. The core impairment in planning and/or program- to assess the three core characteristics of CAS as formu-
ming spatiotemporal parameters of movement sequences lated in the ASHA Technical Report (ASHA, 2007),
results in errors in speech sound production and prosody.” without going into the issue of differential diagnosis itself.
(ASHA, 2007, pp. 3–4). Since then, this definition has been This review is organized by each feature character-
adopted widely in the CAS research literature (e.g., Grigos izing CAS and within each feature by level of analysis
& Kolenda, 2010; Iuzzini-Seigel, Hogan, Guarino, & Green, (perceptual/transcription, acoustic, articulatory analysis).
2015; Maas & Farinella, 2012; Murray, McCabe, Heard, & We review methodological details for each procedure and
Ballard, 2015; Namasivayam et al., 2015; Preston et al., provide a short overview of research results that have been
2014; Terband, Maassen, Guenther, & Brumberg, 2009, reported in the literature. In terms of methodological details,
2014). for each approach, we identify four critical parameters
With respect to the clinical criteria for diagnosing that must be specified for operationalization and determining
CAS, research has also made important progress in recent cutoff scores for diagnosis: (a) the response target to be
years. Although ASHA (2007, p. 4) noted that “there is produced by the child (sounds, words, nonwords, etc.),
no validated list of diagnostic features of CAS that differ- (b) the task used to elicit these responses (e.g., imitation,
entiates this symptom complex from other types of child- picture naming), (c) the conditions under which the responses
hood speech sound disorders,” the CAS Technical Report are elicited (e.g., quiet, with time pressure), and (d) the
proposed three segmental and suprasegmental speech char- measures obtained from these responses (e.g., error consis-
acteristics that were considered to be consistent with a tency scores, formant ratios). For each method, we further
deficit in speech motor planning and programming and summarize the scientific basis, specifically, (e) whether
thus as being specific to CAS: administration is standardized, (f ) whether validity and
reliability data are available, and (g) whether norm or
1. inconsistent errors on consonants and vowels in
reference data for children are available (we make a distinc-
repeated productions of syllables or words;
tion between norm data, i.e., norm-referenced cutoff scores,
2. lengthened and disrupted coarticulatory transitions and reference data, i.e., numbers reported by other studies
between sounds and syllables; and that may serve as reference values). Finally, we discuss
3. inappropriate prosody, especially in the realization issues that need to be taken into consideration when choosing
of lexical or phrasal stress. a suitable technique and identify research needs in terms
of the development of (more objective) measures as well as
These features have gained wide acceptance in the their validation and standardization.
subsequent literature for purposes of participant selection,
but little research has sought to empirically test the diag-
nostic validity of these features. One major obstacle to Inconsistent Errors on Consonants and Vowels
such empirical study is the fact that none of these proposed in Repeated Productions of Syllables or Words
features was stated in operationalized terms. This lack of
operationalization also hinders comparability of participants
Background
across studies, because often researchers either do not Inconsistency of Speech
provide operationalized criteria for the CAS diagnoses of Disordered or atypical “inconsistency” is variability
their participants or researchers use different criteria. The in speech production in the absence of contextual varia-
purpose of this tutorial is to provide a structured overview tions (e.g., phonetic context, pragmatic influences, matura-
of measurement procedures that have been used or could tion or cognitive–linguistic influences), such as during
be used to operationalize and assess these three core char- repeated productions of the same exemplar across multiple
acteristic. The hope is that this will facilitate a more repli- trials (Dodd, Hua, Crosbie, Holm, & Ozanne, 2009;
cable evidence base and, eventually, a consensus on how Marquardt, Jacks, & Davis, 2004). The measurement of
best to capture these features for future research and clinical inconsistent speech production includes not just quantity of
application. different productions and control of context but also the
To be clear, we do not address whether a “feature quality of those alterations. Qualitative differences, such as
checklist” is ultimately the optimal approach to diagnosis the number and type of (multiple) substitutes for phonemes
(e.g., see Shriberg et al., 2017, for a discussion of prob- within and across all positions, assist in the differentiation
lems with this approach), nor do we suggest that these spe- of atypical/disordered “inconsistency” from “normal” vari-
cific features are the most important or discriminative ones ability as found in typically developing (TD) children
(see Murray, Iuzzini-Seigel, Maas, Terband, & Ballard, (Iuzzini-Seigel, 2012; Iuzzini-Seigel & Forrest, 2010). In the

3000 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


next sections, we will discuss measures that allow us to possibly implying that word-level inconsistency may relate
distinguish variability that is a part of normal learning and to the severity of the problem and not just disorder classi-
development from atypical inconsistency seen in children fication (Bradford & Dodd, 1996; Iuzzini-Seigel, 2012;
with speech disorder (e.g., CAS). Tyler, Williams, & Lewis, 2006). In fact, a recent study
demonstrated that inconsistency scores alone (from the Diag-
Speech Variability During Typical Development nostic Evaluation of Articulation and Phonology [DEAP]
In TD children, some degree of variability in word Inconsistency subtest; Dodd et al., 2002) were only able to
production is expected, but highly inconsistent speech discriminate CAS from other SSDs with a modest accuracy
production is considered a sign of pathology or disorder of 30% (Murray et al., 2015) and thus may not be suffi-
(Holm, Crosbie, & Dodd, 2007). In repeated productions cient for differential diagnosis (Bradford & Dodd, 1996).
of the same word in a picture-naming task (with 25 items), Segmental-level inconsistency measures (e.g., type–
Holm et al. (2007) found approximately 10%–13% vari- token ratio [TTR]; Forrest & Iuzzini-Seigel, 2008; Iuzzini-
ability at the whole-word level in TD children ages 3;0–6;11 Seigel & Forrest, 2010) have proven to be more sensitive
(years;months). Studies of typical speech development than word-level procedures for differential diagnosis of
have documented decreasing variability during the repeated CAS from other SSD populations. In particular, segmental-
productions of words or speech sounds with increasing age level TTR measures, the consonant substitute inconsistency
(Iuzzini-Seigel, 2012; Preston & Koenig, 2011). For exam- percentage (CSIP; Forrest & Iuzzini-Seigel, 2008; Iuzzini-
ple, Burt, Holm, and Dodd (1999) and Holm et al. (2007) Seigel & Forrest, 2010) and its variant, the inconsistency
found a negative correlation between age and word vari- severity percentage (ISP; Iuzzini-Seigel & Forrest, 2010),
ability in children with typical speech development between demonstrate high scores for children with CAS but not TD
3;10 and 4;10 and between 3;0 and 6;11, respectively. Within children or children with articulation or phonological
this general trend of decreasing word variability in TD delays (Forrest & Iuzzini-Seigel, 2008; Iuzzini-Seigel, 2012;
children, variability peaks have been observed during certain Yao-Tresguerres, Iuzzini-Seigel, & Forrest, 2009). For
phases, such as during language and vocabulary expansion example, CSIP scores below 21% were found for children
(Iuzzini-Seigel, Hogan, Rong, & Green, 2015; Sosa & with phonological or articulatory disorders, while children
Stoel-Gammon, 2006). Specifically, Sosa and Stoel-Gammon with CAS had CSIP scores of greater than 24% (Forrest &
(2006) observed an increase in whole-word variability in Iuzzini-Seigel, 2008). Similarly, ISP scores differentiated
children between 1 and 2 years of age when two-word com- TD children from speakers with speech disorder, with > 18%
binations were emerging and when vocabulary size was ISP scores indicating possible CAS diagnosis (TD group
approximately 150–200 words. Vocabulary expansion had ISP scores of < 7.5%). Overall, Iuzzini-Seigel (2012)
between 15 and 21 months has also been associated with a suggests that between segmental (e.g., ISP) and lexical
temporary regression in speech motor performance (Iuzzini- (Word Inconsistency Measure; DEAP subtest) inconsis-
Seigel, Hogan, Rong, et al., 2015). These nonmonotonic tency measures, the segmental-level analysis may be rela-
changes in error variability during typical development tively more sensitive for differential diagnosis between
have been attributed to resource allocation issues and TD, phonological disorder (PD), and CAS and to track
dynamic interactions between language and speech systems intervention-related changes over time.
(Green, Nip, & Maassen, 2010; Iuzzini-Seigel, Hogan, At the level of acoustic inconsistency, measures such
Rong, et al., 2015; Macrae, Tyler, & Lewis, 2014). Overall, as the acoustic spatiotemporal variability indices (e.g.,
children’s speech production is more variable, less flexible, envelope-based spatiotemporal index [E-STI]; Howell,
and less accurate than adult speech until the early teens Anderson, Bartrip, & Bailey, 2009) or voice onset time (VOT)
(A. Smith & Zelaznik, 2004). variability (Iuzzini-Seigel, 2012) have clinical potential for
differential diagnosis and treatment progress monitoring in
Error Inconsistency in CAS CAS, but they have rarely been applied in this population.
In general, studies provide evidence for increased Generally, children’s VOTs are more variable than adults’
variability in speech production of children with CAS rela- VOTs, and variability decreases with age and stabilizes
tive to TD children or those with other speech impairments around the age of 11 years (Auzou et al., 2000; Whiteside,
(e.g., Dodd, Hua, Crosbie, Holm, & Ozanne, 2002; Iuzzini- Dobbin, & Henry, 2003). Iuzzini-Seigel (2012) investigated
Seigel, Hogan, & Green, 2017; Schumacher, McNeil, Vetter, inconsistency of speech in 3- to 5-year-old children with
& Yoder, 1986). For example, Schumacher et al. (1986) CAS, PD, and TD using acoustic (VOT variability), seg-
found that whole-word phonetic variability elicited from mental, and lexical measures. Children with CAS evidenced
repetitions of words distinguished children (5–9 years of less stability at both the acoustic level (significantly higher
age) with CAS from TD children or those with functional coefficients of variation [COVs] of VOTs for bilabial voice-
articulation disorders. However, results from word-level in- less stops) and at the segmental and lexical levels relative
consistency measures (e.g., Token-to-Token Inconsistency; to speakers with PD and TD speakers. Furthermore, Iuzzini-
Dodd et al., 2002) should be interpreted cautiously. Children Seigel also analyzed VOT measures (e.g., COV and skew-
with inconsistent phonological disorder and children with ness) as a function of group, differentiated by segmental
severe speech sound disorder (SSD), in general, may demon- (e.g., CSIP, ISP) or lexical inconsistency (e.g., Word Incon-
strate high scores on word-level inconsistency assessments, sistency Assessment; Dodd et al., 2009) measures. Only in

Terband et al.: Methodology in the Assessment of CAS 3001


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
groups classified by the segmental-level inconsistency only provides a general impression of a child’s production
measures (and not groups differentiated by lexical-level accuracy and is not recommended as the only measure of
inconsistency measures) did speakers with CAS demonstrate consistency (Betz & Stoel-Gammon, 2005). In addition, the
a more positive skewness, that is, a higher COV for VOTs number of errors (e.g., number and variety of substitutions)
relative to speakers with PD. In a more recent study, and the most frequently used error type indicate the de-
Iuzzini-Seigel, Hogan, Guarino, et al. (2015) demonstrated gree of variability in errors produced (in line with clinical im-
that, under conditions of attenuated auditory feedback (au- pression of “inconsistent errors”; Betz & Stoel-Gammon,
ditory masking), children with CAS produced a lower per- 2005).
centage of optimal exemplars of voiceless bilabial stops and
reduced vowel space area relative to TD children or chil- Total Token Variability and Error Token Variability
dren with speech delays. They interpreted these findings as Several procedures have been reported for assessing
indicative of poor feedforward motor programs and com- word-level inconsistency/variability, albeit with differing
pensatory reliance on auditory feedback in CAS (Terband formulas and descriptions (Dodd, 1995; Ingram, 2002;
& Maassen, 2010). Schumacher et al., 1986; Shriberg et al., 1997a; see Table 1).
At the level of kinematic inconsistency (e.g., kinematic In a longitudinal study, Marquardt et al. (2004) assessed the
STI; Kleinow & Smith, 2000), studies have indicated that accuracy, stability, total token variability (TTV), and error
speech articulation is more variable in preschool- and token variability (ETV) of whole-word productions in chil-
school-age children with CAS, relative to children with other dren with CAS (4;6–7;7) undergoing phonological treatment
SSDs or TD peers (Grigos, Moss, & Lu, 2015; Moss & (for formula, see Table 1). Their study revealed that mea-
Grigos, 2012; Terband, Maassen, van Lieshout, & Nijland, sures of stability and accuracy increased over time while
2011). For example, Grigos et al. (2015) demonstrated variability (TTV) decreased. However, individual data
greater jaw variability (higher STI) as a function of word showed clear session-to-session variability in patterns at the
length (mono-, bi-, and trisyllabic: “pop,” “puppet,” and three time points for these children with CAS, with ETV
“puppypop,” respectively), while Terband et al. (2011) emerging as the least consistent of the variables tested. The
demonstrated greater variability of tongue tip movements variability results obtained for children with CAS across
in 6- to 9-year-old children with CAS (relative to TD peers). time paralleled the results of single-word articulation testing
Furthermore, jaw deviances or instabilities (lateral move- and relational analysis of consonants and vowels in con-
ment range and variability) were found in the coronal plane, nected speech. For example, the child with higher levels of
but not in the midsagittal plane for children with SSD or TTV and ETV and lower levels of accuracy and stability
CAS relative to TD peers (Terband, van Zaalen, & Maassen, also had the lowest scores on relational analysis and articu-
2012). The findings of kinematic instability are in line with lation testing, possibly implying a relationship between
clinical observations (e.g., lateral jaw slide) in children with severity of speech disorder and underlying speech motor
SSD and CAS (Namasivayam et al., 2013; Terband et al., variability (also see the ECI section).
2012) and may be of diagnostic and therapeutic impor- With respect to validity, transcription-based word-
tance. In the following sections, we review perceptual, acoustic, level token-to-token consistency measures (e.g., TTV) were
and articulatory measures used to evaluate speech inconsis- found to be moderately correlated with segmental-level (in)
tency in children with CAS. consistency assessments (e.g., Error Consistency Index
[ECI]) but demonstrated low correlations with acoustic
measures of phonetic variability (vowel formants, VOT,
Perceptual Measures and coefficient of variation of word duration; Preston &
Background Koenig, 2011). A comparison of interrater reliability sug-
To capture various types of error consistencies at the gests that broad phonetic transcriptions from spontaneous
word and segmental level, several different formulas are speech are more reliable than those of responses obtained
reported in the literature (for details, please refer to Betz & from rapid picture-naming tasks (Marquardt et al., 2004;
Stoel-Gammon, 2005; Marquardt et al., 2004). For example, Preston & Koenig, 2011; see Table 1).
(in)consistency measured as a percentage of the total pro-
ductions of a target word has been used by Dodd and col- Token-to-Token Inconsistency Assessment: DEAP
leagues (Dodd, 1995; Dodd et al., 2002) and Shriberg and Inconsistency Subtest
colleagues (Shriberg, Aram, & Kwiatkowski, 1997a). This Dodd and colleagues (Dodd et al., 2002; McIntosh
provides an index of “production consistency,” whereas the & Dodd, 2008), as part of the DEAP Test, developed and
use of total error productions as the denominator is said standardized a 25-word picture-naming subtest to elicit word-
to reflect “error consistency” (Betz & Stoel-Gammon, 2005; level token-to-token inconsistency (see Table 2). In Token-to-
Iuzzini-Seigel, 2012). The numerator in such error consis- Token Inconsistency assessment, a speaker is instructed to
tency measures may also differ to capture (a) the proportion repeat the same utterance multiple times (three times) across a
of errors, (b) consistency of error types, and (c) consistency similar context, while their consistency of productions is scored
of the most frequently used error type (Betz & Stoel- as “same” (nonvariable) or “different” (variable). A pro-
Gammon, 2005; Iuzzini-Seigel & Forrest, 2010; Shriberg duction is considered variable if any of the productions dif-
et al., 1997a). The overall proportion of error productions fer in the three trials (Dodd et al., 2002). Dodd’s word-level

3002 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Table 1. Methodological details: total token variability and error token variability (Marquardt et al., 2004; Preston & Koenig, 2011).

Materials and methods

(1) Stimuli or targets being Six multisyllabic words (elephant, umbrella, strawberries, helicopter,
analyzed thermometer, and spaghetti; Preston & Koenig, 2011)
(2) Tasks used to elicit those Picture naming (Preston & Koenig, 2011)
targets Spontaneously elicited connected speech samples using age-appropriate
materials (Marquardt et al., 2004)
(3) Conditions in Quiet, with time pressure (rapid picture naming; Preston & Koenig, 2011)
which responses are elicited Quiet, no time pressure (Marquardt et al., 2004)
(4) The measures obtained from Total token variability: (number of variants − 1) / (number of tokens − 1)
those responses (Marquardt et al., 2004)
Error token variability: (number of incorrect variants − 1)/ (number of
incorrect tokens − 1) (Marquardt et al., 2004)
Scientific basis
(5) Standardized measurement No
protocol?
(6) Validity and reliability of Validity: No
outcome measures? Reliability: Broad transcription reliability from spontaneous
speech (10% of samples) = 86.22% (range: 75%–96.26%;
Marquardt et al., 2004)
Interrater reliability of total token variability scores based on
phonetic transcription of rapid naming task with r = .55
(Preston & Koenig, 2011)
(7) Norm or reference data available? No

Token-to-Token Inconsistency assessment is a nominal mea- measurements using such methods (e.g., ECI) may represent
surement, and children with phonological disorders are classi- severity of the problem rather than disorder category (Betz
fied as inconsistent or consistent, depending on whether or & Stoel-Gammon, 2005; Forrest, Dinnsen, & Elbert, 1997;
not they produced the same words consistently across three Forrest, Elbert, & Dinnsen, 2000 ; Tyler et al., 2006). With
repetitions (> 40% = inconsistent). If inconsistency scores regard to reliability and validity, ECI score calculation
are greater than 40% (but see Iuzzini-Seigel, 2012, for higher has a high degree of reliability (99%; Tyler et al., 2003)
cutoff > 50%), along with the presence of other features, and possibly addresses the same construct as other measures
such as poor oromotor performance, poorer productions of speech severity (e.g., PCC; Tyler & Lewis, 2005; see
during imitation than spontaneous speech, consonant and Table 3).
vowel distortions, and atypical prosody, then a CAS diag-
nosis may be suspected (Dodd et al., 2002; see Table 2).
TTR of Consonant Substitutions
ECI TTR analysis is a measure of the number of types of
With respect to inconsistency measures at the seg- productions to the total number of tokens produced (see
mental level, the ECI has been applied in a number of studies Table 4). It indicates the number of different ways (i.e.,
(Preston & Koenig, 2011; Tyler & Lewis, 2005; Tyler, inconsistency) a target form is produced by the child. Two
variations of TTR analysis have been applied in both diag-
Lewis, & Welch, 2003; see Table 3). The ECI is a raw score
nostic and therapeutic contexts in the SSD and CAS popu-
calculated as the sum of the total number of different error
lations. The segmental-level TTR measure, called CSIP,
forms across all consonants and all word positions. A higher
calculates a percentage based on the number of different error
ECI score indicates a greater number of different error
substitutes across all targets divided by the total number
forms across a larger number of consonants, and a lower of erred productions across the whole inventory (Forrest &
ECI score indicates fewer different error forms across a Iuzzini-Seigel, 2008; Iuzzini-Seigel, 2012). The ISP (Iuzzini-
smaller number of consonants (Tyler & Lewis, 2005). The Seigel & Forrest, 2010) is derived from CSIP by modifying
ECI measure is moderately–strongly correlated to token-to- the denominator (of CSIP) from the total number of erred
token variability of repeated productions at word level and productions to the number of target opportunities. Validity
measures of speech severity, such as percent consonants cor- of the CSIP/ISP measure has been demonstrated in few stud-
rect (PCC; Preston & Koenig, 2011). Generally, correlation ies. Segmental-level ISP measure is correlated with the broader
between PCC and ECI scores have been reported in the lexical-level word inconsistency scores (r > .70; Iuzzini-
range of r = −.58 to −.88 in children with speech and Seigel, 2012), which demonstrates construct validity. Inter-
language disorders (Tyler & Lewis, 2005; Tyler et al., 2003). rater percent agreement scores for narrow transcrip-
Importantly, and as mentioned earlier (see the Error Incon- tions, as used in TTR analysis, is reported to be > 90%
sistency in CAS section), there are several studies that (Heisler, Goffman, & Younger, 2010; Iuzzini-Seigel, 2012;
provide support for the notion that variability/consistency see Table 4).

Terband et al.: Methodology in the Assessment of CAS 3003


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Table 2. Methodological details: Word Inconsistency Assessment (Dodd et al., 2009).

Materials and methods

(1) Stimuli or targets being analyzed 25 words (ranging from one to four syllables)
(2) Tasks used to elicit those targets Picture naming
(3) Conditions in which responses Quiet, no time pressure, production of each target word in
are elicited three separate trials, each trial separated by an intervening
task (subsection of oral motor screen) or a short break
(5 min) with conversation
(4) The measures obtained from Percentage of target words produced differently (word
those responses inconsistency score)
Scientific basis
(5) Standardized measurement protocol? Yes
(6) Validity and reliability of outcome Validity: Not specified in the DEAP test manual
measures? Reliability: Percent interrater agreement for Word Inconsistency
Assessment based on whole-word narrow transcriptions
from video/audio recordings was 91.64% (SD = 5.76%;
Iuzzini-Seigel, 2012)
(7) Norm or reference data available? Reference data: n > 40% = inconsistent phonological disorder
(Dodd, 2005; Tyler & Lewis, 2005)

Note. DEAP = Diagnostic Evaluation of Articulation and Phonology.

Acoustic Measures variability separately (Lucero, 2005). The FDA nonlinearly


Acoustic Spatiotemporal Variability Indices manipulates the time axis of acoustic (pitch, intensity, and
Assessment of speech variability via audio signals is formant tracks) or kinematic signals from successive utter-
clinically feasible even in difficult-to-test populations and ances, such that their features are in alignment with each
has been recently proposed by several researchers (Anderson, other. The amount of adjustment necessary to bring the
Lowit, & Howell, 2008; Cummins, Lowit, & van Brenk, signals into alignment provides an estimate of temporal
2014; Howell et al., 2009; see Table 5). The acoustic STI variability, while the differences on the amplitude axis
is calculated in a similar manner to its kinematic variant provide an estimate of spatial variability (Anderson et al.,
but from the amplitude envelope derived from rectified 2008; Howell, Anderson, & Lucero, 2010). Following time
and low-pass filtered speech audio recordings (Howell et al., and amplitude alignment, temporal variability and spatial
2009). As the source signal for variability calculation is the variability can be independently derived by averaging the
amplitude envelope, Howell et al. (2009) refer to this as standard deviation of the spatial and temporal errors
E-STI. The E-STI measure captures the joint spatial and tem- across the signal (Anderson et al., 2008). Another recent
poral variation in the patterning of speech amplitude enve- development in the assessment of speech variability using
lopes over repeated utterances. For the E-STI, the sum of acoustic recordings is the utterance-to-utterance variability
50 SDs at 2% intervals is calculated over time- and amplitude- (UUV) index (Cummins et al., 2014). For the UUV index,
normalized repeated acoustic amplitude envelopes. While mel-frequency–scaled spectral coefficients are extracted
kinematic STI derived from single articulatory movement from utterances, and a dynamic time-warping algorithm is
trajectories (or, in some cases, derived from interarticula- used to map one utterance on to the other. The UUV index
tory distance measures) represent stability of underlying is a quantitative measure that represents the amount of
movement templates (Kleinow & Smith, 2000), the E-STI warping (compression and stretching) required for the opti-
represents the summed output of respiratory, laryngeal, and mal mapping between the two utterances.
articulatory subsystems. Lower E-STI values suggest less With regard to validity, E-STI, FDA, and UUV
variability, a more robust and efficient speech subsystem co- procedures have shown good comparability to other
ordination (Anderson et al., 2008; Cummins et al., 2014; validated measures (e.g., kinematic STI) when investigating
Howell et al., 2009). task demands on the speech motor system (e.g., changes
There is preliminary data to suggest that E-STI and in speech rate) and distinguishing type/severity of speech
kinematic STI are positively correlated and that E-STI is disorders (e.g., in dysarthria; Anderson et al., 2008;
useful to discriminate speakers based on age and speakers Mefferd, 2015; van Brenk & Lowit, 2012). These indices
who stutter from those who do not (Howell et al., 2009). A are also correlated with speech intelligibility ratings and stan-
further methodological advancement over the STI/E-STI dardized maximum performance tasks (e.g., diadochokinesis;
has been the nonlinear functional data analysis (FDA) pro- Anderson et al., 2008; Cummins et al., 2014; Howell et al.,
cedure (Lucero, 2005; Lucero, Munhall, Gracco, & Ramsay, 2010). Although these procedures have great potential for
1997; Ramsay & Silverman, 1997). The FDA procedure clinical use, they are yet to be applied to the CAS popula-
permits the estimation of spatial (or amplitude) and temporal tion. In terms of reliability, none of the studies examining

3004 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Table 3. Methodological details: Error Consistency Index (ECI; Preston & Koenig, 2011; Tyler & Lewis, 2005; Tyler et al., 2003).

Materials and methods

(1) Stimuli or targets being analyzed 64 words (included every English consonant at least twice—except /h/; Preston &
Koenig, 2011)
(2) Tasks used to elicit those targets Picture naming (Preston & Koenig, 2011)
(3) Conditions in which responses are elicited Quiet, no time pressure (Preston & Koenig, 2011)
(4) The measures obtained from those responses ECI: Sum of all different error forms for all consonant phonemes combined
(Preston & Koenig, 2011; Tyler & Lewis, 2005; Tyler et al., 2003)
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? Validity: Point -by-point consonant agreement = 87.3% (range: 81.5%–92.3%)
Interrater reliability of ECI scores, r = .98 (Preston & Koenig, 2011)
Reliability: Intra- and interreliability of error consistency scores derived from
transcriptions = 99% (Tyler et al., 2003)
(7) Norm or reference data available? Reference data: ECI range in preschool-age children with speech and language
disorders: 12–70
ECI cutoff scores for children with speech and language disorders: variable
group, upper quartile > 44.75; consistent group, lower quartile < 22.25 (Tyler &
Lewis, 2005)

these procedures reports any reliability scores related to & Whalen, 2017). However, most studies report outcome
segmentation of acoustic recordings or peak-picking algo- measures obtained with high reliability (Iuzzini-Seigel,
rithms (see Table 5). Hogan, Rong, et al., 2015; Lundeborg et al., 2015; see Table 6).

VOT Variability Articulatory Measures


VOT is considered a robust and reliable acoustic Background on Kinematic Variability
temporal cue for distinguishing between voiced and voice- The source or nature of articulatory variability depends
less plosive cognates (Auzou et al., 2000; Lisker & Abramson, on one’s theoretical perspective. The motor control literature
1964; see Table 6). It is defined as the time (in milliseconds) suggests that fluctuations of a value over repeated mea-
between the release of oral closure for plosive production surements (variability; Chau, Young, & Redekop, 2005) is
and the onset of voicing (Lisker & Abramson, 1964) and re- an indicator of imprecise movements often associated
flects coarticulatory timing control between laryngeal and with pathophysiology or an immature neuromotor system
supralaryngeal mechanisms in speech production (Auzou (e.g., A. Smith & Zelaznik, 2004). In theories such as the
et al., 2000; Whiteside et al., 2003). VOT and VOT variabil- dynamical systems theory, variability also serves as an
ity have been investigated in children with SSDs arising from indicator of adaptability and flexibility in the system (Thelen
articulation and phonological impairments (Lundeborg, & Smith, 1994; van Lieshout & Namasivayam, 2010). How-
Nordin, Zeipel-Stjerna, & McAllister, 2015), speech mo- ever, variability as a positive aspect of production has not
tor issues (Yu et al., 2014), and apraxia of speech (AOS; really taken off in the field of SSD and CAS.
Iuzzini-Seigel, Hogan, Guarino, et al., 2015). Objectively, movement variability has been described
Variability of VOT productions is usually calculated in the CAS literature in terms of discrete temporal or spatial
as the coefficient of variance of repeated productions. A parameters as related to single articulatory movements (e.g.,
few studies have used measures of VOT and VOT variability standard deviations or covariance measures related to peak
in the assessment of children with CAS. Compared to velocities, amplitudes, and duration of movements) and as
children with speech delay, children with CAS have been measures of articulatory coordination (e.g., Grigos, 2009;
shown to produce shorter VOTs for voiceless stops, indi- Grigos & Patel, 2007; Nijland, Maassen, Hulstijn, & Peters,
cating a delay in acquisition of the voicing contrast (Iuzzini- 2004; Terband et al., 2011, 2012). More recently, speech
Seigel, 2012; Iuzzini-Seigel, Hogan, Guarino, et al., 2015). motor performance measures based on complete movement
As of yet, outcome measures related to VOT, such as abso- trajectories (from single articulators), called the kinematic
lute VOT length, VOT variability, or strength of voiced– STI (Kleinow & Smith, 2000), have been utilized. Researchers
voiceless contrasts, have not been correlated reliably to have also started to examine speech motor system (in)sta-
other outcome measures, such as intelligibility obtained bility at the level of movement coordination within and
with children with CAS. between functional synergies. The specifics of these outcome
With respect to reliability, one has to consider that measures are described in the subsections below.
VOT is a measurement of overlapping physiological events Typically, optical (i.e., camera based using visible or
represented by strict, sometimes arbitrarily defined bound- infrared light) or electromagnetic articulography (EMA)
aries. As such, discrepancies in measurements within and systems have been used in children for tracking orofacial
across studies might be expected to some degree (Abramson movements related to speech (Moss & Grigos, 2012; Terband

Terband et al.: Methodology in the Assessment of CAS 3005


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Table 4. Methodological details: type–token ratio: consonant substitute inconsistency percentage (CSIP)/inconsistency severity percentage
(ISP; Iuzzini-Seigel, 2012; Iuzzini-Seigel & Forrest, 2010).

Materials and methods

(1) Stimuli or targets being analyzed 200–240 word probe list that provides 340–440 opportunities to produce all of the
American English consonants in all naturally occurring word positions (Iuzzini-
Seigel, 2012; Iuzzini-Seigel & Forrest, 2010)
Stimuli also derived from the Goldman-Fristoe Test of Articulation 2 (GFTA-2) and
the first trial of Word Inconsistency Assessment (Dodd et al., 2009)
(2) Tasks used to elicit those targets Picture-naming task (if child is unable, then semantic cue or delayed imitation is
carried out)
(3) Conditions in which responses are elicited Quiet, no time pressure
(4) The measures obtained from those responses CSIP: percentage based on the number of different error substitutes across all targets
divided by the total number of erred productions across the whole inventory
(Iuzzini-Seigel, 2012; Iuzzini-Seigel & Forrest, 2010)
ISP: percentage based on the number of different error substitutes across all targets
divided by total number of productions (Iuzzini-Seigel, 2012; Iuzzini-Seigel &
Forrest, 2010)
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? Validity: Construct validity: high correlation between ISP (r > .70) and lexical-level
word inconsistency scores (Iuzzini-Seigel, 2012)
Reliability: Interrater percent agreement for narrow transcription > 90% (Heisler et al.,
2010; Iuzzini-Seigel, 2012)
(7) Norm or reference data available? Reference data: ISP score cutoff for CAS > 17% (Iuzzini-Seigel, 2012)

Note. CAS = childhood apraxia of speech.

et al., 2011). Optical motion capture systems utilize small movement trajectories (e.g., of the jaw or the lower lip)
reflective markers (approximately 3 mm) that are placed or individual movement cycles (cyclic STI; van Lieshout &
on the child’s upper and lower lips, right/left/mid jaw, and Moussa, 2000; see Table 7). A lower STI value represents
lip corners to track speech-related movements. Other less variability, suggesting a robust and well-learned under-
markers are placed on the forehead and nasion, which are lying movement template (Kleinow & Smith, 2000). With
used as reference to correct for head rotation/movements. regard to stimuli and elicitation procedures, camera-based
An alternative to optical motion capture system is EMA. motion tracking of speech articulators in children has been
In EMA, the position and motion of sensor coils attached limited to visible structures such as the jaw and lips and
to speech articulators are tracked within a magnetic field. to words that comprise of bilabial consonants (e.g., pop,
The sensor coils, typically around 4 × 4 × 3 mm in size, puppet, and puppypop: Moss & Grigos, 2012; buy bobby
are usually glued on the bridge of the nose, the maxillary a puppy: A. Smith & Goffman, 1998). Stimuli with bilabial
gum ridge on the upper and lower lips, the mandibular productions are also chosen with EMA systems for easier
gum ridge, and two or three points on the tongue. As the segmentation of position data (Terband et al., 2011). To ac-
sensor coils are wired and directly glued on the articula- quire adequate data for measurement of articulatory vari-
tors, this methodology is relatively invasive and might not ability (e.g., STI/cyclic STI), about 10–15 productions of
be tolerated well by young children or infants. In com- the target stimuli are elicited. Most speech kinematic stud-
parison, the passive reflective markers used with optical ies in children have elicited productions using picture nam-
motion tracking systems are unobtrusive, light, and well ing, cloze sentence procedure (within a story retell game), or
tolerated by young children and offer a more relaxed and by direct/immediate word/sentence imitation tasks with
naturalistic setting for data collection, especially in children. auditory models (Grigos et al., 2015; Moss & Grigos, 2012;
The limitation of optical motion capture systems is that Sadagopan & Smith, 2008; Terband et al., 2011; see Table 7).
they require a direct line of sight between the camera
and the reflective marker and hence are only suited for the Covariance Measures
measurement of externally visible structures such as the jaw Moss and Grigos (2012) examined spatial coupling
and lips. The operational principles of the optical motion (calculated as absolute peak correlation coefficient [PC]
capture and EMA systems have been elaborated elsewhere between articulator pairs; i.e., between jaw and lower lip
and are beyond the scope of this review (e.g., see Feng [J–LL], jaw and upper lip [J–UL], and upper and lower
& Max, 2014; Yunusova, Green, & Mefferd, 2009). lip [UL–LL]) and temporal coupling (time required for
peak spatial coupling; i.e., lag) as a function of word length
Kinematic Spatiotemporal Variability Indices (e.g., “pop,” “puppet,” and “puppypop”; see Table 8). A
For the STI, a sum of 50 SDs at 2% intervals is pair of articulators with a high degree of spatial and tem-
calculated over amplitude- and time-normalized repeated poral coordination would yield high correlation coefficients

3006 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Table 5. Methodological details: acoustic spatiotemporal variability indices (Anderson et al., 2008; Cummins et al., 2014; Howell et al., 2009;
van Brenk & Lowit, 2012).

Materials and methods

(1) Stimuli or targets being analyzed 20–25 repetitions of a phrase of which typically 10 are used for analysis: “Buy Bobby a
puppy” (E-STI; Howell et al., 2009); “Well we’ll will them” (FDA; Anderson et al., 2008);
“Tony knew you were lying in bed” (FDA/UUV; Cummins et al., 2014)
(2) Tasks used to elicit those targets Phrase repetition
(3) Conditions in which responses are elicited Quiet, self-selected comfortable/habitual speaking rate, twice as fast or half as fast as
habitual speaking rate
(4) The measures obtained from those responses Independent or combined temporal and spatial variability (E-STI/FDA/UUV) from audio
recordings
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? Validity: Results comparable to kinematic STI and negatively correlated with speech
intelligibility ratings (Cummins et al., 2014; van Brenk & Lowit, 2012)
Reliability: No
(7) Norm or reference data available? No

Note. E-STI = envelope-based spatiotemporal index; FDA = functional data analysis; UUV = utterance-to-utterance variability.

and low lag values. Moss and Grigos analyzed these mea- dependent as articulatory movements overlap in time and
sures in 3- to 6-year-old TD children and those with CAS interact with one another. Acoustically, this manifests itself
and speech delay (n = 6 per group). There was no effect as the realizations of consecutive speech segments affecting
of group or Group × Word interactions for PC and lag. each other mutually. The effect is bidirectional. Influences
Green, Moore, Higashikawa, and Steeve (2000) analyzed of a segment on a following segment are called persevera-
PC and lag in 1-, 2-, and 6-year-old TD children and tory or carryover coarticulation, and influences of an up-
adults. In general, 1- and 2-year-old children demonstrated coming segment on a preceding segment are known as
greater spatial coupling between the UL–LL than between anticipatory coarticulation. Furthermore, coarticulation is not
the lips and jaw pairs. The PC values indexing lip and jaw limited to adjacent segments and can occur across syllables.
coupling (J–UL, J–LL) for 1-year-old children were very Coarticulation is the consequence of the inertia of
low, indicating weak coupling (values centered near zero). the articulatory organs caused by their biomechanical char-
Spatial coupling values increased with age. With regard acteristics and an economy of effort in articulatory planning
to lag-to-peak coefficient values, all articulatory move- influenced by biomechanical constraints (e.g., Recasens,
ments (across pairs of articulators) were tightly coupled 2004; Recasens, Pallarès, & Fontdevila, 1997), prosodic
with mean lag values not > 29 ms for any age group (see conditions (Cho, 2004; De Jong, 1995; Edwards, Beckman,
Table 8). & Fletcher, 1991), and syllable structure (e.g., Modarresi,
Sussman, Lindblom, & Burlingame, 2004; Nittrouer,
Coefficient of Variation of Spatial and Temporal Coupling Munhall, Kelso, Tuller, & Harris, 1988; Sussman, Bessell,
Coefficient of variation of the PC (PCcov) and lag Dalston, & Majors, 1997). Furthermore, the amount of
values (Lcov) from the Covariance Measures section were coarticulation depends on lexical frequency and, relatedly,
analyzed by Moss and Grigos (2012) for the following the specific demands of the communication task (e.g.,
articulatory pairs: J–LL, J–UL, and UL–LL in 3- to 6-year- Farnetani & Recasens, 1997; Kühnert & Nolan, 1999).
old TD children, those with speech delay, and children diag- Perseveratory coarticulation has been found to reflect pre-
nosed with CAS (n = 6 per group; see Table 9). Significant dominantly biomechanical constraints, whereas anticipa-
main effects for group were found for PCcov and Lcov. tory coarticulation mainly reflects higher level phonetic
The CAS group had significantly higher average PCcov processing (e.g., Daniloff & Hammarberg, 1973; Hertrich
and Lcov across utterances for J–LL coupling than the & Ackermann, 1995, 1999; Kent & Minifie, 1977; Whalen,
speech delay group (see Table 9). 1990). Comparisons between carryover and anticipatory
coarticulation effects are highly complicated, as both effects
co-occur at multiple levels at approximately the same time.
Lengthened and Disrupted Coarticulatory Moreover, the specific biomechanical constraints and syllabic
Transitions Between Sounds and Syllables position of the speech sounds involved play a role that
is not straightforward and appears to be language specific,
Background that is, some studies report stronger perseveratory as
Coarticulation compared to anticipatory coarticulation whereas other
Coarticulation refers to the phenomenon that the studies report opposite effects (Beddor, Harnsberger, &
specific properties of articulatory movements are context Lindemann, 2002; Graetzer, Fletcher, & Hajek, 2015;

Terband et al.: Methodology in the Assessment of CAS 3007


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Table 6. Methodological details: voice onset time (VOT) variability (Iuzzini-Seigel, Hogan, Rong, et al., 2015; Whiteside et al., 2003; Yu et al., 2014).

Materials and methods

(1) Stimuli or targets being analyzed Five repetitions of CVC pseudowords (pVb), which sampled corner vowels (e.g., /pib/,
/pub/; Iuzzini-Seigel, Hogan, Guarino, et al., 2015)
115 Repetitions of monosyllabic /pa/ (Yu et al., 2014)
Five repetitions of 12 CVC target words with plosive consonants in syllable initial
position (e.g., pea, bee, tea; Whiteside et al., 2003)
Three repetitions of six minimal pairs (e.g., pil–bil, tennis–dennis; Lundeborg et al., 2015)
(2) Tasks used to elicit those targets Imitation of recorded speech sample (Iuzzini-Seigel, Hogan, Guarino, et al., 2015)
Cued (white circle on monitor) repetition task (Yu et al., 2014)
Picture naming (Iuzzini-Seigel, Hogan, Guarino, et al., 2015)
In carrier phrase “say ___ now” (Iuzzini-Seigel, Hogan, Guarino, et al., 2015) or “say ___
again” (Whiteside et al., 2003)
(3) Conditions in which responses are elicited Quiet room, no time pressure
(4) The measures obtained from those responses Duration in milliseconds of VOT measured described in terms of mean, SD median,
median difference scores for voiced–voiceless cognates, COV, and skewness
(Iuzzini-Seigel, Hogan, Guarino, et al., 2015)
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? Validity: No
Reliability: Intrarater reliability: ICC = .98–.99 (absolute error = 2.0–4.3 ms; Iuzzini-Seigel,
Hogan, Guarino, et al., 2015); Cronbach’s alpha = .97 (Lundeborg et al., 2015).
Interrater reliability: Pearson r = .97 (Whiteside et al., 2003); mean difference between
raters = 17.19 ms (SD = 6.89 ms), Pearson r = .93 (Yu et al., 2014)
(7) Norm or reference data available? Reference data: Mean COV values (in %) for voiced plosives approximately 20%–30%
for typically developing children between 5;8 and 13;2 (years;months). Mean COV
values (in %) for voiceless plosives approximately 15%–25% for typically developing
children between 5;8 and 13;2 (Whiteside et al., 2003)
Typically developing 5-year-olds: Mean COVs of 74% for /b/ and 51% for /d/. Mean
COVs of 42% for /p/ and 34% for /t/
3- to 5-year-old children with CAS: Mean (SD) of COV = 56% (29) for /p/ and 52% (28) for /t/
3- to 5-year-old children with phonological delay: Mean (SD) of COV = 38% (19) for /p/
and 42% (25) for /t/ (Iuzzini-Seigel, 2012)

Note. COV = coefficients of variation; CAS = childhood apraxia of speech; ICC = intraclass correlation coefficient.

Modarresi et al., 2004; Recasens & Pallarès, 2001; Sharf Sussman, Minifie, Buder, Stoel-Gammon, & Smith, 1996;
& Ohde, 1981). Zharkova, Hewlett, & Hardcastle, 2011, 2012) and children
move from a more global to a more segmental planning
Typical Development of Coarticulation (Katz & Bharadwaj, 2001; Nijland et al., 2002; Nittrouer,
In typical development, coarticulatory patterns Studdert-Kennedy, & McGowan, 1989; Noiray et al.,
change as children become more adultlike in their speech 2018; Siren & Wilcox, 1995). However, coarticulation in-
production and improve spatiotemporal control. However, creases (relatively) in certain contexts that are language
precisely how coarticulation changes during development specific, that is, depending on, for example, the phonologi-
has proved to be rather complex. Studies agree on the fact cal and articulatory specification of the segments involved
that coarticulation is more variable in the speech of children (e.g., underspecified vowels exhibit more coarticulation;
as compared to adults, but some studies report stronger Nijland et al., 2002), prosodic patterns (e.g., stressed vowels
coarticulation in children while other studies report that exhibit less coarticulation; Nijland et al., 2002), and mor-
children exhibit less coarticulation than adults. At first phological structure or lexical frequency (e.g., higher fre-
glance, these results appear to be conflicting, but studies quent utterances show more coarticulation in adults but
differ in experimental methodologies, procedures, lan- not in children; Song, Demuth, Evans, & Shattuck-Hufnagel,
guage, stimuli, and age of participants. When examined 2013). Furthermore, differences between anticipatory and
closely, the results show a pattern in which “coarticulation perseveratory coarticulation in their developmental trajec-
that reflects poor temporal control or poor differentiation tories seem likely due to their differences in etiology, but
of structures decreases, whereas coarticulation that reflects the development of anticipatory and perseveratory coarti-
language-specific efficiency increases” (ASHA, 2007, p. 8). culation have not yet been compared directly in a single
More specifically, coarticulation decreases in general, as experimental design. In fact, little is known about the
coordinative structures/functional motor synergies develop development of perseveratory coarticulation in general with
(e.g., Barbier et al., 2013; Noiray, Abakarova, Rubertus, the vast majority of studies focusing on anticipatory coarti-
Krüger, & Tiede, 2018; Noiray, Ménard, & Iskarous, 2013; culation (but see Song et al., 2013).

3008 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Table 7. Methodological details: spatiotemporal index (STI)/cyclic STI (cSTI; Grigos, 2009; A. Smith, Goffman, Zelaznik, Ying, & McGillem,
1995; Van Lieshout & Moussa, 2000).

Materials and methods

(1) Stimuli or targets being analyzed Eight to 15 productions of /papa/ and /baba/ produced with equal stress (Grigos, 2009)
10–15 productions of “pop,” “puppet,” and “puppypop” (Grigos et al., 2015; Moss &
Grigos, 2012)
Dutch words /paːs/ and /spaː/ repeated for 5–12 s (three to six movement cycles per trial;
Terband et al., 2011)
(2) Tasks used to elicit those targets Object naming (Grigos, 2009)
Closed-sentence procedure or respond to a “who”-question cued by a picture probe
(Grigos et al., 2015; Moss & Grigos, 2012)
Reiterated speech task–auditory model provided as needed (Terband et al., 2011)
(3) Conditions in which responses are elicited No time pressure, play scenario (Grigos, 2009)
Naturalistic productions embedded in a story retell game (Grigos et al., 2015; Moss
& Grigos, 2012)
Syllable repeated at self-chosen normal, comfortable pace (Terband et al., 2011)
(4) The measures obtained from those responses Jaw, lower lip, and upper lip displacement trajectories (Grigos, 2009; Grigos et al., 2015)
Lip aperture STI and lower lip–jaw STI (Moss & Grigos, 2012)
cSTI for tongue tip, lower lip, and jaw (Terband et al., 2011)
Scientific basis
(5) Standardized measurement protocol? No
Segmentation based on zero crossing of jaw velocity trace (Grigos, 2009)
Movement cycles (peaks/valleys in the position and velocity signals) were identified by
automated algorithm using relative amplitude (10% of maximum amplitude) and time
(a minimum interval of 0.5 s between successive events) criteria. Errors in automated
peak/valley assignment were corrected manually (Terband et al., 2011)
(6) Validity and reliability of outcome measures? No
(7) Norm or reference data available? Reference data: lower lip STI data on typically developing children and young adults for
“buy bobby a puppy” phrase: M (SD) = 24.1 (4) for 4-year-old children, 18.5 (5.7)
for 7-year-old children, 13.6 (2.5) for 20- to 27-year-old young adults (A. Smith &
Goffman, 1998)

In summary, the literature indicates that development Van der Meulen, Gabreëls, et al., 2003; Sussman, Marquardt,
does not involve a global increase or decrease in coarticula- & Doyle, 2000).
tion. Speech motor development rather moves toward One factor that could be held responsible for this
“flexible patterns of coarticulation” (Noiray et al., 2018, paradox is reduced phonological distinctiveness. The less
p. 1363; see also Noiray, Wieling, Abakarova, Rubertus, & distinctly speech sounds are produced, the weaker their
Tiede, in press), which can differ depending on the phonetic possible coarticulatory influence on surrounding speech
and linguistic context. The point we want to make here, sounds. Children with CAS demonstrated weaker coarti-
therefore, is that one should deliberate what the possible culation in studies where they also showed a decreased
different outcomes would signify when assessing coarti- differentiation of speech sounds as compared to their TD
culation, that is, would more or less coarticulation in a peers (stop consonants [Sussman et al., 2000] and vowels
specific case indicate impaired, delayed, or more adultlike [Nijland et al., 2002; Nijland, Maassen, & Van der Meulen,
speech motor planning and programming? 2003]). It is unclear why these studies found a decreased
differentiation of speech sounds as not all studies do.
Coarticulation in Children With CAS Possibly, the decreased distinctiveness actually reflects
As formulated in the CAS Technical Report, the coarticulatory effects in the opposite direction. In studies
speech of children with CAS is characterized by “lengthened that feature similar phonological distinctiveness in the
and disrupted coarticulatory transitions between sounds speech of children with CAS in comparison with TD chil-
and syllables” (ASHA, 2007, p. 4). First and foremost, dren, coarticulation was found to be stronger and more
children with CAS show coarticulation patterns that are extended (Nijland, Maassen, Van der Meulen, Gabreëls,
not consistent, not typically immature, and highly idiosyn- et al., 2003). In a recent study, Terband (2017) investigated
cratic. Coarticulation effects usually change the character- anticipatory coarticulation in [ə] as context-dependent F2
istics of a speech sound in the direction of the neighboring ratio relative to size of the produced phonetic contrast in
speech sound. For 5- to 7-year-old children with CAS, the data set that was collected previously as part of the
however, coarticulation has been found to be both stronger studies by Nijland and colleagues (Nijland et al., 2002;
and more extended, as well as the opposite, more segmen- Nijland, Maassen, & Van der Meulen, 2003), thus taking
tal (or hyperarticulation), as compared to their TD peers the potential coarticulatory influence of the following
(Maas & Mailend, 2017; Maassen, Nijland, & Van der speech sounds into account. The results showed increased
Meulen, 2001; Nijland et al., 2002; Nijland, Maassen, coarticulation in the group of children with CAS (n = 16)

Terband et al.: Methodology in the Assessment of CAS 3009


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Table 8. Methodological details: covariance measures (Green et al., 2000; Grigos et al., 2015; Moss & Grigos, 2012).

Materials and methods

(1) Stimuli or targets being analyzed One-, two-, and three-syllable words (“pop,” “puppet,” and “puppypop”) repeated
10–15 times in random order (Moss & Grigos, 2012)
“Baba,” “papa,” and “mama” in 15 repetitions pseudorandom order (Green et al., 2000)
(2) Tasks used to elicit those targets Closed-sentence procedure or respond to a “who”-question cued by a picture probe
(Moss & Grigos, 2012)
Reading for older children and imitation for younger children (Green et al., 2000)
(3) Conditions in which responses are elicited No time pressure, naturalistic productions embedded in a story retell game (Grigos et al.,
2015; Moss & Grigos, 2012)
(4) The measures obtained from those responses Peak correlation coefficient (PC) between articulator pairs and lag (time required for peak
spatial coupling; Green et al., 2000; Moss & Grigos, 2012)
Scientific basis
(5) Standardized measurement protocol? No
Cross-correlation functions computed on the displacement traces
(6) Validity and reliability of outcome measures? Validity: No
Reliability: 10% of data set was reanalyzed by the same experimenter for three
coordinative indices (i.e., contribution to oral closure, coefficient, and lag). The mean
absolute difference between first and second measurements of coefficient and lag
was 0.012 and 3 ms, respectively. Pearson correlations between the first and second
measurements ranged from 0.96 to 0.99. These findings suggest that the difference
between the two measurements was negligible (i.e., good reliability; Green et al.,
2000)
(7) Norm or reference data available? Reference data: Mean (SD) of PC values and lag data from 3- to 6-year-old typically
developing children for “puppypop” phrase: J–LL: PC: 0.62 (0.13), lag: 18.87 (2.77);
J–UL: PC: 0.46 (0.08), lag: 27.86 (3.04); UL–LL: PC: 0.53 (0.06), lag: 26.78 (1.38;
Moss & Grigos, 2012)
Typically developing children (only data for 2- and 6-year-old typically developing children
provided below due to space limitations; exact raw data unavailable; ~ = approximate
values): J–LL: PC: ~0.3 to ~0.7, lag: ~ −0.02 to ~ −01; J–UL: PC: ~0.2 to ~0.4, lag:
~ −02; UL–LL: PC:~0.6, lag: ~ −02 to ~ −01 (Green et al., 2000)
Note: PC values close to one indicate a high degree of spatial coupling, while lag values
close to zero indicate high levels of temporal coupling

Note. J = jaw; LL = lower lip; UL = upper lip.

compared to TD children (n = 8), but this effect was large variability in the children with CAS—both within
limited to certain articulatory contexts. While TD children groups and within subjects (Nijland et al., 2002). In direct
showed a differentiation in coarticulation between conso- comparison, no differences were found between inter- and
nant contexts, the children with CAS did not. The results intrasyllabic coarticulation, neither in the children with
did not show any evidence of decreased coarticulation in CAS nor in their TD peers (Maassen et al., 2001; Nijland,
CAS. Maassen, Van der Meulen, Gabreëls, et al., 2003). Although
A second factor that is often put forward to explain these studies did not contain an adult control group, such
the paradoxical findings is syllabic structure. The manipula- an effect has been reported for adults in the literature (e.g.,
tion of syllable boundary or syllable shape revealed differ- Modarresi et al., 2004; Nittrouer et al., 1988; Sussman
ences in the adjustment of the durational structure as a et al., 1997). However, the location of syllable boundary
function of syllabic organization in children with CAS as did have an effect, and intersyllabic coarticulation was
compared to normally developing children (Maassen et al., found to be stronger in V/CC (e.g., /zə sxit/; “ze schiet”) than
2001; Nijland, Maassen, Van der Meulen, Gabreëls, et al., in VC/C (e.g., /zəs xit/; “zus giet”) sequences for both groups
2003; see also Marquardt, Sussman, Snow, & Jacks, 2002). of children (Nijland, Maassen, Van der Meulen, Gabreëls,
More specifically, the children with CAS did not show et al., 2003). In summary, whereas syllabic structure has
systematic durational adjustments to syllabic structure, and been found to have a different effect on temporal organiza-
consistent intra- and intersyllabic temporal structures were tion (the durations of the speech sounds) in 5- to 7-year-old
missing (Maassen et al., 2001; Nijland, Maassen, Van der children with CAS compared to their TD peers, it does
Meulen, Gabreëls, et al., 2003; see also Marquardt et al., not have a differential effect in terms of coarticulation.
2002). However, the differential effects of syllable structure
on coarticulation are less clear. Children with CAS did not
show a significant coarticulation effect across syllable Perceptual Measures
boundaries, while TD children showed stronger intersylla- Identification of Gated Stimuli
bic coarticulation as compared to adults. However, this Due to the transient nature of the acoustic signal,
lack of a group-level effect could very well be due to the speech characteristics involving fine-grained phonetic detail

3010 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Table 9. Methodological details: coefficient of variation of spatial and temporal coupling (Moss & Grigos, 2012).

Materials and methods

(1) Stimuli or targets being analyzed One-, two-, and three-syllable words (“pop,” “puppet,” and “puppypop”) repeated
10–15 times in random order
(2) Tasks used to elicit those targets Closed-sentence procedure or respond to a “who”-question cued by a picture probe
(Moss & Grigos, 2012)
(3) Conditions in which responses are elicited No time pressure, naturalistic productions embedded in a story retell game (Grigos
et al., 2015; Moss & Grigos, 2012)
(4) The measures obtained from those responses Coefficient of variation of peak correlation coefficient (PCcov) between articulator pairs
and coefficient of variation for lag (time required for peak spatial coupling; Lcov; Moss &
Grigos, 2012)
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? No
(7) Norm or reference data available? Reference data: Mean (SD) of jaw–lower lip PCcov and Lcov data of 3- to 6-year-old
typically developing (TD), CAS and children with speech delay for the phrase
“puppypop”: TD: PCcov: 0.36 (0.15), Lcov: 0.65 (0.27); speech delay: PCcov:
0.25 (0.10), Lcov: 0.35 (0.14); CAS: PCcov: 0.54 (0.22), Lcov: 0.73 (0.30; Moss &
Grigos, 2012)

such as coarticulation are very difficult to assess perceptu- Acoustic outcome measures to assess coarticulation are
ally (see Table 10). Ziegler and von Cramon (1985) used stimuli specific, and which measure is appropriate depends
a vowel identification task in which a panel of nine trained on the speech sounds that are involved. In vowels, coarticu-
listeners were presented with gated speech segments con- lation can be calculated with mean formant frequencies
taining parts of increasing length of three test words with measured over a short time window (10–30 ms) at differ-
the form /gɘtVːtɘ/ with target vowels (/i, y, u/) and were asked ent parts of the speech sound, typically comprising onset,
of which test word the segment was the beginning of (see midpoint, and offset. While primarily formant frequencies
Table 10). The percentage of correct identification is indic- at midpoint are indicative for realized vowel quality and ar-
ative for the amount of coarticulatory information that is ticulatory positioning, other parts of the vowel can be
contained in the stimulus and can be analyzed as a function used to investigate the range of the coarticulatory influence.
of stimulus length and compared between speakers with Exact definitions of onset and offset vary between studies
and without speech disorder. Examining the productions of but are usually at about 20%–30% and 70%–80% of the
a patient with AOS compared to three control speakers, vowel, respectively. Few studies have focused on sonorants
Ziegler and von Cramon found that the onset of the vowel and liquids, but coarticulation in these speech sounds can
gesture was delayed in /i/ and /y/, whereas for /u/ the differ- be measured similar to vowels. The same principle applies
ences with the control speakers were not as pronounced. to fricatives, provided that the calculations are not based
These results indicate a reduced anticipation of the upcom- on formant analysis but on the spectral moment of the
ing articulatory movement (lip spread in case of /i/ and frication noise. When little spectral information is avail-
lip rounding in case of /y/) in the patient with AOS. Using able, such as in the case of plosives, place of articulation
a similar gating technique, Southwood, Dagenais, Sutphin, should be derived from the formant trajectories in the
and Garcia (1997) replicated this finding of reduced antici- consonant-to-vowel or vowel-to-consonant transition.
patory coarticulation in another apraxic patient. Acoustic measurements of coarticulation typically
This measure has not been used in children and only involve the first three formants, with F2 as the most
sparsely in populations with speech disorders in general. Its prominent measure of interest. Under the assumption of
potential for use in clinical settings is limited as the proce- an idealized vocal tract model, changes in vocal tract
dure yields 90 stimuli per speaker and requires an elaborate shapes during coarticulation might be obtained from trac-
perception experiment with a panel of trained listeners. ing the formant contours over time. The most prominent
relationships in the context of coarticulation are the follow-
ing. First formant frequencies are inversely related to tongue
Acoustic Measures height, that is, high vowels have low F1 values and low
Background vowels have high F1 values. Second formant frequencies
There is a large body of studies involving acoustic are related to tongue advancement, that is, front vowels
measurements of coarticulation, typically comparing specific have high F2 values and back vowels have low F2 vowels.
spectral characteristics of the acoustic signal across dif- Third formant frequencies have been found to be related
ferent contexts. Measurements can focus on the acoustic to lip rounding in front vowels, with low F3 values
spatial domain (how much the acoustics are influenced) present in rounded vowels and high F3 values present in
or the temporal domain (how far the influence reaches). unrounded vowels (Harrington, 2010). With respect to

Terband et al.: Methodology in the Assessment of CAS 3011


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Table 10. Methodological details: identification of gated speech stimuli (Ziegler & von Cramon, 1985).

Materials and methods

(1) Stimuli or targets being analyzed Six repetitions of three words /gətVːtɘ/ with target vowels (/i, y, u/); each of which
five gating segments of increasing length were extracted
(2) Tasks used to elicit those targets Imitation (model produced by experimenter)
(3) Conditions in which responses are elicited Quiet, no time pressure; items in carrier phrase (“Ich habe /…/ gehört,” “I have
heard /…/”)
(4) The measures obtained from those responses Percentage /i, y, u/ responses per gating segment in an identification task by a
panel of trained listeners
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? No
(7) Norm or reference data available? No

voiced consonants, transitions of F2 have been found to reduced coarticulation. These factors require appropriate
be a relatively reliable indicator of place of articulation, attention when designing and analyzing speech tasks
with increasing F2 trajectories for labial consonants to employed to assess coarticulation in CAS (Hardcastle &
decreasing F2 trajectories for dorsal consonants (e.g., Tjaden, 2008).
Kewley-Port, 1982; Liberman, Cooper, Shankweiler, & The three most prominent acoustic techniques to
Studdert-Kennedy, 1967). As such, F2 has been found in evaluate coarticulation are F2 ratios, first moment coeffi-
general to be more sensitive to coarticulation than F1 and cients, and F2 locus equations. Since F2 ratios and first
F3 (Öhman, 1966). moment coefficients are usually reported side by side, these
With regard to stimuli and elicitation procedures, outcome measures will be discussed jointly, followed by a
many studies have used schwa–CV(C) sequences. When separate subsection on F2 locus equations.
interested in consonant production, the unspecified, neutral
vowel limits systematic carryover coarticulation and F2 Ratios and First Moment Ratios
schwa proves to be very sensitive to anticipatory coarti- Coarticulation in children’s speech has mainly been
culation, making it a very suitable object of study itself quantified by using the center of gravity (also named spec-
(Nijland et al., 2002; Nittrouer, 1993). Corner vowels are tral centroid or first moment of the spectral distribution)
often included in the assessment materials, as they are and fricative F2 frequencies as outcome measures (Nittrouer
most distinctive within the F1–F2 space. When studying et al., 1989; see Table 11). Typically, stimuli with varying
vowel-to-vowel coarticulation, consonant context is im- fricative spectral distributions and vowels with lip-spreading
portant to consider as recent results have suggested that de- and lip-rounding features are used, for example, /sisi/, /ʃiʃi/,
viant coarticulation in children with CAS compared to TD /susu/, and /ʃuʃu/. Coarticulation is usually quantified by cal-
children might be limited to certain articulatory contexts culating F2 ratios: dividing mean F2 values in /i/ utterances
(Terband, 2017). by mean F2 values in /u/ utterances averaged across a series
A further consideration is that measuring formants of repetitions (see Table 11). The F2 ratios provide a measure
in children can be difficult due to their relatively high fun- to distinguish the utterances. High F2 ratios in the vowels in-
damental frequencies, which generate widely spaced har- dicate large distinctions between vowels, and the F2 ratios in
monics, leading to an undersampling of the vocal tract the measurement points preceding the vowel reflect the coar-
transfer function, and may cause first and second formants ticulation effect of the upcoming vowel (Nittrouer et al.,
to blend (Lee, Potamianos, & Narayanan, 1999; Nijland 1989). It has been found, however, that centroids tend to
et al., 2002; Story & Bunton, 2016). This has been found be a relatively poor measure of fricative vowel coarticula-
to be particularly problematic in earlier studies using speech tion but are rather a measure of anticipatory lip rounding
processing programs with limited linear predictive coding (Nittrouer et al., 1989; Soli, 1981).
and visualization capabilities (Bennett, 1981; Bickley, 1986; Despite the fact that lengthened and disrupted coarti-
Nittrouer et al., 1989). Solutions to this measurement prob- culatory transitions has been identified as one of the main
lem, while becoming less urgent with modern speech process- criteria in CAS, the literature on coarticulation is, as of
ing software, are still researched, for example, by extracting yet, relatively modest in size, compared to the literature
the spectral envelope through improved spectral filtering investigating coarticulation in neurotypical children and
techniques (Story & Bunton, 2016). adults (Hardcastle & Tjaden, 2008). A number of studies
Children with CAS might display reduced articulatory have used acoustic measures of coarticulation in the assess-
rate and reduced size or amplitude of articulatory move- ment of children with CAS. As of yet, no coherent picture
ments, which may complicate interpretations of coarticula- can be drawn with respect to coarticulatory behavior in
tory effects: Both reduced articulation rate and reduced CAS. Compared to their TD peers, children with CAS
speech movements may contribute to the appearance of have found to display earlier and stronger anticipatory

3012 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Table 11. Methodological details: first moment ratio/F2 ratio (Maas & Mailend, 2017; Nijland et al., 2002; Nittrouer et al., 1989).

Materials and methods

(1) Stimuli or targets being analyzed Eight repetitions of four reduplicated syllables (/CVCV/) consisting of a fricative (/s/, or /ʃ/)
followed by a vowel context (/i/, or /u/; Nittrouer et al., 1989)
Six repetitions of 12 /dəˈCV/ syllables consisting of an initial stop (/b/, /d/, /s/, and /x/) followed
by three final vowel contexts (/i, a, u/; Nijland et al., 2002)
Six repetitions of 12 /CVb/ syllables consisting of an initial fricative (/s, z, ʃ/) followed by three
final vowel contexts (/i, ɑ, u/; Maas & Mailend, 2017)
(2) Tasks used to elicit those targets Imitation (model produced by experimenter)
Accompanied by a picture (Nittrouer et al., 1989)
(3) Conditions in which responses Quiet, no time pressure
are elicited Items in isolation (Nittrouer et al., 1989)
Items in carrier phrase (“Hé /dəˈCV/ weer” [he…wIːr] (“hey…again”; Nijland et al., 2002)
Items in carrier phrase (“It’s the /CVb/ again”; Maas & Mailend, 2017)
(4) The measures obtained from Ratio of F2 frequencies in different vowel contexts (Nittrouer et al., 1989)
those responses Ratio of F2 frequencies in different vowel contexts at /ə/ midpoint, /ə/ end, C onset, CV
transition onset, CV transition end, and V midpoint (Nijland et al., 2002)
Ratio of first spectral moment (Maas & Mailend, 2017)
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome Validity: No
measures? Reliability: Interinvestigator differences in segmentation: 12.2 ms; correlation between
segmentation markers: r > .78 (Nijland et al., 2002)
Interinvestigator differences in segmentation: 1.2 ms (onset) and 1.4 ms (offset); correlation
between segmentation markers: r > .99 (Maas & Mailend, 2017)
Validity and reliability of F2 values by a postprocessing procedure of outlier removal
(Nijland et al., 2002)
(7) Norm or reference data available? Reference data: F2 frequencies, fricative ratios, and vowel context ratios for /si/, /ʃi/, /su/, and
/ʃu/ are reported for eight participants per age group for adults (four males, four females)
and 3-, 4-, 5-, and 7-year-old TD children (Nittrouer et al., 1989)
Mean midpoints and width of ranges of F1 and F2 and variability of F2 of schwa and vowels
for children with CAS, TD children, and adult females are reported (Nijland et al., 2002)
F ratios and V ratios for /si/, /ʃi/, /su/, and /ʃu/ are reported for adults, TD children, and children
with SSD (Maas & Mailend, 2017)

Note. TD = typically developing; CAS = childhood apraxia of speech; SSD = speech sound disorder.

coarticulatory vowel effects during a preceding consonant & Conture, 2002; Gibson & Ohde, 2007; Sussman, Hoemeke,
(Maassen et al., 2001), display higher variability in the & McCaffrey, 1992; Sussman et al., 1996).
amount of coarticulation, and display reduced distinc- Locus equations are based on the correlation between
tions between different vowels (Nijland et al., 2002). the values of F2 at vowel onset and vowel midpoint in CV
Findings of reduced contrasts have been reproduced when sequences for a given consonant across vowel contexts.
studying fricative productions in children with SSD, indepen- Lindblom (1963) found that the relationship between F2 at
dent of SSD subtype (Maas & Mailend, 2017). Abnormal onset and F2 midvowel can be described by a linear regression
(greater and reduced) coarticulation was observed only in equation: F2 onset = k × F2 vowel midpoint + c, where
children diagnosed with CAS (Maas & Mailend, 2017). k is the slope of the regression line and c is the y intercept
(the value where the regression line crosses the y-axis at
Locus Equation Metric x = 0; Lindblom, 1963, as cited in Sussman et al., 1991).
The locus equation metric was originally conceived Regression slope and y intercept can then be used to quan-
by Lindblom (1963), as cited in Sussman, McCaffrey, and tify anticipatory coarticulation in CV utterances where a
Matthews (1991), in the search for an invariant cue of steeper slope (i.e., a larger value of k) and a lower y inter-
place of articulation in stop consonants, independent of cept (a smaller value of c) indicate more coarticulation
vowel context (Sussman et al., 1991; see Table 12). While (Krull, 1989). In general, regression slope and y-intercept
initially based on voiced stops, it has been found to be an values show a strong correlation. Alveolar and dental
effective descriptor of place of articulation for consonants with productions, for example, typically feature shallower
other manners of articulation as well (Fowler, 1994; Sussman, slopes and higher y intercepts, while bilabials typically
1994; Sussman & Shore, 1996; but see also Brancazio & feature steeper slopes and lower y intercepts. Approxi-
Fowler, 1998) and has been shown to be stable across lan- mants, however, form an exception and typically feature
guages (Krull, 1988; Sussman, Hoemeke, & Ahmed, 1993). slopes near zero with varying F2 onset loci exclusively
Furthermore, the measure has been shown to work in adults described by varying y intercepts (Sussman, 1994; Sussman
and in children as young as 1.5 years old (Chang, Ohde, & Shore, 1996).

Terband et al.: Methodology in the Assessment of CAS 3013


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Although locus equations have only been used measures of inconsistency, techniques include EMA and
sparsely in children with CAS and children with speech dis- optical motion capture systems. Since the technical back-
orders in general, they show great potential. Using locus ground, general procedures, and methodological consider-
equations, Sussman and colleagues demonstrated decreased ations regarding these techniques have been described in
differentiation of stop place of articulation as well as a the Background on Kinematic Variability and Kinematic
pattern of decreased and less stable coarticulation across Spatiotemporal Variability Indices sections, we will only
stop consonants in five children with CAS compared to highlight additional aspects that are specific when studying
children with typical development (Sussman et al., 2000), coarticulation. In addition, electropalatography (EPG;
while Chang et al. demonstrated that children who stutter Timmins, Hardcastle, McCann, Wood, & Wishart, 2008)
do not differ from their TD peers in terms of degree of and ultrasound imaging systems (e.g., Noiray et al., 2018;
coarticulation (Chang et al., 2002). Song et al., 2013; Zharkova et al., 2011, 2012) have been
Reliable locus equations can be obtained using several used to assess coarticulation in children. EPG utilizes an
tasks and stimuli. The elicitation method used to obtain re- individually tailor-made artificial palate, placed inside the
sponses appears to have little effect on locus equations mouth against the speaker’s hard palate, containing elec-
(Chang et al., 2002; Gibson & Ohde, 2007; Sussman et al., trodes that record timing, location, and (in modern systems)
1992). The original study of Sussman et al. (1992) used an pressure of lingual contact. As such, EPG can be used to
imitation task with the stimuli embedded in carrier phrase measure spatiotemporal aspects of tongue–palate constrictions
“It’s a /CVt/ again,” while Chang et al. (2002) successfully but does not track articulatory movements. Oppositely,
used a picture-naming task and Gibson and Ohde (2007) ultrasound can be used to track tongue movements but
used spontaneous elicitation and imitation during free-play is less suitable to visualize and quantify lingual constric-
and child-centered activities with toys and pictures in their tions. With ultrasound, a sonic transducer is placed head-
study with toddlers from 1.5 years old. While elicitation mounted, tightly under the chin. The transducer emits
method is somewhat flexible, a requirement that is crucial high-frequency sound waves and records their echo as the
is that the stimuli should contain enough variation in sound waves are reflected by bodily fluids and soft tissue,
vowel context. It is not clear what constitutes the exact such as the lingual musculature. Ultrasound is gaining
minimum number of vowels needed to reliably calculate popularity quickly due to its relatively low-cost and limited
locus equations. However, Nijland et al. (2002) reported invasiveness. Although ultrasound records full-tongue con-
that using only the three corner vowels /i, a, u/ did not tours, its time resolution is limited in comparison with EPG
result in reliable slope calculation. It is therefore advised to and EMA systems. Other imaging techniques include X-ray
obtain minimally three repetitions of six dissimilar vowels, microbeam and magnetic resonance imaging, but these are
as described by Sussman et al. (see Table 12). generally not considered suitable for children. The opera-
tional principles of the EPG and ultrasound systems have
been elaborated elsewhere and are beyond the scope of this
Articulatory Measures review (e.g., Cleland, McCron, & Scobbie, 2013; Gibbon
Background & Lee, 2007; Zharkova, 2013).
With respect to coarticulation, articulatory analyses Articulatory measures of coarticulation basically
would have value for understanding CAS as a motor comprise two approaches and focus either on articulatory
speech disorder but, to date, have not been applied in this timing or on articulatory positioning. Articulatory timing
population. A wide variety of techniques are available that measures assess the temporal coordination between speech
have been used for tracking speech movements and articu- movements and operationalize coarticulation as the over-
latory positioning in children. Similar to articulatory lap in time between the realization of consecutive

Table 12. Methodological details: locus equation metric (Sussman et al., 1992).

Materials and methods

(1) Stimuli or targets being analyzed Three to six repetitions of 18 /CVt/ syllables consisting of an initial stop (/b/, /d/, and /g/) in
six vowel contexts (/i/, /I/, /æ/, /a/, /ʌ/, and /u/)
(2) Tasks used to elicit those targets Imitation (model produced by experimenter)
(3) Conditions in which responses are elicited Quiet, no time pressure; items in carrier phrase (“It’s a /CVt/ again”; Sussman et al., 1992)
(4) The measures obtained from those responses Regression slope and y intercept of the linear relationship between the frequencies F2 at
onset and F2 at vowel midpoint
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? Validity: No
Reliability: Interinvestigator differences in F2 frequencies: 97.2 Hz; correlation between
F2 measurements: r > .95
(7) Norm or reference data available? No

3014 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


articulatory movements. The amount of overlap can be Gibbon et al. (1993) formulated the overlap index as the
calculated based on either offset and onset or midpoints overlap between the articulatory constrictions of consec-
of either movements or target configurations, depending utive consonants relative to the duration of the first conso-
on the sequence of speech sounds involved and the specific nant. In formula: (approach closure C1 − approach closure
research question. In principle, these overlap measures re- C2 / approach closure C1 − release closure C1) × 100. Herein,
quire simultaneous tracking of movements or constrictions approach and release are defined as the starting points of the
made by different articulators. Lingual coarticulation can first palatal contact and the release of full closure for the
therefore only be measured with EMA, ultrasound, or consonant constriction, respectively. A value of < 100 indi-
EPG. In addition to EMA, however, anticipatory or carry- cates that the articulation of the two segments overlap (the
over effects of lip rounding can be assessed using an optical lower the value, the stronger the coarticulation), while a
motion capture system when combined with onset or offset value of > 100 indicates a gap between the two segments.
timing of the coarticulatory context segment based on Using EMA, Kühnert et al. (2006) calculated the measure
acoustics. of overlap between articulatory constrictions with a differ-
Measures of articulatory positioning, on the other ent formula as (Offset C1 − Onset C2 / Offset C2 − Onset
hand, assess differences in realized place of articulation C1) × 100. Positive values indicate overlap between the
across different contexts, similar to the acoustic measures two segments, while negative values indicate a lag. Onset
of formant ratios and locus equations described above. A and offset are defined as the start- and endpoints of the
strong advantage of articulatory over acoustic measures of movement plateau, meaning the start and end of the full
coarticulation is that they are based on direct information constrictions, and thus correspond to “approach” and
of the positioning of the articulators. Since the articulatory “release” in the EPG formula. The important difference
positioning does not need to be derived from the acoustic between these two ways of calculating the overlap/lag
signal, coarticulatory influences on movement targets and between two consecutive segments is that, in Gibbon and
trajectories can be investigated even if they do not manifest colleagues, lag is relative to the duration of C1, whereas
themselves acoustically. Where early kinematic studies re- lag is relative to the duration of C1 + C2 in the method by
lied on visual inspection of movement trajectories across Kühnert and colleagues. We believe the latter is to be pre-
stimuli (e.g., Katz & Bharadwaj, 2001; Katz, Machetanz, ferred since it is less sensitive to durational adjustments of
Orth, & Schönle, 1990), quantitative measures have been individual segments. In principle, the same measure could
developed in recent years, and a variety of different tech- be used for studying anticipatory coarticulation in CV or
niques have been used to this end. Coarticulatory context VC sequences by substituting C1 or C2 constriction offset
segments do not necessarily need to be assessed kinemati- and onset, with the offset and onset of the movement pla-
cally, and the quality of the realized segments could be teau of the vowel. However, precisely establishing the
verified perceptually or acoustically (e.g., Weismer, Yunusova, points of movement plateau or constriction offset and on-
& Westbury, 2003). set is much more difficult in vowels than in consonants.
Over the years, a large set of movement-based mea- The technical details go beyond the scope of this tutorial,
sures of coarticulation has been reported in the literature, but it should be noted that validity and reliability of the
often conceptually similar but adapted or adjusted based measurement procedures have yet to be established in
on technological progress in data collection and processing. large sample studies.
In this tutorial, we will focus on the more recent versions Regarding stimuli, many studies have focused on ini-
of measures that can be used with the modern systems. tial /kl/ sequences in real words. In principle, however, any
heterorganic sequence of articulatory movements is possi-
Temporal Gestural Overlap ble (if EPG is used, with the obvious limitation that the
The first studies to investigate coarticulation through speech sounds must involve lingual–palatal contact). Re-
measures of articulatory timing used EPG (e.g., Butcher, garding task and elicitation procedures, studies with chil-
1989; Butcher & Weiher, 1976; Hardcastle, 1985), later dren have used real words preceded by an indefinite article
followed by EMA (Katz et al., 1990; see Table 13). These repeated from a wordlist read by the experimenter at an
early studies relied on a comparison of the timing of articu- habitual rate (Timmins et al., 2008). The target words were
latory movements in different contexts but did not quantify mixed with filler items, and the whole list was repeated
temporal overlap as such. Measures of overlap in time 10 times (see Table 13). The recorded words were subject
between speech movements were developed a decade later, to a qualitative phonological analysis, and incorrect pro-
driven by the need to normalize for durational differences ductions in which not all segments were realized were re-
(Gibbon, Hardcastle, & Nicolaidis, 1993) and as a part of moved from the analysis.
the different, more general endeavor to investigate coordi-
nation of speech movements in the theoretical framework Articulatory Positioning: Anticipatory Lip Rounding
of articulatory phonology (inter- and intragestural coordi- Labial anticipatory coarticulation has mainly been in-
nation; e.g., Chitoran, Goldstein, & Byrd, 2002; Kühnert, vestigated in adults in the context of theories to account for
Hoole, & Mooshammer, 2006). cross-linguistic differences in anticipatory rounding behavior
Most studies investigating temporal overlap have but has also been successfully assessed in 3.5- to 8-year-old
focused on consonant–consonant sequences. Using EPG, children (Noiray, Cathiard, Abry, & Ménard, 2010; Noiray,

Terband et al.: Methodology in the Assessment of CAS 3015


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Table 13. Methodological details: temporal gestural overlap/lag (Kühnert et al., 2006; Timmins et al., 2008).

Materials and methods

(1) Stimuli or targets being analyzed /CC…/ sequences comprising real words (e.g., “clock”)
(2) Tasks used to elicit those targets Read at self-chosen, habitual, rate; 10 iterations of each target word
(3) Conditions in which responses are elicited Quiet, no time pressure; items in carrier phrase (“I see…”; Kühnert et al., 2006) or
preceded by an article (“a…”; Timmins et al., 2008)
(4) The measures obtained from those responses Gestural overlap/lag = (t_offsetC1 − t_onsetC2 / t_offsetC2 − t_onsetC1) × 100
(Kühnert et al., 2006)
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? No
(7) Norm or reference data available? No

Ménard, Cathiard, Abry, & Savariaux, 2004; see Table 14). the point at which lip area shows a 10% decrease follow-
Two parameters indicative for lip rounding have been inves- ing the maximum area and offset by the point of a 10%
tigated in this respect, lip protrusion and lip constriction, of increase following the minimum lip area. The duration
which the latter has been consistently shown to be more reli- of the obstruence interval is based on the acoustic sig-
able (Noiray et al., 2010; Noiray, Cathiard, Ménard, & nal with V1 offset and V2 offset determined from the
Abry, 2011; Ménard, Cathiard, Dupont, & Tiede, 2013). spectrogram.
Where earlier studies used a combination of three-dimensional The stimuli used by Noiray et al. (2004, 2010) con-
optical (infrared light) and video recordings, later studies rely tain intervocalic consonant sequences of increasing length,
on video-based registration only (e.g., Ménard et al., 2013). which was specific to their study testing theoretical hypo-
In this technique, lip constriction is measured as theses about the temporal expansion of lip rounding as a
between-lips area based on the labial contours. Speakers’ function of intervocalic obstruence interval duration (see
lips are marked with a blue lipstick to maximize visual Table 14). For children with CAS, some of these complex
contrast, and a purpose-designed video analysis software intervocalic consonant clusters might be (too) difficult
automatically tracks and processes labial shapes. The time to produce. In principle, however, any intervocalic con-
resolution depends on the camera, but the software doubles sonant sequence could be used, as long as the consonants
the frame rate of the camera (which means that, with mod- are phonologically neutral with respect to rounding
ern ordinary equipment, rates of ≥ 60Hz are easily attain- and the clusters are phonologically legal in the testing
able). The operationalization of anticipatory coarticulation is language.
strongly intertwined with the stimuli, consisting of V1CnV2
sequences (/i/Cn/y/ or /i/Cn/u/), in which Cn varied from Articulatory Positioning: Mean Distance
zero to three consonants. In these sequences, anticipatory Across Set/Context
vowel behavior is assessed through the relation between A first type of measure of coarticulatory influences on
the total duration of the rounding gesture in the final vowel articulatory positioning is the absolute distance between the
and the duration of the obstruence interval or, in other position of an articulator during the production of a speech
words, is measured by how early in the utterance lip sound in different contexts and has been mainly used to
rounding starts. The duration of the constriction gesture investigate lingual coarticulation (see Table 15). Distance
is based on the video data, with the onset marked by a measures can be based on tongue contour as a whole or on

Table 14. Methodological details: anticipatory lip rounding (Noiray et al., 2004, 2010).

Materials and methods

(1) Stimuli or targets being analyzed 10–12 repetitions of /iCny/ sequences in which Cn varied from zero to three intervocalic
consonants (in French forming the names [iy], [isy], [iky], [iksy], [ikry], [itkry], [iskry],
and [iksty])
(2) Tasks used to elicit those targets Imitation, prompted by the experimenter
(3) Conditions in which responses are elicited Quiet, no time pressure; items embedded in carrier sentences
(4) The measures obtained from Anticipatory lip rounding = total duration of the rounding gesture in the final vowel /
those responses duration of the obstruence interval
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? No
(7) Norm or reference data available? No

3016 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Table 15. Methodological details: mean distance across set/context (Kim et al., 2018; Zharkova et al., 2011, 2012).

Materials and methods

(1) Stimuli or targets being analyzed 10 repetitions of CV syllables consisting of a fricative (/s/ or /ʃ/) followed by a vowel
context (/i/, /a/, and /u/; Zharkova et al., 2011, 2012)
Five repetitions of /ə/ preceded or followed by four CVC words consisting of two voiced
alveolar plosives /d/ with the vowels (/i/, /u/, /ɑ/ and /æ/; Kim et al., 2018)
(2) Tasks used to elicit those targets Reading/picture naming (text + image on screen; Kim et al., 2018; Zharkova et al., 2011,
2012)
(3) Conditions in which responses are elicited Quiet, no time pressure; items in carrier phrase (“It’s a…Pam”: Zharkova et al., 2011, 2012;
“Get…a puppy” and “Put a…here”: Kim et al., 2018)
(4) The measures obtained from those responses Mean across set/context distance (in millimeters) of tongue contours (Zharkova et al.,
2011, 2012) or tongue body position (Kim et al., 2018)
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? No
(7) Norm or reference data available? Reference data: mean across set/context distance of tongue contours for /si/, /su/, /sa/
and /ʃi/, /ʃu/, /ʃa/ are reported for 10 participants per age group for adults (gender not
reported) and 6- to 10-year-old TD children (Zharkova et al., 2011, 2012)

Note. CV = consonant–vowel; TD = typically developing.

specific parts of the tongue (e.g., flesh-point markers on curvature position, Dorsum Excursion Index, Tongue Con-
tongue tip, body and dorsum [EMA], highest point in tongue straint Position Index, and LOCa-i, a tongue bunch location
body [ultrasound], center point of contact [EPG]). Using index, which is further explained below; see also Ménard,
ultrasound tongue imaging, Zharkova et al. (2011, 2012) Aubin, Thibeault, & Richard, 2012; Zharkova, 2013; see
quantified coarticulation as the mean nearest neighbor dis- Table 16). The main purpose of their study was to compare
tance between tongue curves at midpoint of the production ultrasound data collection with and without head stabiliza-
of the initial fricatives /s/ and /ʃ/ in two vowel contexts, tion (i.e., the ultrasound scanner mounted on a headset or
calculated as the Euclidean distance from each point in handheld). The results indicate that tongue shape measure
one curve to the nearest point in the second, comparison LOCa-i is the most robust, as it was the only measure that
curve. Coarticulatory distance between single points in- was not affected by the absence of stabilization. LOCa-i
stead of contours, such as EMA coil position or EPG cen- captures the extent of tongue front and tongue back excur-
ter point of contact, could be calculated in the same way. sion and is calculated as the ratio of tongue height at 1/3
A similar but slightly different approach was used by and 2/3 of the length of the tongue curve (measured from
Kim, Coalson, and Berry (2018) in investigating articula- the tip). Higher values correspond to a more /i/-like tongue
tory measures of anticipatory and carryover lingual coarti- shape, and lower values correspond to a more /a/-like
culation in (/ə/)CVC(/ə/) sequences with EMA (see Table 15). tongue shape (Zharkova et al., 2015).
Instead of comparing tongue position in two contexts, they The LOCa-i tongue shape ratio measure can be seen
compared each /ə/ production with the speaker-specific aver- as the articulatory equivalent of acoustic F2/second moment
age over all repetitions at the temporal midpoint. The ratios (see the F2 ratios and First Moment Ratios section)
advantage hereof is that it generates a data point for each and are suitable for consonant–vowel (CV) or vowel-to-vowel
utterance individually instead of each context pair and thus (əCV) anticipatory coarticulation, albeit specifically de-
provides a context-independent measure of coarticulation. signed for /i/ and /a/ vowel contexts. Task and elicitation
Coarticulation was measured at two positions in /ə/, at /ə/ procedures are similar to the mean distance across set/
midpoint and at /ə/ boundary, defined as onset (anticipa- context measure (Zharkova et al., 2011, 2012; see Table 16).
tory) or offset (carryover) of /ə/, which were acoustically With respect to the comparison between head-mounted
identified as the first or last glottal pulse. The two yielded or handheld ultrasound recording, the results from Zharkova
the same pattern of results, although a direct comparison of et al. (2015) indicated that it was possible to collect reliable
the two versions of the measure in terms of sensitivity was data without head mount in adolescents (N = 10; 13-year-
not possible due to the small sample size (N = 7 female adult olds). As the authors note, however, this might not hold
speakers; Kim et al., 2018). for younger children. Until it has been conclusively proven
to be reliable, it is advised to collect data with head stabiliza-
tion when investigating coarticulation in younger children.
Articulatory Positioning: Tongue Shape Ratio
Instead of a distance measure based on tongue con-
tours or flesh points, Zharkova, Gibbon, and Hardcastle Articulatory Positioning: Coarticulation Degree
(2015) quantified coarticulation as the vowel context ratios Another measure of coarticulation that has been used
of five different measures of tongue shape (curvature degree, in recent ultrasound studies with children is coarticulation

Terband et al.: Methodology in the Assessment of CAS 3017


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Table 16. Methodological details: tongue shape ratio (LOCa-i; Zharkova et al., 2015).

Materials and methods

(1) Stimuli or targets being analyzed Six repetitions of CV syllables consisting of a consonant (/p/, /t/, /s/, and /ʃ/) followed
by a vowel context (/i/ and /a/)
(2) Tasks used to elicit those targets Reading/picture naming (text + image on screen)
(3) Conditions in which responses are elicited Quiet, no time pressure; items in carrier phrase (“It’s a…Pam”)
(4) The measures obtained from those responses i/a ratio on the LOCa-i measure of tongue shape
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? No
(7) Norm or reference data available? No

Note. CV = consonant–vowel.

degree (Noiray et al., 2018; Rubertus & Noiray, 2018), which Inappropriate Prosody, Especially in the
can be seen as the articulatory variant of the locus equa-
tions metric (see the Locus Equation Metric section; see
Realization of Lexical or Phrasal Stress
Table 17). Similarly, coarticulation degree captures whether Background
the positioning of an articulator during the production of a Prosody; Lexical and Phrasal Stress
speech sound varies systematically depending on its position Prosody is difficult to define and may encompass
in the vowel context by means of a regression analysis. different aspects of speech for different researchers and
Unlike the acoustics-based equivalent, however, the articula- clinicians. For present purposes, we will not discuss the
tory measure was used not only for consonant–vowel (CV) many different views of prosody but instead attempt to de-
anticipatory (Noiray et al., 2018) but also for vowel-to-vowel lineate the aspects of prosody that have received attention
(VCə) carryover coarticulation (Rubertus & Noiray, 2018). in the literature on AOS. To help delineate this domain,
Articulatory positioning was based on the highest point of we will follow Shriberg and Kent (2013) in using the term
the tongue body (horizontally) at the (acoustically deter- prosody to refer to suprasegmental aspects of the speech
mined) temporal midpoint of the segments of interest. Spe- signal that affect the linguistic or communicative structure
cifically, they measured whether tongue body height in the of an utterance, such as stress, intonation, and pauses (see
consonant and /ə/ varied systematically depending on the also Gerken & McGregor, 1998). Excluded from this defini-
vowel by regressing the horizontal position of the highest point tion and discussion are paralinguistic, suprasegmental as-
of the tongue body at C and V midpoint and /ə/ and V mid- pects of speech that primarily provide information about
point, respectively (see Table 17). Differences in coarti- the speaker or the speaking context, such as voice quality
culation degree were expressed in regression coefficients, and overall loudness.
where a larger value (i.e., a steeper slope) indicates more It is important to recognize that there is significant
coarticulation. cross-linguistic variation in prosodic structure and the

Table 17. Methodological details: coarticulation degree (Noiray et al., 2018; Rubertus & Noiray, 2018).

Materials and methods

(1) Stimuli or targets being analyzed Six repetitions of C1VC2/ə/ pseudowords, V consisting of the tense long vowels
/i:/, /y:/, /e:/, /u:/, and /o:/ and C consisting of /b/, /d/, /g/, and /z/ with C1V a fully
crossed set and C2 different from C1
(2) Tasks used to elicit those targets Imitation of prerecorded model
(3) Conditions in which responses are elicited Quiet, no time pressure; items preceded by an article (“eine…”)
(4) The measures obtained from those responses Coarticulation degree: mean within stimulus distance in tongue body position in
V and C1 midpoint (Noiray et al., 2018) and V and /ə/ midpoint (Rubertus &
Noiray, 2018)
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? No
(7) Norm or reference data available? Reference data: Graphic displays of regression slope estimates are available for
vowel-to-/ə/ carryover coarticulation in consonant contexts /b/, /d/, and /g/ for
3-year-olds (n = 19), 4-year-olds (n = 14), 5-year-olds (n = 14), 7-year olds
(n = 15), and adults (n = 13; Mage 23; seven females and six males) with typical
development (Rubertus & Noiray, 2018)

3018 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


developmental trajectory to acquire adultlike control of such as Dutch and German is trochaic (Sw). Most two-
prosodic aspects of speech (e.g., Gerken & McGregor, 1998; syllable words are trochees. Sequences of more than two
Kehoe, 2001; Kehoe, Stoel-Gammon, & Buder, 1995). We syllables typically consist of single-syllable feet and trochaic
focus here primarily on English, as most published research feet. Unfooted unstressed syllables (syllables that do not
on CAS has involved English-speaking children, but the form part of a trochaic foot) are more vulnerable to omis-
measures reviewed here are expected to be applicable, with sion than footed syllables, both in the course of develop-
appropriate modifications, to other languages as well. In ment (e.g., banana [wSw] is more likely to be reduced to
addition, as the focus of the literature with respect to pros- nana [Sw] than to bana [wS]) and in colloquial adult speech
ody in CAS has been primarily on lexical and phrasal stress, (e.g., opossum [wSw] is often reduced to possum [Sw], not
we restrict our discussion to these aspects here as well. oposs [wS]). A number of more specific explanations have
Stress in the linguistic sense refers to perceptual been put forward to explain for the patterns of syllable
prominence of a syllable in a sequence of syllables and, as omission in children’s speech, but discussion of these pro-
such, is a relative phenomenon (it makes little sense to talk posals is beyond the scope of this tutorial (see, e.g., Gerken,
of “stress” for monosyllabic utterances). Perceptual promi- 1996; Kehoe, 2001, for further discussion).
nence of a syllable may involve manipulation of three main
perceptual parameters, namely, length (physical attribute: Prosody and Stress in Typical Development
duration), loudness (physical attribute: intensity), and pitch Few normative data are available for prosodic devel-
(physical attribute: fundamental frequency or F0). Syllables opment, but there appears to be general agreement that
that are longer, louder, and higher in pitch (or involve a syllable omissions should be rare or infrequent by age of 3
greater pitch change) are perceived as stressed compared to or 4 years (Gerken & McGregor, 1998; Kehoe, 2001), and
syllables that are shorter, less loud, and lower in pitch (or stress errors are considered rare in typical development
involve a smaller pitch change). Although these parameters (Kehoe, 2001). It is important to keep in mind, however,
can each signal stress more or less independently (e.g., that, in linguistically oriented accounts of prosody and
Patel, 2003; Patel & Campellone, 2009), speakers typi- its development, much research has relied on perceptual
cally manipulate these aspects in some coordinated fashion. (transcription-based) methods (e.g., Gerken, 1996; Kehoe,
Furthermore, articulatory aspects, such as vowel quality, 2000). Given well-known limitations of perceptually based
can also affect the perception of stress (Velleman & Shriberg, measures (e.g., Goffman, Heisler, & Chakraborty, 2006;
1999). In physical terms, production of prosody involves Maas & Mailend, 2012), normative suggestions based on such
manipulation of the respiratory system, the phonatory system, measures must be viewed with some caution. More sensi-
and the supralaryngeal (articulatory) system and requires tive measures such as acoustic or kinematic measures may
complex coordination across these systems (e.g., Goffman reveal greater insight into prosodic abilities of children.
& Malin, 1999). As such, given the changes in vocal tract For instance, using acoustic temporal measures (Carter &
anatomy during childhood (e.g., Vorperian et al., 2009), it Gerken, 2004) showed that 2-year-old children who omit
is to be expected that these complex coordination demands syllables (perceptually) do mark the underlying presence
can pose difficulties for children with speech motor plan- of the omitted syllable by lengthening the preceding syllable.
ning and/or programming impairments. Conversely, Goffman (1999, 2004), using kinematic mea-
Lexical stress refers to the stress patterns of individ- sures, showed that even 4- to 6-year-old children differ
ual lexical items, regardless of their sentential context (e.g., from adults in their differentiation between different rhyth-
potato has stress on the second syllable, pyramid has stress mic patterns.
on the first syllable). Phrasal stress refers to stress patterns
in larger, multiword utterances and may serve to highlight Prosody and Stress in Children With CAS
important information such as content words and new in- Since the initial descriptions of AOS by Darley and
formation (relative to function words or given [old] infor- his colleagues (e.g., Darley, Aronson, & Brown, 1975), ab-
mation) or to indicate a contrast with previous statements or normal prosody has remained a prominent and common
information (e.g., KRAmer ate the soup [not Elaine]). feature associated with AOS in children and adults, in
Syllables are grouped into higher level groupings scientific investigations and in clinical practice (Ballard,
called metrical feet, which constitute the domain of stress Robin, McCabe, & McDonald, 2010; Caruso & Strand,
representation and which themselves are organized into 1999; Duffy, 2005; Forrest, 2003; Hall, 2000; McCabe,
superordinate structures, such as prosodic words (e.g., Rosenthal, & McLeod, 1998; Odell, McNeil, Rosenbek,
Kehoe, 2001). Metrical feet contain one or two syllables, & Hunter, 1991; Odell & Shriberg, 2001; Rosenbek &
and the two most common basic foot types consisting of Wertz, 1972; Shriberg, Aram, & Kwiatkowski, 1997b,
two syllables are the trochee and the iamb. Trochaic feet 1997c; Strand, McCauley, Weigand, Stoeckel, & Baas,
have stress on the first syllable (e.g., mother, baby, wobble), 2013; Velleman & Shriberg, 1999; Wambaugh, Duffy,
and iambic feet have stress on the second syllable (e.g., McNeil, Robin, & Rogers, 2006; Yoss & Darley, 1974).
balloon, hotel, forget). Stressed syllables are sometimes in- The CAS Technical Report lists the “realization of lexi-
dicated with “S” (strong), and unstressed syllables are indi- cal or phrasal stress” (ASHA, 2007, p. 4) as the core ob-
cated with “w” (e.g., wobble = Sw and balloon = wS). The servation of atypical prosody in CAS. More specifically,
basic foot pattern in English and other Germanic languages children with CAS produce less differentiation between

Terband et al.: Methodology in the Assessment of CAS 3019


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
stressed and unstressed syllables (e.g., Munson, Bjorum, & pseudowords can be elicited via repetition (Munson et al.,
Windsor, 2003; Shriberg et al., 1997b, 1997c, 2003), pro- 2003; Skinder, Strand, & Mignerey, 1999) or reading (for
viding the listener with the impression of equalized stress older children; e.g., Ballard et al., 2010; van Rees, Bal-
across syllables or misplaced stress. Changing the prosodic lard, McCabe, Macdonald-D’Silva, & Arciuli, 2012), pic-
structure of a word by omitting syllables has also been re- tures (Murray et al., 2015) and toys can be used to
ported (e.g., Velleman & Shriberg, 1999). These differ- prompt the production of words in a naming task or in a
ences in realization of lexical or phrasal stress are more play context or conversation (Odell & Shriberg, 20011;
readily observed in utterances with iambic feet (Munson Skinder, Connaghan, Strand, & Betz, 2000). Hence, words
et al., 2003; Nijland, Maassen, Van der Meulen, Gabreëls, can be used to collect information about how children
et al., 2003)—the less common grouping of stressed and store and access linguistic information about stress, whereas
unstressed syllables in Germanic languages. for pseudowords, the stress pattern has to be provided in
Less accurate stress production in children with CAS the stimuli (whether spoken or written). Pseudowords
compared to children without communication impair- do, however, allow for a more straightforward control
ments and children with other types of communication over different psycholinguistic variables and the pho-
impairments (e.g., speech delay and phonological impair- netic makeup of the words. As stress production may in-
ments) is a consistent finding in the literature, although the teract with articulatory difficulties (Munson et al., 2003),
observed differences have not always reached statistical these aspects may be important to control in order to iso-
significance (e.g., Munson et al., 2003) nor do all children late the effect of stress from the effect of other variables.
with CAS show abnormal stress production (e.g., Shriberg Typically, a listener or a group of listeners pro-
et al., 1997c). Furthermore, in studies where many different vide a binary judgment—correct or incorrect marking of
speech and language variables were collected and analyzed stress—against the glossary/recording of the target words.
as potential indicators for CAS diagnosis, perceived errors For example, Murray et al. (2015) elicited polysyllabic
in stress marking were the most accurate predictors of words from children with a suspected CAS diagnosis in a
expert judgments of CAS diagnosis (prediction accuracy picture-naming task (the Single-Word Test of Polysyllables;
of up to 80%; Murray et al., 2015) and the most successful Gozzard, Baker, & McCabe, 2006; see Table 18). To mea-
basis for discriminating children with suspected CAS sure the adequacy of stress production, listeners assessed
from children with speech delay (Shriberg et al., 1997b). whether the observed stress pattern matched the expected
Nevertheless, the fact that not all children with CAS show one for the particular word, which was then converted into
abnormal stress production and group differences do not a percentage of stress matches. Other studies have included
always reach statistical significance suggests that there may additional codes to provide more detail about the nature
be subtypes of CAS (cf. Shriberg et al., 1997c) and/or of the stress production errors. For example, Skinder et al.
that perceptual measures of stress production may not (2000) asked listeners to judge children’s productions of
be sufficiently robust or sensitive. It is also relevant to note bisyllabic words, including both iambic and trochaic utter-
that some studies report relatively poor correspondence ances. The listeners identified the productions as (a) mis-
between perceptual and acoustic measures of stress production placed, (b) correct, or (c) equal in terms of stress.
(e.g., Munson et al., 2003).
Acoustic Measures
Perceptual Measures Background
Percentage Correct Stress Stress is expressed mainly by three perceptual param-
Several studies have examined the adequacy of lexi- eters (length, pitch, and loudness). The physical correlates
cal stress as perceived by listeners to establish a diagnostic of these parameters (duration, fundamental frequency or
basis for identifying children with CAS; see Table 18. Since F0, and amplitude/intensity) can be measured acoustically.
stress (i.e., a stressed syllable) is a relative construct only Several studies have taken advantage of the acoustic ap-
identified in the context of nonstressed syllables, the stimuli proach to study and quantify production of prosody, par-
to elicit a spoken response must include minimally two ticularly lexical stress in children with CAS. Acoustic
syllables. These syllables, in turn, may be embedded into
longer utterances, such as multisyllabic words, carrier phrases, 1
The Prosody-Voice Screening Profile developed by Shriberg,
or sentences. It is important to include both trochaic and iam- Kwiatkowski, and Rasmussen (1990) is based on multidimensional
bic feet in the stimuli, because atypical stress production judgments of the appropriateness of stress based on 24 eligible
may be more evident in iambic (non)words (e.g., Munson utterances derived from conversational speech. This system is not
et al., 2003), and including only trochees may therefore not further discussed here given the considerable data reduction and
perceptual training required (Odell & Shriberg, 2001): “Coders
provide an opportunity to observe differences. At the same
learn to discriminate each prosody-voice (PV) code by training
time, if children also differ in the production of trochaic stress practice that includes learning the perceptual criteria for each code and
pattern, it may be an indication of a more severe impairment. listening to several hundred audio-taped exemplars obtained from
The choice of stimuli partially dictates the task that samples of child and adult speakers representing a wide spectrum of
can be used to elicit the responses, with more numerous speech disorders, including speakers with motor speech disorders”
options for word stimuli. While both words and (p. 287).

3020 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Table 18. Methodological details: Single-Word Test of Polysyllables (Gozzard et al., 2006).

Materials and methods

(1) Stimuli or targets being analyzed 50 Multisyllabic real words: three (n = 37), four (n = 12), five (n = 1) syllables
(2) Tasks used to elicit those targets Picture naming
(3) Conditions in which responses are elicited Quiet, no time pressure
(4) The measures obtained from those responses Percent stress matches (based on binary judgments of match/mismatch)
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? Validity: Percent stress matches among the two most discriminative measures in
distinguishing children with CAS from children without CAS (77%–80% accuracy;
Murray et al., 2015)
Reliability: Interrater reliability = 91.2% (Murray et al., 2015)
(7) Norm or reference data available? No reference data from children with typical speech. Reference data of children
without CAS but with other speech disorders (N = 15), M = 67.3%, SD = 22.4%,
and children with CAS (N = 28), M = 9.8%, SD = 9.1% (Murray et al., 2015)

Note. CAS = childhood apraxia of speech.

studies use similar tasks and stimuli as perceptual studies, iambs and spondees, which led the authors to exclude these
but some considerations are particularly pertinent in the items from analysis.
context of the acoustic approach. We will first consider The acoustic measures are typically obtained from
the effect of phonetic context in stressed and unstressed the nucleus of a syllable, and they include the duration of
syllables. Listeners in a perceptual study have the advan- the segment, peak intensity, and peak F0. Sometimes,
tage of using all the different cues of stress simultaneously. measures that relate to the timing of the peak F0 and/or
This allows listeners to weigh different cues differently, amplitude are also included. The magnitude of stress is
depending on the phonetic context. For example, in English, reflected in the comparisons of these measures between
vowels are produced with a longer duration in a syllable stressed and unstressed syllables—greater difference reflects
with a voiced coda compared to a voiceless one. The acous- more pronounced production of stress. Several different
tic measures of duration, amplitude, and F0 are obtained techniques have been developed to compare the acoustic
in isolation. The duration difference that results from measures of stressed and unstressed syllables.
different phonetic context or from stress will interact with Some studies have compared the raw values of
one another and cannot be separated. In order to interpret duration, amplitude, and F0 between stressed and un-
the duration change as an effect of stress production, it is stressed syllables (Nijland, Maassen, Van der Meulen,
important that the phonetic context be controlled for a Gabreëls, et al., 2003; Skinder et al., 2000). For example,
reliable comparison of stressed and unstressed syllables. In Nijland, Maassen, Van der Meulen, Gabreëls, et al. (2003)
addition, the acoustic measures typically depend on a reli- found a difference in the duration of unstressed syllables
able identification of the nucleus of a syllable (the vowel in iambic feet when children with CAS were compared to
or a syllabic consonant). The nuclei that are surrounded TD children. More specifically, while the duration of
by stop consonants can be more reliably identified com- stressed syllables was comparable between the two groups,
pared to those surrounded by liquids or glides. the authors found that children with CAS did not have
In addition to phonetic context, acoustic analysis is shorter durations for unstressed syllables in these utterances.
particularly sensitive to variables related to phrase- or The shortcoming of comparing raw values of acoustic
sentence-level prosodic factors such as phrase-final length- measures, such as syllable duration, is that this approach
ening and citation intonation. Like phonetic context, these does not take into account individual variation in these
variables affect the same acoustic variables as stress and measures between different groups. Children with CAS,
therefore interact with stress. For example, the last stressed for example, may have a decreased speaking rate compared
syllable in the final foot of a phrase is subject to phrase-final to TD children. This systematic difference may interact
lengthening. This lengthening can mask the effect of stress with the duration differences that are related to stress.
on duration in case of trochees and inflate the effect in case
of iambs. Using a carrier phrase (e.g., “It’s a [stimulus] Lexical Stress Ratio
again.”) may help to circumvent this issue. Carrier phrases Shriberg et al. (2003) proposed the lexical stress ratio
will also help to avoid citation intonation that people often (LSR), an approach to quantify stress production that
use in picture-naming or single-word reading tasks (Ballard, takes advantage of acoustic correlates of stress see Table 19.
Djaja, Arciuli, James, & van Doorn, 2012), where people The LSR combines duration, intensity, and F0 into one
raise their F0 in the end of the utterance as if requesting composite score of stress. More specifically, the LSR is
feedback. Shriberg et al. (2003) also reported that children the sum of the ratios of three acoustic measures (frequency
were playfully varying the duration of the last syllable in area under pitch contour trace, amplitude area under

Terband et al.: Methodology in the Assessment of CAS 3021


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Table 19. Methodological details: lexical stress ratio (LSR; Shriberg et al., 2003).

Materials and methods

(1) Stimuli or targets being analyzed Eight bisyllabic real-word trochees (eight iambs and eight spondees excluded
from analysis; see Shriberg et al., 2003)
(2) Tasks used to elicit those targets Imitation (from recorded audio model)
(3) Conditions in which responses are elicited Quiet, no time pressure; items in isolation
(4) The measures obtained from those responses Weighted average of ratios of frequency area, amplitude area, and vowel duration
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? Validity: No clear pattern of correspondence between LSR values and clinical
perceptual judgments of abnormal stress
Reliability: Interjudge differences: amplitude = 0.9–1.3 dB; F0 = 10.9–14.0 Hz;
duration = 16–18 ms
(7) Norm or reference data available? No reference data of children with typical speech. Range of LSR values for
children with (non-CAS) speech delay: 0.65–1.14 (Shriberg et al., 2003)

Note. CAS = childhood apraxia of speech.

rectified waveform contour trace, and duration) weighted PVI was used to measure normalized relative vowel duration
by a constant. Although Hosom, Shriberg, and Green and peak intensity over the first two syllables of the poly-
(2004) automated the calculation of LSR using automatic syllabic words. The results showed that speakers with AOS
speech recognition, this indicator has not been widely used. had lower PVI vowel duration values for words with weak–
While LSR assigns different weight for various acoustic do- strong stress produced in the sentence condition, compared
mains combining them into one indicator of stress much to controls and individuals with aphasia, and was primarily
like a human listener, it lacks a similar flexibility. Human attributed to disproportionately long vowels in the word-
listeners may weigh different perceptual cues of stress differ- initial weak syllable for AOS participants. Similar findings
ently, depending on the phonetic and intonational context, were reported by Courson et al. (2012). Together, these
while the weights are constant in LSR. findings demonstrate that the PVI might be a promising
acoustic diagnostic tool in assessing dysprosody in AOS.
Pairwise Variability Index Ballard et al. (2010) have further demonstrated that the
Finally, another measure that has been used to quan- PVI is strongly correlated with perceptual ratings of prosody.
tify stress production in children with and without CAS The PVI has several advantages over other approaches
(Ballard et al., 2012, 2010; Shriberg, Jakielski, & El-Shanti, of quantifying stress production in addition to normalizing
2008) is the pairwise variability index (PVI; Low, Grabe, & for individual differences of the measures of interest. First,
Nolan, 2000; see Table 20). This index is calculated for a study by Ballard et al. (2012) provides reference data for
each acoustic measure related to stress assignment (duration, the PVI of duration, amplitude, and F0 in a cohort of 73 TD
intensity, and F0) separately, and it normalizes for the indi- 3- to 7-year-old children. The authors used a picture-naming
vidual variability of speakers for these measures. PVI is task to elicit polysyllabic words with Sw and wS stress
calculated by the following formula (from Ballard et al., patterns. According to the results, stress production was
2010): PVI(dur) = ((dk − dk + 1) / (( dk − dk − 1) / 2)) × 100, adultlike for words with a Sw stress pattern (e.g., “butterfly”)
where d is the duration of the kth syllable (see Table 20). already by age of 3 years. In contrast, even the older chil-
This formula illustrates the calculation of PVI for dura- dren in this cohort differed from adults in their stress pro-
tion; the same formula can be used to calculate the PVI duction of words with a wS stress pattern (e.g., “potato”),
for other acoustic measures by replacing the duration with at least with respect to duration and amplitude. These results
the measure of interest (e.g., intensity or F0). are crucial for interpreting the stress production differences
Findings to date using the PVI indicate that this mea- in children with CAS as they suggest that not all differences
sure can reveal differences between speakers with and without in stress production of wS words are a reason for concern
AOS (in children and adults). For example, Shriberg et al. in a 5-year-old while differences in words with Sw pattern
(2008) used the PVI to investigate timing and stress charac- may be reflective of a delay or disorder.
teristics in the speech of three siblings with CAS using the Second, Ballard et al. (2010) argue that analyzing dif-
PVI and found a significantly poorer score in one of the ferent correlates of stress separately from one another is a
three affected speakers, compared to their age-matched strength of the PVI approach because it may provide a clini-
controls. With respect to adults with AOS, Vergis et al. cian with indications as to which aspect of stress production
(2014) analyzed lexical stress contrastiveness in polysyllabic is most impaired in children with CAS and/or which aspect
words produced in isolation and in a carrier sentence, pro- of stress is the best target in therapy. For example, Ballard
duced by individuals with AOS + aphasia (AOS; n = 9), et al. examined the effects of therapy, which emphasized
aphasia only (n = 8), and unaffected speakers (n = 8). The only durational contrasts in stress production. While all

3022 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Table 20. Methodological details: pairwise variability index (PVI; e.g., Ballard et al., 2010, 2012).

Materials and methods

(1) Stimuli or targets being analyzed Multisyllabic nonwords (Ballard et al., 2010) or words (Ballard et al., 2012)
(2) Tasks used to elicit those targets Reading (Ballard et al., 2010) or picture naming (Ballard et al., 2012)
(3) Conditions in which responses are elicited Quiet, no time pressure, in isolation
(4) The measures obtained from those responses PVI for duration, amplitude, and/or F0
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? Validity: Strong correlations with perceptual ratings of prosody for three children with
CAS (Ballard et al., 2010)
Reliability: Intraclass correlation coefficients: .905 (duration), .996 (F0; Ballard et al.,
2012); interjudge Pearson r = .98 (duration); average difference, 1.22 ms (Ballard
et al., 2010)
(7) Norm or reference data available? Reference data available for ages 3–7 years and adults (Ballard et al., 2012). No
cutoff values specified, but means and SD available

Note. CAS = childhood apraxia of speech.

participants with CAS (n = 3) showed improvement on extent these measurement differences have an effect on the
the duration contrast, the contrast between other vari- final result (Liss et al., 2009). In terms of reliability, Ballard
ables, such as intensity and F0, also improved. and colleagues (Ballard et al., 2012, 2010) reported high
With respect to the validity and reliability of the Pearson and intraclass correlation coefficients and small
PVI, the following should be noted. At present, validation interrater differences for their (small) sample, suggesting
of the PVI as a measure of dysprosody in CAS is limited that PVI measures can be reliably obtained (see Table 20).
to the strong correlations between PVI and perceptual rat-
ings of prosody in three children with CAS, as reported by
Ballard et al. (2010). Clearly, further validation using Articulatory Measures
(much) larger samples is needed, in particular, also to vali- Kinematic Pairwise Variability Index
date the PVI as a potential diagnostic marker for CAS (e.g., Articulatory measures of prosody may include kine-
validation against other measures such as standardized matic measures of movement amplitude and movement du-
maximum performance tasks). Also, in order to obtain ration (e.g., Goffman, 1999, 2004; Grigos & Patel, 2007,
PVI values, it is necessary to divide the stimuli in vocalic 2010; Kopera & Grigos, 2019; see Table 21. Similar to
and intervocalic intervals based on acoustic information acoustic measures, ratios of duration or amplitude and
available from the waveform and spectrogram. As noted PVI can be computed to express the degree of differentia-
by White, Liss, and Dellwo (2011), variations exist in the tion between stressed and unstressed syllables. All the same
approach of researchers to determine vocalic and consonan- caveats and considerations as discussed previously, with re-
tal information, although it is not clear whether or to what spect to kinematic measures, apply here as well. To date,

Table 21. Methodological details: kinematic pairwise variability index (PVI; Kopera & Grigos, 2019).

Materials and methods

(1) Stimuli or targets being analyzed Two-syllable sequence puppy extracted from multisyllabic nonword puppypop
(2) Tasks used to elicit those targets Cloze sentence or response to question
(3) Conditions in which responses are elicited In story context with prop, no time pressure, in isolation
(4) The measures obtained from those responses PVI for jaw movement duration and amplitude
Scientific basis
(5) Standardized measurement protocol? No
(6) Validity and reliability of outcome measures? Validity: Not reported, except to the extent that the PVI movement duration distinguished
children with CAS from TD children, and children were designated as having CAS
based on perceived presence of prosodic abnormalities.
Reliability: Not reported
(7) Norm or reference data available? Only group means and standard deviations: PVI movement duration: TD = 33.1 (29.3),
SSD = 21.9 (18.3), CAS = 18.0 (30.4)
PVI movement amplitude: TD = 86.9 (62.0), SSD = 66.3 (45.5), CAS = 73.4 (65.8)

Note. CAS = childhood apraxia of speech; TD = typically developing; SSD = speech sound disorder.

Terband et al.: Methodology in the Assessment of CAS 3023


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
only one study has applied kinematic measures to study which implies that the acoustic–auditory results serve as pri-
prosody in CAS (Kopera & Grigos, 2019). Kopera and mary reference frame for the control and monitoring of
Grigos examined PVI based on movement duration and speech movements; as a consequence, there can be large
PVI based on movement amplitude (in addition to acoustic variability in movement given a particular auditory goal.
measures) in seven children with CAS, eight children with Thus, to fully understand the control of speech movements
other SSDs, and nine TD children. Children produced the and to reveal the underlying deficits in speech disorders,
target word puppypop in cloze sentence or in response to the three levels of directness are complementary.
a prompt (see Table 21). Kopera and Grigos observed From the perspective of clinical evaluation and inter-
that PVI based on movement duration from perceptually vention, some characteristics are better suited to be de-
accurate puppy(pop) utterances distinguished children scribed at the perceptual level, especially phonemic errors
with CAS from TD children, and that other children with and prosody; others at the acoustic level, especially pho-
SSDs did not differ either from children with CAS or TD netic distortions, coarticulation, and prosody; and still
children. PVI based on movement amplitude did not differ others at the kinematic level, especially coarticulation, sta-
between any groups. Interestingly, PVIs based on acoustic bility, and gestural coordination. The first reason why this
measures (duration, F0 peak, F0 average) did not differ is the case resides in the limitations of our knowledge of the
between groups, suggesting that kinematic measures of PVI relation between auditory, acoustic, and articulatory phonetics.
may be more sensitive to group differences.
If for a particular sound we do not precisely know all the
acoustic and kinematic correlates of a correct production,
the perceptual judgment of the clinician outperforms acoustic
Discussion and Conclusions and kinematic measures in determining correctness. How-
This tutorial gives an overview of measurement ever, the second, clinically more important reason to adopt
techniques for the assessment of childhood speech motor a multilevel approach is the notion of reference frames
disorders, in particular CAS, organized according to three presented in the previous paragraph. Both for the clinician
levels of directness of measurement—perceptual, acoustic, and for the speaker, an auditory approach to correct devi-
and kinematic—as well as three symptom domains that ant articulatory movements can be the primary diagnostic
are generally considered core deficits of CAS—inconsistency, and therapeutic channel. For instance, the optimal instruc-
lengthened and disrupted coarticulation, and inappropriate tion for correcting vowel productions in a person with dys-
prosody. Below, the merits of these measures for diagnosis arthria who has difficulty with tongue elevation might very
and assessment of underlying deficits are discussed from well be a perceptual one: “produce the vowel more /i/-like.”
a broader perspective. In addition, future directions for To what extent a sound is /i/-like may be easier to assess
research and clinical practice are discussed. and easier to control at the perceptual than at the kinematic
The three levels of directness refer to the extent to which level: “elevate your tongue body closer to the palatal–alveolar
the measure directly reflects speech movements. This should region.” The acoustic level can be of help to focus on
not be interpreted to mean that more movement-oriented particular aspects of the auditory percept and to link the
measures also more directly reflect the underlying deficit. auditory percept to kinematic aspects. To illustrate these
First, this highly depends on the domain of assessment, relations, we give an example. Technical developments
since some lend themselves better for kinematic measurement between 1970 and 1990 allowed for constructing equip-
(e.g., inconsistency) than others (e.g., inappropriate prosody). ment that visualizes acoustic characteristics of speech.
Second, for a full description of the clinical aspects of a speech Thus, the so-called visual speech apparatus was designed for
disorder, the three levels are complementary. Especially the deaf and hard of hearing, providing a visual feedback
Kent (2004) but also others (e.g., Brumbach & Goffman, of the produced vowel in acoustic F1–F2 space (Povel &
2014; Goffman, 2010; Kleinow & Smith, 2006; Kloth, Arends, 1991). Here, the instruction could be “try to move
Janssen, Kraaimaat, & Brutten, 1995; A. Smith, Goffman, the vowel-dot into the upper-right corner of this vowel tri-
Sasisekaran, & Weber-Fox, 2012) have stressed the role angle.” Similar acoustics-based feedback systems have been
of the interface between language and speech processes developed since, to give enhanced feedback for speech dis-
in determining typical and deficient speech, arguing that orders (e.g., McAllister Byun & Hitchcock, 2012; McAllister
speech movements or speech gestures are perceptually con- Byun, Swartz, Halpin, Szeredi, & Maas, 2016; Shuster,
trolled, goal-oriented, and driven by cognition (mental Ruscello, & Toth, 1995) or second-language acquisition
representations). That is, by moving the articulators, the (e.g., Bliss, Abel, & Gick, 2018; Dowd, Smith, & Wolfe,
speaker intends to produce an acoustic signal that can 1998; Hirata, 2004; Lambacher, 1999).
be understood by a listener; in addition to the reaction of In addition to the different levels of reference frames,
the listener, the speaker’s own monitoring of the acoustic a reason why the levels of directness need to be analyzed
result thereby forms a crucial criterion to determine in a complementary fashion is that speech is produced in
success. From many studies, it has become clear that there a larger context—words, utterances, dialogues—in which
is no one-to-one relationship between speech movements higher levels of control interact with lower levels. A process-
and perceptual result. According to Guenther, Hampson, oriented approach requires analyses of not only the end
and Johnson (1998), the invariant targets of speech are not product, that is, speech movements, but also the processes
vocal tract shapes, but regions in auditory perceptual space, of conceptualization, formulation, encoding, planning, and

3024 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


control involved in producing those movements. These diagnostic value on its own. A first step to improve this situa-
processes run off in a cascade-like fashion, such that all tion is to adopt a more analytic, process-oriented way of
processes are simultaneously active, working on different theorizing about measurements and their relation with under-
parts of the utterance: the higher the level, the more advanced lying deficits in speech disorders. The second step is to conduct
the preparation. The simultaneous organization may re- clinical research to come up with validated consensus mea-
quire a trade-off in attention allocation. Fluent speakers surement protocols to operationalize, quantify, and eventually
may focus their attention on the message they want to get standardize assessments, so that we can better compare chil-
across and to a lesser extent on formulating eloquent sen- dren across studies—and interpret observations from individ-
tences but leave all articulation processes to the automatic ual clients in a clinical setting—using replicable methods.
pilot. For people with language and speech disorders,
speaking may be more like a dual or even triple or quadru-
ple task, in which case much more attention must be allo-
cated to the formulation and articulation processes as well. References
Thus, in this view, processing levels interact, and speech Abramson, A. S., & Whalen, D. H. (2017). Voice onset time (VOT)
symptoms must therefore be studied in context. at 50: Theoretical and practical issues in measuring voicing
The same two principles (context dependency and in- distinctions. Journal of Phonetics, 63, 75–86.
American Speech-Language-Hearing Association. (2007). Childhood
teraction between levels) apply to each of the three CAS
apraxia of speech (Report No. TR2007-00278). Rockville, MD:
characteristics: inconsistency, lengthened and disrupted Author. Retrieved from http://www.asha.org/policy
coarticulation, and inappropriate prosody. These can be Anderson, A., Lowit, A., & Howell, P. (2008). Temporal and
approached with each of the three levels of measurement, spatial variability in speakers with Parkinson’s disease and
and none of the levels is better or more direct than any of Friedreich’s ataxia. Journal of Medical Speech-Language Pathology,
the others across contexts. However, the approach chosen 16(4), 173.
determines to a large extent the data that are collected and Aram, D. M., & Horwitz, S. J. (1983). Sequential and non-speech
thus the interpretation that can be given regarding the praxic abilities in developmental verbal apraxia. Developmental
underlying deficit. Comprehensive studies are needed Medicine & Child Neurology, 25, 197–206.
Auzou, P., Ozsancak, C., Morris, R. J., Jan, M., Eustache, F., &
that include more than one diagnostic feature and more
Hannequin, D. (2000). Voice onset time in aphasia, apraxia of
than one level of measurement. speech and dysarthria: A review. Clinical Linguistics & Phonetics,
For whom was this tutorial written? First of all, for 14(2), 131–150.
researchers who can spend time to consider alternative Ballard, K. J., Djaja, D., Arciuli, J., James, D. G., & van Doorn, J.
methods and have the resources to implement those that (2012). Developmental trajectory for production of prosody:
are judged to be optimal. There is no holy grail; different Lexical stress contrastivity in children ages 3 to 7 years and in
research questions and different study populations require adults. Journal of Speech, Language, and Hearing Research,
different methods. The past decades were a time of tech- 55(6), 1822–1835.
nical development that allowed for powerful acoustic Ballard, K. J., Robin, D. A., McCabe, P., & McDonald, J. (2010).
A treatment for dysprosody in childhood apraxia of speech.
analyses and saw the emergence of fine-grained kinematic
Journal of Speech, Language, and Hearing Research, 53(5),
measurement procedures. The line of research and technical 1227–1245.
development has been focused on finding acoustic and Barbier, G., Perrier, P., Ménard, L., Payan, Y., Tiede, M., &
kinematic correlates of perceptual phenomena, as well as Perkell, J. (2013, September). Speech planning as an index of
the other way around: finding perceptual effects of kine- speech motor control maturity. Paper presented at the 14th
matic and acoustic phenomena. Now, to move beyond Annual Conference of the International Speech Communication
objectifying perceptual phenomena with acoustic and Association (Interspeech 2013), Lyon, France.
kinematic measures, we need to adopt an integrated ap- Beddor, P. S., Harnsberger, J. D., & Lindemann, S. (2002).
proach in which all levels are included and all levels are Language-specific patterns of vowel-to-vowel coarticulation:
Acoustic structures and their perceptual correlates. Journal of
interpreted in what they have to offer for diagnosis and
Phonetics, 30(4), 591–627.
treatment.
Bennett, S. (1981). Vowel formant frequency characteristics of
Second, for clinicians, who generally need to work preadolescent males and females. The Journal of the Acoustical
with what they have. Being better informed about diverse Society of America, 69(1), 231–238.
methods of assessment fosters an analytic view on underly- Betz, S. K., & Stoel-Gammon, C. (2005). Measuring articulatory
ing processes and alternative methods of listening, watching, error consistency in children with developmental apraxia of
and measuring. There are quite a few assessment protocols speech. Clinical Linguistics & Phonetics, 19(1), 53–66.
around; most of these are not validated and used primarily Bickley, C. (1986). Formant estimation of high fundamental
based on pragmatic considerations with respect to time and frequency speech. The Journal of the Acoustical Society of
equipment available. Most of the cited studies do not have America, 79(S1), S38.
the scope and the volume, in terms of numbers of partici- Bliss, H., Abel, J., & Gick, B. (1998). Computer-assisted visual
articulation feedback in L2 pronunciation instruction. Journal
pants or scope of measurements, to yield norm data or even
of Second Language Pronunciation, 4(1), 129–153.
reference data that could be applied outside the context of Bradford, A., & Dodd, B. (1996). Do all speech-disordered children
the particular study and thus could lead to generalizations. have motor deficits? Clinical Linguistics & Phonetics, 10(2),
This implies that no measure so far has proven to have clear 77–101.

Terband et al.: Methodology in the Assessment of CAS 3025


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Brancazio, L., & Fowler, C. A. (1998). On the relevance of locus Dodd, B., Hua, Z., Crosbie, S., Holm, A., & Ozanne, A. (2009).
equations for production and perception of stop consonants. Diagnostic Evaluation of Articulation and Phonology–US Edition
Perception and Psychophysics, 60(1), 24–50. (DEAP). San Antonio, TX: Pearson.
Brumbach, A. C. D., & Goffman, L. (2014). Interaction of language Dowd, A., Smith, J., & Wolfe, J. (1998). Learning to pronounce
processing and motor skill in children with specific language vowel sounds in a foreign language using acoustic measure-
impairment. Journal of Speech, Language, and Hearing Research, ments of the vocal tract as feedback in real time. Language and
57(1), 158–171. Speech, 41(1), 1–20.
Burt, L., Holm, A., & Dodd, B. (1999). Phonological awareness Duffy, J. R. (2005). Motor speech disorders: Substrates, differential
skills of 4-year-old British children: An assessment and develop- diagnosis, and management. St. Louis, MO: Mosby.
mental data. International Journal of Language & Communication Edwards, J., Beckman, M. E., & Fletcher, J. (1991). The articulatory
Disorders, 34(3), 311–335. kinematics of final lengthening. The Journal of the Acoustical
Butcher, A. (1989). Measuring coarticulation and variability in tongue Society of America, 89(1), 369–382.
contact patterns. Clinical Linguistics & Phonetics, 3(1), 39–47. Ekelman, B. L., & Aram, D. M. (1984). Spoken syntax in children
Butcher, A., & Weiher, E. (1976). An electropalatographic investiga- with developmental verbal apraxia. In D. M. Aram (Ed.),
tion of coarticulation in VCV sequences. Journal of Phonetics, Seminars in speech and language: Assessment and treatment of
4(1), 59–74. developmental apraxia (pp. 97–110). New York, NY: Thieme-
Carter, A., & Gerken, L. (2004). Do children’s omissions leave Stratton.
traces? Journal of Child Language, 31(3), 561–586. Farnetani, E., & Recasens, D. (1997). Coarticulation and connected
Caruso, A. J., & Strand, E. A. (1999). Clinical management of speech processes. In W. J. Hardcastle, J. Laver, & F. E. Gibbon
motor speech disorders in children. New York, NY: Thieme. (Eds.), The handbook of phonetic sciences (pp. 371–404).
Feng, Y., & Max, L. (2014). Accuracy and precision of a custom
Chang, S.-E., Ohde, R. N., & Conture, E. G. (2002). Coarticulation
camera-based system for 2-D and 3-D motion tracking during
and formant transition rate in young children who stutter.
speech and nonspeech motor tasks. Journal of Speech, Language,
Journal of Speech, Language, and Hearing Research, 45(4),
and Hearing Research, 57(2), 426–438.
676–688.
Forrest, K. (2003). Diagnostic criteria of developmental
Chau, T., Young, S., & Redekop, S. (2005). Managing variability
apraxia of speech used by clinical speech-language pathol-
in the summary and comparison of gait data. Journal of Neuro-
ogists. American Journal of Speech-Language Pathology,
Engineering and Rehabilitation, 2(1), 22.
12(3), 376–380.
Chitoran, I., Goldstein, L., & Byrd, D. (2002). Gestural overlap
Forrest, K., Dinnsen, D. A., & Elbert, M. (1997). Impact of substitu-
and recoverability: Articulatory evidence from Georgian. In
tion patterns on phonological learning by misarticulating children.
C. Gussenhoven & N. Warner (Eds.), Laboratory phonology
Clinical Linguistics & Phonetics, 11(1), 63–76.
(Vol. 7, pp. 419–447). Berlin, Germany: Mouton de Gruyter.
Forrest, K., Elbert, M., & Dinnsen, D. A. (2000). The effect of sub-
Cho, T. (2004). Prosodically conditioned strengthening and vowel-
stitution patterns on phonological treatment outcomes. Clinical
to-vowel coarticulation in English. Journal of Phonetics, 32(2),
Linguistics & Phonetics, 14(7), 519–531.
141–176. Forrest, K., & Iuzzini-Seigel, J. (2008). A comparison of oral motor
Cleland, J., McCron, C., & Scobbie, J. M. (2013). Tongue reading:
and production training for children with speech sound disorders.
Comparing the interpretation of visual information from inside
Seminars in Speech and Language, 29(4), 304–311.
the mouth, from electropalatographic and ultrasound displays of Fowler, C. A. (1994). Invariants, specifiers, cues: An investigation
speech sounds. Clinical Linguistics & Phonetics, 27(4), 299–311. of locus equations as information for place of articulation.
Courson, M.-E. A., Ballard, K. J., Canault, M., Layfield, C. A., Perception & Psychophysics, 55(6), 597–610.
Scholl, D. I., & Gentil, C. (2012). Lexical stress production in Gerken, L. (1996). Prosodic structure in young children’s language
healthy and apraxic speakers of Australian English or French. production. Language, 72, 683–712.
Journal of Medical Speech-Language Pathology, 20(4), 47–53. Gerken, L., & McGregor, K. (1998). An overview of prosody and
Cummins, F., Lowit, A., & van Brenk, F. (2014). Quantitative its role in normal and disordered child language. American
assessment of interutterance stability: Application to dysar- Journal of Speech-Language Pathology, 7(2), 38–48.
thria. Journal of Speech, Language, and Hearing Research, 57(1), Gibbon, F., Hardcastle, W., & Nicolaidis, K. (1993). Temporal
81–89. and spatial aspects of lingual coarticulation in/kl/sequences:
Daniloff, R., & Hammarberg, R. (1973). On defining coarticulation. A cross-linguistic investigation. Language and Speech, 36(2–3),
Journal of Phonetics, 1(3), 239–248. 261–277.
Darley, F. L., Aronson, A. E., & Brown, J. R. (1975). Motor speech Gibbon, F., & Lee, A. (2007). Electropalatography as a research
disorders. Philadelphia, PA: Saunders. and clinical tool. Perspectives on Speech Science and Orofacial
De Jong, K. J. (1995). The supraglottal articulation of prominence Disorders, 17(1), 7–13.
in English: Linguistic stress as localized hyperarticulation. The Gibson, T., & Ohde, R. N. (2007). F2 locus equations: Phonetic
Journal of the Acoustical Society of America, 97(1), 491–504. descriptors of coarticulation in 17- to 22-month-old children.
Dodd, B. (1995). Procedures for classification of subgroups of Journal of Speech, Language, and Hearing Research, 50(1),
speech disorder. In B. Dodd (Ed.), Differential diagnosis and 97–108.
treatment of children with speech disorder (pp. 49–64). San Diego, Goffman, L. (1999). Prosodic influences on speech production
CA: Singular. in children with specific language impairment and speech
Dodd, B. (2005). Differential diagnosis and treatment of chil- deficits: Kinematic, acoustic, and transcription evidence.
dren with speech disorder (2nd ed.). London, United Kingdom: Journal of Speech, Language, and Hearing Research, 42(6),
Whurr. 1499–1517.
Dodd, B., Hua, Z., Crosbie, S., Holm, A., & Ozanne, A. Goffman, L. (2004). Kinematic differentiation of prosodic cate-
(2002). Diagnostic Evaluation of Articulation and Phonology gories in normal and disordered language development.
(DEAP). London, United Kingdom: The Psychological Journal of Speech, Language, and Hearing Research, 47(5),
Corporation. 1088–1102.

3026 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Goffman, L. (2010). Dynamic interaction of motor and language Heisler, L., Goffman, L., & Younger, B. (2010). Lexical and articula-
factors in normal and disordered development. In B. Maassen tory interactions in children’s language production. Develop-
& P. van Lieshout (Eds.), Speech motor control: New develop- mental Science, 13(5), 722–730.
ments in basic and applied research (pp. 137–152). Oxford, Hertrich, I., & Ackermann, H. (1995). Coarticulation in slow speech:
United Kingdom: Oxford University Press. Durational and spectral analysis. Language and Speech, 38(2),
Goffman, L., Heisler, L., & Chakraborty, R. (2006). Mapping of 159–187.
prosodic structure onto words and phrases in children’s and Hertrich, I., & Ackermann, H. (1999). Temporal and spectral aspects
adults’ speech production. Language and Cognitive Processes, of coarticulation in ataxic dysarthria: An acoustic analysis.
21(1–3), 25–47. Journal of Speech, Language, and Hearing Research, 42(2),
Goffman, L., & Malin, C. (1999). Metrical effects on speech move- 367–381.
ments in children and adults. Journal of Speech, Language, and Hirata, Y. (2004). Computer assisted pronunciation training for
Hearing Research, 42(4), 1003–1015. native English speakers learning Japanese pitch and durational
Gozzard, H., Baker, E., & McCabe, P. (2006). Children’s productions contrasts. Computer Assisted Language Learning, 17(3–4), 357–376.
of polysyllables. ACQuiring Knowledge in Speech, Language Holm, A., Crosbie, S., & Dodd, B. (2007). Differentiating normal
and Hearing, 8(3), 113–116. variability from inconsistency in children’s speech: Normative
Graetzer, S., Fletcher, J., & Hajek, J. (2015). Locus equations and data. International Journal of Language & Communication
coarticulation in three Australian languages. The Journal of the Disorders, 42(4), 467–486.
Acoustical Society of America, 137(2), 806–821. Hosom, J.-P., Shriberg, L., & Green, J. R. (2004). Diagnostic assess-
Green, J. R., Moore, C. A., Higashikawa, M., & Steeve, R. W. ment of childhood apraxia of speech using automatic speech
(2000). The physiologic development of speech motor control: recognition (ASR) methods. Journal of Medical Speech-Language
Lip and jaw coordination. Journal of Speech, Language, and Pathology, 12(4), 167.
Hearing Research, 43(1), 239–255. Howell, P., Anderson, A. J., Bartrip, J., & Bailey, E. (2009). Com-
Green, J. R., Nip, I. S. B., & Maassen, B. (2010). Some organiza- parison of acoustic and kinematic approaches to measuring
tion principles in early speech development. In B. Maassen & utterance-level speech variability. Journal of Speech, Language,
P. van Lieshout (Eds.), Speech motor control: New developments and Hearing Research, 52(4), 1088–1096.
in basic and applied research (pp. 171–188). Oxford, United Howell, P., Anderson, A. J., & Lucero, J. (2010). Motor timing
Kingdom: Oxford University Press. and fluency. In B. Maassen & P. van Lieshout (Eds.), Speech
Grigos, M. I. (2009). Changes in articulator movement variabil- motor control: New developments in basic and applied research
ity during phonemic development: A longitudinal study. (pp. 215–225). Oxford, United Kingdom: Oxford University Press.
Journal of Speech, Language, and Hearing Research, 52(1), Ingram, D. (2002). The measurement of whole-word productions.
164–177. Journal of Child Language, 29(4), 713–733.
Grigos, M. I., & Kolenda, N. (2010). The relationship between Iuzzini-Seigel, J. (2012). Inconsistency of speech in children with
articulatory control and improved phonemic accuracy in child- childhood apraxia of speech, phonological disorders, and typical
hood apraxia of speech: A longitudinal case study. Clinical speech development (Unpublished doctoral dissertation). Indiana
Linguistics & Phonetics, 24, 17–40. University, Bloomington, IN.
Grigos, M. I., Moss, A., & Lu, Y. (2015). Oral articulatory control Iuzzini-Seigel, J., & Forrest, K. (2010). Evaluation of a combined
in childhood apraxia of speech. Journal of Speech, Language, treatment approach for childhood apraxia of speech. Clinical
and Hearing Research, 58(4), 1103–1118. Linguistics & Phonetics, 24(4–5), 335–345.
Grigos, M. I., & Patel, R. (2007). Articulator movement asso- Iuzzini-Seigel, J., Hogan, T. P., & Green, J. R. (2017). Speech incon-
ciated with the development of prosodic control in chil- sistency in children with childhood apraxia of speech, language
dren. Journal of Speech, Language, and Hearing Research, impairment, and speech delay: Depends on the stimuli. Journal
50(1), 119–130. of Speech, Language, and Hearing Research, 60(5), 1194–1210.
Grigos, M. I., & Patel, R. (2010). Acquisition of articulatory con- Iuzzini-Seigel, J., Hogan, T. P., Guarino, A. J., & Green, J. R.
trol for sentential focus in children. Journal of Phonetics, 38(4), (2015). Reliance on auditory feedback in children with childhood
706–715. apraxia of speech. Journal of Communication Disorders, 54, 32–42.
Guenther, F. H., Hampson, M., & Johnson, D. (1998). A theoretical Iuzzini-Seigel, J., Hogan, T. P., Rong, P., & Green, J. R. (2015).
investigation of reference frames for the planning of speech Longitudinal development of speech motor control: Motor and
movements. Psychological Review, 105(4), 611–633. linguistic factors. Journal of Motor Learning and Development,
Guyette, T. W., & Diedrich, W. M. (1981). A critical review of 3(1), 53–68.
developmental apraxia of speech. In N. J. Lass (Ed.), Speech Katz, W. F., & Bharadwaj, S. (2001). Coarticulation in fricative-
and language. Advances in basic research and practice (pp. 1–49). vowel syllables produced by children and adults: A preliminary
New York, NY: Academic Press. report. Clinical Linguistics & Phonetics, 15(1–2), 139–143.
Hall, P. K. (2000). A letter to the parent(s) of a child with develop- Katz, W. F., Machetanz, J., Orth, U., & Schönle, P. (1990). A
mental apraxia of speech. Part I: Speech characteristics of the kinematic analysis of anticipatory coarticulation in the speech
disorder. Language, Speech, and Hearing Services in Schools, 31, of anterior aphasic subjects using electromagnetic articulography.
169–172. Brain and Language, 38(4), 555–575.
Hardcastle, W. J. (1985). Some phonetic and syntactic constraints Kehoe, M. M. (2000). Truncation without shape constraints: The
on lingual coarticulation during /kl/ sequences. Speech Com- latter stages of prosodic acquisition. Language Acquisition, 8(1),
munication, 4(1–3), 247–263. 23–67.
Hardcastle, W. J., & Tjaden, K. (2008). 32 Coarticulation and Kehoe, M. M. (2001). Prosodic patterns in children’s multisyllabic
speech impairment. The handbook of clinical linguistics, 506–524. word productions. Language, Speech, and Hearing Services in
Harrington, J. (2010). Acoustic phonetics. In W. J. Hardcas- Schools, 32(4), 284–294.
tle, J. Laver, & F. E. Gibbon (Eds.), A handbook of phonetics Kehoe, M. M., Stoel-Gammon, C., & Buder, E. H. (1995). Acoustic
(Vol. 116, pp. 81–129). Oxford, United Kingdom: Wiley- correlates of stress in young children’s speech. Journal of Speech
Blackwell. and Hearing Research, 38(2), 338–350.

Terband et al.: Methodology in the Assessment of CAS 3027


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Kent, R. D. (2004). Models of speech motor control: Implications Low, E. L., Grabe, E., & Nolan, F. (2000). Quantitative character-
from recent developments in neurophysiological and neuro- izations of speech rhythm: Syllable-timing in Singapore English.
behavioral science. In B. Maassen, R. Kent, H. F. M. Peters, Language and Speech, 43(4), 377–401.
P. van Lieshout, & W. Hulstijn (Eds.), Speech motor control Lucero, J. C. (2005). Comparison of measures of variability of
in normal and disordered speech (pp. 1–28). Oxford, United speech movement trajectories using synthetic records. Journal
Kingdom: Oxford University Press. of Speech, Language, and Hearing Research, 48(2), 336–344.
Kent, R. D., & Minifie, F. D. (1977). Coarticulation in recent Lucero, J. C., Munhall, K. G., Gracco, V. L., & Ramsay, J. O.
speech production models. Journal of Phonetics, 5(2), 115–133. (1997). On the registration of time and the patterning of speech
Kewley-Port, D. (1982). Measurement of formant transitions in movements. Journal of Speech, Language, and Hearing Research,
naturally produced stop consonant–vowel syllables. The Journal 40(5), 1111–1117.
of the Acoustical Society of America, 72(2), 379–389. Lundeborg, I., Nordin, E., Zeipel-Stjerna, M., & McAllister, A.
Kim, Y., Coalson, G., & Berry, J. (2018). A kinematic analysis of (2015). Voice onset time in Swedish children with phonological
coarticulation effects on schwa. Folia Phoniatrica et Logopaedica, impairment. Logopedics Phoniatrics Vocology, 40(4), 149–155.
70(3–4), 203–212. Maas, E., & Farinella, K. A. (2012). Random versus blocked prac-
Kleinow, J., & Smith, A. (2000). Influences of length and syntactic tice in treatment for childhood apraxia of speech. Journal of
complexity on the speech motor stability of the fluent speech Speech, Language, and Hearing Research, 55(2), 561–578.
of adults who stutter. Journal of Speech, Language, and Hearing Maas, E., & Mailend, M.-L. (2012). Speech planning happens
Research, 43(2), 548–559. before speech execution: Online reaction time methods in the
Kleinow, J., & Smith, A. (2006). Potential interactions among lin- study of apraxia of speech. Journal of Speech, Language, and
guistic, autonomic, and motor factors in speech. Developmental Hearing Research, 55(5), 1523–1534.
Psychobiology, 48(4), 275–287. Maas, E., & Mailend, M.-L. (2017). Fricative contrast and coar-
Kloth, S., Janssen, P., Kraaimaat, F., & Brutten, G. (1995). Speech- ticulation in children with and without speech sound disor-
motor and linguistic skills of young stutterers prior to onset. ders. American Journal of Speech-Language Pathology, 26(2S),
Journal of Fluency Disorders, 20(2), 157–170. 649–663.
Kopera, H. C., & Grigos, M. I. (2019). Lexical stress in childhood
Maassen, B., Nijland, L., & Terband, H. (2010). Developmental
apraxia of speech: Acoustic and kinematic findings. Interna-
models of childhood apraxia of speech. In B. Maassen &
tional Journal of Speech-Language Pathology. https://doi.org/10.
P. van Lieshout (Eds.), Speech motor control: New developments
1080/17549507.2019.1568571
in basic and applied research (pp. 243–258). Oxford, United
Krull, D. (1988). Acoustic properties as predictors of perceptual
Kingdom: Oxford University Press.
responses: A study of Swedish voiced stops. Sweden: Department
Maassen, B., Nijland, L., & Van der Meulen, S. (2001). Coarticu-
of Linguistics, University of Stockholm.
Krull, D. (1989). Second formant locus patterns and consonant– lation within and between syllables by children with develop-
vowel coarticulation in spontaneous speech. Phonetic Experimental mental apraxia of speech. Clinical Linguistics & Phonetics,
15(1–2), 145–150.
Research at the Institute of Linguistics, University of Stockholm,
10, 87–108. Macrae, T., Tyler, A. A., & Lewis, K. E. (2014). Lexical and pho-
Kühnert, B., Hoole, P., & Mooshammer, C. (2006). Gestural overlap nological variability in preschool children with speech sound
and C-center in selected French consonant clusters. Paper pre- disorder. American Journal of Speech-Language Pathology,
sented at the 7th International Seminar on Speech Production 23(1), 27–35.
(ISSP), Ubatuba, Brazil. Marion, M. J., Sussman, H. M., & Marquardt, T. P. (1993). The
Kühnert, B., & Nolan, F. (1999). The origin of coarticulation. In perception and production of rhyme in normal and develop-
W. J. Hardcastle & N. Hewlett (Eds.), Coarticulation: Theory, mentally apraxic children. Journal of Communication Disorders,
data and techniques (pp. 1–30). Cambridge, United Kingdom: 26(3), 129–160.
Cambridge University Press. Marquardt, T. P., Jacks, A., & Davis, B. (2004). Token-to-token
Lambacher, S. (1999). A CALL tool for improving second variability in developmental apraxia of speech: Three longitudinal
language acquisition of English consonants by Japanese case studies. Clinical Linguistics & Phonetics, 18(2), 127–144.
learners. Computer Assisted Language Learning, 12(2), Marquardt, T. P., Sussman, H. M., Snow, T., & Jacks, A. (2002).
137–156. The integrity of the syllable in developmental apraxia of speech.
Lee, S., Potamianos, A., & Narayanan, S. (1999). Acoustics of Journal of Communication Disorders, 35, 31–49.
children’s speech: Developmental changes of temporal and McAllister Byun, T., & Hitchcock, E. R. (2012). Investigating the
spectral parameters. The Journal of the Acoustical Society of use of traditional and spectral biofeedback approaches to inter-
America, 105, 1455. vention for /r/ misarticulation. American Journal of Speech-
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Language Pathology, 21, 207–221.
Studdert-Kennedy, M. (1967). Perception of the speech code. McAllister Byun, T., Swartz, M. T., Halpin, P. F., Szeredi, D., &
Psychological Review, 74(6), 431. Maas, E. (2016). Direction of attentional focus in biofeedback
Lindblom, B. (1963). On vowel reduction (Report No. 29). Stock- treatment for /r/ misarticulation. International Journal of
holm, Sweden: The Royal Institute of Technology, Speech Language & Communication Disorders, 51, 384–401.
Transmission Laboratory. McCabe, P., Rosenthal, J. B., & McLeod, S. (1998). Features of
Lisker, L., & Abramson, A. S. (1964). A cross-language study of developmental dyspraxia in the general speech-impaired popula-
voicing in initial stops: Acoustical measurements. Word, 20(3), tion? Clinical Linguistics & Phonetics, 12(2), 105–126.
384–422. McIntosh, B., & Dodd, B. J. (2008). Two-year-olds’ phonological
Liss, J. M., White, L., Mattys, S. L., Lansford, K., Lotto, acquisition: Normative data. International Journal of Speech-
A. J., Spitzer, S. M., & Caviness, J. N. (2009). Quantifying Language Pathology, 10(6), 460–469.
speech rhythm abnormalities in the dysarthrias. Journal Mefferd, A. (2015). Articulatory-to-acoustic relations in talkers
of Speech, Language, and Hearing Research, 52(5), with dysarthria: A first analysis. Journal of Speech, Language,
1334–1352. and Hearing Research, 58(3), 576–589.

3028 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Ménard, L., Aubin, J., Thibeault, M., & Richard, G. (2012). Measur- children and adults. Journal of Speech and Hearing Research,
ing tongue shapes and positions with ultrasound imaging: A 32(1), 120–132.
validation experiment using an articulatory model. Folia Phonia- Noiray, A., Abakarova, D., Rubertus, E., Krüger, S., & Tiede, M.
trica et Logopaedica, 64(2), 64–72. (2018). How do children organize their speech in the first years
Ménard, L., Cathiard, M.-A., Dupont, S., & Tiede, M. (2013). of life? Insight from ultrasound imaging. Journal of Speech,
Anticipatory lip gestures: A validation of the movement ex- Language, and Hearing Research, 61(6), 1355–1368.
pansion model in congenitally blind speakers. The Journal of Noiray, A., Cathiard, M.-A., Abry, C., & Ménard, L. (2010). Lip
the Acoustical Society of America, 133(4), EL249–EL255. rounding anticipatory control: Crosslinguistically lawful and
Modarresi, G., Sussman, H., Lindblom, B., & Burlingame, E. ontogenetically attuned. In B. Maassen & P. van Lieshout
(2004). An acoustic analysis of the bidirectionality of coar- (Eds.), Speech motor control: New developments in basic and
ticulation in VCV utterances. Journal of Phonetics, 32(3), applied research (pp. 153–171). Oxford, United Kingdom:
291–312. Oxford University Press.
Moss, A., & Grigos, M. I. (2012). Interarticulatory coordination Noiray, A., Cathiard, M.-A., Ménard, L., & Abry, C. (2011).
of the lips and jaw in childhood apraxia of speech. Journal of Test of the movement expansion model: Anticipatory
Medical Speech-Language Pathology, 20(4), 127. vowel lip protrusion and constriction in French and English
Munson, B., Bjorum, E. M., & Windsor, J. (2003). Acoustic and speakers. The Journal of the Acoustical Society of America,
perceptual correlates of stress in nonwords produced by children 129(1), 340–349.
with suspected developmental apraxia of speech and children Noiray, A., Ménard, L., Cathiard, M.-A., Abry, C., & Savariaux, C.
with phonological disorder. Journal of Speech, Language, and (2004, October). The development of anticipatory labial coarticu-
Hearing Research, 46(1), 189–202. lation in French: A pionering study. Paper presented at the Eighth
Murray, E., Iuzzini, J., Maas, E., Terband, H., & Ballard, K. International Conference on Spoken Language Processing, Jeju
(2018). Diagnosis of childhood apraxia of speech compared to Island, South Korea.
other speech sound disorders: A systematic review. Paper pre- Noiray, A., Ménard, L., & Iskarous, K. (2013). The development
sented at the ASHA Convention 2018, Boston, MA, November of motor synergies in children: Ultrasound and acoustic mea-
15-17, 2018. surements. The Journal of the Acoustical Society of America,
Murray, E., McCabe, P., Heard, R., & Ballard, K. J. (2015). 133(1), 444–452.
Differential diagnosis of children with suspected childhood Noiray, A., Wieling, M., Abakarova, D., Rubertus, E., & Tiede, M.
apraxia of speech. Journal of Speech, Language, and Hearing (in press). Back from the future: Non-linear anticipation in
Research, 58(1), 43–60. adults and children’s speech. Journal of Speech, Language, and
Namasivayam, A. K., Pukonen, M., Goshulak, D., Hard, J., Hearing Research (this special issue).
Rudzicz, F., Rietveld, T., . . . van Lieshout, P. (2015). Treatment Odell, K. H., McNeil, M. R., Rosenbek, J. C., & Hunter, L. (1991).
intensity and childhood apraxia of speech. International Journal Perceptual characteristics of vowel and prosody production in
of Language & Communication Disorders, 50(4), 529–546. apraxic, aphasic, and dysarthric speakers. Journal of Speech and
Namasivayam, A. K., Pukonen, M., Goshulak, D., Yu, V. Y., Kadis, Hearing Research, 34(1), 67–80.
D. S., Kroll, R., . . . De Nil, L. F. (2013). Relationship between Odell, K. H., & Shriberg, L. D. (2001). Prosody-voice characteristics
speech motor control and speech intelligibility in children with of children and adults with apraxia of speech. Clinical Linguistics
speech sound disorders. Journal of Communication Disorders, & Phonetics, 15(4), 275–307.
46(3), 264–280. Öhman, S. E. G. (1966). Coarticulation in VCV utterances: Spectro-
Nijland, L., Maassen, B., Hulstijn, W., & Peters, H. (2004). Speech graphic measurements. The Journal of the Acoustical Society of
motor coordination in Dutch-speaking children with DAS America, 39(1), 151–168.
studied with EMMA. Journal of Multilingual Communication Patel, R. (2003). Acoustic characteristics of the question-statement
Disorders, 2(1), 50–60. contrast in severe dysarthria due to cerebral palsy. Journal of
Nijland, L., Maassen, B., Van der Meulen, S., Gabreëls, F., Kraaimaat, Speech, Language, and Hearing Research, 46(6), 1401–1415.
F. W., & Schreuder, R. (2002). Coarticulation patterns in children Patel, R., & Campellone, P. (2009). Acoustic and perceptual cues
with developmental apraxia of speech. Clinical Linguistics & to contrastive stress in dysarthria. Journal of Speech, Language,
Phonetics, 16(6), 461–483. and Hearing Research, 52(1), 206–222.
Nijland, L., Maassen, B., & Van der Meulen, S. (2003). Evidence Pollock, K. E., & Hall, P. K. (1991). An analysis of the vowel mis-
of motor programming deficits in children diagnosed with articulations of five children with developmental apraxia of
DAS. Journal of Speech, Language, and Hearing Research, speech. Clinical Linguistics & Phonetics, 5, 207–224.
46(2), 437–450. Povel, D.-J., & Arends, N. (1991). The visual speech apparatus:
Nijland, L., Maassen, B., Van der Meulen, S., Gabreëls, F., Kraaimaat, Theoretical and practical aspects. Speech Communication,
F. W., & Schreuder, R. (2003). Planning of syllables in children 10(1), 59–80.
with developmental apraxia of speech. Clinical Linguistics & Preston, J. L., & Koenig, L. L. (2011). Phonetic variability in re-
Phonetics, 17(1), 1–24. sidual speech sound disorders: Exploration of subtypes. Topics
Nittrouer, S. (1993). The emergence of mature gestural patterns is in Language Disorders, 31(2), 168.
not uniform: Evidence from an acoustic study. Journal of Speech Preston, J. L., Molfese, P. J., Gumkowski, N., Sorcinelli, A.,
and Hearing Research, 36(5), 959–972. Harwood, V., Irwin, J. R., & Landi, N. (2014). Neurophysiology
Nittrouer, S., Munhall, K., Kelso, J. S., Tuller, B., & Harris, K. S. of speech differences in childhood apraxia of speech. Develop-
(1988). Patterns of interarticulator phasing and their relation mental Neuropsychology, 39(5), 385–403.
to linguistic structure. The Journal of the Acoustical Society of Ramsay, J. O., & Silverman, B. W. (1997). Functional data analysis.
America, 84(5), 1653–1661. New York, NY: Springer.
Nittrouer, S., Studdert-Kennedy, M., & McGowan, R. S. (1989). Recasens, D. (2004). The effect of syllable position on consonant
The emergence of phonetic segments: Evidence from the reduction (evidence from Catalan consonant clusters). Journal
spectral structure of fricative-vowel syllables spoken by of Phonetics, 32(3), 435–453.

Terband et al.: Methodology in the Assessment of CAS 3029


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Recasens, D., & Pallarès, M. D. (2001). Coarticulation, assimilation Siren, K. A., & Wilcox, K. A. (1995). Effects of lexical meaning and
and blending in Catalan consonant clusters. Journal of Phonetics, practiced productions on coarticulation in children’s and adults’
29(3), 273–301. speech. Journal of Speech and Hearing Research, 38(2), 351–359.
Recasens, D., Pallarès, M. D., & Fontdevila, J. (1997). A model Skinder, A., Connaghan, K., Strand, E., & Betz, S. (2000). Acoustic
of lingual coarticulation based on articulatory constraints. correlates of perceived lexical stress errors in children with
The Journal of the Acoustical Society of America, 102(1), developmental apraxia of speech. Journal of Medical Speech-
544–561. Language Pathology, 8(4), 279–284.
Rosenbek, J. C., & Wertz, R. T. (1972). A review of fifty cases of Skinder, A., Strand, E. A., & Mignerey, M. (1999). Perceptual and
developmental apraxia of speech. Language, Speech, and Hearing acoustic analysis of lexical and sentential stress in children with
Services in Schools, 3(1), 23–33. developmental apraxia of speech. Journal of Medical Speech-
Rubertus, E., & Noiray, A. (2018). On the development of gestural Language Pathology, 7, 133–144.
organization: A cross-sectional study of vowel-to-vowel anticipa- Smith, A., & Goffman, L. (1998). Stability and patterning of speech
tory coarticulation. PLOS ONE, 13(9), e0203562. movement sequences in children and adults. Journal of Speech,
Sadagopan, N., & Smith, A. (2008). Developmental changes in the Language, and Hearing Research, 41(1), 18–30.
effects of utterance length and complexity on speech movement Smith, A., Goffman, L., Sasisekaran, J., & Weber-Fox, C. (2012).
variability. Journal of Speech, Language, and Hearing Research, Language and motor abilities of preschool children who stutter:
51(5), 1138. Evidence from behavioral and kinematic indices of nonword
Schumacher, J. G., McNeil, M. R., Vetter, D. K., & Yoder, D. E. repetition performance. Journal of Fluency Disorders, 37(4),
(1986, November). Articulatory consistency and variability in 344–358.
apraxic and non-apraxic children. Paper presented at the Annual Smith, A., Goffman, L., Zelaznik, H. N., Ying, G., & McGillem, C.
Convention of the American Speech-Language-Hearing Associa- (1995). Spatiotemporal stability and patterning of speech move-
tion, Detroit, MI. ment sequences. Experimental Brain Research, 104(3), 493–501.
Sharf, D. J., & Ohde, R. N. (1981). Physiologic, acoustic, and Smith, A., & Zelaznik, H. N. (2004). Development of functional
perceptual aspects of coarticulation: Implications for the reme- synergies for speech motor coordination in childhood and
diation of articulatory disorders. In N. J. Lass (Ed.), Speech adolescence. Developmental Psychobiology, 45(1), 22–33.
and language: Advances in basic research and practice (Vol. 5, Smith, B., Marquardt, T., Cannito, M., & Davis, B. (1994). Vowel
pp. 153–247). New York, NY: Academic Press. variability in developmental apraxia of speech. In J. A. Till,
Shriberg, L. D. (1994). Developmental phonological disorders: K. M. Yorkston, & D. R. Beukelman (Eds.), Motor speech
Moving toward the 21st century—Forwards, backwards, or disorders: Advances in assessment and treatment (pp. 81–89).
endlessly sideways? American Journal of Speech-Language Baltimore, MD: Brookes.
Pathology, 3, 26–28. Soli, S. D. (1981). Second formants in fricatives: Acoustic conse-
Shriberg, L. D., Aram, D. M., & Kwiatkowski, J. (1997a). Develop- quences of fricative-vowel coarticulation. The Journal of the
mental apraxia of speech: I. Descriptive and theoretical per- Acoustical Society of America, 70(4), 976–984.
spectives. Journal of Speech, Language, and Hearing Research, Song, J. Y., Demuth, K., Evans, K., & Shattuck-Hufnagel, S. (2013).
40(2), 273–285. Durational cues to fricative codas in 2-year-olds’ American
Shriberg, L. D., Aram, D. M., & Kwiatkowski, J. (1997b). Develop- English: Voicing and morphemic factors. The Journal of the
mental apraxia of speech: II. Toward a diagnostic marker. Journal Acoustical Society of America, 133(5), 2931–2946.
of Speech, Language, and Hearing Research, 40(2), 286–312. Sosa, A. V., & Stoel-Gammon, C. (2006). Patterns of intra-word
Shriberg, L. D., Aram, D. M., & Kwiatkowski, J. (1997c). Develop- phonological variability during the second year of life. Journal
mental apraxia of speech: III. A subtype marked by inappropriate of Child Language, 33(1), 31–50.
stress. Journal of Speech, Language, and Hearing Research, 40(2), Southwood, M., Dagenais, P., Sutphin, S., & Garcia, J. M. (1997).
313–337. Coarticulation in apraxia of speech: A perceptual, acoustic,
Shriberg, L. D., Campbell, T. F., Karlsson, H. B., Brown, R. L., and electropalatographic study. Clinical Linguistics & Phonetics,
McSweeny, J. L., & Nadler, C. J. (2003). A diagnostic marker 11(3), 179–203.
for childhood apraxia of speech: The lexical stress ratio. Clinical Story, B. H., & Bunton, K. (2016). Formant measurement in chil-
Linguistics & Phonetics, 17(7), 549–574. dren’s speech based on spectral filtering. Speech Communication,
Shriberg, L. D., Jakielski, K. J., & El-Shanti, H. (2008). Breakpoint 76, 93–111.
localization using array-CGH in three siblings with an un- Strand, E. A., McCauley, R. J., Weigand, S. D., Stoeckel,
balanced 4q;16q translocation and childhood apraxia of speech R. E., & Baas, B. S. (2013). A motor speech assessment for
(CAS). American Journal of Medical Genetics Part A, 146A(17), children with severe speech disorders: Reliability and validity
2227–2233. evidence. Journal of Speech, Language, and Hearing Research,
Shriberg, L. D., & Kent, R. D. (2013). Clinical phonetics (4th ed.). 56(2), 505–520.
Boston, MA: Pearson. Sussman, H. M. (1994). The phonological reality of locus equations
Shriberg, L. D., Kwiatkowski, J., & Rasmussen, C. (1990). The across manner class distinctions: Preliminary observations.
prosody-voice screening profile. Tucson, AZ: Communication Phonetica, 51(1–3), 119–131.
Skill Builders. Sussman, H. M., Bessell, N., Dalston, E., & Majors, T. (1997). An
Shriberg, L. D., Strand, E. A., Fourakis, M., Jakielski, K. J., Hall, investigation of stop place of articulation as a function of sylla-
S. D., Karlsson, H. B., . . . Wilson, D. L. (2015). A diagnostic ble position: A locus equation perspective. The Journal of the
marker to discriminate childhood apraxia of speech from Acoustical Society of America, 101(5), 2826–2838.
speech delay: I. Development and description of the pause Sussman, H. M., Hoemeke, K. A., & Ahmed, F. S. (1993). A cross-
marker. Journal of Speech, Language, and Hearing Research, linguistic investigation of locus equations as a phonetic descriptor
60(4), S1096–S1117. for place of articulation. The Journal of the Acoustical Society
Shuster, L. I., Ruscello, D. M., & Toth, A. R. (1995). The use of of America, 94(3), 1256–1268.
visual feedback to elicit correct /r/. American Journal of Speech- Sussman, H. M., Hoemeke, K. A., & McCaffrey, H. A. (1992).
Language Pathology, 4, 37–44. Locus equations as an index of coarticulation for place of

3030 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


articulation distinctions in children. Journal of Speech and Tyler, A. A., Williams, M. J., & Lewis, K. E. (2006). Error consis-
Hearing Research, 35(4), 769–781. tency and the evaluation of treatment outcomes. Clinical
Sussman, H. M., Marquardt, T., & Doyle, J. (2000). An acoustic Linguistics & Phonetics, 20(6), 411–422.
analysis of phonemic integrity and contrastiveness in develop- van Brenk, F., & Lowit, A. (2012). The relationship between acoustic
mental apraxia of speech. Journal of Medical Speech-Language indices of speech motor control variability and other measures
Pathology, 8(4), 301–313. of speech performance in dysarthria. Journal of Medical Speech-
Sussman, H. M., McCaffrey, H. A., & Matthews, S. A. (1991). An Language Pathology, 20(4), 24–29.
investigation of locus equations as a source of relational invari- van Lieshout, P., & Moussa, W. (2000). The assessment of speech
ance for stop place categorization. The Journal of the Acoustical motor behavior using electromagnetic articulography. The
Society of America, 90(3), 1309–1325. Phonetician, 81, 9–22.
Sussman, H. M., Minifie, F. D., Buder, E. H., Stoel-Gammon, C., van Lieshout, P., & Namasivayam, A. K. (2010). Speech motor
& Smith, J. (1996). Consonant–vowel interdependencies in variability in people who stutter. In B. Maassen & P. van Lieshout
babbling and early words: Preliminary examination of a locus (Eds.), Speech motor control: New developments in basic and
equation approach. Journal of Speech and Hearing Research, applied research (pp. 191–214). Oxford, United Kingdom:
39(2), 424–433. Oxford University Press.
Sussman, H. M., & Shore, J. (1996). Locus equations as phonetic van Rees, L. J., Ballard, K. J., McCabe, P., Macdonald-D’Silva,
descriptors of consonantal place of articulation. Perception & A. G., & Arciuli, J. (2012). Training production of lexical stress
Psychophysics, 58(6), 936–946. in typically developing children using orthographically biased
Terband, H. (2017, July). Deviant coarticulation in childhood apraxia stimuli and principles of motor learning. American Journal of
of speech (CAS) does not include hyperarticulation. Paper Speech-Language Pathology, 21(3), 197–206.
presented at the 7th International Conference on Speech Motor Velleman, S. L., & Shriberg, L. D. (1999). Metrical analysis of the
Control, Groningen, the Netherlands. speech of children with suspected developmental apraxia of
Terband, H., & Maassen, B. (2010). Speech motor development in speech. Journal of Speech, Language, and Hearing Research,
childhood apraxia of speech (CAS): Generating testable hypothe- 42(6), 1444–1460.
ses by neurocomputational modeling. Folia Phoniatrica et Vergis, M. K., Ballard, K. J., Duffy, J. R., McNeil, M. R., Scholl, D.,
Logopaedica, 62, 134–142. & Layfield, C. (2014). An acoustic measure of lexical stress
Terband, H., Maassen, B., Guenther, F. H., & Brumberg, J. (2009). differentiates aphasia and aphasia plus apraxia of speech after
Computational neural modeling of speech motor control in stroke. Aphasiology, 28(5), 554–575.
childhood apraxia of speech (CAS). Journal of Speech, Vorperian, H. K., Wang, S., Chung, M. K., Schimek, E. M., Durtschi,
Language, and Hearing Research, 52(6), 1595–1609. R. B., Kent, R. D., . . . Gentry, L. R. (2009). Anatomic develop-
Terband, H., Maassen, B., Guenther, F. H., & Brumberg, J. (2014). ment of the oral and pharyngeal portions of the vocal tract: An
Auditory–motor interactions in pediatric motor speech disorders: imaging study. The Journal of the Acoustical Society of America,
Neurocomputational modeling of disordered development. 125(3), 1666–1678.
Journal of Communication Disorders, 47, 17–33. Walton, J. H., & Pollock, K. E. (1993). Acoustic validation of vowel
Terband, H., Maassen, B. & Maas, E. (2016). Toward a model of error patterns in developmental apraxia of speech. Clinical
pediatric speech sound disorders (SSD) for differential diagnosis Linguistics & Phonetics, 7, 95–111.
and therapy planning. In P. van Lieshout, B. Maassen, & Wambaugh, J. L., Duffy, J. R., McNeil, M., Robin, D. A., &
H. Terband (Eds.), Speech motor control in normal and disor- Rogers, M. A. (2006). Treatment guidelines for acquired apraxia
dered speech: Future developments in theory and methodology of speech: A synthesis and evaluation of the evidence. Journal
(pp. 81–110). Rockville, MD: ASHA. of Medical Speech-Language Pathology, 14(2), xv–xxxiii.
Terband, H., Maassen, B. & Maas, E. (2019). A psycholinguistic Weismer, G., Yunusova, Y., & Westbury, J. R. (2003). Interarticulator
framework for diagnosis and treatment planning of develop- coordination in dysarthria. Journal of Speech, Language, and
mental speech disorders. Folia Phoniatrica et Logopaedica. Hearing Research, 46(5), 1247–1261.
https://doi.org/10.1159/000499426 Whalen, D. H. (1990). Coarticulation is largely planned. Journal
Terband, H., Maassen, B., van Lieshout, P., & Nijland, L. (2011). of Phonetics, 18, 3–35.
Stability and composition of functional synergies for speech White, L., Liss, J. M., & Dellwo, V. (2011). Assessment of rhythm.
movements in children with developmental speech disorders. In A. L. R. D. Kent (Ed.), Assessment of motor speech disorders
Journal of Communication Disorders, 44(1), 59–74. (pp. 231–251). San Diego, CA: Plural.
Terband, H., van Zaalen, Y., & Maassen, B. (2012). Lateral jaw Whiteside, S. P., Dobbin, R., & Henry, L. (2003). Patterns of
stability in children with developmental speech disorders. Journal variability in voice onset time: A developmental study of
of Medical Speech-Language Pathology, 20(4), 112–118. motor speech skills in humans. Neuroscience Letters, 347(1),
Thelen, E., & Smith, L. B. (1994). A dynamic systems approach to the 29–32.
development of cognition and action. Cambridge, MA: MIT Press. Yao-Tresguerres, S., Iuzzini-Seigel, J., & Forrest, K. (2009). The
Timmins, C., Hardcastle, W., McCann, J., Wood, S., & Wishart, J. effect of therapy on children’s use of inconsistent substitutes.
(2008). Coarticulation in children with Down’s syndrome: An Paper presented at the Annual Convention of the American
electropalatographic analysis. Proceedings of ISSP 2008—8th Speech-Language-Hearing Association, New Orleans, LA.
International Seminar on Speech Production (INRIA) Strasbourg, Yoss, K. A., & Darley, F. L. (1974). Developmental apraxia of
France, 273–276. speech in children with defective articulation. Journal of Speech
Tyler, A. A., & Lewis, K. E. (2005). Relationships among consis- and Hearing Research, 17(3), 399–416.
tency/variability and other phonological measures over time. Yu, V. Y., Kadis, D. S., Oh, A., Goshulak, D., Namasivayam, A.,
Topics in Language Disorders, 25(3), 243–253. Pukonen, M., . . . Pang, E. W. (2014). Changes in voice onset
Tyler, A. A., Lewis, K. E., & Welch, C. M. (2003). Predictors of time and motor speech skills in children following motor speech
phonological change following intervention. American Journal therapy: Evidence from /pa/ productions. Clinical Linguistics &
of Speech-Language Pathology, 12(3), 289–298. Phonetics, 28(6), 396–412.

Terband et al.: Methodology in the Assessment of CAS 3031


Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Yunusova, Y., Green, J. R., & Mefferd, A. (2009). Accuracy assess- Zharkova, N., Hewlett, N., & Hardcastle, W. J. (2011). Coarticu-
ment for AG500, electromagnetic articulograph. Journal of lation as an indicator of speech motor control development in
Speech, Language, and Hearing Research, 52(2), 547–555. children: An ultrasound study. Motor Control, 15(1), 118–140.
Zharkova, N. (2013). Using ultrasound to quantify tongue shape Zharkova, N., Hewlett, N., & Hardcastle, W. J. (2012). An ultra-
and movement characteristics. The Cleft Palate-Craniofacial sound study of lingual coarticulation in /sV/ syllables produced
Journal, 50(1), 76–81. by adults and typically developing children. Journal of the
Zharkova, N., Gibbon, F. E., & Hardcastle, W. J. (2015). Quantify- International Phonetic Association, 42(2), 193–208.
ing lingual coarticulation using ultrasound imaging data col- Ziegler, W., & von Cramon, D. (1985). Anticipatory coarticulation
lected with and without head stabilisation. Clinical Linguistics in a patient with apraxia of speech. Brain and Language, 26(1),
& Phonetics, 29(4), 249–265. 117–130.

3032 Journal of Speech, Language, and Hearing Research • Vol. 62 • 2999–3032 • August 2019

Downloaded from: https://pubs.asha.org Andrea Martucci on 09/27/2019, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions


Copyright of Journal of Speech, Language & Hearing Research is the property of American
Speech-Language-Hearing Association and its content may not be copied or emailed to
multiple sites or posted to a listserv without the copyright holder's express written permission.
However, users may print, download, or email articles for individual use.

You might also like