Werker1984 PDF
Werker1984 PDF
Cross-LanguageSpeechPerception:
Evidencefor Perceptual Reorganization
During the First Year of Life*
JANET F. WERKER AND RICHARD C. TEES
University of British Columbia
Previous work in which we compared English infants, English odults, and Hindi
odults on their obility to discriminate two pairs of Hindi (non-English) speech con-
trasts has indicated thot infants discriminate speech sounds according to phonetic
category without prior specific language experience (Werker, Gilbert, Humphrey,
8 Tees, 1981). whereas adults and children OS young OS age 4 (Werker 8 Tees, in
press), may lose this obility as a function of age and or linguistic experience. The
present work was designed to (0) determine the generolizability of such a decline
by comparing adult English, adult Salish, and English infant subjects on their per-
ception of a new non-English (Salish) speech contrast, and (b) delineate the time
course of the developmental decline in this ability. The results of these experi-
ments replicate our original findings by showing that infonts con discriminate
nonnative speech contrasts without relevont experience, and thot there is a de-
cline in this ability during ontogeny. Furthermore, dota from both cross-sectional
and longitudinal studies shows that this decline occurs within the first yeor of life,
and thot it is o function of specific longuage experience.
While a large (but finite) number of sound segments occur in the languages of
the world, only a subset is used phonemically (to differentiate meaning) in any
particular language. Several researchers have predicted that human infants are
born with the ability to discriminate the universal set of phonetic contrasts
regardless of language experience, and that this ability declines as a function of
specific linguistic experience (Eimas, 1978; Morse, 1978; Werker et al., 1981).
Alternatively, it has been proposed that experience listening to a language may
be necessary to facilitate the perception of the phonetic distinctions used in
that language (Eilers, Gavin, & Wilson, 1979). Most relevant data support the
first of these predictions, suggesting that rather than having to learn to differ-
entiate phonetic features, young infants seem to respond to speech sounds ac-
cording to the categories that could serve as the basis for adult phonemic
l This work was jointly supported by grants to Richard C. Tees from the Social Sciences
and Humanities Research Council (410-81-0796), the National Research Council (PA0179) of
Canada, and the National Institute of Mental Health (lR03NH35829), and by NICHD Grant
HD12420 to Haskins Laboratories. We thank the infants and mothers who made this study possi-
ble. We also thank KathySearcy, Sue Tees, and Carole Bawden for their assistance. Special thanks
to AI Liberman for making us welcome at Haskins Laboratories. Requests for reprints should be
sent to Janet F. Werker, Department of Psychology, DaIhousie University, Halifax, Nova Scotia,
B3H 451, or to Richard C. Tees, Department of Psychology, University of British Columbia, Van-
couver, BC, V6T lY7, Canada.
49
50 WERKER AND TEES
The first study attempted to determine whether the results obtained in this
earlier work were representative of developmental changes in cross-language
speech perception. To test this, an experiment similar to that reported by
Werker and colleagues (1981) was designed using a different non-English
CROSS-LANGUAGE SPEECH PERCEPTION 51
Method
Subjects. Twelve full-term infants (8 girls, 4 boys) ranging in age from 6
months, 4 days, to 7 months, 29 days, with an average age of 6 months, 29
days, were recruited by advertising in local newspapers. Infants were requested
to participate on days when they had no evidence of colds or ear infections.
Care was taken to ensure that each infant was comfortable in the experimental
room before testing began.
Ten English-speaking adults (6 males, 4 females), aged 22-35, were re-
cruited from the University of British Columbia campus. As it is difficult to
find adults with no second language training, notes were made on formal and
informal training. No English adults had exposure to a second language con-
taining the contrast being studied.
Five native Thompson-speaking adults (3 females, 2 males) ranging in
age from 30 to 65 were tested on their discrimination of the Thompson tokens.
Stimuli. Multiple natural exemplars of each sound were used in the dis-
crimination task, so that subjects would have to ignore within category acous-
tic variability and differentiate the sounds according to phonetic category,
much as is done in natural language processing. Care was taken to ensure that
the exemplars from the two categories were equated for intensity, fundamental
frequency, duration, and intonation contour. The English contrast used was
the place of articulation distinction, /h/-/da/, in which bilabial and alveolar
voiced stop consonants are differentiated. Four exemplars of /ba/ and four
exemplars of /da/ were used. The Thompson (non-English) contrast /kr/-
/ii/ involved two glottalized voiceless stop consonants where the uvular ver-
sus velar place of articulation distinction is the critical difference. These
sounds are produced by obstructing the air flow by raising the back of the
tongue either against the velum (velar) or behind the velum (uvular). Back con-
sonants are characteristic of North American Indian languages. English listeners
typically label both velar and uvular stops as velar, since uvular consonants are
not typically used in English.
In recording native Indians who are not accustomed to reading their lan-
guage, it was necessary to record whole words, and then ask the speaker to
52 WERKER AND TEES
repeat the first consonant-vowel (CV) sound. It was then possible to perform
acoustic analyses of words and CV repetitions to ensure that the CV syllables
contained the same consonant sounds as the words. The vowels in Interior
Salish languages vary (somewhat in free variation and in a somewhat systema-
tic fashion) between speakers and between consonants (Thompson & Kinkade,
in press). In over 100 recordings of k and q words and sounds from three dif-
ferent speakers, we were unable to find exemplars wherein a similar enough
vowel followed multiple CV only repetitions of k and q.
In the CV repetitions from the words kixm (to fry an egg) and &xm (to
make one see), however, there was one exemplar (or token) of /ki/ and one
token of /ii/ in one speaker’s recording in which the vowels sounded nearly
identical to one another and appeared similar in a wave form analysis. Since
there is a discontinuity in the wave form of glottalized stops, (a 0 amplitude
segment in the wave form) it was easy to use the /i/ periodic segment from a
single /I&/ and the /i/ periodic portion from a single / @/ to splice on to addi-
tional exemplars of the ejective portion taken from other k and of q repeti-
tions. This was done to yield three tokens of /iii/ with a single /i/ segment
and three different ejective portions, and three tokens of /&/ with a single
/i/ segment and three different ejective portions.
Classical spectrographic analysis has been shown to provide little infor-
mation as to the acoustic differences between velar and uvular sounds (Mayes, .
1979). In our spectrographic analysis, the only apparent differences between
typical spectograms from /ki/ and /&/ were in the third formant
transition, and possibly in the amplitude and duration of the burst. F, is flat
for /Qi/ at around 2300 kHz, whereas it rises for /ki/ from 2400 kHz to ap-
proximately 2900 kHz. The amplitude and duration of the /q/ burst are
greater than in the /k/ burst. Representative spectograms are shown in Fig. 1.
The average duration for each token was 400 ms with a 1500-ms silent interval
between tokens. Final tapes were prepared and set up with the use of ihe
PDP-224 computer at Haskins Laboratories in New Haven, CN. All tapes
were played on a Revox A-77 tape recorder at approximately 65 db SPL in a
tracoustics sound-attenuated test chamber. The entire operation was controlled
by a logic system (Werker et al., 1981).
Procedure. Infants were tested in a “head turn” (HT) paradigm (some-
times referred to as “visually reinforced infant speech discrimination” para-
digm) in which the infants were conditioned to turn their heads away from an
experimental assistant and toward a loud speaker within a specified time inter-
val (4 l/2 s) when there is a change in the speechsound category. Correct head-
turns are reinforced with the presentation, and illumination, of an electrically
activated toy animal inside a smoked plexiglass box while incorrect head-turns
(i.e., false positives) are not reinforced. Three exemplars of /ki/ were set up in
random order on Track 1 of a two-track tape, and 3 exemplars of /hi/ were set
up and aligned on Track 2. When changes from Track 1 to Track 2 occurred
during the testing, the subject’s task was to indicate when there was a change
GLOTTALIZEO VELAR /ki/
. ...
v- 1
260 ’ 460 ’
GLOTTALIZEO UVULAR $i
//
Time (ms)
Figure 1. Spectogroms of typical exemplars of the Thompson glottalid velar/
uvular contrast (/k//-/4//).
53
54 WERKER AND TEES
in the phonetic category from /ki/ to /ii/. In this sense, it could be argued
that the HT procedure functioned as a categorizing discrimination task since
multiple exemplars were used. However, exemplars from a single category
were much more similar than those typically used in categorizing tasks (cf
Kuhl, 1979).
In the experimental setup the infant sat on its parents’ lap facing an ex-
perimental assistant (E2) across the table in a sound-attenuated chamber. The
speaker and the visual reinforcer were located at a 110O-angle,90 cm to the left
of the parent/infant. Both the parent and E2 wore headphones through which
music was played so they would not be able to influence the infant’s behavior.
The E2 kept the infant looking in his/her direction by manipulating small toys.
Another Experimenter (El) sat outside the chamber observing the infant
through a one-way observation window and monitored the logic system con-
sole (for details, see Werker et al., 1981).
In the conditioning phase of this procedure, the experimenter activated
the toy animal immediately following a sound change. Once the infant formed
the association between the sound change and activation of the toy animal
(usually within 2 to 10 trials), the infant, upon hearing the sound change,
turned its head to seethe toy animal perform, and activation of the reinforcer
became contingent on an appropriate head turn.
When conditioning was successful (i.e., three correct anticipatory head,
turns in a row) presentation of stimuli and activation of the visual reinforcer
became controlled by a logic system. Every time the infant turned its head, E2
pressed a button on the floor. All button presseswere recorded on a Grason-
Stadler event recorder. If the button press occurred within 4 l/2 s of the stim-
uli changing from one phoneme (i.e., ba) to another (i.e., da), the visual
reinforcer was activated by the logic system. A record of each was recorded.
The operation of the logic system also yielded a record of each time an infant
did not turn his/her head during a change trial (i.e., misses) and each incorrect
head turn (i.e., false positives).
A variate of this paradigm was used with adult English subjects where a
button press rather than a head-turn was the required behavioral response (see
Werker & Tees, 1983).
The criterion for successful discrimination was 8 out of 10 correct re-
sponsesto change trials with no more than two errors (i.e., two misses or two
false positives).’ The criterion for deciding an infant could not discriminate a
’ Typically, in the HT procedure, head-turns are only counted during demarcated observa-
tion intervals (e.g., Aslin et al., 1981; Kuhl, 1979; Werker et al., 1981). In this series of experi-
ments, we modified the procedure and controlled for bias by random manipulation of the timing
between experimental trials. To do this, we had to bring each infant under tight experimental con-
trol during conditioning. For example, if an infant was inclined to make frequent false positive
head-turns, we extinguished that response proclivity by lengthening the interval between sound
changes. Following conditioning, experimental trials occurred according to a random schedule
(every 4 to 15 trials) when the infant was continuously oriented toward the experimental assistant.
Since the timing of experimental trials varied, control trials were not used. Every head-turn during
this period was counted, yielding an overall probability of c .Ol for achieving an 8 out of 10 cor-
rect response to change trials.
CROSS-LANGUAGE SPEECH PERCEPTION 55
contrast had two phases. First, the infant had to successfully discriminate /!~a/-
/da/ directly before and after failing to reach criterion on a nonnative contrast.
This was done to ensure that the failure of the infant was due to an inability to
readily perceive the sound difference, and was not due to nonspecific factors
such as boredom, dirty diapers, etc. Two infants (1 male, 1 female) were elimi-
nated from further analysis because they failed this phase. Second, the infant
was given 25 change trials on the nonnative contrast in their unsuccessful at-
tempt to reach criterion. Adults were also given 25 change trials in which to
reach criterion.
Results and Discussion
The portion of subjects that either reached or did not reach the 8 out of 10
criterion on the Thompson contrast is illustrated in Fig. 2. All 5 of the adult
Thompson speakers reached criterion, whereas only 3 out of 10 adult English
speakers did so. Of the English infants tested, 8 out of 10 reached criterion.
100
+
s
: 50
2
0
Thompson En lish
Adults EA”3 it,” In 3ants
Figure 2. Proportion of Thompson-speaking adults, English-speaking adults, and
infants from English-speaking homes reaching criterion on the Thompson glottal-
ized velar/uvular contrast (/r)I / - / #/).
56 WERKER AND TEES
EXPERIMENT 2
The second experiment was designed to establish the developmental time
period in which the decline in speech discriminative ability occurred. In this
endeavor, subjects were tested on both the Thompson /h/-/hi/ contrast, a,s
well as on one of the Hindi contrasts (/{a/-/ta/) employed in our earlier re-
search (Werker et al., 1981). Two contrasts were used to increase our con-
fidence in the generality of any results we might obtain.
Since we had already ascertained that by age 4, children appear to dis-
criminate nonnative contrasts as poorly as adults (Werker & Tees, 1983), we
decided to examine perception in children between the ages of 8 months and 4
years. After testing about 15 children of various ages, it became apparent that
important changes were occurring during the first year of life. At that time, we
narrowed our investigation to study cross-language perception in infants be-
tween 6 and 12 months of age. In addition to testing English infants, there was
an attempt to test infants being raised in homes in which either Hindi or
Thompson was primarily spoken. This was done to determine whether the
observed decline was a result of specific language experience, or whether it
could be explained by a general developmental decline in the ability to make
difficult perceptual distinctions.
Method
Subjects. In this study, data were collected from infants aged 8-10
months and 10-12 months, and were compared to the earlier data we had col-
lected on infants aged 6- to 8-months-old under identical testing conditions,
either in a previous study (Werker et al., 1981) in the caseof the Hindi contrast
or from Experiment 1 of the present study in the case of the Thompson con-
trast. One group of 8- to lo-month-old children (7 females and 5 males, rang-
ing in age from 8 months, 3 days, to 9 months, 10 days, with an average age of
CROSS-LANGUAGE SPEECH PERCEPTION 57
TABLE 1
Infant Discrimination Performance on Two Non-English.Speech Contrasts
(J) (2)
Reached Criterion 6-8 months tl- JO months JO- J2!!on+hs
that the lo- to 12-month-olds performed significantly less well than the two
groups of younger infants on both contrasts.
The number of trials to criterion was compared across contrasts and
across ages for those infants who could discriminate the sounds. It was
assumed that if the decline in the ability to discriminate nonnative contrasts ac-
cording to phonetic category occurred gradually between 6 and 12 months of
age, this gradual change would be evident in an increase in the number of trials
required to reach criterion. However a 3 x 3 repeated measures analysis of
variance showed there to be no significant differences between age groups,
F= 1.57, p< .05, or between sound contrasts F=2.78, p< .05, making it dif-
ficult to argue that there was a gradual increase across age in the number of
trials required to reach criterion.
To make sure that the decline around lo-12 months of age was not sim-
ply a function of a general performance decline for difficult perceptual tasks at
this age, a few same-aged babies being raised in homes in which Hindi or
Thompson are primarily spoken were also tested. To date, we have only been
able to find 5 infants (3 Hindi and 2 Thompson) between 11 and 12 months
who meet this criterion, and only 3 of these infants who would condition in the
HT procedure (i.e., reach criterion on the ha/da contrast). This drop-out rate
is similar to that found in the same-aged English infants. All three of these in-
fants reached discrimination criterion on their native contrast within 10 change
trials.
These findings show the decline in the ability to discriminate nonnative
phonetic contrasts occurs within the first year of life. That is, most of the
English infants tested could discriminate both non-English contrasts at 6 to 8
months of age. By 8 to 10 months a smaller percentage could discriminate the
contrasts, and by 10 to 12 months the infants were performing as poorly as the
young children and adults in Experiments 1 and 2. However, infants being
raised to speak Hindi or Thompson sounds could still discriminate the relevant
contrasts at 11 to 12 months of age. The results provide strong support for the
supposition that specific linguistic experience is necessaryto maintain phonetic
discrimination ability. Without such experience, there is a loss in this ability by
10 to 12 months of age.
60 WERKER AND TEES
EXPERIMENT3
100
Cross-Sectional Data
80
5 60
u”
,”
0 40
20
100
Longitudinal Doto
80
c 60
Z
E
E 40
a.
20
1 Solish /;i/vs /4 i/
Figure 4. Proportion of infant subiects from three ages and various backgrounds
reaching criterion on Hindi and Thompson (Salish) contrasts.
GENERAL DISCUSSION
In summary, these experiments provide strong support for the claim that
young infants can discriminate many of the phonetic distinctions used across
natural languages without relevant experience, and that there is a decline in
this ability as a function of specific language experience. Furthermore, these
experiments provide data showing that this decline may be evident by the end
of the first year of life.
62 WERKER AND TEES
Morse, P. A. (1978). Infant speech perception: Origins, processes and alpha centauri. In F. D.
Minifie & L. L. Lloyd (Eds.), Communicative and cognitive abilities-Early behavioral
assessment. Baltimore, MD: University Park Press.
Singh, S., & Black, J. W. (1966). Study of twenty-six intervocal consonants as spoken and recog-
nized by four language groups. Journal of the Acoustical Society of America, 39, 371-387.
Streeter, L. A. (1976). Language perception of two-month-old infants shows effects of both in-
nate mechanisms and experience. Nature, 259, 39-41.
Tees, R. C., & Werker, J. F. (1982, June). Perceptualflexibility: Recording of the ability to dis-
criminate nonnative speech sound. Paper presented at the meeting of the Canadian Psycho-
logical Association, Montreal, Canada.
Thompson, L. C., & Kinkade, M. D. (in press). Linguistic relations and distributions. In W.
Sturtevant (Eds.), Handbook of North American Indians. Vol. 8, The northwest coast.
Trehub, S. (1976). The discrimination of foreign speech contrasts by infants and adults. Child
Development, 47, 466-472.
Werker, J. F., Gilbert, J. H. V., Humphrey, K., & Tees, R. C. (1981). Developmental aspects
of cross-language speech perception. Child Development, 52, 349-355.
Werker, J. F., & Tees, R. C. (1983). Developmental changes across childhood in the perception of
nonnative speech sounds. Canadian Journal of Psychology, 37, 278-286.