L2 Influence On L1 Speech in The Production of VOT: Tetsuo Harada

15th ICPhS Barcelona
L2 Influence on L1 Speech in the Production of VOT

Tetsuo Harada
University of Oregon, Eugene, OR 97403-1248, USA
E-mail: [email protected]
isolation (/p/ = 58 ms, /t/ = 70 ms, /k/ = 80 ms) and in

ABSTRACT sentences (/p/ = 28 ms, /t/ = 39 ms, /k/ = 43 ms) [12].
Judging from these available English data, it is quite
This research investigates the effect of the acquisition of predictable that the VOT values of Japanese /p, t, k/
English voice onset time (VOT) by early Japanese English produced by native English speakers (NE) are longer than
bilinguals on their production of the Japanese counterparts those of native Japanese speakers (NJ). Although her data
in L1. The data were collected from 6 monolingual are limited to medial single stops, Han claims that NE’s
English speakers, 6 monolingual Japanese speakers, and 6 VOT values for the Japanese voiceless stops were found to
early Japanese English bilinguals. Results show that the be longer than those of NJ (NE’s Japanese VOT /p/ = 15.6
early bilinguals, who speak Japanese at home and English ms, /t/ = 19.6 ms, /k/ = 30.8 ms; NJ’s Japanese VOT /p/ =
elsewhere, make a distinction between Japanese and 7.7 ms, /t/ = 12.0 ms, /k/ = 17.9 ms) [11]. But there are no
English VOT regardless of the place of articulation. This available reports of Japanese English early bilinguals’
means that the bilinguals have successfully established VOT values for Japanese voiceless stops that are relevant
two different phonetic categories for Japanese and to this research study.
English; however, their Japanese VOT values have ended
up being longer than those of the Japanese monolinguals, The authenticity of VOT is related to the age of learning
while their English categories are not statistically different [e.g., 13, 14]. Flege has developed the speech learning
from those of the English monolingual group. This may model (SLM) to explain the age-related limit on the ability
imply that in order to maintain phonetic contrast in a to produce authentic L2 sounds [1, 5]. This model assumes
common phonological space not L2 sounds but L1 sounds that a perceptual phonetic category for an L2 sound
can be deviated from L1 phonetic categories. initiates accurate production of the L2 sound and “many
L2 production errors have a perceptual motivation” [1]. In
early stages of L2 acquisition, learners may substitute an
1. INTRODUCTION L2 sound with the closest sound in L1, but as they get
more exposed to L2, they may discern phonetic
One of the challenging issues in second language differences between the L2 sound and the L1 sound. This
phonetics and phonology is how bilinguals produce an L2 perceptual awareness may lead them to establish a new
sound. Recent studies show that even proficient bilinguals phonetic category and to correctly produce the target
may not produce an L2 sound exactly as monolinguals do sound. The hypothesis in this model is that the likelihood
[e.g., 1]. This prediction corresponds to Grosjean’s view that L2 learners will discern the phonetic differences
that the “mixing” of the two languages occurs inevitably depends on both the phonetic dissimilarity between the L1
because their two language systems are both engaged [2]. and the target L2 sounds as well as the age of learning a
This view implies that interference is bi-directional [1, 3, second language. In other words, the greater the phonetic
4]. difference between L1 and L2 sounds is and the earlier L2
The effect of an L1 sound on the production of an L2 learners are exposed to a second language, the more likely
sound has been well documented [e.g., 5, 6, 7, 8, 9]. they are to notice the phonetic difference, which will lead
However, it is still uncertain how the experience with L2 to authentic production of L2 sounds.
affects the production of sounds in L1. This research However, this model suggests that “even when categories
investigates the effect of learning of English voice onset are established for an L2 sound, the L2 sound might not be
time (VOT) by early Japanese English bilinguals on their produced exactly as it is produced by native speakers.”
production of the Japanese counterparts in L1. Traditionally, the term “interference” has referred only to
Both Japanese and English have voiced and voiceless the influence of the L1 on the production of an L2, but
stops. Homma reports that the mean Japanese VOT values Flege claims that “cross-language phonetic interference is
of initial /p, t, k/ were 24 ms, 32 ms, and 45 ms, bi-directional in nature” [1]. He found that bilinguals
respectively while those of medial /p, t, k/ were 7 ms, 16 produced stops in their L1 with VOT values resembling
ms, and 24 ms, respectively [10]. More recently, Han those of stops in L2. This example contradicts the
reports word-medial VOT values, based on 12 tokens for traditional view that the dominant language may remain
each of ten subjects: /p/ = 7.7 ms, /t/ = 12.0 ms, and /k/ = resistant to an influence by the non-dominant language, at
17.9 ms [11]. As for English, Lisker and Abramson report least for certain phonetic dimensions [e.g., 15].
means of VOT values for initial voiceless stops both in The primary goal of the present study is to test the
1085 ISBN 1-876346-48-5 © 2003 UAB

hypothesis in Flege’s SLM that cross-language phonetic corpus was as follows:

interference is bi-directional in nature, specifically, among Japanese VOT corpus
very proficient Japanese English bilinguals, who speak /p/ /t/ /k/
Japanese at home and English elsewhere. The study papa (papa) tako (octopus) kame (turtle)
addresses the following research questions: a) is there any pari (Paris) tane (seed) kata (shoulder)
bi-directional interference in the production of Japanese tate (length) kasa (umbrella)
and English VOT by early bilinguals; and b) is there any
variation of bi-directional interference in the bilinguals' English VOT Corpus
production of Japanese and English VOT across places of /p/ /t/ /k/
articulation. panda tablet carrot
parrot tadpole camel
2. METHOD package taxi candy
2.1 Subjects 2.4 Data Measurement

The data for the study were collected from 6 Japanese Each of the experimental words occurred three times on
monolinguals, 6 early Japanese English bilinguals, and 6 each list except for papa in the Japanese VOT corpus. This
English monolinguals. The Japanese monolingual speakers means that the total number of tokens for the Japanese
participating in this study had never lived in an English VOT corpus was 324 tokens (7 words x 3 repetitions x 12
speaking country and had never used English in their daily subjects + 1 word x 6 repetitions x 12 subjects). For the
life. They were not always purely monolingual speakers English corpus there were 324 tokens (9 words x 3
because almost every student in Japan is required to study repetitions x 12 subjects). Average VOT values for each
English as a foreign language in junior and senior high place of articulation for each subject were based on 9
schools, but the subjects in this study did not have a observations.
functional command of English. The English monolingual Macquirer, a speech analysis program, was used for data
speakers were undergraduate students at a university on measurement. The VOT of initial stops was measured to
the West Coast of the United States. They had never lived the nearest millisecond from the beginning of the release
in a foreign country for more than six months and had burst to the onset of voicing energy in F2 formants. Also,
never spoken other languages than English. The Japanese the waveform was used as secondary information. VOT
English bilinguals were from Japanese-speaking families was identified in the waveforms from the beginning of the
in the United States and spoke Japanese at home and release burst to the first positive peak in the periodic
English elsewhere. Most of these subjects had been portion of the following vowel.
exposed to Japanese at a heritage language school or at a
school in Japan for a certain period of time.
3. RESULTS
2.2 Procedures
First the English data for the early bilinguals were Figure 1 shows the mean VOT values for Japanese
collected and then the Japanese data from the same voiceless stops by the Japanese monolinguals (JM), and
bilinguals were elicited to compare their English and the early Japanese English bilinguals (JB), and the mean
Japanese VOT. Each session consisted of a 20-minute VOT values for English counterparts by the same
face-to-face pronunciation elicitation test administered in a bilinguals (EB) and the English monolinguals (EM).
quiet room. During each session the informant was shown
pictures of objects which had been designed to elicit
words beginning with the target voiceless stop consonants.
The data from the monolingual speakers were collected, 90
using the same method. 80
Mean VOT Values (ms)
70
2.3 Materials
60
The words were selected taking into consideration the 50
following criteria: (1) the following vowel quality ([a] for 40
Japanese words or [3] for English words), (2) disyllabic 30
words, and (3) the same accent or stress pattern (HL for 20
Japanese VOT data, and stress on the first syllable for 10
English VOT data). 0
JM JB EB EM
Following a picture cue, the subjects were asked to say a Group
word, inserting it in the Japanese carrier phrase sore wa
_____ desu (= That is _____) or in the English carrier Figure 1 The mean VOT values for Japanese and English
phrase I see a _____ in the picture. The subjects were asked voiceless stops. The error bars enclose +/- one standard
to repeat each word in the VOT corpus three times. The error.
ISBN 1-876346-48-5 © 2003 UAB 1086

As expected, the monolingual Japanese speakers’ English. However, the mean VOT values for Japanese
voiceless stops had substantially shorter VOT values than voiceless stops produced by the bilinguals are significantly
those of the early bilinguals’ Japanese (30 ms vs 47 ms). different from those of the Japanese monolingual group,
Also, all the bilinguals produced Japanese voiceless stops while those of English voiceless stops produced by them
with shorter VOT values than their English voiceless stops are not significantly different from those of the English
(47 ms vs 69 ms), while their English voiceless stops had monolingual group (p = .2830).
shorter VOT values than those of the English
monolinguals (69 ms vs 79 ms).
4. DISUCSSION
Figure 2 will demonstrate the differences in VOT values
across places of articulation for all of the subjects. This study clearly suggests that the early bilinguals, who
speak Japanese at home and English elsewhere, make a
distinction between the Japanese and English VOT values
regardless of the place of articulation. This means that the
bilinguals have successfully established two different
100
phonetic categories for Japanese and English VOT;
90
however, their Japanese VOT categories are not the same
Mean VOT Values (ms)
80
as those of the Japanese monolingual speakers. But their
70 JM
English categories are not statistically different from those
60 JB
EB
of the English monolingual group.
50
40 EM
The success in establishing a new category for English
30
VOT shows that the bilinguals were able to notice the
20
slight phonetic difference in VOT between English and
10
p t k
Japanese. This finding supports Flege’s hypothesis that
Place of Articulation bilinguals can establish a new phonetic category for an L2
sound only when they discern the phonetic difference
Figure 2 The differences in VOT values across places of between the L2 sound and the closest L1 sound [1].
articulation for all of the subjects. The error bars enclose However, it is worth pointing out that the bilinguals’
+/- one standard error. Japanese VOT values are affected by those of English,
their second language, and have ended up being longer
than those of the Japanese monolinguals. Therefore, we
The mean VOT values obtained for each of the subjects can argue that there is clearly L2 interference in the
were submitted to a (4) Group and (3) Place of production of Japanese VOT. But the finding that the
Articulation ANOVA, which yielded a significant group bilinguals’ English categories do not differ from those of
main effect [Group, F (3, 60) = 40.702, p < .0001; Place, F the English monolinguals can imply that there may be no
(2, 60) = 9.171, p = .0003]. The duration pattern of VOT L1 interference in the production of English VOT.
relative to the place of articulation is similar across the The finding that the Japanese categories of the early
four groups: VOT is longest for /k/ (p < .05) and there is bilinguals are deviated from those of L1 Japanese speakers
no significant difference between /p/ and /t/ (p = .2555). In may be accounted for by the maintenance of phonetic
addition, since there was no interaction between Group contrast in a common phonological space [1, Susan Guion,
and Place [Group * Place, F (6, 60) = .252, p = .9566], the personal communication]. VOT is represented as either
mean Japanese VOT values produced by the bilingual negative VOT values standing for “voicing lead (onset of
subjects were greater than those of the monolingual glottal vibration prior to articulatory release)” or positive
Japanese subjects regardless of the place of articulation. VOT values meaning “voicing lag (onset of glottal
However, though statistically not significant, there is a vibration following release)” [14]. For example, in
trend that the bilinguals’ mean VOT values for /t/ in both utterance-initial voiced stops in Japanese, the voicing
Japanese and English are more deviant from those of the usually begins -35-0 ms on average before the release of
monolinguals. Since there was no significant difference the stop closure [16], while in utterance-initial voiced
across places of articulation, the mean VOT values stops in English, it more frequently begins shortly after the
obtained for each of the subjects were submitted to a (4) release of the closure. Utterance-initial voiceless stops in
Group one-way ANOVA. The analysis yielded a Japanese and English are identified in terms of VOT: in
significant group main effect [F (3, 68) = 34.660, p Japanese voiceless stops, the voicing starts shortly after
< .0001]. Scheffe post hoc tests revealed that the the release of the closure, whereas in English it starts 60-
bilinguals produced Japanese voiceless stops with 80 ms after the release of the closure. In other words, the
significantly longer VOT values than the monolingual English voiced stops fall roughly in the same acoustic
Japanese speakers, but they produced them with space as the Japanese voiceless stops. This can be
significantly shorter VOT values than their English VOT illustrated in the following diagram:
(p < .0018). This suggests that the bilinguals are making a
phonetic distinction in VOT values between Japanese and
1087 ISBN 1-876346-48-5 © 2003 UAB

English stops “Effects of age of second-language learning on the

/b/ /d/ /g/ /p/ /t/ /k/ production of English consonants,” Speech
Communication, vol. 16, pp. 1-26, 1995.
___________________ 0 ______________________
Japanese stops [7] S. Oyama, “A sensitive period for the acquisition of a
nonnative phonological system,” Journal of
/b/ /d/ /g/ /p/ /t/ /k/ Psycholinguistic Research, vol. 5, pp. 261-285, 1976.
___________________ 0 ______________________
[8] M. S. Patkowski, “The critical age hypothesis and
(lead voicing) (short lag) (long lag) interlanguage phonology,” in First and Second
Figure 3 Voice onset time in English and Japanese Language Phonology, M. S. Yavas, Ed., pp. 205-221.
San Diego CA: Singular Publishing Group, 1994.
The conflict of English voiced stops with Japanese [9] T. Scovel, A Time to Speak: A Psycholinguistic Inquiry
voiceless stops in the acoustic space may have caused the into the Critical Period for Human Speech, Cambridge
bilinguals’ Japanese voiceless stops to be longer than MA: Newbury House, 1988.
those of the monolinguals so that they could maintain the [10] Y. Homma, “Durational relationship between
phonetic contrast between the two stops. This may imply Japanese stops and vowels,” Journal of Phonetics, vol.
that in order to maintain phonetic contrast in a common 9, pp. 273-281, 1981.
phonological space not L2 sounds but L1 sounds can be
deviated from L1 phonetic categories. Although this will [11] M. S. Han, “The timing control of geminate and
require additional studies because VOT data for voiced single stop consonants in Japanese: A challenge for
stops were not collected in this study, due to the conflict of nonnative speakers,” Phonetica, vol. 49, pp. 102-127,
an L2 phonetic category with an L1 phonetic category a 1992.
bilingual’s L1 sound may not be produced in the same
way it is produced by monolingual speakers. [12] L. Lisker and A. Abramson, “A cross-language study
of voicing in initial stops: Acoustical measurements,”
Word, vol. 20, pp. 384-422, 1964.
ACKNOWLEDGMENTS
[13] J. E. Flege, “Age of learning affects the authenticity
This research was supported by the Freeman Foundation of voice onset time (VOT) in stop consonants
Faculty Research Fellowship, and the Junior Professorship produced in a second language,” Journal of the
Development Award, both at the University of Oregon. Acoustical Society of America, vol. 89, pp. 395-411,
1991.
REFERENCES [14] L. Williams, “Phonetic variation as a function of
second language learning,” in Child Phonology 2:
[1] J. E. Flege, “Second-language speech learning: Theory, Perception, G. H. Yeni-Komshian, J. F. Kavanagh, and
findings, and problems,” in Speech Perception and C. A. Ferguson, Eds., pp. 185-215. New York:
Linguistic Experience: Issues in Cross-Language Academic Press, 1980.
Research, W. Strange, Ed., pp. 233-277. Timonium
MD: York Press, 1995. [15] M. Mack, “Consonant and vowel perception and
production: Early English-French bilinguals and
[2] F. Grosjean, “Neurolinguists, beware! The bilingual is English monolinguals,” Perception and Psychophysics,
not two monolinguals in one person,” Brain and vol. 46, pp. 187-200, 1989.
Language, vol. 36, pp. 3-15, 1989.
[16] Y. Homma, “Voice onset time in Japanese stops,”
[3] O. Bohn and J. E. Flege, “Perceptual switching in Onsei Gakkai Kaiho, vol. 163, 7-9, 1980.
Spanish/English bilinguals,” Journal of Phonetics, vol.
21, pp. 267-290, 1993.
[4] J. E. Flege, “The production of ‘new’ and ‘similar’
phones in a foreign language: Evidence for the effect
of equivalence classification,” Journal of Phonetics,
vol. 15, pp. 47-65, 1987.
[5] J. E. Flege, “Speech learning in a second language,” in

Phonological Development: Models, Research,
Implications, C. A. Ferguson, L. Menn and C. Stoel-
Gammon, Eds., pp. 565-604. Timonium MD: York
Press, 1992.
[6] J. E. Flege, M. J. Munro, and I. R. A. MacKay,
ISBN 1-876346-48-5 © 2003 UAB 1088

L2 Influence On L1 Speech in The Production of VOT: Tetsuo Harada

Uploaded by

Copyright:

Available Formats

L2 Influence On L1 Speech in The Production of VOT: Tetsuo Harada

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L2 Influence On L1 Speech in The Production of VOT: Tetsuo Harada

Uploaded by

Copyright:

Available Formats

15th ICPhS Barcelona

L2 Influence on L1 Speech in the Production of VOT

isolation (/p/ = 58 ms, /t/ = 70 ms, /k/ = 80 ms) and in

1085 ISBN 1-876346-48-5 © 2003 UAB

hypothesis in Flege’s SLM that cross-language phonetic corpus was as follows:

2.1 Subjects 2.4 Data Measurement

ISBN 1-876346-48-5 © 2003 UAB 1086

1087 ISBN 1-876346-48-5 © 2003 UAB

English stops “Effects of age of second-language learning on the

[5] J. E. Flege, “Speech learning in a second language,” in

ISBN 1-876346-48-5 © 2003 UAB 1088

You might also like