Changing Contexts and Shifting Paradigms in Pronunciation Teaching

Changing Contexts and Shifting

Paradigms in Pronunciation Teaching

Iowa State University

he history of pronunciation in English language teaching is a study

in extremes. Some approaches to teaching, such as the reformed
method and audiolingualism, elevated pronunciation to a pinnacle of
importance, while other approaches, such as the cognitive movement and
early communicative language teaching, mostly ignored pronunciation
(Celce-Murcia, Brinton, & Goodwin, 1996). Currently, it seems clear that
pronunciation deserves neither fate, either to be unfairly elevated to the
central skill in language learning or banished to irrelevance.
To a large extent, pronunciations importance has always been determined by ideology and intuition rather than research. Teachers have
intuitively decided which features have the greatest effect on clarity and
which are learnable in a classroom setting. Derwing and Munro (this
issue), recognizing this tendency toward teacher intuition in determining classroom priorities, make an appeal for a carefully formulated
research agenda to define how particular features actually affect speaker
intelligibility. That such an appeal is needed suggests, in Derwing and
Munros words, that pronunciation instructional materials and practices
are still heavily influenced by commonsense intuitive notions and that
such intuitions cannot resolve many of the critical questions that face
classroom instructors (p. 380).
During the past 25 years, pronunciation teachers have emphasized
suprasegmentals rather than segmentals in promoting intelligibility
(Avery & Ehrlich, 1992; Morley, 1991), despite a paucity of research
evidence for this belief (Hahn, 2004). Recent carefully designed studies
have shown some support for the superiority of suprasegmental instruction in ESL contexts (e.g., Derwing & Rossiter, 2003). Also, wider
availability of software that makes suprasegmentals discourse functions
more accessible to teachers and learners will encourage work with suprasegmentals (Chun, this issue; Pickering, this issue). However, the importance of suprasegmentals for communication in English as an international language (EIL) is uncertain ( Jenkins, 2000; Levis, 1999). It is
also by no means clear that all suprasegmentals are equally learnable.
Pennington and Ellis (2000), for example, found that although some
TESOL QUARTERLY Vol. 39, No. 3, September 2005


elements of intonation, such as nuclear stress, appear to be learnable,

other elements, such as pitch movement marking boundaries and the
intonation of sentence tags, are not. Even for those who advocate the
centrality of suprasegmentals, a more nuanced approach is clearly

More fundamentally, pronunciation research and pedagogy have long
been influenced by two contradictory principles, the nativeness principle
and the intelligibility principle. The nativeness principle holds that it is
both possible and desirable to achieve native-like pronunciation in a
foreign language. The nativeness principle was the dominant paradigm
in pronunciation teaching before the 1960s, but its influence was rapidly
diminished by research showing that nativeness in pronunciation appeared to be biologically conditioned to occur before adulthood
(Lenneberg, 1967; Scovel, 1995), leading to the logical conclusion that
aiming for nativeness was an unrealistic burden for both teacher and
learner. Despite extensive ongoing research into a critical period for
acquiring pronunciation, in practice very few adult learners actually
achieve native-like pronunciation in a foreign language. Factors such as
motivation, amount of first language (L1) use, and pronunciation
training are positively correlated with more native-like pronunciation,
but none of these other factors seems to overcome the effects of age
(Flege & Frieda, 1995; Moyer, 1999).
Although an overwhelming amount of evidence argues against the
nativeness principle, it still affects pronunciation teaching practices.
Popularly, the principle drives the accent reduction industry, which
implicitly promises learners that the right combination of motivation
and special techniques can eliminate a foreign accent. In language
classrooms, it is common for learners to want to get rid of their accents
(as one of my recent students expressed it). Many teachers, especially
those unfamiliar with pronunciation research, may see the rare learner
who achieves a native-like accent as an achievable ideal, not an exception.
The second principle is the intelligibility principle. It holds that
learners simply need to be understandable. The intelligibility principle
recognizes that communication can be remarkably successful when
foreign accents are noticeable or even strong, that there is no clear
correlation between accent and understanding (Munro and Derwing,
1999), and that certain types of pronunciation errors may have a
disproportionate role in impairing comprehensibility.
The intelligibility principle implies that different features have different effects on understanding. Instruction should focus on those features


that are most helpful for understanding and should deemphasize those
that are relatively unhelpful. This assumption of differential importance
is evident in most intelligibility-based arguments for pronunciation
instruction. For example, the longstanding belief that instruction should
focus on suprasegmentals (e.g., Avery & Ehrlich, 1992) assumes that a
focus on these features leads to better and quicker speaker intelligibility
than a focus on segmentals.
Jenkinss (2000) lingua franca core (LFC), a proposal for intelligibilitybased pronunciation instruction, shares this assumption about intelligibility, albeit with an important difference in communicative context.
Jenkins argues that her approach supports EIL (also called ELF, or
English as a lingua franca) communication, but her recommendations
have caused pronunciation teachers in all contexts to revisit their beliefs
about intelligibility and the primacy of suprasegmentals. Dauer (this
issue) provides an ESL response to the LFC, both praising its renewed
emphasis on segmentals and arguing that its de-emphasis on suprasegmentals will not serve learners well, given that the boundaries between
ESL and EIL communication are more fluid than the LFC suggests.
The LFC also raises issues for EFL contexts, where its recommendations would seem to be most at home. However, because students in EFL
classrooms share the same L1, they converge toward second language
(L2) pronunciation that is heavily influenced by the L1. Thus, the
documented tendency of different L1 speakers to converge toward more
internationally intelligible pronunciation ( Jenkins, 2000) does not seem
to operate in EFL contexts. Walker (this issue) describes a technique
used successfully to help learners who share the same L1 converge
toward pronunciation that will be more intelligible in EIL communication.
Despite the current dominance of intelligibility as the goal of pronunciation teaching, both the nativeness and intelligibility principles continue to influence pronunciation in the language curriculum, both in
how they relate to communicative context and in the relationship of
pronunciation to identity.


Most currently published pronunciation materials are consistent with
the nativeness principle. These materials hold that prestige native
speaker versions of English are the proper models for pronunciation
learning. Although most native speakers of English speak neither General American nor Received Pronunciation (RP), published materials
rely on these accents for examples, giving a skewed view of pronunciation that may not serve learners communicative needs. Deterding (this
issue) describes how Singapore English speakers who are used to RP


found Estuary English speech, which they are more likely to encounter
in England, to be often unintelligible. Deterding argues that pedagogical
reliance on prestige models is counterproductive for learners ability to
understand normal speech.
The intelligibility principle carries a sensitivity to context. Intelligibility assumes both a listener and a speaker, and both are essential elements
for communication. Levis (in press) describes the context sensitivity of
intelligibility in terms of a native speakingnonnative speaking (NS
NNS) listener-speaker matrix for assessment (Figure 1). The four quadrants reflect different aspects of intelligibility and suggest different
priorities for language teaching.
Quadrant A has NS speakers and listeners and is usually assumed to be
the standard for successful communication. This assumption implies that
the speakers varieties are mutually intelligible, although it is not clear
just how mutually intelligible native varieties actually are. Research has
shown that understanding in NS communication is often more complex
than one would expect (e.g., Cutler, Dahan, & van Danselaar, 1997).
Quadrant B, with NS speakers and NNS listeners, is a normal configuration for language teaching in an ESL context. It is also the norm for most
language teaching beyond ESL contexts, in which print and audio
materials are based on NS models. However, the ways in which NNS
listeners actually decode and interpret NS speech is not completely clear.
Quadrant C reflects most current research on intelligibility, where NNS
speakers communicate with NS listeners. This model assumes that NSs
already have the ability to communicate and makes NNSs responsible for
communicative success. Quadrant D, where both speakers and listeners

Speaker-Listener Intelligibility Matrix (Levis, in press)


Native Speaker

Nonnative Speaker








are NNSs, reflects EIL communication, in which NNSs use English as a

lingua franca to communicate with each other.
Field (this issue) reports on research in which NNS listeners interpret
misstressed words, some with changes in vowel quality. This study shows
that NNS listeners behave somewhat differently from NSs, especially with
regard to changes in vowel quality, leading Field to suggest that unstressed syllables may often be unimportant for intelligibility, a conclusion not so different from Jenkinss (2000).
In another study in this issue, Riney, Takagi, and Inutsuka show how
Japanese and American listeners judge degree of accent differently.
American listeners used primarily segmental clues (/l/ and //) to
determine strength of accent, but Japanese listeners appeared to use
suprasegmentals to determine strength of accent. This finding suggests
that emphasizing suprasegmentals in teaching NNSs does little to
decrease NS listeners perceptions of NNSs accent, and that pronunciation teachers need to think more about how learners perceive speech
rather than relying solely on NS perceptions.
In reality, the two-by-two matrix in Figure 1 is simplistic, reflecting a
view of English that divides the world into native and nonnative speakers.
Kachrus three circles of Englishes (Kachru, 1986) adds a third type of
English user into the matrix, the speaker of a nativized variety. Thus, the
question of intelligibility should be addressed using a three-by-three
matrix (Figure 2).
World Englishes Speaker-Listener Intelligibility Matrix
Inner Circle





Outer Circle


Expanding Circle












The four italicized corners of the matrix reflect the same communicative possibilities shown in Figure 1, but the bolded sections of the matrix
are relatively unexplored. Both Quadrant 1 and Quadrant 2 include
inner-circle and outer-circle interlocutors, and in both cases, the standardized nature of inner-circle Englishes may shift the perceived responsibility for being intelligible to outer-circle interlocutors (Bamgbose,
1998). At this juncture, the communicative context becomes crucial. In
U.S. university settings, for example, graduate teaching assistants from
outer-circle countries such as India are routinely tested for spoken
English proficiency, even when their English proficiency is otherwise
indistinguishable from inner-circle graduate students. It seems evident
that such testing is conducted because outer-circle speakers have unfamiliar accents, not a lower proficiency in English. In an outer-circle
setting, however, an inner-circle interlocutor is more likely to recognize
the validity of the outer-circle accent.
Quadrant 3, in which outer-circle speakers are interlocutors, likely has
the same kind of variation in intelligibility as NSNS communication.
Outer-circle speakers will likely have the same difficulties with unfamiliar
accents and registers that inner-circle speakers have with unfamiliar
Quadrants 4 and 5 include outer-circle and expanding-circle interlocutors. These interactions often occur in contexts without inner-circle
speakers. As a result, pronunciation issues may cause breakdowns in
communication similar to those described by Jenkins (2000), who found
that pronunciation caused a loss of intelligibility in NNSNNS communication. It would be surprising, however, if the two quadrants had the
same bottom-up processing difficulties discussed by Jenkins. In general,
the proficiency of outer-circle speakers is more like that of inner-circle
speakers than that of expanding-circle speakers, for whom English is a
foreign language. Thus, an outer-circle listener and an expanding-circle
speaker, as in Quadrant 5, are more likely to negotiate intelligibility
using context or top-down knowledge of English than are an expandingcircle listener and an outer-circle speaker, as in Quadrant 4, where
bottom-up processing constraints are likely to be more severe.

Both Figure 1 and Figure 2 have a weakness: In judgments of intelligibility, they ignore, on the positive side, the role of language identity,
and on the negative side, language attitudes. Accent is influenced not
only by biological timetables but also by sociolinguistic realities. In other
words, speakers speak the way they do because of the social groups they
belong to or desire to belong to. The role of identity in accent is perhaps


as strong as the biological constraints. Accent, along with other markers

of dialect, is an essential marker of social belonging.
The pull of identity is also strong for NNSs of a language. Jenkins
(2000) describes how same-L1 NNS pairs pronounce English with a
greater number of deviations than do pairs of speakers from different
L1s. This tendency toward convergence, even when it means speaking
English with more deviant pronunciation, indicates the importance of
identity. The addition of biological constraints to L2 pronunciation
makes the acquisition of a prestige variety of English especially difficult.
Gatbonton, Trofimovich, and Magid (this issue) show how ethnic group
affiliation is a critical factor in pronunciation accuracy. They argue that
inaccuracy may reflect neither lack of ability nor interest but rather
social pressure from home communities or other students who speak
their L1. In fact, speakers who are too accurate risk being seen as disloyal
to their primary ethnic group.
The tension between accent and identity is perhaps strongest for
teachers from outside inner-circle countries. As teachers, their accents
may be a matter of pride (Sifakis & Sougari, this issue) or uneasiness
because NS pronunciation is seen as the yardstick for intelligibility
(Golombek & Jordan, this issue, p. 520), but it is never a neutral issue.
Jenkins (this issue) describes NNS teachers ambivalence when discussing accent. Teachers exploring ELF pronunciation goals approve of
them for others, but they often want to match their own pronunciation
to NS norms. Jenkins says that despite verbal assent to ELF goals, most
[teachers] nevertheless continued referring to NNS differences from RP
or GA as incorrect forms rather than ELF variants, as if they could
accept ELF in theory but not in practice (p. 540). Sifakis and Sougari
(this issue) find some willingness among Greek teachers to consider ELF
goals, although the teachers in their study strongly adhere to inner-circle
pronunciation norms. Progress in adopting ELF goals, suggest the
authors, can only be achieved by explicit in-service and preservice
education on how English functions in the teachers immediate geopolitical environment.
Accent is also intertwined with race in determining professional
identity. Golombek and Jordan (this issue) report on two Taiwanese
teachers of English studying in a U.S.-based TESL masters program.
Both teachers claim that NS teachers in Taiwan are judged as much on
appearance as on language. In fact, white teachers are often preferred,
so that native speakers of Spanish and French are also considered to be
speakers of American English because they look the part. Golombek and
Jordan call for teacher education programs to help NNS students
imagine alternative identities (p. 513) for themselves, identities that go
beyond restrictive notions of pronunciation intelligibility and employ a
variety of factors to establish professional legitimacy.


These examples suggest how identity is complicated not only by the

desire to belong, but by the attitudes and prejudices of others. If the
positive aspect of identity is the desire to belong, the negative is the
desire to exclude. Mugglestone (1995) traces the rise of the prestige
accent in British English, in which RP became the mark of those who
went to the right schools and therefore the mark of socioeconomic
power and status, but that also made it a gate-keeping tool that could be
used to exclude. Lippi-Green (1997) similarly discusses how accent is
used in American English to discriminate against speakers of nonprestige
varieties. Using language in general and accent in particular to discriminate has been called the last publicly acceptable form of discrimination.
Language thus comes to be the acceptable substitute for discrimination
based on other qualities such as racial, ethnic, and class differences
(Milroy & Milroy, 1985; Wolfram & Schilling-Estes, 1998).

Currently, pronunciation theory, research, and practice are in transition. Widely accepted assumptions such as the primacy of suprasegmentals,
the superiority of inner-circle models, and the need for native instructors
have been rightly challenged. ESOL professionals are recognizing that
judgments of intelligibility involve nonlinguistic as well as linguistic
factors, and that even completely intelligible pronunciation may be
evaluated negatively. Decisions about adjusting accent are not value free
because accents are intimately tied to speaker identity and group
membership. Increasing evidence also shows that the context of instruction directly affects how pronunciation should be addressed. Users of
English who interact professionally in inner-circle contexts may need to
adjust to an inner-circle model, but English users in the outer or
expanding circle may find that inner-circle models are inappropriate or
unnecessary ( Jenkins, 2000). These findings indicate that teaching
pronunciation is only partially a pedagogical decision, and that old
assumptions are ill-suited to a new reality.
John M. Levis is an associate professor of TESOL and applied linguistics at Iowa State
University, Ames, Iowa, USA. His research interests include the intelligibility of
spoken language, intonation, English pronunciation, and varieties of English. He has
published articles in numerous prestigious journals.



