N. J. Smelser, P. B. Baltes - International Encyclopedia of Social & Behavioral Sciences-Pergamon (2001)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1775

C

Campaigning: Political 1986, Kazee 1994). Generally the rather obvious


benefits far outweigh the costs of winning for those
with the requisite political ambition; but costs must be
At the microlevel, political campaigning involves an
weighed—loss of privacy, loss of time with family,
individual candidate attempting to win political office.
frequently loss of income. Similarly, the evident costs
At the macrolevel, political parties seek to gain seats in
of losing are almost always greater than the benefits;
government for their partisans and ultimately to gain
but some benefits do accrue to losers—making a name
power or at least to share power in the relevant
for oneself, building up good will with party officials.
governmental arena. At each level decisions must be
Beyond these broad generalizations, little is known
made about candidacies, organizations must be built,
about how specific candidates make individual de-
messages must be refined, strategies must be set, and
cisions given the dazzling array of difficult-to-estimate
the public must be convinced to support those waging
variables (Maisel and Stone 1997).
a particular campaign. Only certain aspects of political
campaigns are viewed by the public. Frequently, the
actions taken away from public view are more de-
terminative of success and failure. 1.2 Parties’ Choices of Candidates
The ways in which political parties choose their
candidates for office varies significantly from political
1. The Choice of Candidates system to political system and from party to party. The
key variable in this aspect of the process relates to the
Decisions about who will run for office involve extent to which party organization controls places on
decisions by potential candidates as to whether or not the ballot contrasted with voters acting independently.
they will run and decisions by political parties, acting In the former, the process is often described as
under a variety of rules, concerning who their candi- candidate recruitment, in the latter, candidate emerg-
dates will be. ence.
At one extreme are those systems in which political
parties slate their candidates for office, ranking cand-
1.1 The Decision to Run idates or placing them in safe constituencies. Most
parties operating in electoral systems with propor-
Individuals who run for political office must exhibit tional representation follow this procedure. It is also
political ambition beyond the desire to simply serve followed in systems with strong party systems, in
their community. Campaigning for and serving in which citizens vote more for parties than they do for
political office involves personal costs and public individual candidates. Many European systems fit this
exposure that deter many from seeking office. How- description (Hix and Lord 1997).
ever, even those ambitious for elective office must make At the other extreme are systems, of which the
decisions about what offices to seek and when to seek United States is prototypical, in which voters choose
those offices (Jacobson and Kernell 1981). These party candidates in primary elections. The shape of the
decisions are based on a complex, often-implicit cost- electorate in primary elections varies considerably
benefit analysis. Potential candidates weigh the costs from polity to polity. In theory, primary elections are
and benefits of both winning and losing. Their decision open only to members of one political party or
calculus can be summarized: another, but in actual practice how membership is
defined varies according to local rules and can be quite
pr l pw (BwkCw)jpl(BlkCl) restrictive or totally open. In either case, voters in
the primary elections determine the party nominee,
where pr l probability of running; pw l probability of even if that nominee would not have been the choice of
winning; pl l probability of losing; Bw l benefit of party officials (Maisel 1999).
winning; Cw l cost of winning; Bl l benefit of losing; Many systems are hybrids, with party officials and
Cl l cost of losing. the voters sharing power. The US presidential nom-
The difficulty that potential candidates face in inating system stands as perhaps the most complex
making these decisions relates to the uncertainty means of choosing party candidates. Delegates elected
regarding virtually all of the relevant variables (Maisel to national party conventions make the actual nom-

1433
Campaigning: Political

inations. Some delegates are selected because of their specialize in media advertising, polling, direct mail,
positions within the party or as elected officials with telephone banks, fundraising, and virtually every other
party affiliations; others are chosen at party meetings aspect of campaigning; other firms are structured to
(or caucuses) held in local communities; still others are take over all or many aspects of a campaign. Some
chosen by the voters in primary elections, based on work in particular geographic areas. Others are ready
lists of delegate candidates approved by the presi- to sell their services anywhere throughout an entire
dential campaigns they seek to support (Polsby and nation or even internationally.
Wildavsky 2000, Wayne 2000).

3. Defining Messages and Setting Strategy


2. The Campaign Organization
A political campaign requires getting a message to
In polities with strong party systems, campaign organ-
voters. To do that, a candidate must define what the
izations tend to be party-centered. In these cases, the
message is and must devise a strategy to reach the
voters cast their ballots for candidates as represent-
voters who will be swayed by that message. Again,
atives of a party. The party defines the campaign
how this is to be done varies with whether the
message and communicates it to the voters. The party
campaign in question is party-centered or candidate-
raises the money to fund the campaign, does the
centered.
polling necessary to understand what the electorate is
For party-centered campaigns, the message is the
thinking, and structures the campaign strategy and
party platform, and the voters to be reached at those in
tactics. Typically full-time, year-round party workers
the party’s core constituency and others to whom the
perform all these functions or they are contracted out
message might appeal in a particular year. Neither the
to professionals.
message nor the strategy vary much from year to year,
Where strong party systems do not exist, individual
though marginal changes are made as the context
candidates must build their own organizations. In
changes—and these marginal changes might well spell
these cases, candidates are most concerned with their
the difference between victory and defeat.
own elections, not that of those who share this party
For candidate-centered campaigns these decisions
label. Candidates choose campaign managers whose
are among the most crucial. Candidates must decide
sole goal is victory for that one candidate. They build
whether to stick with the party message or to devise
personal organizations to handle the important func-
one of their own; they must decide what particular
tions of campaigning—doing research to develop issue
issues will play best to their voters; and they must
positions and to counter opponents’ proposals; polling
figure out who precisely those voters are. Campaign
the public to ascertain perceptions of the candidate
strategists divide the electorate in a variety of
and reactions to strategies; writing speeches; com-
ways—by demographic characteristics, by economic
municating with the press; developing and placing
interests, by geographic locations. The goal is to use
advertisements; sending out direct mail appeals, tele-
research and polling data to determine what appeals
phoning and leafleting the district; scheduling and
will work with which audiences and to have the
preparing for candidate appearances; and perhaps
candidate address those audiences appropriately. For
most important, raising money. For larger and more
incumbents seeking re-election, the strategic con-
expensive campaigns, all of these functions are per-
siderationofteninvolvesrelyinguponstrengths demon-
formed by paid workers or volunteers in a candidate’s
strated earlier. For challengers, the strategy must be
own organization, or bought with campaign cont-
to find an opponent’s weakness and to emphasize
ributions.
one’s own strengths.
Recent campaign organizations have been less able
Effective campaign strategies remain more art than
to handle all of these functions, and two means of
science. Professional political campaigners know what
coping have emerged. For campaigns in smaller
has worked in the past. But many strategists learn
constituencies, the candidate often performs all of
those lessons. The successful strategists are those who
these functions him or herself. They are done more or
can see how to change the approach as the context
less well depending on the skill of the candidate, the
changes, as is often said in the military, ‘not to refight
time available, the level of competition, and the
the last war but to plan for the next one.’
sophistication that is expected in campaigning for
office (Maisel 1999).
Even those with relatively high budget campaigns
often find that they cannot hire staff to perform all of 4. Communicating with the Voters
the necessary campaign activities. These candidates
frequently turn to paid political consultants, experts To be effective, a political campaign must be able to
from outside the campaign who specialize in certain send the message it has devised to the audience it has
aspects of the campaign process and sell their services targeted. Campaign messages are communicated in
to many campaigns during any election cycle. Firms two ways, through free media and paid media.

1434
Campbell, Donald Thomas (1916–96)

4.1 Free Media Coerage Bibliography


Political campaigns seek free media coverage in Ansolabehere S A, Iyengar S 1995 Going Negatie: How Attack
whatever form they can, whenever they can. Candi- Ads Shrink and Polarize the Electorate. Free Press, New York
dates want to be reported on in newspapers and Hix S, Lord C 1997 Political Parties in the European Union. St.
newsmagazines; they want to grant interviews on Martin’s Press, New York
Jacobson G C, Kernell S 1981 Strategy and Choice in Cong-
radio; they want their events covered on television.
ressional Elections. Yale University Press, New Haven, CT
Free media coverage is advantageous not only because Kazee T A (ed.) 1994 Who Runs for Congress? Ambition,
there is no cost but also because the viewer sees news Context, and Candidate Emergence. Congressional Quarterly,
coverage of a candidate as giving that candidate a Washington, DC
sense of legitimacy; the candidate is seen as part of the Maisel L S 1986 From Obscurity to Obliion: Running in the
day’s news, not as part of a paid advertisement. Congressional Primary, rev. edn. University of Tennessee,
However, free media is uncontrolled exposure. Camp- Knoxville, TN
aign strategists can try to structure what topics will be Maisel L S 1999 Parties and Elections in America: The Electoral
covered—and campaigns are exploring increasingly Process, 3rd edn. Rowman & Littlefield, Lanham, MD
Maisel L S, Stone W J 1997 Determinants of candidate emer-
sophisticated means to do this—but too frequently the
gence in U. S. House elections: an exploratory study.
message delivered is very different from the message Legislatie Studies Quarterly 22: 79–96: Current work
the candidate seeks to convey. from the Candidate Emergence Project can be found at
http:\\socsci.colorado.cdu\CES\home.html
Polsby N W, Wildavsky A 2000 Presidential Elections: Strat-
egies and Structures of American Politics, 10th edn. Chatham
4.2 Paid Media Adertising House Publishers, New York
The opposites apply to paid advertising. Paid medi- Wayne S J 2000 The Road to the White House 2000. St. Martin’s
Press, New York
a—whether in the form of television, radio, or print
advertisements, of Internet sites, or of direct mail—is
L. S. Maisel
costly and is often viewed by the prospective voter as
the slanted message it in fact is. However, these paid
media also have the distinct advantage of allowing a
candidate to emphasize exactly the message desired.
These messages can be narrowcast, that is designed for
and directed to specific target audiences. These ads can Campbell, Donald Thomas (1916–96)
create images, discuss positions, or—increasingly in
recent years—attack an opponent, all in precisely the Donald Thomas Campbell, born November 20, 1916
way strategists feel will be most effective for a specified in Grass Lake, Michigan, died May 6, 1996 in
audience. Bethlehem, Pennsylvania, thus ending a career marked
The critical concerns about political campaigns by an array of superlatives. Prolific scholar and author,
today relate to paid media. Many fear that effective original thinker, ebullient teacher with a notable
political messages cannot be conveyed in 30-second twinkle in his eye, and generous colleague, Campbell
advertisements, the medium preferred by advertising was widely regarded as the most important social
executives. Others complain that the emphasis on science methodologist of the twentieth Century. Sev-
negative, attack advertising has poisoned the atmo- eral of Campbell’s articles vie with one another as
sphere that surrounds political life, keeping some of among the most widely cited pieces of social science
the best candidates from running (Ansolabehere and scholarship. Repeatedly he coined phrases any one of
Iyengar 1995). Virtually everyone bemoans the exor- which most scholars would be proud to have con-
bitant costs of campaigns, costs spurred by reliance on ceived, for example, ‘quasi-experiments,’ ‘unobtrusive
paid media. However, political campaigns will cont- measures,’ ‘internal and external validity,’ and ‘plaus-
inue to rely on paid media so long as they are effective ible rival hypotheses.’ His concepts are incorporated
in communicating a candidate’s message to the target as fundamental in several fields—psychology, soci-
audience; for that is the means to electoral succ- ology, anthropology, organization and management
ess—the ultimate message of a campaign’s effect- sciences, public policy, evaluation, education, and
iveness. philosophy—and in common use by scholars unaware
of their inventor. Moreover, Campbell’s welcoming
See also: Advertising: Effects; Electoral Systems; and self-critical personal style, modeled on his schol-
Electronic Democracy; Mass Media, Political Econ- arship, led him to embrace and examine all ‘plausible
omy of; Media and Social Movements; Media Effects; rival hypotheses’ and endeared him to students and
Media, Uses of; Party Identification; Political Mach- colleagues alike.
ines; Political Money and Party Finance; Political Campbell received his B.A. (1939) and Ph.D. (1947),
Parties; Polling; Primary Elections; Voting, Sociology both in Psychology, from the University of California
of; Women’s Suffrage (Berkeley), and taught at several institutions, including

1435
Campbell, Donald Thomas (1916–96)

Ohio State University (1947–50), the University of tion of the physical world and mechanisms for
Chicago (1950–53), Syracuse University (as N.Y. State compensating for these errors. (Campbell later de-
Board of Regents Albert Schweitzer Professor, 1979– clared profound ‘ambivalence’ toward the use of
82) and Lehigh University (as Distinguished Uni- disguised measures; see Kidder and Campbell 1970.)
versity Professor of Sociology-Anthropology, Psy- Campbell explored the sources and loci of bias and
chology and Education, 1983–96), but the majority of developed and refined methods for illustrating and for
his scholarly years were spent as professor of psy- minimizing these biases. For example in Jacobs and
chology at Northwestern University (1953–79) with Campbell (1961) he demonstrated how consensus in
which Campbell’s name and reputation are ‘perman- the ‘reality’ of an arbitrarily invented but shared group
ently and inextricably co-identified’ (Campbell 1981). norm could persist over generations of experimental
Campbell’s honors include a Fulbright Visiting subjects. In a series of crosscultural explorations
Professorship in Social Psychology at Oxford (1968), Campbell and co-workers (e.g., Segall et al. 1966)
the Distinguished Scientific Contribution Award from explored sources of validity and invalidity in percep-
the American Psychological Association (1969), elec- tions of ‘in-groups’ and ‘out-groups,’ and the differ-
tion to the National Academy of Sciences and the ential susceptibility to perceptual illusions between
American Academy of Arts & Sciences (1973), the Kurt European and non-European cultures.
Lewin Memorial Award from the Society for the Campbell and Fiske (1959) published ‘Convergent
Psychological Study of Social Issues (1974), presidency and Discriminant Validation by the Multitrait, Multi-
of the American Psychological Association (1975), the method Matrix,’ a complex and frequently cited
Williams James Lectureship at Harvard (1977), the analysis of the need for multiple measures of under-
Myrdal Prize in Science (1977), and the Distinguished lying constructs (traits) and of the need for the
Contribution Award from the American Educational demonstrated capacity of methods to distinguish
Research Association (1981). In addition, two annual among traits if one is to minimize irrelevant measure-
awards are given in his name: The Donald Campbell ment artifacts. Since every measure is partially invalid,
Award for Significant Research in Social Psychology multiple and distinctive methods are called for to yield
(1982–) and the Donald Campbell Award from the a ‘heterogeneity of irrelevancies.’ Campbell and Fiske
Policy Studies Organization, to an ‘outstanding meth- explored the value of simultaneously deploying maxi-
odological innovator in public policy studies’ (1983–). mally different measures of an underlying trait (in
Campbell received honorary degrees from numerous pursuit of convergent validation) and of different
universities, including Oslo, Michigan, Chicago, and traits measured the same way (in pursuit of discrimi-
Southern California. nant validation). Only by demonstrating that an
But Campbell felt most honored by the vast array of underlying trait can be measured in various distinct
books dedicated to him, a list that grew steadily in ways and that the chosen measurement strategies
several disciplines before and after his death. distinguish among measured traits, can one minimize
the impact of misleading artifacts.
By the early 1960s Campbell had established an
1. Major Contributions enviable reputation for the vigor and reach of his
campaign for advancing the methods and theory
When he died, Campbell’s resume! listed more than 230 associated with issues of validity. His reputation was
published books, monographs, and articles. Any brief further advanced with the publication with (Campbell
discussion of his contributions must necessarily focus and Stanley 1963) of ‘Experimental and Quasi-exper-
on a few. Best known as a methodologist, with imental Designs for Research,’ a widely circulated work
considerable scholarly work in the areas of exper- (elaborated as Cook and Campbell 1979) in which
imental design, measurement, and social experimen- they popularized the terms ‘internal’ and ‘external’
tation, Campbell is perhaps remembered especially for validity:
his explorations of the concept of validity. He was also
a well-regarded epistemologist with a keen interest in Internal validity is the basic minimum without which any
the sociology of science. To each domain, Campbell experiment is uninterpretable: did in fact the experimental
treatment make a difference in this specific experimental
brought his unique mind to bear on the problem of instance? External validity asks the question of generalizabi-
knowledge production. In a remarkable array of lity: To what populations, setting, treatment variables, and
works, Campbell explored from several perspectives measurement variables can this effect be generalized?
the inevitable fallibilities inherent in both observers (Campbell and Stanley 1963).
and methods in accurately portraying the world.
In his earliest publication, ‘The Indirect Assessment Another widely cited treatise, Unobtrusie Mea-
of Social Attitudes’ (1950), Campbell evidenced an sures: Non-reactie Research in the Social Sciences
aspect of the interests that would become the overall (Webb et al. 1966) evolved during affable lunchtime
focus of his scholarly life: the imperfections introduced conversations with colleagues from different depart-
by human observers and methods, including the ments. Over several years these colleagues sought to
scientific method, in the search for veridical descrip- identify relatively unbiased, nonreactive ways of mea-

1436
Campbell, Donald Thomas (1916–96)

suring behavior. A frequently cited example describes In a brilliant tongue-in-cheek section of this article
ways a Chicago museum tried to identify the popu- Campbell advised ‘trapped administrators whose pol-
larity of its exhibits. ‘Obtrusive measures’ included itical predicament will not allow the risk of failure’
questioning visitors leaving the museum; an ‘unob- how to capitalize on ‘threats to validity’ to assure
trusive measure’ was the frequency with which mu- positive results (Campbell 1969b). For example, were
seum staff had to replace worn tiles in front of exhibits. they to accept the occasional ‘grateful testimonial’ as if
By that index, the exhibit of baby chicks hatching live it were a representative outcome, to reserve interven-
and wet from quivering eggs was by far the most tions for carefully selected subpopulations most likely
popular. to succeed, and to eliminate from their analyses all
Campbell’s participation in an interdisciplinary those who prematurely quit the program, ‘successful’
conference spawned his brilliant foray into the so- program outcomes could be virtually assured.
ciology of science entitled ‘Ethnocentrism of Disci- Campbell continued in various fora to insist that
plines: A Fishscale Model of Omniscience’ (1969a). On social scientists have both obligations and opportun-
reading a paper ‘in an area of high relevance’ to his ities to test the relevance of their theories and methods
own work by esteemed fellow social psychologist in the service of public good. In this, Campbell’s
William McGuire, Campbell realized and then ack- unswerving concern was with the biases affecting the
nowledged that he had not ‘read or read-at even half of generation of knowledge. For example, in his too-
McGuire’s citations, and was not at all aware of the infrequently cited ‘Assessing the Impact of Planned
existence of another sizeable proportion’ (Campbell Social Change’ (1975) Campbell again displayed re-
1969a). While many scholars might react with chagrin, markable methodological and sociological insight in
Campbell instead reflected on how the ‘highly ar- generalizing the problem of data validity in the policy
bitrary’ organization of universities and academic arena in the following way:
fields skewed knowledge production:
The more any quantitative social indicator is used for social
Thus anthropology is a hodgepodge of all novelties venturing decision-making, the more subject it will be to corruption
into exotic lands—a hodgepodge of skin color, physical pressures, and the more apt it will be to distort and corrupt
stature, agricultural practices, weapons, religious beliefs, the social processes it is intended to monitor (Campbell 1975).
kinship systems, history, archeology, and paleontology …
Thus psychology is a hodgepodge of sensitive subjective Corrupted indicators remain a persistent and only
biography, or brain operations, or school achievement
vaguely explored source of real-world invalidity (e.g.,
testing, of factor analysis, of Markov process mathematics, of
schizophrenic families, of laboratory experiments on group Cochran et al. 1980).
structure in which persons are anonymous … Thus econo- Another highly influential, widely circulated piece
mics is a hodgepodge of mathematics without data, of history went unpublished for nearly 20 years: in ‘Methods for
of economic institutions without mathematics or theory, of theExperimentingSociety’Campbellattemptedspecifi-
an ideal model of psychological man … (Campbell 1969a) cally to design a strategy for a society willing to test
innovations and their intended effects. He argues for a
Campbell took a breathtaking leap when he con- society that would
ceived the ‘myth of unidisciplinary competence,’ dem-
onstrating how social, administrative, structural, and vigorously try out possible solutions to recurrent problems
political supports which accrue over time around and would make hard-headed, multidimensional evaluations
arbitrary disciplines ultimately create incentives that of outcomes, and when the evaluation of one reform showed
reinforce these arrangements and punish others of it to have been ineffective or harmful, would move on to try
equal epistemological worth. Campbell detailed the other alternatives (Campbell, in Overman 1988).
inevitable consequence—biased knowledge produc-
tion—and offered a model of ‘interdisciplinary Surveying the multitude of societal reforms under-
narrowness’ for mitigating this ‘ethnocentrism of way at the time, many of which he endorsed (e.g.,
disciplines.’ Alexander Dubcek in Czechoslovakia, Salvador
By the late 1960s, following Vietnam War protests, Allende, in Chile), Campbell argued:
campus unrest, and a national focus on urban inequi-
There is no such (experimenting society, committed to reality
ties, Campbell began trying conscientiously to in- testing, self-criticism, to avoiding self-deception) anywhere
fluence the design, implementation, and evaluation of today. While all nations are engaged in trying out innovative
social policies. In ‘Reforms as Experiments’ (1969b) reforms, none of them are yet organized to adequately
Campbell introduced a range of real-world quasi- evaluate the outcomes of these reforms (Overman 1988).
experiments (attempts to approach the standards of
experimentation even where random assignment of After specifying requirements of such an ‘experiment-
subjects to conditions is unachievable)—each tested ing society’ Campbell attempted in various ways to
against standards of ‘internal and external vali- determine conditions under which these results might
dity’—to the practical world of social reform. That be achievable. For example, after Allende was assas-
article frequently is cited as the most important single sinated in Chile, Campbell invited Chilean social
work in the field now known as ‘program evaluation.’ scientist Ricardo Zun4 iga to Northwestern University

1437
Campbell, Donald Thomas (1916–96)

specifically to reflect on the conditions that might know for certain? … Our only hope as competent knowers is
allow vigorous, large-scale reforms to be tested. that we be the heirs of a substantial set of well-winnowed
Campbell made other major contributions in his propositions (Campbell, in Overman 1988).
application of evolutionary theory to the realm of
ideas. Campbell, the eternal epistemologist, focused In the heady intellectual mix he created and sus-
in particular on the joint operation of the ‘blind- tained, Campbell was known for changing his mind
variation-and-selective-retention process fundamental when persuaded by carefully tested evidence, and for
to all inductive achievements, to all genuine increases challenging others to do the same. For example, over
in knowledge.’ In his ‘evolutionary epistemology,’ the years Campbell was persuaded by sociologist
which perhaps received its fullest flowering in his essay Howard S. Becker and others to appreciate the
honoring Karl Popper (in Overman 1988), Campbell falsifiability of carefully wrought case studies, in spite
endorsed the natural selection paradigm for explaining of their failure to live up to his long-cherished
the production of knowledge. Through the selective methodological criteria (Campbell, in Overman 1988).
winnowing of ideas, Campbell argued, what survive In return, he prompted qualitatively-oriented col-
over time are those ideas that are most meritorious, leagues to persist in reconsidering the validity claims
particularly as each is subjected to ‘critical realism.’ inherent in their own work (see, for example, Becker
In his William James lectures, Campbell (1977) 1986). For this open-mindedness, Campbell is revered
summarized his perspective in this way: by quantitative and qualitative methodologists alike
(see Patton 1999).
I am a fallibilist and antifoundationalist. No part of our Campbell also influenced legions of colleagues to
system of knowledge is immune to correction. There are no
reflect honestly on their own work. He talked and
firm building blocks, either as indubitable and therefore valid
axioms, or in any set of posits that are unequivocal once wrote openly about his own insecurities and frustra-
made. Nor are there any unequivocal experiences or explicit tions, including how late in his career he began to
operations on which to found a certainty of communication write, droughts in his own productivity, his discomfort
in lieu of a certainty of knowing. with the production pressure and relentless expec-
I am some kind of a realist, some kind of a critical, tation of genius when he was a young assistant
hypothetical, corrigible, scientific realist. But I am against professor at the University of Chicago, his frequent
direct realism, naive realism, and epistemological compla- bouts of depression, and the times when he believed
cency (Campbell, in Overman 1988). his students and colleagues ‘carried him.’
He explained his messy desk cheerfully by saying
that he ‘filed archeologically,’ and as long as no one
2. Legacy moved things around he could find whatever he
Campbell’s legacy is vast and pronounced. It affects a wanted. Amazed colleagues would watch him reach
wide range of social scientists as well as public policy into a pile and pull out the precise papers relevant to
scholars and practitioners, and is reflected in the work their discussions.
of hundreds of his students and more distant heirs. He Although its publishers have sold thousands of
introduced fundamentally new ways of thinking that copies of Experimental and Quasi-Experimental
are taken for granted today about how we perceive, Designs for Research and students return to it year
discover, and assess accumulated knowledge. Many of after year, Campbell never received any royalties; he
his concepts are now fundamental throughout the asked only for unlimited copies to give freely to his
social and policy sciences. Campbell is the most colleagues.
important intellectual figure in the history of evalu- Consideration of Campbell’s legacy would be in-
ation research. The strategies he employed in ‘Reforms complete without a discussion of his personal style, for
as Experiments’ for unearthing and illustrating threats it, too, promoted the production of knowledge while
to interpretations of policy implementation are an providing his colleagues with a vital and successful
often-cited model for exploring complex data (see paradigm. It would understate Campbell’s colossal
Tufte 1983). influence if there were no mention of his towering
In remarkably vivid, fetching but precise language, warmth, kindness, and generosity, and his deeply
Campbell repeatedly captured the central importance honest and self-critical perspective on his own work
of fallibilities inherent in human observers and their (Campbell 1981). Campbell was remarkably able to
methods as they accumulate knowledge about the reflect on his own fallibilities and to create conditions
world, and thus of the importance of multiple perspec- that would model his epistemology, and improve both
tives, the triangulation of methods, and an open his work and the work of others. Examples are diverse
process whereby ideas are tested against one another and legendary: he would often send a requested reprint
to promote the winnowing effects of competition. accompanied by a colleague’s critical rejoinder. He
Thus, in the James lectures, Campbell stated that was always in pursuit of contrary evidence and
counter-intuitive examples, particularly if they con-
We are cousins to the amoeba, and have received no direct fronted his own work. He encouraged students to
revelation not shared with it. How then, indeed, could we write minority sections to project reports if they

1438
Cancer-prone Personality, Type C

disagreed with him or with an emerging majority. Kidder L, Campbell D T 1970 The indirect testing of social
Campbell was resolutely encouraging of others and in attitudes. In: Summers G (ed.) Attitude Measurement. Rand
fact was thought by some to ‘suffer fools gladly.’ He McNally, Chicago
Overman E S (ed.) 1988 Methodology and Epistemology for
gave three grades: A, B, and incomplete. If the data
Social Science: Selected Papers of Donald T. Campbell.
collected in dissertations failed to support anticipated Chicago
hypotheses, or failed to replicate published findings in Patton M Q 1997 Utilization–Focused Ealuation, 3rd edn. Sage,
the literature, Campbell urged students to write up the Beverly Hills, CA
apparent reasons for the obtained results. Segall M H, Campbell D T, Herkovits M J 1966 The Influence of
Over time, Campbell connected his early deep Culture on Visual Perception. Bobbs-Merrill, Indianapolis, IN
philosophical explorations to a program of practical Tufte E R 1983 The Visual Display of Quantitatie Information.
work that will continue to guide generations of Graphics Press, Cheshire, CT
scholars and practitioners. Campbell lives on through Webb E J, Campbell D T, Schwartz R D, Sechrest L 1966
Unobtrusie Measures: Nonreactie Research in the Social
his voluminous work in many fields, through his
Sciences. Rand McNally, Chicago
students, and through their adoption of his style.
A. C. Gordon
See also: Evolutionary Epistemology; Experimental
Design: Large-scale Social Experimentation; Exper-
imental Design: Randomization and Social Exper-
iments; Experimentation in Psychology, History of;
External Validity; Foucault, Michel (1926–84); Inter- Cancer-prone Personality, Type C
nal Validity; Panel Surveys: Uses and Applications;
Pearson, Karl (1857–1936); Quasi-Experimental De- Type C has emerged as a behavioral pattern, coping
signs; Reform: Political; Unobtrusive Measures style, or personality type that predisposes people to, or
is a risk factor, in the onset and progression of cancer.
Type C has been described as being over-cooperative,
stoical or self-sacrificing, appeasing, unassertive,
Bibliography patient, avoiding conflict, compliant with external
Brewer M, Collins B 1981 Scientific Inquiry and the Social authorities, unexpressive, suppressing or denying
Sciences. Jossey-Bass, San Francisco negative emotions, self-sacrificing, and predisposed to
Campbell D T 1950 The indirect assessment of social attitudes. experiencing hopelessness and depression (Bleiker
Psychological Bulletin 47(1): 15–38 1995, Eysenck 1994, Temoshok 1990).
Campbell D T 1969a Ethnocentrism of disciplines and the fish- Since the mid-twentieth century, the contribution of
scale model of omniscience. In: Sherif M, Sherif C (eds.) psychosocial factors to cancer has been an important
Interdisciplinary Relationships in the Social Sciences. Aldine, research topic. A search of the literature shows an
Chicago
Campbell D T 1969b Reforms as experiments. American Psy-
exponential growth: while only nine publications
chologist 24: 409–29 appeared from 1951 to 1955, between 1995 to 1999
Campbell D T 1975 Assessing the impact of planned social more than 300 hundred documents dealing with this
change. In: Lyons G (ed.) Social Research and Public Policies. topic have been published in scientific journals
Public Affairs Center, Dartmouth College, Hanover, NH (Psychlit and Medline database). This article deals
Campbell D T 1977 Descriptie Epistemology: Psychological, with the ‘state of the art’ of Type C and its role as a risk
Sociological, and Eolutionary. William James Lectures. factor in the onset and progression of cancer.
Harvard University, Cambridge, MA
Campbell D T 1981 Another Perspectie on a Scholarly Career.
In: Brewer M B, Collins B E (eds.) Scientific Inquiry and the
Social Sciences. Jossey-Bass, San Francisco 1. Historical Oeriew
Campbell D T, Fiske D A 1959 Convergent and discriminant
validation by the multitrait–multimethod matrix. Psychologi- The belief that personality contributes to the etiology
cal Bulletin 56: 81–105 of disease is as old as the history of human thought. In
Campbell D T, Stanley J C 1963 Experimental and quasi- the third millennium BC, Hippocrates stated that
experimental designs for research on teaching. In: Gage N melancholic humor (caused by a surplus of black bile)
(ed.) Handbook of Research on Teaching. Rand McNally, caused cancer, and Galen (second century AD) ob-
Chicago served that neoplasm was more likely to occur in
Cochran N, Gordon A C, Krause M 1980 Proactive records: melancholic and depressed than in sanguine women.
Reflections on the village watchman. Knowledge 2(1): 5–18
It is in the twentieth century that experimental
Cook T D, Campbell D T 1979 Quasi-experimentation Design
and Analysis Issues for Field Settings. Rand McNally, Chicago evidence about relationships between cancer and
Jacobs R C, Campbell D T 1961 The Perpetuation of an personality are found. During the first half of the
arbitrary tradition through several generations of a laboratory twentieth century, investigations were made from the
microculture. Journal of Abnormal & Social Psychology 62: perspective of psychoanalytical theory, and within the
649–58 field of psychosomatic medicine. The most important

1439
Cancer-prone Personality, Type C

characteristics in cancer patients found through em- properties and different levels of generalizability, and
pirical research mentioned: loss of important rela- their scores can, therefore, be generalized to different
tionships, inability to express negative emotions, universes, presenting difficulties for the comparability
tension concerning parental relationships, and sexual of results. Over recent decades much more attention to
problems. Type C measurement has been paid, and these
Finally, it is from the second half of the twentieth methodological problems have been partially over-
century that the role of behavioral and psychosocial come.
conditions in the onset and progression of cancer has
been investigated systematically in research programs,
groups, and laboratories, following a track parallel to
2.2 Cancer Characteristics
research in Type A, Coronary prone behavior. A good
deal of clinical observations and research data has Cancer is not a homogeneous entity; both cancer types
been produced about a systematic link between some (melanoma, sarcoma, etc.) and cancer sites (lung,
psychosocial factors and cancer. At the same time, breast, skin, etc.) vary with regard to etiology, course,
there have been comprehensive reviews of psycho- mortality, heritability, and risk factors. Also, cancer
oncology research, and strong criticism of its meth- always implies a process, and psychological factors
odological flaws. Thus, before presenting the main may act at different stages of this process; this fact
traits constituting Type C, the most important diffi- introduces important research problems. For example
culties involved in its research will be discussed. Kreitler et al. (1993) stated that emotional repression
may be a response to the threat posed by the cancer
diagnosis. Several studies have dealt with the timing of
2. Difficulties in the Study of Cancer-prone psychological assessment and its relationship to
cancer, but the relationships between psychosocial
Personality, Type C conditions and neoplastic processes require sophis-
Temoshok and Heller (1984) described Type C re- ticated and complex research designs, and these have
search in a very illustrative fashion, as ‘comparing yet to be fully developed.
apples, oranges and fruit salad,’ expressing its com-
plexity and methodological difficulties. In other words
it involves different constructs, assessed through dif- 2.3 Subjects\Sample Characteristics
ferent instruments, in different samples of subjects
with different types of cancer and other diseases, and Several authors have pointed out that sociodemo-
using different designs. graphic variables may be associated with medical and
The heterogeneity of factors contributing to the environmental risk factors. Among sociodemographic
noncomparability of psycho-oncology literature can characteristics, age seems to be the most important,
be considered in the following ways: (a) the nature and since cancer incidence and prevalence increases with
measurement of psychological phenomena, (b) cancer age, and several personality characteristics are also
characteristics (type, site, and stage in the process), (c) associated with age (e.g., introversion). When cancer
sample characteristics, and (d) designs used. groups are compared with other participants, age
should be controlled (e.g., Ferna! ndez-Ballesteros et al.
1998).
In order to test the psychological risk factors of a
2.1 The Nature and Measurement of Psychological
given disease or illness through psychological co-
Phenomena
variation, comparison between healthy and ill subjects
Personality type, coping style, stress or emotional is not enough. It is essential to distinguish also between
response, or behavioral pattern are constructs with a the target illness and other illnesses. Cancer patients
variety of conceptualizations in psychology. This have been compared, as well as with normal patients,
problematic issue has been overcome by Eysenck with patients suffering from cardiovascular and di-
(1994); in short, the subject’s response to a given gestive problems, chronic diseases and benign path-
stressful situation cannot be understood without ologies, and with accident victims, among others.
taking into consideration their stressor perception and
the behavioral pattern used in coping with stress—
both are stable personality conditions.
2.4 Designs
To this conceptual problem, a methodological one
should be added. As constructs, all types of variables Science requires strategies or designs for testing
must be operationalized through psychological hypotheses; that is, when and under what conditions
measures. From semistructured interviews and pro- units (subjects) will be observed and measured.
jective techniques to well-developed questionnaires, Authors agree that the best design for studying a given
hundreds of measures have been used in psycho- risk factor is a prospective one. Subjects (recruited at
oncology research, which have different psychometric random in the community or as a member of a given

1440
Cancer-prone Personality, Type C

group) are assessed in the relevant target behaviors not feeling anger or irritation, and in their relation-
and psychological constructs, and are monitored until ships they report avoiding arguments with others
some of them develop cancer and others do not. Also, by using reason and logic, often contrary to their
in longitudinal studies subjects already diagnosed with feelings. For example, in a quasiprospective study,
cancer are monitored during different stages of the within a breast cancer prevention program, Fer-
illness process, from diagnosis to progression, treat- na! ndez-Ballesteros et al. (1997) compared the ration-
ment, and survival or death. ality (as well as other somatic characteristics) of
In quasiprospective designs psychological evalu- healthy women (N l 96), women with benign breast
ation is carried out before cancer diagnosis, but this disease (N l 90) and women with breast cancer (N l
period is usually very short. This type of design is 122). Healthy women were matched with women
commonly used within cancer prevention programs in with benign breast pathology or with breast cancer.
which all subjects are assessed in psychosocial All were assessed before diagnosis. Results show
characteristics and, after diagnosis, subjects with that women with breast cancer differed significantly
cancer are monitored. This type of design has two from healthy women and those with benign breast
main flaws: psychological characteristics are measured pathology. Any other review on this topic yields the
when subjects have already developed cancer, and same results: emotional expression is one of the psy-
subjects attending a prevention campaign cannot be chological hallmarks of cancer (Bleiker 1995, Eysenck
considered to be in a neutral context. Furthermore, 1994, Spiegel and Kato 1996, Temoshok 1990).
this situation is quite heterogeneous, since while Unfortunately, however, experimental data does
several subjects may already be suspicious about not describe the nature of these psychological
lumps or lesions, others are merely participating in a characteristics, nor whether cancer subjects fail to
preventive—or ‘normal’—situation. perceive or feel their emotional internal physiological
In retrospective designs, subjects already diagnosed reactions or whether they simply fail to express them
or being treated for cancer are compared with control (verbally or otherwise).
groups in a set of psychological characteristics. Such
retrospective designs are highly criticized because they
do not tell us whether psychological characteristics are 3.2 Depression
the cause or the result of cancer (e.g., Kreitler et al.
Consideration of medical anamneses also suggests
1993).
that depression is a frequent precursor of cancer;
All of these designs can be considered as ‘descriptive’
feelings of helplessness and hopelessness are also
or ‘correlational’ research strategies. However, there
found in several studies. For example, the Western
are also experimental designs; in these, the ‘inde-
Electric Health longitudinal study monitored 2,020
pendent’ variable (a psychological or personality
employees with the purpose of investigating coronary
condition) is manipulated in order to measure its
disease risk factors. As a measure of personality and
causal or functional relationship with the ‘dependent’
personal and social adjustment the Minnesota
variable (cancer).
Multiphasic Personality Inventory (MMPI) was ad-
In sum, several of these conceptual and methodo-
ministered. At the 17-year follow-up, 4 percent had
logical problems have been overcome, while others
died of cancer. As reported by Shekelle et al. (1981),
continue to be the object of criticism.
there was a twofold increase in the odds of death from
cancer in men with psychological depression. This
result was consistent across sites and types of cancer.
3. Personality Characteristics in Cancer Patients
Also, this association persisted after adjustments had
Evidence from prospective, quasiprospective, retro- been made for age, cigarette smoking, use of alcohol,
spective, and experimental studies converges on a set family history of cancer, and occupational status (see
of psychological characteristics that seem to act in the Depression, Hopelessness, Optimism, and Health).
onset and progression of all types of cancer. Let us In spite of these impressive results, it is unclear
examine these characteristics. whether these states are antecedents or consequences
of cancer, and research data is inconsistent in this area.

3.1 Emotional Expression


3.3 Interpersonal Style
This appears to be at the core of Type C. That is,
suppression, repression, inhibition and\or denial of In the medical literature, cancer patients have been
negative emotions seem to be the central character- described by their doctors as ‘extremely pleasant’ in
istics of the cancer-prone personality (see Emotions their interactions with them. There is evidence from
and Health). Cancer patients (in comparison to con- retrospective, quasiprospective, and prospective
trols) are described as anti-emotional and alexitimic, studies, as well as from experimental studies, that
with a tendency to control their negative feelings cancer patients—in comparison to control subjects—
(mainly anger, aggressiveness, hostility); they report yield significantly higher scores in tests assessing need

1441
Cancer-prone Personality, Type C

for harmony, self-sacrifice, and unassertiveness. For by Grossarth-Maticek and Eysenck (1990) on the
example, in a quasiprospective study Ferna! ndez- basis of cancer-prone characteristics (emotional ex-
Ballesteros et al. (1998) compared, in terms of need for pression, assertiveness, coping, etc.), and it was at-
harmony (as well as rationality), three groups of tempted to manipulate them through behavior therapy
women with breast cancer (before diagnosis (74), techniques. In 1983, mortality and causes of death for
during treatment (105), and during the follow-up the 2,563 subjects of the study were ascertained. Of the
(132)) with healthy women. Both personality variables selected subjects, 38 percent of risk subjects had died,
correctly classified 86 percent of the participants (87 as compared to 10 percent of the random sample.
percent of the breast cancer patients and 82 percent of In the cancer therapy group deaths were signi-
the healthy participants) and, moreover, no significant ficantly fewer than in the control group, and there
differences in rationality and need for harmony were were no deaths from cancer. In marked contrast, 60
found to be related to the stage of the cancer process. percent of the deaths in the control group were from
This behavioral pattern can be understood as a cancer. Although Grossarth-Maticek’s studies have
coping style for stressful situations, as a personality been criticized for certain methodological short-
characteristic, or as a defense mechanism against comings (questionnaires used were not in a standard
anxiety. form, and there was no report on psychometric
properties), their impressive results are in accordance
with the well-established effects of psychosocial treat-
3.4 Stressors, Stress, Strain, and Coping
ment, which significantly increases survival in cancer
Mechanisms
patients (e.g., Spiegel and Kato 1996).
Cancer patients’ anamneses usually show that an From descriptive and experimental studies it can be
antecedent of cancer disease is a subject’s life event. summarized that cancer-prone or Type C personality
However, as already mentioned, it is important to is a psychological configuration with two main con-
differentiate between the external stressor, the sub- ditions: (a) suppression of negative emotions (such as
ject’s appraisal and coping style, and the stress anger and anxiety), which includes not only lack of
reaction (see Stress and Health Research). As under- verbal expression of emotions but also corresponding
lined by several authors, stress is the response of a interpersonal behaviors and (b) inappropriate stress
given subject to their perception of a stressor—a coping mechanisms, leading to feelings of hopeless-
stimulus or situation perceived as threatening. Also, ness, helplessness, and depression.
however, subjects have different mechanisms for Finally, what mechanisms can be assumed to link
coping with stress. For example, Cooper and Faragher psychological characteristics with cancer?
(1993) pointed out that subjects who experienced a
stressful situation they perceived as threatening had a
3.5 Mechanisms that Link Psychological Factors
significantly higher cancer risk. In a retrospective
with Cancer
study, Watson et al. (1984) compared breast cancer
patients and healthy controls. Breast cancer patients From a mentalistic and\or dualistic perspective, psy-
were more likely than controls to report a tendency to chological characteristics observed in cancer patients
control emotional reactions, particularly anger. have been considered as though the ‘soul’ (or mental
However, evidence about psychosocial influences entities) were affecting the body. This position has
on cancer incidence and progression comes not only been brought up to date through empirical evidence
from descriptive studies—there are also important from both psychological and biological literature that
experimental results. For example, Grossarth-Maticek support the relationship between personality and
carried out an experimental study within the emotions and the autonomic, endocrinological and
Heidelberg prospective study. In 1971, 1,026 healthy immune systems.
persons from a random sample of the population were O’Leary (1990), in her review of stress response and
assessed in a wide set of psychosocial and medical the immune system, referred to the fact that chronic
variables. Also, 1,537 subjects were selected on the stress is associated with a suppression of the immune
basis of their increased psychosocial and clinical risk function, and that personality and coping styles may
of cancer or cardiovascular or circulatory disease. enhance or degrade the immune response. As pointed
Criteria used included chronic depression and hope- out by Spiegel and Kato (1996), chronic and acute
lessness, chronic excitement and anger, heavy smok- stress are associated with reductions in various
ing, hypertension, high blood cholesterol, and high measures of the immune function, and, moreover,
blood sugar. From the 1,537 high-risk subjects, 100 psychological intervention positively influences hall-
were selected who were thought to have an especially marks of the immune system.
high risk of cancer (and also of CV disease). Half of In sum, as Spiegel and Kato (1996) conclude, ‘there
these subjects were selected randomly and given is a nonrandom relationship among various psycho-
psychological therapy intended to decrease the re- social factors and cancer incidence and progression
spective risk. This therapy, which has been called that can only partially (underline added) be explained
‘creative novation behaviour therapy,’ was developed by behavioral, structural, or biological factors.’

1442
Cancer: Psychosocial Aspects

4. Conclusions Kreitler S, Chaitchik S, Kreitler H 1993 Repressiveness: cause or


results of cancer? Psycho-oncology 2: 43–54
Throughout the twentieth century, the empirical study O’Leary A 1990 Stress, emotion, and immune function. Psy-
of the relationship between personality and cancer has chological Bulletin 108: 363–82
been an important research topic. During the 1980s Shekelle R B, Raynor W J, Ostfeld A M, Garron D C,
and 1990s, several conceptual and methodological Bieliauskas L A, Liu S C, Maliza C, Paul O 1981 Psychological
problems have been overcome. depression and seventeen-year risk of death from cancer.
Psychosomatic Medicine 43: 117–25
Various designs, assessment devices, and samples
Spiegel M D, Kato P M 1996 Psychosocial influences on cancer
have shown that a set of psychological characteristics incidence and progression. Harard Reiew of Psychiatry 4:
appears to play a role in cancer onset and progression. 10–26
These characteristics can be reduced to the suppression Temoshok L 1990 On attempting to articulate the bio-
of negative emotions and to inappropriate stress psychosocial model: Psychological-psychophysiological
coping mechanisms. Much more attention should be homeostasis. In: Friedman H (ed.) Personality and Disease.
paid to the psychological nature of these charac- Wiley, New York
teristics. Temoshok L, Heller B W 1984 On comparing apples, oranges
Psychological treatments (manipulating the above and fruit salad: A methodological overview of medical
outcome studies in psychosocial oncology. In: Cooper C L
psychological characteristics) appear to act on the
(ed.) Psychosocial Stress and Cancer. Wiley, New York
cancer process, preventing cancer onset and increasing
patient survival possibilities. R. Ferna! ndez-Ballesteros
Scientific literature supports the hypothesis of a link
between personality characteristics, stress reaction
coping styles and biological systems. That is, there is
evidence of a connection between personality, stress
and cancer, as well as between personality, stress, and Cancer: Psychosocial Aspects
the autonomic, endocrinological, and immune
systems.
These psychological characteristics can be con- 1. Introduction
sidered as cancer risk factors. Nevertheless, Type C or
cancer-prone personality should be understood in Psychosocial research in oncology has been conducted
terms of its synergic interaction with other genetic, since the 1950s. The initial interest in cancer onset has
biological, and environmental conditions. changed, perhaps due to the many methodological
problems, to research in psychosocial consequences of
See also: Cancer: Psychosocial Aspects; Cancer a cancer, and the impact of psychosocial factors in
Screening; Depression, Hopelessness, Optimism, and cancer progression. Although there are some studies,
Health; Emotions and Health; Gender and Cancer; in several fields of research, which show negative
Personality and Health; Stress and Health Research results, most well-controlled studies have shown posi-
tive correlations between specific psychosocial factors
and quality of life during cancer treatment, cancer
Bibliography progression, and survival.
Bleiker E 1995 Personality Factors and Breast Cancer. ICG,
Dordrecht, The Netherlands
Cooper C L, Faragher E B 1993 Psychosocial stress and breast
2. Quality of Life in Cancer Patients
cancer: the interrelationships between stress events, coping The concept of quality of life (QOL) was first ref-
strategies and personality. Psychological Medicine 23: 653–62 erenced in Index Medicus in 1977. Since that time, it
Eysenck H J 1994 Personality, stress, and cancer: Prediction and
has emerged as an important concept in cancer
prophylaxis. Adances in Behaior Research and Therapy 16:
167–215 research in the description of conditions in cancer
Ferna! ndez-Ballesteros R, Ruiz M A, Garde S 1998 Emotional diagnosis and treatment that make the quality of life of
expression in healthy women and those with breast cancer. patients worse, and in the use of quality of life
British Journal of Health Psychology 3: 41–50 measurements for the evaluation of new methods of
Ferna! ndez-Ballesteros R, Zamarro! n M D, Ruiz M A, Sebastia! n treatment. QOL is defined as a concept which refers to
J, Spielberger C D 1997 Assessing emotional expression. the individual’s own perceptions about the degree of
Personality and Indiidual Differences 22: 719–29 satisfaction with treatment, and ability to perform in
Greer S, Watson M 1985 Towards a psychobiological model of their daily life. Although impairment in several di-
cancer: Psychological considerations. Social Science and
mensions of quality of life are primarily caused by
Medicine 20: 773–7
Grossarth-Maticek R, Eysenck H J 1990 Prophylactic effects of biological aspects of the disease, psychosocial research
psychoanalysis on cancer-prone and coronary heart disease- has focused on psychosocial factors that additionally
prone probands, as compared with control groups and be- modulate the intensity of subjectively experienced
haviour therapy groups. Journal of Behaioral Therapy and impairments in quality of life. Extensive psychosocial
Experimental Psychiatry 21: 91–9 research has been done with cancer patients in the

1443
Cancer: Psychosocial Aspects

unconditioned stimulus unconditioned reaction


chemotherapy nausea, vomiting

neutral stimulus no reaction


coke

after some chemotherapeutic cycles


with an association between chemotherapy and cola

conditioned stimulus conditioned reaction


coke nausea, vomiting

Figure 1
Learning model of classical conditioning, of anticipatory nausea and vomiting. See text for further details

initial phase of diagnosis and treatment, whereas little disposition; and (c) psychological factors, such as
information is available about the problems and treatment-related state anxiety or general psycho-
concerns that persist for long-term survivors. It can be logical distress (Andrykowski and Gregg 1992). There
assumed that, for all cancer survivors, what begins as are various hypothesized pathways by which a state of
a crisis involving diagnosis and treatment gradually anxiety or distress may contribute to the experience of
becomes a chronic illness with life-long follow-up treatment-related nausea. First, due to the proximity
medical care, long-lasting psychological effects, and of structures in the higher brain stem responsible for
changes in social and employment relationships. the release of vomiting, and noradrenergic structures
that are influenced by anxiety and distress, it can be
assumed that distress triggers release and intensity of
nausea and vomiting through a higher noradrenergic
2.1 Physical Symptoms activity. Second, it is well known from learning
psychology that anxiety and distress may facilitate the
2.1.1 Treatment-related nausea and omiting. It is process of classical conditioning. Therefore it can be
estimated that 75 percent of the patients receiving assumed that distress may enhance the conditioned
chemotherapy will suffer from nausea or vomiting in anticipatory nausea and vomiting.
spite of an antiemetic therapy (Morrow 1992).
Nearly 25 percent of patients suffer from so-called
anticipatory nausea. In these cases nausea or vomit- 2.1.2 Pain. Pain is one of the most prevalent
ing occur prior to the infusion of chemotherapy physical complaints of cancer patients. Bonica pub-
and is triggered by particular special odors or situ- lished the most comprehensive overview in 1985 (see
ations similar to stimuli experienced in previous Fig. 2). Fifty to 70 percent of cancer patients suffer
treatment. Whereas post-chemotherapy nausea and from pain syndromes during the late phases of their
vomiting are caused by neuropharmacological agents, disease; 20 to 50 percent suffer from cancer- or
anticipatory nausea and vomiting are assumed to treatment-related pain even in early phases of their
be caused by classical processes of conditioning. disease. In 60 to 80 percent of the cases pain syn-
Neutral stimuli (e.g., the taste of a drink like cola) dromes are attributable to the malignancy of the
that were associated with the infusion of chemo- tumor (tumor compression, infiltration of nerves); in
therapy (cola was mixed into the antiemetic drug 10 to 25 percent they are long-term results following
that was given prior to chemotherapy), will be operation, chemo- or radiotherapy; in 3 to 10 percent
associated with post-chemotherapy nausea. As a con- of cases the pain syndromes can be assumed to be
sequence, its taste will release nausea even if the unrelated to the cancer or treatment (e.g., migraine,
neuropharmacological effects of chemotherapy have arthritis). In many cases, however, it is difficult to
passed. Taste becomes a learned stimulus for anti- discover the main cause of pain. It can be assumed
cipatory nausea or vomiting (see Fig. 1). that several psychological factors, known modulators
Several predictors were identified as high risk factors in nonmalignant pain also influence the intensity and
for the development of nausea and vomiting during aversiveness of cancer-related pain. Among others,
cancer chemotherapy: (a) antiemetic drugs; (b) physio- factors of depression, chronic daily stress in private
logical factors, such as age, constitutional pre- life or at work, maladaptive pain-related patterns of

1444
Cancer: Psychosocial Aspects

Figure 2
Prevalence of pain in different cancer diagnoses (see Bonica 1985).

thought like catastrophizing or suppressive cog- 2.2 Body Image


nitions; maladaptive pain coping strategies belong to
Impairments of body image are mostly a direct
the most important psychological factors that con-
consequence of cancer treatment. Significant impair-
tribute to the chronicity of pain, e.g., chronic low
ment was seen in patients with a tumor of the head,
pack pain (Hasenbring 1998).
neck, or larynx; in breast cancer patients with a
resection of the mamma; and in patients with colo-
rectal cancer and an artificial anal\sphincter praeter.
2.1.3 Fatigue. Only in the last years of the twen- Between 10 and 56 percent of the breast cancer patients
tieth century did cancer research focus on fatigue as develop problems in sexuality and in their partner-
another cancer- or treatment-related physical com- ships. More than 50 percent of the patients with cancer
plaint; a national survey in the USA showed a pre- of the head or neck experience a decrease in self-
valence of problems with fatigue in 76 percent of 419 esteem, combined with anxiety and depression, which
chemotherapy and\or radiation-treated patients, leads to withdrawal and social isolation (Holland and
with impairment in walking, exercising, house Rowland 1998).
cleaning, or climbing stairs (Curt 2000). Cancer-
related fatigue had also a significant emotional com-
ponent, with 90 percent of patients reporting that
2.3 Physical Functioning
fatigue contributed to a loss of control, hopelessness,
isolation, lack of motivation, sadness, and frust- A number of cancer patients suffer from impairment in
ration. A significant number of patients also reported sexual functions, in the short- and\or long-term, after
fatigue-related social impairment, which included cancer treatment. For instance, 75 percent of women
difficulties in shopping, expressing intimacy with a with cancer of the cervix showed intense, continuous
loved one, playing with children, or spending time sexual difficulties; 33 percent gave up their sexual
with friends. Curt has shown that the physician contacts. Patients with cancer of the prostate ex-
recommendations tended to be nonspecific: 40 per- perience an impaired erection, which is a generally
cent of physicians recommended doing nothing, 37 reversible consequence of injury to the autonomous
percent prescribed rest. Further research should nerves. Anxiety, dysfunctional patterns of thought,
focus on the biological and psychological mechanisms and feelings of shame are psychological factors
by which fatigue is produced and maintained, in eliciting avoidance behavior, which leads to persistent
order to develop effective interdisciplinary treatment sexual difficulties. A study of 407 long-term survivors
modalities. of bone-marrow transplantation (BMT) found sexual

1445
Cancer: Psychosocial Aspects

problems in 29 percent of the men and in 80 percent patient functional impairment and spousal negative
of the women (Syrjala et al. 1998). The authors behaviors related to greater restrictions on the activity
recommend behavioral as well as hormonal or of the spouse and by depression.
mechanical therapy.

3. Coping With Cancer-related Distress


2.4 Emotional Distress Coping research in cancer patients has focused on
Confrontation with a diagnosis of cancer is ex- cognitive and behavioral strategies that patients em-
perienced as a crisis, which leads to distress in the ploy to reduce emotional distress or to change a
form of, for instance, expectations of chronic suffering, distressing situation. A major question in cancer
acute, or long-lasting fear of death, and feelings of research has been whether specific coping responses
anxiety, depression, or anger. Patients with dys- would decrease or increase survival time. Results of
functional coping strategies are at risk of developing different prospective cohort-studies (i.e., with two or
chronic anxiety and other feelings of distress; patients more assessment points) indicate that several aspects
with intense or long-lasting emotional distress suffer of avoidance behavior, for instance, avoidance of
more from the physical complaints described above. distressing social contacts (social withdrawal), avoid-
Extensive clinical studies have shown that 20 to 40 ance of expressing feelings of fear, anger, or regret, as
percent of cancer patients suffer from intense or long- well as the minimization or denial of the potential
lasting emotional distress (Holland and Rowland threat, are predictors of reduced survival time. For
1998). Some studies have seen intense feelings of instance, Greer et al. (1979) conducted a retrospective
anxiety in 80 percent of the patients immediately study in women with breast cancer including a ten-
subsequent to cancer treatment (e.g., Hasenbring et al. year follow-up. They found that women with an active,
1993). Patients suffered from fear of pain caused by confronting coping behavior (‘fighting spirit’) showed
unfamiliar methods of treatment, by claustrophobia an increased survival time when compared to those
during computed tomography, or by anxiety because who showed denial or stoic acceptance behavior.
they did not know what the next steps of their therapy Women with signs of help-\hopelessness showed the
would entail. lowest survival time. The tendency to minimization
and anger were independent predictors of survival
time in a further prospective study in 125 patients with
metastatic melanoma, published by Butow et al.
2.5 Cognitie Functions (1999).
Clinical impressions suggest that a significant number
of cancer patients suffer from impairment to various
cognitive functions, although only a few studies have 4. Social Support
assessed cognitive functioning in cancer patients ob-
jectively. Cull et al. (1996) found 49 percent of cancer There has been extensive research into the role played
patients, with various diagnoses, suffering from im- by positive social relationships in the adaptation to
pairment of concentration and memory. Cancer itself stress. Among the variables (e.g., the number of
(e.g., tumor of the brain, brain metastases in breast available persons, frequency of contact with relatives,
cancer patients) and its treatment (neurotoxicity of friends, etc., perceived social support, and degree of
cytostatic therapy) are known causes of cognitive satisfaction with perceived support), perceived social
impairment. Anxiety, depression, fatigue, and low support seems to be the best predictor of adaptation to
psychosocial support are further possible causes of cancer. Other qualitative dimensions are emotional
impairment of cognitive functions. support, instrumental, self-esteem, and appraisal sup-
port. Social support might influence patients’ ability
to adapt to the illness and its treatment (‘stress-buffer
hypothesis’) or might have a direct influence on the
progression of the disease (‘main-effect hypothesis’).
2.6 Social Relations
Retrospective studies in cancer patients have revealed
Fear of death, and impairment of emotional and some evidence of a positive relationship between social
physical functioning of cancer patients, lead to dif- support and progression of the disease (Spiegel and
fering degrees of disturbance in their social relation- Kato 1996). Nevertheless, there is also some evidence
ships. Social roles and responsibilities in a family have from recent studies for differential aspects. In a group
to be redefined where patients suffer from significant intervention study, Helgeson et al. (2000) found that
complaints. Manne et al. (2000) reported, in a cross- peer discussion groups in 230 women with breast
sectional study with 219 cancer patients and their cancer were helpful for women who lacked support
spouses, a significant relationship between increasing from their partners or physicians, but harmful for

1446
Cancer: Psychosocial Aspects

women who had high levels of support. More research 6.2 Psychobiological Hypotheses
is needed to clarify the relationship between special
There has been extensive research that corroborates a
aspects of social support, social networks, and the
connection between psychological distress, neuro-
adaptation to the disease.
endocrine effects, and progression of cancer (Garssen
and Goodkin 1999). Chronic stress, depression, and
5. The Impact of Psychosocial Factors on Cancer the lack of social support are seen to be related to
Progression hyperactivity of the hypothalamus—pituitary—
adrenal axis. The possible links between stress, neuro-
Whereas psychosocial predictors of the onset of cancer endocrine hyperactivity, and the progression of cancer
have not yet been reliably identified (at the end of the are many. It is possible that increased cortisol can
twentieth century), perhaps due to methodological increase the onset of glucosis, which leads to a
problems, the literature supports the idea that psycho- selective decrease in the growth of normal cells and
social variables predict cancer progression. Besides the enables the increase in growth of tumor cells. Another
coping styles, and the quality and amount of social effect could be that cortisol, prolactin, or another
support, as described above, cancer-independent dis- stress-sensitive hormone, could stimulate the tumor
tress in daily life has also been investigated in relation growth directly in hormone sensitive tumors such as
to cancer progression. In a study of 86 breast cancer breast cancer. A third effect extensively researched,
patients, Forse! n (1991) interviewed the women after assumes that stress-induced hyperactivity of neuro-
surgery. Patients with more stressful life events during endocrine functions will lead to a suppression of
the year before the operation revealed an increased immune functions (Andersen et al. 1994).
risk of recurrence (relative risk: 3.48) and of death
(relative risk: 4.37). The role of chronic daily stress at
work or in private life was seen as a high risk factor for 7. Psychosocial Interention
tumor progression and for survival, after adjusting for Psychosocial intervention has been studied in the
age, diagnosis and tumor stage was explored in a acute in-patient setting, primarily in order to reduce
further study in 51 cancer patients during their first treatment-related side effects, such as nausea, vomit-
chemotherapy (Hasenbring et al. 1993). However, ing, and pain, as well as in late phases of the disease
some studies could not find a correlation between life in order to enhance the ability to cope with cancer-
events and tumor progression or survival. related emotional distress and to mobilize social
support.
6. Possible Links between Psychosocial Variables
and Tumor Progression 7.1 Interention for the Reduction of Treatment-
There are a number of possible links between psycho- related Side Effects
social factors and the progression of cancer. Chronic There are a number of studies which have investigated
daily stress, the lack of social support and mal- the efficacy of different behavioral methods for re-
adaptive strategies of coping with stress, especially ducing chemo- or radiotherapy-induced nausea and
when accompanied by depression (a) may change vomiting, and for enhancing quality of life during the
individual health behavior, which is known to be acute in-patient setting, e.g., relaxation, systematic
related to cancer onset or progression or (b) may desensitization, hypnosis, and biofeedback (Carey and
directly influence endocrinological and\or immuno- Burish 1988). Whilst all methods show positive effects
logical factors that are assumed to be precursors of in the reduction of nausea, vomiting, and depression,
cancer progression and death. relaxation, especially in combination with guided
imagery, proved to be the most effective treatment.
6.1 Health Behaior
7.2 Interentions for Enhancing the Coping
Patients with increased depression as a persistent
Repertoire and Quality of Life
reaction to cancer, who experience intense or con-
tinuous stress (related to or independent of cancer Several single therapeutic methods were investigated
treatment), or who lack sufficient social support may which focused on emotional and social support during
reveal more maladaptive health behaviors; for cancer treatment. Spiegel et al. (1989) published a first
example, they tend to be less motivated to practice controlled, randomized intervention study of group
healthy eating habits, they are more often reliant on therapy offered to women with metastatic breast
alcohol and smoking, and their sleep will be more cancer. For a year, sessions of seven to ten women met
frequently disturbed (Spiegel and Kato 1996). A weekly under professional supervision. These focused
number of epidemiological studies have shown that on discussions about death, problems in families, on
several of these health behaviors are significantly the communication with their doctors, and on methods
related to cancer progression. of enhancing the quality of life during the last phases

1447
Cancer: Psychosocial Aspects

of their disease. Follow-up studies after a year revealed term survivors, and to identify predictors of these. It
lower anxiety and depression, decreased pain and can be assumed that psychosocial factors play a
fatigue, and more adaptive coping strategies, when significant role in association with biological pre-
compared to a control group with nonspecific psycho- dictors, which would demonstrate the necessity for
logical support. In a ten-year follow-up significantly adjunct psychological therapy.
different survival times were observed: 36.2 months
after the end of therapy in the experimental group, and See also: Cancer-prone Personality, Type C; Child-
18.9 months in the control group. hood Cancer: Psychological Aspects; Chronic Illness,
Another startling group therapy study was pub- Psychosocial Coping with; Chronic Illness: Quality
lished by Fawzy et al. (1993). They treated patients of Life; Chronic Pain: Models and Treatment
with a malignant melanoma for six weeks with a one- Approaches; Gender and Cancer
and-a-half-hour group session once a week. These
included educational elements regarding coping with Bibliography
stress, seeking for social support, and relaxation\
imagination. Besides a significant reduction of Andersen B L, Kiecolt-Glaser J K, Glaser R 1994 A bio-
anxiety, depression, and fatigue, for the first time behavioral model of cancer stress and disease course.
modifications in immunological variables were dem- American Psychologist 49: 389–404
Andrykowksi M A, Gregg M E 1992 The role of psychological
onstrated. The authors found significant increases in
variables in post-chemotherapy nausea: Anxiety and expec-
lymphocytes, natural-killer-cell activity, and alpha- tation. Psychosomatic Medicine 54: 48–58
interferon induced natural-killer cytotoxicity com- Bonica J J 1985 Treatment of cancer pain: Current status and
pared to a randomized control group. In the six-year future needs. In: Fields H L, Dubner R, Cervero F (eds.)
follow-up, the experimental group showed a signifi- Proceedings of the 4th World Congress on Pain: Seattle.
cantly higher survival rate than did the controls, Raven, New York, pp. 589–616
with higher survival rates in patients who showed an Butow P N, Coates A S, Dunn S M 1999 Psychosocial predictors
increase in active coping behavior. of survival in metastatic melanoma. Journal of Clinical
Oncology 17: 2256–63
Carey M P, Burish T G 1988 Etiology and treatment of the
psychological side effects associated with cancer chemo-
7.3 How Can We Find Out Who Needs Professional therapy: A critical review and discussion. Psychological
Psychological Support? Bulletin 104: 307–25
In order to incorporate psychosocial therapy into the Cull A, Hay C, Love S B, Mackie M, Smets E, Stewart M 1996
routine treatment of cancer patients, the question of What do cancer patients mean when they complain of
concentration and memory problems? British Journal of
which patients might need professional psychosocial Cancer 74: 1674–9
support, and which patients would respond to the Curt G G A 2000 The impact of fatigue on patients with cancer:
offer, has to be answered. For instance, Wellish and Overview of fatigue 1 and 2. The Oncologist 5(suppl 2): 9–12
Wolcott (1994) tried to define criteria from their Fawzy F I, Fawzy N W, Hyun C S, Elashoff R, Guthrie D et al.
clinical experience of bone marrow transplantation 1993 Malignant melanoma: Effects of an early structured
that would enable the assessment of the need for psychiatric intervention, coping and affective state on re-
psychosocial therapy based on individual coping currence and survival 6 years later. Archies of General
abilities, stress in daily life, and psychiatric history. An Psychiatry 50: 681–9
empirical approach would be the analysis of re- Forse! n A 1991 Psychological stress as a risk factor for breast
cancer. Psychotherapy and Psychosomatics 55: 176–85
sponders and nonresponders and their determinants Garssen B, Goodkin K 1999 On the role of immunological
within randomized controlled studies (see Helgeson et factors as mediators between psychosocial factors and cancer
al. 2000). The question ‘Who needs psychological progression. Psychiatry Research 85: 51–61
therapy?’ should possibly be modified to ‘Who needs Greer S 1991 Psychological response to cancer and survival.
which sort of psychological therapy?’ because it has Psychological Medicine 21: 43–9
been shown that more than 95 percent of patients in an Greer S, Morris T, Pettingale K W 1979 Psychological response
acute phase of their cancer treatment will accept this to breast cancer: Effect on outcome. Lancet 2111: 785–7
kind of support (Hasenbring et al. 1999). Hasenbring M 1998 Predictors of efficacy in treatment of chronic
low back pain. Current Opinion in Anaesthesiology 11: 553–8
Hasenbring M, Gassmann W, Arp K, Kollenbaum V,
Schlegelberger T 1993 Belastungen, Ressourcen und Krank-
8. Future Implications heitsverarbeitung im Verlauf einer Polychemotherapie:
Further research is needed to replicate the important Ergebnisse einer prospektiven La$ ngsschnittstudie. [Distress,
resources and coping behavior during a polychemotherapy:
findings of Spiegel et al. (1989) and Fawzy et al. (1993),
results of a prospective study]. In: Muthny F A, Haag G (eds.)
and to answer the question of how these interventions Onkologie im psychosozialen Kontext. Roland Asanger,
could be implemented in the routine treatment of Heidelberg, Germany
cancer patients. A further field of research is the Hasenbring M, Schulz-Kindermann F, Hennings U, Florian M,
investigation of the tendency towards the chronicity of Ramm G, Zander A R 1999 The efficacy of relaxation\
physical complaints such as pain and fatigue in long- imagery, music therapy and psychosocial support for pain

1448
Cancer Screening

relief and quality of life: First results from a randomized are checked further by diagnostic work-out. Screening
controlled clinical trial. Bone Marrow Transplantation is different from testing where the initiatie comes from
23(Suppl. 1): 549 the indiidual, or from a health professional, in the
Helgeson V S, Cohen S, Schulz R, Yasko J 2000 Effects of
course of a diagnostic process (Wilson and Jungner
psychosocial treatment in prolonging cancer: Who benefits
from what? Annals of the New York Academy of Science 840: 1968, Health Council, Netherlands 1994). People
674–83 screened are less prepared for the diagnosis and less
Holland J C, Rowland J H (eds.) 1998 Handbook of Psycho- aware about the condition they are screened for than
oncology (2nd edn.). Oxford University Press, New York those tested. Screening is aimed at unaware, healthy,
Manne S L, Alfieri T, Taylor K L, Dougherty J 2000 Spousal asymptomatic people, and thus has heavier public
negative responses to cancer patients: The role of social health, social, psychological, and ethical implications
restriction, spouse mood, and relationship satisfaction. than testing.
Journal of Consulting and Clinical Psychology 67: 352–61 Prerequisites for successful screening (Wilson and
Morrow G R 1992 Behavioural factors influencing the de-
Jungner 1968 for WHO) have been modified to be in
velopment and expression of chemotherapy induced side
effects. British Journal of Cancer-Supplement 19: S54–60 66: accordance with developments in genetics (Nuffield
54–61 Council of Bioethics), and specific European guid-
Spiegel D, Bloom J R, Kraemer H C, Gottheil E 1989 Effect of elines for mammography (x-ray of the breasts) scre-
psychosocial treatment on survival of patients with metastatic ening have been developed (Kirkpatrick et al. 1993).
breast cancer. Lancet 2: 888–91 In very few countries launching new screening prog-
Spiegel D, Kato P M 1996 Psychosocial influences on cancer rams is regulated by law, in some guided by reco-
incidence and progression. Harard Reiew of Psychiatry 4: mmendations. The criteria of the National Screening
10–26 Committee (1998), which are based on the refined
Syrjala K L, Roth-Roemer S L, Abrams J R, Scanlan J M,
WHO criteria of 1968, are expressed as follows:
Chapko M K, Visser S, Sanders J E 1998 Prevalence and
predictors of sexual dysfunction in long-term survivors of (a) The condition screened should be an important
marrow transplantation. Journal of Clinical Oncology 16: health problem and its natural history should be
3148–57 understood; it should be recognizable either at a latent
Wellish D K, Wolcott D L 1994 Psychological issues in bone or early symptomatic stage.
marrow transplantation. In: Forman S J, Blume K G, Thomas (b) The test should be simple, safe and reliable,
E D (eds.) Bone Marrow Transplantation. Blackwell-Scientific, inexpensive, and acceptable to those screened. The
Boston, pp. 556–570 distribution of test values should be known, and the
cut-off levels agreed upon. There should be an agreed
M. I. Hasenbring policy for diagnostic evaluation of those with positive
screening findings. The chance of physical or psycho-
logical harm to those screened should be less than
the chance of benefit.
Cancer Screening (c) The treatment or intervention should be effective
with evidence that early treatment leads to a better
Cancer is the uncontrolled growth and spread of outcome.
abnormal cells (see Cancer: Psychosocial Aspects; (d) The screening program should be clinically,
Cancer-prone Personality, Type C; Childhood Cancer: socially, and ethically acceptable respecting the equity
Psychological Aspects; Gender and Cancer; Sun Expo- of access principle; it should also be cost-effective and
sure and Skin Cancer Preention). Typically cancer managed and monitored according to quality assu-
develops slowly and has a long preclinical phase. Early rance principles.
detection of cancer means better prognosis because of The public health rationale behind cancer screening
early and lighter treatment. This can lead to improved is to reduce mortality and morbidity. The idea is to
quality of life and savings for the society. find cancer in a preclinical asymptomatic phase.
Screening for risk factors or genetic susceptibility to
cancer is based on the assumption that knowledge of
1. Difference between Screening and Testing risk affects lifestyle choices and interventions.

Screening for cancer is any kind of test performed for 2. Screening Programs for Cancer
systematic detection or exclusion of cancer, risk
factors, or susceptibility to cancer assuming that the An example of cancers amenable for screening is
initiatie comes from outside the indiidual: from the breast cancer (the most common female cancer)
healthcare or screening organizer. Screening sorts out screened by mammography, and also cervical cancer
apparently healthy people who probably have the (by Pap smear screening). More than 1 in 10 women in
screened condition from those who probably do not. industrialized countries get breast cancer in their
A screening test need not take the form of a large-scale lifetime. The natural history of breast cancer is
population intervention. It is not intended to be known—there is an asymptomatic stage; treatment is
diagnostic. Persons with positive or suspicious findings more effective after early diagnosis, and based on

1449
Cancer Screening

randomized controlled studies the disease-specific cultural and resource factors on both society and
mortality among women over 50 has been reduced by service level determine the screening proided (Table 1).
a third in the screened group (Kerlikowske 1995, These, plus individual factors, determine the uptake of
deKoning 2000). The test is acceptable and cost- screening (Marteau 1993). A tradition of centrally
effective. When screening is carried out as a public organized public healthcare encourages compliance,
health program, the equity of access principle is but also provides equity in access, whereas services
fulfilled, too. Among over 100 known types, other based on self-initiation attract those people who
potential cancers for screening include prostate cancer usually take care of their health. Examples of in-
(in many countries already more common than lung dividual level determinants are social background,
cancer among males) by serum prostate specific health behavior in general, beliefs, perceived risk, and
antigen (PSA); colorectal cancer by fecal occult blood knowledge (Aro et al. 1999). Previous screening
test and sigmoidoscopy, and to a lesser extent lung, experience influences adherence to later screening
stomach, and ovarian cancer by different technologies. rounds.
Different countries implement different screening Theoretical models used to study determinants of
policies, some more enthusiastically than others. This cancer screening uptake have mostly been cognitive
shows value judgments in the interpretation of expectancy value models predicting self-initiated par-
scientific evidence, as well as differences in health ticipation. Reasons why they are not so good at
culture and the availability of resources. The cost of predicting behavior are probably due to the fact that,
large-scale high-technology screening exceeds the re- where social and service contexts are essential, the
sources of most societies. Low-technology options, models are too individualistic, cognitive (e.g., concept
like clinical breast examination for breast cancer and of intention), and general for cancer screening. The
visual inspection for cervical cancer, in use in some stages of change theory can be applied to adherence to
developing countries, are often the next best, although repeated screening rounds, especially if the context
less effective options. factors are included in the model.

3. Social and Psychological Aspects of Cancer 3.2 Implications of Screening Proision and Uptake
Screening
Person-years saved, preferably also quality adjusted
life years, are the main implications of screening
3.1 Determinants of Screening Proision and Uptake
uptake at the society level (Table 1). Uptake as well as
In society, screening can be studied both as a health provision of screening raise costs and awareness, and
service offered and an act of an individual. Different have an impact on values in the society. Medicalization
Table 1
Screening in a society: examples of determinants and implications of screening provision and uptake on different
levels of society
Screening provision Screening uptake

Determinants Implications Determinants Implications


Society level
values awareness culture lives saved
health policy costs healthcare costs
healthcare medicalization context demand
resources awareness
Service level
healthcare organization staff workload
insurance education invitation treatment
tradition access follow-up
values cost
resources context
Individual level
NAa awareness beliefs inconvenience
worry health relief
(nonattenders) behavior worry
experience false reassurance
a not applicable

1450
Cancer Screening

is intensified if the screened condition is seen as factors, such as improved cancer awareness and
something bad and unwanted. Diagnostic work-out enhanced health behavior.
and treatment increase the amount of work needed. Screening, however, is not the solution for cancer
Since screening also detects cancers which otherwise control. It is costly, cost-effective only among certain
would never become clinical, or which would appear age or risk groups, and not globally feasible. Pre-
later (overdiagnosis), the workload of the service vention would be better than screening for an already
provider is further increased. existing disease, but causal factors of cancer are still
People inited to, or otherwise made aware of poorly understood. Of the predicted 20 million new
screening, might get worried about the screened cancer cases every year by 2020, up to 14 million will
condition, and therefore refrain from participating. occur in developing countries. At present one in three
An invitation to screening may raise worry and a to four people in industrialized countries fall ill with
screening test may be inconvenient or bring other cancer during their lifetime. The overall five-year
hazards. A normal screening finding brings relief, but a survival rate for cancers is over 50 percent, but for
finding of any other sort can cause long-lasting small ( 1 cm) breast cancers it exceeds 90 percent.
anxiety. For those with a false positie finding, worry, Screening for risk factors or susceptibility to cancer
especially cancer-specific worry, and its behavioral is closer to prevention than screening for disease, but
indicators, preoccupation with symptoms, and in- has different implications. Screening for disease detects
creased use of health services, might last long past the few cases, most of which would become clinical later
reassuring diagnosis (Rimer and Bluman 1997, Aro et without screening, whereas screening for risk factors
al. 2000). Intrusive diagnostic procedures, like surgery, finds a lot of people who will never develop the disease.
can be very stressful. A true positie diagnosis of cancer Eventual screening for genetic susceptibility to cancer
due to screening may both prolong life and improve its has to meet the criteria of good quality screening as
quality. However, this impact cannot be evaluated well. Informed consent is essential and, in addition to
individually but only at a group level. Finding cancer, risk notification, social and psychosocial consequences
which would never have become clinical in a lifetime, need to be understood and intervention options
is clearly a major adverse effect of screening. A false (lifestyle changes, choosing a partner, medication,
negatie finding misleadingly reassures and delays preventive surgery, other kinds of therapy, and com-
diagnosis. A true negatie finding is reassuring, as- binations or interactions of these) clarified. Potential
suming that screening has no adverse side effects. A screening programs need to be evaluated against
good quality organized mammography screening pro- accepted criteria (Hakama 1991) by healthcare tech-
gram (for ages 50j every two or three years) finds nology assessment methods. Public debate and the
breast cancer in five out of 1,000 screened women, has empowerment of lay people to take part in discussions
1:1 malignant:benign ratio of breast biopsies on values, ethics, screening policies, and priorities
(Kirkpatrick et al. 1993), and a very low (even less need to be encouraged. Modeling software provides
than 1 percent) false positive rate per screening round. one way of forecasting effects and impacts of screening
For a woman attending several screening rounds in a where experimental studies are not acceptable or
lifetime, the cumulative risk of a false positive finding feasible. People need knowledge and skills to control
rises considerably—close to 50 percent in 10 screening the cancer problem both in the roles of decision-
rounds in the US screening (Elmore et al. 1998). makers and service providers, and as individuals.
While the screening uptake literature is partly based
on theoretical models in predicting self-initiated at- See also: Cancer-prone Personality, Type C; Cancer:
tendance, literature on implications of the screening Psychosocial Aspects; Childhood Cancer: Psychol-
process and findings has a nearly nonexistent theor- ogical Aspects; Gender and Cancer; Genetic Screening
etical framework. The cognitive-behavioral models of for Disease-related Characteristics; Health Behavior:
illness anxiety rising from the tradition of psycho- Psychosocial Theories; Health Care Technology;
somatic studies have mostly been used to guide National Health Care and Insurance Systems; Risk
research questions and the development of methods. Screening, Testing, and Diagnosis: Ethical Aspects;
Screening and Selection; Sun Exposure and Skin
Cancer Prevention; Vulnerability and Perceived Sus-
4. Challenges in Studying Cancer Screening ceptibility, Psychology of
The study of cancer screening is enriched by the
examination of the determinants and implications of
both provision and uptake, including the levels of Bibliography
society, service, and the individual. A prospective Aro A R, Absetz P S, van Elderen T M, van der Ploeg E, van der
design—most easily built in organized screening pro- Kamp L J T 2000 False-positive findings in mammography
grams with a planned schedule—enables the measure- screening induce short-term distress—breast cancer-specific
ment of predictive determinants. Context-specific concern prevails longer. European Journal of Cancer 36:
measures, in addition to standard ones, reveal valuable 1089–97

1451
Cancer Screening

Aro A R, de Koning H J, Absetz P, Schreck M 1999 Psycho- eating of one’s own kind as a special, often ritually
social predictors of first attendance for organised mammo- significant, form of behavior. In societies in which
graphy screening. Journal of Medical Screening 6: 82–8 cannibalism was a component of initiation into a
de Koning H J 2000 Assessment of nationwide cancer-screening
religious cult or secret society, it was frequently the
programmes. Lancet 355: 80–1
Elmore J G, Barton M B, Moceri V M, Polk S, Arena P J, very act of violating the taboo and overcoming the
Flectcher S W 1998 Ten-year risk of false positive screening fear and loathing associated with cannibalism that
mammograms and clinical breast examinations. New England provided the element of ordeal common in such
Journal of Medicine 338: 1089–96 induction rites. In symbolic contexts, the cultural
Hakama M 1991 Screening. In: Holland W W, Detels R, Knox treatment of cannibalism is akin to that of incest, and
G, Fitzsimons B, Gardner L (eds.) Oxford Textbook of Public in mythology the two images are often closely asso-
Health. Oxford University Press, Oxford, UK, Vol. 3 ciated; they also raise similar explanatory problems, in
Health Council of the Netherlands: Committee Genetic Screen- that each rests upon the interaction of biological and
ing 1994 Genetic Screening. Health Council, publication No.
cultural factors.
1994\22E, The Hague
Kerlikowske K, Grady D, Rubin S M, Sandrock C, Ernster V L If a phylogenetic aversion toward cannibalism does
1995 Efficacy of screening mammography. A meta-analysis. exist, it is one that can be modified—though perhaps
Review. Journal of the American Medical Association 273: not fully abolished—by culture. Thus, from the actor’s
149–54 point of view the objective definition of cannibalism as
Kirkpatrick A, To$ rnberg S, Thijssen M A O (eds.) 1993 Euro- applied to humans may be complicated by locally
pean Guidelines for Quality Assurance in Mammography defined differences between own kind and other kind.
Screening. Office for the Official Publications of the European If members of alien groups are taken to be nonhuman,
Communities. Report EUR 14821. Luxembourg eating them may not be perceived or psychologically
Marteau T M 1993 Health-related screening: Psychological
registered as cannibalistic. Ontological distancing of
predictors of uptake and impact. In: Maes S, Leventhal H,
Johnston M (eds.) International Reiew of Health Psychology. this sort is not uncommon among cultures practicing
Wiley, Chichester, UK, Vol. 2 cannibalism and although objectively such maneuvers
Morrison A S 1992 Screening in Chronic Disease, 2nd edn. may be seen as rationalizations, it is important to
Oxford University Press, New York grant that cultural understandings are capable of
National Screening Committee 1998 First Report of the UK creating a situation in which eating the other is
National Screening Committee. Department of Health, Lon- palatable, even commendable, whereas eating one’s
don own is repulsive, horrific, or insane.
Rimer B K, Bluman L G 1997 The psychosocial consequences of As the quintessential uncivilized Other, and a
mammography. Journal of the National Cancer Institute
projected inversion of Self, the Cannibal has been an
Monograph 22: 131–8
Wilson J M C, Jungner G 1968 The Principles and Practice of object of fascination in many societies and for much of
Screening for Disease. WHO, Geneva recorded history. Writing in the fifth century BC,
Herodotus reports that man-eating is a custom among
A. R. Aro the Anthropophagi, dwelling in the wild lands north of
the Black Sea, and among other groups living beyond
the light of Greek civilization. In the sixteenth century,
Montaigne’s ironic essay Of Cannibals used the man-
eating attributed to tribes in the New World as a foil to
criticize certain European barbarities of his time. The
Cannibalism implied ‘otherness’ of cannibalism appears in later
literary works, such as Daniel Defoe’s Robinson
Cannibalism is the eating of one’s own kind. Of the 75 Crusoe (1719), Herman Melville’s Typee (1846), and
mammalian species known to eat conspecifics, three Joseph Conrad’s Heart of Darkness (1902), to epito-
are higher primates: chimpanzees and gorillas, usually mize the exoticism of the faraway Americas, South
under conditions of social or environmental stress— Seas, and Africa, respectively. Today, novelists and
and humans. Among these species the rarity of filmmakers sometimes employ cannibal themes to
cannibalism, relative to the myriad opportunities and convey the grotesque, nightmarish, or psychotic
temptations to engage in it, suggests a biologically- nature of certain situations and characters. The same
based aversion that may have evolutionary utility. fascination excites the public’s appetite for accounts of
Many cultures have formulated this aversion as a present-day incidents of cannibalism under conditions
taboo, or as an attitude which passively regards the of starvation or psychopathology.
behavior as unthinkable or faintly ridiculous. In most As a staple figure in the world’s folklore traditions,
cultures where cannibalism was institutionalized (and the Cannibal’s otherness is usually certified by imagin-
as such is now thought to be extinct in the world), the ing the creature as not quite human: a witch, ghost,
element of taboo appeared in the guise of restrictions demon, giant, god, ogre, vampire, were-animal, or
on the practice of cannibalism: the categories of other monster. By negative example, the Cannibal
persons eligible to eat and be eaten, the body parts discloses the nature of morality, sociality, and other
consumed, and various procedural rules marking the virtues. Freudian theorists (e.g., Klein 1975) interpret

1452
Cannibalism

such images as projections of unconscious ideas when, in 1884, survivors of the wrecked yacht Migno-
originating in early childhood. Thus, the infantile wish nette were successfully prosecuted for murder
to devour the parent is converted into the fear of being (Simpson 1984).
devoured by the parent—a fear made acceptable to the In works of ethnology and ethnohistory, aggressive
ego by virtue of the parent-figure’s fantastical disguise. cannibalism, usually in the context of warfare, is the
Types of literal cannibalism vary according to most frequently reported form. For many peoples of
motive and circumstance; so great is the diversity, in Africa, the Americas, and the South Pacific, canni-
fact, that it tends to overwhelm the common feature of balism is the supreme expression of hostility: killing
ingestion and confound efforts to understand canni- and eating someone is the ultimate act of annihilation;
balism as a unitary phenomenon. Auto-cannibalism reducing the hated enemy to feces is the ultimate
conceivably applies to the act of eating cast-off parts of insult. Such motives appear, also, in the large-scale
oneself—hair, nail clippings, mucous, excrement, cannibalism reportedly perpetrated upon ‘class en-
placenta—but is perhaps better reserved for instances emies’ in China’s Guangxi Province during the Cul-
in which, under torture or other duress, individuals tural Revolution of 1966–1976 (Zheng 1996). In some
partake of their living flesh, raw or cooked. Greek societies the act aggressively appropriates the victim’s
mythology imagines this horror in the fate of spiritual strength, by consuming either the entire body
Erisichthon, who unwisely violates a grove sacred to or particular parts thereof, such as the heart, liver,
Ceres. As punishment, the goddess calls for Famine to genitals, or head. Early Spanish explorers reported
visit the offender and deliver upon him a hunger so that the Tupinamba of coastal Brazil adopted captives
insatiable that, eventually, he devours his own body, into their community, where they received excellent
limb by limb, unto death. Gustatory cannibalism treatment and sometimes even married, before being
refers to the unceremonious eating of human flesh, slaughtered and eaten. Similarly, the Aztecs are said to
invariably that of an enemy, simply as food. Although have treated war captives with great honor and
reported for parts of Africa and Melanesia, some solicitude, as a prelude to sacrificing them to the gods
authorities doubt the existence of such culturally and giving their bodies over to be eaten by the nobility.
unadorned cannibalism; for, as Sahlins (1983, p. 88) In these cases, then, cannibalism occurs in conjunction
remarks, ‘cannibalism is ‘‘symbolic,’’ even when it is with sacrifice, but only after the victim is ennobled and
‘‘real.’’’ Phrased slightly differently: noncannibals are made precious as an offering through incorporation
fascinated by cannibalism because they don’t practice into the group. Such sacrificial cannibalism is widely
it; cannibals are fascinated by cannibalism because noted (e.g., Hogg 1966) and is shown graphically, if
they do. Accordingly, while human flesh is undoubt- symbolically, in the holiest of Christian rituals—the
edly a potential source of high-quality protein (Harner eucharistic ‘eating’ of the body and blood of Christ.
1977), cannibalism as a cultural practice cannot be As implied, most types of volitional cannibalism
satisfactorily explained purely as a response to par- involve eating an enemy or, at the least, someone from
ticular material conditions. Epicurean cannibalism outside the group (‘exocannibalism’). Eating someone
regards human flesh as a delicacy, an aesthetic nor- from within the group (‘endocannibalism’) usually
mally requiring the victim to be perceived as something occurs as a form of mortuary (or funerary) canni-
less than—at any rate, other than—human. Medicinal balism, in which all or part of the body is consumed as
cannibalism is the ingesting of human tissue, usually an act of piety or affection, as a source of spiritual
that of an executed criminal, as a supposed medicine strength or continuity, as a kind of recycling of the
or tonic; this practice existed throughout Europe spirit for the purpose of group renewal or repro-
during the sixteenth, seventeenth, and eighteenth duction, or as the means for assisting the spirit of the
centuries (Gordon-Grube 1988). Innocent cannibal- deceased to attain a desirable state in the afterlife. A
ism is so named because the perpetrator is unaware premortem variant may be found in the Chinese idea
that he or she is eating human flesh; thus, in Greek (and practice) of ultimate piety, that is, a child giving
legend, Atreus punishes his brother Thyestes for a piece of his own flesh to nourish an ailing parent. The
seducing his wife, by tricking him into eating his own unconscious desire to incorporate the loved object—a
children. Survival cannibalism occurs under starvation common psychoanalytic observation—supports the
conditions, such as shipwreck, military siege, and notion that mortuary practices in many societies
famine, in which persons normally averse to the idea include symbolic elements that are sublimations of
are driven to the act by the will to live. That some cannibalistic impulses or, in a more speculative vein,
persons under such conditions have been known to vestiges of actual cannibalism in the past (Sagan 1974).
accept death instead of resorting to cannibalism attests As a topic of scholarly and popular interest,
to the psychological strength of the taboo and the cannibalism was jolted by the publication of a book
corresponding revulsion many individuals feel toward doubting that institutionalized cannibalism ever ex-
the cannibal act. Survival cannibalism, especially when isted, anywhere. In The Man-eating Myth, author
combined with murder, raises legal and moral issues Arens (1979) argues that the centuries of reports about
having to do with the ‘defense of necessity,’ the first cannibalism in remote times or places rest on not a
judicial analysis of which occurred in British courts shred of reliable, eyewitness evidence. Instead of

1453
Cannibalism

describing literal practices, the widespread attribution Lestringant F 1997 Cannibals: The Discoery and Representation
of ‘cannibalism,’ Arens suggests, is a derogatory racial of the Cannibal from Columbus to Jules Verne. University of
or ethnic stereotype—a device for debasing or demon- California Press, Berkeley, CA
Sagan E 1974 Cannibalism: Human Aggression and Cultural
izing ‘the other,’ thus proving the need to govern or
Form. Harper & Row, New York
convert them. Scholars, he concludes, have been too Sahlins M 1983 Raw women, cooked men, and other ‘great
quick and uncritical in accepting such attributions at things’ of the Fiji Islands. In: Brown P, Tuzin D (eds.) The
face value. Ethnography of Cannibalism. Society for Psychological An-
The debate generated by Arens’s book comes down thropology, Washington, DC, pp. 72–91
to whether or not one is prepared to accept eyewitness Sanday P R 1986 Diine Hunger: Cannibalism as a Cultural
accounts from missionaries, administrators, and ad- System. Cambridge University Press, Cambridge, UK
venturers, or the circumstantial evidences of ethnogra- Simpson A W B 1984 Cannibalism and the Common Law: The
phy or ethnohistory. Although few anthropologists Story of the Tragic Last Voyage of the Mignonette and the
agree with Arens’s blanket dismissal of such sources, Strange Legal Proceedings to Which it Gae Rise. University of
Chicago Press, Chicago
the controversy has usefully heightened both scholarly Turner C G II, Turner J A 1999 Man Corn: Cannibalism and
awareness of the ideological potential of ‘cannibalism’ Violence in the Prehistoric American Southwest. University of
and empirical rigor in studies of cannibalism as a Utah Press, Salt Lake City, UT
culturally embedded, institutionalized practice (e.g., Villa P, Bouville C, Courtin J, Helmer D, Mahieu E, Shipman P,
Brown and Tuzin 1983, Sanday 1986, Goldman 1999); Felluomini G, Branca M 1986 Cannibalism in the Neolithic.
it has also helped to inspire a body of literary and Science 233: 431–6
historical studies (e.g., Lestringant 1997) which exam- Zheng Y 1996 Scarlet Memorial: Tales of Cannibalism in Modern
ine ‘cannibalism’ as a metaphor in colonial discourse China (trans. Sym T P). Westview Press, Boulder, CO
or in explicitly imaginary treatments of the cultural
‘other.’ D. Tuzin
Of particular interest, insofar as it offers the only
remaining possibility of direct evidence of willful
cannibalism, is a resurgence of archaeological at-
tention to prehistoric cannibalism and a refinement of
techniques to test for it. Thus, modern taphonomic
analyses of skeletal remains recovered at Anasazi sites Capitalism
(AD 900–1300) in the US Southwest (Turner and
Turner 1999) and at Fontbre! goua Cave, a Neolithic Capitalism is an economic system based on free
site in southeastern France (Villa et al. 1986), strongly market, private enterprise and ownership. The term
indicate that cannibalism occurred in those places in and concept of capitalism as an exploitative socio-
association with violent homicide, probably in the economic system was initially introduced by Marxist
context of war or political terrorism. It is hoped that social sciences. The spread of the term, however, was
these new techniques may be used to test long-standing strengthened and its concept changed by the Western
suspicions of cannibalism on the part of more ancient critiques of Marxism and socialism. Although rarely
hominids—Australopithecus africanus, Homo erectus, used, sociology and economics accepted the term and
Homo neanderthalensis, and archaic Homo sapiens. historiography broadly discuss its half-a-millennium
development and various stages from commercial to
See also: Ritual; Taboo globalized capitalism.

Bibliography 1. Concepts and Theories


Arens W 1979 The Man-eating Myth: Anthropology & Anthro- The term, derived from capital (root in Latin caput,
pophagy. Oxford University Press, Oxford, UK ‘head’) and related to capitalist, appeared in the mid-
Brown P, Tuzin D (eds.) 1983 The Ethnography of Cannibalism. late nineteenth century in the English, German,
Society for Psychological Anthropology, Washington, DC French, and other European languages. The word
Goldman L R (ed.) 1999 The Anthropology of Cannibalism. capital had been used from the seventeenth century
Bergin and Garvey, Westport, CT onward in the economic sense, meaning ‘fund of
Gordon-Grube K 1988 Anthropophagy in post-renaissance money,’ and later ‘principal’ (as opposed to interest).
Europe: The tradition of medicinal cannibalism. American The term capitalist appeared in mid–late eighteenth
Anthropologist 90(2): 405–9
Harner M 1977 The ecological basis for Aztec sacrifice. American
century French, English, and American texts. The first
Ethnologist 4(1): 117–35 recorded occurrence of the form capitalism in English
Hogg G 1966 Cannibalism and Human Sacrifice, 1st American is ‘The sense of capitalism sobered and dignified Paul
edn. Citadel Press, New York de Florac’ in Thackeray’s The Newcomes II (1854).
Klein M 1975\1932 The Psycho-analysis of Children (trans. The spread of the term capitalism in common
Strachey A). Dell, New York language use paralleled its use in Marxist socio-

1454
Capitalism

economic theory, the appearance of socialist\com- motivated by making profit and wants to get more
munist economies in various countries, and the money back at the end of the reproduction process
expansion of Cold War rhetoric based on the opposi- (M–C–M’). The source of profit is one of the com-
tion of the two socioeconomic systems. The term and modities—the labor power, which creates a ‘surplus
concept of capitalism as a socioeconomic system was value.’ The value of the goods produced by the workers
basically introduced by the first theorists of Marxist is higher than the value of the labor power bought for
social sciences. The spread of the concept, however, a wage, which covers the expenses of the reproduction
was strengthened by Western critiques of socialism of labor power (including the reproduction of the class
and Marxism. Although rarely used, sociology and itself). Capitalists expropriate this ‘surplus,’ thus
economics accepted the term and historiography exploiting the proletariat. In capitalism, therefore,
broadly discussed the transition from feudalism to social production is in sharp contrast with individual
capitalism, and the various stages of capitalist de- expropriation. As a consequence, wealth is accum-
velopment from its early commercial phase via indus- ulated in the hands of capitalists, while the proletariat
trial and financial capitalism, to its late, twentieth is kept in poverty. Capitalism, however, creates its
century characteristics. The capitalist world system is own ‘grave diggers’—the ever-increasing proletariat
often discussed according to the pattern of advanced who, uniting together in the advanced West, rises up to
core and dependent, backward periphery. Develop- destroy the capitalist system through a proletarian
ment studies and economics analyze late twentieth revolution. Bourgeois expropriators become expro-
century globalization and the increased role of trans- priated and a just communist society without classes
national companies in a more inter-related, but at the and private ownership is established.
same time even more polarized capitalist world econ- Post-Marxian Marxism revised this theory. Eduard
omic system. Bernstein, the leading theorist of the powerful German
Karl Marx, who often used the terms capital, Social Democratic Party, rejected the concept of
capitalist, and capitalistic, did not use the term proletarian revolution in his landmark article series,
capitalism as a noun in such basic writings as the Probleme des Socialismus (1896). He argued that
Communist Manifesto and Das Kapital. The term capitalism has developed in a direction different from
appeared only later in his correspondence on the the one indicated by Marx. Poverty has not grown but
development of Russian capitalism during the late diminished, and welfare reforms have been accepted
1870s. The concept finally gained scholarly interpret- peacefully. Bernstein and the Western Social Demo-
ation and became widely used by Marxist theory and cratic movement maintained that the proletariat could
the socialist movement in the last third of the nine- realize its interests and goals by organized mass
teenth century. The concept was broadly used in movement and parliamentary reforms. Another lead-
historiography and sociology, but much less frequent- ing German socialist, Rudolf Hilferding, in his Das
ly in non-Marxist economics, to describe a phase of Finanzkapital (1909), added that modern economy
history or a set of ideas and mentality. During the created ‘consciously regulated social relations.’ Later
twentieth century, capitalism became a popular term, he concluded in the theory of ‘organized capitalism,’
especially to contrast the private-market system with thus, a state-controlled welfare system.
socialism. Revolutionary socialism emerging during the first
Marxist theorists interpreted the term capitalism as decades of twentieth century redefined contemporary
a socioeconomic system, a ‘mode of production,’ in capitalism as the higher or the ‘last’ stage of capitalism,
which ‘capital is not a thing but a social production otherwise called imperialism. In her The Accumulation
relation belonging to a definite historical formation of of Capital, Rosa Luxemburg (1951, English trans.;
society’ (Marx 1952, Chap. 48, English trans.; first first published in 1913) maintained that C would not
published in 1867). Capitalism is a class society be able to expand its markets, recruit new labor force,
basically consisting of two antagonistic classes: bour- and mobilize new resources without a ‘third person,’
geoisie and proletariat. The system historically came i.e., noncapitalist (independent farmer) sectors of the
about through ‘primitive accumulation,’ the separa- advanced countries, and, most of all, the agricultural
tion of peasantry from the means of production, and economy of the colonies and peripheries. The Russian
the creation of a free proletariat deprived of property, Vladimir I. Lenin defined imperialism as a new,
whose only saleable commodity was its labor power. expansionist stage of capitalism, which was also
The ruling class, the bourgeoisie, owns the means of characterized by the domination of ‘monopolies,’
production, buys the labor power, and controls all where the competition of small individual capitalists is
economic transactions and production processes. In replaced by the deadly competition of monopolies, not
the capitalistic reproduction process, capitalists who only locally but also in the international arena.
own the capital buy various types of commodities (raw According to this view, capital exports became a new
material, machinery, labor power, etc.) for producing instrument for exploiting backward peripheral coun-
new commodities for sale. In this reproduction process tries. Imperialism has led to the fight for the redis-
money is transformed into commodities, then again tribution of the colonies and spheres of interest, and
into money (M–C–M). The capitalist, however, is this unavoidably generates wars, and leads to the

1455
Capitalism

proletarian revolution of the backward world. This While capitalist society in its early stage has been,
version of Marxism was the leading ideology of the and in contemporary backward countries is character-
Bolshevik revolution in Russia, a revolutionary wave ized by extreme social polarization and income
in Central and Eastern Europe after World War I, and inequity, capitalism granted equal rights and, in prin-
the communist takeover in a series of Central and ciple, freedom of choice later in the advanced coun-
Eastern European, Asian, and African countries after tries. From the 1870s on, but mostly from the 1950s,
World War II. the industrial working class began shrinking, and the
Non-Marxist scholarship did not build on the ‘white-collar’ layers, including the middle class, gradu-
Marxist concept and interpretation of capitalism, but ally expanded and became the most dominant layer of
the term itself became commonly used in scholarship. society. During the 1960–80s, a two-thirds to three-
Many twentieth century sociologists and economists quarters white-collar and middle class majority domi-
used the term capitalism to mean a society and nated advanced societies.
economy based on free market and entrepreneurial
interest, and accepted these characteristics as the
essence of capitalism, in contrast to socialism. Crit-
iques of Marxism, therefore, contributed just as much 2. Early Stages of Deelopment
to the spread of the term and concept of capitalism.
Because of his work on the role of Protestant ethic in Capitalism, broadly accepted as a phase of history, is
the development of capitalism, Max Weber played an dominating the last half-a-millennium. It has also
instrumental role in making capitalism a commonly become conventional that capitalism itself had various
used term in scholarship. Ludwig von Mises and his phases and stages. Its antecedents go back to ancient
disciple Friedrich von Hayek frequently used the term history. Barter and exchange of one thing for another,
capitalism when launching the most radical attacks as Adam Smith, the main advocate and ideologue of
against Marxist theory and socialist economy and market capitalism argued, are a part of human nature.
defending an undisturbed free market. Joseph The pockets of private-market economy and its
Schumpeter’s book Capitalism, Socialism, and Democ- institutions, especially in urban settlements, flourished
racy (1943) emphasized the changing evolutionary in medieval Europe. Moneylenders and urban mer-
character of capitalism because of its entrepreneurial chants were its main representatives.
interest that ‘incessantly revolutionarizes the econ- Its continuous development, however, began mostly
omic structure from within’ and with a ‘creative from the sixteenth century. Historiography produced
destruction’ destroys the old, outdated and creates the a vast literature on the transition from feudalism to
new. capitalism. These writings maintain that declining
The broadly accepted non-Marxist meaning of productivity, a demographic crisis, and the scarcity of
capitalism is rather different from the Marxist in- peasant labor generated the lowering of rents, a
terpretation. According to the most widespread defini- decrease of labor services, and an increase in peasant
tion, capitalism is a complex socioeconomic system mobility and freedom in Western Europe. In most of
where the bulk of the means of production is in private these countries, consequently, serfdom was gradually
hands guaranteed by modern property rights and loosened and disappeared by the sixteenth century.
laws. An ‘invisible hand’ (Adam Smith), the operation Commercialization of agriculture played a central role
of market, regulates the entire economy, production, in the development of capitalism. The flourishing
distribution, and labor. The market operates accord- textile industry in Britain with an excess production
ing to the play of supply and demand, which influence over consumption led to the enlargement of productive
price fluctuations and automatically regulate invest- capacities in an unheard scale and made England the
ments and output. As the Say-law phrased it: each ‘workshop of the world.’
supply creates its demand. The market mechanism, This stage between the sixteenth and eighteenth
therefore, is a self-regulating system. Capitalist entre- centuries is often called ‘commercial capitalism’ in
preneurs are acting individually and freely in a rational history. The system was based on private ownership of
organization and control themselves with double- warehouses and ships and buying and selling goods all
entry bookkeeping, motivated by gain and increase of over the world. The merchants also lent out money at
profit. Capitalism, in this sense, is in sharp contrast interest and established connections with production.
with both its rigidly regulated feudal predecessor and They bought and distributed raw materials for
contemporary rival socialism. Hereditary, almost un- peasants who worked at home in the traditional way
changeable social status, basically noncommercialized (putting-out system), and gradually subordinated
production for local consumption, strict communal the rural cottage industry. It led to a proto-
and guild regulations, and the lack of free private industrialization, the development of a decentralized
property, labor and market make economic relations putting-out system coordinated by merchants, and
in Feudalism rigid opposed to capitalism; and non- later the establishment of (nonmechanized) factories,
market economy based on state ownership and central which introduced division of labor in the production
planning contrast socialism to capitalism. procedure.

1456
Capitalism

The emergence of capitalism, as various historians faire capitalism. Smith argued that from the selfish
have discussed it, also gained incentive from the activity of the entrepreneurs, driven by profit mo-
discovery of the ‘New World.’ The unlimited inflow of tivation, the entire society profited, and that the
gold and silver from Latin America led to the individual actions, because of the self-regulating im-
accumulation of wealth. Although Spain was the main pact of the market, created an economic harmony and
beneficiary at the beginning, most of the wealth got well being for the nation. Smith established the
into the hands of Dutch and English merchants. ‘It is economic thought and popular thinking on capitalist
not the importation of gold and silver—stated Adam market economy for two centuries. Political thinkers
Smith—that the discovery of America has enriched and economist of latecomer countries, however, at-
Europe … [It] made a most essential [change] … by tacked British laissez-faire ideology. The American
opening a new inexhaustible market to all the com- Hamilton and Germans Johann-Gottlieb Fichte and
modities of Europe, it gave occasion to a new division Friedrich List argued against laissez-faire in favor of
of labor.’ The new markets and emerging Atlantic protective state interventionism.
trade enriched the commercial powers of northwest The end of the eighteenth century is considered to be
Europe and made them the core of the emerging a turning point in the development of capitalism. As a
modern world system from the sixteenth century on. consequence of the British Industrial Revolution, a
This period saw the rise of modern world trade, new stage of more mature, as social scientists and
characterized by the trade of mass consumption goods historians called it, ‘industrial capitalism’ emerged
of food and textile, instead of the medieval trade of and dominated the nineteenth century. Starting in
luxury goods from the near East and India. While the Britain, the industrialization process transformed the
core countries of Western Europe sold processed core countries of the capitalist world system into
industrial products, they bought unprocessed raw industrialized economies with a high, sustained econ-
materials and agricultural produces from the per- omic growth. Industrial capitalism, in the first period
ipheral countries of the world system: Latin America of its existence, institutionalized long working days,
and Eastern Europe. and exploited female and child labor ruthlessly in the
The rise of capitalism was strongly influenced by factory system. Living conditions of the workers in
cultural-ideological and institutional factors. The crowded and polluted industrial cities, as described by
Protestant reformation of the sixteenth century Charles Dickens and Friedrich Engels, were extremely
created a new work ethic and lifestyle, which was a poor. Capitalism, however, reached its dynamic period
prime mover of capitalist development. Hard work of development. A banking revolution in Belgium and
and thrift became virtues, which was in sharp contrast France, then, most of all, in Germany, introduced the
to parasitic noble values and lifestyle, the so-called Cre! dit Mobiler type, then the German type of in-
Hidalgo (Spanish) and Szlachta (Polish) attitude. vestment banks with huge industrial portfolios. These
Meanwhile, the new moral code legitimized capitalist banks played an instrumental role in industrializa-
inequity: character was idealized, richness considered tion. The relatively small, family-managed factories
to be the triumph of will and work, poverty became a enlarged, and most of them became joint stock com-
moral failing. panies with professional management. From the mid-
In the absolute state between the sixteenth and nineteenth century on, a European railroad system
eighteenth centuries, centralizing policy creating uni- created a united European transportation network
form monetary systems, legal codes, unified and and an integrated marketplace. In this period virtually
defended home markets without internal tariffs, also the entire globe became part of the capitalist world
played an important role. The isolation of the domestic system. The core countries offered an unlimited
market by limiting imports was accompanied by a market for food and raw materials. Britain and a
tremendous effort to build up a large colonial empire handful of Western European countries absorbed
and a system of various privileges and stimuli for nearly 70 percent of the world’s food and raw material
exports. The state itself became a promoter of the exports. New institutions strengthened international
development of market, industry, trade, and trans- capitalism, such as a free trade system and the gold
portation by the creation of road systems, canals, standard—a fiscal underpinning of the free trade
navies, and even state-owned industries. All these system making the national currencies convertible,
served to strengthen the state, but also historically which was gradually introduced from the 1860s to the
paved the way for dramatic capital accumulation and 1870s.
modern capitalism. The expansion of capitalism around the turn of the
nineteenth\twentieth centuries led to considerable
changes in the system. Werner Sombart introduced the
2.1 Industrial Capitalism and Beyond
term of ‘late capitalism,’ and Rudolf Hilferding named
In the last quarter of the eighteenth century, Adam this new period ‘finance capitalism.’ The concept of
Smith published his powerful theory and ideology of ‘Imperialism’ (Hobson) also appeared and became a
the free market system, the Wealth of Nations, which leading concept of revolutionary Marxism. Some of
became instrumental in the development of laissez- the socialist theorists, however, such as Karl Kautsky,

1457
Capitalism

considered imperialism as a ‘policy’ of capitalism, Western European countries, led by France and
similar to long workdays and child labor in its first Britain, nationalized entire sectors of their economies
phase. He maintained that capitalism could exist and established new state-owned firms in various
without colonies by introducing new policies. industries.
Twentieth-century capitalism was characterized by Capitalism underwent a spectacular transforma-
the challenge of major wars, worldwide economic tion. In his article on The Instability of Capitalism
depression, and a significant slowing down of econ- (1928), the leading Austrian-American economist,
omic growth. More importantly, capitalism had to Joseph Schumpeter, prophesied the transformation of
face the challenge of socialism, first as a movement, the system and the fading of the original meaning of
then as a rival socioeconomic system. A great part of capitalism. He maintained that capitalism is inherently
the twentieth century was characterized by the struggle stable and able to recover from great crises, but
of the two socioeconomic systems. Capitalism reacted generates socio-intellectual effects discordant with its
by a highly flexible adjustment to the new challenges. spirit and institutions. As a consequence, ‘Capitalism
During the wars and depression, the development of is … in so obvious process of transformation into
state-capitalism with a strong and effective state something else … [that] it will be merely matter of
interventionism, a return to protectionism, inflation- taste and terminology to call it Socialism or not.’
ary financing, and even state investments and plan- Towards the end of the twentieth century, capitalism
ning, helped the system to respond to the new adjusted to the new historical situation and markedly
challenge. Large state-owned sectors emerged in Italy, transformed but preserved entrepreneurial interest,
Spain, and Poland. National planning was introduced market flexibility, efficiency, competitiveness, and
in Germany and Hungary to cope with economic reached its highest prosperity and fastest growth rate
decline and unemployment. Self-regulating, laissez- ever. The Western core countries of the world sys-
faire market capitalism was replaced everywhere by a tem successfully adjusted to a new technological-
regulated market system. Adam Smith’s theory was communication revolution and became the leader of
replaced by John Maynard Keynes’ concept of state new technologies and the rise of postindustrial society.
interventionism to create additional demand. During the long period of postwar prosperity, a part
of the former Asian and South European peripheries
of the capitalist world system reached a higher than
3. The Welfare State, Mixed Economy and Back average growth and became an integrated and equal
to Globalized Laissez-Faire part of the core countries. After a long period of
rivalry, capitalism emerged victorious while the par-
The socialist challenge of capitalism required an allel world system of centrally planned state socialism,
effective response. The first chapter of this rivalry was which conquered one-third of the world and influenced
opened in the late nineteenth century. Chancellor Otto an even greater part by its policy after World War II,
von Bismarck, in his struggle with the strong German collapsed in 1989–91. The former Soviet Bloc coun-
socialist movement, turned towards the introduction tries introduced radical market reforms, thus re-
of welfare institutions. Pension system, health care, establishing capitalism. During the last third of the
and other welfare institutions were introduced for twentieth century, laissez-faire capitalism became the
workers to take the wind out of the sail of the emerg- leading model again. Neo-liberal economics tri-
ing socialist movement. For these developments, the umphed and challenged mixed economy and welfare
Great Depression of the early 1930s brought about state.
a turning point. The Swedish Social Democratic Capitalism in late twentieth century reached a new
government, elected in 1932, started building a welfare stage in its history. The main trend of this new age is
state. President Franklin D. Roosevelt also initiated globalization. National boundaries and national econ-
major social legislation and the introduction of the omies rapidly lost their importance. Multi- or
social security system. The comprehensive system of transnational companies penetrated previously inde-
the welfare state, however, emerged only after World pendent economies. Foreign direct investment became
War II in Western Europe. Capitalism, in contrast instrumental all over the world, including the ad-
with its nineteenth century features, initiated wide- vanced core countries. Towards the end of the 1990s
spread social legislation. Paid vacation, short work- nearly one-third of American exports and two-thirds
week, nationwide pension systems, free health care, of imports were intrafirm deliveries. About one-third
education, maternity leave, cheap housing, and many of French, Dutch, and British industrial output was
other institutions were established. Human rights were produced and roughly 25–40 percent of their research
re-evaluated and the right to work, i.e., a full- and development expenditure was financed by af-
employment policy, became dominant. Inequity, at filiates of transnational companies. In Ireland, and
least in the middle layers of the society, markedly former state-socialist Hungary, foreign affiliates pro-
diminished in the advanced capitalist countries. duced two-thirds of industrial output and financed
The period after the Second World War also saw the roughly 70 percent of research and development
spread of mixed economies. Almost all of the advanced expenditures.

1458
Capitalism: Global

While globalization, in some cases, contributed to a recent origin. Despite the fact that many scholars have
successful catching-up process with the core, it also used the terms interchangeably, with the widespread
preserved and even strengthened the core–periphery use of the concept of globalization in the 1990s, the
inequity. The gap between advanced core and the force of the argument for a firm distinction between
peripheries was growing considerably: intercountry these concepts has increased. International capitalism
income spread was 10:1 in 1913; 26:1 in 1950, but it has generally been conceptualized in state-centrist
increased to 40:1 at the end of the twentieth century. A terms, focused mainly on how national capitalists
newly globalized but even more polarized capitalist based on competing national economies and working
world system, dominated by an expanding Western through national companies operated across borders.
core, was emerging at the turn of the millennium. The distinctive concept of global capitalism takes its
departure from the idea of a global economy dominat-
See also: Capitalism: Global; Economic History; ed by globalizing corporations and those who own and
Industrialization, Typologies and History of; Marxian control them, and those in influential positions who
Economic Thought; Marxist Social Thought, His- serve their interests. The article traces the development
tory of; Polanyi, Karl (1886–1964); Social History; of this conception of global capitalism since the 1950s,
Weberian Social Thought, History Of from the transitional idea of capitalist world-economy
through several attempts to establish a genuinely
global conception of capitalism not grounded in
national economies and societies.
Bibliography
Aston T H, Philpin C H E 1985 The Brenner Debate. Cambridge
University Press, Cambridge, UK 1. Introduction
Baldwin P 1990 The Politics of Solidarity. Class Bases of the
European Welfare State 1875–1975. Cambridge University Researchers on global capitalism have focused on
Press, Cambridge, UK three inter-related phenomena, increasingly significant
Hacker L M 1940 The Triumph of American Capitalism. Simon since the 1960s. These are, first, the ways in which
& Schuster, New York transnational corporations (TNCs) have facilitated
Heimann E 1964 History of Economic Doctrines. Oxford the globalization of capital and the production of
University Press, New York goods and services; second, the rise of new global
Hobson J A 1902 Imperialism. A Study. Nisbet & Co, London forms of organization of the capitalist class; and third,
Kautsky K 1914 Der Imperialismus. Neue Zeit, Berlin transformations in the global scope of TNCs that own
List F 1904 The National System of Political Economy. Green
and control the mass media, notably television chan-
and Co., London
Luxemburg R 1951 The Accumulation of Capital. Routledge, nels and the transnational advertising agencies and
London their role in promoting global brand consumer goods
Marx K 1952 Capital. Encyclopedia Britannica, Chicago and the emergence of a global culture and ideology of
Schumpeter J 1943 Capitalism, Socialism, and Democracy. Allen consumerism. Theory and research on each of these
& Unwin, London three phenomena roughly coincide with attention to
Smith A 1976 An Inquiry into the Nature and Causes of the the economic, political and culture–ideology spheres
Wealth of Nations. University of Chicago Press, Chicago of global capitalism.
Sombart W 1921–8 Der moderne Kapitalismus. Duncker and
Humbolt, Munich, Germany, Vols. 1–III
Sweezy P M 1942 The Theory of Capitalist Deelopment.
Principles of Marxian Political Economy. Oxford University
2. Global Capitalism as an Economic System
Press, New York It is no coincidence that interest in a global capitalist
Wallerstein I 1974 The Modern World System: Capitalist system in contrast to competing national capitalisms
Agriculture and the Origins of the European World-Economy in
increased perceptibly from the 1950s. The context of
the Sixteenth Century. Academic Press, New York
Weber M 1992 The Protestant Ethic and the Spirit of Capitalism. theoretical and empirical interest in competing
Routledge, London national capitalisms was (and for many still is) the
history of colonialism and imperialism. This is overlaid
I. T. Berend with several versions of the theory that capitalist states
could more or less successfully plan their own econ-
omic futures (see, for example, Keynesianism, regu-
lation theory, and the developmental state). The
concept of international capitalism, therefore, refers
to a system of interacting and competing national
Capitalism: Global capitalist economies, in which national elites of vari-
ous types use ‘their’ big business (and businesses) to
While the idea of international capitalism has been further national interests around the world. As direct
part of political economy and related disciplines for imperialism and colonialism came to an end and as
centuries, the concept of global capitalism is of more increasing numbers of very large TNCs began to

1459
Capitalism: Global

emerge in the 1960s, attention began to shift decisively markets (for example, Switzerland, Sweden, Canada,
from national to global capitalism. Australia), but also for those legally domiciled in the
The dependency approach to development and USA and Japan. While most of the biggest corpora-
underdevelopment of Gunder Frank and the related tions are still headquartered in the First World, several
world-systems approach of Wallerstein, both high- dozen companies originating in what is conventionally
lighted the systemic nature of capitalism as a world- called the Third World or that part of it known as the
wide phenomenon over several centuries. While these newly industrializing countries have been numbered
theoretical innovations can be said to have prepared among the 500 biggest companies by revenues in the
the ground for it, neither entirely succeeded in estab- world. This group has included the state-owned oil
lishing a coherent concept of global capitalism for companies of Brazil, India, Mexico, Taiwan and
what came to be termed the ‘age of development’ from Venezuela (owned by the state but increasingly run
the 1950s onwards. This was due to ambivalence over like private corporations), banks in Brazil and China,
their units of analysis and insufficient focus on the role and the Korean manufacturing and trading conglom-
of the major corporations in development in general. erates (chaebol), some of which have attained global
For example, the analysis of core, semiperiphery and brand-name status (for example, Hyundai and
periphery in the world-systems approach is based on Samsung).
national economies, as is the theoretically more Writers who are skeptical that capitalism is a global
ambitious concept of commodity chains. The simple system argue that because most major TNCs are
assertion that these take place within a world-economy legally domiciled in the USA, Japan and Europe, and
or global economy does not suffice for an analysis of because they trade and invest mainly between them-
global capitalism. selves, capitalism is still best analyzed in terms of
Theories of global capitalism, in the sense used here, national corporations. For such writers, the global
take off from the proposition that capitalism entered a economy is a myth and, consequently, there is no
new, global phase in the second half of the twentieth global capitalist system as such. Against this con-
century. By the end of the century, the largest TNCs clusion, proponents of the salience of a global capi-
had assets and annual sales far in excess of the gross talist system argue that an increasing number of
national products (GNPs) of most of the countries in corporations operating outside their countries of
the world. In 2000, only about 70 countries out of a origin see themselves as developing global strategies of
total of around 200 for which there were data, had various types, as is obvious from the contents of their
GNPs of more than 10 billion US dollars. By contrast, annual reports and other corporate publications.
the Fortune Global 500 list of the biggest corporations While all parts of all economies are clearly not equally
by turnover reported that around 450 of them had globalizing, an increasing volume of empirical research
annual sales greater than US$10 billion. This com- indicates that the production and marketing processes
parison, however, underestimates the economic scale of of most major industries are being de-territorialized
major corporations compared with sovereign states, as from their countries of origin and that these processes
most TNC revenues are counted as part of the gross are being driven by the TNCs. The central issue for
domestic product (GDP) of some countries. A more economic globalization is the extent to which TNCs
appropriate measure is to compare TNC revenues domiciled in the USA, Japan, European and other
with government revenues. Gray (1999) has calculated countries can be more fruitfully conceptualized as
that for 1997–99, the seven largest economic entities in expressing the national interest of their countries of
the world were the budgets of the governments of the origin (the globo-skeptic argument) or what can be
USA, Germany, Japan, China, Italy, UK and France, conceptualized as the private interests of those who
but nine of the top 20 were corporations, and of the own and control them, aggregated as the interests of
richest 100 economic entities by revenues, 66 were global capitalism. Even if historical patterns of TNC
TNCs. Thus, in this important sense, such well-known development have differed from country to country
names as General Motors, Shell, Toyota, Unilever, and region to region, it does not logically follow that
Volkswagen, Nestle, Sony, Pepsico, Coca Cola, TNCs and those who own and control them express
Kodak, and Xerox, the huge Japanese trading houses any type of ‘national’ interest or national character.
(and many other corporations most people have never The formal ownership of capital and the corpora-
heard of?) have more economic power at their disposal tions has been transformed since the 1960s. The
than the majority of the countries in the world. These ownership of share capital has increased throughout
figures indicate the gigantism of TNCs relative to the the world by means of greater (though still a tiny
state budgets of most countries. minority in most communities) participation of the
Not only have TNCs grown enormously in size general population in stock markets and the indirect
since the 1950s, but their global reach has expanded investments that hundreds of millions of people have
dramatically. Many TNCs regularly earn more than through their pension funds and other forms of
half of their revenues outside the countries in which savings. This has led some to argue that economic
they are legally domiciled. This is not only the case for globalization has created a popular capitalism, though
TNCs from countries with relatively small domestic others argue the more elitist thesis that the real drivers

1460
Capitalism: Global

of the global capitalist system are the managers of unit would tend to be people from many countries, more
trusts and pension funds. However, formal ownership and more of whom begin to consider themselves as
does not necessarily mean effective control over capital citizens of the world as well as of their places of birth
and the resources of TNCs. and residence (these might differ). And third, they
The globalization of the international financial and would tend to share similar lifestyles, particularly
trading system can be fruitfully analyzed in terms of patterns of higher education (in cosmopolitan business
the progressive weakening of the nation-state and the schools) and consumption of luxury goods and ser-
growing recognition that the major institutions of vices. Theory and research to support this argument
global capitalism, notably TNCs and globalizing are as yet in a very embryonic phase. Nevertheless, and
financial and trading organizations, are setting the despite real geographical and sectoral conflicts, the
agenda for these weakened nation-states. Theory and whole of the transnational capitalist class shares a
research on this issue has, not surprisingly, led to an fundamental interest in the continued accumulation of
increased interest in the politics of global capitalism. private profit on a global scale.
Many other Marxist, anti-Marxist and Marx-inspir-
ed scholars see capitalism as a global system, but tend
3. Global Capitalism as a Political System to conceptualize globalization in much wider terms
and, thus, minimize the importance of global capi-
The politics of global capitalism is debated intensely talism as an explanatory variable. The most important
inside and outside the social sciences. Since the of these are the geographer, Harvey (1989), whose
disintegration of the Soviet empire from the late 1980s, notion of time–space compression has been very
the struggle between capitalism and communism has influential, and the sociologist, Giddens (1990), whose
been largely replaced by the struggle between the conception of global capitalism is but one element in
advocates of capitalist triumphalism and the oppon- his theory of globalization as a product of late (and
ents of capitalist globalization. Many theorists have reflexive) modernity. Both of these contributions are
discussed these issues within the triadic framework of significant for their attempts to build a bridge between
states, TNCs and international economic institutions. the debates around economic, political and cultural
From this perspective, the global capitalist system is globalization.
dominated by the relations between the major states
and state-systems (USA, the European Union, and
Japan), the major corporations, and the World Bank, 4. Global Capitalism as a Culture–Ideology
the International Monetary Fund, World Trade Or- System Dominated by Large Globalizing Media
ganisation (supplemented in some versions by other and Adertising Corporations
international bodies, major regional institutions, and
so on). The third distinctive aspect of capitalism as a global
The idea of a transnational ruling class has been system is the worldwide diffusion and increasingly
suggested by several authors, notably in Cox’s thesis concentrated ownership and control of the electronic
(1987) on the emergence of a global class structure and mass media, particularly television. For example, the
in the work of Gill (1990) on the Trilateral Com- number of TV sets per capita has grown so rapidly in
mission, where he identifies a ‘developing trans- developing countries (from fewer than 10 per 1,000
national capitalist class fraction.’ Sklair (2001) pro- population in 1970 to 145 per 1,000 in 1995, according
poses a more explicit concept of a transnational to UNESCO) that many researchers argue that a
capitalist class, and it plays a central role in his theory globalizing effect due to the mass media is taking place
of the capitalist global system. Here, the transnational all over the world.
capitalist class is the characteristic institutional form Ownership and control of television, including
of political transnational practices in the global capi- satellite and cable systems, and associated media (like
talist system (paralleling the role of transnational newspaper, magazine and book publishing, films,
corporations in the economic sphere and consumerism video, records\tapes\compact disks, and a wide var-
in the culture–ideology sphere). In this formulation iety of other marketing media, notably the internet),
the transnational capitalist class is analytically divided are concentrated in relatively few very large TNCs.
into four main fractions: (a) TNC executives and their The predominance of US-based corporations is being
local affiliates; (b) globalizing bureaucrats and poli- challenged by Japan, Europe and Australia-based
ticians; (c) globalizing professionals; (d) consumerist groups globally, and even by Third World corpora-
elites (merchants and media). tions like the Brazil-based media empire of TV Globo.
A transnational capitalist class might be trans- The culture–ideology of consumerism prioritizes
national in at least three senses. First, its members the exceptional place of consumption and consumer-
would tend to have outward-oriented global rather ism in contemporary capitalism, increasing consump-
than inward-oriented national perspectives on a var- tion expectations and aspirations without necessarily
iety of issues, for example, support for free trade and ensuring the income to buy. The extent to which
neoliberal economic and social policies. Second, they economic and environmental constraints on the pri-

1461
Capitalism: Global

vate accumulation of capital challenge the global lenges of environmental harm and declining stocks of
capitalist project in general and its culture–ideology of resources essential for the maintenance of global
consumerism in particular, is a central issue for theory capitalist consumerism. However, the persistence of
and research on the capitalist global system. Never- problems of pollution, health risks, environmental
theless, it should be pointed out that most scholars degradation and waste management intrinsic to the
studying these issues of global culture do so not in system, suggests that it will be difficult to avoid
terms of capitalism but in terms of the potential ecological crisis. In this context, the attempt by TNCs
impact of global culture on national and local cultures and their supporters in international bureaucracies,
and identities. the professions, government (globalizing bureaucrats
and politicians), and the mass media to capture the
idea of sustainable development and to reconcile it
5. Crises of Global Capitalism with capitalist globalization is worth further study.

While Marxist and Marx-inspired theories of the


inevitability of a fatal economic crisis of capitalism 6. Resistance to Global Capitalism
appear to have lost most of their adherents, at least
two related but logically distinct crises of global Global capitalism is often seen in terms of impersonal
capitalism have been identified. The first is the sim- forces (notably market forces, free trade) wreaking
ultaneous creation of increasing poverty and increas- havoc on the lives of ordinary and defenseless people
ing wealth within and between societies (the class and communities. It is not coincidental that interest in
polarization crisis), not to be confused with Marx’s economic globalization has been accompanied by an
emiseration thesis which failed to predict enormous upsurge in what has come to be known as New Social
increases in wealth for rapidly expanding minorities all Movements (NSM) research. NSM theorists, despite
over the world. The second is the unsustainability of their substantial differences, argue that the traditional
the system (the ecological crisis). These crises are often response of the labor movement to global capitalism,
interpreted through the prism of consumerism in- based on class politics, has failed. In its place, a new
herent in a global capitalist system based on globaliz- analysis based on identity politics (notably of gender,
ing corporations (Sklair 2001). Globalizing corpora- sexuality, ethnicity, age, community, belief systems)
tions (in some cases rather more clearly than national has been developed, directed towards resistance to
governments) recognize the class crisis, but largely in sexism, racism, environmental damage, war-monger-
marketing terms. In most communities around the ing, capitalist exploitation and other forms of human
world the absolute numbers of people who are rights abuses.
becoming global consumers have been increasing The globalization of identity politics involves the
rapidly over recent decades. However, it is also true establishment of global networks of people with
that in some communities the absolute numbers of the similar identities and interests outside the control of
destitute and near-destitute are also increasing, some- international, state and local authorities. There is a
times alongside the new rich consumers. The best substantial volume of research and documentation on
available empirical evidence (see United Nations such developments in the women’s, peace, indigenous
Development Programme Human Development Re- peoples’ and environmental movements, some of it in
port, published annually since 1990) suggests that the direct response to perceived TNC malpractices. This
gaps between rich and poor have widened since the provides a series of research-rich connections for
1980s in many parts of the world. The very poor scholars influenced by postmodernist and global capi-
cannot usually buy the goods and services that global talist theories.
capitalists sell. While there is a long way to go before Serious challenges to global capitalism in the econ-
consumer demand inside the rich First World is omic sphere have also come from those who ‘think
satisfied, the gap between the rich and the poor all over global and act local.’ This normally involves disrupt-
the world is not welcome news for TNCs. In addition ing the capacity of TNCs and global financial institu-
to the profits lost when poor people who want to buy tions to accumulate private profits at the expense of
goods and services do not have the money or even the their workforces, their consumers and the communi-
credit to do so, the increasing visibility of the new rich ties that are affected by their activities. An important
and the new poor in an age of constant global media aspect of global capitalism is the increasing dispersal
exposure directly challenges capitalist claims that of the manufacturing process into many discrete
everyone eventually benefits from economic globaliza- phases carried out in many different places, populariz-
tion. ed by Dicken (1998) as global shift. Being no longer so
The ecological crisis is also directly connected with dependent on the production of one factory and one
consumerism, encapsulated in the political and ideo- workforce gives capital a distinct advantage, par-
logical struggles over the concept of sustainable ticularly against the strike weapon that once gave
development. Since the 1980s most major TNCs have tremendous negative power to the working class.
developed detailed policies in response to the chal- Global production chains can be disrupted by strategi-

1462
Cardioascular Conditioning: Neural Substrates

cally planned stoppages, but this is generally more of Gill S 1990 American Hegemony and the Trilateral Commission.
an inconvenience than a real weapon of labor against Cambridge University Press, Cambridge, UK
capital. The global division of labor builds flexibility Gray C 1999 Corporate Cash. WEP, Eugene, OR
Harvey D 1989 The Condition of Postmodernity. Blackwell,
into the system so that not only can capital migrate
Cambridge, MA
anywhere in the world to find the cheapest reliable Herman E S, McChesney R 1997 The Global Media: The New
productive sources of labor but also few groups of Missionaries of Corporate Capitalism. Cassell, London
workers can any longer decisively ‘hold capital to Lechner F J, Boli J (eds.) 1999 The Globalization Reader.
ransom’ by withdrawing their labor. At the level of the Blackwell, Malden, MA
production process, globalizing capital has a sub- Ross R J S, Trachte K C 1990 Global Capitalism: The New
stantial advantage over labor. In this respect, the Leiathan. State University of New York Press, Albany, NY
global organization of the TNCs and allied institutions Sklair L 1995 Sociology of the Global System, 2nd edn. Johns
like the World Bank and the WTO have generally Hopkins University Press, Baltimore, MD
Sklair L 2001 The Transnational Capitalist Class. Blackwell,
proved to be too powerful for local labor and
Malden, MA
community organizations. Strange S 1996 The Retreat of the State: The Diffusion of Power
Nevertheless, global capitalists, if we are to believe in the World Economy. Cambridge University Press, Cam-
their own propaganda, are continuously beset by bridge, UK
opposition, boycott, legal challenge, and moral out- United Nations Development Programme 1990–onwards
rage from the consumers of their products, concerned Human Deelopment Report, Oxford University Press, New
citizens, and by disruptions from their workers. There York
are also many ways to be ambivalent or hostile about van der Pijl K 1998 Transnational Classes and International
cultures and ideologies of consumerism, some of which Relations. Routledge, London
Wallerstein I 1979 The Capitalist World Economy. Cambridge
the Green movement has successfully exploited.
University Press, Cambridge, UK
The issue of democracy is central to the prospects
for global capitalism and the struggle against it. The L. Sklair
rule of law, freedom of association and expression,
freely contested elections, as minimum conditions and
however imperfectly sustained, are as necessary in the
long run for mass market based global consumerist Cardiovascular Conditioning: Neural
capitalism as they are for alternative social systems. Substrates
While most theory and research on capitalism
continues to be state-centrist, focusing largely on how
it works within specific countries, the growing influ- Our existence relies on our ability to alter our future
ence of globalization in the social sciences appears to behavior as a function of our past experiences, a
be encouraging more scholars to consider capitalism phenomenon known as learning. This requires that a
in the global as well as the local and\or national permanent record of our experiences, our memories,
context. be stored in the brain to be recalled to guide our future
actions. The last three decades of the twentieth century
See also: Capitalism; Economic Growth: Theory; witnessed intense investigations of the brain areas
Global History: Universal and World; Globalization where memories are formed and stored, and the
and World Culture; Globalization: Geographical structural changes that represent them. Many neuro-
Aspects; Globalization: Political Aspects; Globaliz- scientists have adopted Pavlovian learning paradigms
ation, Subsuming Pluralism, Transnational Organ- in animals to aid in these investigations. These
izations, Diaspora, and Postmodernity; International relatively simple paradigms present an opportunity to
identify the brain circuitry and mechanisms that
Business; International Marketing; International
contribute to learning as reflected in the acquisition of
Trade: Economic Integration; International Trade: specific learned responses. Among these responses are
Geographic Aspects; Marx, Karl (1818–89); Mult- learned cardiovascular responses, and particularly
inational Corporations; World Systems Theory learned heart rate responses, which are the focus for
this article.
Bibliography
1. Paloian Heart Rate Conditioning: A Model
Cox R W 1987 Production, Power, and World Order: Social
Forces in the Making of History. Columbia University Press,
to Assess the Brain Circuits that Contribute to
New York Memory
Dicken P 1998 Global Shift: Transforming the World Economy,
3rd edn. Paul Chapman, London 1.1 Paloian Conditioning
Gereffi G, Korzeniewicz M (eds.) 1994 Commodity Chains and
Global Capitalism. Praeger, Westport, CT During Pavlovian conditioning the relationship
Giddens A 1990 The Consequences of Modernity. Polity Press, among events is learned. For example, an animal
Cambridge, UK learns that one event, such as the presentation of an

1463
Cardioascular Conditioning: Neural Substrates

auditory or visual stimulus, that repeatedly precedes a That area was the avian homologue of the mammalian
second event, such as the presentation of an electric amygdala. Its destruction blocked the development of
shock or a pleasant morsel of food, provides in- the CR. This finding guided future research focused on
formation about the occurrence of the second event. the mammalian amygdala as a component of a brain
The auditory or visual stimulus is called the con- circuit that contributes to learning and memory, using
ditional stimulus (CS), and the electric shock or a heart rate CR in the rabbit as a model response.
morsel of food is called the unconditional stimulus
(US). This learned relationship or association is stored 2. Heart Rate Conditioning in the Rabbit
in memory such that subsequent presentations of the
CS will elicit expectations regarding the occurrence of
2.1 Circuit Components
the US. These expectations elicit responses, called
conditioned responses (CRs), which are a specific The heart rate decelerative, or bradycardic, CR in the
consequence of the formation and storage of the rabbit has been widely used to identify the components
association between CS and the US. Implicit in of the circuit that contribute to memory formation
the search for the structural changes that form the and storage (see Kapp et al. 1998 and Powell 1994 for
substrates for Pavlovian associative memories is the reviews). This CR develops when an acoustic CS
assumption that if an association between the CS and immediately precedes either an aversive or appetitive
US is made, neuronal information concerning both US. The neurons comprising the final motor pathway
must converge at a common brain structure(s). Thus, for the expression of this CR are located in the
research directed at identifying structures involved in medulla, and their axons travel via the vagus nerve to
memory has been guided by an identification of the heart. The identification of this CR pathway
structures where CS and US information converge. created the opportunity to identify the structures that
may activate it. As noted in Sect. 1.3, Cohen’s work
pointed to the amygdala, and research in the rabbit
1.2 Heart Rate Conditioning
has demonstrated an important contribution of a
Researchers have used heart rate CRs as model CRs to group of neurons in this structure to CR expression.
identify the areas where memories are formed and These neurons are located in the central nucleus (ACe)
stored. These CRs are advantageous because the and project directly and indirectly to the final motor
locations of the motor neurons that produce them are path neurons. Electrical stimulation of the ACe
known. Thus, structures that send projections to produces bradycardia and lesions of the ACe markedly
activate these neurons leading to the expression of the attenuate the development and expression of the CR
CR can be identified using anatomical methodology. (Kapp et al. 1998). Further, increases in the activity of
The subsequent identification of areas that send ACe neurons developed to the CS during conditioning
information to the areas that in turn activate the (Applegate et al. 1982, McEchron et al. 1995). These
motor neurons, including the pathways by which CS results suggest that these neurons excite the motor
and US information, access these areas, exposes an neurons leading to CR expression and are an im-
entire circuit that contributes to memory. Identi- portant component of the circuit for the development
fication of the circuit permits the identification of sites and expression of the CR (Fig. 1). Recall that an
of CS and US convergence and sets the stage for assumption in the search for the neural substrates of
analyses of the structural changes that form the Pavlovian associative memories is that CS and US
substrates for memory of the CS–US association. information must converge at a common site(s). Thus,
identification of the pathways conveying CS and US
information to central structures such as the ACe is an
1.3 An Early Model: Heart Rate Conditioning in the
important component of the analysis. The most
Pigeon
peripheral components of the acoustic CS pathway
David Cohen (1980) was the first to use a heart rate have yet to be identified. However, research using an
CR to identify the components of a circuit that acoustic CS in the rat has demonstrated that de-
contributes to memory. He measured the accelerative struction of neurons in the inferior colliculus, a
heart rate CR in the pigeon that developed in response structure that receives information from the most
to a visual CS which preceded an electric current (US) peripheral components of the auditory system (Fig. 1),
applied to the foot. He believed that a systematic blocks the development of several Pavlovian CRs
analysis of the flow of neuronal information, from CS (LeDoux 1995). The inferior colliculus in turn projects
and US input to CR output, should eventually lead to to several other auditory structures including the
the identification of sites of CS and US convergence. magnocellular component of the medical geniculate
While the pigeon model is no longer used, Cohen’s nucleus (MGm). MGm destruction blocks CR deve-
strategy proved successful and was adopted by others lopment in the rabbit (McCabe et al. 1993). The
using other models. Importantly, he identified an area MGm projects to the lateral amygdaloid nucleus (AL),
where the structural changes responsible for the and AL neurons respond to acoustic stimuli (Quirk et
memory of the association potentially may occur. al. 1995). Neurons in the AL project both directly and

1464
Cardioascular Conditioning: Neural Substrates

(Fig. 1). Does CS and US convergence occur in the


amygdala, making it a potential candidate for the site
where the memory is formed? Or, does convergence
occur in other areas as well? With respect to the ACe,
neurons within this structure are responsive to both
the CS and US (McEchron et al. 1995), and the
responses that develop to the CS in this nucleus during
conditioning may well reflect a structural change(s)
that represents the memory for the association. How-
ever, the mere development of these responses is not
conclusive evidence that it is a site where this change
occurs. Such responses may reflect their relay from
other brain regions. For example, neurons in the
rabbit and rat MGm, and in the rat AL, respond to
both the CS and US (Bordi and LeDoux 1994,
Romanski et al. 1993. McEchron et al. 1996a). Fur-
thermore, responses in the MGm develop to the CS in
many species during Pavlovian conditioning (Edeline
et al. 1988), McEchron et al. 1995), and such changes
also develop in the rat AL (Quirk et al. 1995). Thus,
the responses that develop in the ACe may represent
responses relayed from the MGm to the AL and from
the AL to the ACe. Importantly, increases in synaptic
conductivity that indicate a structural change have
been demonstrated in both the MGm and AL during
Pavlovian conditioning McEchron et al. 1996a, Rogan
et al. 1997). The as-yet-unidentified structural sub-
strates for these increases may well represent the
neural basis for the memory of the association formed
between the CS and US, and the resultant heart rate
Figure 1 CR that develops. The extent to which increases in
A simplified diagram of the putative structures and synaptic conductivity occur in the ACe has yet to be
pathways comprising the brain circuit that contributes demonstrated. To the extent that it does occur in
to the acquisition of the bradycardic CR in the rabbit the ACe, then the overall picture is one in which the
structural change(s) that form the substrate for the
indirectly via intra-amygdaloid connections to the associative memory during Pavlovian conditioning
ACe (Fig. 1). may occur at several sites within the essential circuit.
Less is known about the pathway by which US As reviewed in Sect. 2.1 considerable research
information gains access to central structures to devoted to an identification of the essential circuit for
converge with CS information. This pathway may be conditioning bradycardia has focused on the
comprised of projections from spinal trigeminal com- amygdala and MGm. It is important to realize,
plex neurons, which are activated by US presentations however, that other brain structures also appear to
(McEchron et al. 1996b) and which project to the contribute importantly to the development of this CR.
ventral posterior medial thalamic nucleus. The latter’s Two in particular, the prefrontal cortex and the
destruction in the rabbit prevents the development of cerebellar vermis, make important contributions
the bradycardic CR (McEchron et al. 1995). The (Powell 1994, Supple and Kapp 1993). Lesions of
pathway by which the US information is transmitted both areas severely retard CR development, and
to converge with CS information in more central neuronal activity develops to the CS in both areas. The
structures is at present unknown. However, recordings functional interactions of these structures with
of neurons in the CS pathways to determine if they the MGm and amygdala in the development of the
respond to CS and US presentations indicates sites bradycardic CR will be an important focus for
where convergence occurs, as described in Sect. 2.2. research in the early twenty-first century. An equally
important focus will be (a) a determination of the
exact contribution that each of these components
2.2 Potential Sites of Memory Formation: The
makes to CR development, and (b) the exact site(s)
Conergence of CS and US Information
and nature of the cellular substrates for Pavlovian
The above research suggests that CS information associative memories. Obviously, much needs to be
accesses the amygdala, and that the ACe produces CR accomplished if we are to completely understand their
expression via an influence on vagal motor neurons neural basis.

1465
Cardioascular Conditioning: Neural Substrates

See also: Autonomic Classical and Operant Condi- Quirk G J, Repa J C, LeDoux J E 1995 Fear conditioning
tioning; Classical Conditioning and Clinical Psycho- enhances short-latency auditory responses of lateral amygdala
neurons: Parallel recordings in the freely behaving rat.
logy; Classical Conditioning, Neural Basis of; Coro-
Neurons 15: 1029–39
nary Heart Disease (CHD): Psychosocial Aspects; Rogan M J, Staubli V V, Le Douse J E 1997 Fear conditioning
Eyelid Classical Conditioning; Pavlov, Ivan Petrovich induces associative long-term potentiation in the amygdala.
(1849–1936) Nature 390: 604–7
Romanski L M, Clugnet M C, Bordi F, LeDoux J E 1993
Somatosensory and auditory convergence in the lateral
nucleus of the amygdala. Behaioral Neuroscience 107: 444–50
Supple W F Jr, Kapp B S 1993 The anterior cerebellar vermis:
Bibliography Essential involvement in classically conditioned bradycardia
in the rabbit. Journal of Neuroscience 13: 3705–11
Applegate C D, Frysinger R C, Kapp B S, Gallagher M 1982
Multiple unit activity recorded from amygdala central nucleus B. Kapp
during Pavlovian heart rate conditioning in rabbit. Brain
Research 238: 457–62
Bordi F, LeDoux J E 1994 Response properties of single units in
areas of rat auditory thalamus that project to the amygdala.
II. Cells receiving convergent auditory and somatosensory
inputs and cells antidromically activated by amygdala stimu-
lation. Experimental Brain Research 98: 275–86
Care and Gender
Cohen D H 1980 The functional neuroanatomy of a conditioned
response. In: Thompson R F, Hicks L H, Shvyrkov V B (eds.) Care consists of the physical, emotional, and in-
Neural Mechanisms of Goal Directed Behaior and Learning. tellectual processes that enable humans to maintain
Academic Press, New York their lives and activities; such activities usually are
Edeline J M, Dutrieux G, Neuenschwander-El, Massioui N 1988 distinguished from economic production. Care is a
Multiunit changes in hippocampus and medial geniculate central aspect of human life, but social scientists have
body in freely behaving rats during acquisition and retention only recently begun to pay careful attention to the
of a conditioned response to a tone. Behaior and Neural connection of care and gender. Since most activities of
Biology 50: 61–79 care historically have belonged properly in the same
Kapp B S, Silvestri A J, Guarraci F A 1998 Vertebrate models of
learning and memory. In: Martinez J L, Kesner R P (eds.)
sphere as the family and been the activity of women,
Neurobiology of Learning and Memory. Academic Press, New slaves, servants, and working-class people, care has
York been beneath the concern of most social theorists and
LeDoux J E 1995 Emotion: Clues from the brain. Annual Reiew scientists. Changes in the roles of family and public
of Psychology 46: 209–35 institutions for care, as well as feminist scholars’
McCabe P M, McEchron M D, Green E J, Schneiderman N impetus to study the concerns of women’s lives, have
1993 Electrolytic and ibotenic acid lesions of the medial led to the emergence of a robust field of study that
geniculate-prevent the acquisition of classically conditioned explores the physical, social, political, economic, and
heart rate to a single acoustic stimulus in rabbits. Brain philosophical implications of care.
Research 619: 291–8
McCabe P M, McEchron M D, Green E J, Schneiderman N
1995 Destruction of neurons in the VPM thalamus prevents
rabbit heart rate conditioning. Physiology and Behaior 57: 1. The Growing Importance of Care for Public
159–63 Concern
McEchron M D, Green E J, Winters R, Nolen T G,
Schneiderman N, McCabe P M 1996a Changes of synaptic The long-standing devaluation of care grows out of
efficacy in the medical geniculate nucleus as a result of auditory the devaluation of its dual central aspects. On the one
classical conditioning. Journal of Neuroscience 16: 1273–83 hand, care is about the actual physical work required
McEchron M D, McCabe P M, Green E J, Hitchcock J M, to maintain and repair objects and people. On the
Schneiderman N 1996b Immunohistochemical expression of other hand, care also denotes mental states of en-
the c-Fos protein in the spinal trigeminal nucleus following gagement that make central the concerns of the person,
presentation of a corneal airpuff stimulus. Brain Research 710: object, idea, etc. toward which the care is directed.
112–20
Thus, in the English language care is associated with
McEchron M D, McCabe P M, Green E J, Llabre M M,
Schneiderman N 1995 Simultaneous single unit recording in
both burdens and woes. Even in current discussions of
the medial nucleus of the medial geniculate nucleus and care some scholars emphasize the physical concerns of
amygdaloid central nucleus throughout habituation, acqui- care, others see it as a psychological or philosophical
sition, and extinction of the rabbit’s classically conditioned category, while others try to combine these two
heart rate. Brain Research 682: 157–66 elements.
Powell D A 1994 Rapid associative learning: Conditioned The historical exclusion of care from public life and
bradycardia and its central nervous system substrates. Inter- from serious scientific consideration also reflects the
gratie Physiological and Behaioural Science. 29: 109–33 duality of care’s nature (Tronto 1996). Aristotle began

1466
Care and Gender

his Politics by distinguishing between the worlds of 2.1 The Sociological Approach
politics and household and relegating all the banal and
Among the first issues that feminist scholars and
daily concerns of the household as outside of the
activists noted at the beginning of the second wave of
concerns of citizens. To such thinkers, care was
feminism was that, even in industrial societies, a deep
beneath the dignity of citizens. Others have viewed
division of labor persisted that left to women most of
care as too lofty for any serious association with public
the tasks of caring. Marxist scholars frequently dis-
life. Realist thinkers from Augustine to Max Weber
tinguished this realm of reproduction from produc-
have been suspicious of attitudes of care because it was
tion. Although conservatives argue that childcare and
associated with ideals of Christian charity and agape,
tending to the household were naturally the realm of
and have relegated such concerns to realms beyond
women’s work, feminists began to challenge both the
public concern, either to spiritual communities or to
gendered division of labor and the devaluation of
individuals’ consciences.
women’s work. Political platforms throughout the
A major cause for this greater attention from social
West provided analyses of these discrepancies and
scientists is that care itself has changed during the
called for policy solutions such as available day care
industrialization since the 1850s (see Family and
and wages for housework. Scholars soon began to
Gender; Household Production). Households used to
explore such ‘labors of love’ (Finch and Groves 1983).
be almost entirely self-contained. They produced
A large sociological literature has now emerged that
foodstuffs, domestic products such as soap, energy,
explores the boundaries and nature of personal care
etc. for themselves. Birth, death, and illness were
within and outside of traditional family contexts and
treated in the household with minimal assistance from
institutionalized care in society.
others. Households consisted of people who were in
various stages of their life. With the rise of indus-
trialization, households have grown smaller, most
household goods have become commodified, and 2.2 The Psychological Approach
professionals have begun to assist in processes of Gilligan’s path breaking work, In a Different Voice
birth, death, and illness. Thus, activities that used to (1982), is often taken as the starting point for the
be conceived of as essentially private and personal discussion of care. Gilligan’s work challenged the
became social, public, and political concerns. The basic assumption of Lawrence Kohlberg’s influential
development of public institutions since the eighteenth account of cognitive moral development by arguing
century to aid in caring for the ill and infirm (Foucault that not all moral development could be measured by
1965) has accelerated with the development of welfare a single path. Gilligan posited that, in addition to
state bureaucracies (Bussemaker and van Kersbergen Kohlberg’s account of the development of justice
1994). As care has become a more central aspect of reasoning, psychologists needed to observe as well the
social and political life, these questions have become development of an orientation to care. To Gilligan,
more important for scholars: who cares for whom, three qualities describe this ethic and distinguish it
how is care organized and paid for in different from the justice approach: (a) it revolves around the
societies, and what are the practical and normative moral concepts of responsibility and relationships
questions entailed in practices of care? rather than rights and rules; (b) it is tied to concrete
circumstances rather than to abstract rules of morality;
(c) it is an activity rather than a set of principles.
2. Conceptualizing Care Gilligan insisted that these three orientations need not
be associated with gender, but her admonition was
Scholars use different formulations of care depending widely disregarded. Despite the absence of empirical
upon the different conceptual purposes to which evidence to support the claim that these two orient-
they employ the concept. Among these uses are: care ations distinguish between men and women, a broader
as a sociological category; i.e., an account of how public audience seized the idea that there was a gender
societies organize care work; care as a psychological difference in morality that distinguished men and
orientation, i.e., a framework for understanding women; men were more interested in impartial and
psychological differences between men and women; principled conceptions of justice, and women were
care as an ethical category, i.e., a basis for making more interested in maintaining particular relationships
ethical judgments in care settings; care as a political of care. Arguments for care thus became associated
and philosophical perspective, i.e., a paradigm in with a political strategy to valorize the lives and
which to understand an alternative account of human experiences of women.
nature and values. The empirical work that ties care to
gender is strongest in the psychological and socio-
logical approaches, but within all of these accounts of
2.3 The Ethical Approach
care, the relationship of care to gendered activities and
to the relative status of men and women in society Noddings (1984) transformed the assumption of
remains central. women’s more caring nature into a moral position.

1467
Care and Gender

Arguing that care was always a dyadic relationship, logists have studied the organization of institutions to
Noddings posited that there was a limit to the kinds of care for children, the elderly, the disabled, the infirm.
concerns that caring could address; they were limited Scholars in communication have begun to use a care
to the kinds of relationships and life activities that perspective to explain how care can alternatively frame
occurred in intimate settings. listening and communicating with others. Legal
Ruddick’s (1990) important work on ‘maternal scholars have used an ethic of care to explain alterna-
thinking’ reinforced the notion that caring had its own tive ways to understand the legal process. Economists
set of moral norms and practices that might be have begun to explore how economists’ theories and
distorted if applied in a broader context (see Mother- policy proposals have been distorted by the failure to
hood: Social and Cultural Aspects). It is perhaps not recognize the fact that women bear most of the costs of
surprising that proponents of this account of care rearing children, while all share in the public benefits
were often exploring care relationships within social of well-reared workers and citizens (Folbre 1994).
institutions and professions where women predomi- More generally, political and social theorists have
nated, such as mothering (Ruddick 1990), nursing noted that to take the care perspective seriously
(Benner and Wrubel 1989), and in education (Nodd- challenges some of the basic assumptions that seem to
ings 1984). permeate Western value systems about human life.
The care perspective stresses human vulnerability and
dependence. It argues that all individuals are de-
2.4 The Political and Philosophical Perspectie pendent upon others and that an understanding of
Critics of the psychological and narrow ethical posi- individuals as interdependent rather than as auton-
tions soon emerged, arguing that care was necessarily omous, is actually a more realistic portrayal of their
neither gendered nor limited to a narrow sphere of life existence. Since most contemporary political philo-
(Sevenhuijsen 1998, Tronto 1993). These critics began sophies rest upon an assumption of the individual as a
to define care much more broadly and to think about rational, autonomous individual, the care perspective
it as an essential human activity, whose particular thus raises a fundamental challenge to them.
contours would be shaped by broader political and Another set of challenges concerns how we organize
social decisions made in a society. Writing about the social and political life by dividing the public sphere
development of institutions for caring in the and the private sphere. Feminist scholars have long
Scandinavian welfare states, Waerness (1984) argued argued that the division of life into public and private
that a distinctive ‘rationality of care’ could be dis- spheres is one of the most important demarcations in
cerned in agencies and practices that were involved in reducing women, associated with the private, to
issues of care. This broader conception of care second-class status (Pateman 1988). In contemporary
combines the previous frameworks: it understands industrialized societies, care is not divided easily into
care to be both work and a framework of values about public and private: many previously private care
work, to be both concrete physical activity and concerns have become part of the purview of welfare
normative orientations and concerns about that state institutions, and many public activities are
activity. It combines the critical dimension of the carried out to accomplish care.
sociological approaches that recognize the devalued
nature of care with the psychological and ethical
approaches that seek to reconstruct the values of the 4. Applying Care Perspecties: The Work of
caring work and activities that are done. From this Care in Society
perspective, the persistence of the gendered nature of
care is a sign of a broader problem in devaluing certain Since scholars actually are thinking about a variety of
necessary human traits and activities, not only a concerns when they discuss care, they might use the
devaluation of that which is seen as feminine or term in a variety of ways. No standard definition of
residing in women’s sphere. Furthermore, scholars care has yet emerged, in part because the following
have begun to connect the devaluation of care with the problems seem to be essentially contested within the
devaluation of women’s lives and activities. The meaning of the term: (a) does care apply to care-work
relative powerlessness of care-givers is tied to the that is directed at another, or can it also be directed
relatively low status of the work that they perform toward the self ? To include the self adds dimensions of
(Tronto 1993). spirituality to the conception of care. It also makes it
more difficult to associate care with altruism. Yet
advocates of incorporating the care of the self with
3. Care as a Challenge to How We Organize other forms of care posit that introspections about the
Social and Political Life meaning of care are often important sources for
thinking about care, and that there is another place
Scholars have begun to apply these various care within the framework of talking about care to dis-
frameworks to explore a number of arenas of social tinguish between care for the self and care for others.
life. Concerns about care have informed how socio- (b) Does care primarily concern mental and emotional

1468
Care and Gender

attitudes or does it primarily concern physical ac- of care versus justice. In this debate, the framework
tivities of care giving? Scholars who tend to emphasize delineated by Gilligan’s (1982) psychological theory is
the mental and emotional aspects of care often do so taken as definitive of a difference in orientations
by ignoring the range of sites of care in society and toward the moral world. Thus, the question becomes,
emphasizing only the most familiar aspects of care, are care and justice perspectives compatible?
such as mothering. Scholars who focus on the activities Throughout the 1980s, scholars emphasized the in-
of care, especially as carried out in social institutions compatibility of the approaches, often to the detriment
often emphasize care as work to the exclusion of seeing of taking the care approach seriously. Kohlberg
other dimensions of care. (c) What are the best ways to himself described care as a secondary kind of moral
describe different aspects of care? Fisher and Tronto orientation that came into play to resolve local moral
(1990) provided a framework for dividing care into issues once the larger issues of justice had been settled.
four phases: caring about, caring for, care giving, and Some feminists tried to argue that the model of care
care receiving. By thinking of care as an ideal of a was preferable to one of justice (Held 1993) while
complete process in which all of the phases are present, others insisted that the qualities of localness inevitably
it is possible to provide a standard by which one can made care a flawed conception for understanding
evaluate how effectively care is rendered in any ethical issues (Jaggar 1995). Increasingly, feminist
particular setting. (d) How can one account for the theorists resolve the care–justice debate by asserting
power differentials in different forms of care? Other that the two approaches were compatible and could
scholars have emphasized that it makes sense to complement each other, for example, by insisting
classify care according to the relative power positions upon the rights of care workers (Bubeck 1995, Kittay
of those involved in the caring relationship. Thus, 1999) or by discussing caring in terms of rights (Tronto
Waerness (1984) distinguishes between ‘personal ser- 1993).
vice’ and ‘necessary care.’
Personal service is work that one could have
5.2 Problems With Care Theories
performed by oneself but calls upon others to do.
Necessary care is the care needed by people who Nevertheless, advocates of care theory recognize
cannot provide such care to themselves. Necessary several serious problems in using care as a moral
care may require particular expertise (e.g., providing approach. One problem is the problem of parochial-
medical or therapeutic care) or it might be care that is ism. If people care most for those who are closest to
not specialized (e.g., helping the frail elderly with them, how can they make judgments about the
transportation needs). (e) How are needs for care (perhaps more serious) needs for care of those who are
determined? While on some basic level the needs for more distant from them? Many care theorists ac-
care are obvious, in any given society there can be a knowledge that such problems of parochialism, of
variety of ways to meet caring needs and to con- being too partial to the needs of those to whom we are
ceptualize and prioritize caring needs. For example, closest, require that care be supplemented with more
though elderly people may need assistance in moving abstract principles of justice. Care theorists might
about, different societies will determine that elderly demand, for example, that justice requires that every-
parents should reside with their grown children, that one be provided with minimal levels of care. At its
public services should provide assistants to help the most profound level, the problem of parochialism
elderly in their homes, or that elderly people should raises the question of what are the appropriate limits
live in specially designed communities or institutional and boundaries of care. Does the ethical imperative to
settings designed for them. Indeed, any particular care stop at the family, local community, group, or
society may decide to allow people to choose among nation, or is it global (Robinson 1999)?
many different options in deciding how best to meet A second serious problem is the problem of pa-
the needs of care. A society may decide to leave such ternalism. How can approaches of care avoid the
decisions to individuals, to individual families, to the problem that care-givers often have a degree of power
market, or to governmental agencies. over the people for whom they care? The problems of
abuse in care relationships are serious concerns. Even
when abuse is not present, however, there is often
5. Applying Care: Philosophical Concerns About conflict among the parties in a situation of care about
Care in Society how to best provide the care. Since the care-giver,
especially in cases of necessary care, is usually in a
A number of philosophical issues has arisen as scholars position of providing the care receiver with something
have begun to think about care and gender. that the care receiver needs, givers have an upper hand
in defining the situation and in determining the nature
and extent of care. A third serious problem is in trying
5.1 The Care–Justice Debate
to determine a standard for good care. If care must be
One central way to frame the philosophical dispute understood as relative to different circumstances, then
about the value of care has been to see it as a question on what basis can one ever judge what constitutes

1469
Care and Gender

adequate or good care? Again, since care receivers are care remain, we must keep in mind how easily such
often in positions of receiving care, they are often in a interpersonal relationships of care can mask the vast
weak position to argue for their own views of what inequalities that exist when we ask, who cares for
caring should be provided and how. Even if care- whom? People of relative privilege in the industrialized
givers are relatively devalued in society, they may still world are able to command vast amounts of necessary
be in positions of power vis-a' -vis their charges and care and personal service. On the other hand, those
possibilities for abuse or mistreatment are serious. A who are the poorest members of the global society not
fourth serious problem concerns how we define equal- only receive less essential care, they are often hired at
ity in the light of the need for care for dependent subsistence wages to work and provide care for others.
individuals in democratic societies. If some individuals The globalizing economy has also commodified and
are highly dependent upon care, then in what sense can globalized the traffic in care workers.
they count as the equals of other democratic citizens? Even as theorists of care have warned against the
Indeed, Kittay (1999) has noted that the care-givers of parochialisms that often creep into care analyses, they
highly dependent people also are excluded from some have only begun to confront the question of whether
aspects of social and political life by their need to care there is a moral imperative to organize care on a global
for their charges. For example, single mothers of level. To what extent does one nation owe care to the
disabled children provide exhausting hours of care citizens who live within another nation? To what
which renders them unable to earn money for the extent can such concerns be extended before they are
household. Care workers are especially vulnerable to stretched too far to bear any resemblance to the kinds
exploitation (Bubeck 1995). of concerns that we usually associate with ‘caring’?
How much provision of care is enough? Tragically,
there will always be more needs for care than there is
care available in human societies. How then are
5.3 Care and Citizenship difficult decisions about the allocation of care best
At another level, scholars have begun to question made? In order for care to take place, there must first
whether the very definition of what constitutes a be a perception of the need for care. Yet as Fraser
citizen has to be conceived according to changes in (1989) observed, needs are determined through a
caring practices. A model of citizenship that makes highly political process of needs interpretation. How
many benefits contingent on participation in paid can such a process occur fairly?
employment disadvantages women (and those few What constitutes good care will be variable in any
men) who (must) leave the paid work force to fulfill society. Different people have different notions of
their continuing care obligations to children, elders, or what constitutes the best care for them. The challenge
disabled relatives (Knijn and Kremer 1997). of care will be to offer options for care so that everyone
can be well cared for in a manner that they find
acceptable. Finally, the feminist impulse to revalue
care grew out of a desire to end gender-based dis-
6. Areas for Future Research parities in the provision of care. The greatest challenge
to theorists of care is to see that, not only do gender-
Scholars throughout the industrialized world have based forms of caring change to make the burdens and
begun to question how care fits into the framework of blessings of care equally distributed among men and
modern industrialized countries. As the model of the women, but to do so in a way that does not reinscribe
family as headed by a breadwinner and homemaker care as unequal burdens for the relatively dis-
continues to decline, as more caring functions leave advantaged.
the household and become marketed commodities or
services (Ungerson 1997) or provided by social insti- See also: Caregiver Burden; Family and Gender;
tutions, care practices will continue to evolve. But the Fatherhood; Household Production; Masculinities
meaning and significance of care will also continue to and Femininities; Motherhood: Economic Aspects;
evolve as individuals perceive that the relationships Motherhood: Social and Cultural Aspects; Social
between care-givers and care receivers continue to be Welfare Policies and Gender; Time-use and Gender
intensely personal and intimate.
One important part of care is the interpersonal
relationship that it represents. To what extent can
advanced societies continue to provide personal care? Bibliography
The professionalization of care in the last century (the Benner P, Wrubel J 1989 The Primacy of Caring: Stress and
growth of specialties such as doctoring, nursing, Coping in Health and Illness. Addison-Wesley, Menlo Park,
teaching, etc.) may in fact be matched by a process of CA
deskilling of care in the next century as tasks are Bubeck D 1995 Care, Justice and Gender. Oxford University
further differentiated and separated from one another. Press, Oxford, UK
Yet as important as the interpersonal dimensions of Bussemaker J, van Kersbergen K 1994 Gender and welfare

1470
Career Deelopment, Psychology of

states: Some theoretical reflections. In: Sainsbury D (ed.) The models of career development and the study of
Gendering Welfare States. Sage, London careers are always linked to the existing environmental
Finch J, Groves D 1983 A Labour of Loe: Women, Work and conditions (cf. Whyte 1986 vs. Hall 1986), to a
Caring. Routledge and Kegan Paul, London
‘relationship concept’ (l psychological contract)
Fisher B, Tronto J C 1990 Towards a feminist theory of care. In:
Abel E, Nelson M (eds.) Circles of Care. State University of and to prevailing points of view (e.g. intra- or
New York Press, Albany, NY interorganizational) of the different sociological dis-
Folbre N 1994 Who Pays for the Kids? Gender and the Structures ciplines and the different fields of psychology (e.g.,
of Constraint. Routledge, New York occupational, organizational, personnel, and mana-
Foucault M 1965 Madness and Ciilization. Pantheon Books, gerial psychology). The orthodox perspective consists
New York of the concentration on the big organization, assoc-
Fraser N 1989 Unruly Practices: Power, Discourse, and Gender in iated with the ‘bounded career’ which develops within
Contemporary Social Theory. University of Minnesota Press, a single organization.
Minneapolis, MN
However, it needs to be pointed out that the
Gilligan C 1982 In a Different Voice. Harvard University Press,
Cambridge, MA situation in the field of career development research
Held V 1993 Feminist Morality: Transforming Culture, Society and in developing career theories must be assessed as
and Politics. University of Chicago Press, Chicago extremely unsatisfactory, not least because it is a
Jaggar A 1995 Caring as a feminist practice of moral reason. In: recent, interdisciplinary field within the field of social
Held V (ed.) Justice and Care: Essential Readings in Feminist sciences. Several academic disciplines and their ‘sub-
Ethics. Westview Press, Boulder, CO fields’ are working in it—a circumstance which has led
Kittay E 1999 Loe’s Labor. Routledge, New York to a ‘fractionation’ of the field which has therefore not
Knijn T, Kremer M 1997 Gender and the caring dimension of been able to produce integrative theories of career
welfare states toward inclusive citizenship. Social Politics 4:
development with the disciplines ignoring the results
328–61
Noddings N 1984 Caring: A Feminine Approach to Ethics and of the others (psychology, sociology, management
Moral Education. University of California Press, Berkeley, science, anthropology, etc.).
CA The biggest part by far of empirical career research
Pateman C 1988 The Sexual Contract. Stanford University within the field of applied psychology (as also of the
Press, Stanford, CA interdisciplinary studies) is dominated by the per-
Robinson F 1999 Globalizing Care. Westview, Boulder, CO spective of the ‘intraorganizational career.’ Here two
Ruddick S 1990 Maternal Thinking: Towards a Politics of Peace. issues play a central role: ‘careers in organizations’ and
Beacon Press, Boston the ‘matching’ between the needs of the employee on
Sevenhuijsen S L 1998 Citizenship and the Ethics of Care.
the one hand and those of the organization on the
Routledge, London
Tronto J C 1993 Moral Boundaries. Routledge, New York other (cf. Holland 1997). This ‘intrafirm perspective’
Tronto J C 1996 The Political concept of care. In: Hirschmann has supported the so-called ‘bounded career assump-
N, Di Stefano C (eds.) Reisioning the Political: Feminist tions’ for decades (as opposed to the ‘boundary-less
Reconstructions of Traditional Concepts in Western Political assumptions’) having stressed for a major part the
Theory. Westview Press, Boulder, CO ‘single organizational settings’ (l research into career
Ungerson C 1997 Social politics and the commodification of development within a single, stable organization) until
care. Social Politics 4: 362–81 today. Unfortunately, it can be observed in career
Waerness K 1984 The rationality of caring. Economic and research that this orthodox approach has been kept
Industrial Democracy 5: 185–211
for a major part—in spite of significant changes in the
organizations.
J. Tronto
This traditional perspective of work and career, as it
is also reflected by most of the classical ‘models of
career development’ (cf. Brown and Brooks 1990,
Osipow 1983, Super 1957), obviously comprises prin-
ciples such as stability in the working environment
Career Development, Psychology of (e.g., continued existence of the organization and the
work role), movement by hierarchical promotion and
1. The Traditional Career Perspectie interorganizational mobility, and a constant avai-
lability of positions and work roles in accordance with
A career can be defined as a pattern of work the interests, talents, and the lifestyle preferred by the
experiences comprising the entire life span of a person person. It regards the organizations as a ‘benevolent’
and which is generally seen with regard to a number of unity playing an active role in designing the career of
phases or stages reflecting the transition from one the individual employee or the executive, respectively.
stage of life to the next (l sequences of work roles). This traditional career perspective was accompanied
Therefore ‘career’ may be conceptualized with regard by two institutions defining the background for the
to the development of skills and expertise, the ability ‘locus of career responsibility.’ (a) The ‘psychological
to learn, the developmental identity, and the self- contract,’ the mutual relationships and expectations
concept as ‘career anchor.’ between the employee and the organization (l

1471
Career Deelopment, Psychology of

unwritten contract of relationships): what an employee However, most of all this significantly limits the
thinks he\she owes to the organization and what traditional possibility for career development by
he\she thinks he\she can expect the organization to ‘climbing up the ladder.’ New environmental condi-
owe him for this. A psychological contract thus tions or conditions of the surrounding field with their
produces a long-term security (for both sides) and a high degree of uncertainty are forcing organizations to
high level of commitment and loyalty to the orga- abandon the practice of guiding their employees and
nization on the part of the employee. To the employee executives over long-standing career paths and
this had the advantage of having a valued job and a through prescribed development sequences. Within
position which enabled him to climb up a hierarchical the ‘boundary-less’ organization this responsibility is
career ladder in scheduled steps. To the organization shifted towards the person who is thus itself respon-
this model guaranteed a constant supply of talented sible for its career development—via different em-
employees with a high level of commitment, striving to ployment relationships in different organizations (l
take up new and enlarging roles—possibly even boundary-less career). It is the opposite of an inter-
involving putting personal interests last. ‘Downsizing,’ organizational career (l bounded career), with sev-
‘head-count reductions,’ and ‘large-scale layoffs’ have eral meanings to this term: (a) this career surpasses the
rendered this contract invalid. Employees can no ‘bounds’ of different organizations; (b) there are no
longer count on a ‘link’ to their company going beyond (classically) hierarchical principles in career devel-
the contractually agreed salary. opment; (c) the marketability and network relations
The second institution consists of the (b) assumption are predominant. Hall (1976) has coined the term
of big organizations which held a great fascination ‘protean career’ for this new scenario. By this we
especially for psychological career research of the understand a proactive process which is controlled by
1970s and 1980s and which was regarded as the basis the person—and not by the organization. It consists of
for lifelong employment and as a model for a career all the different experiences of a person in the fields of
context encompassing the entire working life (Ouchi training and the work experience in different organ-
1981, Pascale and Athos 1981). Thus the ‘locus of izations, the change between different groups of
career responsibility’ was determined as well. The training occupations or job roles, etc. to different
organization was required to create stable social employment relationships in different organizations.
working conditions, and also to create an environment Besides, it can also be observed that the growing
which offered ideal conditions to both the employees trend towards teams, leads towards assigning man-
and the organization itself and which was also taking agement functions to these teams (e.g., in the medium
care of the personal development and the welfare of its managementb level) as their members are increasingly
employees. The focal point in career research was acquiring the ability of self-management. So-called
placed—even until the beginning of the 1990s—mainly ‘high involvement work teams’ coordinate, schedule,
on intraorganizational issues and limits itself to and distribute the work without depending on firmly
executive, management careers, or professional and established ‘supervisor positions’ (see Teamwork and
hierarchical careers, respectively. Team Training). A stronger orientation towards teams
in organizations should result in a less clear definition
of individual tasks or work roles which are
2. The No-boundary Organization not inseparably linked to certain persons. The team
members are expected to fulfill different tasks at
In the meantime, this traditional perspective of career different moments and, if required, to share their
development in organizations has dramatically expertise with others. The ideal member of a team
changed, most of all with regard to the implicit work should have the highest possible number of different
of the psychological ‘relationships contract’ involving abilities and should be able to work without direct
a high level of commitment in the long term. A leadership. Therefore in such a team there is not much
significant number of factors relating to changes in the room for people who only want to give instructions or
design and the structure of organizations influence the for those who are only able to carry out one single task
nature of careers within organizations. Among other or who want to specialize on certain tasks. This
things, these include the changed work relationships circumstance, too, will result in a more difficult career
and work contracts, the trend towards leaner and planning in the sense of a linear career path. Finally,
flatter organizations, and the increasing distribution the changes in work relations and in the ‘psychological
of work to teams. The ‘relational contract’ became a contract’ have reduced the stability of the traditional
‘transactional contract’ designed for shorter terms, career, as well as the attitudes, the loyalty, and the
including a performance-based payment, a lower level expectations of the employee or the executive. The
of mutual commitment which can relatively easily be employee can no longer count on a long-term commit-
terminated by either side (MacNeill 1985). To the ment on the part of the employing organization, and
extent that organizations grow flatter, also the number neither can he or she necessarily count on the fact that
of people required for leadership functions in the his or her abilities will be assessed and valued on the
middle and upper management level will be reduced. job market.

1472
Career Deelopment, Psychology of

Changes in career development require us to look at However, the efforts of the organizations conc-
the changes in the organizational, professional, and erning the work relationships between employees and
industrial or economic context at the same time. The executives on the one hand and the companies on the
evolution of the forms of organizations has always other have also caused a high degree of destabilization.
been the impulse for the content and the way of Additionally, a higher degree of diversification among
professional careers. That is why career theories both employees and executives can be observed (the
always reflect the current assumptions on organiza- percentage of women is constantly increasing and—
tions (Arthur et al. 1989). Organizational structures most of all among executives—an increasing inter-
dictate the necessary core competencies and different nationalization can be recognized). But also the
organizational practices require a different ‘mix’ of attitudes and the behavior of the individual employee
competencies. Therefore changed ‘boundary-less’ and executive towards the work situation has signifi-
organizations will employ employees with changed cantly changed, most of all in the fields of commit-
values (e.g., a limited commitment) and with a ment, loyalty, and work and family.
‘boundary-less’ career oriented more towards the self.
‘Boundary-less’ organizations change existing career
patterns. Here some ‘critical dimensions’ can be 4. New Career Competencies
identified, such as the ‘size’ and the ‘degree of
centralization’ determining the evolution of organiza- Instead of ‘bemoaning’ these changes and to wish back
tions and having various effects on the developing work relationships and career developments from the
career concept and career systems. past, it is necessary to point out chances for future
development and to bring about clarity. This can best
be done by newly arranging work roles, setting
3. The Reorganization of Organizations competencies required for the future, clarifying dif-
ferent career concepts and their conceptual meaning
Large-scale companies and organizations have rapidly (e.g., whether it is firm-centered or person-centered),
lost to an increasing extent—at least since the end of and by newly defining career success as well as career
the 1980s—their dominance, the importance of their responsibility.
expansion, and their fascination. Instead of vertical Usual work roles are often described as artifacts of
coordination of dominant companies, other models the industrial age, as ‘vehicles’ which were used in
have been found which are more efficient and com- order to split up work in ‘packages.’ By continuous
petitive at being able to cope with the fast, complex, reorganization, downsizing, teaming, etc., many of the
global developments of the information and service old work roles or ‘jobs’ have been significantly
society more effectively. Big organizations of the changed—or totally deleted—by ‘job enrichment’ or
future will consist of multiple ‘divisions’ and ‘joint ‘job enlargement’ measures. The changes in the work-
ventures,’ ‘regional alliances,’ private and public ing world also require new career competencies and
partnerships. In addition to this, entrepreneurs, strategies: a ‘portfolio of skills’ which can be translated
franchisers, and smaller businessmen will supply sup- from one work situation to others, with the focus on
porting services, technologies and materials, and ‘interorganizational employability’; lifelong learning
distribute goods. These ‘boundary-less organizations’ in order to maintain the skills ‘relevantly’; experience
continuously change their dynamic forms and struc- in ‘team and project work’ (so-called ‘collaborative
tures by expanding and reorganizing, acquiring new abilities’); building reputation over one’s career; and,
parts or selling old ones and even entering into not least, the career should be consistent with the self-
partnerships with other organizations or companies identity. In order to be successful in a turbulent
for certain periods of time. During this process the old working world, in the future more than mere ‘job
ideas of vertical coordination will increasingly vanish, skills’ will be required. New ‘career competencies’ will
just because horizontal coordination models as an be necessary, so-called ‘metaskills.’ They are required
alternative are better able to adapt to the specific and in order to acquire new skills. Here are some examples:
constantly changing interests of every company. adaptability, tolerance of ambiguity, and uncertainty
This revolution in the reorganization of organiza- (Handy 1990).
tions which was triggered by the global competition Changes in the required core competencies can best
has given a new definition to both the work relation- be shown in the field of leadership and management, at
ships and the forms of career development and newly the point of transition from the ‘traditional organ-
challenges the classical career theory to paths not yet ization’ to the current ‘network organization.’ Careers
taken. In order to cope with a highly turbulent external in the traditional organization often used to comprise
environment which can be predicted only to a limited only a single technical expertise (e.g., production or
degree and in the short run, organizations have tried to sales)—but only in a few cases the understanding of
adapt to the changed circumstances by ‘reorganizing,’ several business functions. The more complicated the
‘downsizing,’ ‘delayering,’ ‘flattening the pyramid,’ organizations became, the more competencies—in
‘teaming,’ and ‘outsourcing.’ different compositions—were required (e.g., technical,

1473
Career Deelopment, Psychology of

commercial, and administrative competencies). In direction, the working life will lead. The ‘external’
today’s network organizations, in which independent career relates to the formal phases and roles as they are
companies are linked to each other, the objective is to defined by the organization’s practices and the
provide critical expertise required for a specific project concepts of society regarding what an individual can
or product. Network organizations gain their par- expect within the professional organization. Closely
ticular influence by relying on their internal and related to the term of the ‘internal’ career is the
external partners who, on their part, contribute to the concept of the ‘career anchors’ which can be equated
value added chain. By doing this products and projects with the self-concept of a person. This develops during
(from RandD to sale) can be tackled which could not the course of gaining increasing experience of life and
be handled by a single organization on its own but work. It consists of talents and abilities which a person
which can only be handled with the help of network recognizes for himself or herself, of a structure of
partners. values, and of job-relevant motives and needs. The
In order to do this, executives need ‘collaborative’ self-concept constructed in this way has a stabilizing
competencies. These consist of three components: (a) influence on the actions and decisions of a person and
the ability to analyze a problem and develop solutions becomes part of the personality—with these career
with the help of partners and networks; (b) ‘partner anchors not being cemented forever but able to change
abilities,’ i.e., to develop concepts, to negotiate, and to (among other things by a change of values in the
carry them out to mutual benefit; (c) relationship society). We differentiate between several categories of
management, i.e., to identify the needs of clients and such career anchors which, according to the occu-
partners and to meet them precisely. These career pational group, can be part of the individual self-
competencies, depending on the organizational struc- concept in order to express the respectively existing
tures, can finally be continued to the currently central motives, values, and needs (cf. also the works
developing ‘cellular organization’ which no longer by Hogan et al. 2000). The essential categories of
includes any kind of management hierarchy. The ‘cells’ career anchors are: (a) autonomy\independence; (b)
of these organizational forms consist of, among other security\stability; (c) technical–functional compet-
elements, ‘self-managing teams’ with their own man- ency; (d) general management competency; (e) entre-
agement responsibilities which would be able to preneurial creativity; (f) lifestyle; (g) service in the
survive autonomously. However, they share their sense of a service to a higher task; (h) challenge. Unlike
knowledge and their information with other ‘cells’ in the construction of ‘career resilience,’ the concept of
order to be stronger and more competent. This the ‘career anchor’ implies that the definitions by
organizational form does not work as ‘employer’ but it different persons in organizations of their career
is a helpful mechanism in order to promote the regarding developing talents, motives, and values vary
knowledge-intensive abilities of their members in their significantly.
application and enlargement. Careers are currently changing in several ways.
Today we cannot, or much less than in the past,
assume that people will spend all their life in one job or
5. New Career Concepts in one special field. That is why we have to find
answers to, among other things, the following
So this leads us back to career responsibility as in the questions: how can we find a transition from technical
cellular organization it is completely placed with its to management roles? In order to achieve an effective
members (l the locus of responsibility). The old performance lifelong learning is required and the
theories for career management regard career de- acquisition of ever new abilities. Neither can we
velopment in the sense of narrow, sequenced, and assume that people will always stay employed by one
methodical activities. Today’s working world, on the company. These changes create an environment in
other hand, is characterized by continuous changes which employees and executives regularly have to
(for both sides), by uncertainty, turbulence, limited evaluate their potentials and career plans again and
obligations, and a lack of borders. Therefore the again in order to keep themselves ‘marketable’—in
traditional ‘organizational career,’ in which the connection with the opportunity to appear profession-
employees were expected to put their personal interests ally ‘attractive’ to a company. Careers will develop
last, was replaced by a so-called ‘self-centered career.’ laterally and diagonally. However, loyalty and com-
So the career model now changes into an individual mitment from the part of the organizations will
process of self-responsibility. The individual explores decrease which will shift the center of responsibility
his or her career opportunities, sets career goals, and control of the career development (as mentioned
develops strategies, and searches for a relevant before) towards the employee. The working individual
definition from time to time (Greenhaus et al. 1995). is to develop a higher degree of ‘resilience’ during this
In order to do this, Schein (1990) proposes a transitional stage of shifting responsibility and control
differentiation of the term ‘career’ into an ‘internal’ in order to become ‘career self-resilient’ in the working
and an ‘external’ career—with the internal career world. This construction of ‘career resilience’ meas-
relating to the subjective sense for where, i.e., in which ures the ability of an individual to change his or her

1474
Career Deelopment, Psychology of

career, to cope with uncertainty and with disappoint- talents, by supporting the planning for succession, and
ments in organizational processes. This concept can be by countering discontent and decreasing commitment
applied to all individuals and all careers. caused by uncertainty and instability of new work
relations.
6. Challenges for the Field of Applied Psychology
In these enormous processes of change in the field of 7. Summary and Implications for Future
career development the above-mentioned fields of Research
applied psychology are given a number of essential
tasks. However, here it is indispensable that the Global competition has triggered a revolution in the
individual ‘subfields’ do not neglect the results of the reorganization of the organization. In contrast to the
others (neither those of other social sciences)—as has traditional form of the ‘bounded’ organization with its
been done up to now. These tasks include, for one centralized decision making processes, vertical models
thing, the development of assessment methods and of coordination, hierarchies, and career ladders, the
development tools in order to identify individual present organization emphasizes a boundary-less
abilities, potentials, and talents, to determine the state concept, a vertically coordinated approach, and boun-
of development and to provide creative approaches dary-less career principles. New forms of organiz-
for ‘skill-building.’ This task gets all the more difficult ations have dynamic shapes and structures. They
if at the moment it is not yet known how the new rearrange themselves to adapt to changes in the
requirements and roles will look like in 5–7 years. The environment. It was outlined how evolution of the
same applies to development plans, job descriptions, forms of organizations has been the impulse for
and incentive systems. However, they are also included content and way of professional careers. In this
in the field of ‘multi-rater feedback’ tools such as the scenario several trends were described such as the shift
360m feedback. A special importance is attached to from bounded to boundary-less concepts; stability to
‘cross-training’ in order to create opportunities to instability of employment; long-term to short-term
acquire new abilities and skills or to enlarge existing career goals; generic to firm-specific competencies;
ones. The activities in the field of ‘human resources firm-centered to person-centered career approaches;
planning’ must be significantly enlarged. One focal relational to transactional employer–employee re-
point here should be put on career planning in order to lationships; and the shift of career responsibility from
develop career paths and ‘mentoring programs’ (also the employer to the employee. In order to describe the
over different organizational levels). evolving career management focus, a number of
The development of training courses for ‘boundary- different concepts and terms created by researchers
less career management’ or the problematic of ‘career over the past several years have been presented: career
adaptability’ must not be forgotten. The latter must be anchors; psychological contract; protean career; in-
intensively studied by psychologists. For example, is ternal vs. external career; self-centered career;
‘career adaptability’ a function of the personality? destabilization of relationships between people and
Also ‘coping strategies’ must be developed. Mentors organizations; interorganizational concepts; net-
must be enabled to help employees and executives to working; career self-resilience; interorganizational
always steer their career safely in an ever-changing employability; high involvement work teams; collab-
working world. Psychologists are able to help working orative competencies.
people in their career development by replacing their All of this raises some important questions for
(conservative) expectations of a continuous mobility future research. For instance, to what extent will
‘upwards’ by a rather cyclical and lateral career boundary-less organizations alter career theories and
development. In ‘boundary-less organizations’ and actual career mobility patterns? To what extent is
‘boundary careers,’ major role conflicts caused by career adaptability a function of personality (dis-
complicated working conditions must be expected, position toward proactive behavior) or age vs. a skill
and a substitute must be found for identifying with the that can be developed? Can people learn to be
organization. adaptable? There is need for more research on what
With all of these psychological career activities the personal and developmental growth training best
reliable determination of a specific need for develop- prepare people to engage in new career cycles and
ment (especially the future need in growth areas), disengage from old ones? How can organizations help
advisory activities, and the conduct of evaluation employees to regularly assess their skills, interests, and
programs is of major importance. Here a ‘dual focus,’ values so that they can figure out what kind of work
i.e., individual and organization, is an essential re- experience to seek? As information technology makes
quirement for successful work on the part of the new organizational forms possible, and as social values
psychologist. Career development programs offer their shift priorities, what should a given job consist of and
services to the employee for his or her career planning how should one hire and train people for the am-
and give incentives for learning. The organization biguous and changing roles? It is necessary to consider
benefits from this program by developing in-house whether a career goal is helpful or an unnecessary

1475
Career Deelopment, Psychology of

restriction. Research is needed on new standards of very different from the tasks and activities that
career success, on new forms of work identity, on role characterize interactions among families and close
overload (the work vs. family conflict), and new friends without the presence of illness or disability.
substitutes for organizational identification have to be Thus, when a wife provides care to her husband with
found. It is also worth studying what support systems Alzheimer’s disease (AD) by preparing his meals, it
are needed for employees under the transactional may be an activity she would normally do for an
contract and how the development of a ‘spot market’ unimpaired husband. However, if a wife also assists
mentality can be avoided? What is the impact of choice her cognitively impaired husband with bathing and
in the different employment contracts (relational vs. dressing, few would question whether or not care-
transactional)? A major implication of these ideas for giving is taking place. The difference is that providing
future research is that the study of careers must be assistance with bathing and dressing or assisting with
better connected to a turbulent, complex, rapidly complex medical routines clearly represents ‘extra-
changing environment, and it has to become multi- ordinary’ care and exceeds the bounds of what is
disciplinary. ‘normative’ or ‘usual.’ Similarly, parents caring for a
child with a chronic illness may need to assist with
daily medical routines (e.g., insulin injections or chest
Bibliography physical therapy) that are time consuming and difficult
Arthur M B, Hall D T, Lawrence B S (eds.) 1989 Handbook of and are in addition to normal parenting responsi-
Career Theory. Cambridge University Press, New York bilities. Caregiving involves significant expenditure of
Brown D, Brooks L (eds.) 1990 Career Choice and Deelopment. time and energy often for months or years, requiring
Jossey-Bass, San Francisco the performance of tasks that may be physically
Greenhaus J H, Callanan G A, Kaplan E 1995 The role of goal
demanding and unpleasant, and frequently disrupting
setting in career management. The International Journal of
Career Management 7(5): 3–12 other family and social roles of the caregiver (see
Hall D T 1976 Careers in Organizations. Scott, Foresman, Gender Role Stress and Health).
Glenview, IL Although caregivers may perform tasks similar to
Hall D T 1996 The Career is Dead: Long Lie the Career, 1st edn. those carried out by paid health professionals, they
Jossey-Bass, San Francisco perform these services for no compensation and do so
Hall D T (ed.) 1986 Career Deelopments in Organizations. either voluntarily or because they feel there are no
Jossey-Bass, San Francisco other alternatives. Because the physical and mental
Handy C 1990 The age of Unreasoning. Harvard University health consequences of taking on this role are some-
Press, Boston
times severe, and because caregivers represent an
Hogan J, Hogan R, Weinert A B 2000 The Values and Interests
Inentory. University of the Federal Armed Forces, Hamburg, invaluable resource to the well-being of our popula-
Germany tion, research on caregiving has become a high priority
Holland J L 1997 Making Vocational Choices, 3rd edn. among scholars in many disciplines as well as among
Psychological Assessment Resources, Odessa, FL policy makers.
MacNeill I R 1985 Relational contracts: what we do and do not
know. Wisconsin Law Reiew. 3: 483–525
Osipow S H 1983 Theories of Career Deelopment, 3rd edn. 2. Prealence of Caregiing
Prentice-Hall, Englewood Cliffs, NJ Although the definition and boundaries of what is
Ouchi W G 1981 Theory Z. Addison-Wesley, Reading, MA
meant by the term caregiving often vary depending on
Pascale R T, Athos A G 1981 The Art of Japanese Management.
Simon and Schuster, New York the purpose for which such definitions are used, there
Schein E H 1990 Career Anchors. University Associates, San is strong consensus that regardless of how caregiving is
Diego, CA defined, its prevalence is high. A broadly inclusive
Super D E 1957 The Psychology of Careers, 1st edn. Harper and approach might argue that a caregiver is needed for
Row, New York every person with health-related mobility and self-care
Whyte W F 1986 The Organization Man. Simon and Schuster, limitations which makes it difficult to take care of
New York personal needs, such as dressing, bathing, and moving
around the home. Current estimates indicate that 4
A. B. Weinert percent of the non-institutionalized US population
under the age of 55 meet these criteria. Beyond the age
of 55, the proportion of persons with mobility and\or
Caregiver Burden self care limitations increases dramatically; fully half
of the population falls into this category after age 85
1. Definition of Family Caregiing (US Bureau of the Census 1990). If we assume that
these individuals minimally require one caregiver,
The provision of assistance and support by one family these estimates yield over 15 million caregivers in the
member or friend to another is a pervasive aspect of US. Indeed, these estimates are somewhat lower than
everyday human interactions. Providing help to a results reported in a recent national survey of care-
family member with chronic illness or disability is not givers which reported that there were 22.4 million

1476
Caregier Burden

households that met broad criteria for the presence of stressors include measures of patient disability, cog-
a caregiver in the past 12 months (National Alliance nitive impairment, and problem behaviors, as well as
for Caregiving 1997). the type and intensity of caregiving provided. Key
Caregiving is not just a late life phenomenon outcome variables for the caregiver include psycho-
involving the care of disabled older persons. It is logical distress and burden, often referred to as
estimated that 10–14 percent of children and adoles- caregiver burden, psychological and physical mor-
cents (7.5 million) in the US have some type of chronic bidity, and patient outcomes such as institutional-
illness or disability; of these individuals, approxim- ization and death.
ately 20–25 percent (1.5 million) have serious health Although the literature consistently shows a mod-
conditions that impair daily functioning and thus erate relationship between level of patient disability
require a caregiver. Additionally, 4.1 million indiv- and psychological distress of the caregiver, there is
iduals between the ages of 21 and 64 require personal considerable variability in caregiver outcomes which is
assistance in ADLs\IADLs. thought to be mediated and\or moderated by a variety
of factors including economic and social support
resources available to the caregiver (Haley et al. 1996),
3. Who Proides Care and What Type of Care is a host of individual difference factors, such as gender,
Proided? personality attributes (optimism, self-esteem, self-
mastery), coping strategies used, and the quality of the
Caregivers to elderly individuals generally are differ-
relationship between caregiver and care recipient (see
entiated by age and relationship to the care recipient.
Quittner et al. 1990, Schulz et al. 1990, 1995).
One large group consists of adult children, usually
Researchers have further extended basic stress-coping
daughters or daughters-in-law, in their 50s and 60s;
models to include examination of secondary stressors,
the second group of caregivers comprises spouses of
such as role conflict engendered by caregiving de-
care-recipients and is generally older and has a higher
mands (Pearlin et al. 1997), and have applied many
proportion of male caregivers than adult children
additional theoretical perspectives borrowed from
caregivers (see Gender Role Stress and Health).
social and clinical psychology, sociology, and the
The roles and functions of family caregivers vary by
health and biological sciences to help understand
type and stage of illness and include both direct and
specific aspects of the caregiving situation. Finally,
indirect activities. Direct activities can include pro-
researchers interested in the health consequences of
vision of personal care assistance, such as helping with
caregiving have focused on a variety of physiological
bathing, grooming, dressing, or toileting; healthcare
mechanisms including the pituitary-adrenal axis, the
assistance such as catheter care, giving injections, or
sympathetic nervous system, and the immune system
monitoring medications; and checking and monitoring
in their effort to identify biological modulators (Haidt
tasks, such as continuous or periodic supervision, and
and Rodin 1995, Kiecolt-Glaser et al. 1991, Vitaliano
telephone monitoring. Indirect tasks include care
et al. 1997) of the stress-health relationship. Re-
management such as locating services, coordinating
searchers have also shown that the stresses associated
service use, monitoring services, or advocacy, and
with caregiving are a risk factor for the caregiver’s
households tasks, such as cooking, cleaning, shopping,
death (Schulz and Beach 1999). In sum, caregiving
money management, and transportation of the family
clearly provides a rich platform for the application of
member to medical appointments or day care pro-
much of the theoretical and methodological expertise
grams (Biegel and Schulz 1999). The intensity at which
of researchers in many disciplines.
some or all of these caregiving activities are performed
A wide range of caregiving effects have been
varies widely, with some caregivers having only limited
described in the literature including disruption of
types of involvement for a few hours per week while
family routines, psychological distress, and psycho-
other caregivers might provide more than 40 hours a
logical and physical morbidity including mortality,
week of care and be on call 24 hours per day (see Social
financial hardship, and work-related problems (see
Support and Stress).
Stress and Health Research). Feeling burdened or
distressed by the demands of caregiving is the most
4. Conceptual Approaches to the Study of frequently reported outcome associated with care-
Caregiing giving, although this is not a universal phenomenon,
particularly among spousal caregivers (Zarit et al.
The dominant conceptual model for caregiving as- 1986). Psychiatric morbidity such as depression and
sumes that the onset and progression of chronic illness anxiety are also common. Physical health effects such
and physical disability is stressful for both patient and as increased susceptibility to illness have been more
caregiver and, as such, can be studied within the difficult to demonstrate, although they are likely to
framework of traditional stress-coping models. In- occur in high demand situations among vulnerable
deed, some researchers have likened caregiving to (e.g., frail) caregivers. Possible mediators of illness
being exposed to a severe, long-term, chronic stressor effects are increased depression associated with care-
(Pearlin et al. 1995). Within this framework, objective giving and changes in health related behaviors such as

1477
Caregier Burden

sleeping and eating patterns and medical compliance. See also: Aging and Health in Old Age; Gender Role
Positive effects of caregiving such as increased self- Stress and Health; Social Support and Stress; Stress
esteem, the satisfaction of knowing that one’s relative and Coping Theories; Stress and Health Research
is being properly cared for, as well as improved mental
health have also been reported (Beach et al. 2000).
The demands and negative impacts of dementia Bibliography
caregiving are generally higher than nondementia
caregiving (Schulz 2000). Indeed, a recent report by Beach S R, Schulz R, Yee J L, Jackson S 2000 Negative and
positive health effects of caring for a disabled spouse:
Ory et al. (1999) documents the ways in which
Longitudinal findings from the Caregiver Health Effects
dementia care is different from other types of family Study. Psychology and Aging 15: 259–71
caregiving. Not only do dementia caregivers spend Biegel D E, Schulz R 1999 Caregiving and caregiver inter-
significantly more hours per week providing care than ventions in aging and mental illness. Family Relations 48:
nondementia caregivers, they also report greater em- 345–54
ployment complications, caregiver strain, mental and Haidt J, Rodin J 1995 Control and Efficacy: An Integratie
physical health problems, reduced time for leisure and Reiew. A report to the John D. and Catherine T. MacArthur
other family members, and family conflict. Factors Foundation Program on Mental Health and Human De-
that are likely to account for this greater level of strain velopment
Haley W E, Roth D L, Coleton M I, Ford G R, West C A C,
include having to contend with behavioral problems
Collins R P, Isobe T L 1996 Appraisal, coping, and social
of the care-recipient (e.g., wandering, aggressiveness), support as mediators of wellbeing in black and white family
and the unpredictable nature and course of dementing caregivers of patients with Alzheimer’s disease. Journal of
illnesses. A number of health psychologists have Consulting and Clinical Psychology 64: 121–9
focused this type of caregiving as a platform for Kiecolt-Glaser J K, Dura J R, Speicher C E, Trask O J, Glaser
studying mind-body phenomenon linking chronic R 1991 Spousal caregivers of dementia victims: Longitudinal
stress to physical morbidity. changes in immunity and health. Psychosomatic Medicine 53:
345–62
National Alliance for Caregiving and the American Association
5. The Future of Caregiing of Retired Persons 1997 Family Caregiing in the US Findings
from a National Surey. Final Report. National Alliance for
A number of important interrelated demographic, Caregiving, Bethesda, MD
health, and social trends will shape the caregiving Ory M G, Hoffman III R R, Yee J L, Tennstedt S, Schulz R
agenda in the future. First, there will be a worldwide 1999 Prevalence and impact of caregiving: A detailed com-
increase in the number of older individuals, and parison between dementia and non-dementia caregivers.
possibly increased numbers of disabled individuals Dementia and non-dementia caregiving. The Gerontologist 39:
with longer life expectancies due to medical inter- 177–85
ventions. A key question will be the extent to which Pearlin L I, Aneshensel C S, Le Blanc A J 1997 The forms and
increases in life expectancy are associated with in- mechanisms of stress proliferation: The case of AIDS care-
givers. Journal of Health and Social Behaior 38: 223–36
creasing years of disability. Some early evidence
Pearlin L I, Aneshensel C S, Mullan J T, Whitlatch C J 1995
suggests that future cohorts of elderly individuals will Caregiving and its social support. In: Binstock R H, George
be healthier and more functional than current cohorts, L K (eds.) Handbook of Aging and the Social Sciences, 4th edn.
thus adding years to life may not necessarily increase Academic Press, New York, pp. 283–302
the need for caregiving assistance. Alternatively, one Quittner A L, Glueckauf R L, Jackson D N 1990 Chronic
can speculate that some types of medical interventions parenting stress: Moderating vs. mediating effects of social
(e.g., drug therapies that enable AD patients to spend support. Journal of Personality and Social Psychology 59:
more years at home) will extend the family caregiving 1266–78
career. Second, future cohorts of elderly will have Schulz R (ed.) 2000 Handbook of Dementia Caregiing. Springer,
New York
smaller families and fewer children available to pro-
Schulz R, Beach S 1999 Caregiving as a risk factor for mortality.
vide care. The supply of caregivers may be further The caregiver health effects study. Journal of the American
depleted by the increased and sustained labor-force Medical Association 282: 2215–9
participation of adult daughters, making them less Schulz R, O’Brien A T, Bookwala J, Fleissner K 1995 Psychiatric
available to provide care. Third, the trend toward and physical morbidity effects of Alzheimer’s Disease care-
shifting care from formal to informal care providers is giving: Prevalence, correlates, and causes. The Gerontologist
likely to continue and accelerate. Caregivers increas- 35: 771–91
ingly will be asked to provide complex postacute care Schulz R, Visintainer P, Williamson G M 1990 Psychiatric and
in addition to the chronic care they have provided physical morbidity effect of caregiving. Journals of Geron-
tology: Psychological Sciences 45: P181–P191
traditionally. As these examples illustrate, caregiving
US Bureau of the Census 1990 The Need for Personal Assistance
issues are linked inextricably to broader demographic with Eeryday Actiities: Recipients and Caregiers. Current
trends and the health and disability of our older Population Reports (Series P-70. Household Economic Stud-
population. At the macrolevel, caregiving and care- ies). Department of Commerce, Washington, DC
giver burden will have to be addressed through Vitaliano P P, Schulz R, Kiecolt-Glaser J, Grant I 1997 Research
government policy and private sector programs. on physiological and physical concomitants of caregiving:

1478
Caregiing in Old Age

Where do we go from here? Annals of Behaioral Medicine 19: important because it established the strength of the
117–23 family as a major source of interpersonal support and
Zarit S H, Todd P A, Zarit J M 1986 Subjective burden of care during old age (Chappell 1990).
husbands and wives of caregivers: A longitudinal study. The
Gerontologist 26: 260–6
3. Reaching Maturity
R. Schulz
By the late 1970s and especially in the 1980s, research
on caregiving was burgeoning. It examined earlier
assumptions of the research. For example, early
research assumed contact meant assistance or contact
Caregiving in Old Age included positive support. Research started to in-
vestigate whether and under what circumstances social
1. Caregiing, the Area interaction was indeed supportive, when assistance
was positive. Research distinguished who provides
Caregiving in old age is the provision of assistance to support and studied the critical role of spouses among
an elder when his or her health deteriorates, whether it married couples and children when a spouse is not
is physical or mental health, or a combination of the available frequently because of death. While care-
two. Caregiving typically refers to unpaid care from giving frequently focuses on tasks of activities of daily
members of the informal network, that is, from family living, either instrumental activities of daily living
or friends. It is an aspect of the more general concept (IADL) such as shopping and banking, and basic
of social support. However, there is no precise scien- activities of daily living (ADL) such as going to the
tific definition of the term caregiving, and research toilet, eating, personal mobility, and other activities
reveals a diversity of views. For example, spouses who necessary for survival, the emotional aspects received
are providing assistance to their loved one with little attention. Yet it was recognized that it is this
instrumental activities of daily living are less likely emotional element that distinguishes informal care
than others, such as children or siblings, to consider from formal or paid caregiving.
themselves as caregivers (more frequently referred to Caregivers were labeled the ‘hidden victims,’ ‘sand-
as carers in the UK). Think of the instance where a wich generation,’ ‘generation-in-the-middle’ (Brody
daughter takes her mother grocery shopping. Either 1981), raising the public recognition of caregiving. It
one or both may define it as a chance to socialize, or as was during this period that Cantor’s (1979) hier-
assistance, or as simply something they do together. archical compensatory model, also known as the
The literature adds to the confusion with the use of substitution model, received much attention. She
a variety of terms, including caregiving, caring, assis- argued for an orderly hierarchical selection of care-
tance, interaction, and support, sometimes used givers, determined by the primacy of the relationship
synonymously and sometimes not. Despite these between the giver and the recipient. According to this
difficulties, the area of caregiving in old age has view, the most preferred caregiver was the spouse,
received much research attention from 1970 onwards. followed by daughters, sons, other relatives, friends,
and neighbors, in that order. Litwak’s (1985) com-
2. In the Beginning peting hypothesis of task specificity also gained popu-
larity. In this instance, it was argued that persons
Interest in the area arose during the 1960s and 1970s as differentially placed within society provide different
a practical concern. During this time, gerontologists types of assistance (spouses can provide emotional and
documented the extensiveness of social ties, including other long-term needs on a continuous basis; neigh-
caregiving, during later life. This was important within bors provide short-term, sporadic, and instrumental
the context of the times when it was commonly assistance).
believed that seniors in Western industrialized societies There was less of a focus on support from siblings,
were largely isolated from their family, living alone, friendships, and grandchildren; most studies examined
and often housed in long-term care institutions. care from the spouse and children, who are the most
Gerontological research recorded the falsity of these prevalent care providers. Significant gender differences
images by studying the strong interactional ties within in caregiving were revealed: the fact that women tend
the lives of most seniors. Researchers reported the to do the emotional and hands on work while men
preferred normality of ‘intimacy at a distance,’ in tend to provide advice and financial assistance
which neither seniors nor their children wish to live (Horowitz 1981). Other studies reported that men are
together but have a desire for contact. Usually seniors more likely to rely exclusively on their spouse for
lived geographically proximate to at least one of their emotional support whereas women are more likely to
children. This period also examined the prevalence of rely on friends (Hess and Waring 1980); that caregivers
informal, unpaid caregivers as the dominant source of experience burden as a result of their involvement in
assistance for elders. The informal network was the this role (Zarit et al. 1980); that there is generally one
first resort for care when health declined. This was main care provider who does most of the work (Stone

1479
Caregiing in Old Age

et al. 1987); and that working daughters do not provide ized countries. This was related to a perceived crisis in
fewer hours of caregiving than those who are not in health care system funding and paved the way for a
paid labor (Brody et al. 1984). new political rhetoric, recognizing, for the first time in
Still other research indicated that some ethnic the twentieth century, caregivers and community care.
groups have different and more extensive caregiving Indeed, family care emerged in the 1990s as a cor-
than others, but disentangling how much of the nerstone of the new vision for health reform through-
variation is due to ‘culture’ is difficult because ethnic out the industrialized world.
minority status and social class are correlated empiri- This new-found political awareness of caregivers
cally. It was also during this time that the full extent of brought an urgency to research in this area, high-
caregiving was exposed. In a review of scientific studies lighting the need for an adequate understanding of
throughout industrialized worlds, Kane and Kane caregiving if policies and programs were going to
(1985) estimated that between 75 and 85 percent of all make assumptions about their capacities and their
personal care to seniors comes from the informal needs. It could not be assumed that the family is
network. Chappell (1985) reported that almost all necessarily the most appropriate place for caregiving.
community living elders receiving any type of as- The area of elder abuse provided an example. Pro-
sistance do so from the informal network. The lack of grams that insist seniors stay with families could
recognition of the care provided by caregivers and the prolong abusive situations.
lack of support for caregivers within formal health The current political interest in caregiving repre-
care systems was also documented (George 1988). sents a rediscovery of caregiving. Although govern-
The sheer volume of research on caregiving es- ment had not previously acknowledged the role played
tablished it as a major area and much was learned. by caregivers, they had never replaced private arrange-
Informal caregiving emerged as the indisputable dom- ments. In the past, governments intervened only when
inant system of care in post-modern society—despite families and individuals were not coping and came to
all that has been said about our individualism and lack their attention. At the present time, governments’
of concern for one another. Caregiving, it became embrace of caregivers can be viewed skeptically as a
obvious, is significantly a woman’s issue. Women means to cost-shift from the public purse to caregivers,
predominate as the caregivers and the care-receivers. largely women. A concern with increased burden for
Caregivers are burdened and women who work do not caregivers is prominent. New questions are being
shirk from caregiving. addressed. What are the assumptions about family
caregivers in current policies? What is the impact of
4. Growth in the 1990s different service interventions on caregivers? How can
the formal system support caregivers and how can
By the late 1980s and early 1990s caregiving had caregivers be integrated with formal health care
become a popular area of gerontological research, delivery? The answers to these questions promise
producing studies on a multitude of facets of this exciting new knowledge to assist seniors and their
topic. Studies on burden and stress continued, reveal- families.
ing either no differences in levels between male and In addition, increasingly sophisticated analyses are
female caregivers or, when differences were reported, being conducted and testing of earlier explanations
that female caregivers experience more stress. Re- continues. For example, Penning (1990) demonstrates
search on male caregivers showed their more in- a lack of empirical support for the concepts of the
strumental approach to the role than women have. sandwich generation and generation-in-the-middle.
Positive aspects of caregiving such as feeling useful Her research supports serial caregiving as a more apt
and bringing comfort to a loved one were being descriptor of this phenomenon, since individuals are
studied. Caregivers, furthermore, are reluctant users usually involved in raising their children, then caring
of formal services. However, the last decade of the for their parents, then caring for their husband rather
twentieth century did not produce simply more of the than being engaged in all of these roles at one time. In
same. It also saw another substantive shift in this area addition to the characteristics of caregivers and the
of inquiry. Caregiving became politicized. burdens of caring, researchers are beginning to study
Before this time, it was recognized within academic the meaning of caring. Wenger et al. (1996) suggest
and practice circles as important, but it was not conceptualizating caregiving in terms of purposes and
recognized at the political level. This changed for two relationships rather than tasks, taking the everyday
major reasons. One was heightened awareness from a experiences of caregivers and care recipients into
feminist perspective that aging and caregiving, as a account. For example, one can examine a preparatory
woman’s issue, had been more or less invisible as part stage of anticipating care; being involved in having to
of the private domain and not part of a public debate. provide preventive care, such as ensuring the person
Public policy had been operating by assuming that eats well and exercises; and being involved in the
women’s caregiver roles would simply continue supervision of care.
(Hooyman 1990). A second major factor was the New theoretical insights into the socioemotional
prolonged economic recession throughout industrial- context of relationships has direct relevance for care-

1480
Cargo Cults

giving. Lang et al. (1998) for example, finds increas- Brody E M, Johnsen P T, Fulcomer M C 1984 What should
ingly discriminating choices in our social interactions adult children do for elderly parents? Opinions and prefer-
as we age. Baltes (1996) studies etiologies (causes) of ences of three generations of women. Journal of Gerontology
39: 736–46
behavioral dependency such as learned helplessness,
Cantor M H 1979 Neighbors and friends: an overlooked
learned dependency and selective optimization with resource in the informal support system. Research on Aging 1:
compensation and the prominent role the social 434–63
environment plays in this regard. Her research directs Chappell N L 1985 Social support and the receipt of home care
attention to overdependency among care recipients, services. Gerontologist 25: 47–54
an under-researched area to date. Chappell N L 1990 Aging and social care. In: Binstock R H,
George L K (eds.) Handbook of Aging and the Social Sciences,
3rd edn. Academic Press, San Diego, CA, pp. 438–54
5. Conclusions George L K 1988 Why won’t caregivers use community services?
Unexpected findings from a respite care demonstration\
Interest in caregiving in old age arose and has evaluation. In: George L K, Fillenbaum G G, Burchett B M
continued from a pragmatic applied interest. It began (eds.) Respite Care: A Strategy for Easing Caregier Burden:
with assumptions about supportive relationships and Final Report. Duke University, Center for the Study of Aging
Family Support Program. Durham, North Carolina
about the role of the family, and over time has focused
Hess B B, Waring J M 1980 Changing patterns of aging and
on the complexity of both the definition of caregiving family bonds in later life. In: Skolnick A, Skolnick J H (eds.)
as well as its contextual fields. Only recently, has there Family in Transition. Little, Brown, and Co., Boston, pp.
been conceptual development in this area with current 521–37
attempts to examine the meaning of the term ‘care- Hooyman N 1990 Women as caregivers of the elderly: social
giving’ and how that varies from group to group. At implications for social welfare policy and practice. In: Biegel
the present time, complex conceptual issues are start- D E, Blum A (eds.) Aging and Caregiing: Theory, Research
ing to be addressed—When is interaction caregiving? and Policy. Sage, Newbury Park, CA, pp. 221–41
Whose definition counts? How are caregivers taken Horowitz A 1981 Sons and daughters as caregivers to older
parents: Differences in role performance and consequences.
into account within the health care system? How can
Paper presented at the annual meeting of the Gerontological
they be taken into account? Whose definitions of Society of America, Toronto
caregiving are appropriate for service delivery? How Kane R A, Kane R L 1985 The feasibility of universal long-term
can the autonomy of caregivers be maintained? The care benefits. New England Journal of Medicine 312: 1357–64
conceptual issues relate to methodological issues. If Lang F R, Staudinger U M, Carstensen L L 1998 Perspectives
the mother does not consider the daughter’s efforts to on socioemotional selectivity in late life: How personality and
be caregiving but the daughter does, whose definition social context do (and do not) make a difference. Journal of
does the researcher accept, if either? If the wife is Gerontology 53B: 21–30
cooking the meals anyway and always has, is this part Litwak E 1985 Helping the Elderly: The Complementary Roles of
Informal Networks and Formal Systems. Guilford Press, New
of caregiving when her husband’s health declines?
York
When computing the economic value of caregiving, do Penning M J 1990 Receipt of assistance by elderly people:
we include those times and tasks governments would hierarchical selection and task specificity. The Gerontologist
not provide? 30: 220–7
Future directions for caregiving research in the Stone R, Cafferata G L, Sangl J 1987 Caregivers of the frail
short term seem more or less clear. Renewed interest in elderly: A national profile. The Gerontologist 27: 616–26
the area has been fueled by the recognition of Wenger C, Grant G, Nolan M 1996 Older people as carers as
caregivers at the political level in the new vision of well as recipients of care. In: Minichiello V, Chappell N,
health reform. Kendig H, Walker A (eds.) Sociology of Aging. International
Sociological Association, Research Committee on Aging,
See also: Aging, Theories of; Care and Gender; Australia, pp. 189–206
Zarit S H, Reever K E, Bach-Peterson J 1980 Relatives of
Caregiver Burden; Ecology of Aging; Health Care impaired elderly: Correlates of feelings of burden. Geronto-
Delivery Services; Health Care Markets: Theory and logist 20: 649–55
Practice; Health Care, Rationing of; Health Care
Systems: Comparative; Lifespan Development: Evolu- N. L. Chappell
tionary Perspectives; Lifespan Theories of Cognitive
Development; Population Aging: Economic and
Social Consequences; Psychoneuroimmunology
Cargo Cults
Bibliography
1. Preliminary Definition
Baltes M M 1996 Many Faces of Dependency in Old Age.
Cambridge University Press, Cambridge, UK Cargo cults or movements are socio-magico-religious
Brody E M 1981 Women in the middle and family help to older activities which have been occurring mainly in Mel-
people. Gerontologist 21: 471–80 anesia (that part of Oceania comprising the island

1481
Cargo Cults

archipelagoes from Irian Jaya in the west, to the east 3. Background


and southeast through Papua New Guinea, the Bis-
marcks, Solomon, Vanuatu, Fiji, and New Caledonia) From early in the nineteenth century the many
since the 1850s (Steinbauer 1971, p. 181) until the Melanesian coastal communities have been subject to
present day. They include genuine sociopolitical and the increasing pressures of Euro-American imperi-
economic aspirations, but are in essence millenarian in alism. Nonliterate and differentiated by many lan-
nature. That is, the activities, usually initiated by guages and dialects but using Pidgin (Tok pisin) as a
someone generally like a prophet, are directed towards lingua franca, these hunting, fishing, and horticultural
obtaining or greeting an expected state of bliss or communities with stone age technologies, came into
contentment, the latter being envisaged as a free access contact with Euro-American industrialized civiliza-
to cargo (the goods and foodstuffs offloaded from the tions. Colonial forms of order were imposed. New
ships and, more recently, the aircraft of industrialized infectious diseases ravaged populations. Money, taxes,
countries). plantations, and forms of indentured and contract
labor were introduced. From the earliest years, Chris-
tian missionaries of different denominations com-
municated their varying messages and doctrines.
2. Alternatie Nomenclature Fortune hunters, labor recruiters, traders, and pro-
Despite the name ‘cargo’ the cults may be considered spectors tended to be ‘rough’ as well as ‘ready.’ In
a subset in a local idiom of all those movements of turn, Germans, British, Australians, Japanese, and
socio-religious reform, revival, and renewal which, Americans have warred and brought their own goods
endemic to the Christian ambience, have been vari- and modes of government. Through all this Melane-
ously called millenarian, messianic, or enthusiastic sians tried to live within their own community organi-
movements. Although similar activities have occurred zations based upon simple subsistence economies
where Christian influences have been marginal, in- (defined as without money, a common and factorial
cluding one cargo cult (Berndt 1952), the vast bulk of medium of exchange) plus, in many areas, plantation
recorded instances have taken place where Christianity or other service labor for cash—which alone could buy
has been more or less established. cargo. Some Melanesians, vulnerable to unstable
Still, avoiding Christian associations and the local world markets, engaged in cash cropping.
cargo idiom, bringing cargo movements into line with
similar kinds of activities found elsewhere in Oceania
and among the indigenous peoples of the Americas, 4. Generalized Course of a Moement
sub-Saharan Africa, Australasia, and parts of Asia
(areas which Europeans have colonized or settled), a One of the more engaging qualities of Melanesians is
number of alternative terms to cargo, according to their phlegmatic acceptance of the wonders of modern
apparent main emphases, have been coined. Thus technology. Still, what seems to have impressed them
Accommodation, Acculturation, Adaptation, and Ad- most about white people has been not their particular
justment, also as verbs, reveal the activities as attempts human vices and virtues or their military prowess, but
to reconcile tradition with the ways of a more powerful their apparent free access to cargo: all those useful
and intrusive culture: Crisis and Disaster stress the artifacts, sacks of flour and rice, canned foods, and
effects of prior traumatic natural or cultural events; frozen carcasses. A persistent theme of day to day life
Nativistic, Militant, and Denunciatory emphasize the was, and in places still is, the ‘secret’ of that access.
movements as protests against foreign rule together Often, one who alludes to the subject is regarded as a
with the attempt to resolve problems by reaching back bore and ignored. Still, the problem itself remains:
into tradition; Dynamistic, Vitalization, and Revitali- speculation and supposition are joined to traditional
zation accent a positive cultural renewal in the face of myths and, indeed, Bible stories—particularly Noah’s
what is perceived as decay or decadence; Prophetic, curse on the sons of Ham (Genesis 9:25)—to explain
Charismatic, Messianic stress the importance given to why whites seemed to have free access to cargo while
the leader of a movement; and Christian influences are Melanesians did not. Such stories and speculations
more directly evoked by Holy Spirit and Salvation make up what Burridge (1963, p. 147) has called a
movements. myth-dream: a compost of hopes, desires, and possi-
While some cargo cults have occurred elsewhere in bilities which a prophet or visionary may bring into a
Oceania, they are usefully regarded as a Melanesian focus and coherence through action instructions.
phenomenon. More than 400 instances have been Given the context of the myth-dream, the prophet
recorded since the 1850s, and one must assume that usually reports a peculiar experience: a dream or
others, in secluded areas, have gone unrecorded. The vision, an encounter with a traditional spirit entity, an
decade after the end of World War II, during ancestor perhaps, or a Christian representation such
which allied forces had deployed vast quantities of war as the archangel Gabriel, Virgin Mary, Holy Spirit, or
materiel and general supplies throughout Melanesia, other entity. Such an experience may become part of
saw acceleration in the number of instances. the myth-dream even if, as often occurs, it is shrugged

1482
Cargo Cults

off. Sometimes, however, prophet, mystical encounter, access to cargo and, perhaps, an equivalence in criteria
and action instructions come together into cogent and of status in relation to whites.
persuasive synthesis. What it is that transforms a Cargo movements thus envisage the creation of a
possibly overwrought imagination into an active cult ‘New Man,’ with access to cargo and capable of
or movement, which includes those who are otherwise competing with whites on their own terms. If a new
sober, practical, and businesslike, is, so far, elusive. heaven seems cloudy, it is mainly embodied in a
Sometimes visionary, prophet, and leader are con- typical Melanesian way in a new earth where access to
tained in the same person. At other times, depending cargo will reward the faithful, with more for the most
on capacities for organizing others, they separate. The astute.
explicit objective of a movement, access to cargo, as The incidence of cargo cults, in particular the high
well as the means thereto have led to the activities concentration in former New Guinea where Tok pisin
being called ‘bizarre.’ These have included: destruction is\was most used, hints at other factors at work. The
of crops and\or traditional sacra; marching about few occurrences in Polynesia and Micronesia, where
with wooden ‘rifles’; taking scribbled pieces of paper Tok pisin is absent, and in the Papua New Guinea
(‘cheques’) to a store; signaling for cargo with ‘radios’ Highlands where, despite a more ordered colonial
of palm thatch; forms of ‘baptism’ (taken from penetration, much the same socioeconomic conditions
Christian rites); orgiastic dances and sexual prom- in relation to whites existed, presents problems. One
iscuity (re-enacting traditional creation myths in which answer is that in the context of the myth-dream the
chaos and disorder give way to the moralities and rules Tok pisin word kago (l cargo) accretes to itself a
of sexual access and marriage). Often there is a rider: transcendent sense of redemption from perceived
those who do not participate will be doomed to inequities which the English ‘cargo’ does not.
destruction. Furthermore, it is noticeable that cargo cults tend to
Failure of the cargo to arrive, or administrative occur most frequently in those areas, such as in former
action, usually bring activities to an end. Still, they New Guinea excluding the Highlands, where (due in
may smolder and develop, as the Jon Frum movement part to population depletion from introduced diseases)
on Tanna in Vanuatu (Guiart 1962) has done over the community organizations involved shifting political
years into stable, if syncretic, sects. However, discus- and social leaderships. In consequence, infirm political
sions continue, the myth-dream is kept alive. The authority and ambiguities of obligation and identi-
question of the ‘secret’ remains: how do white people ty—just that ‘anomy and incertaintie’ noted in the
get cargo? seventeenth century of English enthusiastic move-
ments (Burridge 1960, p. 13). Elsewhere in Oceania as
5. Interpretations well as in the Highlands of Papua New Guinea not
only were (and are) communities much larger and
The kinds of interpretation of cargo cults are implicit more stable, but the loci of obligation and political
in the nomenclature: reactions with differing emphases
to the cultural stresses of Melanesian history that have
been summarized in the previous sections. Socio- 6. Conclusion
economic analyses going ultimately to kinds of depri-
vation, especially problems attending the advent of Over the space of 70 years for the Highlands of Papua
cash, have dominated. Such interpretations tend to be New Guinea, and an added near century for most of
historically particularist. They do not necessarily the peoples in the coastal and intermediate areas,
suffice for very similar kinds of movement in, for Melanesians have moved from a stone age technology
example, California where, instead of cargo, the idiom rich in symbolism (where every quality of human
may be access to spacecraft or redemption by aliens character and nuance of change in climate, vegetation,
from space. Although one is forced to consider the or animal behavior was charged with meaning) into an
conditions of life of participants, these conditions are ambience where bureaucracy, science, and reason are
not exhausted by the purely socioeconomic or even paramount. The former qualities are discounted in
political. favor of quantity, and symbolism is derided as
For example, in Tok pisin the word wok (l work) superstition. In such a world, except where access to
refers to magical rites and spells, cult activities and money and the goods it can buy are difficult, for most
also to work in gardens, forests, or at sea, which last Melanesians cargo cults are becoming an anachron-
translates into food, giving feasts, making exchanges, ism.
gaining and maintaining relative status, and so pol- As a note of caution, long ago cult leaders and
itical influence. Further, the symbolisms in the ac- influential Melanesians were taken to Australia where
tivities bespeak an ending of the old or present ways of they were conducted around factories to see how cargo
life and a fresh start. This may be a movement from an was made. Yet this only took the central question one
economy based on exchanges of foodstuffs and valu- step deeper: whence the ability to make cargo? Why
ables such as shell necklaces or chaplets of dogs’ teeth, were the necessary resources not available in New
to one based upon cash which, in turn, would provide authority have persisted in reasonable certainty.

1483
Cargo Cults

Guinea? Perhaps, following American example and in ida to the Gulf of Paria, then turns westward along the
contemporary idiom, future cult activities may focus coast of Venezuela.
on the Book of Revelation from the Bible or be The islands are divided into four groups. The most
directed toward access to space vehicles. Still, they will northern group is the Bahamas, a string of several
pose much the same basic questions for social scien- hundred islets, of which 29 are inhabited, that stands
tists, as have cargo movements. The question of just apart from the Antilles. The next two groups, the
why a cult does or does not occur is still a mystery. Antilles proper, are the demographic, economic, and
Melanesian scholars will no doubt reveal many new cultural heart of the region. The Greater Antilles
nuances, but whether they will do any better in relation consist of the islands of Cuba, Hispaniola (com-
to their movements than Europeans have done for prised of Haiti and the Dominican Republic), Jam-
theirs, remains to be seen. aica, Puerto Rico, and their respective dependencies.
They encompass the bulk of the Antillean lands and
See also: Belief, Anthropology of; Cognitive Anth- house three-quarters of the region’s inhabitants. The
ropology; Conflict: Anthropological Aspects; Econ- Lesser Antilles, which include the Leeward and Wind-
omic Anthropology; Exchange in Anthropology; ward Islands, span like a crescent at the eastern end of
Horticultural Societies; Melanesia: Sociocultural the archipelago, and count about 40 inhabited isles. A
Aspects; Millennialism; Political Anthropology; Poli- fourth and looser grouping includes the islands off
tical Economy in Anthropology; Symbolism in Venezuela, from Tobago and Trinidad to Aruba.
Anthropology Beyond the archipelago, most social scientists now
include in the Caribbean the Colombian islands off the
coast of Nicaragua and the mainland territories of the
Bibliography Guianas (Guyana, Suriname, Belize, and Cayenne,
also known as French Guiana).
Belshaw C S 1954 Changing Melanesia. Oxford University Press,
Melbourne, Australia
Berndt R M 1952 A cargo movement in the central Highlands
of New Guinea. Oceania XXIII: 40–65, 137–58, 202–34 2. An Obious Heterogeneity
Burridge K O L 1960 Mambu. Methuen, London. [1970
Harper Torchbooks, New York; 1995 Princeton University Scholarly treatments of the Caribbean as a single
Press, Princeton, NJ] object of study, and the related conceptualization of
Burridge K 1969 New Heaen, New Earth. Basil Blackwell, the region as a distinguishable sociocultural area, are
Oxford, UK both recent and controversial. Poets and novelists,
Guiart J 1962 Les Religions de l’Oceanie. Presses Universitaires such as Cuban writers Alejo Carpentier and Nicolas
de France, Paris Guillen, were the first to herald Caribbean socio-
Knox R A 1950 Enthusiasm. Clarendon Press, Oxford, UK
Lawrence P 1964\1971 Road Belong Cargo. Manchester Univer-
cultural unity early in the twentieth century. This call
sity Press, London was renewed by recent literary figures such as Antonio
Lindstrom L 1990 Knowledge and Power in a South Pacific Benitez-Rojo and Edouard Glissant. Most social
Society. Smithsonian Institution Press, Washington, DC scientists, however, took the old colonial boundaries
Steinbauer F 1971 Melanesian Cargo Cults [trans. Wohlwill for granted, dividing the region into linguistic spheres
M]. George Prior Publishers, London that duplicated the European dominions and across
Trompf G W (ed.) 1990 Cargo Cults and Millenarian Moements. which they saw few similarities. By the early 1960s,
Mouton de Gruyter, Berlin international symposia on the region (e.g., Rubin
Wallace A F C 1956 Revitalization movements. American An- 1960) and the comparative sketches that appeared in
thropologist LVIII: 264–81
Worsley P 1957 The Trumpet Shall Sound. MacGibbon & Kee,
the Nieuwe West Indische Gids, Caribbean Studies, and
London (Rev. Edn. 1968) Social Economic Studies implicitly acknowledged the
overall unity of the Caribbean. Soon after, explicit
K. Burridge treatments of the region as a single object of study—
and parallel efforts to conceptualize its structural
similarities—emerged in European geography (e.g.,
Lowenthal 1972) and North American anthropology
(e.g., Mintz 1966, 1984).
Caribbean: Sociocultural Aspects The earlier reluctance of social scientists to view the
Caribbean as a whole is understandable, and not just
1. A String of Isles because their research evolved within European col-
onial studies and often reproduced the insularism of
The Caribbean archipelago, sometimes called the the local elites. With 40 million people spread over
Antilles or the West Indies, comprises nearly a some 90 territories, most of which are tiny and
thousand identifiable islands and islets that spread surrounded by sea, the Caribbean displays a wide
over 2,500 miles around the Caribbean Sea. The range of similarities and differences. No single so-
archipelago spans southeasterly from southern Flor- ciological or cultural feature stands out as the defining

1484
Caribbean: Sociocultural Aspects

essence of the region. Thus, complexity in regard to descend directly from regions of Africa that cor-
size and population and social heterogeneity within respond to at least 18 contemporary states.
and across political boundaries are two defining
themes of Caribbean Studies, even among the social
scientists who question the singleness of the region 3. Family Resemblance: The Shapes of History
(Trouillot 1992).
Social scientists today see the division of labor, past Beyond this profusion of traits and descent lines, the
and present, as the root cause of Caribbean diversity unity of the Caribbean is one of family resemblance.
both within and across territories (Cross and Heuman Different characteristics account for the likeness be-
1988). In the last five centuries, the Caribbean has tween any two territories within and across linguistic
experienced a wide range of labor regimes: slavery, boundaries. Yet the web of parallelisms, relationships,
indentured labor, sharecropping, peasant agriculture, and genealogies that spans the entire archipelago and
simple commodity production in family-based enter- spills over into the Guianas makes the area, as a whole,
prises, agricultural and industrial wage labor. Except quite distinct from any other world region.
for the barely known practices of its earliest inhabi- Caribbean family resemblance is governed by the
tants, the Arawaks and Island Caribs who probably shared experience of power. Caribbean societies
specialized in swidden agriculture and small-scale evolved in the shadow of power yielded almost
fishing respectively, these labor regimes all reflect the always vertically from within and almost always
incorporation of the Caribbean in the capitalist world resting on the ultimate domination of Europe and,
economy. Today, while most Caribbean men and later, of Europe and the United States. Six overlapping
women still derive the bulk of their income from features typify this exercise of power: (a) the deci-
agriculture, some of the islands have specialized in mation of the native population, which created an
offshore banking. Others have become favored hubs in effective terra nullius, a land literally up for grabs; (b)
assembly industries or electronic communications. a subaltern integration in the international order
Others rely heavily on tourism, especially from the characterized by the duration and extent of external
United States, which now exercises unmatched econ- rule (thus the depth of intrusion in local life); (c) the
omic and political power over most independent states extreme regimentation of populations that paralleled
within the region. this intrusion; (d) plantation slavery as the epitome of
The political scene is as varied as the economic the last two features; (e) the continuity of institution-
landscape (Stone 1985). The Caribbean bears the mark alized—often state-enforced—exclusions and hier-
of a variety of political systems: colonies and neo- archies; and (f) the no less continuous regimentation of
colonies with different degrees of limited autonomy; cultural practices. Scholarship on the region now
three monarchical experiments in nineteenth-century tends to address these themes singularly or in com-
Haiti; Westminster-style parliamentarism in the for- bination.
mer British colonies; civil and military dictatorships Modern Caribbean history starts in the early six-
notably in Cuba, the Dominican Republic, Haiti, and teenth century with the swift decimation of the original
Suriname; and various unusual and self-labeled inhabitants of the Antilles from disease, warfare,
versions of socialist rule. Elections are now the norm mistreatment, and forced labor. From then on, Euro-
in most countries, but the formal adherence to demo- peans moved through the islands as if they were empty
cratic procedures barely hides a wide variety of lands to be fashioned exclusively for goals that
practices from blatant fraud and unabashed clientel- originated elsewhere. Indeed, the Caribbean stands
ism to single-party rule and electoral divisions along out in the world as an exceptional product of European
ethnic lines. colonialism. Caribbean territories have experienced
This mosaic of economic and political formulas Western European influence longer and more pro-
parallels an impressive variety of languages and foundly than any other area outside of Europe itself.
religions. Caribbean languages include Spanish, Nowhere else have European states held onto
French, English, Dutch, and Creole languages of all dependencies for so long and shaped them so deeply
kinds, echoes of the many European nations that without having to take into account the strength or
influenced the area (Taylor 1977, Christie 1996) relevance of native institutions. Almost everything
Likewise, almost all the variations of Christianity can that we now associate with the Caribbean—from
be found in the area. Jewish influence is not to be sugar cane, coffee, mangoes, bananas, donkeys, and
dismissed; nor is that of Islam. Hinduism and Bud- coconuts, to the people themselves, whether African
dhism strive in Trinidad and Guyana. Native Carib- or Asian in origin—was brought there as part of the
bean religions complete the denominational mosaic: European conquest.
Haitian Vodoun, Trinidadian Shango, Cuban Sante- External rule dictated the coercion of labor and the
ria, Jamaican Pocomania and Rastafarianism, belief regimentation of local populations, both of which
systems and practices where the Old World meets the reached their peak during the 370 some years that
New. The influence of the Old World reaches its peak African slavery lasted in the Caribbean. The slave
with the African contribution. Caribbean populations trade itself lasted from about 1518 to 1873. During

1485
Caribbean: Sociocultural Aspects

that time, the Caribbean imported at least four million persecution of Rastafarians in Dominica or Jamaica,
African slaves, perhaps one third of all the Africans of Maroons in Suriname and of homosexuals in Cuba,
who came to the Americas. In comparison, the United are all evidence that cultural coercion continues in the
States imported about half a million. Caribbean.
Caribbean slavery was plantation slavery, spon-
sored by European capitalists and geared toward the
production of tropical crops for export. Sugar cane 4. The Caribbean in the Social Sciences
and coffee dominated the system, but tobacco, cotton,
and indigo were also important. The impact of slavery Caribbean social science started long before the
on the cultural and social life of Caribbean popu- institutionalization of current disciplines, with the
lations may be the most evident feature behind field observations and commentaries of colonists such
Caribbean family resemblance. The role of particular as Bartolome! de las Casas, Jean-Baptiste Labat, Bryan
territories varied during the centuries of slavery. Yet Edwards, John Gabriel Stedman, and Moreau de
the entire region was deeply molded by that experi- Saint-Me! ry. Since then, local elites, colonial and
ence. postcolonial, have generated a huge literature that
In 1791, the slaves of Saint-Domingue\Haiti, then prefigures and continues to fuel the themes favored
France’s most lucrative colony and perhaps the most today by professional academics (Lewis 1983). Haiti
profitable dependency of any European power, started and Cuba held the lead in the social sciences until the
an uprising that augured a new phase in Caribbean second half of the twentieth century. Since the 1960s,
history. After 13 years of war, they defeated the however, the anglophone Caribbean has generated a
formidable army sent by Napoleon to restore slavery spectacular amount of social scholarship. The impact
and proclaimed the independence of Haiti in 1804. of the various local schools and the widespread
The Haitian Revolution signaled the end of Atlantic acknowledgement outside of the region that its col-
slavery. Throughout the century, European powers onial history profoundly shaped its present give
successively abolished either the slave trade or slavery Caribbean social science a decisive historical bent.
itself in a process completed in the Antilles with the History generates the greatest amount of schol-
abolition of slavery in Puerto Rico (1873) and Cuba arship about and from the region. The record begins
(1880). with the colonists, includes the Haitian pioneers of
Abolition did not bring an end to the plantation postcolonial history, Thomas Madiou and Beaubrun
system or the coercion of the labor force. Hundreds of Ardouin and mushrooms in the twentieth century with
thousands of indentured laborers, mainly from China the works of Elsa Goveia, C. L. R. James, Eric
and South Asia, were brought to replace the blacks on Williams, Manuel Moreno Fraginals, Jean Fouchard,
sugar plantations. Caribbean people of South Asian Walter Rodney, Barry Higman, and a number of
descent now constitute important ethnic subgroups historians from the University of the West Indies.
from Guadeloupe to Suriname and, most notably, in Williams set the connection between capitalism and
Trinidad and Guyana. In the latter two places, the slavery as a central theme of Caribbean and Afro-
Afro-Caribbean descendants of the former slaves American studies. James made the Haitian Revolution
found new niches in the agro-social system, becoming a legitimate object of study outside of Haiti. Fouch-
independent peasants or, later, part of a growing ard pioneered twentieth-century studies of maroon
middle class tied to the state. The peasantry was even slaves.
stronger in the mountainous islands of the Wind- Today, the anglophone world—England and es-
wards, and in Haiti and Jamaica. At the other end of pecially the United States—produces the greatest
the spectrum the plantation system outlasted slavery, number of titles on Caribbean history, although
even without massive input of Asian labor, in places contributions from Dutch researchers remain signifi-
likeBarbados, Cuba, Antigua, or Puerto Rico—though cant. The state of Caribbean historical research is
in Puerto Rico it coexisted with a peasantry whose summarized in UNESCO’s six-volume General His-
roots anteceded freedom itself. tory of the Caribbean. Slave studies remain a highlight
Whether paid laborers or independent peasants, of current research, with increasingly detailed ac-
Caribbean rural dwellers tend to be marginalized counts of plantation life and coordinates. In recent
politically, socially, and culturally by local elites years, the Haitian Revolution, slave resistance, the rise
that replaced European colonizers. Indeed, the of peasantries, women’s history, postslavery life in
whole region is deeply marked by social exclusion general, and US interventions in the region have
based on markers such as skin color, ethnic affiliation, attracted increased attention.
class origins, occupational status, religion, and History’s lead is not confined to the number of titles
language. That exclusion, often institutionalized, it generates. The disciplines next in line—anthro-
intensifies a regimentation of cultural practices which pology in particular, but also geography, sociology,
dates back to slavery. Explicit or tacit codes limiting economics, and even political science—tend to frame
the reach and value Creole speech, the denigration of their objects against a strong historical background.
Afro-Caribbean religions throughout the region, the Because Caribbean social science cannot ignore the

1486
Caribbean: Sociocultural Aspects

role of North Atlantic power in shaping the region, it by the cultural critique of Rex Nettleford and linguist
constantly arks back to the consequences of colonial- Mervyn Alleyne in Jamaica. Since the late twentieth
ism, of external domination, and of the region’s century, however, a number of writers, notably anthro-
integration in the capitalist world economy. For pologists Mintz and Price (1992) have critically re-
instance, economics focuses as much on the plantation assessed the Herskovits agenda. Many scholars now
system, past and present, as on the global mechanisms emphasize the principles and processes behind Carib-
that sustain the Caribbean’s incorporation in the bean sociocultural development rather than trace the
Atlantic world, often echoing dependency theory as Old World origins of singular traits, thus overlapping
exposed in continental Latin America (Beckford studies in creolization.
1999). Creolization—the process by which newly arrived
Even economists not aligned with theories of de- populations facing severe physical and institutional
pendency take power and history into account. Thus, constraints generated a distinct mode of life and
while Sir Arthur Lewis’s focus on ‘the dual economy’ developed cultural beliefs and practices, including
and the ‘traditional sector’ won him a Nobel Memorial linguistic practices, that have become distinctively
Prize, fellow Caribbeanists are acutely aware that the native to the region—has a long pedigree in Caribbean
work of the St. Lucian economist was deeply involved studies. Here again, the puzzle remains how to
in the history that produced the dualism he described reconcile heterogeneity and family resemblance. A
(Lewis 1978). Within Caribbean Studies, the social century after the beginning of the slave trade, local and
sciences in general move in tandem with histori- foreign observers began to describe the features and
ography, always taking into account the state of development of Creole languages and wondered about
knowledge about where these societies come from the similarities across islands. Creole linguistics have
even when the explicit purpose is to explain where they now become a subfield with its own insular debates.
are now or where they might be going in the future. Yet other aspects of the creolization process and the
Not surprisingly, a leading theme in Caribbean nature of Creole society itself remain central and
social studies has been the continuous assessment of cross-disciplinary themes in Caribbean social and
the specific heritage of various Old World regions on cultural studies (Arnold 1998).
particular territories when not on the entire area. Here The emphasis on social and cultural blending,
again, social scientists have to navigate between favored by various students of creolization, has long
obvious signs of blending and no less obvious signs of run counter to the emphasis on heterogeneity among
heterogeneity to account for their observations. Thus, observers who see Caribbean societies as unwieldy
local observers have produced a number of theoretical patchworks of groups solidly marked by skin color
schemes—such as Cuban Fernando Ortiz’s notion of and artificially joined by power. Here again, the trend
‘transculturation,’ later recycled by North Atlantic starts in colonial times. It runs through the nineteenth,
anthropology—to bypass the difficulties inherent in and is revived in the early twentieth by writers such as
the enterprise. On the ground, however, Ortiz himself Lorimer Denis and Franc: ois Duvalier in Haiti. Yet it
joined a number of researchers who focused on took Jamaican anthropologist Smith’s application of
Africa’s influence on social, political, and economic the plural society model (1965) to give that line of
institutions (from female-dominated retail trade to research much of its firmness and exposure in Carib-
matrifocal families), cultural beliefs and practices bean studies. Smith’s work continues to fuel virulent
(from religion to language) to the arts (most notably controversies but the fundamental question it raises
music). The attention to gender roles and family remains unavoidable. Given what is known of Carib-
structure is central to the work of a number of scholars, bean heterogeneity, how does a social scientist relate
not all of whom accept the centrality of the African current institutions and other elements of the social
legacy (e.g., Smith 1988). On the other hand, the system to the cultural traditions of the peoples
emphasis on African continuities is central to the work involved?
of Haitian Jean Price Mars, a founder of the Ne! gritude Yet the question leads to an impasse only if one
movement, who renewed with the call for the re- looks for a unified content, a single essence that
evaluation of Africa first made by early nineteenth- epitomizes Caribbeanness. The now common vision
century Haitian writers. of the Caribbean as a sociocultural area where
The more specific search for sub-Saharan ‘survivals’ heterogeneity and family resemblance are inherently
became institutionalized in North Atlantic universities indissociable products of a history of uneven power
with the work of US anthropologist Melville provides the frame within which most scholars ap-
Herskovits who saw the Caribbean as a subset of the proach the region today (Mintz and Price 1985, Watts
Afro-Americas, much like his French counterpart, 1987)
Roger Bastide. Price-Mars and Herskovits also share
an emphasis on the ‘cultural ambivalence’ of Carib- 5. The Exploding Islands
bean elites, torn in their eyes between their African
and European heritage—an ongoing subtheme of Just as the vision of the Caribbean as a single object of
Caribbean critical inquiries, most recently sustained study becomes accepted in social science circles (e.g.,

1487
Caribbean: Sociocultural Aspects

Mintz and Price 1985, Cross and Heuman 1988) it Lewis G K 1983 Main Currents in Caribbean Thought. The
faces new challenges in a world increasingly dominated Historical Eolution of Caribbean Society in its Ideological
by its powerful US neighbor (Serbin 1998). It must Aspects, 1492–1900. The Johns Hopkins University Press,
now contend with the growth in rural–urban mi- Baltimore, MD
gration, the crossing of borders within the region and Lewis W A 1978 The Eolution of the International Economic
Order. Princeton University Press, Princeton, NJ
the size and visibility of Caribbean diasporas in the
Lowenthal D 1972 West Indian Societies. Oxford University
North Atlantic. Press, Oxford, UK
Caribbean people have been moving within and Mintz S W 1966 The Caribbean as a socio-cultural area. Cahiers
across political borders since slavery days, and mi- d ’Histoire Mondiale IX: 916–41
gration has been a major theme in Caribbean Studies Mintz S W 1974\1984 Caribbean Transformations. The Johns
since at least the 1960s. Today however, migratory Hopkins University Press, Baltimore, MD
flows within single islands, within the region, and away Mintz S W, Price R 1976\1992 The Birth of an African-American
from it have reached unprecedented proportions. As Culture: An Anthropological Perspectie. Beacon Press,
peasants and rural proletarians rush to the cities, the Boston
expansion of urban slums now redefine the socio- Mintz S W, Price S 1985 Caribbean Contours. The Johns
cultural landscape in all territories. The growing Hopkins University Press, Baltimore, MD
presence of intra-regional migrants—such as Haitian Rubin V D (ed.) 1960 Caribbean Studies: A Symposium. Uni-
cane cutters in the Dominican Republic, Dominicans versity of Washington, Seattle, WA
in Guadeloupe and Martinique—also questions Serbin A 1998 Sunset oer The Islands. The Caribbean in an Age
boundaries once thought impermeable. Most import- of Global and Regional Challenges. St. Martin’s Press, New
York
ant, outside contact has become part of daily local life.
Smith M G 1965 The Plural Society in the British West Indies.
The number and visibility of Caribbean migrants in University of California Press, Berkeley, CA
North Atlantic countries, and the fact that break- Smith R T 1988 Kinship and Class in the West Indies. Cambridge
throughs in transport and communications enhance University Press, Cambridge, UK
their economic and cultural impact on their country of Stone C 1985 The Caribbean as a political region. In: Mintz S W,
origins make diasporas an inherent part of Caribbean Price S (eds.) Caribbean Contours. The Johns Hopkins
Studies (Brana-Shute 1983). Urban glut, and the University Press, Baltimore, MD
frequency of outside contact have also increased the Taylor D M 1977 Languages of the West Indies. The Johns
crime rate, notably infractions associated with drug Hopkins University Press, Baltimore, MD
use or trade. The concept of a single sociocultural area Trouillot M-R 1992 The Caribbean region: An open frontier in
will have to accommodate this geographical and social anthropological theory. Annual Reiew of Anthropology 21:
remapping of the region and redraw its own heuristic 19–42
boundaries accordingly. UNESCO 1997–2001 General History of the Caribbean.
UNESCO and Macmillan Caribbean, London, 6 Vols.
Watts D 1987 The West Indies: Patterns of Deelopment, Culture
See also: Colonialism, Anthropology of; Colonialism: and Enironmental Change since 1492. Cambridge University
Political Aspects; Colonization and Colonialism, His- Press, Cambridge, UK
tory of; Creolization: Sociocultural Aspects; Histori-
ography and Historical Thought: Indigenous Cultures M. R. Trouillot
in the Americas; Slavery as Social Institution;
Slavery: Comparative Aspects; Slaves\Slavery, His-
tory of

Bibliography Cartographic Visualization


Arnold A J (ed.) 1998 Who\what is ‘creole’? Plantation Society
in the Americas 5(1)
Beckford G 1972\1999 Persistent Poerty: Underdeelopment 1. Setting
in Plantation Economies of the Third World. The University Assume one is interested in an overview of the
of the West Indies, Kingston, Jamaica
population distribution of a particular area. An option
Brana-Shute R 1983 A Bibliography of Caribbean Migration and
Caribbean Immigrant Communities. University of Florida,
would be to give a written description such as: ‘in the
Gainesville, FL northeast only a few small towns are found, while in
Christie P 1996 Caribbean Language Issues, Old & New: Papers the southeast ….’ Before one gets a clear picture,
in Honour of Professor Meryn Alleyne. University of the West several pages could have been filled, especially if a
Indies, Kingston, Jamaica description of the relations of the towns with other
Cross M, Heuman G (eds.) 1988 Labour in the Caribbean. geographic phenomena such as the rivers or railroad
Macmillan Caribbean, London network are also incorporated. A better option is to

1488
Cartographic Visualization

industrial development on the growth of towns one


would need the possibility to generate maps depending
on the nature of the question. This could be to switch
layers with information on or off. This would also
include the possibility to view the data in alternative
visualizations. Questions are no longer the simple
Where?, What?, and When?, especially if for instance
specific demographic models are included, which can
predict alternative future population trends based on
census- and socio-economic data. These trends could
be visualized in an animation. The mapping environ-
ment requires interaction and dynamics. This is what
today’s visualization is all about, as will become clear
in the remainder of this chapter. However, according
to the dictionary, visualization means ‘to make
visible.’ From this perspective map making has always
been visualization. Here the term visualization is used
in the context that allows the map user to interact with
the (digital on-screen) map. For a detailed discussion
see (Hearnshaw and Unwin 1994, MacEachren and
Figure 1 Taylor 1994).
Population distribution

tell this story using a map. Maps are the most efficient
and effective means to transfer geospatial information. 2. Visualization process
The map user can locate geographic objects, while the Any map, static or dynamic, on screen or on paper,
shape and color of signs and symbols representing the complex or simple, is created during what is called the
objects inform about their characteristics, such as the cartographic visualization process. This process is
town’s location and population size. Maps reveal considered to be the translation or conversion of
geospatial relations and patterns, and offer the user geospatial data from a database into graphics. Pre-
insight into, and an overview of, the distribution of dominantly these are map-like products as is schema-
particular phenomena, such as an area’s population tically explained in Fig. 2. This process should be seen
distribution. in the context of geospatial data handling. Geospatial
The map easily allows one to answer questions of data handling stands for the acquisition, storage,
the nature ‘Where …? What …?, and When …?’ manipulation, and visualization of geospatial data in
These questions deal with the basic components of the context of particular applications.
geospatial data: location, characteristics, and time, or During the visualization process, cartographic
their combination. The map discussed above and methods and techniques are applied. These can be
found in Fig. 1 will quickly result in answers to considered as a kind of grammar that allows for the
questions such as ‘Where are the large towns located?’ optimal design, production and use of maps, depend-
or ‘What town has the highest number of inhabi- ing on the application. The process is guided by the
tants?’ Questions regarding ‘When …?’ depend on phrase ‘How do I say what to whom, and is it effective?’
the map contents. Often the map is just a snap shot in The phrase holds four key words: ‘How’ which refers
time, for instance the map in Fig. 1 depicts the status to cartographic methods and techniques (in the case of
as of 1998. In answering questions, another quality of Fig. 1 a proportional point symbol map has been
maps will be revealed—the ability to offer an ab-
straction of reality. It simplifies by selection, but at the
same time it puts, when well designed, the remaining
information in a clear perspective. The map of the area
only needs the boundaries of municipalities, and a
symbol for the number of people living in each town.
In this particular case there is no need for roads,
mountains, or other physical features. Maps like these
are often static and come as they are, especially when
they are printed on paper. On-screen maps sometimes
allow for some interaction, such as getting data from
the database behind the map.
For more complex questions one needs more than Figure 2
just a static map. For instance, to study the impact of The cartographic visualization process

1489
Cartographic Visualization

chosen); ‘What’ which refers to the geospatial data people have become involved in making maps. The
(Fig. 1 deals with quantitative population data); widespread use of Geographical Information Systems
‘Whom’ which refers to the map audience and the (GIS) has significantly increased the number of maps
purpose of the map—(the map in Fig. 1 is rather basic being created (Longley et al. 1999). Even the spread-
and could function in a newspaper or school atlas); sheets used by most office workers today have mapping
‘Effective’ reflects the usefulness of the map (do the capabilities, although most people are probably not
map readers understand the message the map intends aware of this. The opportunities offered by the World
to bring, an overview of the area’s population dis- Wide Web will again lead to an incredible increase in
tribution). maps produced. Some websites, such as MapQuest
The producer of maps could be a professional produce over a million maps a day! Many of these
cartographer, but could also be an expert who is maps are not produced as final products, but rather as
mapping, for instance, vegetation stands using remote intermediate products to support the user in his or her
sensing images or health statistics in the slums of a work dealing with geospatial data. The map, as such,
city. With today’s availability of web mapping tools has started to play a completely new role: it is not just
the mapmaker could be anyone—with or without any a communication tool but also a tool to aid the user’s
notion of cartographic design. (visual) thinking process.
The visualization process can vary greatly depend- This process is being accelerated by the oppor-
ing on where the visualization takes place and the tunities offered by hardware and software develop-
purpose for which it is needed. Visualizations can be, ments. These have changed the scientific and societal
and are, created during any phase of the geospatial needs for geo-referenced data and, as such, for maps.
data handling process. They can be simple or complex, New media such as CD-ROMs and the WWW not
while the production time can be short or long. Some only allow for dynamic presentation but also for user
examples are the creation of a full, traditional topo- interaction. Users do expect immediate and real-time
graphic map sheet, a newspaper map, a sketch map access to the data and data geospatial has become
showing a route, a map from an electronic atlas, an abundant. This abundance of data, welcomed in some
animation showing the growth of a city, a three- sectors, is a major problem in other sectors. One lacks
dimensional view of a building or a mountain, or even the tools for user-friendly queries and retrieval when
a real-time display of traffic conditions via the World studying the massive amount of data produced by
Wide Web. Other examples include ‘quick and dirty’ sensors, and now available via the WWW.
views of part of the database, the map used during the These developments have given the word visualiza-
updating process, or during a geospatial analysis. The tion enhanced meaning, since progress in other discip-
environment in which the visualization process is lines has linked the word to more specific ways in
executed can vary considerably. It can be done on which modern computer technology can facilitate the
paper, on a stand-alone personal computer, or a process of ‘making visible’ in real time. Specific
computer linked to the World Wide Web. software toolboxes have been developed, whose func-
Many tools are available to visualize the data. These tionality is based on two key words: interaction and
tools consist of functions, rules, and habits. Algor- dynamics. A separate discipline called scientific visu-
ithms to classify the data or smooth a coastline are alization, has developed around it (McCormick et al.
samples of functions. Rules tell us, for instance, to use 1987), which is having a major impact on cartography
proportional symbols to display quantities or position as well. If applied in cartography it offers the user the
an artificial light source in the northwest to create possibility of instantaneously changing the appearance
shaded relief maps. Habits, or traditions as some of the map. Interacting with the map will stimulate the
would call them, tell us to color the sea in blue, user’s thinking and will add a new function to the map.
lowlands in green, and mountains in brown (Robinson As well as communication, it will prompt thinking and
et al. 1995, Kraak and Ormeling 1996). decision-making.
Developments in scientific visualization have stimu-
lated a model for map-based scientific visualization
(DiBiase 1990). As such, it is also known as Geo-
3. Visualization and cartography graphical visualization (MacEachren 1995). It covers
both the communication and thinking functions of the
In the past, even dealing with incomplete and un- map. Communication is described as ‘public visual
certain data, the visualization process nearly always communication’ since it concerns maps aimed at a
resulted in an authoritative map. The maps created by wide audience. Thinking is defined as ‘private visual
a cartographer were good enough for the user. This thinking’ because it is often an individual playing with
shows that cartography, for a long time, has been very the geospatial data to determine its significance (see
much driven by supply rather than demand. Somehow, Fig. 3). On a lower level, different visualization stages
this is still the case. However, nowadays, it is also can be recognized: each requires a different strategy
accepted that just making a map is not the only from the perspective of map use, based on audience,
purpose of cartography. Especially since 1980, many data relations, and the need for interaction. These

1490
Cartographic Visualization

type of users influence one’s view of what exploration


entails.
Returning to the sentence driving the visualization
process ‘How do I say what to whom and is it effective’
some similarities as well, as differences between modes
of visualization: presentation and exploration. ‘How’
still represents the cartographic methods and tech-
niques. However, new technology is emerging and this
offers challenges and opportunities, such as animation,
the application of the third dimension and virtual
reality, multimedia, etc. ‘I’ is no longer just the
cartographer, but an expert geoscientist. In the very
near future, it can probably be just anyone having
access to the WWW. ‘What’ no longer represents a
relatively well-defined and known data set; at least,
Figure 3 certainly not from the user perspective. ‘To whom’
The visualization process: visual thinking and visual seems to be simpler than before; it is not a relatively
communication well-defined user group, but the same person repre-
sented by ‘I,’ the expert geoscientist in the role of
stages are exploration, analysis, synthesis, and pres- cartographer. ‘Effective’ raises some interesting ques-
entation. tions. When a map is used, the information to be
From Fig. 3 it is obvious that presentation fits into transferred is known and, for all problems involved,
the traditional realm of cartography, where the car- can somehow be measured. But how can the visual
tographer works on known geosptial data and creates thinking process be measured? If it is considered
communicative maps. These maps are often created positively, is it because of the efficient graphics or
for multiple uses. However, exploration often involves because of the geoscientist’s clever thinking? These
a discipline expert creating maps while dealing with questions become more complex if we realize that we
unknown data. These maps are generally for a single do not even know the initial aim of the visualization in
purpose and are related to the expert’s attempt to solve these circumstances.
a problem. While dealing with the data, the expert The most prominent change is the shift from supply-
should be able to rely on cartographic expertise driven cartography to a demand-driven approach.
provided by the software or some other means. Although having many people making maps without
This process describes the ‘democratization of any cartographic knowledge might seem wrong, those
cartography’ (Morrison 1997). He explains it as ‘using non-cartographers will also introduce fresh views, as
electronic technology, no longer does the map user for instance described by Keller and Keller (1993).
depend on what the cartographer decides to put on a They distinguish three steps in the visualization pro-
map. Today the user is the cartographer.’ And ‘users cess: the first identifies the visualization goal; the
are now able to produce analyses and visualizations at second removes mental roadblocks; and the third
will to any accuracy standard that satisfies them.’ designs the display in detail. In the second step, Keller
and Keller suggest taking some distance from the
discipline in order to reduce the effects of traditional
4. Visualization and exploration constraints. Why not choose an alternative mapping
method? For instance, show a video of the landscape
What is exploratory cartography? The environment next to a topographic map. New, fresh, creative
has just been described: a person is trying to solve a graphics could be the result, they might also offer
particular geo-problem and is exploring various geo- different insights and would probably have more
spatial databases. Exploration also means working impact than traditional mapping methods. During the
with unknown data. However, what is unknown for third step, which is especially applicable in an ex-
one is not necessarily unknown to others. For instance, ploratory environment, one has to decide between
browsing in Microsoft’s Encarta World Atlas is an mapping data and visualizing phenomena.
exploration for most of us because of its wealth of An exploratory visualization environment offers the
information. With products like these, such explo- tools to act in the ways previously suggested. Such an
ration takes place within boundaries set by the environment should allow the user to look at geo-
producers. Cartographic knowledge is incorporated in spatial and other geo-referenced data in any com-
program wizards resulting in pre-designed maps. Some bination, at any scale, with the aim of seeing or finding
users feel this to be a constraint, but those same users geospatial patterns (which may be hidden). Geospatial
will no longer feel constrained as soon as they follow patterns can be defined as variations in location,
the web links attached to this electronic atlas. This attributes or time, or a combination of any of the three
example shows that the environment, the data, and the geospatial components within an area of interest. One

1491
Cartographic Visualization

5.2 Naigation and Orientation


This involves the keys to the map. At any time, the user
should be able to know where the view is located and
what the symbols mean. To illustrate this function Fig.
5(b) shows a map with its marginal information as well
as the coordinates at the cursor location in the map.

5.3 Query Data


During any phase of the visualization process, the user
should have access to the geospatial database to query
the data. The questions should not necessarily be
limited to simple What? Where? or When? As Fig. 5(c)
shows clicking a geographic object in the map reveals
the information available in the database, as well as
the hyperlinks that are attached to the object.

Figure 4
Brushing 5.4 Multi-Scale
Combining different data sets is a common operation
in an exploratory environment. The chance that these
of the first concepts of visual geospatial data ex- sets will have the same data density or the same level of
ploration was introduced by Monmonier (1989) when abstraction is unlikely. Generalization operators to
he described the term brushing (see Fig. 4). This is solve these multi-scale problems remain necessary, if
when the selection of an object in a map automatically not just to make sure zoom-in and zoom-out operators
highlights the corresponding elements in the other result in sensible maps. In Fig. 5(d) zooming-in results
graphics. Depending on the view in which one selects in a more detailed map.
the objects, there is geographical brushing (clicking in
the map), attribute brushing (clicking in the diagram
or table), and temporal brushing (clicking on the time
line). As such, the user gets an overview of the relation 5.5 Re-expression
among geographic objects based on location, charac-
teristics, and time. To stimulate visual thinking, an unorthodox approach
to visualization was recommended. This requires
options to manipulate data behind the map or offer
different mapping methods for displaying the data. An
5. Visualization Functions example of data manipulation is the application of
several classification systems, while the use of different
What are the basic requirements of an exploratory advanced map types to view the same data represents
visualization environment? The necessary functions display differences. Fig. 5(e) illustrates how the popu-
described below all need high interactivity options. lation data shown in the figures main map can be
displayed alternatively. The upper inset shows a prism
map (a 3D representation of the data in which the
height of the area corresponds to the attribute value;
5.1 Basic Display the big towns have high columns) and a cartogram (a
map in which the area of a geographic unit is equal to
Map displays need tools to allow the user to pan, the attribute value and not, as usually, to its geo-
zoom, scale, transform, and rotate the image contents. graphic area; the highly populated towns take a large
These geometric tools should be available and in- maps space).
dependent of the dimensionality of the displayed
geospatial data. Fig. 5(a) illustrates panning and
zooming facilities. Another option could be rotating
the map. This last function is of particular need when
5.6 Multiple Dynamically-linked Views
displaying 3D-maps, since it also allows objects that
otherwise might be hidden by other objects to be These tools represent a combination of multimedia
seen. and the brushing technique already mentioned. The

1492
Cartographic Visualization

a
b

Figure 5
Exploratory cartographic functions: (a) Basic display; (b) Navigation and orientation; (c) Query; (d) Multi-scale; (e)
Re-expression

user will be able to view and interact with the data in visualization software might have the most functions
different windows, all representing related aspects of available, but it lacks the typical geo-referencing
the data. These views do not necessarily contain maps; options needed to deal with geospatial data. However,
video, sound, text, etc. can all be included. Clicking an operational experimental exploratory visualization
object in a particular view will show its geospatial environments do exist. Examples are web-based
relations to other objects or representations in all the Decartes (Andrienko and Andrienko 1999) and Carto-
other views (see Fig. 6). graphic Data Visualizer (CDV) (Dykes 1997, Dykes
1998). As an example CDV will be discussed since it is
available free as a result of academic research (http:\\
5.7 Animation www.geog.le.ac.uk\jad7\cdv\). Both packages pri-
marily aim at the visualization of socio-economic
Maps often represent complex processes that can be census data. Fig. 6 shows a typical view on a CDV-
well expressed by animation. Animation can be used session. The program requires a base map (admin-
for temporal as well as non-temporal changes in istrative boundaries) and statistical data. From the
geospatial data. Aspects to be solved are related to the menu on the left the user can select active variables and
interface (user interaction—navigation) and the visualizations. Options are choropleth maps (or poly-
legend. Since the exploratory user will also be the gon maps), proportional circle maps, cartograms, and
creator of the animation, he or she should be able to non-cartographic visualization such as dotplots and
influence the flow of the animation. scatter diagrams. When creating these different views
on the data they will all be linked together. Clicking an
area in a map or a symbol in a diagram will highlight
6. Visualization Enironments the corresponding area\symbol in the other views.
The user has several options available to choose the
Most cartographic components of geographical in- layout of each of the views. One can also execute basic
formation systems can only handle one or two of the calculations to create new variables or classify the data
functions described in the previous section. Scientific according to different methods. CDV is by the nature

1493
Cartographic Visualization

Figure 6
A typical view of a CDV session

of its functionality, capable of showing geospatial usability, for others it is quite a job. To understand
patterns that would not be apparent when single maps how maps work is not easy (Wood 1992), especially as
or diagrams would be viewed. we change into a demand-driven mapping environ-
ment. In the supply-driven environment, it was known
what the map should tell and who the customers were.
Today cartographers deal partly with providing facili-
7. Conclusions tating tools to visualize geospatial data. How the tools
Packages with the functionality similar to CDV offer are used remains, for the moment at least, unknown.
an advanced cartographic visualization environment. These trends will only increase as more and more maps
However, remembering ‘How do I say what to whom, are produced via the World Wide Web. One has to find
and is it effective’ leaves the question open if it indeed out why someone wants to make a map, in order to be
works. From a technical point of view it certainly able to judge the future tools that need to be developed.
does, but can the user handle all those linked views. It
is argued that if users can change their perspective on See also: Cartography; Cartography, History of;
the data through selection and transformation, as well Cognitive Maps; Dynamic Mapping in Geography;
as alternative visualizations, meaningful relations Ethnic Conflict, Geography of; Geographic Infor-
among data variables are more likely to be revealed. mation Systems; Planetary Cartography; Tactile Maps
For some products, this is easy to determine its in Geography; Thematic Maps in Geography

1494
Cartography

Bibliography and Canada. It is commonly an independent de-


partment in Europe and other parts of the world.
Andrienko G L, Andrienko N V 1999 Interactive maps for
visual data exploration. International Journal for Geographic Commercial enterprises, nonprofit entities, and gov-
Information Sciences 13: 355–74 ernment agencies that produce maps generally have
DiBiase D 1990 Visualization in earth sciences. Earth & Mineral large cartography departments that are central to their
Sciences, Bulletin of the College of Earth and Mineral Sciences operations.
59: 13–18 The term ‘research’ within the field of cartography
Dykes, J 1997 Exploring spatial data representation with has at least two distinct meanings. One is the sys-
dynamic graphics. Computers and Geosciences 23: 345–70 tematic gathering of information from a variety of
Dykes J A 1998 Cartographic visualization: exploratory spatial sources for compilation into a coherent map. The
data analysis with local indicators of spatial association other is the quest for discovery of knowledge about
using Tcl\Tk and cdv. The Statistician 17: 485–97
Hearnshaw H M, Unwin D J (ed) 1994 Visualization in Geo-
maps and the processes associated with them. The
graphical Information System. Wiley, Chichester, UK former definition is closely associated with commercial
Keller P R, Keller M M 1993 Visual Cues, Practical Data and nonprofit map production and the latter with
Visualization. IEEE Press, Piscataway, NJ academic cartography. The subject-matter of aca-
Kraak M-J, Ormeling F J 1996 Cartography, the Visualization of demic cartographic research ranges from historical
Spatial Data. Addison Wesley Longman, London studies of the cultural, technical, and political context
Longley P, Goodchild M, Maguire D M, Rhind D (eds.) 1999 of maps to the changing processes of production to the
Geographical Information Systems: Principles, Techniques, ways in which maps are used in wayfinding and in the
Applications and Management. Wiley, New York production of knowledge. The relationship between
MacEachren A M 1995 How Maps Work: Representation,
Visualization and Design. Guilford Press, New York
people and maps, including symbol perception, the
MacEachren A M, Taylor D R F (eds.) 1994 Visualization in development of map-reading abilities, and cognitive
Modern Cartography. Pergamon Press, London processes in map use, have been of special interest in
McCormick B H, DeFanti T A, Brown M D 1987 Visualization cartography in the late twentieth century, as have the
in Scientific Computing. IEEE Computer Graphics 7: 69 processes by which maps are produced by computer.
Monmonier, M 1989 Geographic brushing: enhancing ex- Although maps are most closely associated with
ploratory analysis of the scatterplot matrix. Geographical geography, practitioners in many disciplines and
Analysis 21: 81–4 professions use the map as a device for recording and
Morrison J L 1997 Topographic mapping for the twenty first preserving information, as a tool in research, and as
century. In: Rhind D (ed.) Framework of the World. Geoin-
formation International, Cambridge, UK, pp. 14–27
pedagogical illustration. As computers are used more
Robinson A H, Morrison J L, Muehrcke P C, Kimerling A J, for the process of mapping, it is no longer as strictly
Guptill S C 1995 Elements of Cartography. Wiley, the domain of professional cartographers as it once
New York was. Virtually anyone with a computer and appro-
Wood D, Fels J 1992 The Power of Maps. Routledge, London priate software can produce a map. Whether this
democratizing of mapping is a positive or negative
M. J. Kraak development is not altogether clear. The products of
nonspecialists range from misleading to highly insight-
ful and creative. Maps are now far more widespread
and are available in a more timely way to fit immediate
needs than in the past, and they are becoming less
constrained by convention.
Cartography Cartography and its products have a profound
effect on human thinking and behavior. The map is a
The International Cartographic Association defines metaphor for the real world and is often the model
cartography as ‘the discipline dealing with the con- that shapes it. Existing hills and valleys, rivers, and
ception, production, dissemination and study of political boundaries are recorded on maps, but de-
maps.’ It goes on to say, ‘A map is a symbolized image cisions about surface excavation, river redirection, or
of geographic reality, representing selected features or political redefinitions are marked on maps and become
characteristics, resulting from the creative efforts of reality. The terms ‘map’ and ‘mapping’ have become
cartographers, and is designed for use when spatial common metaphors in everyday language; ‘on the
relationships are of special relevance’ (International map’ signifies importance, and ‘mapping a strategy’
Cartographic Association 1995). Cartography, along implies attentiveness and purpose in the planning of
with disciplines such as geodesy, surveying, aerial actions.
photogrammetry, and satellite remote sensing, is a Cartography has responded to changing social,
component of the mapping sciences. It is closely allied intellectual, and technological conditions throughout
with geographic information systems and geographic history, and the permeation of computers into all
information science. aspects of life in the late twentieth century has resulted
Cartography is most often found in geography in profound changes. Cartography has transformed
departments in colleges and universities in the USA from a data-poor to a data-rich field, as information

1495
Cartography

can now be gathered using aerial photography, global


positioning systems, and satellite imaging. Inexpensive
computer storage and the Internet have made it
possible to share and receive massive amounts of data.
Professional cartographers are now heavily involved
in information management and in creating mech-
anisms for shared use of data. Scientific visualization
in cartography, which implies the use of maps for
discovery of knowledge, has gained considerable
attention in recent years as a result of the ubiquity of
computer usage in spatial representation.

1. Major Cartographic Concepts


Mapping Earth’s surface involves the concepts of
scale, projection, spatial relationships, generalization,
symbolization and data modeling, and categories of
maps. Some level of understanding of each of these
concepts is inherent in every instance of the em-
ployment of a map.

Figure 2
1.1 Scale Five map projections showing a variety of shapes and
The map is a scaled model of Earth’s surface (or of the sizes of land masses when different projection systems
surface of some other planetary body), and the reduced are applied: (a) orthographic, the projection that looks
size is one of its most useful features. Earth itself is too like a view of a globe, (b) gnomonic, a very distorted
large for ready comprehension. projection but the only one on which all great circles
There are three types of scale expression commonly are straight, (c) cylindrical equal area, which preserves
used on conventional maps: verbal, representative correct sizes of all areas but has extreme changes in
fraction (RF), and graphic (Fig. 1) (Robinson et al. linear scale, (d) Alber’s, which is equal area and is
1995). Although each can refer either to distance commonly used for the USA, and (e) plane chart,
(linear scale) or area (areal scale), linear scale is by far which represents all degrees of latitude and longitude at
the more commonly used. Areal scale is generally the same length regardless of their length on Earth’s
indicated only if area is consistently scaled over the surface
entire map. Since distance cannot be consistently
scaled, no such constraint is associated with the sparse linework indicate that the map covers a large
appearance of a linear scale on a map. area. Scale clues are becoming increasingly important,
Explicit scale expression is not the only indication of as the user of an animated map cannot always use an
scale on maps (Eastman 1981, Brewer 1990). Map explicit expression of scale while observing the map.
content and design give scale clues as well. Double-line
streets, for example, immediately convey that the map
shows a small area; a shaded relief background and
1.2 Projection
The term projection refers to the system by which
points on a spherical or spheroidal surface are assigned
to points on a flat surface. Every point on the sphere or
spheroid has a corresponding point on the flat map.
Perhaps the most fundamental observation about map
projections is that linear scale cannot be the same
everywhere; in other words, map projections are
geometrically distorted representations, in contrast to
globes, which maintain a shape very close to that of the
planet. There are many map projections in use and the
Figure 1 five represented in Fig. 2 illustrate a variety of shapes
The three types of scale expression, illustrated in both and relative sizes of landmasses that can result when
linear and areal form different systems of projection are applied.

1496
Cartography

Although it is necessary to tolerate distortion when 1.3 Spatial Relationships


projecting Earth onto a flat surface, the distortion in
Maps show spatial relationships in a readily compre-
map projections is systematic and in some cases highly
hended form. Metric relationships include distance,
useful. Every rectangular projection of the world, for
direction (angle), and area; topological relationships
example, has one or two straight lines on which there
include such properties as connected to, inside, and
is no distortion and that distortion increases at right
outside. The concept of spatial relationship also
angles away from the line(s) of no distortion. Although
includes spatial (or geometric) form; an entity may be
large-area maps have major visible distortion, maps of
considered to have a footprint of a point, line, or area,
small areas, say a city or small region have so little
and if quantities are associated, the entity may be
distortion that it has traditionally been of interest only
thought of as a pole, ribbon, or volume. Spatial form
for the most exacting measurements.
also includes shape (e.g., Long Island vs. Martha’s
The development of geographic information sys-
Vineyard) and distribution (‘more of something here
tems has resulted in renewed interest in map projec-
than there’). The relevant spatial relationships vary
tions because many maps have been transformed into
with the type of map and with how they are used.
digital files and data have to be ‘unprojected’ (trans-
Street map users want to know the street on which a
formed from flat map cartesian coordinates to longi-
feature is located and the turns and distances to reach
tude and latitude) to be compatible with other data.
it. Users of a world population map want to know
Even the limited amounts by which map projections
where population is concentrated and where there are
affect location within small areas will affect the
relatively empty areas.
compatibility of multiple data sets.
Like the general concept of scale, spatial relation-
Despite the impossibility of maintaining consistent
ship is often taken for granted by the practiced map
linear scale on a map projection, there are two other
user. The reason for using a map is not the spatial
properties that may be maintained. When, at every
relationship itself, but what it means. A ‘hot-spot’ on
point on the map, maximum linear scale distortion is
a map showing cancer rates suggests the need for
compensated by opposite distortion in the opposite
further study and possible action; the spatial relation
direction, area scale is maintained. If the linear scale
on the map is simply the trigger. Whether a street,
change is the same in every direction at each point,
population, or cancer map, it is the spatial relation-
angles at all points will be maintained. The terms for
ships that make it a useful device. Without them, there
these two categories of projection are equal-area (or
would be no need for a map.
equivalent), and conformal (or orthomorphic), re-
spectively. Many projections have neither compen-
sating nor equal changes of linear scale in opposite
directions; these projections are neither equal-area nor
1.4 Generalization
conformal.
Some projections have special properties that make As scaled representations of Earth or portions thereof,
them useful for specific purposes. The Mercator every map is generalized. Lines have fewer details in
projection (see Thematic Maps in Geography), when them than present in the feature on Earth’s surface,
centered on the Equator, shows all loxodromes (or and only selected features are shown. When quantities
rhumb lines, i.e., lines of constant geographic direc- are depicted, they are usually shown by category
tion) as straight lines. This property has been impor- rather than by exact value. Generalization is highly
tant to navigators. It was a profound development useful in focusing the map on important information
because most loxodromes on Earth (and on globes) instead of cluttering it with overabundant detail. It
are spirals, and the Mercator projection simplified also means that some important features may not be
route planning. The projection suffers from extreme there, not because they do not exist but because they
deformation of area, and, like many other projections, have been generalized out of the map.
has been used inappropriately. The Mercator maps Despite the obvious necessity for generalization in
hanging on classroom walls have led many students to making representations that are literally thousands
think of Africa and South America as a relatively and millions of times smaller than the original terri-
small land bodies relative to the greatly exaggerated tory, it is not always understood. The small street
high-latitude landmasses of Greenland and Russia. maps of major cities that appear on road maps may be
There are over 100 named projections (see Snyder assumed by users to include every street, and a surface
1993), and cartographers try to choose one that is feature that falls between the contours of an elevation
appropriate for the mapping purpose at hand. The map may surprise and confuse the hiker. The failure of
discrete naming of projections, while providing a map due to generalization is one of the reasons that
convenient reference, obscures the fact that there are there are very many different maps of the same area
an infinite number of ways to represent Earth on at different scales and with different selections of
flat paper. Computer construction of projections features. The larger the scale (i.e., the larger the
may result in revised nomenclature and increasingly representation of a given area) the less generalization
tailored projections (Laskowski 1997). is necessary and the more useful it is likely to be for

1497
Cartography

finding details. The smaller the scale (the smaller the


representation of a given area), the more likely it is
that broad patterns will be visible because of the lack
of obfuscating detail.
The relationship between scale and generalization is
not an exact one. Some maps have much more detail
than others of the same physical scale. The term
generally applied to the degree of detail relative to size
of features on Earth’s surface is resolution (Tobler
1988). A map of the USA showing data by state (50
units total) has considerably coarser resolution than a
map showing data by county (3,000j units total)
whether the latter map is larger, smaller, or the same
size as the former.

1.5 Symbolization and Data Modeling


The very act of representing a feature with a symbol is
a form of generalization. The symbol is not the
original, and every map is a set of symbols. Map-
makers have been highly creative in employing sym-
bols.
Geometrically, mapmakers can employ only points,
lines, and areas as symbols on flat maps, though they
may be used in such a way as to simulate the illusion of
a third dimension. The use of three-dimensional
media, such as molded plastic, allows the explicit use
of volume as a symbol. The development of virtual
reality allows the illusion not only of a third dimension
but of the intersection of surfaces. Regardless of
medium, the geometry of symbols generally reflects
the way in which humans think of the object relative to
the scale of the map. For example, a city is a large area
on a street map but a point on a world map. A river on
a local, large-scale engineering map is an area, but the
same river appears as a line on a smaller-scale regional
map. An important exception is when numerosity of
symbols represents quantity. A dot map of population, Figure 3
for example, uses the dot to represent a quantity of Four ways of representing a set of elevation data; lines
persons, not an individual feature. In effect, then, dots are used in the contour map (a) and three-dimensional
on the map are being used to represent conceptual diagram (d), whereas area symbols are used in the layer
volume. shaded map (b) and in the shaded relief map (c)
There are a limited number of characteristics of
symbols, called ‘visual variables,’ which can be used to
represent the characteristics of the feature represented descriptive terms represent what exists at a location.
(Bertin 1983). Darkness–lightness of the symbol and These digital files can be thought of as data models as
size are the most commonly used to represent quantity. opposed to a physical model. They can almost always
Shape, color hue, and orientation most often represent be converted into a map, and they are usually much
differences in kind. Patterning in area symbols may more flexible. A digital elevation model, for example,
also be used to distinguish characteristics of features. i.e., a file of elevation values at points, can be converted
The modeling of geographic space in digital files into any of the maps in Fig. 3.
introduces a different form of representation from that
on paper or similar physical medium (Peuquet 1991).
Geometric concepts of point, line, area, and volume
1.6 Categories of Maps
are still valid, but there is no scaled physical rep-
resentation or visual symbols associated with features. Maps can be divided into two very broad categories:
Rather, numbers represent location, and codes or general reference and thematic. A general reference

1498
Cartography

A simple twofold classification is limited, and


sometimes a third is added: navigation maps. Others
consider the general reference and thematic categories
to be opposite ends of a scale, with individual maps
fitting somewhere between. Navigation maps, such as
a street guides, are near the general reference end
because many different specific items are represented.
They have certain thematic qualities, however, because
of the focus on the features relevant to finding one’s
way. However, inventory maps, such as a map showing
the residence of every patient with a certain disease,
are close to the thematic end of the scale. They share
the general reference characteristic that individual
items are shown.
There is good argument that the general reference–
Figure 4
thematic classification is more fitting of use than of
An example of a general reference map, extracted and
maps themselves. One can look at a thematic map to
adapted from The World Factbook, CIA, 1999
see how large one political unit is relative to another,
fhttp:\\www.cia.gov\cia\publications\factbook\g
which is a function generally associated with the term
(original is in color)
‘general reference.’ Likewise, one can look at a general
reference map and observe overall pattern of drainage
or evidence of population, a use generally associated
with the term ‘thematic’ (Robinson and Petchenik
1976).

2. The History of Maps and Cartography


The history of cartography begins earlier than re-
corded time and in all likelihood had its origins in
gestures and ephemeral marks in the soil (Godlewska
1997). The development of civilizations and of carto-
graphic practice were probably highly intertwined,
and at least some cartographic artifacts remain from
ancient Mesopotamia and Egypt, including a clay
tablet from roughly 3800 BC with mountains and a
river. Extant land ownership maps date to 2000 BC.
Figure 5 The development of grid systems, associated with less
Zebra Mussels in the USA, an example of a thematic subjective and more scientific approaches to mapping,
map, extracted and adapted from National Atlas of the was reflected in written materials in China in the third
United States, US Geological Survey, century AD, and the earlier development of paper in
fhttp:\\www.nationalatlas.gov\g (original is in color) China provided a flexible and highly portable physical
medium for drawing maps (Thrower 1996).
map (Fig. 4) shows a variety of individual features and The concept of the world as a sphere was basic to the
is generally used to find specific places, such as cities, development of logical representation of large areas of
rivers, or regions. The general reference map is the world. Measurements make sense only when the
selective of features in any category; only ‘important’ curvature of the planet is taken into account. The idea
cities, rivers, or regions are identified, for example. seems to have had its origins with the Pythagoreans
Although lettering on the map often stands out and was supported in work by Plato and Aristotle.
visually, there is limited visual hierarchy among the Eratosthenes (276–196 BC), who served as the
features of the map, as all the features must be librarian at Alexandria, used the concept of the
distinguishable from one another. A thematic map spherical Earth and the relationship between noon
(Fig. 5) shows a distribution of one or more phenom- sun angles at different locations to measure the size of
ena and is used to visualize spatial form. The thematic the planet. Because the exact length of a stade, the
map is generally based on all the known instances of unit of measure in use at the time, is not known, it is
the phenomenon even though none may be shown difficult to know exactly how accurate his estimate
individually. The thematic map generally has clear was, but it seems to have been remarkably close and a
visual hierarchy, with the symbols representing the better estimate than several that followed as scholars
distribution standing out the most. attempted to correct the figure. A successor in his

1499
Cartography

position in the second century AD, Claudius Ptolemy, a competition sponsored by the British government,
compiled the seminal tome entitled Geographia that John Harrison in the later half of the eighteenth
contained a guide to making maps, including instruc- century produced a chronometer that could retain
tions for several map projections and the latitude- accurate time at sea (Sobel 1995). It was a turning
longitude location of about 8,000 places. point in the accuracy of maps covering large regions of
Mapmaking in China and South Asia is charac- the world.
terized by remarkable developments in roughly the The rise of detailed scientific mapping of Earth’s
same time period as the accomplishments of the surface in the form of topographic mapping owes
ancient Greeks. There is reference to a map on silk in much to the Cassini family, whose influence on
the third century BC, and maps were used by the cartography began with the appointment of astron-
military and administrators in the Han Dynasty. In omer Giovanni Domenico Cassini to the Acade! mie
later centuries, the first printed map appeared in China Royale in Paris, which along with the Royal Society of
well before the development of printing in the West London was concerned with mapping and other
(approximately 1153 vs. 1472). scientific problems. Under his direction, maps re-
European cartography during the Middle Ages was corded only locations that were determined by astro-
characterized by maps that are highly schematic by nomical methods, and detailed measurement of a
modern standards but are important cultural docu- degree of latitude was undertaken at various locations
ments of their time, having been associated with to test the theory of the earth as a prolate spheroid,
religious institutions and beliefs. One schema was a which affects the placement of locations on flat maps.
circular map with east at the top showing Asia as the The detailed mapping of France was continued under
upper section of the circle, separated from Europe in three more generations of the Cassinis (Jacque, Ce! sar-
the lower left and Africa in the lower right by the Don François, and Jean) and was completed in 1793.
River and the Nile. Europe and Africa were separated In the eighteenth and early nineteenth centuries, one
by the Mediterranean Sea. These maps are known as of the most important developments in cartography
T-O maps, the rivers and Mediterranean forming a was thematic mapping (Robinson 1982). With the
rough T shape and the oceans surrounding the practice of accurate base maps established, it was
landmasses forming the O. The second schema was a feasible to focus attention on the spatial arrangement
circle with north at the top and, within the circle, of various phenomena of interest such as elevation,
temperature zones that included the torrid one along population, and winds. Symbol systems such as
the equator, bordered by the temperate zones, which isolines, dots, and the choropleth method came into
in turn were bordered by frigid zones at the poles. being and are commonplace in modern mapping.
Route maps of the time period carried far more detail Changes in cartography in the twentieth century
than these schematic world maps, but the most included the use of aerial photographs in topographic
remarkable detail and the most scientific approach mapping and changes in production technologies. The
was found in the portolan charts. These maps were latter half of the century saw the development of
used in navigation, primarily in the Mediterranean satellite sensing for gathering data for maps, the
Sea, and showed intricate coastlines with hundreds of deployment of global positioning systems (constel-
identified coastal locations. At several locations within lations of satellites sending signals from which latitude
the water are compass roses with direction lines and longitude are determined), and the infusion of
extending from them. computers into mapping including the development of
Cartography in the Islamic world during the Euro- geographic information systems and Internet map-
pean Middle Ages, while varied, included continuation ping. The variety of thematic mapping methods has
of the Greek scientific tradition. The scientific in- recently undergone rapid change as attention has
strument called the astrolabe, a device for measuring focused on use of the computer to overlay multiple
angles to stars, was important in finding the way to phenomena, produce animated maps, and develop
Mecca. In the early 1400s, Ptolemy’s Geographia, sophisticated visualizations of spatial phenomena
which had been preserved in Byzantium, reached Italy (Thrower 1996).
via refugees from the invading Turks. Translated into The richness of cartographic history throughout the
Latin, the work had an extraordinary influence on world and over many centuries is reflected in The
subsequent development of cartography. Systematic History of Cartography, a monumental multi-volume
projection of the spherical earth to the flat map series underway as this Encyclopedia is being written
followed Ptolemaic instructions for well over a century (Harley and Woodward 1987). The project itself is an
(Thrower 1996). important intellectual event in cartographic history.
Accurate mapping depends on the ability to measure
both latitude and longitude accurately. Latitude,
which could be measured by angles to the sun and See also: Cartographic Visualization; Cartography,
other heavenly bodies at specific times of the day was History of; Cognitive Maps; Dynamic Mapping in
far easier to measure than longitude, which depends Geography; Ethnic Conflict, Geography of; Geo-
on differences in time from one location to another. In graphic Information Systems; Planetary Cartography;

1500
Cartography, History of

Tactile Maps in Geography; Thematic Maps in ‘graphic representations that facilitate a spatial under-
Geography standing of things, concepts, conditions, processes, or
events in the human world’ (Harley and Woodward
1987) has increased the range of artifacts to include
Bibliography celestial maps and cosmographical maps of supposed
worlds. It also embraces performance cartographies;
Bertin J 1983 The Semiology of Graphics [English translation by spatial representation incorporated in gesture, pro-
Berg W J]. University of Wisconsin Press, Madison, WI cession, ritual, theatre, and dance. The linked con-
Brewer C A 1990 The effect of color on the perception of map straints of paper and flatness have been abandoned to
scale. In: Everett W (ed.) Student Honors Competition Winning
Papers, Association of American Geographers, Cartography
include many and mixed media and three-dimensional
Specialty Group, Toronto, Canada expressions of these, such as buildings, whole settle-
Eastman J R 1981 The perception of scale change in small-scale ments, and massive earth sculptures.
map series. The American Cartographer 8(1): 5–21
Godlewska A 1997 The idea of the map. In: Hanson S (ed.) Ten
Geographic Ideas that Changed the World. Rutgers University
1. The Field
Press, New Brunswick, NJ The history of cartography as conceived in the 1940s
Harley J B, Woodward D (eds.) 1987 The History of Car- in the first modern review of the field has been
tography. University of Chicago Press, Chicago, IL transformed (Bagrow 1951). Approximately 90 per-
International Cartographic Association 1995 The Definition of
Cartography, a statement adopted by the 10th General
cent of that text focused on maps made in the Classical
Assembly, Barcelona, Spain, September 3, http:\\www.icaci. and European worlds. Although written primarily for
org map collectors, and not for scientists and scholars, the
Laskowski P 1997 The Distortion Spectrum, monograph 50. chronological sequencing of chapters and narrow
Cartographica 34(3): 67–95 cultural range gave the unintended impression of an
Peuquet D 1991 Methods for structuring digital cartographic evolutionary development. The field is no longer
data in a personal computer environment. In: Taylor D R F Eurocentric but encompasses maps and mapmaking in
(ed.) Geographic Information Systems: The Microcomputer all cultures. An important consequence of this is that
and Modern Cartography. Pergamon Press, Oxford, UK the historical development of mapmaking is no longer
Robinson A H 1982 Early Thematic Mapping in the History of
Cartography. University of Chicago Press, Chicago, IL
seen as unilinear. Establishing ultimate origins in
Robinson A H, Petchenik B B 1976 The Nature of Maps: Essays prehistory may never be possible but they were
toward Understanding Maps and Mapping. University of certainly multiple, probably numerous, and almost
Chicago Press, Chicago certainly geographically widespread. Another conse-
Robinson A H, Morrison J L, Muehrcke P C, Kimerling A J, quence of the shift to a multicultural perspective is that
Guptill S C 1995 Elements of Cartography, 6th edn. Wiley, mapmaking is no longer seen to be exclusively a
New York specialist or even semispecialist activity. Though not
Snyder J P 1993 Flattening the Earth: Two Thousand Years of Map as ubiquitous as speech, it has long been a widespread
Projections. University of Chicago Press, Chicago, IL vernacular skill. Transformation of the field has also
Sobel D 1995 Longitude: The True Story of a Lone Genius Who
Soled the Greatest Scientific Problem of His Time. Walker,
reduced the formerly excessive emphasis given to
New York printed maps based either on surveys or systematic
Thrower N J W 1996 Maps and Ciilization: Cartography in compilation and plotted on mathematically-generated
Culture and Society. University of Chicago Press, Chicago, IL graticules. Categories of what were once considered
Tobler W 1988 Resolution, resampling, and all that. In: ‘less accurate’ derived maps are now accorded ap-
Mounsey H, Tomlinson R (eds.) Building Databases for Global propriate attention. These include early manuscript
Science. Taylor & Francis, London, pp. 129–37 maps, many thematic maps, and most media maps
including the cartographoons of political cartoonists.
J. M. Olson The longer timescale, adoption of multicultural
perspective, and recognition given to additional cate-
gories of maps have together increased actual and
potential links between the history of cartography and
other fields. Those with the histories of art, science,
Cartography, History of and technology are fairly obvious. Even so, these have
not been developed as much as might have been
Recent expansion in both the usages and definitions of expected. Others are beginning to be forged. Cultural
the noun ‘map’ and the verb ‘to map’ have had anthropology is likely to be significant in relation to
consequences for the history of cartography as a field. vernacular cartography (see, for example, Nabokov
Its focus is no longer biased almost exclusively towards in Lewis 1998). Likewise, religious studies should
flat, formally-constructed, professionally-produced, contribute to the understanding of cosmographical
artifactual maps of the earth’s terrestrial and marine maps and archaeoastronomy to resolving claims that
surfaces, made in the European tradition. The increas- some rock art incorporates celestial maps. Archae-
ingly adopted operational definition of ‘maps’ as ologists of later periods are beginning to make use of

1501
Cartography, History of

maps made in or derived from the cultures being perspective on the world, a general history of car-
investigated (see, for example, Waselkov in Lewis tography ought to lay the foundations, at the very
1998). The historical role of maps in relation to other least, for a world view of its own growth.’
modes for communicating information spatially is at To these, many working in the field would now add:
last beginning to be investigated (see, for example, (e) ‘a commitment to explore the roles of maps
Fletcher 1995 and Pearce in Lewis 1998). Links with within societies, involving reconstructing their mean-
the cognitive sciences (including cognitive archae- ing.’
ology) will be particularly important in furthering the These defining criteria differentiate the ‘history of
understanding of map use and in helping to recon- cartography’ from ‘historical cartography’ the practice
struct the early prehistory of cartography. The of compiling maps in the present from historical data.
ultimate and most elusive question is likely to be which Latterly, however, ‘map history’ has emerged as an
emerged first as the language of space among Homo alternative term for ‘history of cartography.’ This is
sapiens; those currently universal features of speech regrettable. The neologism ‘cartographe’ was intro-
referred to as spatial deixis, schematic graphics (in- duced in 1839 for the study of early maps. The word
cluding artifactual maps), or nongraphics (including was soon applied to the map field in general and was to
performance cartographies)? (Twyman 1982). In appear in many European languages in the second half
trying to answer this question, links with comparative of the nineteenth century. It is now associated world-
and evolutionary linguistics will be vital. Whether wide with institutions, organizations, and activities
actual or potential, all these links will be mutually involved with the making, use, and preservation of
beneficial. plans, charts, and globes as well as maps in the strict
The substance, scope, methods, and intentions of sense of representations of the earth’s land surface at
the transformed field are being revealed by an ongoing medium to small scales.
publishing venture; the multivolume The History of
Cartography (Harley and Woodward 1987, 1992,
1994, Woodward and Lewis 1998; with further 2. Historiography
volumes scheduled for the European Renaissance,
European Enlightenment, Nineteenth Century, and Notwithstanding the long history and near univer-
Twentieth Century). Whereas, for example, Bagrow sality of maps, mapmaking, and map use, the history
(1951) covered ‘Maps of primitive peoples’ in four of cartography is a youthful field. Although there was
pages, The History of Cartography devotes a whole a long but sporadic tradition of antiquaries, collectors,
volume of more than 650 pages to maps in ‘traditional and some mapmakers chronicling the subject, the field
societies’ (Woodward and Lewis 1998). Likewise, only began to emerge after 1850 with the institu-
Asian, including Islamic, cartography to which tionalization of geography as an academic field es-
Bagrow devoted 21 pages is the subject of two volumes, pecially in Germany. Even then it lacked identity,
together containing more than 1500 pages (Harley and being treated as part of the wider history of geo-
Woodward 1992, 1994). Whereas the index to Bagrow graphical discovery and exploration. Its beginning as a
is approximately 95 percent personal names, and vir- modern subject began in 1935 with the founding of
tually devoid of topics, the volume indexes of The Imago Mundi, still the only international journal
History of Cartography reveal a surprising diversity of devoted exclusively to the field, as distinct from map
topical content including, for example, ‘accuracy,’ collecting. Since then, the ‘emergence of cartography
‘alphabets,’ ‘ancestors,’ ‘animal language,’ ‘animism,’ as an independent and practical discipline’ has pro-
‘anthropology,’ and ‘architecture,’ as well as the much vided ‘new theoretical frameworks as well as a rein-
less surprising ‘altitude,’ ‘arcs,’ ‘astrolabes,’ ‘axes,’ forced raison d’eV tre for the study of cartographic
‘atlases,’ and ‘azimuths.’ history’ (Harley and Woodward 1987).
The defining criteria of the history of cartography as Many of the material resources on which the
announced by the editors of The History of Car- emergence of the new history of cartography were
tography in the mid-1980s (Harley and Woodward based were already in place, including archives, map
1987) are now agreed to by a majority of those active libraries, and private collections, but they were to
in the field: become more accessible (Skelton 1972). Epistemo-
(a) ‘acceptance of a catholic definition of ‘‘map’’’; logical and methodological developments were conse-
(b) ‘commitment to a discussion of the manifold quences of the wider ferment of ideas and institutional
processes that have contributed to the form and changes that followed World War II. Increasing
content of individual maps’; prosperity and better and more specialist librarians
(c) ‘recognition that the primary function of car- and bibliographers were in part responsible for the
tography is ultimately related to the historically unique appearance of major printed catalogues of collections
mental ability of map-using peoples to store, ar- and regional cartobibliographies, albeit heavily biased
ticulate, and communicate concepts and facts that towards Europe and North America. New printing
have a spatial dimension’; technologies made possible the reprinting of earlier
(d) ‘belief that, since cartography is nothing if not a obscure but significant publications. Among these

1502
Cartography, History of

Acta Cartographica, Vols. 1–27 (Theatrum Orbis the theoretical focus of their work, or use accepted
Terrarum, Amsterdam, 1967–81) was particularly methodological procedures to solve problems identi-
important. Most important among these publishing fied within a theoretical framework. Conversely, most
initiatives was the quality reproduction, often in color, set out to investigate topics de noo and only rarely on
of thousands of maps hitherto only accessible to a few. the basis of what is already known. They add to
The field became somewhat less dominated by knowledge but usually without extending theory.
geographers and historians and began to attract some Given the paucity of graduate courses in the history of
of the new generation of cartographers, some of whom cartography, much depends on the few that do exist if
had trained in the military, and were interested in the imbalance between empiricism and theory is to
modern maps and in the contexts and consequences of change. Meanwhile, it is important that the field
their use. In 1972 The International Cartographic accepts and integrates concepts, issues, and directions
Association created a Standing Commission on the being developed at or beyond what many insiders now
History of Cartography from an existing working see as its periphery, often by persons who do not
group that was already preparing a handbook of identify themselves with it. Important among these are
cartographic terms in use before 1900 (Wallis and the idea of maps as power, the distinction between
Robinson 1987). An international conference on the intended and actual roles, an increased commitment to
History of Cartography now meets biennially, usually iconology, and re-examining the idea of value-free
preceded by a meeting of the International Society for cartography.
the Curators of Early Maps. Map collectors are an
important interest group, providing the field with
financial and material support. The International Map 3. The History of Mapmaking and Maps
Collectors’ Society meets annually.
By 1994 there were more than 500 self-designated The following outline reflects something of the enor-
historians of cartography in the world; unequally mous empirical additions made during the last quarter
distributed geographically with approximately 40 per- of the twentieth century but it is neither structured nor
cent resident in continental Europe, 25 percent in underpinned by a body of theory. The overriding
North America, and 10 percent in the United King- concern has been to present a multicultural review
dom. Research interests were equally skewed with, for from the earliest times to the beginning of the
example, 66 respondents registering an interest in electronic age of geographical information systems
cartobibliography and 13 in projections, but only one and, in so doing, to reduce the Eurocentrism that has
in mining maps, two in cartouches, three in map distorted most syntheses hitherto. Although primarily
accuracy, four in Hebrew maps, and five in tithe maps. organizational, the structure does imply a working
Likewise, 19 registered an interest in Gerardus hypothesis for a developmental theory.
Mercator and 13 in Claudius Ptolemy but eminent Cartography was originally a near-universal skill. It
cartographers from other cultures and periods were probably had prehistoric origins in every part of the
far less prominent. For example, al-Idrı. sı. , Timothy then settled world, when map content was one or a mix
Pont, and Thomas Hutchins had only one registrant of terrestrial, celestial, and cosmographical, and its
each. Nevertheless, the earlier collector-led, dilettante, roles were geographical, calendrical, and making sense
aesthetically motivated, Eurocentric, and somewhat of the unknown universe and linking it to the world of
insular preoccupation with Renaissance and En- experience and report. The ability to read random
lightenment printed maps and mapmakers has under- patterns as maps was probably used in divinational
gone a relative decline and interest in such topics as practices. In so-called traditional societies, these types
mathematics in cartography, techniques of map pro- of cartographic activities survived until recent times.
duction, map publishing, and surveying is increasing. Indeed, they are still practiced in some. In Western
Listed under the umbrella of ‘Theory and Meth- societies some of the skills characteristic of this level of
odology’ are such interests as ‘cartography and ability are still used, as when experiential sketch maps
cultural theory,’ ‘construction of knowledge space,’ are made in communicating to others or when the
‘politics of projections and maps,’ and ‘semiotics of scaleless topologically-structured maps provided by
prehistorical maps.’ (Lowenthal 1998). public transport systems are referred to.
In anticipating future directions of the field, the Cartography underwent a qualitative shift in most
most serious limiting factors are the paucity of of the world’s early urban societies, where mapmaking
specialist undergraduate courses and limited oppor- slowly became an activity of elites but rarely a
tunities for formal graduate training. For the most specialism. Although increasingly sophisticated and
part, those active in the field continue to enter by often revealing superb craftsmanship in producing
chance and often in mid-career. They contribute to a artifacts of artistic quality, the basic roles of maps
rich mix of epistemologies, interests, and skills, but it were still threefold, though with an increasing em-
could be argued that this is delaying the emergence of phasis on terrestrial, made and used in the contexts of
an overarching paradigm. Essentially, researchers do communicating, instructing, planning, building, ad-
not operate within an agreed philosophy, concur on ministering, and record keeping. Very slowly, these

1503
Cartography, History of

societies would seem to have developed a cartographic according to even approximate linear scale, content
consciousness, although the concept of ‘map’ as a was determined by the usually immediate purpose a
discrete category probably remained weak. map was made to serve. For example, physically
From the later urban societies of the Middle East conspicuous and culturally important ground features
and the eastern Mediterranian, two further and more might be omitted, whereas minor features and unique
or less independent cartographic traditions emerged: events central to a message often dominated, usually
very quickly, towards the end of the first millennium to the exclusion of all but an essential minimum of
AD, Islamic terrestrial and celestial mapping; and base data. Artifactual maps were made on or with a
slowly, via more complicated stages, but ultimately wide range of media. Most content was terrestrial,
with greater global impact, the Classical–Christian– some littoral, but very little marine. The maps were
European tradition. Both traditions used advanced almost always short-lived and often ephemeral. In the
instrumentation and mathematics to achieve greater absence of writing, many artifactual examples were
precision than in other urban societies, arguably made to communicate to persons not present messages
achieved higher artistic and aesthetic standards than about events, conditions, and proposed activities
elsewhere, and used maps very effectively in the elsewhere. They were functionally undifferentiated
interests of religious, scientific, and ruling elites. examples of much wider systems of pictographic
Ultimately, the European tradition achieved most, communication. Sometimes they were made to in-
surpassing that of Islam and, without knowing it until struct persons present or in the course of interactive
much later, the earlier peak of cartographic achieve- planning. Very rarely did they serve as the equivalents
ment reached in China. The printing of maps dissemin- of the reference maps of Western societies, though
ated spatially-organized information more effectively some were made and from time to time remade to
than ever before and, together with education, in- preserve lore. Rather, their equivalents in the latter are
creasing literacy, and the mass media raised carto- the experiential sketch maps of small areas used by
graphic consciousness to an unprecedented level. individuals to convey local information to others. It
Maps and atlases became trade items. Formal ground seems likely that they manifested the cognitive and
survey and systematically conducted censuses and behavioral substrata from which formal cartographies
inventories of many kinds replaced judicious com- later emerged but direct evidence from prehistory is,
pilation of randomly assembled intelligence as the and seems likely to remain, rare and contestable.
basic sources of information. Nation-states began Gestural and performance maps could leave no phys-
to map their own territories and, later on, their ical evidence and most artifactual examples woulds
colonial possessions. Cosmographical mapping vir- have had short lives.
tually ceased but new types of maps emerged, in In the course of geographical discovery and ex-
particular thematic maps and large-scale national ploration, Europeans routinely solicited vernacular
topographic map series. terrestrial maps from the indigenous peoples of Africa,
After c. 1975, a fourth cartographic tradition began the Americas, the Arctic, and Australasia, often
to emerge rapidly from the European tradition: incorporating them into their own maps. They did not,
geographical information systems. An important part however, recognise important differences between
of the electronic age it seems to be supplementing their own and the indigenes’ epistemologies of ter-
rather than replacing the tradition from which it restrial space. What appeared to be crude equivalents
emerged. If so, this will be in keeping with the of their own maps were fundamentally different.
historically earlier transitions between cartographic Appraised retrospectively according to Western con-
traditions. cepts of accuracy, the consequences of this led to
major errors in European cartography of newly
discovered areas (Lewis 1986, 1993).
3.1 Vernacular Terrestrial Maps
Many, perhaps most, adult members of historically
3.2 Celestial and Cosmographical Maps of Shamans
traditional societies appear to have possessed the
and Tribal Leaders
linked abilities to make certain kinds of maps and to
understand those made by others in their own and Within historical nonliterate societies, not all maps
similar societies. These abilities emerged in prehistory were terrestrial in content or of vernacular origin.
(Harley and Woodward 1987, Woodward and Lewis Religious and tribal leaders often had responsibility
1998). Artifactual, gestural, and performance maps for making, replicating, and sometimes performing
were observed by Europeans and Euro-Americans in celestial and cosmographical maps, as well as for
their early contacts with preliterate tribal societies preserving the lore and conducting the ceremonies
(articles by Bassett, Davenport, Lewis, and Turnbull with which these were associated. This was particularly
in Selin 1997). Less frequently, but almost as wide- so in societies preoccupied with the heavens in the
spread, tribal peoples practicing divination were ob- contexts of calendrics and astrology and with a belief
served to interpret cartographically patterns induced in a structured cosmos extending far beyond the world
on a variety of organic materials. Never constructed of direct experience. Celestial patterns were, for

1504
Cartography, History of

example, stitched on shamans’ coats, painted on them as surfaces on which to make plans of property,
animal hides, incorporated in sand paintings, and land, houses, and temples, as well as small-scale maps
mirrored in the organization of rooms and in the of larger areas and the whole world as they supposed
positioning of buildings within settlements and vis-a' - it to be (Millard in Harley and Woodward 1987).
vis each other. The rapidly developing field of arch- In the fourth millennium BC, rudimentary topogra-
aeoastronomy is producing increasingly convincing phic diagrams began to appear on Egyptian decorated
evidence for the incorporation of celestial maps in pottery. Detailed plans were being made in ink on
prehistoric rock art. plaster-covered boards by the middle of the second
The frequency, variety, and physical diversity of millennium BC and cosmographical and celestial maps
cosmographical maps were probably each greater than were being engraved in stone by the end of the first
for celestial maps. In these, planar symmetry about millennium BC (Shore in Harley and Woodward
one or more axes was a common geometrical charac- 1987).
teristic, often incorporating an axis mundi as a third In China, maps of many kinds, great complexity,
dimension. Media were very varied, including cer- and often exquisite craftsmanship were being made
amics, skin drumheads, wall murals, sculpted figures, during and after the Warring States period (403 BC –
and scarification of human bodies. 221 BC). Many have survived; perhaps more than for
There is little doubt that celestial and cosmographi- any of the other early urban societies. Among the
cal maps had been made in the antecedents of these earliest is a plan for the construction of a royal tomb,
societies well before the earliest contact with Euro- engraved on a bronze plate and inlaid with gold and
peans. In other societies, maps had already become silver. Others include topographic, administrative, and
much more sophisticated; essentially those that had thematic maps on wooden boards and a map on silk
acquired scripts, evolved more advanced forms of showing a pattern of roads, rivers, and mountains,
religion, and developed large urban communities, together with the locations of settlements, scenic sites,
especially Babylonia, Egypt, northern China, Persia, and places of historical interest. One of the earliest
Islam, Mesoamerica, and the central Andes. known maps to have a grid (though probably not
based on it) is engraved on a stone dating from AD
1136 and may have had antecedents. There is some
direct evidence for mapping according to linear scale
3.3 Cartography in Early Urban Societies
and some textual support for this (Wanru et al. 1990
Urbanization was part cause and part consequence of and Yee in Harley and Woodward 1994).
increasingly complex social systems, religious institu- Surprisingly, and unexplained, there are almost no
tions, civil and military administrations, and extensive extant terrestrial maps from the Indian subcontinent
trading systems. It also involved increasing speciali- for the two millennia before the advent of the
zation and the concentration of wealth. In these Portuguese. The exceptions are a few incised potsherds
conditions new types and styles of maps emerged. from the end of the first millennium BC that have
Within each of the major urbanized regions a new rough plans of monasteries and a few ancient sculp-
cartography would appear to have emerged more or tures depicting sacred rivers. Nevertheless, the first
less independently. So far as is known, maps were not urban culture, the Harappan dating from the middle
conceived, compiled, or made by specialist carto- of the third millennium BC, had many temples and
graphers but as part only of the work of artists, settlements that were so standard in form that they
scribes, engravers, sculpters, scholars, astronomers, must have been based on precise plans. Furthermore,
architects, engineers, etc. The centers of map pro- artifacts from the period have been identified as
duction were at first the larger urban centers, separated surveying instruments. Cosmographical mapping may
by hinterlands in which the vernacular, shamans’, and have had its roots in temple design but the maps
tribal leaders’ maps were almost certainly still the emerged in their complex and artistically rich forms
norm. Even in the urban centers map consciousness much later in the Hindu and Jain religious traditions.
was probably still weak. Symptomatic of this was the Likewise, terrestrial mapping, though ultimately soph-
absence of words for ‘map’ as evidenced in those isticated, emerged late; from the middle of the sev-
languages of which there are reasonably reliable enteenth century AD onwards. In the subcontinent,
lexicons: ancient Greek and Latin, Persian, Arabic, therefore, early urbanization and events associated
Sanskrit, Hindi, and Chinese before 300 BC. Even so, with it was not the catalyst for cartography—but two
the maps made in these societies were much more much later events were; the emergence of the great
sophisticated and, by later Western standards at least, religions and contacts with Europeans (Schwartzberg
much more maplike in appearance than the vernacular in Harley and Woodward 1992).
shamans’ and tribal leaders’ maps being made else- Within months of first entering Mexico in 1519,
where, as, perhaps, by the nonelites and reactionary Spaniards were presented with paintings on cloth that
groups in the towns. were itinerary maps complete with topographic details
Between c. 2,300 BC and 500 BC, Babylonian necessary for route finding. What they either did not
scribes, in addition to writing on clay tablets, used see or recognize were three other categories of Meso-

1505
Cartography, History of

american cartography, terrestrial maps that incor- by interacting with contemporary cultures as distant
porated accounts of history (cartographic histories); and different as medieval Europe and Hindu India.
cosmographical maps showing either a horizontal Islamic cartographic achievements were numerous
cosmos divided into five quadrants or a vertical and diverse: artifacts of the highest craftsmanship and
cosmos divided into layers along an axis mundi; and artistic quality, such as astrolabes in brass incorpor-
celestial maps of stars and constellations. Of these, ating pierced planisphere star maps, celestial globes,
cosmographical models were probably the oldest and exquisitely painted birds-eye views of cities;
tradition, dating back to the Olmecs who founded mathematically sophisticated circular world maps
urban centers as early as 1200 BC. By the first centered on Mecca, constructed in such a way that the
millennium AD carved stone tablets showing the city’s distance and direction could be read directly
cosmic layout were widespread. Celestial maps were from any locality in the Islamic world; and delicately
less common but cartouches in a Mayan wall mural colored cosmographical maps incorporating three
are believed to represent the arrangements of planets divine worlds, the earth surrounded by seven heavenly
and constellations on the night of an event in AD 792. spheres and a legendary encircling mountain, together
In central Mexico extant cartographic histories are with a seven-layered hell.
numerous and elsewhere known by report. One dates With the possible exception of that in China, Islamic
from c. 1542. Made on paper, cloth, animal hides, or cartography was more diverse and sophisticated than
parchment, they were complex, iconographically that in any of the other early urban societies. Its
sophisticated, and showed what the ruling elites traditions persisted for more than a millennium. With
wanted communities to know about their pasts. the exception of some world maps, marine charts, and
Artists, scribes, and sculpters also formed an elite but celestial globes, most maps were made for or copied
at the conquest product specialization was probably from manuscript texts, to which they were generally
restricted to the larger urban centers. In smaller subservient. Many of the texts were translations into
communities artists were expected to paint everything Arabic from Greek, Syriac, Hebrew, Persian, Sanskrit,
from pots to maps (Munday in Woodward and Lewis and Turkish. These included Ptolemy’s Almagest and
1998). Geography. Islamic scientists and craftsmen were
In the years after 1532, when the Spaniards first creative in adopting and adapting the techniques
reached the central Andean empire of the Inkas, the described in these translations. Many cartographic
invaders were not given and did not report seeing artifacts were used in determining religious practices
maps. It is possible that flat maps did exist but that but few, if any, were religious icons. Cartography was
native conceptions of space and their symbolizing of it encouraged and supported by imperial patrons and
were so different from the Spaniards’ experience that many maps were linked with political power. As in all
they were not recognized. The archaeological record other early urban societies there were no equivalents
certainly reveals spatial and landscape representations of the modern professional cartographer. Terrestrial
from the previous 2,000 years. The Nazca lines (large maps, at least, were produced by the elite for the elite
ground drawings) may have been made to attract and for the purposes of edification, illustration, and propa-
spatially direct Andean gods. Ceramics of the first gation of imperial glory. With the exception of marine
millennium AD represent landscapes and buildings charts and aids to religious practice very few were
three dimensionally. In the years immediately before practical. (Harley and Woodward 1992 and Savage-
the conquest, landscapes were carved three-dim- Smith, Karamustafa, and King in Selin 1997).
ensionally on stones and even on extensive rock
outcrops. Knotted string devices (khipus) may, in ad-
dition to other functions, have been used as maps.
3.4 Cartography is European and Other Later
Soon after the conquest maps began to be included in
Urban Societies
native chronicles. There are long traditions of cer-
emonies that incorporate mapping procedures. Typi- The extent to which Dark Age and medieval car-
cally, these involve the arrangement of amulets in tography in the western Mediterranean and Europe
relation to a ground drawing of a map or plan (Gartner was influenced by that in the classical world is
in Woodward and Lewis 1998). inadequately understood. The importance and quality
Within 200 years of Mohammed’s death in AD 632, of maps certainly declined in the late Roman Empire
the Islamic-Arab world extended from the Atlantic but the role of the Byzantine Empire in transmitting
coast of North Africa in the west to the Chinese sphere cartographic traditions has still to be assessed. One
in the east and was soon to extend south to the Sahara. extant monastery map from the early ninth century
Together with religious practices, the fostering of AD is reminiscent of the best large-scale Roman plans.
scholarship and science, extensive geographical ex- Medieval mappae mundi were certainly in a tradition
ploration, and urbanization, this rapid territorial extending from the third century AD. Some of the
expansion stimulated cartography, in part by bor- simple world outlines known as T-O maps made from
rowing languishing concepts and techniques from the the seventh century AD onwards were influenced by
earlier Hellenic, Byzantine, and Persian cultures and Greco-Roman philosophical tradition. Portolan

1506
Cartography, History of

charts have been claimed by a minority of authorities history of cartography before the technical innova-
to have Roman, Greek, Phoenician, Egyptian, even tions of the second half of the twentieth century. From
Neolithic origins (Harvey, Woodward, and Campbell the early sixteenth century until the mid-nineteenth
in Harley and Woodward 1987). Possible, probable, century Europe dominated the field. Thereafter, the
or certain though these influences from earlier cultures USA and Canada assumed increasing importance.
may have been, they were insignificant compared with Voyages of discovery and overseas settlement and
that of Ptolemy (Claudius Ptolemaeus). exploration vastly increased geographical knowledge.
A second century AD Greek scholar who worked During and after the eighteenth-century Enlighten-
within the framework of the early Roman Empire, ment, science focused attention on hitherto ignored or
Ptolemy’s writings were unknown in Europe until little known phenomena such as ocean currents and
1406, when his Geography was translated from Greek magnetic compass declination, both of which began to
into Latin. Hand-copied versions, many with maps, be mapped. New and improved instruments increased
soon began to circulate and printed editions prolifer- the accuracy of observations. For example, in the
ated after 1475. Written as a manual for mapmakers, mid-eighteenth century the sealed spring-driven
the original may not have contained maps but its chronometer made it possible to establish longitude
accounts of three projections and tables of latitudes accurately anywhere and under virtually any condition
and longitudes were sufficient for the construction of (Sobel 1995). Navigation, travel, trade, warfare and
world and regional maps. The Geography has been international rivalry, settlement of newly discovered
called the touchstone of the Renaissance in European and reclaimed lands, Christian missionary activities,
cartography. Before 1482 Ptolemaic maps were ex- searching for and exploiting new and better natural
clusively reconstructions but the Ulm edition of that resources, civil administration of increasingly complex
year contained woodcut maps that incorporated recent societies with emerging senses of social responsibility,
discoveries by Europeans (Dilke 1985). planning and civil engineering, science, tourism, popu-
Although Ptolemaic maps continued to be pub- lar journalism, the new academic geography that first
lished for several centuries, the influence of the ideas in emerged in Germany after the mid-nineteenth century,
the Geography were in the long run the more im- and the pedantic geography that paralleled it in the
portant. It insisted on carefully measured data, new state education systems were among the factors
stressed the need for validation, promoted the idea of creating demands for and access to more, better, and
plotting data with reference to a graticule, and the different kinds of maps.
equally important notion that the graticule could be At first, those who were known to acquire maps
transformed by the use of different projections, there- improved their status. For example, the original
by preserving one property at the expense of others. commissioners of mid-sixteenth century custom-made
Because they were in such demand, Ptolemaic maps Italian world atlases; slightly later, those who bought
were a factor in stimulating the map-printing tech- the standardized world atlases of the late sixteenth
nology; at first from woodblocks and, after the mid- century Dutch cartographers Abraham Ortelius and
sixteenth century, from engraved copper plates. The Gerardus Mercator; and, soon after that, the pur-
latter were more readily correctable and revisable than chasers of single-country atlases such as Christopher
woodblocks (Woodward and Harvey in Harley and Saxton’s of England and Wales. Those who knew how
Woodward 1987). to use maps, rather than just placing them in their
For almost 1,000 years before the Ptolemaic impact, libraries or having their names engraved on them as
cartography in Europe had served the Christian patrons, increased their political influence, material
church. Indeed, many, perhaps most, maps had been assets, and intellectual standing. For example, in
made by monks and clerics. The primary purpose of attempting to consolidate their powers, monarchs in
mappaemundi had been to communicate the significant early modern Europe used maps to better define and
events in Christian history rather than to record their expand their territories (Buisseret 1992). Known as
precise locations, with symbolism and allegory playing cadastral maps, cartographic records of property
major roles. The T in T-O maps separated the three ownership were important tools in the extension of
known continents but also symbolized the Christian land-based power (Kain and Baigent 1992). From the
Cross, in some cases with the body of Christ super- early eighteenth century onwards, statesmen began
imposed on it. Many local maps and plans were of establishing permanent departments to map national
ecclesiastical buildings and church property (Wood- and colonial territories, systematically, accurately, and
ward and Harvey in Harley and Woodward 1987). Of usually at much larger scales than those, if any, already
the main categories of maps, only the Portolan charts available; for example, in France (Konvitz 1987) and
had a primary utilitarian function (Campbell in Harley British India (Edney 1997). Thereafter, up-to-date
and Woodward 1987). topographic map series were to become hallmarks of
The Ptolemaic impact preceded a post-Renaissance nation-states. Governors of quasistates such as the
change in the role of maps from mainly religious to Hudson’s Bay Company attempted something similar,
almost exclusively secular. It occurred during the most though less systematically (Ruggles 1991). By the
momentous and diverse series of developments in the eighteenth century cartographic consciousness had

1507
Cartography, History of

extended to the rural gentry and new urban middle onstrate spatial correlations between two or a few
class subscribers to monthly magazines, most of which phenomena (Robinson 1982). The emergence of the-
included maps on a fairly regular basis. To conform matic cartography heralded the great late nineteenth
with magazine formats and keep engraving costs and twentieth century national surveys of geology,
down, these were small. Many illustrated current soil, landuse, etc. and the best of the national atlases
events, and were often made by eminent cartographic published after the middle of the twentieth century.
engravers such as Emanuel Bowen and Thomas Aerial photogrammetry, the science of obtaining
Jefferys (Jolly 1990). accurate measurements from air photographs, was
After about 1825, lithography became the popular used increasingly after World War I. By the 1950s
map-printing technique in Europe. Cheaper than most large-scale mapping and map revision made use
copper-plate engraved maps, the new maps became of it and in the 1960s and 1970s it was used to map the
larger and, from the 1860s onwards, the technique was moon. In 1950, an even greater technical advance was
adapted to print in color. A new generation of maps heralded by the publication of the first computer-
and atlases began to be published, often cheaply, in generated map. Computers are now used in every
large quantities, for use in schools and in the home. In stage of mapmaking, from surveying to the com-
the USA, however, between 1880 and 1940 wax pilation, design, and production of the end product.
engraving was the dominant technique in commercial Increasingly, maps are custom made. In this respect
cartography. One advantage was that printer’s type they are similar to the vernacular maps made in
was used for lettering on maps (Woodward 1977). traditional societies. They are not, however, grounded
Wax engraving was important in mass producing in individual experience or shared traditions but use
categories of widely available, cheap, and sometimes software programs (geographical information
freely distributed maps that extended map conscious- systems) and digitized databases. How historians of
ness among almost all sectors of the American public; cartography in the future will incorporate these
railroad and domestic tourist maps after 1880 and developments into the universal history of maps and
highway maps between the two world wars. They mapmaking is as yet unclear.
accelerated a trend that had been stimulated by
popular journalism during the Civil War. Before 1861 See also: Geography; Space and Social Theory in
American newspapers rarely contained maps (Bosse Geography; Spatial Analysis in Geography; Spatial
1993). Thereafter, they were to become common Data
(Monmonier 1989).
Between 1500 and 1850 the centers of cartographic
innovation and map production were in the financial
cities of the densely populated, economically active Bibliography
regions of southern and northwestern Europe. The
Bagrow L 1951 Die Geschichte der Kartographie. Safari-Verlag,
center of gravity of these activities were Italy (Venice, Berlin. First English edition, used in preparing this article, rev.
Rome, and Genoa) in the early and mid-sixteenth and enl. Skelton R A [trans. Paisey D L] 1964 History of
century, the Low Countries (Antwerp and Amster- Cartography. Harvard University Press, Cambridge, MA
dam) in the late sixteenth and seventeenth centuries, Bosse D 1993 Ciil War Newspaper Maps. Johns Hopkins
and France (Paris) and England (London) in the University Press, Baltimore, MD
eighteenth century. Theory and techniques improved, Buisseret D (ed.) 1992 Monarchs, Ministers, and Maps: The
specialisms emerged, and cartography became increas- Emergence of Cartography as a Tool of Goernment in Early
ingly professionalized. By the late nineteenth century Modern Europe. University of Chicago Press, Chicago
the center of gravity had crossed the North Atlantic Dilke O A W 1985 Greek and Roman Maps. Cornell University
Press, Ithaca, NY
and dispersed to cities such as New York, Washington
Edney M H 1997 Mapping an Empire: The Geographic Con-
DC, and Chicago. struction of British India 1765–1843. University of Chicago
One important category of maps did not conform to Press, Chicago
these general trends; maps of one or a closely related Fletcher D 1995 The Emergence of Estate Maps: Christ Church,
group of physical, or social, or economic phenomena. Oxford 1600–1840. Oxford University Press, Oxford, UK
Now referred to as thematic maps, early examples Harley J B, Woodward D (eds.) The History of Cartography,
were made in the late seventeenth century, rather more Vol. 1, Cartography in Prehistoric, Ancient, and Medieal
during the eighteenth, followed by a burgeoning in Europe and the Mediterranean, Vol. 2. book 1, 1992, Car-
northwest Europe during the first half of the nine- tography in the Traditional Islamic and South Asian Societies;
teenth. A consequence of growth of interest in natural and Vol. 2, book 2, 1994. Cartography in the Traditional East
and Southeast Asian Societies. University of Chicago Press,
history, the environment, society, and economic Chicago
change, it was facilitated by scholarly periodicals and Jolly D C 1990 Maps in British Periodicals. Part 1. Major
only possible once base maps existed. Much of it Monthlies Before 1800. D C Jolly, Brookline, MA
stemmed from the curiosity and research interests of Kain R J P, Baigent E 1992 The Cadastral Map in the Serice
individuals in topics as diverse as cholera, language, of the State: A History of Property Mapping. University of
cereals, and volcanoes, often in attempts to dem- Chicago Press, Chicago

1508
Case Study: Logic

Konvitz J W 1987 Cartography in France 1660–1848: Science, data collection will be used to assemble a wide range of
Engineering, and Statecraft. University of Chicago Press, information about the case. One of the main benefits
Chicago of a wide-range of data is that it permits a triangulation
Lewis G M 1986 Indicators of unacknowledged assimilations
of research methods, thus, providing substantial verifi-
from Amerindian maps on Euro-American maps of North
America. Imago Mundi 38: 9–34 cation of the particular phenomenon in question.
Lewis G M 1993 Metrics, geometries, signs, and language: Researchers vary in why they choose to study a
sources of cartographic miscommunication between native single case, but most often they do so in order to
and Euro-American cultures in North America. Carto- examine a proposed theory or to provide grounded
graphica 30(1): 98–106 and detailed information for a new theory (Glaser and
Lewis G M (ed.) 1998 Cartographic Encounters: Perspecties on Strauss 1967). Although many quantitative research-
Natie American Mapmaking and Map Use. University of ers consider the case study as a limited form of
Chicago Press, Chicago analysis, especially by comparison with methods that
Lowenthal M A (ed.) 1998 Who’s Who in the History of
collect information on a large sample of instances,
Cartography: The International Guide to the Subject (D9).
Map Collector Publications Ltd. for Imago Mundi Ltd., such a conclusion misconstrues fundamentally the
Tring, UK main purpose of the case study. The analyst who
Monmonier M 1989 Maps With The News: The Deelopment of selects and studies a single case, or even a handful of
American Journalistic Cartography. University of Chicago cases, is not primarily interested in statistical inference
Press, Chicago to a larger population of units, but rather in theoretical
Robinson A H 1982 Early Thematic Mapping in the History of analysis and inference. Nevertheless, the findings from
Cartography. University of Chicago Press, Chicago the study of a single case, like those from the study of
Ruggles R I 1991 A Country So Interesting: The Hudson’s Bay a large sample of instances, always are to be regarded
Company and Two Centuries of Mapping, 1670–1870. McGill-
as tentative, awaiting further confirmation from ad-
Queen’s University Press, Montreal.
Selin H (ed.) 1997 Encyclopaedia of the History of Science, ditional studies of parallel theoretical cases.
Technology, and Medicine in Non-Western Cultures. Kluwer
Academic, Dordrecht, The Netherlands
Skelton R A 1972 Maps: A Historical Surey of Their Study and 2. Why Study the Single Case?
Collecting. University of Chicago Press, Chicago
Sobel D 1995 Longitude. Walker and Co., New York The intensive analysis of the case study is premised on
Twyman M 1982 The graphic presentation of language. Infor- the special advantages that it furnishes. First, the
mation Design Journal 3(1): 1–22 study of single cases enables the researcher to probe a
Wallis H M, Robinson A H (eds.) 1987 Cartographic Innoa-
particular question, or phenomenon, in great detail.
tions: An International Handbook of Mapping Terms to 1900.
Map Collector Publications, Tring, UK in association with Such in-depth examinations ultimately permit the
the International Cartographic Association researcher to acquire a degree of knowledge about the
Wanru C, Xihuang Z, Shengzhang H, Zhongxun N, Jincheng R, case that is typically impossible through the exam-
Deyuan J (eds.) 1990 An Atlas of Ancient Maps in China— ination of a large number of cases. Moreover, such in-
From the Warring States Period to the Yuan Dynasty (476 depth work also enables the researcher to pursue the
BC–AD 1368). Cultural Relics Publishing House, Beijing examination of alternative theoretical ideas, thus
Woodward D 1977 The All-American Map: Wax Engraing and ultimately arriving not merely at a thorough under-
its Influence on Cartography. University of Chicago Press, standing of the empirical facts, but ideally a careful
Chicago
and correct appreciation of the most germane and
Woodward D, Lewis G M (eds.) 1998 The History of Car-
tography Vol. 2, book 3, Cartography in the Traditional African, effective theoretical argument to fit these facts (Camp-
American, Arctic, Australian, and Pacific Societies. University bell 1975).
of Chicago Press, Chicago Second, by studying a single case the researcher is
able to take full account of the social, or historical,
G. M. Lewis context of the phenomenon in question. Students of
case studies regard context as essential to under-
standing the nature of the phenomenon. Take, for
example, the study of children who are unruly in the
classroom. The investigator may believe that such
unruliness is the product of how the child relates both
Case Study: Logic to his peers and teacher in the context of classroom
activity. In order to appreciate and to fully understand
1. The Case Study Defined the nature of the child’s reactions, the investigator is
compelled to study the child in the classroom situation
The case study is a research strategy with special (Stake 1995).
implications for theoretical analysis and data col- Third, the study of the single case permits the
lection. A single case of a particular phenomenon is researcher to probe comprehensively into the empirical
examined intensively for the light that it can shed on a data at hand. In the study of the workings of a single
specific problem or question. Often several methods of community, for example, the case study researcher can

1509
Case Study: Logic

explore a variety of dimensions of the community and studies were often in-depth studies of particular social
can thereby create a multidimensional, or holistic, roles, designed to explore the nature of such roles.
sense of the community rather than, let us say, a Other famous case studies have been done of entire
unidimensional one based upon its size or territorial cities, including the range of different people and
breadth. In so doing the researcher can also emerge organizations in such cities. More recently, social
with a fuller understanding of the case in question by researchers have studied specific organizations using
fashioning an integrated portrait of the case into the case study format; specific locales designed to
which the various pieces, or dimensions, fit. Fourth, understand the nature of social groups in those locales;
the case study provides boundaries to the nature of the and even specific indiiduals whose traits exemplify
phenomenon under investigation. The case is chosen special qualities for investigation (Stake 1995). In
because it represents a self-contained unit that will addition, the case is sometimes used very effectively to
permit the researcher to investigate the phenomenon explore in rich detail the nature of social processes at
in isolation from other forces. Thus, educational work. One of the best illustrations of using the case
researchers will often investigate a single classroom or study in this manner was done by the political scientist,
school because such a case represents some unusual Matthew Crenson (1971). He furnished a number of
qualities in which they are interested (Stake 1995). important insights into the nature and exercise of
Finally, the single case is sometimes chosen because power, including the dynamics of agenda-setting,
it represents a special illustration of the phenomenon simply from the study of how political decisions were
under investigation. Sometimes it is portrayed as the made in a single city.
exception to the rule, or deviant case, thereby per-
mitting the observer to understand some more general
phenomenon in greater depth. 4. The Case Study as a Strategy of Research
By comparison with other research methods, the case
3. How to Choose Cases study has certain limitations. It does not permit the
easy and refined manipulation of variables in the same
Great care must be exercised in the choice of the single way as an experiment does. Nor does it permit a
case. Sometimes social scientists believe that any case researcher to investigate various configurations of
may be used to explore a particular phenomenon. variables that might be associated with particular
Indeed, there appears to have been a great deal of outcomes, or dependent variables, like the sample
faulty case study research because there are few survey. Most often, the case study is done using
guidelines by which to select the case. In fact, the qualitative methods, thus resulting in some confusion
choice must always involve certain critical decisions in between case studies and qualitative research (Mer-
advance of the data collection and analysis. riam 1998). Yet there are important examples in which
First and foremost, the researcher must decide the quantitative data also have been collected to provide
grounds for selecting a particular unit for study. Since key information about the single case. For example,
such selection is never based upon issues of statistical some researchers have used time series data on a single
inference, the theoretical reasons for selecting the case country to examine it in depth.
must always be made clear—or, at least, as clear as
possible—in advance of the research. For example, the
sociologist R. Stephen Warner believed that broad 4.1 The Case Study as an Inductie Tool
societal changes might have a deep influence on the life
of churches in America. Thus, he chose to focus on a The case study has proven most useful for the
single church, and to examine how the internal generation of new theory. Because a single case can be
organization of the church changed over several examined in great depth, and because it can be studied
decades during which there was marked change in with a variety of research tools, researchers can mine it
American society. Second, the researcher should for a great deal of information. At the same time, the
choose those cases that will furnish the clearest tests of careful examination of such information can furnish
the theory, or argument, in question. If a researcher is insights not easily acquired by other research methods.
interested in the causes of revolution, then a society Therefore the case study is perhaps at its strongest
should be chosen where there has been a substantial when it is used as an inductive research instrument,
and marked revolution, to examine the possible allowing the researcher to construct an explanation
causes. for the phenomenon under investigation (Glaser and
Strauss 1967).
3.1 What Qualifies as a Case?
4.2 The Case Study as a Deductie Tool
Unlike certain forms of research method, such as the
sample survey, case study research can employ very Yet, the study of the single case can also furnish an
different units of analysis on which to focus its important tool for those observers who wish to test, or
theoretical attention. The earliest examples of case apply, deductive theories. One of the most famous

1510
Case Study: Logic

illustrations of this occurs in the classic study of the examination of a group of impoverished men, and
International Typographers Union (Lipset et al. 1956). how they adapted to life during the Great Depression.
The authors had concluded, both on empirical and In fact, as Whyte so brilliantly showed, it really was a
theoretical grounds, that trade unions were charac- case study of a small group of men, and the way in
terized by very rigid and tightly closed political which the nature of stratification emerged within the
administrations, with one clique maintaining its rule in group, and how it influenced the behavior of indivi-
office over a long period of time. They then chose duals. It was done with great insight and care by
deliberately to examine the ITU in detail, because it Whyte, making it still one of the best and most
appeared to be the exception to the general situation. influential case studies of all time.
What was it about the ITU, they asked, that produced
empirically disparate results from all other trade
unions? This was a powerful, well-refined question. In 6. Issues of Theoretical Specification: The Use of
effect, the ITU provided a comparison to all other Singular Cases to Illuminate General Principles
cases: if its outcome was different, so too must have Sometimes the study of the single case is done
been the set of elements that brought about such an specifically to reveal the importance of a special
outcome. The answer, furnished through a variety of configuration, or confluence, of elements that together
evidence, suggested that, because of their level of help to explain the unique outcome of a case, and its
education, skill, and commitment to the ITU, union difference from the general pattern (Campbell 1975).
members of this union, as compared to others, simply The logic underlying the analyst’s concern is that the
were more likely to take an active role in union affairs. set of variables together constitutes the sufficient,
The one consideration that is overlooked in using perhaps even singular, explanation for a particular
the case study as a deductive tool is that researchers outcome, or effect. The research on the International
may fail to specify the nature of the null hypothesis, or Typographers Union by Lipset et al. (1956), for
the likely outcome if the theory is incorrect. While instance, pointed to the special configuration of
most case studies have failed to consider this important conditions in the ITU that helped to promote internal
feature of theory construction and testing, Yin (1994) democracy. It suggested that the special confluence of
argues that, if done carefully, such a strategy can be factors promoted democracy, leading to the obvious
incorporated easily into the analysis of single, and conclusion that only under certain special social and
even multiple, cases. historical circumstances could such variables again
come together in such a specific manner.
5. The Case Study: Seminal Illustrations
Over the years there have been a number of very 7. Issues of Theoretical Generalization and
significant case studies. Two, in particular, illustrate Refinement: The Use of Parallel and Different
the wide variety of uses to which case study analysis Cases
can be put in the study of social phenomena.
One of the first, and surely most famous, case Because the analysis of a single case can only take the
studies was done of the city of Muncie, Indiana (Lynd analyst so far, the best rule to follow, if possible, is to
and Lynd 1929). Conducted during the course of the examine at least three or four pertinent cases in depth.
1920s, the investigators and their research team Multiple cases permit the researcher to refine and
wanted to uncover the full nature of a typical Ameri- develop theoretical arguments, and they do so pre-
can city over the course of its lifetime. Part of their cisely because case studies permit the researcher to
work involved tracing the history of the city. A large examine cases deeply and, ultimately, to make careful
part, however, involved the collection of a wide range comparisons among those cases.
of information on the city’s current residents, organi- One example will illustrate the advantages of using
zations, and its internal workings. several cases and examining them in depth. Orum
There were a variety of important discoveries, (1995) was interested in examining various theories
including discoveries about the nature of work life and about the origins of urban growth. Popular theories
religious life in the community, as well as about the had suggested that such growth came about because of
dominance of a leading industrial family over the special alliances among the leading figures in different
entire range of civic and political life of the community. major institutional spheres, particularly local govern-
It might be noted that sample surveys of city residents, ment, business, and the media. Orum suggested that
or demographic data about them, would never have such theories were limited only to contemporary
uncovered the full and complex portrait the Lynds circumstances and that they did not help much to
were able to construct through their pioneering case explain the sources of growth in the past. Using several
study research. different cases, and studying them in great depth
A second famous case study was done of men who permitted him to come up with several important
lived in the South End of Boston ( Whyte 1943). This discoveries bearing upon issues of urban growth and
study, also conducted in the 1930s, was ostensibly an expansion.

1511
Case Study: Logic

The first case examined was that of Milwaukee, showed that the first three stages were very similar in
Wisconsin. Tracing its origins back to the early terms of the rapidity of growth, and in the role of
nineteenth century, Orum discovered that the city entrepreneurs, local business, and local government in
seemed to go through certain stages in its growth, and promoting such growth in Milwaukee and Austin
that one could not simply rely on explanations of alike. In the end, then, he refined his overall argument
alliances among leaders of different institutional to suggest that the stage view of urban growth in
spheres. Thus, based upon intensive study, he argued America fit not only industrial-era cities, but also cities
that, in the earliest years, the growth of the city relied in the postindustrial era. This suggested that there
heavily on local entrepreneurs, people who promoted appeared to be an underlying structural pattern to the
the city and who, among other things, sold land for it. growth of American cities, regardless of era. This last
But over time this motive force for growth changed, comparison also enabled him to draw out differences
and the main forces turned to new economic institu- between Milwaukee and Austin, allowing for further
tions, ones that brought in growth by attracting new understanding of the nature of urban growth, and
residents. A third stage happened when the impetus difference, in the United States.
for growth shifted from the local economic enterprises, In brief, then, as the example of the various cities is
which had grown somewhat stagnant, to local govern- intended to illustrate, the careful and systematic study
ment. Government sought to expand the community of several cases can permit the researcher both to
by adding new territory. Among other things, it also substantiate important theoretical generalizations and
became clear in this stage that a single conception of to refine and extend them. Again, this can be done
urban growth was inappropriate; growth could take because of the deep acquaintance the researcher gains
the form of new population, but also new territory. from the study of cases, an acquaintance unlike that to
Such an insight would have been impossible had Orum be obtained with methods like surveys or experiments.
relied only on the study of this single case in depth. The
fourth and final phase happened as the city began to
decline. In large part, the decline took place when
many of its original industries left for greener pastures 8. Conclusion
in other cities. Under these circumstances, a heavy The use of a single case to develop, or to test,
burden was placed upon local government to devise theoretical insights remains a very important strategy
ways to promote growth, either by expanding its in the social sciences at the start of the twenty-first
borders or by attracting new industry through various century. Indeed, it would appear to be growing in
incentives. significance. But it must be executed with great care,
In effect, then, Orum’s explanation for Milwaukee’s and with a special sensitivity to theoretical issues. The
growth suggested that such expansion happened by major limitation in the study of the single case, or even
stages, and that at each stage there was a different a handful of cases, lies not in the limited number of
configuration of local business and political forces empirical units, but in the ability of the researcher to
responsible for such growth, and that the principal be sensitive to issues of theory development and the
form of growth itself could vary between stages. Next, gathering of relevant evidence to develop and refine
he asked whether this pattern held only for Milwaukee, that theory.
or whether it could also describe the trajectory of
growth and decline in other parallel industrial cities. See also: Case Study: Methods and Analysis; Case-
Thus, he made comparisons in the pattern and stages
oriented Research; Human–Environment Relation-
of growth in Milwaukee with those in Cleveland,
Ohio, a city which, on the face of it, seemed similar in ship: Comparative Case Studies; Psychotherapy: Case
many respects. In fact, the stages and forces of growth Study; Single-subject Designs: Methodology; Time
at each stage were virtually identical in Cleveland and Series: General
Milwaukee. He thus argued that among American
industrial cities, the nature of growth, and the forces
responsible for it, showed important parallels, under-
Bibliography
scoring his argument on behalf of a stage theory of
urban growth. Bradshaw Y, Wallace M 1991 Informing generality and explain-
Finally, Orum pushed the argument one step fur- ing uniqueness: The place of case studies in comparative
ther, wondering whether the pattern found was only research. International Journal of Comparatie Sociology
characteristic of industrial growth, and cities, in 32(1–2): 154–71
Campbell D T 1975 ‘Degrees of Freedom’ and the case study.
America. Could the pattern of stages also apply to Comparatie Political Studies 8(2): 178–93
cities that grew up in the postindustrial era and were Crenson M 1971 The Unpolitics of Air Pollution. Johns Hopkins
themselves postindustrial cities? Here he made a Press, Baltimore, MD
comparison between the patterns in Milwaukee and Feagin J R, Orum A M, Sjoberg G (eds.) 1991 A Case for the
those in Austin, Texas, a postindustrial city he also Case Study. University of North Carolina Press, Chapel Hill,
had come to know intimately. This comparison NC

1512
Case Study: Methods and Analysis

Glaser B G, Strauss A L 1967 The Discoery of Grounded increasingly clear through this process that the com-
Theory: Strategies for Qualitatie Research. Aldine, Chicago parative advantages of case study and statistical
Lipset S M, Trow M, Coleman J S 1956 Union Democracy: the methods are largely complementary and that the two
Internal Politics of the International Typographical Union.
methods can thus achieve far more scientific progress
Free Press, Glencoe, IL
Lynd R, Lynd H M 1929 Middletown: A Study in American together than either could alone.
Culture. Harcourt Brace, New York
Merriam S B 1998 Qualitatie Research and Case Study Applica- 1. The Definition of ‘Case’ and ‘Case Study’
tions in Education. Jossey-Bass, San Francisco
Orum A M 1995 City-Building in America. Westview Press, Early efforts to define ‘case studies’ relied on distinc-
Boulder, CO tions between the study of a small vs. a large number
Ragin C C, Becker H S (eds.) 1992 What Is A Case? Exploring of instances of a phenomenon. Case studies thus
the Foundations of Social Inquiry. Cambridge University Press, became characterized as ‘small n’ studies, in contrast
Cambridge, UK
to ‘large N’ statistical studies. Related to this, one
Ragin C, Zaret D 1983 Theory and method in comparative
research: Two strategies. Social Forces 61(3): 731–54 early definition stated that a ‘case’ is a ‘phenomenon
Stake R E 1995 The Art of Case Study Research. Sage, Thousand for which we report and interpret only a single measure
Oaks, CA on any pertinent variable’ (Eckstein 1975). Case study
Warner R S 1991 Oenology: The making of New Wine. In: researchers have increasingly rejected this definition,
Feagin J R, Orum A M, Sjoberg G (eds.) A Case for the Case however, because it wrongly implies, in the language
Study. University of North Carolina Press, Chapel Hill, NC, of statistics, that there is an inherent ‘degrees of
pp. 174–99 freedom’ problem in case studies. In other words, this
Whyte W F 1943 Street Corner Society: The Social Structure of definition of case studies suggests that with a greater
an Italian Slum. University of Chicago Press, Chicago
number of variables than observations on the depe-
Yin R K 1994 Case Study Research: Design and Methods, 2nd
edn. Sage, Thousand Oaks, CA ndent variable, case studies provided no basis for
causal inference. In the view of case study researchers,
A. M. Orum however, each case includes a potentially large number
of observations on intervening variables and quali-
tative measures of different aspects of the dependent
variable, so there is not just a ‘single measure’ of the
variables or an inherent degrees of freedom problem.
Case Study: Methods and Analysis This point is increasingly recognized by researchers
from the statistical tradition (King et al. 1994).
Case study methods have been around as long as In addition, the ‘small n\large N’ distinction implies
recorded history, and they presently account for a that large N methods are always preferable whenever
large proportion of the books and articles in anthro- sufficient data is available. As argued below, however,
pology, biology, economics, history, political science, case studies can serve useful theory building purposes,
psychology, sociology, and even the medical sciences. such as the inductive generation of new hypotheses,
The logic of case study methods, much like that of any even when instances of a phenomenon are sufficiently
historian’s or detective’s efforts to make inferences numerous to allow the application of statistical
from patterns within cases and comparisons between methods.
them, is more intuitive than the logic of statistical For present purposes, then, a case is defined as an
inference. Until relatively recently, however, the lack instance of a class of eents (George 1979 a, 1979 b).
of formalization of the logic of case study methods The term ‘class of events’ refers here to a phenomenon
inhibited them achieving their full potential for con- of scientific interest, such as revolutions, types of
tributing to the progressive and cumulative devel- governmental regime, kinds of economic system, or
opment of theories. It is only in the last three decades personality types. A case study is thus a well-defined
that scholars have formalized case study methods and aspect of a historical happening that the inestigator
linked them to underlying arguments in the philosophy selects for analysis, rather than a historical happening
of science. Ironically, statistical methods, though less itself. The Cuban Missile Crisis, for example, is a
intuitive, were standardized earlier, so that attempts to historical instance of many different classes of events:
formalize case study methods often misappropriated cases of deterrence, coercive diplomacy, crisis man-
terms and concepts from statistics (McKeown 1999). agement, and so on. In deciding which class of events
More recently, case study methods have evolved from to study and which theories to use, the researcher
being defined ‘negatively,’ via contrasts to statistical decides what data from the Cuban Missile Crisis is
methods, to being defined ‘positively,’ by their dis- relevant to their case study of it. Of course, even if one
tinctive logic, techniques, and comparative advan- accepts the present definition of a case, there is still
tages. This continuing evolution remains a contested room in the context of particular studies to debate
process, but there is growing consensus on the proper such questions as: ‘what is this event a case of’ and
procedures for carrying out case studies and the ‘given this phenomenon, is this event a case of it?’
strengths and limits of such studies. It is becoming (Ragin and Becker 1992).

1513
Case Study: Methods and Analysis

There is potential for confusion among the terms Table 1


‘comparative methods,’ ‘case study methods,’ and Types of case studies
‘qualitative methods.’ In one view the comparative
method, or the use of comparisons among a small Lijphart Eckstein
number of cases, is distinct from the case study atheoretical configurative-ideographic
method, which in this view involves the internal interpretative disciplined-configurative
examination of single cases (see Comparatie Studies: hypothesis generating heuristic
Method and Design). For the present purposes, how- deviant ?
ever, case study methods are defined to include both theory-confirming\ crucial, most-likely,
within-case analysis of single cases and comparisons infirming least-likely
between or among a small number of cases. This is not
an effort to claim wider meaning for the term ‘case
studies,’ but an outgrowth of the growing consensus for example, we find that teenagers are ‘difficult’ in
that the strongest means of drawing inferences from both tribal societies and industrialized societies, we
case studies is the use of a combination of within-case might be tempted to infer that it is the nature of
analysis and cross-case comparisons within a single teenagers rather than the nature of society that
study or research program, although single case accounts for the difficulty of teenagers.
studies can also play a role in theory development. As Arend Lijphart (1971) and Harry Eckstein (1975)
for the term ‘qualitative methods,’ this is sometimes contributed further to the formalization of case study
used to encompass both case studies carried out with a methods by clarifying the differences among various
positivist view of the philosophy of science and those types of case study research designs and theory-
implemented with a postmodern or interpretive view. building goals. These authors identified similar types,
This present article hews to the traditional terminology although their terminology differs and Lijphart adds
in focusing on ‘case studies’ as that subset of quali- an important type, the ‘deviant case,’ for which
tative methods that has adopted a largely positivist Eckstein does not make explicit provision. Their types
framework. of case studies correspond as shown in Table 1.
The atheoretical or configurative-ideographic case
study takes the form of a detailed narrative or ‘story’
presented in the form of a chronicle that purports to
2. The Historical Deelopment of Case Study illuminate how an event came about. Such a narrative
Methods is highly specific and makes no explicit use of theory or
theory-related variables. Most case studies, however,
Case study methods have developed through several do have an explanatory purpose. These studies gene-
phases over the last three decades. Prior to the 1970s, rally fall into the category of ‘disciplined-configu-
‘case studies’ consisted primarily of historical studies rative’ or ‘interpretive’ case studies, in which general
of particular events, countries, or phenomena, with propositions are used, often implicitly, to explain
little effort to cumulate results or progressively develop specific historical cases. Another variant of such case
theories (Verba 1967). Throughout the 1970s, how- studies is the use of cases as examples that illustrate a
ever, scholars who were dissatisfied with the state of theory.
case study methods, and encouraged by the example of Heuristic case studies seek to generate new hy-
the formalization of statistical methods, began to potheses inductively from the study of particular cases.
formalize case study methods. Notably, statistical methods lack this capacity for
First, Adam Przeworski and Henry Teune (1970) inductively generating hypotheses, and they typically
clarified the logic of ‘most similar’ and ‘least similar’ rely instead on hypotheses derived deductively or
case comparisons. In the former comparison, which borrowed from case study research. An especially
draws on the logic of John Stuart Mill’s method of important type of case study for developing new
difference and mimics the experimental method, the hypotheses is the ‘deviant’ case study. This is the study
researcher compares two cases that are similar in all of a case whose outcome is not predicted or explained
but one independent variable and that differ in the adequately by existing theories. Unless the outcome of
outcome variable. Such a comparison may be con- a deviant case turns out to be a consequence of
sistent with the inference that the difference in the measurement error, the case is likely to be useful for
single independent variable that varies between the identifying variables that have been left out of existing
cases accounts for the difference in the dependent theories. Finally, researchers can use case studies to
variable (although for a variety of reasons discussed test whether the outcomes and processes that theories
below, this inference may be spurious). In a com- predict in particular cases are in fact evident.
parison of least similar cases, which draws on Mill’s Eckstein’s and Lijphart’s contributions demon-
method of agreement, the researcher compares two strated that there was not just a single type of case
cases that differ in all but one independent variable but study, but many kinds of case study research designs
that have the same value on the dependent variable. If, and many different theory-building purposes that they

1514
Case Study: Methods and Analysis

could serve. Their treatments differed, however, in might be no single necessary or sufficient variable for a
that Lijphart relied greatly on statistical concepts and phenomenon: it might be that either ABC or DEF
language. He was thus skeptical of the value of single causes Y, and that none of the variables A–F is itself
case studies for building social science theories, and, sufficient to cause Y (see Human–Enironment Rela-
consistent with the widespread preference at the time tionship: Comparatie Case Studies). In such circum-
for ‘large N’ over ‘small n’ methods, he urged re- stances, pair-wise comparisons of cases might wrongly
searchers to consider several means of either decreas- reject variables that contribute to the outcome of
ing the number of variables in their models or interest in conjunction with some contexts but not
increasing the number of cases to be studied in order to with others, and might also accept as causal variables
make use of statistical rather than case study methods. that are in fact spurious.
This advice, however, raised the risk ‘conceptual To compensate for these limits of controlled com-
stretching’ (Sartori 1970), or of lumping together parison, George developed the ‘within case’ methods
dissimilar cases under the same definitions. Possibly of ‘congruence testing’ and ‘process tracing’ as means
for this reason, Lijphart later placed greater emphasis of checking on whether inferences arrived at through
instead on the controlled comparison of most similar case comparisons were spurious (see Pattern Match-
to cases as a basis for causal inference (Lijphart 1975, ing: Methodology). In congruence testing, the re-
Collier 1993). searcher checks whether the prediction a theory makes
Eckstein, in contrast, focused on the use of case in a case, in view of the values of the case’s independent
studies for theory testing and argued that even single variables, is congruent with the actual outcome in the
case studies could provide tests that might strongly case. In process tracing, the researcher examines
support or impugn theories. In so doing, Eckstein whether the causal process a theory hypothesizes in a
developed the idea of a ‘crucial case,’ or a case that case is in fact evident in the sequence and values of the
‘must closely fit a theory if one is to have confidence in intervening variables in that case. Thus, process
the theory’s validity, or, conversely, must not fit equally tracing might be used to test whether the residual
well any rule contrary to that proposed’ (Eckstein differences between two similar cases were causal or
1975, his emphasis). Eckstein argued that true crucial spurious in producing a difference in these cases’
cases are rare, so he pointed to the alternative of ‘most outcomes. Process tracing can perform a heuristic
likely’ and ‘least likely’ cases. A most likely case is one function as well, generating new variables or hypo-
that is almost certain to fit a theory if the theory is true theses on the basis of sequences of events observed
for any cases at all. The failure of a theory to explain inductively in cases.
a most likely case greatly undermines our confidence George (1979 a, 1979 b) also systematized case study
in the theory. A least likely case, conversely, is a tough procedures by developing what he called the method
test for a theory because it is a case in which the theory of ‘structured focused comparison.’ In this method,
makes only a weak prediction. A theory’s ability to the researcher systematically: (a) specifies the research
explain a least likely case is strong evidence in favor of problem and the class of events to be studied; (b)
the theory. In this way, Eckstein argued, even single defines the independent, dependent, and intervening
case studies could greatly increase or decrease our variables of the relevant theories; (c) selects the cases
confidence in a theory or require that we alter its scope to be studied and compared; (d) decides how best to
conditions. characterize variance in the independent and depen-
Alexander George (1979 a, 1979 b) further devel- dent variables; and (e) formulates a detailed set of
oped case study methods by refining ‘within-case’ standard questions to be applied to each case. In
analysis and cross-case comparisons in ways that help addition, consistent with his emphasis on equifinality,
each method compensate for the limits of the other. George argued that case studies could be especially
George argued, as Mill himself had, that the ‘method useful in developing what he called ‘typological
of difference’ and the corresponding practice of com- theories,’ or contingent generalizations on ‘the ariety
parison of most similar cases could lead to spurious of different causal patterns that can occur for the
inferences. One reason for this is that no two nonex- phenomena in question … [and] the conditions under
perimental cases achieve the ideal of being similar in which each distinctie type of causal patterns occurs’
all respects but one independent variable and the (George 1979a, his emphasis). He thus advocated a
outcome. Thus, there is always the danger that left-out kind of ‘building block’ approach to the development
variables or residual differences in the values of the of theories in which each case, while rendered in terms
independent variables account for the difference in of theoretical variables, might prove to be a distinctive
the outcomes of similar cases of (see Human– causal pathway to the outcome of interest.
Enironment Relationship: Comparatie Case Studies). In the 1980s and 1990s, thousands of books and
In addition, as Mill recognized, phenomena might be articles made use of these improvements in case study
characterized by what general systems theorists have methods in a wide variety of social science research
termed ‘equifinality,’ or the condition in which the programs. Meanwhile, scholars continued to elaborate
same outcome can arise through different causal case study methods and articulate the ways in which
pathways or combinations of variables. Thus, there they differed from statistical methods. David Collier,

1515
Case Study: Methods and Analysis

reviewing the development of case study and com- more easily addressed by case studies, and causal
parative methods, argued that these methods have effects, which are best assessed through statistical
advantages in defining and measuring qualitative means, are essential to the development of causal
variables in conceptually valid ways and fore- theories and causal explanations (George and Bennett
stalling the problem of conceptual stretching (Collier 2001).
1993). Charles Ragin argued that qualitative methods Another relevant development in the philosophy of
were also better than statistical methods at accounting science has been the resurgence of interest in Bayesian
for equifinality and complex interaction effects. Al- logic, or the logic of using new data to update prior
though statistical methods can model several kinds of confidence levels assigned to hypotheses. Bayesian
interaction effects, Ragin noted, they can do so only at logic differs from that of most statistics, which eschew
the cost of requiring a larger sample size, and models reliance on prior probabilities. Eckstein’s crucial, most
of nonlinear interactions rapidly become complex and likely, and least likely case study designs implicitly use
difficult to interpret. Ragin also introduced the method a Bayesian logic, assigning prior probabilities to the
of Qualitative Comparative Analysis, which uses likelihood of particular outcomes (McKeown 1999).
Boolean algebra to reduce a series of comparisons of One new development here is the refinement of
cases to the minimum number of logical statements or Eckstein’s approach, taking into consideration the
hypotheses that entail the results of all the cases likelihood of an outcome not just in view of one
compared (Ragin 1987). This method, he argues, theory, but in the presence of alternative hypotheses.
makes comparisons among cases in ways that treat If a case is ‘most likely’ for a theory, and if the
them inherently as configurations of variables, and alternative hypotheses make the same prediction, then
that thus allow for the possibility of equifinality and the theory will be strongly impugned if the prediction
complex interactions (see Configurational Analysis). does not prove true. The failure of the theory cannot
Both Collier and Ragin also noted the limitations of be blamed on the influence of the variables highlighted
case study methods, including the potential for inde- by the alternative hypotheses. Conversely, if a theory
terminacy when attempting to sort out rival explan- makes only a weak prediction in a ‘least likely’ case,
ations in a small number of cases, the difficulty of the alternative hypotheses make a different prediction,
attaining a detailed understanding of more than a few but if the first theory’s prediction proves true, this is
cases, and the inability to make broad generalizations the strongest possible evidence in favor of the theory
on the basis of small numbers of cases. (Van Evera 1997). This helps address the central
problem of a Bayesian approach—that of assigning and
justifying prior probabilities—even if it does not fully
3. New Deelopments in Case Study Methods resolve it.
The continuing development of the logic of hypo-
The thousands of applications of case study methods thesis testing has also been relevant to case study
in the last two decades have provided fertile ground methods (see Hypothesis Testing: Methodology and
for further methodological refinements. Three key Limitations). On this topic, Imre Lakatos argued that
recent developments include the strengthening of a theory can be considered progressive only if it
linkages between case study methods and the phil- predicts and later corroborates ‘new facts,’ or novel
osophy of science, the elaboration of the concept of empirical content not anticipated by other theories
typological theories, and the emergence of elements of (Lakatos 1976). This criterion helps provide a stan-
consensus on the comparative advantages and limita- dard for judging whether process tracing, the desig-
tions of case study methods. nation of new subtypes, and the proposal of new
theories from heuristic case studies are being done in a
progressive or regressive way. It also provides a
3.1 Case Studies and the Philosophy of Science
philosophical basis for arguing that a hypothesis can
With regard to the philosophy of science, the ‘scientific be derived from one set of observations within a case
realist’ school of thought has emphasized that causal and then to some extent tested against the ‘new facts’
mechanisms, or independent stable factors that under or previously unexamined or unexpected data that it
certain conditions link causes to effects, are important predicts within that same case, although independent
to causal explanation (Little 1998). This has resonated corroboration in other cases is usually advisable as
with case study researchers’ use of process tracing to well (Collier 1993).
uncover evidence of causal mechanisms at work. It has
also provided a philosophical counterpoint to at-
3.2 Typological Theories and ‘Fuzzy Logic’
tempts by researchers from the statistical tradition to
place ‘causal effects,’ or the expected difference in A second recent development in case study methods
outcomes brought about by the change in a single has been the elaboration of the concept of typological
independent variable, at the center of causal expla- theory. Typological theories occupy a middle ground
nation (King et al. 1994). Case study researchers between covering laws, or highly general abstract
have argued that both causal mechanisms, which are propositions, and causal mechanisms. Typological

1516
Case Study: Methods and Analysis

theories identify recurring conjunctions of mechan- available process tracing evidence (Njolstad 1990).
isms and provide hypotheses on the pathways through When this occurs, it may still be possible to narrow the
which they produce effects. Thus, like QCA, typolo- number of plausible explanations, and it is also
gical theories treat cases as configurations. Unlike important to indicate as clearly as possible the extent
QCA, they do not attempt to reduce the number of to which the remaining hypotheses appear to be
theoretical statements about the variables, but retain a complementary, competing, and incommensurate in
diverse and admittedly complex set of contingent explaining the case.
generalizations, with potentially one generalization Second, most case study researchers have readily
per type. Consequently, typological theories are well acknowledged the limits of Mill’s methods. Ragin’s
suited to modeling equifinality. alternative of qualitative comparative analysis makes
To construct typological theories, researchers first less restrictive assumptions, but its results are highly
specify the variables and use them to define the sensitive to changes in the measurement or coding of a
typological space, or the set of all mathematically single case (Goldthorpe 1997). There has thus been a
possible combinations of the variables (this is some- movement toward typological theories and fuzzy logic,
times termed a truth table in the philosophy of science). which make still less restrictive assumptions than QCA
At first this may seem to produce an unmanageably and are not so sensitive to the results of a single case.
large number of combinations: a model with five In addition, there is growing consensus that the use of
dichotomous variables, for example, would have 32 within-case methods of analysis helps provide a check
possible types. However, once the researcher begins to on the potential spuriousness of cross-case compari-
categorize extant cases in a preliminary way into sons (Collier 1993, Mahoney 1999, George and
particular types, it often becomes possible to narrow Bennett 2001). Case study researchers consequently
the range of cases of interest for study. Many types seldom if ever rely on case comparisons alone.
may remain empty, with no extant cases. Some types Third, there is growing recognition that the case
may be overdetermined for the outcome of interest, selection criteria necessary for statistical studies are in
and hence not worthy of study unless they have an some respects inappropriate for case studies. Random
unexpected outcome. From among the cases and types selection in a case study research design, for example,
that remain, the researcher can use the preliminary can result in worse biases than intentional selection
categorization of cases within the typological space to (King et al. 1994). There is also increasing under-
help identify most likely, least likely, most similar, standing that, consistent with the reliance of some case
least similar, and crucial cases for study. Cases in the study designs on a Bayesian logic, case studies are
typological space with unexpected outcomes, or de- sometimes intentionally selected not to be represen-
viant cases, can help identify new causal pathways that tative of some wide population but to provide the
can be added to the existing theory in a kind of strongest possible inferences on particular theories
‘building block’ approach (George and Bennett (McKeown 1999). There is still disagreement between
2001). A related development concerns the concept of those who warn against any selection on the dependent
‘fuzzy logic’ (Ragin 2000). Fuzzy logic treats cases as variable (King et al. 1994) and those who argue
configurations but rather than using dichotomous or that selection on the dependent variable is appropriate
trichotomous variables and categorizations of cases, it for some research objectives (Collier and Mahoney
allows the use of scaling to give a score on the extent to 1996, Ragin 2000, George and Bennett 2001). Related
which a case fits into a certain type. In other respects, to this is a continuing disagreement over whether
the use of fuzzy logic proceeds in ways much like those single case studies can make only limited contributions
of typological theories. to theory building (King et al. 1994), or whether
single case studies have indeed reshaped entire re-
search programs (Rogowski 1995). There is wider
agreement, however, that selection bias is potentially
3.3 The Emerging Consensus on the Strengths and
more severe in case studies than statistical studies
Limits of Case Study Methods
because biased selection of case studies can overstate
A third development is that while several debates on as well as understate the relationship between the
case study methods continue, others have moved independent and dependent variables (Collier and
toward synthesis or even closure, and the overall Mahoney 1996).
picture is of an emerging consensus on the advantages On the whole discussions of these issues have
limitations of case study methods. As noted above, moved toward an emerging consensus on the com-
researchers from a variety of methodological tradi- parative advantages and limitations of case study
tions have recognized that because case studies can methods. These methods’ advantages include the
include many observations, they do not suffer from an conceptualization, operationalization, and measure-
inherent degrees of freedom problem. At the same ment of qualitative variables (conceptual validity), the
time, it is also widely agreed that particular case avoidance of conceptual stretching, the heuristic iden-
studies may suffer from indeterminacy, or an inability tification of new variables and hypotheses (often
to exclude all but one explanation on the basis of through study of deviant cases), the assessment of

1517
Case Study: Methods and Analysis

whether statistical generalizations offer plausible or Psychotherapy: Case Study; Single-case Experimental
spurious explanations of individual cases, the incor- Designs in Clinical Settings; Single-subject Designs:
poration of equifinality and complex interactions Methodology
effects, and the inferences made possible by combining
within-case and cross-case analyses (Collier 1993,
Munck 1998, George and Bennett 2001). It is possible
that new statistical methods may be able to improve Bibliography
upon the statistical treatment of equifinality and Bates R H, Greif A, Rosenthal J, Weingast B, Levi M 1998
interaction effects, and at least narrow the gap in the Analytic Narraties. Princeton University Press, Princeton,
treatment of this issue, but the other comparative NJ
advantages of case study methods appear to be Collier D 1993 The comparative method. In: Finifter A W (ed.)
inherent in their differences from statistical methods. Political Science: the State of the Discipline II. American
The limits of case study methods include their Political Science Association, Washington, DC
inappropriateness for judging the relative frequency or Collier D, Mahoney J 1996 Insights and pitfalls: selection bias in
representativeness of cases, their weakness at per- qualitative research. World Politics 1: 56–91
Eckstein H 1975 Case studies and theory in political science. In:
forming partial correlations and establishing causal Greenstein F I, Polsby N W (eds.) Handbook of Political
effects or causal weight, the necessarily narrow and Science. Addison-Wesley, Reading, MA, Vol. 7
contingent nature of their generalizations, and the George A L 1979 a Case studies and theory development. In:
danger that selection bias can be more catastrophic Lauren P G (ed.) Diplomacy: New Approaches in Theory,
than in statistical studies (Collier 1993, King et al. History, and Policy. Free Press, New York
1994, Munck 1998, George and Bennett 2001). For- George A L 1979 b The Causal nexus between cognitive beliefs
tunately, these are precisely the strengths of statistical and decision-making behavior: the ‘Operational Code.’. In:
studies. Falkowski L S (ed.) Psychological Models in International
Politics. Westview, Boulder, CO
George A L, Bennett A O 2001 Case Studies and Theory
Deelopment. MIT Press, Cambridge, MA
4. Future Directions in Case Study Methods Goldstone J 1997 Methodological issues in comparative macro-
sociology. Comparatie Social Research 16: 107–20
Just as the formalization of case study methods in the Goldthorpe J 1997 Current issues in comparative macrosocio-
1970s inspired a generation of more sophisticated logy. Comparatie Social Research 16: 1–26
research, the recent further refinements in these King G, Keohane R, Verba S 1994 Designing Social Inquiry.
methods are likely to lead to still greater sophistication Princeton University Press, Princeton, NJ
in their use in the social, biological, and even physical Lakatos I 1976 Falsification and the growth of scientific research
sciences. The increasingly evident complementarity of programs. In: Lakatos I, Musgrave A (eds.) Criticism and the
case study and statistical methods is likely to lead Growth of Knowledge. Cambridge University Press, Cam-
toward more collaborative work by scholars using bridge, UK
Lijphart A 1971 Comparative politics and the comparative
various methods. The recent interest among rational method. American Political Science Reiew 65: 682–93
choice theorists in using case studies to test their Lijphart A 1975 The comparable cases strategy in comparative
theories, for example, is an important step in this research. Comparatie Political Studies 8: 158–77
direction (Bates et al. 1998). Because case studies, Little D 1998 Microfoundations, Method, and Causation. Trans-
statistical methods, and formal modeling are all action, New Brunswick, NJ
increasingly sophisticated, however, it is becoming less Mahoney J 1999 Nominal, ordinal, and narrative appraisal in
likely that a single researcher can be adept at more macro-causal analysis. American Journal of Sociology 104:
than one set of methods while also attaining a cutting- 1154–96
edge theoretical and empirical knowledge of their McKeown T 1999 Case studies and the statistical world view.
field. Collaboration might therefore take the form of International Organization. 1: 161–90
Munck G L 1998 Canons of research design in qualitative analysis.
several researchers working together using different Studies in Comparatie International Deelopment 33: 18–45
methods, or of researchers more self-consciously Njolstad O 1990 Learning from history? Case studies and the
building on the findings generated by those using limits to theory-building. In: Gleditsch N P, Njolstad O (eds.)
different methods. In either form, effective collab- Arms Races: Technological and Political Dynamics. Sage,
oration requires that even as they become expert in Newbury Park, CA
one methodological approach, scholars must also Przeworski A, Teune H 1970 The Logic of Comparatie Social
become conversant with alternative approaches, aware Inquiry. Wiley-Interscience, New York
of their strengths and limits, and capable of an Ragin C C 1987 The Comparatie Method: Moing Beyond
informed reading of their substantive results. Qualitatie and Quantitatie Strategies. University of Cali-
fornia Press, Berkeley, CA
Ragin C C 2000 Fuzzy-Set Social Science. University of Chicago
See also: Biographical Methodology: Psycholog- Press, Chicago, IL
ical Perspectives; Case Study: Logic; Case-oriented Ragin C C, Becker H S 1992 Introduction. In: Ragin C C, Becker
Research; Configurational Analysis; Human–En- H S (eds.) What is a Case? Exploring the Foundations of Social
vironment Relationship: Comparative Case Studies; Inquiry. Cambridge University Press, Cambridge, UK

1518
Case-oriented Research

Rogowski R 1995 The role of theory and anomaly in social- For example, a researcher might use a case-oriented
scientific inference. American Political Science Reiew 89: approach in order to study a small number of firms in
467–70 an in-depth manner. Suppose these firms were all
Sartori G 1970 Concept misformation in comparative politics.
thought to be unusually successful in retaining their
American Political Science Reiew 64: 1033–53
Van Evera S 1997 Guide to Methods for Students of Political best employees while at the same time investing in
Science. Cornell University Press Ithaca, NY them and thus enhancing their potential value to
Verba S 1967 Some dilemmas in comparative research. World competing firms. To find out how they do it, a
Politics 20: 111–27 researcher would have to conduct an in-depth study of
the firms in question. By contrast, a variable-oriented
A. Bennett researcher might study the predictors of variation in
rates of ‘employee retention’ across a large sample of
firms. Is it more a matter of firm or industry charac-
teristics? Do these two sets of factors interact? Useful
answers to these questions would be based on careful
Case-oriented Research analysis of relationships between variables, using data
drawn from a survey of a large number of firms—the
more (and the more varied), the better.
1. Introduction As these two examples show, what matters most is
Case-oriented research focuses on interconnections the researcher’s starting point: does the researcher
among parts and aspects within single cases. In this seek to understand specific cases or to document
approach, the researcher attempts to make sense of general patterns characterizing a population? This
each case as a singular, interpretable entity. In-depth contrast follows a longstanding division in all of
knowledge of the cases included in a study is con- science, not just social science. Georg Henrik von
sidered a prerequisite for the examination of patterns Wright argues in Explanation and Understanding
that might be observed across cases. Case-oriented (1971) that there are two main traditions in the history
researchers often study one case at a time, but they of ideas regarding the conditions an explanation must
may also study multiple instances of a given phenom- satisfy in order to be considered scientifically re-
enon (e.g., comparable instances of ethnic conflict). spectable. One tradition, which he calls ‘finalistic,’ is
The distinctiveness of case-oriented research is ap- anchored in the problem of making facts understand-
parent when this approach is contrasted with the able. The other is called ‘causal-mechanistic’ and is
variable-oriented approach, where researchers focus anchored in the problem of prediction. The contrast
more exclusively on cross-case patterns, without first between case-oriented and variable-oriented research
gaining an understanding of each case. closely parallels this fundamental division. In the two
examples just described, the first researcher uses the
case-oriented approach in order to make certain facts
2. Goals of Case-oriented Research understandable, for example, the spectacular success
of a handful of firms in retaining their most valuable
Today social scientists tend to identify case-oriented employees; the second researcher uses the variable-
research with specific techniques of data collection oriented approach in order to derive an equation
linked to the observation and analysis of singular cases predicting levels of retention, based on a large sample
(e.g., direct observation of individuals at the micro of firms, and to draw inferences from this equation to
level and archival research on nation-states at the an entire population.
macro level). While generally useful, the identity of Once the distinction between case-oriented and
case-oriented research with specific techniques of data variable-oriented research is established and their
collection is unfortunate, for it obscures basic differ- contrasting goals acknowledged, it is clear that the
ences between case-oriented research and conven- importance of techniques of data collection as bearers
tional variable-oriented research. More fundamental of the ‘case-oriented vs. variable-oriented’ dis-
than differences in methods of data collection is the tinction begins to fade. For example, it is clear that a
contrast between goals (Ragin 1987). Case-oriented researcher using case-oriented methods to study a
strategies are distinctive in that they are centrally handful of firms might benefit from conducting sur-
concerned with making sense of a relatively small veys of their employees and performing a conventional
number of cases, selected because they are substan- variable-oriented analysis of these data. The results of
tively or theoretically significant in some way (Eckstein the survey would contribute to this researcher’s depth
1975). Conventional variable-oriented strategies, by of knowledge about the firms in question, just as
contrast, are centrally concerned with the problem of interviewing their top executives or studying their
assessing the relationship between aspects of cases archives would contribute useful information. Like-
across a large number of generic ‘observations,’ wise, it is clear that the researcher using variable-
usually with the goal of inferring general patterns that oriented methods to predict rates of retention could
hold for a population. benefit from interviews of top executives or personnel

1519
Case-oriented Research

officers to help interpret results found in the analysis of Campbell (1975) suggests that each separate theor-
the predictors of retention. Still, the first researcher etical implication can be seen as a different ‘ob-
would remain focused on the problem of understand- servation’ for ‘testing’ a theory. Thus, a single case
ing the handful of firms in question, while the second becomes many observations—some contradicting and
would remain focused on the problem of explaining some supporting competing theories. Collectively,
variation in retention rates across a large number of competing theories and their different implications
firms and making inferences from this sample to a define all theoretically relevant aspects of the case in
population. The important point here is that data question. The researcher’s task is to see which theory
collection techniques per se are relatively neutral; what does the best job of explaining aspects of the case
matters most is the researcher’s goal and the contrast- relevant not only to its own implications, but also to
ing research strategies that follow from different goals. the implications of competing theories. Thus, after
defining and selecting relevant aspects of the case, the
3. Logic of Case-oriented Research researcher assesses the explanatory power of each
theory. The theory (or combination of compatible
The logic of case-oriented research is fundamentally theories) that best covers both its own implications
configurational. Different parts of each case are and those of competing theories prevails and provides
understood in relation to one another and in terms of the basis for the investigator’s representation of the
the total picture or package that they form together. case.
The organizing idea in such research is that the parts of In the end, the researcher crafts an explanation that
a case may constitute a coherent whole; that they have satisfies as many theoretical implications as possible in
an integrity and coherence considered together. For a coherent manner. The success of a case study hinges
example, researchers who study families often find on (a) the number of relevant aspects of the case the
that patterns of interpersonal accommodation are so researcher can encompass with his or her explanation,
enmeshed that a ‘dysfunction’ cannot be remedied (b) the success of the researcher in showing that his or
without addressing many different aspects of a family her portrait of the case actually makes sense of all
all at once. Likewise, researchers who study cultures theoretically relevant aspects, and (c) the agreement of
often observe that cultural traits seem to come in other scholars that all relevant aspects of the case in
packages that defy easy disassembly. The ‘parts’ that question have in fact been addressed by the researcher
case-oriented researchers examine can be quite varied. in a convincing manner. Of course, the most successful
In macro-level research, for example, the ‘parts’ of a case studies accomplish much more than theoretical
case might include institutions, path dependencies, and substantive coherence. They also may advance
social structures, historical patterns and trends, rou- theory or establish important lessons for policy makers
tine practices, cultural beliefs, singular events, event (White 1992). Still, theoretical and substantive co-
sequences, connections to other cases, the case’s larger herence should be considered preconditions for these
environment, and so on. What matters most is that the more ambitious goals (Yin 1994).
investigator makes sense of multiple aspects of the Campbell’s (1975) argument that case-oriented re-
case in an encompassing manner, using theory as a searchers test multiple theoretical implications for
guide. each case underscores the configurational nature of
Donald Campbell (1975) offers a rough formaliza- this type of research. Because theoretical implications
tion of this approach in his examination of the logic of direct the researcher’s attention to ‘observations’ (i.e.,
case-oriented research. Campbell argues that at first different aspects of a single case) that cannot be
glance the in-depth study of a single case appears to be independent from one another, the researcher must
lacking in scientific merit because there is only one make sense of them all at once, as a package.
case to explain and many possible explanations to Furthermore, because different theories typically have
choose from. He notes, however, that case-study implications about different aspects of the case, the
researchers routinely ‘reject’ theories because they do researcher’s attention is directed to a broad range of
not explain the facts of their case. Further, despite aspects, all of which may be interconnected in some
having only one case, they must often struggle to find way. Thus, case-oriented researchers must examine
theories that work. Why is it so difficult? The key to overlapping configurations of aspects as they weigh
this puzzle is the simple fact that every theory has the explanatory power of competing theories.
many implications, relevant both to features of the
case in question and to causal processes and sequences
operating within the case. Thus, researchers evaluate
many theoretical implications relevant to their cases to 4. Case-oriented Research on Multiple Instances
see if their cases conform to expectations. Not all
features of a case may be compatible with the initial Case-oriented researchers sometimes conduct a series
theory, and the researcher must either find an alternate of case studies, selecting cases strategically so that the
theory that works better, revise an existing theory, or knowledge the cases generate accumulates. For ex-
propose an entirely new one (Walton 1992). ample, a specific type of evidence that is not available

1520
Case-oriented Research

in one case may prompt the investigator to select for patterns and different types of cases (e.g., different
his or her next study a case that offers this evidence. In paths to some outcome).
this manner, case-oriented researchers may develop The top-down approach is more theory centered
theoretical arguments in much the same way that and more dependent upon prior knowledge and
psychoanalysts build theory from the analysis of a research. Based on existing theoretical and substantive
series of patients, admitting to their practice patients knowledge, the researcher develops an analytic frame
that promise to enhance their thinking and their for the outcome under investigation, much like a
theories in some way. survey researcher develops a questionnaire. This ‘in-
This case-by-case, grounded approach can be con- strument’ is then applied to each case in a relatively
trasted with a research strategy that takes multiple uniform and structured manner. That is, each case is
instances of a phenomenon as its starting point. For ‘interrogated’ in much the same way. A central goal of
example, a researcher might be interested in the the top-down approach, in addition to the goals of
formation of green parties in Western Europe and reliability and replicability, is to ensure that the cases
focus on a handful of parallel cases. The key contrast selected for in-depth study have equal voice and the
between the study of multiple instances and studying a results are not skewed toward the most prominent
series of cases is the timing of the selection of cases. In cases, the cases with the most or best data, or the cases
serial case studies, findings from one case determine that the researcher simply happens to know best.
the selection of the next case; in the study of multiple While the analytic frame for the top-down approach is
instances, by contrast, the researcher identifies mul- established at the beginning of the research, it is, of
tiple instances of a phenomenon at the outset of an course, open to revision as researchers learn more
investigation. Of course, the researcher’s definition of about their cases.
the population of cases may shift in the course of the The top-down approach also may be used when
research as the investigator learns more about the researchers seek to study a specific range of cases or
phenomenon (see Ragin 1997). The key point is that specific categories of cases. In this approach, a
the study of multiple instances begins with a pre- preexisting analytic frame is used to guide the selection
liminary specification of relevant cases. of instances for in-depth study. The goal is to select
Designs appropriate for the case-oriented study of cases that maximize variation on causally relevant
multiple instances are varied, with the number of cases conditions while economizing on the number of cases.
ranging from two to as many as the researcher can For example, a researcher might construct a two-by-
study in a detailed manner. Still, there are two general two table cross-tabulating levels of two main causal
approaches to the study of multiple instances: ‘bot- conditions (e.g., high versus low levels of development
tom-up’ and ‘top-down.’ Neither strategy is practiced and high versus low levels of democracy) and then seek
in pure form, and there are also many mixed and to ‘fill’ each cell of this table with at least one case. The
hybrid strategies in between these two hypothetical investigator would then conduct in-depth studies of
extremes, just as the there are many ways to blend and the selected cases, knowing that important causal
mix the study of multiple instances, chosen all at once, conditions are well represented in the set of instances
and the grounded selection of cases one at a time. chosen for examination.
The bottom-up approach to multiple instances
attempts to develop a more or less succinct explanation
of each case, in relative isolation from other cases. Of 5. Incongruities Between Case-oriented and
course, it is impossible for researchers to wear blinders Variable-oriented Research
or to forget what they have already learned, and
knowledge of one case will inevitably impinge on the Case-oriented research is distinct from variable-orient-
examination of subsequent cases. Still, the goal of the ed research in many ways. The distinctiveness of the
bottom-up approach is to give each case, in effect, a case-oriented approach is often overlooked, and re-
separate voice and thereby allow for maximum di- searchers using different strategies talk past each other
versity. After completing each separate case study, the when discussing their methods. While some scholars
researcher turns to the task of comparing cases, and emphasize philosophical differences between case-
each case is used as a lens for viewing other cases. The oriented and variable-oriented strategies, there are
goal of this cross-fertilization of case studies is to help striking and profound differences between the two
the researcher identify all possibly relevant factors and approaches at the ‘practical level.’ The practical level
to revise understandings of each case using insights refers to the relatively mundane procedures res-
gained from other cases. On the basis of these earchers use when working with evidence to produce
comparisons, the researcher builds an analytic frame social scientific representations of what they have
for the outcome in question, which embraces all learned—that is, summaries of findings and patterns.
causally relevant conditions that have some degree of The importance of practical differences as a basis
cross-case relevance. In the end, the researcher builds for miscommunication between case-oriented and
a more complete account of each case and also variable-oriented researchers can be seen clearly in the
develops a strong basis for identifying both cross-case contrasts between two very simple procedures: the

1521
Case-oriented Research

search for commonalities across multiple instances of cases do not ‘fit’ with the others. For example, a
a phenomenon in case-oriented research and the researcher studying ‘firms that invest in and retain
examination of the correlation between two variables their top employees’ might decide that several firms
across a sample of observations in variable-oriented originally thought to belong to this group really don’t
research. The search for commonalities is usually part belong, and that maybe one or more that were thought
of an effort to derive limited generalizations from the to be outside this group actually belong in it. This
in-depth study of a modest number of instances. For flexibility is maintained throughout the investigation
example, a researcher might attempt to identify com- because the core concepts (e.g., ‘investing in employ-
mon antecedent conditions for an outcome these cases ees’) may be revised as the researcher learns more
share. This researcher might argue further that the about relevant instances.
common antecedents are necessary conditions for the
outcome—that is, if theoretical and substantive know-
5.2 Outcomes
ledge support this interpretation. For example, a
researcher conducting an in-depth study of a select In correlational studies researchers usually identify a
group of firms—those that are able to both invest in ‘dependent variable’—an outcome that varies across
and retain their best employees—might identify cases. Typically, such outcomes are aspects of cases
causally relevant features shared by these firms. Shared that vary by level, for example, level of satisfaction,
features, in turn, might point to a possible formula for level of bureaucratization, and so on. Sometimes the
success and offer lessons for other firms. The search outcome variable is categorical, indicating whether or
for commonalities across a select group of cases is the not some event has occurred (e.g., filing a complaint),
most basic case-oriented strategy applied to multiple and sometimes it is a frequency or a rate (e.g., rate of
instances. By contrast, the examination of a cor- employee retention). The important consideration, in
relation between two variables across a sample of this procedure, is that the outcome vary across
observations in variable-oriented research is usually ‘observations.’ The goal of the research typically is to
conducted with the goal of establishing that two explain, if possible, why cases have different values on
conditions or aspects covary. For example, high rates the dependent variable. Such research is centrally
of retention might be correlated with high levels of concerned with the question of ‘why.’ For example, a
employee compensation across a sample of firms. This researcher might seek to explain why some employees
correlation, in turn, might be interpreted causally: the are more satisfied than average and others less so, why
researcher would argue that one reason firms differ in some firms have higher retention rates than average
their employee retention rates is that they differ in how and others less than average, and so on.
well they compensate employees. In a case-oriented study of commonalities, by
While both procedures seem deceptively simple and contrast, the outcome is often something that does not
straightforward, these two ways of producing results vary substantially across the chosen cases. In a study
from evidence involve sharply contrasting practical of firms, for example, cases might be chosen precisely
orientations toward cases, outcomes, and causal con- because they all display the same outcome—a specific
ditions: pattern of successful retention. Recall that the goal of
a case-oriented study of commonalities is to identify
common causal conditions linked to a specific out-
5.1 Cases
come across a relatively small number of purposefully
When a researcher using variable-oriented methods selected cases. Thus, the focus is on cases with a
computes a correlation between two variables, the specific outcome, not cases that vary widely in how
relevant cases become more or less invisible and much they display this outcome. While the outcomes
variables take center stage. Furthermore, the set of in a study of this type will not be exactly identical
cases included in the computation must be fixed before across cases, the researcher must demonstrate that the
the researcher can compute the correlation. Once this outcomes in the cases selected are in fact enough alike
set is fixed, usually at the outset of an investigation, it to be treated as instances of the same thing. Finally,
is rarely altered. What matters most is that the cases unlike correlational studies, which are centrally con-
(which are understood as ‘observations’) belong to the cerned with the question of ‘why’ (as in: why some
same general ‘population’ and that they be drawn more than others?), case-oriented studies are centrally
from this population with an eye toward randomness concerned with the question of ‘how’ (as in: how does
or representativeness or some combination of these it happen?). How do firms retain their most valuable
criteria. employees?
In a case-oriented study of commonalities, by
contrast, cases have clear identities and are usually
5.3 Causal Conditions
chosen specifically because of their substantive signifi-
cance or theoretical relevance. Furthermore, the set of In a correlational study, causation may be inferred
relevant cases may shift during the investigation from a pattern of covariation. If a variable thought to
because the researcher may decide that one or more represent a cause or to be an indicator of a key causal

1522
Case-oriented Research

condition is strongly correlated with the outcome boundary also fixes the assumption of homogeneity,
variable, then the researcher may make a causal which usually is not warranted. Investigators should
inference. Usually, the researcher will assess the be able to redefine the set of relevant cases as they
relative strength of several causal variables at the same learn more about them (Ragin 1997). (c) It is difficult
time. The typical goal is either to find out which one to determine how something ‘comes about’ by com-
explains the most variation in the outcome variable or paring cases with different levels of the outcome. The
simply to assess the relative importance of the different partial instances (i.e., those with lower scores on the
independent variables. In effect, variables compete outcome variable) are likely to provide many false
with each other to explain variation. In most investiga- leads (Lijphart 1971, 1975, Ragin 1997). (d) It is
tions, each causal variable is considered sufficient, by pointless to try to isolate the ‘independent’ effect of
itself, for the outcome or an increment in the outcome. any causal condition when several factors usually
That is, each one is considered an ‘independent’ must combine for a particular outcome to occur
variable capable of affecting the outcome variable, (George 1979), and so on. In short, at a practical level
regardless of the values of other causal variables. the two approaches seem almost antithetical. It is no
In a case-oriented study of commonalities, by wonder that findings diverge and researchers often
contrast, causation is typically understood conjunc- talk past each other (Rueschemeyer 1991; Ragin 1997).
turally. The goal of this type of analysis is to identify
the main causal conditions shared by relevant cases.
Causal conditions do not compete with each other, as 6. Bridging Case-oriented and Variable-oriented
they do in correlational research; they combine. How Research
they combine or ‘fit together’ is something that
researchers try to discern using their in-depth know- While the gulf between case-oriented and variable-
ledge of cases. Because all the cases have more or less oriented research seems great, it is possible to span it.
the same outcome, the usual reasoning is that the Consider the following scenario: a researcher using
causally relevant conditions shared by cases provide case-oriented methods studies several instances of an
important clues regarding which factors must be outcome (e.g., a small number of firms that are very
present to produce the outcome in question. When successful in investing in and retaining top employees)
constructing this argument, the researcher is especially documents causally relevant commonalities, and then
sensitive to the possibility that a given causal require- constructs a general, composite argument about how
ment (i.e., a necessary condition) may be met in a these firms do it. This argument leads to four specific
variety of different ways. recommendations (which might be labeled X to X )
These and other practical differences in how re- based on the observed commonalities. A " second %
searchers using case-oriented versus variable-oriented researcher reads the report of this study and decides to
methods work with evidence to produce results pro- evaluate it with a large sample using variable-oriented
vide many opportunities for disjunctures in findings. methods. This researcher collects information on a
For example, from the perspective of variable-oriented random sample of firms and finds that, as independent
work, the study of commonalities across a small variables, X to X do not distinguish more successful
number of instances is fraught with analytic sins and " % firms, using various measures of
from less successful
errors: (a) The number of cases is too small and too employee retention. In short, the second researcher
nonrandom to warrant any kind of inference (King et shows that there is no statistically significant difference
al. 1994). (b) The procedure ‘selects on the dependent in the retention rates for firms with and without these
variable’ (i.e., focuses on cases all with more or less the four aspects, considering these aspects one at a time or
same value on the outcome variable). This practice in an additive, multivariate equation.
may deflate otherwise robust correlations (Collier What went wrong? Usually, the researcher using
1995, Collier and Mahoney 1996, King et al. 1994). (c) variable-oriented methods will claim that the first
Researchers may drop cases that ‘don’t fit’ at various researcher’s ‘sample’ was ‘too small’ and ‘unrepre-
stages of the analysis, which seriously undermines any sentative.’ Thus, the identification of X to X took
effort to generalize beyond the cases that remain (King advantage of specific aspects of the selected" cases.
% The
et al. 1994). (d) The most important causal factors (i.e., first researcher might counterattack by arguing that
causally relevant commonalities) do not vary and thus causally relevant commonalities identified through in-
are impossible to assess, and so on (Lieberson 1991, depth study are very difficult to represent as ‘variables,’
1994; Goldthorpe 1991, 1997). and that the second researcher’s crude attempt to
Likewise, from the perspective of case-oriented operationalize them fell far short. Indeed, the first
work, the examination of the correlation between a researcher might argue that it would take in-depth
causal and outcome variable across many cases is knowledge of each firm included in the variable-
fundamentally flawed: (a) Typically, there are so many oriented study to capture these conditions appro-
cases that there is no way for the researcher to know if priately and contextually.
they are all really comparable and thus belong together These criticisms and countercriticisms are quite
in the same analysis. (b) Fixing a population or sample common. However, the incongruity between the two

1523
Case-oriented Research

hypothetical studies can be resolved without resorting the claim of necessity made by the case-oriented
to derogation. The first researcher in this example— researcher.
the one using case-oriented methods—selected on This sketch identifies only one of several ways to
instances of the outcome and identified four causally span case-oriented and variable-oriented inquiry. The
relevant conditions shared by the firms in question. In general and most important point is that it is possible
essence, this researcher worked backwards from the to join these two approaches if researchers are careful
outcome to causes and thus identified potential necess- to distinguish between necessity and sufficiency and to
ary conditions for the outcome (Ragin 2000). Are separate the analysis of these two aspects. More
these conditions truly necessary? In part, this is an generally, this discussion underscores the distinctive-
empirical question. To gain confidence, the researcher ness of case-oriented research, especially the case-
should examine as many instances of the outcome as oriented study of multiple instances of an outcome.
possible, to see if they agree in displaying these four
causally relevant conditions (or their causal equiva- See also: Case Study: Logic; Case Study: Methods and
lents). But it is also a question about existing knowl-
Analysis; Classification: Conceptions in the Social
edge. Is the argument that these four conditions are
necessary consistent with theoretical and substantive Sciences; Configurational Analysis; Explanation:
knowledge? Do they make sense as necessary condi- Conceptions in the Social Sciences; Person-centered
tions? If the researcher’s finding is consistent with Research; Psychotherapy: Case Study; Single-case
existing substantive and theoretical knowledge, then Experimental Designs in Clinical Settings; Single-
the argument that these four conditions are necessary subject Designs: Methodology; Time Series: General;
is strengthened. Triangulation: Methodology
How should the second researcher—the one using
variable-oriented methods—respond to the argument
that these four factors are necessary conditions? At a
more abstract level, the specification of necessary Bibliography
conditions is relevant primarily to the identification of
cases that are candidates for an outcome. Cases cannot Campbell D T 1975 Degrees of freedom and the case study.
be candidates for an outcome if they do not meet the Comparatie Political Studies 8: 178–93
Collier D 1995 Translating quantitative methods for qualitative
necessary conditions. But many cases may meet the
researchers: The case of selection bias. American Political
necessary conditions for an outcome and still not Science Reiew 89: 461–6
exhibit the outcome, because they lack additional Collier D, Mahoney J 1996 Insights and pitfalls: Selection bias in
conditions which, when combined with the necessary qualitative research. World Politics 49: 56–91
conditions, establish sufficiency for the outcome. In Eckstein H 1975 Case study and theory in political science. In:
fact, the cases displaying the outcome may be only a Greenstein F I, Polsby N W (eds.) Handbook of Political
small minority of those that meet the necessary Science, Volume 7, Strategies of Inquiry. Addison-Wesley,
conditions. Thus, while there are clear gains from Reading, MA
specifying necessary conditions, as in the hypothetical George A 1979 Case studies and theory development: The method
case-oriented study, the identification of causally of structured, focussed comparison. In: Lauren P G (ed.)
relevant commonalities shared by instances of an Diplomacy: New Approaches in History, Theory and Policy.
outcome does not establish the conditions that are Free Press, New York
Goldthorpe J 1991 The uses of history in sociology: Reflections
sufficient for an outcome. Thus, the variable-oriented on some recent tendencies. British Journal of Sociology 42:
researcher’s finding that these four conditions do not 211–30
distinguish low-retention firms from high-retention Goldthorpe J 1997 Current issues in comparative macrosoci-
firms across a large sample of firms does not directly ology. Comparatie Social Research 16: 1–26
challenge the argument that these conditions are King G, Keohane R O, Verba S 1994 Designing Social Inquiry:
necessary. Scientific Inference in Qualitatie Research. Princeton Uni-
The variable-oriented analysis of these four condi- versity Press, Princeton, NJ
tions across a large sample of firms is much more Lieberson S 1991 Small N’s and big conclusions: An examination
directly relevant to their sufficiency. To show that of the reasoning in comparative studies based on a small
these conditions are jointly sufficient for the outcome, number of cases. Social Forces 70: 307–20
it would be important to demonstrate that when these Lieberson S 1994 More on the uneasy case for using Mill-type
methods in small-N comparative studies. Social Forces 72:
four conditions are combined the outcome follows. In
1225–37
other words, the variable-oriented researcher could Lijphart A 1971 Comparative politics and comparative
evaluate sufficiency by examining the correspondence method. American Political Science Reiew 65: 682–93
between the combination of the four causes, on the Lijphart A 1975 The comparable cases strategy in comparative
one hand, and the outcome, on the other, across a research. Comparatie Political Studies 8: 158–75
large sample of firms. Still, this analysis would be an Ragin C C 1987 The Comparatie Method: Moing Beyond
evaluation of sufficiency, not necessity, and the results Qualitatie and Quantitatie Strategies. University of Cali-
of this analysis would not bear in a direct manner on fornia Press, Berkeley, CA

1524
Cassandra\Cornucopian Debate

Ragin C C 1997 Turning the tables: How case-oriented research tract Essay on the Principle of Population (Malthus
challenges variable-oriented research. Comparatie Social 1798). In it, he argued that population tends to grow at
Research 16: 27–42 a geometric rate (i.e., a constant percent increase, but
Ragin C C 2000 Fuzzy-Set Social Science. University of Chicago
ever-larger absolute increments) while necessities (par-
Press, Chicago
Rueschemeyer D 1991 Different methods—contradictory re- ticularly food supply) tend to grow at an arithmetic
sults? Research on development and democracy. In: Ragin C C rate (constant absolute increments). Therefore, sus-
(ed.) Issues and Alternaties in Comparatie Social Research. tained population growth would inevitably outstrip its
E. J. Brill, Leiden, The Netherlands means of support, unless subjected to certain ‘checks.’
Walton J 1992 Making the theoretical case. In: Ragin C C, In the first edition of the essay, these ‘positive checks’
Becker H S (eds.) What Is a Case? Exploring the Foundations were war, pestilence, disease, and famine. The second
of Social Inquiry. Cambridge University Press, New York edition introduced the possibility of ‘preventive
White H 1992 Cases are for identity, for explanation, or for checks,’ which included moral restraint, celibacy, late
control. In: Ragin C C, Becker H S (eds.) What Is a Case?
marriage, and abstinence. If the preventive checks did
Exploring the Foundations of Social Inquiry. Cambridge
University Press, New York not slow population growth before it overran the
Wright G H von 1971 Explanation and Understanding. Cornell limits of the land to support it, then the positive checks
University Press, Ithaca, NY would come into play.
Yin R K 1994 Case Study Research: Design and Methods, 2nd Malthus principally was interested in whether aid to
edn. Sage, Thousand Oaks, CA the poor could be expected to lift them out of poverty
(he concluded that it could not). Concern over the
C. C. Ragin consequences of population and economic growth for
human welfare and the environment has waxed and
waned several times since the publication of Malthus’s
essay. Particularly relevant to contemporary schol-
arship is the growth of neo-Malthusianism since
Cassandra/Cornucopian Debate World War II. Luten (1980) traces this recent history
through concern over population growth and its
The Cassandra\Cornucopian debate is one of many impact on resource use and economic growth in the
names given to an argument between two extreme 1940s and 1950s (see, e.g., Osborn 1953, Coale and
positions on the prospects for human society and the Hoover 1958), to pollution and pesticides in the 1960s
environment in the face of population and economic (notably Carson 1962), and back to population and
growth. So-called Cassandras (after the Greek princess resources in the late 1960s and 1970s (Paddock and
given the power of prophecy but denied the power of Paddock 1967, Ehrlich 1968, Meadows et al. 1972).
persuasion) contend that unchecked growth in num- The Cornucopian position has roots that extend
bers of people and material consumption rates will back to the mercantilist school of economic thought in
inevitably lead to environmental and social catas- the sixteenth century, which emphasized the positive
trophe. This group is alternately called doomsayers, effect of population growth on economic production.
pessimists, catastrophists, or neo-Malthusians and It was the mercantilist economist Jean Bodin who
stereotypically is associated with ecologists and envi- claimed ‘There is no wealth but men.’ More recently,
ronmentalists. Cornucopians (after the Greek legend Boserup (1965, 1981) argued that increased population
of the ‘horn of plenty’ that overflows with whatever its size and its associated demand for food itself induced
owner wishes for) believe that human ingenuity and new agricultural production techniques and increased
free markets will allow the human species to adapt to density facilitated the flow and application of new
any conceivable pressures caused by growth of the knowledge. In this way a growing population aided
human enterprise. They are referred to alternately as rather than hindered development.
optimists, panglossians, or exemptionists, and ste- These points of view and others have been reflected
reotypically are associated with free-market econo- and expanded on in Cornucopian responses to the
mists. The debate originated at least as far back as post-war rise of neo-Malthusian concern. For ex-
Malthus in the eighteenth century, flared up with ample, Barnett and Morse (1963) argued that tech-
special intensity in the late 1960s and 1970s over the nological progress can more than offset resource
issue of exhaustible natural resources, and continues scarcity, while Simon (1981, 1996) and Kahn et al.
in more subdued form today as reflected in the issue of (1976) have been the most visible Cornucopian stan-
global environmental change. dard-bearers in recent decades.

1. Historical Roots 2. Cassandras


The roots of the Cassandran position extend as far The most recent crest of neo-Malthusian concern
back as the Reverend Thomas Malthus, who in 1798 occurred in the 1960s and 1970s, driven by a number
published the first and most pessimistic version of his of writers and researchers concerned with what they

1525
Cassandra\Cornucopian Debate

saw as the inevitability of rising levels of pollution and of academic respectability that more popular treat-
increasingly scarce food supply and nonrenewable ments lacked. Criticism covered a wide range of issues,
natural resources. Most of these arguments pointed to but there were several common themes centered on the
material consumption and especially population perceived lack of economic principles.
growth as the key drivers of these problems, and
advocated immediate measures to reduce consump-
tion and control population. For example, Hardin
(1968) argued that population growth inevitably leads 3.1 Technical Progress
to the degradation of common property resources, the Neoclassical economics views scarcity not as absolute
root of many environmental problems, and that the but as relative and as a function of technology.
only solution to the ‘population problem’ was ‘mutual Technical progress lowers the cost of making a
coercion, mutually agreed upon,’ that is, societies resource more accessible. For example, improvements
must agree to enforce limits upon their own repro- in irrigation systems make it cost effective to expand
ductive behavior. the area of irrigated land, essentially increasing the
The most well-known advocate of the Cassandra total land available for cultivation. Thus, expressing
position was Ehrlich, who in 1968 published The limits in fixed physical units, as was done in World3
Population Bomb, a short popular book declaring that and in other carrying-capacity arguments, left out the
the struggle to feed humanity was over and had been crucial role of technology in expanding the pool of
lost; hundreds of millions were destined to starve in resources available to society.
the coming decades. It foresaw a future so dire that it
advocated consideration of a system of ‘triage’ in
foreign aid policy first recommended by Paddock and
Paddock (1967). The system would cut off aid to 3.2 Adjustment Mechanisms
countries such as India deemed beyond help due to
rapid population growth and dim prospects for im- Many critics pointed out the lack of accounting for
provements in agricultural production and would well-known adjustment mechanisms in response to
focus instead on countries such as Pakistan that might scarcity, especially price systems. In an ideal market
possibly avoid massive starvation with the right kind economy, as a resource becomes relatively more
of assistance. scarce, its price rises. This rising price induces a host of
In 1972 the sense of impending doom was heigh- adjustments: recycling; conservation; the development
tened further by the publication of The Limits to of better methods of discovery, extraction, and pro-
Growth (Meadows et al. 1972), a book by a group of duction; and the substitution of other goods, all of
researchers at the Massachusetts Institute of Tech- which have the effect of essentially increasing supply.
nology that argued that the human population was An often-quoted example is that although global oil
approaching the limit of the planet’s ability to sustain consumption rates increased between the 1970s and
it, and without fundamental societal changes we would 1990s, oil reserves grew rather then declined over this
soon exceed that limit, triggering a dramatic increase time period—a result of price-induced improvements
in death rates, catastrophic environmental degrada- in exploration, extraction, and efficiency.
tion, and social chaos. Their conclusion received an The World3 model did not include a price system at
enormous amount of attention, in part because it was all. It also aggregated the world into a single region,
based on one of the first attempts to create a computer eliminating trade as a possibility, and aggregated over
model of the world as a single system. They subjected many goods so that substitution was not possible.
their ‘World3’ model to an extensive sensitivity analy- Thus, many economists put little stock in its results, or
sis, but in all cases it produced an ‘overshoot and in the conclusions of arguments that did not account
collapse’ pattern: a surge in population followed for the ability of a market economy to respond to
sooner or later by a crash. The robustness of the result scarcity.
seemed to lend great credibility to their conclusions.

3.3 Methodological Weaknesses


3. A Storm of Criticism
Several critics focused on particular aspects of the
Both The Population Bomb and The Limits to Growth methodology of The Limits to Growth study. Rela-
sold millions of copies, and this wave of academic and tionships between variables in the model may have
popular attention to a presumed approach to the been plausible, but were based on little more than the
planet’s carrying capacity generated a storm of criti- authors’ intuition, ignoring theory and evidence in
cism. The Limits to Growth itself received special economics, demography, and other fields. Also, sev-
attention by academic critics (Maddox 1972, Cole et eral studies showed that the ‘overshoot and collapse’
al. 1973, Nordhaus 1973, Sanderson 1994), perhaps behavior of their model was by no means inevitable. If
because its basis in computer modeling gave it an aura changes in variables were made slowly rather than

1526
Cassandra\Cornucopian Debate

suddenly, the model produced steady growth (Cole et environment. Many ecologists and other natural
al. 1973). If the prices were included or the possibility scientists considered the treatment of environmental
of substitution was allowed for, or alternative as- impacts by Simon and others to be perfunctory and
sumptions about population growth or technical unconvincing at best, and deliberately misleading at
change were made, model outcomes changed dras- worst. For example, while optimists often focused on
tically, in some cases producing continuous growth improvements in the levels of pollution of, for ex-
(Nordhaus 1973). ample, DDT or PCBs (improvements that, Cassandras
often point out, were driven in large part by the
concern of environmentalists), they usually said little
4. Cornucopians and their Critics about larger issues such as climate change or bio-
diversity loss. Critics accuse them of being quick to
In many ways the Cornucopian credo is neoclassical dismiss scientific consensus when convenient to their
economic reasoning taken to an extreme. Simon was argument, and of suggesting that such consensus was
the best-known Cornucopian. In The Ultimate Re- a conspiracy to obtain more research money.
source (1981; a 1996 follow-up did not alter the basic
message) he argued in essence that on balance people
create more than they destroy, so that the more people,
the better the world. While he agreed that larger 4.3 Unfounded Assumptions
populations can create short-run scarcity and en-
vironmental degradation, he believed that these Simon’s argument was often attacked for its reliance
problems presented challenges to which society would on the assumption of a rather mechanical link between
always adapt, improving itself in the process. In the human population numbers and the assumed stock of
long run, society is better off—prices are lower, the knowledge and other forms of social capital.
environment is cleaner, and consumption is great- Sanderson (1980) points out that Simon assumes social
er—than if the problems had never arisen. capital grows automatically with the size of the
Critics of the Cornucopian position seized on population, and that it has an unusually large effect on
several perceived weaknesses. In general, it is often production. To some extent, population growth’s
argued that the theory is not testable: any scarcity or positive long-run impact on economic production is
environmental degradation that is encountered can be built into Simon’s model by assumption.
explained away as either a market imperfection or as a
short-run problem that will in fact lead to a long-run
benefit. By presuming to explain everything, it explains
nothing. More specific criticisms have focused on 5. Winners and Losers
several main points. Although the debate is ongoing, it is possible to assess
the accuracy of some of the claims made so far by the
two sides. For example, Malthus has turned out to be
4.1 Discontinuities wrong so far, because he failed to anticipate the
Many critics argue that Cornucopians have a blind opening up of new agricultural lands, the scope for
faith that the future will repeat the past. Because improvements in agricultural technology, storage, and
resource prices have generally fallen, and by some transport, and the decline in Western fertility. Simi-
measures and in some places the environment has larly, the predictions of mass starvation made in the
improved, it is presumed that these trends will always 1960s and 1970s never materialized; the Green Rev-
continue. Cornucopians argue that there is no reason olution in agriculture raised yields much more than
to expect the future to be different from the past in anticipated, and developing country population
terms of society’s ability to adapt to and benefit from growth slowed as a transition to lower fertility was
scarcity-induced challenges. Cassandrans counter that begun. Nor did protracted shortages in natural re-
there is every reason to expect the future to be different, sources materialize. This outcome was highlighted by
since gradual increases in environmental degradation a much-publicized bet made between Simon and
driven by the increasing scale of human impact are Ehrlich in 1980. The two wagered over whether the
likely to eventually push ecosystems past thresholds prices of five metals (chosen by Ehrlich) would rise
beyond which decline will be rapid and perhaps over the decade (indicating scarcity); Simon won on all
uncontrollable. counts as the prices of all five fell. Furthermore, as of
yet there are no signs of Limits to Growth-type
scenarios coming to pass. Supporters of the project
emphasize that independent of the value of their
4.2 Insufficient Accounting for Natural Science
scenarios, there was value in the attempt to model the
The Cornucopian position is often accused of focusing world as a system, however crudely, as a first step
on improvement in material consumption and human toward an important new direction of research.
welfare at the expense of associated degradation of the However, Sanderson (1994) points out that World3

1527
Cassandra\Cornucopian Debate

was an ‘evolutionary dead-end’—even the authors’ Cassandra\Cornucopian debate may be contributing


own follow-up study in 1992 presented no substantial to a more productive discussion.
advance over the original model.
On the Cornucopian side, Simon’s argument that
population growth is unambiguously positive for long- See also: Environmental and Resource Management;
term economic growth was certainly influential in Environmental Policy; Environmental Vulnerability;
countering a focus on the negative consequences of Environmentalism: Philosophical Aspects; Enviro-
growth, but it was not adopted wholesale by econo- nmentalism, Politics of; Environmentalism: Preser-
mists. Mainstream economics has come to no strong vation and Conservation; Human–Environment
conclusion on whether population growth influences Relationships; Malthus, Thomas Robert (1766–
economic growth. More recently, studies in Asia 1834); Population Dynamics: Mathematic Models of
suggest that age structure effects of fertility reduction Population, Development, and Natural Resources;
in some settings can stimulate growth if combined Sustainable Development
with other economic and political measures.
Some other Cornucopian articles of faith have not
fared well. Nuclear energy did not solve the energy
problem, as several technological optimists predicted,
and while the oil price increases of the 1970s have Bibliography
since abated, the fossil-fuel-driven problem of global
climate change has not gone away. Barnett H J, Morse C 1963 Scarcity and Growth. Johns Hopkins
Press for Resources for the Future, Baltimore, MD
Boserup E 1965 The Conditions of Agricultural Growth. Aldine,
Chicago
Boserup E 1981 Population and Technological Change: A Study
6. Resolution? of Long-term Trends. University of Chicago Press, Chicago
Carson R 1962 Silent Spring. Houghton Mifflin, Boston
Attempts have been made to resolve the Coale A J, Hoover E M 1958 Population Growth and Economic
Cassandra\Cornucopian debate in a number of Deelopment in Low-income Countries. Princeton University
ways. Some conclude that the truth lies somewhere in Press, Princeton, NJ
between the two extremes: the outlook is neither as Cole H S D, Freeman C, Jahoda M, Pavitt K L R (eds.) 1973
rosy nor as dismal as these two camps believe. Others Models of Doom: A Critique of The Limits to Growth. Universe
conclude that each may have particular aspects of the Books, New York
story right. For example, Cornucopians may be right Ehrlich P R 1968 The Population Bomb. Ballentine Books, New
that we can feed an expanding population without York
massive famine, but Cassandras may be correct in Hardin G 1968 The tragedy of the commons. Science 162:
emphasizing that this will come at great environmental 1243–8
Kahn H, Brown W, Martel L 1976 The Next 200 Years: A
cost if current trends continue.
Scenario for America and the World. Morrow, New York
In the absence of resolution, lessons drawn from the Luten D B 1980 Ecological optimism in the social sciences: The
debate could facilitate progress on contemporary question of the limits to growth. American Behaioral Scientist
issues related to global environmental change. For 24(1): 125–51
example, the climate change issue has given rise to a Maddox J R 1972 The Doomsday Syndrome. McGraw-Hill, New
new group of models, called integrated assessment York
models, which share some features with the early Malthus T R 1798\1967 Essay on the Principle of Population,
world-systems models: they are often global in scope, 7th edn. Dent, London
they seek to incorporate feed-backs between society Meadows D H, Meadows D L, Randers J 1992 Beyond the
and the environment, and they cover long time-scales. Limits. Chelsea Green Publishing, Post Mills, VT
History offers lessons on the dangers of aggregation, Meadows D H, Meadows D L, Randers J, Behrens W W III
the importance of grounding relationships in theory 1972 The Limits to Growth. Universe Books, New York
and data, and the importance of transparent as- Nordhaus W D 1973 World dynamics: Measurement without
sumptions and treatments of uncertainty. In addition, data. The Economic Journal 83(332): 1156–83
Osborn F 1953 The Limits of the Earth. Little, Brown, Boston
the debate over the Earth’s capacity to sustain society
Paddock W, Paddock P 1967 Famine—1975! Little, Brown,
has not disappeared; current manifestations include Boston
the issue of human-induced loss of ‘ecosystem services’ Sanderson W C 1980 Economic-demographic simulation
such as soil aeration, pollination, air and water models: A review of their usefulness for policy analysis.
purification, and element cycling. Key economic Research Report. International Institute for Applied Systems
concepts left out of earlier arguments are being Analysis, Austria, RR-80-14
confronted: ecologists argue that there are no practical Sanderson W C 1994 Simulation models of demographic,
substitutes for ecosystem services, their supply is fixed, economic, and environmental interactions. In: Lutz W (ed.)
and their constraint on production seems to be near Population–Deelopment–Enironment: Understanding their
enough to matter. In this case, the history of the Interactions in Mauritius. Springer-Verlag, Berlin, pp. 33–71

1528
Caste

Simon J L 1981 The Ultimate Resource. Princeton University nant place to Brahmanical texts as representatives of
Press, Princeton, NJ Indian society. An important intervention made by
Simon J L 1996 The Ultimate Resource 2. Princeton University McKim Marriott (1990) needs mention here. Marriott
Press, Princeton, NJ
provided a significant alternative to Dumont’s for-
mulation, arguing that different transactional strate-
B. O’Neill
gies defined the position of different castes—it was not
a simple case of hierarchy versus equality but rather of
a universe governed by a complex set of rules and
Caste strategies regarding matching, mixing, and marking
through which different regional and local configura-
tions of castes were generated. What was at stake for
The career of caste as an anthropological concept both Dumont and Marriott, despite their differences,
provides a fascinating terrain on which to examine was the representation of India as the ‘other’ of the
how anthropological objects come to be invented and modern West. They were much less interested in either
stabilized, and the relation they bear to patterns of the concrete historical processes through which insti-
sociality, needs of government, and production of tutions were formed or the contemporary changes in
knowledge. Many textbooks in sociology and social the caste system. It is instructive to compare this with
anthropology represent caste as a fundamental institu- the way that caste was rendered in the work of the
tion of Indian society. In this view castes are ranked in Indian anthropologist, M. N. Srinivas. Srinivas’s
a hierarchical order and are governed by rules of stake in the local and his deep concern with the way
endogamy, commensality, and purity–pollution that caste was shaping Indian democratic politics dis-
put strict limits on exchange of women and food tinguish him from these authors.
between castes. The hierarchical order, however,
permits the flow of women from lower to higher castes
through marriage and the flow of food from higher to 2. The Stake in the Local
lower castes. The caste system is seen to correspond to The decade of the 1950s saw the emergence of ‘village
a rough division of labor, though it is recognized that studies’ as an important genre of anthropological
certain occupations such as agriculture cut across writing in India. For Srinivas the village was an
different castes. This textbook picture of the caste indispensable site for fieldwork because of the im-
system, though influential, has come under serious portance he attached to what he called the ‘field-view’
scrutiny in recent years as being not only ahistorical as distinct from the overly textual view of the caste
but also ignoring the power\knowledge axis in the system. The two important concepts that Srinivas
production of social science concepts. (1971) formulated with far reaching consequences for
the study of caste were ‘sanskritization’ and ‘dominant
1. Caste as the Ideology of Indian society caste.’
Srinivas contended that a preoccupation with tex-
Louis Dumont’s (1980) Homo Hierarchicus has long tual sources had led many to assume that the caste
been regarded as an outstanding contribution to the system was a rigid hierarchical system. Attention to
understanding of caste. Dumont argued that principles historical and ethnographic processes on the ground,
of hierarchy and holism were central for explaining the on the other hand, showed that while individual
caste system. The principle of hierarchy in India, he mobility was severely restricted in the caste system,
proposed, was based upon the religious opposition group mobility had always been possible for castes in
between pure and impure—pollution incurred in the the middle rungs through the process of sanskritiza-
biological processes of life and death was removed in tion. This process referred to the ability of a caste
India not through processes of reciprocity (I bury your group, which had achieved economic or political
dead—you bury mine) but through principles of mobility, to stake a claim to a higher status by
hierarchy. The task of removal of pollution was recasting its customs and rituals to correspond more
assigned to the lower castes who became permanently closely to Brahmanical rituals and practices. The
imbued with it. Thus the separation between castes as process of change in ritual practices and social customs
well as their hierarchical ordering could be derived was not open to everyone. However, when a caste was
from the opposition between pure and impure. The successful in acquiring a measure of economic or
scheme had the simplicity and elegance to make the political power through physical mobility, opening up
bewildering diversity of Indian civilization immedi- of markets, or changed political alignments, it could
ately knowable, especially to Western readers. translate this into claims to a higher caste status. With
Dumont’s characterization of Indian society has the decline in traditional kingship brought on by the
been challenged on the ground that what he saw as a colonial rule, such functions were shifted to colonial
timeless ideology was itself a result of certain practices authorities. It is one of the exquisite ironies of the
of classification and enumeration instituted in the colonial rule that when caste began to be recorded
context of colonial administration that gave a domi- systematically in the census, it became a major source

1529
Caste

for claiming of higher status by various caste groups production were geared towards standardizing
who adduced their sanskritic practices or their myths methods of revenue collection and it was by no means
of origin as evidence of their higher status. obvious that caste rather than village would be the
Unlike the concept of sanskritization, which as- obvious unit for the organizing of data. According to
sumed an interaction between local and regional levels, Smith it was only around 1850 that the census in the
the concept of the dominant caste emerged out of an case of Punjab was transformed from an instrument of
understanding of local and especially agricultural tax to an instrument of knowledge. Earlier census
society. Srinivas (1971) argued that the ritual hierarchy reports were more pragmatic and localistic in orien-
of caste notwithstanding, the dominant role in village tation. There was considerable tension between the
life was played by landowning peasant-proprietor concerns of centrist census officials collating data in an
castes who were rarely Brahmins. The relation between encyclopedic manner and the local officials who were
these rich peasant castes and the other castes in the concerned with recording the nature of social groups
village was characterized as a form of patron–client and categories for more practical purposes such as
relations. Thus Srinivas argued that the horizontal collection for revenue or settlement of disputes (see
solidarity of caste expressed by endogamy and com- Appadurai 1996). In order to understand the concerns
mensality was counterbalanced by the vertical soli- of the centrist census officials who recorded infor-
darity observable at the village level through patron– mation on caste for which there could have been no
client relations. Many institutions such as fictive practical use, it is necessary to turn to the scientistic
kinship within the village, ritual processes for creation notions of race in this period.
of boundaries, and the division of labor and economic In shifting to castes as the most natural groups
exchange, created the village as an economic, political, around which information was to be organized on
and ritual unit that established solidarities cutting Indian society, British officials relied on their notions
across caste. As the general understanding of locality of race and physical types. Herbert Risley ([1908],
in the field of anthropological theory deepened, 1969) was the most vocal proponent of using anthro-
interest shifted to the mapping of specific histories pometrical measures in conducting the ethnographic
through which localities were created. The search for survey of India because, according to him, anthro-
overarching structural principles seemed less fruitful pometry yielded particularly good results in India by
than understanding how the momentous changes that reason of the caste system, which allowed marriage
were taking place in Indian society impacted upon only within a limited circle. Relying on such symbolism
caste, at the local, regional, and national levels. as that of dark versus light skin color and shape of
nose and jaw, the category of caste was thus collapsed
with that of race. Ghurye (1932) was the first Indian
3. Colonial Constructions sociologist to criticize this view of caste and to
challenge its political implications.
Once the search for enduring structural principles gave It is not anyone’s case that the process of recording
way to historically grounded work, the colonial caste created this institution ex nihilo. What it did was
archive became an important source for understanding to freeze the ongoing processes by which caste came to
the nature of colonial rule and how it transformed the be solidified in the official imagination, and to generate
institution of caste. The processes of enumeration and the conception of community as enumerative, which
classification played a particularly important role in had a strong influence on processes of political
this process. Although the predecessor states of the representation (Cohn 1984). It is important to under-
British in India did have apparatuses for counting, score the fact that the census, gazetteers, reports, and
classifying, and controlling populations, these were other such forms of knowledge came to represent the
tied to specific needs of the state such as revenue power of official discourse to name and fix the status of
collection, or rising of temporary armies. The British caste groups in local imaginaries. Many caste groups
colonial state instituted a new way of collecting began to see the census as the source for claiming
information in the form of maps, settlement reports, higher status. Thus census commissioners were be-
revenue records, statistical information, censuses, sieged with petitions challenging a particular status
folklore, narratives, to name a few. ascribed to a caste.
This new form of governance, or ‘rule-by-records It is also interesting to see that the theories of the
and rule-by-reports’ in the felicitous phrasing of racial origin of castes had a serious impact on social
Richard S. Smith (1986) had a decisive influence on the and political mobilization, as in the anti-Brahmin
shaping of caste identities in the twentieth century. In movements in the South of India and in the mobiliza-
the processes of classifying and enumerating the tion of the untouchable castes under the leadership of
population, the British did not start with caste as if it Ambedkar in Western India—both of which used the
were a natural category found on the ground. In the idea of original settlers versus Aryan invaders. The
early phases of colonial rule the emphasis was on discursive formations around caste quotas in edu-
cadastral control. Statistics on land ownership, ten- cational institutions, government jobs, and reserved
ancy, crop production, and instruments of agricultural constituencies, as well as the politics of mobilization

1530
Caste

around these issues, show the interaction and cir- occupations. Yet he preferred to retain the label
culation of categories between popular imagination ‘untouchable’ in his politics (Charsley 1996). In the
and official discourse (Dirks 1992). embroilment of nationalist politics with caste mobili-
zation, Gandhi’s strategy was to retain untouchables
within the fold of Hinduism while Ambedkar’s at-
tempt was to forge a new identity for them, but within
4. The Contemporary Context the Indian civilizational context—hence the choice of
neo-Buddhism as the new religion for the erstwhile
An important paradox about caste in public life in untouchables (Ambedkar 1946).
India is that while the Constitution of India recognizes The term ‘untouchable,’ like other such generic
only individuals as bearers of rights and duties, and terms masks considerable local heterogeneity. Its
bans any discrimination based on caste or religion, the importance lay in the binary division between un-
processes of democratic politics with the imperatives touchable and nonuntouchable castes created through
of grass root mobilization have created new arenas for the bureaucratic imperative of creating a single cate-
the evolution of caste-based politics (Be! teille 1996, gory for purposes of reservations. Under the Govern-
Kothari 1970). Interestingly the legal prohibitions on ment of India Act of 1935, the term ‘scheduled castes’
caste and the political mobilization around caste replaced the earlier category of depressed classes.
reflect the contrary pulls of formal legal justice system Though untouchability was to be the criterion for
and the imperatives of representational politics in inclusion of castes into the list of scheduled castes, the
contemporary India. This can be illustrated with the names of the castes that were finally included in the
example of ‘untouchability’ and its contradictory state lists formed a kind of unity only through a
careers in law, politics, and everyday life (Galanter ‘common relationship their members have with
1972). It is not that a similar exercise cannot be done government’ (Dushkin 1972, p. 166). The fact that the
on other castes, but the example of untouchability has category of untouchability was basically a twentieth
a particular resonance because of its unique associ- century creation does not detract from either the
ation with the caste system. experiences of oppression of castes grouped under this
While it could conceivably be argued that some category, or the importance of the grounding of
form of caste-based discrimination is to be found in all struggles for greater equality under identity politics. In
the legal sanskritic texts, recent examination of the fact, the creativity of the new social movements can be
genealogy of the terms through which untouchability seen in the emergence of the term dalit (literally
makes an appearance in discourse shows the im- ‘downtrodden’) among the neo-Buddhists and sched-
portance of political and social processes in the uled castes in Maharashtra, which has become now
negotiation of group identities in democratic societies the most commonly accepted term of self-reference by
(Charsley 1996). The term ‘untouchability’ is ascribed such groups. The most interesting feature of this phase
to Sir Herbert Risley ([1908], 1969) and was part of his of the movement of dalits is the emphasis not only on
effort to classify and rank castes in the subcontinent as political action but also on representation of the
a whole. While the category shudra occurs in the experiences of untouchable castes through literature,
Sanskrit texts and is sometimes taken to be equivalent life histories, and collection of oral literature as a
to ‘untouchables,’ its referents are varied—ranging critique of caste society (Murugkar 1996, Omvedt
from kings, powerful landowning castes, to castes with 1995).
extreme disabilities. Prior to Risley, compilers of An interesting question arises as to whether politics
district gazetteers and state census reports had experi- is the only domain within which the legacy of
mented with other terms, such as depressed classes, oppression and humiliation is articulated in contem-
depressed castes, panchamas, pariahs, etc. porary India. How are caste identities performed in
The use of the term ‘untouchable,’ in public life, everyday life? Khare (1984) had suggested that instead
owed much to the reformist politics of early twentieth of playing the impure foil to the Brahmin, the
century, especially to Gandhi’s politics of reform and untouchables conduct themselves even at the com-
the agenda for the abolition of untouchability in the munity level as civilizational critics, showing a com-
nationalist movement (Gandhi 1954). In 1931 Gandhi plex relationship with caste ideology. Other important
adopted the term Harijan (people of god) and in- ethnographies of caste show how caste identity is
creasingly substituted it for other terms in his writings. performed in relation to everyday activities of work
While the prototypical Harijan for Gandhi was a and worship; how elaborate strategies for management
member of the bhangi caste who cleaned lavatories and of caste identities are evolved, and how these perfor-
was thus rendered ‘unclean,’ his major political op- mances come to be experienced as forms of em-
ponent, Ambedkar, who himself came from the Mahar bodiment.
caste of Bombay Presidency, was much more inter- Debates on caste have shown the intricate relations
ested in forging political alliances between the major between forms of esthetic, political, and social repre-
agricultural dependent castes, whose low status came sentations of caste. Simultaneously much of the
from their dependency rather than their polluting scholarly agenda of research in India in recent years

1531
Caste

has been influenced by the public controversies on Smith R S 1986 Rule-by-records and rule-by-reports: Comp-
such issues as caste-based reservations or reintroduc- lementary aspects of the British imperial rule of law. Contribu-
tion of caste-based enumeration for the next decennial tions to Indian Sociology 19(1): 153–76
Srinivas M N 1971 Social Change in Modern India. University of
census. New patterns of sociality are evident in ways in
California Press, Berkeley, CA
which caste identity is performed in the context of
family, community, economy, and polity which recon- V. Das
ceptualize caste not as a sign of the exotic but of the
contemporary repositioning of the subjects and objects
of anthropology in new ways.

See also: Area and International Studies: Develop- Categorization and Similarity Models
ment in South Asia; Class: Social; Labor, Division
of; Social Stratification; South Asian Studies: Culture; Imagine meeting a friend’s pet, and having to decide
Subaltern History; Subaltern Studies: Cultural whether it is best classified as a cat or a dog. According
Concerns to prototype models of categorization, the pet is
compared with mental prototypes for cat and dog, and
if it is more similar to the cat prototype than to the dog
prototype, then it is classified as a cat. According to
exemplar models of categorization, the pet is compared
Bibliography with all remembered instances of cats and dogs, and if
it is more similar to known cats than to known dogs,
Ambedkar B R 1946 What Congress and Gandhi Hae Done to the then it is classified as a cat. Notice that both types of
Untouchables. Thacker, Bombay, India models rely on the notion of similarity. For these
Appadurai A 1996 Number in the colonial imagination. In:
models to be usefully tested and compared with each
Appadurai A Modernity at Large: Cultural Dimensions of
Globalization. University of Minnesota Press, Minneapolis,
other, or with other models of categorization, simi-
MN, pp.114–39 larity must be formally defined and empirically as-
Be! teille A 1996 Caste in contemporary India. In: Fuller C J (ed.) sayed, along with formal and empirical treatments of
Caste Today. Oxford University Press, Delhi, India categorization. This article briefly considers empirical
Charsley S 1996 ‘Untouchable’: What is in the name? Journal of results and formal models of similarity and categoriz-
Royal Anthropological Institute 2: 1–23 ation, concluding that the complex but systematic
Cohn B S 1984 The census, social structure and objectification in behavior should eventually yield to accurate models.
South Asia. Folk 26: 25–49
Deliege R 1992 Replication and consensus: Untouchability,
caste and ideology in India. Man 27: 155–73
1. Similarity
Dirks N B 1992 Castes of mind. Representations 37: 56–78
Dumont L 1980 Homo Hierarchicus: The Caste System and its
Implications. University of Chicago Press, Chicago 1.1 Empirical Assays of Similarity
Dushkin L 1972 Scheduled caste politics. In: Mahar J M (ed.)
The Untouchables in Contemporary India. University of The similarity of items can be assessed in many
Arizona Press, Tucson, AZ different ways. The most direct method is simply to
Galanter M 1972 The abolition of disabilities: Untouchability show a person two items and ask him or her to rate
and the law. In: Mahar J M (ed.) The Untouchables in their similarity or dissimilarity. Alternatively, three or
Contemporary India. University of Arizona Press, Tucson, AZ more items can be presented, and the viewer is asked to
Gandhi M K 1954 The Remoal of Untouchability. Navajivan put them into subgroups of apparently similar items.
Press, Ahmadabad, India A different method is for the viewer to see two items
Ghurye G S 1932 Caste and Race in India. Kegan Paul, London and then judge whether they are the same or different.
Khare R S 1984 The Untouchable as Himself: Ideology, Identity A fourth method shows the viewer a single item that
and Pragmatism Among the Lucknow Chamars. Cambridge must be identified; presumably the degree of confusion
University Press, Cambridge, UK between items, as measured either by response time or
Kothari R 1970 Caste in Indian Politics. Orient Longmans,
by error rate, reflects the similarity of the items.
Delhi, India
Marriott M (ed.) 1990 India Through Hindu Categories. Sage,
Although in many cases these different assays of
Delhi, India
similarity are concordant, there are a variety of
Murugkar L 1991 Dalit Panther Moement in Maharashtra: A situations in which they do not agree (Medin et al.
Sociological Appraisal. Popular Prakashan, Bombay, India 1990). For example, consider the two pairs of coun-
Omvedt G 1995 Dalit Visions: The Anti-Caste Moement and the tries, West Germany–East Germany and Ceylon–-
Construction of an Indian Identity. Orient Longman, New Nepal. When asked to select the pair of countries that
Delhi, India are most similar to each other, people tended to choose
Risley H R [1908] 1969 The People of India. Oriental Books, West Germany–East Germany. When asked to select
New Delhi, India the pair of countries that are most dissimilar from each

1532
Categorization and Similarity Models

other, people again chose West Germany–East Germ- there is a prespecified range of dimensions or features
any (Tversky 1977). It has also been established that that considered. For example, in judging the
even for a single method of measurement, the relative similarity of a lion and a dog, there are an infinite
similarities of items can change depending on context. number of shared features such as ‘smaller than a
For example, in the context of hair, gray is more battleship,’ but these features are presumably not
similar to white than it is to black, but in the context of included in the computation of similarity. In other
clouds, gray is more similar to black than it is to white words, the specification of what features or dimensions
(Medin and Shoben 1988). Nevertheless, similarity is are relevant to the similarity computation is critical for
not an utterly unconstrained, useless theoretical cons- theories involving similarity (Goodman 1972, Murphy
truct. Rather, there are strong regularities in how and Medin 1985). These formalisms can accommodate
similarity is affected by context and by measurement differences in relevances of the features or dimensions
method (Goldstone 1994b; Medin et al. 1993). (by way of the salience function, f, in the contrast
model, and by way of the attentional factors, αr in the
MDS model), but the processes that determine these
relevances are not specified in these models.
1.2 Models of Similarity
Among the many models of similarity two are most
prominent. Multidimensional scaling (MDS) repre-
sents items by values in a multidimensional psycho- 2. Categorization
logical space (Shepard 1987). For example, on the
dimensions of size and ferocity, the item lion would Categorization can be measured in a variety of ways.
have values large and high. The similarity of two For a given item, a person can be asked to make a
items, A and B, is inversely related to the distance discrete choice among several candidate categories.
between the items in the psychological space, denoted Alternatively, the person can give a rating of how
‘distance(A,B).’ The similarity of items A and B is typical the item is of a certain category. Response
formally specified as times can also be measured.
In many situations, similarity is a good predictor of
S(A, B) l exp(kdistance(A, B)) categorization. For example, people (in the USA) rate
E G
a robin as being highly similar to many other birds, but
l exp k( αr Q ArkBr Q p)"/p (1) people do not rate a penguin as being very similar to
F r H many other birds. This similarity difference predicts
that people should categorize a robin more efficiently
where the sum is taken over all the relevant dimensions than a penguin, which in fact occurs; people are faster
of the space, αr is the attention allocated to dimensions to verify that ‘a robin is a bird’ than ‘a penguin is a
r, Ar is the value of item A on dimension r, and p is a bird.’ Similarity can also explain why people are slow
power typically set at a value of 1 or 2 depending on to falsify the statement, ‘a bat is a bird,’ because a bat
whether the dimensions can be selectively attended to is similar to a bird, yet is not a bird (Smith et al. 1974).
or not. For a review of MDS models, extensions, and There are also situations in which similarity ap-
some of their applications, see Ashby (1992) and parently does not predict categorization. Consider, for
Nosofsky (1992b). example, the category of ‘things to remove from a
A different formalization was proposed by Tversky burning house.’ Children and heirloom jewelry are
(1977) in his ‘contrast model.’ In this model, items are prominent members of this category, yet they are not
assumed to be represented by sets of present or absent very similar (Barsalou 1983). As another example,
features. If A denotes the set of all relevant features in consider a hypothetical three-inch diameter disk-
item A, then the similarity of items A and B is specified shaped object. People will rate it as more similar to a
by a linear combination (i.e., a contrast) of their quarter (i.e., a 25 cent coin) than to a pizza, but they
shared features (denoted A E B), the features in A that categorize it as more likely to be a pizza than a quarter
are not in B(denoted AkB), and the features in B that (Rips 1989). Some of these apparent dissociations
are not in A(denoted BkA), between similarity and categorization can be recon-
ciled if selective emphasis on particular features is
S(A, B) l θ f (A E B)kαf (AkB)kβ f (BkA) (2) taken into account. For example, children and heir-
loom jewelry are very similar on the dimensions of
where f is a function that specifies the saliences of the irreplaceability and portability, which are critical
various features (and sets of features), and where θ, α dimensions emphasized by a burning house. When a
and β are weighting factors for the influence of the hypothetical three-inch diameter object also includes
shared and distinctive features. other features characteristic of quarters, so that there
These models have been quite successful in ad- is less selective emphasis on size, then category
dressing a wide variety of phenomena in similarity judgments cohere with similarity judgments (Smith
data. However, all these models take for granted that and Sloman 1994).

1533
Categorization and Similarity Models

A prominent similarity-based model of categoriz- in which the relevance of dimensions or features is


ation is the ‘generalized context model’ (GCM, learned (Kruschke 1999, Kruschke and Johansen
Nosofsky 1986). In the GCM, an item to be cate- 1999), numerous connectionist or neural-network
gorized is compared with all exemplars in memory, models that create novel internal representations (for
and each exemplar ‘votes’ for its category with a an overview see Ellis and Humphreys 1999), and
strength proportional to the similarity of the exemplar models of similarity that apply to structured repre-
to the item. An important quality of the GCM is that sentations (Goldstone 1994a).
the attention paid to each stimulus dimension can be
adjusted to suit the categorization at hand. An See also: Categorization and Similarity Models:
extension of the GCM includes a mechanism by which Neuroscience Applications; Concept Learning and
the dimensional attention strengths are learned Representation: Models; Knowledge Spaces
(Kruschke 1992). Whether or not empirical evidence
yet demands inclusion of prototypes, in addition to Bibliography
exemplars, is a matter of current debate; Nosofsky
(1992a) provides a summary of tests of prototype Allen S W, Brooks L R 1991 Specializing the operation of an
models versus exemplar models. explicit rule. Journal of Experimental Psychology: General 120:
Perhaps more problematic for similarity-based 3–19
Ashby F G (ed.) 1992 Multidimensional Models of Perception
theories are cases in which categorization appears to
and Cognition. Erlbaum, Hillsdale, NJ
be based, at least in part, on ‘rules’ which specify strict, Ashby F G, Allfonso-Reese L A, Turken A U, Waldron E M
necessary, and sufficient conditions. One characteristic 1998 A neuropsychological theory of multiple systems in
of ‘rules’ is that they can be extrapolated far beyond category learning. Psychological Reiew 105: 442–81
the training exemplars. Erickson and Kruschke (1998) Barsalou L W 1983 Ad hoc categories. Memory & Cognition 11:
trained people to categorize simple geometric forms, 211–27
most of which could be classified by a simple rule Ellis R, Humphreys G W 1999 Connectionist Psychology. Psy-
regarding height, and a few of which were exceptions chology Press, Hove, East Sussex, UK
to the rule. When tested with novel forms beyond the Erickson M A, Kruschke J K 1998 Rules and exemplars in
category learning. Journal of Experimental Psychology: Gen-
range of the training instances, people responded
eral 127: 107–40
according to the rule, despite the fact that the most Goldstone R L 1994a Similarity, interactive activation, and
similar training exemplar violated the rule. Another mapping. Journal of Experimental Psychology: Learning,
clear case in which an abstracted rule violates exemplar Memory and Cognition 20: 3–28
similarity is presented by Shanks and Darby (1998). Goldstone R L 1994b The role of similarity in categorization:
However, even for rules, similarity to exemplars can providing a groundwork. Cognition 52: 125–157
have an influence. Erickson and Kruschke (1998) Goodman N 1972 Seven strictures on similarity. In: Goodman N
found that rule-obeying exemplars were classified (ed.) Problems and Projects. Bobbs-Merrill, Indianapolis, IN,
more accurately when they were similar to a high- pp. 437–47
Kruschke J K 1992 ALCOVE: An exemplar-based connectionist
frequency training instance than when they were
model of category learning. Psychological Reiew 99: 22–44
similar to a lower-frequency training instance. Several Kruschke J K 1999 Toward a unified model of attention in
other demonstrations of exemplar influence on rule associative learning. (Revision to appear in The Journal
use have been reported (e.g., Allen and Brooks 1991, of Mathematical Psychology. Available online http:\\
Palmeri and Nosofsky 1995). Categorization models www.indiana.edu\"kruschke\tumaal.html)
that combine rules with exemplars, or graded similarity, Kruschke J K, Johansen M K 1999 A model of probabilistic
or other modifications have recently been developed category learning. Journal of Experimental Psychology: Learn-
(Ashby et al. 1998, Erickson and Kruschke 1998, ing, Memory & Cognition. 25: 1083–119
Vandierendonck 1995). Medin D L, Goldstone R L, Gentner D 1990 Similarity in-
volving attributes and relations: Judgments of similarity and
difference are not inverses. Psychological Science 1: 64–9
3. Future Progress Medin D L, Goldstone R L, Gentner D 1993 Respects for
similarity. Psychological Reiew 100: 254–78
Despite the apparent complexity of phenomena in Medin D L, Shoben E J 1988 Context and structure in con-
similarity and categorization, there are systematicities ceptual combination. Cognitie Psychology 20: 158–90
that should yield to a characterization of categoriz- Murphy G L, Medin D L 1985 The role of theories in conceptual
ation in terms of similarity. Future models of similarity coherence. Psychological Reiew 92: 289–316
and categorization will have to include (a) mechanisms Nosofsky R M 1986 Attention, similarity and the identification-
than learn feature relevance that is context-sensitive; categorization relationship. Journal of Experimental Psy-
chology: General 115: 39–57
(b) mechanisms that abstract new features; and (c)
Nosofsky R M 1992a Exemplars, prototypes, and similarity
mechanisms that compare representations that are rules. In: Healy A F, Kosslyn S M, Shiffrin R M (eds.) Essays
more complex than feature sets or points in psycho- in Honor of William K. Estes. Vol. 2: From Learning Processes
logical space. An excellent discussion of these issues is to Cognitie Processes. Erlbaum, Hillsdale, NJ, pp. 149–67
provided by Goldstone (1994b). Recent progress in Nosofsky R M 1992b Similarity scaling and cognitive process
these directions includes models of category learning models. Annual Reiew of Psychology 43: 25–53

1534
Categorization and Similarity Models: Neuroscience Applications

Palmeri T J, Nosofsky R M 1995 Recognition memory for category learning. These include normal aging, major
exceptions to the category rule. Journal of Experimental depression, Parkinson’s disease, Alzheimer’s disease,
Psychology: Learning, Memory, & Cognition 21: 548–68 Huntington’s disease, strokes, and schizophrenia (for
Rips L J 1989 Similarity, typicality, and categorization. In:
a review, see, e.g., Ashby et al. 1998). This and other
Vosniadou S, Ortony A (eds.) Similarity and Analogical
Reasoning. Cambridge University Press, Cambridge, UK, pp. recent evidence sheds light on which neural structures
21–59 do or do not participate in category learning.
Shanks D R, Darby R J 1998 Feature- and rule-based generaliz-
ation in human associative learning. Journal of Experimental
Psychology: Animal Behaior Processes B 24: 405–15 1.1 Sensory Cortex
Shepard R N 1987 Toward a universal law of generalization for
psychological science. Science 237: 1317–23 Sensory cortex refers to all cortical areas associated
Smith E E, Shoben E J, Rips L J 1974 Structure and process in with sensory function. In the case of vision, this
semantic memory: A featural model for semantic decision. includes virtually all of the occipital cortex and much
Psychological Reiew 81: 214–41 of the temporal and parietal cortex. An object must be
Smith E E, Sloman S A 1994 Similarity- versus rule-based perceived before it can be categorized, so an intact
categorization. Memory and Cognition 22: 377–86 sensory cortex is necessary for normal categorization.
Tversky A 1977 Features of similarity. Psychological Reiew 84: It is not so clear, however, whether the characteristics
327–52
of a category and the rules for distinguishing it from
Vandierendonck A 1995 A parallel rule activation and rule
synthesis model for generalization in category learning. similar but different categories are learned and stored
Psychonomic Bulletin and Reiew 2: 442–59 in the sensory cortex. Interest in this hypothesis has
been sparked in recent years by reports of a variety of
J. K. Kruschke category-specific agnosias that result from damage to
certain high-level visual cortical areas. Category-
specific agnosia is defined as the ability to perceive or
categorize most visual stimuli relatively normally, but
a reduced ability to recognize exemplars from some
Categorization and Similarity Models: specific category, such as inanimate objects (e.g.,
tools), or fruits. The most widely known of such
Neuroscience Applications deficits occur with human faces (i.e., prosopagnosia).
Although such category-specific agnosias are con-
Categorization is the act of assigning objects or events sistent with the hypothesis that category structure is
to classes (i.e., categories). It is performed countless represented in visual cortex, they are also generally
times every day, and is among the most important and consistent with the hypothesis that visually similar
basic of all decisions. Many different categorization objects are represented in nearby areas of the visual
models have been proposed. In several cases, models cortex. For example, it is well known that neighboring
that make very different assumptions about how cells in the visual cortex tend to fire to similar stimuli.
people learn new categories have been equally suc- Thus, damage to some contiguous region of the visual
cessful at accounting for a given set of categorization cortex is likely to lead to perception deficits within a
data. Recent advances in neuroscience have provided class of similar stimuli. In fact, specific tests have failed
a new means with which to test among such competing to rule out this similarity hypothesis.
models. The correct model of categorization will be Evidence against the hypothesis that category learn-
consistent with the known neuroscience, as well as ing occurs in the visual cortex has been obtained
with observable categorization behavior. In addition, in single cell recording experiments with monkeys.
a knowledge of the underlying neuroscience makes it Several studies have found that the firing properties
possible to develop categorization models for many of cells in high-level visual areas (e.g., inferotemporal
groups of people that most current theories ignore cortex) do not change when there is a switch in the
(e.g., children, the elderly, people with various neuro- category assignment of the visual stimulus to which
logical disorders). This article reviews what is known the cell is most responsive (e.g., Rolls et al. 1977). If the
about the neuroscience of categorization, and con- categories were represented in the visual cortex, then
siders the implications of this knowledge for cate- the firing properties of visual cortical cells should
gorization models. change when the category memberships are switched.
For example, similar studies have found changes in the
responses of cells in other brain areas (e.g., amygdala)
1. Structures in such experiments.
Categorization is not a single mental ability, but
instead depends on several different abilities that use
1.2 Frontal Cortex
different brain structures and processes. As evidence
of this, a wide variety of normal and pathological The sensory cortex projects directly to the frontal
conditions have been shown to interfere with normal cortex, which consists roughly of the forward-most

1535
Categorization and Similarity Models: Neuroscience Applications

third of the cerebral cortex. With respect to category ceiving the stimuli. Rather, it appears that their
learning, two especially important structures in this difficulty is in learning to associate an appropriate
region are the prefrontal cortex and the anterior response with each stimulus alternative.
cingulate, which are thought to be critical for working The basal ganglia are also thought to be critical
(i.e., short-term) memory and executive (i.e., voli- structures for procedural learning and memory, which
tional) attention—operations that are important in is a phylogentically ancient system in which simple
some types of category learning. So, it is not surprising associations between stimuli and responses are
that there is abundant evidence that these structures learned. For this reason, the evidence that the basal
are critically important for learning at least some types ganglia are important in category learning supports
of category structures. The most well known such models that assume category learning depends on
evidence comes from applications of the Wisconsin procedural learning and memory.
Card Sorting Test (WCST). This is a widely used
neuropsychological test of frontal dysfunction in
which participants learn a series of categories each of 1.4 Medial Temporal Lobe
which differ on only a single critical attribute (i.e., Learning about a new category requires the use of
symbol color, shape, or number). Patients with frontal some form of memory. Oftentimes, when a person
cortical lesions are well known to have deficits on the thinks of memory, he or she means the memory of
WCST. Activation in frontal areas, including the facts and events (called semantic and episodic mem-
prefrontal cortex and the anterior cingulate, has also ory, respectively). It is now known that medial
been found in the few extant neuroimaging studies of temporal lobe structures, including the hippocampus,
category learning. the parahippocampal regions, and the entorhinal and
The frontal cortex is the locus of human reasoning. pararhinal cortices, are critical for the consolidation of
As such, the evidence that the frontal cortex partici- such memories. It might be expected, therefore, that
pates in category learning supports classical models of such memory systems contribute significantly to cate-
category learning (which date back to Aristotle) that gory learning, especially because a number of current
assume people learn categories by reasoning about theories assume that people categorize a new stimulus
them (e.g., through an explicit process of generating by comparing it to memory representations of pre-
and testing hypotheses about category structure). viously seen exemplars. For this reason, a number of
studies have examined how amnesiacs with medial
1.3 Basal Ganglia temporal lobe lesions perform in a variety of category-
learning tasks. With only a few exceptions, these
All of the cortical structures that have so far been studies have found normal category learning, even in
discussed (sensory cortex, prefrontal cortex, anterior patients with dense amnesia that resulted from ex-
cingulate) project directly to the basal ganglia—a tensive lesions (e.g., Squire and Knowlton 1995).
collection of subcortical structures that includes the When presented with a stimulus, amnesiacs with
caudate nucleus, the putamen, and the globus pallidus medial temporal lobe damage are significantly im-
(other structures are also included). Patients with paired in judging whether they have seen that stimulus
diseases of the basal ganglia (e.g., Parkinson’s disease, before, but their ability to assign it to the correct
Huntington’s disease) are impaired in category learn- category is essentially normal.
ing (Knowlton et al. 1996a, 1996b), which suggests These results support the hypothesis that medial
that these subcortical structures may contribute to the temporal lobe structures are not critical for most
learning of new categories. Perhaps the best evidence forms of category learning, and although the issue is
for a basal ganglia contribution to category learning, not yet resolved, they present a challenge to models
however, comes from a long series of lesion studies in that assume people access instance-based memories
rats and monkeys. In primates, virtually all of the (i.e., detailed representations of previously seen cate-
visual cortex projects directly to the tail of the caudate gory exemplars) during category learning.
nucleus, and the cells in this area then project, via the
globus pallidus and thalamus, to the prefrontal cortex
and more posterior motor areas (i.e., the premotor 2. Multiple Category Learning Systems
cortex). These projections place the caudate in an ideal
position to link percepts and actions, and many Another lesson learned from studying the neuro-
researchers have hypothesized that this is its primary science of categorization is that human category
role. Many studies have shown that lesions of the tail learning may be mediated by several different systems.
of the caudate nucleus impair the ability of animals to This is suggested by the widely distributed neural
learn visual discriminations that require one response structures that participate in category learning and by
to one stimulus and a different response to some other the overwhelming evidence that there are multiple
stimulus (e.g., Packard and McGaugh 1992). Because memory systems. Because any category-learning sys-
the thalamus and visual cortex are intact in these tem requires memory, the existence of multiple mem-
animals, it is unlikely that their difficulty is in per- ory systems raises the possibility that each of several

1536
Categorization and Similarity Models: Neuroscience Applications

memory systems might have their own dedicated


category-learning module. The notion that there may
be multiple category-learning systems has a long
history, but recently has been the subject of intense
scrutiny. Among those researchers postulating mul-
tiple systems, the consensus is that one system learns
explicitly and at least one learns implicitly. The explicit
system is accessible to consciousness and engages in an
explicit reasoning process that may involve hypothesis
testing or theory construction and testing. This system
is almost certainly mediated by frontal cortical struc-
tures. The implicit system is not accessible to conscious
awareness. Currently, there is debate as to whether
this system uses a procedural- or instance-based
memory system. Much evidence supports this multiple
systems hypothesis. For example, a number of studies
have found qualitative differences in the way people
learn categories that are best separated by an explicit Figure 2
rule as opposed to tasks in which no salient explicit A schematic illustrating how the explicit system in the
rule will succeed. COVIS model of category learning operates during the
Wisconsin Card Sorting Test

3. COVIS (COmpetition between Verbal and model, called COVIS, postulates separate, competing
Implicit Systems) explicit and implicit category-learning systems that are
simultaneously active at all times. Depending on the
Currently, there are only a few neuropsychological relationship between the categories to be learned,
theories of category learning (Ashby et al. 1998, Gluck however, one system may dominate the other. There
et al. 1996, Pickering 1997). A schematic illustrating are three hierarchical levels—cortex, thalamus, and
the most important neural structures and pathways of the basal ganglia. The two systems are mediated by
one of these is shown in Fig. 1 (Ashby et al. 1998). This parallel loops following the path: cortex–caudate–
globus pallidus–thalamus. The posterior loop, from
visual cortical areas to the tail of the caudate nucleus,
mediates the implicit system. In humans, all visual
areas project directly to the tail of the caudate (except
Area V1), with about 10,000 visual cortical cells
converging on each caudate cell. Cells in the tail of the
caudate then project to the prefrontal or premotor
cortex (via the globus pallidus and the thalamus).
COVIS assumes that, through a procedural learning
process, each caudate unit learns to associate a
category label, or perhaps an abstract motor response,
with a large group of visual cortical cells (Ashby and
Waldron 1999).
The explicit system is mediated by an anterior loop,
from the anterior cingulate and prefrontal cortex to
the head of the caudate nucleus, and then back to the
prefrontal cortex. Figure 2 illustrates how this system
might operate for a task like the WCST. In Fig. 2, the
active rule is to sort the cards by the color of the
symbols pictured. This rule is maintained in working
memory via the bold reverberating loop shown in the
figure. If feedback indicates that this rule is incorrect,
then the system must implement a different rule,
perhaps one that says to sort the cards by the shape of
Figure 1 the symbols pictured. There is evidence that imple-
A schematic illustrating the COVIS model of category menting a new rule requires two separate operations
learning. See text for more details (IT l inferotemporal (e.g., Owens et al. 1993). First, a new rule (e.g., sort by
cortex) shape) must be selected from among the alternative

1537
Categorization and Similarity Models: Neuroscience Applications

salient explicit rules, and second, attention must be the second millennium Catholicism has come to
switched from the previously active rule to the new designate most particularly the character Christianity
rule (e.g., from a rule to sort by color to a rule to sort has assumed within the Roman Communion under
by shape). COVIS assumes that the selection process is the authority of the papacy.
mediated by the anterior cingluate and prefrontal ‘Catholic’ is not a term to be found in the New
cortex, and that the switching process is mediated Testament but it is already being used to characterize
within the basal ganglia (Ashby et al. 1999). and designate the church in its early postapostolic
period. Its primary meaning is ‘universal,’ signifying
See also: Categorization and Similarity Models; Con- both Christianity’s claim to be a faith unbounded by
cept Learning and Representation: Models; Con- language, ethnic, or class boundaries and the insistence
nectionist Models of Concept Learning; Memory that local church communities must be in communion
Models: Quantitative; Neural Networks: Biological with one another: the church in Corinth or Ephesus is
Models and Applications; Neural Representations of ‘Catholic’ because part of an association of com-
Objects munities sharing the same faith and rituals present up
and down the Roman Empire and, potentially, every-
where. As such the term quickly became the church’s
Bibliography most regularly used name, incorporated into the
Ashby F G, Alfonso-Reese L A, Turken A U, Waldron E M creeds. All Christians using the creeds, whether
1998 A neuropsychological theory of multiple systems in Roman Catholic, Orthodox, or Protestant, claim to
category learning. Psychological Reiew 105: 442–81 believe in, and belong to, the Catholic Church.
Ashby F G, Isen A M, Turken A U 1999 A neuropsychological The character assumed by Christianity in the second
theory of positive affect and its influence on cognition. century and generally termed ‘Early Catholicism’ is
Psychological Reiew 106: 529–50 one in which each local church is ruled by a bishop,
Ashby F G, Waldron E M 1999 On the nature of implicit assisted by priests (presbyters) and deacons, claiming
categorization. Psychonomic Bulletin & Reiew 6: 363–78
Gluck M A, Oliver L M, Myers C E 1996 Late-training amnesic
‘succession’ from the apostles. Entry is via the sac-
deficits in probabilistic category learning: a neurocomputa- rament of baptism while regular worship is centered
tional analysis. Learning Memory 3: 326–40 upon the weekly celebration of the Eucharist, renewing
Knowlton B J, Mangels J A, Squire L R 1996a A neostriatal Jesus’s last supper with his disciples. The Hebrew
habit learning system in humans. Science 273: 1399–402 scriptures were retained as inspired, generally in their
Knowlton B J, Squire L R, Paulsen J S, Swerdlow N R, Swenson Greek Septuagintal form, to which was added a
M, Butters N 1996b Dissociations within nondeclarative collection of the earliest Christian writings, the New
memory in Huntington’s disease. Neuropsychology 10: 538–48 Testament, to which apostolic authority was at-
Owens A M, Roberts A C, Hodges J R, Summers B A, Polkey tributed. The shape of Catholicism had, however, been
C E, Robbins T W 1993 Contrasting mechanisms of impaired
attentional set–shifting in patients with frontal lobe damage or
established well before the full canon of the New
Parkinson’s disease. Brain 116: 1159–75 Testament was agreed upon, certain books remaining
Packard M G, McGaugh J L 1992 Double dissociation of fornix for long in dispute. While there was at first no formal
and caudate–nucleus lesions on acquisition of two water machinery for linking local churches together in the
maze tasks: further evidence for multiple memory systems. one church, local councils of bishops from neighboring
Behaioral Neuroscience 106: 439–46 towns soon began to be called, while the churches in
Pickering A D 1997 New approaches to the study of amnesic major cities, particularly those claiming an apostolic
patients: what can a neurofunctional philosophy and neural foundation, assumed a leading position. Among these
network methods offer? In: Mayes A R, Downes J J (eds.) the Church of Rome, where both Peter and Paul were
Theories of Organic Amnesia. Psychology Press, Hove,
UK, pp. 255–300
martyred, came to be accorded an unquestionable pre-
Rolls E T, Judge S J, Sanghera M K 1977 Activity of neurones in eminence from an early date.
the inferotemporal cortex of the alert monkey. Brain Research While all the books of the New Testament are
130: 229–38 centered unequivocally upon the figure of Jesus Christ,
Squire L R, Knowlton B J 1995 Learning about categories in the they already suggest, particularly in the account of the
absence of memory. Proceedings of the National Academy of first Christian martyr, Stephen, but also in concen-
Sciences, USA 92: 12470–4 tration upon certain of the apostles, the importance
for early Christianity of the personal ‘witness’ to
F. G. Ashby Christ (the strict meaning of martyr). A cult of the
martyrs, focused upon their tombs, began early and
developed into a wider cult of the saints that remained
a permanent characteristic of Catholicism. Further-
Catholicism more virginity\celibacy was stressed from early times
as an alternative road to exceptional holiness, a way
Catholicism signifies the central form that Christianity later institutionalized in monasticism.
has taken in history from, at latest, the second century, This ‘Early Catholicism,’ constituted by hierarchical
but it is a form that has continued steadily to evolve. In ministry, sacraments, the canon of scripture, and the

1538
Catholicism

cult of the saints, was common to east and west, the turning ‘Catholicism’ into ‘Roman Catholicism’: a
Greek and Latin churches, as also to Syrian, Coptic, hitherto largely decentralized church was being made
and Armenian churches. Despite diversities of langu- to conform ever more closely to rules laid down in
age, liturgy, and theological tradition, Catholic Chris- Rome, rules which included insistence on higher moral
tianity was held together by the fact and theology of and educational standards in the clergy and greater
‘communion,’ a sacramental bond of fellowship from separation between clergy and laity. The second was a
which individuals or whole churches could be excluded vast extension of canon law and its implementation
on account of heretical teaching or moral deviation. through a hierarchy of church courts with appeal
After Christianity became a privileged religion with always possible to the highest courts in Rome. Canon
the conversion of the emperor Constantine in the early law became the effective tool for implementing the new
fourth century, it grew enormously in numbers. What Gregorian ideals. The third was the law of celibacy for
had hitherto been a community of a voluntary sort all the clergy. Hitherto most ordinary priests were, or
whose members knew that they might well have to face had been, married, just as they had been and continued
persecution on account of their beliefs became instead to be in the east. Henceforth in the west the married
quite quickly a church of the masses, many of whose priest would be an offender, though in practice he
members had received very limited Christian instruc- would long continue to exist. The law of celibacy was
tion. The myths and customs of the rural population a necessary tool for ensuring the segregation of clergy
were carried on, often covered over with a thin from laity, making it possible for the former to be
Christian veneer. Henceforth Catholicism would be controlled effectively by higher church authority. The
characterized by the co-existence of a clerical and ‘church’ became in practice more and more a clerical
monastic religious upper class with a popular religion reality in which the laity were expected ideally to
which it would preach to generation after generation remain passive and obedient. The identification of
but never fully convert. church and clergy was accentuated further by langu-
Post-Constantinian Catholicism grew too in the age. The early church had allowed great linguistic
development of an institutional framework designed flexibility but in the medieval west Latin came to have
to resolve conflicts and maintain unity, although the a wholly privileged position as the language of the
result was often to precipitate still worse division. The church and its public services as well as of theology. As
most striking development of this was what came to be a range of vernaculars developed vigorously their own
recognized as the ‘ecumenical council’ beginning with literatures, the ecclesiastical use of Latin became a
that of Nicaea (325). At the same time the authority of further characteristic of Catholicism. It expressed a
the major churches—Rome, Alexandria, Antioch, and certain clerical universalism at odds with the national
then Constantinople and Jerusalem—was further ac- vernaculars used by the laity.
centuated. By the fifth century this had been stabilized Two further developments, linked particularly with
in the theory of the ‘Five Patriarchates’ but, among the thirteenth century, were also lastingly influential
them, Rome and Constantinople held a quite privi- for the shaping of Catholicism and its differentiation
leged position. That of Rome, with its unquestionable from eastern Orthodoxy. One was the founding,
apostolic origins, had been there from the start, but beginning with the Franciscan and Dominican friars,
was reinforced greatly from the late fourth century. of religious orders which were not purely monastic
Constantinople, the new imperial capital, claimed to but, instead, active, mobile, and involved in the world
be for that reason also the church’s second see in around them, especially the urban world. Further
dignity, though Rome denied that its own primacy was societies of this sort, most notably the Jesuits in the
due to its civic, rather than its apostolic, status. In sixteenth century, would multiply, providing a power-
practice, the two co-existed for centuries, at times ful, well-organized, and educated religious army at the
somewhat tensely, but also co-operatively. Rome service of the church, available not only internally but
enjoyed an increasingly effective primacy of juris- also for missionary service abroad, as the Franciscan
diction throughout the Latin west but of little more mission to China in the fourteenth century already
than honor in the east. At the same time the advance of demonstrated. The second was scholasticism, formal
monasticism profoundly affected the ethos of Cathol- systems of theology, heavily influenced by the re-
icism in both east and west. discovery of Aristotle’s works, which grew out of the
It was only after the eleventh century that the teaching system of the new universities. Here formal
‘Catholicism’ of the western church could be dis- logic was learned and applied in a systematization of
tinguished clearly from the ‘Orthodoxy’ of the eastern. theology very different from that of the patristic period
The Gregorian reform (named after Gregory VII, or the continued traditions of the Greek east.
pope from 1073–1085) profoundly affected the western Thus, by the later Middle Ages the type of Christi-
church but through the accentuation of tendencies anity which we have come to call Catholicism was
already present. Three are crucial: the first was a large almost fully evolved—a mix of subapostolic, post-
extension in the claims but, still more, the effective Constantinian, and medieval developments. The Re-
jurisdictional power of the papacy. Its ability to formation brought an attack on many of these
impose a defined model of ecclesiastical life was developments, almost anything in fact which could

1539
Catholicism

not be defended with firm scriptural authority. Against minimum papal intervention in non-Italian ec-
this assault the papacy yielded almost nothing. It clesiastical affairs) institutional rigidity grew while
admitted moral and educational, but not doctrinal or creativity declined. Faced with the often anticlerical,
institutional, failings. The Council of Trent (1545–63) and even antireligious, ethos of the French Revolution
formulated the lines of the Counter-Reformation and later liberal movements, the Catholic Church
position, emphatically reaffirming almost everything drew in on itself as a fortress against the modern
that Luther, Zwingli, and Calvin had impugned. The world, defining the authority of the pope at the first
one major new element that it added to the character Vatican Council of 1870 under Pius IX (1846–78)
of Catholicism as a social reality was the seminary. while a little later Pius X (1903–14) condemned every
Hitherto there had been no institutions formally form of ‘modernism.’ In the first decades of the
established for the training of diocesan priests, who twentieth century Catholicism could then appear as a
had learned what they needed to know either through consistently reactionary force, intellectually, culturally
the general education of a university or through a and socially. Yet it was far from being a community in
measure of apprenticeship, a son often enough learn- decline. Modern technology made Roman centraliz-
ing the job from his father. The Tridentine seminary ation work in a way never previously possible. The
would improve greatly the effectiveness and education railway brought countless pilgrims to Rome and a cult
of the clergy but also further distance them socially of the reigning pope among ordinary Catholics quickly
from the laity, particularly on account of the ‘minor developed. Pius IX, antimodern as he was, became in
seminary,’ effectively a secondary school in which consequence the first ‘modern’ pope, almost idolized
candidates for the priesthood would be segregated by millions. At the same time a series of apparitions of
from their early teens. This could lead easily on to a the Blessed Virgin Mary mostly to children, of which
further gap between the clergy and the university the most famous were at Lourdes in 1858 and at
world, and therefore also between the clergy and Fatima in 1917, much reinforced a principal focus for
modern ways of thought. While with time medieval Catholic popular devotion, both backed by and
scholasticism was ousted from the university, it would supporting papal definitions of Mary’s Immaculate
in a much degraded form continue to dominate the Conception (1854) and Assumption into Heaven
characteristic mindset of the clergy until the twentieth (1950). The missionary orders, mostly nineteenth-
century. century foundations, were extending Catholicism
Tridentine Catholicism was, nevertheless, far more throughout the southern hemisphere; there was a
creative than the term ‘Counter-Reformation’ might move to encourage forms of lay apostolate across
suggest. While classical Protestantism with its com- Europe and North America, while for the first time
mitment to ‘sola scriptura’ was averse to religious there had developed a vast army of nuns at work
involvement in secular culture, the Catholicism of the almost everywhere. This last may have been the most
sixteenth and seventeenth centuries produced its last significant shift. In the Middle Ages nuns were few in
great cultural era, the Baroque, as rich artistically as comparison with monks and included no active orders.
the earlier Catholic art cultures, the Romanesque and Only in the seventeenth century, led by heroic women
the Gothic. In art and music, as in theology and like Mary Ward and Jane Frances de Chantal, did
spirituality, Catholicism is by nature open to the unenclosed orders of women, committed to apostolic
secular and developmental, even luxuriantly. This was work as well as prayer, begin to exist despite huge
always true, though only in the nineteenth century did clerical opposition. By the close of the nineteenth
John Henry Newman justify the development of century there were more nuns in the church than
doctrine as a human and Catholic necessity, to be priests. Silent in many ways they were forced to
contrasted with Protestant insistence upon not going remain, Roman Catholicism being traditionally an
beyond the explicit content of scripture. intensely male-dominated and clerical form of
Sixteenth- and seventeenth-century Catholicism Christianity, but a changing balance between the sexes
was also particularly fruitful in the literature of prayer, was evident even here.
from the mystical writings of the Spanish Carmelites, Catholicism continued to present publicly an almost
Teresa of Avila and John of the Cross, to the Jesuit unchanged face until the end of Pius XII’s pontificate
Spiritual Exercises, Francis de Sales’ Introduction to in 1958; highly clerical, intellectually repressed, well-
the Deout Life, and countless other minor classics, organized, expansive, more tightly controlled from
many of them popular also among Protestants. Rome than ever, insistent upon the use of Latin,
Catholicism then has traditionally combined in- retaining apparently untouched its medieval legacy,
stitutional rigidity with spiritual and artistic creativity. institutional, and liturgical. Yet processes of moderniz-
In the nineteenth century with the triumph of Ultra- ation had already gone far, intellectually especially in
montanism (a name used to describe a form of western Europe, institutionally elsewhere with the
Catholicism in Europe north of the Alps anxious to appointment of Asian and African bishops, and the
enhance papal authority and uniformity with Rome in creation of numerous cardinals to represent every
contrast with the more old-fashioned Gallicanism continent. For the first time since the fourteenth
which accepted papal supremacy but limited to a century, the majority of cardinals were not Italian. It

1540
Catholicism

needed only the gentle touch of John XXIII and the Vatican II’s praise for the married clergy of Catholic
calling of the Second Vatican Council (1962–5) for a Uniate churches in the east, its permission for married
truly revolutionary process to set in, only part of men to be ordained to the diaconate, and, finally, the
which was formally approved by the Council. The acceptance of married priests, especially in Britain, if
intellectual debate unleashed by 4 years of conciliar they were converts who had been ministers in another
discussions set a movement in train which went well church previously.
beyond the letter, though often not far beyond the Opposition to contraception set Vatican policy
spirit, of the Council’s decrees. Within a few years against almost all international attempts to curb the
Latin was replaced by the local vernacular in the rapid rise in global population. This, together with
liturgy and theological education; scholastic theology papal rejection of the ordination of women and refusal
almost disappeared from the seminaries; hitherto to open the door to a married priesthood, did much to
frigid relationships with the Orthodox and Protestant restore the nineteenth-century image of official
churches gave way to a near-frenzied spate of official Catholicism as incurably reactionary and antimodern.
ecumenism. At no time since the Middle Ages has the papacy
In strictly theological terms the changes of Vatican appeared to matter so much within public policy as in
II may seem less than revolutionary, but in the wider the age of John Paul II but its image has reverted to
terms of the sociology of Catholicism, its sense of that of Pius XII. In papal history John XXIII could
identity and shared ethos, they undoubtedly were. seem an aberration.
Symbolic ramparts, such as the denial of the cup in Shortage of priests resulted in the closing or merging
communion to the laity, which had been fiercely into one of many parishes, especially in France, and
maintained against Protestantism for centuries, were the transfer of duties from priests to deacons or to the
dismantled quickly. It was not only that the barriers laity in general. Women, who in the past had never
which made Catholics feel they were different—from been allowed even to serve at mass, were now enrolled
Latin to Friday abstinence from meat—had almost in thousands as ‘eucharistic ministers.’ Moreover, the
disappeared, and with them the shared certainty that religious orders, both male and female, whose
the pope was always right. It was also that Catholics numbers had grown enormously between 1850 and
now found themselves divided in a way that had never 1960, went into rapid decline in Europe and North
previously been the case in modern times. If, for Pope America. In practice a rapid declericalization of the
Paul VI (1963–1978) and the official church, Vatican II Church’s ministry was taking place. While this could
had gone just far enough this was challenged from be seen as advantageous, its value was partially
both right and left. For the former it had gone much undermined by the haphazard way in which it hap-
too far, betraying authentic Catholic values in the pened, lack of a positive pastoral strategy and con-
pursuit of ecumenism, while for the latter it had failed tinued insistence by Rome that the traditional clerical
through evading the most awkward issues, notably pattern remained normative.
that of Roman centralization. The conflict hardened Behind the conflicts over practice and the loss of
particularly over two matters that refused to go away: confidence in authority there were now huge dif-
contraception and the law of celibacy for priests. Pope ferences in theological perception. If Vatican II seemed
Paul had prevented the Council from discussing either. almost unbelievably radical to Catholics content with
Contraception was passed to a papal commission that the church of Pius XII, the policies of John Paul II
actually recommended a change in teaching. Paul seemed almost unbearably reactionary to Catholics
rejected the recommendation and in 1968, in the who had found the Council a harbinger of religious
encyclical Humanae Vitae, reaffirmed the use of liberation. Many ‘progressive’ Catholics, while deeply
contraception to be always wrong. He was quite loyal to the unity of a worldwide communion and to
unprepared for the degree of open opposition this the idea of a central ‘see of unity,’ not only rejected
aroused and for the consequent decline in confidence much current papal teaching, particularly that of John
in papal teaching, particularly among the more Paul, whose personal theology often appeared funda-
thoughtful laity, who disregarded widely the papal mentally preconciliar, but had come to share so full a
ban. Again, despite numerous appeals, at least to sense of fellowship with Christians of other churches
allow married men to be ordained, both he and John that intercommunion had come to be seen as obviously
Paul II (1978–) repeatedly renewed insistence on the right and was practiced frequently with the encour-
law of celibacy in face of the near-catastrophic decline agement of many priests, despite condemnation by
in the number of priests in the western world in the 30 authority. ‘Conservatives’ accused ‘progressives’ of
years following the council, due both to the withdrawal practicing an ‘aZ la carte’ Catholicism. ‘Progressives’
of priests from the ministry, usually to marry, and a charged ‘Conservatives’ with ‘integralism,’ confusing
linked fall in recruitment. A further factor was that the substance of Catholicism with clericalism and
minor seminaries were now seen as obsolete and closed Ultramontanism. While ‘Progressives’ tended to define
almost everywhere in Europe and North America in Catholicism in terms inclusive of the principal insights
the early postconciliar years. The credibility of the law of the Reformation as constituting a necessary critique
had been internally undermined by several things: of the one-sidedness of the late medieval church,

1541
Catholicism

‘Conservatives’ continued to define it as inherently ies: Religion; Middle Ages, The; Orthodoxy; Refor-
contradictory to Protestantism. Vatican II had, in mation and Confessionalization; Religion: Evolution
fact, led to a profound questioning of the model of and Development; Religion, History of; Religion,
Catholicism as it developed across a 1000 years from Sociology of; Renaissance; Revolutions of 1989–90 in
the Gregorian reform through the Counter-Refor- Eastern Central Europe; Western European Studies:
mation to the decrees of Vatican I and Ultramon- Religion;
tanism. How far that model will in consequence alter
in the long run remains unclear.
It may well be claimed that the areas where this
contestation is most in evidence are also areas of Bibliography
church decline. As against that decline, most obvious
in western Europe, is to be set a huge growth of Alberigo G (ed.) 1995 History of Vatican II. Orbis, Maryknoll,
Catholicism in Africa, Latin America, and several NY
parts of Asia. Brazil is now the country with by far the Aubert R 1978 The Church in a Secularised Society. Paulist
Press, New York
largest number of Catholics. Here too, however, there Cleary E L, Stewart-Gambino H (eds.) 1992 Conflict and
are great differences. Latin America has suffered an Competition: The Latin-American Church in a Changing
acute shortage of clergy for far longer than Europe Enironment. Lynne Rienner Publishers, Boulder, CO
and most of the issues dividing the church in Europe Duffy E 1997 Saints and Sinners: A History of the Popes. Yale
are evident here too, as also in the Philippines. The University Press, New Haven, CT
advance of Pentecostal Protestantism is also altering Ellis J T 1969 American Catholicism. University of Chicago
the hitherto almost exclusively Catholic character of Press, Chicago
Latin American Christianity. Elsewhere in Asia and in Flannery A 1992 Vatican Council II: The Conciliar and Post-
most of Africa the number of priests and religious has conciliar Documents. Dominican Publications, Dublin, Eire
escalated. Religious orders whose recruitment plum- Gannon T M (ed.) 1988 World Catholicism in Transition.
Macmillan, New York
meted after Vatican II in their traditional areas have
Gifford P 1998 African Christianity: Its Public Role. Indiana
grown remarkably in Africa, India, and Indonesia. University Press, Bloomington, IN
For the Jesuits in particular India became the most Greeley A M 1977 The American Catholic: A Social Portrait.
flourishing of provinces, but the growth of houses of Basic Books, New York
contemplative nuns in Latin America also is note- Hastings A 1989 African Catholicism. SCM Press, London
worthy. Nevertheless throughout the southern hemi- Hastings A (ed.) 1991 Modern Catholicism. SPCK, London
sphere there are profound tensions between Roman Hastings A (ed.) 1999 A World History of Christianity.
control and local processes of inculturation and Eerdmans, Grand Rapids, MI
political orientation. The future of Catholicism as the Hebblethwaite P 1984 John XXIII: Pope of the Council.
principal constituent within Christianity may depend Chapman, London
upon how far a lasting modus iendi is established Hebblethwaite P 1993 Paul VI: The First Modern Pope. Paulist
Press, New York
between pope and Curia on the one hand and the Hennesey J J 1981 American Catholics: A History of the Roman
young churches of the south on the other. John Paul’s Catholic Community in the United States. Oxford University
understandable preoccupation with Poland and east- Press, New York
ern Europe is unlikely to be continued by his suc- Hornsby-Smith M P 1987 Roman Catholics in England: Studies
cessors. Brazil, Argentina, India, the Philippines, in Social Structure Since the Second World War. Cambridge
Nigeria, and the Congo may matter more for the University Press, Cambridge, UK
shaping of Catholicism in the twenty-first century than Keough D 1990 Church and Politics in Latin America. St Martin’s
traditional pillars such as France, Spain, Ireland, and Press, New York
Poland, though the mediating position of the vast McBrien R P 1994 Catholicism. Harper, San Francisco
church of the United States may still prove to matter McBrien R P (ed.) 1995 The Harper Collins Encyclopedia of
Catholicism. HarperCollins, New York
most of all. While the decline in church practice or
O’Farrell P 1977 The Catholic Church and Community in
missionary enthusiasm in France or Ireland is un- Australia: an History. Nelson, West Melbourne, Australia
deniable, there is as yet little evidence to suggest that
world wide the strength of Catholicism is in decline A. Hastings
and, if the reforms of Vatican II have led to a shaking
of the foundations unprecedented for many centuries,
they have also contributed to a massive renewal
evident above all within the third world.
See also: Catholicism and Gender; Christianity Ori-
gins: Primitive and ‘Western’ History; Historiography Catholicism and Gender
and Historical Thought: Christian Tradition; His-
toriography and Historical Thought: Classical Period Christianity in its beginnings appeared to offer a
(Especially Greece and Rome); Latin American Stud- gender-inclusive promise of redemption. Both women

1542
Catholicism and Gender

and men entered the baptismal waters to be freed from Augustine tried to fuse these contradictions between
the corrupting power of the old Adam and to rise to creational patriarchy and redemptive equality. He
newness of life in Christ. This vision of an inclusive realized that in order for woman to be redeemed in
community of redemption was rooted in the ministry Christ she must be made in the image of God,
of Jesus and the early Jesus community. Here it was otherwise she would have no redeemable soul. But he
suggested that women and other despised people are was also bound by Paul’s dictum in I Corinthians that
not just included in the existing social and cultural only the male is the image and likeness of God, and
system, but that system itself is to be radically woman has only a secondary reflection of humanness
dismantled and turned upside down. under the male.
This vision was expressed in the baptismal formula Augustine’s solution was to split the image of God
cited in St. Paul’s Letter to the Galatians (3:28): as a gender-neutral spiritual soul from woman’s nature
as female. The sex-neutral soul in woman is in the
For as many of you are baptized into Christ have put on image of God and possesses a redeemable soul, but, as
Christ. There is neither Jew nor Greek, there is neither slave female, she is not in the image of God but images the
nor free. There is neither male nor female, for you are all one body, the subjugated lower self. As female, woman
in Christ Jesus. was made subject to the male from the beginning, even
in the original creation. This subjugation has to do
Some early Christians understood this as meaning with her role as wife, and sexual partner for the
that baptism into Christ dissolved the social sub- production of children. Furthermore, due to her
ordination of women to men and allowed women to primary role in leading the male into disobedience to
preach and travel as itinerant evangelists. But this view God, woman’s original subordination has now been
clashed with deeply held assumptions that the female worsened into coercive servitude. This subjugation is
was inferior, unclean, under subjugation by nature, not changed by baptism or even by celibacy. Rather,
unable to image God. a Christian woman should accept her subordination
Paul himself sought to modify this radical vision by willingly as both her original place in creation and as
suggesting that although this transformation of re- punishment for her primacy in sin. This subordination
lations will happen very soon, for ‘the time is later now will disappear only in heaven when the created order
than when we first believed,’ it has not yet happened. of sex, procreation, mortality, and sin is swallowed up
Here and now all are to remain in their present social in redemptive immortality.
conditions, whether married or unmarried, slave or This Augustinian view would be transmitted to the
free. Middle Ages as orthodoxy to be repeated in sub-
Paul’s successors sought to systematize his efforts stantially the same form by Thomas Aquinas, with the
by spiritualizing equality and splitting it from social addition of the Aristotelian anthropology that defined
relations, insisting that the hierarchical authority of women as biologically inferior, non-normative human
the paterfamilias is still intact. Wives, youths, and beings. For Thomas Aquinas, this inferior nature of
slaves must not only continue to obey their husbands, woman meant that Christ had to be a male in order to
fathers, and masters, but do so willingly. represent the fullness of human nature, only possessed
But this conflict between egalitarian and patriarchic by males. Only males, in turn, could represent Christ
views of the church continued in the second century. in the priesthood. Moreover, women could not exercise
The deutero-Pauline letter of I Timothy tries to public authority in their own right, since they were by
counteract women who are claiming a public ministry nature under subjugation.
by a patriarchic reading of Genesis 1–3. Women were Yet alternative traditions continue to lurk around
created second and sinned first; they are under the edges of the Medieval Catholic tradition, ideas of
subjugation through their secondary place in God’s an original spiritual equality that was accessible here
original order of creation. Moreover, this subjugation and now through redemptive conversion. There was
is redoubled through their primacy in sin. They may also the Pentecost tradition, in which the gift of the
not preach, but are to keep silence. They are saved Spirit empowers women as well as men to speak as
only by accepting this subjugation in marriage, child God’s prophets. In the Wisdom tradition, God coming
bearing, and modest, submissive behavior. forth to create the world is personified as female. The
This struggle between conflicting views of Christia- creating, redeeming God is imaged as both male and
nity continued into the third century. By the fourth female.
century, these more radical versions of redemptive Medieval female mystics, such as Hildegard of
equality had been mostly defeated. The women who Bingen and Julian of Norwich, laid hold of these
flocked into celibate Christian monastic life in the alternative traditions. Although accepting female sub-
fourth and fifth centuries accepted their segregation ordination in marriage, they assumed that they had
from public ministry and preaching in the church. Yet been freed from this subordination through celibacy
the idea continued to linger in female monasticism that and withdrawal into a female community independent
the female ascetic, by virtue of renouncing marriage, of male rule, where women religiously governed their
was no longer under male domination. own lives. As virginal women they existed no longer as

1543
Catholicism and Gender

femina, but as homo, as the spiritual self made in the several factors; first, the Catholic bishops were ve-
image and likeness of God. Through the gift of the hemently opposed to feminism, associating it with a
Spirit, they had been freed from silence and called to secular modern apostasy to Christian and ‘family’
speak as God’s prophetic voice to the church and values, represented by child labor laws, the Equal
society at large, calling them to account for their sins Rights Amendment, birth control, divorce, liberalism,
and revealing the mysteries of God’s coming trans- and socialism. Many bishops explicitly rejected
formation of the world. women’s suffrage itself, claiming that it would take
Hildegard of Bingen, Julian of Norwich and other women out of their proper sphere in the family.
female mystics also claimed the wisdom tradition to The view of gender that came to prevail in nine-
define a feminine personification of God through teenth and early twentieth century Catholicism was
which God created, sustains, and reconciles creation no longer one that stressed women’s inferiority, but
to God-self. The wisdom tradition enabled these rather their complementarity to men. Women were
female mystics to overcome the assumption that God different and even better than men, but their specific
is essentially male and so woman as woman cannot nature lay in their passive auxiliary being. Women
image God. As divine wisdom woman images God were ‘Marylike,’ rather than ‘Christlike.’ They are to
and God is imaged as woman. This female wisdom represent the church in its submission to Christ, and to
tradition culminates in the thought of Julian of masculine leadership, not taking any such leadership
Norwich, where femaleness is brought into the Trinity themselves. In these views, Catholicism echoed and put
itself. As Julian puts it, ‘As truly as God is our Father, its particular stamp on the romantic views of gender
so truly is God our Mother.’ complementarity common to the period.
The Reformation era of the sixteenth to seventeenth The conflict between official Catholicism and more
centuries was in many ways a setback for women. egalitarian views of women has moved into the
Laws that allowed women membership in guilds and Catholic community in the post-Vatican II era. Vati-
control of property were narrowing. The Protestant can II itself largely ignored women. Women, even
Reformation abolished celibacy while championing women religious, were absent from the delegates to the
the patriarchal family, where women’s sole vocation Council. Protest about women’s absence by Cardinal
was that of dutiful wife and mother. Women lost many Leon-Joseph Suenens in 1962 brought a slight change.
vocations that had been available through celibate A few women religious and laywomen were allowed as
female communities. ‘auditors’ without vote in the later sessions of the
In Catholic areas, of course, monastic life for Council. Three even served on commissions that
women continued, but the narrowing of women’s legal drafted the final Council statements, specifically those
and economic opportunities also happened there. The on ‘the Church in the Modern World’ and on the
Catholic Counter-Reformation stressed the strict ‘Apostolate of the Laity.’
cloistering of women and resisted the efforts of many The final documents of the Council have a few
creative women of the period, such as Mary Ward in concessions to women’s new role in work and legal
England, who sought to found new non-cloistered equality in modern society. It is said that women are
activist orders for women. Women were either to be claiming a new equity for themselves before the law
wives obedient to their husbands or cloistered nuns and in fact, and the participation of women in cultural
obedient to the male authorities of the church. life should be acknowledged and supported by the
A few Catholic humanist thinkers, such as Erasmus, church. These statements are echoed in Pope John
championed women’s expanded education. The mav- XXIII’s encyclical Pacem in Terris (1963). Here it is
erick Catholic humanist Agrippa von Nettesheim even said that ‘women are becoming ever more conscious of
suggested that women were originally not only spiri- their human dignity, they will not tolerate being used
tually equal to men, but superior in their represen- as mere material instruments, but demand rights
tation of the divine Wisdom of God. Male domination befitting the human person in both domestic and in
was recent, and manifested not God’s will or female public life’ (Sect. 41).
inferiority, but simply male tyranny. But this view was These concessions to women’s role in public society
deeply contrary to the trends of the times and would did not include changes in the life of the church itself,
not be heard again until expounded by late eighteenth in terms of women’s ordination, or in teachings on
century feminists—for example, Mary Wollstonecraft, sexuality and reproduction. Both of these were being
wh ochampioned women’s rights in her treatise, The challenged in the 1960s. Mainline Protestant churches
Vindication of the Rights of Woman (1792). Such views in Western Europe and North America had begun to
would not appear in Catholicism until after the Second ordain women in increasing numbers in the 1960s. In
Vatican Council in the 1960s. 1976 the American Episcopal Church endorsed
There was virtually no organized Catholic par- women’s ordination, a change that brought women
ticipation in the first wave of American feminism that into the priesthood in a tradition more closely allied to
ran from the first Women’s Rights Convention in Catholicism.
Seneca Falls, New York in 1848 until the winning of The Roman Magisterium responded with an en-
women’s suffrage in 1920. The reasons for this lie in cyclical in that same year insisting that women could

1544
Catholicism and Gender

not be ordained because as women they lacked the porters) is by no means limited to North America.
capacity to ‘image Christ.’ Maleness and represen- Feminist theological networks grew in Western
tation of Christ as a male were thus identified in a way Europe in the 1980s and 1990s, and in the Catholic
that many Catholics had come to be seen as ques- church in Asia, Africa, and Latin America. With an
tionable. Far from ending the discussion, support increasing crisis in the celibate male priesthood, more
groups for women’s ordination among Catholics have and more of the ministry in Catholic parishes and
grown since 1975 in Europe, and the United States and chaplaincies is being done by theologically trained
Canada. laywomen.
Another area of contention among Catholics has Thus the conflict between the Catholic hierarchy
been the traditional teaching that sex (allowable within and women, exacerbated by the reactionary policies
marriage alone) is primarily for procreation, and of the pontificate of John Paul II, showed no signs of
cannot be separated from its procreational potential. abating at the end of the twentieth century. The issues
Thus birth control (other than the rhythm method) is go far beyond the inclusion of a few token women in
forbidden. As moral theologians and ordinary Catho- clerical office. They point to a fundamental rethinking
lics began increasingly to question these teachings, of ancient traditions that are rooted in the male
Pope Paul VI convened a commission to study the hierarchy and a view of femaleness and sexuality as
question in 1964. The commission met from 1964–7 expressions of inferiority and sin to be distanced from
and included among its members three married the ‘purity’ of sacramental office.
couples who testified to the anxiety created by lack of A thoroughgoing rethinking of these patterns any
reliable family planning. time soon in official teaching seems unlikely. But since
The commission’s report voted to change the women who claim a more egalitarian view of them-
teaching to one that accepted all medically approved selves and their right to decide about their own bodies
methods of birth control within the context of com- show no signs of departing in large numbers from the
mitted, child-raising families. But this official report Catholic church, it is likely that this conflict will
was rejected by Pope Paul VI, who was advised by a continue in the foreseeable future as a deep schism of
few dissenters that such a change would threaten the viewpoints concerning gender identity and relations
church’s claim to authoritative teaching. The re- among Catholics.
affirmation of the anti-birth control teaching in the
papal enclyclical Humanae Vitae (1968) created a See also: Buddhism and Gender; Catholicism; Family
storm of protest. Most lay Catholics have simply and Gender; Feminist Theology; Goddess Worship
chosen to ignore the Church’s teaching on this subject. (Old and New): Cultural Concerns; Islam and Gender;
Yet the Vatican, under John Paul II, has continued to
Judaism and Gender; Protestantism and Gender;
insist on this teaching and even to claim that it is
unchangeable, deepening the conflict between Magist- Religion and Gender; Religion: Family and Kinship;
erium and laity. Religion: Mobilization and Power; Religion: Morality
Recognizing the deeply conflicted state of relations and Social Control; Women’s Religiosity
of the official church to women, the American bishops
voted in the early 1980s to engage in an official
dialog with the US Women’s Ordination Confer-
ence. At the conclusion of this dialog the bishops Bibliography
decided to undertake a pastoral letter on women in the Boerrsen K E 1981 Subordination and Equialence: The Nature
church. They conducted a series of ‘listening sessions’ and Role of Women in Augustine and Thomas Aquinas. Univer-
with Catholic women around the United States to hear sity Press of America, Washington, DC
their issues. They then began to draft an epistle that Kaiser R B 1986 The Politics of Sex and Religion. Leaven Press,
declared that men and women should be partners in Kansas City, MO
home, family, and church. The epistle even condemned Raming I 1976 The Exclusion of Women from the Priesthood:
‘sexism’ as a sin. Diine Law or Sex Discrimination? Trans. N Adams, Scare-
But the Vatican intervened in the process of drafting crow Press, Metuchen, NJ
this letter, demanding that the condemnation of birth Ruether R R 1995 ‘Catholic women’. In: In Our Own Voices:
control and women’s ordination must be strengthened, Four centuries of American women’s religious writing 1st edn.,
and that the model of male–female relations must be Harper, San Francisco, pp. 17–60
Ruether R R 1998 Women and Redemption: A Theological
one of complementarility, not partnership. The effect History. Fortress Press, Minneapolis, MN
of these interventions was a decision by the American Swidler L, Swidler A 1977 Women Priests: A Catholic Com-
bishops to table the epistle, rather than issue a mentary on the Vatican Declaration. Paulist Press, New York
document that was likely to worsen rather than Weaver M J 1985 New Catholic Women: A Contemporary
improve relations with women. Challenge to Traditional Religious Authority, 1st edn. Harper
This conflict between official teachings of the Vati- & Row, San Francisco
can and a growing consciousness of their rights to
equality among Catholic women (and male sup- R. R. Ruether

Copyright # 2001 Elsevier Science Ltd. 1545


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Cattell, Raymond Bernard (1905–98)

Cattell, Raymond Bernard (1905–98) ‘parallel proportional profiles’ and ‘confactor rota-
tion,’ more of which will be said later.
In describing the psychologist Raymond Bernard By far the longest and most productive portion
Cattell’s comprehensive research program, Child, in a of Cattell’s scientific career occurred while he was
single sentence, gave an amazingly complete and Director of the Laboratory of Personality Assessment
accurate description: ‘His major concern was to map and Group Behavior (later the Laboratory for Per-
out an integrated theory of human intellectual, tem- sonality Analysis) at the University of Illinois. It was
peramental and motivational characteristics within from this setting that his influence on the development
the context of hereditary and environmental influences of the science of psychology hit its maximum. The
using multivariate methods of analysis’ (Child 1998, sheer number of big ideas and level of productivity in
p. 356). In trying to convey the sweep of Cattell’s following them up during those years are remarkable.
theoretical and empirical work Goldberg (1968) His success in claiming his ‘fair’ share of CPU time on
referred to him as ‘psychology’s master strategist’ the Illiac, the university’s automatic computer, was
(p. 617). Indeed, the sheer audacity of venturing into legendary. Although by today’s standards, its capacity
so many major research areas during one career is was meager, at the time the Illiac allowed Cattell to do
unlikely to be seen again for some time. large factor analyses in weeks and months that would
Raymond B. Cattell was born on March 20, 1905 in otherwise have taken years.
Staffordshire, England. He was raised in Devon, near
the town of Torquay. Cattell began his academic
career in chemistry at University College, London and
received his Bachelor of Science degree with first class 1. Contributions to the Methods and the
honors in 1924. Drawn to the study of psychology, Substance of Psychology
Cattell began work on a doctorate at King’s College, Raymond B. Cattell is an exemplar of a class of
also in 1924. After completing his Ph.D. in 1929, psychological researchers who were heavily invested in
Cattell did some teaching at University College, Exeter the application of quantitative models to the study of
before becoming Director of the School Psychological behavior. Also included were the likes of Cyril Burt,
Services and Clinic at Leicester in 1932. Hans J. Eysenck, J. P. Guilford, Charles Spearman,
In 1937, E. L. Thorndike invited Cattell to come for Godfrey Thompson, and L. L. Thurstone. Let it be
a year to Columbia University as a research associate. said openly that this is not a list of like-minded
Subsequently, he also spent brief periods at Clark, individuals who necessarily had more in common than
Harvard, and Duke universities. During his time at not. Rather, it is a list of important contributors to the
Harvard, Cattell began seriously to develop the development of psychology as a science who relied
methodology for intensively studying the single case substantially on the models and methods of factor
(P-technique factor analysis) in which he remained analysis in their empirical research pursuits. In both
interested throughout his career. Cattell acknow- strategy and tactics, they differed from each other in
ledges that with his ‘ … office … having been next the use of factor analytic methods.
door to Gordon Allport’s, the latter may also have Because of Cattell’s heavy reliance on factor analy-
played an initiating part’ but goes on to lament that sis, it is not possible to separate the substance from the
although Allport, too, was ‘ … in pursuit of unique- methods. In fact, in an earlier review of Cattell’s
ness of personality … ’ he had ‘ … some unreadiness research program in Theories of Personality (Hall and
to come to grips with statistical models’ (Cattell Lindzey 1978), the chapter heading is ‘Cattell’s Factor
1984, p. 141). Exhibiting a certain disdain for those Theory.’ Therefore, a discussion of his work can be
who refused to use ‘proper’ methods, no matter how usefully begun with a methodological introduction.
celebrated their status, was one of Cattell’s stable At the heart of Cattell’s research program lay the
characteristics. basic equation of the common factor model. Cattell
Cattell worked for a time during the latter years of referred to it as the factor specification equation and
World War II in the Adjutant General’s Office. In wrote it as:
1945 he became a research professor at the University
of Illinois in Urbana-Champaign and remained there aji l bj :f ijbj :f ij(jbjk:fkijuji
until 1973 when, in part due to age restrictions then in " " # #
force, he retired. He moved on to the University of
Hawaii where he maintained a special appointment for where aji is a score for person i on behavior (or act) aj,
a few years. When that appointment ended, Cattell and fqi (q l 1, 2, …, k) represent the ith person’s
stayed in Honolulu and continued to work informally endowments or scores on the k common factors, the bjq
with several younger colleagues and to publish an (q l 1, 2, …, k) are weights (factor loadings) that
occasional paper. One of his last publications specify the amount to which any individual’s score on
(McArdle and Cattell 1994) brought to a close his 50- the qth factor contributes to (in Cattell’s theoretical
year affair (Cattell 1944) with the beguiling concepts of framework, determines) his score on variable aj, and uji
is a contribution to aji that is unique to both the

1546
Cattell, Raymond Bernard (1905–98)

behavior and the individual. The uji terms include both aj values in the specification equation) across different
errors of measurement and portions of the observed samples. This represented an explicit commitment to a
score that are reliably measured, but fall outside the fundamental objective of scientific research: the identi-
purview of the common factors defining the inter- fication of relationships that remain inariant under
relations of a given set of variables. different transformations. In his context, Cattell for-
The bjq values, which Cattell referred to as situ- malized this notion in the terms ‘parallel proportional
ational indices (later, behavioral indices), brought the profiles’ (the results) and ‘confactor rotation’ (the
situation and the individual’s traits into the same method) mentioned earlier. Along the way, he worked
equation so that both individual and situational on the development of methods for assessing the
information could be used to predict behavior. Sup- similarity of factor loading patterns so that one could
pose, for example, a given individual had a very high render some judgment about factorial invariance even
endowment on the factor Cattell labeled Exvia, which without reaching the ideal parallel proportional pro-
was similar in nature to the individual differences files.
dimension called Extraversion in many personality Cattell was also firmly committed to the principles
research and theory contexts. If, on the one hand, the of rigorous, quantitative measurement and scaling
bj for Exvia with respect to performance of a given aj and once presented, with characteristic lack of mod-
was small, then that high level of Exvia had little or no esty, a proposed more basic set of concepts (Cattell
bearing on that person’s aj score. On the other hand, 1964) to replace the prevailing notions of reliability
however, if bj for Exvia on performance of a given aj and validity. In so doing, he again challenged the
was large, then that high level of Exvia had major status quo regarding the role of item homogeneity in
influence in determining that person’s aj. By contrast, evaluating the quality of measuring instruments.
for the person with a near zero level of Exvia, little Cattell placed a strict reliance on mathematical
would be contributed to their performance on aj by the representations of phenomena of interest although, in
Exvia factor even if the bj for Exvia was large, because his case, this amounted generally to using some variant
the product bj,exvia:fexvia,i would be close to zero. of the common factor model. His insistence on the
This factor specification equation is thus a ‘working introduction of rigor into the definition of psycho-
model’ of the individual differences orientation in that logical concepts left its mark in a number of ways. He
the differing amounts of the latent attributes possessed argued that ambiguity and confusion about concepts
by different people (variation in the fq) are ‘converted’ could be reduced by discarding as many of those in
into differences in their scores on the observed vari- vogue as possible, discovering and defining new ones,
ables (variation in the aj). By knowing the amounts of and giving them new names to mark their unique
a person’s trait endowments and the magnitude of the status. In the process, he delivered such labels as
contribution the traits made to manifest variables, one ‘cortertia’ (cortical alertness) and ‘threctia’ (threat
could predict an individual’s scores on the manifest reactivity) to identify major individual differences in
variables, that is, behavior. dimensions of personality.
Within that framework, the scientific agenda is A monograph on real-base, true zero scaling (Cattell
rather clear: define the latent factors as clearly as 1972) illustrates the concern he held for fundamental
possible, construct tests to measure them accurately, issues of scaling, the commitment and energy with
and develop empirically based estimates of the b values which he would attack a problem, provide a solution,
for the manifest variables, the aj, of interest. Then one and advance it for discussion and debate. Debate,
has made the prediction objective of science reachable however, was not always the reaction evoked by his
and can also elaborate an explanatory system as well proposals.
by detailing more and more regarding the nature of the In addition to this strong analysis and modeling
factors (e.g., see how the factors behave across orientation, Cattell’s research program was buttressed
different media of observation and how stable they are by several other key attributes. Where others might
across time, examine their genetic and environmental carve up the study of behavior to identify more clearly
variance contributions, plot their age changes, etc.). a piece for themselves in the broader scheme, Cattell
Attacking these several objectives was a major part of did so as a way of more systematically approaching the
Cattell’s research program. whole. Much of his work was organized around the
Thus, despite knowing the adaptations in the tripartite division of abilities, temperament, and
methods of the physical sciences had to be made in motivation. But he enriched this classical triumvirate
studying behavior, Cattell’s methodological approach by distinguishing between surface and source traits,
showed a deep reliance on fundamental principles of estimating heritability coefficients, and incorporating
physical science. He believed that factors were causal concepts of psychological states, learning theory, and
influences on observable variables and that, in ana- situational influences. He introduced the multiple
lytical work, when the factors had been properly abstract variance analysis (MAVA) design to incor-
rotated to their ‘causal influence’ identities, the factor porate different family configurations into the esti-
loadings (the bj values in the specification equation) mation of heritability coefficients and in the design of
would be proportional for a given set of behaviors (the correlational studies he saw not one, but six important

1547
Cattell, Raymond Bernard (1905–98)

ways to interrelate variables, persons, and occasions what areas of his intellectual functioning he could
of measurement. He systematized the latter in a detect losses. He responded that he didn’t act on visual
heuristic often referred to as the ‘data box’ or input as rapidly as he used to when, for instance,
covariation chart (Cattell 1952) which is still fre- driving his automobile. (This was true!) But, he very
quently encountered in the literature. quickly added, he could still retrieve adjectives, etc.,
As an example of his capacity for orderly ab- quite handily when he was writing and he was very
straction and the willingness and ease with which he pleased about it. Some of his ‘classic’ remarks, many
would think ‘big,’ Cattell was not satisfied to name of which are buried in footnotes, will likely go
and report the results of his many factor analyses. unheeded by the future generations, and that will be a
Instead, he proposed the development of a Universal stylistic as well as a substantive loss. For example, his
Index (U. I.) of factors, the first 15 of which (U. I. 1 characterization of the hyperplane as ‘the footprint of
through U. I. 15) were to be reserved for what he a causal influence’ or his accusation that by developing
considered to be the well-established human ability and making widely available the by then nearly
factors. He then proceeded to assign the next 17 ubiquitous orthogonal analytic rotation program,
numbers (U. I. 16 through U. I. 32) for factors based Varimax, Henry Kaiser had ‘put opium on the
on objective tests on which he had published. He market,’ are illustrative.
invited further contributions to this indexing system Cattell’s deep reliance on factor analysis was symp-
but, as one might expect from even a passing ac- tomatic of his belief that the workings of variables
quaintance with human nature, one person’s proposal should be examined with as little interference as
for a Universal Index is hardly likely to be adopted possible from the observer. The methods of factor
readily by others. analysis could then be used to tease out the key
In addition to a frequently observed trait of suc- patterns of relationships. As any craftsman, Cattell
cessful research scientists—a desire to be first and to felt strongly about having access to proper tools and
receive credit for it—Cattell was gifted with a high worked to evolve factor analytic methods during the
level of intelligence and capacity for abstract thought. length of his career. He did this both by formalizing
He made up many of the items for the Culture Fair design with the presentation of the covariation chart
Intelligence tests (Cattell 1954) himself; took delight in or ‘data box’ and by restlessly pushing technological
the rapidity with which he could solve items from developments. The ever present need for faster, more
‘would be’ intelligence test item writers and, in return, efficient methods for conducting factor analysis if his
relished ‘stumping’ them with a recently written item research goals were to be realized, directly resulted in
or two of his own. software applications such as Maxplane, Procrustes,
The levels of abstraction with which Cattell felt and Rotoplot to facilitate rotation of factors to a final
quite comfortable were in many cases a far remove solution. The Rotoplot program, for instance, dis-
from data. For example, second-order factor analyses played the pairwise factor (actually, reference vector)
(factor analyzing the intercorrelations of factors) were plots on an oscilloscope and filmed them by 35 mm
routine in his laboratory. Third-order factor analyses, camera mounted on the oscilloscope. As soon as the
while not routine, were completed on several occasions film strip could be processed, Cattell would project the
to which published papers attest. Along the way, at plots on his office wall and write down the tangents of
least one fourth-order factor analysis was conducted. the angles of rotation for another run, urgently trying
Cattell was not inclined to be tolerant of others’ lack to achieve ‘maximum simple structure’ so that factors
of knowledge of his system. Those who worked in his could be interpreted and the article written and
laboratory were apt to be rebuffed when they would submitted. At the time that Rotoplot was invented,
occasionally suggest ways by which he might others who were doing graphical rotation of factors
strengthen the bonds of communication with his were having to have each pairwise plot rendered by
readers. ‘Let them go and read my 1957 book’ (Cattell triangle and T-square.
1957), was the reply heard on more than one occasion
to the suggestion that a little more detail might be
helpful to the reader. Nor, as was already mentioned, 2. Bridging Psychology’s Past and its Future
was he kindly diposed toward those he felt were not
using the multivariate tools of the trade wisely. Cattell and his research program represent an im-
Cattell was an avid reader (and writer) of poetry and portant bridge between psychology’s past and its
also enjoyed biographies from which he enriched his future. While he was alive, Cattell was a personal, as
own writing. His enormous legacy of books, chapters, well as scientific link to the psychology’s past. On
and articles, numbering well over 500 publications, occasion, he would tell of conversations (sometimes,
exhibit many examples of his caring about the craft of encounters) with the likes of Burt, Fisher, Pearson,
writing as well as the ideas being written about. He and Spearman. Although he was not a student of
would search for just the right word or turn of Thurstone, he seized on the tool of multiple factor
phrase—and did so amazingly quickly! When he was analysis which Thurstone was so instrumental in
well into his eighties the author once asked him in bringing to psychologists and future refined it to his

1548
Cattell, Raymond Bernard (1905–98)

own ends. At the same time, his zeal in promoting, and search program. For example, the fruits of his efforts
passing along to future generations of researchers not to measure temperament by questionnaire, especially,
only his substantive findings but also the methods that are evident today in the continuing usage of the High
he thought were appropriate and necessary for psy- School Personality Questionnaire (HSPQ) and the 16
chological research, forged important links to the Personality Factor questionnaire (16PF). Some of the
future. He was a prime mover in establishing the work of which Cattell was most proud and for which
Society of Multivariate Experimental Psychology he had the highest hopes sprang from his contributions
(SMEP) and the founding of the society’s journal, to the study of motivation. His concept of the dynamic
Multiariate Behaioral Research (MBR). He helped calculus, which was an attempt to write a systematic
to launch MBR in 1966 with a provocative lead article prediction equation for behavior that included trait,
entitled ‘Multivariate Behavioral Research and the state, and dynamic aspects was refined into what he
Integrative Challenge,’ in which he summarized his came to call structured learning theory. In the author’s
own version of the two general traditions of psy- view, this was a key contribution in Cattell’s own
chological research—the experimental and the in- evaluation of his scientific legacy.
dividual differences traditions—and pointed out the Many of the individuals who received their scientific
directions for future research that would elevate the training in Cattell’s laboratory are still active and
study of behavior to new levels of competence. In that productive today. They, in turn, are passing on to their
same year the first edition of the Handbook of students the concepts of the data box, fluid and
Multiariate Experimental Psychology, for which crystallized intelligence, covariation designs, factorial
Cattell was both the editor and the major contributor, invariance, states versus traits, etc.
was published. A second edition followed 22 years Despite the comprehensive perspective, the numer-
later (Nesselroade and Cattell 1988). ous insights and innovations, and the sheer bulk of his
McArdle (1984) credited Cattell with having a scholarly output, which spanned parts of eight dif-
profound effect on the development of what is now ferent decades, Raymond B. Cattell’s place in the
referred to generally as ‘structural modeling’ the fitting history of psychology will not be identified without
of quantitative models to covariance matrices (and controversy for the foreseeable future. A lifetime
occasionally the associated arithmetic means, as well), achievement award for his contributions to psycho-
which has grown to enormous proportion in current logy to be given by the American Psychological
day psychological research. This connection reinforces Foundation was announced in early 1997. Although
the amazingly productive service Cattell extracted Cattell had never taken kindly to the science ‘estab-
from the factor analytic model. Cattell’s empirical and lishment,’ he had reached a point in his journey
theoretical work, as no one else’s, has helped to clarify where such a high level of recognition for his scientific
the distinction between latent and manifest variables work meant something to him. But it was not to be.
and the key properties of both. Some of his more philosophical, provocative writings
The contributions that Cattell made both substan- having to do with the roles in our society of genetics,
tively and methodologically were many. Here, we have evolution, and public policy had generated bitter and
only touched on some of the major ones. He developed sufficient resentment that, after some ‘negotiating,’ the
and presented one of the most systematic and encom- award was withdrawn prior to the 1997 annual
passing theories of personality that psychology has meeting of the American Psychological Association in
ever seen. Along the way, he introduced a large Chicago, where it was to have been given. Cattell died
number of methodological innovations, some of which the following February 2, at the age of 92.
have made their way into today’s behavioral science
argot with an ease that today gives no hint of Cattell’s See also: Adult Cognitive Development: Post-
pioneering contribution. The ‘scree test,’ ‘Procrustes’ Piagetian Perspectives; Behavioral Assessment;
rotation, and ‘P-technique’ come readily to mind. Darwin, Charles Robert (1809–82); Factor Analysis
The theory of fluid and crystallized intelligence, on and Latent Structure: Overview; Galton, Sir Francis
which Cattell collaborated closely with Horn (Horn (1822–1911); Lifespan Theories of Cognitive Develop-
and Cattell 1966) remains very much in currency today
ment; Multivariate Analysis: Overview; Parsons,
as far as so-called psychometrically oriented models of
cognition are concerned. The apparently different age Talcott (1902–79); Personality Structure; Personality
gradients of these two kinds of abilities has helped to Theories; Quetelet, Adolphe (1796–1874); Thorndike,
stimulate thinking not only about the complexity of Edward Lee (1874–1949)
the organism and the nature of its development across
the lifespan (e.g., Baltes 1987), but also about the need
to consider what level of aggregation of variables is
optimal for a given purpose. Bibliography
Despite Cattell’s strong investment in the study of Baltes P B 1987 Theoretical propositions of life-span develop-
human abilities (Cattell 1971), he did not neglect the mental psychology: On the dynamics between growth and
temperament and motivation components of his re- decline. Deelopmental Psychology 23(5): 611–26

1549
Cattell, Raymond Bernard (1905–98)

Cattell R B 1944 ‘Parallel proportional profiles’ and other by another, and where all the objects, similar to the first,
principles for determining the choice of factors by rotation. are followed by objects similar to the second. Or, in
Psychometrika 9: 267–83 other words, where, if the first object had not been, the
Cattell R B 1952 Factor Analysis. Harper, New York
second neer had existed ’ (Lewis 1973a, italics are
Cattell R B 1954 Culture Fair Intelligence Tests, Scales 1, 2, and
3, Forms A and B, re. edn. IPAT, Champaign, IL Lewis’s).
Cattell R B 1957 Personality and Motiation Structure and Lewis draws attention to the comparison between
Measurement. World Book Co., New York the factual first definition where one object is followed
Cattell R B 1964 Validity and reliability: A proposed more basic by another and the counterfactual second definition
set of concepts. Journal of Educational Psychology 55: 1–22 where, counterfactually, it is supposed that if the first
Cattell R B 1971 Abilities: Their Structure, Growth, and Action. object ‘had not been’ the second object would not have
Houghton Mifflin, Boston been either.
Cattell R B 1972 Real base, true zero factor analysis. Multi- It is the connection between counterfactuals and
ariate Behaioral Research Monographs 72(1): 1–162
causation that makes them relevant to social science
Cattell R B 1984 The voyage of a laboratory, 1928–1984.
Multiariate Behaioral Research 19: 121–74 research. From the point of view of some authors, it is
Child D 1998 Raymond Bernard Cattell (1905–1998). British difficult, if not impossible, to make any sense of causal
Journal of Mathematical and Statistical Psychology 51: 353–7 statements without using counterfactual language
Goldberg L R 1968 Objective Personality and Motivation (Lewis 1973a, Holland 1986, Rubin 1978, Robins
Tests—theoretical introduction and practical compendium— 1985, 1986). Other authors are concerned that using
Cattel R B and Warburton F W. Contemporary Psychology such language gives an emphasis to unobservable
13: 617–9 entities that is inappropriate in the analysis of em-
Hall C S, Lindzey G 1978 Theories of Personality, 3rd edn. pirical data (Dawid 1997, Shafer 1996). The discussion
Wiley, New York
here accepts counterfactuals in discussions of caus-
Horn J L, Cattell R B 1966 Refinement and test of the theory of
fluid and crystallized general intelligences. Journal of Edu- ation and will explain their role in the estimation of
cational Psychology 57: 253–70 causal effects based on the work of Neyman (1923,
McArdle J J 1984 On the madness in his method: R. B. Cattell’s 1935), Rubin (1974, 1978), and others.
contributions to structural equation modeling. Multiariate We begin with a simple observation. Suppose that
Behaioral Research 19: 246–67 we find that a student’s test performance changes from
McArdle J J, Cattell R B 1994 Structural equation models of a score of X to a score of Y after some educational
factorial invariance in parallel proportional profiles and intervention. We might then be tempted to attribute
oblique confactor problems. Multiariate Behaioral Research the pretest-posttest change, YkX to the intervening
29(1): 63–113
educational experience, that is, to use the gain score as
Nesselroade J R, Cattell R B (eds.) 1988 Handbook of Multi-
ariate Experimental Psychology, 2nd edn. Plenum, New York a measure of the improvement due to the intervention.
However, this is social science and not the tightly
J. R. Nesselroade controlled ‘before-after’ measurements made in a
physics laboratory. There are many other possible
explanations of the gain, YkX. Some of the more
obvious are: simple maturation, other educational
experiences occurring during the relevant time period,
Causal Counterfactuals in Social Science and differences in either the tests or the testing
conditions at pre- and posttests. Cook and Campbell
Research (1979) provide a classic list of ‘threats to internal
validity’ that address many of the types of alternative
The term ‘counterfactual conditional’ is used in logical explanations for apparent causal effects of interven-
analyses to refer to any expression of the general form: tions. For this reason, it is important to think about
‘If A were the case then B would be the case.’ In this the real meaning of the attribution of cause. In this
usage, A is usually false or untrue in the world so that regard, Lewis’s discussion of Hume serves us well.
A is ‘contrary to fact’ or counterfactual. Examples From it we see that what is important is what the value
abound. ‘If kangaroos had no tails, they would topple of Y would hae been had the student not had the
over’ (Lewis 1973b). ‘If an hour ago I had taken two educational experiences that the intervention entailed.
aspirins instead of just a glass of water, my headache Call this score value, Y *. Thus enter counterfactuals.
would now be gone’ (Rubin 1978). Perhaps the most Y * is not directly observed for the student, that is,
obnoxious counterfactuals in any language are those they did hae the educational intervention of interest,
of the form: ‘If I were you, I would … .’ so asking for what their posttest score would hae been
Lewis (1973a) observed the connection between had they not had it is asking for information collected
counterfactual conditionals and references to caus- under conditions that are contrary to fact. Hence, it is
ation. He finds these logical constructions in the not the difference YkX that is of interest, but the
language used by Hume in his famous discussion of difference YkY *, and the gain score has causal
causation. Hume defined causation twice over. He significance relative to the effect of the educational
wrote ‘we may define a cause to be an object followed experience only if X can serve as a substitute for the

1550
Causal Counterfactuals in Social Science Research

counterfactual Y *. In physical-science laboratory (1978) mainly in its emphasis on population quantities.


experiments such a substitution is often easy to make, One of the main benefits of this model is that it
but it is rarely believable in many social science identifies certain ‘counterfactual conditional expecta-
applications of any consequence. tions’ as the location of key assumptions about the
A formal model or language for discussing the inferential structure of any causal study and its
problem of estimating causal effects (of the form resulting data.
YkY * rather than YkX ) was developed by Neyman The prospective causal study begins with the ‘units,’
(for randomized experiments) and Rubin (for a wide ‘subjects,’ or ‘cases’ of the study, and the ith unit is
variety of causal studies) and will be called the denoted by the subscript ‘i.’ It may help the reader to
Neyman\Rubin model for causal effects here. imagine that this is a discussion about a very large
sample of units. Denote the population of units under
study by P. For the most part P will lie quietly in the
1. Prospectie Causal Studies background without being noticed. The baseline in-
formation that is collected or recorded for unit i will be
The Neyman\Rubin model is most easily understood
denoted as a ector of numerical information, zi.
in the context of a prospectie causal study that has the
There is a ‘causal’ variable denoting a set of possible
general structure specified by this sequence of events.
‘treatments’ or ‘exposure’ conditions to which each
(a) Subjects or experimental units of study are
unit in the study could be exposed. For simplicity we
identified.
assume that these are only two treatment conditions
(b) Baseline or pretest information about these units
denoted by x l 1 (treatment) and x l 0 (control). A
is recorded.
more complicated version of this would let x be a
(c) The units are either assigned to (in a controlled
number representing the ‘strength’ of the treatment
study) or select themselves to (in studies without the
level, but we will just use the dichotomous case here.
control of assignment) exposure to one of the treat-
An important aspect of causal variables designating
ment conditions or interventions of the study.
such treatments levels or intervention conditions is the
(d) These units are then subsequently exposed to
assumption that the level of exposure for any unit
their assigned or self-selected treatment condition (and
could hae been different from what it actually was.
each unit is affected by this exposure in a manner that
This condition excludes ‘attributes’ of units (such as
is unrelated to the exposure conditions of the other
race, gender, age, or pretest score) as causal variables
units).
in the sense that such attributes cannot have ‘unit-level
(e) At an appropriate later time an outcome,
causal effects’ in the sense that we will define below.
endpoint, or posttest measure is recorded for each unit
This idea is discussed more extensively in Holland
in the study.
(1986, 1988), and is mentioned again in another
The type of study that this five-part schema is
context at the end of the present discussion. Each unit
intended to cover includes most randomized com-
is exposed to one treatment level and the value of x to
parative experiments as well as many types of pretest\
which i is exposed is denoted by xi.
posttest quasiexperiments or observational studies. It
Finally we come to the outcomes or dependent
should be emphasized that, properly interpreted, the
variables in the study, and here is where a special
Neyman\Rubin model has application to other types
notation is needed. We let Yi (x) denote the (nu-
of causal studies (see Holland 1988, Holland and
merical) response that would be recorded for unit i if
Rubin 1988, Robins 1997), but, for our purposes here,
unit i were exposed to treatment level, x. For each i,
prospective causal studies are already sufficiently
Yi (x) is a function of x. It should be emphasized that
complicated and inclusive.
Yi (x) is not directly observed unless xi l x. This is an
The condition mentioned parenthetically in (d), that
important point because it is crucial to realize that the
the exposure conditions of the other units do not affect
oYi (x)q do not denote obsered data like zi and xi do,
the outcomes associated with a given unit, is very
but rather the oYi (x)q are ‘potential outcomes’ that lie
important, and is clearly an assumption that would
behind the observed values of the outcome variable.
not be true in general. For example, in the study of
For this reason, we denote the potential observations
infectious diseases your vaccination will affect my
by capital letters to distinguish them from quantities
likelihood of contracting polio. Rubin explicitly iden-
that are directly observable, which we denote by lower
tifies this assumption calling it the Stable Unit-
case letters.
Treatment Value Assumption, or SUTVA. SUTVA
The connection between the potential outcomes and
will be assumed throughout this discussion.
the actually observed outcomes is then given by the
equation
2. The Neyman\Rubin Model yi l Yi(xi) (1)
The version of the Neyman\Rubin model used here is where yi is the observed outcome or value of the
adapted from that of Holland (1986), and differs from dependent variable for unit i. The idea behind Equa-
the original versions by Neyman (1935) and Rubin tion (1) is that to get from the potential outcomes

1551
Causal Counterfactuals in Social Science Research

oYi (x)q to an observed outcome we must select the allow us to make some sort of conclusion or inference
value of x in Yi (x) to be the value to which i is actually about these causal effects. This is the place where the
exposed, that is, x l xi, and then we obtain yi from the Neyman\Rubin model might appear to be impractical
potential observations, oYi (x)q, via Equation (1). for applied research, that is, because its most basic
The obsered data for each unit i is the vector parameters, the unit-level causal effects, are not
(zi, xi, yi). The potential outcomes, oYi (x)q, are never directly observable. Furthermore, this is exactly
observed for all values of x for a fixed unit, i, but only the place where the potential exposability of all of the
for the specific x-value to which i is actually exposed, levels of x to any unit is seen to be crucial to the
xi. It is sometimes said that the oYi (x)q are ‘counter- foundations of the theory. The definition of causal
factual’ because they are not actual observations. Here effect requires this assumption so that the difference,
they are called potential obserations because they Yi (1)kYi (0), is meaningful. It is the Fundamental
could have been observed had xi been different than it Problem of Causal Inference and this definition of
was. causal effect that makes causal inference both more
It is helpful to use a notation such as, E( y), to mean interesting and more difficult than the simple com-
the average value of yi across the (large number of) putation of correlational and associational measures.
units in P. Furthermore, an expression such as The unit-level causal effects mimic the comparison
between a ‘factual’ and a ‘counterfactual’ identified in
E ( y Q x l a) (2) the quote mentioned earlier from Hume by Lewis.
It is now time to introduce the notion of an Average
will mean the average value of yi across all of the (large Causal Effect (ACE). In general, an ACE is any
number of) units in P for which xi l a. The use of the average of unit-level causal effects. The most general
expectation notation is to make certain quantities ACE has the form
clearer in their meaning, and may be justified from
either a frequentist or Bayesian point of view. It ACE l E [Y(1)kY(0) Q A] (4)
should be noted that within the expectation notation,
the subscript i, denoting the unit, is suppressed where A denotes some collection of units defined in
because, within the scope of the E ( ) operator, i is terms of either zi or xi or both. In Equation (4) we
averaged over. again suppress the subscript i because it is being
averaged over. As an example of an A in Equation (4)
we might use A l ‘all the units in the study,’ in which
case the ACE is the average causal effect over all of P.
3. Using the Neyman\Rubin Model But other cases might be of interest, for example, A l
An important fact about the Neyman\Rubin model is ‘all units where i is male and for whom xi l 1.’ In this
the Fundamental Problem of Causal Inference (Hol- case the ACE is for the males in treatment group 1.
land 1986), which is: It is impossible in principle to Here we restrict our attention to the ACE that is called
obsere Yi (x) for more than one value of x for any one the ‘effect of the treatment on the treated’ in which A
unit, i. Any procedure that claims to have avoided the denotes all of the units for which xi l 1, that is
Fundamental Problem of Causal Inference can always
be shown to be based on untestable assumptions. ACE l E [Y(1)kY(0) Q xl 1]
Sometimes such assumptions are plausible, and some- l E [Y(1) Q x l 1]kE [Y(0) Q x l 1] (5)
times they are not.
A basic definition that we are now in a position to
make is that of a ‘unit-level causal effect.’ Because we Up to now, we have simply defined the basic
are restricting attention to the simple case of two structure of the data collection design as well as the
treatment levels, x l 0\1 we may restrict attention to causal connection between the potential outcomes and
these differences the causal variable x, that is, Equation (3). We have
not yet identified the connection between any quan-
tities that could be estimated with data and the causal
Yi(1)kYi(0) lthe casual effect of x l 1
parameters given by either the unit-level causal effects
relative to x l 0 for unit i (3) in Equation (3) or the average causal effects in
Equation (5). This leads us to the ‘prima facie Average
In the Neyman\Rubin notation, the unit-level Causal Effects,’ (FACEs). The FACEs are what can be
causal effects are the basic quantities of interest in estimated from the data. To parallel the ACE in
causal inference. However, the Fundamental Problem Equation (5) we examine the FACE which is simply
of Causal Inference is now immediately seen to be the difference between the mean of the outcome
fundamental because it implies that unit-level causal variable observed in each treatment group, that is
effects are, themselves, neer directly obserable. Thus,
we are always reduced to making assumptions that FACE l E [ y Q x l 1]kE [ y Q x l 0] (6)

1552
Causal Counterfactuals in Social Science Research

If we substitute the definition of y in terms of the empty counterfactuals arise when the value of a
potential observations, oYi (x)q, that is given in Equa- variable for a factor in a study could not have been
tion (1) into Equation (6) we obtain other than the value that it was. The interesting and
useful counterfactuals arise in those cases when the
FACE l E [Y(1) Q x l 1]kE [Y(0) Q x l 0] (7) variable could have had a different value that it did for
the individuals in a study, at least in principle.
Finally if we combine Equation (5) and Equation Judgment as to when a counterfactual is empty or not
(7) we obtain the following basic formula that relates is not always easy and may require careful thought in
the ACE to the FACE many cases. Consider the examples ‘if I were you, I
would, …’ Lewis’s kangaroo and Rubin’s aspirin in
the opening paragraph. These represent very different
FACE l ACEjBIAS (8) kinds of counterfactuals from this perspective. The
first is as empty as they get, the emptiness of the second
where depends on how the kangaroo might not have a tail
(our imagination vs. an axe), while Rubin’s aspirin
BIAS l E [Y(0) Q x l 1]kE [Y(0) Q x l 0] (9) could be taken or not.

The BIAS term contains two parts, one factual, that See also: Causation (Theories and Models): Conc-
is, E [Y(0) Q x l 0] l E [ y Q x l 0], and the other coun- eptions in the Social Sciences; Counterfactual
terfactual, that is, E [Y(0) Q x l 1]. The factual part is Reasoning: Public Policy Aspects; Counterfactual
just the mean of yi for those units with xi l 0. The Reasoning, Qualitative: Philosophical Aspects;
counterfactual part is the mean of Yi (0) for those Counterfactual Reasoning, Quantitative: Philoso-
units for whom xi l 1. E [Y(0) Q x l 1] is a quantity for phical Aspects; Internal Validity; Quasi-Experimental
which there can never be any data because the Designs
conditioning event makes the quantity being averaged
over a counterfactual. Thus, it is a counterfactual
conditional expectation. The value of such counterfac-
tual parameters is that they pinpoint exactly where
assumptions must be made that allows causal inference Bibliography
to take place using empirical data. When BIAS l 0,
we have FACE l ACE and the empirical FACE Cook T D, Campbell D T 1979 Quasi-experimentation: Design
equals the causal ACE. and Analysis Issues for Field Settings. Houghton Mifflin,
Boston
An important condition that insures that
Dawid A P 1997 Causal Inference without Counterfactuals.
BIAS l 0 is the construction of xi by random assign- Research Report No. 188, Department of Statistical Science,
ment which forces xi and Yi (0) to be statistically in- University College, London
dependent of each other as functions of i over P Holland P W 1986 Statistics and causal inference. Journal of the
(Holland 1986). When this independence holds, American Statistical Association 81: 945–70
E [Y(0) Q x l 1] l E [Y(0) Q x l 0] and BIAS l 0. Holland P W 1988 Causal inference, path analysis and recursive
structural equations models. In: Clogg C (ed.) Sociological
Methodology. American Sociological Association, Washing-
ton, DC, pp. 449–84
4. Empty Counterfactuals Holland P W, Rubin D B 1988 Causal inference in retrospective
studies. Ealuation Reiew 12: 203–31
There is an unsatisfactory and rather misleading use of Lewis D K 1973a Causation. Journal of Philosophy 70: 556–67
counterfactuals that sometimes arises in social science Lewis D K 1973b Counterfactuals. Harvard University Press,
research. It occurs when the counterfactual condition, Cambridge, MA
that is, the ‘if A were the case’ part could never occur Neyman J 1923 Sur les applications de la theorie des probabilites
in any real sense. Such empty counterfactuals arise aux experiences agricoles: Essai des principes. Roczniki Nauk
when a nonmanipulable factor in a causal study is Rolniczki 10: 1–51 (in Polish: English trans. Dabrowska D,
described as having an ‘effect’ on some outcome. Speed T 1991 Statistical Science 5: 463–80)
Examples easily come about in casual causal talk. The Neyman J 1935 Statistical problems in agricultural experimen-
tation. Supplement of the Journal of the Royal Statistical
effect of gender on salary suggests considering ‘what Society 2: 107–80
her salary would have been had she been a man.’ The Robins J M 1985 A new theory of causality in observational
effect of test performance on future employment survival studies—Application of the healthy worker effect.
suggests ‘the job he would have had had he scored Biometrics 41: 311
higher on the test.’ The effect of English language Robins J M 1986 A new approach to causal inference in
proficiency on a math test in English suggests ‘the mortality studies with a sustained exposure period—applica-
mathematics score a non-English speaker would have tion to control of the healthy worker survivor effect. Mathe-
received had he or she been an English speaker.’ These matical Modelling 7: 1393–1512

1553
Causal Counterfactuals in Social Science Research

Robins J M 1997 Causal inference from complex longitudinal make some step in the direction of that conclusion?
data. In: Berkane M (ed.) Latent Variable Modeling with And what would such a statement of causality mean?
Applications to Causality. Springer-Verlag, New York, pp.
69–117
Rubin D B 1974 Estimating causal effects of treatments in
randomized and nonrandomized studies. Journal of Edu-
cational Psychology 66: 688–701 2.1.1 Symmetric and directed relations. Associ-
Rubin D B 1978 Bayesian inference for casual effects: The role of ation is a symmetric relation between two, or poss-
randomization. Annals of Statistics 6: 34–58 ibly more, features. Causality is not symmetric. That
Shafer G 1996 The Art of Causal Conjecture. MIT Press, is, if C is associated with R then R is associated with
Cambridge, MA C, but if C is a cause of R then R is not a cause of
C. Thus the first task, given any two features C and
P. W. Holland R, is to distinguish the cases where:
(a) C and R are to be regarded as in some sense on
an equal footing and treated in a conceptually sym-
metric way in any interpretation.
(b) One of the variables, say C, is to be regarded as
explanatory to the other variable, R, regarded as a
Causal Inference and Statistical Fallacies response. That is, if there is a relation, it is regarded
asymmetrically.
Often significance tests for the existence of as-
1. Generalities sociation and of dependency are identical. The dis-
The pairing of causality and fallacies may seem tinction being studied here is a substantive one of
idiosyncratic. In fact it nicely captures the point that interpretation. Failure to observe this distinction leads
many statistical fallacies, i.e., plausible-seeming argu- to the fallacy of the overinterpreted association.
ments that give the wrong conclusion, hinge on the
overinterpretation or misinterpretation of statistical
associations as implying more than they properly do.
The article begins by discussing three main views of 2.1.2 Graphical representation. A useful graphical
causality, briefly indicating the scope for fallacious representation shows two variables X and X , re-
garded on an equal footing, if associated," as connec-
#
arguments and then at the end returns to discuss some
fallacies in slightly more detail. See Graphical Models: ted by an undirected edge, whereas two variables
Oeriew. such that C is explanatory to R, if connected, are
The very long history in the philosophical literature done so by a directed edge. See Fig. 1a and Fig. 1b.
of discussions of causality is largely irrelevant for these There are two possible bases for the distinction
purposes. It typically regards a cause as necessary and between explanatory and response variables. One is
sufficient for an effect: all smokers get lung cancer, all that features referring to an earlier time point are
lung cancer patients smoke. Here the concern is with explanatory to features referring to a later time point.
situations with multiple causes, even if one is pre- The second is a subject-matter working hypothesis
dominant, and where explicit or implicit statistical or based for example on theory or on empirical data from
probabilistic considerations are needed. other kinds of investigation. Thus the weight of a child
at one year is a response to maternal smoking behavior
during pregnancy. In such situations the relevant time
is not the time when the observation is made but the
2. Notions of Causality time to which the features refer, although of course
observations recorded retrospectively are especially
subject to recall biases.
2.1 Causality as Stable Association
As an example of the second type of explanatory-
Suppose that a study, or preferably several different response relation, suppose that data are collected on
but related studies, shows that two features, C and R, diabetic patients assessing their knowledge of the
of the individuals (people, firms, communities, house- disease and of their success in managing their disease,
holds, etc.) under investigation are associated. That is, as measured by glucose control. These data may well
if we take, to be explicit, positive monotone associ- refer to the same time point and it is not inconceivable
ation, individuals with high values of C tend to have that, for example, patients with poor glucose control
high values of R and vice versa. For example C and R are thereby encouraged to learn more about their
might be test scores at a given age in arithmetic and disease. Nevertheless, as a working hypothesis, one
language, or level of crime and unemployment rate in might interpret the data assuming that knowledge, C,
a community. is explanatory to glucose control, R, considered as a
Under what circumstances might one reasonably response. This is represented in simple graphical form
conclude that C is a cause of a response R, or at least in Fig. 1b by the directed edge from C to R.

1554
Causal Inference and Statistical Fallacies

Figure 1
(a) Undirected edge between two variables X , X on an equal footing. (b) Directed edge between explanatory
" #
variable C and response variable R. (c) General dependence of response R on B, C. (d) Special situation with
R K C Q B. (e) Special situation with B K C corresponding in particular to randomization of C

In summary the first step towards causality is to empirical statistical analysis in so far as it aims towards
require good reasons for regarding C as explanatory to causal explanation.
R as a response and that any notion of causal The definition entertains all possible alternative
connection between C and R, and there may be none, explanatory variables. In implementation via an
is that C is a cause of R, not the other way round. observational study one can at best check that the
We may talk about the ‘fallacy of the incorrect measured background variables B do not account for
direction’ when the explanatory-response relation is the dependence between C and R. The possibility that
identified in the wrong direction. the dependence could be explained by variables ex-
planatory to C that have not been measured, i.e., by
so-called unobserved confounders, is less likely the
larger the apparent effect, and can be discounted only
by general plausibility arguments about the field in
2.1.3 Common explanatory ariables. Next, con- question. Sensitivity analysis may be helpful in this:
sider the possibility of one or more common explan- that is, it may be worth calculating what the properties
atory variables. For this, suppose that B is of an unobserved confounder would have to be to
potentially explanatory to C and hence also to R. explain away the dependence in question. For further
There are a number of possibilities of which the most details see Rosenbaum (1995) and the entry Obser-
general is shown in Fig. 1c with directed edges from ational Studies: Oeriew.
B to C, from C to R, and also directly from B to R. Mistaken conclusions reached via neglect of con-
On the other hand, if the relation were that repre- founders, observed or unobserved, may be called
sented schematically in Fig. 1d, the only dependence ‘fallacies of the neglected confounder.’
between C and R is that induced by their both depend-
ing on B. Then C and R are said to be conditionally
independent given B, sometimes conveniently written
R K C Q B. There is no direct path from C to R that
does not pass via B. Such relations are typically asses- 2.1.4 Role of randomization. The situation is sub-
sed empirically by some form of regression analysis. stantially clarified if the potential explanatory vari-
In such a situation, one would not regard C as a able C is a randomized treatment allocated by the
cause of R, even though in an analysis without the investigator. Then in the scheme sketched in Fig. 1e
background variable B there is a statistical depen- there can be no edge between the B’s and C since
dence between the two. such dependence would be contrary to randomiz-
This discussion leads to one definition used in the ation, i.e., to each individual under study being
literature of C being a cause of R, namely that there is equally likely to receive each treatment possibility. In
a dependence between C and R and that the sign of this situation an apparent dependence between C and
that dependence is unaltered whatever variables B , B R cannot be explained by a background variable as
etc., themselves explanatory to C, are considered " # in Fig. 1d. It is in this sense that it is sometimes
simultaneously with C as possible sources of depen- stated, especially in the statistical literature, that
dence. This definition has a long history but is best causality can be inferred from randomized experi-
articulated by I. J. Good and P. Suppes. A corre- ments and not from observational studies. It is
sponding notion for time series is due to N. Wiener argued here that while other things being equal, ran-
and C. W. J. Granger. This definition underlies much domized experiments are greatly to be preferred to

1555
Causal Inference and Statistical Fallacies

Figure 2
(a) Intermediate variable I accounting for overall effect of C after ignoring I; R K C Q I. (b) Correlated variables
C, C* on an equal footing and both explanatory to response R

observational studies, difficulties of interpretation, that C* is explanatory to C and that the appropriate
sometimes serious, remain. interpretation is to fix C* when analysing variations
of C.
For example, suppose that C and C* are respectively
2.1.5 Intermediate ariables. In Sect. 2.1.4 the vari- measures of educational performance in arithmetic
ables B have been supposed explanatory to C and and language of a child, both measured at the same
hence to R. For judging a possible causal effect of C age, and that the response is some adult feature. Then
it would be wrong to consider in the same way vari- the third possibility is inapplicable; the first possibility
ables intermediate between C and R, i.e., variables I is to regard the variables as a two-dimensional measure
that are responses to C and explanatory to R. They of educational performance and to abandon, at least
are valuable in clarifying the nature of any indirect temporarily, any notion of separating the role of
path between C and R, but the use of I in a regression arithmetic and language.
analysis of R and C would not be correct in assessing In summary, this first broad notion of causality is
whether such a path exists. If R is independent of C that of a statistical dependency that cannot be ex-
given an intermediate variable I, but dependent on I, plained away via an eligible alternative explanation.
then C may still have caused I and I may be a cause
of R.
For instance suppose that C represents assignment 2.2 Causality as the Effect of Interention
to a new medical regimen as compared with a control
regimen and that the former, but not the latter, 2.2.1 Counterfactuals. While the notion of causality
eventually induces lower blood pressure, I, which in discussed in Sect. 2.1.6 is certainly important and is
turn induces a reduced cardiac event rate, R; see Fig. strongly connected with the approach adopted in
2a. Does the new regime cause a reduced cardiac event many empirical statistical studies, it does not, however,
rate? If R is conditionally independent of C given I, it directly capture a stronger interpretation of the
would be reasonable to say that the regimen does word causal. This is connected with the idea of hy-
cause a reduction in R and that this reduction appears pothetical intervention or modification. Suppose for
to be explained via improved blood-pressure control. simplicity of exposition that C takes just two possible
forms, to be called presence and absence. Thus pres-
ence might be the implementation of some program
2.1.6 Explanatory ariables on an equal footing. In of intervention, and absence a suitable control state.
some ways an even more delicate situation arises if For monotone relations, one may say that the pres-
we consider the role of variables C* on an equal foot- ence of C causes an increase in the response R if an
ing with the variable C whose causal status is under individual with C present tends to have a higher R
consideration; see Fig. 2b. If the role of C is essen- than that same individual would have had if C had
tially the same whether or not C* is conditioned, i.e., been absent, other things being equal.
whether or not C* is included in the regression Slightly more explicitly, let B denote all variables
equation, there is no problem, at least at a qualitative possibly explanatory to C and suppose that there are
level. On the other hand, consider the relatively com- no variables C* to be considered on an equal footing
mon situation where there is clear dependence on to C. Consider for each individual two possible values
(C, C*) as a pair but that either variable on its own is of R, Rpres, Rabs that would arise as C takes on its two
sufficient to explain the dependence. There are then possible values, present and absent, and B is fixed.
broadly three routes to interpretation: Then presence of C causes, say, an increase in R if Rpres
(a) To regard (C, C*) collectively as the possibly is in some sense systematically greater than Rabs.
causal variables. We now discuss this notion, which has its origins at
(b) To present at least two possibilities for in- least in part in J. Neyman’s and R. A. Fisher’s work
terpretation, one based on C and one on C*. on design of experiments and in the studies of H. A.
(c) To obtain further information clarifying the Simon and has been systematically studied and fruit-
relation between C and C*, establishing for instance fully applied by D. B. Rubin.

1556
Causal Inference and Statistical Fallacies

For a given individual, only one of Rpres and Rabs can for an individual might have been different from how
be observed, corresponding to the value of C actually it in fact is. This is relevant only to variables that
holding for that individual. The other value of R is a appear solely as explanatory variables. For example
so-called counterfactual whose introduction seems, they may be variables measured at base-line, i.e., at
however, essential to capture the notion hinted at entry into a study. Any intermediate variable by its
above of a deeper meaning of causality. nature of being at some point a response is poten-
tially manipulable. Purely explanatory variables can
be divided into intrinsic variables, essentially defining
characteristics of the individual, and potential explan-
2.2.2 Differences in counterfactuals. The simplest atory variables, which might therefore play the role
and least demanding relation between the two values of C in the present discussion. Intrinsic variables
of R is that over some populations of individuals should not be regarded as even potentially causal in
under study the average of Rpres exceeds that of Rabs. the present sense. For example the gender of an in-
This is a notion of an average effect and is testable dividual would in most contexts be regarded as an
empirically in favorable circumstances. A much intrinsic characteristic. The question ‘what would R
stronger requirement is that the required inequality have been for this woman had she been a man, other
holds for every individual in the population of con- things being held fixed?’ is in many, although not
cern. Stronger still is the requirement that the differ- quite all, contexts meaningless.
ence between the two values of R is the same for all
individuals, i.e., that for all individuals

RpreskRabs l ∆ 2.2.5 Variables to be held fixed. Finally, care is es-


sential in defining what is to be held fixed under hy-
This is called, in the language of the theory of the pothetical changes of C. Certainly responses I to C
design of experiments, the assumption of unit-treat- are not fixed. Variables, B, explanatory to C are held
ment additivity. fixed. There is an essential ambiguity for variables C*
Now these last two assumptions are clearly not on an equal footing with C. This is strongly connec-
directly testable and can be objected to on that ted with the issue of which explanatory variables to
account. The assumptions are indirectly testable, to a include in empirical regression analyses.
limited extent at least. If the individuals are divided
into groups, for example on the basis of one or more of
the background variables B, the assumptions imply
for each individual observed that the difference be- 2.3 Causality as Explanation of a Process
tween the two levels of R has the same sign in the first
case and that it is the same, except for sampling errors, There is a third notion of causality that is in some ways
in the second. In Sect. 2.2.3 the consequences of the more in line with normal scientific usage. This is that
possibly causal variable C having a very different effect there is some understanding, albeit provisional, of the
on different individuals are discussed. process that leads from C to R. This understanding
typically comes from theory, or often from knowledge
at a hierarchical level lower than the data under
immediate analysis. Sometimes, it may be possible to
2.2.3 Ignorable treatment allocation. In Sect. 2.2.2 represent such a process by a graph without directed
it has been tacitly assumed throughout that the two cycles and to visualize the causal effect by the tracing
possible values of R for each individual depend on of paths from C to R via variables I intermediate
the value of C for that individual and would be un- between C and R. Thus the effect of diet in an
affected by reallocation of C to other individuals. epidemiological study may be related to the physio-
That is, the effects of C act independently as between logical processes underlying the disease under study,
different individuals. These considerations have a the effect of pharmaceutical products related to the
strong bearing on the appropriate definition of a unit pharmaco-dynamics of their action, the effect of
of study. For example, in a comparison of different interventions at a community level related to ideas of
methods of class teaching of children, the unit would individual psychology, and so on.
primarily be a whole class of students, i.e., the whole This last notion of causality as concerned with
group who are taught and work together. generating processes is to be contrasted with the
second view of causality as concerned with the effects
of intervention and with the first view of causality as
stable statistical dependence. These views are com-
2.2.4 Intrinsic ariables. There is an important re- plementary, not conflicting. Goldthorpe (1998) has
striction implicit in this discussion. It has to be mean- argued for this third view of causality as the ap-
ingful in the context in question to suppose that C propriate one for sociology, with explanation via

1557
Causal Inference and Statistical Fallacies

Rational Choice Theory as an important route for and that are explanatory to R and are part of some
interpretation. natural process, and on the other hand interventions
To be satisfactory there needs to be evidence, into the system that may depend on C and which may
typically arising from studies of different kinds, that be explanatory to R, but which in some sense are
such generating processes are not merely hypothe- unwanted or inappropriate for interpretation. In the
sized. context of clinical trials, an example is the failure of
patients to comply with a treatment assigned to them.
Ignoring such noncompliance can lead to inappro-
priate intention-to-treat analysis.
3. Special Issues Another example is in evaluations of study pro-
grams, whenever students in only one of the programs
receive intensive encouragement during the evaluation
3.1 Interaction Inoling a Potentially Causal period.
Variable
We now turn to the issue of interactions with a
potentially causal variable. The graphical repre-
sentations used in Sect. 2.3 to show the structure of 3.3 Aggregation
various kinds of dependency and independency hold- So far, little has been said about the choice of
ing between a set of variables have the limitation, at observational units for study. At a fundamental
least in the form used here, that they do not represent research level it may be wise to choose individuals
interaction, in particular that an effect of C may be showing the effects of interest in their simplest and
systematically different for different groups of indi- most striking form. More generally, however, the
viduals. choice has to be considered at two levels. There is the
This affects the role in analysis and interpretation level at which ultimate interpretation and action is
especially of variables B that are themselves possibly required and the level at which careful observation of
explanatory to the variable C whose causal status is largely decoupled individuals is available. For
under consideration. So far we have been largely example, a criminologist comparing different senten-
concerned with whether such variables could explain cing or policing policies is interested in individual
the effect on response of C. In more detailed discussion offenders but may be able to observe only different
we should consider the possibility that the effect of C communities or policing areas. A nutritional epi-
is substantially different at different levels of B. For demiologist comparing different diets is interested in
example if B is an intrinsic feature such as gender, we individual people but may have to rely, in part at least,
consider whether the effect of C is different for men on consumption and mortality data from whole
and for women. In particular, if the effects of C are in countries. The assumption that a dependence estab-
opposite directions for different levels of B we say lished on an aggregate scale, for example at a country
there is a qualitative interaction, a possibility of special level, has a similar interpretation at a small-scale level,
importance for interpretation. for example for individual persons, involves the
Note especially, that even when C represents a assumption that there are no confounders B at a
randomized treatment which is automatically de- person level that would account for the apparent
coupled from preceding variables B, the possibility of dependency. This will typically be very hard or even
serious interactions with B cannot in general be impossible to check at all carefully from country-level
ignored; see Sect. 5.1. data.
Viewed slightly differently, absence of interaction is Errors rising as a result of over-aggregated units of
important not only in simplifying interpretation but analysis are called ‘ecological fallacies’ or in econo-
also in enhancing generalizability and specificity. That metrics ‘aggregation biases.’ See Ecological Inference.
is, an effect that has been shown to have no serious
interaction with a range of potential variables is more
likely to be reproduced in some new situation and
more likely to have a stable subject-matter interpret- 4. Bradford Hill’s Conditions
ation.
The above discussion implicitly emphasizes that, while
causal understanding is the aim of perhaps nearly all
research work, a cautious approach is essential,
especially, but by no means only, in observational
3.2 Unwanted Unobsered Intermediate Variable
studies. The most widely quoted conditions tending to
Consider further the role of variables I referring to make a causal interpretation more likely are those of
time points after the implementation of C. A subject- Bradford Hill (1965), put forward in connection with
matter distinction can be drawn between on the one the interpretation of epidemiological studies. Bradford
hand intermediate variables that are responses to C Hill emphasized their tentative character.

1558
Causal Inference and Statistical Fallacies

For a critical discussion of these conditions, see argued as evidence of guilt. The California Court of
Rothman and Greenland (1999) and for a slightly Appeal rejected this argument as inappropriate, pri-
revised version of them Cox and Wermuth (1996, marily because they regarded the issue as being
Sect. 8.7). whether there could be one or more further matching
Koch gave conditions for inferring causality when couples in the Los Angeles area. For this they argued in
the potential cause can be applied, withdrawn, and effect that if N is the population of the greater Los
reapplied in a relatively controlled way and the pattern Angeles area then the number of couples with the
of response observed. assigned properties has a Poisson distribution of mean
NP and, since one such couple is known to exist, the
number has a zero-truncated Poisson distribution with
parameter NP. From this the Court of Appeal calcu-
5. Some Fallacies in More Detail lated a probability of about 0.4 of there being one or
The previous discussion has mentioned various points more further couples with matching features, too high
at which fallacious arguments are not only possible, to justify a safe conviction of the original couple.
but relatively common. This in no way covers the A more formal argument would use Bayes’s theorem
wealth of fallacious arguments possible in a statistical to calculate the posterior probability of guilt, assuming
context, perhaps the most pervasive being the com- that the only evidence is that stated above and that the
parison of rates using inappropriate or ill-defined numerical assignments are reasonably accurate.
denominators. There is not space to discuss all these
possibilities. This article is therefore concluded with
three specific examples related to the main discussion of
causality. 5.2 Interactie Effect Inoling an Unobsered
Explanatory Variable
It has been reported (Zivin and Choi 1991) that early
controlled clinical trials with stroke patients were
5.1 Inappropriate Reference Set for Probabilities
discouraging because they appeared to give differing
The first fallacy to be discussed, unlike the others, does answers to the question: do thrombolytic agents
not center around misuse of notions connected with effectively improve the status of stroke patients? These
causality, but rather with mathematically correct but are substances to dissolve blood clots, i.e., thrombi.
inappropriate calculations of probability. Cornfield Nowadays it is known that a stroke can be caused by
(1969) discussed a criminal case in California, People a thrombus or by a burst vein. Thus, a thrombolytic
versus Collins, in which a couple had been found agent may improve the patient’s status or it may
guilty partly on the basis of the following argument. worsen it considerably, depending on the reason for
Eye-witnesses described the crime as committed by a the stroke. With patients of both types in any study
man with a moustache, by an Afro-American man ‘treatment-unit’ additivity will not hold, nor can there
with a beard, by an inter-racial couple in a car, by a be ‘strongly ignorable treatment allocation,’ even if
couple in a partly yellow car, by a girl with blond hair, the study is a controlled clinical trial with random
and by a girl with a ponytail. allocation of patients to one group treated with a
Probabilities were assigned to these six features and, thrombolytic agent and the other with a placebo.
assuming independence, multiplied to give a prob- Instead, observations like those in Table 1 are to be
ability P l 0.8i10−(, the smallness of this being anticipated. The main response is success of treatment

Table 1
Counts, percentages and odds-ratios; the two explanatory variables are
independent, C K B, due to randomized allocation of treatments, C, to
patients; depending on the patient’s unobserved status, B, the thrombolitic
agent has a very different chance of treatment success
B l 1, Burst vein B l 2, Thrombus

R, Success of C, Thrombolytic agent C, Thrombolytic agent


treatment yes no yes no
yes 6 60 1425 300
(2 percent) (20 percent) (95 percent) (20 percent)
no 294 240 75 1200
sum 300 300 1500 1500
odds-ratio 0.08 16

1559
Causal Inference and Statistical Fallacies

Table 2
Counts, percentages and the same association between R and C as measured in terms of relative chances given each
level combination of A and B; strong three-factor interaction between A, B, C; response depends on each of the
explanatory variables A, B, C
Al1 Al1

Bl1 Bl2 Bl1 Bl2

Cl1 Cl2 Cl1 Cl2 Cl1 Cl2 Cl1 Cl2


Rl1 604 40 300 2000 301 2015 600 40
(30 percent) (20 percent) (75 percent) (50 percent) (75 percent) (50 percent) (30 percent) (20 percent)
Rl2 1396 160 100 2000 99 1985 1400 160
Sum 2000 200 400 4000 400 4000 2000 200
Relative chances for R l 1 comparing C l 1 to C l 2 given A, B
1.5 1.5 1.5 1.5

Table 3
Counts, percentages and relative chances obtained from Table 2 by summing
over the levels of A show seemingly replicated associations for R and C given
B; the actual associations between R and C, as shown in the previous table,
appear reversed and (R, C) K B
Bl1 Bl2

Cl1 Cl2 Cl1 Cl2


Rl1 905 2055 900 2040
(38 percent) (49 percent) (38 percent) (49 percent)
Rl2 1495 2145 1500 2160
Sum 2400 4200 2400 4200
Relative chances for R l 1 comparing C l 1 to C l 2 given B
0.78 0.78

and the main explanatory variable is the type of Here, treatment C l 1 is consistently better under
treatment. In this example there is a ratio of 1:5 for the four different conditions, since the chances for suc-
number of stroke patients with a burst vein to those cessful treatment R l 1 are higher for C l 1 if
with a thrombus. With successful randomization the compared with C l 2; relative chances are even
patient’s status will be independent of treatment, but identical in the four conditions. But, this treatment
the strong interactive effect of status and treatment on appears to be consistently worse under two repli-
outcome cannot be avoided. Whenever it is not feasible cations B l 1 and B l 2, i.e., when A is unobserved.
to observe the patient’s status, then the reported results In addition, in an analysis of only the three variables
of any clinical trial will strongly depend on the actual R, B, C, it appears as if randomization had been used
percentage of patients in the study having a thrombus successfully since C K B.
as cause of the stroke. In any case, nowadays it would These effects are an extreme example of what is
be unethical to include stroke patients known to have often called the Yule-Simpson paradox. Although in a
a burst vein in a clinical trial designed to study a sense the word paradox is inappropriate, the possi-
thrombolytic agent. bility of this kind of dependence reversal reinforces the
need for either carefully studying relations among
explanatory variables or for using effective randomiz-
5.3 Dependence Reersal
ation procedures.
The artificial data in Table 2 illustrate that a strong If a statistical association is to be judged as evidence
three-way interaction among explanatory variables A, for a causal hypothesis, then one should be certain that
B, C can lead to replicated dependence reversals the observed associations do not mislead us about the
whenever the response R depends on each of the actual associations. This is impossible without
explanatory variables. assumptions about unobserved variables even in trials

1560
Causation (Theories and Models): Conceptions in the Social Sciences

with randomization (see also Stone 1993). Therefore it Cox D R, Wermuth N 1996 Multiariate Dependencies—Models,
appears that substantial progress in establishing causes Analysis and Interpretation. Chapman and Hall, London
can be expected only via understanding and descrip- Dawid A P 2000 Causality without counterfactuals (with dis-
tion of the processes which generate observed effects. cussion). Journal of the American Statistical Association 95:
407–48
See Linear Hypothesis: Fallacies and Interpretie Prob- Goldthorpe J G 1998 Causation, Statistics and Sociology. Econ-
lems (Simpson’s Paradox). omic and Social Research Institute, Dublin, Republic of
Ireland
Good I J 1978 Fallacies, statistical. In: Kruskal W H, Tanur J M
(eds.) Encyclopedia of Statistics. Free Press, New York, Vol. 1,
6. Suggested Further Reading pp. 337–49
Holland P 1986 Statistics and causal inference (with discussion).
The statistical aspects of causality are best approached
Journal of the American Statistical Society 81: 945–70
via the discussion paper of Holland (1986), where in Huff D 1954 How to Lie with Statistics. Norton, New York
particular, key references to the earlier literature will Lauritzen S L 2000 Causal inference from graphical models. In:
be found; see also Cox and Wermuth (1996, Sect. 8.7). Klu$ ppelberg C, Barndorff-Nielsen O E, Cox D R (eds.)
For general issues about observational studies, see Complex Stochastic Systems. Chapman and Hall\CRC,
Cochran (1965) and Rosenbaum (1995). For a philo- London
sophical perspective, see Simon (1972) and Cartwright Pearl J 2000 Causality: Models, Reasoning and Inference.
(1989). For an interventionist view, see Rubin (1974) Cambridge University Press, Cambridge, UK
and for a more formal analysis still from a social Rosenbaum P R 1995 Obserational Studies. Springer, New
science viewpoint Sobel (1995). For a development York
based on directed acyclic graphs, see Pearl (2000) and Rothman K J, Greenland S (eds.) 1998 Modern Epidemiology,
2nd edn. Raven-Lippincott, Philadelphia, PA
for the general connections with graph theory
Rubin D B 1974 Estimating causal effects of treatment in
Lauritzen (2000). For an approach based on a com- randomized and nonrandomized studies. Journal of Edu-
plete specification of all independencies between a set cational Studies 66: 688–701
of variables, followed by a computer-generated listing Simon H A 1972 Causation. In: Kruskal W H, Tanur J M (eds.)
of all directed acyclic graphs consistent with those Encyclopedia of Statistics. Free Press, New York, Vol. 1,
independencies, see Spirtes et al. (1993). The use of pp. 35–41
counterfactuals is criticized by Dawid (2000). The Sobel M E 1995 Causal inference in the social and behavioral
reader should be aware that many rather different sciences. In: Arminger G, Clogg C C, Sobel M E (eds.)
interpretations of causality are involved in these Handbook of Statistical Modeling for the Social and Behaioral
discussions. Sciences. Plenum, New York
An elementary account of fallacies is given, with Stone R 1993 The assumptions on which causal inferences rest.
many interesting examples, by Huff (1954). Good Journal of the Royal Statistical Society B 55: 455–66
Spirtes P, Glymour C, Scheines R 1993 Causation, Prediction and
(1978) gives a detailed classification of types of fallacy, Search. Springer, New York
again with excellent examples. See also Agresti Zivin J A, Choi D W 1991 Neue Ansa$ tze zur Schlaganfall-
(1983). Therapie. Spektrum der Wissenschaft Sept: 58–66

See also: Causation (Theories and Models): Con- D. R. Cox and N. Wermuth
ceptions in the Social Sciences; Causation: Physical,
Mental, and Social; Explanation: Conceptions in the
Social Sciences; Scientific Reasoning and Discovery,
Cognitive Psychology of

Causation (Theories and Models):


Bibliography Conceptions in the Social Sciences
Agresti A 1983 Fallacies, statistical. In: Kotz S, Johnson N L
(eds.) Encyclopedia of Statistical Sciences. Wiley, New York, Many, perhaps most problems and hypotheses in
Vol. 3, pp. 24–8 social and behavioral research concern causal rela-
Bradford Hill A 1965 The environment and disease: association tions. What caused the fall of communism? What
or causation. Proceedings of the Royal Society of Medicine 58: caused Peter’s depression? What causes aggression in
295–300 general? Atkinson and Birch (1978, p. 30) consider as
Cartwright N 1989 Nature’s Capacities and their Measurement.
Clarendon Press, Oxford, UK
one of the fundamental questions in the study of
Cochran W G 1965 The planning of observational studies of motivation: ‘What causes the strength of tendencies to
human populations (with discussion). Journal of the Royal change?’ However, the idea or concept of causation is
Statistical Society A 128: 234–265 also involved in many cases in which people do not use
Cornfield J 1969 The Bayesian outlook and its application (with the word ‘cause.’ Many concepts include the idea of
discussion). Biometrics 25: 617–57 causation as part of their meaning, for example, a

1561
Causation (Theories and Models): Conceptions in the Social Sciences

makes b happen, a produces b, a leads to b, a has an perceive the two events and the temporal relation
influence on b, or b depends on a. The central between them, but we do not perceive something like a
importance of causation is demonstrated by the fact causal power or a causal link between them (Hume
that social scientists continuously emphasize the dif- 1739, p. 636). Therefore, Hume defined a cause to be
ference between causal relations and mere correla- an event a, followed by a contiguous event b, where A
tions. Many research methods have been developed is always followed by B (1748, p. 76). This is Hume’s
for drawing valid causal inferences from observations. ‘regularity theory,’ which implies that there is no
Causal statements are part of explanations. b can be causal nexus in addition to regular succession.
explained by demonstrating that a occurred and that The view that causation is reducible to regular
there is a causal law saying that whenever an event of succession has been criticized for centuries (cf.
type A takes place an event of type B necessarily Armstrong 1983). It has been argued, for example,
follows. Although not all explanations are causal, that, according to the regularity theory, day causes
causal explanations play a central role in science and night. Hume’s view was nevertheless taken over by
in everyday life. Rational action is based on causal modern empiricists. Bertrand Russell claimed that the
assumptions. This holds for everyday action as well as concept of cause was a ‘relic of a bygone age’ and
for scientific technology. If we press a switch, talk could be replaced by the concept of functional re-
quietly to a baby, take medicine, or give up smoking, lationship. Empiricism in turn had a considerable
we have causal assumptions implying that what we are influence on the social sciences. Finally, however,
doing leads to a certain goal or prevents an event we logical empiricism (or ‘logical positivism’), was
don’t want to happen. severely criticized (Popper 1959, Quine 1951). Its
central claims had to be given up, especially the
principle that a statement is meaningful if and only if
1. The Concept of Cause it can be verified by observation. As a consequence of
this criticism, the empiricist view of causation was put
The concept of causation expresses a relation between into question, too, and many philosophers came to the
eents, meaning that one event is the cause and another conclusion that causation is not reducible to regular
the effect. ‘a causes b’ means that a makes b happen. succession (Bunge 1959). The idea of regular suc-
We also speak of causation when one event prevents cession does not include the idea that a makes b
another from happening. a and b may be qualitative happen. ‘Causation’ has perhaps to be taken as a
events, for example, Paul losing his job or Maria fundamental concept not capable of being defined by
passing her exam. Causation also applies to quan- other concepts like succession. There is, however, still
titative variables, for example, the raise in price caused no general agreement on this problem.
the drop in demand. In such a case the cause and the In contemporary science and philosophy, spatial
effect are changes of quantities. It is also possible to contiguity (e.g., one billiard ball touches the other) is
say that X taking a certain value is the cause for Y not considered as necessary for causation. In fact,
taking a certain value. spatial contiguity as a necessary condition was already
Causal statements can be singular statements or outdated in Hume’s days, since Newton had developed
general ones. ‘This event a caused this event b’ is his theory of gravitation which involves distant action.
singular. ‘A always causes B’ is general. An example of However, temporal contiguity is widely accepted. In
a general hypothesis is ‘Frustration always causes the social sciences, causal statements usually pre-
aggression’ (which is not true). A general causal suppose that a cause precedes its effect. In principle,
hypothesis which has been confirmed by empirical the idea of simultaneous causation also seems to
research is called a causal law. Causal statements may apply. Backward causation, however, as discussed in
refer to observable as well as to unobservable events. some fields of physics, does not yet play a role in social
Some causal statements connect theoretical with em- or behavioral theories.
pirical concepts, thus functioning as bridge principles The idea of causation is closely connected to
in testing or applying theories. For example, if an counterfactual conditional statements. Hume has given
experimenter presumes that a certain treatment a second definition of causation which connects causes
arouses the power motive of subjects, she accepts a with conditionals. According to this definition, the
causal assumption. Or if reaction time is used as an statement that a caused b means that ‘if the first object
indicator of certain cognitive processes, it is pre- had not been, the second neer had existed’ (Hume
supposed that these processes have a causal influence 1748, p. 76). Today this is called counterfactual
on reaction time. dependence: if a had not happened, b would not have
Since David Hume’s famous treatise on causality, happened either; if a had happened, b would have
any discussion of this subject is much influenced by happened, too. Lewis (1986, essays 17 and 21) defines
what he taught. Hume maintained that causal relations causation in terms of counterfactual dependence. It is,
cannot be observed; they are inferred by the observer. however, controversial whether the concept of caus-
We see that one billiard ball touches the other ation or the concept of counterfactual dependence is
immediately before the second one begins to move. We more fundamental. Some philosophers maintain that

1562
Causation (Theories and Models): Conceptions in the Social Sciences

conditionals should be explained in terms of causation causal process, the system (e.g., a person, a group, an
(Sanford 1989, Chaps. 11–14). In any case, the institution) exposed to A is in a ‘normal state,’ that is,
concepts of causation and conditionals are related to it is not destroyed, damaged, or disturbed too much. It
each other. is assumed, for instance, that there won’t be a natural
How closely are causes connected to their effects? catastrophe or a war, that the persons involved won’t
Bunge defines causation in a way that assumes a very become ill or crazy, and so on. If such extraneous
close connection: if A happens, then (and only then) B events do happen and prevent B from happening, this
is always produced by it (Bunge 1959, p. 46). In this is not considered as evidence against the hypothesis.
case, A is not only a sufficient condition of B, but also The causal statement refers to the ‘normal’, ‘un-
a necessary condition. Hume, as well as Newton, and disturbed’ case.
later on, Russell, were convinced that an effect always If a is considered as a cause (maybe an inus
has one and only one (type of) cause. Most con- condition) of b, this does not necessarily mean that a is
temporary social scientists instead hold that different the most immediate cause. A war may cause that some
causes can have the very same effect (a view already people die of hunger. This may be true, though there is
maintained by Machiavelli). For example, John may a more immediate cause, namely that these people had
have been fired (b) because he stole the money (a). But no food just before they died, and even more im-
he could have been fired anyway, so that stealing the mediate, physiological causes could be pointed out.
money was not necessary for b. Being fired may have The previous remarks describe the general use of the
many different causes. concept of causation in the social and behavioral
Interestingly, causes are often not sufficient either. sciences. In addition, there are special concepts that
John was only fired because his stealing was observed differ from the general view. It has been claimed that in
(c) and reported to the works management (d). a everyday language the concept of cause implies ma-
alone was not sufficient for b but a together with c nipulation (Collingwood 1940, von Wright 1971). ‘a is
and d were. This set of sufficient conditions was in the cause of b’ means that we can produce (or prevent)
turn not necessary for b since there could have been b by making a happen (preventing a). Now it is seems
quite another set of sufficient conditions. Mackie to be true that many causes are actions and that people
(1974, p. 62) calls an event a of this kind an inus learn the concept of causation in close connection with
condition: a is an insufficient but nonredundant part of their own behavior and its effects. However, it is
an unnecessary but sufficient condition of b. implausible that this concept, once acquired, implies
In the social sciences, causal statements are often manipulation as part of its meaning. It is rather vice
interpreted as ceteris paribus statements: other things versa. The idea of manipulation by action seems to
being equal, A produces B. ‘Ceteris paribus’ does not involve the idea of causation. Furthermore, people are
refer to every other object and state all over the world able to use and to understand propositions about
(they never all remain unchanged, not even for a events which cannot be manipulated at all, like the
millisecond), but to those (partly unknown) states that explosion of a star.
are causally relevant for B. Consider, for example, this In empirical social research, scientists regularly test
psychological law, known as ‘the Zeigarnik effect’: statistical hypotheses, say, about means or correlation
unfinished tasks are more easily recalled than com- coefficients. In which way are these statistical hy-
pleted ones. Obviously, this can only be true if potheses related to those causal hypotheses scientists
understood as a ceteris paribus statement. The fact are interested in? Usually, it seems to be presupposed
that a task has not been finished has a causal influence that a general causal hypothesis can be tested by
on recall. But there are, of course, further factors that testing a statistical hypothesis about a parameter like
determine whether a task will be recalled or not. a mean (cf. Erdfelder and Bredenkamp 1994). For
Therefore, the Zeigarnik effect can only be demon- example, if for every individual an increase in X causes
strated under certain experimental conditions: subjects an increase in Y, then for every group of individuals
work on a number of puzzles. Some of them are (and, of course, for a large population), the mean of X
interrupted, some are not, and everything else is kept must be greater than the mean of Y.
unchanged (as far as possible). But why deal with group means or correlations
Since it is not always clear which additional factors when the subject of research are causal relations? The
have to remain unchanged, causal hypotheses are reason for this is that observation and measurement
incomplete in a certain sense (Gadenne 1984). They are inaccurate to some degree, so that many observa-
can be made more complete by specifying further tions have to be made in order to determine whether
conditions under which the connection between A and there is an increase or decrease in Y. In addition, to
B is supposed to hold. However, it is usually not control effects of repeated measurement, it is often
possible to formulate a very precise hypothesis saying necessary in social research to use several research
that A produces B provided that certain well-defined groups of individuals differing only in X (best achieved
boundary conditions C hold. These ‘further condi- by random assignment of individuals, see below).
tions’ seem incapable of being described exactly. For Now, even if it is true that for every individual an
example, it is usually presupposed that, during the increase in X causes an increase in Y, it does not follow

1563
Causation (Theories and Models): Conceptions in the Social Sciences

from this that every individual of a group higher in X the independent ariable X (in this simple case, X has
has a higher measured score in Y than every individual the values A and non-A) and the dependent ariable Y
of a group lower in X. Therefore, one needs statistical (B, non-B). If B is obtained under the experimental
analysis. Usually, if the statistical null hypothesis is condition and fails to be obtained under the control
rejected and the alternative hypothesis is accepted, this condition, it is justified to consider the causal hy-
is considered as a confirmation of the causal hy- pothesis H as confirmed (H being interpreted as a
pothesis being tested. However, there is no total ceteris paribus hypothesis). This is justified because
agreement about this procedure and there are some other possible causes of B were ruled out by different
problems, concerning, for example, the logical relation kinds of control. Some of them were eliminated (e.g.,
between causal and statistical statements, or the noise), others were kept constant (e.g., time, rooms,
connection between deciding on statistical hypotheses experimenter), so that they cannot be the cause of a
and the confirmation or falsification of the causal difference between the experimental and the control
hypotheses. condition.
It has been asked whether causal statements them- In experimental research, X is manipulated by the
selves can be interpreted as probabilistic statements. researcher, whereas in passie obserational methods
Suppes (1970) has presented a probabilistic theory of (ex post facto design) X varies naturally. If manipu-
causality. Let the probability that B happens be higher lation is possible and justifiable, it has an advantage
if A occurs before: p(B\A)  p(B). In this case, A over passive variation, since it is much easier in this
seems to be something like a cause of B, it is a prima case to prepare two or more very similar research
facie cause. A prima facie cause is not always a genuine conditions, differing only in X. The most important
cause. Suppose that there is an event Ah that precedes method of control in social experimentation (as
A, and let p(B\A and Ah) l p(B\Ah). Thus, A is opposed to physical experimentation) is randomiza-
actually quite irrelevant to B. It only appeared to be a tion. By randomly assigning individuals to exper-
cause, it is a spurious cause. If a prima facie cause is not imental and control conditions, the properties of these
spurious, it is a genuine cause. For a criticism of this individuals are held constant between the conditions.
theory see Salmon (1984, p. 193). For an advancement To be more precise, the probability distributions of
of the statistical approach to causation, especially with these variables are held constant. In order to do this it
respect to the social and behavioral sciences, see Steyer is not necessary to know these variables, that is, one
(1985, Cheng 1997). does not need to identify and measure each variable
that could have an influence on Y.
In passive observational methods, one can also try
2. Causal Hypotheses and Empirical Research to discover whether a correlation between X and Y is
based on a causal relation with (the values of) X being
Most social scientists hold (with Hume) that we cannot the cause of (the values of) Y. Suppose, for instance,
perceive something like a causal link between A and B. that people who pray regularly are healthier. It may be
We only can perceive the events A and B themselves that praying (X) has a positive causal influence on
and their temporal relationship. The causal relation health (Y). It could also be, however, that X and Y are
must be concluded from what we can observe. Under both causally influenced by a third variable Z, say, a
what conditions is it justified to conclude that certain variable connected with lifestyle, or something else. In
results of observations confirm a causal statement? this case, the correlation between X and Y should
This is one of the major questions in the methodology disappear if Z is held constant. Statistically, the
of the social and behavioral sciences. influence of the variable Z can be determined by
If A has always been followed by B, this can be calculating the partial correlation between X and Y
explained by the causal hypothesis H: ‘A always adjusted for the linear regression of each on Z. If this
produces B.’ It could be, however, that in every case in partial correlation is close to zero, one may conclude
which A occurred, there was another event Ah, that X is not a cause of Y. However, if one has not
unknown to the researcher, which actually caused B. found a variable Z that accounts for the correlation
Therefore, it is required that extraneous factors like Ah between X and Y, this is no guarantee that such a
have to be ruled out by means of an appropriate variable does not exist.
empirical design. This is optimally done by comparing Based on this kind of reasoning, methods have been
cases that differ only with respect to A (J. S. Mill’s developed for testing causal assumptions using data
Method of Difference). For instance, individuals are from passive observation (Blalock 1964). They are
randomly assigned to an experimental and a control known under such names as causal models, path
condition. The two conditions are as similar as possible analysis, and structural equation models. They are
(time, rooms), except that the individuals of the especially important in sociology, economics, and
experimental group are exposed to A, whereas the political science, where it is more difficult to perform
individuals of the control group are not. The research experiments than in psychology.
question is whether A causes B, or, as social scientists In addition to experimental and passive observa-
often say, whether there is a causal relation between tional methods there are quasi-experimental methods.

1564
Causation (Theories and Models): Conceptions in the Social Sciences

These are experimental designs that allow for some implications for planning and conducting empirical
control of unknown extraneous variables, though they research as Mill’s inductivism has. In order to test
do not use random assignment (Campbell and Stanley hypothesis H a test prediction P is derived deductively
1963, Cook and Campbell 1979, see also Cook and from H and some auxiliary assumptions. If non-P is
Shadish 1994). observed, H is falsified—unless the auxiliary assu-
In passive observational methods only those vari- mptions are put into question. If P is observed, this
ables can be controlled which have been deliberately does not automatically count as a confirmation of H.
selected and measured. This is, of course, better than The result P confirms H only if H has been tested
no control at all. Nevertheless, it cannot be ruled out seriously. A serious test requires that P cannot be
that a covariation between X and Y is really due to a explained by other hypotheses yet known in this field
further variable which has not been under control. On of research. A test is especially critical if, in addition to
the other hand, it should be noted that in randomized being serious, P contradicts other hypotheses that
experiments, too, one can never be completely sure have been brought forward (which is known as an
that every extraneous variable is under control. This experimentum crucis). In this case, it is rather difficult
follows from the trivial fact that, taken exactly, no for H to pass the test, and if it does, it deserves an
empirical variable can be held perfectly constant, since increase in confirmation. Shortly put, this hypot-
two individuals (or groups) always exist at different hetico-deductive methodology requires control of
space–time points. More important is that by manipu- alternative explanations of B, and this in turn requires
lating X the experimenter may manipulate something methods of control of other possible causes than A,
else that, unbeknownst, has a causal influence on Y. including nuisance variables.
This kind of extraneous variable cannot be controlled Since Campbell and Stanley (1963) it has become
by randomization, but only by theoretical reflection: common practice to discuss the problem of causal
could the effect expected (or already found) in this interpretation in terms of internal and external alidity.
experiment be caused by something else that varies Internal validity is concerned with the singular causal
with X? Is there any alternative hypothesis which also statement ‘a causes b.’ The conclusion that a caused b
explains the difference in Y? Are there any nuisance is internally alid if the research design made it possible
variables connected with X? Reflections of this kind to control those extraneous variables that could have
are of the utmost importance in planning experiments also produced b. However, the result that a caused b is
as well as in interpreting their results. They are not compatible with there being further, nonredundant
rendered superfluous by using standard designs. conditions which, in connection with a, produced b,
In the social sciences the methodology of empirical e. g., properties of persons or the situation. Thus, the
research as presented above is usually ascribed to J. S. question arises whether this causal relationship also
Mill. Mill (1843) conceived four Inductie Methods for holds under different conditions. External alidity is
discovering and proving causal relations, the most defined as the approximate validity with which con-
important of them being the Method of Difference (see clusions are drawn about the generalizability of a
above). However, he only put certain procedures causal relationship to and across populations of
already known in the Middle Ages into the form of persons, settings, and times’ (Cook and Campbell
precise rules. Robert Grosseteste (c. 1168–1253) sug- 1979, p. 39). External validity includes the question
gested that a good way to determine whether a whether a causal relation found in laboratory research
particular herb has a purgative effect would be to also holds under more natural conditions. Cook and
examine numerous cases in which this herb is admin- Campbell (1979) expanded this view by adding con-
istered under conditions where no other purgative struct alidity and statistical conclusion alidity. For a
agents are present. further development of these methodological criteria
Mill overrated the importance of his inductive rules. see Campbell (1986), Cook and Shadish (1994).
He believed that every causal law yet known in science It is obvious that the problem of external validity is
was discovered by one of these rules. Mill’s con- closely related to the problem of induction. Actually,
temporary William Whewell (1794–1866), one of the the theory of internal and external validity is based on
intellectual fathers of Karl Popper (1959), held the principles from Mill’s inductiistic view as well as on
opposite view: hypotheses are discovered by creative elements from Popper’s (1959) hypothetico-deductie
insight, a process not reducible to specific inductive methodology. Some supporters of a pure deductivistic
rules. Today most philosophers and historians of view have instead proposed to dispense with any
science maintain that there are many ways of inventing question of inductive generalizability and to formulate
hypotheses and theories and that, say, Newton and the problem in this way: which general hypotheses
Einstein did not invent their theories with the help of concerning the causal relation between A and B were
Mill’s rules. tested in this research and which is the result? (cf.
However, Mill was at least partially right, since he Gadenne 1984, Kruglanski and Kroy 1975). The result
also held that his rules play a decisive role in proing could be, for example, that the general hypothesis ‘A
causal statements. The hypothetico-deductive meth- and C together cause B’ is confirmed, whereas the
odology as conceived by Popper has quite similar hypothesis ‘A alone is sufficient for B’ is falsified. The

1565
Causation (Theories and Models): Conceptions in the Social Sciences

result could also be that a hypothesis which was processes in the world are so complex that even the
confirmed in earlier laboratory research is now falsi- best causal laws are presumably simplifications which
fied in a natural setting. do not hold without exception.
Like all hypotheses and theories, causal hypotheses Causal thinking is compatible with the fact
can be empirically confirmed though not be proven as that there are interactions and feedback processes.
certain. They cannot be definitely falsified either. If the Wherever people interact, that is, in the whole psycho-
result of some repeated experiments is A and non-B, logical, social, and economic area, it should be taken
the hypothesis that A ceteris paribus produces B is into account that a variable Y which is causally
invalidated to some degree and may be given up. But dependent on a variable X may have an effect on X,
even in this case it is not completely sure that the too. Sometimes, this effect from Y on X is so small that
hypothesis is false. It could have happened that in all it can be neglected. In this case, the simplification of
these experiments there was an uncontrolled factor F taking only (the values of) X as the cause and (the
sufficient to prevent B. Furthermore, a test prediction values of) Y as the effect is justified. But in many other
does not usually follow from H alone; it is derived cases it may lead to serious mistakes if interactions are
from H together with further assumptions, including overlooked. Even in mechanics, which is usually
singular statements describing the experimental pro- considered as a shining example of causal thinking,
cedure, bridge principles connecting theoretical and many laws are actually interaction laws, for example,
empirical concepts, auxiliary assumptions from other Newton’s third law of motion and his law of gravi-
theories or even other disciplines (for example, laws tation (cf. Bunge 1959).
concerning measuring instruments), and the assum-
ption that extraneous variables were eliminated or See also: Causal Counterfactuals in Social Science
held constant. Therefore, unexpected empirical results Research; Causal Inference and Statistical Fallacies;
never tell us directly which assumption is false (Duhem Causation: Physical, Mental, and Social; Control
1954). All scientists can do is to guess which assump- Variable in Research; Explanation: Conceptions in the
tions are the mistaken ones, replace them by others, Social Sciences; External Validity; Generalization:
and make new predictions and tests. Testing hypo- Conceptions in the Social Sciences; Hypothesis Test-
theses and theories is holistic in this sense. Hypotheses ing in Statistics; Hypothesis Testing: Methodology
and theories, at least nontrivial ones, are not separately and Limitations; Internal Validity; Laboratory Ex-
testable, which renders the interpretation of results periment: Methodology; Theory: Conceptions in the
more difficult. Nevertheless, most scientists and, prob- Social Sciences
ably, most philosophers of science believe that al-
though there is no verification and no conclusive
falsification there is progress in empirical research.
After some systematic research, it turns out that Bibliography
certain causal hypotheses are much better confirmed
Albert H 1999 Between Social Science, Religion, and Politics.
than others, and the same holds, in the long run, for Essays in Critical Rationalism. Rodopi, Amsterdam
whole theories. It should be noted that contemporary Armstrong D M 1983 What is a Law of Nature? Cambridge
Popperians (often misunderstood as supporting University Press, Cambridge, UK
definite falsifications) agree with this view. They do Atkinson J W, Birch D 1978 An Introduction to Motiation. Van
not believe that theories can be falsified easily, and do Nostrand, New York
not recommend that a theory has to be given up Blalock H M Jr (ed.) 1964 Causal Inferences in Nonexperimental
immediately if it is contradicted by an empirical result Research. The University of North Carolina Press, Chapel
(cf. Albert 1999). Hill, NC
People have a strong tendency to ‘see’ and to infer Bunge M 1959 Causality: The Place of the Causal Principle in
Modern Science. Harvard University Press, Cambridge, MA
causal relations in their environment. Cook and Campbell D T 1986 Relabeling internal and external validity for
Campbell (1979) speculate that this is a product of the applied social scientists. In: Tronchim W M K (eds.) 1986
biological evolution: causal knowledge is helpful to Adances in Quasi-Experimental Design Analysis: New Direc-
adapt more effectively to external circumstances, tions for Program Ealuation. Jossey-Bass, San Francisco,
especially knowledge about manipulable causes. CA, Vol. 31, pp. 67–77
Though causal hypotheses in everyday life (as well as Campbell D T, Stanley J C 1963 Experimental and quasi-
in science) are usually simplifications, they often experimental designs for research on teaching. In: Gage N L
correspond sufficiently to real processes to have (eds.) Handbook of Research on Teaching. Rand McNally,
survival value. Chicago
Collingwood R G 1940 An Essay on Metaphysics. Oxford
There are limits to causal thinking. First, causal
University Press, Oxford, UK
statements in the social sciences, even highly confirmed Cook T D, Shadish W R 1994 Social experiments: Some devel-
ones, are incomplete in a certain sense (see above): opments over the past fifteen years. Annual Reiews of
whether A really produces B depends on conditions Psychology 45: 545–80
which are not yet known completely and can probably Cook T D, Campbell D T 1979 Quasi-Experimentation: Design
never be described in a very precise way. The causal & Analysis Issues for Field Settings. Rand McNally, Chicago

1566
Causation: Physical, Mental, and Social

Duhem P 1954 The Aim and Structure of Physical Theory. causation is a very general notion: it applies in and
Princeton University Press, Princeton, NJ (Originally pub- across the physical, mental, and social domains.
lished 1908) According to many philosophers, however, there is an
Erdfelder E, Bredenkamp J 1994 Hypothesenpru$ fung. In: interesting and difficult class of problems that emerges
Herrmann T, Tack W (eds.) EnzyklopaW die der Psychologie.
when we focus on how causal statements in one
Methodologische Grundlagen der Psychologie. Hogrefe, Go$ t-
tingen, pp. 604–48 domain—for example, the physical domain—interact
Gadenne V 1984 Theorie und Erfahrung in der psychologischen with causal statements in others—for example, the
Forschung. Tu$ bingen, Mohr social or mental domains. Problems in this class—
Hume D [1739] 1978 A Treatise of Human Nature [Book I Selby- which we will call problems of causation in multiple
Bigge L A (ed.), 2nd rev. edn. Nidditch P H (ed.)]. Clarendon domains—is what this article is about (for causation in
Press, Oxford, UK (Originally published 1739) general, see Causes and Laws: Philosophical Aspects).
Hume D [1748] 1975 An Enquiry Concerning Human Under-
standing. [Selby-Bigge L A (ed.), 3rd rev. edn. Nidditch P H
(ed.)]. Clarendon Press, Oxford, UK (Originally published
1748)
Kruglanski A W, Kroy M 1975 Outcome validity in exper- 1. Cartesian Dualism and Gassendi’s Objection
imental research: A re-conceptualization. Journal of Rep-
resentatie Research in Social Psychology 7: 168–78
The seventeenth century philosopher and mathema-
Mackie J L 1974 The Cement of the Unierse. Clarendon Press, tician Rene Descartes famously held that, while the
Oxford, UK nature and behavior of nonhuman animals could be
Mill J S [1843] 1959 A System of Logic, Ratiocinatie and explained completely by the physical science of his
Inductie, 8th edn. Longmans, London (Originally published day, various features of human beings—including in
1843) particular the fact that human thought and action are
Popper K R 1959 The Logic of Scientific Discoery. Hutchinson, not under direct environmental control—seemed to
London place them beyond the reach of physical science.
Quine W V O 1951 Two dogmas of empiricism. The Philo- Descartes therefore endorsed dualism, the idea that
sophical Reiew 60: 20–43 every human being is a complex of a physical body
Steyer R 1985 Causal regressive dependencies: An introduction. located in space and subject to physical laws and an
In: Nesselroade J R, von Eye A (eds.) Indiidual Deelopment
and Social Change: Explanatory Analysis. Academic Press,
immaterial mind not located in space and not subject
Orlando, FL, pp. 95–124 to physical laws. Descartes’ contemporary Gassendi
Suppes P 1970 A Probabilistic Theory of Causality. North- famously objected that dualism made a mystery of
Holland, Amsterdam how bodily actions and movements could be caused by
von Wright G H 1971 Explanation and Understanding. Cornell immaterial psychological events: ‘How can there be
University Press, Ithaca, NY effort directed against anything, or motion set up in it,
unless there is mutual contact between what moves
V. Gadenne and is moved? And how can there be contact without
a body?’ (Descartes and Cottingham et al. 1984).
Gassendi’s objection is an example of a problem of
causation in multiple domains. It turns on the fact that
there is causal contact between the mental and physical
domains, something which is difficult to understand if
Cartesian dualism is true. However, problems of this
Causation: Physical, Mental, and Social sort are by no means unique to the relationship
between the mental and the physical, nor to Descartes’
We invoke the notion of causation to describe many specific account of that relationship, and nor do they
different aspects of our experience and of the world. arise only from older ideas in science and philosophy.
For example, in describing the physical domain, we On the contrary, they arise from a general picture of
might say the earthquake caused the building to the world which is very common in contemporary
collapse; in describing the psychological or mental intellectual culture: the causal hierarchy picture.
domain, we might say paranoia was the cause of
Oswald shooting Kennedy; and in describing the social
domain, we might say excessie military spending
caused instability in the Soiet Union. In addition, we 2. The Causal Hierarchy Picture
often use causal notions to link these various domains
together. For example, we might say her arm rose What sort of picture of the world does contemporary
because she wanted to signal, linking the mental domain science present us with? According to one common
(wanting to signal) and physical domain (her arm view, science presents us with a twofold picture. One
rising); or social policy causes the degradation of the part of the picture is that the world is organized into a
enironment, linking the social domain (social policy) series of levels (cf. Oppenheim and Putnam 1958). The
and the physical domain (the environment). In short, most basic level is the physical level, which is described

1567
Causation: Physical, Mental, and Social

by the most basic science, physics. Arranged in a mysterious why S—or any nonbasic causal claim—
hierarchy on top of this level are other levels such as should be true.
the chemical level, the biological level, the psycho- While the issues are sometimes assimilated, the
logical level, and the social level, and each of these problem generated by the causal hierarchy picture is
levels are described by nonbasic sciences (sometimes different from the problem that Gassendi raised for
called special sciences), such as chemistry, biology, Descartes, in two respects. First, Gassendi’s objection
psychology, and sociology. To put things the other flows from the fact that, for Descartes, the physical
way around, the picture views social entities (insti- and the mental domains are so different in nature: one
tutions, groups, or corporations) as being composed is in space, the other not. But the problem raised by the
of psychological entities (persons or individuals), causal hierarchy picture does not have its source in the
which are themselves composed of biological entities fact that the mental level is so different in nature from
(cells), which are themselves composed of chemical the physical level. Its source is rather the simple fact
entities (molecules), which are themselves composed that upper-level causal claims are distinct from (i.e.,
of fundamental physical entities (atoms and subatomic not identical with) lower-level claims, together with
particles). the idea that every time a causal claim is true at some
Another part of the picture adds causation to the nonbasic level, a corresponding causal claim will be
hierarchy, in two distinct ways. First, it is part of the true at the basic level. Second, the problem generated
picture that, since causation is a very general notion, by the causal hierarchy picture is not limited to the
causal statements are true at every level. Thus, causal mental and physical domains, as was Gassendi’s
claims about the behavior of corporations or persons objection. As we have seen, it arises for any nonbasic
are, from the point of view of the picture, perfectly causal claim, whether in biology, psychology, or social
genuine and potentially true. Second, it is also part of theory. So long as one supposes that there are causal
the picture that every time a causal statement is true at claims at these levels, a problem of causation of
some nonbasic level, there is a corresponding causal multiple domains can be raised.
statement that is true at the basic level. We can now see the significance of the problem of
The intuitive idea behind this second point is that causation in multiple domains for contemporary
causal statements at higher levels require mechanisms thought. On the one hand, we have a picture of the
at lower levels, and, ultimately, at the physical level. world, the causal hierarchy picture, which is presented
Thus, consider the statement, she raised her arm to us by science and therefore has a reasonable claim
because she wanted to signal. According to the picture, to be our best and most sophisticated picture of the
this statement (if true) is made true by, and requires for world. On the other hand, the problem of causation in
its truth, a further statement drawn from a lower level, multiple domains apparently tells us that picture
presumably a statement about neural activity and how cannot be right, because the picture itself makes it
that activity causes movement of the limbs. Or again, hard to see how any nonbasic causal statements can be
consider the statement, urbanization in Australia in the true.
twentieth century resulted in a decline of religious
practice. This statement (if true) is made true by, and
requires for its truth, a further statement drawn from
3. The Exclusion Argument
a lower level, presumably a statement about the
thoughts and goals of particular people that made up We have so far been concerned with presenting the
the relevant populations. In both cases, causal state- problem generated by the causal hierarchy picture in
ments at upper levels require mechanisms at lower an informal way. However, it is desirable to be a bit
levels. more explicit about the reasoning which leads to the
Many aspects of this causal hierarchy picture have problem. The crucial fact about this reasoning is that
been discussed and examined in recent philosophy of it exploits a principle about causation often called the
(natural and social) science (see Reduction, Varieties exclusion principle (Kim 1993, Yablo 1992). As we
of; Social Properties (Facts and Entities): Philo- shall shortly see, the precise formulation of this
sophical Aspects). But for our purposes the important principle is a matter of controversy, but a reasonable
point is that it raises a problem of causation in multiple initial formulation is (E1):
domains, and in fact a whole class of such problems.
Here is a simple way of seeing the issue. Consider some (E1) If e causes e , then there is no event e such
causal statement S which is putatively true at some " (a) e is #not identical to e ; and (b)
that: $ e
nonbasic level N. Given the causal hierarchy picture, causes e $ " $
there will be another causal statement S* which is true #
at the basic level B. However, if S* is true, and we Intuitively, the idea here is that the discovery of a
know it is true given our picture of science, why should cause for some phenomenon usually means that other
we suppose in addition that S is true? Intuitively, after causes have been ruled out or excluded. For example,
all, all the causal work—all the pushing and if a doctor tells you that the pain in your foot is caused
shoving—is done at the basic level. But then it is by uric acid crystals in the joint of your big toe, you do

1568
Causation: Physical, Mental, and Social

not regard it as an open question whether the pain has This objection might go as follows. While it might be
some other cause—calcium crystals, for example. true that there are apparently lots of different sciences,
With the exclusion principle in place, we can now be this is only a methodological or pragmatic fact, which
more explicit about the problem generated by the represents our limited epistemological access to the
causal hierarchy picture. It is easiest to state the world. In fact, there is only a single science—physics—
argument if we focus on a particular kind of causal which genuinely explains the world. The other sciences
statement, one in which a certain nonbasic event n are simply convenient descriptions which, apart from
causes a physical event p. (Later we will consider their pragmatic usefulness, will or could be dispensed
whether this assumption is misleading.) Thus, suppose with in the future.
that (1) is true: The main problem with this position is a feature of
the causal hierarchy picture that so far has not been
(1) n causes p. discussed: multiple realizability. Earlier we saw that
according to the causal hierarchy picture, entities from
Given the causal hierarchy picture, it would seem that upper levels are composed of entities from lower
there must be a physical event p* which also causes p; levels. But it seems an obvious empirical fact that the
hence (2) is true: same entities from an upper level might be composed
of many different entities at lower levels. Thus, the
(2) p* causes p. same corporation over time might be composed of
different individuals. The same mental processes in
Now, given that n and p* are from different domains, different people might be made up of different neural
i.e., are drawn from different levels, we may further processes—indeed, given science fiction possibilities,
assume that: they might not even be made of neural processes at all.
The lesson usually drawn from multiple realizability
(as these sort of facts are called) is that entities from
(3) n is distinct from (i.e. not identical to) p*. upper levels are genuinely distinct from lower-level
entities and, as a corollary, that the sciences dealing
However, the principle of exclusion—in the form of with upper-level entities are genuinely distinct from
(E1)—says that, in general, no two distinct events can lower-level sciences (Fodor 1974); see Kim (1993) for
cause a single event. Hence: an opposite view. However, if this is the right lesson,
one should take seriously the hierarchy part of the
(4) abc causal hierarchy picture.
(4) If n is distinct from p*, then it is not the case It is important to note that drawing this lesson
that both n and p* causes p. from multiple realizability does not represent a return
to Cartesian dualism. One can see this if one introduces
But (1)–(4) are jointly contradictory: while any three another feature of the causal hierarchy picture: super-
of them might be true, all four cannot be true together. enience. Philosophers often say that, in the causal
It follows (barring some subtle ambiguity) that at least hierarchy picture, the upper levels supervene on the
one of (1)–(4) must be false. In other words, what the basic level, where this means that, while the upper
reasoning tells us is that the causal hierarchy picture levels in the hierarchy are distinct from the basic level,
together with the exclusion principle presents a con- they are nevertheless wholly determined by, or are
tradiction. To solve the problem one must give up or entailed by, the basic level (cf. Kim 1993). A simple
modify either the exclusion principle or the causal way to think of supervenience is by switching meta-
hierarchy picture or both. phors from levels to patterns. To say that the psycho-
logical or the social supervene on the physical is to say
that these are patterns in the physical—hence are
wholly physical—it is simply that they are not patterns
4. Possible Responses that physicists are interested in. To put the point a
little more formally, supervenience tells us that if you
The most common response to the exclusion argument imagine a possible world which completely matches
is to modify the principle of exclusion. Before turning the actual world in all physical respects, then you have
to that response, however, it is important to briefly ipso facto imagined a possible world that also matches
consider some alternative possibilities. the actual world in all psychological and social
respects. But this idea is something that Descartes
would have denied. According to Descartes, the
psychological (and presumably the social also) is only
4.1 Rejecting the Hierarchy Part of the Picture
contingently related to the physical, and thus a
One obvious response is to suggest that the causal possible world that completely matches the actual
hierarchy picture is a distorted picture of the world world in physical respects is not necessarily a world
because it takes the metaphor of levels too literally. that completely matches it in these other respects.

1569
Causation: Physical, Mental, and Social

4.2 Rejecting the Causal Part of the Picture between cows and numbers. Similarly, one might say
that the difference between causal claims in the
If one cannot reject the hierarchy part of the causal physical domain and e.g. the psychological domain is
hierarchy picture, can one reject the causal part? One not a difference in causation, it is simply a difference
way to develop this objection starts with the ob- between physics and psychology.
servation that it is not at all obvious that particular
sciences make explicit use of causal notions. Bertrand
Russell (1917) famously claimed that physics had no
use for causation. More recently, many have argued 4.3 Rejecting the Metaphysical Presuppositions of
convincingly that causation has no place, or at least no the Picture
central place, in the explanations offered by particular
special sciences such as psychology and linguistics Finally, a number of philosophers have attempted to
(Chomsky 1959, Cummins 1983) and anthropology avoid the exclusion argument by suggesting that it
(Sperber 1996). So perhaps the proper account of the only arises from mistaken metaphysical views lying
picture of the world that science presents is that it is behind the causal hierarchy picture. For example, in
hierarchical but not causal. stating the exclusion argument above, we assumed
However, while it might very well be true that that an upper-level event might cause a lower-level
particular sciences do not provide explicitly causal event—(1) summarizes just that possibility. According
explanations, this does not mean that various causal to some philosophers, however, this sort of ‘downward
claims will not be true at the various levels. We can see causation’ is deeply mysterious, and causation takes
this if we examine briefly the dispute been Chomsky place only within levels, not across levels (cf. Kim
and Skinner, surely one of the central moments of 1993). However, if one rejects downward causation,
twentieth-century science (Chomsky 1959). Skinner one might reject the exclusion argument right from the
had proposed, among other things, that the basic aim start.
of scientific psychology and linguistics should be to However, the problem with this suggestion is two-
provide causal explanations of human behavior— fold. First, it is not clear that downward causation is
functional analysis, Skinner called it. Part of so mysterious. As we have noted a number of times,
Chomsky’s response to Skinner was that it is prema- causation is an extremely general notion. There seems
ture to suppose that scientific psychology could no reason in principle why we should not suppose that
provide such explanations. He suggested instead that a upper-level events might not cause lower-level ones.
more reasonable aim for scientific psychology would Second, it seems possible to formulate the exclusion
be the decomposition of certain psychological capaci- argument even if one does not start by assuming that
ties and abilities, and in particular the ability on the downward causation is possible. For consider: given
part of a speaker to speak a language. I think it is fair the causal hierarchy picture, every nonbasic event will
to say that, in this regard, Chomsky’s suggestion has supervene on some basic event. But then one could
been enormously influential and that most cognitive causally explain the presence of any nonbasic event by
psychologists follow him in not aiming at causal causally explaining the basic event on which it super-
explanations of behavior. However, even if this is true, venes. This brings the problem back again: given the
it also remains true both that there are truths about causal hierarchy picture, there seems no rationale for
how behavior is caused, and that these truths are in all supposing that any nonbasic causal statement is true.
probability multiply realized. But then we have the
problem of causation of multiple domains back again.
One can imagine a more radical development of this
point, where the claim at issue is not that causation 5. Problems with Exclusion
plays no role in the explanations offered by special
sciences, but rather that no nonbasic causal claims are If we cannot answer the exclusion argument by
true. In fact this idea has been historically popular in rejecting the causal hierarchy picture or some meta-
philosophy of mind and psychology. In part because physical presupposition of the argument, it would
of the influence of Wittgenstein, it was at one time very seem that the only option left is to give up or modify
common to hear that causal talk in the psychological the exclusion principle. Indeed, objections to this
domain either should not be taken seriously or should principle are easy enough to find. Consider the spy
be interpreted very differently from similar talk de- shot by the firing squad: the first soldier on the firing
ployed in the physical domain (Ryle 1949; for criticism squad caused his death, but so did the second. Hence
see Davidson 1963). But suggestions of this sort the spy’s death is overdetermined. But a little reflection
consistently underestimate the sense in which caus- shows that the exclusion principle as stated so far rules
ation is an extremely abstract and general notion. out cases of overdetermination a priori.
W.V. Quine famously argued that the difference This objection to the exclusion principle is sugges-
between the statements cows exist and numbers exist is tive but limited. It is true that there are possible cases
not a difference in existence, but rather a difference of overdetermination. But it still seems unlikely that

1570
Causation: Physical, Mental, and Social

every case of nonbasic causation should be analogous principle, then we have a response to the exclusion
to the case of the firing squad. To accommodate the argument. The exclusion argument mistakes (E1) for
possibility of overdetermination we can reformulate (E3). Combining the causal hierarchy picture and (E1)
the exclusion principle to read: yields a contradiction, but there is no problem with
combining the causal hierarchy picture and (E3), for
(E3) does not exclude the possibility of nonbasic
(E2) If e causes e , then (probably, in general)
" is no event
there # e such that: (a) e not causal claims.
identical to e ; and$ (b) e causes e $
" $ #

(E2) allows, where (E1) does not, the possibility of 6. Further Questions
firing squad cases, but it also says that we can expect
such cases to be the exception rather than the rule. On Problems of causation of multiple domains threaten to
the other hand, it seems clear that we could formulate undermine our picture of the world. But, as we have
a probabilistic version of the exclusion argument based seen, these problems arise only from a mistaken
on (E2) which would be almost as bad as the conception of the exclusion principle, i.e. only if we
nonprobabilistic version based on (E1). formulate that principle as (E1) or (E2) and not (E3).
A second objection to the exclusion principle cuts However, it would be a mistake to conclude that this is
deeper, however. Consider the possibility of com- the end of the matter. For (E3) generates some further
position (or multiple realizability) all the way down: puzzling questions of its own. I will close by briefly
whenever we arrive at a potential basic level, it turns mentioning two of these.
out that this level is itself composed at a lower level. The first question arises when we consider the
Combining this idea with the exclusion principle yields difference between (5) and (6):
some absurd results. For if there is composition all the
way down, it is easy to see that the exclusion argument (5) For some events e , e , and e , if e causes e ,
will tell us that any candidate causal statement is false. and if e supervenes " on # e then$ e causes
" e #
But this suggests that there really is something wrong $ " $ #
(6) For all events e , e , and e , if e causes e , and
if e supervenes"on#e then$e causes " #
with the exclusion principle. Surely a principle of
e
causation should not have the consequence that no $ " $ #
candidate causal claim is true!
In response to this objection, a proponent of the (E3) allows the possibility that (5) is true. But it had
exclusion principle might take one of two options. better not allow the possibility that (6) is. For (6) is
First, one might deny the possibility that there is subject to counterexamples such as the following
composition all the way down. However, this is a fairly (Jackson and Pettit 1990). Suppose living in a par-
desperate maneuver. It is surely an empirical question, ticular neighborhood on the north side of town (e )
not a question which can be decided a priori, whether makes you happy (e ); living in a particular neighbor- "
the levels of the world simply go on forever. Second, hood on the north side# of town entails living on the
and much more plausibly, one might revise the north side of town—as we might put it, living on the
exclusion principle to accommodate the objection. north side of town (e ) supervenes on living on a
The obvious way to do this is to replace ‘identical to’ particular neighborhood $ on the north side (e ). Never-
in (E2) with ‘supervenient on’, invoking the relation theless, it might not be true that living on "the north
we discussed in the context of multiple realizability. side (e ) makes you happy (e )—after all, with the
That would result in the following principle: $ of your own particular
exception # neighborhood, you
may dislike the north side, thinking of yourself as a
‘south side person.’ But now we are faced with a
(E3) If e causes e , then (probably, in general)
" is no event
there # e such that: (a) e is not problem: if (6) is not true and (5) is, how are we to
supervenient on e $; and (b) e causes
$ e draw the line?
" $ # In order to answer this problem, Jackson and Pettit
appeal to a principle of causation they call inariance
This version of the exclusion principle does not have of effect under ariance of realization. They go on to
the result that, if the world is composed all the way develop this principle, embedding it in a framework
down, none of our paradigm causal claims are true. for thinking about the relation between causation,
For it is consistent with (E3) that causal claims can be causal explanation, and the causal hierarchy picture
true at levels which supervene on the basic level. Thus, (especially as that applies to the social and psycho-
even if we assume that there is composition at every logical realms) which they call program explanation
level, it is still consistent with (E3) that some causal (cf. Jackson and Pettit 1990, 1992, Pettit 1993).
claims at nonbasic levels are true. Unfortunately, reviewing these ideas is beyond the
On the other hand, if it is (E3), and not (E1) or (E2), scope of the present discussion. But the important
which provides the proper articulation of the exclusion point is that the idea that (E3) articulates the exclusion

1571
Causation: Physical, Mental, and Social

principle incurs an obligation: to explain what dis- we have seen, the latter concept does seem a reasonable
tinguishes the event triples fe , e , e g which satisfy (5) concept of causation. But defending that choice in
from the event triples which "don’t.
# $ detail requires more investigation of the concept of
The second question returns us to Gassendi’s causation that we can enter into here.
objection to Descartes. In response to Gassendi’s
objection, many Cartesian dualists are tempted to See also: Causal Inference and Statistical Fallacies;
endorse epiphenomenalism. Epiphenomenalists ac- Causation (Theories and Models): Conceptions in the
cept that there is a correlation between mental events Social Sciences; Causes and Laws: Philosophical
and behavioral events, but insist that the correlation in Aspects; Cognitive Science: Overview; Individualism
question is not causal. Rather the correlation is the versus Collectivism: Philosophical Aspects; Inten-
product of two other relations: a nomological (or tionality and Rationality: A Continental-European
lawful) relation between brain events and psycho- Perspective; Intentionality and Rationality: An
logical events, and a causal relation between brain Analytic Perspective
events and behavioral events. Since these latter re-
lations can obtain without a causal relation obtaining
between mental events and behavioral events, epi-
phenomenalism is a possible position, no matter how Bibliography
counterintuitive it seems to suppose that mental events
don’t cause behavioral events. Chomsky N 1959 Verbal behavior Skinner B F. Language 35(1):
The problem that epiphenomenalism presents for 26–58
Cummins R 1983 The Nature of Psychological Explanation. MIT
(E3) and our discussion of it is the following. If we Press, Cambridge, MA
restrict attention to the relation between the psycho- Davidson D 1963\1981 Actions, reasons and causes. In: David-
logical level and the neurological level, we too have son D (ed.) Essays on Actions and Eents. Oxford University
articulated a correlation between mental and behav- Press, Oxford, UK
ioral events, and for us too the correlation is a Descartes R Cottingham J et al. (trans.) 1984 The Philosophical
product of two further relations: first, there is a Writings of Descartes Cambridge University Press Cambridge,
supervenience relation between the mental event and UK, Vol. 2
the brain event; second there is a causal relation Fodor J A 1974\1981 Special sciences: or, the disunity of science
between the brain event and the behavioral event. as a working hypothesis. In: Fodor J A (ed.) Representations1st
edn. MIT Press, Cambridge, MA
However, why should we suppose that this correlation
Jackson F, Pettit P 1990 Causation and the philosophy of mind.
is causal? As the case of epiphenomenalism makes Philosophy and Phenomenological Research 50: 195–214
clear, the mere fact that a correlation is a product of a Jackson F, Pettit P 1992 In defense of explanatory ecumenism.
causal relation and some other relation does not make Economics and Philosophy 8: 1–21
the correlation causal. Perhaps then, the answer we Kim J 1993 Mind and Superenience. Cambridge University
have given to the exclusion argument is no better than Press, Cambridge, UK
the epiphenomenalist answer to Gassendi? Oppenheim P Putnam H 1958 The unity of science as a working
Part of the answer to this question is that there are hypothesis Minnesota Studies in Philosophy of Science 2:
correlations and there are correlations. True, the fact 3–36
that a correlation is the product of a causal relation Pettit P 1993 The Common Mind. Oxford, University Press,
New York
and some other relation does not make it causal. But it Russell B 1917\1963 On the notion of cause. In: Russell B (ed.)
had better not make it noncausal either—after all, Mysticism and Logic. Penguin, New York
eery causal relation is a product of a causal relation Ryle G 1949 The Concept of Mind. Hutchinson’s University
and identity. So the structural similarity between the Library london
account we have offered and epiphenomenalism does Sperber D 1996 Explaining Culture. Blackwell, Oxford, UK
not entail that account is no better than epipheno- Yablo S 1992 Mental causation. Philosophical Reiew 101:
menalism. 245–80
But there is also a somewhat deeper issue here. The
world presents us with a myriad of correlations. Our D. Stoljar
concept of causation is in part a tool to divide these
correlations into the causal and the noncausal. Ob-
viously, there are some correlations which are clearly
causal and some which are not. But it needs to be
admitted that there are hard cases, cases in which it is
difficult to say whether we have a causal correlation Causes and Laws: Philosophical Aspects
here or not. The causal hierarchy picture that we have
been discussing presents us with such hard cases. The questions ‘what makes it the case that one event
Adopting a concept of causation that employs (E1), causes another?’ and ‘what makes it the case that
these cases will turn out not to be causal; adopting a something is a law of nature?’ are highly controversial
concept of causation that employs (E3), they will. As ones for which, amongst contemporary philosophers,

1572
Causes and Laws: Philosophical Aspects

there is no answer that can claim orthodoxy. This behave in a given kind of situation. What about causal
article sketches some of the most influential theories of facts? Two different kinds of causal claim, one general
causation and lawhood. and one particular, need to be distinguished. On the
one hand, there are general (sometimes called ‘generic,’
‘population-level,’ or ‘type-level’) causal claims, which
have the form ‘F causes G’ or ‘F is a cause of G’
1. Preliminaries
(‘smoking causes cancer’; ‘poverty is a cause of crime’).
‘Cause’ and ‘law’ are perhaps two of the most Again, F and G are properties or kinds of event or
important and fundamental concepts that human object. On the other hand there are particular causal
beings deploy in their attempts to understand and claims of the form ‘c caused e’ (or ‘c is a cause of e’):
intervene in their environment, both in everyday life ‘the striking of the match caused the fire,’ ‘the
and in their scientific endeavors. When we want to assassination of Archduke Ferdinand was a cause of
explain why an event occurred, we seek out its causes. the First World War,’ and so on. c and e here are
When we act, we do so because we believe the action particular events (or facts): a particular assassination
will have certain effects. When we want to know why and war, for example, rather than assassinations and
our environment and our fellow human beings behave wars in general.
in regular, predictable ways, we look for the laws that
govern that behavior.
It is part of the scientist’s job to discover what 1.2 Determinism s. Indeterminism
causes what, and to investigate what the laws of nature
are. The philosopher’s job, however, is a more abstract A question that is relevant to any discussion of the
one: the philosopher wants to know what makes it the nature of causes and laws is that of whether the
case that something—anything, be it the decay of a universe is fundamentally deterministic or indeter-
subatomic particle or the assassination of Archduke ministic. One way of putting the thesis of determinism
Ferdinand—causes something else, or that something is this: the complete state of the universe at any given
is a law of nature—whether it is a physical, social, time together with the laws of nature determines
psychological, or any other kind of law. We know that precisely what the complete state of the universe will
when one strikes a match and it lights, the first be at all future times. Indeterminism is simply the
event—the striking—caused the second. But this denial of determinism.
causal fact depends for its truth on more than the mere The view that the universe is fundamentally in-
fact that the two events occurred one after the other: deterministic has been widely (though by no means
there must be some extra feature of the world which universally) held within science since the advent of
binds these two events together as cause to effect. quantum mechanics; however, philosophers have only
Similarly, it seems that laws of nature must be more relatively recently begun to take the possibility of
than mere regularities: it is doubtless true that all indeterminism seriously. One consequence of this
lumps of gold are smaller than a mile in diameter, but reluctance to take indeterminism on board is that most
it is not a law that this is so. So genuine laws must have current theories of causes and laws (with the exception
some extra feature, apart from regularity, which marks of theories of general causation) originally were
them off from mere ‘accidental’ regularities. The formulated under the assumption of determinism, and
philosopher’s job, or at least part of it, is to say what, were then modified to take into account the possibility
if anything, this ‘extra feature’ might be. that at least some laws and causal processes are
fundamentally indeterministic. The theories of laws
and particular causation discussed below are presented
in their simpler, deterministic form. While the
1.1 The Basic Locutions indeterministic versions of the theories present ad-
There are three kinds of locution which need to be ditional complications and problems—for example,
distinguished from one another. First, we have because they employ the notion of probability—the
statements of law, which can be said to have the form fundamental philosophical issues addressed by the
‘it is a law that all Fs are Gs’. F and G are properties or indeterministic and deterministic versions are mostly
kinds of event. For instance, if it is a law that all the same (see Probability and Chance: Philosophical
heated metals expand, the relevant properties are Aspects).
being a heated metal (a property that all and only
metals that are being heated instantiate) and ex-
panding (a property that all and only expanding things 2. Laws of Nature
instantiate). Particular heated (and therefore ex-
panding) bits of metal are in turn said to instantiate or What is the difference between laws of nature and
are instances of the law that all heated metals expand. merely ‘accidental’ regularities? (Suppose that it is a
Laws, then, are general facts about how objects or law that all metals expand when heated and that it is
events of a particular kind will, as a matter of law, merely accidentally true that all (past, present, and

1573
Causes and Laws: Philosophical Aspects

future) Queens of England are less than two meters generalizations—is untenable, for it simply collapses
tall.) A crucial difference is that laws ‘support the distinction between laws and accidental regular-
counterfactuals,’ whereas accidental regularities do ities. If statements of law are merely true universal
not. It is true of any unheated piece of metal that if it generalizations, then accidental regularities are them-
were to be heated it would expand. But it need not be selves laws: it turns out to be a law of nature that no
true of any non-Queen of England that if she were to Queen of England is taller than 2 meters after all.
be Queen she would be less than 2 meters tall. If A more sophisticated version of the regularity
someone is currently, say, 2.05 meters tall, becoming theory is provided by the ‘Ramsey–Lewis view’ (after
Queen of England would not result in her losing more F. P. Ramsey and David Lewis (see Lewis 1973b,
than 5 cm height. pp. 73–5: J. S. Mill also held a similar view.) The basic
Another difference is that violations of law are said idea of the Ramsey–Lewis view is as follows. When we
to be physically impossible, whereas violations of seek, as scientists do, to find out how the universe
accidental regularities are physically possible. It is behaves, we are not satisfied with isolated facts about
physically impossible for a heated metal to fail to what happened in the laboratory on Tuesday afte-
expand when heated; but it is perfectly possible for a rnoon, what happened when a particular patient took
Queen of England to be taller than 2 meters. a particular drug, and so on. Rather, what we seek are
A third difference is that we tend to suppose that wide-ranging generalizations that abstract as far as
laws (unlike accidental regularities) ‘govern’ what goes possible from the particularities of the laboratory or
on in the universe. We tend to suppose that laws of the patient. Generalizations about the heights of
nature are rather like pieces of divine legislation: Queens of England or the diameters of lumps of gold
decrees that the universe must obey certain rules apply to an extremely restricted number and range of
rather than mere general descriptions of what in fact phenomena. On the other hand, generalizations about
happens. subatomic particles, chemical reactions, the relati-
Can an analysis of lawhood be given that does onship between force, mass and acceleration, and so
justice to these intuitions about the nature of laws? A on, apply to an enormous number of diverse phen-
popular view (see Armstrong 1983) is that laws are omena, and are hence much more useful and expl-
relations of necessitation between universals. To say anatorily powerful. According to the Ramsey–Lewis
that it is a law that all Fs are Gs is to say that there is view, the difference between laws of nature and merely
a necessary relation, N, that holds between the ‘accidental’ regularities amounts to roughly this diff-
universals (properties) F and G. (Following erence between generalizations that are wide-ranging
Armstrong, we can write this ‘N(F, G).’) When an and powerful and those that are not. It is not that there
object instantiates F, the instantiation of F guarantees, is any ontological distinction between laws and acci-
via N, that G will also be instantiated. Thus, dents—laws do not have an extra feature (N, say) that
N(F, G)—its being a law that all Fs are Gs—guarantees accidental regularities lack; rather, it just so happens
that all Fs will in fact be Gs. that some generalizations (the laws) are interesting
The basic point of this ‘realist’ theory of laws is that and useful ways of codifying what happens in the
it provides an ‘ontological ground’ for the difference universe, and others (the accidents) are not.
between laws and accidents: accidental regularities The Ramsey–Lewis view (and the Humean view of
just happen, whereas lawful regularities are grounded lawhood in general) diverges from the everyday,
in and explained by a necessary relation holding intuitive conception of laws of nature in that it does
between the co-occurring properties. And, on this not accord laws the governing role that we ordinarily
view, sense can be made of the idea that laws govern suppose them to have. At bottom, nothing makes
what happens: if N(F, G) holds, then G must be metals expand when heated any more than anything
instantiated whenever F is instantiated. makes any Queen of England be shorter than 2 meters.
The main rival to this conception of laws is the Some philosophers claim that this aspect of
‘Humean’ or ‘regularity’ conception, according to Humeanism takes the sting out of the alleged conflict
which laws are really no more than regularities. The between determinism and free will (see Swartz 1985,
regularity view is motivated primarily by an ‘em- chaps. 10 and 11, Berofsky 1987) (see also Free Will
piricist’ epistemology according to which we ought and Action).
not to believe in entities that we cannot perceive. Since
our evidence that there are any laws of nature comes
solely from the observation of regularities—when we
see a law (as opposed to an accidental regularity) being 3. General Causation
instantiated we do not see any extra ontological
feature—we ought not to believe that there is any extra
3.1 Probabilistic Theories
ontological feature which laws have but accidental
regularities lack. Frequently it is claimed on cigarette packets that
A ‘naı$ ve’ regularity theory, however—according to smoking causes heart disease; but what makes (or
which statements of law are merely true universal would make) such a claim true? The most popular

1574
Causes and Laws: Philosophical Aspects

kind of theory of general causation takes such claims under investigation will be a member of one and only
to be made true by probabilities: what makes it true one background context. Now we can assess the causal
that smoking causes heart disease is that smoking status of C with respect to E in each background
raises the probability of heart disease. context. According to Eells’s analysis, C causes (or is
What does it mean to say that smoking (C) raises the a positive causal factor for) E in a particular popu-
probability of heart disease (E)? One can think of the lation if and only if C raises the probability of E in
matter in roughly the following way. Fix on a every background context of that population, that is,
particular population: Australian citizens, say, or the if and only if Pr(E\C and Bi)  Pr(E\" C and Bi) for
residents of Greater London. Now divide that popu- each background context Bi. Similarly C is a negative
lation into the smokers (those members of the popu- causal factor for E if C lowers the probability of E in
lation who instantiate C) and the nonsmokers (those every background context, and E is causally neutral
who don’t instantiate C). Find out what the relative for E if C makes no difference to the probability of E
frequency of heart disease is in each group, that is, the in each background context. According to Skyrms’s
proportion of smokers who get heart disease (call this slightly weaker (1980) analysis, C causes E if and only
proportion x) and the proportion of nonsmokers who if C raises the probability of E in at least one
get heart disease (y). If x  y, then smoking raises the background context and C does not lower the prob-
probability of heart disease in the population under ability of E in any background context.
investigation: Pr(E\C)  Pr(E\" C) (the probability Dupre! (1984, 1990) has argued that analyses like
of E given C is greater than the probability of E given Eells’s and Skyrms’s are too strong: they make it too
the lack of C). hard for general causal claims to be true. Suppose that
However, while the fact that C raises the probability some tiny minority of the US population has some
of E gives us prima facie evidence for thinking that C peculiar physiological condition P that makes smoking
causes E, it does not entail that C causes E. Falling (C) a prophylactic against heart disease (E)—even
barometer readings (C) raise the probability of rain though for everyone else in the population, smoking
(E), that is, Pr(E\C)  Pr(E\" C), even though C increases the risk of heart disease. It follows from
does not cause E. Rather, C and E are both effects of Eells’s analysis—and also from Skyrms’s—that smok-
a common cause: low atmospheric pressure (F). The ing does not cause heart disease in the US population,
correlation between C and E is spurious. We, there- since Pr(E\C & P) Pr(E\" C & P). This is a result
fore, need to refine the basic intuition that causes raise that Dupre! regards as highly counterintuitive. Dupre! ’s
the probabilities of their effects, in order to rule out rival analysis takes as its starting point a method that
cases of spurious correlation. is actually used to test causal hypotheses in the social
One way of doing this is to ‘hold fixed’ other and medical sciences: that of the controlled expe-
relevant factors (in this case, F) when assessing C’s riment. If one wants to know whether C causes E in a
probabilistic impact on E: rather than looking just at given population, one way of trying to find out is to
the correlation between C and E, we consider the take a random sample of the population, induce C in
correlation between C and E in the presence of F and, a random subset of that sample, and compare the
separately, in the absence of F. What we find in the results. The point of having a random sample is to try
barometer case is that P(E\C & F ) l (P(E\" C & F ) to insure that other factors that are relevant to E occur
and P(E\C & " F ) l P(E\" C & " F ): the probab- with the same relative frequency as they do in the
ilistic correlation between C and E disappears when population as a whole, and thus that the probabilistic
we hold fixed the relevant factor, and this reflects the correlation between C and E is not spurious. Dupre!
fact that C does not cause E: falling barometer calls a sample that achieves this match of frequencies
readings do not cause rain. a fair sample. His claim is that C causes E if and only
According to one kind of analysis (see, for instance, if C raises the probability of E in a fair sample of the
Suppes 1970, Skyrms 1980, Eells 1991), general causal population. This analysis yields the desired result that
facts are to be analyzed in terms of this kind of smoking causes heart disease in the above example,
conditional probability. Of course, there is usually since in any fair sample those with the physiological
more than one factor or property (apart from the condition P will be vastly outnumbered by those who
factor C whose causal influence we are trying to lack P; so C will still raise the probability of E in a fair
establish) that is relevant to an effect E, so we need to sample.
keep all such factors fixed if the probabilistic cor- An alleged benefit of Dupre! ’s analysis is that it
relations are to reflect causal correlations. Call a makes the metaphysics of general causation—what
subset of the population whose members are all the makes general causal claims true—match up in an
same with respect to which relevant factors (apart obvious way with actual scientific methodology, and
from C) they possess a ‘background context’. For therefore explains why the methodology of the con-
example, if there are two factors apart from C that are trolled experiment can, and often does, provide a way
relevant to E (call them X & Y), there will be four of uncovering causal facts.
background contexts: X & Y, X & " Y, " X & A problem with probabilistic theories of general
Y, and " X & " Y. Each member of the population causation, however, is that they appear to fail given

1575
Causes and Laws: Philosophical Aspects

the assumption that at least some features of the world 4. Particular Causation
are determined by prior circumstances (see Carroll
1992). For example, suppose that C causes E, but that
4.1 Hume’s View
C itself is determined by the comination of factors
XYZ. Then it is not true that Pr(E\C & XYZ)  The agenda for much of the contemporary discussion
Pr(E\" C & XYZ), since Pr(E\" C & XYZ) is about the nature of particular causation was set by
undefined; hence, according to Eells’s and Skyrms’s Hume (1777, Sects. IV–VII). Hume was an empiricist,
analyses—and contrary to hypothesis—it is not true believing that there could be no ‘ideas’ in the mind that
that C causes E. do not somehow come from our senses. How, then, do
Dupre! ’s analysis also falls prey to this objection. we come to have an idea of causation: what features of
Suppose that 10 percent of the population have XYZ. the world could furnish us with that idea? Famously,
Then in a fair sample of Cs, 10 percent must have Hume maintained that we do not perceive any intrinsic
XYZ; similarly for a fair sample of " Cs. But there can connection or relation between two events we judge to
be no such fair sample of " Cs, since by hypothesis be causally related: we see the match being struck, and
there will be nobody at all who has XYZ but lacks C. we see it light, but we do not see any causation between
So according to Dupre! ’s analysis it is not true that C those two events. Causation, then, cannot be a
causes E. mysterious intrinsic relation between events—for if it
were, since we cannot perceive any such relation, we
could have no idea of causation. (However, see
Strawson 1989 for the claim that this standard
interpretation of Hume is mistaken.) Hume’s positive
claim was that two events are related causally in virtue
3.2 A Deflationary View
of contiguity (causes and their immediate effects are
Given the apparent failure of standard probabilistic right next to each other in space and time), temporal
theories of general causation, perhaps a radical change priority (causes precede their effects), and constant
of approach is needed. The theories discussed so far conjunction (events similar to the cause are always
seek to analyze general causation solely in terms of followed by events similar to the effect). So, according
conditional probabilities: no mention is made of to Hume, what makes it true that the striking caused
particular causation. But this may seem rather odd, the lighting is that the two events are contiguous (or, if
for it is natural to think that general and particular there is some spatio-temporal gap between them, the
causal facts are related in some way: it is natural to events are mediated by further events so that there is a
think that the fact that smoking causes heart disease chain of contiguous events starting with the striking
has something to do with the fact that lots of individual and ending with the lighting); the striking occurs
smokers are caused by their smoking to get heart before the lighting; and similar strikings are always
disease. This line of thought leads naturally to the view followed by similar lightings. The constant conju-
that general causal claims are not claims about a nction requirement insures that for Hume (and for
distinctive kind of causation—general causation—at most modern day Humeans) causation is an extrinsic
all, but are rather merely generalizations about par- relation: a causal relation obtains between two events
ticular causal claims. not purely in virtue of how those events themselves
The generalizations in question cannot be universal are, but in virtue of features of other events: the
generalizations: when we say that smoking causes striking only gets to count as a cause of the light-
heart disease we do not mean to imply that all smokers ing because other strikings are also followed by
get heart disease as a result of smoking. But perhaps lightings.
they are rather weaker generalizations, akin to ‘dogs The other notable feature of Hume’s view is that it
bark’ or ‘American men like baseball’ (see Carroll is a reductionist view: causal claims, like ‘the striking
1991). According to this proposal, general causal of the match caused the lighting,’ are made true, at
claims do not have precisely defined truth conditions: bottom, not by any primitive causal feature of the
the question of precisely what proportion of smokers world, but by noncausal features (since the fact that
need to get heart disease because they smoke in order two events are contiguous, the fact that one occurs
to make it true that smoking causes heart disease no before the other, and the fact that they are similar to
more admits of an answer than does the question of other events are not themselves causal facts).
precisely what proportion of dogs need to bark in Nowadays Hume’s own analysis is regarded as
order to make it true that dogs bark. untenable, for a variety of reasons. For example, the
A central issue, then, is whether there is a special, contiguity requirement rules action at a distance (one
general, kind of causation that serves to make general event’s causing another, spatially or temporally distant
causal claims true—a kind of causation that is distinct event without there being a chain of events ‘hooking
and relatively autonomous from particular causa- up’ the first to the last) impossible, while many
tion—or whether general causal claims are made true philosophers think that action at a distance is at least
solely by particular causal claims. conceptually possible. The constant conjunction re-

1576
Causes and Laws: Philosophical Aspects

quirement is obviously flawed: there can be constant claim that causation is to be analysed in terms of
conjunction without causation—for example, joint counterfactuals, add further complexities in an at-
effects of a common cause are constantly conjoined tempt to circumvent the problem cases (see, e.g.,
but neither causes the other—and, if determinism is Menzies 1989). Lewis’s theory and its successors fall
false (which most philosophers believe to be evident squarely within the Humean tradition, since the
from quantum mechanics), it seems there can be counterfactual analysis is a reductionist project: it
causation without constant conjunction. Bombarding seeks to show how causal facts are made true by
an atom might cause it to decay 2 seconds later even noncausal features of the world.
though exactly similar bombardments will not always Not everyone, however, is a Humean. Many phil-
be followed two seconds later by decay. osophers deny Hume’s central motivating claim that
Nevertheless, the general issues that Hume raised causation cannot be perceived (see Anscombe 1971,
are still among the main foci of dispute in con- Armstrong 1997, Chap. 14, Menzies 1998); and many
temporary philosophy of causation. In particular, the philosophers believe for other reasons that the reduc-
question of whether causal facts reduce to noncausal tionist project is doomed. Some argue that it is subject
facts is the driving force behind many of the theories of to fatal counter-examples (see, e.g., Carroll 1994).
causation currently on offer. Some argue that the features of the world to which
reductionist analyses seek to reduce causal facts are
themselves really causal features; hence reductionist
analyses are circular and therefore unsuccessful (see
Carroll 1994). Some hold that our concept or caus-
4.2 Counterfactual Analyses
ation is so heterogeneous that no unified analysis is
Arguably the most influential analysis of causation in possible; others that if a broadly Humean analysis of
recent years has been Lewis’s counterfactual analysis causation were true it would render the pervasive
(Lewis 1973a). According to Lewis’s analysis, an event regularities which the universe exhibits utterly in-
e causally depends on event c if e counterfactually explicable: it would make the apparent orderliness of
depends on c—which is to say, if, had c not occurred, nature a sort of cosmic fluke (see Strawson 1989,
e would not have occurred either (see Counter- Chap. 5).
factual Reasoning, Quantitatie: Philosophical Aspects;
Counterfactual Reasoning, Qualitatie: Philosophical
Aspects). A ‘chain of causal dependence’ is a series of
eventsfa, b, c, …, ng such that b causally depends on 5. Causes and Laws
a, c causally depends on b, and so on. Finally c causes
e if there is a chain of causal dependence (perhaps The above discussion of particular causation makes
involving only c and e themselves, but perhaps no explicit mention of any relationship between
involving many hundreds of intermediate events) from causation and laws; yet intuitively there is some
c to e. relationship between, for instance, the law that all
The need for chains of dependence arises when there metals expand when heated and the fact that a
is no counterfactual dependence between cause and particular piece of metal’s expansion is caused by its
effect because of a backup mechanism or ‘preempted being heated. In fact, however, laws of nature typically
alternative.’ Suppose that the bus stop is next to the do have a role to play in theories of causation. Hume’s
taxi rank. Susan takes the bus to work (c); but if the constant conjunction requirement, for example, can
bus had been late she would have taken a taxi. She be seen as a requirement that causes and effects are
attends a 9.30 meeting (e). e does not counterfactually lawfully correlated (or ‘fall under’ a covering law).
depend on c, since if Susan had missed the bus she Laws of nature enter into Lewis’s counterfactual
would have taken a taxi and attended the meeting just analysis of causation indirectly, via his analysis of
the same. But there is some intermediate event counterfactuals (see Lewis 1973b): for Lewis, the truth
d—Susan’s getting off the bus at the stop outside her of ‘if A had been the case, B would have been the case’
office, say—which counterfactually depends on c (if requires that B is true at the ‘closest possible world(s)’
she had not got on the bus in the first place, she would in which A is true; and closeness of possible worlds
not have got off it either) and upon which e depends in part on the extent to which they share the
counterfactually depends (if she had not got off at that same laws of nature.
stop—given that she was already on the bus, and it was In any case, the relationship between causes and
already nearly 9.30—she would have had to walk back laws is certainly more complex than the simple heated
to her office from the next stop, and she would have metal example may suggest. For one thing, lawful
missed the meeting). correlation is not sufficient for causation, for example,
Lewis’s theory has been found to be subject to a arsenic poisoning may be correlated lawfully with
number of counter-examples; and much recent work death, but it does not follow that anyone who is
on causation has been devoted to devising alternative poisoned with arsenic and then dies is caused to die by
analyses which, while remaining true to Lewis’s basic the poisoning (since they may get hit by a bus just after

1577
Causes and Laws: Philosophical Aspects

they take the arsenic). Moreover, it has been argued Suppes P 1970 A Probabilistic Theory of Causality. North-
that lawful correlation is not necessary for causation Holland, Amsterdam
either; Anscombe (1971), for example, holds that c may Swartz N 1985 The Concept of Physical Law. Cambridge
University Press, New York
cause e even though c and e do not fall under any
regularity. However, a unified account of causes and
H. Beebee
laws has recently been proposed by Heathcote and
Armstrong (1991): according to their view, the causal
relation is (in fact, but not as a matter of necessity)
identical with N, the relation in virtue of which laws of
nature obtain. Celebrity
See also: Causal Inference and Statistical Fallacies; 1. Celebrity and Critiques of Mass Culture
Causation (Theories and Models): Conceptions in the
Social Sciences; Causation: Physical, Mental, and The term ‘celebrity’ in its modern meanings began to
Social; Counterfactual Reasoning, Qualitative: Phil- be used in the nineteenth century, but the study of the
osophical Aspects; Counterfactual Reasoning, Quan- phenomenon began in earnest with the rise of mass-
titative: Philosophical Aspects; Determinism: Social produced culture, and in particular with the elab-
and Economic; Empiricism, History of; Logical Posi- oration of an industrialized Hollywood film ‘star
tivism and Logical Empiricism; Natural Law; Prob- system’ in the early decades of the twentieth century.
ability and Chance: Philosophical Aspects (For an excellent history of fame and fame discourse
from ancient times to the near-present, see Braudy
1986.) It first emerged as a sustained focus of inquiry
through mid-twentieth century criticism of mass
culture, from both the left and the right. Although
Bibliography these works were rarely based on empirical research,
Anscombe G E M 1971 Causality and Determination. Cambridge and were filled with unsupported assertions about
University Press, London those creating and receiving celebrity images, they
Armstrong D M 1983 What Is A Law of Nature? Cambridge called attention to the mass production and man-
University Press, Cambridge, UK agement of celebrities, and to the question of the social
Armstrong D M 1997 A World of States of Affairs. Cambridge impact of industrialized celebrity culture.
University Press, Cambridge, UK
Berofsky B 1987 Freedom from Necessity. Routledge & Kegan
Paul, London 1.1 The Celebrity and Capitalist Ideology
Carroll J 1991 Property-level causation? Philosophical Studies
63: 245–70 Early perspectives on celebrity were largely theor-
Carroll J 1992 The unanimity theory and probabilistic suf- etical, emerging as part of Marxist-influenced Frank-
ficiency. Philosophy of Science 59: 471–79 furt School cultural criticism of ‘mass culture’, ‘mass
Carroll J W 1994 Laws of Nature. Cambridge University Press, society’, and the ‘culture industry’. Celebrities were
Cambridge, UK seen as mass-produced, standardized commodities
Dupre! J 1984 Probabilistic causality emancipated. Midwest posing as unique human individuals, and celebrity
Studies in Philosophy 9: 169–75
Dupre! J 1990 Probabilistic causality: A rejoinder to Ellery Eells.
discourse as a major ideological support beam for
Philosophy of Science 57: 690–8 consumer capitalism. In the 1940s, Theodor Adorno
Eells E 1991 Probabilistic Causality. Cambridge University and Max Horkheimer (1977), for instance, saw celeb-
Press, Cambridge, UK rities as the products of ‘the culture industry’, the
Heathcote A, Armstrong D M 1991 Causes and laws. Nous 25: cultural apparatus of mass society; Hollywood stars
63–73 serve as distractions from the dissatisfactions created
Hume D 1777\1975 Enquiries Concerning Human Understanding by industrial capitalism, and to manipulate ‘the
and Concerning the Principles of Morals, 3rd edn. Clarendon masses’ into capitalism’s false promises of both choice
Press, Oxford, UK (standardized, mass-produced celebrities appear to be
Lewis D K 1973a Causation. Journal of Philosophy 70: 556–67 different individuals) and universal success (celebrities
Lewis D K 1973b Counterfactuals. Blackwell, Oxford, UK appear to demonstrate the rewards available to all).
Mellor D H 1995 The Facts of Causation. Routledge, London
Leo Lowenthal (1968), also writing in the 1940s,
Menzies P 1989 Probabilistic causation and causal processes: A
critique of Lewis. Philosophy of Science 56: 642–63
researched changes in ‘mass idols’ in popular maga-
Menzies P 1998 How justified are Humean doubts about intrinsic zines, charting the move from ‘idols of production’
causal links? Communication and Cognition 31: 339–64 (business and politics) to ‘idols of consumption’
Skyrms B 1980 Causal Necessity. Yale University Press, New (entertainment and sports); he too suggested that these
Haven, CT popular culture heroes perpetuated the myth of an
Strawson G 1989 The Secret Connexion. Clarendon Press, open social system, such that the existing social system
Oxford, UK is celebrated along with the star. C. Wright Mills

1578
Celebrity

(1956, p. 71) wrote in the 1950s of the professional many humanities-based approaches to stars and star-
celebrity as a summary of American capitalist society’s dom, which tend to consider the symbolic activity that
promotion of competition and winning; as ‘the crown- takes place in and through celebrity discourse. Who
ing result of the star system in a society that makes a gets attention, the logic goes, tell us much about the
fetish of competition’, the celebrity shows that rewards core values, or ideological contradictions, of the
go to those who win, regardless of the content of the society giving the attention. Two themes have been
competition. The definition of celebrities as mass- particularly pervasive: the tension between egalitarian
produced distractions, and their ideological role in and aristocratic cultural strands; and the pursuit of the
promoting consumption, competition, individualism, authentic self.
and the myth of open opportunity, has continued in
much contemporary cultural criticism.
2.1 Celebrities as a Democratic Aristocracy

1.2 Celebrity Versus Heroism One striking feature of contemporary Western celeb-
rity discourse is the way celebrities are treated to a
For more conservative cultural critics from the 1950s cultural status that is simultaneously ‘above’ the rest
onward, the role of the celebrity system as an ideo- of the populace and ‘of’ that populace. Celebrities are
logical support for capitalism was less important than culturally constructed as a sort of elected aristocracy,
its reflection of a large-scale disconnection between both elevated and brought down by the watching
notoriety and merit. Such a view crystallized in the crowds; the celebrity has become one symbolic means
1960s with the publication of Daniel Boorstin’s The through which the population of the unfamous de-
Image (1961), which distinguished celebrity from clares its own power to shape the public sphere
heroism. In an argument that presaged more recent (Marshall 1997). Moreover, while celebrity culture
postmodernist theory on ‘simulation and simulacra’ certifies some people as more deserving of attention
and the implosion of artifice and reality (see and rewards because of their difference from the rest of
Baudrillard 1988), Boorstin argued that, with the the population, it also continually demonstrates that
growth of mass media, public relations, and electronic such people are ordinary, just like everyone else
communication, it was possible to produce fame (Braudy 1986). Thus, popular celebrity discourse
without any necessary relationship to outstanding embodies an ambivalence about hierarchy in Western
action or achievement. Thus, the hero, whose fame is democracies: celebrities are celebrated for being better
the result of distinctive action or exceptional, meri- than, but no better than, those who watch them.
torious character, has been superseded by the celeb-
rity, whose notoriety is manufactured by mass
media without regard for character or achievement;
the signs of greatness are mistaken for its presence. In 2.2 The Search for the ‘Real’ Self
Boorstin’s definition, the celebrity is a ‘human pseudo- A second outstanding feature of contemporary celeb-
event’. The phenomenon of celebrity is a symptom of rity discourse is the thematic emphasis on getting
a media-driven culture in which artifice has displaced ‘behind’ celebrity images to the ‘true’ or ‘real’ self.
reality, and in which merit and attention have be- Celebrity discourse, as Richard Dyer (1991, p. 135)
come uncoupled. Although as Leo Braudy (1986) has has demonstrated, involves a ‘rhetoric of authenticity’:
shown, the oppositions between pure, ‘real’ fame and the question of what a celebrity is ‘really like’, what
inauthentic ‘artificial’ celebrity do not fully hold kind of self actually resides behind the celebrity image,
up—historically, ‘fame and merit have never been is a constant, whether in the form of tabloid expose! s,
firmly and exclusively coupled—the conservative crit- behind-the-scenes reporting, celebrity profiles, or fan
ical approach to celebrities as false, vulgarized heroes activities such as autograph-seeking. In part, this is
has pointed towards historically new features. Modern because celebrities have the unique characteristic of
media, through the increasingly sophisticated crea- appearing to audiences only in media texts, while also
tion, management, and reproduction of images, have living in the world as actual human beings—they are
an unprecedented capacity to place a person on the images, but are ‘carried in the person of people who do
cultural radar screen, quickly and with no necessary go on living away from their appearances in the media’
reliance on the person’s publicly celebrated actions or (Dyer 1991, p. 135). In part, the theme of realness is
character. the result of the increasing visibility over time of
celebrity production mechanisms, raising the question
of whether the celebrity image has been manufactured
2. Celebrities as Symbolic Entities to attract an audience, or whether it reflects a true,
deserving self (Gamson 1994). Celebrity discourse,
The interest shown by cultural critics in what the with its heavy rhetorical emphasis on authenticity,
workings of contemporary celebrity tell us about the thus manifests a larger cultural anxiety about the
culture that makes it so central has been taken up in relationship between media images and lived realities.

1579
Celebrity

3. The Political Economy of Celebrity celebrity-driven media outlets since the 1970s, more-
over, it has become both easier to build celebrity and
While many of the attempts to grapple with the unique more difficult to retain it—hence Andy Warhol’s
symbolic or ideological features of contemporary famous declaration that eventually everyone will be
celebrity have been either entirely speculative or based world famous for 15 minutes. While Hollywood movie
exclusively on textual analysis, much of the empirical studios still generate a large proportion of American
research on the topic has focused on celebrity as an and international celebrities, celebrity has become less
economic and social system. Influenced by the strat- centralized, and the logic of celebrity has taken hold
egies of political economists and organizational sociol- within a wider range of social spheres, including the
ogists, this research has investigated not so much the worlds of literature, art, business, music, sports, and
cultural meaning of celebrity as the internal organiz- scholarship.
ation and economic logic of the celebrity system. In
contrast to approaches which assume that film stars
are popularly selected for attention, for instance, such
analysts tend to see celebrity as the result of ‘the 4. Fandom and the Reception of Celebrity
exigencies of controlling the production and mar- Analyses of the social and economic organization of
keting of films’ (King 1986, p. 155). Although celeb- celebrity tend to bracket questions of its cultural
rities increasingly emerge in other social domains meanings, and textual analyses of celebrity tend to
(politics, academia, etc.), most attention has been operate with untested assumptions or assertions about
given to the major celebrity production center, the the meaning and impact of celebrity for audiences.
entertainment industry. Methodologically, both have tended to exclude em-
pirical research into the meaning of celebrities and
celebrity in the everyday lives of the fans or audiences
encountering them. As the study of culture in general
3.1 The Hollywood Star System
began, in the 1980s, to take more of a methodological
The pursuit of celebrity, especially in the entertainment turn towards audience research, audience-related
business, became highly routinized, rationalized, and aspects of celebrity have also come more sharply into
industrialized over the course of the twentieth century, focus. This territory remains, however, underinvesti-
with the development of industries, such as public gated, in part because the study of audiences is both
relations, specifically devoted to the generation and methodologically cumbersome and costly.
management of public visibility. Celebrities are, in this Considerable thought, if sparse research, has been
context, marketing tools. In the notoriously risky devoted to the question of the fans’ or audiences’
entertainment business, which requires high capital relationship to celebrities. One early theory, for
investment for most of its products, a star is an instance, proposed that mass media such as television
insurance policy against audience disinterest, used facilitate a ‘para-social relationship’ between per-
primarily to minimize the risk of financial loss. Thus, formers and audience members, in which the spectator
star images are typically managed in accordance with comes to relate to the celebrity as if they were in a face-
the needs of the financiers of the vehicle with which a to-face relationship, with the ‘illusion of intimacy’
celebrity is associated—with film stars, for instance, a (Horton and Wohl 1956). Since then, largely through
movie studio. The key nexus is not so much the psychoanalytic film theory, discussions have focused
celebrity and his or her audience, but the celebrity’s on the processes by which audience members identify
backers, who pursue publicity, and journalists, editors, with celebrities, especially film stars (Tudor 1974;
and producers, who provide it (Gamson 1994). Stacey 1991). Various typologies of identification have
The structure of the star system has changed been set forth, emphasizing quite a range of activities,
significantly over its brief lifetime. The early studio uses, and types of attachment developed by audiences
star system involved tight control of the production, in their encounters with celebrities (Marshall 1997).
exhibition, and distribution of films and their associ- This has been especially important in challenging the
ated film-star images by several major studios; stars assumption that ‘the audience’ for celebrity is a
were under studio contracts, and studio publicity homogenous mass, and that celebrities mean the same
operations were responsible for producing and dissemi- thing for all of its members. Para-social identification
nating celebrity stories and images. When the studio and celebrity hero-worship, for instance, both appear
oligopoly was broken up by a US Supreme Court to be common stances; but ironic, playful, and
decision in the 1950s, many more parties with a irreverent interpretations of celebrity images also
financial interest in celebrities’ careers became in- appear to be prevalent, especially as the pursuit of
volved in the management of celebrity images—per- celebrity itself has become a more common focus of
sonal publicists, managers, agents, in addition to the public discussion (Gamson 1994). Particular attention
celebrity himself or herself, joined studio publicists in has been paid to the ways groups marginalized on the
battling for control of the process. With the growth of basis of race, class, gender, sexuality, ethnicity, or age
television since the 1950s, and the explosion of make use of celebrity images for their own purposes:

1580
Censorship and Secrecy: Legal Perspecties

building group solidarity or expressing rebelliousness King B 1986 Stardom as an occupation. In: Kerr P (ed.) The
and alienation, for example, through the celebration Hollywood Film Industry. Routledge and Kegan Paul, London
of particular types of stars (strong women, for ex- Lowenthal L 1968 The triumph of mass idols. In: Literature,
Popular Culture, and Society. Pacific Books, Palo Alto, CA,
ample, or stars from their own ethnic group) (Stacey
pp. 109–36
1991). Marshall P D 1997 Celebrity and Power. University of Minnesota
Press, Minneapolis, MN
Mills C W 1956 The Power Elite. Oxford University Press,
5. The Future of Celebrity London
While there is considerable consensus on certain Stacey J 1991 Feminine fascinations: forms of identification in
distinctive features of contemporary celebrity, a star-audience relations. In: Gledhill C (ed.) Stardom: Industry
of Desire. Routledge, London, pp. 141–66
tremendous amount of room remains for empirical in-
Tudor A 1974 Image and Influence: Studies in the Sociology of
vestigations of the contours of celebrity as a socially Film. St. Martin’s Press, New York
organized system of meaning, status, identification,
and pleasure; theory on the phenomenon remains J. Gamson
much sharper and more sophisticated than research.
Three areas in particular are likely to prove fruitful for
future research. First, audience research can continue
not only to document the range and types of audience Censorship and Secrecy: Legal
positions vis-a' -vis celebrities, but also to examine
various possible explanations for the variance—audi- Perspectives
ence characteristics, for instance, or qualities of
different celebrity domains and types. Second, the 1. Secrecy
process by which the logic of celebrity can and does
spread to spheres other than entertainment, and how it Secrecy involves norms about the control of infor-
may operate differently in those realms, is not well mation, whether limiting access to it, destroying it, or
understood. Finally, nearly all of the literature on prohibiting or shaping its creation. Secrecy is a general
celebrity has emerged from, and focused on, the USA and fundamental social process known to all societies.
and the UK. Cross-cultural comparative research is a It can characterize interaction at any level—from
very promising, and almost entirely untapped, source information that an individual withholds, to secret
of insight into the cultural, economic, political, and rites of passage of preindustrial societies, to the secrets
social logic of celebrity. of contemporary fraternal or business organizations,
to state-held information on national security. Secrecy
See also: Advertising: General; Charisma: Social norms are embedded in role relationships and involve
Aspects of; Elites: Sociological Aspects; Media Effects; obligations and rights to withhold information,
Media, Uses of; Reputation; Sport, Sociology of; whether reciprocal or singular. In preventing or
Status and Role: Structural Aspects restricting communication, the legally supported form
of censorship discussed here involves secrecy. Yet
most secrecy (e.g., concealing information about a
Bibliography surprise party or aspects of one’s past) does not
Adorno T, Horkheimer M 1977 The culture industry: en- involve formal law and the law involves secrecy in
lightenment as mass deception. In: Curran J, Gurevitch M, many other ways.
Woolacott J (eds.) Mass Communication and Society. Sage In a democratic society secrecy and openness reflect
Publications, Beverly Hills, LA, pp. 349–83 conflicting values and social needs and exist in an ever-
Alberoni F 1972 The powerless ‘elite’: theory and sociological changing dynamic tension. Efforts to control infor-
research on the phenomenon of the stars. In: McQuail D (ed.)
mation occur in a rich variety of contexts. Norms
Sociology of Mass Communications. Penguin Books, Har-
mondsworth, UK, pp. 75–98 about the concealment of information and restrictions
Baudrillard J 1988 Selected Writings. Stanford University Press, on communication ideally should be considered along-
Stanford, CA side of their opposites—norms mandating the reve-
Boorstin D J 1961 Jean Baudrillard: The Image: A Guide to lation of information and protecting the freedom to
Pseudo-eents in America. Harper & Row, New York know and communicate. Such norms may involve
Braudy L 1986 The Frenzy of Renown: Fame & Its History. formal legal rules such as Britain’s Official Secrets Act
Oxford University Press, New York or the United States’ Freedom of Information Act,
Dyer R 1991 A star is born and the construction of authenticity. nonlegally binding formal policies such as a bank’s
In: Gledhill C (ed.) Stardom: Industry of Desire. Routledge,
refusal to reveal client information in the absence of a
London, pp. 132–40
Gamson J 1994 Claims to Fame: Celebrity in Contemporary warrant or the consumer information voluntarily
America. University of California Press, Berkeley, CA provided on some product labels, or it may involve
Horton D, Wohl R R 1956 Mass communication and para- informal expectations (close friends are expected not
social interaction: observations on intimacy at a distance. to reeal shared secrets to outsiders but are expected to
Psychiatry 19: 215–29 reeal certain personal details to each other, such as

1581
Censorship and Secrecy: Legal Perspecties

true feelings about shared interests). The correlates form with profound implications for a democratic
and consequences of such variation offer rich material society.
for analysis of the sociology of secrecy. This article Censorship assumes that certain ideas and forms of
reviews some selected social forms, processes, and expression are threatening to individual, organiz-
consequences of secrecy, and the law as applied to ational, and societal well-being, as defined by those in
censorship. power, or by those involved in a moral crusade, and
There is no widely agreed upon general theoretical hence must be prohibited. It presupposes absolute
or conceptual framework for considering secrecy standards, which must not be violated.
issues. Given their social importance, there is a Much censorship assumes that all individuals, not
surprising lack of empirical or explanatory research just children, are vulnerable and need protection from
seeking to understand the contours of secrecy and offending material—whether pornography or radical
openness and why, and with what consequences, some criticism of existing political and religious authority.
forms have the support of law. Nor has there been Individuals cannot be trusted to decide what they wish
much research contrasting different forms of legal to see and read or to freely form their own opinions.
secrecy. Some censorship is largely symbolic, offering a way
Philosophers have considered ethics, (Bok 1989) to enhance social solidarity by avoiding insults to
students of politics the implications for democracy, shared values (e.g., a prohibition on flag burning). It
(Shils 1956, Laquer 1985, Donner 1980, Moynihan may be a form of moral education as with prohibitions
1998) and other social scientists have studied the on racist and sexist speech. Or masquerading under
patterning, processes, and correlates of information high principles of protecting public welfare and mor-
control rules across institutions and societies (Simmel als, it may simply involve a desire to protect the
1964, Goffman 1969, Tefft 1980, Wilsnak 1980, interests of the politically, economically, and re-
Scheppele 1988). The largest body of work is by legal ligiously powerful by restricting alternative views, and
scholars; it emphasizes jurisprudence in often related criticism or delegitimation of those in power.
areas such as the First Amendment, obscenity and Among the most common historical rationales are
pornography, national security and executive privi- political (sedition, treason, national security), religious
lege, freedom of information, trade secrets, privacy (blasphemy, heresy), moral (obscenity, impiety), and
and confidentiality, informant identities, fraud and social (incivility, irreverence, disorder). These of
implied warranties, but generally fails to explain course may be interconnected. What they share is a
broader social processes. claim that the public interest will be negatively affected
by the communication.
Censorship may be located publicly relative to other
legal forms of secrecy. Censorship is justified by the
2. Censorship protection of public welfare. Rationales for other
legally supported forms include: the protection of
priate property for trade secrets; economic efficiency
2.1 Definitions and Differences
and fairness justifications in common law disputes
Censorship of communication in the modern sense is over secret information; the encouragement of honest
associated with large, complex, urban societies with a communication and\or protection from retaliation
degree of centralized control and technical means of underlying forms such as lawyer–client and doctor–
effectively reaching a mass audience. It involves a patient confidentiality, the secret ballot, and a judge’s
determination of what can, and cannot, (or in the case en camera ruling that the identity of an informant need
of nongovernmental efforts should and should not) be not be revealed; the protection of intimate relations in
expressed to a broader audience in light of given the case of spousal privilege; the protection against
political, religious, cultural, and artistic standards. improperly elicited confessions underlying the Fifth
Censorship may involve withholding or editing exist- Amendment; the strategic adantage justification of
ing information, as well as preventing information sealed warrants and indictments; and the respect for
from being created. In the interest of keeping material the dignity and priacy of the person justification for
from a broader audience, content deemed to be limits on the collection and use of personal infor-
offensive or harmful to public welfare is suppressed or mation, whether involving census, tax, library, or
regulated. arrest (as against conviction) records. There has been
At the most general level any rule, whether codified little empirical research on whether, how well, with
or customary, proscribing self-expression (e.g., nudity, what consequences, and under what conditions these
hairstyles, body adornment, language use) or the justifications are met.
surveillance and suppression of personal communi- Censorship is involuntary, unlike a nondisclosure
cation ( phone, mail) can be seen as a form of agreement that parties to a court settlement vol-
censorship. But our focus is primarily on state- untarily agree to. Censorship is unitary and non-
supported efforts to control mass communication discretionary—those subject to it don’t have the
justified by claims of protecting the public interest, a option of communicating. In contrast, the dyad of a

1582
Censorship and Secrecy: Legal Perspecties

confidential professional relationship is discretionary the modern period, continuing a trend that began with
for one party, such as the client; with the client’s the printing press, new technologies, involving news-
permission, a doctor or lawyer may reveal confidential papers, mass-produced books and magazines, radio,
information. Censorship seeks to withhold informa- telegraph, telephone, television, film, audiocassettes,
tion from a mass audience, rather than a given video, fax, and the Internet, with their unprecedented
individual, as with controversial laws preventing ability to reach large numbers of people relatively
revelation of the identity of birth parents to adoptees. easily, inexpensively, and efficiently, have created
Where information exists but censors prevent its demands from conflicting groups for greater openness
release, it is intended to remain secret. In contrast are and freedom of communication and greater control
legal secrets involving a natural cycle of reelation such over it. The conflict and debate continues—note
as sealed indictments and search and arrest warrants, conflicts over cable TV and efforts to regulate content
which become known when executed, or an industry and access to the Internet.
confidentiality agreement, which may expire after a Following revelations in the USA about
few years. Censorship as a form of secrecy stands Watergate and government spying and disruption of
alone. It is not reciprocally and functionally linked the civil rights and antiwar movements during the
with its opposite—the legal mandate to reveal. For 1970s, the groundbreaking Freedom of Information
example some civil grand juries compel testimony, but Act was passed and the Supreme Court strongly
then promise to keep it confidential. reaffirmed the principle of no prior restraint on the
Censorship is distinct from government regulation press in the Pentagon Papers case (New York Times
of fraudulent or deceptive commercial communi- Co. . United States 1971).
cation, which, unlike opinion and artistic expression, As the examples of Socrates who chose to die rather
offers a clearer basis for empirically determining truth, than to have his ideas censored (or Plato who argued
as with the Federal Trade Commission’s truth in for censorship of the arts), the Romans who censored
advertising requirements. Censorship is separate from plays and banished offending poets, Pope Gelasius in
restrictions on communication based on copyright the fifth century who issued the first papal list of
infringements, where the issue is not secrecy, but prohibited books, and the Inquisition beginning in
wrongful use. It is also distinct from editorial gate- 1231 indicate, technology is hardly needed to spur
keeping based on other criteria such as quality, cost, censorship.
demand, and relevance, and in the case of regulating However, demands for censorship of religious and
public demonstrations and entertainment, public political ideas gained significant momentum in the
safety and order. These can of course mask a desire to fifteenth century with the appearance of that most
censor that would not otherwise be legally supported. subversive of technologies (after the invention of
Government-legitimated censorship is distinct from writing)—the printing press and the subsequent spread
censorious outcomes that may result from the actions of literacy. This broke the historic monopoly, however
of private groups. With the separation of church and limited, of religious and government institutions on
state, only censorship by government has the support communication with the masses. Authorities tried and
of law. Nongovernmental organizations such as a are still trying (often in vain) to control the new
religious group or social movement may prohibit, or techniques of mass communication. Later with the
attempt to dissuade, members and others from prod- separation of church and state and the increased
ucing, disseminating, or reading, listening to, or power of the nation state, the reach of religious
viewing material deemed objectionable. They may censorship declined (e.g., prosecution for blasphemy)
request editorial changes, advocate boycotts, and while political censorship gained in importance, as did
lobby school boards, libraries, bookstores, and thea- ideas of free expression, which both countered and
ters to exclude such material. provoked censorship.
When we look at social processes of information In the West, cultural values from the Enlightenment
control such as withholding information and selective elaborated on by Immanuel Kant and later J. S. Mill
presentation, a form of censorship may sometimes be and others stressed the importance of freedom of
seen in propaganda, public relations, and advertising, expression and openness as central to finding the
as public and private sector actors pursue their interest truth, and for the stability and effectiveness of demo-
in creating favorable public impressions. Consider for cratic government. Individuals were optimistically
example cigarette companies withholding information assumed to be responsible and rational beings who
on the health risks of smoking or tire manufacturers would reach the best conclusions, whether involving
not revealing the knowledge that their tires are unsafe. normative or empirical truth, with full information
and discussion. Scientific ideals involving the ability to
question and the freedom to communicate fit here as
well. For both government and science, visibility or
2.2 Historical Deelopment
transparency is believed to be a central factor in
Interest in the topic is strongly related to developments accountability. Later arguments emphasized that the
in communications technology and current events. In psychological well-being and dignity of the person

1583
Censorship and Secrecy: Legal Perspecties

were best served by the freedom to express one’s self United States (1919). In the 1968 case of United States
and form one’s own opinions. The argument based on . O’Brien the Supreme Court held that local laws
personality has been stronger in Europe than in the could regulate time, place, and manner of expression if
USA. done in a content neutral fashion, narrowly tailored to
In the second half of the twentieth century, with the serve substantial government interests and if alterna-
allies’ victory over fascist governments in World War tive channels of expression were left open.
II, the fall of colonialism, and the ending of the cold There is often disagreement about the social conse-
war, the cultural force of democratic ideals involving quences of expression and how material should be
freedom of inquiry and expression has grown stronger. defined. Does exposure to sexually or violently explicit
The principle of freedom of expression is contained in words and images result in incitation and mimicry as
the First Amendment to the United States Consti- some research claims (Itzin 1993, National Academy
tution, various United Nations documents, European of Sciences 2000), or is it a safety valve and thus an
Constitutions, and documents such as the European alternative to action as others claim (Segal and
Convention on Human Rights and Fundamental McIntosh 1993, Hein 1993)? Does the prevalence of
Freedoms. In the USA, the Supreme Court’s extension violent and sexual content reflect or create public
of the protection of the First Amendment to the states demand? Can a reasonable consensus be reached on
meant that numerous nineteenth- and early twentieth- the distinction between pornography and erotic art?
century state and local laws sanctioning censorship Can heterogeneous, rapidly changing societies with
were in principle unconstitutional, although in prac- multiply porous borders meaningfully talk of com-
tice there was often strong local support for such laws. munity standards?
This can be seen in struggles over education (e.g., There has been little research on variation in
the Scopes ‘monkey’ trial involving the teaching of censorship. Political and religious expression has
Darwinism in Tennessee in 1925), the routine denial of generally received greater legal support than other
First Amendment rights to labor protestors up to the forms such as sexual expression. Printed matter has
1930s and civil rights protestors through the 1960s, greater protection than other media. Film, live audi-
and various local struggles over efforts to ban books ence presentations, the Internet, and cable TV have
from libraries. greater protection than conventional television and
For Western style pluralist democracies, formal radio where there is a scarcity of spectrum. Artistic
government censorship is the exception rather than expression likely to raise the concerns of censors is
the rule, at least relative to absolutist authoritarian generally ignored until it seeks to reach a mass
regimes, which believe they have the only truth audience via the media or museums. Material ap-
(whether political, religious, or moral) and do not propriate for adults may not be suitable for children.
permit opposing views. In 2000, an annual survey of Freedom of expression and access to information
press freedom found that 80 percent of the world’s generally have greater protection in the USA
population lives in nations with less than a free press; than in Europe (e.g., greater tolerance of hate and
about one-third of the countries were considered to other offensive speech, and stronger freedom of
have free press and broadcast systems; and one-third information laws and protections against libel suits).
had systems with strong government control (Freedom
House 2000). An extreme example is from the govern-
ment of Iran where Salman Rushdie’s book Satanic
Verses was not only banned for being blasphemous,
2.3 Methods of Censorship
but a reward was offered for Rushdie’s death.
Direct organizational means of government cen- Three major means of direct censorship are preentie
sorship must be considered separately from the in nature. Their goal is to stop materials deemed
availability of resources to create and distribute infor- unacceptable from appearing, or if that is not possible,
mation and from informal means of censorship, from being seen or heard by prohibiting their cir-
whether by government or private interests, including culation:
self-censorship. Formal prepublication reiew. This requires would-
While freedom of expression is a central component be communicators to submit their materials for certifi-
of the modern democratic state, among democracies, cation before publicly offering them. Soon after the
there is considerable variation in censorship by con- invention of the printing press, the church required
tent, media of communication, place, and time period. review and approval before anything could be printed.
Constitutional and legislative guarantees of the indi- However impractical and difficult to enforce in the
vidual’s right to freedom of expression are not ab- contemporary period, to varying degrees such ‘prior
solute. In considering the social consequences of restraint’ is found in authoritarian societies, whether
exercizing a right, courts and legislatures balance it based on secular political (as in Cuba) or religious
against other rights and community needs and stan- doctrines (as in Iran and Afghanistan) at the turn of
dards, such as the presence of the clear and present the century. It may be seen in democracies during
danger that Justice Holmes wrote of in Schenck . emergency periods such as a war. There may be formal

1584
Censorship and Secrecy: Legal Perspecties

review boards, or censors may be assigned directly to wide a spectrum of viewpoints as would be found with
work at newspapers and broadcasting stations. more decentralized ownership.
Goernment or interest-group monopolization of pub- A related area of licensing can be seen in local
lication. Here the censors in effect are the producers requirements that those wishing to hold a public
and are the only ones allowed to offer mass com- demonstration obtain a permit. Given constitutional
munication. For much of its history the church was protections, such permits are usually granted in the
intertwined with government and was in effect the only USA, although there may be restrictions justified by
publisher. In the former Soviet Union the press and the need to maintain public order.
media were government controlled and private means In Western democracies, broadcast media (e.g.,
were prohibited. radio and television), unlike print media, are subject to
Licensing and registration. The means of production licensing. The scarcity of bandwidth requires govern-
and transmission of information may be limited to ment regulation and, depending on the criteria used,
trusted groups who agree to self-censorship in light of can be an invitation to censorship. The US Federal
prior restrictions. In England in the sixteenth century, Communications Commission, for example, has am-
printing was restricted to one official company and all biguous rules regarding the control of broadcast
books had to be cleared by religious authorities prior content. The use of certain four letter words deemed to
to publication. Four centuries later, China required be indecent is prohibited. Although rarely exercised,
that all Internet content providers be registered with there is the possibility of license revocation or non-
the government and abide by vague content restric- renewal for violations.
tions. Permission may be required to own a printing After comedian George Carlin used the word ‘fuck’
press, and in some countries, even ownership of a in a late night broadcast in 1973, the offending radio
typewriter has been regulated. station received a warning letter from the FCC. The
Government subsidized programs for the arts and station then sued, claiming that FCC regulations on
journalism may come with political and cultural indecent speech violated the First Amendment. The
strings attached. In the Soviet Union, sponsorship of Supreme Court (FCC . Pacifica Foundation 1978)
artists’ and writers’ associations stressed ‘socialist upheld the FCC action and added an additional
realism,’ a doctrine that held that art should serve the controversial, very broad, censorship rationale known
purposes of the state. Those rejecting this doctrine as the ‘pervasiveness doctrine.’ Under this doctrine,
were neither subsidized nor offered access to the regulation is required because, ‘the broadcast media
public, and they risked prosecution, as with Nobel have established a uniquely pervasive presence in the
Prize winning author Alexander Solzhenitsyn. lives of all Americans’ and offensive and indecent
In the USA in 1990, under prodding from material delivered over the airwaves confronts the
Congress, The National Endowment for the Arts citizen, ‘not only in public, but also in the privacy of
required that grant recipients sign a nonobscenity the home.’ Given the ease with which indecent
oath and that artistic merit be determined by taking communications may enter the home, children must
into account general standards of decency. A federal be protected from unwillingly or willingly encounter-
court held that the decency clause was too broad and ing them.
that public funding of art was entitled to First The elastic quality of a standard such as ‘pervasive-
Amendment protection (Karen Finley et al. . National ness’ could be used to justify censorship of any form,
Endowment for the Arts and John Frohnmayer 1992). even of books and newspapers that also pervade
A related aspect involves an informal de-licensing in society. Indeed, when Congress passed the Com-
which individuals deemed to be untrustworthy relative munications Decency Act in 1996 to regulate Internet
to the official standards are prohibited from com- content, a medium not characterized by spectrum
municating. For example, during the 1950s, Holly- scarcity, it was argued that the Internet pervades the
wood film writers suspected of communist sympathies home just as the radio and television do and hence
were prohibited from working in the industry via a must be regulated. However the Supreme Court found
black list. this Act unconstitutional in ACLU . Reno in 1997. As
A more subtle form of exclusion involves denying the Internet evolves, and to the extent that it becomes
access, as when government officials provide infor- a platform for delivering voice and video com-
mation only to favored journalists believed to put an munications that parallel traditional broadcasts, con-
acceptable slant on their reports. Even where the flicts over the appropriateness of its regulation will
means of communication are freely available in a legal likely intensify.
sense, inequality in resources often means that many In contrast are means applied after the fact, which
potential voices go unheard. Journalist A. J. Leibling seek literally to block or destroy communications or to
has observed, ‘freedom of the press is assured to those punish and deter. A classic example, likely seared on
who own one.’ A related issue involves the trend the memory of anyone who has seen it on film, is the
toward consolidation of newspapers, magazines, tele- Nazis’ burning of books in 1933. Materials originating
vision, and motion pictures under fewer and fewer from suspect sources that do not cooperate with
owners. Such monopolies are unlikely to express as censors and\or are from outside a country may be

1585
Censorship and Secrecy: Legal Perspecties

categorically blocked. This may happen through tech- ethics and voluntary standards. Between the time a
nical means as when the former Soviet Union journalist writes a story and its appearance there are
electronically jammed communications of Radio Free several levels of review. The large broadcasting
Europe or through the seizure of material at borders. companies have internal units that review everything
China has created an electronic wall around its from advertisements to program content before they
Internet to block access to material from nonap- appear. When Elvis Presley appeared on national
proved sources (e.g., among sites blocked are the New television for the first time, only his upper body was
York Times and CNN). Until the United States shown, given censors’ concern with what was then
Supreme Court (US . One Book Entitled Ulysses considered to be his prurient hip shaking.
1934) found that James Joyce’s Ulysses was a work of Another prevalent nongovernmental means, often
art, even though it contained ‘dirty words,’ US undertaken to avoid the threat of more stringent
customs authorities routinely seized literature deemed government controls, involves voluntary rating sys-
inappropriate. The US Postal Service prohibits the tems. Here the goal is not to ban the material but to
importation and domestic transmission of certain give consumers fair warning so they can make up their
forms of obscene communication. With the 1934 own mind. In the 1920s, the Motion Picture Assoc-
decision we see the seeds of later Supreme Court iation of America created a seal of approval for films
rulings such as Miller . California 1973, which held meeting its standards. In 1968, it created its rating
that work could be prohibited only if, taken as a system (expanded in 2000) for films based on nudity
whole, it had no redeeming value as art or science, was and violent content. Some comic books, TV programs,
patently offensive, and was not in keeping with local music videos, video games, music, and web sites are
community standards. also rated. This may be welcomed as consumer
The conflict between the principle of no prior information or seen as censorship that can chill
restraint and any modern society’s legitimate need to expression and create an undesirable climate, opening
control some communication has resulted in a variety the door to greater control.
of after-the-fact sanctions (e.g., criminal offenses Technical means of information control such as the
involving espionage, revealing national secrets, ob- v-chip, which permits programming a television set so
scenity, pornography, and incitement, injunctions to it will not receive material deemed objectionable, and
cease publication and administrative sanctions; and various software filters, which screen web sites for
civil remedies such as invasion of privacy, libel, and sexual and violent content, facilitate private control.
defamation). Such means are seen as more efficient than a heavy-
In the USA in 2000 (in contrast to Britain, handed government censor and more consistent with
which has an Official Secrets Act), the disclosure of an open, highly heterogeneous society. Those who
properly classified information (with certain excep- want such material have access while those who might
tions such as information on the design of nuclear be offended can avoid exposure.
weapons and the names of intelligence agents) is not a There are degrees of censorship. More common
crime. However disclosing such information can result than outright prohibition, particularly in the case of
in losses of security clearance, dismissal, and fines. erotic material (which has received increased legal
Communication has a special quality. Unlike most protection in recent decades in the face of local legal
other legally regulated subjects, most communications prohibitions that appeared in the nineteenth century),
cannot be legally prohibited before they are offered, are time, place, manner, and person restrictions—
but once offered, legal penalties may apply. Thus, whether required by government or undertaken vol-
there is a paradox and some uncertainty and risk for untarily. Potentially offensive material is segmented
communicators, particularly those pushing bound- and walled off from those for whom it is deemed
aries. The goal here, beyond punishment for the inappropriate. For example, pornographic material
infraction in question, is to warn and deter others in may be restricted to red-light districts, to adults, and to
the hope of encouraging self-censorship. late night programming when minors are presumed
On a statistical basis, the major form of censorship not to be watching; children may be prohibited from
in western societies is self-censorship. Publishers, certain concerts, and stores may refuse to sell them
editors, and producers of mass communications are violent videos.
aware of boundaries not to be crossed, even though
there are many gray areas. Communicators generally
stay within the borders, whether to avoid prosecution
2.4 Limitations of Censorship
or lawsuits, to avoid offending various social groups,
to keep the channels of government information open, While government censorship makes a symbolic state-
to please stockholders and advertisers, or out of their ment, it is often rather impractical beyond the short
sense of patriotism and morality. run, given the ubiquitous nature and continual
Press and broadcast organizations (e.g., The improvements in mass communication technologies
National Association of Broadcasters) and the major and the leaky nature of most social systems. Of course
newspapers and television networks have codes of computer-based technologies may make it easier to

1586
Censorship and Secrecy: Legal Perspecties

track whom an individual communicates with and most legitimately classified government secrets have a
what material they access. But on balance, technology shelf life and must be revealed after 75 years.
appears more likely to be on the side of freedom of In the long run it is also difficult for censors to deny
expression than the side of the censors. The ease of pragmatic outcomes and those that are empirically
modern communications, in particular remote forms obvious. This raises the intriguing sociology of knowl-
(such as radio, television, fax, and the Internet) whose edge question of the relationship between culture with
transmission can transcend national borders and its significant, but not unlimited, elasticity and a level
means of reproduction (such as photocopiers, scan- of reality or truth that, when in conflict with culture,
ners, audio- and videotaping, and printing through a may erode it over time. No matter what the power of
computer) that are inexpensive and relatively easy to the church to prosecute Galileo for heresy and ban his
use and conceal limit the ability of censors. The work, or the amplitude of its megaphone to assert the
Internet, if available, has the potential to make earth was flat, it could not suppress the truth for long.
everyone a publisher. Its ‘many to many’ comm-
unication through labyrinthine networks (chat rooms, See also: Censorship and Transgressive Art; Cen-
bulletin boards, and e-mail) is far more difficult to sorship in Research and Scholarship; Communication
censor than the traditional ‘one to many’ com- and Democracy; Confidentiality and Statistical Dis-
munication of the newspaper or television station and it closure Limitation; Mass Communication: Normative
is less expensive. Frameworks; Privacy of Individuals in Social Re-
Given the expanding scale of published material search: Confidentiality; Secrecy, Anthropology of
and the diffusion of communications technology that
began with the printing press, government and re-
ligious bodies are forever trying to catch up. This is the Bibliography
case even in modern highly authoritarian settings. For
example in China during the Tiananmen Square Aumente J 1999 The role and effects of journalism and Samizdat
protest, fax technology kept China and the world leading up to 1989. In: Aumente J, Gross P, Hiebert R,
informed of the events, and in Iran the fall of the Shah Johnson O, Mills D (eds.) Eastern European Journalism
Before, During and After Communism. Hampton Press,
was aided by smuggled audiotapes urging his over-
Cresskill, NJ
throw. Strong encryption, which protects messages, Bok S 1989 Secrets: On the Ethics of Concealment and Reelation.
also makes the censor’s task more difficult. Vintage Books, New York
Beyond technical factors, censorship is often ac- Coetzee J 1996 Giing Offense: Essays on Censorship. University
companied by demand for the censored material. of Chicago Press, Chicago
Censoring material may call attention to it and make Donner F 1980 The Age of Sureillance: the Aims and Methods of
it more attractive (the forbidden fruit\banned in America’s Political Intelligence System. Knopf, New York
Boston effect). Black market demand for such material Freedom House 2000 Press Freedom Surey 2000. Freedom
makes it likely that some individuals will take the risk House, Washington DC
Green J (ed.) 1990 The Encyclopedia of Censorship. Facts on File,
of creating and distributing it, whether out of con-
New York
viction or for profit. Potential communicators often Goffman E 1969 Strategic Interaction. University of Pennsyl-
find ways to avoid or deceive censors whether using vania, Philadelphia, PA
satire, parable, code language, changing the name of Hein M 1993 Sex, Sin, and Blasphemy. Free Press, New York
prohibited publications, or through simply defying the Herman E, Chomsky N 1988 Manufacturing Consent: The
law, as with the many underground ‘samizdat’ presses Political Economy of the Mass Media. Pantheon, New York
that challenged communist rule in Eastern Europe. Itzin C 1993 Pornography: Women, Violence and Ciil Liberties.
It is also the case that sometimes, ‘the truth will out.’ Oxford University Press, New York
In a democracy illegitimate political censorship in the Laqueur W 1985 A World of Secrets. Basic Books, New York
Marx G T 1984 Notes on the collection and assessment of
name of national security or executive privilege is
hidden and dirty data. In: Schneider J, Kitsuse J (eds.) Studies
vulnerable to discovery—note Watergate and the Iran- in the Sociology of the Social Problem. Ablex, New York
Contra affair. In spite of its dependence on govern- Moynihan D 1998 Secrecy: The American Experience. Yale
ment sources, the mass media may play an important University Press, New Haven, CT
counterbalancing role here, watching those who seek National Academy of Sciences 2000 Promoting Health: Inter-
to watch. Beyond investigative reporters often using vention Strategies from Social and Behavioral Science Re-
the Freedom of Information Act, such ‘dirty data’ search. National Academy Press, Washington, DC
may be revealed by the legal procedure of ‘discovery’ Pool I D S 1983 Technologies of Freedom. Belknap Press of
in court cases, by experiments and tests, by leaks, Harvard University Press, Cambridge, MA
Popper K 1962 The Open Society and its Enemies. Routledge &
whistleblowers, and participants with a Dostoevskian
Kegan Paul, London
compulsion to confess and by uncontrollable con- Segal L, McIntosh M (eds.) 1993 Sex Exposed: Sexuality and the
tingencies such as accidents (e.g., the crash of an Pornography Debate. Rutgers University Press, New Bruns-
airplane carrying Watergate ‘hush’ money). The more wick, NJ
complex and important a cover-up or illegal con- Scheppele K 1988 Legal Secrets: Equality and Efficiency in the
spiracy, the more vulnerable it is to revelation. Even Common Law. University of Chicago Press, Chicago

1587
Censorship and Secrecy: Legal Perspecties

Shils E 1956 The Torment of Secrecy. Free Press, Glencoe, IL unpredictable and unanticipated. Although this dis-
Simmel G 1964 The Sociology of Georg Simmel (ed. Wolff K). tinction came to prominence in the mid-eighteenth
Free Press, Glencoe, IL century, it has obvious relevance to the contemporary
Strum P 1999 When the Nazis Came to Skokie: Freedom for aant garde, for whom unsettling the status quo is a
Speech We Hate (Landmark Law Cases and American
Society). University of Kansas Press, St. Lawrence, KS
primary principle.
Sunstein C 1993 Democracy and the Problem of Free Speech. Second, artists have expanded the meaning of
Free Press, New York aesthetic since the mid-1980s to beyond the object, and
Tefft S 1980 Secrecy: A Cross Cultural Perspectie. Human beyond the art world. This type of art supersedes the
Sciences Press, New York aesthetic formalism of Clement Greenberg by actively
Wilsnak R 1980 Information control: A conceptual framework engaging the world, critically dissecting it, and pro-
for sociological analysis. Urban Life (Jan) posing an alternative future (Kester 1998). ‘Art’
frequently encompasses activist interventions in the
G. T. Marx art world or the larger community, uncovering dis-
guised social processes such as power, racism, and
sexism (e.g., the feminist collective Guerrilla Girls, or
Hans Haacke, whose 1971 exhibition unmasking the
complex operations of a Manhattan slumlord was
canceled by the Guggenheim Museum because of its
Censorship and Transgressive Art ‘extra-esthetic’ nature). It can therefore encompass
process rather than object. This means expanding the
‘Transgressive art’ violates aesthetic or broader cul- concept of ‘public art’ beyond heroic depictions of
tural norms. It provokes, challenges, and shocks ‘great men’ and their deeds, and beyond large-scale,
audiences, causing anger, insult, or anxiety. Restric- transcendent ‘masterpieces’ by confirmed artistic
tions such as censorship may result, where parties geniuses, to face-to-face collaborations with real com-
inside or outside the production process impose munities (Jacob 1995), or enlarging the definition of
modifications on work, force its suppression, or induce art to include the voices of formerly marginalized
creators to self-impose constraints on what they make groups, presented in untraditional venues, and em-
in order to avert potential problems. Artistic actions ploying unconventional media. This can lead to the
and the social reactions they call out are thus or- violation of norms and taboos broader than those of
ganically linked. Transgression in the world of art can the art world itself. The first page of the catalogue
also be a vital source of innovation. accompanying the Whitney Museum of American
Art’s exhibition ‘Abject Art: Repulsion and Desire in
American Art’ (1993) boldly represents this position:
‘[O]ur goal is to talk dirty in the institution and
1. The Term Transgressie Art: Three Meanings degrade its atmosphere of purity and prudery by
foregrounding issues of gender and sexuality in the art
Transgressive art has become a buzzword in the art exhibited’ (Ben-Levi et al. 1993, p. 7). The materials in
world of the mid- to late 1990s. In the broadest sense, the object list included dirt, hair, excrement, dead
it connotes any artwork that offends audiences. That animals, and rotting food.
includes creations that could be deemed blasphemous
for violating religious doctrines, be they traditional
religions or dominant civic creeds; work that defies
social hierarchies and would be regarded as subversive
1.1 The Contemporary Coalescence
by those in power; and art that deals with ‘unsavory’
or ‘impolite’ topics. Transgressie art thus becomes a Generally speaking, since the last quarter of the
catch-all term, embracing a wide variety of work nineteenth century—with the rise of Impressionism,
within its scope. In this regard, art is transgressive if it and the breakdown of the Academy system through-
is prickly. In another age, the same material might out Europe—artists have veered between two major
have been condemned or dismissed as obscene, her- aesthetic strategies. One approach has been ‘art for
etical, or merely ‘bad art.’ art’s sake,’ where artists produce work primarily for a
But ‘transgressie art’ also has two more specific, small group of like-minded individuals, using what
and inter-related, meanings. First, transgressie art could be compared to an exclusive language of
corresponds to philosopher Immanuel Kant’s ‘sub- expression. The other style is producing art aimed for
lime,’ as opposed to the ‘beautiful.’ Whereas the the wider public, incorporating social and political
beautiful is soothing, representing order and regu- themes. The first strategy is aesthetically daring but
larity, the sublime is surprising, it embraces the hermetic; the second has often been aesthetically
obscure, the novel, and the terrible. The sublime conservative but liberally accessible. Dada and Ab-
challenges the senses and produces restlessness, as stract Expressionism are examples of ‘art for art’s
opposed to satisfaction. The sublime excites fear; it is sake’; the social realism that dominated art production

1588
Censorship and Transgressie Art

in the United States during the Depression of the Danse (1865–9), featuring a cluster of frenzied female
1930s, or the work of muralists such as Diego Rivera dancers surrounding a geT nie de la danse. The central
who were active throughout the same era in Mexico, figure is sexually ambiguous, bearing the slim body of
represent socially and politically oriented art. a male youth capped by a female face. S\he troubled
But contemporary artists are distinctive: they fre- many viewers and became the lightning rod for a
quently have mobilized both approaches, combining heated public controersy, as did Princess X, a sculp-
inventive techniques with urgent social issues. What ture by Constantin Brancusi (1916). In one respect this
has been the impetus for this dramatic innovation? work resembles the classic pose of a Madonna. But it
Artists have been pulled from the cloister of their also unmistakably resembles a set of male genitalia,
studios by larger social events since the mid-1980s. thus challenging the categories of male and female,
The AIDS crisis hit American artists in disproportion- sacred and profane. Furthermore, a production of The
ate numbers. They responded to inadequate or biased God of Vengeance (penned by playwright Sholem
media coverage, and the relative dearth of official Asch, 1910) ran afoul of public opinion and the law in
political responses to the epidemic, by treating this New York City in 1922. This is the tale of Yankel, who
subject with an activist stance (the now-ubiquitous operates a brothel in the basement while he maintains
‘Silence l Death’logowasoriginallyaneoninstallation a ‘wholesome’ atmosphere for his wife and chaste
in the window of New York City’s New Museum of daughter, Rivkele, upstairs. Sacred and profane, public
Contemporary Art). Furthermore, the empowerment and private are segregated—until Rivkele falls in love
movements that gained strength during the with one of the prostitutes. Such cause ceT leZ bres were
1960s—particularly the civil rights movement, fem- novel in their time; they are much more commonplace
inism, and gay liberation—affected the art world by today.
expanding opportunities for artists and art admini- The photographer Andres Serrano’s notoriety was
strators from these groups, as well as making artists established by Piss Christ (1987), where he placed a
more self-reflexive about the structure of the art world plastic crucifix into a container of his own urine.
and communities beyond. Finally, conservative re- Whereas many people were repulsed by the idea of
ligious and political leaders in the United States who mixing sacred and profane when this work came to
reacted to the success of the disenfranchised in wide public attention in 1989, one canny reviewer
expressing themselves led a series of attacks to cut off shifted the figure\ground perspective: instead of seeing
public funding of the arts, targeting unconventional the crucifix as being despoiled, perhaps more im-
expression. Contemporary artists have used both portantly the urine was being sanctified. After using a
traditional and transgressie strategies to confront variety of vital fluids in his work (including milk,
these pressing issues and resist these assaults. blood, and semen), subsequent projects have included
The Morgue (photos that provided a peak into the
generally forbidden realm of corpses, in 1992), A
History of Sex (1997), which featured images of
1.2 The Fundamental Nature of Transgression
intergenerational sexual relations (an 87-year-old
‘Transgression’ signifies that rules have been violated. grandmother with her young lover), and a nude
The most useful conceptual tool to understand the woman stroking the erect penis of a horse (under-
significance of this is the work of anthropologist Mary scoring the cultural regulation of ‘proper’ sexual
Douglas on moral pollution. In order to structure expression and the symbolic boundaries of taboo).
interaction, every society consolidates experience into At about the same time that Serrano was vilified, so
what Douglas calls ‘natural categories,’ binary oppo- was the photographer Robert Mapplethorpe. A sig-
sitions that consign phenomena to one classification nificant portion of Mapplethorpe’s work features
or another. Significant natural categories include sexually explicit or homoerotic images. A few nude or
sacred and profane, public and private, masculine and seminude portraits of children were also judged in
feminine, internal and external. Societies differ in sexual terms by some social critics. Mapplethorpe’s
where they assign particular behaviors, but the process extensive series of self portraits represents a range of
is basic across human groups. transgressions: in 1978 he became the Devil incarnate
Any infraction of these natural categories resounds with his leathers and a bullwhip stuck into his rectum;
throughout society, threatening the sanctity of other in 1980 he was the ethereal waif with lipstick pout,
distinctions as well. Controversial artists have run mascaraed eyes, and naked, vulnerable torso. From
afoul of convention, and have been prime movers in demon to sexually ambiguous figure, Mapplethorpe
what one observer has termed a ‘category crisis’ tweaked public sensibilities in the 1980s and 1990s.
(Garber 1992), a pervasive blurring of previously Finally, the widespread outcry against Martin
precise symbolic boundaries. All have committed these Scorcese’s film The Last Temptation of Christ (1988)
figurative affronts in some manner or another. and Salman Rushdie’s novel The Satanic Verses (1988)
Such violations are not entirely new. For example, have similar origins in violations of symbolic bound-
Jean-Baptiste Carpeaux produced a sculptural group aries. In both instances, major religious figures were
for the facade of the new Paris Ope! ra, La demystified by focusing on their humanity, alongside

1589
Censorship and Transgressie Art

their divinity. But to the fundamentalist mind, this was dignity when uttered by a condescending male, be-
an unbearable assault on received wisdom and sanc- comes an affectionate code word for sisterhood.
tified territory (Dubin 1992). Works in the exhibition highlighted women who self-
consciously work beyond the symbolic boundaries of
prim femininity, skewering the cultural assumptions
of how they should act.
Of course, male artists have used their bodies as an
1.3 Bad Girls and Gender Deiation
artistic medium as well: Vito Acconci’s Seedbed (1971)
As demonstrated, protection of natural categories, and featured the artist masturbating beneath a ramp in a
intolerance of their combination or violation, are at New York City gallery, while visitors strolled above.
the base of a significant portion of modern-day Chris Burden’s Shooting Piece (1971) involved having
controversies about art. Feminist performance artists a friend shoot him in the arm; in Trans-Fixed (1974),
Karen Finley and Annie Sprinkle transgress social he was crucified over the back of a Volkswagen. The
expectations of femininity, the self, and the ‘proper’ late Bob Flanagan famously nailed his penis to a board.
display of the body in public: they use ‘earthy’ Ron Athey currently performs ritualized piercings and
language, vault from persona to persona, and reveal mutilations. These performances transgress conven-
intimate parts of their bodies and their functions tion, to be sure. But in vital ways they harmonize with
(Sprinkle invites audience members to examine her male norms of sexuality and bravado, unlike their
genitals with a flashlight; Finley smears the outside of female artistic counterparts who more directly chal-
her body with raw eggs, squirts breast milk, and rubs lenge gendered norms of behavior.
yams over her buttocks, upsetting the presumed order
of nature). In one of her best-known acts, We Keep
Our Victims Ready, Finley slathers her nearly naked
body with chocolate she scoops out in handfuls from a 2. Transgressie Art and Social Response
heart-shaped box. Next she sprinkles herself with tiny
red cinnamon candies, followed by a layer of alfalfa During every epoch and in every conceivable place
sprouts, and topped off with Xmas tinsel. Her body where modern cultures have developed, we can expect
has become her canvas. Id trumps everything else in to observe this peculiar dialectic: a particular type of
this type of expression (Dubin 1992). social control evolves to address specifically whatever
Finley and Sprinkle continue the tradition of fem- the powers-that-be find threatening. The ‘problem’
inist ‘bad girls’ who have pioneered performance and inevitably calls out the ‘solution.’ This is what phil-
‘body art’ since the 1960s, such as Carolee Schneeman: osopher Michel Foucault described as ‘perpetual
in Meat Joy (1964) she enlisted others to join her in an spirals of power and pleasure’ (Foucault 1978), which
uninhibited romp with raw fish, chickens, and other symbiotically bind opponents to one another in
items; in Interior Scroll (1975), she literally gave birth pitched battle. In other words, enforcing the rules and
to her art as she removed a rolled text from her vagina. violating them both provide gratification. Prelate and
In more recent times, French artist Orlan has em- infidel, prime minister and traitor, each character-
barked on a series of seven plastic surgeries which will istically complements the other in the struggle for
transform her face into an amalgam of idealized female ascendence.
features: Mona Lisa, the goddess Diana, and
Botticelli’s Venus among them. Photographer Marina
Vainshtein has altered her body in a different respect:
2.1 Censorship: Different Forms
her torso is covered with tattoos evoking the Holo-
caust, including skeletons, swastikas, barbed wire, For all its flaws—it is an extremely vague term, all too
crematoria, and the notorious inscription on the gate promiscuously applied to depict any obstacles your
to Auschwitz, Arbeit macht frei (‘Work leads to opponents erect to limit, direct, and\or interfere with
freedom’). Her ‘project’ transgresses traditional Jewish your absolute freedom of expression—censorship is
prohibitions against altering the body, at the same the most recognizable and purposeful term we have to
time that it symbolically re-enacts and then repossesses cover such situations. (Unlike Eskimos, with their
or ‘controls’ the violation committed by the Nazis purported multiplicity of descriptors of snow, speakers
when they dehumanized their victims by tattooing of English encounter a serious deficit of phrases
numbers on their skin. characterizing constraints on expression.) ‘Censorship’
These defiant women chafe at the way the female is a contested term; its meanings are reckoned along a
body has been objectified in the past, and challenge the variety of axes.
constraints imposed upon them by men. The 1994 Censorship’s visibility is one pivotal characteristic,
exhibition Bad Girls (The New Museum of Contem- be it overt or covert, de jure or de facto, regulative or
porary Art, New York) began with a linguistically constitutive (Jansen 1991). Much censorship is occult,
loaded title: ‘bad’ becomes ‘good,’ a badge of honor brought to light only when taken-for-granted proce-
from this perspective; and ‘girl,’ freighted with in- dures fail to stem the tide of forbidden thoughts, or

1590
Censorship and Transgressie Art

when they are unable to anticipate when and where bending provoked extensive debate in periodicals of
new types of danger may erupt, and in what form. As the day, suffered defacement with a bottle of ink, and
the understanding of hegemony alerts us, the most even an order of banishment issued by Napoleon III.
efficient form of social control is through ideas, not by The salon president had Brancusi’s Princess X re-
force. Whenever a society effectively socializes every- moved from the Salon des Inde! pendants in 1920, pre-
one into supporting the same core belief system, empting the possibility of adverse public response or
controls will seem natural, matter-of-fact, unre- even police action. All the cast members and the
marked. This promotes the sort of stability where producer of The God of Vengeance were arrested and
prescription and proscription dictate and regulate charged with obscenity, immorality, and indecency
nearly all thought and behavior. An efflorescence of (they were found guilty, but their sentences were
censorship therefore commonly signals some break- suspended; the play was forcibly closed). Theaters
down, an unraveling of established rules. screening The Last Temptation of Christ endured
Censorship’s essential nature and primary targets bomb threats, picketing, and vandalism, and it was
are additional attributes. Emile Durkheim’s justly even banned in certain locales. Iran’s Ayatollah
famous dictum asserts that deviance is to be expected Khomeini declared Salman Rushdie’s The Satanic
even in a society of saints (Durkheim 1938). In a Verses blasphemous in 1989 and issued a fatwa or
religious community, then, like the seventeenth cen- death decree against the Indian-born writer; 10 years
tury Italian monastery swept by a witch craze in later, Rushdie (who is now a British citizen) was
Aldous Huxley’s The Deils of Loudun (1952), blas- tentatively beginning to make public appearances
phemy and heresy will be the shape given to doctrinal again.
aberrations, and punishments will be meted out Furthermore, a long and acrimonious national
accordingly. In other societies where a puritanical debate erupted in the United States over the work of
moral system is to be defended, censorship will be Andres Serrano, Robert Mapplethorpe, and Karen
directed at alleged instances of perversity. Whatever Finley; it reached from the halls of Congress to
fashions the fundamental character of the community backwater radio talk shows. In each case, a small
also shapes how the threats to it are perceived—those amount of federal money had been appropriated by
things which must be banished from view and from the National Endowment for the Arts (NEA) to
earshot. underwrite exhibitions of their work. Senator
The particular site or source of censorship is also Alphonse D’Amato ripped up the catalogue from an
critical. In its classic, perhaps purest form, censorship NEA-sponsored fellowship show that included the
is a prerogative of government, be it civil or eccle- photo Piss Christ, on the Senate floor in 1989.
siastical. Censorship is something officials are entitled Moreover, an enraged teenager damaged Piss Christ
to impose, hence state censorship. Market censor- with a hammer while it was on display in an Australian
ship exists as well: certain groups are excluded from museum in 1997; administrators subsequently closed
having their say because they lack the resources or the exhibition of Serrano’s work ‘for security reasons.’
access which would allow them to disseminate their His History of Sex provoked bomb threats in the
ideas easily. Systematic exclusion from the market- Netherlands.
place of ideas typically parallels the exclusion from The Contemporary Art Center in Cincinnati, and its
other opportunities and rewards a society may have to director Dennis Barrie, were indicted for pandering
offer some of its citizens, but not to others, charac- pornography when it hosted a Robert Mapplethorpe
teristically because of their race, religion, gender, class, retrospective in 1990 (they were acquitted). Karen
or sexual orientation. Finally, we must consider self- Finley was denied an NEA fellowship in 1990 after she
censorship, where individuals hold themselves in check, had gained notoriety as ‘the chocolate-smeared
consciously or unconsciously, either out of fear of woman’ and became the poster girl for what con-
bringing official action down upon their heads, or servative opponents of the NEA felt was an illegitimate
from the fatigue and frustration which results from use of public funds (she and three other performance
continually trying to get past cultural gatekeepers and artists—the ‘NEA Four’—later successfully sued to
getting rebuffed. have their fellowships restored). Significant threats
These are, of course, ideal types. In any specific case, and impediments to the unfettered circulation of these
one or more of these forms of regulation might be works materialized in every instance. Yet also in every
operative, either simultaneously or sequentially. Cen- instance, generally at great personal cost, the creators
sorship may occur at any point along the process of and their works prevailed.
production or distribution of cultural goods, and in
cases where hegemony maintains a stranglehold upon
independent thinking, even at the point of reception 3. Summing Up
and consumption (Dubin 1997).
The main examples of transgressie art cited earlier One of the difficulties with the term censorship is that
have all endured negative reactions and restrictions of it commonly connotes finality. More accurately, cen-
one type or another. Carpeaux’s sculptural gender- sorious actions are part of an ongoing process of

1591
Censorship and Transgressie Art

action and reaction that demonstrates a variable tools to understand these phenomena. The extent to
equilibrium. Cultural products typically display a which artists utilize the potential of new technologies
resilience that enables them to persist, despite meeting will have a significant impact upon how viable their art
substantial obstacles to their successful completion remains as a generative force in the twenty-first
and circulation during particular times or in specific century, and the degree to which social scientists will
venues (Dubin 1999). Even in extreme situations where continue to study them.
the creator is permanently silenced or the product
destroyed, history often records the deed. Culture See also: Art: Anthropological Aspects; Art, Sociology
typically outlasts those who would crush it. of; Cultural Studies: Cultural Concerns; Film: Genres
Transgressie art is commonly credited with de- and Genre Theory; Postmodernism in Sociology;
stabilizing taboos (Ben-Levi et al. 1993) and providing Postmodernism: Philosophical Aspects; Television:
a source of innovation in the art world (Graves 1998). Genres
Both views undoubtedly have merit. But it is equally
true that at the beginning of the new millennium,
transgressie art has become a norm in itself. In that
respect it has become somewhat banal, incorporating Bibliography
shock for the sake of shock. What was on the extreme Apel D 1997 The tattooed Jew. New Art Examiner 10: 13–17
margins of comprehension and acceptability not so Ben-Levi J, Hauser C, Jones L C, Taylor S 1993 Abject Art:
long ago has quickly become absorbed into the Repulsion and Desire in American Art. Whitney Museum of
mainstream. Artists who seek to be transgressors at American Art, New York
present are increasingly hard pressed to uncover virgin Douglas M 1973 [1970] Natural Symbols: Explorations in
territories to explore. Cosmology. Vintage, New York
This provides both an opportunity and a challenge Douglas M 1984 [1966] Purity and Danger: An Analysis of the
to researchers. In the past, social scientists studied a Concepts of Pollution and Taboo. Routledge & K. Paul,
London
wide variety of transgressions employing a Durk- Dubin S 1992 Arresting Images: Impolitic Art and Unciil
heimian, normative approach. Mary Douglas’ work Actions. Routledge, New York
subsequently moved investigators toward a more Dubin S 1999 Displays of Power: Memory and Amnesia in the
structural form of analysis, continuing the predilection American Museum. New York University Press, New York
to analyze such phenomena under the rubric of Dubin S 1997 Pressed to the limit: printers and the problematics
‘deviance.’ But in a postmodern world, these earlier of censorship. Journal of Visual Anthropology 9: 229–41
frames of reference are not as adequate as they once Durkheim E 1964 [1938] The Rules of Sociological Method. Free
were. Press, New York
As a broad range of social breaches has become Foucault M 1978 The History of Sexuality: Vol. 1: An Introduc-
tion. Pantheon, New York
routine in contemporary society, the role, the import, Garber M 1992 Vested Interests: Cross-Dressing and Cultural
or even the continued existence of an artistic aant Anxiety. Routledge, New York
garde is in question. Marginal art might no longer be Graves L 1998 Transgressive traditions and art definitions.
‘where the action is.’ Researchers may turn instead to Journal of Aesthetics and Art Criticism 1: 39–48
other realms to examine creativity, the defiance of Jacob M J 1995 Culture in Action: A Public Art Program of
symbolic boundaries, and the eventual social responses. Sculpture Chicago. Bay Press, Seattle, WA
The frontiers of innovation are frequently found in the Jansen S C 1991 Censorship: The Knot That Binds Power and
domains of science and technology. New forms of Knowledge. Oxford University Press, New York
communication, insight, and philosophical explora- Kester G H 1998 Art, Actiism, and Oppositionality: Essays from
Afterimage. Duke University Press, Durham, NC
tion are as likely to spring from the Internet as from Rose B 1993 Is it art? Orlan and the transgressive act. Art in
artistic styles such as Impressionism, a prime example America 2: 82–7
of how art that once posed a dynamic, transgressive Tucker M, Tanner M, Goode Bryant L, Dunye C 1994 Bad
challenge to perception has become static, conven- Girls. MIT Press, Cambridge, MA
tionalized.
Future researchers will discover fresh veins of S. C. Dubin
material in the social worlds people create with and
around computers. Computers alter discourse (who
you contact, where, for what purposes, and with what
frequency); identity and perception (‘presentation of
self’ becomes more malleable, ‘reality’ more manifold);
the distribution of ideas and images (the availability of Censorship in Research and Scholarship
sexually-oriented material); and the process of cre-
ation itself (an array of raw material can be ‘sampled’ Censorship is the suppression or alteration of speech
and reconstructed in myriad forms). This fluidity of or writing prior to publication in the interests of an
categories denotes and defines our postmodern world, alleged higher social good. The narrowest definition
and researchers will need to develop new theoretical would limit the use of the term to authoritative action,

1592
Censorship in Research and Scholarship

usually, but not always applied by governments. may, on occasion, require a degree of legislative self-
However, historically, conflicts with prevailing relig- restraint that democratically elected legislatures can-
ious doctrine have been a major reason for censorship not be expected to muster. Two especially sensitive
imposed by clerical authorities. While that remains a areas of research, sex and race, illustrate the point. For
major motive in some areas of the modern world, an excellent discussion of restraints on research in
censorship has come to be associated primarily with these areas in the United States, see Hunt The New
the actions of governments. The rise of totalitarian Know-Nothings.
governments in the twentieth century produced per-
haps the most egregious and systematic programs of
censorship in history, both as a consequence of the 2. Censorship and Research On Sexual Behaior
range of behavior over which such governments and Attitudes
claimed sovereignty and of the ability of the modern
state to impose and enforce its rules on its citizens. But In the United States, religion, politics, and the freedom
democratic governments too engage in acts of cen- to do research on the subject of sexual practices and
sorship when their central values appear to be attitudes clash more vividly than perhaps anywhere
threatened or, more accurately perhaps, when the else. The result can be seen in a history of attempts,
values of the dominant political or economic forces frequently successful, by the Congress to forbid studies
within them are threatened. of sexual behavior financed by public funds. In one
Censorship, as defined above, is frequently assoc- instance, the research of Alfred Kinsey, the Congress
iated with patronage, especially in the academic world. reached into the private sector in an effort to cut off
Sponsors of research—governments, corporations, funding for the work. When Kinsey’s two studies,
foundations, and individuals—are often in a position Sexual Behaior in the Human Male (1948) and Sexual
to close areas of research by not funding it, or to stop Behaior in the Human Female (1953) were published
or alter the publication of the results of research seen and captured wide public attention, religious, and
to be inimical to their interests. Patronage, or more political conservatives became alarmed at what they
precisely, the threat of losing it, may also lead to acts saw as publicizing depraved sexual practices, thereby
of self-censorship by scholars. As research in the contributing to the moral degradation of the society.
natural and social sciences has become more expens- Kinsey’s research, however, was not publicly financed.
ive, these forms of censorship have become increas- Rather it was funded by grants channeled through the
ingly common. National Research Council by the Rockefeller Foun-
It is important to distinguish censorship, which is dation. Congressional conservatives used the instru-
always accompanied by the threat of legitimately mentality of the Committee to Investigate Tax Exempt
imposed sanctions, from disagreements that may be Foundations. The committee suggested that founda-
accompanied by efforts to persuade others not to read tions that supported research like Kinsey’s might well
or listen to the offending writing or speech. The two lose their tax exempt status. Whether there was any
are frequently confused, though the former prevents Constitutional basis for such a threat hardly mattered
free speech, while the latter is, itself, an exercise of free in the climate of the times. Shortly after the Committee
speech. Report was published, the Rockefeller Foundation
ended its funding of the research. While the research
continued, it did so with far less money, largely earned
1. Censorship and Research from the royalties of the books, and, as Hunt reports,
‘… much of its data went unpublished for many years;
Scholarship and research, by their very nature, have the findings of a survey completed in 1970, for
always been vulnerable to censorship. The questions instance, were not published until 1989.’
asked, the methods used, the answers found and their The Kinsey case was extraordinary in the attempt of
implications, may challenge a prevailing orthodoxy, the Congress to reach the private funding of research.
and in the extreme and most dramatic cases, may In other respects, however, it was a precursor to future
threaten to undermine the fundamentals of ruling attempts to censor unpopular social science and art.
regimes, secular and clerical alike. In democratic Publicly funded research on sex has been constantly in
societies social science research is unlikely to be seen as trouble, and in the 1970s and 1980s, several large-scale
threatening to the interests of the state. It is much surveys of sexual behavior that had been approved by
more likely, however, to be seen as a threat to some the National Institutes of Health, were stopped when
group interest in the society and to elicit governmental the Congress cut off funds for them by legislation or
action through the political supporters of that group. pressured political leaders in the Executive Branch to
Indeed, since social science research, particularly large reverse NIH’s decisions.
scale projects, has come to rely on governmental The most publicized case of the kind involved an
funding, and since the subjects of social science exhibition at the Corcoran Gallery of Art in
research often touch on and challenge deeply-held Washington DC of the work of the photographer
social and religious values, resisting efforts to stifle it Robert Mapplethorp. Included in the exhibition was a

1593
Censorship in Research and Scholarship

series of explicitly homoerotic pictures. Since the venture into the area of race, intelligence and genetics
Corcoran Gallery is privately owned and largely that they did so at the risk not only of having their
privately financed, there was no direct way for out- work attacked, which was fair enough, but of being
raged religious groups and their Congressional sup- personally vilified as pseudoscholars whose purported
porters to reach the Gallery directly. However, the scholarship was merely a mask for a racist social
exhibit was funded in part by money from the National agenda.
Endowment for the Arts which, as a government Twenty-five years later, the Jensen episode was
agency was directly reachable both through legislation replayed, with even greater intensity following the
and appropriations. Both avenues were used. Ap- publication of The Bell Cure (Herrnstein and Murray
propriations were drastically reduced, and legislation 1994). The book was met by a storm of criticism, much
imposing ‘decency standards’ on Endowment grants of it scholarly and serious—the vast majority of
was passed. The latter was subsequently declared psychologists and geneticists found their work to be
unconstitutional by the courts, but the Endowment, wholly without merit—but some of it consisting of ad
itself, has yet to recover from the episode and from the hominum vilification of the authors.
continuing implacable hostility of the religious and Neither the Jensen nor the Herrnstein–Murray cases
political right wings of American life. meet the strict definition of censorship given at the
outset. Rather, they are examples of what might be
called ‘virtual censorship’ in which the threat of
3. Censorship and Research on Race personal vilification may serve to prevent some
scholars from addressing topics for which they can be
Conservative religious groups can claim credit for certain that even their hypotheses, much their results,
restraining research into matters of sexuality. But will call down on them painful personal con-
American liberal and civil rights groups can make a frontations. There is no way to know whether the
similar claim with respect to research that touches on Jensen and Herrnstein–Murray examples had that
race, a topic that is guaranteed to elicit emotional effect on others. However, the case of the aborted
controversy. Two sets of events, both taking place at conference on Genetics and Crime demonstrates that
the intersection of race and genetics, illustrate the the governmental equivalent of personal vilification,
point. The two are similar in some respects, but also the threat of political retribution, can have that effect
different in ways that show important shadings in on even the best government agencies.
evaluations of prior restraint of research.
The alleged correlation between race and intel-
ligence has generated more emotional controversy 4. Genetics Race and Violence
than perhaps any other topic in the social sciences.
Educational psychologist Arthur Jensen set off the On May 1, 1991, the National Institutes of Health
debate in 1969 with an article in the Harvard Edu- awarded a grant for a conference on ‘Genetic Factors
cational Review, ‘How Much Can We Boost IQ and in Crime: Findings, Uses and Implications.’ The award
Scholastic Achievement?’ Jensen, relying heavily on was based on the recommendation of a peer review
the literature of studies of identical twins, concluded group, which had concluded the organizers had done
that there is, in fact, a large component of heritability a ‘superb job of assessing the underlying scientific,
in intelligence, measured by IQ tests, and that African legal, ethical, and public policy issues and organizing
Americans on average were at a genetic disadvantage. them in a thoughtful fashion.’ Ten weeks later funds
Jensen’s avowed purpose, as the title of his article for the conference were frozen when NIH Director, Dr.
suggested, was to force educators to confront that Bernadine P. Healy, concluded that the review group’s
finding and to devise educational strategies that took it recommendation had not been strong enough for the
into account. meeting to go forward in the face of public criticism.
Jensen’s conclusion from the data he reviewed, quite The public criticism that concerned Dr. Healy came
apart from the policy prescriptions that followed from from a coalition of psychiatrists and civil rights groups
it, was highly debatable in genetic terms, and it was that argued that government sponsorship of the
hotly debated. There was no question of censoring his meeting was yet another step toward legalization of
work, though some might have wished to do so if it the racist idea that violence is genetically based. The
had been possible. Jensen’s work was wholly funded Center for the Study of Psychiatry asserted there are
from private sources. He was, however, prevented no known genetic factors in crime, but merely raising
from speaking on some college campuses by disrup- the issue arouses racial prejudices and distracts us
tions or the threat of them. If the term ‘political from the true causes of the high crime rate within the
correctness’ is understood to describe a form of debate inner city, including poverty, unemployment, racism,
in which the object is to anathematize one’s adversary dangerous and inadequate schools, drug abuse, family
in order to eliminate him from the debate on grounds dysfunction, and a variety of other social and econ-
of moral inferiority, then the Jensen episode can be omic factors. And as one opponent put the more
seen as a warning to social scientists who would extreme argument, ‘It is clear racism. It is an effort to

1594
Censuses: Comparatie International Aspects

use public money for a genocidal effort against African challenge or embarrassment. It is in the nature of
Americans.’ social science to provide both from time to time.
The cause was taken up by the Congressional Black Tension between the two is, therefore, unavoidable.
Caucus and the problem facing the management of The main protection for social scientists in protecting
NIH became not the merits of the project or the their right to publish freely lies in their willingness to
protection of the peer review process which the NIH join together to assert it when threatened and the
pioneered and developed, but the protection of the willingness of their employing institutions, individu-
agency from the fallout of a political storm. The ally and collectively, to join with them.
decision was not surprising. From the point of view of
agency managers, political embarrassment leads to See also: Censorship and Secrecy: Legal Perspectives;
political problems, putting at risk agency interests, Censorship and Transgressive Art; Civil Liberties and
programs and careers that far outweigh the value of a Human Rights; Civil Rights; Ethics and Values;
single project, no matter how meritorious. Freedom\Liberty: Impact on the Social Sciences;
Freedom of the Press; Fundamental Rights and
5. Industry and Censorship Constitutional Guarantees
Because social and behavioral research is not heavily
funded by industry, it is unlikely that it will be heavily
Bibliography
affected by the problems that have accompanied
industry’s support of scientific research, especially Herrnstein R J, Murray C 1994 The Bell Cure: Intelligence and
research in the biomedical sciences. The restraints Class Structure in American Life. The Free Press, New York
imposed by industrial sponsors are typically motivated Hunt M The New Know-Nothings: The Political Foes of the
by concern for the protection of intellectual property, Scientific Study of Human Nature. Transaction Publishers,
New Brunswick, NJ
and most commonly they take the form of required Jensen A 1969 Harard Educational Reiew 39: 1–121
delays in publication for a specified period of time to
allow for review by the sponsor for intellectual R. M. Rosenzweig
property implications. There have been occasions,
however, in which corporations have gone further and
prohibited publication because the results of the
research would be publicly embarrassing to the
company, a situation more analogous to that some-
times faced by social scientists.
Censuses: Comparative International
A potentially risk does face social scientists, even Aspects
though they are not heavy recipients of corporate
support. It is an indirect but real risk, especially for Over 200 countries have taken censuses within the last
scholars in universities. Universities are often willing ten years. Because there is a desire to compare data
to protect their scholars from publication threatened from one country to another, the United Nations has
by the government. However, if they are simultane- issued guidelines for the taking of censuses. Each
ously accepting limitations on publication as a part of country may have its own way of dealing with special
industrial contracts, their ability to hold a principled populations, but if the methods are documented,
position against the government is seriously com- comparisons can be made. Countries differ in whether
promised. Indeed, during the 1980s and early 1990s, or not they use mail primarily or rely on census-takers
when it was the policy of the American government to to visit each unit and record the information. Many
use the Export Control Act to restrict the international countries use sampling as part of the census operation,
dissemination of unclassified but allegedly sensitive as part of the processing to get an early look at results,
research findings, it was common for government as part of an evaluation, or in an effort to reduce the
officials to argue that universities were willing to do for burden on households by not asking every household
money what they would not do to protect the national to complete every question. Because of more recent
interest. It is an argument that resonates powerfully concerns about privacy and confidentiality, some
with legislatures and with the public, and once it is countries have challenged the use of a census and are
accepted it applies to any research that can be invested looking at new methods of obtaining population data.
with a claim of national interest. Since that is an open-
ended category, the willingness of universities to
accept prior restraints on publication as the price they 1. Distinguishing Features of a Census
must pay for industrial money could greatly compro-
mise their ability, or even willingness, to defend social National censuses, drawing on the sponsorship and
scientists whose work is challenged. authority of national governments typically provide
It is in the nature of governments and other information for national needs such as population
important social interests to protect themselves from counts used to reapportion legislatures. National

1595
Censuses: Comparatie International Aspects

censuses also serve local needs, such as providing Periodicity of censuses ensures that trends can be
population data for state and local agencies for followed. Also, comparisons among countries are
planning for health, education, and many other more meaningful if they refer to the same time period.
services. The United Nations has been instrumental in trying to
Because local populations may grow dramatically set a schedule for censuses. The final step in producing
between censuses, many cities and states or provinces a census is the compilation and publication of the data
take local censuses to benefit from more up-to-date as soon as possible. Without dissemination, a census
population figures. The emphasis in this article is on cannot be used effectively.
national censuses and their comparability. Before a
census takes place, there must be agreement about
what geographic areas are to be included. If new 2. Coerage of Population
territory has been annexed since the last census, it
must be clearly defined. Geographical subdivision as One of the biggest problems in comparing censuses
well as national boundaries must be clearly delineated across countries is the difference between a de jure and
and changes from the last census must be clearly a de facto enumeration. A de jure census counts people
stated. at their usual place of residence, whether or not they
While censuses typically strive for universality, in were there on Census Day. The USA, for example,
every country some groups are excluded. Such groups defines an individual’s ‘usual’ residence as the place
must be decided upon before the census. For example, where one spends the majority of one’s time. A de facto
the population residing in an embassy of a foreign census counts people where they were present at the
country may be exempt from the census of the country time of the census. In practice, most countries combine
in which the embassy is located. People who are living one method with some elements of the other. De jure
in the country as temporary workers may be exempt, residence may not coincide with legal residence. The
though it is often a function of the amount of time they United Nations (1992) recommends a combination of
spend in the country. Military personnel and their the two methods. For example, the 1980 census in
families living outside of the country may or may not Brazil was described as both de jure and de facto
be counted. (In the USA, they have been counted in (Goyer and Domschke 1983).
some censuses and not in others.) This is different in Using the de jure concept means deciding, in
concept from early censuses when whole groups of advance of enumeration, where to count people who
citizens or residents were left out. (See Censuses: have more than one residence, persons who work at
History and Methods for examples of censuses that one location and return to their usual residence for the
counted only adult men.) weekend, students at boarding school or college,
Though not essential, it has become a custom in members of the military who live on base but maintain
most countries to tie the census of population to a a residence off base, and other such cases. Com-
census of housing, expanding analytical capability. parisons between countries that use de jure and de
However, it is also important to make provision for facto counting rules are difficult, as are those between
those persons who do not live in regular housing, such countries using different de jure counting rules. Docu-
as the homeless or nomads. Provision must be made mentation of how and where people are counted is the
for squatters’ colonies, refugee camps, defense areas, best hope for making correct comparisons.
and the institutional population. Another factor to be considered when preparing for
Modern censuses rely on enumeration of the in- a census is how to count certain special populations. A
dividual, although often through households or in- group that is difficult to count in many developed
stitutions. This implies that data about each individual countries is the homeless. Though efforts are made to
is collected and recorded, enabling cross-classification count them where they receive services such as food or
by individual characteristics. In some early censuses, shelter, there is always the chance that they are not
or in some enumeration of hard-to-enumerate groups, counted or counted more than once. In some
aggregates from the group have been recorded, thereby countries, the counting of nomads presents difficulties.
prohibiting cross-classifications. In the USA, the first Many approaches have been tried, such as a tribal
census of 1790 was of this type, collecting data on the approach in which a chief gives the information, or a
number of males, number of females, but no individual water point approach in which all nomads are
data (Anderson 1988). enumerated at water points. The treatment of civilian
Another feature of a census is that it takes place at aliens who work in a country as seasonal workers, who
the same time, or with reference to a specific census cross a frontier daily to work in a country, or who are
day, at all places in the country. To get an accurate residing in a country for some work-related reason is
snapshot of the country, censuses must take into now an important consideration in many European
account all births, deaths, and immigrants as of a countries as well as the USA. The differential effect of
certain day, usually denoted as Census Day. (Weather how countries treat the special populations has an
sometimes dictates the physical enumeration of some important effect on the estimated size of the world’s
remote areas of the country at a separate time.) population (UN Statistical Office 1992).

1596
Censuses: Comparatie International Aspects

Influencing the coverage of the census is the time of 3. Basic Content


the year when it is conducted. Most countries opt for
a time of the year when travel is not impeded by bad To be able to compare countries by census charac-
weather and when people are at their usual place of teristics, countries need to collect data on a basic list of
residence. This may affect the principle of census characteristics with common definitions. The United
taking that calls for simultaneous data collection. In Nations has been very instrumental in developing a list
the USA, Alaska is enumerated at a different time of basic content items with their definitions. Though
from the continental states. If some areas are im- there is not total agreement on every item or definition,
possible to reach, coverage will be biased downward. the ability to make comparisons on basic character-
Another time-related factor is whether the enumer- istics has been enhanced.
ation takes place early in the month or mid-month. An item of great interest to many countries is the
Canada has found that mid-month is helpful in the percentage of people who live in urban or rural areas.
enumeration of those people who rely on a monthly These data are regarded as an aspect of the de-
check, move out of short-term housing until the next velopmental stage of a country. To be able to make
check appears, live on the streets at the end of one cross-country comparisons, the definitions must be the
month and the beginning of the next, but are usually in same. However, there is no universal definition of
housing at mid-month. Tests in the USA confirmed urban. In some countries, it depends on population
that a mid-month start would be useful there as well, size, sometimes on the presence of a city, on whether
and changing from the traditional April 1 Census Day an area has urban characteristics, or on whether
to mid-March would also provide additional time in agriculture is the predominant economic activity. All
which to conduct the census (Cohen et. al 1999). other areas not urban are defined as rural.
In conducting the census, several reference periods Another item of interest is that of households and
are used. For example, questions about income usually families. In some countries, the terms household and
refer to income in the previous year. Questions about family are used interchangeably. In others, however,
labor force usually refer to the week previous to household means one or more persons living in a
Census Day, whereas questions about who was living dwelling, whether or not related by blood or marriage,
in a household, usually refers to the night before while family is two or more persons related by blood
Census Day. Having multiple reference periods can or marriage. Thus, there can be more than one family
often be confusing to respondents and may affect the in a household. Of interest in many international
quality of data. The length of the enumeration period comparisons are numbers of households and families,
also affects coverage. The shorter the length, the more types of households, number of one-person house-
likely the census will avoid over- or undercounting. In holds, and family sizes (Goyer and Domschke 1983).
large countries, the time span may be longer, but the Other common characteristics for which cross-
extended period often leads to overcounting. Suharto country comparisons are frequently made are: sex,
(1995) points out that no country that used a de jure age, race, ethnicity, marital status, educational at-
method in censuses between 1985–1994 was able to tainment, literacy, religion, languages spoken, citi-
complete the census in a day, while nine countries that zenship, employment status, occupation, income, and
used a de facto method completed the census in one fertility. Some countries will not ask certain questions.
day. Though a one-day census avoids the complexity For example, the USA will not ask a question about
of people moving, coverage suffers. From a survey of religion. Some questions cause uneasiness and may
countries that took censuses between 1985 and 1994, lead to coverage problems. Questions on citizenship
Suharto (1995) reported that 31 countries completed in a country that has many undocumented workers
enumeration in a week. However, countries such as may not be desirable (see Censuses: History and
Colombia and the USA took well over two months. As Methods).
the time period of the enumeration lengthens, coverage
errors increase.
Privacy and confidentiality concerns also affect 4. Organizational Structure
coverage (Suharto 1995). Within the last 20 years, in
several countries, concerns about confidentiality have In the more developed countries, there is a permanent
led to decreased census participation. Because census census office. In other countries, a census office may be
data are used for so many purposes, and public use temporary for the period of the census. A permanent
files are now widely disseminated, the risk that office offers many advantages, because what was
individual census records may be inadvertently dis- learned in one census can be factored into the plans
closed has raised public concern in many countries. A for the next. There are more likely to be staff in place
strict provision of census confidentiality is necessary who have census experience and who are aware of
to reassure people concerned about issues of privacy. census concepts, definitions, and planning. In many
This feeds into the need for an effective public relations countries, censuses are viewed as a statistical activity,
program in which the legal protection of individual carried out by an agency that handles statistical
census information is stressed. activities. In some countries, the census is carried out

1597
Censuses: Comparatie International Aspects

by a non-statistical agency, perhaps even the military. may still be of special value in countries with limited
A country is more likely to participate in UN statistical computer power. Sampling is also used in the quality
activities and adopt UN recommendations if it has a control of the data processing. Similarly, in evaluation
statistical organization running the census. studies samples are often selected for coverage or
content studies. Sampling has not yet been used for the
follow-up of people who did not return the census
questionnaire, though it was proposed in the USA for
2000. (See Cohen et al. 1999 for a description of the
5. Type of Enumeration sampling approach the US Bureau of the Census
originally recommended for 2000.)
There are two main types of census enumeration. In
Many people have suggested using administrative
the canvasser approach a census enumerator inter-
records as the basis for a census, and, indeed, ad-
views the population in an enumeration area and
ministrative records in the form of a population
records the responses on a census questionnaire. In the
register, augmented by housing registers is a replace-
self-enumeration approach, the questionnaires are
ment for a census in some European countries.
distributed (in advance) and a member of the house-
However, there are other ways that administrative
hold records the information about the household
records can be used to improve the quality of data
and its members on the questionnaire.
collected, add to the data, or evaluate the data. The
There are advantages and disadvantages to each
US Census Bureau has long used a limited set of
method. In a country where the literacy rate is low, the
administrative records to evaluate its census data
canvasser approach is necessary. In the USA, the self-
(Edmonston and Schultze 1995). One of the experi-
enumeration method is used, supplemented by the
ments carried out, as part of the 2000 census in the US
canvassing of nonrespondent households and special
was a test of whether the merging of several ad-
populations. Tests have shown that accuracy is im-
ministrative record files would produce high-quality
proved in a self-enumeration approach because the
short-form information. The results of this test will be
household respondents are more likely to report what
available in 2002.
they know about themselves rather than being in-
fluenced by the responses from other households. In
tests for the 1960 census, the US Census Bureau found
that interviewers were influenced by what they thought 7. Ealuation
the responses should be instead of what the facts
showed. Suharto (1995) reported that in the 1985–1994 In comparing censuses across countries, it is important
period, about three-quarters of the countries that took to know about their quality. Many countries carry out
censuses used the canvassing approach. Post-Enumeration Surveys to estimate the coverage of
the census, or checks with administrative records or
additional interviews to estimate the accuracy of the
data for specific items. These evaluations can be
helpful in changing questionnaire items for future
6. Use of Sampling and Administratie Records censuses. For example, when people are asked how old
they are the responses tend to ‘heap’ at the terminal
Sampling and administrative records are each efforts
digits of 0 and 5. Demographers refer to this phenom-
to supplement and improve upon enumeration. Sam-
enon as age heaping. When the question is about date
pling has been used for many years, and in a variety of
of birth, less age heaping results. The results of
contexts. One common way is to ask most of the
evaluation studies also facilitate comparisons of fer-
population a minimum set of items and then ask a
tility, disability, and language spoken in the home.
much larger set of questions of a sample of the
Estimates of coverage by major demographic groups
population. In countries where this is done, the
within countries can help in the interpretation of
questionnaires are usually known as the long form and
differences in rates of different kinds such as mortality
the short form. The long form has been under
rates, prevalence rates, home ownership rates, crime
discussion in the USA as a possible deterrent to
victimization rates, and many others.
response. Though that view is not universally held,
efforts are underway to replace the long form by more
continuous survey activity, which would also have the
benefit of providing more up-to-date data to local 8. Census Processing and Dissemination
areas (Edmonston and Schultze 1995). Another use of
sampling may occur during processing. It may be The entire census data-processing system needs to have
important to get an early indication of some census been decided upon and tested before enumeration.
results, so a sample of questionnaires may be selected, Overexpenditure of census funds on data collection
coded for occupation, industry, and other items, and has sometimes affected the availability of funds to
then only this sample is processed. This use of sampling process the data. Occasionally a sample of the

1598
Censuses: Demographic Issues

questionnaires is selected for coding and processing to Goyer D S, Domschke E 1983 The Handbook of National
facilitate early dissemination. The form and extent of Population Censuses, Latin America and the Caribbean, North
the tabulations is also important and affects inter- America, and Oceania. Greenwood Press, Westport, CT
national comparability. It is important to have a plan Suharto S 1995 Emerging Issues Related to World Population and
Housing Census Program, United Nations Statistics Division,
for the release of census results to the government, to
Technical Notes
other data users, and to the general public. National UN Statistical Office 1992 Handbook of Population and Housing
results provide data for government policy and action. Censuses. Part 1. Planning, Organization and Administration
They also permit international comparisons and of Population and Housing Censuses. Department of Economic
analysis of trends. Small area data are the planning and Social Development. Statistical Office, New York
tools used by local governments and businesses.
Academic researchers analyze census data at each level B. A. Bailar
and often find different results from government
analysts, as well as puzzling results requiring deeper
investigation. To facilitate broad analyses by diverse
analysts, many countries have developed the tradition
of releasing public-use samples of census records. This
use of sampling (from the full census file) also helps
address concerns regarding confidentiality once identi- Censuses: Demographic Issues
fiers are removed. Currently, data are disseminated in
many different forms. Most countries have tradition- Censuses are designed to provide population counts
ally published printed volumes of census data, but for countries or areas within a country together with a
many have also prepared computer files of use to range of demographic, social, and economic data on
researchers, and a number of countries have begun to the population. Basic demographic data provide in-
release tabulations and sample data on CD-ROM and formation on the size, distribution, structure ( par-
in the form of files accessible via the Internet. ticularly age and sex), and change of populations;
more broadly, demographic data systems also collect
information on racial\ethnic characteristics, socio-
economic characteristics, and health (see Demography:
9. Alternaties to a Census Twentieth-century History). As a complete count of
the population, a census differs from other demo-
Though over 200 countries have taken censuses since graphic data systems, but serves to complement them.
1990, several countries have recently abandoned cen- This article describes the essential features of a census
sus taking because of lack of public cooperation or in contrast with other demographic data systems and
vocal opposition, relying instead on other methods discusses the quality of demographic data collected in
such as population registers, administrative records, censuses, with a particular focus on age data.
or a combination of administrative records and
household surveys to supply data needs. Also, the
escalating cost of censuses has led to a consideration in
many countries of other ways of satisfying data needs. 1. Characteristics of a Census
The accuracy of such alternative forms on national
data collection needs careful assessment, and is of A census of population is a complete counting of the
special concern for international comparisons. number of persons in a country (or region) at a given
point in time. Because of the amount of detail available
See also: Censuses: Demographic Issues; Policy in a census, both demographic and geographic, a well-
Knowledge: Census; Population Cycles, Formal conducted census forms the core of a modern govern-
mental statistical or demographic data system. In
Theory of; Population Forecasts; Statistical Systems:
particular, it provides population totals, geographic
Censuses of Population distributions, and characteristic data that can be used
as benchmarks or reference points for other demo-
graphic and socioeconomic data.
The essential features of a census include reference
Bibliography to a well-defined territory and a specific point in time,
Anderson M 1988 The American Census. Yale University Press,
or reference date. The census is designed to enumerate
New Haven, CT every person within the designated territory at the
Cohen M L, White A A, Rust K F (eds.) 1999 Measuring a specific reference date and to collect information on
Changing Nation: Modern Methods for the 2000 Census. their characteristics at that time. As ideal types,
National Academy Press, Washington, DC censuses may be either de facto or de jure, counting all
Edmonston B, Schultze C (eds.) 1995 Modernizing the US the people present in the territory on the census date or
Census, National Academy Press, Washington, DC all the people who ‘usually’ or legally live there

1599
Censuses: Demographic Issues

(Shryock and Siegel 1980, p. 92). Since these ideal types or for smaller subgroups of the population, such as
are difficult to implement, most censuses, in practice, ethnic or immigrant populations. Such detailed in-
are a mix of the two. At a national level, the most formation is generally unobtainable in other general-
problematic groups to define and count tend to be: purpose demographic data collection systems.
military, naval, and diplomatic personnel (both for- The major expense, both in time and money, in
eign personnel in the country and native personnel conducting a census is the set of activities involved in
abroad); indigenous populations and nomadic groups; compiling a list of addresses or dwelling units and
civilian nationals temporarily abroad; and civilian actually contacting each household. Thus, the mar-
aliens temporarily in the country (including refugees ginal cost of collecting additional demographic, social,
and displaced persons). The United States census, for and economic data is generally not great (e.g., Edmon-
example, counts most people at their usual place of ston and Schultze 1995, p. 46). Further, with modern
residence, a de jure concept, includes some persons sampling techniques, sufficient precision can often be
temporarily in the country, such as foreign students, obtained by embedding a sample within the census so
but excludes others such as foreign tourists (see that it is unnecessary to burden the entire population
Censuses: Comparatie International Aspects). with a lengthy set of questions. The choice of whether
For countries that either receive large numbers of to collect information from the entire population or a
temporary and permanent migrants or send large sample is determined by several factors, including
numbers, there may be a substantial difference be- legal requirements for the data, the desired degree of
tween the de facto and de jure concepts. The alternative precision, and the cost of data collection.
population concepts can be even more problematic for Data collected in a census provide detailed measures
geographic subdivisions within a country than for the of the demographic make-up of the population.
entire country. In addition to the groups previously Typical measures include: marital status, birthplace,
listed, persons with multiple residences (e.g., vacation citizenship, children ever born (or some measure of
or seasonal homes, migrant workers, or persons with fertility), and measures of nationality, ethnicity, or
multiple job sites) may be difficult to assign properly to race. Most modern censuses collect a range of other
a specific subnational area. Defining the nature of a information on socioeconomic status, including: edu-
‘temporary’ stay away from home can create difficul- cation (educational attainment, school enrollment,
ties in assigning residence for a variety of populations; literacy), economic activity (occupation, industry, type
for example, boarding students or residential college of activity, income, commuting activity), and other
students may never return to their parental home but social or political characteristics (language, religious
may be counted at either site (and, in fact, may often affiliation or ethnicity, living arrangements).
be erroneously counted at both sites). The specific set
of residence rules used in a census can have a significant
impact on the measured population size for areas with
large numbers of these problematic groups, areas such 3. Alternatie, Complementary Demographic
as university towns or resort areas. Data Collection Systems
Although the principal purpose of censuses is to
count the population, their main utility is generally the A census is the backbone of a national statistical
data collected on the characteristics of the population. system, but it is only one part of a complete system.
The following sections of this article discuss the types Surveys and registrations systems fill the data gaps left
of data collected in censuses and alternative data by a census; each system can provide data to check the
collection methods. Then the article addresses the completeness and accuracy of the others. The dis-
quality of census data, the role of demographic tinction between a census and a survey is often a
methods in assessing data quality, and, more specifi- matter of degree. One key difference is that censuses
cally, data on age and sex. are usually designed to determine the size of the
population of an area; surveys rarely do so. Sample
surveys provide the same types of demographic, social,
and economic data as collected in censuses. However,
2. Census Content because of the generally smaller scale of a survey and
because it is usually administered by trained and
Censuses collect a range of information on individuals experienced interviewers, the survey can delve more
and households (or families). In a census, the most deeply into various subjects than a census. The main
basic geographic and demographic data are collected tradeoffs involve the greater detail (categorical and
for eery indiidual in the population, generally: geographic) and the greater precision provided by a
location (or usual residence), age, sex, and relationship census.
to head of household or family. Because of its broad Registration systems, the third part of a demo-
population coverage, a census may be the only source graphic data system, are generally designed to count
for detailed demographic and socioeconomic infor- vital events: births, deaths, marriages, and entries and
mation about subnational areas, especially small ones, exits at international boundaries. They also include

1600
Censuses: Demographic Issues

population registers of the entire population or seg- both for census accuracy and for census evaluations.
ments of the population; examples of the latter include Persons with no usual place of residence (such as
persons of draft age, workers covered by social homeless populations or migratory workers), persons
insurance or government health systems, voters. Popu- with multiple residences, and persons away from their
lation registers, when active records are tallied, can families or homes of record (such as college students,
share many features of a census: population coverage, members of the armed forces) are particularly prone to
geographic coverage, and content (Bogue et al. 1993, being omitted, double counted, or counted in the
Sect. 3, p. 2). Registers differ in that the data are wrong place, which results in a double error—an
generally not collected simultaneously or with refer- undercount at the proper location and an overcount at
ence to a specific point in time. When a register is the wrong location. The numerical difference between
continuously updated (e.g., to take into account the gross undercount and gross overcount is termed
departures through death or emigration, additions the net undercount of a census.
through birth, immigration, or aging, and changes in Statistical techniques for measuring census com-
status or residence), this discrepancy in timing is pleteness and accuracy generally involve matching
removed. While universal population registers are individual census records with a sample of individual
rare, partial registers are very common. In fact, records from another data system. Other systems used
comparisons of census counts with partial registers for evaluation include administrative records from
can provide critical checks on the coverage and registration systems, previous censuses, periodic
accuracy of a census. sample surveys, and specially designed coverage eval-
Vital registration systems count events, not people, uation (or postenumeration) surveys. Statistical eval-
yet they are essential for providing information on the uations offer the potential for measuring coverage
demographic dynamics of a population. Birth and according to a wide range of characteristics and for
death rates are generally computed with vital regis- measuring both gross undercounts and gross over-
tration data as the numerators and population data counts.
from a census as the denominators. Cumulative counts Demographic techniques of census evaluation in-
from birth, death, and immigration registration sys- volve comparison of aggregate statistics or consistency
tems can be used to reconstruct the demographic checks, either internal or with external data sources.
history of a country and check the accuracy of census Because demographic evaluations do not check in-
data or the consistency of censuses across time. Since dividual records, they measure net undercounts (or
the data in vital rates and census evaluations come overcounts) rather than the gross components of
from different collections systems (different registra- census error. Most demographic evaluations rely on a
tion systems, censuses, and surveys), such comparisons version of the demographic balancing equation
must be constructed carefully to maintain consistency (Shryock and Siegel 1980, pp. 105–11). This simple
with regard to definition of population groups, time equation relates the population at a point in time to
references, and territorial coverage. the population at a previous time and the demographic
components of change during the intervening interval:
Population at the final point (P ) equals
4. Accuracy of Censuses "
Censuses are widely used to allocate political power, as Population at the initial time (P ) plus
a tool for allocating governmental resources, and a !
Births during the interval (B) minus
plethora of private uses. Although improvements in Deaths during the interval (D) plus
methods and technology have led to concomitant Immigrants during the interval (I ) minus
improvements in the quality of census data, the range Emigrants during the interval (E ),
of uses has sparked concerns over both the com-
pleteness of census counts and the accuracy of the Or
information collected (Anderson and Fienberg 1999).
P l P jBkDjIkE
Techniques for evaluation of completeness and ac- " !
curacy can be broadly categorized as statistical in This equation can be applied to the total population of
nature or demographic. The completeness of a census, an entire country, an area within the country, age–sex
or its coverage, is distinct from the accuracy of the groups (where it can be called cohort analysis),
data items collected in the census and involves both racial\ethnic populations, and other subgroups of the
undercounts and overcounts. Undercounts or census population. Since no demographic data are completely
omissions arise from failures to include persons or accurate, correction of errors in the measures of the
households in the census counts. Overcounts can arise components or assessment of the potential error in the
from duplications or multiple counts of the same resulting estimate is an essential part of demographic
person, from fabrications, and from erroneous enum- analysis of census errors.
erations or counts of persons who are not properly The United States Census Bureau has applied
part of the census universe. Problems with proper demographic evaluation techniques to measuring cov-
reporting of residence can present special problems erage of recent censuses with a program called Demo-

1601
Censuses: Demographic Issues

Table 1
Demographic estimates of percent net census undercount, by race, for the
United States: 1940–90
Race 1990 1980 1970 1960 1950 1940
Total 1.8 1.2 2.7 3.1 4.1 5.4
Black 5.7 4.5 6.5 6.6 7.5 8.4
Not Black 1.3 0.8 2.2 2.7 3.8 5.0
Black–Not Black difference 4.4 3.7 4.3 3.9 3.6 3.4
Source: Robinson et al. 1993, Table 2.

graphic Analysis (DA). The DA estimates draw on a survey design may limit comparisons across censuses,
variety of demographic data to construct estimates of particularly those separated by several decades. For
the population by age, sex, and race—historical the demographic estimates across censuses, however,
statistics on births beginning in 1925 (corrected for the degree of variability and potential biases tend to be
under-registration as measured in a series of regis- consistent because the estimates are developed with
tration tests), historical statistics on deaths and legal consistent data, methods, and assumptions. The sur-
immigration since 1935, administrative data from vey estimates are, however, more flexible in that they
Medicare (the national health insurance system for the can provide many measures that are either not
elderly, also corrected for underregistration), and available from demographic methods or are subject to
demographic estimates of emigration and unautho- potentially large estimation errors; for example, cover-
rized immigration. Comparing the DA estimates with age evaluation surveys can measure gross overcounts
census counts provides a consistent set of coverage and undercounts, coverage error for racial and ethnic
estimates by age, sex, and race for the censuses of 1940 groups for which historical data may not be available,
through 1990 (Robinson et al. 1993); the resulting coverage of subnational geographic areas for which
estimates are shown in Table 1. The DA estimates the required demographic measures of internal mi-
show a steady pattern of improvement in census gration are not sufficiently precise, and other popu-
coverage from 1940 (5.4 percent net undercount) lation groups for which the requisite demographic
through 1980 (1.2 percent) but a worsening of coverage data may be completely missing or highly inadequate
in 1990 (1.8 percent). The same pattern of change is (such as owners and renters, family types, native and
apparent for both the minority Black population and foreign-born populations).
the balance of the US population (largely the majority The accuracy of demographic methods for evalua-
White population). However, the difference in under- ting census coverage ultimately depends on the avail-
coverage between the Black and non-Black population ability of data and the degree of error and uncertainty
shows no such trend; in fact the 4.4 percentage point in the vital statistics and other data from noncensus
difference in coverage in the 1990 Census is the highest sources. A variety of techniques for assessing the
shown. variability in such demographic estimates have been
The demographic estimates of coverage for the proposed (Robinson et al. 1993, Anderson et al. 2000),
series of US censuses have proved to be roughly but none is widely accepted or applied. Further, some
consistent with measures derived from coverage evalu- methods for explicitly combining demographic tech-
ation surveys, for example, a 1990 survey estimated niques with survey results have been proposed to
the net undercount at 1.6 percent, only 0.2 percentage improve the accuracy of coverage measurements (e.g.,
points different from the demographic estimate Bell 1993).
(Hogan and Robinson 1993). The survey estimates, A full evaluation of coverage with demographic
like the DA estimates, showed higher undercount rates techniques may not be achievable in many countries
for the Black population; further, the survey showed because of a lack of the requisite long historical series
high undercount rates for other minorities populations of demographic data plus supplementary data sources,
(Hispanics, Asians, and American Indians) that could such as those used for the DA estimates in the United
not be separately measured with demographic tech- States. However, demographic techniques can provide
niques. a variety of consistency checks that yield measures of
The experience of the 1990 US Census illustrates the relative coverage of the census to an external
both the strengths and potential weaknesses of demo- standard or the relative coverage of different subpopu-
graphic estimation techniques. Because of the reliance lations within the same census rather than absolute
on historical data series, the demographic estimates measures of census net undercount. For example,
permit a degree of precision in comparisons of change analyzing the change in population between two
in coverage across censuses that is hard to achieve with censuses either in absolute terms with the balancing
coverage evaluation surveys, for example, sampling equation or as a rate of change provides an indication
variability, other measurement errors, and changes in of the relative coverage of the two censuses. Likewise,

1602
Censuses: Demographic Issues

sex ratio analysis (described below) provides a measure can easily lead to enumeration difficulties and age-
of relative coverage of men and women in different age selective omissions from the census count.
groups. Checks of census totals or subtotals against Age misstatement takes a variety of forms. Advant-
independent aggregates such as birth statistics, partial ages accruing to individuals for attaining certain
registers (e.g., draft registrations, national health threshold ages can lead to reporting of these ages on
system enrollments), or universal registers requires the part of persons younger than the critical age
correcting the aggregate data for underenrollment or because they perceive reporting older ages to be
definitional differences from the census. If full and advantageous. While these ages may vary from coun-
accurate corrections can be made, such comparisons try to country, the age of majority (often 18 or 21) and
do provide measures of net undercoverage of the age of retirement (often 65) often show striking
census. patterns of overstatement. At the extreme upper end
of the age distribution, especially at ages 100 and over,
there is a marked tendency for individuals to report
exaggerated ages; this tendency is more prevalent in
5. Age, Sex, and Race\Ethnicity Data in countries with poorer birth registration systems and in
Censuses societies and groups with lower levels of literacy.
A particularly striking type of age reporting error is
Perhaps the most crucial demographic data collected the tendency of individuals to prefer reporting ages
in a census are age and sex; consequently, they are ending in certain digits; this tendency is generally
almost universally collected in censuses and surveys. exhibited as a preference for ages ending in ‘0’ or ‘5.’
Much of the science of demography is devoted to Other types of preferences can also be found including
analyzing, accounting for, or controlling for the effects a tendency to report even-numbered ages and a
of age structure on other social and economic pheno- tendency to prefer reporting years of birth ending in ‘0’
mena, so that age is a factor, either explicit or implicit, or ‘5’ which may result in different patterns of age
in almost every demographic analysis. Differences in reporting, depending on the date of the census.
age composition between populations can account for Because digit preference can be severe and can have
many other observed differences, either in whole or in serious ramifications for the use of census data, a
part, including differences in fertility, mortality, sex variety of techniques have been developed to measure
ratios, dependency, household structure, and even the phenomenon and to correct the data. The principal
employment rates. correction techniques involve distributing the popu-
The primacy of age data requires special attention lation across age groups according to historical data
to quality, measurement of errors, and correction. on births or, more generally, ‘smoothing’ the age
While errors of coverage and reporting occur across distribution while maintaining totals for certain age
the entire age spectrum, studies of age reporting from groups with graduation techniques developed mainly
many countries (Ewbank 1981, Spiegelman 1968, p. 59) by actuaries (Shryock and Siegel 1980, pp. 201–29).
have documented several specific types of errors in In contrast to age, classification of individuals
censuses from many countries: according to sex presents no general problems in
(a) underenumeration of specific age groups, es- censuses. Analytic techniques are generally straight-
pecially young children and young adults; forward, relying principally on the sex ratio, conven-
(b) age misstatement around ‘threshold’ ages (age of tionally defined as the number of males per 100
majority, age of retirement); females. Nonetheless, analysis of sex ratios, especially
(c) age overstatement at older ages; by age group, can be a very powerful evaluative and
(d) preference for ages ending in certain digits; analytic tool. Many factors can affect the sex ratio of
(e) failure to report age. a population, but the principal ones are age structure
Whereas underenumeration is a general problem and migration patterns. The sex ratio of births falls in
with census data, age-specific underenumeration af- a fairly narrow range from 101 to 107 and is biologi-
fects the overall quality of these essential data used in cally determined. Differential mortality of the sexes
other analyses. Problems in some age groups can be causes this ratio to change as a birth cohort ages. In
traced to specific types of enumeration issues. Under- developed countries, male mortality generally exceeds
counts of very young children seem to be widespread female mortality at every age so that the sex ratio
and appear to have a variety of causes, including decreases smoothly as a birth cohort ages.
oversight on the part of adult respondents, devaluation In the absence of migration, the sex ratio reaches
of female children in some societies, and overstatement 100 by young adulthood. Above age 50, the sex ratio
of age. The other age group that seems to suffer decreases more sharply with age, reflecting again the
generally from undercounts is young adults (roughly higher mortality of males. As a result of these mortality
ages 15–30). These are the ages in most countries when patterns, older populations tend to have lower sex
individuals enter adulthood, marry, form new house- ratios than younger populations.
holds, and leave their parental homes. The related Departures from this general pattern can be in-
geographic mobility and other changes of residence dicative of underlying demographic, social, or econ-

1603
Censuses: Demographic Issues

omic phenomena or anomalies. In some countries, ancestry and changes the salience of such ancestry to
very high sex ratios among young children (i.e., high individual self-identification, more than half of the
proportions of boys) are thought to indicate sex- growth of the American-Indian population can be
selective abortions, infanticide, neglect, or conceal- attributed to inconsistent measurement of this racial
ment of children. Higher sex ratios among young group across censuses (Passel 1996).
adults may indicate high levels of maternal mortality. High incidences of intermarriage, changes in data
In the United States, lower than expected sex ratios collection methods, and changing societal norms all
among adults aged 25–54 are indications of poorer contribute to difficulties in consistently measuring
census coverage among males than females. Low sex racial\ethnic populations over time and across differ-
ratios can also result from excess male mortality in ent data systems. The 2000 Census of the United
wartime; this effect will persist for the lifetime of the States encompasses virtually all of the potential pitfalls
affected cohorts. Migration patterns can alter sex in measuring race. For the first time, respondents to
ratios if migration is sex-selective, as it often is. Areas the US census are allowed (even encouraged) to
with employment concentrations in extractive indus- respond with more than one racial identification.
tries (agriculture, mining), heavy manufacturing, or While this procedure offers a detailed picture of the
military often have high sex ratios as men, more than population in 2000, it does not reflect the way such
women, migrate into these areas in search of jobs. On data were collected in previous censuses; individuals
the other hand, areas with a high concentration of jobs choosing more than one race are not classified in the
in ‘white collar’ occupations and ‘office work’ often same category as in previous censuses. As a result,
have low sex ratios as more women than men are valid comparisons with past (and future) censuses will
employed in these jobs. require modification of the race groupings with so-
The age and sex structure of a population encapsu- called ‘bridging’ techniques; unfortunately for demo-
lates its demographic history. Cohorts are born, age, graphic analyses, the data necessary to produce
and die with predictable and discernable patterns; consistent group definitions across time may, at best,
migrants into and out of an area alter the composition. be approximate or may simply not be available at all.
A census—a snapshot of the population at a given
time—provides invaluable information on this history.
Analysis of the age–sex data, either alone or in
combination with data from other demographic data 6. Conclusion
systems such as surveys, registration systems, and
previous censuses, can tell a great deal about the Censuses provide the principal measure of population
quality of the census data and the nature of the size for countries and geographic subdivisions within
population. countries while also providing detailed information on
Data on race or other similar identifiers such as the demographic, social, and economic characteristics
ethnicity, nationality, or religion are collected in many, of the population. An accurate census is a crucial
but not all censuses. While race data can be extra- feature of a national statistical system, providing
ordinarily useful in demographic analyses and are baseline data and benchmarks for other collection
often critical for applications of census data, the systems. Demographic techniques, particularly for
quality of these data can be quite variable. Groupings assessing age and sex structure and for analyzing
based on race (or similar identifiers) are often per- changes over time, provide essential means for utiliz-
ceived in the popular mind as equivalent to demo- ing the data and for assessing their quality. For further
graphically distinct populations; that is, individuals information on censuses and census data, see Ander-
enter or leave a race group only through the con- son (2000) and Censuses: History and Methods;
ventional demographic processes of birth, death, and Censuses: Comparatie International Aspects.
migration. Indeed, DA as applied to race groups in the
United States treats the Black and not Black popula-
tions in just this manner; and for the 1940–90 results
(described above), this treatment appears quite ap-
Bibliography
propriate. However, in many modern societies, par- Anderson M J (ed.) 2000 Encyclopedia of the U.S. Census.
ticularly multiethnic ones, racial identification and Congressional Quarterly Press, Washington, DC
ethnicity are increasingly becoming matters of per- Anderson M J, Daponte B O, Fienberg S E, Kadane J B,
sonal choice and self-identification rather than of Spencer B D, Steffey D L 2000 Sampling-based adjustment of
societal ascription (Waters 1990). Consequently, in- the 2000 Census: A balanced perspective. Jurimetrics 40:
341–56
dividual membership in or identification with a racial\ Anderson M J, Fienberg S E 1999 Who Counts? The Politics of
ethnic group may change over time. This phenomenon Census-Taking in Contemporary America. Russell Sage Foun-
introduces another ‘component of change’ into demo- dation, New York
graphic measures of the size of a population group. Bell W R 1993 Using information from demographic analysis in
For example, because of the large numbers of indivi- post-enumeration survey estimation. Journal of the American
duals in the United States with some American-Indian Statistical Association 88(423, September): 1106–18

1604
Censuses: History and Methods

Bogue D J, Arriaga E E, Anderton D L (eds.) 1993 Readings in for political subdivisions such as states or provinces,
Population Research Methodology: Volume 1. Basic Tools. cities, counties, or other civil divisions. The agency
United Nations Population Fund, Social Development Cen- usually reports the results to the public a few months
ter, Chicago
or years after the census; the results are considered
Edmonston B, Schultze C L 1995 Modernizing the U.S. Census.
National Academy Press, Washington, DC ‘news’ and are reported in the media. Since censuses
Ewbank D C 1981 Age Misreporting and Age-Selectie Underen- aim to count the entire population of a country, they
umeration: Sources, Patterns, and Consequences for Demo- are very expensive and elaborate administrative opera-
graphic Analysis. Report No. 4, Committee on Population and tions, and thus are conducted relatively infrequently,
Demography, National Academy Press, Washington, DC generally at five to ten year intervals.
Hogan H, Robinson J G 1993 What the Census Bureau’s Between censuses, governments estimate the size
coverage evaluation programs tell us about the differential and characteristics of the population, either by ex-
undercount. In: Proceedings of the 1993 Research Conference trapolating the trends in the census into the future,
on Undercounted Ethnic Populations. US Department of estimating the population from other data systems
Commerce, Washington, DC
Passel J S 1996 The growing American Indian population,
such as tax or drivers license records, or by conducting
1960–1990: Beyond demography. In: Sandefur G D, Rindfuss periodic sample surveys to collect information about
R D, Cohen B (eds.) Changing Numbers, Changing Needs: the population. Representative probability samples
American Indian Demography and Public Health. National collect information from a small portion of the
Academy Press, Washington, DC population, and thus can be conducted frequently,
Robinson J G, Ahmed B, Das Gupta P, Woodrow K A 1993 even monthly. In the USA, the Current Population
Estimation of population coverage in the 1990 United States Survey of around 50,000 households is conducted
census based on demographic analysis. Journal of the Ameri- monthly. National governments also conduct other
can Statistical Association 88(423): 1061–71 types of censuses, particularly of economic activity,
Shryock H S, Siegel J S 1980 Methods and Materials of De-
mography, 4th printing, rev. US Government Printing Office,
such as an agriculture, manufacturing, or business
Washington, DC census. Such censuses collect information on the
Spiegelman M 1968 Introduction to Demography, rev. edn. number and characteristics of farms, businesses, or
Harvard University Press, Cambridge, MA manufacturing firms. In the nineteenth century, such
Waters M C 1990 Ethnic Options. University of California Press, censuses were conducted at the same time as the
Berkeley, CA population census. Today, the economic censuses are
generally conducted on a different schedule from the
J. S. Passel population census (see Statistical Systems: Censuses of
Population).

Censuses: History and Methods 2. History


Censuses have been taken since ancient times by
1. Definition\Oeriew emperors and kings trying to assess the strength of
their realms. These early censuses were conducted
A census is a count of the population of a country as sporadically, and generally served to measure the tax
of a fixed date. National governments conduct or military capacity of a particular area. They tended
censuses to determine how many people live in to count adult men, men liable for military service, or
different areas of the country, whether the population people liable to tithe (pay taxes). For census taking of
is growing, stable, or declining in the country as a an entire population to become feasible, a uniform
whole or in particular parts of the country. They also unit of analysis was required. Hence census taking in
determine what the characteristics of the population the West had to await the emergence of the concept of
are in terms of age, sex, ethnic background, marital commensurate households, generally seen as a de-
status, or income. Generally governments collect the velopment of the Medieval European West (Herlihy
information by sending a questionnaire in the mail or 1985). The household or family served as the unit or
an interviewer goes to every household or residential analysis or the locus for counting the members within
address in the country. The questionnaire asks the it.
head of the household, or a responsible adult in the Generally speaking the modern periodic census of
household (the respondent), to list all the people who all persons is an invention of the early modern period
live at the address as of a particular date, and answer in the European West and, particularly from a New
a series of questions about each of them. The re- World perspective, was associated with efforts by the
spondent or the interviewer is then responsible for home country to determine the success of the overseas
sending the answers back to the government agency, colonies. Thus the British Crown and the British
which in turn adds up the results, or tabulates or Board of Trade ordered repeated counts of the colo-
aggregates the answers from the country overall and nial American population in the seventeenth and

1605
Censuses: History and Methods

eighteenth centuries, starting in the 1620s in Virginia generally undertakes a public review process to de-
(Wells 1975, Cassedy 1969). In Canada, French efforts termine the questions to be asked. The questions vary
to count the population began in 1665–1666 when from nation to nation depending on the particular
Jean Talon came to the New World on behalf of Louis political and social history and conditions of the
XIV and took a census of what became Quebec. country. Most censuses include basic demographic
Censuses were continued at irregular intervals after information such as the age, sex, educational back-
Canada became a British colony in 1763 (Worton ground, occupation, and marital status of the in-
1997). By the early nineteenth century, census taking dividual. Race, ethnic or national origin, and religious
began to be a regular feature of government in Western affiliation are important questions in many nations.
Europe and North America (Alterman 1969, Nissel Further questions can include the person’s place of
1987, Glass 1973, Desrosieres 1993, Patriarca 1996). birth, relationship to the household head; the indiv-
The International Statistical Congress (established idual’s or the family’s income; the type of house the
1853) and the International Statistical Institute (esta- household occupies; and whether the person is a
blished 1885) proposed and promoted the uniform citizen, has moved in the past five years, or speaks a
regular censuses for all national states. In the twentieth particular language. Questions that are quite routine
century, census taking spread throughout the world. in one nation may be seen as quite controversial in
The United Nations Statistical Office compiles reports another, depending on the history of the country.
on population worldwide (Ventresca 1996). Americans do not ask questions on religious affiliation
on the census since it is considered a violation of the
First Amendment right to freedom of religion. Other
3. Functions and Techniques nations, such as India, do collect such information.
Questions on the number of children born to a woman
Censuses serve a variety of purposes in different were quite controversial in China in recent years
countries. At a minimum a census provides a measure because of the emphasis on the one-child policy of
of the size of the population of a country, which can be population limitation. In the USA, asking a question
compared with the population in the past, the popu- on income was considered controversial in 1940 when
lation of other countries, and to make estimates of the it was first asked. It is no longer as problematic.
likely population in the future. Governments use Questions change in response to public debate about
census information in almost all aspects of public the state of society. Americans wanted to know which
policy, from determining how many children an households had radios in 1930, and introduced
educational system must serve, to determining where questions on housing quality in 1940. Canadians have
to put new roads. Censuses are also used to provide the recently begun to ask census questions on the unpaid
denominators of other measures, e.g., measures of per work done in the home.
capita income for a state or local area, or for such Taking a census can be divided into several phases.
measures as crime rates or birth or death rates. Private The statistical agency first divides the country into
businesses use census data for marketing analyses to geographical subdivisions to be counted, and makes
determine where to locate new businesses, or to decide maps and address lists, and prepares instructions for
where to advertise particular products. the local census takers. In most countries, this phase
Other government agencies and private researchers requires hiring large numbers of temporary workers to
use the census to provide the ‘sampling frame’ for do the counting, or calling upon other government
other survey research. This is the address list that employees at a local level, such as schoolteachers, to
researchers use to determine where to conduct a conduct the count. The central statistical agency
sample or poll. In the USA, the Bureau of the Census prepares and prints the questionnaires, and distributes
or Census Bureau conducts the census during the tenth them to households either through the mail, or by
year of each decade. Each decade, the population delivery by enumerators. During the second phase of
count provides the data for reapportioning seats the count a responsible adult or household head in
among the states in the House of Representatives and every household, family, or equivalent institution, is
Electoral College, and for redrawing district bound- asked to fill out the form or respond to the enumerator
aries for seats in the House, in state legislatures, and in and supply the required information about each
local legislative districts. In Canada, a full census is member of the household. Questions for all people
taken during the first year of every decade and an usually include a brief set, on a ‘short’ form, for
abridged census is taken during the sixth year of the example, name, age, sex, racial or ethnic status, marital
decade. The population data are also used to ap- status, and relationship to the household head. Gen-
portion seats among the provinces in the House of erally a smaller sample of households receive a more
Commons and to draw electoral districts (see Policy complicated or ‘long’ form which can have many
Knowledge: Census). detailed questions on the individual’s work status,
Most nations create a permanent national statistical income, housing, educational background, citizen-
agency to take the census, such as the US Bureau of ship, and recent moves. The person who fills out the
the Census or Statistics Canada. The agency in charge form or the enumerator is responsible for returning it

1606
Censuses: History and Methods

to the statistical agency. The respondent then mails the posed of electors determined by summing the House
form to the agency; the enumerator collects the form and Senate members for each state. The decennial
in person or the information by phone. During the census was designed to provide the population figures
third phase, the statistical agency enters the data onto for apportioning the seats in the House according to
a computer and adds up, or tabulates the responses for population. ‘Direct taxes’ levied on the states were also
the nation, states, or provinces, and cities, towns, and to be apportioned on the basis of population
other local jurisdictions. The agency also cross- (Anderson 1988).
tabulates the answers, reporting not merely on the At the time of the Constitution, a racially based
number of people in a local area, but on the number of slave labor system also existed, and almost 20 percent
people, for example, in five-year age cohorts, for each of the American population were enslaved African-
sex, for local areas. The agency only publishes the Americans. The framers debated whether slaves were
tabulated results of the count, and keeps the individual ‘persons’ or ‘property’ and thus whether states should
responses confidential. In the USA, individual census receive representation for their slave populations. The
responses are stored at the National Archives. After 72 southern states where slavery dominated did not
years, the original forms are opened to the public for consider the slave population for purposes of ap-
use. These original responses are frequently used by portioning their state legislatures, but they did con-
people researching the history of their families, or by sider slaves property for tax purposes. The framers
those constructing genealogies. could not find an easy solution to this dilemma, and
Until the 1980s, statistical agencies published census developed what came to be called the Three-Fifths
results in large volumes of numeric tables—sometimes Compromise, which ‘discounted’ the size of slave
numbering in the hundreds of volumes. Since then, population as equivalent to 60 percent of the free
census results have become available electronically, on population when determining the apportionment of
disc, magnetic tape, CD-ROM, or the Internet. The the House. The Three-Fifths Compromise thus re-
choice of census technique for a particular country is quired a separate count of slave and the free, mainly
the result of its social and political traditions and white, population. The Constitution also specified that
technological capacities. The US census is highly ‘Indians not taxed’, that is, those Indians who were not
automated, and is primarily conducted by mail. considered part of the civil society were not to be
Canada sends enumerators to deliver the census form counted in the census. The Three-Fifths Compromise
to each household; the household head fills it out and was abolished with the abolition of slavery after the
sends it in. Other nations use even more labor- Civil War; the tradition of accounting for the various
intensive techniques of collecting and tabulating the racial groups in the population continued after abol-
data. Turkey, for example, currently requires people ition.
to stay home to await the census taker and counts the The first census was taken in 1790. Assistant US
entire population on a specific Sunday census day (see marshals were instructed to travel the country and ask
Censuses: Comparatie International Aspects). six questions at each household. These included: the
name of the family head; and for each household, the
number of free white males 16 and over; the number of
free white males under 16; the number of white
4. National Traditions females; the number of other free people (the free
colored); and the number of slaves. They totalled the
figures for their local jurisdiction and sent them to
4.1 The United States
the US marshal for the state that in turn totalled the
The US census was mandated in the 1787 Constitution. figures for the state and sent them to the President.
At the time of the American Revolution the framers The first American census counted 3.9 million
faced many problems of uniting the 13 separate people.
colonies into a national government, including how to In later years, the census became more elaborate,
allocate political representation among the states, and with more questions asked, and more data published
how to levy taxes. (Cohen 1982). In 1850 Congress mandated a census
The initial government structure under the Articles schedule (form) with a line of questions for each
of Confederation gave each state one vote in Congress, person, including their name. A temporary Census
and required the states to collect taxes for the national Office, as it was then called, was set up in Washington
government. This proved unsatisfactory since the to add up the tables of responses and publish the
states were of widely different sizes, tax capacity, and results in large volumes. By 1880 when the American
cultures. The 1787 Constitution derived the sover- population topped 50,000,000, the census was still
eignty of the state from the ‘people of the US’ and being compiled by hand, using a primitive tally system.
created a bicameral legislature with representation in In 1890, the Census Office introduced machine tabu-
the Senate based on the states; and representation in lation of the responses, and each person’s answers
the House of Representatives based upon the popu- were converted to codes punched in Hollerith cards, a
lation of each state. The Electoral College was com- precursor to the IBM punch card. The cards were then

1607
Censuses: History and Methods

run through counting machines. This innovation was results to correct for errors on the basis of post-
the beginning of modern data processing, and led to enumeration check survey. The population of Canada
further innovations in tabulating large amounts of in 1997 was 30.3 million.
data. By the 1940s, the Census Bureau commissioned
the construction of the first non-defense computer,
UNIVAC, to tabulate the 1950 census. By the late
1950s, the Census Bureau dispensed with the punch 4.3 Great Britain
cards and developed an electronic scanning system,
The first Parliamentary proposals for regular censuses
called FOSDIC, or Film Optical Scanning Device for
of population in Britain date to the mid-eighteenth
Input to Computers, to read the answers on the census
century, but they did not lead to the institutionaliza-
form and input the data to computer (Eckler 1972).
tion of census taking until the end of the century. In
In 1940, the US began to collect some census
late eighteenth century Britain, the publication of
information from a sample of the population, and
Thomas Malthus’ Essay on the Principle of Population
slowly shifted the detailed questions on the census to
and concerns that the population was declining led
the long form asked of only about 15–25 percent of
Parliament to pass a bill ‘for taking an Account of the
households. In 1970 the American census became
Population of Great Britain and of the Increase or
primarily a mail enumeration, as the Census Bureau
Diminution thereof.’ The first count was taken in
developed automated address files for the country. As
March 1801 by the Overseers of the Poor under the
of the year 2000, over 90 percent of the roughly
direction of former clerk of the House of Commons,
100,000,000 residential addresses in the US receive the
John Rickman, and counted 9 million people. Census
census form in the mail. If the Census Bureau does not
taking continued every decade afterwards except for
receive a response, it sends an enumerator to determine
the cancellation of the 1941 Census during the Second
if the address is correct and get the information from
World War.
the household at the address.
In 1837 the General Register Office was organized.
The decennial census was made a responsibility of the
GRO and used the administrative structure of the
registration districts to organize the field portion of
4.2 Canada the count.
The 1841 census introduced individual level enu-
Regular decennial censuses began in 1851 in Canada.
meration. The mid-nineteenth century census was
The British North America Act of 1867 required that
particularly influenced by the ongoing medical and
the census provide population counts for the ap-
public health work of the GRO leaders such as William
portionment of representation in the House of Com-
Farr. It therefore developed detailed data on oc-
mons among the four provinces of Ontario, Quebec,
cupation, age, and locale for the denominator data
Nova Scotia, and New Brunswick, and for the periodic
required for the publication of mortality and other
readjustment of the boundaries of electoral districts.
statistics (Glass 1973, Nissel 1987).
The first census of the Dominion was taken in 1871,
Machine tabulation of the census was introduced in
and counted 3.7 million people.
1911. The 1920 Census Act established a permanent
The first Canadian census after Confederation was
legislative mandate for the census. Sampling was
also a very elaborate affair, collecting information on
introduced in 1951 by way of the analysis and
agriculture, livestock, animal products, industrial
publication of results from a 10 percent sample of the
establishments, forest products, shipping and fisheries,
complete 1951 forms. In 1961, the census used a short
mining, and public institutions as well as on popu-
form for the entire population and an additional long
lation. Canada was primarily an agricultural nation at
form for a 10 percent sample. In 1966, a one in ten
the time, and the Department of Agriculture con-
sample count was taken. The results of the 1961 census
ducted the census. In 1905 the census bureau was made
were computer tabulated. By 1966 the Census Office
a permanent government agency in the Agriculture
had its own computer facilities. The population of the
Department. In 1918 an independent Dominion Bu-
UK in 2000 is 59 million.
reau of Statistics was created with oversight for taking
the census and collecting other statistical information,
headed by a Chief Statistician. In 1971, the bureau was
renamed Statistics Canada. Canada has a centralized
4.4 France
statistical system, with all major statistical work
conducted in the same agency (Worton 1997). France established a population registration system in
Canada began to conduct sample surveys along the late eighteenth century, and began a tradition of
with the full census in 1941, introduced a quinquennial quinquennial census taking under Napoleon in 1801.
census (a census on the sixth year of the decade) in The first census counted a population of 33 million.
1956, and self-enumeration in 1971. In the 1990s, the For much of the nineteenth century, the administra-
Canadian census introduced adjustment of census tion of the count was local with a small national office

1608
Censuses: History and Methods

(Statistique Generale de la France (SGF)) publishing count the population well enough. All censuses con-
the results tabulated locally. Individual forms were tain errors of various kinds. Some people and
used in Paris starting in 1817; by 1876 the entire addresses are missed. People may misunderstand a
country used three standardized forms: one for the question, or fail to answer all the questions. Census
individual, one for the family or household, and a officials have developed elaborate procedures to catch
third for the building enumerated. Tabulation was and correct errors as the data are collected, but some
done locally until 1896 when Lucien March introduced errors tend to remain. Since census results are often
machine tabulation. The French pioneered in the used to allocate seats in legislative bodies and govern-
analysis of occupational statistics derived from the ment funds, such errors undermine the credibility of
census as a means of understanding the evolution of the census as an allocation mechanism.
the economy from an agricultural and artisan basis to In recent years, developments in statistical analysis
one characterized by professional qualifications and have made it possible to measure the accuracy of
employees with credentials. In 1946, after World War censuses (Choldin 1994, Anderson and Fienberg
II, the French National Statistical Agency was titled 1999). Census results may be compared with popu-
Institut National de la Statistique et des Etudes Eco- lation information from other sources, such as the
nomiques (INSEE) (Desrosieres 1991). The population records of births, deaths, and marriages in vital
of France at the 1999 census was 60 million. statistics. Census officials can also determine the level
of accuracy of the count by conducting a second,
sample count, (a post enumeration survey or PES)
shortly after the complete census, and then matching
5. Issues the records of the sample and the census. Census
officials estimate who is missed, and who is counted
Censuses provide important information about the twice or in the wrong geographic location, and use
population of a country, and can become embroiled in these estimates to evaluate the overall accuracy of the
political or social controversy simply by reporting count. Canada and Australia adjust the census results
information relevant to ongoing issues in the society. for omissions and other errors. Great Britain is
Complaints about the census generally involve planning to adjust the 2001 census (Chambers and
concerns about the accuracy of the count, the pro- Crudas 2000, Paice and Steel 2000).
priety of particular questions, and the uses to which In the USA, people who live in cities, the poor, and
the data are put (see Censuses: Demographic Issues). minorities tend to be undercounted relative to the rest
Censuses require public understanding, support, of the country. Since 1970 officials representing such
and cooperation to be successful. Concerns about undercounted jurisdictions claimed that these juris-
government interference with private life can prevent dictions have suffered loss of political representation
people from cooperating with what is an essentially and government funding since the apportionment and
voluntary counting process. People may be suspicious funding formulas are based on incorrect data. Mayors
of giving information to a government agency, or may and leaders of civil rights organizations have pressed
object that particular census questions are invasions of for adjustment of the census results, and have filed
privacy. lawsuits to compel adjustment of the enumeration.
When such public trust is lacking, people may fail to The courts have not ordered adjustment of the US
participate. In recent times, individuals doubled up in census results, and the question of adjustment has
illegal housing units, undocumented immigrants who emerged as a political controversy in Congress. Since
do not reside in the country legally, or individuals who the late 1980s, Republicans have generally opposed
do not wish to reveal their economic or social situation adjusting for the undercount, while Democrats have
to a government agency, can be reluctant to respond to supported it. The nation will debate these issues anew
a census. In the most serious challenges, people claim after the 2000 Census since the Census Bureau plans to
the results will not be held in confidence and that publish results based upon both adjusted and unad-
census should not be conducted at all. During World justed population counts and to provide their prof-
War II, census records from countries occupied by the essional judgment indicating which data set they
Nazis, for example in the Netherlands, were used to consider the more accurate.
identify Jews for detention, removal, and extermi- Congress, states, and local governments could use
nation. The effect of such use was to undermine the adjusted counts for determining legislative districts,
legitimacy of the census after World War II. In the funding allocations, and for program evaluation,
Netherlands, which took its last regular census in 1971 though such actions are likely to draw further court
and collects population information through other challenges.
mechanisms, the legacy of the Nazi era was one of the
major justifications for ending census taking (Seltzer
1998). See also: Population Cycles, Formal Theory of;
At the other side of the spectrum, political chal- Population Forecasts; Statistical Systems: Censuses of
lenges can be made to the census if the census does not Population

1609
Censuses: History and Methods

Bibliography has actually been a cluster of related conceptual pairs:


center and periphery, but also core and periphery,
Alterman H 1969 Counting People: The Census in History.
Harcourt, Brace & World, New York metropolis and satellite, and metropolis and province.
Anderson M 1988 The American Census: A Social History. Yale But they have not been mere synonyms. Moreover, in
University Press, New Haven, CT obscure and perhaps treacherous ways, ideas of center
Anderson M, Fienberg S E 1999 Who Counts? The Politics of and periphery may occasionally have overlapped in
Census Taking in Contemporary America. Russell Sage Foun- social thought and discourse with other classical
dation, New York contrasts, such as modernity–tradition, or urban–folk.
Cassedy J 1969 Demography in Early America: The Beginnings of The sociologist Edward Shils had a major part in
the Statistical Mind. Harvard University Press, Cambridge, introducing the center–periphery pair of concepts into
MA
the vocabulary of academic social science (cf.
Chambers R, Cruddas M 2000 A one number census for the
United Kingdom. Chance Journal 13(3): 38–41 Greenfeld and Martin 1988). His paper ‘Center and
Choldin H 1994 Looking for the Last Percent: The Controersy Periphery’ originally appeared in 1961, but there was a
oer Census Undercounts. Rutgers University Press, New close affinity between it and several of his other
Brunswick, NJ publications in the same period. The overall con-
Cohen P C 1982 A Calculating People: The Spread of Numeracy ception of society and culture in Shils’s perspective was
in Early America. University of Chicago Press, Chicago consensualist; but not, he asserted in a later comment,
Desrosieres A 1991 Official statistics and medicine in nineteenth- as merely a facile expression of the mood of the era. In
century France: The SGF as a case study. Social History of his view, the importance of consensus to social life had
Medicine 4: 515–37
been underestimated (Shils 1975, p. xi). Thus he
Desrosieres A 1993 La Politique des Grands Nombres: Histoire de
la Raison Statistique. Edition La De! couverte, Paris [1998 The concentrated on notions such as ‘tradition, ‘ritual,’
Politics of Large Numbers: A History of Statistical Reasoning. ‘deference,’ and ‘charisma.’
Harvard University Press, Cambridge, MA] Notably, in his ‘Center and Periphery’ paper, these
Eckler A R 1972 The Bureau of the Census. Praeger, New York two terms were used in a metaphorical sense—cen-
Glass D V 1973 Numbering the People. Saxon House, trality had ‘nothing to do with geometry and little with
Farnborough, UK geography.’ The center was identified with what was
Herlihy D 1985 Medieal Households. Harvard University Press, ultimate, irreducible, and sacred in the realm of
Cambridge, MA symbols, values, and beliefs; it was also identified in
Paice J, Steel D 2000 Census adjustment in Australia. Chance
the realm of action with those roles and institutions
13(3): 41–2
Patriarca S 1996 Numbers and Nationhood: Writing Statistics in which embodied such cultural understandings and
Nineteenth-century Italy. Cambridge University Press, New were most actively engaged in propagating them. Yet
York in ‘Metropolis and Province in Intellectual Life’, pub-
Nissel M 1987 People Count: A History of the General Register lished in the same year, the facts of uneven spatial
Office. Her Majesty’s Stationery Office, London organization which are now usually linked to concepts
Seltzer W 1998 Population statistics, the Holocaust, and the of center and periphery become clear, and are set in
Nuremberg trials. Population and Deelopment Reiew 24: what we would now describe as a transnational
511–52 context. Drawing on his study of the situation of
Ventresca M 1996 When States Count: Institutional and Political
intellectuals in India, Shils argued that in their minds,
Dynamics in Modern Census Establishment, 1800–1993. Ph.D.
dissertation, Stanford University, Stanford, CA people have varyingly extensive maps of the world
Wells R 1975 The Population of the British Colonies in America significant to them, and that a major feature of such
before 1776: A Surey of Census Data. Princeton University maps would be their portrayal of one’s qualitative
Press, Princeton, NJ proximity to or distance from the metropolis (Shils
Worton D A 1997 The Dominion Bureau of Statistics: A History 1972, p. 356).
of Canada’s Central Statistics Office and its Antecedents: In the former of these two papers, then, the center is
1841–1972. McGill-Queens University Press, Kingston, a locus of excellence, of grace; the periphery defers to
ON it. In the latter, the metropolis is a center of vitality, a
seat of creativity. The province, on the other hand,
M. Anderson Shils notes, is frequently taken to be in itself ‘rude,
unimaginative, awkward, unpolished, rough, petty,
and narrow’ (Shils 1972, p. 357). Cultural salvation
lies in an involvement with the metropolis. The choice
seems to be one between impoverished autonomy and
Center–Periphery Relationships enriching dependence.
If Shils was a theorist of consensus and cultural
Concepts of center and periphery became increasingly authority, the metropolis\satellite and core\periphery
prominent in the social sciences during the second half pairs, as inserted into the debates of the 1960s and
of the twentieth century. They were of diverse origins 1970s, pointed in an entirely different direction,
and entailed varied emphases, but these differences resonating with another political mood. ‘Metropolis’
have not always been obvious in continued use. There and ‘satellite’ were the terms used by the economist

1610
Center–Periphery Relationships

Andre Gunder Frank (1967), who had spent much standings and representations of self and other in the
of the 1960s in Latin America, studying what he context of these large-scale center–periphery struc-
described as the development of underdevelopment, tures.
and engaging with the growth of ‘dependency theory.’ Alternatively, identifications may indeed be made at
Then in the 1970s, introducing ‘world-system theory,’ the national level: the United States or France are seen
the sociologist Immanuel Wallerstein (e.g., 1974) con- rather more as centers than Sweden, Romania, or
trasted ‘core’ with ‘periphery.’ Both Frank and Burkina Faso. In other instances, centers are yet
Wallerstein were primarily concerned with political more specifically placed. Nations and regions may
economy, and with the expanding control and ex- have their own centers in particular localities. In the
ploitation of the material resources of the ‘periphery’ study of historical civilizations, there is frequently an
or ‘satellite’ on the part of the ‘core’ or ‘metropolis.’ In emphasis on centers as sites combining political power
both instances, the emphasis was on conflict and with complex ritual life and elaborated knowledge
domination within a global order. systems in the hands of specialists. A well-known
The points of view represented by Shils, on the one paper by the anthropologists Bernard Cohn and
hand, and by Frank and Wallerstein on the other, McKim Marriott (first published in 1958), for exam-
seem not often to be brought into direct confrontation, ple, delineates the complementary roles of centers and
although over the years, they may have come to networks in the integration of Indian civilization
interact and blend with one another. By now, when (Cohn 1987, 78ff). In related mid-twentieth century
center–periphery concepts are more passingly referred work, as Redfield and Singer (1954), also anthro-
to, it is not always entirely clear which of the varieties pologists, discussed the cultural role of cities, they
is involved, if not some combination of them. Yet described the dynamic relationship between the ‘great
there is a field of tension here between studies focusing tradition’ of early urban centers and the ‘little trad-
on distributions of material assets and the exercise of itions’ of the surrounding peasant societies. The
power on the one hand and studies of culture on the emphasis on the symbolic authority of the center here
other; between views suggesting consensus and views is clearly reminiscent of Shils’s view.
emphasizing conflict; and between views treating the Yet Redfield and Singer also noted that later urban
social organization of meaning and meaningful forms centers, rather than refining particular local traditions,
somewhat in isolation and other views which insist on are often at the crossroads of the world, bringing
setting it in the context of political and economic together diverse traditions and serving as communi-
structures. cation nodes for wider areas. In this the authors
Probably all would accept, however, that centers adumbrated recent interest in the role of particular
and peripheries imply each other. We are dealing with cities in contemporary transnational center–periphery
relational phenomena: there is no center without relationships. ‘World cities,’ such as New York,
periphery, no periphery unless there is a center. The London, or Paris, may be seen as generalized, multi-
particular way the center is a center is also reflected in purpose centers, combining various kinds of power
the way the periphery is a periphery. and drawing the attention of the periphery for many
different reasons (Hannerz 1993, Knox and Taylor
1995). The contemporary global structure of center–
1. Leels and Varieties periphery relationships, however, can be understood
as more internally varied. Cities may be centers in
Our conception of center–periphery relationships here particular ways to particular people, dispersed in a
is that they are relationships of inequality existing in transnational periphery—Rome to the Catholic world;
geographical space. Such relationships, however, have San Francisco to gay people not only from elsewhere
been defined at very different levels of specificity. in the United States, but also from other continents;
During the latter half of the twentieth century, Memphis, Tennessee, to friends of country music. As
especially in the Cold War era, the world was under- we see, attachments of very diverse kinds are involved.
stood to be divided into three main segments—one This means that the same people in the periphery may
Western and capitalist, one Socialist, one ‘developing’ well be involved with several, noncompeting centers
and postcolonial. Here the First and the Second for different purposes. Yet centers of the same type can
Worlds had their internal center–periphery structures. also compete in the periphery.
The Third World, on the other hand, may have had How do centers become centers, even drawing
these, but also tended to be seen rather more as a attention across national boundaries? Some are cen-
periphery in its entirety, to one First or Second World ters for mainly historical or mythical reasons, because
center or other. At present the extreme macroview something occurred there once in the past, or occurred
may be that of a center North and a peripheral South. there first, or because the memory of a charismatic
Much of the cultural and political debate over cen- figure is preserved there. They belong on the map of
ter–periphery relationships remains mostly at such the significant past, or of eternal truths. Often they are
levels of identification. Notions such as Orientalism, intensified as centers at particular points in a ritual
Occidentalism, and Eurocentrism refer to under- calendar. Many ‘centers of pilgrimage’ have such

1611
Center–Periphery Relationships

characteristics. Other centers are very much of the commentators have also become less inclined to accept
present. They draw attention and stimulate imagin- the assumption of passive reception at the periphery.
ation from afar because they are ‘where the action is’ They are much more likely to see an active periphery
now, with regard to one or more lines of human engaged in managing diffusion from the center: the
preoccupation. The contemporary proliferation of periphery accepts this, rejects that, modifies one thing,
center–periphery relationships of transnational reach and synthesizes something else with items from its own
is undoubtedly related to a greater ease of trans- local cultural inventory. Metaphors of creolization
portation and communication, making centers easy to and hybridity summarize this combination of cultural
get to, and easy to stay in touch with. As the diffusion with creativity at the periphery.
participants, practitioners, adherents, activists, and
employees of more subcultures, lifestyles, ideologies, 3. Mixed Feelings
occupations, or corporations use new means to extend
their circles out of local or national habitats, new far- It is in the nature of their asymmetrical relationship
away centers are discovered, or even made. As more that center and periphery take different views of it, and
diasporas are generated by more migrants, old centers may even be differently aware of it. A periphery has to
may also be reaffirmed, from further away. attend to its center. The center for its part may at times
be preoccupied with its internal affairs, and may not
2. Center–Periphery Diffusion have a clear grasp of the more widely dispersed
consequences of its actions. The periphery tends to
In part, it may be in the passage of culture from center have a more developed idea of the center than vice
to periphery that the two parties to the relationship versa.
manifest themselves. In the clearest case, the center Emotions may be attached to such linkages. Cen-
would be always the donor, the periphery always the ter–periphery relationships often do not leave people
recipient—the center is active, the periphery is passive. cold, passionless, neutral. In the consensualist view,
Yet even if the imbalance is not quite so great, some centers generate warm, deferential attitudes at the
measure of net cultural export would appear to be one peripheries. In a conflict perspective, in contrast,
conception of center–periphery influence. What is center and periphery are primarily defined by political
prominently involved here, consequently, is diffusion. and economic structures—and we may then see their
In a wider arena of debate about the effects of global cultural concomitants at the periphery in the responses
interconnectedness, since mid-twentieth century, the to control, coercion, and constraint. Along such lines,
preoccupation with cultural diffusion from center to the many forms of cultural resistance have been a
periphery has often been expressed in one or the other major theme in the portrayal of center–periphery
of two ways. Post-World War II modernization theory relationships. The periphery may defend itself sym-
offered a forecast of where the world was heading— bolically against intrusions through a celebration of
where the West actually already was, more or less, and the self and its local and traditional roots, but there
where the rest would follow, with some assistance may also be a denigration of the other, an uncompli-
from the center. When not drily ‘value-free,’ moderni- mentary representation of the center.
zation theory tended to take a positive view of the Yet the responses of the periphery to the center are
changes in question. The other major way of des- not always clear-cut. For one thing, claims to center–
cribing cultural diffusion on a global scale was less periphery consensus now increasingly meet with skep-
favorably inclined. The term ‘cultural imperialism,’ as ticism. With Gramsci as a source of theoretical
Tomlinson (1991, 3ff), has pointed out in a review, is a inspiration, observers may interpret deference in the
combination of two complicated concepts into one, more complicated terms of hegemony. It is likewise
with a considerable ideological charge. It has tended possible, however, that people at the periphery may be
to refer to a growing dominance of western, and genuinely of two minds about a center. If ambivalence
especially American, media and consumer goods in is a prevalent quality of social life, as Smelser (1998)
other parts of the world. Coca-Cola, McDonald’s, and has argued, perhaps this is even as common a response
Barbie dolls have become the predictable pieces of to a center as any more one-sidedly favorable or
evidence in a genre of cultural critique. Both the early unfavorable stance. It may be a place one loves to hate
form of modernization theory and the critique of and hates to love. One may feel that good things come
cultural imperialism thus tend to offer global homo- from there, and at the same time resent its influence, or
genization scenarios; implying, forecasting, or warn- what may seem to be its narcissism. Especially those
ing of the end of cultural diversity. multipurpose centers understood to be ‘where the
Such scenarios, however, have been increasingly action is,’ moreover, may inspire sentiments which are
often contested. There has been a growing attention to positive, but instrumental or playful rather than
the multicentricity of culture, to crisscrossing cultural deferential. They are among the liminal spaces of
flows, and not least to instances of cultural counter- contemporary life.
flow, from periphery to center. While some net All views of the center, however, and all messy
asymmetry of flow may yet be undeniable, many feelings toward it, are not necessarily to be found in

1612
Centers for Adanced Study: International\Interdisciplinary

any single inhabitant of the periphery. They may Knox P L, Taylor P J (eds.) 1995 World Cities in a World-
rather be complicatedly distributed among its popu- system. Cambridge University Press, Cambridge, UK
lation, and such variations can be the foci of intense Lash S, Urry J 1994 Economies of Signs and Space. Sage, London
Redfield R, Singer M 1954 The cultural role of cities. Economic
debates and conflicts structuring local life at the
Deelopment and Cultural Change 3: 53–73
periphery. Shils E 1972 The Intellectuals and the Powers. University of
Chicago Press, Chicago
Shils E 1975 Center and Periphery. University of Chicago Press,
4. The End of Center–Periphery Relationships? Chicago
Smelser N J 1998 The rational and the ambivalent in the social
With some frequency, one now hears voices criticizing sciences. American Sociological Reiew 63: 1–15
center–periphery concepts, or proposing the decline of Tomlinson J 1991 Cultural Imperialism. Pinter, London
center–periphery structures. It is said that the world Wallerstein I 1974 The Modern World-System. Academic Press,
economy is increasingly decentered (see, e.g., Lash and New York
Urry 1994, p. 4). Human life, it is also argued, is
increasingly deterritorialized. People move quickly U. Hannerz
between places, being rooted in neither of them, and
have relationships which belong in no place in par-
ticular. ‘Critical masses’ for cultural elaboration may
be built up in cyberspace, and do not require local
face-to-face contacts. As would-be centers and would- Centers for Advanced Study:
be peripheries are closely in touch, by way of electronic International/Interdisciplinary
media or jumbo jets, the ‘cultural lag’ which divided
them before is no longer there. In the current period, scores of institutions of scholarly
Even if they are often closer to futuristics than to the learning around the world include the words ‘ad-
really existing world, there may be something to such vanced study,’ or some approximate translation
arguments, At times, however, both the concepts thereof, in their formal titles. Others that might appro-
criticized and the alternatives to them would need to priately use such a title do not, often for simple
be elaborated more precisely. Indeed we must be historical reasons. Moreover, some institutions using
sensitive to the ambiguity of our terms, and to the the term have little in common, in structure or
actual range of variations out there. Asymmetries may function, with the leading examples of such organi-
indeed only be relative, culture flows are not entirely zations. Therefore it is useful to review the main goals
one-way, the world is crisscrossed by more sym- that have come to be associated with ‘advanced study,’
metrical relationships as well, and there are many before considering the variety of institutional designs
centers, and many kinds of centers, rather than only used to address them.
one. Centers also rise and decline. Center–periphery
conceptualizations now deserve to be scrutinized and
developed, rather than merely rejected. It is notable,
too, that the end of centers and peripheries appears 1. The Essence of ‘Adanced Study’
often proclaimed by commentators whose own van- The phrase ‘advanced study’ is hardly a technical
tage point is at the center—in this case not a privileged term. Its superficial meaning is obvious, and for this
position. reason its original coinage is not easy to trace for the
English-speaking world. But its popularity and more
See also: Dependency Theory; Globalization and restrictive meaning among scholars is quite surely a
World Culture; Globalization, Anthropology of; phenomenon of the twentieth century, springing di-
Globalization: Political Aspects; Hegemony: Anthro- rectly from the creation and rapid success of the
pological Aspects; Hegemony: Cultural; Pilgrimage; Institute for Advanced Study at Princeton University
World Systems Theory in the United States, officially incorporated in 1930.
The original design of this Institute was in turn rather
clearly the innovation of Abraham Flexner. In the
Bibliography 1920s, Flexner had become a noted student and critic
of the American system of higher education through a
Cohn B S 1987 An Anthropologist among the Historians and series of publications including the Flexner Report, a
Other Essays. Oxford University Press, Delhi sharp critique of the nation’s medical schools. Flexner
Frank A G 1967 Capitalism and Underdeelopment in Latin
was later approached by the Bamberger family, who
America. Monthly Review Press, New York
Greenfeld L, Martin M (eds.) 1988 Center: Ideas and Institutions. wished to endow a new medical college in New Jersey,
University of Chicago Press, Chicago to serve as an agent in this enterprize. Flexner soon
Hannerz U 1993 The cultural role of world cities. In: Cohen A P, came to recommend the endowment not of a medical
Fukui K (eds.) Humanising the City? Edinburgh University college but of a new kind of university. His vision was
Press, Edinburgh, UK of an institution without undergraduate students, in

1613
Centers for Adanced Study: International\Interdisciplinary

which advanced graduate students and top-flight frontiers, and are expert at sending out fingers of
professors would form a partnership of scholars who inquiry which in disciplinary terms illuminate pre-
would ‘be left to pursue their own ends in their own viously unknown terrain. But deeper and richer probes
ways … in tranquillity …’ As this vision crystallized, that consolidate knowledge of larger areas, such as
Flexner decided it should be called ‘an Institute of linking up several neighboring peninsulas to one
Higher Learning or Advanced Studies.’ another, are often best carried out by teams from
The phrase was launched into a realm of high multiple disciplines. This has, of course, become an
visibility. It helped that one of the first recruits was increasing necessity, as the bodies of already-estab-
Albert Einstein, soon surrounded by world-class lished knowledge within given disciplines have become
mathematicians, physicists and cosmologists who so large that mastery of even the local ‘consolidated
could engage in fruitful dialogues with him. knowledge’ of a single narrow discipline consumes
Flexner’s conception was deliberately designed as a much of the education process. Therefore most centers
counterweight to the conventional university. Univer- of advanced study are more or less deliberately
sities, at least in America, are the long-acknowledged interdisciplinary, and even take steps to ensure that
home for basic research and the pursuit of increasingly resident scholars do not segregate themselves by
complex topics in the advancement of knowledge. discipline, but are exposed to enriched stimulation
Flexner felt that a center for advanced study, dedicated from broader cross-disciplinary discussions. Scholars
to roughly the same ends, should be different. Some who have spent long periods of study and teaching in
said that Flexner’s ‘innovation’ was a retreat to conventional universities before encountering this
the simplicity of Plato’s Academy. But he was attempt- novel environment testify that they have sorely missed
ing to address issues surrounding the interplay be- such broad-spectrum intellectual interaction ever since
tween the advancement of knowledge and the standard they first devoted themselves to a single discipline by
forms of modern university life. entering graduate schools. Moreover, scientific knowl-
While ‘advanced study’ might seem to involve the edge about the natural world, as well as the symbolic
examination of extremely complex or intricate topics, world of mathematics, is notably cosmopolitan.
a more essential definition is research at the most Therefore diversity in national origins is also sought
remote frontiers of knowledge, as they stand at any by most centers for advanced studies, given the
point in time. The core of well-established knowledge possibility of relevant genius in any part of the world,
is for the most part constantly expanding. Institutions and the impress of cultural perspectives on knowledge.
of education, including higher education, are mainly
dedicated to the transmission of such bodies of
‘received wisdom.’ This transmission presumes at least 2. Illustratie Institutional Designs
a two-tier system of actors, being the masters who
know and the young who are learning. The Institute for Advanced Study at Princeton is not
The frontier itself is a fuzzy region, which begins at only the original model for advanced study centers,
the far edge of consolidated knowledge where doubt but is probably larger in endowment and size, as well
first rises as to the shape and meaning of what lies as structural complexity, than any others. By now, it
beyond. As one proceeds outward, fragments of has a permanent faculty of nearly two dozen, spread
knowledge become increasingly sparse, and general over four distinct schools. These faculties review
uncertainty increases rapidly. As with most poorly- applications from further scholars to be designated as
explored zones, there is no dearth of conjecture as to Members, who reside at the Institute for periods
what lies there, but controversy is high. There is also ranging from a single term to several years in oc-
confidence that further exploration can in due time casional cases. Such Members normally propose re-
produce enlightenment, and hence true additions to search projects that fit the specialized interests of
consolidated knowledge. To explore effectively it is scholars with permanent faculty appointments, al-
usually helpful to have mastered established knowl- though some Members are encouraged to work in
edge in the terrain closest to that edge of the frontier. areas which look profitable to the current faculty, but
But once past that prerequisite—and the more ad- which are under-represented in the relevant school.
vanced of graduate students have achieved that state— Other scholarly visitors, in addition to those selected
everyone can be an explorer at the same level. Hence as Members, are also hosted for shorter periods where
Flexner’s collegium of scholars was realistic, along mutually fruitful. In all, some 200 scholars spend
with his welcome proviso that such a group be as significant periods of time at the Institute in any given
liberated as possible from the routine demands of year.
daily life, including most of the bureaucratic and The Princeton Institute began with a special focus
teaching activities required of the typical university on mathematics and classical studies, but has broad-
professor. ened considerably over time. The School of Math-
Conventional universities have long been organized ematics continues to be the largest collegium of
into specific disciplines such as physics, biology, or scholars. The School of Natural Sciences is substantial
economics. These disciplines have their own regional in size as well, although it has a heavy emphasis on

1614
Central Africa: Sociocultural Aspects

astrophysics and particle physics. The School of Humanities and Social Sciences was founded through
Historical Studies continues to reflect its original the efforts of an eminent Dutch linguist, E. M.
dedication to ancient history and classical studies, but Uhlenbeck, following his experience as a Fellow of the
now enjoys significant representation in medieval and Stanford Center. Other institutions have been es-
modern history. The School of Social Science was not tablished in Europe with attention to the same model,
added until the early 1970s, and remains the smallest such as the Institute for Advanced Study (Wissen-
of these units. In addition to the major social-science schaftskolleg) in Berlin (1980), and the Swedish Col-
disciplines which have core representation, the School legium for Advanced Study in the Social Sciences
hosts Members and visitors in literature, philosophy, (1985) at Uppsala. The National Humanities Center in
and even art history. North Carolina was also promoted largely by former
In 1954 the Center for Advanced Study in Beha- Fellows of the Center at Stanford, and again modeled
vioral Sciences, with the help of the Ford Foundation, closely after it.
opened its doors on land of Stanford University in There are local adaptations in form across these
Palo Alto, California, although it was administratively institutions. For example, the Dutch Institute requires
independent of that institution. It was inspired by the that half of its Fellows be from the Netherlands. The
generic example of the Princeton Institute, but was Berlin Institute hosts a few permanent Fellows as well
also motivated as a response to the scant attention as annual residential ones. The prominence of group
being given to the social and behavioral sciences in the projects or annual ‘themes’ varies from institution to
Princeton Institute’s program at the time. Although its institution. None of these centers, however, is designed
broad goals are parallel to those at Princeton, its mode to train graduate students or offer advanced degrees;
of operation differs in a number of particulars. Most instead, they are dedicated to promoting research of
notably, it has no permanent faculty appointments. great distinction. They also enjoy a high degree of
Instead, a small secretariat invites nearly four dozen autonomy from specific universities as well as state or
scholars as residential Fellows each year. Scholars are federal governments, although they may receive fund-
given office space and various forms of computer and ing from, and engage in relationships with, such
research support, and are encouraged to pursue agencies.
whatever reflections, research, and writing they wish. In addition to the growing number of such large-
Various methods are employed to encourage informal scale centers, numerous smaller intrauniversity units
intellectual communication between the disciplines with similar goals have emerged. While advanced
present. study centers will always be vastly outnumbered by
The disciplinary scope of these fellowships for the universities and other training institutions, the exis-
Center at Stanford is extremely broad. To be sure, the tence and popularity of these centers reflect an
five primary social sciences—anthropology, econo- increasing awareness that the Flexner ideal for a
mics, political science, psychology, and sociology— collegium of research scholars is, under modern
make up a majority of each cohort, especially as conditions, an indispensable part of the infrastructure
augmented by, typically, an ample representation of desirable for the healthy advancement of knowledge.
historians. But leaders in research in other diverse
social science specialties, such as education, linguistics, See also: International Research: Programs and Data-
statistics, geography, law, and philosophy appear with bases; Specialization and Recombination of Spec-
fair regularity from cohort to cohort. In addition, ialties in the Social Sciences; Think Tanks
there has been on one hand routine representation
from the boundaries in the form of literary studies or,
occasionally, well-known authors of fiction, and on
the other hand representatives of the natural sciences,
Bibliography
including ornithologists, biomedical researchers, Flexner A 1960 An Autobiography. Simon and Schuster, New
Nobel Laureates in biology, and even a chemist York
working on the social communication of insects. While
the normal fellowship gives free rein to individual P. E. Converse
researchers, the Center at Stanford dedicates some
fraction of slots within each cohort to ‘special projects’
prearranged for work on scientific or policy problems
that require sustained multidisciplinary attention. On
average there are about two such projects per year, Central Africa: Sociocultural Aspects
participated in by a fifth to a tenth of the Fellows in
residence. Most authors define Central Africa as the vast area
The general model for the Center at Stanford has comprising speakers of the different western branches
been adopted by newer advanced study institutions of of Bantu. The great diversity of western Bantu
interest to social and behavioral scientists. In 1970 the populations and traditions have gradually extended,
Netherlands Institute for Advanced Study in the beginning 5,000 years ago, from southern Cameroon,

1615
Central Africa: Sociocultural Aspects

into Equatorial Guinea, Gabon, People’s Republic of The savanna regions in the southern and western
Congo, the Democratic Republic of Congo, and half of Central Africa saw the emergence of important
Angola. Side to side to Ubangi and Nilotic speakers, political leadership roles, including kingship. De-
they have moreover spread across the Central African pending on the rainfall, the wooded savanna was
Republic, southern Sudan, Western Uganda, Rwanda, moderately suitable for agriculture and\or pastor-
and Burundi, to Eastern Congo, Tanzania, and North- alism. Priests were the guardians of all reproductive
western Zambia. Whereas the eastern Bantu have resources. The extended family was the focus of
specialized in grain growing, the western Bantu adop- physical and spiritual well-being of both living and
ted cassava and yams for staple foods. The region has deceased members of the patrilineage. Lineage chiefs
been the scene of complex interactions between the gained control over the rain-making cults and the
local and outer worlds. territorial custody of hunting traditions and the
collection of tribute. From the fourteenth century, the
chiefs gained regional control over iron and smithery,
and particularly from the seventeenth century, over
1. AD 1000 to 1880 fire-arms and market-oriented commerce. This as well
as their indirect involvement in the Atlantic slave trade
Probably due to its sparse population, Central Africa made their power paramount over large areas in
did not develop the type of domestic markets and central Angola and western Congo.
specialized local industries as found in West Africa.
Until the late fifteenth century, it stayed out of the
lines of long-distance communication such as those
that had developed from coastal East Africa or along 2. Challenges from the Atlantic and Colonizing
the caravan routes of West Africa. The history of Europe
precolonial Central Africa from AD 1000 can be
divided broadly into two main regional developments. Unlike West and East Africa, Central Africa had not
With regard to the northern half of Central Africa, entered the maritime trade prior to the arrival of the
comprising the great equatorial forest with its northern Portuguese caravels seeking trading bases, from the
woodland fringe and the Congo basin, linguistic and 1480s onwards. Reports by Portuguese missionaries
ethnographic data—common ‘words and things’ (Van- and merchants as well as occasional correspondence
sina 1990)—bear witness to the interaction between by western-educated Kongo, show how the foreign
the farming and pastoral peoples in the forest and influence brought radical changes to Kongo kingdom
savanna areas north and south of the equator. The and culture. The kingdom then comprised prosperous
most important hunter-gatherer groups of Northern farming communities linked to a regional trade of fish
Central Africa were the matrilineal Pygmies, inter- and craft goods. From this period, the nzimbu shells in
acting with their agricultural neighbours and long- the hands of Europeans took on the functions of
distance trading groups (like the salt-making Vili of coinage for the previously unfamiliar purposes of
Loango, and the neighboring copper-trading Tio). wage-payment and marketing.
The Zaire river and its tributaries have been an Portuguese traders carried Mediterranean manu-
important corridor for the transmission of cultural factures to Kongo in exchange for raffia cloth, ivory,
influences and systems of government, as well as for dye wood, and copper. The redistribution of these
the trading economies of the Atlantic zone. The manufactures (such as North African textiles, iron
farming in the thinly scattered patrilineal societies of knives, glass mirrors, glass schnapps bottles, glazed
the northern and eastern corner of the rain forest bowls, Venetian beads, and glazed china) was con-
remained for long untouched by the food plants trolled carefully by the royal court. Moreover, Portu-
originating from South America and already imported guese teachers, artisans, lawyers, and priests enhanced
into other parts of Central Africa. the authority of the king and his closest supporters. A
In the central part of the equatorial forest, the number of Kongo were taken to Europe for further
Mongo patrilineal societies were strongly influenced education. From the sixteenth century the history of
by the neighboring agriculturalists to the northwest the southern savanna became tragically bound up with
along the Ubangi and higher up in the Sudan. Mongo the growth of Atlantic slave-trade. Manpower, rather
Big Men, often combining agriculture and trade, than landed property as in early modern Europe, was
tended to privilege their patrilocal household and the key to value in the Central African communities.
patriclan rather than the village community as an Slave-trading of servile subjects and captives gained
economic stronghold and political unit, also by way of via long-distance marketing at the Malebo Pool (where
matrimonial transactions and cultic and initiatory Kinshasa would later develop) became the grim
specializations. By the sixteenth century, long-distance solution to local needs for foreign goods. Like any
exchange economies of textiles and fire-arms for ivory slave-trading society, the Kongo kingdoms and other
and local captives may have connected the equatorial neighboring equatorial kingdoms became oppresssive
forest even with Egypt. and fractious, and deemed to collapse.

1616
Central Africa: Sociocultural Aspects

The Atlantic trade gradually came under the control such regulatory documents were intended to mod-
of the semi-colonial Luso–African community of the ernize society by freeing the so-called eT olueT from
Loanda hinterland, Africa’s first White colony. The customary collectivism. Population censuses, together
long-distance savanna trade-routes probably became with geographic and linguistic-ethnographic mapping
avenues for the dissemination of European goods and and recording of data, allowed for a compartmentaliz-
influence, as well as of the South American crops: ation of languages and ethnic groups.
maize, cassava, tobacco, tomatoes. Soon, frontier Unlike the nineteenth century bureaucratic nation–
states such as Matamba and Kasanje as well as a new state in the West, the Bantu political traditions of
class of trading entrepreneurs took control over trade Central Africa do not draw their inspiration from
relations with the European powers. Following the orders of visual representation and architectonic
industrial revolution in Europe, new goods were being spatial models. They are moulded by organic, hy-
shipped to Southern Cameroon, Gabon, and the draulic, and\or animal-totemic metaphors informing
Lower Congo. Towards the end of the eighteenth political networks and strategies as an order of events,
century, with the involvement of French, English and forces, sources, and relations. Membership of and
Dutch merchants, the Atlantic slave trade reached its alliances between particular social groups are not
peak. primarily tied to a geographic partitioning but to
Swahili-Arab penetration in Northeastern Zambia blood ties and to the mythical or primal space-time
and Southeastern Congo in the early nineteenth order in line with the constant cosmogenetic re-
century started as trade of guns and powder for ivory enactment of the reproductive and hierarchical weave
and slaves. It would soon overthrow the local chiefs between the founding ancestors, their foundational
and gain independent political power for itself. and migratory exploits, and their multiple de-
Following Livingstone’s and Stanley’s ‘explor- scendents.
ations,’ the 1885 Berlin Conference legitimized The functions of traditional political title holders
Europe’s colonial scramble for Central Africa. The were, and are, being thought of as prior to, and the
conquest of Equatorial Africa would last until the source of, all life forms as well as the guarantee of their
1920s before military power, forced labor, harsh order. A chief represents and surpasses his sub-
rubber exploitation, diseases, and hunger would have ordinates by his twofold function. First, like his
decimated half of the population and broken local totemic animals (leopard, eagle, crocodile), he is a
resistance. Colonial rule gradually deprived local conqueror. Through enthronement, the ruler em-
societies of the institutional ability to confront the bodies the founding ancestors and represents the
exploitative forces and exogenous ‘civilizing pro- primal space-time order initiated by the immigration
grams’ imposed upon them. In the views of the of his ancestral people and their conquest of the land.
colonizing masters, the bringing about of a phil- Through his body, in particular his clairvoyance and
anthropic medical provision and wide-spread school the nightly forces which he shares with his totemic
education in the colonies was an essential feature in animals, the ruler impersonates the founding ancestor.
the international justification of political and econ- He imposes the qualities of a perennial hierarchical
omic colonialism. The colony’s extractive economy social organization, territorial unity, and moral order
directed at the ‘so far unutilized resources’ and relying on his society. Second, the chief acts as the an-
on mandatory labor in lieu of tax payment, as well as drogynous life-giver or mediator of the (re)generative
its transport and communication technology were to processes between the land, society, man, and the ‘the
open up the remote areas to the larger world scene. primal womb’ in the earth. His rule is thus one of
They were to ‘uplift’ and integrate the local popu- cosmic and physical regeneration, guaranteeing hu-
lations in the new era of ‘universal civilization and man, agricultural and social reproduction, and instit-
progress.’ uting commensality and sharing in his territory. In this
Bantu traditional rule was deeply alien to the diurnal ordering role, the chief protects his people
colony’s display of order. Colonial state power was from the nocturnal anti-rule of envy, theft, sexual
imposed through regulations, enticements, and direct abuse and sorcery. The elders, through councils,
interference, as both a text and texture, absorbing and ceremonial exchange, and authoritative speech, extend
domesticating people and events by the very acts of the regenerative capacity into the daily weave of events
writing and administrative ordering. The textual econ- and relations uniting kin groups under common rule.
omy disassembled and inserted local realities into In the Christian mission stations or in the plan-
panoptical recording and regulation. The management tation, mining, and industrial enterprises, ever larger
of people’s civil identity and geographic confinement numbers of individuals and families joined the workers
was achieved through the identity cards and man- camps and ‘indigenous townships,’ set up outside
datory ‘passbooks.’ Records of marriage, descent, customary rule, on the fringes of White settlements.
settlement, and land use were meant to bind the They thus created new ideological and physical spaces
populations geographically, and were authoritative of identity and collective imagination. Indeed, from
for the succession to chiefly titles and for settling the last decades of the nineteenth century, Christian
matrimonial and family disputes. On the other hand, congregations set out to work side by side with the

1617
Central Africa: Sociocultural Aspects

concessionary companies and the colonial admin- ations were largely drawing on European ideological
istration. Their aim was to ‘bring civilization and discourse associating modern economic development
universal salvation to the Dark Continent,’ as well as with gradual political emancipation. They thereby
to ‘save the individual’s soul’ through the removal of became complicit with the modernizing and auth-
‘pagan African customs.’ While the first decades of the oritarian endeavor of the White welfare—and nation–
missionary endeavor were directed at the adaptation state, however alien to people at grassroots level.
of Christianity to local society and culture, from the Third, in the terms of western Bantu traditions, they
1930s on, missions became increasingly involved in favored a palaver model of negotiation, ceaselessly co-
the social engineering of the townships, echoing the opting ‘brothers’ and ‘doing things with words.’
dominant mood of the plantation or mining trusts and In the 1970s, authenticity movements in various
the colonial administration. From then on, church states prompted a radical shift away from assimilation
authorities designed paternalist programs to educate of the White civilizational models, now urging people
converts towards assimilating modern skills and dis- to take possession in a militant way of their own
positions. By the 1950s, the term modernization had history and a dignified self-representation on the world
become a new banner for the assimilationist option. scene. Throughout the 1970s and 1980s, artists and
In the Belgian Congo, the colonial government had customers in the discos celebrated a liberation from
secured the agreement of the missions to manage the paternalist and moralizing colonization. Today, the
schools in exchange for land concessions. Nearly all people in both the rural and urban milieux have
mission stations developed boarding schools with become aware how much their lifestyles, modes of
enough fields, cattle, poultry, and workshops to be production, and environments are both in rupture
self-sufficient. The colonial school presented Western with their parental models, and simultaneously ex-
progress as the source of a more dignified identity on cluded from the current economic and informational
the world scene. Vinck (1995) convincingly shows the globalization of western consumerist lifestyles. The
extent to which, from the 1920s in the Belgian Congo, masses in the poverty-stricken suburbs have been
the school books conveyed the primary symbols and haunted, through television and downtown scenes, by
values of the West. By encouraging the pupils to the imageries of ease and extravagance that the
develop a new identity and self-image, these manuals transcontinental mass media as well as the few very
influenced the entire political e! lite-to-be in post- rich nationals exhibit in fine clothing, expensive cars,
independence Congo. The concept of State, a key and luxury goods. Many are increasingly bitter about
notion in the school books, was the comprehensive their exclusion. However, they do not fail to develop
and abstract, depersonalized expression of the new their own proud visions of society and the way things
power and authority which came to dominate the are. In the early 1990s, through the waves of demon-
various spheres of life. According to the manuals, all strations and Dead-Town protest manifestations or
Whites shared in the power of the State and thus were even uprisings, suburban people counteracted the
to be considered as authorities. Colonial school books govermental views on modernization and the neo-
depicted traditional society and religion as dominated colonial politics. Irony and parody allowed the popu-
by sorcery and the machinations of the devil. lace to deconstruct imperialist twentieth century mod-
ernity.
Suburbanites straddle worlds through hybridity or
3. Endogenization the imaginary transgression of codes, in particular in
the utopian fields of humor, daydream, and glotto-
From the 1950s, among the second generation of those phagia. These are gradually furthering a cultural
having assimilated the European civilizational capital, critique of the postcolonial situation: rather than
the frustration grew with the glaring contradiction fighting one another as ill-fortuned or discontented
between the promises which the colonial master had consumers of modern cash goods (increasingly
put before them and their second-rank position in the brought in second-hand from the North), people in
colonial institutions. A few Christian priests or pastors suburban neighborhoods and rural communities are
and e! lite got imbued with the Third World eman- re-exploring their genuine sense of communality, and
cipation movement enhanced by the rapid de- their collective memory stored in body techniques and
colonization in Asia, the 1955 Afro–Asian Conference sensuous culture. Hybridity in many cultural ex-
in Bandung, the independence struggle in Algeria, the pressions blurs the tradition\modernity, Bantu\
NeT gritude call for the rehabilitation of African cul- western, precapitalist\capitalist oppositions that have
tures, and the Pan-Africanism radiating from Ghana been created by the Europeanizing tropes. Songs on
and Guinea. the radio today in vernacular languages and along old
The militant e! lite used a triple banner in the battle rhythms, recalling both the collective frenzy and
for decolonization. First, in the terms of the Christian, euphoria in the bars of the 1970s and 1980s and village
humanist, and\or socialist discourses of the colon- festivities or rituals, are now unsettling the Reformist
izers, they claimed the right to dignity, social eman- voices of the (post) colony which had connected city
cipation, and equality. Second, their political aspir- life with French or English speech and e! tiquette, with

1618
Central America: Sociocultural Aspects

school education, and the petty bourgeois life style, Devisch R 1996 Pillaging Jesus: Healing churches and the
and with the well-equipped biomedical services. The villagisation of Kinshasa. Africa 66: 555–86
pidgnization or creolization of French or English Harms R 1981 Rier of Wealth and Sorrow: The Central Zaire
colloquial language, and of narrative styles in songs Basin in the Era of Slae and Iory Trade, 1500–1891. Yale
University Press, New Haven, CT
and newspapers, vitiate modernist master-tales about Hunt N R 1999 A Colonial Lexicon of Birth Ritual, Medicaliz-
the literate African city dweller and the retrograde and ation and Mobility in the Congo. Duke University Press,
illiterate villager. Thousands of independent prophetic Durham, NC
healing churches of the holy spirit, through exorcizing Mazrui A A, Wondji C (eds.) 1993 General History of Africa:
materialist greed and westernization of public mores, VIII Africa since 1935. Heinemann, London
or even State politics identified with the work of Mbembe A 1992 Provisional notes on the postcolony. Africa 62:
sataani (the local term for the Christian-imported 3–37
notion of Satan, referring to those engaged in illegal Mudimbe V Y 1988 The Inention of Africa: Gnosis, Phil-
practices and sorcery), develop a forceful critique of osophy, and the Order of Knowledge. Indiana University Press,
Bloomington, IN
the catastrophic collusion between economic modern-
Vansina J 1966 Kingdoms of the Saanna: A History of the
ization and people’s sociocultural dispossession. At the Central African States until European Occupation. University
same time, through collective trances and ‘Christian’ of Wisconsin Press, Madison, WI
forms of telepathy and clairvoyance in the name of Vansina J 1990 Paths in the Rainforests. Currey, London
the holy spirit, the prophetic churches explore ways Vaughan M 1991 Curing their Ills: Colonial Power and African
towards domesticating modernizing forces. During Illness. Polity Press, Cambridge, UK
the ceremonial offering of money to the prophet and Vinck H 1995 The influence of colonial ideology on schoolbooks
assistant assembly leaders, adepts may circle more in the Belgian Congo. Paedagogica Historica 31: 355–405
than a full hour around the congregation, dancing and
singing as if to re-enchant one of modernity’s most R. Devisch
striking secularizations, anonymous cash trade. The
offering aims to transform the subaltern’s poverty and
needs into what is defined as divine grace or healing.
The prophetical churches thereby speak back to global
capitalism and celebrate their participation in a new
economy. They subject the monetary economy to the Central America: Sociocultural Aspects
utopian ideals of equality and solidarity proper to
brotherhood and sisterhood. Most scholars know that a great deal of violence took
place in Central America in the late twentieth century,
See also: African Studies: Culture; African Studies: but relatively few know the reasons for it, what its
Politics; Colonization and Colonialism, History consequences were, or how it affected research on the
of; Fourth World; Historiography and Historical region. The immediate causes of the violence were
Thought: Sub-Saharan Africa; Near Middle East\ revolutionary attempts from the 1960s through the
North African Studies: Economics; Near Middle 1980s which led to brutal military reprisals, mainly
East\North African Studies: Society and History against civilians—in Guatemala, Nicaragua, and El
Salvador. The violence had an enormous economic
impact on all five countries of the region (which also
includes Honduras and Costa Rica), leaving already
Bibliography poor people much poorer. Civilians suffered the
greatest casualties and many of the survivors left the
Andersson E 1958 Messianic popular moements in the Lower
region permanently, while those who remained live in
Congo. Almqvist and Wiksells, Uppsala, Sweden
Birmingham D 1981 Central Africa to 1870: Zambezia, ZaıW re and
heavily militarized states where ‘low-intensity’ conflict
the South Atlantic. Cambridge University Press, Cambridge, continues. Yet to everyone’s surprise social mobiliz-
UK ation and protest by a wide variety of disadvantaged
Birmingham D, Martin P M (eds.) 1983 History of Central groups also continues. The vast social changes engen-
Africa, 2 Vols. Longman, London dered by these phenomena have encouraged a new
Boahen A (ed.) 1990 General Hiistory of Africa: VII Africa under kind of social science research, such that most now
Colonial Domination 1880–1935. Abridged edn. Unesco, Paris deal with regional patterns and transnational flows,
and Currey, London history, large institutions such as the state and mili-
Cairns H 1965 Prelude to Imperialism: British Reactions to tary, and the nature of ongoing as well as past social
Central African Society 1840–1890. Routledge, London
De Boeck F 1996 Postcolonialism, power and identity: Local
movements. After a brief summary of what happened
and global perspectives in Zaire. In: Werbner R, Ranger T in the five countries of the region, this article will
(eds.) Postcolonial Identities in Africa. Zed, London, pp. discuss recent comparative and historical research
75–106 projects and will then treat the way in which eth-
Devisch R 1995 Frenzy, violence, and ethical renewal in nography changed and developed to deal with the
Kinshasa. Public Culture 7: 593–629 questions raised by this period of history.

1619
Central America: Sociocultural Aspects

1. The Central American Reolutions mostly carried out by ex-soldiers and other uprooted
people, unemployed by the previous cycle of violence.
Revolutionary movements began in Guatemala, El Nonetheless, very significant social movements, es-
Salvador, and Nicaragua following the 1959 Cuban pecially of indigenous people, are now taking place
revolution, which inspired them. The Central Ameri- throughout the region.
can insurgencies resembled those that took place in Carlos Vilas (1995) provides an excellent general
much of Latin America, although unlike the others, in treatment of all of the phenomena described here in
the 1970s they became significantly more violent and fewer than 200 pages. Vilas, an Argentinian sociologist,
threatening to their national regimes, which were in treats the cases in greater depth than other social
turn much weaker. Small guerrilla fronts led by scientists because he lived in the region for more than
relatively well-educated men (most of them members ten years, was in intellectual contact with revolution-
of the middle or elite classes) attempted to organize ary and state sectors in all five countries, and partici-
peasants or rural workers in most countries of Latin pated in the Nicaraguan attempt to establish a
American in order to seize state power—whether or revolutionary social and political regime. Vilas’
not locally inspired grassroots movements were al- earlier (1987) study of the Sandinista revolution in
ready in place. The early guerrillas, whose basic Nicaragua is the best general study of revolution in
strategy was focismo ( Wickham-Crowley 1992) Central America.
worked largely in regions where they assumed popular
support could be generated to overturn the old social
order for a new one, but did little to describe their
actual sociopolitical projects to the masses. The 2. Comparatie Research
focistas operating in the 1960s suffered ignominious
defeats, but became much more successful when they What took place in Central America calls for com-
renewed their efforts under changed leadership in the parative research, especially on the question of why
1970s. Several different guerrilla groups developed in three of its countries had major revolutionary move-
each country, differing on strategy (insurrectionary ments, and two did not. Many such studies have been
versus prolonged popular war), revolutionary subject done, but few go beyond the obvious economic
(workers, peasants, or middle groups), and inter- differences among the five countries. Most point to the
national line (Russian, Chinese, or Third World). The pattern of export-led economic growth since the 1960s,
actual number of guerrillas in Central America was which impoverished peasants everywhere but in Costa
never very large (from 2,000 to 10,000), but they were Rica. (The case of Honduras is usually explained by
supported by most of the rural population in the areas the fact that fewer peasants were displaced there.) The
where they worked. The military forces in Central most useful and original comparisons have been done
America were of variable size and sophistication— by Robert Williams, an economist who uses socio-
greatest in Guatemala, least in Nicaragua—but always logical, historical, and ethnographic methods in his
outnumbered the guerrillas by at least ten to one. By research. Williams’s first book (1986) observes that
1980 they had developed into formidable counter- cotton and cattle production for export expanded
insurgency powers (everywhere but in Costa Rica), hugely in all five countries (with considerable econ-
being provided with arms, advisors, and tactical omic aid from the USA for its own economic reasons)
support by the USA (everywhere but in Nicaragua). and led to significant dispossession of peasants every-
Nicaragua’s Sandinistas had the greatest popular where, including in Costa Rica. But the five Central
support and took state power in 1979, thus becoming American states handled peasant protest quite differ-
the only successful revolution in Latin America besides ently, with both Costa Rica and Honduras carrying
Cuba. The Sandinistas were defeated in elections ten out land reform and expanding services while the three
years later; but most observers agree that the human other states responded with repression and militariz-
and material cost of the contra war (Vilas 1995) ation—which led to war. In his second book, Williams
instigated by the USA explains the defeat. The (1994) examines the social and economic factors that
Guatemalan and Salvadoran revolutionaries came led to two different kinds of states in Central
close to taking power shortly after the Nicaraguan America—the three revolutionary countries being
victory, and were in a strong enough position to controlled by rigid oligarchies, the other two being led
negotiate peace accords with the military governments by more open political groups. He finds an explanation
of their countries in the 1990s. During the civil wars, in the social and political relations created by the
the violence visited upon the rural civilian population coffee-export economy (among the owners, workers,
by the military was extremely high and indiscriminate merchants), the first major postcolonial export in the
everywhere, but especially so in Guatemala, where region, which played a critical role in state formation.
Maya Indians were most heavily affected. In fact, both (Paige’s 1997 study, which is less complete, reaches
scholars and the UN Commission on Human Rights similar conclusions.)
depicted Guatemala’s military actions as genocidal. Timothy Wickham-Crowley (1992) compares
Violence continued in much of the region, but now is where, and to what effect guerrilla movements arose,

1620
Central America: Sociocultural Aspects

and who participated as both leaders and cadre, as gender, and race relations in the region. An early
well as the military and regime responses provoked by glimpse into the conclusions of this project can be
guerrilla warfare. By treating all of the significant found in Gould’s (1998) book on mestizaje in Nica-
guerrilla-led movements in Latin America since (and ragua. Because of its many novel components—
including) the Cuban revolution, he provides a useful participation of both anthropologists and historians,
comparative context for the Central American guer- North and Central American scholars, and group
rilla movements of both the 1960s and 1980s. Virtually work in interpretation—this project promises to be a
no in-depth or ethnographic studies of one or more landmark study.
Central American guerrilla groups have been done.
Such work may be forthcoming now that the wars are
definitively over and the surviving guerrillas have 3. Ethnographic Research
returned to engage in peaceful politics in their coun-
tries. The Guatemalan and Salvadoran cases both cry The social scientists most challenged by the revol-
out for research while the revolutionary protagonists utionary situation in Central America have been
are still alive, especially since Wickham-Crowley’s anthropologists. Anthropologists have mainly worked
controversial study is not actor based and relies on in particular communities of Mayan Indians in
rather weak secondary materials. Guatemala. Such work continues, but is impaired by
An important six-volume comparative history has the limited insight into state and subject formation,
been produced (Rivas 1993), most of its contributors social movements, and transnational flows that such
Central Americans. The coverage of the civil bounded research can produce. Those who have stayed
wars and their aftermath is quite limited and little new within community boundaries have dealt mainly with
theoretical ground is broken in these volumes. What is the impact of fear and community divisions on local
innovative about this work is the consistent com- culture (e.g., Zur 1998); those who initially dealt with
parative emphasis, unusual for historians, which the larger issues (e.g., Smith 1990) lost most of the
expands information on all of the countries and strengths of an ethnographic approach. But in recent
identifies the major gaps in knowledge. The economic years ethnographers have stretched themselves to do
and political events of the last quarter of the twentieth quite innovative, multisited ethnographic research.
century clearly led to the comparative focus. Jennifer Schirmer’s (1998) research broke new
Two other comparative historical and ethnographic ground by providing an ethnography of military
works are under way in which the Central and North political perception and strategizing in Guatemala—
American scholars are more equal in number. The where counterinsurgency tactics were best developed,
more traditional project is being done only in as well as more violent and politically significant than
Guatemala under the guidance of anthropologist elsewhere. Schirmer interviewed hundreds of Guate-
Richard Adams, who has worked in Central America malan army officers multiple times between 1986 and
for nearly fifty years. The emphasis of the ethno- 1996 about everything from the nature of the guerrilla
graphic project is on how ethnic relations were affected threat to their views about the proper political role of
by the violence that took place in different com- the military in the state apparatus. Criticized when her
munities. (In this regard it is little different from the research began on the assumption that she could only
early book edited by Robert Carmack in 1988.) The parrot military views, Schirmer’s published work on
second project, taking place under the leadership of Guatemala’s military project is now widely recognized
Jeffrey Gould, Charles Hale, and Dario Euraque, is as providing invaluable information on the formation
more innovative. The historians and anthropologists of a military ideology, one that was nativistic (even
conducting the research often worked in the same sites though strongly supported by the USA) and extremely
(some in all Central American countries except Costa successful. Especially revelatory is how deliberately
Rica) and met frequently to discuss methods and the military increased their control over Guatemalan
ideas. The project focuses on the nature and meaning politics, culture, and civil society. The first stage of
of the process of mestizaje (the creation of a single pacification used strictly military means, including
people through cultural and biological mixing, enco- murders of suspects and massacres of civilians; the
uraged if not forced by nation-state building) during second stage involved economic and cultural restruc-
the past two centuries and into the present. Through- turing of the communities most affected by the
out Latin America mestizaje is assumed to have violence (the indigenous ones) together with reorgan-
been a ‘natural’ and noncoercive process that took ization of national party politics and elections; the
place mainly in the colonial period and thus had little final stage was reached with the 1994 peace accords, in
to do with state building; North and Latin American which the military helped to reconfigure both state and
historians have mostly repeated this myth as fact, civil society. As one military officer enthused, ‘We
despite strong evidence to the contrary. This Central [planned] the State in all of its ramifications!’ (Schirmer
American project takes the myth on with detailed 1998, p. 235).
historical and ethnographic case studies, which illu- The most important of the postwar social move-
minate a great deal about the general pattern of ethnic, ments that have taken place in Guatemala has been

1621
Central America: Sociocultural Aspects

that of the indigenous Maya, who have always been leaders of various kinds, has evoked changes in
the largest and most exploited group in Guatemala. indigenous subjectivity to the point that many now
They had not been politically active on their own operate in the state rather than apart from the state.
specific causes or identity issues, however, until the His ethnography promises to show how specific state-
late 1980s—although they certainly did participate in like operations and powers interact with those in civil
Guatemala’s revolutionary movement. As it is society to produce modern citizens\states—a kind of
presently constituted, the Maya movement is a cultural ethnography of the state that will be of general interest
movement led by indigenous intellectuals, who take to social science.
language, dress, and religious tradition to be the main A very different kind of indigenous movement took
(and probably safest) issues. But the political ramifi- place in Nicaragua during its revolution, one spear-
cations are much broader and a less well-known Maya headed by the Miskitu of the Atlantic coast, a large,
movement is building on a more popular base, making remote, and politically neglected area. This much
economic and political issues central (Bastos and smaller group of Indians resisted the revolutionary
Camus 1993). Before the movement, the Maya ma- Sandinistas, who wanted to assimilate them into the
jority rarely held important political offices, even state. The dialog between the Miskitu and the
where they outnumbered Ladinos. Today they play a Sandinistas over several years produced an agreement
major electoral role, are in charge of several major of autonomy for those living in the Miskitu region, but
public institutions, and have to be taken into the one whose economic and political features were
political equation on all national issues. Among other ambiguous. Charles R Hale’s (1994) analysis of
things, the movement challenges North American Miskitu history and identity, together with the more
ethnographers to take up issues of importance to the contemporary negotiations between Miskitu of vari-
Maya ( Warren 1998). An edited book on racism ous positions, Sandinistas of various positions, and
(Arenas et al. 1999), a major issue to Maya activists other ethnic groups in the region, is a rare ethnography
but a topic rarely addressed by anthropologists before of a multifaceted political movement, one that treats
the 1990s, fits this still small genre. Sieder (1998) edits culture, social relations, and political consciousness as
a discussion of the Maya and other kinds of social dynamic historical phenomena. Roger Lancaster
movements—women’s, human rights, and so forth. (1992) writes a more typical ethnography of urban
Only Diane Nelson (1999) has attempted to charac- Mestizos in postrevolutionary Nicaragua, notable for
terize the Maya movement in relation to the state. This its treatment of race, gender, and sexuality in the
inventive work covers many different public signs of context of a ruined economy.
change in the cultural politics of ethnic relations in The only ethnography on El Salvador for any time
Guatemala (movies, signs, advertisements, jokes period is that by Binford (1996) on El Mozote, the
about Rigoberta Menchu! ) as well as real cultural community that was eradicated by an elite Salvadoran
politics over the past 15 years. Use of such a brigade directly under the tutelage of US advisors.
variety of materials allows Nelson to describe a Binford tries to bring the dead to life in his eth-
complex social movement which has been extremely nography, by piecing together the nature of their lived
significant for the cultural reconstitution of Guate- experiences before their community was destroyed. He
mala. In truth, Nelson treats state politics in only one also situates the destruction of this community in the
chapter and on one issue, thus not producing a true complex environment of Salvadoran guerillas, mili-
ethnography of the state, but rather an entertaining tary, and US support. There are literally no eth-
postmodern pastiche of the multiple issues that sur- nographies written of Honduras and very little social
round ethnography in the Central American con- science done on the country, even though Honduras
text—from new social movements, to new public represents an especially interesting Central American
cultures and discourses, to new kinds of reflexive case in many respects. It has been divided between a
positioning for activist analysts. modern, lowland banana enclave owned by the USA
Closer to a true ethnography of the state is the as yet and a highland region backward in both economic and
unpublished work of Finn Stepputat, a Danish cul- political development—in part because of the banana
tural geographer. Over a long period he worked with enclave; it was heavily militarized in the 1980s, yet its
several distinct indigenous groups, some of them military governments engaged several times in signifi-
returned refugees, as well as both Indian and Ladino cant land reform; and the peasants joined no rev-
state and military actors—all in one large municipal- olutionary movements, even though they struggled for
ity—on the implications of the violence, new social land (often successfully) for decades. The case asks for
actors, and political movements of various kinds on innovative ethnography on all of these features. Yet
indigenous political subjectivity. To summarize his only one case of ethnographic research is now being
thesis very briefly, he suggests that intensified in- written up (by Sarah England), which looks at another
teraction between indigenous groups and various feature of life characteristic of all Central America,
representatives of the state (e.g., the military, Ladino transnational migration. Though at least 10 percent of
representatives, and actors in foreign nongovern- the Central American population are now trans-
mental organizations) as well as social and religious national migrants, the phenomenon has been little

1622
Ceramics in Archaeology

treated by social scientists working in any Central Lancaster R N 1992 Life is Hard: Machismo, Danger, and the
American country. England’s research on the Intimacy of Power in Nicaragua. University of California
Honduran Garifuna describes the multiple cultural Press, Berkeley, CA
identities taken on by a black, indigenous group, who Nelson D 1999 A Finger in the Wound: Body Politics in
Quincentennial Guatemala. University of California Press,
move between Honduras (where they hold an in- Berkeley, CA
digenous identity) and North America (where they are Paige J M 1997 Coffee and Power: Reolution and the Rise of
considered black). Democracy in Central America. Harvard University Press,
The final ethnography to be treated here is Marc Cambridge, MA
Edelman’s (1999) study of ‘peasants against globaliz- Rivas E Torres (ed.) 1993 Historia General de Centroamerica,
ation’ in Costa Rica. Costa Rica is the model Central Tomos I–VI. Ediciones Siruela, Madrid, Spain
American country, with a small, very homogeneous Schirmer J 1998 The Guatemalan Military Project: A Violence
population, an early democratic government whose Called Democracy. University of Pennsylvania Press, Phila-
social services rival the best in Latin America, and no delphia, PA
significant military presence in recent history. While Sieder R (ed.) 1998 Guatemala After the Peace Accords. Institute
of Latin American Studies, London
little other innovative social science has been done Smith C A 1990 (ed.) Guatemalans and the State: 1540–1988.
there recently, Edelman’s study of Costa Rican social University of Texas Press, Austin, TX
movements during the 1980s is a gem, possible only for Vilas C 1987 Perfiles de la reolucion Sandinista: Liberacion
someone who has worked in a country for more than nacional y transformaciones sociales en Centroamerica. Edi-
a decade. Combining a sophisticated political econ- torial Legasa, Madrid, Spain
omic analysis of global neo-liberalism with a strong Vilas C 1995 Between Earthquakes and Volcanoes: Market,
political ethnography of a multitude of peasant State, and the Reolutions in Central America. Monthly Review
organizations which dissolve and regroup repeatedly, Press, New York
Edelman’s book is bound to be controversial because Warren K B 1998 Indigenous Moements and their Critics: Pan-
of its opinionated stance on a multitude of issues, Maya Actiism in Guatemala. Princeton University Press,
Princeton, NJ
including the literature on social movements and Wickham-Crowley T P 1992 Guerrillas and Reolution in Latin
globalization. It remains, however, an example of the America: A Comparatie Study of Insurgents and Regimes
innovative ethnography being produced in Central Since 1956. Princeton University Press, Princeton, NJ
America because of the global nature of its problems, Williams R G 1986 Export Agriculture and the Crisis in Central
the complexity of its social movements, and the America. University of North Carolina Press, Chapel Hill, NC
peculiarities of its state-level institutions—which op- Williams R G 1994 States and Social Eolution: Coffee and the
erate on a scale small enough to invite national-level Rise of National Goernments in Central America. University
ethnography. of North Carolina Press, Chapel Hill, NC
Zur J N 1998 Violent Memories: Mayan War Widows in
Guatemala. Westview Press, Boulder, CO
See also: Dependency Theory; Revolutions, Sociology
of; Revolutions, Theories of; South America: Socio- C. A. Smith
cultural Aspects

Bibliography
Arenas C, Hale C R, Palma G (eds.) 1999 Racismo en Guatemala? Ceramics in Archaeology
Abriendo debate sobre un tema tabu. Facultad Latino-
americano de ciencias sociales, Guatemala City, Guatemala
Bastos S, Camus M 1993 Quebrando el silencio: Organizaciones
Ceramics, or pottery, are among the oldest and most
del pueblo maya y sus demandas (1986–1992). Facultad significant technological innovations in the history of
Latinoamericano de ciencias sociales, Guatemala City, humankind, the first truly synthetic material. Being
Guatemala highly plastic and thus virtually infinite in the range of
Binford L 1996 The El Mozote Massacre: Anthropology and shapes and forms possible, offering ready surfaces for
Human Rights. University of Arizona Press, Tucson, AZ decoration, and being ubiquitous in many archae-
Carmack R 1988 Harest of Violence: The Mayan Indians and the ological contexts, ceramic containers and artifacts
Guatemalan Crisis. University of Oklahoma Press, Norma, OK have provided archaeologists with one of their main
Edelman M 1999 Peasants Against Globalization: Rural Social categories of empirical data. Studies of variation in
Moements in Costa Rica. Stanford University Press, Stan- ceramic production, style, and use have assisted
ford, CA
Gould J L 1998 To Die in this Way: Nicaraguan Indians and the
archaeologists both in the construction of chron-
Myth of Mestizaje, 1880–1965. Duke University Press, ologies and in the interpretation of ancient societies.
Durham, NC Ceramics denotes objects made from clay and fired
Hale C R 1994 Resistance and Contradiction: Miskitu Indians to achieve hardness, which range from low-fired terra-
and the Nicaraguan State, 1894–1987. Stanford University cottas and earthenwares (firing range 900–1200 mC),
Press, Stanford, CA through stonewares (1200–1350 mC), to porcelains

1623
Ceramics in Archaeology

(1300–1450 mC). Most archaeological analyses of cer- are also characteristic of early Near Eastern sites, such
amics are focused on earthenwares, which dominated as clay spindle whorls and loom weights, clay sickles
pottery industries throughout most of prehistory. (with obsidian or flint blades), clay toys and models,
Aside from vessels and containers of all kinds, ceramic and clay stamps and cylinder seals. Ceramic tech-
artifacts also include such important items as bricks nology was not independently developed in Europe,
and tiles, figurines and models, and tableware. but rather was borrowed or diffused from the Near
East, in association with the spread of agriculture and
a sedentary mode of existence. In the classical world,
of course, Greek black- and red-figured pottery
1. History of Ceramics (600–400 BC) represents a high level of technical and
artistic achievement, as does slightly later Roman
The earliest human experimentation with clay— Arretine ware (100 BC–AD 400).
including the discovery that moist clay is plastic, can Ceramic technology in the New World arose con-
be shaped, and when heated or fired will retain its siderably later in time, evidently as an independent
shaped form—is not well documented archaeologi- development; moreover, the potter’s wheel and the use
cally. Fired clay figurines are known from the Dolnı! of glazes never developed in the New World. Between
Ve$ stonice site in Czechoslovakia, dated to 30,000 BC. ca. 2500–2000 BC, pottery appears in several local-
In the Old World, the first ceramic vessels appear in ities, including the coast of Columbia, the Pacific coast
the early Jomon culture of Japan at about 10,000 BC, of Mexico, and in the southeastern United States.
and in the Near East (Anatolia) at around 8500–8000 Among the outstanding pottery traditions of the New
BC. Ceramics do not appear until considerably later in World, one must count the coastal Peruvian Nazca
the New World, in several localities between 3000– and Moche traditions (200 BC to AD 700) with their
2500 BC. In all cases, the origins of ceramic vessels elaborated modeled anthropomorphic and zoomor-
appear to be closely associated with sedentary modes phic vessels; the Late Classic Maya pottery which
of life, and with storage either of agricultural products included exquisite polychrome funerary vessels; and
or of quantities of gathered plant foods (as in the the Pueblo pottery traditions of the southwestern
Jomon case). There is a high degree of correlation United States.
between sedentism and pottery making, as evident in a
world ethnographic sample where only two out of 46
pottery-making societies are nonsedentary (Arnold
1985). 2. Archaeological Approaches to Ceramics
The oldest known pottery in the world is the ‘cord-
marked’ Jomon pottery industry of Japan, dating to Ceramics have played a central role in archaeological
10,000 BC and associated with sedentary or semi- method and theory, for several reasons: (a) pottery has
sedentary hunter-gatherers. Jomon pottery is highly a long history, and is virtually ubiquitous in most
distinctive, both with its cord or string-impressed sedentary societies; (b) pottery is nonperishable, and
surface decoration, and its elaborate modeled rims; is often recovered in very large quantities from
vessel forms are typically high jars or beakers. Else- archaeological excavations; (c) pottery functioned
where in the Far East, pottery appears in parts of both as utilitarian cooking, storage, and serving vessels
China by around 7000 BC, if not earlier. The Yang- for all strata of a society, as well as special purpose
shao-Culture, centered in the Yellow River valley of functions for elite ceremonial or funerary use; and, (d)
northern China and dating between 4800–4200 BC, is as a highly plastic material, pottery displays seemingly
noted for its beautifully formed earthenware jars and infinite variation in composition, manufacturing
dishes, decorated with red-and-black painted geom- method, shape, and decoration. These variations,
etric designs. In later time periods, the Chinese moreover, were culturally conditioned, resulting in
perfected many technical aspects of ceramic prod- particular ceramic styles or traditions, which can be
uction, such as the horizontal and vertical updraft traced through time and space.
kilns, the potter’s wheel, and high-fired stonewares Early in the development of modern archaeology,
and glazes. Porcelain ceramics were innovated in scholars such as Flinders Petrie in Egypt, and some-
China sometime around the beginning of the first what later James Ford, Irving Rouse, and James
millennium AD, and were widely traded throughout Griffin in North America, recognized the value of
the Old World, and later to Europe and the Americas. ceramic studies in developing relative chronologies
In the Near East, architectural uses of clay (e.g., for based on changes in ceramic style over time. Prior to
walls, floors, and roofs) are widespread by around the development of such absolute chronometric dating
7500 BC, in association with the domestication of methods as radiocarbon or dendrochronology (see
plants and animals, and the origins of sedentary village Chronology, Stratigraphy, and Dating Methods in
life. Pottery containers appear by 8500–8000 BC in Archaeology), the construction of ceramic chronolo-
Anatolia, and slightly later in other parts of the Near gies were essential to developing local and regional
East. A variety of other clay implements and artifacts time frames for prehistoric cultures.

1624
Ceramics in Archaeology

With the rise of absolute dating methods in the surface treatment. Possible surface treatments include
second half of the twentieth century, archaeological slipped, glazed, burnished, polished, paddle-impres-
studies of ceramics have shifted somewhat, with less sed, and smoothed. Decorations may be applied
emphasis on classification, seriation, and chronology by painting, stamping, incising, carving, or other
construction, in favor of new approaches. These methods, such as three-dimension applique! or relief.
include detailed physical and chemical studies of Designs used in decorations typically consist of indi-
ceramic composition and of production techniques vidual design elements, which are systematically
and sequences, which inform archaeologists about the organized into motifs and larger decorative panels,
technology of ceramic production and use, and about and generally follow culturally-determined rules. The
the distribution and movement of ceramic vessels classification and analysis of such design systems may
during their life spans. Such information in turn is be based on individual motif catalogs and set of design
useful to archaeologists who are attempting to under- rules (e.g., Mead et al. 1965), or on analysis of
stand prehistoric economic and sociopolitical organiz- underlying principles of symmetry (e.g., Washburn
ation. 1977).
Methods of ceramic classification also vary. Pottery
producers have their own indigenous or folk classifi-
cation systems, which have been the subject of con-
siderable ethnoarchaeological study (see Sect. 2.4
2.1 Classification of Ceramics
below). Formal archaeological procedures for clas-
As perhaps the most plastic of all material culture sifying pottery include the well-developed ‘type-
media known in antiquity, ceramics exhibit enormous variety’ system, first applied in North America and
variation—in fabric, methods of production, shape, later extended to Mesoamerica. This system uses a
decoration, and function—offering tremendous ‘binomial’ nomenclature, in which a geographic name
advantages as well as challenges to the archaeologist. is combined with some specific technological des-
In order to bring some order to this variability, criptor (e.g., ‘Barton incised’). Classification schemes
archaeologists must classify pottery into sets of like range from simple paradigmatic classifications, to
objects (see Classification and Typology (Archaeo- complex, hierarchical taxonomies. An alternative
logical Systematics)). The aims and methods of approach, much favored in the 1960s and 1970s, is
ceramic classification, however, depend greatly upon quantitative or phenetic classification, in which ‘types’
the particular archaeological approach. For example, are generated by a computer, following certain math-
a classification which is aimed at the discovery of ematical algorithms operating upon a number of
historical types, those which show meaningful vari- qualitative and\or quantitative parameters. The prime
ation over time, will most likely emphasize different example of such a numerical taxonomy of ceramics is
aspects of variation from a classification designed to that of Clarke (1970) for the Bell Beaker pottery
exhibit key differences in manufacturing process. The complex of Great Britain.
literature on archaeological classification of ceramics
is vast, but useful overviews may be found in Shepard
(1965), Rice (1987), and Sinopoli (1991).
Ceramic classifications may be based on a number
2.2 Ceramics and Chronology
of different kinds of variables, the most common
being technological dimensions, vessel shape, and The emphasis accorded ceramic studies by archaeolo-
decoration. Classifications based on technological gists reflects the importance of using ceramic change
dimensions are especially useful when the aim is to as a means for constructing cultural chronologies. In
understand production processes, although such the late nineteenth century, pioneering Egyptologist
dimensions may also be useful in the definition of Flinders Petrie recognized that pottery vessels which
historical types. Major technological dimensions had been placed in tomb groups as funerary objects
include raw materials (especially clay and nonplastic showed subtle but continuous stylistic changes over
inclusions such as mineral temper), methods of vessel time. Petrie arranged representative vessels in an
forming (such as coiling or slab building, and the use inferred chronological sequence, resulting in the first
of the potter’s wheel), secondary treatments (such as seriation of pottery.
the application of slips or glazes), and methods of The methods of seriation were greatly refined by
firing (open air firing, use of kilns). Vessel shape may American archaeologists, such as Irving Rouse and
be classified formally according to various systems, James A. Ford (Ford 1962), working with ceramic
such as that developed by Shepard (1965), which assemblages from various New World localities, in-
distinguishes between restricted and unrestricted ves- cluding the Mississippi Valley region, and the Viru
sels, and further between composite and inflected Valley of Peru. Their work generated much debate
shapes. Vessel shape, naturally, is closely linked with concerning the methods of ceramic classification and
vessel function. The third major category of variation the reality of ceramic ‘types.’ The fundamental prin-
used in ceramic classification is that of decoration and ciple of seriation was to define a set of historical types

1625
Ceramics in Archaeology

which displayed gradual temporal changes, with a tery in ancient societies requires that the archaeologist
particular type arising at some point in time, gradually be able to characterize the unique composition of a
increasing in popularity (hence reflected by increased particular ceramic product, typically a mix of clay and
frequency in archaeological assemblages), and later other nonplastic inclusions (including purposefully
decreasing until the type disappeared from the archae- added ‘temper.’) A key phase in archaeological pottery
ological record. When the frequencies of such his- analysis is thus ceramic characterization (Bishop et al.
torical types were plotted as percentage diagrams, they 1982). Characterization may also include efforts at
displayed a characteristic frequency distribution re- sourcing, in which the materials that make up a
sembling the plan of a battleship, and hence were particular ceramic ware are traced to their geographic
known as ‘battleship curves.’ points of origin, such as local clay quarries or sources
The great advantage of seriation was that it per- of sand used as temper.
mitted the construction of cultural chronologies in- A wide range of mineralogical and geo-chemical
dependent of any ‘absolute’ method of direct dating. techniques have been applied to ceramic compo-
Surface collections of potsherds from sites of unknown sitional analysis, whether for characterization or
age could be tabulated according to the frequency of sourcing. Petrographic analysis of the nonplastic
key ceramic types, and then chronologically ordered inclusions within a ceramic fabric, in which the specific
by arranging the frequency distributions to form mineral grains are identified by examining a thin
‘battleship curves.’ Moreover, local ceramic chronolo- section of the pottery under a polarizing microscope, is
gies could be linked together using trade pottery which a widely used technique. X-ray diffraction, which
occurred in more than one region, or by tracing the identifies minerals by their crystalline structures, is
diffusion of particular stylistic traits. With the in- another frequently applied method for the characteriz-
vention of radiocarbon dating and other methods of ation of pottery on the basis on the temper or non-
‘absolute’ dating in the later half of the twentieth plastic inclusions. Other techniques which have been
century, the importance of seriation has declined, applied more recently include optical emission spec-
although it is still important as a cross-check on troscopy, X-ray fluorescence spectroscopy, atomic
radiocarbon-based chronologies. absorption spectroscopy, neutron activation analysis,
Pottery may also be directly dated, either by proton-induced X-ray emission, Mo$ ssbauer spectro-
radiocarbon dating of organic inclusions within the scopy, electronic microprobe analysis, and induc-
ceramic fabric (i.e., dung or chaff included as temper), tively-coupled plasma analysis (see Rice 1987 for a
or by thermoluminescence (TL) dating. TL dating is review of these and other methods).
based on the fact that when clay and other geological
inclusions in pottery are fired at temperatures of 500 mC
or higher, electrons which had been ‘trapped’ in the
2.4 Ethnoarchaeology and Ceramics
crystal lattice structure are freed, emitting light or
thermoluminescence. Following the firing process, Given the importance of ceramics in archaeology, it is
new trapped electrons gradually accumulate in the not surprising that archaeologists have turned to
crystalline imperfections of the pottery, as a natural traditional pottery-making societies to learn more
consequence of radioactive decay. If ancient pottery is about potential variability in ceramic production,
then reheated in the laboratory up to 500 mC, and the distribution, use, and discard. The study of con-
emitted light is measured and plotted as a ‘glow curve’ temporary peoples, using ethnographic methods of
of intensity vs. temperature, the age of the speci- participant-observation, in order to gain knowledge of
men can be calculated, since the intensity of light material culture variability which is potentially ap-
emitted will be proportional to age. There are, of plicable to the interpretation of archaeological assem-
course, many possible complications deriving from the blages, is called ethnoarchaeology (see Ethno-
geological composition of the fabric, and from post- archaeology). Ceramic ethnoarchaeology (Kramer
depositional conditions affecting the pottery. As a 1985) is perhaps one of them most important subfields
result, TL dating is less widely used than the radio- within this topic.
carbon method. Because archaeologists had long been concerned
with the classification of ceramics, a number of
ethnoarchaeological studies have focused on the ways
in which traditional potters classified or categorized
2.3 Compositional Studies
their own products. Indigenous potters typically pay
Within many ancient societies, pottery was produced little attention to the technological attributes often
by specialists who then traded, exchanged, or sold accorded emphasis by archaeologists (such as details
their wares to other nonpottery producing sectors of of temper, paste, surface treatment, or decoration),
society, and\or to other villages or geographic local- but rather emphasize general function in their folk
ities. Moreover, pottery was frequently traded or classifications. Thus, among the Kalinga of the Philip-
exchanged over considerable distances. Tracing the pines, pottery vessels are distinguished by whether
production, distribution, and specialized use of pot- they are intended for cooking rice or for cooking

1626
Cerebellum: Associatie Learning

vegetables and meat, and by their sizes (Longacre van der Leeuw S E, Pritchard A C (eds.) The Many Dimensions
1981). The Fulani of Cameroon (David and Hennig of Pottery: Ceramics in Archaeology and Anthropology. Uni-
1972) lexically discriminate among five size classes of versity of Amsterdam, Amsterdam
jars, with secondary classification based on their Washburn D K 1977 A Symmetry Analysis of Upper Gila Area
Ceramic Design. Papers of the Peabody Museum of Anthro-
intended contents. pology and Archaeology, No. 68. Peabody Museum, Cam-
Other ethnoarchaeological studies of pottery have bridge, UK
focused on aspects of ceramic production, including
the social role or status of potters in their societies, on P. V. Kirch
ceramic distribution, on the use of vessels, their life
spans, and on their breakage and discard rates (see
Kramer 1985 for a general review). Such studies have
aided archaeologists in their interpretations of ancient
ceramics by revealing the complex linkages between
behavior and material culture, demonstrating that Cerebellum: Associative Learning
strictly utilitarian explanations for archaeological
phenomena are not always preferable, and by showing
that multiple lines of evidence may help to discriminate Associative learning is behavioral change that accom-
between alternative explanations. panies the presentation of two or more stimuli at the
same point in time or space. For many years, behav-
ioral and neural scientists have studied associative
See also: Art: Anthropological Aspects; Art History; learning in invertebrate and vertebrate species using
Chronology, Stratigraphy, and Dating Methods in standard classical and instrumental conditioning pro-
Archaeology; Classification and Typology (Archaeo- cedures in hopes of delineating neural circuits, brain
logical Systematics); Ethnoarchaeology; Intensifi- structures, and brain systems that are involved in
cation and Specialization, Archaeology of; Trade and encoding learning and memory. In this article, the
Exchange, Archaeology of critical involvement of the cerebellum in associative
learning is examined.

Bibliography
Arnold D E 1985 Ceramic Theory and Culture Process. Cam-
1. The Cerebellum and Classical Eyeblink
bridge University Press, Cambridge, UK Conditioning
Bishop R L, Rands R L, Holley G R 1982 Ceramic compo- Arguably the best-understood associative learning
sitional analysis in archaeological perspective. In: Schiffer M
(ed.) Adances in Archaeological Method and Theory, Vol. 5.
paradigm, from both behavioral and neurobiological
Academic Press, New York perspectives, is classical conditioning of the eyeblink
Clarke D L 1970 Beaker Pottery of Great Britain and Ireland. response. Briefly, a neutral stimulus such as a tone or
Cambridge University Press, Cambridge, UK light (the conditioned stimulus or CS) is presented just
David N, Hennig H 1972 The Ethnography of Pottery: A Fulani before an aversive stimulus such as a peri-orbital
Case Seen in Archaeological Perspectie. Addison-Wesley, shock or corneal air puff (the unconditioned stimulus
Reading, MA or US). Initially, the CS produces no overt movement
Deetz J 1965 The Dynamics of Stylistic Change in Arikara while the US causes a reflexive eyeblink (the uncon-
Ceramics. University of Illinois Press, Chicago ditioned response or UR). After 50–100 pairings of
Ford J A 1962 A Quantitatie Method of Deriing Cultural the CS and US, the CS begins to elicit a learned
Chronology. Technical Manual No. 1. Pan American Union,
Washington, DC
eyeblink (the conditioned response or CR). While
Kramer C 1985 Ceramic ethnoarchaeology. Annual Reiew of most eyeblink conditioning experiments have involved
Anthropology 14: 77–120 rabbits as subjects, it appears that all mammals,
Longacre W A 1981 Kalinga pottery, an ethnoarchaeological including humans, learn this simple associative task at
study. In: Hodder I, Issac G, Hammond N (eds.) Pattern of the similar rates using similar brain circuitry.
Past. Cambridge University Press, Cambridge, UK For a variety of reasons that include (a) the relative
Mead S M, Birks L, Birks H, Shaw E 1975 The Lapita Pottery simplicity of the response being monitored, (b) the
Style of Fiji and its Associations. Polynesian Society Memoir great deal of control that the experimenter has over
No. 38. Wellington (New Zealand) stimulus delivery, and (c) the precise timing of the
Rice P M 1987 Pottery Analysis: A Sourcebook. University of learned response, this behavioral task has proven
Chicago Press, Chicago
Rye O S 1981 Pottery Technology: Principles and Reconstruction.
useful for delineating the neural circuitry involved in
Taraxacum, Inc., Washington, DC simple associative learning. Many experiments con-
Shepard A O 1965 Ceramics for the Archaeologist. Carnegie ducted since the early 1980s have demonstrated
Institution of Washington Publication 609, Washington, DC conclusively that the cerebellum contains a population
Sinopoli C M 1991 Approaches to Archaeological Ceramics. of neurons that change their patterns of firing to
Plenum Press, New York encode the acquisition and performance of the classi-

1627
Cerebellum: Associatie Learning

cally conditioned eyeblink response—that is, the excitability of interpositus nucleus neurons, a result
cerebellum’s circuitry constitutes the essential learning that is compatible with formation of a behavioral CR.
and memory architecture for this basic associative Similar to cerebellar cortex, neurons that developed
learning procedure (see Woodruff-Pak and Steinmetz discharge patterns highly correlated with CR per-
2000, for review). formance were observed in the interpositus nucleus.
Neurons that discharged to presentations of the CS
and US were seen and, after learning, neurons that
discharged in patterns that were time-locked with the
1.1 Lesion Experiments
behavioral CR were abundant. Interestingly, the onset
The initial demonstrations of the involvement of the of interpositus unit activity preceded the behavioral
cerebellum in classical eyeblink conditioning were response by 30–60 milliseconds. These important
lesion experiments (e.g., McCormick and Thompson observations provide strong evidence that cellular
1984). Lesions placed in the interpositus nucleus of the actiity in the interpositus nucleus is the neural substrate
cerebellum prevented acquisition of eyeblink CRs and of the behaioral CR that is obsered. It is thought that
abolished previously learned CRs. Lavond et al. (1985) CR-related activity generated in the interpositus nu-
demonstrated the same lesion effect with infusions of cleus activated neurons in the red nucleus which, in
kainic acid, which spared fibers of passage that course turn,activateneuronsinthebrainstemmotornucleithat
through or near the interpositus nucleus. Reversible are responsible for generating eyeblinks.
lesions placed by cooling brain tissue (Clark and
Lavond 1993) or injecting muscimol (Krupa et al.
1993) were also effective in abolishing CRs. Interest-
ingly, when additional paired training was delivered
without cooling or muscimol inactivation, the animals
1.3 CS and US Pathways
showed no savings in the rate of CR acquisition: they
behaved as if they had received no previous paired Using stimulation, recording and lesion methods (e.g.,
training. These studies provide strong evidence that Steinmetz et al. 1989), the putative pathways for
critical neuronal plasticity that underlies classical projecting CS and US information from the periphery
eyeblink conditioning occurs in the cerebellum. to the cerebellum have been delineated. It appears that
Lesions of the cerebellar cortex have not produced a tone CS is projected from the ear to the cochlear
as consistent results as interpositus nucleus lesions nuclei which, in turn, relays tone information to the
(e.g., Lavond and Steinmetz 1989). Cerebellar cortical basilar pontine nuclei. The pontine nuclei then project
lesions have reportedly caused retarded rates of CR information about the CS to the cerebellum along
acquisition, reduced CR amplitudes, or the appear- mossy fibers. On the US side, air puffs are known to
ance of mistimed CRs. These data indicate that the activate corneal receptors that send projections to the
cerebellar cortex is involved in the conditioning trigeminal nucleus. The trigeminal nucleus, in turn,
process, but its precise role in conditioning or its projects information about the US to the rostromedial
interactions with the interpositus nucleus during con- portion of the dorsal accessory inferior olive. Climbing
ditioning are not well understood. fibers that originate from the inferior olive then relay
information about the occurrence of the US to the
cerebellum.
1.2 Recording Experiments
Electrophysiological recordings taken from cerebellar
cortex and the interpositus nucleus have provided
additional evidence for the involvement of the cer-
1.4 The Cerebellum as an Associator
ebellum in classical eyeblink conditioning (e.g.,
Berthier and Moore 1986, 1990). Recordings in regions There is ample evidence from anatomy, electro-
of the cerebellum known to receive converging CS and physiology, lesion, and microstimulation experiments
US input have revealed neurons that discharge with that information concerning the occurrences of the CS
patterns that seem to be encoding the conditioning and US converges on populations of neurons in the
process. Specifically, Purkinje cells were identified that cerebellum. The leading models of the involvement of
discharged when the CS or US was presented. Other the cerebellum in classical eyeblink conditioning pos-
Purkinje cells either increased or decreased their rate tulate that CS–US inputs converge in two locations—
of discharge in a pattern that seemed to be time-locked the interpositus nucleus and cerebellar cortex (e.g.,
to execution of the behavioral CR. Purkinje cells that Steinmetz 2000). Changes in the excitability of
decreased their firing rates are particularly interesting cortical and nuclear neurons, produced by convergent
because Purkinje cells are known to inhibit neurons in CS and US inputs, are thought to form the cellular
the deep cerebellar nuclei. Thus, a decrease in firing bases for the learning and performance of the classi-
rate of a Purkinje cell could result in an increase in cally conditioned eyeblink response. In essence, the

1628
Cerebellum: Associatie Learning

cerebellar circuitry serves as an ‘associator’ for discrete rotations of the head are detected by semicircular
stimuli that are presented in the environment. This canals located in the inner ear, and the eyes are moved
idea is certainly not new. Computational neuro- in their sockets in the direction opposite to the
biologists such as Marr (1969) have long considered movement of the head. This reflex stabilizes the line of
the architecture of the cerebellum to be ideal for sight. The VOR is highly plastic as the gain of the
associating environmental information with teaching reflex can be changed easily to accommodate changes
or reinforcing inputs. Further, models such as those of in the strength or efficiency of the extraocular muscles
Marr have hypothesized that mossy fibers and climb- to deal with changing levels of vestibular activation. In
ing fibers serve as the environmental and teaching many respects, this can be considered to be an
inputs, respectively. This architecture maps very nicely associative learning procedure as it is known that gain-
onto the known neural circuitry involved in eyeblink setting of the VOR is dependent on two events:
conditioning where the CS appears to be carried along vestibular input from the semicircular canals and
mossy fibers and the US appears to be carried along visual information (the relative slippage of the visual
climbing fibers. image on the retina during head movements) to
The previous models do not, however, predict a role determine if a change in VOR gain is needed. Over the
for the deep cerebellar nuclei in the associative learning years, a variety of studies have implicated the cer-
process; rather they postulate that the deep nuclei are ebellum and associated brainstem structures in ad-
passive recipients of outflow from cerebellar cortex. aptation of the VOR (Ito 1984, Lisberger 1988). In a
The data collected using eyeblink conditioning suggest similar way to classical eyeblink conditioning, it
differently. The nuclei appear to receive convergent appears that neuronal plasticity that forms the basis of
CS–US input, the nuclei show neuronal responses that VOR adaptation occurs in discrete regions of the
are related to conditioning, and the reversible lesion cerebellar cortex and in brainstem nuclei (the ves-
experiments of Clark and Lavond (1993) and Krupa et tibular nuclei) that receive convergent input from the
al. (1993) suggest that critical cellular plasticity related vestibular and visual systems.
to conditioning occurs in the nuclei. At this time, the Critical involvement of the cerebellum has also been
most parsimonious explanation of the available data demonstrated for an instrumental conditioning task
suggest that critical plasticity that underlies classical that has some similarities to classical eyeblink con-
eyeblink conditioning occurs both in cerebellar cortex ditioning (Steinmetz et al. 1993). In this task, rats are
and in the deep cerebellar nuclei. It has been suggested first shaped to press a bar to terminate a mild,
that these two areas may encode different features of pulsating foot-shock. After learning the bar-press
the conditioning, with excitability changes in nuclear response, the rats are placed in a signaled-training
cells important for generating activity that drives situation where they learn to avoid the foot-shocks by
brainstem motor neurons responsible for eyeblink pressing the response bar during tone presentations.
CRs, and excitability in cortical cells important for Rats reach 50–60 percent avoidance rates with about
providing gain on the response and for regulating the 10–12 days of training in this procedure. This as-
timing of the response (e.g., Gould and Steinmetz sociative task is somewhat similar to classical eyeblink
1996, Steinmetz 2000). conditioning in that a neutral stimulus (a tone) is used
to signal an impending noxious stimulus (a mild foot-
shock). Bilateral lesions of the dentate and interpositus
2. The Cerebellum and Other Associatie nuclei in rats prevented learning of this avoidance
response. Escape responding was initially high in
Learning Paradigms lesioned rats, but this responding decreased over
There are surprisingly few other demonstrations of the sessions. Interestingly, deep nuclear lesions seem
involvement of the cerebellum in associative learning. to be effective in preventing avoidance learning only
This is likely not due to a general lack of involvement when the interval between tone onset and foot-shock
of the structure in this type of learning (although see onset is five seconds or less.
below for some limitations on cerebellar involvement
in associative learning) but rather that few experiments
have been conducted to explore the involvement of the 3. When is the Cerebellum Critical for Simple
cerebellum in associative learning tasks. Two excep- Associatie Learning?
tions are briefly described here; adaptation of the
vestibulo–ocular reflex (VOR) and instrumental sig- Another way to frame this question is to ask: when is
naled bar-press conditioning. the cerebellum not involved in associative learning? A
number of experiments have addressed this issue.
First, the cerebellum appears to be necessary for
associative learning when the interval of time between
2.1 Adaptation of the VOR
the stimuli being associated is relatively short. Classi-
The VOR is a brain system used to stabilize a visual cal eyeblink conditioning can only be obtained when
image on the retina during movement. In this reflex, the CS–US interval is 3–4 seconds or less. Longer

1629
Cerebellum: Associatie Learning

CS–US intervals do not produce eyeblink CRs, al- learning that involves reinforcement). The rats de-
though a variety of other conditioned responses can be scribed above that showed severe deficits in learning to
elicited. Adaptation of the VOR requires near-sim- avoid the signaled foot-shock after cerebellar lesions
ultaneous occurrences of head rotation and retinal (Steinmetz et al. 1993) were trained in an appetitive
slip. As detailed above, signaled bar-press conditioning version of the task. In the appetitive version of the
seems to be critically dependent on cerebellar function task, a tone was presented for a 3–5 second period, and
only when relatively short CS–US intervals are used. a bar press resulted in the delivery of food pellet
Avoidance conditioning is relatively easy to obtain reward. The cerebellar-lesioned rats easily learned the
with longer tone–foot-shock intervals, but cerebellar appetitive task even though they showed a complete
lesions appear to have no effect on these learned inability to learn the aversively motivated task. In a
responses—suggesting that other brain areas are criti- more direct comparison of appetitive and aversive
cal for active avoidance learning when intervals classical conditioning, Gibbs (1992) trained rabbits on
between the stimuli are relatively long. These observa- two classical conditioning tasks before delivering
tions are highly consistent with what is known about lesions to the interpositus nucleus of the cerebellum.
the role of the cerebellum in movement and posture— One task was standard classical eyeblink conditioning
the cerebellum is intricately involved in making fine (an aversive task), while the second task was classical
adjustments to ongoing movements during relatively jaw-movement conditioning (an appetitive task). Jaw-
brief periods of time (often referred to as movement movement conditioning involves pairing a tone CS
error-correction). with the delivery of water or juice (the US) into the
Second, and related to the first point, the cerebellum mouth. The water or juice causes a movement of the
seems to be critical for associative learning that involves jaw as the rabbit consumes the liquid. Several pairings
relatively simple, discrete skeletal muscle responses. of the CS and US produce an anticipatory jaw
Conditioned eyeblinks, VOR adaptation, and con- movement to the tone (the CR). Gibbs showed that in
ditioned bar-press responses require the rapid re- this within-subject experiment, lesions of the cerebellar
cruitment of relatively few muscles, especially when deep nuclei abolished eyeblink conditioning but had
the time period allowed for responding is relatively no effect on jaw-movement conditioning. These data
brief. The idea that cerebellar involvement in learning suggest that the cerebellum is involved in encoding
may be limited by response requirements has been aversively motivated associative learning but not
tested (Steinmetz et al. 1991). Rabbits were trained in appetitively motivated associative learning. This sug-
two tasks: classical eyeblink conditioning, and a gestion is not surprising given the large body of
discriminative avoidance procedure. In the discrim- research that has detailed the involvement of forebrain
inative avoidance procedure, rabbits were presented structures and circuits in reward learning.
with a tone and required to locomote in an activity
wheel to avoid a foot-shock that was presented after
the tone onset. Bilateral lesions of the interpositus 4. The Cerebellum and Associatie Learning
nucleus of the cerebellum prevented acquisition of the
conditioned eyeblink response and abolished already For several years, theorists who have speculated about
learned eyeblink CRs, but had no effect on the the function of the cerebellum have noted that the
acquisition or performance of the discriminative basic anatomy and architecture of the cerebellum
avoidance response. While the discriminative wheel- seems to be designed for associative learning. The
turn avoidance task differs from eyeblink conditioning cerebellum receives inputs through two separate and
on several dimensions (e.g., it involves discrimination unique systems of fibers: mossy fibers and climbing
learning and uses a longer interstimulus interval), one fibers, and a growing body of evidence suggests that
of the major differences between paradigms lies in the plasticity in cerebellar neurons may be due to as-
response requirements (discrete eyeblink vs. a sociative interactions that occur between these inputs.
relatively complex, bipedal, locomotive response). Further, classical eyeblink conditioning seems to be an
Lavond and colleagues (1985) have also shown that ideal associative learning paradigm for studying the
the conditioned changes in heart-rate that are nor- involvement of the cerebellum in associative learning.
mally observed during classical eyeblink conditioning This procedure involves the conditioning of discrete
are not affected by cerebellar lesions. These data responses, involves the presentation of an aversive or
suggest that the encoding of autonomic responses noxious US, and involves a relatively brief period of
during associative learning involves areas outside of time between the presentation of the stimuli being
the cerebellum, a finding that is compatible with a associated. In essence, the classical eyeblink condition-
number of other studies. Together, these data suggest ing procedure could be considered the prototypical
that the cerebellum may be involved in associative learning paradigm for engaging the cerebellum during
learning when relatively discrete skeletal muscle associative learning. Further studies into cellular and
responses are conditioned. systems-level cerebellar processes engaged during clas-
Third, there is evidence that the cerebellum is not sical eyeblink conditioning should provide valuable
involved in encoding associative reward learning (i.e., data concerning the general role of the cerebellum in

1630
Cerebellum: Cognitie Functions

associating two or more external stimuli or internal Steinmetz J E, Logue S F, Miller D P 1993 Using signaled bar-
events in time. pressing tasks to study the neural substrates of appetitive and
aversive learning in rats: Behavioral manipulations and
cerebellar lesions. Behaioral Neuroscience 107: 941–54
See also: Cerebellum: Cognitive Functions; Classical Steinmetz J E, Sears L L, Gabriel M, Kubota Y, Poremba A
Conditioning, Neural Basis of; Electroencephalo- 1991 Cerebellar interpositus nucleus lesions disrupt classical
graphy: Basic Principles and Applications; Elec- nictitating membrane conditioning but not discriminative
troencephalography: Clinical Applications; Eyelid avoidance learning in rabbits. Behaioral Brain Research 45:
Classical Conditioning; Long-term Depression (Cere- 71–80
bellum); Topographic Maps in the Brain; Vestibulo- Woodruff-Pak D S, Steinmetz J E (eds.) 2000 Eyeblink Classical
ocular Reflex, Adaptation of the Conditioning: Animal Models. Kluwer, Boston

J. E. Steinmetz

Bibliography
Berthier N E, Moore J W 1986 Cerebellar Purkinje cell activity
related to the classically conditioned nictitating membrane
response. Experimental Brain Research 63: 341–50
Berthier N E, Moore J W 1990 Activity of deep cerebellar
nuclear cells during classical conditioning of nictitating Cerebellum: Cognitive Functions
membrane extension in rabbits. Experimental Brain Research
83: 44–54
In the nineteenth century, researchers reached a
Clark R E, Lavond D G 1993 Reversible lesions of the red
nucleus during acquisition and retention of a classically consensus on the basis of animal ablation experiments
conditioned behavior in rabbits. Behaioral Neuroscience 107: that damage to the cerebellum leads to motor disorders
264–70 but does not affect sensory or cognitive functions. In
Gibbs C M 1992 Divergent effects of deep cerebellar lesions on the early twentieth century Stewart and Holmes (1904)
two different conditioned somatomotor responses in rabbits. demonstrated that cerebellar lesions resulting from
Brain Research 585: 395–9 tumors or gunshots elicit comparable motor deficits in
Gould T J, Steinmetz J E 1996 Changes in rabbit cerebellar humans such as reduction of muscle tone, impairment
cortical and interpositus nucleus activity during acquisition, of movement coordination, and deficits in the regu-
extinction and backward classical conditioning. Neurobiology lation of gait and posture. Voluntary movement
of Learning and Memory 65: 17–34 control is severely affected, the main symptom being a
Ito M 1984 The Cerebellum and Neural Control. Raven Press,
New York
disturbance of movement coordination (ataxia) which
Krupa D J, Thompson J K, Thompson R F 1993 Localization may affect the control of limb muscles and ocular
of a memory trace in the mammalian brain. Science 260: muscles as well as speech control. In the view of
989–91 clinical neurology, the cerebellum was thus thought to
Lavond D G, Hembree T L, Thompson R F 1985 Effect of be exclusively engaged in the control of motor activity.
kainic acid lesions of the cerebellar interpositus nucleus on In the 1950s, however, a broader concept of cer-
eyelid conditioning in the rabbit. Brain Research 326: 179–82 ebellar function was suggested, with a possible cer-
Lavond D G, Lincoln J S, McCormick D A, Thompson R F ebellar involvement in the control of autonomic and
1984 Effect of bilateral lesions of the dentate and interpositus limbic activity. Clinical observations made an im-
cerebellar nuclei on conditioning of heart-rate and nictitating portant contribution to this broader concept (for a
membrane\eyelid responses in the rabbit. Brain Research 305:
323–30
summary see Daum and Ackermann 1995). Con-
Lavond D G, Steinmetz J E 1989 Acquisition of classical genital cerebellar malformations, for example, were
conditioning without cerebellar cortex. Behaioral Brain found to be associated not only with ataxia, balance
Research 33: 113–64 problems, and other motor deficits, but also with
Lisberger S G 1988 The neural basis of learning of simple motor mental retardation. In addition, disorders such as
skills. Science 242: 728–35 schizophrenia and autism were related to neuro-
Marr D 1969 A theory of cerebellar cortex. Journal of Physiology pathological abnormalities of the cerebellum. The
202: 437–70 hypothesis of a cerebellar involvement in the control
McCormick D A, Thompson R F 1984 Cerebellum: essential of emotions was further supported by findings of a
involvement in the classically conditioned eyelid response. modulation of fear and aggression by cerebellar
Science 223: 296–9
stimulation for the control of epileptic seizures.
Steinmetz J E 2000 Brain substrates of classical eyeblink
conditioning: A highly localized but also distributed system.
More recent concepts of cerebellar function were
Behaioral Brain Research 110: 13–24 influenced by reports of two-way connections between
Steinmetz J E, Lavond D G, Thompson R F 1989 Classical the cerebellum and the cerebral cortex, findings of
conditioning in rabbits using pontine nucleus stimulation as a cognitive deficits in patients with cerebellar dysfunc-
conditioned stimulus and inferior olive stimulation as an tion, and neuroimaging reports of cerebellar activation
unconditioned stimulus. Synapse 3: 225–33 during a range of cognitive tasks (Schmahmann 1997).

1631
Cerebellum: Cognitie Functions

1. Anatomy of the Cerebellum essential neuronal circuitry involves the convergence


of CS and US information in the cerebellum and an
Like the cerebrum, the cerebellum consists of two efferent projection from the cerebellum to motor nuclei
hemispheres. Three functional regions can be dis- in the brain stem, which control eyeblink responses
tinguished: the centrally located vermis (Latin ‘worm’) (see Thompson 1991). Patients with cerebellar dys-
and the lateral and intermediate zones in each hemi- function are severely impaired at acquiring eyeblink
sphere. Mossy fibers are the major afferents to the CRs, although the reflex blink to the US is unaffected.
cerebellum, receiving their input from brain stem Conditioning of simultaneously recorded nonmotor
nuclei and spinal chord neurons. The climbing fibers, autonomic and electrocortical responses are also intact
which are the second excitatory input to the cer- (Daum et al. 1993a). The critical involvement of the
ebellum, originate from a single site in the medulla and cerebellum in the conditioning of motor responses has
the inferior olivary nucleus. Efferent projections are now been documented by a large number of clinical
mediated by the Purkinje cells and the deep nuclei. investigations, functional neuroimaging studies, and
There are several reciprocal pathways between the studies of physiological manipulations of cerebellar
cerebellum and the cerebral cortex. The cerebellum functions in normal subjects (for a summary see
receives input via pontine nuclei from the parietal Schugens et al. 2000).
cortex, the prefrontal cortex, and the superior temporal In motor imagery, a motor program that is stored
sulcus. The cerebellum projects back to the same elsewhere in the CNS is activated without any overt
regions via the thalamus (Schmahmann 1997). These movement. There is some evidence that the cerebellum
afferent and efferent projections imply a possible role becomes active during motor imagery, in tasks such as
of the cerebellum in the modification of information silent counting and imagination of tennis training
which is projected from the cortex to the cerebellum movements (Decety et al. 1990).
and sent back to the cortex.

3. Timing
2. Motor Learning and Motor Imagery
The notion that the cerebellum computes timing
It is well known that the cerebellum plays a critical role requirements for motor performance is supported by
in motor control, with the lateral regions of the investigations in animals, as well as by clinical data
cerebellum mediating movement planning and pro- (Keele and Ivry 1991). Keele and Ivry have argued
graming, while the medial regions are involved in the that the lateral regions of the cerebellum are critically
execution of movement (Dichgans and Diener 1984). involved in the internal timing of motor and nonmotor
Imaging studies using positron emission tomogra- behavior which requires temporal computation. They
phy have also demonstrated a cerebellar role during attributed the deficits in classical conditioning of
motor learning of sequential finger movements as well cerebellar patients to problems in timing the initiation
as in trajectorial learning. In addition, the cerebellum of the CR. This hypothesis is supported by the finding
was found to contribute to the monitoring and of an inappropriately timed CR reported by Daum et
optimizing of movements by using sensory (proprio- al. (1993a) and by Topka et al. (1993). Key symptoms
ceptive) feedback (Jueptner and Weiler 1998). of cerebellar symptoms, such as dysmetria (problems
Motor skill learning describes the qualitative im- with precise movements) or dysdiadochokinesia
provement of performance through practice which (problems with fluent alternating movements), can
ensures that movements can be performed fast, ac- also be interpreted within the context of deficient
curately, and with little attentional control. Electro- cerebellar timing functions. Further support for this
physiological and lesion studies in nonhuman primates idea stems from impairments in rhythmic tapping in
with cerebellar hemispherectomies have demonstrated cerebellar patients (Ivry et al. 1988) as well as from
the critical contribution of the cerebellum at a stage of deficits in speech production, which reflects a decline
motor learning when performance becomes fast and in temporal coordination of neuromuscular interac-
accurate. Similar results were observed in patients tion needed for articulation (Ackermann and Ziegler
with cerebellar dysfunction, who had problems in 1992). Impairments of speech perception are also
learning the skillful execution of serial movements consistent with the idea of a timing deficit in the non-
(Doyon 1997). motor domain (Ackermann et al. 1999).
The most frequently used motor learning paradigm
is classical conditioning of the eyeblink response. An
acoustic stimulus, the conditioned stimulus (CS), is 4. Cognitie Functions
paired with a corneal airpuff, the unconditioned
stimulus (US), which evokes an eyeblink. After re- The results of developmental dysfunction, such as
peated pairing of the CS and the US, the eyeblink cerebellar agenesis (absence of the cerebellum) or
occurs to the CS but before onset of the airpuff and cerebellar hypoplasia (prenatal developmental deficits
thereby forms a conditioned response (CR). The which result in loss or incomplete cerebellar devel-

1632
Cerebellum: Cognitie Functions

opment), varies from congenital apraxia to normal which would be associated with changing attentional
motor abilities. Similarly, cognitive development can behavior, is deficient due to cerebellar dysfunction.
range from profound mental retardation to normal Two-way cerebellar-parietal projections led to the
status. Intellectual deficits after delayed motor de- investigation of visuospatial abilities that are mediated
velopment may be due to a close coupling of motor by the parietal cortex. The studies carried out so far
and intellectual functions in early life. yielded no clear evidence of a general impairment of
Performance of patients with cerebellar dysfunc- visuospatial functions in patients with cerebellar dys-
tions on standard intelligence tests is generally in the function. Findings of difficulties of such patients with
normal range. Short- and long-term declarative mem- the mental manipulation of three-dimensional objects
ory as well as priming effects are also largely unaffected in space offers some evidence of a visuospatial proces-
in patients with cerebellar lesions (for a summary see sing deficit consistent with possible dysfunction of
Daum and Ackermann 1997). cerebellar–parietal circuits (Wallesch and Horn 1984).
With respect to skill learning, it has been argued
that the neocortex may be primarily concerned with
the generation\processing of specific operations, while 5. Conclusion
the cerebellum serves to modulate and optimize the
functions in question (Ito 1993). In support of this In summary, the cerebellum makes an important
idea, cerebellar damage was associated with deficits in contribution to the control of voluntary movement
the automatization of visuomotor sequences and and movement coordination as well as to the control
visuomotor skill learning (Doyon et al. 1998). As far as of balance, gait, and posture. Motor learning abilities
nonmotor skill learning is concerned, performance of are also largely dependent upon the functional in-
patients with cerebellar dysfunction on standard per- tegrity of the cerebellum. There is also strong evidence
ceptual and cognitive skill acquisition was largely for a cerebellar role as an ‘internal clock’ which comes
unimpaired (Daum et al. 1993a, Helmuth et al. 1997). into play during the control of movement as well as
The acquisition and performance of language skills during perceptual processing.
may be more difficult for such patients (Fiez et al. The exact nature of the cerebellar involvement in
1992); and cerebellar activations during such tasks may cognitive processes is so far less well understood.
be related to verbal response search (Desmond et al. Possible contributions to prefrontal or executive func-
1998). tions and visuospatial processing remain to be speci-
The anatomical as well as functional relationship fied by studies using patients with selective cerebellar
between the cerebellum and the prefrontal cortex lesions, adequately clinical and nonclinical matched
(Kim et al. 1994) led to the investigation of cognitive control groups, and the use of a wide range of tests
functions that are thought to be associated with assessing different aspects of the abilities in question.
prefrontal or executive function. Performance on Functional neuroimaging techniques also provide a
anticipatory planning tasks was found to be impaired good tool to study the cerebellar contribution to
in patients with cerebellar atrophy in some studies different cognitive abilities. A problem of imaging
(Grafman et al. 1992, Hallett and Grafman 1997). techniques is, however, that it is difficult to determine
Verbal fluency abilities are also associated with execu- which brain area is critically involved in which aspects
tive processing. In such tasks, subjects are asked to of cognitive processing, since essential and correlated
name as many items as possible of a certain semantic activity cannot be easily distinguished. A combination
category or starting with a certain letter within a of imaging techniques and transcranial magnetic
specific time limit. Patients with cerebellar damage stimulation, which elicits a transient lesion, may be a
may occasionally show problems with word gener- promising approach in this regard.
ation tasks of this kind (Fiez et al. 1992). Such
problems may, however, be influenced by slowing of See also: Habituation; Motor Control Models: Learn-
speech (‘dysarthria’) which is a frequent symptom of ing and Performance; Motor Cortex; Prefrontal
cerebellar damage. This motor speech slowing may Cortex
interfere with the execution of verbal fluency tasks,
and lead to poorer performance in some cases.
While cerebellar activation is observed in functional Bibliography
neuroimaging during performance of the Wisconsin
Card Sorting Test, a standard test of concept forma- Ackermann H, Graber S, Hertrich I, Daum I 1999 Cerebellar
tion, perseverative tendencies do not usually occur in contributions to the perception of temporal cues within the
speech and nonspeech. Brain Lang. 67(3): 228–41
cerebellar patients (Daum and Ackermann 1997,
Ackermann H, Ziegler W 1992 Die cerebella$ re Dysarthrie: Eine
Hallett and Grafman 1997). By contrast, cerebellar Literaturu$ bersicht. Fortschr. Neurol. Psychiat. 60: 28–40
lesion patients had problems in attentional shifting Akshoomhoff N, Courchesne E, Press G, Iragui V 1992
between modalities (Akshoomhoff et al. 1992). This Contribution of the cerebellum to neuropsychological func-
pattern might be explained by impaired cerebellar– tioning: Evidence from a case of cerebellar degenerative
prefrontal interaction where ‘prefrontal’ activation, disorder. Neuropsychologia 30: 315–28

1633
Cerebellum: Cognitie Functions

Daum I, Ackermann H 1995 Cerebellar contributions to Schmahmann J D (ed.) 1997 The Cerebellum and Cognition.
cognition. Behaioural Brain Research 67: 201–10 International Review of Neurobiology Vol. 41
Daum I, Ackermann H 1997 Neuropsychological abnormalities Stewart T G, Holmes G 1904 Symptomatology of cerebellar
in cerebellar syndromes—fact or fiction? In: Schmahmann J D tumors: A study of forty cases. Brain 27: 522–91
(ed.) The Cerebellum and Cognition. Academic Press, San Thompson R F 1991 Are memory traces localized or distributed?
Diego, CA Neuropsychologia 29: 571–82
Daum I, Ackermann H, Schugens M M, Reimold C, Dichgans J, Topka H, Valls-Sole J, Massaquoi S G, Hallet M 1993 Deficit in
Birbaumer N 1993a The cerebellum and cognitive functions in classical conditioning in patients with cerebellar degeneration.
humans. Behaioral Neuroscience 107: 411–9 Brain 116: 961–9
Daum I, Schugens M M, Ackermann H, Lutzenberger W, Wallesch C W, Horn A 1984 Long-term effects of cerebellar
Dichgans J, Birbaumer N 1993b Classical conditioning after pathology on cognitive functions. Brain and Cognition 14:
cerebellar lesions in humans. Behaioral Neuroscience 107: 19–25
748–56
Decety J, Sjo$ holm H, Ryding E, Sternberg G, Ingvar D H 1990 B. Suchan and I. Daum
The cerebellum participates in mental activity: Tomographic
measurements of regional cerebral blood flow. Brain Research
535: 313–7
Desmond J E, Gabrieli J D E, Glover G H 1998 Dissociation of
frontal and cerebellar activity in a cognitive task: Evidence for
a distinction between selection and search. Neuroimage 7:
368–76 Cerebral Cortex: Organization and
Dichgans J, Diener H C 1984 Clinical evidence for functional
compartmentalisation of the cerebellum. In: Bloedel J, Dich-
Function
gans J, Precht W (eds.) Cerebellar Functions. Springer, Berlin,
pp. 126–47 If it is possible to distinguish a hierarchy of complexity
Doyon J 1997 Skill learning. Schmahmann J D (ed.) The in behavior, ranging from the simplest reflex to the so-
Cerebellum and Cognition. International Reiew of Neurobio- called cognitive functions, and if behavior is related to
logy 41: 273–96 the structural organization of the brain, there is no
Doyon J, Laforce R Jr, Bouchard G, Gaudreau D, Roy J, Poirier doubt that the cerebral cortex corresponds to the
M, Bedard P J, Bedard F, Bouchard J P 1998 Role of the highest levels. The main reasons for this generally
striatum, cerebellum, and frontal lobes in the automatization
of a repeated visuomotor sequence of movements. Neuropsy-
accepted notion are the following: (a) the cerebral
chologia 36: 625–41 cortex is not only the largest piece of cerebral gray
Fiez J A, Petersen S E, Cheney M K, Raichle M E 1992 Impaired matter in humans and in other mammals, but it is also
nonmotor learning and error detection associated with the one with the most impressive system of internal
cerebellar damage. Brain 115: 155–73 connections, suggesting an essentially global oper-
Grafman J, Litvan I, Massaquoi S, Stewart M, Sirigu A, Hallet ation; (b) there are good reasons to believe that most
M 1992 Cognitive planning deficits in patients with cerebellar of the connections between cortical neurons are of the
atrophy. Neurology 42: 1493–96 ‘plastic’ kind, i.e., they are modified through learning
Hallet M, Grafman J 1997 The cerebellum and cognition: and thus incorporate knowledge about the world; and
Executive function and motor skill learning. International (c) lesions of the cerebral cortex often impair behavior
Reiew of Neurobiology 41: 297–323
Helmuth L L, Ivry R B, Shimizu N 1997 Preserved performance
in a realm that clearly belongs to the psychological
by cerebellar patients on tests of word generation, discrimi- level, such as language, orientation, and perception.
nation learning, and attention. Learning and Memory 3(6): This pre-eminence of the cerebral cortex in the control
456–74 of complex behavior can be related to some of its
Ito M 1993 Movement and thought: Identical control mech- anatomical and physiological peculiarities.
anisms by the cerebellum. Trends in Neurosciences 16: 448–50
Ivry R B, Keele S W, Diener H C 1988 Dissociation of the lateral
and medial cerebellum in movement timing and movement
execution. Experimental Brain Research 73: 167–80 1. The Structural Type of the Cortex
Jueptner M, Weiler C 1998 A review of differences between basal
ganglia and cerebellar control of movements as revealed by The term cortex defines a class of brain structures
functional imaging studies. Brain 121(8): 1437–49 characterized by an essentially two-dimensional lay-
Keele S W, Ivry R 1991 Does the cerebellum provide a common out. A ‘vertical’ organization along the ‘thickness’ of
computation for diverse tasks. A timing hypothesis. Annals of the cortex is repeated almost identically throughout
New York Academy of Sciences 608: 179–211 the ‘plane’ of the cortex. The thickness of the cerebral
Kim S G, Ugurbil K, Strick P L 1994 Activation of a cerebellar cortex in humans varies between about 2 and 4 mm (in
output nucleus during cognitive processing. Science 265:
949–51
the mouse 1 mm) while in the plane the cortex of one
Schugens M M, Topka H R, Daum I 2000 Eyeblink conditioning hemisphere covers an area of about 1000 cm# (in the
in neurological patients with motor impairments. In: Wood- mouse about 1 cm#). Besides the cerebral cortex of
ruff-Pak D S, Steinmetz J E (eds.) Eyeblink Classical Condi- mammals, the cerebellar cortex, the optic tectum, and
tioning: Vol. I Applications in Humans. Kluwer Academic many other structures of vertebrate and invertebrate
Publishers, Boston brains are built according to a similar two-dimensional

1634
Cerebral Cortex: Organization and Function

Figure 1
The basic connectivity of the cortex. Its most numerous neuronal contingent is pyramidal cells, connected into an
excitatory network both locally by a rich system of axon collaterals (indicated in B) and over longer distances via
the white matter beneath the cortex, terminating mainly in the upper cortical layers (indicated in C). M: primary
motor cortex with axons going to the spinal cord, S: primary sensory area, receiving afferent fibers from a sensory
system (via the thalamus). O: olfactory region at the edge of the cortex. From Braitenberg (1977) On the Texture of
Brains. Springer, Berlin

scheme. What they all have in common is a diffuse other parts of the brain takes its origin from another
projection of inputs over the whole plane, which is level. This distinction of input and output levels,
transformed by a unitary operation, varying from together with the levels where most of the internal
cortex to cortex, into the output. In many places the traffic of the cortex takes place, is at the origin of the
plane of the ‘cortex’ represents point to point some well-known laminar structure of the cerebral cortex.
external input space such as the visual field or the The most common distinction is one of six layers,
surface of the body. Also, at every point of the cortical numbered (usually by Roman numerals) from the top
plane, output fibers leave the cortex to reach a variety down (from the free surface to the white matter). The
of destinations, including other parts of the cortex most characteristic features of the various layers with
itself. In the cerebral cortex of mammals both input respect to input and output are the following: the
and output fiber systems are on the same side of the upper layers, layers I to III, are devoted to com-
cortical plane, forming the so-called ‘hemispheric munication between distant parts of the cortex within
white matter’ (with some minor exceptions in a or between hemispheres. Layer IV is the level at which
marginal region, where the olfactory input enters the sensory input fibers terminate (relayed from the
cortex in the uppermost layer, see olfactory input O in thalamus),orfibersmostlyfromotherpartsofthecortex
Fig. 1). In addition to the cortico-cortical fibers, directly relaying such inputs. Layer V sends fibers to
sensory input fibers and motor output fibers shown in the basal ganglia and to distant parts of the brain or to
Fig. 1, the white matter contains a diffuse system of the spinal cord. Layer VI communicates with the
cortical fibers reaching the basal ganglia, as well as a uppermost layers, as well as with the thalamus.
system of two-way connections between cortex and All of these statements are only statistically valid: a
thalamus. certain amount of thalamic fibers can also reach layers
I, III, and VI, and layers IV to VI also participate to a
certain degree in cortico-cortical communication. In
2. Layers spite of this, the distinction of the layers also gains
support in the appearance of histological sections
As in other cortices, the input from distant places through the cortex. When the neural cell bodies are
reaches the cerebral cortex predominantly at a special stained, the layers differ both in the number and the
level of the cortical thickness. Similarly, the output to size of the neurons they contain. They also differ in the

1635
Cerebral Cortex: Organization and Function

Figure 2
Examples of myeloarchitectonic differences between areas in the human cortex. Both stripes of Baillarger can be
seen in the upper right (middle frontal gyrus) and lower left (Broca region), only the outer stripe can be recognized
in the upper left (upper frontal gyrus) and both of them disappear in the strong myelination of the primary motor
cortex (lower right). From: Braitenberg and Schu$ z (1989) Cortex: hohe Ordnung oder gro$ ßtmo$ gliches
Durcheinander? Spektrum der Wissenschaft 5: 74–86

density of fibers in myelin preparations (Fig. 2). These borders between the layers cannot be as sharp as they
differences are certainly related to the different roles sometimes appear in textbook diagrams, since they are
the layers play in cortical information handling. always crossed by dendrites and axons of many
The complexity of interactions both within and neurons in adjoining, and even more distant layers
between layers can be appreciated by staining in- (Fig. 1).
dividual neurons in their entirety, for example with the
time-honored Golgi method or with the modern
techniques of intracellular injection of dyes. These 3. Types of Neurons in the Cortex
methods show cortical neurons as three-dimensional
devices which collect signals in the region of their The pattern of axonal and dendritic ramifications
dendritic trees and distribute them in the region (or varies a great deal between individual neurons, de-
regions) of their axonal terminations. The diameters pending on their localization in different layers and in
of axonal and dendritic trees greatly exceed (by a different parts of the cortex. This led to the classifica-
factor of 10 to 100 or more) the distance between the tion of a great number of neuronal types, where one
corresponding cell bodies in the tissue. The result is an overriding distinction is now accepted by most
extremely dense felt of cell processes in which each authors, that of the spiny neurons (often subsumed
neuron is interwoven with about 100,000 other neu- under the term pyramidal cells or Type I neurons), and
rons. Taking this into account, it follows that the of the spineless neurons (often subsumed under the

1636
Cerebral Cortex: Organization and Function

term stellate or Type II neurons). This distinction is small number of cortical neurons in their vicinity
supported by differences in the fine structure of (pyramidal and stellate). In other cases their inhibitory
membrane specializations as they appear in the elec- synapses are distributed more profusely, making the
tron microscope both on the dendritic and axonal tree, distinction of several classes of stellate cells possible
and by electrophysiological findings. Briefly, spiny according to the pattern of their ramification (basket
neurons receive most of their synaptic input on cells, chandelier cells, double bouquet cells, etc.; see
‘spines,’ i.e., small processes emanating from the Peters and Jones 1984). They, too, receive their input
dendritic tree, and make ‘excitatory’ synapses onto mainly from cortical pyramidal cells. In the regions
other neurons via their axonal tree. Spineless neurons where primary sensory input (e.g., visual) enters the
receive their synapses directly on their dendrites and cortex, they may also be directly contacted by the
‘inhibit’ other neurons via their axonal tree. This incoming fibers.
distinction coincides with another anatomical feature: Although the inhibitory interneurons in some cases
most spiny neurons have an axon which descends to are certainly involved in the computation which takes
the white matter and makes both, ‘short-range connec- place in the cortical network, their role may be
tions’ via local axon collaterals and ‘long-range considered as ancillary with respect to that of the
connections’ somewhere else in the brain or—in most pyramidal cells. They may perhaps put a brake on
cases—somewhere else in the cortex (Fig. 1). In neuronal activity when it threatens to explode in a
contrast, the axon of a spineless stellate cell does not runaway reaction, as may be expected in a network of
enter the white matter and only contacts other neurons elements, such as the pyramidal cells, all exciting each
in its vicinity. other. Indeed, this braking action occasionally
Spiny neurons are the great majority (about 85 fails, as evidenced by epileptic fits, one of the com-
percent) of all neurons in the cerebral cortex. This monest forms of functional derailment in the cortex.
category largely coincides with that of pyramidal cells The comparison with other parts of the brain shows
of older classifications, characterized by a bipartite that a network of mainly excitatory connections is
dendritic tree, with several ‘basal dendrites’ distributed indeed a striking peculiarity of the cerebral cortex. In
around the cell body, and one ‘apical dendrite,’ the other major parts of the brain the majority of
ascending vertically through the cortex and ramifying neurons are either connected into an inhibitory net-
in upper layers (Fig. 1). There are, however, also spiny work (basal ganglia) or they do not form a network
neurons without an apical dendrite, sometimes termed among themselves but relay some input to some output
‘spiny stellate cells,’ especially as recipients of primary in a feedforward manner (cerebellar cortex, thalamus).
sensory input in layer IV of primary sensory areas.
Disregarding a subpopulation of these which does not
project to the white matter, the basic connectivity
of cortical neurons can be described as follows 5. The Basic Function
(Braitenberg and Schu$ z 1998).
The question of what could possibly be the advantage
of an immense network of interconnected excitatory
neurons whose increase in size has accompanied
4. The Basic Connectiity of the Cerebral Cortex mammalian evolution up to the crowning event of
human culture, has received various tentative answers.
The ‘skeleton cortex’ (Fig. 1) consists of a large The most convincing interpretation of cortical struc-
number of pyramidal cells (of the order of 10"! in ture is based on the observation that the synapses
humans), distributed throughout all layers of the between pyramidal cells, the majority of all synapses in
cortex, and producing excitatory synapses which for the cortex, are of the special kind residing on dendritic
the most part again contact other pyramidal cells. spines, and on the supposition that spine-synapses are
There is considerable divergence and convergence ‘plastic,’ i.e., modifiable by ‘learning.’ There is no
built into this system, since every pyramidal cell definitive proof of this supposition, but enough in-
communicates with many thousands of other pyr- direct evidence to make it plausible (see review by
amidal cells both in its input and in its output. Thus, Horner 1993), and, moreover, alternative explanations
‘diffuse excitatory feedback’ is the pre-eminent feature for the role of spines are less convincing. Be this as it
of the cerebral cortex. Only a small percentage of may, there is little doubt that, of all parts of the brain,
pyramidal cells project to other parts of the brain and the cerebral cortex is the one most concerned with the
only a few percent of the synapses in the cortex come acquisition of knowledge. Suffice it to say that lesions
from neurons from other parts of the brain. By far the of the cerebral cortex impair complex acquired capa-
greatest part of synaptic traffic in the cortex is internal cities, such as language.
and only involves cortical neurons. In terms of engineering, the network of cortical
The inhibitory stellate cells are diffusely distributed pyramidal cells could be likened to a giant ‘associative
in the network of pyramidal cells. In some cases their memory,’ a device which connects together more
inhibitory synapses have a strong grip on a relatively strongly (through modifiable synapses) the neurons

1637
Cerebral Cortex: Organization and Function

that are often active at the same time. Thus, events of primary sensory input, as in the case of special areas
the outside world which tend to present themselves for the extraction of motion (area V5) or color (area
together will be represented in the brain by neurons V4) in the visual scene (Zeki 1993), or in the case of a
tied together by strong synapses. The idea that the special area (Wernicke’s area) fed by acoustic input in
cortex incorporates knowledge by repeating, in its the context of language, distinct from other areas
synaptic connectivity, the structure of reality was concerned with acoustic perception. Evidently, besides
spelled out in two well-known theories. In 1949, D. O. the various sensory inputs reaching the cortex in
Hebb proposed ‘cell assemblies’ as the units of separate regions, and the long efferent (e.g., cortico-
cognitive operations in the brain. These are ensembles spinal) axons emanating from other regions, it is the
of cortical neurons connected to each other by context in which the cortex operates locally that
excitatory synapses which have been strengthened in defines the areas.
the course of a learning process. Such cell assemblies Apart from the different effects of lesions, cortical
may be thought of as representing ‘objects’ of the areas were also delimited on the basis of subtle, and
world. Due to activity reverberating among its neu- sometimes also quite evident, differences in the
rons, such an assembly stays active for some time once appearance of their layers and in the degree of
it has been activated, a property which perhaps is at myelination (‘cortical architectonics’). For example, a
the basis of ‘short-term memory.’ Also, a cell assembly small-celled layer IV is particularly well developed in
may become active in its entirety even if only a subset primary sensory areas while it cannot be discerned at
of its neurons is activated initially, in a way reminiscent all in the primary motor area (Area 4). The primary
of the phenomenon of ‘pattern completion’ well motor area sticks out because of a population of very
known to perceptual psychologists. large cells in layer V, the cortico-spinal neurons. Large
‘Synfire chains’ (Abeles 1991) are the other theor- pyramidal cells in layer III are found in some asso-
etical proposal based on associative memory. Quite ciation areas, including the speech centers. Two heavy
compatible with the anatomy of the cortical network bands of horizontal myelinated fibers (the ‘stripes of
and with the physiological properties of single neu- Baillarger’) are evident in some areas and only one
rons, it is possible to imagine sets of neurons, each set such band in others (Fig. 2). Overall myelination tends
when activated in synchrony, activating another such to decrease towards higher association areas. Such
set, and this in turn another one, etc., forming long structural differences were at the basis of ‘cortical
chains of activity propagating through the cortex with maps,’ the best known of which, the map by K.
great selectivity and temporal precision. This scheme Brodmann from 1909, distinguishes about 50 areas.
explains how events displayed in time (e.g., the words Although this large number was met with skepticism
of a language, musical themes, complex movements, at first, it is remarkable that many of the structural
etc.) are incorporated in memory. The temporal distinctions were later shown to correspond to diff-
precision postulated by this theory has been impres- erent properties of neurons when cortical mapping
sively verified by correlation studies on spike activity was undertaken anew by microelectrode analysis.
of different cortical neurons (Abeles and Prut 1996). More recent maps tend to assume even higher numbers
Between Hebbian cell assemblies representing ‘ob- of areas.
jects’ of the world, and Abelesian synfire chains Evidently, the neuronal network of the cortex,
representing ‘events,’ there may be transitional forms although built according to common principles
of cortical activity, all based on the idea of modifiable throughout its extent, is subject to variations that
synapses embodying the notion of synchronous events, adapt it locally to different kinds of input and to
or of events occurring in succession, acquired through different kinds of computation, which the input is
a process often called ‘Hebbian learning.’ subjected to. The variations which appear in the Nissl
(cell body) picture and in the myelin preparations
(cyto- and myeloarchitectonics) are just two aspects of
6. Cortical Areas the underlying variations of the basic network, and
indeed the two were shown by Hellwig (1993) to be
From the ‘macroscopic layout’ of the cortical network related to each other by a set of simple rules.
it is also possible to gain insights into its organization.
It has been known for some time that restricted lesions
of the cortex produce different symptoms according to 7. Columns
their localization. This led to the definition of ‘cortical
areas,’ distinct regions of the cortex a few centimeters Beyond the confirmation of the different nature of
across (in humans) whose reality has since been individual cortical areas, microelectrode neurophysi-
confirmed by many detailed electrophysiological stud- ology in many cases revealed an even finer mapping
ies. There are ‘sensory areas’ (visual, acoustic, somato- within areas, the so-called ‘columns.’ In the primary
sensory, olfactory), ‘motor areas,’ and ‘association visual area V-1 (Brodmann’s area 17) certain proper-
areas.’ The latter in part have been recognized as ties to which neurons are tuned, such as responding to
secondary and tertiary stations in the elaboration of stimuli from the right or left eye, or to vertical,

1638
Cerebral Cortex: Organization and Function

horizontal, or oblique stripes, periodically recur over The strong internal connectedness of the cortex in
the surface of the cortex at distances of about 0.5 or each hemisphere is reflected also in the large volume of
1 mm (Hubel 1988). The term ‘column’ was chosen for the hemispheric white substance (the cerebellar hemis-
these small compartments within the areas since the pheres by comparison have a very scanty white
locally specific properties of neurons in one column substance, containing only afferent and efferent fibers).
tend to be similar for neurons in all, or in several In the human brain and in the brains of other large
layers. Columns (or slabs) of neurons with similar mammals, within the white matter underlying the
properties, but different in a systematic way from cortex it is possible to discern about six large bundles
column to column have also been described in sec- connecting distant parts of the cortex, such as the
ondary visual areas, in the somatosensory cortex, and occipito-frontal fascicle or the arcuate fascicle between
in the primary acoustic cortex. In many cases, these the motor and the sensory speech areas. The rest of the
functionally defined columns can be attributed to the white matter is composed of shorter fibers between
way the input is organized: inputs from different neighboring areas, and to a lesser extent, of com-
sources tend to enter the cortex in alternating bundles missural fibers, connecting both hemispheres via the
(e.g., right eye vs. left eye in the visual cortex, fibers corpus callosum, and of the fibers connecting the
from the same or the opposite hemisphere in the cortex, both ways, with subcortical centers. The fact
auditory and prefrontal cortex), thus imposing period- that the volume occupied by the long cortico-cortical
ical inputs onto a largely homogeneous looking bundles is a small fraction of the total volume of the
intracortical network. However, periodicity can also white substance suggests that there is a hierarchical
be found in the pattern of axonal arborizations of organization, with a great amount of preprocessing
pyramidal cells within cortical areas where they between neighboring areas, often within one modality,
preferably project to columns with similar functional preceding the global integration.
properties (Malach et al. 1997, Yoshioka et al. 1996).
Occasionally, columns may also be visible in the See also: Brain Asymmetry; Brain, Evolution of;
arrangement of cell bodies and dendritic trees (whisker Brain: Response to Enrichment; Cingulate Cortex;
representation in rodents), or as the ‘cytochrome Electrical Stimulation of the Brain; Functional Brain
oxidase blobs’ of the visual cortex. However, neither Imaging; Long-term Potentiation and Depression
areas nor columns can be regarded as separate entities; (Cortex); Motor Cortex; Neural Plasticity; Neural
the fiber felt is continuous across the borders between Plasticity in Auditory Cortex; Neural Plasticity in
them. Visual Cortex; Prefrontal Cortex; Pre-motor Cortex;
Split Brain; Topographic Maps in the Brain; Visual
8. Connections Between Areas System in the Brain
The reality of cortical areas is also evident in the
pattern of connections between them. This can be
studied (in animals) by injecting certain dyes locally Bibliography
into the cortex, which are then transported in axons Abeles M 1991 Corticonics: Neural Circuits of the Cerebral
either in the direction away from the cell body Cortex. Cambridge University Press, Cambridge, UK
(‘anterograde transport’) or towards it (‘retrograde Abeles M, Prut Y 1996 Spatio-temporal firing patterns in the
transport’) depending on the particular dye chosen. frontal cortex of behaving monkeys. Journal of Physiology 90:
The overall picture is the following (Young et al. 249–50
1995): individual areas are connected to several other Braitenberg V, Schu$ z A 1998 Cortex: Statistics and Geometry of
areas, but not to all. Areas can differ considerably with Neuronal Connectiity, 2nd edn. Anatomy of the Cortex:
respect to the number of areas they are connected to. Statistics and Geometry 1991 Springer, Berlin, Heidelberg,
New York
Some primary sensory areas are connected to only a
Hebb D O 1949\1961 Organization of Behaior. A Neuro-
few other areas; higher association areas can be psychological Theory, 2nd edn. Wiley & Sons, New York
connected to more than one third of all cortical areas. Hellwig B 1993 How the myelin picture of the human cerebral
The majority of the connections between areas are cortex can be computed from cytoarchitectural data. A bridge
reciprocal. In the monkey cortex, if an area A projects between von Economo and Vogt. Journal of Brain Research
to another area B, there is also a projection from B to 34(3): 387–402
A in 82 percent of the cases. Areas that are located Horner C H 1993 Plasticity of the dendritic spine. Progress in
close to each other in the cortex, often related to the Neurobiology 41: 281–321
same sensory modality, are more likely to be con- Hubel D H 1988 Eye, Brain and Vision. The Scientific American
Library, New York
nected, but there are also connections between areas
Malach R, Schirman T D, Harel M, Tootell R B H, Malonek D
quite far apart. Most areas are connected to their 1997 Organization of intrinsic connections in owl monkey
symmetrical partners in the other hemisphere. The area MT. Cerebral Cortex 7: 386–93
exceptions are in parts of the visual and somatosensory Peters A, Jones E G (eds.) 1984 Cerebral Cortex, Cellular
areas, which do not contribute fibers to the corpus Components of the Cerebral Cortex. Plenum Press, New York,
callosum, the main interhemispheric fiber bundle. London, Vol. 1

1639
Cerebral Cortex: Organization and Function

Yoshioka T, Blasdel G G, Levitt J B, Lund J S 1996 Relation


between patterns of intrinsic lateral connectivity, ocular
dominance, and cytochrome oxidase-reactive regions in ma-
caque monkey striate cortex. Cerebral Cortex 6: 297–310
Young M P, Scannell J W, Burns G 1995 The Analysis of
Cortical Connectiity. Springer, Berlin, Heidelberg, New York
Zeki S 1993 A Vision of the Brain. Blackwell Scientific Publica-
tions, London, Edinburgh

A. Schu$ z and V. Braitenberg

Copyright # 2001 Elsevier Science Ltd.


All rights reserved.

Change: Methods of Studying


The themes of stability and change permeate from the
Greek philosophers to the present. Following Hera-
klit, change seems to be a ubiquitous phenomenon: ‘all
is flowing (Panta rei)’ or ‘you cannot get into the same
river twice.’ The opposite position was held by Figure 1
Parmenides, who said, ‘something is real only if we can (a) Concepts for univariate stability\change: change of
say it is, if it was or will be, it is not real. Change is an absolute value (individual and group level) but
illusion.’ Stability and change are basic concepts normative stability; (b) change of variability but
within psychology, too. The aim of this article is to normative stability; (c) change of rank order but
give an overview regarding methods of studying absolute stability (group level)
change. Therefore, in the first step, important concepts
of change and stability are introduced. For each of the
different concepts of stability, change is defined as the variable control belief is measured on two oc-
absence of stability. casions. Each individual starts from a different level of
control belief and then shows an increase by the same
amount (cf., Fig. 1(a)). The rank correlation is 1 and,
therefore, normative stability is given. Figure 1(c)
1. Concepts of Change and a System of illustrates change with respect to normative or rank-
Categories to Differentiate Them order stability: the rank order is reversed from time 1
to time 2. In this sample, the stability coefficient
measured as rank correlation between occasions
1.1 Basic Concepts of Change is r l –1.
Traditionally, there are at least two important basic Another concept of stability\change is stability of
concepts of stability and change: absolute and nor- variability. Stability of variability requires that the
mative stability (Baltes et al. 1977). Whereas nor- variance of a variable does not change between two
mative stability is only defined at the group level, occasions of measurement. Figure 1(b) shows an
absolute stability can be defined at the individual and increase in the variability across situations. The
the group level. A definition of absolute stability for an example in Fig. 1(b) demonstrates that change in the
individual is that the value of a variable for this variability across occasions can occur despite absolute
individual does not change between occasions of and normative stability. To conclude, these examples
measurement. Absolute stability at the group level is demonstrate that there exist different concepts of
defined as the mean level of the group for a variable stability\change and that stability can be given for one
does not change between observations. Figure 1(c) concept, but not necessarily for the others.
shows absolute stability at the individual level for
individual 3 but not for individual 1. At the same time
there is stability at the group level, because the mean of
the whole sample does not change. Figure 1(a)
1.2 A Classification System for Stability\Change
describes change in absolute stability for each in-
Concepts
dividual and at the group level.
Normative stability is given if the rank order with To gain an even fuller picture of change concepts, a
respect to a variable does not change between two classification system of change concepts will be out-
occasions of measurement. Imagine, for instance, that lined, which takes into account the following:

1640

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Change: Methods of Studying

(a) individual change or change within groups; measures, the resulting measure must be worse. This
(b) number of variables: univariate or multivariate means that if there is a high correlation between the
change; first and second measurements and the reliabilities of
(c) number of and time distance between occasions: the single measurements are medium or high, then the
two or multiple measurements; reliability of the difference is low (numerical example:
(d) scaling of the variables: categorial, ordinal, if r l r l 0.84 and r l 0.83, then it follows that
interval scale. "" 0.06).
rdd l ## "#
For reasons of economy, other important dimensions (c) Regression to the mean says that extreme values
are not (explicitly) included within this classification: for the first occasion will probably be followed by
(e) change can be located at the surface or the more average values in the second measurement. The
construct level; mathematical formulation for regression to the mean
(f) (measured variable or latent construct; pheno- is s(ypred) s(x) (ypred are the predicted values). It can
type or genotype); be shown easily that regression to the mean is a simple
(g) trend, growth, and rhythm; mathematical tautology depending on the assumption
(h) change of the (dimensional) structure; of equal variances for pre- and post-test.
(i) change in measurement error; For each of these points of critique, Rogosa (1995)
(j) synchronicity, asynchronicity; gives examples that they do not hold in general and
(k) change and causality. that they can be subsumed under the myths of
longitudinal research. Therefore, difference scores can
be used if there are only two occasion measurements.

2. Examples of Important Methods for Studying


Change
2.2 Cross-lagged-panel Analysis
When outlining a system of categories for change
concepts, it becomes evident that by combining the After studying change using difference scores, we turn
different dimensions of the system a large number of to the study of causal relations using change measure-
change concepts could be derived. Within this short ments. Consider, for instance, the theoretical question
description, however, only a subset can be explained in of whether there exists a causal relationship between
more detail. control belief and academic performance and vice
versa. The empirical basis for computing estimates of
cross-lagged-panel relationships is that a sample of
individuals is measured (at least) twice with respect to
(at least) two variables.
2.1 Analyzing Change by Using Differences Between
The cross-lagged-panel model contains two im-
Two Occasions
portant cross-lagged paths (a) between control belief
This way of studying change is frequently used in (time 1) and performance (time 2) and (b) between
research. The way of measuring change shown in Fig. performance (time 1) and control belief (time 2). These
1(a) is based on comparing measures of one variable models can be estimated and tested for significance,
for a sample of individuals for two occasions. The yielding possible causal (in the sense of time-lagged
question is whether there is a change in the mean of the relationships; see Schmitz 1990) relationships: no
variable. To analyze these questions, one computes causality, unidirectional causality, or bidirectional
differences between time-2 and time-1 measures. Crit- causality (feedback).
iques of this approach have pointed to various
problems:
(a) differences are dependent on the level of the
first measurement;
2.3 Analysis of Time Series
(b) differences are not reliable (not as much as the
original variable); The data basis for analysis of time series is that at least
(c) regression to the mean leads to systematic one individual is studied for at least one variable, using
biases. many occasions of measurement. Whereas the meth-
(a) The law of initial values states that high initial ods in Sect. 2.1 require only a small database (two
values correlate with small differences between second occasions measurements), for time series analysis one
and first measurement and vice versa, which means often needs more than 20 or even 50 data points. One
that there is a negative correlation between initial might think that it would be difficult to collect such
status and change. data, but in psychophysiology, behavioral observa-
(b) From the assumptions of classical test theory, tions, and diary approaches, it is feasible to collect this
e.g., Lord (1963) derived an equation for the reliability kind of data. One of the great advantages of the time-
of differences. He argued that if there are two fallible series approach is that one can test hypotheses also for

1641
Change: Methods of Studying

Figure 2
Cross-lagged relationships for two variables

a single individual. A typical research question may be event-history analyzes can be applied (Willett and
to study the effect of an intervention (e.g., does a Singer 1995).
training have an effect on control belief?) and the In sum, methods of studying change based on poor
identification of trends and rhythms. A variable shows information—as in two-occasion measurements—can
a linear trend if the increase from one occasion to the only lead to poor results (one cannot derive con-
next is constant. An example of a variable which clusions that are valid for individuals), whereas meth-
shows a simple rhythm is workload measured daily: ods which use the information contained in rich (e.g.,
usually, the workload is high during workdays whereas time-series) data can lead to rich conclusions (e.g.,
at weekends people will not work. In the bi- or which are also valid for individuals). If someone is
multivariate case one can test synchronous or asyn- really interested in stability and change (although it
chronous relationships (see Schmitz and Skinner requires a great deal of effort to conduct a longitudinal
1993). Synchronous relationships occur if variables study), one should not always apply only simple
show similar patterns over time (e.g., days); they show methods (such as differences) and small data basis
asynchronous relationships, e.g., if one variable lags (such as two measurement points) for economical
behind another variable (shows a similar pattern just reasons as there are alternative methods which may
one day later). One can analyze, for instance, intra- provide more differentiated results. To conclude, it is
indiidual causal relationships between variables. If time to avoid shortcomings of studying change or, as
there are more individuals, one can perform intra- Rogosa (1995 p. 8) put it, ‘Myth 1: two observations a
individual time-series analyzes for each individual and longitudinal study make.’
then combine the individual parameters at the ag-
gregate level, as is done in hierarchical linear modeling See also: Chaos Theory; Event-history Analysis in
(HLM) (Bryk and Raudenbush 1987) or in the Continuous Time; Event Sequence Analysis: Methods;
approach proposed by Schmitz et al. (1996). Both Longitudinal Data; Longitudinal Data: Event–
methods can be regarded as solutions to the History Analysis in Discrete Time; Longitudinal Re-
idiographic–nomothetic controversy. (See Fig. 2.) search: Panel Retention; Markov Models and Social
Analysis; Markov Processes for Knowledge Spaces;
Time Series: General
3. Other Methods and Outlook
A special case of the general time-series models are
chaos models, which are nonlinear. Bibliography
One characteristic of chaos is the severe dependence Alligood K T, Sauer T D, Yorke J A 1997 Chaos. An In-
on the initial conditions, which is often referred to as troduction to Dynamical Systems. Springer, New York
the butterfly effect. There are simple chaotic systems Baltes P B, Reese H W, Nesselroade J R 1977 Life-span De-
which can be described by only one so-called order elopmental Psychology: Introduction to Research Methods.
parameter. These systems converge to homeostasis if Brooks\Cole, Monterey, CA
the order parameter stays within certain limits. How- Bryk A S, Raudenbush S W 1987 Application of hierarchical
ever, if the parameter changes slightly the system linear models to assessing change. Psychological Bulletin 101:
147–58
shows chaotic behavior (Alligood et al. 1997).
Gottman J M, Roy A K 1990 Sequential Analysis. A Guide for
Sometimes the assumption of interval data is not Behaioral Researchers. Cambridge University Press, Cam-
fulfilled, and in these cases synchronicity and asyn- bridge, UK
chronicity can be studied using Markov models (see Lord F M 1963 Elementary models for measuring change. In:
Gottman and Roy 1990). If information is given on Harris C W (ed.) Problems in Measuring Change. University
how long it takes until an event occurs, methods of of Wisconsin Press, Madison, WI, pp. 21–38

1642
Chaos Theory

Rogosa D R 1995 Myths and methods: ‘myths about longi- discrete time dynamical system on n-dimensional
tudinal research’ plus supplemental questions. In: Gottman space is a system,
J M (ed.) The Analysis of Change. Erlbaum, Mahwah, NJ, pp.
3–66
x ( tj1) l F (x(t )), t l 1, 2, … ,; x(0) l x (1)
Schmitz B 1990 Univariate and multivariate time-series models: !
the analysis of intraindividual variability and intraindividual
relationships. In: von Eye A (ed.) Statistical Methods in of n difference equations in n variables. Here F is a map
Longitudinal Research, Volume II: Time Series and Categorical from n-dimensional real space to itself, x (t ) and
Longitudinal Data. Academic Press, Boston, pp. 351–86 x (tj1) are n-dimensional vectors at date t and tj1
Schmitz B, Skinner E 1993 Perceived control, effort, and respectively, x denotes the initial condition vector,
academic performance: interindividual, intraindividual, and and t is time. !
multivariate time-series analyses. Journal of Personality and A useful pedagogical example is the scalar ‘tent
Social Psychology 64: 1010–28 map’ T from the closed interval [0, 1] to itself defined
Schmitz B, Stanat P, Sang F, Tasche K G 1996 Reactive effects
of a survey on the television viewing behavior of a telemetric
by
television audience panel: a combined time-series and control
group analysis. Ealuation Reiew 20: 204–29 x(tj1) l T (x(t ))
Willett J B, Singer J D 1995 Investigating onset, cessation,
relapse, and recovery: using discrete-time survival analysis to where
examine the occurrence and timing of critical events. In:
Gottman J M (ed.) The Analysis of Change. Erlbaum, T (x) l 2x, x in [0, 0.5]
Mahwah, NJ, pp. 203–59
and
B. Schmitz
T (x) l 2k2x, x in [0.5, 1] (2)

The hallmark of chaos is SDIC, i.e. for two nearby


initial conditions, x (0), y (0), the map F magnifies the
Chaos Theory distance between them:

Q F (x (0))kF ( y (0))Q  Qx (0)ky(0)Q


Chaos theory is the study of deterministic difference
(differential) equations that display sensitive depen- for Qx (0)ky (0)Q small enough
dence upon initial conditions (SDIC) in such a way as (3)
to generate time paths that look random.
This article (a) explains what ‘chaos’ is in math- for most pairs of nearby initial conditions. Here Q:Q
ematics, (b) explains why economists became inter- denotes a distance measure. The tent map (2) illus-
ested in this concept, (c) sketches economic forces that trates this idea nicely because Qx (0)ky (0)Q is magni-
can ‘smooth’ out dynamical irregularities that can lead fied by a factor of two for small enough Qx (0)ky (0)Q
to chaos, (d) very briefly discusses theoretical models except for the rare case where x (0), y (0) straddle the
that generate chaotic dynamical systems as equilibria, maximizer, x l 0.5.
and (e) discusses empirical testing for chaos. We shall A popular way to precisely capture the idea of SDIC
spend most of the time on (e) because even though the and to measure it is the largest Lyapunov exponent, L,
vast bulk of studies has not found convincing evidence which is defined by the following limit as h tends to
for chaotic dynamics that are short term predictable infinity,
(out-of-sample) using nonlinear methods with high
ability to detect chaos, the quest for chaos has L (x (0), v) l Limit (1\h) ln oQDF h (x (0)):vQq (4)
generated useful statistical methods.
Here DG denotes the n by n derivative matrix of map
G, F h (x (0)) denotes the application of map F to the
n-vector x (0) h times, v denotes a nonzero n-dimen-
1. Chaos sional direction vector, ‘:e denotes dot product, Q:Q
The mathematical apparatus for rigorous treatment of denotes the Euclidean norm on n-dimensional space,
chaos can be quite intimidating to the nonmathe- and ln denotes natural logarithm.
matician. Therefore we shall attempt to explain the Sufficient conditions are available for L (x (0), v )
concepts verbally as much as possible. This exposi- to exist and to be independent of x (0), v for most
tional strategy will allow the reader to decide whether initial conditions x (0) and most vectors v. A popular
it is worth their while to investigate this area further. definition of chaos is:
Reference to wide-ranging surveys are given.
We follow Brock (1986) for a very brief treatment of Definition (Chaos) The map F is said to be chaotic if
some mathematics of chaos below. A deterministic L0.

1643
Chaos Theory

This is not the only definition of chaotic map in the furcation into a 2m-cycle, … to fully developed chaos as
literature, but it is a popular one and we shall use it a slow ‘tuning’ parameter increases. This particular
here. route to chaos is used a lot in economics (Benhabib
The value of L for the tent map T is L l ln (2)  0 1992). Some economists argue that fully developed
so T is chaotic by this definition of chaos. As we saw chaos is not as useful to economic science as the
before, T displays SDIC. The quantity L measures analysis and classification of bifurcations themselves,
how fast (on average per iteration) a tiny measurement because the evidence from data is stronger for the
error (captured by y) in the initial condition x(0) is existence of bifurcations.
magnified by the map. If the iteration h in (2) is
thought of as a forecast then QDF h (x (0)):vQ repre-
sents the error in an h-horizon forecast caused by 2. Chaotic Economic Dynamics?
measurement error  at date zero. If L  0, this error Even though predictability of a chaotic dynamics is
is growing exponentially as h increases by a factor futile in the long term, nonlinear prediction methods
exp (L). This kind of behavior is associated with can do a good job on short term prediction when
deterministic maps generating random-looking time chaos is present. Consider a general stylized dynamic
series output. Clearly the tent map example generates economic model represented by the following stoch-
random-looking time series output. Another example astic dynamical system
of a type of chaotic map is a psuedo-random number
generator for computers. x(tj1) l f (R(x(t)), x(t))js:n(t) (7)
If the solution ox(t), t l 1, 2, … q to (1) is bounded for
each initial condition, x , under regularity conditions where x(t) is a vector of economic quantities that
the limiting behavior of!chaotic dynamical systems is represent the state vector of the economy at date t,
contained in a set A that is invariant under application R(x(t)) represents the response of economic agents to
of F which is called a ‘strange attractor.’ It is called the state of the economy at date t, and the function f
‘strange’ because it is not a rest point or a p-cycle. Here produces the new state vector x(tj1) at date tj1.
a p-cycle is a collection of p vectors, ox (1), x (2), … , Here n(t) denotes a stochastic process, called ‘forcing
x ( p)q, such that noise,’ (often chosen to be an independent and
identically distributed sequence of mean zero, finite
x(2) l F (x(1)), x (3) l F (x(2)), … , variance random variables) which represents outside
x ( p) l F (x( pk1)), x (1) l F (x( p)) (5) shocks to the economic system and s denotes standard
deviation of each random variable n(t).
A rest point is a p-cycle where p l 1. Many economic models (e.g., many of the models
It is useful to briefly explain the notion of bifurca- treated in Benhabib 1992) can be put into the math-
tions and ‘routes to chaos.’ Consider a dynamical ematical form of Eqn.(7). If we define
system with ‘fast’ and ‘slow’ variables, x(t), and a(t)
where the fast variables x(t) have a rate of change that F(x) l f (R(x), x) (8)
is much faster than the slow variables. Write
thenEqn.(1)isadifferenceequationoftheform(1)whens
x(tj1)kx(t) l f (x (t ), a (t )), l 0. Since economic theory generates nonlinear dy-
a(tj1)ka(t ) l eg(x(t), a(t)), 0 e 1, (6) namics it is theoretically easy to produce economic
models of the form (8) that generate chaotic dynamics
where ‘1’ means ‘much less than one’ to capture the when external forcing noise is set equal to zero (e.g.,
idea that the second difference equation in (6) moves Benhabib 1992, Boldrin’s chapter in Anderson et al.
much more slowly than the first. Hence, it is useful 1988, Dechert 1996).
sometimes to assume the fast variables have already An important issue is whether, in such models, the
converged to an attractor conditional on the value of parameter values needed to obtain chaos (especially a
the slow variables. A bifurcation value is a value of the chaos where prediction in the near term can be
slow variables such that passing through it leads to an improved by exploiting it) are consistent with em-
abrupt change of the attractor of the fast variables. pirical measurements in economics. For example,
For example suppose the attractor is a rest point which when real interest rates are low (as they typically are),
abruptly changes to a p-cycle, p1; or a p-cycle which intertemporal smoothing operations such as intertem-
abruptly changes a more complex attractor. There is a poral arbitrage (e.g., the trading of assets across
classification theory for bifurcations (Kuznetsov different points of time in order to profit after
(1995)). adjustment for interest costs and for risk bearing) tend
A well-studied route to chaos in one dimensional to squash cycles and chaos in economic systems with a
discrete dynamical systems is the Feigenbaum cascade rich enough variety of market instruments.
(and closely related Sharkovsky ordering). Here a rest Brock’s chapter in Anderson et al. (1988) goes
point (i.e., a one-cycle) bifurcates into a two-cycle, through a list of empirical plausibility checks for the
followed by bifurcation into a four-cycle, followed by magnitude of ‘frictions’ needed to obtain short term
bifurcation into an eight-cycle, … , followed by bi- forecastable deterministic cycles and chaos at high to

1644
Chaos Theory

usuably (usuably from a policy-relevant predictive inspired the development of new statistical methods.
point of view) low frequencies in macroeconomic and These methods have turned out to be useful in areas
financial data. Brock’s general conclusion is not having nothing to do with deterministic chaos. We
encouraging for the presence of persistent determinis- briefly explain two of them here. The first method is a
tic cycles and chaos for economic dynamics of macro specification test called the ‘BDS test’ by many writers
variables and financial asset returns in countries with (Bollerslev et al.’s chapter in Engle and McFadden
well-developed asset markets and financial markets 1994, De Grauwe et al. 1993). The method emerged
but maybe better for countries with poorly-developed out of the work that culminated in Brock et al. (1996).
markets. It is argued that well-developed market Here is a very brief explanation.
economies just have too many instruments through Suppose one formulates a model that relates a set of
which self-interested intertemporal smoothing beha- variables to be predicted to another set of variables
vior can operate to be consistent with persistent called predictor variables, and fits this model to data
deterministic chaos or cycles at high to medium and saves the residuals. If one has done a proper job of
frequencies. theorizing, model formulation, and model fitting, then
After all, the usual arguments behind the efficient the residuals should be unforecastible using histories
markets hypothesis suggests that there is money lying based upon observables. The BDS test is used to
on the table if there are potentially predictable patterns formulate and carry out tests for unpredictability of
such as deterministic chaos or deterministic cycles in residuals of fitted models. (See Brock et al. 1991 for
asset returns. That is, arbitrage opportunities are extensive discussion and Monte Carlo work.) The
available unless interest rates are high, market instru- BDS test has been used to test the adequancy of fitted
ments for arbitrage are few, risk adjustments are high, models to data (Dechert 1996, De Grauwe et al. 1993,
or constraints on borrowing and lending are high. Of Pesaran and Potter 1992). The BDS test plays a similar
course, none of these arguments are germane to the role for nonlinear and general models as the Box and
presence of chaos and cycles at very low frequencies. Jenkins Q-test does for auto-regressive integrated
For example, Day’s chapter in Pesaran and Potter moving average (ARIMA) models (Box and Jenkins
(1992) is more encouraging for the presence of per- 1976). Tests like the Q-test and the BDS test are useful
sistent ‘complex’ economic dynamics such as chaos, for testing the adequacy of fitted models and evalu-
especially at long-term historical frequencies. How- ating whether the evidence warrants a more costly
ever, while theory is suggestive, there is no substitute exploration of alternatives to the null hypothesis
for rigorous statistical testing. model. See Dechert (1996) for a collection of studies
More interesting to empirical economists is whether where the BDS test is used in this manner.
there is evidence in economic and financial time series The bootstrap-based specification test is a more
data for the presence of chaos. While the evidence is sophisticated specification testing method. This
weak for financial data and for macroeconomic data method is especially relevant for detecting subtle
for the presence of chaos which can be short term nonlinearities in settings like finance where the econ-
predicted, there does seem to be useful evidence in omic logic dictates that any patterns are likely to be
favor of nonlinear structure which can be predicted in hard to detect (e.g. Maddala and Li’s chapter in
the short term, conditional on appropriate informa- Maddala and Rao 1996). The idea here is to use a
tion sets. The evidence for extra unconditional out-of- version of bootstrap to compute the null-model
sample predictability (i.e., prediction of data that was distribution of statistics gleaned from various trading
not in the sample used to fit the model) using nonlinear strategies, and to use the data values of these statistics
methods appears to be weak for asset returns data. to suggest refinements of the null model in financially
The evidence for extra conditional out-of-sample relevant directions. For example, in financial applica-
predictability of asset returns is better (LeBaron 1994). tions, the null model is sometimes taken to be a
See the review of Brock et al. (1991) with especial random walk. This procedure emphasizes develop-
attention to the references to Diebold, Nason, and ment of statistical quantities to evaluate the null model
LeBaron. However evidence for bifurcations, complex which are motivated by the behavior that one is
dynamics, abrupt changes, and other nonlinear pheno- modeling. This method was made possible by recent
mena seems quite strong (Dechert and Hommes 2000, technical increases in computing speed, and reduction
see especially Chavas for animal dynamics; Carpenter in cost, as well as advances in computationally based
et al. 1999, for ecological dynamics; LeBaron 1994, for inference methods such as bootstrap.
nonlinear patterns in financial data; Dechert 1996, for
an overall review of evidence). 4. Conclusion and Future Directions

3. New Statistical Methods The study of chaos and general complex dynamics as
well as general complexity theory in economics,
The quest for evidence of deterministic chaos, deter- finance, and social studies has already been fruitful in
ministic cycles, and other complex deterministic per- generating new methods and new evidence. This kind
sistent patterns in economic and financial data, has of work is likely to grow in prominence as computa-

1645
Chaos Theory

tional costs fall and advances continue in computa- Fechner and was further developed in psychology by
tional-based methods of theory and statistical in- Thurstone (1927), is used to explain the often observed
ference. inconsistency in choice experiments, where a subject
on repeated presentations of one particular subset of
See also; Computational Psycholinguistics; Linear and alternatives does not always select the same alterna-
Nonlinear Programming; Neural Systems and Behav- tive. The term ‘utility’ refers to a setting where the
ior: Dynamical Systems Approaches; Self-organizing subject is requested to choose on the basis of preference
Dynamical Systems; Stochastic Dynamic Models (a choice context typical for economics), but it has to
(Choice, Response, and Time) be stressed that the applicability of random utility
models is much wider. In psychophysical experiments
the criterion of choice might be, for instance, the
Bibliography intensity of lights, pitch of tones, perceived weight of
Anderson P, Arrow K, Pines D (eds.) 1988 The Economy as an objects, and so on. While in most instances of ran-
Eoling Complex System. Addison-Wesley, Redwood City, dom utility models the distributions of the random
CA variables are restricted to some parametric family
Benhabib J (ed.) 1992 Cycles and Chaos in Economic Equilibrium. (following Thurstone, who considered Gaussian dis-
Princeton University Press, Princeton tributed variables), the problem addressed in this entry
Box G, Jenkins G 1976 Time Series Analysis: Forecasting and is to characterize random utility theory in its most
Control (revised edn). Holden-Day, San Francisco, CA
Brock W 1986 Distinguishing random and deterministic
general form in terms of the testable restrictions it
systems (abridged version). In: Grandmont J (ed.) Nonlinear imposes on the choice data that it is supposed to
Economic Dynamics. Academic Press, New York model.
Brock W, Dechert W, Scheinkman J, LeBaron B 1996 A test for
independence based upon the correlation dimension. Econo-
metric Reiews 15(3): 197–235 1. Random Utility Representation
Brock W, Hsieh D, LeBaron B 1991 Nonlinear Dynamics,
Chaos, and Instability Statistical Theory and Economic
The data of a typical choice experiment consist of the
Eidence. MIT Press, Cambridge, MA relative frequencies, or, ideally, choice probabilities
Carpenter S, Ludwig D, Brock W 1999 Management of p(i, K ) with which an alternative i ? K is chosen when
eutrophication for lakes subject to potentially irreversible the subset K of alternatives is offered. The master set of
change. Ecological Applications 9(3): 751–71 n alternatives is conveniently identified with n l
Dechert W (ed.) 1996 Chaos Theory in Economics: Methods, o1, …, nq. In principle any system of subsets may be
Models and Eidence. Edward Elgar, Cheltenham, UK offered, but two special cases are of both theoretical
Dechert W, Hommes C (eds.) 2000 Complex nonlinear dynamics and practical interest. In a complete system of choice
and computational methods. Journal of Economic Dynamics probabilities choices are obtained for every subset of n
and Control 24: 651–62
De Grauwe P, Dewachter H, Embrechts M 1993 Exchange Rate
with at least two elements; if just all two-element
Theory, Chaotic Models of Foreign Exchange Markets. Black- subsets are offered, the data can be collected in a
well, Oxford, UK binary choice probability (BCP) matrix.
Engle R, McFadden D 1994 Handbook of Econometrics. North- According to random utility theory the internal
Holland, Amsterdam, Vol. 4 value or ‘strength’ of each alternative i ? n in the choice
Kuznetsov Y 1995 Elements of Applied Bifurcation Theory. context of the experiment is modeled by a random
Springer, New York variable Ui, with all such random variables defined on
LeBaron B 1994 Chaos and nonlinear forecastibility in econ- the same probability space with measure Pr. The
omics and finance. Philosophical Transactions of the Royal collection oUiqi ? n is a random utility (RU) represen-
Society 348: 397–404
Maddala G, Rao C (eds.) 1996 Handbook of Statistics 14:
tation of the choice probabilities p(i, K) whenever
Statistical Methods in Finance. North-Holland, Amsterdam
Pesaran H, Potter S 1992 Nonlinear dynamics and econometrics. p(i, K ) l Pr(Ui l maxoUjq), i ? K 7 n. (RU)
j?K
Journal of Applied Econometrics 7: S1–S195
The fact p(i, oi, j q)jp( j, oi, j q) l 1 for two-element
W. A. Brock subsets entails that the collection oUiqi ? n is nonco-
incident, meaning that Pr(Ui l Uj) l 0, and thus
Pr(Ui  Uj l Pr(Ui  Uj), for i  j. This is no serious
restriction; for instance, all continuous random
Characterization Theorems in Random variables have this property.
Utility Theory
2. Representation by Rankings
In random utility theory the evaluation of a stimulus
by a subject is modeled by a random variable from The problem of interest now is to characterize RU
which a sample is taken at each presentation of the in terms of a set of necessary and sufficient con-
stimulus. This idea, which goes back to Gustav ditions on the observed choice probabilities, allow-

1646
Characterization Theorems in Random Utility Theory

ing an arbitrary joint distribution for the collection with pij as a shorthand for p(i, oi, j q) and with N l
oUiqi ? n. Obviously, in RU the choice probabilities oij: i, j ? n, ij q denoting the collection of ordered
are completely determined by the n! values pairs from n.
ProUi  Ui  (  Ui q of this distribution for all Geometrically, any BCP matrix p is a point in
" n
rankings of #the n alternatives. Any such ranking can be [0, 1]N, the unit hypercube in n(nk1)-dimensional real
identified with the linear order π on n for which i π j vector space indexed by N, and the collection of BCP
whenever i is ranked higher than j. Denoting the matrices on n elements that are induced by rankings is
collection of linear orders on n by Πn, choice prob- a particular subset of this hypercube, denoted by Pn.
abilities p(i, K ) are said to be induced by rankings (IR) Identifying a linear order on n with its indicator
if there exists a probability distribution  on Πn such function (i.e., writing πij l 1 instead of i π j ), IR
that can for BCP matrices be restated as

p(i, K ) l oπ ? Πn:i π j for all j ? KBoiqq, p ? Pn  p l  (π) π (CH)


i?K7n (IR) π?Π
n

As already noted by Block and Marschak (1960), RU for some probability distribution , that is, for some
and IR are equivalent. nonnegative real numbers (π) summing to unity.
In economics, where the interest is not in repeated Thus, by definition, Pn constitutes the conex hull of
choices by one individual, but in an aggregation of the set Πn consisting of n! of the 2n(n−") vertices of the
choices over many individuals, IR results immediately hypercube. By linear duality theory (see, e.g., Kuhn
from the assumption that within any single person the 1956) this statement is equivalent to the general
choices derive from some fixed ranking of the alterna- criterion
tives (economists tend to call such choices ‘rational’),
with  collecting the relative frequencies of the p ? Pn  fc, pg  maxofc, πgq
π?Π
rankings in the population. n
The study of the representation problem for random for all integer c ? N, c  0
utility models concentrates on IR rather than RU:
verifying the existence of a probability distribution on where f:, :g is the standard inner product in N.
a finite set of rankings turns out to be more tractable Thus, is an abstract sense, this system of linear
than verifying the existence of a collection of un- inequalities characterizes BCP matrices p satisfying
determined random variables. RU and IR. This solution, however, involves an
infinite system of inequalities for any fixed, finite
number n of alternatives, and, consequently, does not
2.1 Characterization for Complete Systems yield an effective procedure for checking whether for a
given BCP matrix a random utility representation
Block and Marschak (1960) established that for a exists. (This contrasts with the solution BM for
complete system of choice probabilities the linear complete systems of choice probabilities.) The follow-
inequalities ing standard theory makes it clear that, for any
fixed n, the set Pn is in fact defined by a finite subsystem
 (k1)Q JBKQp(i, J )  0, i ? K 7 n (BM) of the above infinite system. The characterization
K 7 J7n problem that remains is to find, for any n, such a finite
collection of linear inequalities.
are necessary conditions for IR, thus RU. Via an
intricate combinatorial argument Falmagne (1978)
then constructively showed how, given BM, a prob-
ability distribution  on Πn satisfying IR can be 4. The Linear Description of Pn
constructed, thus proving that BM actually charac- By CH, Pn is a polytope, i.e., the convex hull of
terizes IR and RU in terms of testable restrictions finitely many points, in N. The set of vertices of Pn is
on complete systems of observed choice probabilities. Πn, whence it is called the linear ordering polytope.
Any polytope is alternatively described as a bounded
polyhedron, i.e., a subset obtained as the intersection
3. The Geometry of Binary Choice Probabilities of finitely many halfspaces, or, the solution set of
finitely many linear inequalities. In fact, any p ? Pn
No such definite result is available for BCP matrices, satisfies the system of (n) linear equalities (EQ)
for which RU and IR specialize to #
pijjpji l 1, ij ? N (EQ)
pij l Pr(Ui  Uj) l  (π), ij ? N
oπ ? Π : π q
ni j each of which constrains Pn to be contained in the

1647
Characterization Theorems in Random Utility Theory

facets, which do exclude parts of the cube from P , are


defined by inequalities of the form $

pijjpjkjpki  2, i, j, k ? n

Since this inequality can be equivalently rendered as

pik  pijjpjk, i, j, k ? n

it is known as the triangle inequality.


Thus, for n l 3, the situation is simple, with the
triangle inequality as necessary and sufficient con-
dition. Since an inequality defining a facet of some
Figure 1 Pn remains facet-defining for all higher-dimensional
Unit cube with the linear ordering polytope P (in linear ordering polytopes, the triangle inequality is a
$ necessary condition for all n  3. Although the geo-
bold) and vertices indicated with the corresponding
rankings of o1, 2, 3q metries of the six-dimensional P and the 10-dimen-
sional P are considerably more complex% than that of
&
P , the triangle inequality turns out to be also sufficient
hyperplane which is the intersection of the halfspaces $ n l 4 and n l 5. It can be interpreted as ‘proba-
for
defined by the two corresponding inequalities. As a
consequence, Pn is not of full dimensionality in N, but bilistic transitivity,’ since, applied to the vertices of Pn,
has (affine) dimension n(nk1)k(n) l (n). the inequality amounts to the transitivity property of
If the inequality fa, pg  a is# valid# for all p ? Pn linear orders as binary relations on n. In this sense,
and the corresponding equality ! fa, pg l a is EQ in one direction () represents a(nti)symmetry and
satisfied by some, but not all, p ? Pn, then the! set in the reverse direction () completeness. (The prop-
o p ? Pn:fa, pg l a q is called a face of Pn. Geometri- erty of (ir)reflexivity plays no role here.)
cally, Pn is fully !contained in the halfspace of N
defined by this valid inequality and the face of Pn
consists of the points at the boundary of Pn that lie in 5. The General Case
the hyperplane delimiting this halfspace. A face of
maximum dimension, i.e., one less than the dimension In this way the picture seems nice and complete and,
Pn itself, is called a facet of Pn, and the corresponding indeed, at some point it was conjectured that the
valid inequality is called facet defining. A complete, triangle inequality would be sufficient for all n.
nonredundant system of inequalities for Pn consists of However, for n l 6 soon a counterexample was found
facet-defining inequalities, one for each facet of Pn. of some p ^ P , but satisfying this inequality. Thus,
Such a system is non-unique, since any two inequalities starting with 'n l 6, there are other facet-defining
fa, pg  a and fb, pg  b that are the same modulo inequalities. Once this was realized, a hunt for facets
! that EQ entails
EQ (i.e., such ! fa, pgkfb, pg l a kb ) started, yielding more and more inequalities, in fact
are equivalent and define the same face of Pn (if!any). ! whole classes of inequalities, for increasing n.
Thus, the representation problem is redefined as one (Fishburn (1992) presents history and state of the art
of finding facet-defining inequalities for all facets of of these developments at that point in time.) Two of
the linear ordering polytope Pn. the most general classes found up to now are presented
next, to illustrate the complexity of the problem.

4.1 The Case n l 3 and the Triangle Inequality


5.1 Facets Defined by MoW bius Ladders
The smallest nontrivial BCP-matrix p l ( p , p , p ,
p , p , p ) for n l 3 alternatives is, by (7), "#contained
"$ #" In the field of operations research, where the polytope
in#$a three-dimensional
$" $# subspace of the unit hypercube Pn turns up in a discrete optimization context,
of ', specified, for instance, by the p , p and p Gro$ tschel et al. (1985) developed techniques for
coordinates. This is illustrated in Fig. 1."#The#$polytope
$" proving inequalities facet-defining and defined various
P is the octahedron obtained as the convex hull of the schemes of facet-defining inequalities in the language
six$ linear orders on o1, 2, 3q. These vertices and the of graph theory. In this setup, directed graphs are
boldface edges in Fig. 1 constitute the faces of considered with node set n. Any arc set A 7 N defines
dimension 0 and 1, respectively. The facets of P are through its indicator function 1A a linear functional
the triangular faces of the octahedron. Six of them $ denoted as
coincide with the facets of the cube itself, correspond
to the inequalities pij  0, pij  1, and thus represent A( p) l f1A, pg l  pij.
no restriction on BCP matrices. The remaining two ij ? A

1648
Characterization Theorems in Random Utility Theory

Figure 2
(a) Mo$ bius ladder composed of one four-cycle and four three-cycles; arcs shared by two cycles in bold. (b) Mo$ bius
ladder of 21 cycles (adapted from Gro$ tschel et al. (1985))

Figure 3
(a) A stability-critical graph with stability number 2. (b) An infinite family of stability-critical graphs
(with stability number m)
By suitable choices for the arc set A and constant a 5.2 Facets Defined by Stability-critical Graphs
facet-defining inequalities of the form A(p)  a may!
! Using the techniques developed by Gro$ tschel et al.
be obtained. One example follows.
(1985), Koppen (1995) derived another general class of
Gro$ tschel et al. (1985) call an arc set A a MoW bius
facet-defining inequalities, this time in terms of un-
ladder basically if A is the union of an odd number k of
directed graphs. Let G l (k, E ) be a connected graph
(directed) cycles C , C , ..., Ck, such that each Ci is
with node set k l o1,…, kq, k  3, and some edge set E
either a three-cycle" or# a four-cycle and the given
of unordered pairs of k. Let S(k, E ) denote the
sequence constitutes an (undirected) cycle on a higher
stability number of G, i.e., the maximum cardinality of
level in the sense that any adjacent pair Ci, Ci+
(including Ck, C ) has precisely one arc in common,"
a stable set in G, where a subset of nodes of G is called
"
while all nonadjacent pairs Ci, Cj are disjoint. (There
stable if it does not contain an edge. G is called
stability-critical if the removal of any one edge from G
are a few additional side-conditions.) For any such
would increment the stability number (by one).
Mo$ bius ladder A the inequality
Koppen (1995) proved that for any such G l (k, E )
kj1 and any 2k distinct elements u , …, uk,  , … , k in n the
A( p)  QAQk inequality " "
2
is facet defining for all linear ordering polytopes Pn for  pu v k  ( pu v jpu v )  S(k, E )
i i i j j i
which it is defined (that is, with n equal to or exceeding i?k oi, jq ? E

the number of nodes appearing in A). Fig. 2(a) shows defines a facet of the linear ordering polytope
a small Mo$ bius ladder, which, with QAQ l 11 and Pn, n  2k, if and only if G is stability-critical.
k l 5, represents a facet-defining inequality for all The graph in Fig. 3(a) represents such an inequality
Pn, n  6. Fig. 2(b) illustrates the generality of the with k l 7 and S(k, E ) l 2 (any three-set contains an
class of Mo$ bius ladders by a more elaborate example, edge, but any edge removed introduces a stable three-
which also more explicitly shows the idea of the set), which is facet defining for Pn, n  14. In Fig. 3(b)
Mo$ bius strip. an infinite family of stability-critical graphs is de-

1649
Characterization Theorems in Random Utility Theory

picted, each corresponding to a new facet-defining special case with k l 2, Q l  and  the set of linear
inequality. The symmetry present here is ‘accidental’; orders on n.
being stability-critical imposes no further structural The consequence is that for the general case of
restrictions on a graph whatsoever. probability data p the characterization problem
amounts to finding, for any n, a system of linear
inequalities defining the facets of the polytope with
vertex set . Some results have been obtained for
5.3 Conclusion for the Binary Case other relations (Regenwetter 1996, Suck 1997). Typi-
The Mo$ bius ladders and stability-critical graphs are cally, in such a case the first nontrivial facets found are
just two examples of the various classes of facet- the ones corresponding to the defining properties of
defining inequalities for BCP matrices known to date. the class of relations  (see the end of Sect. 4.1). The
In particular, while all cases considered here have 0–1 parallel with the linear order case goes further in that,
coefficients, facet-defining inequalities with general, quite generally, more and more facets turn up
unbounded integer coefficients have been found with increasing n. To date, these investigations have not
(Leung and Lee 1994, McLennan 1990, Suck 1992). resulted in characterization theorems for general n
The conclusion is that, starting with the triangle and, except for very small n, linear descriptions of the
inequality for n  5, there is, with increasing n, a corresponding polytopes are only partly known.
combinatorial explosion of facet-defining inequalities
for Pn with no apparent structural regularities—wit- See also: Decision and Choice: Random Utility
ness the special case of stability-critical graphs alone. Models of Choice and Response Time; Elicitation of
Thus, the problem of a complete characterization for Probabilities and Probability Distributions; Measure-
general n seems intractable at the moment. What can ment Theory: Probabilistic; Preference Models with
be achieved on a more practical level, by computer Latent Variables; Utility and Subjective Probability:
enumeration, is obtaining a complete linear descrip- Contemporary Theories; Utility and Subjective Prob-
tion of Pn for specific small n, n l 6, 7, 8, … . ability: Empirical Studies

6. Extension to General Probabilistic Bibliography


Measurement
Block H D, Marschak J 1960 Random orderings and stochastic
Taking an abstract, measurement-theoretic point of theories of response. In: Olkin I, Ghurye S, Hoefding W,
view, Heyer and Niedere! e (1992) have shown that Madow W, Mann H (eds.) Contributions to Probability and
random utility theory, as discussed above, can be Statistics. Stanford University Press, Stanford, CA
generalized to other instances of probabilistic mea- Falmagne J-C 1978 A representation theorem for finite random
surement (see Measurement Theory: Probabilistic). scale systems. Journal of Mathematical Psychology 18: 52–72
The basic idea is as follows (considerable extensions Fishburn P C 1992 Induced binary probabilities and the linear
ordering polytope: A status report. Mathematical Social
to more complicated relational structures and data
Sciences 23: 67–80
formats are to be found in Niedere! e and Heyer Gro$ tschel M, Ju$ nger M, Reinelt G 1985 Facets of the linear
(1997)). ordering polytope. Mathematical Programming 33: 43–60
Let for some finite domain of n objects, identified Heyer D, Niedere! e R 1992 Generalizing the concept of binary
with n, a mapping p: nk [0, 1] specify ‘probability choice systems induced by rankings: One way of probabilizing
data’ on k-tuples of objects. Let Q be a ‘quantitative’ deterministic measurement structures. Mathematical Social
k-ary relation on  and  the set of k-ary relations on Sciences 23: 31–44
n isomorphically embeddable in (, Q). Then, in this Koppen M 1995 Random utility representation of binary choice
general setting, it can be shown that p has a random probabilities: Critical graphs yielding critical necessary con-
scale representation ditions. Journal of Mathematical Psychology 39: 21–39
Kuhn H W 1956 Solvability and consistency for linear equations
and inequalities. American Mathematical Monthly 63: 217–32
p(i ,…, ik) l Pr((Ui ,…, Ui ) ? Q)
" " k Leung J, Lee J 1994 More facets from fences for linear ordering
and acyclic subgraph polytopes. Discrete Applied Mathematics
for some jointly distributed collection of random 50: 185–200
variables oUiqi ? n if and only if p is induced by a McLennan A 1990 Binary stochastic choice. In: Chipman J S,
probability distribution on , McFadden D, Richter M K (eds.) Preferences, Uncertainty,
and Optimality. Westview, Boulder, CO
Niedere! e R, Heyer D 1997 Generalized random utility models
p(i ,…, ik) l oR ? : (i ,…, ik) ? Rq
" " and the representational theory of measurement: A conceptual
link. In: Marley A A J (ed.) Choice, Decision, and Measure-
if and only if p is in the convex hull of , interpreted ment. L. Erlbaum, Mahwah, NJ
as vertices of the hypercube [0, 1]nk. Classical random Regenwetter M 1996 Random utility representations of finite
utility theory in terms of RU, IR, and CH is simply the m-aryrelations.Journal ofMathematicalPsychology 40:219–34

1650
Charisma and Charismatic

Suck R 1992 Geometric and combinatorial properties of the that are regarded as the precondition for healing,
polytope of binary choice probabilities. Mathematical Social telepathy, and divination. It is primarily through
Sciences 23: 81–102 ‘these extraordinary powers that have been designated
Suck R 1997 Probabilistic biclassification and random variable
by such special terms as ‘mana,’ ‘orenda,’ and the
representation. Journal of Mathematical Psychology 41: 57–64
Thurstone L L 1927 A law of comparative judgment. Psycho- Iranian ‘Maga’ (the term from which our word
logical Reiew 34: 273–86 ‘magic’ is derived). We shall henceforth employ the
term ‘charisma’ for such extraordinary powers’
M. Koppen (Weber 1963, p. 2). Such charismatic power is either
inherited as a natural endowment or it is acquired by
extraordinary means. The religious talent in both
cases is a substance that may remain dormant in a
Charisma and Charismatic person until it is aroused by asceticism or trance.
Weber did not dwell specifically on these features of
charisma in folk religion because he wanted to employ
‘Charisma’ or ‘gift of grace’ is a theological notion the term to understand the secular dynamic of auth-
that has been widely used in the social and religious ority and leadership in social institutions. His main
sciences to describe either the hierarchical organiz- intention was to compare and contrast three types of
ation of religious roles or explain the growth and authority: charismatic, traditional, and legal-rational.
development of social movements based on religious As we have seen, charismatic authority rests on the
inspiration or the basis of authority and leadership in ability of a leader to inspire disciples in the belief of the
society generally. In its strictly religious context, it authenticity of a calling. In practical terms, an auth-
means a divinely conferred power, being derived from entic claim is validated by a talent such as healing, but
the Greek kharisma (kharis favor or grace). Char- this alone cannot be the basis of authority. In the case
ismatic power is associated with the idea of the sacred of genuine charisma, a follower has a duty to accept
as a force in human affairs. A person in possession of the authority of a leader. In Economy and Society
charisma is thought to have a talent, for example, in (Weber 1978, pp. 1, 241), the term charisma is applied
terms of healing or prophecy. In anthropological to a certain quality of an individual personality by
research, there has been considerable interest in virtue of which he is considered extraordinary and
‘shamanism’ as a form of charismatic authority that treated as endowed with supernatural, superhuman,
depends on a capacity to have visions and to perform or at least specifically exceptional powers or qualities.
healing (see M. Eliade 1964). In sociology, it is Traditional authority involves the acceptance of a rule
conceptually part of an analytical framework that is that expresses a custom, namely an established pattern
concerned with understanding large-scale changes in of belief or practice. Finally, legal-rational authority is
religious institutions and the foundations of authority. typical of bureaucracies in which formal rules of
conduct are underpinned by procedural norms. These
1. The Sociology of Charisma forms of authority are in turn forms of compliance.
Tradition depends on compliance through empathy;
In the sociology of religion, the study of charisma has legal-rational authority rests on rational argument;
been closely associated with Max Weber (1864–1920) and charismatic authority and leadership require
who adopted the idea from the historical and theo- inspiration.Weber provided many diverse illustrations
logical research of Rudolf Sohm and Karl Holl who in of charismatic leaders including Jesus Christ,
turn had developed the concept in their analysis of Mohammed, Napoleon, Stefan George, and the
Canon Law. Weber wrote that ‘the concept of char- Chinese Emperor. Although the cases were hetero-
isma’ (gift of grace) is taken from the vocabulary of geneous, he argued that charismatic authority is
early Christianity. For the Christian hierocracy confronted by a common problem of succession with
Rudolf Sohm, in his Kirchenrecht, was the first to the death of the leader. Charismatic authority is thus
clarify the substance of the concept, even though he unstable. With the demise of the charismatic leader,
did not use the same terminology. Others (for instance, the disciples typically disband, but occasionally a
Holl in Enthusiasmus und Bussgewalt) have clarified solution for continuity will be developed. In the case of
certain important consequences of it. It is thus nothing the Christian Church , Weber argued that the char-
new (Weber 1978, pp. 1, 216). Despite Weber’s ismatic authority of Christ was invested in the Church
modesty about the adoption of the idea, it became a itself (as the body of Christ) and thus in the bishops
fundamental dimension of his analysis of power and who control the ‘keys of grace’ enjoy a vicarious
had far-reaching consequences for the development of authority. This ‘institutionalization of charisma’ be-
the sociology of religious institutions. Weber gener- comes over time increasingly formal, bureaucratic,
alized the idea and grasped its radical implications for and impersonal. As the charismatic power of Christ
the study of political change in human societies. becomes transformed into a set of formal procedures
In the rise of religion certain individuals are recog- and bureaucratic rules, Weber spoke of the ‘rou-
nized as having a capacity to experience ecstatic states tinization of charisma.’

1651
Charisma and Charismatic

2. Forms of Religious Association aspect of modern society was in turn a function of


Weber’s pessimistic understanding of social change in
Weber’s account of charisma was also an important terms of secular rationalism and the erosion of
component of his sociology of religion within which he religious meaning. Against Weber’s assessment, the
attempted to identify different religious roles and late nineteenth and twentieth centuries have been
patterns of organization. For example, he dis- profoundly influenced by charismatic movements.
tinguished between the prophet who, as a charismatic Charismatic renewal has been a common theme of
figure, has a personal call, and the priest has authority diverse religious movements in ‘primal societies’ and
by virtue of his service in a sacred tradition. The in the industrial societies of Europe and North
prophets, who often emerge from the ranks of the America. The collapse of aboriginal or tribal societies
priesthood, are unremunerated, and depend on gifts under colonial settlement saw the spread of char-
from followers. Weber also distinguished two forms of ismatic movements against the supremacy of White-
prophecy as represented on the one hand by Buddha settler societies such as the Ghost Dance among the
and on the other by Zoroaster and Muhammad. The Cheyenne and Sioux tribes of the American Plains. A
latter are involved in ‘ethical prophecy’ and are Paiute prophet called Wovoka had received a vision in
conceived as instruments of God. These prophets which through ritual dance the dead would return to
receive a commission from God to preach a revelation restore the pristine culture of native societies. This
and demand obedience from their disciples as an anti-White charismatic movement subsided after the
ethical duty. By contrast exemplary prophets dem- murder of Sitting Bull and the subsequent massacre of
onstrate to their followers a salvational path through his followers at Wounded Knee in 1890. Charismatic
the example provided by their own lives. Exemplary leadership has also played a significant role in those
prophecy was, according to Weber, characteristic of new religious movements that have been a response to
Asia; ethical prophecy, of the Abrahamic religions of the social and economic disruptions associated with
the Middle East. Weber’s analysis of charisma with the decolonization of the Third World (see P. Worsley
respect to ethical prophecy in the Old Testament has The Trumpet Shall Sound 1970).
been subject to considerable criticism (see Zeitlin In contexts of rapid social change, charismatic
1984), but his conceptual framework continues to leadership is an important component of so-called
influence both sociology (see Lindholm 1990) and ‘revitalization movements’ that function to create an
anthropology (see Werbner and Basu 1998). effective transition from tribal to urban society.
This discussion of charisma with respect to different However, charisma is also associated with religious
social roles should be seen as part of a larger forms that are a response to personal alienation,
sociological debate about the forms of association that isolation, and meaninglessness in the developed, indus-
characterize the social organization of religious belief trial world. One illustration is the Family who were
and practice. Weber wanted to argue that any group disciples of Charles Manson and operated in southern
that is subject to charismatic authority forms a California in the late 1960s. Manson, who was a social
charismatic community (Gemeinde) and that such a dropout with a criminal record, provided a message of
community is inherently unstable. With the death of personal liberation based on mind-expanding drugs
the leader, the group either dissolves or charisma and found a receptive set of disciples in the cultic
undergoes a process of routinization. The disciples milieux of the American counterculture. The Manson
have no career, no formal hierarchy, no offices, and no Family offered disoriented disciples the experience of
qualifications. The Church that provides the organ- an absolute community that depended on a powerful
izational context of the priesthood is very different. ideology and indoctrination (see Schreck 1988).
Ecclesiastical organizations require a hierarchical ad- Another American illustration would be Jim Jones
ministration of the ‘charisma of office’ in which there and the People’s Temple (see Weightman 1983). These
are definite stages in clerical and administrative ca- manifestations of charisma in modern society are
reers. It is clearly the case that Weber’s sociology of closely associated with the dislocations of youth
charisma should be understood as an application of culture, the anomie of the modern city, and the spread
Ernest Troeltsch’s ‘church-sect typology’ in which of personal alienation.
there is an historical oscillation between the evan-
gelical sects and the bureaucratic churches (see
Troeltsch 1931). 4. Conclusion; Charisma in the Information Age
Weber’s analysis of charisma has produced a rich
3. Charismatic Moements legacy of sociological research, but there are serious
conceptual problems associated with its application in
Because Weber believed that in modern societies legal contemporary society. The concept of charismatic
rational authority would become dominant, tradition movement is used to describe a variety of revivalist,
and charisma were regarded as ‘prerationalistic’ and nativist, messianic, and healing movements. There is
thus as characteristic of premodern societies. The considerable debate in sociology and religious studies
notion that charismatic authority was not a resilient about whether charisma in modern society is degraded

1652
Charisma: Social Aspects of

and inauthentic (see Wilson 1975). The term is freely, and independently of his or her own will. Once
frequently employed to describe the fleeting popularity this quality is socially recognized, it lends those who
of political leaders whose social appeal is constructed possess it a specific power inside the community they
by campaigns in the electronic media. Whereas char- are part of. The explanations given for the possession
isma in traditional societies arises spontaneously in the of charisma—very often imputed to the arbitrary will
collective enthusiasm for divinely inspired leaders, in of divinity—are crucial for the distinction between the
modern politics it is inevitably the product of de- different forms of power deriving from its possession,
liberate orchestration of the media. The notoriety of ranging from the authoritativeness of the prophet to
the Manson Family and the People’s Temple was a the ruling of the leader, and from the magic powers of
conscious product of media attention. However, this the shaman to the pure might of the fierce warrior.
argument indicates a persistent issue in charismatic From the point of view of its social aspects, the
movements, namely how to distinguish between genu- concept of charisma refers to the manifestation of a
ine and false claims to religious authority. Biblical specific modality of power (see Power in Society)
warnings against ‘false prophets’ recognized the fact inside the society, as well as to the specific form of
that disciples could be misguided. While the term has community (see Community Sociology) that forms
been implausibly stretched to describe a diverse and itself around the carrier of personal charisma.
heterogeneous range of leaders in the twentieth cen-
tury from Adolf Hitler to J. F. Kennedy, it remains a
basic concept in sociology and religious studies.
1. The Origin of Charisma as a Problematic
See also: Charisma: Social Aspects of; Folk Religion; Concept
Healing; Millennialism; New Religious Movements;
Prophetism; Religion, Sociology of; Weber, Max Originally, the concept of charisma appears in the
(1864–1920) New Testament literature, and almost exclusively in
the epistles of the apostle Paul, where it is used with
different meanings. It can in fact be: (a) a free gift from
God, (b) the specific gift of a vocation, (c) a series of
Bibliography godly gifts following the investiture of a priest, (d) the
Eliade M 1964 Shamanism. Archaic Techniques of Ecstasy. extraordinary manifestations of the Holy Spirit that
Ballingen Foundation, New York underline the evangelizing potentiality of the first
Lindholm C 1990 Charisma. Blackwell, Oxford, UK Christians. Paul distinguishes between extraordinary
Schreck N (ed.) 1988 The Manson File. Amok Press, New York gifts— those that do not correspond to any determined
Troeltsch E 1931 The Social Teaching of the Christian Churches. function—and exceptional spirituality, which fills the
Allen and Unwin, London, 2 Vols. performance of ordinary functions with grace. In the
Werbner P, Basu H (eds.) 1998 Embodying Charisma Modernity
spiritual life of the first communities, the charisma that
Locality and the Performance of Emotion in Sufi Cults.
Routledge, London and New York resulted in the production of miraculous effects were
Weber M 1963 The Sociology of Religion. Methuen, London fulfilling a function of legitimization of the acts of
Weber M 1978 Economy and Society. An Outline of Interpretie preaching. Moreover, they were also facilitating and
Sociology. University of California Press, Berkeley, CA, 2 increasing the efficacy of preaching. In these cases, the
Vols. relation between charisma and conduct of life becomes
Weightman J M 1983 Making Sense of the Jonestown Suicides. A apparent (Ducros 1937). A difference emerges, there-
Sociological History of the People’s Temple. Mellen Press, fore, between the gift given to spread the Word, and
New York that given to allow the subject to begin moving toward
Wilson B R 1975 The Noble Saages. The Primitie Origins of
perfection. In this tradition of studies, the possession
Charisma and its Contemporary Surial. University of Cali-
fornia Press, Berkeley, CA of charisma is the foundation and legitimization of the
Zeitlin I M 1984 Ancient Judaism Biblical Criticism from Max relations of the subject with the divinity that grants
Weber to the Present. Polity Press, Cambridge, UK it, but also with the institution that guarantees
the transmission of the various faith truths and of the
B. S. Turner appropriate forms of worship. Moreover, having the
charisma also legitimizes its possessors in their rela-
tions with those for whom they fulfill a function of
spiritual leader.
Being the leader of a community of believers for the
salvation of its members is the decisive and fun-
Charisma: Social Aspects of damental function of the carrier of personal charisma.
Having recognized the charismatic power of the
The concept of charisma indicates the presence of a leading function exercised by the ‘holy people’ means
quality which is considered to be extraordinary and grasping the ideal-typical status in which every leader
exceptional. The subject holds this quality personally, is a guide leading toward definitive and certain

1653
Charisma: Social Aspects of

salvation, against every possible opposition. There- in particular, with Marcel Mauss’s studies—as the
fore, the carrier of personal charisma ends up co- collection of extraordinary powers that populate the
inciding, in the New Testament literature, with the field of magically or religiously motivated action. In
man of God, who intervenes in moments of crisis to this field charisma, understood as an ideal-typical
show the way of perfection: the only one that truly form, is ‘a gift that inheres in an object or person
counts for individual or collective salvation. simply by virtue of natural endowment,’ and it ‘may be
The considerations regarding the use of charisma in produced artificially in an object or person through
the communities of believers can be transposed also to some extraordinary means.’ This latter definition is
civil communities. A long historic tradition from apparently secondary, since ‘it is assumed that char-
Plutarch to Hegel through Machiavelli, Kant, Carlyle, ismatic powers can be developed only in people or
and the whole German Romanticism movement pre- objects in which the germ already existed but would
pared the basic elements of the image of the char- have remained dormant unless evoked by some ascetic
ismatic leader (Cavalli 1981) that Weber would later or other regimen’ (Weber 1956\1965, p. 2).
translate into an ideal-typical figure. The two parts, of course, complement each other. In
the analysis of charisma as a sociological category of
social action, the accent is on the recognition of the
2. The Concept of Charisma in Contemporary exceptional quality, whatever this might be. We must
Sociology also stress, however, that in the section concerning
religious communities, charisma is underlined in its
Max Weber (1864–1920) is the author who derives the substantiality: It has, in other words, a specific content
best results from this dual tradition on the social and expresses itself in a doctrine.
aspects of charisma. While accepting Rudolf Sohm’s
work on the origin of the Christian Church, Max
Weber underlined different aspects of it: the idea of the 2.2 Analysis of Charisma as to Types of Religious
supernatural gift offered for the work of a specific Communities
mission; the concepts of call, of predestination, and of The quality of charisma becomes more precise in its
election. From other works on the history of Chri- primary form as the capacity of the subject that carries
stianity, Weber derives further developments of his it to awaken and govern the hidden powers that
theory of charisma. This is the case with Karl Holl and inhabit the natural world. This capacity can be
his studies on monasticism. For Holl, as for Rudolf consubstantial with the actors or activated as a
Sohm, the possession of charisma is the foundation of consequence of an exceptional state of being, namely
the authority inside religious institution. Whatever its ecstasy. What for lay people cannot be other than an
concrete manifestations—from the proof guaranteed occasional experience, is for magicians a dimension
by exceptional interventions, to the exemplary mani- characterizing the existence to which they have per-
festation of a permanently sanctified life—the obje- manent access: ‘The magician is the person who is
ctive of charisma is to lead the community of believers permanently endowed with charisma.’
after having been recognized by it. The success of charisma, understood in this sense, is
directly connected to the capacity to force the invisible
powers to yield to one’s own will through the use of
2.1 Meanings of Charisma in Weber adequate stratagems. On the other hand, it disappears
In his work, Weber defines charisma many times. when supernatural powers assert themselves, powers
First of all, it is part of the sociological categories and that are indifferent to any kind of constraint but
more precisely, it is one of the ideal types of power. In sensitive to veneration and worship acts. The magician
the definition of ‘charismatic power’ the ‘charisma’ is is substituted by the priest or minister who is a
functionary taking care of a ‘regularly organized and
a certain quality of an individual personality by virtue of permanent enterprise concerned with influencing the
which he is apart from ordinary men and treated as endowed gods, in contrast with the individual and occasional
with supernatural, superhuman, or at least specifically ex- efforts of magicians’ (Weber 1956\1965, p. 28). The
ceptional power or qualities. These, as such, are not accessible regularity of the functions and rites rendered priests as
to the ordinary person, but are regarded as of divine origin or
as exemplary, and on the basis of them the individual
functionaries who often inherit their position and
concerned is treated as a leader. In primitive circumstances whose power seems to be separated from personal
this peculiar kind of deference is paid to prophets, to people charisma.
with a reputation for therapeutic or legal wisdom, to leaders Yet another distinguishing quality of the priest, it is asserted,
in the hunt, and heroes in war (Weber 1956\1968, p. 48) in his professional equipment of special knowledge, fixed
doctrine and vocational qualifications, which brings him into
The analysis of the term ‘charisma’ occurs also contrast with sorcerers, prophets, and other types of religious
when Weber is discussing types of religious com- functionaries who exert their influence by virtue of personal
munities. Weber defines it here—not without connec- gifts (charisma) made manifest in miracle and revelation
tions to the analyses of the Durkheimian school, and (Weber 1956\1965, p. 29).

1654
Charisma: Social Aspects of

The prophet is, in this analysis, ‘a purely individual priest, or of a military leader. Likewise, magical
bearer of charisma, who by virtue of his mission elements can be found in religious action: This
proclaims a religious doctrine or divine command- happens, for instance, when prophets use magic acts to
ment’ (Weber 1956\1965, p. 46). It is, therefore, not so demonstrate their own charisma.
important if a new community emerges around pro- At least three aspects of the Weberian analysis of
phets or if their followers refer more to their person charisma need to be recalled here: the nature of social
than to their doctrine. What is decisive is the ‘purely recognition, the implicitly protesting character of
personal charisma’ distinguishing prophets from charismatic power, the transformation of the exercise
priests or ministers, while the content of their preach- of charisma into everyday practice, or in other words,
ing and the specificity of their actions, consisting in its ‘routinization,’ and, inside the everyday practice,
doctrines and ethical imperatives instead of magic, the distinction between personal and official charisma.
distinguish them from magicians.
2.4 Social Recognitions
2.3 Sociology of Power The possession of charismatic qualities is connected
to the social context in two ways. On the one hand it is
Max Weber defines once more the concept of in the social context that the recognition takes place
charisma when passing from the analysis of religious and those possessing charisma need to be socially
communities to that of political communities and, recognized as such by their followers. On the other
through these, to the sociology of power. It is during hand, social recognition does not constitute the foun-
this third phase of his work that the social aspects of dation of charisma. It is, on the contrary, a duty for
charisma are underlined. What is important to high- those that are expected to recognize it. ‘This rec-
light is his further clarification of the emotional aspect ognition, psychologically born of enthusiasm or ad-
of charisma, which is also, to a certain extent, meta- versity and hope, involves complete personal devotion’
everyday, alien to the paths of rationalization that are (Weber 1956\1994, p. 33). The recognition refers,
needed for its social reproduction, but still necessary therefore, to a previous confident expectation. The
for the foundation of a new epoch. Weber arrives at nature of this expectation is that of a positive event
the point of saying that every foundation moment is, such as a reconquest of political freedom or of
in fact, charismatic. This connects the use of charisma supernatural salvation or, again, of a redemption
not only to its social recognition but also to the capable of restoring to the entire people its place
processes of innovation and social transformation. He among the nations. A specific relation emerges be-
contrasts the permanent character of the traditional tween recognition of charisma, the diffused and
and the bureaucratic structures, to the foundational extended expectation of the carrier of charismatic
character of the charismatic action. qualities, and the reunification around that person of
Die Deckung allen u$ ber die Anforderungen des o$ kono- a completely renewed community of followers, which
mischen Alltags hinaus gehenden Bedarfs dagegen ist, je mehr aims at originating a process of renewal for the whole
wir historisch zuru$ cksehen, desto mehr, prinzipiell ga$ nzlich society. Since the entire Weberian discourse is based
heterogen und zwar: charismatisch, fundiert gewesen. Das on ideal-typical configurations, it also has numerous
bedeutet: die ‘natu$ rlichen’ Leiter in psychischer, physischer, variants: The social recognition of prophetic charisma
o$ konomischer, ethischer, religio$ ser, politischer Not waren has a completely different relation with ‘proof’ com-
weder angestellte Amtspersonen, noch Inhaber eines als pared to magical charisma or to that of the political
Fachwissen erlernten und gegen Entgelt geu$ bten ‘Berufs’ im
leader. Prophets proclaiming a ‘religious doctrine or a
heutigen Sinn dieses Wortes, sondern Tra$ ger spezifischer, als
u$ bernaturlich (im Sinne von: nicht jedermann zuga$ nglich) divine imperative’ need only occasionally to ‘prove’
gedachter Gaben des Ko$ rpers und Geistes (Weber 1956, the validity of their message through magical acts
p. 662). revealing their possession of prophetic charisma.
Things look different for the magician and the leader.
Summarized in one sentence, this important quo- For them proof is the concrete result of their enter-
tation says that the answer to human needs, going prise. Therefore, while magicians are constantly tied to
beyond mere economic needs, is not given by experts the success of their own art, leaders are continually
or people holding public office but by bearers of challenged to prove their charisma through (military
qualities of the body and mind perceived as super- or political) victory. As a consequence, while prophets
natural. can expect to be believed by virtue of their own
In Weber’s methodology such distinctions have of testimony and are not under an obligation of certifying
course an ideal-typical character. It is therefore im- their own charisma through magical acts, magicians
possible to trace the ‘pure forms’ of each of them as and charismatic leaders connect their credibility to the
well as to draw specific boundaries between one ongoing success of their extraordinary qualities.
category and the other. This means, for example, that Naturally, in cases like this, where it is not possible
it is possible to find some elements of prophetic to find such ideal-types in their pure status in historic
charisma in the behavior of a political leader, of a reality, there are many steps between one type and the

1655
Charisma: Social Aspects of

other. Prophets, seen as an ideal-typical representation claim. Thus the analysis of the connection between
of charismatic power, are not just those who proclaim charismatic power and utopia is introduced, where
the ‘new commandment.’ They are also those who, ‘as utopia is understood as the realization of a new social
a prophetic blow and a blustering flame,’ evoke ‘the order. Often, though not always, Weber suggests, the
indefinable that pervades and reinforces big com- mission of the carrier of charisma is of revolutionary
munities’ (Weber 1985: p. 612, my translation). This character, since it reverses every hierarchy of values
reconnects the prophet to the charismatic leader, while and clashes against usage, law, and tradition.
the position toward the divinity (of which the prophet In the case of the prophet, as well as in that of the
explicitly wants to be interpreter), as well as the social charismatic leader, charismatic power is tied to
function exercised, keep the two types distinguished moments of crisis and transition and it is a temporary
from each other. The prophet’s function is in fact power. As such it is based on an uncertain and risky
limited to the announcement of the divine message in economic ground (a real ‘charismatic economy’), often
the first case, while leaders are forced to maintain their constituted by free offerings, gifts, or war booty.
position until victory is achieved. Carriers of personal charisma do not create any kind
of systematic and permanent production of resources
through a stable and lasting economy, nor do their
followers. Charismatic power is therefore character-
istic of the moments of transition in statu nascenti and
2.5 Protesting Character of Charismatic Powers
is bound to disappear when social life goes back to
The above helps us to understand the connection normal.
that Weber draws between charismatic power and
situations of social crisis: The recognition of charisma,
following an expectation, emerges only when a situ-
ation of social crisis looks for a solution in exceptional
2.6 Transmission of Charisma
individuals, of whom it is possible to be devotees.
From this point of view, the possession of charisma From Weber’s point of view, getting back to
being a direct and personal gift from God, it ends up everyday life is, also in this case, the result of the social
representing a kind of power implicitly in conflict with action of concrete social actors. In fact, the ‘adherents’
and often alternative to traditional or bureaucratic- of the emotional community that formed itself around
legal power. The crisis in which the carrier of personal the carrier of charisma, as well as the ‘administrative
charisma operates is, in fact, also and mainly a crisis of machinery’ (whether constituted by followers or trust-
leadership. In the articulation of the ideal-typical worthy people), tend to bring it about that the relation
image of the ‘prophet’ defined as an ideal-typical case ‘ends up resting, ideally and materially on a lasting
of personal charismatic power, Weber observes that fundament of unitary character.’ This becomes es-
the prophet almost never comes from the clergy: pecially evident when the problem of succession poses
‘… the personal call is the decisive element distin- itself, after the demise of the carrier of charisma. The
guishing the prophet from the priest. The latter lays problem can be solved in different ways. Charismatic
claim to authority by virtue of his service in a sacred leaders can themselves, while still alive, look for a new
tradition, while the prophet’s claim is based on carrier of charisma or designate a successor; the
personal revelation and charisma’ (Weber 1956\1965, administrative machinery ‘charismatically qualified’
p. 46). This is true also in the political sector. can designate a successor, or the charisma can be
Charismatic leaders often operate in the passage identified as a quality of the blood and therefore
between property power and legal power and what parental inheritance can affirm itself as a mechanism
characterizes them is not so much the content of their of succession.
message as the style they use to spread it. They act Clearly, personal charisma is not always abandoned
driven by a duty, by virtue of which they do not as such but it certainly tends to be subordinated to an
present a value, but state a truth. Their message is administrative, charismatically qualified apparatus,
therefore a mission they are imposing on themselves. capable of recognizing the ‘signs,’ or of deciding upon
The charisma is therefore no longer just a quality or a the norms on the basis of which the charisma can be
gift, but a ‘highly asymmetric power-relationship recognized in the most rigorous possible way. Some-
between an inspired guide and a cohort of followers times the personal charisma disappears completely, as
who see in him and his message the promise and is the case with primogeniture, which affirmed itself in
anticipated achievements of a new order, to which all the West and in Japan, where lords or monarchs are
adhere with greater or lesser conviction’ (Boudon and such independently of the recognition of their subjects;
Bourricaud 1989, pp. 70–1). Therefore, charismatic from this point of view, they can be completely devoid
power is based not only on the exceptional endow- of any personal charisma.
ments of its carriers but also on the exceptional Among these forms of transmission of charisma, of
character of the social situation inside which they particular importance is the possibility of transmission
establish themselves, and on the promise they pro- through ritual acts, which takes a concrete form in the

1656
Charisma: Social Aspects of

charisma of function. This is the case of priestly oduced through a ritual, which is in turn guaranteed
charisma and of royal charisma, both transmitted by an institution. The functional charisma does not
through purification, laying on of hands, and anoint- refer to the subject anymore but to the ‘belief in a state
ing. The charismatic capacities that have been ac- of grace, specific to the institution’ (Se! guy 1988). The
quired this way are independent of the personal result is a situation opposite to the initial one: ‘from
qualities of those who possess them. being a personal, unstable, and ephemeral chara-
The transformation of charisma in everyday prac- cteristic, and by definition nontransmissible, the char-
tice has many consequences. For example, the isma becomes stable and enduring’ (Se! guy 1988,
establishment of the principle of the charisma’s p. 18, my translation). In this way, it becomes possible
inheritance extended to the administrative apparatus to imagine a ‘charismatic grace,’ characteristic of the
implies the hereditary transmission of the powers of carrier of personal charisma, as opposed to an ‘inst-
lordship and administration of goods, therefore itutional grace.’ Weber does, of course, admit the
founding the type of the ‘aristocratic state.’ Moreover, possibility that a personal charisma may hide inside a
the transformation of charisma in everyday practice subject—for instance a priest—who carries a func-
implies the loss of the extraeconomic character, tional charisma. Jean Se! guy does not, however, see an
distinctive of voluntary militant adherence, in favor of exceptional opposition between the two, but more a
a continual acquisition and redistribution of resources. kind of kinship or elective affinity. Functional char-
The vassals are substituted by the ‘taxable subjects,’ isma is not given in an indiscriminate fashion from the
the trustworthy supporters by the party’s executives, charismatically qualified institution to everyone who
the penitential charisma of the martyrs and ascetics is may ask for it. It is not just the result of a charismatic
replaced by the official charisma of the bishops and education, but is founded on the pre-existence of the
priests. charisma itself inside the subject. Briefly, the inst-
Weber also establishes a relation to the economy: itution is only a caretaker whose task is to protect the
‘But the more developed the economic interdepen- personal vocation, and therefore the gift, that the
dencies of the monetary economy, the greater the individual has received in a completely free and
pressure of the charismatic subject’s everyday needs personal way, and to let it grow, therefore guar-
becomes’ (Weber 1956\1994, p. 44). To conclude: anteeing its realization.
‘Charisma typically appears early on in the devel-
opment of a religious (prophetic) or political (con-
quering) authority. It will give way before long, 4. Methodological Issues and Possible
however, to routine powers as soon as its authority has Deelopments
been assured and, above all, as soon as it has gained
sway over the masses’ (Weber 1956\1994, p. 44). This The concept of charisma has gained remarkable
way the passage takes place between the charismatic success in ordinary language. There, it has been
and the ordinary administration. assumed in its weak meaning, becoming synonymous
with qualities that are functionally connected to the
mere communication process. The charismatic person
3. Charisma as an Operational Concept in is first and foremost a communicator and the ‘char-
Contemporary Research isma’ consists essentially in his or her capacity to
communicate and convince. The current extension
Contemporary sociology has shown interest in the and complexity of mass media communication renders
Weberian concept of charisma in various ways, in the such qualities crucial for media success. The decisive
frame of a general rediscovery of this author. Im- character of this kind of success in the contemporary
portant studies have been carried out in the field of communication society renders the success of such
political sociology where a reconstruction of the weak understanding of the concept of charisma
‘charismatic leader’ has been accomplished. From this treacherous and misleading. In fact, even if the
reconstruction a new interpretation of totalitarian communicative qualities that explain media success
dictatorships, and of the crisis of contemporary are personal and free, they do not establish real
democracies, has emerged. Besides this, a second leaderships.
stream of research can be seen in the field of the study When passing from the revelation of pure media
of the routinization processes of charisma and, in success to the contents of such success, it is evident
particular, of the coexistence of personal and official that it is not possible to put on the same level the
charisma. In fact, if we admit that among the different success of a television journalist and that of the media
forms of transmission of charisma there is one that preachers who were flourishing in the USA during the
rests on ritual acts carried out by a ‘charismatically 1980s and 1990s. Nor is it possible to compare the
qualified’ authority, a substantially different situation latter with the media success of a political leader.
is reached. In this new situation, the faith of the Compared to the original Weberian conceptualiza-
believers is not directed toward a person but toward a tion, the mere art of communicating and convincing
way of transmitting an extraordinary quality, repr- certainly does not contain the social aspects of

1657
Charisma: Social Aspects of

charisma, except in the most insignificant way. Media ganzen Struktur, ihrem ‘‘Geist’’ nach, grunderschieden
success does not make this carrier a ‘people’s leader,’ on der rationalen Leitung eines regulaW ren großka-
nor does it make the person the carrier of a mission, pitalistichen ‘‘Betriebs’’...’ However, they are part of
which in some way transcends his or her ordinary the ‘twofold nature’ of what can be called ‘capitalistic
professional task. Nothing impedes the carrier of spirit’: ‘...und ebenso das VerstaW ndnis des spezifischen
personal charisma, spiritual as well as political, ‘from Eigenart des modernen, ‘‘berufsmaW ßig’’ buW rokratisierten
the plebiscitary ruler, to the big demagog, or to the Alltagskapitalismus ist geradezu daon abhaW ngig, daß
leader of a political party’ (Weber, 1958, p. 50, my man diese beiden, sich uW berall erschlingenden, im letzen
translation) from making constant and methodical use Wesen aber erschiedenen Strukturelemente begrifflich
of the potentialities of the mass media. This does not scheiden lernt’ (Weber 1956, p. 667). In a word:
prevent the charisma—understood here as charac- charismatic entrepreneurship not only is not foreign
terized by the assumption of a leadership responsibility to ordinary routine bureaucratized capitalism but
following an absolutely personal call and in response belongs to it, even though rationality and charisma
to a crisis situation—from expressing itself completely should be conceptually distinguished.
independently from the mediagenic qualities of the Charisma remains therefore clearly present inside
subject and, therefore, of their eventual enhancement rational society, even if it is intertwined with bureau-
through the mass media. cratic-legal power and the professionally qualified
The concept of charisma, correctly interpreted in its bureaucracy. Rather than presenting exemplars of the
ideal-typical meaning, is far away from being just a pure types indicated by Weber, the concept of char-
historically born out and conceptually concluded isma appears to be an important conceptual tool to
sociological category. Even if charisma is often con- define and explain phenomena such as personal power
nected by Weber to the infancy of societies and is and consensus in the bureaucratic-rational institu-
supposed to come before the widespread and all- tional organizations. It can work as a precious revealer
embracing process of rationalization, the ‘charismatic’ of situations of latent crises as well as of conflicts—
moment continues to cross contemporary social estab- latent or open—which are growing in the society as a
lishments periodically, every time they find themselves whole and the various institutions inside it. If it is true
in a context of crisis. Elements of ‘charismatic power’ that we are living in an epoch ‘without God or
are sometimes present, not only behind official char- prophets’ it is also true that elements of personal
isma but also to distinguish between the functionary of charisma, in its various forms, do not cease to appear,
a party with a strong ideological identity and who, mostly, even if not exclusively, in the political and
chosen among the holders of leadership offices, will be religious arenas.
its actual leader. While occupying the leadership
position this person will trace the boundaries of the See also: Authority, Social Theories of; Elites: Soci-
ethic of conviction, redefining the objectives. Of ological Aspects; Ideal Type: Conceptions in the Social
course, this is not enough: In their redefinition of the Sciences; Institutionalization; Judaism; Leadership in
objective, leaders need to lean on the main principles Organizations, Psychology of; Leadership, Psycho-
with which they feel themselves to be personally logy of; Legitimacy; Legitimacy, Sociology of;
invested. Elements of ‘personal charisma’ and of Organizational Climate; Organizational Culture;
‘exemplary prophecy’ can therefore be found inside a
Organizations: Authority and Power; Organizations,
party secretary as well as inside every founder of a
religious movement. It was exactly among these Sociology of; Power in Society; Power: Political;
religious movements that the term ‘charisma,’ among Religion, Sociology of; Weber, Max (1864–1920)
others, was reintroduced into Paul’s original interpret-
ation. The charismatic movements, which came into
being in the beginning of the twentieth century within
Protestant circles, have been spreading within Euro- Bibliography
pean Catholicism since the 1970s. Based on prayer, Boudon R, Bourricaud F 1989 Charisma. In: Boudon R,
and on biblical, theological, and spiritual growth, they Bourricaud F (eds.) A Critical Dictionary of Sociology.
consider the charisma as a totality of gifts given by the University of Chicago Press, Chicago, pp. 69–73
Holy Spirit, which is able to give life to the present Cavalli L 1981 Il capo carismatico. Il Mulino, Bologna, Italy
Christian communities in the same way as it did to the Ducros X 1937 Charisma. In: Viller M (ed.) Dictionnaire de
original ones. SpiritualiteT . G. Beauchesne, Paris, Vol. 2, pp. 503–7
Se! guy J 1988 Charisme de fonction et charisme personnel: Le cas
In many parts of his work Weber admits the possi-
de Jean Paul II. In: Se! guy J et al. (eds.) Voyage de Jean Paul II
bility of a charismatic presence inside rational society, en France. Cerf, Paris, pp. 11–34
founded on instrumental rationality. This is the case Shils E 1958 The concentration and dispersion of charisma:
with some finance managers who, thanks only to their Their bearing on economic policy in underdeveloped coun-
uncommonly strong personal trustworthiness, are able tries. World Politics 11: 1–19
to obtain conventions and agreements that others can Shils E 1965 Charisma, order and status. American Sociological
absolutely not obtain. Such phenomena are ‘in ihrer Reiew 30(2): 199–213

1658
Chemical Sciences: History and Sociology

Sholem G 1974 The Messianic Ideology in Judaism and Other course of chemical reactions depends upon the purity
Essays in Jewish Spirituality. Schoken Books, New York of components, those involving natural products are
Social Compass 1982 About the theory of charisma, Special hard to predict and to repeat. Recourse to prayers,
issue, XXIX
incantations, and curious additives should not amaze
Weber M 1947 The Theory of Social and Economic Organization,
1st American edn. Oxford University Press, New York us.
Weber M 1956 Wirtschaft und Gesellschaft. J. C. B. Mohr,
Tu$ bingen, Germany 2. Metallurgy, Alchemy, and Pharmacy
Weber M 1956\1965 Wirtschaft und Gesellschaft. J. C. B. Mohr,
Tu$ bingen, Germany, Vol. I, Chap. V, Sect. 1 [trans. The Alchemy, with its objective of converting base metals
Sociology of Religion. Methuen, London] into gold, which chemist-historians portrayed as
Weber M 1956\1968 Wirtschaft und Gesellschaft. J. C. B. Mohr, absurd or dishonest, was not unreasonable. Nature
Tu$ bingen, Germany, Vol. I, Chap. III, Sect. 10 [trans. was believed to be perfecting metals within her womb,
Eisentstadt S N (ed.) Max Weber on Charisma and Institution
and the alchemist was simply speeding up the process.
Building. University of Chicago Press, Chicago
Weber M 1956\1994 Wirtschaft und Gesellschaft J. C. B. Mohr, If everything was composed, as Aristotle believed, of
Tu$ bingen, Germany, Vol. I, Chap. III, Sect. 11–12a [trans. the four elements Earth, Water, Air, and Fire in
Heydebrand W (ed.) Sociological Writings. Continuum, New different proportions, then changing these ratios
York] would transform one substance into another, and lead
Weber M 1958 Politick als Beruf. Duncker and Humbolt, Berlin might become gold. If, alternatively, Democritus and
Weber M 1985 Wissenschaft als Beruf. In: Weber M (ed.) Epicurus were correct in believing that in appearance
Gesammelte AufsaW tze zur Wissenschaftslehre. J. C. B. Mohr, there were colors, smells, and tastes, but in reality
Tu$ bingen, Germany, pp. 582–613 atoms and void; then because lead, gold, and every-
A. Zingerle (ed.) 1993 Carisma. Dinamiche dell’origine e della
thing is made up of different arrangements of these
quotidianizzazione-Charisma. Dynamiken des ursprungs und
der verallta$ glichung Special issue of Annali di Sociologia— ultimately similar atoms (or ‘corpuscles’), again con-
Soziologisches Jahrbuch 9 versions are possible.
Alchemy began in Egypt and Babylonia, and also in
S. Abbruzzese China: with emphasis both upon making gold (or
maybe something resembling it) using an elixir to
expedite the process, and of prolonging and enhancing
life by giving humans the noble and permanent
Chemical Sciences: History and Sociology qualities of gold. Pharmacy grew out of trial and error,
but in the West the maverick Swiss doctor calling
himself Paracelsus (1493–1541) brought alchemy into
The chemical sciences are concerned with specific it. He introduced metallic compounds into previously
kinds of matter, and their transformations. The herbal medicine, notably for the treatment of the new
boundaries of chemistry, notably with physics and disease, syphilis, which was ravaging Europe. He
biology, are however social constructions varying in publicly burnt the books of the great Greek physician,
different times and places. Chemistry is very ancient, Galen, and saw chemical study as essential for medi-
going back into remote prehistory with cookery, the cine. His career outraged the medical establishment,
preparation of drugs and dyes, the baking of clay into but the powerful and dangerous new remedies proved
ceramics, and metal-working. Its evolution into a irresistible to doctors and patients, and medical
science, where theory guides practice, and into a schools became centers for chemistry.
profession, with formal courses and qualifications,
happened in the eighteenth and nineteenth centuries
(Brock 1992). 3. The First Chemical Theories
Until the mid-twentieth century, it was believed that
1. Ancient Technologies alchemy was abandoned by the rational thinkers of the
Scientific Revolution; especially Robert Boyle (1627–
From very remote times, people have been using 91) and Isaac Newton (1642–1727). Close examination
techniques and processes which we would call chemi- of their manuscripts (Principe 1998) shows that both
cal, involving careful control, as part of a craft or art of them were in fact adepts, copying out and trying
passed from father to son, mother to daughter, or alchemical recipes, and believing that they were well
master to apprentice. Indeed the word ‘chemistry’ is on the way to a transmutation. But they were also
supposed to come from an ancient Egyptian word adherents to the atomic view of matter, seeing hard
‘chem’ meaning earthy: the Arabic definite article ‘Al’ and indestructible corpuscles or particles as funda-
was added to yield our ‘alchemy’, and dropped to give mental. These formed very stable primary mixts, such
‘chymistry’ and then by 1700, ‘chemistry.’ Early as iron, gold, or sulfur, which in turn combined with
technologies, culminating in triumphs such as the each other. Unlike gravity which was universal,
making of porcelain and Japanese swords, include chemical affinity was elective: some substances reacted
features we would regard as magical; but since the together, others did not. J. W. Goethe wrote a novel,

1659
Chemical Sciences: History and Sociology

Electie Affinities (1809) exploring chemical and hu- and chemists everywhere repeated and extended the
man bonding; chemists in eighteenth-century experiments. But results were confusing until in 1806
Germany and Sweden (the center of chemical activity) Davy did the careful experiments confirming his
drew up tables of affinity in attempting to predict the intuition that pure water is decomposed electrically
outcome of reactions. into oxygen and hydrogen only. Just as Newton had
From Germany also came the first chemical para- found that gravity was the force behind planetary
digm. G. H. Stahl (1660–1734) proposed that every- motions, so Davy inferred that electricity and chemical
thing which would burn contained ‘phlogiston’ (Greek, affinity were manifestations of one power. In 1807 he
flammable): this idea brought order into chemical used this insight in isolating the light and reactive
understanding, whereas atomic ideas were vague and metals potassium and sodium, and (putting Britain
untestable. Moreover, in Germany Lorenz Croll in back on the chemical map) went on to demonstrate,
1778 began the first chemical journal, Chemische with chlorine, that Lavoisier had been wrong about
Annalen, bringing into being a chemical community acidity.
there (Hufbauer 1982). His example was followed in Davy had been appointed to the newly-founded
France and England by Antoine Lavoisier and Royal Institution in London’s fashionable West End,
William Nicholson. where he proved himself a lecturer of enormous
attractiveness, making professing a performance art
(Golinski 1992, Knight 1998). The fees which men and
4. Laoisier’s Reolution women paid to join, and hear him, supported a
research laboratory in the basement. Davy became
Lavoisier (1743–94) was a wealthy man, prominent in
one of the first people in Britain to make a living out of
the privatized tax system of France; his spare time he
chemical research, which had previously perforce been
devoted to chemistry, in a splendidly equipped lab-
a hobby for an aristocrat like Boyle, a minister of
oratory. Becoming a member of the Royal Academy
religion like Priestley, or a doctor like Galvani. At the
of Sciences, the small salaried body charged with
Royal Institution, Davy trained (in a kind of informal
scientific research, he resolved to reform the language
apprenticeship) his successor, Michael Faraday
and theory of chemistry. As in Carl Linnaeus’ botany,
(1791–1867), and the pursuit of Davy’s insight that
names should be international, clear, and free from
chemical affinity was electrical continued there.
changeable theory: while phlogiston should be replaced
With Lavoisier, chemistry had acquired an exact
as incoherent. Stahl saw phlogiston emitted in burning;
language, closer to algebra than to the evocative terms
Lavoisier by contrast (in a classic paradigm shift) saw
of the alchemists; and it had testable theories, for
something absorbed from the air, leading to an
example of acidity. It is the science of the secondary
increase in weight. He drew upon the work of Joseph
qualities, of colors, smells, and tastes; it promised to be
Priestley (1733–1804), who had isolated ‘vital’ or
useful (chlorine for disinfecting and bleaching, for
‘eminently respirable’ air in a British tradition of work
example, and oxygen for chest complaints); and it
on gases. Lavoisier christened this substance ‘oxygen’
proved popular everywhere. With its connection to
(Greek, sour) because he believed that it was also
electricity, it became the dynamic fundamental science,
responsible for acidity (generalizing from analyses of
concerned not just with matter but also with force;
nitric and sulfuric acids). Water was a compound of
there was as yet no unified science of physics. Mech-
oxygen with another gas, hydrogen: such elements
anical explanations seemed shallow; while chemistry’s
were the basis of chemistry, rather than the hypo-
connections with heat, light, and electricity went deep.
thetical corpuscles which might concern physicists, or
the Earth, Water, Air, and Fire with which Priestley’s
friend Thomas Jefferson (1743–1826) structured his 6. A Mature Science
book on Virginia. In 1794 Lavoisier was executed as a
J. J. Berzelius (1779–1848) in Sweden used the unsys-
tax profiteer during Robespierre’s Reign of Terror,
tematic Davy’s insight to create a structure for
while the left-wing views of Priestley (who continued
chemistry, ‘dualism,’ based on the idea that every
to disagree with him over phlogiston) led to his exile in
compound had a positive and a negative part. He also
Pennsylvania. But their new and exciting chemistry
picked up John Dalton’s idea that each element was
survived and prospered (Bensaude-Vincent and Abbri
composed of atoms, identical to each other and
1995, Knight and Kragh 1998).
different from those of other elements: Berzelius
arranged these in an electrochemical series from
5. Electricity and Chemistry oxygen, the most negative, to potassium. The number
of elements known steadily grew through the century
In 1799 Alessandro Volta (1745–1827) showed that with improvements in chemical analysis.
electricity was generated when two metals were dipped Berzelius trained a number of chemists by having
into water; there was no need for any animal tissue, as them to stay in his house, where Anna the housekeeper
Luigi Galvani (1737–98) had supposed. His paper was washed up dishes and flasks. But in the 1820s Justus
an alarm bell, as Humphry Davy (1778–1829) put it, Liebig (1803–73) at the University of Giessen launched

1660
Chemical Sciences: History and Sociology

the first graduate school for turning out a stream of Liebig, destroyed this vitalism when in 1828 he
chemists with Ph.D. degrees (Brock 1997, Morrell synthesized urea. In fact the chief interest in this
1997). Liebig’s success depended upon his having reaction was that ammonium cyanate and urea turned
perfected apparatus for analyzing organic com- out to have the same atomic constitution: their
pounds; his students usually did their research on different properties were the result of different
some natural product, and published it in the journal arrangements (Brooke 1995). So the story has more to
which became called Liebigs Annalen after its editor. do with understanding molecular structure; but the
They found jobs, particularly in the dye industry (Fox synthesis, and the work of Liebig and his students in
and Nieto-Galan 1999) and in pharmacy which were analysis, showed that no gulf separated organic and
both becoming based in science rather than craft skills; inorganic worlds. Nevertheless, by 1848 when
many went to England, a rich country with a poor Berzelius died, it was clear that dualism did not fit
educational system. With the collapse of Napoleon’s organic compounds well, and as the chemical com-
empire in 1815, the University of Berlin had emerged munity grew it was convenient to separate organic
dedicated to research and teaching (Wissenschaft und chemistry, based upon carbon, from the inorganic
Bildung), and the various German states began to branch. The expansion of universities led to new
compete in their opera houses and universities. They professorships and laboratories devoted to the special-
followed Giessen, building better laboratories and ism of organic chemistry, from which in the twentieth
bidding for star chemists. Schools began teaching century emerged biochemistry.
chemistry, textbooks were needed (Lundgren and Chemists had relied upon balances, test-tubes,
Bensaude-Vincent 2000), and academic careers opened condensers, blowpipes, and other apparatus difficult
up in a field now largely separated from medicine. to manipulate. The chemist had to think with his (or
Universities in Britain and the USA followed the occasionally her) fingers, and was proud of skills in
German model, usually demanding German research glassblowing. Chemistry was essentially experimental,
experience from professorial candidates. exciting and often dangerous, attractive. Then in 1860
In the 1850s chemists could agree about what things came collaboration between Robert Bunsen, inventor
were made of, but not about formulae. Dalton had of the controllable gas burner, and the physicist G. R.
supposed that water must have the simplest possible Kirchhoff, who found that elements heated to a high
formula, HO; Davy and others, notably Amadeo temperature have characteristic spectra. Analysis
Avogadro (1776–1856), went for our H O formula could be done by physical methods, and this optical
because two volumes of hydrogen combine#with one of spectroscopy was the first of what is now an armory of
oxygen. An atom of oxygen thus weighed either 8 or 16 such techniques which has transformed the appear-
times as much as one of hydrogen, and such un- ance of chemical laboratories (Morris and Travis, in
certainties ran through the whole list of elements. In Krige and Pestre 1997, pp. 715–40).
1860 August Kekule (1829–96), a pioneer in working About the same time the new science of thermo-
out chemical structures such as that of benzene, called dynamics, based on energy and its transformations,
for an end to this confusion through an international brought together into classical physics sciences which
conference, which met in Karlsruhe. It was poorly had been separate, or had been part of the empire of
organized, but afterwards chemists came to accept the chemistry, Davy and Faraday had been pioneers in
reformulation of Avogadro’s arguments by Stanislao what became a new specialism, physical chemistry,
Cannizzaro (1826–1910). With agreed atomic weights, investigating energy changes in reactions, and the
tabular arrangement of the elements became possible; mechanisms, rates, and reversibility of processes. The
and the most successful was the Periodic Table of leaders here were Wilhelm Ostwald (1853–1932) and
Dmitri Mendeleev (1834–1907). J. H. Van’t Hoff (1852–1911) who launched a journal,
His predictions of the properties of some hitherto and promoted academic positions and laboratories.
undiscovered elements were startlingly accurate; and The new profession of chemical engineering was
with the table (as he hoped) the student had to closely linked to the rise of physical chemistry.
remember fewer brute facts. From its position an Whereas early in the nineteenth century chemists had
element’s properties would be known. It is striking been called in only as consultants or trouble shooters
that so many bright ideas, from Dalton via Cannizzaro when something went wrong, by the end of it they were
to Mendeleev, came from people on the periphery employed full-time (Bud and Roberts 1984). In in-
rather than in the great scientific centers. dustry, intellectual property belongs generally to the
company and not the individual, and is secured by
7. The Fragmentation of Chemistry patents (Travis et al. 1998).

In death, we rot: for we (and animals and plants) are 8. The Reduction of Chemistry
then subject to chemical reactions which go differently
while we are alive. Most people believed in a vital force The nineteenth century was the heyday of chemistry,
which maintained life. It is claimed that Friedrich the golden age in which it came to maturity and
Woehler (1800–82), pupil of Berzelius and friend of seemed fundamental. The chemist and spectroscopist

1661
Chemical Sciences: History and Sociology

William Crookes (1832–1919) followed Faraday in chemical societies, their academic and professional
studying cathode rays, but J. J. Thomson in 1897 aspects sometimes in tension, were formed and en-
identified them as composed of subatomic corpuscles, joyed prestige (Russell et al. 1977). Although pol-
soon named ‘electrons.’ The subsequent nuclear ato- lution from new chemical industries (as from older
mic model of Ernest Rutherford (1871–1937)—for ones like tanning) was palpable and led to legislation,
whom all science was physics or stamp-collecting— the expectation was that the chemists would be able to
and Niels Bohr (1885–1962) accounted not only for cure it. It did not happen: Rachel Carson’s book The
spectra, but also for the Periodic Table. Chemistry Silent Spring (1962) alerted the world to the dangers.
became a branch of physics (Nye 1996); the properties So in the late twentieth century, despite successes such
of gold could in principle be calculated from data as plastics, and the array of new drugs available for
about protons, neutrons, and electrons, though in medicine, chemistry is seen as boring and its appli-
practice the chemistry laboratory is essential. This cations as threatening. Chemists feel misunderstood
meant that chemistry lost its glamor; the chemist was and underappreciated.
as ubiquitous as ever, an essential member of the Twentieth-century chemistry is dominated not only
teams or groups so characteristic of twentieth-century by universities in the Giessen tradition, but also by
science, but playing a service role (Knight 1995). big-spending international companies with research
The number of chemists has continued to grow, as laboratories, now turning towards biotechnology
has the number of new substances unknown in nature (Galentos and Sturchio, and Kevles, in Krige and
which they have synthesized. Davy wrote of the Pestre 1997, pp. 227–52, 301–18) and by the military.
chemist being a godlike creator, and this creativity is Research is carried on no longer by a Woehler or a
nowadays celebrated by chemists such as Roald Crookes, on their own or with an assistant, but by
Hoffmann. The engineer or architect must remember teams of people possessing various skills. Chemistry
the law of gravity, but to mourn that architecture has has been taught in an impersonal way, with less hands-
been reduced to physics would be absurd: like the poet on experiment in a world more conscious of health and
or the painter, the chemist has to work within safety.
constraints, but that is a feature of life—indeed making This is a strange eventful history, which was until
creativity possible (Hoffmann and Torrance 1993). the mid-twentieth century mainly written by partici-
Hoffmann, born in Poland, surviving World War II, pants who looked for progress. They had the ad-
escaping to the USA, learning chemistry there, and vantage of being familiar with chemicals and ap-
doing research which brought him a Nobel Prize, paratus; but professional historians of science have
exemplifies another trend. Chemistry reached the West come to look more closely at contexts and careers. The
via Islam. By the eighteenth and nineteenth centuries, history which emerges deserves to be known beyond
it was a European science, with Germany the most the world of chemists.
important center by 1900. Papers in German journals,
and research experience in Germany, counted high in See also: Archaeometry; Behavioral Neuroscience;
any pecking order; but already the USA was becoming
Biomedical Sciences and Technology: History and
a major power in science. There in 1916 G. N. Lewis
proposed the electronic theory of chemical combi- Sociology; Ceramics in Archaeology; Cognitive
nation, much developed by his pupil Linus Pauling. Neuroscience; History of Science; Human Sciences:
Since 1945 the USA has been the center of things, History and Sociology; Physical Sciences: History
making the English language and publication in US and Sociology; Research and Development in
journals the key to prestige in research. Two world Organizations; Scientific Disciplines, History of;
wars, and Hitler’s coming to power between them, are Technological Innovation
part of the reason for this; equally important has been
American prosperity, itself dependent on science.
Chemistry has steadily gone West.
Bibliography
Bensaude-Vincent B, Abbri F 1995 Laoisier in European
Context. Science History, Canton, MA
9. The Status of Chemistry Brock W H 1992 The Fontana History of Chemistry. Fontana,
London
Davy and Liebig (Rossiter 1975) wrote famous books
Brock W H 1997 Justus on Liebig: the Chemical Gatekeeper.
on agricultural chemistry, and in the nineteenth Cambridge University Press, Cambridge, UK
century chemical fertilizers and pesticides were un- Brooke J H 1995 Thinking about Matter. Ashgate Variorum,
equivocally welcomed in a Europe of food shortages. Aldershot, UK
Lavoisier improved French gunpowder, and later Bud R, Roberts G K 1984 Science ersus Practice. Manchester
chemists produced high explosives making possible University Press, Manchester, UK
engineering achievements and also formidable wea- Fox R, Nieto-Galan A 1999 Natural Dyestuffs. Science History,
pons. All these things were seen as benefits. National Canton, MA

1662
Chess Expertise, Cognitie Psychology of

Golinski J 1992 Science as Public Culture: Chemistry and pert. Finally, there has been rich cross-fertilization
Enlightenment in Britain, 1760–1820. Cambridge University between psychological research on chess expertise and
Press, Cambridge, UK research in formal fields like computer science and
Hoffmann R, Torrance V 1993 Chemistry Imagined.
mathematics.
Smithsonian, Washington, DC
Hufbauer K 1982 The Formation of the German Chemical
Community 1720–1795. California University Press, Berkeley,
CA
Knight D M 1995 Ideas in Chemistry. Athlone, London 1. Chess in the Sciences
Knight D M 1998 Humphry Day: Science and Power. Camb-
ridge University Press, Cambridge, UK
Knight D M, Kragh H 1998 The Making of the Chemist. 1.1 Chess in the Formal Sciences
Cambridge University Press, Cambridge, UK
Krige J, Pestre D 1997 Science in the Twentieth Century. Unsurprisingly, chess has been a favorite subject of
Harwood, Amsterdam study in the formal sciences. On several occasions,
Lundgren A, Bensaude-Vincent B 2000 Communicating Chem- chess has been used to explore aspects of game theory;
istry: Textbooks and their Audiences, 1789–1939. Science in a celebrated paper published in 1912, Zermolo
History, Canton, MA formalized the concept of game tree and introduced
Morrell J 1997 Science, Culture and Politics in Britain, 1750– the method of backwards induction with reference to
1870. Ashgate Variorum, Aldershot, UK chess. The game has also been of interest to mathe-
Nye M J 1996 Before Big Science. Twayne, New York
maticians, for example in the field of combinatorics.
Principe L 1998 The Aspiring Adept. Princeton University Press,
Princeton, NJ However, most of the research has been made in
Rossiter M 1975 The Emergence of Agricultural Science: Justus artificial intelligence and computer science. If one
Liebig and the Americans, 1840–1880. Yale University Press, ignores chess automata, most of which turned out to
New Haven, CT be fraudulent, computer chess started in earnest in
Russell C A, Coley N G, Roberts G K 1977 Chemists by 1949 with Shannon’s paper describing a computer
Profession. Open University Press, Milton Keynes, UK program able to play an entire game, either by full
Travis A S, Schro$ ter H G, Homburg E, Morris P J T 1998 search to a specified depth or by selective search. Since
Determinants in the Eolution of the European Chemical that seminal work, researchers have extensively
Industry, 1900–1939. Kluwer, Dordrecht, The Netherlands explored various techniques for improving the
efficiency of search algorithms or to make search more
D. Knight selective (see Newell and Simon 1972, Levy and
Newborn 1991). The crowning achievement of the
quest for efficient search algorithms (the so-called
‘brute-force’ approach) was the development of Deep
Blue, the first computer to beat a world champion in
Chess Expertise, Cognitive Psychology of an official match. Deep Blue’s special-purpose hard-
ware allowed it to consider up to 200 million positions
per second. By contrast, a nice example of the selective-
Expertise may be defined as the ability of some search approach is a program written by Pitrat (1977),
individuals to perform at levels vastly superior to the which uses heuristics to cut the search tree down to the
majority. For historical and scientific reasons, research same size as humans’ (about 100 positions). Recently,
on chess expertise has played a major role in the study computer chess has seen a strong interest in database
of expertise in general. The first reason is that chess theory and in the development and testing of machine-
itself has a very long history (the modern form of learning algorithms.
Western chess goes back to the sixteenth century).
This has made possible an extensive study of the game,
leading to the development of several ‘theories’ about
the proper way to play by leading players such as
1.2 Chess in the Social and Behaioral Sciences
Steinitz, Nimzowitch, and Euwe. Next, the rules of
chess offer a well-specified and constrained environ- Expert behavior in chess has attracted the attention of
ment that is easily formalizable. Chess is also a game various social and behavioral sciences, including
flexible enough to allow multiple experimental psychoanalysis, psychiatry, and sociology. Questions
manipulations. In addition, the presence of a rating such as ‘Does extreme practice of a skill lead to
system (the Elo system) allows one to estimate players’ madness?’ ‘Why are women weaker than men at
skill quantitatively and precisely. Compared to most chess?’, ‘Can oedipal compulsions lead to creativity?’,
other domains of expertise, this ability to measure skill and ‘Why is there a high proportion of Jews among
is a definite advantage. Contrast this situation with, top players?’ have been asked in these fields, although
for example, the study of experts in physics or the answers offered are often controversial (Dextreit
medicine, where researchers have to use very rough and Engel 1981, Holding 1985). In addition, chess has
classifications such as novice, intermediate, and ex- sometimes been used not as an object of study, but as

1663
Chess Expertise, Cognitie Psychology of

a model. Two examples may suffice. In philosophy, between the search behavior of world-class grand-
Lasker developed a philosophical system (machology) masters and that of weaker players, as far as measures
based on the element of fight extant in a chess game. In such as depth of search, breadth of search, or
linguistics, De Saussure made regular use of chess to branching factors are concerned. Third, players of all
illustrate the rule-like character of language. For levels visit the same variation several times when
instance, he drew an analogy between a chess game choosing a move, a phenomenon De Groot called
and the synchronic analysis of language: if somebody ‘progressive deepening.’ This behavior allows cog-
walks into a room where a chess game is being played, nitive systems of limited capacity, such as those of
they can study and understand the position without humans, both to propagate information from a given
knowing the moves leading to it. branch of the search tree to other branches, and to
overcome the limits of short-term memory (De Groot
and Gobet 1996). Fourth, masters are able to zoom
into the key features of the problem at hand very
2. Psychological Studies of Chess Expertise rapidly. Fifth, chess masters normally judge a position
based on a single feature, which can be either static
However, in none of the social and behavioral sciences (e.g., material balance) or dynamic (e.g., potential
mentioned above has chess had such an impact as in actions). This is in sharp contrast to computer
psychology, where it has been used as a standard task programs, which typically use a polynomial function
environment for exploring expertise (Simon and Chase to combine a large number of features (Levy and
1973). Several concepts, such as progressive deepening Newborn 1991). Sixth, chess masters have a remark-
and selective search, and several experimental tech- able memory for meaningful material taken from their
niques, such as the use of verbal protocols and the use domain, even when this material is presented for just a
of recall tasks to study expertise, have their main few seconds.
source in chess research. De Groot’s work, and through it the ideas of the
German psychologist Otto Selz, was important in
shaping the revolution in cognitive psychology
(Newell and Simon 1972). As the cornerstone of most
2.1 Brief History of Psychological Research
current research on expert behavior, it has spawned a
The first psychological investigation of chess expertise large number of empirical studies, the most important
was carried out at the end of the nineteenth century by of which are reviewed in the next section.
Binet, who was interested in masters’ ability to play a
game, or even several games, without seeing the board.
This work, using questionnaires and focusing on the
search for chess players’ hypothetical, and probably
2.2 Key Empirical Results
nonexistent, concrete visual memory, no longer has
much impact. Nor does the work carried out in 1927 It is common to organize chess research along the
by Djakow, Petrowski, and Rudik, who were the first following lines: perception, memory, knowledge, look-
psychologists to bring chess players into the lab- ahead search, and general intelligence. Research on
oratory. These Russians scientists, who gave a battery perception has confirmed De Groot’s earlier results,
of psychometric tests to their subjects, were interested and has shown that players can identify and memorize
in the question of chess talent. Their tests measured critical patterns in a position even with presentation
various ‘faculties of mind,’ such as memory, attention, times below one second. Automatization also affects
or combination power. It turned out that chess masters lower levels of perception: As shown by Saariluoma
did not differ from lay people on most of these tests, (1995), the speed with which chess pieces are
the exception being tasks of visual memory where the recognized is a function of the level of expertise.
stimuli bear a strong resemblance to chess-boards. Finally, eye-tracking studies have shown that chess
The next wave of research on chess psychology had masters have faster eye movements, cover more of the
a huge impact on expertise research and on cognitive important squares on the board, and tend to look at
psychology in general. It was the work of a single man, the intersection of squares more often than weaker
the Dutch psychologist and chess master Adriaan De players (De Groot and Gobet 1996).
Groot. In his doctoral thesis in 1946, De Groot (1978) Empirical studies on memory have shown that chess
introduced two key methods (recall of briefly pre- players’ performance is mediated by variables such as
sented positions, and analysis of verbal protocols depth of processing, presentation time, typicality, level
collected during problem solving), which allowed him of randomization, and age (for reviews, see Holding
to uncover several key determinants of expertise. First, 1985, Saariluoma 1995, Gobet 1998). Interestingly,
players at all levels are highly selective in their search. and contrary to widely held opinion, masters’ su-
Even top-level players do not search much more than periority is maintained with briefly presented
about one hundred positions during a 15-minute positions, even after their semantics have been de-
deliberation. Second, there exist almost no differences stroyed by randomizing the location of pieces (Gobet

1664
Chess Expertise, Cognitie Psychology of

and Simon 2000). While the effect is small, it is also successful in accounting for chess thinking. As an-
reliable, and has been found in other domains of ticipated by Selz, players often use a hierarchy of
expertise such as programming and music. subsidiary methods, which relates to Newell and
As witnessed by the formidable time chess masters Simon’s (1972) means-end analysis. For De Groot
spend studying books and analyzing games, knowl- (1978) a necessary condition for becoming a chess
edge plays an important role in chess expertise. Several master is the construction through experience of two
researchers have attempted to study knowledge di- things: a highly developed and specific mode of
rectly, using techniques such as sorting experiments, perception, and a system of reproductory methods
questionnaires, and verbal reports. There is evidence stored in memory.
that expertise correlates both with qualitative Chess players’ memory can be separated into
organization of knowledge and with quantitative knowledge (knowing that...) and intuitive experience
amount of information stored (Freyhoff et al. 1992), (knowing how...). Selz’s description of thought as a
reflecting findings of other domains of expertise such sequence of operations is also apparent in the formal
as physics. models developed by the Carnegie Mellon research
While De Groot’s statistics about problem solving group centered around Herbert Simon (this research is
have in most cases withstood the test of time, sub- summarized in Newell and Simon 1972). Two com-
sequent research has identified a few skill differences. puter programs (written by Newell, Shaw, and Simon
In particular, it has been found that depth of look- in 1963, and by Baylor and Simon in 1966)
ahead search varies as a function of skill (e.g., implemented the idea of selective search made possible
Saariluoma 1995), although the effect is rather small. by the use of heuristics. Another model, developed by
Several studies have addressed the question of Newell and Simon in 1965, used the evaluation
whether intelligence correlates with chess skill. The obtained at the end of a branch to formulate six
general conclusion is that there is a correlation with principles that dictate the generation of moves and
tests measuring general intelligence, but not with tests sequences of moves. Several of these ideas have been
measuring visuospatial intelligence (Doll and Mayer recently combined with the chunking theory (see
1987). From these studies, it is unclear whether chess below) in a probabilistic model of chess thinking
practice develops aspects of intelligence measured in (Gobet 1997).
IQ tests (a good candidate for explanation would be Two informal theories have also been influential in
ability to think under time pressure), whether at- research on expertise. Holding’s (1985) theory empha-
tainment of a high skill requires superior intelligence, sizes the role of search and knowledge and suggests
or whether both intelligence and chess skill are causally that human experts search in ways similar to com-
related to a third variable, such as an ability to puters. The theory discussed by Saariluoma (1995)
concentrate for long periods. proposes that players, while thinking about a position,
Finally, chess has been a useful domain for studying access goal positions by apperception—that is, con-
the cognitive processes that support outstanding skill ceptual perception. They then try to close the path
in problem solving across the life span. In particular, between the problem position and the goal position.
chess has been useful in identifying compensatory When this is not possible, the problem space is
mechanisms used by older adults to allow high-level restructured. Thus, chess thinking may be described as
performance in spite of age-related declines in per- a sequence of apperception–restructuration cycles,
ceptual, memory, and cognitive abilities (Charness which make it possible to find solutions with only
1981). limited search.
While these theories differ in their emphasis and in
their specificity, going from formal computer pro-
grams to informal verbal theories, they also share a
3. Theories of Problem Soling few important assumptions: They all emphasize the
role of knowledge in problem solving and the high
Given the rich set of empirical data generated by chess selectivity of human search.
research, it is not surprising that chess has produced
several theories of expertise. It is convenient to present
these theories as mainly addressing either problem
solving or memory, although they often aim at a
general characterization. 4. Theories of Memory
The first theory devoted primarily to problem-
solving behavior in chess is De Groot’s (1978) elab- One can identify four major theories of chess per-
oration of Otto Selz’s framework of productive think- ception and memory in chess expertise (Gobet 1998).
ing. Selz proposed that thinking is a continuous The most influential of them is Simon and Chase’s
activity that can be described as a linear chain of (1973) chunking theory, which proposes that experts
operations. De Groot showed that this framework acquire a vast database of chunks (perceptual patterns
was, with a few extensions and modifications, quite that can be used as units) giving access to semantic

1665
Chess Expertise, Cognitie Psychology of

memory and to procedural memory. Subsets of the program also plays (poorly) by pure pattern recog-
chunking theory were implemented in 1973 in a nition.
computer program by Gilmartin and Simon. Based on One of the important challenges facing researchers
extrapolations from the simulations, it was estimated in the field is to provide a coherent and integrated
that between 10,000 and 100,000 chunks are necessary picture of chess thinking, combining simple perceptual
to reach a high level of expertise. However, some structures such as Chase and Simon’s chunks with
weaknesses of chunking theory were uncovered by more complex structures such as schemata. Long-term
later research, mainly the fact that it underestimates working memory and template theory can be seen as
storage into long-term memory (LTM), and that it attempts to address this challenge.
underestimates the role of high-level structures such as
schemata (Charness 1976, Holding 1985).
Several attempts have been made to repair these
theoretical weaknesses, while still accounting for the 5. Future Lines of Research
data that chunking theory successfully explained. Chess expertise has been extensively studied in the past
Holding (1985) is representative of a group of (more research has been done in psychology about
researchers emphasizing the role of high-level knowl- chess than about all other games put together), and it
edge structures, such as schemata or prototypes. In is likely that this will continue in the future. In most
opposition to the simple and specific structures cases, results from chess research can be generalized to
proposed by Chase and Simon, Holding emphasizes other domains of expertise. While scientific under-
that chess masters’ memories are richly organized and standing of expertise has grown substantially through
general. Emphasis is also given to metaknowledge, a large number of experiments and through a wealth of
which consists of principles for efficient search and theoretical developments, there is still no single theory
evaluation of positions. able both to account for most of the empirical data
Ericsson and Kintsch’s (1995) long-term working and to simulate human behavior by playing chess at a
memory theory emphasizes that experts in various high level of expertise. With the rapid progress in
fields, including chess, can encode information into artificial intelligence and computer science, which have
LTM more rapidly than had been postulated by already produced grandmaster-level programs by
traditional models of human memory. In the spirit of brute force, it is however realistic to expect such a
Selz and Newell and Simon’s (1972) approaches, this theory within a decade or two. In addition to this effort
theory views cognitive processes as a sequence of in computer modeling, a main research domain is
stable states representing end products of processing. likely to be neuropsychological investigation using
Through acquired memory skills, these end products brain imaging techniques to study the biological
can be stored in LTM and can be accessed from short- substrate of chess expertise.
term memory by means of retrieval cues. Two inter-
twined mechanisms allow rapid storage into LTM.
The first mechanism allows encoding through a See also: Expert Memory, Psychology of; Expertise,
hierarchical retrieval structure; in the case of chess, the Acquisition of; Medical Expertise, Cognitive Psy-
retrieval structure corresponds to the 64 squares of the chology of; Protocol Analysis in Psychology; Short-
board. The second mechanism allows encoding term Memory, Cognitive Psychology of; Sports as
through knowledge-based associations that elaborate Expertise, Psychology of; Working Memory, Psy-
patterns and schemata stored in LTM. Ericsson and chology of
Kintsch (1995) suggest that these two mechanisms
account for chess masters’ excellent memory for chess
material, as well as for their ability to plan and to
evaluate alternative sequences of moves. Bibliography
Gobet and Simon’s (2000) template theory proposes Charness N 1976 Memory for chess positions: Resistance to
that patterns recurring often in players’ practice and interference. Journal of Experimental Psychology: Human
study lead to the creation of more complex data Learning and Memory 2: 641–53
structures, called templates. As with classical schem- Charness N 1981 Aging and skilled problem solving. Journal of
ata, templates have both a core (containing the same Experimental Psychology: General 110: 21–38
information as chunks), and slots, where variable De Groot A D 1978 Thought and Choice in Chess, 2nd edn.
information can be stored. The template theory is Mouton Publishers, The Hague, The Netherlands
implemented as a computer program, which acquires De Groot A D, Gobet F 1996 Perception and Memory in Chess.
Van Gorcum, Assen, The Netherlands
chunks in an unsupervised way by scanning a large Dextreit J, Engel N 1981 Jeu d’eT checs et sciences humaines
database of master games. The program can simulate [Chess and Human Sciences]. Payot, Paris
various data from perception, such as players’ eye Doll J, Mayer U 1987 Intelligenz und Schachleistung—eine
movements during the first five seconds of presentation Untersuchung an Schachexperten [Intelligence and success in
of a position, and from memory, such as the role of chess playing—an examination of chess experts]. Psycho-
presentation time on memory recall. A version of the logische BeitraW ge 29: 270–89

1666
Chiefdoms, Archaeology of

Ericsson K A, Kintsch W 1995 Long-term working memory. leaders obtain their political authority through both
Psychological Reiew 102: 211–45 ascription and performance (compared to largely
Freyhoff H, Gruber H, Ziegler A 1992 Expertise and hierarchical hereditary kings), they have very generalized admin-
knowledge representation in chess. Psychological Research 54: istrative roles (judicial, economic, ritual, military),
32–7
they have few bureaucratic specialists (unlike the
Gobet F 1997 A pattern-recognition theory of search in expert
problem solving. Thinking and Reasoning 3: 291–313 specialized bureaucracies of states), and they generally
Gobet F 1998 Expert memory: A comparison of four theories. rely on kin-based alliances to structure political
Cognition 66: 115–52 relations with subordinates (rather than the nonkin-
Gobet F, Simon H A 2000 Five seconds or sixty? Presentation based political institutions and formal legal codes of
time in expert memory. Cognitie Science states). Most archaeologists would expand the defini-
Holding D H 1985 The Psychology of Chess Skill. Erlbaum, tion to include social ranking that has at least some
Hillsdale, NJ hereditary component (unlike the achieved ranking of
Levy D, Newborn M 1991 How Computers Play Chess. Com- tribal societies and the almost wholly hereditary social
puter Science Press, New York classes of early states) and some degree of economic
Newell A, Simon H A 1972 Human Problem Soling. Prentice- centralization (control over staple production, build-
Hall, Englewood Cliffs, NJ
ing and managing irrigation systems, tribute mobiliza-
Pitrat J 1977 A chess combination program which uses plans.
Artificial Intelligence 8: 275–321
tion, control over the production and exchange of
Saariluoma P 1995 Chess Players’ Thinking. Routlege, London prestige goods, and\or foreign trade monopolies).
Simon H A, Chase W G 1973 Skill in chess. American Scientist Chiefs generally maintain their political power not
61: 393–403 only through varying strategies for accumulating and
disbursing material resources, but also through ideo-
F. Gobet logical means (manipulating cosmologies, ritual, and
myth, and creating power-imbuing sacred landscapes)
and military coercion. Chiefdoms rarely exist or evolve
in isolation, but instead are found as clusters of
interacting peer polities that often share aspects of elite
culture, have similar structures, and compete for
regional supremacy. Many chiefdoms are also part of
Chiefdoms, Archaeology of larger world systems that link them to more developed
states and empires.
The use of the term ‘chiefdom’ to refer to pre-state Chiefdom-level societies have been identified by
complex societies is a relatively recent phenomenon, cultural anthropologists, archaeologists, and histor-
beginning with Kalervo Oberg’s historical classifica- ians in many parts of the world and in many time
tion of South and Central American societies and periods (Fig. 1), often as precursors to state de-
Marshall Sahlins’s ethnographic work on Polynesian velopment but sometimes as long-term, stable struc-
societies in the 1950s. Elman Service included ‘chief- tures that do not transform into more complex states
doms’ a decade later in his neoevolutionary categories (see States and Ciilizations, Archaeology of). The
of band, tribe, chiefdom, and state, and the concept Polynesian chiefdoms, the pre-Roman (Late Neo-
became closely associated with the cultural evolution- lithic, Bronze Age, and Iron Age) societies of Europe,
ary paradigm. Almost immediately, archaeologists and eastern North American complex societies (par-
began to apply the chiefdom concept to investigations ticularly Mississippian Period) have received the most
of early societies such as the Mississippian towns of intense archaeological study. Archaeologists have
the southeastern USA and the Neolithic European recently expanded their use of the chiefdom model to
builders of Stonehenge, which manifested some com- societies in the Near East, Africa, East Asia, South
plexity but generally lacked the long-held archaeo- Asia, Mesoamerica, South America, and western
logical indicators of ‘civilization’ (e.g., large urban North America (Fig. 1), although the ‘evolutionary
centers, massive kingly tombs, written dynastic his- status’ of many as chiefdoms is controversial (e.g., the
tories and legal codes, markets, and monetary Woodland Period societies of the eastern USA, the
systems). Late Neolithic societies of Europe). Early in the deve-
lopment of the chiefdom concept, archaeologists
identified a number of common material correlates of
these types of societies (e.g., two- to three-level
1. Chiefdoms and Cultural Eolutionary Theory settlement hierarchies, monumental architecture,
burials with hereditary status indicators, specialist-
Chiefdoms have been defined by some archaeolo- manufactured prestige goods, regional and interhouse-
gists in primarily political terms, as smaller-scale hold differences in wealth; see Fig. 1). However, as more
complex societies with centralized decision-making of these complex societies are intensively studied
hierarchies of one or two levels above the individual by archaeologists, it has become clear that material
village (compared to three plus levels in states). Chiefly patterns, and presumably the ideological, political,

1667
Chiefdoms, Archaeology of

Figure 1
The chronological and geographic distribution of some chiefdoms known through archaeological and ethnographic
research, with varying archaeological indicators

social, and economic structures underlying them, vary scales of chiefly political, economic, and ritual con-
substantially (see below). trol), and Colin Renfrew’s ‘group-oriented’ and ‘indi-
vidualizing’ chiefdoms (based on differing strategies of
lineage versus individual aggrandizement and social
2. Critiques and Redefinition of the Chiefdom display).
Concept Other scholars have emphasized that transforma-
tions in social, political, economic, and ideological
A number of recent critiques of the chiefdom concept, structure that have previously defined chiefdoms do
and cultural evolutionary models in general, have not always occur together. Some archaeologists have
emphasized the organizational diversity and historical promoted the concept of heterarchy—in which both
uniqueness that is ignored when anthropologists at- hierarchical and nonhierarchical relations operate
tempt to fit societies in broadly defined classifications simultaneously and separately along multiple dimen-
and assume uniform trajectories of development. sions (social, economic, political, ritual)—as a better
Many scholars have pointed out the great deal of way of modeling structure and process in these
variability in societies designated as ‘chiefdoms.’ One societies. Other archaeologists have favored a move
response to this recognized diversity has been to away from the systemic evolutionary processes implied
subdivide pre-state complex societies into develop- in the chiefdom concept to ‘actor-based’ approaches,
mentally distinct types, such as Robert Carneiro’s Marxist theories of political control, and nonmateri-
‘minimal,’ ‘typical,’ and ‘maximal’ chiefdoms (based alist or postprocessual analyses focused on the ideo-
on the complexity of political hierarchies), Timothy logical and symbolic bases for structure. Thus, while
Earle’s ‘simple’ and ‘complex’ chiefdoms (based on many archaeologists favor continual refinement of the

1668
Chiefdoms, Archaeology of

chiefdom concept and its continued utility in arch- In terms of social structure, chiefdoms invariably
aeological research, others have advocated its have some form of ascribed social ranking. While
abandonment. social status differences are characteristic of all human
societies, social ranks in chiefdoms are at least partially
inherited intergenerationally. They are given struc-
3. Political and Social Features tural rigidity through behavioral taboos and symbolic
expression, and in most cases, there is distinct econ-
Chiefdoms are characterized by partially inherited and omic advantage (differential access to resources) con-
partially achieved political leadership, reinforced ferred on an ‘elite’ stratum. Social stratification in
through ideological manipulation, control of econ- chiefdoms is often manifested in archaeologically
omic resources, and militarism. The political power of visible rank insignia, differential wealth in households,
chiefs is often reflected materially in symbols of office and varying complexity of residential architecture at
(e.g., the jade axes of Maori chiefs and the stone settlements.
maceheads of Gerzean Period Egyptian chiefs) and in Ascribed rather than strictly achieved social ranking
monumental architecture which demonstrates a chief’s is most archaeologically evident in burial practices.
ability to control and mobilize large labor forces (e.g., Inherited status is generally expressed through vari-
the henge monuments of Late Neolithic Europe, the ation in body positioning, body treatment, grave
monumental statuary of the Olmec and Easter Island- forms, and burial accompaniments which cross-cut
ers). Chiefly centers, as the focal point of political, age and sex categories, making status archaeologically
economic, and religious activities of chiefs, are gen- distinguishable from mortuary variability, attribu-
erally archaeologically distinguishable from subor- table to age differences, gender roles, and achieved
dinate villages by their greater size, centrality, and status. Mortuary analyses based on this premise have
presence of monumental architecture, wealth objects, been carried out most successfully at Mississippian,
and specialized production locales. Multiple levels of Chinese Lungshan, Egyptian Gerzean, and Bronze
political authority ( paramount chiefs and local chiefs), Age and Iron Age European cemeteries, where rela-
typical of larger-scale chiefdoms, are often archaeo- tively large samples of well-preserved burials have
logically visible in regional settlement hierarchies of allowed the identification of distinct social hierarchies.
up to three levels. In cases where larger burial populations are lacking,
A number of archaeologists have offered new archaeologists have frequently made an argument for
conceptual frameworks for analyzing differing pol- the presence of elites based on finds of elaborate,
itical power strategies in chiefdoms. One recent model, often monumental ‘chiefly’ burials (e.g., jade-yielding
offered by Richard Blanton and Gary Feinman, Formative Period Mesoamerican tombs, the Nok
contrasts network versus corporate strategies of pol- mounded tombs of West Africa, the Yayoi stone-lined
itical dominance, elaborating Colin Renfrew’s earlier crypts of Japan). Status-related dietary and health
distinction between indiidualizing and group-oriented differences between elites and nonelites have also
chiefdoms. These differing strategies are viewed as been studied through zooarchaeological and paleo-
part of political dynamics in all complex societies, but botanical analyses of food remains at settlements, and
political actors in a particular society may emphasize through osteological assessments of nutritional stress
one mode of control more than the other. In a network and disease in burial populations.
strategy, political actors try to create personal net-
works of dominance through the strategic distribution
of portable wealth and symbolic capital (e.g., ritual
potency and religious knowledge). The emphasis on 4. Political Economy
individual aggrandizement is archaeologically mani-
fested in lavish individual burials, elaborate household Just as the chiefdoms studied by archaeologists vary
wealth display, and competitive presentational events in political structure, they differ in the ways in which
such as ceremonial feasting (e.g., Iron Age chiefdoms chiefs attempt to assert economic control in support of
of Europe, early West African chiefdoms). The highly their governing institutions. Timothy Earle and Ter-
conflictive nature of leadership and long-term in- rence D’Altroy have made a useful distinction between
stability of political configurations are evidenced in strategies of staple finance and wealth finance in
rapidly changing settlement hierarchies. In contrast, a chiefdoms. Staple finance involves the systematic
society emphasizing a corporate political strategy levying of tribute payments (in the form of staple
disperses power across different groups and sectors of goods and\or labor) from subjugated commoners,
society through bureaucratic institutions promoting with the revenue used to finance directly the power-
consensus, solidarity, and collective action, often enhancing activities of the chief, such as monument
reinforced through archaeologically visible public building, warfare, trade, and ceremonial feasting.
architecture, collective tombs, and unifying emblems Since control of land and its production is essential to
(e.g., Late Neolithic chiefdoms of Europe, Woodland staple finance, chiefdoms emphasizing this economic
Period chiefdoms of North America). strategy generally have strongly developed institutions

1669
Chiefdoms, Archaeology of

for land tenure (archaeologically manifested in bound- Particularly in chiefdoms relying heavily on wealth
ary features such as walls and corporate monumental finance, political relationships and hierarchies of
structures) and significant chiefly investment in agri- authority are reinforced through the continual cir-
cultural intensification (archaeologically manifested in culation of prestige goods, most often in the context of
large-scale irrigation works, water control, and arti- bride wealth exchanges (e.g., gifts of porcelain in
ficial terracing). Archaeological investigations suggest Southeast Asia and gold in Panama), political investi-
that the Hawaiian chiefdoms, pre-Inca South Ameri- tures (e.g., the presentation of feather capes in
can chiefdoms, and many pre-state Mesopotamian Polynesia), ritualized feasting, and other politically-
complex societies had political economies dominated charged events. Of these exchange contexts, ritual
by staple finance. feasting is particularly evident in the archaeological
Wealth finance involves the use of prestige goods or record. Specialized feasting paraphernalia (e.g.,
valuables as political currencies by chiefs and other bronze drinking vessels in the European Iron Age,
elites to cement politically strategic alliances with large ceramic cooking vessels in the Mississippian and
other elites and to reward subordinates for loyalty and Formative Mesoamerican chiefdoms), unusual food
service (particularly associated with a network political remains (e.g., large pig concentrations in Lungshan
strategy). Prestige goods vary according to socially China), and their association with ritual architecture
defined standards of value and can be any rare or (e.g., ball courts in Mesoamerica) have been used to
easily controlled object (e.g., jade in Lungshan China, identify feasts which were likely aimed at political
gold in Panamanian chiefdoms, bronze in European integration and competitive status display.
chiefdoms, shell and copper ornaments in eastern
North American chiefdoms). Politically charged
wealth can be locally manufactured or obtained 5. The Ideology of Rulership
through foreign trade. Emerging elites control access
to this political currency through support of attached Ideologies are collective representations of the social
specialists (highly skilled artisans who produce these and political order in particular societies, often encoded
goods wholly for elite consumption), by monopolizing in myths, ceremonies, and various public perfor-
valuable raw materials and technologies, by dominat- mances, but also frequently materialized in archaeolo-
ing trade routes, and by defining limited social contexts gically visible monumental architecture, iconography,
for their circulation (e.g., elite-controlled ceremonial and portable objects. In both state-level societies and
feasting events). While specific chiefdoms tend to chiefdoms, the creation of a dominant ideology and its
emphasize one means of economic control over an- imposition on the populace are an important basis for
other, chiefs tend to use elements of both staple power. In many chiefdoms, elites developed their own
finance and wealth finance in varying combinations, language, dialect, lexicon, and\or writing systems to
resulting in unique forms of political economy. control the flow of esoteric knowledge and to restrict
In addition to the emergence of attached specialists, the performance of religious rites. The archaeological
the rise of chiefdoms is sometimes accompanied by the contexts in which written texts are found and their
growth of independent specialists, full-time specialists decipherable content (e.g., stone stela found with
producing more mundane household goods for an monumental earthen works in Formative Period
unrestricted set of consumers who concentrate at Mesoamerican centers, wooden tablet inscriptions
chiefly centers due to economies of scale. While the associated with Easter Island ceremonial complexes)
archaeological recovery of large-scale craft workshops suggest that literacy was an elite prerogative which
containing mass production equipment (e.g., large served to institutionalize cosmological notions and to
kilns or smelting furnaces, molds, potter’s wheels) is legitimate the political and social domination of elites.
the most direct line of evidence for specialist pro- In the absence of written texts, archaeologists often
duction, such finds are rare. More often archaeologists infer shared cosmic orders and ruling ideologies from
have assessed craft production modes through analysis iconography on portable objects (e.g., the were-jaguar
of the products themselves and their regional dis- figurines of Formative Period Mesoamerica, ‘sun’
tribution. Specialized production by a limited number motifs and other symbols on Mississippian pottery,
of concentrated, recurrently interacting, full-time ‘patron-deity’ symbols on Egyptian Gerzean stone
craftspersons (either attached specialists or indepen- palettes and maces), from monumental constructions
dent specialists) generally results in a more standar- (e.g., the platform temple complexes of Polynesia, the
dized product, as measured through object form, raw henge monuments of Late Neolithic Europe), from
material, and\or decoration. In addition, specialist- material symbols in burials (e.g., painted scenes in
produced prestige goods are largely restricted to elite Japanese Yayoi mounded tombs), and from the spatial
centers, while household-produced domestic goods organization of chiefly centers (e.g., the layout of
are widely dispersed throughout a region, and the Cahokia and other Mississippian centers). An influx
socially unrestricted products of independent special- of exotic symbols representing foreign cosmologies or
ists are concentrated at, but not restricted to, regional religious beliefs (e.g., Buddhist and Hindu art and
centers. architecture in Southeast Asia, widespread Olmec

1670
Chiefdoms, Archaeology of

styles in Formative Period Mesoamerica, Egyptian monumental architecture, toppling of statuary, burn-
themes in the Nubian chiefly tombs of Kerma) has ing of chiefly centers). Examples are the alternating
often been interpreted by archaeologists as a par- rebuilding and burning of fortifications surrounding
ticularly effective strategy for chiefs to add to their Iron Age European towns such as Heuneberg, the
monopolistic store of power-enhancing knowledge. creation of ‘no man’s lands’ between Mississippian
While past archaeological research has largely centers, and the defacement of sacred monumental
focused on ideologies of domination in chiefdoms, statuary on Easter Island. Osteological evidence from
many researchers now recognize that the many indivi- burials allows archaeologists to evaluate rates of
duals and social factions comprising these complex violence in populations (e.g., scalping, skeletal trauma,
societies have distinct, and sometimes conflicting, embedded projectiles), the ritualized use of body parts
worldviews based on personal experiences and inter- in prewar and postwar ceremonialism (e.g., decapi-
ests. Increasingly, archaeologists are attempting to tation for trophy head taking), and debilitating physi-
tease out diverse values and perspectives related to cal conditions and malnutrition related to prolonged
gender, class, ethnicity, occupation, and individuality. exposure to military siege. Archaeological studies of
For example, a number of recent archaeological burials also allow archaeologists to identify ‘warrior
studies have focused on gendered views of cultural insignia’ associated with the development of speci-
norms and social orders (e.g., Joyce Marcus’s and alized warrior classes and strong ideologies of warrior
Kent Flannery’s analyses of feminine depictions in prestige exemplified in the elaborate bronze horse-
human figurines and gender-segregated work spaces in fittings of Bronze Age European warriors and the
Formative Period Mesoamerica). gold-pegged teeth of island Southeast Asian warriors.
The development of new warfare technologies associ-
ated with escalating military competition can be traced
6. Warfare and Militarism archaeologically in intensified metals or stone mining,
labor-saving equipment (e.g., molds) or large-scale
Anthropological studies have suggested that warfare facilities for mass production of weapons, and the
in chiefdom-level societies differs from that in tribal adoption of horses for mounted cavalry.
societies by its expansionistic focus on acquisition of
territory, resources, and captives, and by its con-
centration of military power in the hands of chiefly
leaders who use it to expand their political sway. 7. The Eolution of Chiefdoms
Robert Carneiro has suggested that both the rise and
evolution of chiefdoms are related to warfare. Under Many archaeologists view the transition to inherited,
conditions of land shortages and population stress, institutionalized power and status, associated with
militarily powerful leaders systematically assault set- chiefdoms but further developed in states, as the key
tlements along polity boundaries. Conquered enemies transformation in human societies. Since the organiza-
are incorporated into the expanding chiefdom, result- tional dynamics of chiefdoms are viewed as similar to
ing in the elaboration of political hierarchies and the that of states, theories of state development are
coalescence of a polity of greater complexity as well as commonly applied at the chiefdom level. Early theories
scale. However, warfare is an almost continuous emphasized the managerial benefits of chieftainship as
process in many chiefdoms, one of a number of forms societies expanded in scale. Chiefs arose to administer
of ‘peer polity’ interaction that are politically trans- irrigation works (Karl Wittfogel), coordinate localized
formative. Militarism can often result in more subtle production and exchange (Elman Service), mediate
changes in power relationships among competing land conflicts and warfare (Robert Carneiro), ameli-
chiefdoms, over both the short and long term, without orate economic risk through ritual intervention
territorial conquest. Political loyalties may be shifted (Robert Drennan), and to meet other organizational
to military victors who control the ideology of warrior challenges.
prestige and who increase their political currency More recently, archaeologists have criticized the
through captured labor and valuables, resulting in functionalist nature of these theories and their simplis-
larger alliance networks. tic emphasis on single factors, instead focusing on the
Archaeologists have focused on a variety of material varied acquisitive and power-seeking strategies of
evidence to document changes in the scale, intensity, political actors as they vie for symbolic capital and
and behavioral and ideological aspects of warfare in control over the labor and resources of others. In this
chiefdom-level societies. These include the develop- view, the origins of permanent forms of social in-
ment of fortifications or other defensive works, equality and political authority are to be found in the
changes in regional settlement patterns (e.g., con- competitive and aggrandizing behavior of ‘big men’ in
centration of population in large centers, depopulation tribal societies. Archaeologists now examine this
along polity boundaries, relocation of settlements to transformation at varying scales of analysis, ranging
defensible positions), and systematic destruction of from the transfiguring actions and motivations of
power-symboling architecture (e.g., dismantling of individual political actors, to multiple polities develop-

1671
Chiefdoms, Archaeology of

ing together through forms of peer polity interaction, educational measures to protect the child from all
to change-stimulating contacts with more complex forms of physical or mental violence, injury or abuse,
states and empires (world systems theory). In addition, neglect or negligent treatment, maltreatment or ex-
many archaeologists now recognize that not all chief- ploitation, including sexual abuse. Until now this
dom-level societies have unilinear evolutionary trajec- article has been disregarded throughout the world. In
tories leading inevitably toward greater complexity. laying blame we should not only consider third-world
Many chiefdoms fail to develop state-level institutions countries (with child labor, prostitution, and soldiers)
over the long-term, instead perpetually cycling be- but also the rich industrial countries, where violence
tween complex and simple forms, ‘devolving’ into against children must be seen as a ‘social epidemic.’
tribal societies (e.g., New Zealand’s South Island
Maori, possibly Amazonian societies), mysteriously
collapsing (e.g., the Mississippian chiefdoms, Easter
Island), or otherwise maintaining a chiefdom-level 1. Forms of Child Abuse
organization until eventual absorption by colonizing
states and empires (e.g., many chiefdoms of the There are four forms of child abuse:
Central American Isthmus, North America, the Carib- Physical abuse. Blows or other violent actions,
bean, island Southeast Asia, Europe and Polynesia). resulting in injuries to children, such as being beaten,
hit, whipped, pushed down stairs, hurled against a
See also: Big Man, Anthropology of; Conflict and wall, burned with hot water\cigarettes, jammed in
War, Archaeology of; Intensification and Special- doors\car windows, tortured with needles, put into
ization, Archaeology of; Political Anthropology; cold water or pushed under water, made to eat their
own faeces or drink their own urine, strangling, and
Power in Society; States and Civilizations, Archae-
poisoning.
ology of Sexual abuse. Any action that is inflicted upon or
must be tolerated by a child against their own will or
any action about which the child cannot make a
decision due to their physical, emotional, mental, or
Bibliography verbal inferiority. The offenders use their position of
Carneiro R 1981 The chiefdom as precursor of the state. In:
power and authority to satisfy their own needs at the
Jones G, Krautz R (eds.) The Transition to Statehood in the expense of these children who thus suffer discrimi-
New World. Cambridge University Press, Cambridge, UK, nation as sexual objects. Sexual abuse between chil-
pp. 37–79 dren themselves is manifest when one child is much
Drennan R, Uribe C (eds.) 1991 Chiefdoms in the Americas. older and\or uses force.
University Press of America, Lanham, MD Neglect. Considerable impact on or damage to a
Earle T 1987 Chiefdoms in archaeological and ethnohistorical child’s development due to a lack of care, clothing,
perspective. Annual Reiew of Anthropology 16: 279–308 feeding, medical care, supervision, or protection from
Earle T (ed.) 1991 Chiefdoms: Power, Economy and Ideology. danger.
Cambridge University Press, Cambridge, UK Emotional abuse. Outright rejection, intimidation,
Earle T 1997 How Chiefs Come to Power. Stanford University
terrorization, or isolation of a child. Actions such as
Press, Stanford, CA
Kirch P V 1984 The Eolution of the Polynesian Chiefdoms.
verbal abuse\discrimination on a daily basis, locking a
Cambridge University Press, Cambridge, UK child in a dark room, tying them to a bed and many
Renfrew C, Shennan S (eds.) 1982 Ranking, Resource and other major threats, including to their lives.
Exchange. Cambridge University Press, Cambridge, UK
Yoffee N 1993 Too many chiefs. In: Yoffee N, Sherratt A (eds.)
Archaeological Theory: Who Sets the Agenda? Cambridge
University Press, Cambridge, UK, pp. 60–78
2. Prealence of Forms of Abuse
L. L. Junker This article draws on survey results from German-
speaking countries and the United States as they are
deemed to be representative of comparable statistics
from modern industrial countries.

Child Abuse
2.1 Physical Abuse
Article 19 of the Conention on the Rights of the Child, A representative survey (Pfeiffer and Wetzels 1997) of
ratified in 1989, prescribes that parties shall take all 16 to 59 year olds in Germany about their childhood
appropriate legislative, administrative, social, and experiences of physical parental violence indicates a

1672
Child Abuse

Table 1 having experienced physical parental violence in their


Incidence of parental violence childhood, and 10.6 percent definitely suffered physi-
cal parental abuse.
Incidence of parental violence
Possible answers: neer, rarely, occasionally, often, ery
often
Multiple answers were possible, N l 3241 2.2 Sexual Abuse
More often Recent retrospective studies in Germany, Austria, and
Question Rarely than rarely Switzerland (Deegener 2000) suggest that about 15 to
1. Throwing objects 7.0% 3.7% 25 percent of women and 5 to 10 percent of men
2. Grabbing\pushing around 17.9% 12.1% questioned had been sexually abused in their child-
3. Slapping 36.0% 36.5% hood or adolescence. We believe that the approximate
4. Hitting\striking with an object 7.0% 4.6% incidence and severity of sexual abuse in both sexes is
5. Punching, kicking 3.3% 2.6% as shown in Table 2. Almost two thirds of victims were
6. Striking, beating up 4.5% 3.5% ever only abused on one occasion. One third of cases
7. Strangling 1.4% 0.7% involved repeated sexual abuse. However in the latter
8. Intentional burns 0.5% 0.4% group, about 10 percent of cases continued over
9. Threats with a weapon 0.6% 0.4% periods exceeding 12 months and extending to several
10. Use of weapon 0.6% 0.3% years. Such long-term sexual abuse is perpetrated
predominately by a victim’s relatives. Between 80 and
Total physical violence 90 percent of offenders are male.
(Questions 1–10) 36.1% 38.8%
Total physical violence
(Questions 5–10) 5.9% 4.7%
2.3 Neglect and Emotional Abuse
If defined in conservative terms, the extent of these
distinction between physical punishment and physical forms of abuse in Germany far exceeds the incidence
abuse. Physical punishment is defined as intentionally of the other two forms. If defined in broader terms, it
inflicting pain to control a child’s behavior but without is not an exaggeration to say that the bullying of
intending to cause severe injuries or damage and children is widespread and commonplace.
without violating (existing!) laws. Physical abuse
definitely violates laws, as injuries to the child are
either intentional or tolerated as a consequence of
2.4 Statistics from the USA
these violent actions. The incidences of violence
observed are listed in Table 1. The National Committee to Prevent Child Abuse
Of the persons interviewed, 74.9 percent confirmed (1995) indicates that over three million (suspected)

Table 2
Severity of sexual abuse
Serious sexual abuse
Attempted or actual vaginal, anal, or oral violation; 15%
oral satisfaction by victim or anal penetration of offender
Seere sexual abuse
Victim has to masturbate in front of offender; 35%
offender masturbates in front of victim;
offender fondles victim’s genitals;
victim has to touch offender’s genitals;
victim has to expose genitals to offender
Less seere sexual abuse
Offender attempts to touch victim’s genitals; 35%
offender fondles victim’s breasts; sexual kisses; French kisses
Sexual abuse without physical contact
Exhibitionism; victim has to watch pornographic videos; 15%
offender observes victim while bathing

1673
Child Abuse

victims were recorded in 1994 in the USA. Almost (a) At the individual level there is the perpetrator’s
1300 of these children died as a result of abuse and life history and personality, e.g., their own experiences
neglect. Elders (1999) reports 3,195,000 registered of childhood abuse, early separation of parents,
cases of child abuse in 1997, with one million cases periods in (foster) homes, emotional disturbances, or
being confirmed. However, only 50 to 75 percent of all alcohol and drug abuse.
registered cases were adequately investigated. Over the (b) At the family level there are maladjusted
years the following averages have been determined: 45 parent–child relationships, partner conflicts and prob-
to 50 percent are cases of neglect, 25 percent of lems, etc.
physical abuse, 10 percent of sexual abuse, 3 percent of (c) At the social level there are factors involving
emotional abuse, and 15 percent of other causes. poverty, unemployment, meager and limited housing,
social ghettos, and insufficient social support, etc.
(d) At the community level there are, among other
things, a high tolerance of violence and aggression and
2.5 Combination of Different Forms of Abuse a high degree of violence in their upbringing.
Multiple episodes of maltreatment during childhood There are numerous interactions between these
are more the rule than the exception. Pfeiffer and levels that may lead to (acute or protracted) familial
Wetzels (1997) found that even in the youngest age destabilization, resulting in neglect or physical abuse.
group (i.e., 16 to 29 year olds), approximately one In addition to the risk factors, which may increase the
sixth of adolescents and young adults in Germany had probability of child abuse, one must consider the
been victims of frequent or severe physical parental preventative factors, which reduce the risk and conse-
violence or sexual abuse (involving physical contact quences of abuse.
and abuse outside the family). Employing a rather
conservative estimate, we believe that about one fifth
of this young generation has been affected by physical
parental violence, sexual victimization, or frequent 4. Consequences of Child Abuse
adult partner violence.
Not only can child abuse influence every aspect of
behavior and experience, but it can also lead to
psychosomatic diseases and physical injuries (Egle et
2.6 Assessing the Incidence of Child Abuse al. 2000). In addition, child abuse influences the
attachment of children to their parents and their
The extent of child abuse invites comparison with a relationships with their peers. However, as the short-
global epidemic. However, in contrast to the degree of and long-term consequences are not specific there can
professional, political, and public attention received be various other causes.
by other diseases, child abuse is neglected. Sadler et al. Whereas some research suggests that the conse-
(1999) for example, estimate that the incidence of quences of child abuse can last throughout a person’s
abuse is ten times higher than that of all forms of life, other research shows that abused children suffer
cancer. The ‘costs’ of child abuse are tremendously few (if any) adverse consequences. During the 1990s, a
high, with the ‘human costs’ of the sociopsychological, growing number of investigators has identified many
cognitive, and physical consequences of abuse being mediating and moderating factors that can either
inestimable. Courtney (1999) quotes a 1993 estimate ameliorate or exacerbate the consequences of child
for the USA, indicating that approximately 11.4 abuse (Masten et al. 1990, Kendall-Tackett et al. 1993,
billion US$ was spent on examining abused children, Kaplan et al. 1999):
providing medical care to severely injured children, characteristics of the child’s experiences (e.g.,
treating victims and their families, and caring for nature, frequency, severity, type, and prior history
children in foster families. In addition, Courtney cites of child abuse);
Westman, who calculates that expenditure due to resources of the child (e.g., good social and aca-
‘incompetent parenting’ (in broad terms) in the USA demic competencies, endearing temperament);
amounts to about 38.6 billion US$ each year. vulnerability of the child (e.g., psychiatric disorder,
low intelligence, insecure attachment style, early onset
of the abuse);
3. Causes of Child Abuse social support of the child (e.g., good relationship
with the nonabusing parent, support of significant
More specific causal models are becoming accepted in others from outside the family, psychotherapy).
research and clinical practice (Cicchetti 1989), result- Therefore, the extent of the negative consequences
ing in a marked improvement in early recognition, of child abuse depends on complex transactional
therapy and prevention. Four groups of influences can processes between vulnerability or risk factors on the
be identified which interact within a complex reference one hand, and resiliency or compensatory factors on
framework: the other hand. The term ‘resiliency’ should not be

1674
Child Abuse

understood to minimize the suffering of abused chil- ment of general education standards and lifelong
dren or justify criticism of those children who are not learning (including parent training), and a reduction
resilient. However, some children from violent homes in the wide power and relationship gap existing
are less influenced by their abuse than others and between men and women, resulting from raising boys
develop effective coping skills and strategies. In this to be ‘masters’ and girls to be ‘servants.’About one
connection, the relationship between the causes and third of all cases of sexual abuse\violation are com-
consequences of child abuse must also be considered: mitted by male adolescents.
a low intelligence may—for example—stimulate abu- Furthermore, there should be independent, state-
sive behavior by parents or caretakers, but low nominated representatives for minors, whose role is to
intelligence can also be a negative consequence of replace parents at the community level and to fight for
severe abusive experiences in early childhood. children’s needs and rights in society. The young
Generalizing, the severity of the consequences of should also have more opportunities to co-determine
child abuse depends especially on the following fac- what occurs by means of an institution such as a
tors: the age and developmental stage of the child, the children’s or young people’s parliament. Special
length and severity of the abuse, and the relationship broad-scale support programs for families\parents at
of the abuser to the victim. Therefore, long-term risk (such as home visits to teenage mothers or high-
severe abuse of a young child perpetrated by a parent risk women during pregnancy) should be offered, and
tends to produce more detrimental effects on the more assistance should be available to families in
emotional, physical, social, and cognitive development serious difficulty.
than shorter-term abuse of an older child by a stranger. The predominant objectives of prevention need to
In the case of long-term physical abuse it is not be: (a) to interrupt the generational cycle of family
uncommon to find a range of injuries inflicted at violence, (b) to teach future generations humanitarian
different times. The consequences of neglect remain attitudes and civil behavior, and (c) to promote ‘social
underestimated (i.e., ‘neglected neglect’). parenting’ and ‘structural security’ instead of ‘struc-
tural violence’ for all children.
There is widespread condemnation when child
abuse or violent acts committed by (abused) adoles-
5. Preention cents become public, but this still tends to be little
more than lip service. We tend to avoid reflecting upon
The extent to which abusive educational measures our own violent relationships and actions, obscuring
(including physical punishment) are tolerated varies these by employing projections of monstrous child
widely from country to country as a function of the abusers. Appropriate responsibility for children and
sociocultural context. Except for legal prosecution in their futures is not being implemented for one simple
confirmed cases of child abuse, the rights of parents reason: ‘We are all against such things as child
(or teachers, etc.) to employ a wide range of seemingly pornography, but only a few are willing to actually
appropriate educational measures (including physical support programs that could save children’s lives,
punishment) remain unchallenged. because these cost money and comfort and require
In 1979 Sweden introduced a law banning physical another form of living and life’ (Sigusch 1996).
punishment, addressing this grey zone between ‘per-
mitted’ and ‘illegal’ educational measures. Countries See also: Child Care and Child Development; Child-
such as Finland (1984), Denmark (1986), Norway hood Depression; Childhood Sexual Abuse and Risk
(1987), and Austria (1989) followed suit. In Germany
for Adult Psychopathology; Children and the Law;
the following law was passed in 2000: ‘Children have
the right to a non-violent upbringing and education. Children, Rights of: Cultural Concerns; Family
Physical punishment, emotional abuse and other Health; Socialization in Infancy and Childhood; Vio-
discriminatory measures are unacceptable.’ This does lence and Effects on Children; Violence as a Problem
not imply an increase in prosecutions. As Sweden has of Health
demonstrated successfully, the objective is rather to
alter the value system and influence the ‘legal hygiene,’
while simultaneously strengthening the support for a
nonviolent upbringing and education. An education Bibliography
conducted in accordance with the above-mentioned
Cicchetti D 1989 How research on maltreatment has informed
UN conventions and with the child-protection pro- the study of child development: perspectives from devel-
visions found in many other constitutions would not opmental psychopathology. In: Cicchetti D, Carlson V (eds.)
tolerate physical punishment or other discriminatory Child Maltreatment. Theory and Research on the Causes and
educational measures. Consequences of Child Abuse and Neglect. Cambridge Uni-
Adequate prevention requires worldwide birth con- versity Press, New York
trol, a reduction in poverty, an improvement in both Courtney M E 1999 The economics. Child Abuse and Neglect 23:
socio-ecological and living conditions, the improve- 975–86

1675
Child Abuse

Deegener G 2000 Die WuW rde des Kindes. PlaW doyer fuW r eine it will still need much effort to offer a specialized child
Erziehung ohne Gewalt [The dignity of the child. A plea for an mental health service worldwide.
upbringing without violence]. Beltz, Weinheim, Germany This article provides an overview of the historical
Egle U T, Hoffmann S O, Joraschky P 2000 Sexueller
development of child psychiatry in different cultural
Mißbrauch, Mißhandlung, VernachlaW ssigung. Erkennung und
Therapie psychischer und psychosomatischer Folgen fruW her regions, focusing on the development in Europe and
Traumatisierungen [Sexual abuse, maltreatment and neglect. the USA. A necessary distinction is made regarding
Recognition and therapy of psychological and psychosomatic the diagnostical classification system for mental dis-
consequences of traumatizations at an early age]. Schattauer, orders in childhood and adolescence in comparison to
Stuttgart, Germany the classification system in general psychiatry. Princi-
Elders M J 1999 The call to action. Child Abuse and Neglect 23: ples of child-specific assessment and treatment are
1003–9 described. Further, some future perspectives in the
Kaplan S J, Pelcovitz D, Labruna V 1999 Child and adolescent development of a more biologically influenced child
abuse and neglect research: A review of the past 10 years. Part
psychiatry are discussed as well as the needs for a
I: Physical and emotional abuse and neglect. Journal of the
American Academy of Child and Adolescent Psychiatry 38: modern child psychiatry in developing countries.
1214–22
Kendall-Tackett K A, Meyer Williams L, Finkelhor D 1993
Impact of sexual abuse on children: A review and synthesis of 1. History
recent empirical studies. Psychological Bulletin 113: 164–80
Masten A S, Best K M, Garmezy N 1990 Resilience and The clinical discipline of psychiatry was formed in the
development: Contributions from the study of children who nineteenth century, in a situation of an increasing
overcome adversity. Deelopment and Psychopathology 2: interest in and knowledge of psychological phenom-
425–44
ena. In 1899, the term ‘child psychiatry’ was first used
National Committee to Prevent Child Abuse (NCPCA) 1995
Current Trends in Child Abuse Reporting and Fatalities. The by the French psychiatrist M. Manheimer, who called
Results of the 1994 Annual Fifty State Surey. NCPCA, his book Les troubles mentaux de l’ enfance, subtitled
Chicago PreT cis de psychiatrie infantile.
Pfeiffer B, Wetzels P 1997 Kinder als TaW ter und Opfer. Eine Four main traditions have made substantial contri-
Analyse auf der Basis der PKS und einer repraW sentatien butions to the current body of knowledge in the field of
Opferbefragung. [Children as offenders and victims. An child and adolescent psychiatry, and influenced the
analysis of criminal statistics and victim interviews]. For- structure and the actual treatment concepts of child
schungsbericht 68. Kriminologisches Forschungsinstitut, psychiatric institutions. The formerly unified disci-
Hanover, Germany
plines of psychiatry and neurology have given rise to
Sadler, B L, Chadwick, D L, Hensler D J 1999 The summary
chapter—The national call to action: Moving ahead. Child the tradition of neuropsychiatry, especially in some
Abuse and Neglect 23: 1011–18 European countries. Several scientific associations still
Sigusch V 1996 Kultureller Wandel der Sexualita$ t [Sexuality in include reference to neurology. Increasing research
cultural transition]. In: Sigusch V (ed.) Sexuelle StoW rungen und activity in the areas of neuropsychobiology and
ihre Behandlung [Sexual disorders and their treatment]. neuropsychology confirms the need of a close linkage
Thieme, Stuttgart between these two ‘brain disciplines’ for a better
understanding of psychiatric disorders.
G. Deegener A movement based on a remedial clinical tradition,
promoted by Hans Asperger in Austria and Paul
Moor in Switzerland, still plays a role in pediatric
departments with activities in the field of child psycho-
somatics. In general, remedial education is an im-
Child and Adolescent Psychiatry, portant part of the multidisciplinary children’s mental
healthcare service.
Principles of Developed by the pioneers of psychoanalytical work
with children, like Anna Freud, Alfred Adler and
Child and adolescent psychiatry and psychotherapy Melanie Klein, the psychodynamic-psychoanalytic
comprises the diagnosis, treatment, prevention, and tradition influences etiological concepts of behavioral
rehabilitation of neuropsychiatric and developmental and personality disorders and gives implications on
disorders as well as behavior disturbances during psychotherapeutic treatment strategies; however, be-
childhood and adolescence. The need for a separated havioral therapy, often in combination with family
psychiatric discipline for children and adolescents therapy interventions, is predominant in the clinical
results from age-dependent characteristics of mental use of psychotherapy nowadays.
disorders, strongly influenced by rapidly alternating The empirical, epidemiological, and statistical tra-
stages of neurobiological and social development in dition has been established in a number of European
this period of life. The discipline of child and ado- countries, mainly the UK, Scandinavian countries,
lescent psychiatry is now acknowledged as a medical and Germany. It has been strongly influenced by the
speciality or subspecialty in many countries; however work of Michael Rutter and by research influences

1676
Child and Adolescent Psychiatry, Principles of

from the USA. This tradition created the basis of the logical brain dysfunctions, the prevalence of psy-
currently used classification systems in psychiatry. chiatric disorders increases with the degree of the brain
Although first considerable activities in the field of damage or the metabolic alteration. So, children with
child psychiatry started in Europe in the early twen- epilepsy suffer five times more from mental problems
tieth century, it was not until 1954 that the first than unaffected children.
European symposium of child psychiatry took place in Psychiatric classification systems are compilations
Magglingen (Switzerland), a consequence of World of diagnostic criteria, based on clinical experience, to
War II and the ensuing political situation. At this define and differentiate mental disorders and establish
meeting, first attempts were made to establish a agreement and common language among healthcare
unifying scientific association, which was founded in professionals and researchers. The classification of
1960 as the Union of European Pedopsychiatrists at mental disorders is essential for the development of
the first European congress in Paris. Later, at a treatment concepts and a necessary precondition for
congress in Madrid in 1979, the name of the society epidemiological, clinical, and biological research in
was changed to the European Society for Child and this field.
Adolescent Psychiatry (ESCAP) (Remschmidt and The 10th revision of the International Statistical
van Engeland 1999). Classification of Diseases and Related Health Problems
The crystallization of child psychiatry in the USA (ICD-10), published by the World Health Organ-
was influenced by the growing knowledge in the field isation (WHO) in 1992, is a comprehensive classifi-
of psychiatry in Europe. A sociocultural reform, cation system of medical conditions and mental
beginning in the year 1900, was the main factor disorders, used in official medical and psychiatric
contributing to the establishment of child psychiatry nosology throughout most of the world. However,
in the USA. The desire to protect children from social some countries (e.g., France and the USA) use
hardship resulted in the need for mental healthcare compatible or modified classifications. The fourth
institutions. William Healy founded the first Institute edition of Diagnostic and Statistical Manual of Mental
of Juvenile Research in 1909 to handle child problems Disorders (DSM-IV ), published in 1994 by the Ameri-
occurring in association with delinquency and trau- can Psychiatric Association (APA), is the official
matic injuries. The Orthopsychiatric Association, psychiatric coding system used in the USA and is in
founded in 1924, integrated the different medical and part compatible with the ninth revision of the In-
sociopsychological professionals working in the psy- ternational Classification of Diseases (ICD-9).
chiatric field. The psychiatrist Adolf Meyer installed The mental development of children and adolesc-
the first department of child psychiatry at the Johns ents, in contrast to that of adults, is strongly influenced
Hopkins University in the early 1930s. Leo Kanner, an by brain maturation processes and social environment.
immigrant from Austria, was the first chairman and Considering this fact, the classification system in child
published his textbook of child psychiatry in 1935. and adolescence psychiatry is based on a multiaxial
Child psychiatry was accepted as a clinical discipline in diagnostical view. The approach of the actual classifi-
its own right in 1950. Since then, the American cation systems is atheoretical according to etiological
Academy of Child and Adolescent Psychiatry concepts. Due to the limited knowledge about etiology
(AACAP) has made a marked contribution to the and pathomechanism of the mental disorders, it is
global development of a modern child and adolescent mainly the symptoms and course of a disorder that are
psychiatry (Schwab and Schwab-Stone 1999). used to define and classify the mental disorders. An
The different professionals working in the field of approach to predict the clinical course of the disorder
child psychiatry have been organized in the Inter- is associated with the description of a clinical feature.
national Association of Child and Adolescent Psy- Additional contributing factors, like the intelligence
chiatry and Allied Professions (IACAPAP), founded level, the parent–child relationship, and the social
as an umbrella organization in the 1930s. Countries environment are assembled by using different diag-
from East Asia and the Pacific region present a nostical axes. The impact of the disorder on psycho-
remarkable level of medical supply and research in the social functions is assessed on Axis VI of the multiaxial
field of child and adolescent psychiatry. In recent years diagnostic system (Remschmidt and Schmidt 1994).
South America, African countries, and Australia have ICD-10, regardless of the use of a multiaxial system,
also contributed with increasing activities to the does not consider age of disease onset consistently.
specialization of child mental health services. Apart from that, the predicted clinical course is
basically used for many clinical decisions including
treatment concepts (Rutter 1989). Taking into account
2. Epidemiology and Classification these important facts, an appropriate version of the
classification system for child and adolescent psy-
The prevalence rates of all mental disturbances in chiatry has been developed, where age of onset and
children are estimated at about 15 percent. However, typical clinical course has been integrated. The dis-
mental disorders in need of treatment occur in about orders have been divided into early-onset disorders
half of these cases. In children with primary neuro- with a persistent course (e.g., autism), early-onset

1677
Child and Adolescent Psychiatry, Principles of

disorders with transient manifestation (e.g., enuresis), mands can be easily disrupted, because they are not
age-related interactional disorders (e.g., separation well established due to the rapid developmental
anxiety), young age-related disorders with sometimes process. Infants suffer from regulation disorders,
recurrent episodes or chronical course (e.g., eating which are often induced by the behavior of caregivers.
disorders), early-onset adult-type disorders (e.g. Preschool-age children suffer from somatic, physio-
schizophrenia). logical dysfunction in the area of sleep, eating, speech,
Even though specified diagnostic criteria are pro- and elimination control, whereas in school-age chil-
vided for each mental disorder, increasing the re- dren communication problems in peer groups and
liability of clinicians process of diagnosis, the profes- in the school situation appear (Esser et al. 1990).
sionals have to pay attention to existing mixed forms Separation anxiety, insufficient affect control, and
or comorbid disorders (Caron and Rutter 1991), like learning disabilities are characteristic problems with
attention deficit\hyperactivity disorder (AD\HD) increasing age. Adolescents then have to develop self-
with conduct disorder or eating disorder with de- confidence, to deal with authorities and to accept their
pressive disorder (Biederman et al. 1991). sexual identity.
Risk and protective factors, in addition to the
developmental process, are effective in the patho-
3. Deelopmental Psychopathology and genesis of mental disorders. Genetically determined
Pathogenesis factors are the somatic constitution, particularly the
brain function, and the pattern of personality, the
Mental disorders in children and adolescents are temperamental factors (Rothenberger 1990).
manifested in behavioral problems, affective and Pervasive developmental disorders like autism have
cognitive disturbances, and somatic concerns. The a high genetic risk (Rutter et al. 1999). Traumatic
quality and severity of these problems are influenced brain injuries prenatal, perinatal, or postnatal, or
by the developmental state. The rapid development, chronic metabolic dysfunction increase the risk of
simultaneously on the biological and on the social developing a mental problem. A lack of sociocultural
level, has a strong impact on the diverse developmental stimulation, socioemotional deprivation, or high pres-
tasks of a minor (Munir and Beardslee 1999). sure as a result of people’s inadequate expectations
Biological maturation, particularly in early child- could be summarized as sociogenic risk factors (Rutter
hood, plays a key role in the etiology of mental 1999). Language acquisition in particular strongly
problems. With increasing age social adaptation and depends on a sociogenic balance. Intrapsychic con-
social learning becomes more prominent. The matu- flicts are a result of chronic problems, like sexual or
ration of the brain and gender-specific role taking are physical abuse, mental disease of a parent, delinquency
two main factors of becoming adult. Acceleration or in the family, distorted familiar communication. The
retardation of the biological and social maturation interactive process between parents and child can
results in an alteration of the developmental process amplify or otherwise reduce behavior in both direc-
with behavioral problems and social disintegration in tions, to a normalization or to an increased dis-
the case of existing permanent deficits. turbance. Life events, as acute stress factors, in general
Developmental psychopathology describes age-re- play a minor role in the pathogenesis of mental
lated aspects of mental disorders, especially the forma- disorders of children and adolescents. In the case of
tion and alternating pattern of symptoms over time strong traumatic impairment, like being a victim of a
(Cicchetti and Cohen 1995). In early childhood the rape or having a bad accident, a posttraumatic stress
ability of autonomous behavior is acquired. For that, disorder can appear (Costello et al. 1998).
an inner representation and judgment of the child’s Protective factors, like social attractiveness, verbal
cognitive and behavioral activities is an important skill, appropriate problem-solving strategies, self-
precondition. Older children compare their self-image consciousness, creative intelligence and interests, a
with an imagined ideal as a kind of self-control, which pleasant life and family situation, and stable peergroup
modifies and limits the outcome of the social learning relationships can act successfully against other bio-
process. Although there is an interaction between logical or psychosocial stress factors to reduce the risk
these different components, mental disorders in child- of developing a mental disorder or in the case of
hood and adolescence can be interpreted as deficits in manifestation to reduce symptom severity and im-
maturation or learning processes or in their auton- prove clinical outcome (Laucht et al. 1997).
omous efforts. Parallel to the differentiation of cog-
nitive, emotional, and social functions there are
increasing demands on the child. Sexual development,
gender-related role taking, and academic achievement 4. Assessment
all which have to be coped with adequately.
During certain periods of development, the child To provide a diagnosis, child and adolescent psy-
and adolescent have specific responsibilities. Func- chiatrists examine the mental status, the psycho-
tions according to these specific developmental de- pathological profile, and behavior of the patient;

1678
Child and Adolescent Psychiatry, Principles of

ascertain the social and biological development, and (like in AD\HD), to develop better treatment strate-
evaluate the prior medical and psychiatric history. gies in the near future.
Further useful information can be obtained by par- Assessment of the neuropsychological functions
ental interviews and school reports. Different sources should be done routinely in children and adolescents
of information reflect different point of views, experi- with AD\HD (Barkley 1998). The core symptoms—
ences, and insights (Jensen et al. 1999). Depending on inattention, hyperactivity, and impulsivity—need to
the type of disorder and the age of the child, the be quantified by neuropsychological tests, in part with
reliability of this information is quite different as well. support of computerized methods (e.g., Continuous
The summary of information from different sources, Performance Task, CPT). Learning disabilities, often
including the child, the parents, teachers, peer group associated with developmental or attention deficit
members, doctor, pediatrician, and sometimes the disorders, need further examination by questionnaires
youth welfare department, provides the most reliable and psychological tests. To rule out sensory dys-
and complete picture of symptoms, functional level of function, visual and acoustic perception has to be
the child, and influencing environmental factors. assessed in cooperation with departments of oph-
Recognizing the level of social functioning and the thalmology, phonetics, and pediatric audiology.
performance of developmental tasks related to the If children are younger than four years, the psycho-
child’s age is a key assessment for the estimation of pathological assessment is oriented on findings re-
severity and prognosis of the disorder. Structured garding the child’s development and interactional
psychiatric interviews, rating scales, and self-report behavior. This information can be received mainly
forms can be used to observe, quantify, and categorize from parental reports or, in the case of some early
the evaluated symptoms, for example, the Diagnostic onset diseases, by the medical records. Semi-
Interview Schedule for Children-Revised (DISC-R), standardized play situations and the observation of
the Child Behavior Checklist, scales for the assessment peer group and parent–child interactions provide
of psychotic symptoms (Positive and Negative Symp- further information about the child’s intellectual
tom Rating Scale—PANS) or mood disorders (Beck function, capacity for emotional reactions, attach-
Depression Inventory). ment, and behavior.
Cognitive and intellectual abilities, basic factors of a
healthy mind, are age-related functions. To ascertain
alterations of these functions, it is necessary to
determine the social level and the state of the biological 5. Treatment
maturation. Particularly in children with mental re-
tardation, learning disabilities, or pervasive devel- Psychiatric therapy in general is based on three major
opmental disorders a comprehensive assessment of the treatment pillars: psychotherapy, psychopharmacol-
intellectual function (examples are Kaufman-ABC ogy, and psychosocial and family interventions. Each
and WISC-III) is necessary to create an optimized kind of mental disorder or disease needs a typical
educational and treatment program (Ollendick and combination of these three therapy forms to get an
Hersen 1993). optimal therapy effect. In children and adolescents the
In a subgroup of patients, a defined somatic family and school environment particularly has to be
dysfunction, mainly neurological diseases, is the pri- taken into consideration.
mary cause of mental diseases. Brain diseases (tumor, For a sophisticated mental healthcare network, the
epilepsy), metabolic dysfunctions (phenylketonuria, cooperation of the various professional groups, in-
hypothyroidism) and diseases with defined chromo- cluding psychiatrists, psychologists, social workers,
somal aberrations (like Down, Angelman, and Prader- teachers, and members of administrative institutions
Willi syndrome) can be associated with a varied picture of the government, responsible for children and
of mental disturbances. Therefore physical exami- adolescents, is a necessary precondition (Simeon
nations and technical investigations (brain computer- 1990).
ized scan, radiology, electroencephalography, and The treatment of mental disorders can be divid-
blood and genetic screening) are standard procedures ed into three main groups: treatment mainly by
to rule out somatic disorders and to determine the psychotherapy, treatment mainly by a psycho-
developmental state of a child. The analysis of genetic pharmacological intervention, and combined psy-
mutations, polymorphisms, and associated molecular- chotherapeutical and pharmacological approaches.
biological dysfunctions will provide more insight into Psychosocial interventions, specially in childhood and
the etiology of a psychiatric disorder which is always adolescence, are in general a part of the treatment
manifested on a neurobiological level. The early program.
assessment of a genetically caused metabolic dys- A follow-up assessment over a longer time period
function (like phenylketonuria) can help to prevent provides information about the course of the disease.
the emergence of a disorder or will help, in some cases It is necessary to react early to changes in the
of known genetic defects (like Prader-Willi syndrome) symptomatology, to observe the social functioning of
or relevant polymorphisms associated with a disorder the child and adolescent, to control possible side

1679
Child and Adolescent Psychiatry, Principles of

effects of the treatment, and not to overlook the best amples are eating, anxiety and obsessive-compulsive
moment to stop the treatment. disorders, stress reactions, and emotionally unstable
There is a group of early-onset disorders with a personalities, called borderline personality disorder.
persistent course like mental retardation, pervasive Various behavioral therapy methods in combination
developmental disorders, and transient develop- with family therapy and psychosocial interventions,
mentally dependent disorders like learning disorders, continued in an outpatient therapy setting, are es-
communication disorders, selective mutism, attach- tablished as the most successful kind of treatment. In
ment and elimination disorders (encopresis, enuresis). a few cases, the additional use of medication can
Treatment plans are based on behavioral therapy improve or accelerate the treatment success (Piccinelli
concepts. Contingency management, self-monitoring et al. 1995).
by protocols, training of social skills and communi- Controlled clinical trials are the basis for the
cation, and principles of classical and operant con- development of effective treatment approaches
ditioning are expanded by family therapy, educational (Vitiello and Jensen 1997). Performing clinical trials in
programs and cooperation with teachers, social inter- children and adolescents, either to investigate the
ventions, psychomotoric and coordinative training, efficacy and safety of psychotherapeutic treatment
and in many cases additional parental guidance strategies or of newer psychopharmacological sub-
(Noshpitz 1981). stances, is difficult from the legal and ethical point of
At the present time, there is no disorder-specific view. There is the need to get a written consent from
medication on the market; however, there are some the proband or patient before starting any investi-
substances which are successful in the reduction of gation. The proband has to be informed about targets,
single symptoms belonging to these disorders (Houts efficacy, and expected side effects of the planed
et al. 1994), like antipsychotics in autistic children therapy. In the case of children, especially those
(McDougle et al. 1997). An efficient medication exists suffering from mental problems, an adapted, com-
only for single early-onset disorders. In the case of prehensible form of information needs to be presented,
AD\HD, the core symptoms can be reduced about and the parental authorities have to give their written
70–80 percent by medication, mainly by the use of consent as well. Although double-blind placebo-
psychostimulants. However, children with AD\HD controlled clinical trials are the highest standard to
need an intensive behavioral and educational treat- study the efficacy and safety of newer treatment
ment program as well, particularly, if they suffer from concepts, it is quite difficult from ethical point of view
an associated oppositional defiant or conduct dis- to carry out such a design in children.
order. Often their parents need additional support Further aggravating circumstances hinder a fast
from parental guidance (Barkley 1998, Kazdin 1997). performance of clinical trials in children and adolesc-
The main emphasis in the treatment of mood and ents: First, the brain of a minor is not completely
schizophrenic disorders, summarized as early-onset maturated. Brain dysfunction and resulting mental
adult–type disorders, which get manifested mainly in disturbances may be transient, like separation anxiety
adolescence and early adulthood, is the psycho- and sleep problems in early childhood. The child
pharmacological therapy. The exception is that psychiatrist has to be aware of this fact, especially
younger children with mood disorders often gain a regarding the duration of medical and psycho-
clinical improvement just from psychotherapy. Anti- therapeutic treatment interventions. Further, risks of
depressive and antipsychotic medication acts quite pharmacological side effects and unintentional reac-
specifically on the disturbed brain metabolism, which tions to psychotherapeutic interventions are higher in
is, as a primary mechanism or a biological reaction on children and adolescents than in adults as a conse-
permanent psychosocial stress, the main causal factor quence of the unstable somatic and mental system.
for the disorder. Neurotransmitters like serotonin, Secondly, some mental disorders in children and
noradrenalin, and dopamine are regulated by these adolescents may be triggered by an altered familiar
substances and neuroplastical processes, which are environment with communication problems, may be
important for the structural function of brain, are with violence or sexual abuse. In these cases the
regenerated and stabilized over a longer time. There- treatment of the family system and not exclusively the
fore the intake of most of these medications should be individual therapy is the primary aim. Unintended
continued over a longer time, if there is no contra- reactions, like separation of the parents, should be
indication by serious side effects, which might occur in mentioned as a kind of side effect.
some cases. Although medication is necessary and the
first choice of treatment, these patients need an
intensive psychosocial support and training program
to prevent them from social disintegration and to 6. Future Directions
cover their academic achievement.
Another large group of mental diseases can be In developing countries the scarcity of trained per-
described as behavioral, emotional, age-specific in- sonnel has prevented a higher specialization of child
teraction disorders and personality disorders. Ex- mental health services. For professionals in developing

1680
Child and Adolescent Psychiatry, Principles of

countries the term ‘child mental health’ therefore and perhaps about their prevention, might be very
covers a broad range of problems, including neuro- helpful (Schmidt and Remschmidt 1989).
logical and developmental disorders, mental re- At the least the vision of child psychiatry is the
tardation, educational difficulties, and psychiatric prevention of diseases (Greenfield and Shore 1995).
disorders. Worldwide prevalence rates, an estimation Universal prevention, comparable to vaccination
of the persons with a disorder in relation to the whole strategies, is hardly reachable in psychiatry. Even
population, of about 15 percent for children with selective prevention by early assessment, provisional
mental problems, emphasize that developing countries diagnoses and early therapeutic and psychosocial
need support in the training and further specialization interventions in identified risk groups is difficult and
of their professionals who are responsible for child comprises the problem of false positive assignments
mental healthcare. Cooperative research projects, with the risk of stigmatizing people. Before acute
under the patronage of the WHO and of the local clinical symptoms occur, there is mainly the regis-
psychiatric associations, may be one strategy for tration of unspecific problems, giving some evidence
analyzing the quantity and the quality of child mental of the later manifestation of a specific mental disorder.
health problems in developing countries. It is necess- Defined criteria of premorbid symptomatology asso-
ary to recognize characteristic patterns of mental ciated with prevention concepts have not yet been
health problems, to describe risk factors (like mal- established. Probably the progress in biotechnology
nutrition, war and displacement, political oppression, by detecting disease-specific genetic patterns, will
poverty, child labor, urbanization, and social changes) improve the chance of a selective prevention. A more
and to develop treatment concepts in consideration of realistic concept is that of an indicated prevention in
the specific sociocultural background of the com- the earliest stage of a manifested disorder with minor
munity (Rahman et al. 2000). clinical symptoms (McGuire and Earls 1991).
Although the establishment of specialized mental
health services in developing countries is a primary See also: Adolescence, Psychiatry of; Adolescence,
goal, improvement of the child and adolescent psy- Sociology of; Adolescent Development, Theories of;
chiatry in developed countries remains a permanent Adolescent Health and Health Behaviors; Anxiety
task. Growing insight in molecular biology increases Disorder in Children; Behavior Therapy with Chil-
knowledge about brain function and improves the dren; Childhood and Adolescence: Developmental
understanding of the mind–body relation (Deutsch Assets; Childhood Depression; Childhood Health;
1990). Genetic factors will be described, which have
Infant and Child Development, Theories of; Mental
responsibility for the manifestation of mental diseases.
Biotechnology will offer new therapeutic strategies, Health Programs: Children and Adolescents; Sub-
which have to be examined in clinical trials by child stance Abuse in Adolescents, Prevention of
and adolescent psychiatrists.
Going one step back from the future perspectives to
present problems, child and adolescent psychiatrists
need to perform more controlled clinical trials assess- Bibliography
ing the efficacy and security of psychotherapeutic and American Psychiatric Association 1994 Diagnostic and Stat-
psychopharmacological interventions and both in istical Manual of Mental Disorders (DSM-IV), 4th edn.
combination. It is necessary to know more about long- American Psychiatric Association, Washington, DC
term courses and outcome of mental diseases in Barkley R A 1998 Attention Deficit Hyperactiity Disorder: A
children to young adulthood and to control the efficacy Handbook for Diagnosis and Treatment, 2nd edn. Guilford
Press, New York
of pharmacological and psychotherapeutic treatment
Biederman J, Newcorn J, Sprich S 1991 Comorbidity of attention
over a longer time range. A comprehensive neuro- deficit hyperactivity disorder with conduct, depressive,
science research especially will increase the knowledge anxiety, and other disorders. American Journal of Psychiatry
about disorders with an early onset and a persistent 148: 564–77
pattern of symptoms and about specific develop- Brownell K D, Fairburn C G 1995 Eating Disorders and Obesity.
mental-related disorders. A Comprehensie Handbook. Guilford Press, New York
Some mental disorders are quite rare in children and Caron C, Rutter M 1991 Comorbidity in child psychopathology:
adolescents, like manic affective disorder, so multi- Concepts, issues and research strategies. Journal of Child
center studies in many hospitals have to be designed to Psychology and Psychiatry 32: 1063–80
investigate the efficacy of therapies in a sufficient Cicchetti D, Cohen D J (eds.) 1995 Deelopmental Psycho-
number of child and adolescent patients. Other early pathology. Wiley, New York, Vol. 2
Costello E J, Angold A, March J, Fairbank J 1998 Life events
onset problems in children, like the different kinds of and post traumatic stress: The development of a new measure
learning disabilities, need to be investigated more for children and adolescents. Psychological Medicine 28:
exactly for the development of treatment concepts 1275–88
with higher efficacy. Especially in developing countries Deutsch S I, Weizman A, Weizman R (eds.) 1990 Application of
with a weak school network, knowledge about learn- Basic Neuroscience to Child Psychiatry. Plenum Medical, New
ing disabilities, about efficient treatment approaches York

1681
Child and Adolescent Psychiatry, Principles of

Esser G, Schmidt M H, Woerner W 1990 Epidemiology and Rutter M 1989 Pathways from childhood to adult life. Journal of
course of psychiatric disorders in school-age children—results Child Psychology and Psychiatry 30: 23–51
of a longitudinal study. Journal of Child Psychology and Rutter M 1999 Psychosocial adversity and child psychopath-
Psychiatry 31: 243–63 ology. British Journal of Psychiatry 174: 480–93
Greenfield S F, Shore M F 1995 Prevention of psychiatric Rutter M, Silberg J, O’Connor T, Simonoff E 1999 Genetics and
disorders. Harard Reiew of Psychiatry 3: 115–29 child psychiatry: II Empirical research findings. Journal of
Houts A C, Berman J S, Abramson H 1994 Effectiveness of Child Psychology and Psychiatry 40: 19–55
psychosocial and pharmacological treatments for nocturnal Rutter M, Smith D 1995 Psychosocial Disorders in Young People:
enuresis. Journal of Consulting and Clinical Psychology 62: Time Trends and Their Causes. Wiley, Chichester, UK
737–45 Rutter M, Taylor E, Hersov L 1994 Child and Adolescent
Jensen P S, Rubio-Stipec M, Canino G, Bird H R, Dulcan M K, Psychiatry. Blackwell, Oxford, UK
Schwab-Stone M E, Lahey B B 1999 Parent and child contri- Schmidt M H, Remschmidt H (eds.) 1989 Needs and Prospects of
butions to diagnosis of mental disorders: Are both informa- Child and Adolescent Psychiatry. Hogrefe & Huber, Toronto,
tions always necessary? Journal of the American Academy of ON
Child and Adolescent Psychiatry 38: 1569–79 Schwab J J, Schwab-Stone M E 1999 History of child psychiatry
Kaplan H I, Sadock B J 1998 Kaplan and Sadock’s Synopsis of in the USA. From social reform and psychoanalysis to
Psychiatry: Behaioral Sciences\Clinical Psychiatry. Williams psychiatry of the family. Zeitschrift fuW r Kinder und Jugend-
& Wilkins, Baltimore, MD psychiatrie und Psychotherapie 27: 277–81
Kazdin A E 1997 Practitioner review: Psychosocial treatments Shapiro A K, Shapiro E S, Young J G, Feinberg T E 1988 Gilles
for conduct disorder in children. Journal of Child Psychology de la Tourette Syndrome. Raven, New York
and Psychiatry 38: 161–78 Simeon J G, Ferguson H B (eds.) 1990 Treatment Strategies in
Laucht M, Esser G, Schmidt M H 1997 Developmental outcome Child and Adolescent Psychiatry. Plenum Press, New York
of infants born with biological and psychosocial risks. Journal Simon B 1996 The history of psychiatry: An opportunity for self-
of Child Psychology and Psychiatry 38: 843–53 reflection and interdisciplinary dialogue. Essay review. Psy-
Lewis M 1996 Child and Adolescent Psychiatry. Williams & chiatry 59: 336–56
Wilkins, Baltimore, MD Steinhausen H C 1995 Eating Disorders in Adolescence. Walter
McDougle C J, Holmes J P, Bronson M R, Anderson G M, de Gruyter, Berlin
Volkmar F R, Price L H, Cohen D J 1997 Risperidone Vitiello B, Jensen S 1997 Medication development and testing in
treatment of children and adolescents with pervasive devel- children and adolescents—current problems, future directions.
opmental disorders: A prospective, open label study. Journal Archies of General Psychiatry 54: 871–6
of the American Academy of Child and Adolescent Psychiatry Wiener J M 1997 Textbook of Child & Adolescent Psychiatry.
36: 685–93 American Psychiatric Press, Washington, DC
McGuire J, Earls F 1991 Prevention of psychiatric disorders in World Health Organization ( WHO) 1992 The ICD-10 Classifi-
early childhood. Journal of Child Psychology and Psychiatry cation of Mental and Behaioral Disorders-Clinical Descrip-
32: 129–54 tions and Diagnostic Guidelines. WHO, Geneva
Munir K M, Beardslee W R 1999 Developmental psychiatry: Is
there any other kind? Harard Reiew of Psychiatry 6: 250–62 M. H. Schmidt and A. Maras
Nathan P E, Langenbucher J W 1999 Psychopathology: De-
scription and classification. Annual Reiew of Psychology 50:
79–107
Nissen G, Fritze J, Trott G E 1998 Psychopharmaka im Kindes-
und Jugendalter. Gustav Fischer Verlag, Ulm, Germany
Noshpitz J D 1981 Psychotherapy with children: Basic princi-
ples. Current Psychiatric Therapies 20: 47–59 Child Care and Child Development
Ollendick T H, Hersen M 1993 Handbook of Child and Ado-
lescent Assessment. Allyn & Bacon, Boston Childcare is regular care provided by someone other
Piccinelli M, Pini S, Bellantuono C, Wilkinson G 1995 Efficacy than a child’s parents. Throughout human history,
of drug treatment in obsessive-compulsive disorder. A meta-
analytic review. British Journal of Psychiatry 166: 424–43
grandparents, siblings, and relatives have cared for
Rahman A, Mubbashar M, Harrington R, Gater R 2000 young children, but since about the 1950s, nonparental
Annotation: Developing child mental health services in childcare has become an increasingly visible and
developing countries. Journal of Child Psychology and Psy- prevalent part of the lives of young children in most
chiatry 41: 539–46 industrialized societies, largely because of increasing
Remschmidt H, Schmidt M H 1985 Kinder- und Jugend- rates of maternal employment and of single-mother
psychiatrie in Klinik und Praxis. Thieme, Stuttgart, Germany families. In the US, by 1997, the majority of infants
Remschmidt H, Schmidt M H 1994 Multiaxiales Klassifi- and preschool children spent some time in childcare;
kationsschema fuW r psychische StoW rungen des Kindes- und many of them were in full-time care (35j hours per
Jugendalters nach ICD-10 der WHO. Huber, Bern, Switzer- week) (Capizzano and Adams 2000).
land
Remschmidt H, van Engeland H (eds.) 1999 Child and Adolescent
This rapid and dramatic change in the ecology of
Psychiatry in Europe. Steinkopff, Darmstadt, Germany children’s experience has raised a host of questions
Reynolds W M, Johnston H F 1994 Handbook of Depression in about the benefits and risks associated with non-
Children and Adolescents. Plenum Press, New York parental childcare. Strong value commitments and
Rothenberger A (ed.) 1990 Brain and Behaior in Child Psy- assumptions have often influenced the questions
chiatry. Springer, Berlin asked. In the first wave of research in the 1970s, for

1682
Child Care and Child Deelopment

example, researchers asked whether ‘day care’ was experience. Many of these differences are not apparent
harmful to children; they rarely considered potential when children have experienced high quality care
benefits. At the same time, others were investigating (Lamb 1997).
the benefits of ‘early childhood intervention’ without
considering possible harmful effects. Both groups were
examining systems of nonparental care for young
children, but the ways they framed questions and the 2. Quality of Care
labels they used led to different conclusions.
A second theory predicts that the effects of childcare
depend on the physical and social environment pro-
vided. Harmful effects on both social and intellectual
1. Is Extensie Childcare Harmful? development might be expected if children receive less
attention, affection, interaction, and stimulation from
Why might we expect nonparental childcare to be nonparental adults than they would from their
harmful for children? According to the ‘maternal parents. This view implies that the effects of care
deprivation’ view, having one primary caregiver depend on its quality and on the quality of the home
(usually the mother) with whom to develop an early environment. Quality can be defined by the processes
attachment relationship is critical to the socioemo- that occur in the setting: (a) sensitive, responsive,
tional development of young children. This view positive interactions of adults with children; (b)
implies that even very high quality nonmaternal care intellectually stimulating adult actions and activities,
could impair the development of a secure attachment including appropriate language, reading, and play;
to the mother if it seriously reduced the amount of and (c) toys, materials, and curricula that provide age-
time that a child spends with its mother. The most appropriate opportunities for learning. Quality can
commonly used measure of attachment is the Strange also be defined by structural characteristics, some of
Situation, a laboratory procedure in which children which can be mandated by regulatory agencies: (a)
are observed during and after brief separations from small ratios of children to adults; (b) small group sizes;
their mothers. Secure attachment is indicated when the (c) caregiver education and training in child devel-
child goes to the mother and is comforted by her. opment; (d) sufficient space per child; (e) a safe and
Insecure attachment may be manifested by avoiding clean physical environment; and (f ) continuity and
or ignoring the mother or by clinging to her without wages of the staff. Structural and process measures are
being comforted. correlated moderately with one another.
A large number of early studies concluded that Whether process or structural indicators of quality
childcare in the preschool years did not have harmful are used, children in high-quality care have higher
effects on attachment security or later socioemotional levels of language and intellectual skills, and perform
development, but the evidence is more mixed with better on academic tasks than do children in low-
respect to extensive nonmaternal care in the first year quality care. Although this finding is consistent across
of life. Although the majority of children who receive studies, there is some disagreement about whether
full-time childcare in infancy have secure attachments quality produces a large enough difference to be
to their mothers, some studies find elevated rates of socially significant (Lamb 1997, Scarr 1998). In one
insecure attachment when comparing these children to analysis, the quality of the child care environment and
those in exclusive maternal care or part-time childcare characteristics of the home environment during the
(Clarke-Stewart 1989). A major study of 1,153 infants first 3 years of life were compared as predictors of
across the US indicated that children who spent school readiness and language skill at age 3. The size of
extensive time in childcare during the first 3 years of the childcare quality effect was about half that of the
life had mothers who were slightly less sensitive in family environment, suggesting that childcare makes a
interactions with them, and that extensive care was substantial contribution to language and academic
associated with elevated rates of insecure attachment development (NICHD 1999a, 1999b). Critics have
only for children whose mothers were insensitive also argued that childcare effects do not last into later
(NICHD Early Child Care Research Network 1997, childhood, but evidence for long-term effects is begin-
1999a, 1999b). As one reviewer concluded, ‘Adverse ning to accumulate. In a large study following
effects on infant-mother attachment appear to occur children from age 4 through grade 3, children who
only when infant day-care co-occurs with other risky attended higher quality childcare centers performed
conditions, …’ (Lamb 1997). better on measures of cognitive skills (e.g., math and
It does not appear that time in childcare, in and of language abilities) while they were in childcare and, in
itself, has long-term effects on most other aspects of many cases, through the end of second grade (Peisner-
socioemotional development, but children with ex- Feinberg et al. 1999).
tensive care from infancy on tend to be more aggressive High quality childcare also predicts social skills with
and assertive and to be less compliant to adults in peers and social competence with adults, but the
some settings than are those with less child care relations between quality and social behavior are

1683
Child Care and Child Deelopment

weaker than those between quality and cognitive\ European countries have extensive center-based care
academic performance. Some studies suggest that for children age 3 and over, but care settings for
children’s relationships with their teachers in child younger children vary considerably (Kamerman and
care forecast social skills and relations with teachers in Kahn 1991). The available data suggest both advan-
elementary school (Howes et al. 1998, Peisner- tages and disadvantages of center-based care relative
Feinberg et al. 1999). to home-based care for very young children. Infants
One problem in evaluating childcare effects is that and young children in center care show better language
families with more resources, better education, and and cognitive development by ages 2 and 3 years than
more sensitive parenting styles (to name only a few those in home-based settings of comparable quality,
attributes) place their children in higher quality child- but they also have more communicable illnesses (e.g.
care settings than do less advantaged families. Most colds and other respiratory illnesses) (NICHD 2000, in
recent studies have measured both family and child- press).
care characteristics so that the independent contri- Children attending a center or a childcare home
butions of each could be evaluated, but there is still the with several other children have more experience with
possibility that other unmeasured family attributes peers than do children in other types of home-based
could affect childcare choices and children’s cognitive care. Children with extensive group experience de-
and intellectual development. velop better skills interacting with peers, but their
Experimental studies, in which children are assigned caregivers also rate them higher on negative behavior
randomly to enriched early care experiences or to (e.g. aggression, disobedience) than children in other
control groups, demonstrate clearly that high quality forms of care (Lamb 1997). For young children, social
early care has lasting effects on cognitive and academic skills and sociability with peers are often accompanied
skills, at least for children from economically disad- by a certain amount of aggression; perhaps early
vantaged families. Experiments avoid possible con- contact with peers leads to increases in a range of
founds between family and childcare attributes. The social behaviors.
Abercedarian project enrolled children from low-
income families in educational childcare from infancy
until they reached school age. Children in the treat-
ment group performed better than children in a control 4. Do Effects Differ for Children from Different
group (who received social and nutritional services) on Family Backgrounds?
tests of intelligence and school achievement through-
out childhood and adolescence (Ramey et al. 2000). In the US, where most childcare is funded privately,
Less intensive early intervention programs for children children from affluent families receive higher quality
ages 3–5 also produced lasting effects on children’s care than children from families with low and mod-
school progress; participating children were less likely erate incomes do. In some cases, children from very
to be retained in a grade or to need special education poor families receive slightly higher quality care than
than were children in control groups (Lazar and children from families with modest incomes, largely
Darlington 1982). because of publicly funded programs for children in
The importance of childcare does not end when poverty (Phillips et al. 1994). In countries with publicly
children enter school. Participation in formal after- funded childcare for children from all income levels,
school care programs that provide cognitive stimu- average quality appears to be considerably better than
lation and positive adult interactions is associated it is in the US, and there are not wide discrepancies in
with high academic achievement and low levels of quality associated with family income (Lamb 1997).
behavior problems, particularly among low-income Some researchers have proposed a compensatory
children (Posner and Vandell 1999). Children without hypothesis: that children from disadvantaged homes
adult supervision in the out-of-school hours are at risk might profit from childcare of reasonable or high
for behavior problems and poor adjustment, par- quality because it provides more opportunities for
ticularly if they live in low-income families or unsafe learning and development than they receive at home.
neighborhoods (Pettit et al. 1999). The complementary lost resources hypothesis suggests
that children from highly advantaged homes may be
harmed by childcare because it provides fewer oppor-
tunities than their home environments do. The evi-
3. Type of Care dence for these hypotheses is mixed. Almost all of the
experiments exposing children to high quality care
A great deal of childcare occurs in the child’s home or have included only economically disadvantaged chil-
someone else’s home, especially for infants and tod- dren, and these children clearly profit from such care.
dlers. Caregivers can be grandparents, other relatives, Some investigations have found support for the lost
or nonrelatives. In the US, a small percentage of resources notion, but most studies support the idea
infants receive center-based care, but the percentage that high quality care provides benefits for children
increases as children reach ages 3–5. Many western from a wide range of backgrounds (Lamb 1997).

1684
Child Care and Child Deelopment

5. Public Policy Issues givers and age-appropriate activities and curriculum.


High quality care promotes cognitive, academic, and
5.1 Public s. Priate Funding social development, and its effects last beyond the
preschool years. High quality care for children from
Childcare is expensive. In some industrialized nations, all economic levels is provided more successfully in
public funds pay the great majority of the costs. countries that make large public investments in early
Sweden, France, and some other countries provide care than in countries that require parents and child
large-scale publicly funded, full-day, preschool pro- care staff to pay the costs of care.
grams. In the US, at the other end of the spectrum,
limited amounts of public funds are available to low-
income families, and tax credits cover a small portion 7. Future Directions
of childcare costs for people earning enough to owe
taxes. Overall, parents pay the vast majority of the Nonmaternal care is here to stay. Policy debates about
costs of care. At the same time, childcare workers and what is ‘good enough’ care abound, and research is
staff receive very low wages. In 1997, teachers in needed to identify thresholds of quality that make
childcare centers, many of whom had college degrees, large differences, with particular attention to indi-
earned from approximately $13,000 to $19,000 per vidual child and family characteristics. We understand
year; salaries of teaching assistants ranged from how early childcare can contribute to cognitive and
$10,500 to $12,250 per year (Whitebook et al. 1998). academic performance better than we understand how
As a result, there is high turnover among childcare it can influence positive social behavior with both
workers. peers and adults. New designs, including experiments
with random assignment, could help to answer these
important questions.
5.2 Family Leae s. Infant Care See also: Divorce and Children’s Social Development;
Care for infants is more expensive than care for Nontraditional Families and Child Development;
preschoolers because they require more concentrated Parenting: Attitudes and Beliefs; Parents and Teachers
adult attention and they fare best in small groups. as Partners in Education
Many industrialized nations provide parents with paid
family leave as an alternative to supplying center-
based infant care (Kamerman and Kahn 1991). In the Bibliography
US, some workers receive paid family leave, but
Capizzano J, Adams G 2000 The Hours that Children under Fie
federal law requires only that parents receive up to 12
Spend in Child Care: Variations across States. A Report of
weeks of unpaid leave with guaranteed job security Assessing the New Federalism. The Urban Institute, Wash-
and medical benefits. Even this policy applies only to ington, DC
full-time, full-year workers in large organizations. As Clark R, Hyde J S, Essex M J, Klein M H 1997 Length of
a result, many infants enter nonmaternal care between maternity leave and quality of mother-infant interactions.
ages 2 and 3 months. There is some evidence mothers Child Deelopment 68: 364–83
who return to work very early and whose babies have Clarke-Stewart K A 1989 Infant day care: Maligned or ma-
extensive nonmaternal care are less sensitive during lignant? America Psychologist 44: 266–73
interactions with their infants than are mothers who Howes C, Hamilton C E, Philipsen L C 1998 Stability and
continuity of child-caregiver and child-peer relationships.
have more time at home (Clark et al. 1997). It may be
Child Deelopment 69: 418–26
more difficult to learn your infant’s signals and Kamerman S B, Kahn A J 1991 Child Care, Parental Leae, and
respond appropriately when you spend a great deal of the Under-3s: Policy Innoation in Europe. Auburn House,
time at work starting very early in the child’s life. New York
Lamb M E 1997 Nonparental child care: Context, quality,
correlates, and consequences. In: Damon W, Sigel I,
6. Conclusion Renninger K A (eds.) Handbook of Child Psychology: Vol. 4.
Child Psychology in Practice, 5th edn. Wiley, New York, pp.
Most children in modern industrialized societies will 73–134
spend a significant portion of their early lives in Lazar I, Darlington R 1982 Lasting effects of early education: A
childcare, and many will require such care during the report from the Consortium for Longitudinal Studies. Mono-
early school years as well. Initial fears that child care graphs of the Society for Research in Child Deelopment 47(2-
per se would be harmful have not been supported, but sup-3): 1–151
NICHD Early Child Care Research Network 1997 The effects of
there is some evidence that extensive care, beginning in
infant child care on infant-mother attachment security:
early infancy, can make it more difficult for mothers to Results of the NICHD study of early child care. Child
relate sensitively to their infants. Whether childcare Development 68: 860–79
has positive or negative effects on children’s devel- NICHD Early Child Care Research Network 1999a Child care
opment depends primarily on its quality. Quality is and mother-child interaction in the first 3 years of life.
defined by sensitive, responsive, and stimulating care- Deelopmental Psychology 35: 1399–1413

1685
Child Care and Child Deelopment

NICHD Early Child Care Network 1999b Child outcomes when socially optimal fertility. In this case, there is a role for
child-care classrooms meet recommended standards for qual- government intervention to influence fertility deci-
ity. American Journal of Public Health 89: 1072–7 sions. In fact, externalities are pervasive in all societies.
NICHD Early Child Care Research Network 2000 The relation
This article will discuss their sources, their size, and
of child care to cognitive and language development. Child
Deelopment 71: 960–80 their policy implications.
NICHD Early Child Care Research Network in press. Child
care and common communicable illnesses: results from the
NICHD Study of Early Child Care. Archies of Pediatrics and 1. Why Childbearing Externalities Matter
Adolescent Medicine
Peisner-Feinberg E S, Burchinal M R, Clifford R M, Culkin When consequences of childbearing are mediated by
M L, Howes C, Kagan S L, Yazejian N, Byler P, Rustici J, the market, they are called pecuniary externalities, as
Zelazo J 1999 The Children of the Cost, Quality, and Outcomes when an additional child reduces the wages of future
Study Go to School. University of North Carolina, Chapel
workers by increasing their number. It has been shown
Hill, NC
Pettit G S, Bates J E, Dodge K A, Meece D W 1999 The impact that in this case, socially optimal fertility will not
of after-school peer contact on early adolescent externalizing diverge from the individually optimal level (con-
problems is moderated by parental monitoring, perceived ditional on additional assumptions and a restrictive
neighborhood safety, and prior adjustment. Child Deel- concept of social optimality; see Nerlove et al. 1987).
opment 70: 768–78 Technical externalities do not pass through the mar-
Phillips D A, Voran M, Kisker E, Howes C, Whitebook M 1994 ket, as would be the case if an additional child meant
Child care for children in poverty: Opportunity or inequity? higher taxes for others or a bigger hole in the ozone
Child Deelopment 65: 472–92 layer.
Posner J K, Vandell D L 1999 After-school activities and the
Economic theory asserts that in the absence of
development of low-income urban children: A longitudinal
study. Deelopmental Psychology 35: 868–79 technical externalities, collective welfare will be maxi-
Ramey C T, Campbell F A, Burchinal M R, Skinner M L, mized (in the sense of Pareto optimality) by individuals
Gardner D M, Ramey S L 2000 Persistent effects of early pursuing their own self-interest in the context of a
childhood education on high-risk children and their mothers. competitive market (subject to some further con-
Applied Deelopmental Science 4: 2–14 ditions). In the presence of technical externalities, this
Scarr S 1998 American child care today. American Psychologist is no longer so. Disregarding technical externalities,
53: 95–108 the sum of individually optimal fertility decisions
Whitebrook M, Howes C, Phillips D 1998 Worthy Work, should lead to the same outcome as a collective societal
Unliable Wages. The National Child Care Staffing Study,
decision about fertility levels. With technical external-
1988–1997. Center for Child Care Workforce, Washington,
DC ities, the outcomes would differ. Thus Garret Hardin,
in a famous article, The Tragedy of the Commons,
A. C. Huston which first gave this issue prominence, called for
‘mutual coercion, mutually agreed upon’ (Hardin
1968, p. 1247). Subsequent articles by Demeny (1972),
Blandy (1974), Ng (1986), Nerlove et al. (1987), and
Willis (1987) developed the theory from a more
rigorous economic standpoint.
Childbearing, Externalities of Externalities to childbearing are salient in several
modern policy contexts, as follows. First, in industrial-
In any family, lower fertility would raise income per ized nations, fertility is on average about 1.5 children
family member at least in the short run, since the per woman, well below replacement level, leading to
number of family members to share income would be rapidly aging populations. Why is it so low? When old
smaller and women of reproductive age might do more age support is provided to elders by their own adult
market work. Would this higher per capita income children, this enters a couple’s cost\benefit calculus
justify a government policy to reduce the rate of and provides an incentive for higher fertility. In
population growth by lowering fertility? Not necess- industrialized nations, although old age support is
arily. By choosing to have a child, people express a provided by the younger generations, this is done
preference for the child over the additional consump- generally through public sector tax and transfer
tion that would otherwise be possible for family pension programs, rather than by the elders’ own
members. However, there may be costs and benefits to children. Thus the creation of public pension programs
the additional child that are not directly borne by the has created a large positive externality to childbearing:
decision-making parents, but are rather passed on to higher fertility is socially beneficial through the pen-
other families and to society as a whole. Such sion programs, but this social benefit does not impinge
consequences, if not mediated by the market, are on individual decisions since it is the general level of
known as pure or technical externalities to child- fertility that matters.
bearing. When they are present, then individually Second, in developing nations, the public sector
optimal childbearing decisions need not add up to provides education and healthcare for children, so

1686
Childbearing, Externalities of

additional children impose tax costs on others, a The argument also works in the opposite direction,
negative externality. Governments typically spend when collective costs are shared, as with national debt.
little on the elderly in these countries, and also there The US national debt is roughly $20,000 per capita.
are few elderly, so the public costs of children Additional members of the population due to births or
dominate. This gap between the private and social immigration take on a share of this obligation as
costs of children is a negative externality that is often taxpayers, thereby reducing what must be paid by the
taken to justify government intervention. balance of the population. This is a positive externality
Third, in all countries additional children will place to childbearing.
additional demands on the environment, now and in
the future, and most environmental amenities (clean
air, fresh water, the ozone layer, climate and CO
emissions, biodiversity, forest cover) are outside the# 2.2 Public Sector Inter-age Transfers
market. These negative environmental externalities to
childbearing are therefore potentially very important, When parents rear their children and thereby transfer
particularly in the industrialized nations where con- income to them, and when adults transfer income to
sumption per capita is greater. their elderly parents, no externalities to childbearing
Focusing on just a few issues like these can be arise: the need for transfers enters into the fertility
misleading. A proper understanding of childbearing decision. When transfers take place through the public
externalities requires a more comprehensive approach. sector, however, they do lead to externalities. The
most important public transfers in this context are for
health, education, and pensions.

2. Sources of Externalities
Sources of technical externalities may be grouped into 2.3 Public Goods and Social Infrastructure
(a) common resources or collective wealth, (b) public
sector transfers from one age group to another, and (c) The costs of providing a network of roads rises with
provision of public goods or social infrastructure. In the size of the population served within a given
addition, there is a wide range of pecuniary external- area, but less than proportionately to the population
ities, of which the most important is a potential adverse increase; the same applies for communications
effect on future wages or per capita incomes. Each of networks. These are called quasi-public goods. In ad-
these three will be briefly discussed, before we turn to dition, there are pure public goods, which by definition
a quantitative assessment. cost no more to provide to many people than to few.
The existence and magnitude of externalities depend The leading example is a nation’s military force, which
on institutional context, in particular on the existence can protect a larger population just as well as a smaller
of property rights in resources and on the size of the one; other examples are broadcasting, weather fore-
public sector. With full property rights and no public casting, and scientific research. Provision of a given
sector, most childbearing externalities would vanish. level of public goods is cheaper per capita for a larger
population, since the tax bill per head will be lower.
Public and quasi-public goods give rise to positive
externalities of childbearing.
2.1 Common Resources or Collectie Wealth
The most basic externality to childbearing occurs
when an asset is commonly owned and all members of 3. Ealuation
the population have free access to it. Suppose a group
shares a common pasture for its cows. The larger the The first attempt to evaluate childbearing externalities
group, the fewer cows each will be able to graze appears to be have been made by Lee (1991). The
without degrading the pasture and the cows. Each following discussion will draw on the more com-
birth increases the size of the group, but if there are prehensive work by Lee and Miller (1990), which
many families, then this diluted effect will not count in included estimates for six developing nations and the
the parents’ self-centered cost benefit analysis. Thus USA for the early to mid-1980s. Begin with collective
there is a negative externality. Fertility will be higher, wealth or debt. Great variations in natural wealth,
the group larger, and all worse off than with a particularly for oil in Saudi Arabia and land in Brazil,
collective fertility decision (Hardin 1968). A similar lead to negative externalities that dominate other
argument could be made about environmental ameni- sources for these countries. For the USA, national
ties (water, air, climate, ozone layer, etc.). Nationally- debt is an important source of positive externalities, as
owned land, parks, and mineral or fossil fuel deposits it would be for many OECD nations. An effort was
can likewise be important sources of negative external- made to evaluate other items, but none appears to be
ities. very important. Environmental externalities were not

1687
Childbearing, Externalities of

addressed in Lee and Miller, for lack of evidence. Since the level discussed earlier, then all net externalities
then, a group of ecologists and economists have would be negative.
estimated the value of all the services provided by The estimates just discussed were based on a simple
natural resources worldwide to fall within the range model and the analysis was done by comparative
16 trillion to 54 trillion dollars per year (Roush 1997). steady states. In more recent work, Lee and Miller
The midpoint estimate implies a negative externality (1997) examined fiscal impact externalities in a more
of k$175,000 per birth worldwide, very approxi- detailed and nuanced manner. They projected tax
mately. For most nations, this number would domi- payments over a very long horizon according to
nate all externalities. Despite the great uncertainty whether or not there was an additional birth in the
surrounding the calculation, it warns us that en- base year, including the impacts of all descendants of
vironmental externalities can be very large and should the incremental birth as well. They found a positive net
not be ignored. externality of US$170,000, which is six times per
Turn now to public sector transfers, from taxpayers capita GNP for the reference year (which compares to
to beneficiaries of various programs. When the di- 3.6 in Lee and Miller 1997, adjusted for non-fiscal
rection of transfers is downward from older to younger effects). Given the increase in the cost of healthcare
ages, such as for public education, then incremental benefits and differences in some key assumptions, the
births are costly and there is a negative externality. numbers are in reasonable agreement.
When the direction of transfers is upward from
younger to older ages, such as for public pensions and
healthcare for the elderly, then incremental births
reduce the old age dependency ratio, and there is a 4. Broader Views
positive externality. When these three major transfer
programs and other smaller ones are summed, the net The evaluations just reported are based on the narrow
direction of transfers can be found. In the USA, the concept of technical externalities, following the main
net direction is strongly upward, from younger to development of the theoretical literature. Some
older. These results would be even stronger for other analysts argue for a broader view of externalities, even
OECD countries, since their populations are typically though the welfare implications are then no longer
older and their pensions more generous than in the clear. Some stress pecuniary externalities: an addition-
USA. In the developing nations evaluated other than al birth will mean less land and less capital for future
Brazil, the net direction of transfers is downward. Like workers, and therefore lower productivity and wages.
a number of other Latin American countries, but Nerlove et al. (1987) analyze this case, and show that
unlike most countries in Asia and Africa, Brazil has a under certain assumptions there is no technical ex-
strong pension program. It is also likely that the net ternality and collective fertility decisions would not
direction of public transfers throughout the develop- improve welfare. Nerlove et al. view parents as
ing nations is strongly downward, except for those deriving utility from their own consumption, the
countries in Latin America that have generous un- number of children they have, and the future utility or
funded public pension programs. consumption for their children. But it is possible that
Now consider public goods. Expenditures on the the goals of society differ from those of the individual
military are the main item in this category for the USA parents. In particular, society may give the welfare of
and Saudi Arabia, but for other countries social future generations greater weight than do the in-
infrastructure dominates. Public goods and social dividual parents, or value wilderness or other aspects
infrastructure advantages generate positive external- of the environment more highly. Society may care
ities to childbearing for all countries, falling in the more about the distribution of income, or give a
range of two to ten times the level of gross national different weight to the welfare of women or pre-
product (GNP) per capita. existing children than does the household decision-
These three broad kinds of externalities can be maker. In these cases, there is no reason to expect that
summed to find the net externality. the sum of individual decisions will be socially de-
If we ignore environmental externalities, there is a sirable, and there will be many other reasons besides
substantial positive net externality to childbearing in technical externalities to expect a gap between the
the USA, three times as great as per capita GNP. This individually and socially optimal level of fertility.
would probably be approximately true for other There may also be a different kind of externality, in
OECD countries as well. None of the developing which one person’s fertility may influence decisions by
nations evaluated has a positive externality, while others. For example, if one couple uses contraceptives
Brazil and Saudi Arabia have very large negative to control fertility, that may convey information to
externalities. It is striking that Kenya and Bangladesh, other couples, enabling them to make a better-
two countries that are generally regarded as having informed decision. Or it may alter the social norms
serious population problems, both have externalities that influence the fertility decisions of others by
near zero relative to GNP per capita. If global weakening the influence of traditional institutions
environmental externalities were taken into account at opposing contraceptive use. Or some couples may

1688
Childbearing, Externalities of

imitate others. Dasgupta (1993) considers some of tutes for many of the services that children provide –
these possibilities. from insurance against sickness, unemployment, or
death of spouse, to financial institutions and reliable
pension programs. It would most likely be a mistake to
5. Implications for Policy attempt to fine-tune policy to these measured external-
ities rather than to make more efficient the institutional
The difficulty in assessing and quantifying environ- context within which childbearing decisions are made.
mental externalities, which are potentially very large
for couples in industrialized nations, renders all the See also: Family Size Preferences; Family Theory and
estimates even more uncertain. The numbers presented
the Realities of Childbearing Behavior; Family
above are too shaky to guide policy decisions. Never-
theless it is useful to consider what they would entail if Theory: Economics of Childbearing; Family Theory:
taken at face value, setting aside the question of Feminist–Economist Critique; Family Theory: Role
environmental externalities. of Changing Values; Fertility Transition: Cultural
Once the direction and magnitude of childbearing Explanations; Fertility Transition: Economic Explan-
externalities have been identified, there is a clear policy ations; Gender and Reproductive Health; Mother-
implication: governments can, in principle, improve hood: Economic Aspects; Motherhood: Social and
the welfare of their populations by inducing couples to Cultural Aspects; Personality Disorders
have more children where externalities are positive,
and fewer children where they are negative. One means
to achieve this would be by internalizing the ex-
ternality, such that through taxes or subsidies the Bibliography
couple is brought to face the full social cost or benefit
of the incremental child when making its fertility Blandy R 1974 The welfare analysis of fertility reduction. The
decision. For example, in industrialized countries, the Economic Journal 84(333): 109–29
size of the retirement pension could be linked to each Dasgupta P 1993 An Inquiry into Well-Being and Destitution.
couple’s fertility level; or a couple could receive a Clarendon Press, Oxford, UK
Demeny P 1972 The economics of population control. In:
bonus of US$170,000 (say) on the birth of each child. National Academy of Sciences, Rapid Population Growth:
In developing nations, each couple might be compelled Consequences and Policy Implications. Johns Hopkins Univ-
to make a lump sum payment at the time of the birth ersity Press, Baltimore, MD
of each child. In reality such policies appear to be Hardin G 1968 The tragedy of the commons. Science 162:
neither practical nor desirable. They would adversely 1243–8
affect the well-being of children and probably fall Lee R D 1991 Evaluating externalities to childbearing in
disproportionately on the poorer segment of society in developing countries: The case of India. In: Consequences of
developing nations. Rapid Population Growth in Deeloping Countries. Taylor and
Nor would such policies necessarily be desirable Francis, London, pp. 297–342
from a theoretical perspective, because the measured Lee R D, Miller T 1990 Population growth, externalities to
externality depends sensitively on the institutional childbearing, and fertility policy in the Third World. Proceed-
ings of the World Bank Annual Conference on Development
context and on public sector resource allocation. If a
Economics, 1990. Supplement to The World Bank Economic
country has a positive externality to childbearing due Reiew and to The World Bank Research Obserer pp. 275–304
to a large military establishment, the externality could Lee R D, Miller T 1997 The life time fiscal impacts of immigrants
be reduced by reducing the share of military expendi- and their descendants. Project on the economic demography
tures in the national budget. If a country has a positive of interage income reallocation, demography, UC Berkeley.
externality to childbearing due to a generous unfunded Draft of Chapter 7 for The New Americans, a report of the
public pension program, the externality could be National Academy of Sciences Panel on Economic and Demo-
eliminated by switching to a funded system. If there is graphic Consequences of Immigration. National Academy
a large negative childbearing externality in a develop- Press pp. 297–362
ing nation due to public education, note should be Nerlove M, Razin A, Sadka E 1987 Household and Economy:
taken that fertility is falling rapidly in most parts of the Welfare Economics of Endogenous Fertility. Academic Press,
world where it is not already low, and that transfers London
Ng Y 1986 The welfare economics of population control.
may soon be flowing upward across the age dis-
Population and Deelopment Reiew 12(2): 247–66
tribution. The existence of negative childbearing Roush W 1997 Putting a price tag on nature’s bounty. Science
externalities reflects the ability of individual couples to 276: 1029
pass on some of the childrearing costs to society at Willis R 1987 Externalities and population. In: Johnson D G,
large. While not unimportant, this is probably not a Lee R D (eds.) Population Growth and Economic Deelopment:
major reason why fertility is high. More likely high Issues and Eidence. Wisconsin University Press, Milwaukee,
fertility results in part from obstacles faced by couples WI
in obtaining contraceptives, and in part from the
absence of institutions which provide superior substi- R. D. Lee

Copyright # 2001 Elsevier Science Ltd. 1689


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Childhood and Adolescence: Deelopmental Assets

Childhood and Adolescence: communities are discussed in depth in a series of recent


publications (e.g., Benson 1997, Benson 1998, Benson
Developmental Assets et al. 1998).
Developmental assets represent a theoretical construct
first articulated in 1990 (Benson 1990). Based on a
synthesis of scientific studies in pertinent fields, de-
velopmental assets identify a series of social and
psychological strengths which function to enhance 1.1 Connection to Other Areas of Scientific Inquiry
health outcomes for children and adolescents. The The developmental asset framework is related to
purposes of ongoing research pertaining to devel- several other streams of scientific study, which also
opmental assets are to develop new lines of scientific seek to identify positive developmental experiences
inquiry on the sources and consequences of strength- and competencies known to enhance health and well-
building approaches in child and adolescent devel- being among adolescents. Among these is the emerging
opment and to attempt to provide a conceptual exploration of protective factors in the fields of
roadmap to guide the design and implementation of alcohol, tobacco, and pregnancy prevention. For
community-wide initiatives that are aimed at promot- example, Resnick and his colleagues (1997) demon-
ing healthy development among children and ado- strated the importance of family and school connec-
lescents. The conceptual, research, and application tions in reducing multiple forms of health risk
dimensions of developmental assets are described behaviors. Similarly, the study of resiliency has iden-
here. tified characteristics that enable some children to
navigate through and around what often are debi-
litating environmental risks and experiences. This has
contributed considerably to our understanding of the
scope and nature of social and psychological strengths
1. Deelopmental Assets in Social and Conceptual (Masten et al. 1990, Rutter 1985, Werner and Smith
Context 1992). In addition, the more applied field of youth
development champions the inclusion of this emerging
The framework of developmental assets weaves to- body of knowledge about developmental strengths
gether into an a priori conceptual model, a set of into policy, programs, and practice (Pittman and
developmental experiences, resources, and oppor- Cahill 1991). The developmental asset model builds on
tunities, each of which contributes to important health these areas of inquiry and includes a number of
outcomes, conceived as both the reduction of health- elements from them in the 40 core developmental
compromising behaviors and the increase of positive assets.
or thriving outcomes such as school success. Though The developmental asset framework also provides a
the framework is supported by scientific study, it was complementary approach to the paradigm of deficit
purposefully designed to fuel and guide community- reduction. A deficit reduction model is focused on
based approaches to strengthen the natural and reducing threats, obstacles, and risks that interfere
inherent socialization capacity of communities. There- with healthy development. Among these are abuse;
fore, assets include the kinds of relationships, social neighborhood violence; access to alcohol, other drugs,
experiences, social environments, and patterns of and firearms; poverty; and family dysfunction. While
interaction known to promote health and over which research has shown that these factors are related to a
a community has considerable control. number of negative outcomes, efforts that focus
Developmental assets, then, represent a framework, mainly on controlling or reducing them represent
grounded in scientific study, with the applied aim of incomplete approaches to health promotion. As Ben-
reweaving the developmental infrastructure of a son and colleagues (Benson 1997, Benson et al. 1998)
community by activating multiple sources of asset have discussed, approaches which depend exclusively
building. These include informal, nonprogrammatic on deficit reduction may unintentionally expand the
relationships between adults and youth; traditional role of professionals, programs, and policy in child
socializing systems such as families, neighborhoods, and adolescent health. This may be to the detriment of
schools, congregations, and youth organizations; and more informal and natural capacities that may be
the governmental, economic, and policy infrastruc- rooted in the community.
tures which inform those socializing systems. The The asset model provides, then, an alternative and
intent is to encourage the mobilization of asset- complementary set of ‘benchmarks’ or targets that can
building efforts within many settings of a child’s life be added to the necessary and influential paradigm of
and to increase those efforts for all children and risk reduction. Rather than focusing solely on prob-
adolescents within a community. The developmental lems or threats to be reduced, it accents relationships,
assets framework, the theoretical underpinnings of the experiences, resources, and opportunities to be prom-
framework, and its partner concept of asset-building oted.

1690
Childhood and Adolescence: Deelopmental Assets

1.2 Community as a Context for Human which a community of people has considerable con-
Deelopment trol. That is, the assets are more about the primary
processes of socialization than the equally important
The developmental asset framework is also connected,
arenas of economy, services, and physical infrastruc-
both intellectually and strategically, to the fields of
ture of a city (Benson et al. 1998).
community change and community building. These
Because the developmental asset framework was
areas have historically focused on the economic,
designed not only to inform theory and research but
service, and environmental infrastructures of a city,
also to have practical significance for the mobilization
defining both the inherent capacity of local com-
of communities, the 40 assets are placed in categories
munities for promoting civic health (Kretzman and
that have conceptual integrity and can be described
McKnight 1993, McKnight 1995) and identifying
easily to the residents of a community. As seen in
those core processes of engaged community which can
Table 1, they are grouped into 20 external assets (i.e.,
inform the health of residents. Among these emerging
environmental, contextual, and relational features of
constructs are social trust, personal efficacy, and social
socializing systems) and 20 internal assets (i.e., skills,
capital (Sampson et al. 1997). In addition, the concep-
competencies, and commitments). The external assets
tualization and definition of the developmental assets
include four categories: (a) support, (b) empowerment,
were informed by other community research efforts
(c) boundaries and expectations, and (d) constructive
focusing on the concepts of social norms, civic
use of time. The internal assets are also placed in four
engagement, indigenous leadership, and community
categories: (a) commitment to learning, (b) positive
capacity building (Benson 1997, Benson et al. 1998).
values, (c) social competencies, and (d) positive ident-
ity. The scientific foundations for the eight categories
and each of the 40 assets are described in more detail
in Scales and Leffert (1999).
2. The Deelopmental Asset Framework
The original configuration of 30 developmental assets 3. Measurement, Descriptie Data, and
(e.g., Benson 1990) was expanded to 40 developmental
assets in 1996, based on analysis of data gathered on Prediction
254,000 6th–12th-grade students, additional synthesis Since 1996, numerous studies of 6th–12th-grade young
of child and adolescent research, and consultations people in public and private schools in the United
with researchers and practitioners (Benson 1997). The States have been conducted using the Search Institute
framework’s conceptual foundations are based on Profiles of Student Life: Attitudes and Behaiors, a self-
empirical studies of child and adolescent development, report survey. This 156-item self-report survey mea-
as well as applied studies in prevention, health pro- sures the 40 developmental assets, developmental
motion, and resiliency. The development of this con- deficits (e.g., whether youth watch too much television
ceptual foundation involved a research synthesis which or are victims of violence), thriving indicators (e.g.,
focused on integrating developmental experiences that school success, physical health behaviors), and high-
are widely known to inform three types of health risk behaviors (e.g., alcohol and other substance use,
outcomes among adolescents: (a) the prevention of antisocial behavior, school problems) (Leffert et al.
high-risk behaviors (e.g., substance use, violence, 1998). The most recent aggregate sample is made up of
sexual intercourse, school dropout); (b) the enhance- 99,462 6th–12th-grade youth from public and alterna-
ment of thriving outcomes (e.g., school success, tive schools in 213 cities and towns in the United States
affirmation of diversity, prosocial behavior); and (c) who took the survey during the 1996–7 academic year.
resiliency or the capacity to overcome adversity. The This sample has served as a focal point for several
assets were framed initially around adolescent devel- studies of the relation of assets to risk behaviors and
opment and they are assessed through a self-report thriving outcomes (Benson et al. 1998, Leffert et al.
survey (Leffert et al. 1998). The assets were extended 1998, Scales et al. 2000).
downward conceptually to include young children
(birth–age 10) which then encompasses a more lifespan
context (Roehlkepartain and Leffert 2000).
3.1 Examples of Descriptie Data and the Additie
The conceptualization of the asset framework iden-
Nature of Deelopmental Assets
tified naming developmental factors that were par-
ticularly robust in predicting health outcomes and for The self-report survey is primarily used as a means of
which there was evidence that they could be gener- communicating aggregate data on a community’s
alized across gender, race\ethnicity, and family in- youth. A report, developed for each city or school
come. In addition, the assets were conceived to reflect district that uses the survey, often becomes a widely
core developmental processes which include the rela- shared document and is used to frame a community-
tionships, social experiences, social environments, wide discussion and serves as a focal point to mobilize
patterns of interaction, norms, and competencies over around raising healthy youth (Benson et al. 1998). A

1691
Childhood and Adolescence: Deelopmental Assets

Table 1
Forty developmental assets
External
Support 1. Family support—Family life provides high levels of love and support
2. Positie family communication—Young person and her or his parent(s) communicate
positively, and young person is willing to seek advice and counsel from parents
3. Other adult relationships—Young person receives support from three or more nonparent
adults
4. Caring neighborhood—Young person experiences caring neighbors
5. Caring school climate—School provides a caring, encouraging environment
6. Parent inolement in schooling—Parent(s) are actively involved in helping young person
succeed in school
Empowerment 7. Community alues youth—Young person perceives that community adults value youth
8. Youth as resources—Young people are given useful roles in the community
9. Serice to others—Young person serves in the community one hour or more per week
10. Safety—Young person feels safe at home, at school, and in the neighborhood
Boundaries and 11. Family boundaries—Family has clear rules and consequences, and monitors the young
expectations person’s whereabouts
12. School boundaries—School provides clear rules and consequences
13. Neighborhood boundaries—Neighbors take responsibility for monitoring young people’s
behavior
14. Adult role models—Parent(s) and other adults model positive, responsible behavior
15. Positie peer influence—Young person’s best friends model positive, responsible behavior
16. High expectations—Both parents and teachers encourage the young person to do well
Constructie 17. Creatie actiities—Young person spends three or more hours per week in lessons or practice
use of time in music, theater, or other arts
18. Youth programs—Young person spends three hours or more per week in sports, clubs, or
organizations at school and\or in community organizations
19. Religious community—Young person spends one or more hours per week in activities in a
religious institution
20. Time at home—Young person is out with friends ‘with nothing special to do’ two or fewer
nights per week
Internal
Commitment 21. Achieement motiation—Young person is motivated to do well in school
to learning 22. School engagement—Young person is actively engaged in learning
23. Homework—Young person reports one or more hours of homework every school day
24. Bonding to school—Young person cares about his or her school
25. Reading for pleasure—Young person reads for pleasure three or more hours per week
Positie alues 26. Caring—Young person places high value on helping other people
27. Equality and social justice—Young person places high value on promoting equality and
reducing hunger and poverty
28. Integrity—Young person acts on convictions and stands up for her or his beliefs
29. Honesty—Young person tells the truth even when it is not easy
30. Responsibility—Young person accepts and takes personal responsibility
31. Restraint—Young person believes it is important not to be sexually active or to use alcohol or
other drugs
Social 32. Planning and decision making—Young person knows how to plan ahead and make choices
competencies 33. Interpersonal competence—Young person has empathy, sensitivity, and friendship skills
34. Cultural competence—Young person has knowledge of and comfort with people of different
cultural\racial\ethnic backgrounds
35. Resistance skills—Young person can resist negative peer pressure and dangerous situations
36. Peaceful conflict resolution—Young person seeks to resolve conflict non-violently.
Positie identity 37. Personal power—Young person feels he or she has control over ‘things that happen to me’
38. Self-esteem—Young person reports having high self-esteem
39. Sense of purpose—Young person reports that ‘my life has a purpose’
40. Positie iew of personal future—Young person is optimistic about her or his personal future
Source: 1997. Search Institute. Reproduced with permission.

1692
Childhood and Adolescence: Deelopmental Assets

Table 2
The relation of assets to patterns of high-risk behavior
High-risk behavior patterns Percent with high-risk patterns

If 0–10 If 11–20 If 21–30 If 31–40


Category Definition Assets Assets Assets Assets
Alcohol Has used alcohol three or more times 53 30 11 3
in the past month or got drunk
once or more in the past two weeks
Tobacco Smokes one or more cigarettes every 45 21 6 1
day or uses chewing tobacco
frequently
Illicit drugs Used illicit drugs three or more times 42 19 6 1
in the past year
Sexual Has had sexual intercourse three or 33 21 10 3
intercourse more times in lifetime
Depression\ Is frequently depressed and\or has 40 25 13 4
suicide attempted suicide
Antisocial Has been involved in three or more 52 23 7 1
behavior incidents of shoplifting, trouble with
police, or vandalism in the past year
Violence Has engaged in three or more acts of 61 35 16 6
fighting, hitting, injuring a person,
carrying or using a weapon, or
threatening physical harm in the
past year
School problems Has skipped school two or more days 43 19 7 2
in the past month and\or has below
a C average
Driving and Has driven after drinking or ridden 42 24 10 4
alcohol with a drinking driver three or more
times in the past year
Gambling Has gambled three or more times in 34 23 13 6
the past year
Source: 1998. Applied Deelopmental Science. " Lawrence Erlbaum Associates, Inc., Hillsdale, NJ. Reproduced with permission.

dichotomous form of reporting the assets, whereby patterns by their level of developmental assets. The
each asset is simplified into a single percentage of table also includes a definition of each risk behavior
youth who have, or do not have, the asset, is utilized as pattern. It is important to note that for each of
an effective method for communicating the asset the 10 risk behavior patterns, the percentage of
profile to diverse subgroups within a community. This students reporting the risk behavior declines as the
also allows for a simple summation of the average level of assets rises. This same relationship between
number of youth assets in any given community. assets and risk behaviors has been observed across
Based on this aggregate sample, youth report having, grade, gender, racial or ethnic group, and across all
on average, 18 of the 40 developmental assets. As part communities studied.
of a standardized report, communities also receive a The opposite pattern between level of assets and
breakdown of the percentage of young people who positive or thriving behaviors has also been repeatedly
report different levels of developmental assets. In demonstrated. For these assessments youth who re-
the aggregate sample, 20 percent have 0–10 assets, port fewer assets are also less likely to report each of
42 percent have 11–20, 30 percent have 21–30, and the thriving indicators including school success and
8 percent have 31–40 (Benson et al. 1998). When these the affirmation of diversity (Benson et al. 1998).
categories are combined in cross-tabulation with risk
behaviors or thriving indicators, they become a sig-
nificant source of information for community mem-
3.2 Grade and Gender Differences
bers. For example, Table 2 shows the percentage of
6th–12th grade youth in the aggregate sample who Some variation has been observed across communities
report that they engage in each of 10 risk behavior and in different subgroups of adolescents. Overall,

1693
Childhood and Adolescence: Deelopmental Assets

Table 3
Assets by grade within gender among an aggregate sample of youth surveyed in the 1996–7 school year
Male Female

All 6–8 9–12 6–8 9–12


1. Family support 64 percent 71 percent 59 percent 71 percent 60 percent
2. Positive family communication 26 30 20 35 23
3. Other adult relationships 41 37 40 44 43
4. Caring neighborhood 40 42 35 47 37
5. Caring school climate 25 26 19 34 23
6. Parent involvement in schooling 29 38 22 38 24
7. Community values youth 20 24 15 28 16
8. Youth as resources 25 29 21 31 21
9. Service to others 50 50 41 61 51
10. Safety 55 54 70 42 50
11. Family boundaries 43 43 38 48 44
12. School boundaries 46 55 35 61 41
13. Neighborhood boundaries 46 53 40 54 41
14. Adult role models 27 26 21 35 28
15. Positive peer influence 60 68 46 76 57
16. High expectations 41 50 35 50 35
17. Creative activities 19 15 13 27 21
18. Youth programs 59 59 58 59 58
19. Religious community 64 67 56 73 64
20. Time at home 50 53 45 56 48
21. Achievement motivation 63 59 54 71 71
22. School engagement 64 55 57 70 73
23. Homework 45 38 36 51 55
24. Bonding to school 51 48 45 59 54
25. Reading for pleasure 24 21 17 35 27
26. Caring 43 39 28 57 52
27. Equality and social justice 45 41 28 61 54
28. Integrity 64 53 58 65 75
29. Honesty 63 59 55 71 68
30. Responsibility 60 54 56 64 66
31. Restraint 42 53 25 67 36
32. Planning and decision making 29 24 25 32 33
33. Interpersonal competence 43 28 25 59 61
34. Cultural competence 35 31 24 46 42
35. Resistance skills 37 38 29 48 38
36. Peaceful conflict resolution 44 33 29 59 54
37. Personal power 45 40 48 41 49
38. Self-esteem 47 53 54 42 39
39. Sense of purpose 55 58 61 51 49
40. Positive view of personal future 70 70 70 70 71
N l 99,462.
Source: 1998. Applied Deelopmental Science. " Lawrence Erlbaum Associates, Inc., Hillsdale, NJ. Reproduced with permission.

females report experiencing more of the assets than about one-half of the assets, suggesting somewhat
males. With males and females across each of the asset pervasive, but small, differences in the contextual
levels, 45 percent of females report that they experience experiences of boys and girls over the adolescent years
more than half of the 40 assets compared to 30 percent (Leffert et al. 1998). In addition, a fairly consistent
of males who report that they experience more than decline in the reports of the assets for both males and
half of the assets. females across this age period are found. That is,
When comparing the effect of grade and gender young adolescents (i.e., 6th–8th graders) tend to report
differences in each of the individual assets, at least experiencing more of the assets than older adolescents
small effects (about 0.20; Cohen 1988) are observed in (i.e., 9th–12th graders) (see Table 3).

1694
Childhood and Adolescence: Deelopmental Assets

3.3 Prediction of Risk Behaiors and Thriing tool for staff, boards, and volunteers. A growing
Indicators number of foundations utilize the work for both
frame-funding initiatives and to evaluate proposals.
In reports to participating communities, analyses
Professionals (e.g., social workers, counselors) utilize
similar to that presented in Table 2 are included to
the framework to design interventions for individual
demonstrate the relation of developmental assets to
children and adolescents.
both the risk behaviors and thriving indicators. In
other studies using aggregate samples, regression
analyses are used to assess the extent to which the
developmental assets are useful in predicting either a 4.1 Defining Asset-building Communities
reduction in risk behaviors (Leffert et al. 1998) or a
Asset-building communities are geographies of place
promotion of thriving indicators (Scales et al. 2000).
which maximize attentiveness to promoting devel-
Those analyses have shown that demographic vari-
opmental strengths for all children and adolescents
ables accounted for a range of 5–14 percent of the total
(Benson 1997). The dynamics and processes by which
variance of each of the models constructed to examine
communities mobilize their asset-building capacity are
risk behaviors. In each analysis, a set of the de-
arelativelyunexploredlineofinquiry,boththeoretically
velopmental assets contributed a significant amount
and empirically. An initial framework for understand-
over and above the influence of the demographic
ing the asset-building capacity of communities pro-
variables, accounting for a total of 21–41 percent of
vides a set of core principles (Benson 1997, Benson et
the variance explained in the reduction of each of the
al. 1998). Among these are the principles of devel-
individual risk behavior patterns and for 66 percent of
opmental redundancy (the exposure to asset-building
the variance in a composite index of risk behaviors.
people and environments within multiple contexts),
Similarly, Scales and colleagues (Scales et al. 2000)
developmental depth (a focus on nurturing most or all
examined the extent to which developmental assets
assets in children and adolescents), and developmental
predicted thriving behaviors and how it varied across
breadth (extending, by purpose and design, the reach
different ethnic groups. The demographic variables
of asset-building energy to all children and adole-
accounted for a range of 1–8 percent of the total
scents).
variance explained by each of the models and each
In activating these core principles, five sources of
model accounted for a total of 10–43 percent of the
asset-building potential are hypothesized to exist
variance explained in the individual thriving indicators
within all communities, each of which can be mar-
and from 47 to 54 percent of the variance explained in
shaled via a multiplicity of community mobilization
a thriving index.
strategies. These sources of potential asset-building
influence include: (a) sustained relationships with
adults, both within and beyond family; (b) peer group
4. Application influence (when peers choose to activate their asset-
building capacity); (c) socializing systems; (d) com-
Developmental assets, then, have particular utility for munity-level social norms, ceremony, ritual, policy,
predicting reduction in multiple forms of risk-taking and resource allocation; and (e) programs, including
and thriving behavior (e.g., Leffert et al. 1998, Scales et school- and community-based efforts to nurture and
al. 2000). The descriptive portraits of assets from build skills and competencies.
hundreds of American communities suggest that these In brief, asset-building communities are distin-
developmental asset targets are normatively fragile. A guished as relational and intergenerational places,
set of comprehensive and interlocking strategies have with a critical mass of socializing institutions (e.g.,
been proposed to mobilize the inherent and natural families, schools, neighborhoods, youth organiza-
asset-building capacity of community residents tions, religious communities) choosing to attend to the
(Benson 1997, Benson et al. 1998) and community developmental needs of all children and adolescents.
socializing systems, including families (Roehl- Developmental assets become a language of the
kepartain and Leffert 2000), schools (Starkman common good, uniting sectors, citizens, and policy in
et al. 1999), congregations (Roehlkepartain 1998), and the pursuit of shared targets for all children and
youth-serving organizations (Nelson 1998). New con- adolescents. The commitment of a community and its
ceptual and measurement efforts are underway to people, institutions, and organizations is both long-
extend the asset framework to a series of develop- term and inclusive.
mental phases in the 0–20 age range. Ultimately, rebuilding and strengthening the de-
Though the focus of this last section is on the velopmental infrastructure in a community are con-
influence of communities, it is noted here that the ceived less as a program implemented and managed by
developmental assets construct can be applied in a professionals and more as a mobilization of public will
wide variety of ways. For example, national youth- and capacity. A major target for this level of com-
serving systems and their local affiliates utilize the munity engagement is the creation of a normative
framework for strategic planning and as a training culture in which all residents are expected by virtue of

1695
Childhood and Adolescence: Deelopmental Assets

their membership in the community to promote the Health Programs: Children and Adolescents; Poverty
positive development of children and adolescents. and Child Development; Psychobiology of Stress
Within the context of American society, this vision and Early Child Development; Social Competence:
requires considerable transformation in prevailing Childhood and Adolescence
resident and socialization systems, norms, and oper-
ating principles. As argued in numerous publications
defining this conceptual model of asset-building com-
munity, American cities are typically marked by age Bibliography
segregation, civic disengagement, social mistrust, a
loss of personal efficacy, and the lack of collaboration Benson P L 1990 The Troubled Journey: A Portrait of 6th–12th
across systems. Grade Youth. Search Institute, Minneapolis, MN
Benson P L 1997 All Kids Are Our Kids: What Communities
Must Do To Raise Caring and Responsible Children and
Adolescents. Jossey-Bass, San Francisco, CA
Benson P L 1998 Mobilizing communities to promote devel-
4.2 Asset-building Communities: The National opmental assets: A promising strategy for the prevention of
Moement high-risk behaviors. Family Science Reiew 11: 220–38
The marshaling of community capacity to consistently Benson P L, Leffert N, Scales P C, Blyth D A 1998 Beyond the
and deeply attend to the development of children and ‘village’ rhetoric: Creating healthy communities for children
and adolescents. Applied Deelopmental Science 2: 138–59
adolescents is conceived less as the implementation of
Cohen J 1988 Statistical Power Analysis for the Behaioral
a program and more the awakening of latent human Sciences. Lawrence Erlbaum Associates, Inc., Hillsdale, NJ
and institutional potential to build developmental Kretzman J P, McKnight J L 1993 Building Communities From
strengths. A series of practical tools targeted at the Inside Out: A Path Toward Finding and Mobilizing a
community residents and civic leaders provides con- Community’s Assets. Center for Urban Affairs and Policy
ceptual and strategic counsel for mobilizing asset- Research, Evanston, IL
building capacity. The first asset-building initiative Leffert N, Benson P L, Scales P C, Sharma A R, Drake D R,
began in St. Louis Park, Minnesota in 1995. In the Blyth D A 1998 Developmental assets: Measurement and
following five years, more than 500 other American prediction of risk behaviors among adolescents. Applied
communities began to craft community-wide initia- Deelopmental Science 2: 209–30
tives. Organized more as a social movement than as Masten A S, Best K M, Garmezy N 1990 Resilience and
development: Contributions from the study of children who
the replication of a program, communities are encour- overcome adversity. Deelopment and Psychopathology 2:
aged to tailor their initiatives to local realities and 425–44
capacities and in response to the data from local asset McKnight J 1995 The Careless Society: Community and its
profiles. Because these initiatives are complex, multi- Counterfeits. Basic Books, New York
sector ‘experiments’ in changing local culture, and Nelson L I 1998 Helping Youth Thrie: How Youth Organizations
because they occur in a variety of rural, suburban, and Can—and Do—Build Deelopmental Assets. Search Institute,
urban settings, there is increasing investment in learn- Minneapolis, MN
ing from these communities about innovations and Pittman K J, Cahill M 1991 A New Vision: Promoting Youth
effective practices in mobilizing residents and systems, Deelopment. Center for Youth Development and Policy
with ‘feedback loops’ emerging to inform both the Research, Academy for Educational Development, Washing-
ton, DC
theory of community change and the development of
Resnick M D, Bearman P S, Blum R W, Bauman K E, Harris
practical resources. Several longitudinal studies will K M, Jones J, Beurhring T, Sieving R E, Shew M, Ireland M,
add additional insight to this evolving knowledge Bearinger L H, Udry J R, Tabor J 1997 Protecting adolescents
about the influence of community on human devel- from harm: Findings from the National Longitudinal Study
opment. on Adolescent Health. Journal of the American Medical
Association 278(10): 823–32
See also: Adulthood: Developmental Tasks and Criti- Roehlkepartain E 1998 Building Assets in Congregations: A
cal Life Events; Biopsychology and Health; Child- Practical Guide for Helping Youth Grow Up Healthy. Search
Institute, Minneapolis, MN
hood Health; Childhood Sexual Abuse and Risk for
Roehlkepartain J L, Leffert N 2000 What Young Children Need
Adult Psychopathology; Community Organization to Succeed: Working Together to Build Assets From Birth to
and the Life Course; Developmental Sciences, History Age 11. Free Spirit Press, Minneapolis, MN
of; Divorce and Children’s Social Development; Early Rutter M 1985 Resilience in the face of adversity: Protective
Childhood: Socioemotional Risks; Environments for factors and resistance to psychiatric disorder. British Journal
Education; Environments for Learning; Human of Psychiatry 147: 598–611
Development and Health; Human Development, Sampson R J, Raudenbush S W, Earls F C 1997 Neighborhoods
and violent crime: A multilevel study of collective efficacy.
Bioecological Theory of; Human–Environment Science 277: 918–24
Relationships; Infancy and Childhood: Emotional Scales P C, Benson P L, Leffert N, Blyth D A 2000 Contribution
Development; Lifespan Development, Theory of; of developmental assets to the prediction of thriving among
Lifespan Theories of Cognitive Development; Mental adolescents. Applied Deelopmental Science 4: 27–46

1696
Childhood: Anthropological Aspects

Scales P C, Leffert N 1999 Deelopmental Assets: A Synthesis of 1. The Stages of Childhood


the Scientific Research on Adolescent Deelopment. Search
Institute, Minneapolis, MN Five stages of human growth and development are
Starkman N, Scales P C, Roberts C 1999 Great Places to Learn: common to Homo sapiens: infancy, childhood, juve-
How Asset-Building Schools Help Students Succeed. Search nility, adolescence, and adulthood (Bogin 1999).
Institute, Minneapolis, MN Margaret Mead described lap children (infants, aged
Werner E E, Smith R S 1992 Oercoming the Odds: High Risk 0–1), knee children (toddlers, 2–3), yard children
Children from Birth to Adulthood. Cornell University Press,
(preschool, 4–5), and community children (juveniles in
Ithaca, NY
middle childhood, 6–12). Anthropologists analyze the
P. L. Benson and N. Leffert cultural meaning of the very idea of ‘stages,’ since
stages are used to account for children’s behavior
(‘he’s crying but it is OK, because he’s still a toddler’),
as well as to assure and define normal and appropriate
development (‘she is eight, and so old enough to start
Childhood: Anthropological Aspects helping run our household’). Human cultures weave
wonderful variations, meanings, and stories around
The anthropological study of childhood first docu- pan-human maturational stages of childhood. The
ments and accounts for the variety of childhoods Beng of Ivory Coast for example, believe that young
found around the world; second, uses the comparative children are still partly in yet another ‘stage,’ a cultural
ethnographic record to test hypotheses about human world called wrugbe, where ancestors share life with
development; and, third, studies the mechanisms in prebirth children who are ambivalent about leaving
child, family, and community life for the acquisition, that world. This helps explain for Beng why infants cry
internal transformations, sharing, and intergenera- or are sickly: they want to return to wrugbe.
tional transmission of culture.
A thought experiment will illustrate the anthro-
pological point of view about childhood. Imagine a 2. Conceptions of Childhood in Anthropology
newborn, healthy infant. What is the most important
thing that you could do to influence the life of that There are a variety of perspectives on childhood in
infant? Most respond by mentioning dyadic inter- anthropology. In one view, children are socialized into
action with the baby: hold and touch the infant a lot; a set of norms and customs that they learn and then
provide good nutrition and health care; provide perpetuate. In this view, children are small adults in
stimulation to achieve school success; love the baby; the making, ready receptors of traditions, shaped by
give it wealth and social capital, and so forth. parents and community adults to insure continuity in
Anthropologists believe that the most important cultural and moral education, competence for survival
influence in human development is the cultural setting in the ecology of the community, respect for tradition,
within which the infant will grow up. It is how, why, and appropriate behavior and respect for elders in
by whom children are held, loved, fed, stimulated, demeanor and gender roles.
punished, provided resources, and so forth, and how Second, children’s personalities and minds are
that varies so widely across human communities, that understood as reflections of the cultural themes as well
is the focus of inquiry. Shaping a whole person as the anxieties children grow up with (such as in the
engaged in family and cultural community life is the work in Bali of Bateson and Mead 1942). The focus is
‘purpose’ of childhood development from an anthro- on the semiotics and communication of cultural
pological perspective. Childhood is a cultural project meanings to children, on how these cultural patterns
with goals, meanings, constant adaptation, and are absorbed and internalized, in turn reproducing the
struggle, and anthropology provides the evidence for meanings as well as neurotic obsessions of their
the startling and remarkable varieties of childhoods parents’ cultures.
lived around the world. Biological, psychological, and Third, the psychocultural, or personality integration
cultural anthropologists collaborate in the study of model (Whiting and Whiting 1975), begins with the
childhood, since biology, mind, and culture are all climate, history, and ecology of a community, which
required to understand childhood. shapes child-care practices, which in turn produce
The study of childhood and the process of children psychological effects on children, effects produced by
acquiring culture ‘was almost entirely neglected by direct social learning as well as by psychodynamic
anthropologists until after 1925’ (Whiting 1968). processes shaping personality and defenses in children.
Although much progress has been made, anthro- These children become adults who then project into
pology does not yet provide a single unified theory of myths, rituals, art, and other forms (including in turn
why and how childhoods vary around the world, or of their own practices as parents) the learned patterns as
childhood acquisition of culture. Rather the field offers well as intrapsychic conflicts produced in childhood
rich, multivariate hypotheses and data on childhood and shared by others in their community. Children
(Super and Harkness 1997). and adults alike have universal needs of the self—

1697
Childhood: Anthropological Aspects

hunger for recognition, reward, and material and required for family and community survival. Mothers
bodily satisfaction—to which cultures respond who have heavy subsistence workloads are more likely
through the cultural careers made available to children to expect responsible work from children and use
in a community (Goldschmidt 1990). Culture inevi- stricter discipline. Children living in extended, joint, or
tably thwarts these hungers, leading to intrapsychic and expanded households and family systems are more
cultural conflicts. Melford Spiro used psychodynamic often involved in directive, aggressive interactions,
and sociocultural approaches to understand the ideo- while children living in smaller nuclear families are
logical, political, and ecological reasons for and more often engaged in sociable and intimate inter-
consequences of the care of children by designated actions with parents and others, and fathers are more
community caretakers, or metapelets, in socialist- involved with children. Of course children and adults
inspired agricultural collective groups in Israel (Spiro everywhere are nurturant, seek help, or are aggressive.
1975). These patterns only reflect the modal tendencies of
Fourth, anthropologists study the ‘developmental communities, not a rigid uniformity within them
niche’ of childhood: everyday physical\social settings, (Whiting and Edwards 1988).
cultural customs of care, and the psychology of the
caretakers as shaped by their cultural models of
parenthood that direct behavior (the goals, meanings,
3.2 Gender Differences
and rationales for parenting and being a child)
(Harkness and Super 1996). Parenting of children also Gender differences in children’s development are
is shaped by the organic hardware given by our recognized and shaped by all cultures (Ember 1981).
common mammalian heritage, and by socioeconomic Of five kinds of interpersonal behavior in children
conditions in the community. Children experience aged from three to 11—nurturance, dependency,
culture as it is practiced within their family’s daily prosocial dominance, egoistic dominance, and soci-
routine of cultural life. Cultural routines consist of ability—girls on average were more likely than boys to
activities children engage in (mealtimes, bedtimes, be nurturing toward others, while boys were more
family visits, chores, going to church, school, play, likely to be egoistically dominant and aggressive than
etc.) Activities are the primary mechanisms bringing girls. Play styles and types vary by gender (girls are
culture to and into the mind of the child. Activities more likely to do work-play and to do so nearer their
consist of goals and values; tasks of an activity; the homes, for instance). Women and girls do most
scripts for how to engage in that activity; the people ‘mothering’ of children well into the juvenile period in
present and participating in the activity; and the most cultures, so girls experience care by their own
motives and feelings of those involved (Weisner 1996). sex, while boys do not, leading to differences in early
Finally, some anthropologists view childhood itself gender identification, and psychosocial and self-
as a cultural construction shaped by forces within as development. Peer groups have a tendency to segreg-
well as outside a single cultural community. The very ate by gender, and children prefer same-sex children
idea of what a child or parent is, in this view, is more to interact with. Cultures with more mixed-age and
the outcome of processes of power in an increasingly mixed-gender groups around children are likely to
global political economy, in which children as well as have less sex-segregated roles for children. Individual
parents, are ‘constructed’ or ‘positioned’ by these differences among boys and girls, even within comm-
agents of power (Stephens 1995). unities where there are strong overall gender diff-
erences, are usually substantial.
Father roles are recognized in all societies. Fathers
seldom are involved in direct care of infants and young
3. Some Cultural Influences on Children’s children but fathers do have complementary nurturing
Deelopment and affiliative roles, and are more involved in econ-
omic, protective, and didactic child training. In a study
of 80 preindustrial societies, fathers were more prox-
3.1 Cultural Scale and Complexity
imate and involved with young children in mono-
Children in more complex societies (with occupational gamous, nuclear-family, and nonpatrilocal situations,
specialization, an extensive market economy, a nucl- and wherever mothers make relatively large contri-
eated settlement pattern, centralized and hierarchical butions to family subsistence. Father involvement is
political and legal system, and a centralized religious related to sociocultural evolution: foraging societies
priesthood) are more likely to seek help and assistance report more father participation in childcare, while
from others, to try and dominate or control others, horticultural, agrarian\peasant, and pastoral tend to
and to be more egoistic. Children in less complex have less. There is an upswing in contemporary
societies are more likely to show nurturance societies in encouraging paternal care. The cultural
toward other children (to offer assistance and respond beliefs about gender (how women’s as well as men’s
to their requests), be more responsible, make more roles are defined by parenting), as well as the ability of
responsible suggestions to others, and do more tasks fathers to provide consistently for their children,

1698
Childhood: Anthropological Aspects

influence father involvements. Poverty and uncertain attachments, since in threatening, insecure conditions
economic life, or migration and dislocation, can that was the more adaptive, successful parental and
drastically change father as well as mother involve- child response likely to increase the chances of children
ment in patterns of childcare. reaching reproductive age. Most cultures provide
multiple caretakers to children, not a single person,
and care is ‘socially distributed.’ Indeed, living an
3.3 Emotional Deelopment
entire childhood exclusively in one’s natal home may
Emotional development in childhood is influenced by well be the exception around the world. Older siblings
cultural expectations at each developmental stage and cousins are widely used as caretakers. Extended
about the kinds of demeanor expected. A child should families in village and agrarian-based societies have
show that he\she is a certain kind of cultural person high levels of multiple care of children. In India and
with an appropriate self and identity. Cultural mana- elsewhere in Southeast Asia, for instance, there is an
gement of emotion relies on what Robert Levy (1973) intense, ‘relational,’ childhood experience with several
called ‘redundant cultural control.’ Tahitians (Levy ‘maternal’ figures, a pattern found widely across
1973), for example, as well as many Pacific Island, cultures (Weisner and Gallimore 1977).
Asian and other cultures, expect children to be calm,
gentle, and quiet in demeanor (except for an extended
period of adolescence and youth called taure‘are’a in 3.5 Deelopmental Goals
Tahiti, in which adventures, autonomy, rebellion, and Anthropology does not assume that competencies
aggressiveness are culturally expected and common). valued in Western communities (verbal skills, cogn-
Redundant community management of ‘gentleness’ itive abilities, or signs of egocentric autonomy of the
includes many beliefs and practices: children are self, for instance), are necessarily meaningful child
somewhat distanced from their mothers and fathers developmental outcomes elsewhere, although all cul-
after infancy and live with peers; socialization net- tures are concerned over some version of good com-
works are diffuse, meaning that affect towards others is munication, mental ability, and self- and personhood.
diffused; severe anger is ‘strongly discouraged’ while LeVine et al. (1994) contrast pedagogical goals (cog-
mild transient episodes are tolerated; threats are nitive and social stimulation to prepare children for
common while actual aggression towards children is literacy and schools, as well as for an individualistic
not; accidents are reinterpreted as punishment by and autonomous self- and personhood away from
spirits for aggression and this is widely believed to be their natal home) and pediatric goals (concern for
true; there can be magical retaliation for serious anger; survival, health, and physical growth of infants, and
and it is generally shameful to show lack of control. A subsequent responsible engagement in family subsi-
culture complex of many interrelated beliefs and stence and continuity), comparing families in Boston,
practices of this kind is a strong sign that some USA and the Gusii of western Kenya. Many parents
emotional pattern or competence in children is of and cultures have mixed goals, and are ambivalent
adaptive and moral importance to a society. about the constantly changing requirements for
childhood. Anthropology has a unique point of view
3.4 Basic Trust and Attachment regarding the goals for a good childhood: the pro-
duction of cultural well-being in children. Well-being
Basic trust and attachment are fundamental in chil- is more than physical health or the attainment of skills
dhood in all cultures. Anthropological studies show and competence, or of successful subsequent re-
that a wide range of family and parenting practices can production, important as these are. Well-being is the
produce close affiliation and trusting attachments in ability of a child to engage and participate in the
children. Successful attachment does not depend on activities deemed desirable by a cultural community,
only one kind of maternal care in nuclear families, nor and the psychological experiences that go along with
a specific kind of infant and toddler behavioral style. that participation.
Although the individual child is named and recognized
everywhere, individualism and egoistic autonomy as
goals are not at all universal; rather, sociocentric and 4. The Acquisition of Culture
interdependent self-development are common ideals
(Shweder and Bourne 1991). Chisholm (1999) prop- The roles and settings in which children acquire culture
oses an evolutionary developmental hypothesis rega- matter for when and how children learn. Children are
rding trust and attachment. Environments varied apprentices to more experienced community members
during the long course of evolution. Less threatening, in doing important tasks, and this apprenticeship
more favorable material and social conditions led to situation is a powerful learning experience for chil-
greater investment in fewer children, and so encour- dren. Play and work blend in childhood learning.
aged closer attachments to one or a few caregivers. Imaginative, fantasy, toy, physical, and motoric play
Unfavorable conditions encouraged what are called (including organized sports with rules) varies accord-
‘insecure’ or avoidant\ambivalent infant and child ing to whether adults encourage it, whether it is

1699
Childhood: Anthropological Aspects

considered ‘beneficial’ for children by adults because it 5. Anthropological Methods and the Study of
enhances desired competencies or societies’ develop- Children
mental goals (such as cognitive and school-like
activities in many contemporary cultures), and Anthropological methods for the study of children
whether children’s sheer inventiveness, creativity, and include ethnography and participant observation
exuberance take over. The Kpelle of Liberia have (Weisner 1996). These methods fit with the anthro-
children playing on ‘the mother-ground,’ or open pological concept of childhoods lived in cultural
public spaces where children can observe, lurk nearby, pathways in naturalistic settings. Systematic obser-
and imitate adults going about their activities in an vational procedures, field guides for comparative
agrarian village community. Formal schooling and studies, and special procedures for sampling children’s
the ‘outside’ world of employment and the nation- activities and time use enhance ethnography (Munroe
state contrast sharply with this mother-ground of and Munroe 1994). Anthropologists also use assess-
childhood. Anthropological studies of schooling find ments standard in child development for comparing
striking differences in the culture of classrooms around physical growth, and the cognitive and socioemotional
the world, including different teaching practices and life of children, often revising these to insure that
student expectations. Cultures vary in the ‘moral and culturally appropriate procedures and meaningful
cultural curriculum’ accompanying literacy and other outcomes are being measured. Film and video records
training, classroom and peer norms, gender, and class of childhood are invaluable for comparative studies of
circumstances mirrored in school practices, and daily cultural activities, emotional expression, holding pat-
routines of school (Tobin et al. 1989). Participation in terns, or gaze and attention.
ceremonies and rituals at times of baptism, birthdays,
naming ceremonies, puberty, and marriage also are
powerful influences on children’s acquisition of cult-
ural knowledge. Such ceremonies crystallize cultural 6. Anthropology and the Study of Childhood in
beliefs and practices; they intensify emotionally, poli- the Twenty-first Century
tically, and socially salient key concerns that parents
and communities have about childhood, and elevate Anthropology has always been concerned with the
the goals the community shares for children and experience and the cultural worlds of minorities, the
parents (Turner, 1967). poor and non-literate, and of those, including children,
Multiple cultural and mental processes are involved who are so often unable to give voice to and represent
in culture acquisition. However, the relative impor- their own world. Life histories and autobiographical
tance of different mental and cultural mechanisms for accounts have provided rich data, as in the classic
emotional, social, or cognitive learning is currently not Nisa: The Life and Words of a !Kung Woman (Shostak
well understood in anthropology. Evolved tendencies 1981). Scheper-Hughes (1992) describes infants and
of the mind prepare children to understand the world young children in deeply impoverished political–
in certain ways. For example, children in widely economic circumstances in urban northeastern Brazil,
disparate cultural communities seem to share unde- circumstances leading to high infant and child mor-
rstandings about what living things are like and how tality, and anger and despair among parents (and
they behave and think (Hirschfeld and Gelman 1994). anthropologists). Anthropological studies of African-
Psychodynamic processes transform emotionally sali- American families and economically downwardly
ent cultural information. Stories and narratives embed mobile families in the USA demonstrate how some
cultural knowledge, shape recall, and organize cultural (but not all) can rely on extended kin in their struggles
knowledge into sequences with shared local meaning. with poverty. Anthropological studies of childhood
Sociolinguistic studies of child language acquisition disability and deviance find greater acceptance and
show wide variations in how and when parents talk to social integration of children with physical and cog-
their children, and view language learning as em- nitive disabilities in many communities, as long as
bedded in interactional routines shaped by cultural children are able to live as sufficiently cultural persons
practices, with children as active learners (Schieffelin in their communities and are not violent or dangerous
and Ochs 1986). Cultural knowledge is available to others (Ingstad and Whyte 1995). Anthropologists
to and used by children in the form of cultural models are concerned with children at risk around the globe,
and schemas for how to comprehend and act in the including, for example, children under stress from
world. Neither children nor parents generally ‘know academic examinations in Japan and Korea, immi-
that they know’ most cultural knowledge, and it is not grant children in Europe and elsewhere, street chil-
usually in conscious awareness, even as they act in dren, or children facing change in East Africa. Child
accord with their culture’s beliefs. Cultural beliefs and sexual and physical abuse around the world is now a
practices have powerful ‘directive force’ in guiding recognized concern for anthropologists. Cultural be-
child behavior and child socialization in part because liefs and practices regarding appropriate discipline
of this shared, implicit, everyday understanding put and treatment of children clearly do vary widely, and
into action (D’Andrade 1995). Western notions of abuse are not universal. However,

1700
Childhood Cancer: Psychological Aspects

repeated and unchecked physical aggression, or in- Lessons from Africa. Cambridge University Press, Cambridge,
trafamilial sexual relations between close kin and UK
children are nowhere defined as normative and acce- Levy R 1973 Tahitians. Mind and Experience in the Society
Islands. University of Chicago Press, Chicago
ptable (Korbin 1981). Anthropologists are concerned
Munroe R L, Munroe R H 1994 Behavior across cultures:
with children’s rights, recognizing their vulnerable Results from observational studies. In: Lonner W J, Malpass
status and the lack of provision of basic protections R (eds.) Psychology and Culture. Allyn and Bacon, Boston,
for children (Cultural Surial Quarterly). World MA, pp. 107–11
youth cultures are growing in importance due to the Scheper-Hughes N 1992 Death Without Weeping. The Violence
influence of the Internet and mass communications of Eeryday Life in Brazil. University of California Press,
around the world. These are all topics for the anthro- Berkeley, CA
pology of childhood in the twenty-first century. Schieffelin, B B, Ochs E (eds.) 1986 Language Socialization
However, the comparative study of powerful local and Across Cultures. Cambridge University Press, New York
Schlegel A, Barry H 1991 Adolescence. An Anthropological
regional cultural differences in parenting, childhood,
Inquiry. Free Press, New York
and family life across populations around the world Shostak M 1981 Nisa: The Life and Words of a !Kung Woman.
will continue to provide enduring scientific questions Harvard University Press, Cambridge, MA
for anthropology. Shweder R A, Bourne E J 1991 Does the concept of the person
vary cross-culturally? In: Thinking Through Cultures Expedi-
See also: Adolescent Development, Theories of; Child- tions in Cultural Psychology Harvard University Press
hood Health; Children and the Law; Gender-related Cambridge, MA, pp. 113–55
Development; Infancy and Childhood: Emotional Spiro M E 1975 Children of the Kibbutz, 2nd edn. Harvard
Development; Life Course in History; Life Course: University Press, Cambridge, MA
Sociological Aspects; Trust, Sociology of; Youth Stephens S (ed.) 1995 Children and the Politics of Culture.
Princeton University Press, Princeton, NJ
Culture, Anthropology of; Youth Culture, Sociology Super C M, Harkness S 1997 The cultural structuring of child
of development. In: Berry J, Dasen P R, Saraswathi T S (eds.)
Handbook of Cross-cultural Psychology, 2nd edn. Allyn and
Bacon, Boston, Vol. 2, pp. 3–39
Bibliography Tobin J, Wu D, Davidson D 1989 Preschool in Three Cultures:
Japan, China, and the United States. Yale University Press,
Bateson G, Mead M 1942 Balinese Character. Special Publi-
New Haven, CT
cations of the New York Academy of Sciences 2. New York
Turner V W 1967 The Forest of Symbols. Cornell University Press,
Academy of Sciences, New York
Ithaca, NY
Bogin B 1999 Patterns of Human Growth, 2nd edn. Cambridge
Weisner T S 1996 Why ethnography should be the most
University Press, New York
important method in the study of human development. In:
Chisholm J 1999 Death, Hope, and Sex. Steps to an Eolutionary
Jessor R, Colby A, Shweder R (eds.) Ethnography and Human
Ecology of Mind and Morality. Cambridge University Press,
Deelopment. Context and Meaning in Social Inquiry. Uni-
Cambridge, UK
versity of Chicago Press, Chicago, pp. 305–24
Cultural Surial Quarterly. World Report on the Rights of
Weisner T S, Gallimore R 1977 My brother’s keeper: Child and
Indigenous Peoples and Ethnic Minorities. Cambridge, MA
sibling caretaking. Current Anthropology 18: 169–90
(http:\\www.cs.org)
Whiting J 1968 Socialization: Anthropological aspects. Inter-
D’Andrade R 1995 The Deelopment of Cognitie Anthropology.
national Encyclopedia of the Social Sciences 14: 545–51
Cambridge University Press, New York
Whiting B B, Edwards C P 1988 Children of Different Worlds: The
Ember C M 1981 A cross-cultural perspective on sex differences.
Formation of Social Behaior. Harvard University Press,
In: Munroe R H, Munroe R L, Whiting B B (eds.) Handbook
Cambridge, MA
of Cross-cultural Human Deelopment. Garland, New York
Whiting J W M, Whiting B B 1975 Children of Six Cultures:
Goldschmidt W 1990 The Human Career. Blackwell, Cambridge,
A Psychocultural Analysis. Harvard University Press, Cam-
MA, pp. 531–80
bridge, MA
Harkness S, Super C M (eds.) 1996 Parents’ Cultural Belief
Systems: Their Origins, Expressions, and Consequences. Guil-
ford Press, New York T. S. Weisner
Hirschfeld L A, Gelman S A 1994 Mapping the Mind: Domain
Specificity in Cognition and Culture. Cambridge University
Press, Cambridge, UK
Ingstad B, Whyte S R 1995 Disability and Culture. University of
California Press, Berkeley, CA Childhood Cancer: Psychological Aspects
Korbin J E (ed.) 1981 Child Abuse and Neglect: Cross-cultural
Perspecties. University of California Press, Berkeley, CA Substantial progress in medical anticancer treatment
LeVine R A 1988 Human parental care: Universal goals, cultural
(chemotherapy, radiotherapy, surgery, bone-marrow
strategies, individual behavior. In: LeVine R A, Miller P M,
West M M (eds.) Parental Behaior in Dierse Societies. New transplantation) has improved dramatically the long-
Directions for Child Deelopment 40. Jossey-Bass, San term survival rates of children and adolescents with
Francisco, pp. 3–12 the diagnosis of cancer (malignant tumors, lymph-
LeVine R A, Dixon S, LeVine S, Richman A, Leiderman P H, omas, and leukemias). Some decades ago, the essential
Keefer C H, Brazelton T B 1994 Child Care and Culture. threat for the family was to face the death and loss of

1701
Childhood Cancer: Psychological Aspects

the child. Since more than two-thirds of the young clearly outweigh only weak resources, thus increasing
patients survive, cancer in children and adolescents no the probability of emotional suffering and behavioral
longer is regarded as unevitably fatal but as a chronic disturbance in the child and the risk of marital discord
disease which can result in complete cure, death’ or or parental depression. Other families are charac-
survival with lasting neurological, orthopedical, or terized by resilience, i.e., the available resources are
neuropsychological impairments. strong enough to outweigh the threatening disease
At the end of the twentieth century, there is a impact, thus giving rise to a good adjustment and a
consensus about the necessity of an honest explanation personal sense of mastery. Actually, the large majority
of the life-threatening nature of the disease to the of patients succeed in mastering the plethora of disease
child. The statistical survival rates, however, do not and treatment-related stressors and challenges (Noll et
indicate the prognosis in an individual case, turning al. 1999).
the former threat of certainty of death to the threat of
uncertainty of the very personal fate. ‘Damocles
syndrome’ is the term Koocher and O’Malley (1981)
have coined to characterize the psychological situation 2. Enhancement of Effectie Coping with
of the family that has terminated oncological therapy Childhood Cancer by Family Counseling
but cannot be sure definitely to have defeated the Understanding adaptation to childhood cancer as a
disease. psychological balance between burden and resources
may serve as a basis for effective psychological
counselling strategies. Therefore, the therapeutical
1. Psychological Adaptation to Childhood task on the way to enhancement of adaptation is a
Cancer as a Balance of Stress and Resources twofold one: the first one is to decrease stress impact
and the second one is to increase resource availablity.
The occurence of childhood cancer is not associated In order to develop concrete counseling strategies for
with any known predisposing social or psychological intervention, a differentiated knowledge is required of
feature in the child or in the family. Its presence can be the most common and threatening sources of stress,
regarded a randomly occuring highly stressful life on the one hand, and of the most powerful resources
event for ‘normal’ families. As a consequence, psycho- that can be mobilized, on the other. Selected typical
logical research and intervention is not concerned with stressors for the family of the cancer-sick child are
the detection of suspected psychological causes of presented in Table 1. They cover practical, emotional,
cancer but with an understanding of the stress impact social, and existential domains. As a counterpart,
imposed by cancer diagnosis on the child’s and Table 2 shows important resources that can buffer and
family’s psychological functioning and well-being. diminish cancer stress impact, improve adaptation,
The risk of the child or the family to master or to fail stabilize quality of life, and protect individual family
the turmoils of the cancer experience may be compared members and the whole family structure from decom-
to a balance with one scale pan containing all the pensation. Research has highlighted the central role of
stressors resulting from the disease and the other scale family communication behavior. For instance, the
pan containing all the resources, competencies, and intensity of interparental exchange about disease
sources of social support to handle the stress. In some issues is connected closely to an improved parent–child
vulnerable families, the strong stressors and demands contact and a high disease related information level in

Table 1
Typical stressors, demands, and challenges frequently occurring during the course of childhood cancer
Organization of treatment (meetings with physicans, insurances, etc.)
Financial expenditures
Gathering information on cancer and its treatment
Repeated hospitalizations including the alternative to stay at home with the siblings or be in the hospital with the ill
child
Treatment-related distress according to painful and frightening medical procedures (e.g., bone marrow aspirations)
and treatment side-effects (e.g., nausea and vomiting due to chemotherapy)
Reduced time for recreation and holiday
Role conflict (e.g., father between ongoing job demands and family need for support)
Re-examinination of future perspectives ( job perspectives of the mother; schooling of the child)
Sharing responsibility for treatment decisions (e.g., amputation in the case of bone sarkoma)
Subconscious violation, disappointment, frustration of the hope for a normal and healthy child
Religious doubts
Uncertainty of final outcome; fear of relapse, fear of loosing the child
Source: Chesler and Barbarin (1987), Noeker and Petermann (1990), Noeker et al. (1990)

1702
Childhood Cancer: Psychological Aspects

Table 2 integration of the strong emotional reactions and as


Protective family resources that enhance an effective well as a stepwise restructuring of inappropriate
adaptation to childhood cancer disease concepts improve the parents’ cognitive ca-
pacity for understanding and accepting the oncolo-
Communication competencies gical treatment concept.
Attributing positive meanings to the situation The next major counseling task is to initiate an
Commitment to the family as a unit active, mutually respectful, and supportive coping
Engaging in active, flexible coping efforts attitude and behavior within the family unit including:
Balancing illness demands with other family needs (a) clarification and expression what kind of support
Maintaining social integration every person in the family needs most from other
Developing collaborative relationships with family members for stabilizing outer and inner equi-
professionals librium;
Ability to ask for and accept support (b) encouragement of open communication and
Clear religious or philosophical belief systems expression of personal feelings, concerns, sorrows,
Parenting style demonstrating consistent rules and and anxieties among the family members; and
expectations (c) creation of a shared commitment to regard the
Source: Koocher and O’Malley 1981, Kupst et al. 1995, Noeker and cancer disease as a common challenge for the entire
Petermann 1990, Patterson 1991 family.
the child and the siblings. A fair and respectful sharing
of the new burdens and tasks is associated with a high 3.2 Phase of Intensie Oncological Treatment
family coherence and stability (Noeker et al. 1990).
After the family has overcome the initial shock of
cancer diagnosis psychological counseling should as-
3. Psychological Counseling Strategies and sess primary vulnerability and resource factors in the
Interentions Across Different Phases of Childhood child or in the family. Factors of vulnerability that
Cancer Treatment were already present before diagnosis (e.g., behavioral
or developmental disorders in the child, intense sibling
Based on the knowledge of typical stress and resource rivalry, marital discord, inconsistent parenting styles,
factors in families with childhood cancer, psycho- socioeconomic disadvantages) may interfere with a
logical counseling programs have been developed competent adaptation to demands of disease and
(Chesler and Barbarin 1987, Noeker et al. 1990). The treatment. Their assessment improves the chances for
counseling interventions are related closely to the early and preventive intervention. At least as im-
specific stressors that characterize certain phases of portant as the assessment of risks and deficits, how-
treatment course. Counseling sessions may take place ever, is the exploration of strenghts and competencies.
at the clinic or at home. Basically, all family members If a counselor turns the family’s attention to positive
should join them; however, also sessions with family experiences of crises in the past that were resolved
subsystems may be indicated, e.g., when mother and successfully, problem-solving strategies may be reacti-
father want to find out an agreement on consistent vated that have proven specific effectivity in this
parenting behavior. In the following, selected topics individual family. In addition, psychological coun-
and strategies of a psychological counseling program seling in this phase includes:
are delineated according to the major phases of (a) Encouraging active participation in the man-
anticancer treatment process. agement of the illness;
(b) Initiating clear and practical responsibilities for
the day-to-day management of additional chores and
3.1 Supporting the Initial Coping with Cancer
tasks; making sure that agreements are perceived as a
Diagnosis
fair sharing of the burden by every family member;
After being told the diagnosis of childhood cancer, (c) Parenting issues (e.g., ways to show affection
most parents show strong affective reactions like without spoiling and pity);
shock, sadness, anger, despair. The primary task of the (d) Mobilization and acceptance of concrete social
counselor here is to show nonjudging acceptance of support from neighborhood or classmates; and
these emotions and to verbalize and clarify them. (e) Facilitating communication between family and
Ideosyncratic parental lay theories on why exactly healthcare team.
their own child has got cancer may interfere with the
medical treatment concept. When exploring these
3.3 Positie Course: Long-term Remission of
subjective disease and treatment concepts special
Disease and Process of Reintegration
attention has to be paid to frequent parental concep-
tions of feeling personally responsible and guilty for After termination of intensive oncological therapy,
cancer manifestation (e.g., via genetics, bad food, anxieties stay alive in patient and family concerning a
psychological stress). Psychological clarification and return of the disease which would require a completely

1703
Childhood Cancer: Psychological Aspects

new start of therapy with, however, decreased chances Effective extrafamilial communication may refer to
for cure. Families suffer from intrusive thoughts and establishing cooperation and agreements with the staff
emotions concerning cancer relapse. Cognitive-beha- or to mobilizing and accepting support from the
vioral intervention strategies can support the patient extended social network. If such competencies are not
and family members to observe these recurrent only applied but even refined during the treatment
thoughts consciously, to accept them as a natural process, the ‘cancer experience’ may even stimulate
reaction to their unnormal situation, and at the same processes of individual maturity and family coherence.
time, to not to get too much involved every time
anxieties arise. In addition, counselor and family may
cautiously anticipate and go through possible re- 4. From a ‘Protectie’ to an Open Approach
sponses and options for the case of an actually There follow some guidelines on how to inform
occuring relapse. children with cancer about their diagnosis. Some
Termination of therapy also requires encouragment decades ago, a so-called ‘protective’ approach was
of the child to give up their sick role and corresponding preferred which consisted of not telling the child the
privelegies and to promote the social reintegration in life-theratening nature of the disease. In the meantime,
previous roles, groups, and obligations. not only changes in ethical considerations but also the
progress in medical treatment options have led to a
3.4 Negatie Disease Course: Recurrent Relapses, broad consensus that there is no responsible alterna-
End-stage of Disease, Death, and Dying tive to telling the truth even to preschool children.
Today, medical progress allows the combination of
The diagnosis of cancer recurrency requires a similar the bad news of cancer diagnosis with the good news
accepting and verbalizing counseling behavior con- of reasonable hope for cure. Therefore, telling children
cerning the intense emotional reactions as in the the truth includes two realistic key messages: first, that
former diagnosis shock. This process may include a death has to be expected if no treatment is performed,
clarification of ambivalent tendencies towards con- and second, that even treatment does not necessarily
tinuation or discontinuation of oncological treatment guarantee survival. Information given to the child
since the parents wish to maintain any chance for cure, should be appropriate to their cognitive and emotional
on the one hand, and do not want to make the child age and maturity. In small children, showing concrete
suffer from a stressful but senseless therapy, on the treatment procedures and instruments and offering
other. With the worsening of treatment course, the plausible explanations for sensory experiences like
threat of loss of the child is gradually intensified vomiting during chemotherapy is more effective than
leading to the following counseling topics: abstract explanations about malignant cells. An ap-
(a) Preparing parents for communication with the proach that avoids telling the truth implies serious
child on death and dying. Finding individually ap- medical and psychological ‘side-effects’ to be con-
propriate images integrating truth and hope; sidered:
(b) Searching for phantasies on a good way to spend (a) A child will not accept the enormously dis-
the remaining time together; development of other tressing treatment procedures without the awareness
important goals besides survival like e.g., freedom of the serious consequences in case of therapy refusal.
from pain; and Thus, true information is necessary for compliance
(c) Enhancing imagery on a suitable way of parting with treatment, and compliance with treatment is
in dignity. necessary for medical treatment success.
If the child actually dies, it is important to say that (b) Children who lack plausible information on
the therapy failed and not the child or the family. The their condition tend to develop other explanations for
family’s engagement shown for the child should be the occurrence of their disease (Eiser 1993). A frequent
acknowledged. Individually suitable mourning rituals one is the conception that the illness represents a
may be conceived that combine keeping related to the punishment for own misbehavior.
deceased child and, at the same time, letting them go. (c) Trust in the parents is one of the most important
Family members may be encouraged to take their time psychological resources for the child to protect them
for mourning and grief according to their very per- from feelings of loneliness and depression. A child
sonal needs not according to expectations of their uncovering the parents’ well-meant lies will not only
social network (see Bereaement). have to struggle with the adversities of disease and
Summarized, many counseling interventions focus treatment but, in addition, with social alienation.
on the enhancement of open and clear family com-
munication. The mobilization of this key resource for See also: Biopsychology and Health; Cancer: Psyc-
effective coping with the burden of childhood cancer hosocial Aspects; Cancer Screening; Childhood
may refer to the intrafamilial exchange of personal Health; Chronic Illness, Psychosocial Coping with;
needs and concerns, the practical organization of day- Illness: Dyadic and Collective Coping; Pain, Health
to-day routines, or the subtle communication between Psychology of; Well-being and Health: Proactive
parents and child about their chances of survival. Coping

1704
Childhood Depression

Bibliography a more restrictive level, ‘depression’ refers to a par-


ticular set of symptoms that frequently occur together.
Chesler M A, Barbarin O 1987 Childhood Cancer and the Family.
Meeting the Challenge of Stress and Support. Brunner\Mazel, Such a set of symptoms is designated a syndrome. For
New York instance, children with low mood often simultaneously
Eiser C 1993 Growing up with a Chronic Disease: The Impact on experience boredom, low energy, and social with-
Children and their Families. Kingsley, London drawal, among other possible symptoms. Such a
Koocher G P, O’Malley J E (eds.) 1981 The Damocles Syndrome: syndrome can be assessed at one point in time by the
Psychosocial Consequences of Suriing Childhood Cancer. use of symptom review checklists. At the most re-
McGraw-Hill, New York strictive level, ‘depression’ refers to a psychiatric
Kupst M J, Natta M B, Richardson C C, Schulman J L, Lavigne disorder, characterized by a significant and persistent
J V, Das L 1995 Family coping with pediatric leukemia: 10
change in the child’s functioning. Depressive disorders
years after treatment. Journal of Pediatric Psychology 20:
601–17 have, in addition to a number of defining symptoms, a
Noeker M, Petermann F 1990 Treatment-related anxieties in certain duration and course, a level of severity that
children and adolescents with cancer. Anxiety Research 3: causes considerable distress, and an impact on the
101–11 daily functioning of the young person. Assessment of
Noeker M, Petermann F, Bode U 1990 Family counselling in depressive disorders requires a comprehensive clinical
childhood cancer: Conceptualization and empirical results. interview, possibly accompanied by symptom check-
In: Schmidt L R, Schwenkmezger P, Weinman J, Maes S (eds.) lists or rating scales.
Theoretical and Applied Aspects of Health Psychology, Depression as a single symptom will not be a focus
Harwood Academic Publishers, Chur, pp. 241–53
of this article. Instead the focus will be primarily on
Noll R B, Gartstein M A, Vanatta K, Correl J, Bukowski W M,
Davies W H 1999 Social, emotional, and behavioral func- depression as a disorder, and secondarily on de-
tioning of children with cancer. Pediatrics 103: 71–8 pression as a syndrome, since syndromal depression
Patterson J M 1991 Family resilience to the challenge of a child’s has been the subject of several important longitudinal
disability. Pediatric Annals 20: 491–99 studies among youths.
Depression is one of the psychiatric disorders in
F. Petermann and M. Noeker which disturbed mood is the core defining charac-
teristic. Diagnoses of depression (unipolar depression)
in the American Psychiatric Association (1994) no-
menclature include Major Depressive Disorder and
Dysthymic Disorder, both of which refer to conditions
Childhood Depression in which mood is dysphoric or low. In depression,
children may report feeling sad, unhappy, bored or
Once thought to be rare or virtually nonexistent, uninterested in usual activities, angry, or irritable, or
depression in young people has been reliably docu- may appear sad or tearful. By contrast, bipolar mood
mented. In this article, the meaning of depression, its disorders, or manic-depressive disorders, are con-
clinical manifestation, and the recent history of studies ditions in which mood fluctuates episodically between
in depression are discussed. Key issues in the study of such dysphoric states and unusually elevated, irritable,
depression include prevalence and course of depressive or expansive mood states.
disorders in young people, gender and age differences,
and comorbidity with other disorders. Topics that are 2. Recognition and Recent History
the focus of current research include those familial,
cognitive, and biological factors that may contribute Recognition that depression occurs with considerable
to or maintain depression. Effective treatments for frequency in children and adolescents has been a
childhood or adolescent depression have begun to be relatively recent development. In the 1950s and 1960s,
identified, but many questions about optimal single or when psychoanalytic theory provided the primary
combined treatments for the disorder remain to be conceptual model for psychiatric diagnosis and psy-
answered. chological assessment, clinicians rarely diagnosed
depression in young people, partly because the de-
velopment of superego and ego ideal functions, con-
1. The Meaning of the Term ‘Depression’ in sidered necessary to generate and maintain depression,
Young People was incomplete in youngsters. Child clinicians were
also faced with the reality that their most common
The term ‘depression’ in children or adolescents can referral problems involved behavioral disorders or
refer to three different, increasingly restrictive defini- school performance problems, rather than mood
tions. At the least restrictive level, ‘depression’ refers problems. By the 1970s, there was recognition that
to a negative or low mood, such as sadness. In medical depression might be ‘underlying’ such problems, but it
terms, if the low mood persists beyond an expectable was not until the emphasis on phenomenological or
duration of time, this is a single symptom. It may occur symptom-focused diagnosis that depression began to
in isolation or in conjunction with other symptoms. At be assessed systematically in young people. Lewinsohn

1705
Childhood Depression

et al. (1993) have proposed that the study of affective applying the same criteria and research concepts from
disorders in children and adolescents really began in adults to children does not adequately account for the
the 1970s when several sets of investigators, including interactive contributions of cognitive development,
Carlson, Cytryn, Kovacs, Poznanski, Puig-Antich, family processes, school environment, and biological
Rutter, and their colleagues demonstrated that such development to the development of mood disorders.
disorders do occur and can be reliably assessed in Developmental psychopathologists seek to go beyond
young people. This development corresponded to the merely establishing that depressed children differ from
shift in psychiatric assessment toward systematic nondepressed children on a set of symptoms and
review of presenting symptoms, and away from associated features, and to determine the conditions
nondirective, inferential assessment based on play and processes that contribute to these differences.
observations or patient narrative reporting.
Since the advent of symptom-driven diagnosis in
US psychiatry in 1980, the same diagnostic criteria 3. Prealence and Gender Differences
used to diagnose depressive disorders in adults have
been applied to children and adolescents. At present, Prevalence of depression in children and adolescents
the diagnosis of Major Depressive Disorder (MDD) varies across studies, in part depending on whether a
requires at least one episode in which the child has had syndrome (elevated scores on a continuous scale) or a
five or more of the following symptoms, including one diagnosed disorder (based on parent and\or child
of the first two, for a minimum of two weeks: (a) interviews) serves as the measure of depression.
depressed or irritable mood; (b) markedly diminished Fleming and Offord’s (1990) review of studies showed
interest or pleasure in activities; (c) weight or appetite that MDD in preadolescent children occurred in as
loss or gain; (d) insomnia or hypersomnia; (e) low as 0.4 percent and as high as 2.5 percent of the
psychomotor agitation or retardation; (f) fatigue or samples. Upper estimates for adolescents were higher,
loss of energy; (g) feelings of worthlessness or excessive with a range from 0.4 percent to 6.4 percent. Preva-
guilt; (h) decreased ability to think, concentrate, or lence of DD in children ranged from 0.6 percent to 1.7
make decisions; (i) recurrent thoughts of death or percent, and in adolescents from 1.6 percent to 8.0
suicide or a suicide attempt or plan. The diagnosis of percent. The subsequent Oregon Adolescent Depres-
Dysthymic Disorder (DD) is given if depressed or sion Project (Lewinsohn et al. 1993) found a point
irritable mood is present most days for a year or more, prevalence of 2.57 percent for MDD and 0.53 percent
and if mood disturbance is accompanied by two or for DD, and lifetime prevalence of 18.48 percent for
more of six key symptoms: poor appetite or excessive MDD and 3.22 percent for DD in US high school
eating, insomnia or hypersomnia, low energy or students (mean age l 16.6 years).
fatigue, low self-esteem, poor concentration or diffi- Depression is associated with age and with gender
culty making decisions, and feelings of hopelessness. (Birmaher et al. 1996a). Rates are higher in adolescents
MDD and DD are not mutually exclusive, and some than in children. Phenomenology also differs to some
children or adolescents present with a long-standing extent by age, with adolescents with MDD more likely
DD, upon which an episode of MDD has been than children to have anhedonia, hypersomnia, weight
superimposed: a condition referred to as ‘double change, or lethal suicide attempts. Among depressed
depression.’ children there is equal gender representation, but in
Parallel to the application of adult criteria to adolescents the ratio is about two females to one male,
childhood depression, there have been numerous similar to the pattern among adults. The reasons for
studies in the psychopathology of depression, and in gender differences in adolescent depression are the
its treatment, which have applied concepts and re- focus of considerable current research. Sociocultural
search models from the adult field and extended them pressures on girls, biological changes associated with
to children. To give just two examples, both cognitive puberty, and sex differences in cognitive coping
factors known to characterize adult depression (e.g., mechanisms have been proposed as possible explana-
distortions in thinking) and cognitive therapy, which tions.
is known to be effective for adult depression, have
been studied in young people. Although the use of
adult diagnostic criteria for child and adolescent 4. Course of Depressie Disorders in Childhood
depression is now conventional in research and clinical
practice, there continues to be controversy regarding Kovacs and her colleagues were the first to study the
the developmental sensitivity and adequacy of this course of depressive disorders in prepubertal children.
approach. In particular, the school of thought known Their sample consisted of children who had been
as developmental psychopathology (Cichetti et al. referred for clinical services, and was not an epidemi-
1994) is characterized by an emphasis on understand- ological sample. The average (mean) duration of the
ing the multiple contributions of developmental index episode of MDD in their sample was 32 weeks
sciences toward understanding both normal devel- and half had recovered by nine months (median
opment and disorders such as depression. Merely duration). For DD, however, the median duration of

1706
Childhood Depression

the episode was four years. Both MDD and DD comorbid conditions developed before the MDD.
children demonstrated a high likelihood of having a However, conduct disorder was sometimes found to
second depressive episode within a nine-year follow- develop after MDD and to persist after resolution of
up period, with the risk higher among DD than among the depression. Comorbid MDD and DD, in par-
MDD children (Kovacs 1996). By contrast, children ticular, have been found to predict longer depressive
diagnosed with an Adjustment Disorder with de- episodes, more suicidality, and worse social adjust-
pressed mood (one or a few depressive symptoms in ment. The presence of comorbid anxiety disorders
reaction to a stressor) were not at risk of developing appears to raise the risk of suicidality and of substance
MDD during follow up. abuse, and to be associated with poorer response to
Data from the Oregon Adolescent Depression psychotherapy. There has been particular interest in
Project (Lewinsohn et al. 1993), with an epidemi- comorbid depression and disruptive behavior dis-
ological sample, indicated that there is an extraordi- orders, perhaps because these represent a combination
narily variable duration of MDD in teenagers. Mean of two distinct types of disorders, internalizing and
episode duration was 26 weeks, but median was eight externalizing, that would not ordinarily be expected to
weeks, with a range of 2 to 520 weeks. Seventy-five co-occur. Birmaher and colleagues’ (1996a) review
percent of the adolescents recovered by 24 weeks. indicates that these young people are at increased risk
Overall, about 33 percent of the adolescents who for suicide attempts and adult criminality, and have
recovered from their initial episode of MDD had a poorer response to acute treatment, but also have
second episode within four years. Adolescents with more positive responses to placebo treatment and
onset of MDD before age 15.5 years had longer have fewer depressive recurrences.
episodes and shorter times between recovery and Bipolar disorder includes distinct periods of de-
relapse than did adolescents with later onset. pression and of mania. Because bipolar disorder is
Using syndromal measures of depression assessed well recognized as distinct from unipolar depression,
on rating scales or clinician-completed symptom with a different course and different treatment re-
checklists, studies both in the USA and in the UK have quirements, it is important to assess the risk of
demonstrated that depression during the develop- developing bipolar disorder in depressed young
mental period confers an increased risk of depression people. Follow-up studies of clinically referred youths
during adulthood. Harrington and his colleagues have varied in duration of follow-up period and in
(Harrington et al. 1990) have shown that depressed definition of bipolar disorder. They suggest that about
young people followed up an average of 18 years into 20 percent of adolescents with MDD go on to develop
adulthood have nearly a fourfold increased risk of bipolar disorder, but the range of estimates to date
adult depression when compared to a control group varies widely. Factors that predict bipolar outcome
matched for nondepressive childhood symptoms. include psychotic symptoms during the depressive
episode, family history of bipolar disorder, and hypo-
manic reactions to antidepressant medication. There is
5. Comorbidity and the Risk of Deeloping conflicting evidence regarding the predictive utility of
Bipolar Disorder acute vs. prolonged onset of depression in predicting
bipolar outcome. Both acute onset of severe depres-
Studies from New Zealand, Puerto Rico, and the sion in hospitalized adolescents and early onset of DD
continental USA reviewed by Angold and Costello in (nonhospitalized) children have been associated
(1993) have shown that depression (either MDD or with later bipolar outcome, suggesting there may be
DD) in young people is very often accompanied by more than one path to a bipolar outcome.
other disorders. The most common comorbid con-
ditions in these studies were oppositional or conduct
disorders and the various anxiety disorders. Rates of 6. Current Research Emphases: Psychosocial
conduct or oppositional disorders in depressed chil- Correlates, Biology, and Treatment
dren or adolescents ranged widely across studies, from
20 percent to 80 percent. Most studies showed rates of Psychosocial correlates, biological correlates, and
anxiety disorders between 30 percent and 50 percent, treatment efficacy and effectiveness are three areas of
but some showed rates exceeding 70 percent. In the current research emphasis in childhood and adolescent
Oregon project, Lewinsohn and colleagues (1993) depression. Correlates include an array of factors that
found that the most frequent comorbid diagnoses for are associated with and may make a causal con-
depressed adolescents were anxiety disorder (21 tribution to depression or maintaining a depressive
percent) and substance use disorder (20 percent), with episode. Some psychosocial correlates, such as family
12.4 percent of depressed teenagers having a disruptive factors and cognitive factors, can become targets of
behavior disorder. treatment in psychosocial intervention.
Birmaher and colleagues (Birmaher et al. 1996a), in Compared to controls, the families of depressed
their review of the comorbidity literature, found that, young people are characterized by higher levels of
except for substance use disorders, most of the other parent–child and\or marital conflict, poor parent–

1707
Childhood Depression

child communication, and more distant, less affec- effective than clinical monitoring for depressed adol-
tionate parent–child relationships (Birmaher et al. escents, both in terms of symptom reduction and in
1996a). The challenge for researchers is to de- terms of improved social functioning (Mufson et al.
termine whether these factors are specific to depres- 1999).
sion, and if so, whether they are causal. Studies of older tricyclic antidepressant medications
A number of cognitive factors has been associated did not demonstrate efficacy in children or adolescents.
with depression in youths. These include cognitive A recent study by Emslie and his colleagues (Emslie et
distortions that emphasize negative interpretations of al. 1997) using fluoxetine, a selective serotonin re-
events, a tendency to attribute negative events to uptake inhibitor, did show a significantly better
enduring and internal causes but positive events to outcome for those on medication than for those
transitory and external causes, low self-esteem, and receiving a placebo. As is true in the psychosocial
low estimates of personal control and competence. treatment arena, research on medication treatment
Although such factors are present while youngsters has been limited to acute or short-term treatment.
are depressed, it is not clear whether they represent Numerous critical questions remain to be investi-
pre-existing risk factors or state-dependent correlates gated. These include the relative efficacy of various
of the depressive episode. An important research psychosocial, medication, and combined treatments,
question is to delineate the processes through which the long-term efficacy of such treatments, optimal
children acquire such cognitive characteristics. duration of treatment, and effectiveness of university-
Biological correlates of depression include family- or laboratory-based interventions in the broader
genetic factors and markers of a depressed state. clinical world. It is not yet clear whether individuals
Children of depressed parents are more likely to who fail to respond to a first treatment will do better
develop depression than are children of nondepressed with an alternative. Finally, a major challenge to
parents. Studies in adults show a significant genetic treatment research is to identify strategies and
component to the transmission of mood disorders, methods to address the comorbid conditions that so
and family interaction studies suggest that depressed often accompany depressive disorders in young
parents have deficits in parenting behaviors that raise people.
the risk of poor adaptation in their children. Both
genetics and experience, therefore, are likely involved See also: Anxiety Disorder in Children; Behavior
in cross-generational transmission of depression. Therapy with Children; Child and Adolescent Psy-
A number of potential biological markers of MDD chiatry, Principles of; Childhood Health; Depression;
have been investigated, in the desire to clarify the Depression, Clinical Psychology of; Depression,
biological basis of the disorder. Among these are Hopelessness, Optimism, and Health; Early Child-
secretion of growth hormone after pharmacological hood: Socioemotional Risks; Infancy and Childhood:
challenges, abnormalities in functioning of the hypo- Emotional Development
thalamic–pituitary–adrenal axis, and abnormal sleep
EEG patterns. Results to date do not suggest any
single marker sufficiently sensitive to or specific to
childhood or adolescent MDD that can be used for Bibliography
diagnostic purposes. American Psychiatric Association (APA) 1994 Diagnostic and
Effective treatment is a major current focus of Statistical Manual of Mental Disorders, 4th edn. APA,
research. Cognitive behavior therapy (CBT) involves Washington, DC
working with the young person to understand and to Angold A, Costello E 1993 Depressive comorbidity in children
modify thoughts and behaviors that are likely con- and adolescents. American Journal of Psychiatry 150: 1779–91
tributing to depression (see Cognitie Therapy). Inter- Birmaher B, Ryan N, Williamson D, Brent D, Kaufman J 1996b
Childhood and adolescent depression (II). Journal of the
personal psychotherapy (IPT) involves working with
American Academy of Child and Adolescent Psychiatry 35:
the young person to understand the impact of rela- 1575–83
tionship or role conflicts on depression and to modify Birmaher B, Ryan N, Williamson D, Brent D, Kaufman J, Dahl
interpersonal patterns (see Interpersonal Psychother- R, Perel J, Nelson B 1996a Childhood and adolescent
apy). There is considerable evidence to support the depression (I). Journal of the American Academy of Child and
efficacy of CBT both for childhood and for adolescent Adolescent Psychiatry 35: 1427–39
depression, when compared to control conditions such Cicchetti D, Rogosch F, Toth S 1994 A developmental psy-
as a waiting list (Birmaher et al. 1996b, Reinecke et al. chopathology perspective on depression in children and
1998). Most of the child study subjects have been adolescents. In: Reynolds W, Johnston H (eds.) Handbook of
Depression in Children and Adolescents. Plenum, New York
children with a depressive syndrome, and some of the
pp. 123–41
interventions have been school-based. More of the Emslie G, Rush J, Weinberg W, Kowatch R, Hughes R, Carroll
adolescent studies have involved teenagers with a W, Carmody T, Rintelmann J 1997 A double-blind,
depressive disorder and have been clinic-based. There randomized, placebo-controlled trial of fluoxetine in children
have been very few studies to date comparing CBT to and adolescents with depression. Archies of General Psy-
other active treatments. IPT has also proven more chiatry 54: 1031–7

1708
Childhood Health

Fleming J E, Offord D R 1990 Epidemiology of childhood de- (c) role of psychology in health care settings;
pressive disorders—a critical review. Journal of the American (d) promotion of health and health-related behav-
Academy of Child and Adolescent Psychiatry 29: 571–80 iors;
Harrington R, Fudge H, Rutter M, Pickles A, Hill J 1990 Adult
(e) prevention of illness and injury among children
outcomes of childhood and adolescent depression. Archies of
General Psychiatry 47: 465–73 and adolescents; and
Kovacs M 1996 The course of childhood-onset depressive (f) training.
disorders. Psychiatric Annals 26: 326–30 In reviewing the achievements of child health
Lewinsohn P, Hops H, Roberts R, Seeley J, Andrews J 1993 psychology, this initial conceptualization has been
Adolescent psychopathology. Journal of Abnormal Psycho- followed, though it is immediately apparent that some
logy 102: 133–44 areas have been the focus of much greater research
Mufson L, Weissman M, Moreau D, Garfinkel R 1999 Efficacy activity than others.
of interpersonal psychotherapy for depressed adolescents.
Archies of General Psychiatry 57: 573–9
Reinecke M A, Ryan N, DuBois D 1998 Cognitive–behavioral 1. Psychosocial Contributions to Outcomes in
therapy of depression and depressive symptoms during
adolescence. Journal of the American Academy of Child and Pediatric Conditions
Adolescent Psychiatry 37: 26–34 It has been argued that chronic illness challenges the
Reynolds W, Johnston H (eds.) 1994 Handbook of Depression in
child’s normal development by limiting opportunities,
Children and Adolescents. Plenum, New York
restricting play and activities, the attainment of
J. F. Curry autonomy, and potentially compromising family and
peer relationships. These effects may differ specifically
as a function of the child’s age. For infants, chronic
illness is most likely to affect parent–child relation-
Childhood Health ships, restrict mobility, or limit opportunities to
socialize with peers. Separation from parents is a key
issue. For older children, the impact will have a more
Child health psychology is an interdisciplinary field
direct effect in terms of reduced schooling, comprom-
concerned with the physical, cognitive, social, and
ised peer relationships, more time with adults,
emotional functioning and development as they relate
concerns about body image, and awareness of vulner-
to health and illness issues in children, adolescents,
ability and possible death. Cancer in adolescence may
and their families. It has emerged as a separate
extend a period of dependency on parents and reduce
discipline, in recognition of the unique ways in which
opportunities to establish close interpersonal relation-
children are affected by illness compared with adults.
ships with the opposite sex.
There are several reasons for establishing child
Clinically, the study of how development proceeds
health psychology as a separate discipline from that of
despite such adversity is justified in terms of increasing
adults. First, children’s understanding of health, and
our understanding of normal development, offering
the causes and implications of illness, differ from those
insights where problems occur and improving clinical
of adults. Second, children’s health is important both
services. The identification of protective factors, such
for understanding the current health of the child and
as parenting, is also important. In some cases, family
in terms of implications for adult health. Children who
variables may be better indicators of outcome than
adopt unhealthy lifestyles tend to become adults with
illness predictors. Thus, Carlson-Green et al. (1995)
unhealthy lifestyles. Third, diseases that affect children
studied 63 children who had been treated for a brain
may have different implications for daily life compared
tumor. The best predictors of children’s behavior
with conditions that affect adults. Thus, children with
problems were family and demographic variables,
diabetes need insulin on a daily basis, whereas many
while the best predictors of achievement were illness
adults may be treated by diet or medication alone.
and demographic variables. The implications are that
Untreated, cancer in children is rapidly fatal, but may
outcome following traumatic illness and treatment
be more chronic in adults. Fourth, the impact of
may be very much influenced by positive family
childhood illness is not restricted to the individual but
variables. Identification of factors that moderate
affects the whole family. Parents, and doctors, make
outcomes is of central interest in child health psy-
decisions on behalf of children, often with little clear
chology.
understanding of the child’s preferences.
Child health psychology is concerned with the
relationship between psychological and physical well- 2. Assessment and Treatment of Behaioral and
being. The scope of child health psychology is gen- Emotional Consequences of Disease
erally considered to include the following:
(a) psychosocial contributions to outcomes in pedi- Measurement of psychosocial adjustment has been
atric conditions; central to research in this area. In that a central task of
(b) assessment and treatment of behavioral and childhood is to develop toward an autonomous,
emotional consequences of disease; healthy, and well-functioning adult, adjustment has

1709
Childhood Health

been defined in developmental-normative terms. validity; they do not appear to tap issues of relevance
‘Good adjustment, then, is reflected as behavior that is to the children under study. Many measures are
age-appropriate, normative and healthy, and that criticized as too long, too repetitive, or simply in-
follows a trajectory toward positive adult functioning. appropriate.
Maladjustment is mainly evidenced in behavior that is It would be appropriate to conclude this section
inappropriate for the particular age, especially when with some discussion of the increasing use of quali-
this behavior is qualitatively pathological or clinical in tative measures. These are preferred by some, while
nature’ (Wallander and Thompson 1995 pp. 125–6). others acknowledge the need for qualitative measures
While the basic question concerning how illness and in combination with the more conventional quan-
treatment affects the child’s normal development is a titative measures. Qualitative methods can be used in
very real one, the issue of measurement is complex. the development of quantitative measures, ensuring
Standardized measures that capture the kind of that the latter include issues of importance to the
experiences described so graphically by parents in target population. Qualitative methods may also be
interview studies are not available. the method of choice in situations involving sensitive
Measurement of adjustment has often been based issues or for work with very young or handicapped
on parents’ reports and most commonly is assessed children (Eiser and Twamley 1999).
using the Child Behavior Checklist (Achenbach 1991).
This measure has come under considerable criticism,
not least because it lacks sensitivity for work with sick 3. Interention Programs
children (Perrin et al. 1991). Many of the items assess
somatic symptoms. As a consequence, children with Interventions have been reported to facilitate return to
cancer (or any physical illness) inevitably have higher school following long illness. Varni et al. (1993), for
scores than healthy children (see Childhood Cancer: example, reported a social skills program to teach
Psychological Aspects). children returning to school following diagnosis with
Improvements in the management of a number of cancer how to deal with other children’s questions. A
life threatening conditions, for example, cancer and number of methods were used (e.g. role-play) to
cystic fibrosis, have led to increased survival. However, prepare children for teasing and allow them to develop
this has been achieved at the cost of often daily, the skills to know what to say and save face.
frequently painful, and certainly intrusive treatment Other programs are aimed at improving self-care.
regimens, which may continue for many years. Sur- Holzheimer et al. (1995), for example, used a teaching
vival alone is therefore no longer perceived to be an video, instruction booklet, and opportunities to re-
adequate outcome measure. A central concern must be hearse the desired skills, and reported improved use of
the extent to which treatment compromises quality of an inhaler in preschool children with asthma.
life (QoL). This change in emphasis has necessitated Efforts to treat procedure-related pain have been
the development of alternative means of assessing reported most frequently in pediatric oncology (see
treatment outcomes. Clinicians and researchers have Childhood Cancer: Psychological Aspects). Children
therefore turned to measures of QoL to provide a with cancer undergo regular painful procedures (e.g.,
more comprehensive account of individual patient lumbar punctures and bone marrow aspirations). The
experience as well as information on which to base earliest reports by Jay et al. (1986) described inter-
decisions about service provision. Acceptance of this ventions based on cognitive behavior therapy, includ-
broader perspective on outcomes has led to a rec- ing breathing exercises and use of imagery. Filmed
ognition of the need to identify methods of medical modeling has also been used. Although the early
management, which impose fewer restrictions on reports were often based on a small number of case
quality of life. studies, more recent work has involved randomized
Quality of life is a difficult concept to define and an clinical trials. Kazak et al. (1996) reported that a
even more difficult one to measure. Critical to most combined physiological and psychological interven-
definitions is the notion that an individual’s perception tion was more successful than a physiological inte-
of their QoL is unique, and it therefore follows that rvention alone.
efforts must be made to elicit information directly Interventions for disease-related pain for children
from the individual. Proxy ratings, as provided by remain poorly developed, despite the very high levels
caregivers, are useful, but not a substitute for an of pain experienced by children with conditions such
individual’s own information. as cancer, arthritis, hemophilia, and sickle cell disease.
Only some 3 percent of trials in pediatric oncology Interventions need to target both disease-related pain
include a measure of QoL. There are few examples of and the child’s perceptions of pain. Walco et al. (1999)
their use in intervention studies (Kazak et al. 1996). To conclude that more work is needed which documents
a large extent, this can be attributed to reticence on the the emergence of pain experiences and coping over
part of clinicians to use the measures. Objections are time. Interventions need to be based on a variety of
generally based on perceptions of poor psychometric approaches, and target the pain experience within a
properties. In addition, many measures lack face social and family context.

1710
Childhood Health

Schools based programs for children with headaches reality, caring for the child with chronic illness is not
may be particularly successful. Larsson and Carlsson always about achieving a cure, but should be about
(1996) randomized 26 children aged 10–15 years with enhancing the QoL. As such, there is pressure to
chronic tension type headache to a nurse administered include measures of QoL as an integral part of the
relaxation training intervention or no treatment con- evaluation of clinical trials. The development of
trol. The five-week program involved twice weekly disease-specific QoL scales means that health out-
sessions, each lasting approximately 20 minutes. comes can now be considered from a more holistic
Headache activity was reduced in children in the point of view. QoL measures also have a role in
intervention group at post-treatment and six-month evaluating innovative treatments. Children under-
follow-up compared with the no treatment controls. going bone marrow transplants necessarily experience
isolation, pain, and lengthy hospitalizations, and
assessments of their QoL during treatment and after
4. The Promotion of Health and Health-related are essential.
Given the increasing costs of health care, it is critical
Behaiors that treatments are evaluated in terms of both physical
Drug and alcohol use among teenagers and adol- and psychological well-being. Thus, evaluations of the
escents usually occurs in group situations, and the use success or otherwise of growth hormone therapy must
of such substances has been found to be strongly not be based on height alone, but consideration also
motivated by self-presentational desires such as the needs to be given to the child’s psychological well-
need for social approval and peer acceptance. Family being.
factors, including parental modeling (often parents There remains scope for considerable improvements
themselves are heavy users of the substance), and the in measures available. Current work rarely draws on
family–child relationship, are key risk factors for expertise in developmental psychology to direct the
subsequent substance use. In addition, peer relation- format or content of measures. There is an emerging
ships appear to be strongly related to the extent of literature documenting children’s language, memory,
alcohol and illicit drug use. Both types of substance and emotional development, much of which could be
abuse seem to be more common among those who used to guide new measures. It is also important that
report academic difficulty or poor adjustment at measures are based on a theoretical framework. In the
school. Other reasons for the illicit use of alcohol and past, theory has been sparse or embedded in a model
drugs include the wish to convey autonomy or rebel- of maladjustment. It is likely that theories of normal
liousness, and the reduction of anxiety in intergroup child development will prove more useful in the future.
settings. Some may use alcohol or drugs if they feel Despite the criticisms that can be made, there is no
unable to cope with the pressures associated with doubt that the establishment of a child health psy-
growing up; as a sort of temporary escape from school chology in itself has done much to increase the profile
problems, social anxieties, or home difficulties. of work with children. The difficulties involved may be
considerable, but these also contribute to its attract-
iveness. That children have a unique perspective on
5. Issues Relating to Training life, despite the hardships of a serious illness, is without
debate. From such a perspective, and for many people,
The nature of much of the work with children involves work with children is therefore not only humbling, but
collaboration with other professionals: clinicians, intrinsically more satisfying, than work with other
teachers, and social services. Issues of training par- groups.
ticularly need to address the potential obstacles to
successful collaboration. There is inevitably some See also: Adolescent Health and Health Behaviors;
friction between psychologists and pediatricians in the Child Care and Child Development; Childhood
way in which they approach research and clinical Depression
issues concerned with chronically sick children. While
both share some ideals about what should be done to
help children with chronic conditions and their famil- Bibliography
ies, their training and experience does not necessarily
mean that they share the same research agenda. Achenbach T M 1991 Manual for the Child Behaior Checklist\4-
18 and 1991 Profile. Department of Psychiatry, University of
Vermont, Burlington, VT
Bradlyn A S, Ritchey A K, Harris C V, Moore A K, O’Brien
6. Future Directions R T, Parsons S K, Pattersan K, Pollock B H 1995 Quality of
life research in pediatric oncology: Research methods and
Increasing recognition of the closeness between physi- barriers. Cancer 78: 1333–9
cal and psychological health has contributed to an Carlson-Green B, Morris R D, Krawiecki N 1995 Family and
increasing visibility of psychological factors in pedi- illness predictors of outcome in pediatric brain tumors.
atric medicine. So too have changes in treatment. In Journal of Pediatric Psychology 20: 769–84

1711
Childhood Health

Eiser C, Cotter I, Oades P, Seamark D, Smith R 1999 Health- across studies. Nonetheless, enough research has
related quality of life measures for children. International accumulated from both clinical and general popu-
Journal of Cancer 512: 87–90 lation studies to support the claim that CSA is
Eiser C, Twamley S 1999 Talking to children about health and
significantly correlated with increased risk for various
illness. In: Murray M, Chamberlain K (eds.) Qualitatie
Health Psychology: Theories and Methods. Sage, London forms of adult psychopathology. There has been
Holzheimer L, Mohay H, Masters I B 1998 Educating young disagreement, however, about whether CSA causes
children about asthma: Comparing the effectiveness of a adult psychopathology. Since the 1990s, a few studies
developmentally appropriate asthma education videotape and have been conducted that allow for causal interpre-
picture book. Child: Care, Health and Deelopment 24: 85–99 tations about the effects of CSA on adult psycho-
Jay S M, Elliott C, Varni J W 1986 Acute and chronic pain in pathology.
adults and children with cancer. Journal of Consulting and The purpose of this article is to describe the
Clinical Psychology 54: 601–7 historical and scientific development of CSA research,
Juniper E F, Guyatt G H, Feeny D H, Ferrie P J, Griffith L E,
from early clinical studies to current studies of the
Townsend M 1996 Measuring quality of life in children with
asthma. Quality of Life Research 5: 35–46 general population. Definitional and methodological
Kazak A E, Penati B, Boyer B A, Himelstein B, Brophy P, issues in CSA research will be discussed, and a review
Waibel K, Blackall G F, Daller R, Johnson K 1996 A of recent studies that have used powerful methodolo-
randomized controlled prospective outcome study of a psy- gies for understanding cause-and-effect relationships
chological and intervention pharmaceutical protocol for will be presented. Finally, some general conclusions
procedural distress in pediatric leukemia. Journal of Pediatric about the effects of CSA on adult psychopathology
Psychology 21: 615–32 will be provided, and future directions for this research
Landgraf J M, Abetz L, Ware J E 1996 Child Health Ques- will be suggested.
tionnaire (CHQ): A Users Manual. The Health Institute, New
England Medical Centre
Larsson B, Carlsson J 1996 A school-based, nurse administered 2. The Historical Deelopment of Research on
relaxation training for children with chronic tension-type Childhood Sexual Abuse
headache. Jouranl of Pediatric Psychology 21: 603–14
Perrin E C, Stein R E K, Drotar D 1991 Cautions in using the The evolution of CSA research, as discussed below,
Child Behavior Checklist: Observations based on research has followed a three-phase course that roughly paral-
about children with a chronic illness. Journal of Pediatric lels the progression of research found in all psychiatric
Psychology 16: 411–21 epidemiological research: (a) pioneering publications
Peterson L, Oliver K K, Brazeal T J, Bull C A 1996 A de-
and numerous clinical studies; (b) studies using non-
velopmental exploration of expectations for and beliefs about
preventing bicycle collision injuries. Journal of Pediatric clinical (general population) samples to estimate
Psychology 20: 13–22 prevalence and correlates of CSA; and (c) studies
Varni J W, Katz E R, Colegrove R, Dolgin M 1993 The impact using scientifically rigorous methods that allow for
of social skills training on the adjustment of children with cause-and-effect conclusions about CSA and adult
newly diagnosed cancer. Journal of Pediatric Psychology 18: psychopathology.
751–67
Walco G A, Sterling C M, Conte P M, Engel R G 1999 Empiri-
cally supported treatments in pediatric psychology: Disease 2.1 Phase 1 Research: Pioneering Publications and
related pain. Journal of Pediatric Psychology 24: 155–67 Clinical Research
Wallander J L, Thompson R J 1995 Psychosocial adjustment of
Modern awareness of the prevalence, characteristics,
children with chronic physical conditions. In: Roberts M C
(ed.) Handbook of Pediatric Psychology. Guilford Press, New and possible consequences of CSA in the USA can be
York, Vol. 2 traced to the late 1970s and early 1980s. During this
first phase of modern CSA research, a few pioneering
C. Eiser publications, most notably Finkelhor’s Sexually
Victimized Children in 1979 and Russell’s Sexual
Exploitation: Rape, Child Sexual Abuse, Sexual Har-
assment in 1984 provided data that indicated CSA
occurred frequently to children and adolescents in the
Childhood Sexual Abuse and Risk for USA. Following these works, public attention to CSA
was further stimulated by several highly publicized
Adult Psychopathology media accounts of CSA, and clinical research
increased dramatically. Results from early clinical
1. Oeriew of the Issue studies were mostly of women, and indicated a high
percentage of women reporting CSA, with many
Numerous studies on the prevalence, characteristics, women indicating that the abuse involved family
and consequences of childhood sexual abuse (CSA) members and\or serious abuse (i.e., intercourse CSA).
have been conducted since the 1970s, but deriving These studies also found that a history of CSA was
conclusions about this literature has been hampered associated with elevated rates of several psychological
by the wide variability in definitions and methods used problems, including substance abuse, anxiety dis-

1712
Childhood Sexual Abuse and Risk for Adult Psychopathology

orders, depression, self-injurious behaviors, and inter- sample; (c) methods of data collection (telephone,
personal functioning deficits. As a result of this mailed, or face-to-face interviews); (d) response rates;
increased attention to CSA, dramatic changes in the and (e) the number of questions used to ask about
reporting and verification of CSA occurred in the CSA. Definitions of CSA have varied across all studies
USA: from 1976 to 1993, there was a 25-fold increase to date, primarily in four dimensions: (a) the ages used
in verified cases of CSA. Although much of the original to delimit childhood (most use 18 or 16); (b) whether
research on CSA in the 1970s and 1980s was centered or not the respondent is asked to self-define the
on the USA, other countries, including Canada, the experience as ‘abusive’ (e.g., asking if the experience
UK, and several European nations began or increased was unwanted, the result of force or trickery, and\or
research on CSA. was considered abusive); (c) whether or not differences
in age between the respondent and the perpetrator are
used to define CSA; and (d) the types of sexual
activities asked about (e.g., contact only acts vs. both
2.2 Phase 2 Research: General Population Studies
contact and noncontact acts).
of Prealence and Correlates of CSA
Different definitions of CSA and the number of
The second phase of CSA research in the early 1980s screening questions used most likely contribute to
addressed a methodological limitation of earlier most of the variability across studies. Definitions that
studies i.e., that most early studies were based on use older age cutoffs, and include both contact and
samples of children who were identified as being noncontact acts produce higher prevalence rates. In
abused (forensic samples) or samples of adults who addition, surveys that use two or more screening
were in treatment (clinical samples). These studies questions for CSA typically result in higher prevalence
were limited because results from forensic and clinical rates compared to surveys using only a single question.
samples may differ significantly from results found For a thorough review of how different definitions and
among persons who did not report the abuse or who methods influence prevalence rates of CSA, see
were not in treatment. Numerous nonclinical studies Finkelhor (1994). In this report, Finkelhor reviews the
used college samples, but again, women and men in bulk of studies to date, and concludes that about 20
college who report a history of CSA may not generalize percent of North American women and about 5–10
to the adult general population for many reasons, percent of North American men report a history of
including that college students may over-represent CSA experiences that include both contact and non-
socioeconomically advantaged persons and under- contact acts. The 20 percent estimate for women is
represent psychologically impaired persons. Some consistent with a recent nationally representative study
researchers have also argued that first-year college of women in the USA that used multiple screening
students may have very recently experienced CSA, and items for CSA and face-to-face interviews (Vogeltanz
long-term consequences have not yet occurred. et al. 1999). Prevalence estimates from other countries,
Forensic, clinical, and college samples studies are including Australia, the UK, The Netherlands, New
quite important for developing a base of research to Zealand, Spain, Sweden, and Switzerland, appear to
build upon, but the most scientifically rigorous sam- be fairly consistent with the North American esti-
pling design is one that uses a random sample from the mates, suggesting that CSA occurs at fairly consistent
general population. Random sampling indicates that rates in European and English-speaking countries.
every person in an identified community or even an Some of these earlier general population studies
entire country has an equal chance of being selected measured possible consequences of CSA, but most
for the study (although random samples may exclude varied significantly in the types of psychological
certain age groups or persons living in institutions). problems assessed and the instruments used. For
In CSA research, the basic methodology of a general example, many studies used unstandardized questions
population study is to ask adult participants if they about the occurrence of numerous psychological
experienced any sexually abusive acts during child- symptoms, others used standardized questionnaires
hood (the adult retrospective recall method). Re- and psychological symptom checklists, and a very few
searchers also often ask participants to report the used structured clinical interviews that allowed the
characteristics of the abuse (e.g., who was the abuser, researcher to make a clinical diagnosis for several
the ages of the abuser and participant), any known psychological disorders. Although results from these
effects of the abuse, and\or any current psychological studies varied, the majority, like earlier clinical studies,
problems they may be experiencing. reported significant relationships between adult
Over 20 adult retrospective studies with general women’s reports of experiencing CSA and their greater
population samples were conducted in North America likelihood of experiencing current symptoms of sub-
in the last two decades of the twentieth century, and stance abuse, depression, anxiety, sexual functioning
prevalence rates varied dramatically across studies, problems, eating problems, and self-injurious behav-
with about 2–62 percent in women and 3–16 percent in iors. A large limitation of these studies, as with earlier
men. Sources of this variability include differences in clinical studies, was the relative lack of inclusion of
(a) definitions of CSA; (b) geographical region of men. Researchers preliminarily concluded that men

1713
Childhood Sexual Abuse and Risk for Adult Psychopathology

with a history of CSA were also reporting elevated sectional multivariate designs, in which data about
rates of various psychological problems. early family environment, the occurrence of CSA, and
In addition to reporting on the simple (bivariate) adult psychopathology are gathered at one point in
relationships between CSA and adult psychological time. Two twin studies using a cross-sectional multi-
problems in the general population, a few researchers variate design are also included here because the twin
looked at whether some characteristics of CSA might study provides unique information about possible
be more predictive of adult psychological problems. genetic contributions to risk for CSA and highly
This preliminary data suggested that frequent and\or shared environments. Although cross-sectional multi-
severe abuse, and abuse by a biological parent have variate designs cannot definitively test for causal
been modestly, but consistently linked to a greater relationships, rigorous multivariate methodologies
likelihood of problems in adulthood. provide a high degree of confidence about causality.
By the 1990s, several comprehensive reviews of CSA Finally, all of the studies used structured diagnostic
research had been published, with most authors interviews allowing for clinical diagnosis of psy-
claiming that CSA was correlated with numerous chological disorders. Although standardized question-
psychological problems, symptoms, or disorders. naires may also provide an important way of
However, the majority of studies on CSA did not use measuring levels of psychological symptoms or dis-
methodologies that could determine if CSA caused tress in the general population, a diagnosis provides a
adult problems, and several researchers challenged clear statement that the individuals being interviewed
this causal assumption. The basic argument was that have experienced significant impairment in their lives.
many persons who report a history of CSA also report Diagnoses also provide information about lifetime
that their childhood family environments were patho- occurrence of disorders, whereas symptom checklists
logic, including such problems as conflicts with typically assess only for current levels of functioning.
parent(s), physical or emotional abuse, lack of par- The studies that meet the above criteria come from
ental warmth or caring, living with only one biological the USA, Australia, and New Zealand. In the first of
parent, and parental psychopathology. Therefore, it two general population prospective studies, Fergusson
may be these family environment factors are the causes and colleagues (1996) made yearly evaluations of a
of adult psychopathology as well as a risk factor for New Zealand sample of over 1,000 children from birth
the occurrence of CSA. For an example of a prominent to age 18. At age 18, CSA before age 16 and
debate on this issue, see Nash et al. (1998). As a result diagnosable psychological disorders were measured.
of these challenges, it became incumbent upon CSA Results were that CSA significantly predicted major
researchers to address this issue in future research. depression, anxiety disorder, conduct disorder, sub-
To address cause-and-effect issues in epidemio- stance use disorder, and suicidal behaviors after
logical research, researchers must use either prospec- controlling for prospectively measured family envi-
tive designs or rigorous multivariate studies (see a ronment factors. The highest risk for having a
description below on these methods). This final phase psychological disorder was found among individuals
of research is complex and very costly, and only a few reporting CSA intercourse experiences.
studies to date have been rigorous enough to allow for Boney-McCoy and Finkelhor (1996) used a pro-
cause-and-effect interpretations. A summary of this spective design in which they interviewed a nationally
work is described below. representative sample of children, aged 10–16, at two
times, approximately 15 months apart. Results were
that sexual assaults occurring to the children after the
Time 1 interview predicted Time 2 diagnoses of major
2.3 Phase 3 Research: Studies Examining the
depression after controlling for previous lifetime
Causal Status of CSA on Adult Psychopathology
occurrence of depression, the quality of the parent–
To date, only a few studies have been able to test child relationship, parental education, race, and
adequately for a causal relationship between CSA and whether the child lived with both parents. However,
adult psychopathology. Although these studies vary in the strength of the sexual assault–major depression
their definitions of CSA and measures of psycho- association was weaker after controlling for family
pathology, they are similar in three important ways. environment factors.
First, all the studies had large general population Using a large random New Zealand community
random samples of adults, or in one case, a nationally sample of women, Mullen and his colleagues (1993)
representative random sample of 10- to 16-year-olds. reported that CSA was independently related to
Second, all of the studies used sophisticated several diagnosed psychological disorders, after con-
methodologies that allowed for some level of causal trolling for inadequate parenting, parental divorce,
interpretation. These methods were of two types: a and early physical abuse. This study also found that
prospective design, in which researchers collect mul- the severity of abuse was the strongest predictor of
tiple measures from individuals over time, therefore later problems.
capturing a more accurate account of what happened Finally, two twin studies also supported the in-
before and after the occurrence of CSA; or cross- dependent effects of CSA on adult psychological

1714
Childhood Sexual Abuse and Risk for Adult Psychopathology

disorders, controlling for different aspects of family of negative family environment, although the strength
environment. Dinwiddie et al. (2000) conducted di- of the relationship attenuates from slightly to con-
agnostic interviews and measured CSA in 2,700 siderably, depending on the psychological disorder
Australian twin pairs (male and female monozygotic being predicted. (b) The reported strength or size of
and dizygotic pairs). A single question assessed for the independent relationships between CSA and risk
‘forced’ CSA, and diagnoses were obtained for sub- for adult psychopathology were modest, indicating
stance abuse, major depression, anxiety disorders, and that (i) family environment serves as both a risk factor
conduct disorder. The only family environment factor for CSA and a mediator for the effects of CSA on adult
measured was parental psychopathology. CSA signifi- psychopathology; and (ii) many individuals who re-
cantly predicted all diagnoses in women and all but port a history of CSA do not develop adult problems.
social phobia in men, after controlling for parental (c) More severe forms of CSA i.e., intercourse CSA,
alcoholism and depression. The authors reported that lead to a greater risk of developing psychological
when one twin reported CSA and the other twin did disorders. (d) Men were under-represented in these
not (discordant for CSA), rates of disorders were not studies, but when studied, also had similar risks to
significantly different. However, further analyses women for developing psychological disorders as a
revealed much higher rates of disorders among twins consequence of CSA history.
who were both abused (concordant), indicating that
CSA may have specific effects on the development of
disorders. 4. Future Directions
In the most recent and most sophisticated multi- Research on the prevalence, characteristics, and corre-
variate twin study to date, 1,411 adult female twins lates of CSA has progressed enormously in the last two
from the USA were given structured clinical interviews decades, and findings from each phase of research
to determine lifetime diagnoses of major depression, have informed the next. Despite this progress, there
generalized anxiety disorder, panic disorder, bulimia are still some core problems that should be corrected,
nervosa, alcohol dependence, drug dependence, and including the need for standard definitions of CSA,
the presence of two or more of the disorders (co- more information about what measurement strategies
morbidity). CSA was measured using multiple result in the most accurate information about CSA,
screening questions to determine both contact and and the inclusion of more men in studies. Findings
noncontact abuse. Several family environment factors from current research indicate the need for future
were measured, but, uniquely, the researchers inter- studies of CSA that assess for a wide range of family
viewed the respondents’ parents, and made clinical environment factors, because it is now certain that the
diagnoses of disorders and obtained parent ratings of effects of CSA on adult psychological problems are
various family environment factors. Results were that influenced by the familial context in which children
self-reported CSA of any kind was significantly asso- live. There is no consensus, however, on which aspects
ciated with all disorders, except bulimia nervosa, after of family environment most influence responses to
controlling for parental (or twin) reported family traumatic events such as CSA, and this research is
environment factors and history of parental psycho- needed. Although there is some data suggesting that
pathology. In all disorders (including bulimia more severe forms of CSA lead to greater risk for adult
nervosa), intercourse CSA was more strongly related problems, there is still much to be learned about how
to psychological disorders than any other CSA form characteristics of the abuse may influence later prob-
(genital, nongenital, or any CSA). As in the previous lems, and how abuse characteristics may interact with
twin study, the cotwin control analyses indicated that family environment factors. For example, there is no
there are specific effects of CSA on adult psycho- information about how family environment factors
pathology, but that shared environmental factors may interact with CSA differently, depending on
clearly contribute to both risk of CSA and devel- whether or not the abuser lives in the home with the
opment of future disorders. child. Finally, more information is needed about what
happens after CSA, including measurement of sub-
3. Conclusions sequent psychological problems, coping methods, and
determining what factors seem to protect individuals
The studies reviewed used the most scientifically from developing long-term psychological problems.
rigorous methods to date for testing the causal relation In order to answer these questions, CSA researchers
between CSA and adult psychopathology. The studies must continue to use prospective and multivariate
also had the benefit of all using the same type of cross-sectional studies, but there continues to be a
measurement for adult psychopathology, a clinical great need for clinical and forensic studies that can
interview that used psychiatric diagnostic criteria for inform the direction of more costly epidemiological
determining the presence of a disorder. The studies studies.
support four main conclusions. (a) The statistical
relationship between CSA and adult psychopathology See also: Anxiety Disorder in Children; Behavior
remains significant after controlling for certain aspects Therapy with Children; Child Abuse; Child and

1715
Childhood Sexual Abuse and Risk for Adult Psychopathology

Adolescent Psychiatry, Principles of; Child Care and the way we think about children. The legal standing of
Child Development; Childhood Depression; Children children today in the West would be unrecognizable to
and the Law; Children, Rights of: Cultural Concerns; the eighteenth century observer, who would have seen
Sex Offenders, Clinical Psychology of; Violence and children as the property of the father. The law relating
Effects on Children to children reflects changes in women’s legal standing
as well as global diversity and changing social and
cultural attitudes and practices. With the rapid de-
velopment of international law in the second half of
Bibliography the twentieth century global variations have become
more visible and contested. These variations express
Boney-McCoy S, Finkelhor D 1996 Is youth victimization some of the key social and political tensions raised by
related to trauma and depression after controlling for prior the changing significance of childhood at the beginning
symptoms and family relationships? A longitudinal study.
of the twenty-first century. So despite the emergence of
Journal of Consulting and Clinical Psychology 64: 1406–16
Conte J R 1994 Child sexual abuse: Awareness and backlash.
international laws on children, the ways in which
Future of Children 4: 224–32 children’s welfare and rights are secured differ.
Dinwiddie S, Heath A C, Dunne M P, Bucholz K K, Madden
P A F, Slutske W S, Bierut L J, Statham D B, Martin N G
2000 Early sexual abuse and lifetime psychopathology: A co-
twin control study. Psychological Medicine 30: 41–52
Finkelhor D 1979 Sexually Victimized Children. Free Press, New 1. A Brief History of Children and the Law
York Under the Roman civil law doctrine of patria potestas
Fergusson D M, Horwood L J, Lynskey M T 1996 Childhood the father had unlimited control over his children.
sexual abuse and psychiatric disorder in young adulthood, II:
This doctrine shaped Western legal systems’ vesting of
Psychiatric outcomes of childhood sexual abuse. Journal of the
American Academy of Child and Adolescent Psychiatry 35: parental rights in the father to the exclusion of the
1365–74 mother until the eighteenth and nineteenth centuries.
Finkelhor D 1994 Current information on the scope and nature Within Anglo-American jurisprudence, the common
of child sexual abuse. Future of Children 4: 31–53 law treated the father as the legitimate child’s natural
Kendler K S, Bulik C M, Silberg J, Hettema J M, Myers J, guardian. Legislative reform throughout the nine-
Prescott C A 2000 Childhood sexual abuse and psychiatric teenth century encroached upon the rights of the
and substance use disorders in women. Archies of General father, giving mothers limited rights to custody of their
Psychiatry 57: 953–9 children in specific circumstances. By the early twen-
Mullen P E, Martin J L, Anderson J C, Romans S E, Herbison tieth century the welfare of the child had become the
G P 1993 Childhood sexual abuse and mental health in adult guiding principle in disputes over children on divorce.
life. British Journal of Psychiatry 163: 721–32
Nash M R, Neimeyer R A, Hulsey T L, Lambert W 1998
However it was only much later in the twentieth
Psychopathology associated with sexual abuse: The import- century that the rights of mothers and fathers within
ance of complementary designs and common ground. Journal marriage in relation to their children were equalized.
of Consulting and Clinical Psychology 66: 568–71 With the rise in divorce and parental separation in
Rind B, Tromovitch P, Bauserman R 1998 A meta-analytic the last decades of the twentieth century and the
examination of assumed properties of child sexual abuse using questioning of earlier norms by the growing feminist
college samples. Psychological Bulletin 124: 22–53 movements (Bridgeman and Monk 2000) legislatures
Russell D E H 1984 Sexual Exploitation: Rape, Child Sexual and policy makers considered how the welfare of those
Abuse, Sexual Harassment. Sage, Beverly Hills, CA children affected could be properly safeguarded and
Vogeltanz N D, Wilsnack S C, Harris T R, Wilsnack R W, promoted. Developments have included the emerg-
Wonderlich S A, Kristjanson A F 1999 Prevalence and risk
factors for childhood sexual abuse in women: National survey
ence of alternative dispute resolution, the delegaliza-
findings. Child Abuse & Neglect 23: 579–92 tion of family disputes and the adoption of ‘no-fault’
divorce (Eekelaar and Katz 1984). (See Family Law.)
N. D. Vogeltanz-Holm

1.1 Legitimacy
While legitimate children were seen as the property of
the father, illegitimate children were seen as the
Children and the Law property of no one. Such a child was nullius filius and
had no legal relationship with his or her parents. The
The ways in which the law has positioned children early bastardy laws were aimed at preventing illegit-
have varied from society to society and over time. In imate children from becoming a charge on the com-
part this is due to differences and changes in family munity—and attempted to do so by punishing the
life, in the material world that children inhabit, and in unmarried mother and the reputed father, and charg-

1716
Children and the Law

ing either the mother or both for the relief of the child. restrict and finally to prohibit the employment of
In the twentieth century, the status of the illegitimate children in various occupations (e.g., in factories and
child has changed substantially as has the language coal mines). Initially such legislation restricted the
concerning legitimacy. Article 25(2) of the Universal hours worked; with the advent of compulsory edu-
Declaration of Human Rights 1948 provides that ‘all cation laws children were further withdrawn from the
children, whether born in or out of wedlock, shall labor force. Concerns about child health resulted in
enjoy the same social protection’ and Article 2(2) of measures to reduce infant mortality rates and to
the United Nations Convention on the Rights of the improve diet through welfare provision such as school
Child 1989 (UNCRC) requires states to take all meals. Such measures benefited all children by im-
‘appropriate measures to ensure that the child is proving public hygiene, but many focused on the
protected against all forms of discrimination or pun- urban poor.
ishment on the basis of the status … of the child’s
parents.’ Some countries have abolished the concept
1.3 Institutional Deelopments
of illegitimacy altogether, others have permitted legiti-
mation by subsequent marriage and have legislated to The development of mass education for children in the
mitigate the adverse legal consequences by extending nineteenth and twentieth centuries has resulted in the
to illegitimate children the same rights accorded to school becoming central to the experience of child-
legitimate children. The parental status of the father hood. The law relating to children and education was
has been increasingly recognized in some countries; controversial given that disputes over education raised
others, such as those in South Asia continue to issues around parental rights, religious freedom, and
recognize only the nonmarital child’s legal relationship social justice.
with the mother (Goonesekere 1998). Another institutional expression of the legal and
social changes surrounding childhood was the growth
of the juvenile court movement in the late nineteenth
1.2 Child Welfare and Protection
and early twentieth centuries (see Juenile Justice:
Much of the history of children and the law has been International Law Perspecties). Between 1899 and
dominated by campaigns to protect children and to 1908, juvenile courts had been set up in states in the
advance their welfare. The traditional account of the USA, Australia, and England (Mack 1909). Separate
history of childhood is a story of rescue. The nine- penal institutions for juvenile offenders had already
teenth century witnessed the development of agencies been established with reformatories for juvenile
that were dedicated to rescuing children from abuse offenders being built in the USA and England by the
and neglect within the family. Societies for the pre- mid-nineteenth century. The rationale for these deve-
vention of cruelty to children were established in many lopments was that children were harmed by exposure
Western cities, and by the early twentieth century there to adult depravity, especially within the penal system,
were over 34 such societies in the USA and fifteen that children were more victims than villains, and that
elsewhere. By the late twentieth century, child rescue with timely and appropriate intervention the child
had become an international as well as national cause could be rescued from a life of crime and immorality.
(Kent 1995, Van Bueren 1998). In a number of jurisdictions there was no distinction
Then, as today, there was debate surrounding the made between the child who offended and the neg-
proper role of the state regarding intervention into lected child—both were seen as the victim of the
family life. The ideology of family privacy and the idea socioeconomic environment. The dominance of wel-
that children were the property of their parents fare considerations in the juvenile courts, however,
combined to render legislatures reluctant to encroach resulted in the neglect of procedural rights for children.
too far into family life. However children of poor and Procedural justice for children in the criminal and civil
working class families were more visible and more justice systems became an issue in the 1960s and 1970s
intensively policed than those from wealthier and in a number of jurisdictions. For example in the
more powerful sections of the community. The ‘dis- USA the Supreme Court decision in Re Gault
covery’ of ‘the battered child’ in the middle of the (1965) established that children were entitled to the
twentieth century, and of child sexual abuse in the final protection of the Constitution.
decades, gave rise to significant changes in legal Later the idea of a family court emerged in response
practice, such as revisions in the rules regarding the to the demand that decisions taken in relation to
admissibility of children’s testimony. children needed to be informed by specialist knowl-
In contrast to the legislative reluctance to intervene edge about their needs and welfare. This in turn has
in any substantial way with the parent–child relation- led to the emergence of specialist judges in some
ship, there was considerable legislative activity in the jurisdictions.
nineteenth and twentieth centuries to protect children Despite significant historical variations in the pos-
from exploitation and abuse outside the home and to ition of children under the law, by the end of the
better advance their welfare. In the public sphere in twentieth century in civil proceedings the child’s
industrializing countries, laws were introduced to welfare was treated as the primary if not paramount

1717
Children and the Law

consideration. The language used in many legal regarding state power, whose definition of welfare
systems is no longer that of parental rights alone but prevails, and respect for the child’s identity and
rather of parental responsibility; rights that parents do community. These issues are contested at both the
enjoy must be exercised for the benefit of the child. The national and global level (Alston 1994).
child is no longer seen as the property of parents.

2. The Welfare of Children 2.1 Children and Families


Today most legal systems consider the child’s welfare The conventional account of the different ways in
to be a primary consideration in the resolution of which children are treated by the law has two divergent
disputes involving children—a position reinforced by strands—one that emphasizes the child’s place within
the near universal signing and ratification of the a particular family (belonging to a specific religious,
UNCRC. ethnic, and cultural community) and one that offers a
The law relating to children is increasingly expressed more individualized approach, recognizing the distinc-
in terms of advancing their welfare though there are tiveness of interests even within the family. Domestic
exceptions (see Sect. 2.2). The centrality of the welfare and international law sees the child’s family as being
of the child is expressed in domestic and international the natural and proper environment for the child’s
law. Principle 1 of the UN Declaration of the Rights of upbringing; even though the concept of the family is
the Child 1959 provides that in the enactment of laws undergoing change, ranging from increased cohabi-
for the special protection of children ‘the best interests tation to gay marriage and adoption (Cornell 1998).
of the child shall be the paramount consideration.’ There is considerable social value in ensuring that,
This set the international benchmark for assessing within limits, parents should be free to bring up their
legislation affecting the welfare of children until the children in a way that is consistent with their own
UNCRC of 1989. The UNCRC is based on the ‘3 Ps’: values and beliefs. However, there is variation in how
rights to protection, provision, and participation. legal systems regulate family life, and in particular
Article 3 of the UNCRC provides that: ‘In all actions how they allocate powers and duties to parents and
concerning children, whether undertaken by public or determine the circumstances in which family privacy
private social welfare institutions, courts of law, can be overridden.
administrative authorities or legislative bodies, the
best interests of the child shall be a primary con-
sideration.’ Article 3 is wider than Principle 1 of the
2.1.1 Child abuse. While child abuse is not confined
UN Declaration insofar as it states that the child’s
to the family, much of the debate about the legal frame-
welfare shall be the primary consideration of ‘public or
work focuses on this setting. The abuse of children
private social welfare institutions.’
in ‘public care’ (while regularly plagued by scandal)
While the welfare principle is the primary concern in
tends to generate discussion about the accountability
judicial proceedings affecting children the ability of
of welfare bureaucracies and the quality of care pro-
legislation directly to safeguard and promote the
vided by the corporate parent. Abuse within the
welfare of children is limited by the material and
home gives rise to special problems for legal systems.
ideological circumstances of their upbringing. The
Not only do childrearing practices vary, but some
condition of children in the world today varies
groups (e.g., the urban poor) are subject to more co-
enormously; this can be seen in different mortality
ercive forms of intervention than others. In some juris-
rates, child labor, access to education, malnutrition,
dictions, for example, the grounds for intervening
the prevalence of child prostitution, and the presence
into family life in order to protect children are drawn
of child soldiers. These issues give rise to intense
in broad terms (e.g., statutes identify ‘immorality’ as
debate nationally and internationally. For example,
constituting neglect and thus permit the removal of
while Article 32 of the UNCRC requires states parties
the child), while in others there is a minimum harm
to the UNCRC to recognize the right of the child to be
threshold that has to be satisfied if any intervention
protected from economic exploitation and from per-
is to be justified. Jurisdictions vary in the degree to
forming any work that is likely to be hazardous or to
which they adjudicate between the competing values
interfere with the child’s education’ child labor,
of parents’ right to raise their children according to
including bonded labor, is still widespread throughout
their own values, on the one hand, and the child’s
South Asia and South America. The International
right to be protected, on the other (Schwartz-Kenney
Labour Organization Convention No.138 links the
et al. 2001).
child labor issue with education and provides that the
minimum age for admission to employment must not
be lower than the compulsory school age and in any
event should not be lower than 15. 2.1.2 Relationship breakdown. The rise in relation-
The debates surrounding the welfare rights of ship breakdown in the West has led to a number of
children are intense because they raise questions significant developments.

1718
Children and the Law

There has been an increase in disputes over the diversion and reparation schemes have developed to
custody of children when relationships end. Many reduce the risk of contamination from the formal
jurisdictions have moved towards no-fault divorce, justice system and to underscore the offender’s re-
and where private ordering is unsuccessful disputes sponsibility to the victim and the wider community.
over children are increasingly resolved via mediation. However in some countries (e.g., some states in the
Where such disputes are not resolved through me- USA) there are moves to treat juvenile offenders as if
diation, court adjudication tends to focus on the they were adults, including exposing them to the
welfare of the child. The enforcement of orders made possibility of capital punishment for offences com-
in relation to custody and access, or contact, is mitted whilst a minor (Strater 1995).
problematic; the courts in some jurisdictions (such as
the USA) at times threaten to imprison the custodial
parent if they continue to refuse to allow the noncus- 3. The Modern Children’s Rights Moement
todial parent access.
Custody disputes occasionally involve child abduc- It is now accepted that children have rights (Freeman
tions. Most states have provisions dealing with child 1997). The meaning and realization of such rights is
abduction nationally and have entered into bilateral contested locally and globally. Ideas about children’s
and regional treaties to deter the practice (e.g., the rights are linked with the child’s own community and
Inter-American Convention on the International Re- culture.
turn of Children 1989). At the international level, the The UNCRC signals that children have civil, pol-
Hague Convention on the Civil Aspects of Inter- itical, and social rights. These ‘human rights’ are not
national Child Abduction (1980), which has been just concerned with the welfare of children, though
signed by over 50 countries including most in North many of the UNCRC’s provisions are concerned with
and South America and Europe, focuses on the their protection (e.g., from abuse) and the adequate
enforcement of rights of custody between contracting provisioning of childhood. The distinctive feature of
states. The Convention applies where a child has been the modern children’s rights movement is that along-
wrongfully removed or retained in breach of rights of side the traditional concern with the welfare and
custody. The principle behind the Convention is that protection rights of children it aims to promote their
children who have been abducted should be returned ‘liberty rights.’ The rights contained in Articles 12 to
immediately to the country from which they have been 16 of the UNCRC, are indispensable to developing
abducted and that any dispute as to custody should be respect for the children in society and to fostering their
resolved within that jurisdiction. Article 13 of the participation in the community (De Winter 1997). The
Hague Convention provides defenses to an order for UNCRC has also acted as a catalyst for the fur-
the immediate return of the child including where the ther development of children’s ombudsmen (Flekkoy
child objects and has attained an age and maturity 1991)
where it is appropriate to take account of his or her However the meaning of the right to participate is
views. This recognition of the autonomy rights of the contested (see Ncube 1998), and legal systems are
child reflects how far the law has moved away from the cautious about seeing the child as a legal subject.
idea that children are the property of their parents. While in a number of jurisdictions children are able to
instruct lawyers to represent them in private as well as
public law proceedings, invariably the court’s view of
that child’s welfare will prevail over the child’s wishes.
2.2 Juenile Crime
In short, changing views of children are effecting
While legal doctrine, procedure, and philosophy re- legal change. Children are now acknowledged as
garding the treatment of juvenile offenders vary across having a voice in law in an increasing number of
nations, international law at the end of the twentieth jurisdictions. Familial relations are undergoing change
century has begun to lay down minimum rules for the pursuant to a trend towards the democratization of
administration of juvenile justice. In 1985 the UN family life (Beck 1997). Changing views about children
General Assembly adopted the UN Standard Mini- and human rights are bringing about a movement to
mum Rules for the Administration of Juvenile Justice end the corporal punishment of children. The
(the ‘Beijing Rules’). The Beijing Rules constitute a UNCRC is giving rise to increased tension between the
benchmark against which national juvenile justice local and global positioning of the child. Debates on
systems can be measured covering the investigation, protecting children from traditional childrearing prac-
prosecution, and punishment of juvenile crime. tices and the labeling of these as ‘abusive’ has given
Today the trend is towards a balancing of welfare rise to critiques of universalist conceptions of rights.
and punishment considerations. The discrediting of Yet states themselves are subject to scrutiny as to how
aspects of the rehabilitative ideal in some countries in they are meeting their obligations under the UNCRC.
the 1960s and 1970s has led to a pragmatism and While the way in which the law treats children
managerialism in juvenile justice. Increasingly juvenile continues to be contested, the emergent language of
offenders are segregated from adult offenders and rights concerning social and intergenerational rela-

1719
Children and the Law

tions threatens to subvert many established ways of including women, racial, ethnic and religious minor-
treating children and to open up spaces in which ities, the disabled, and gay males and lesbians all have
children can acquire their own voice. claimed, and received to varying degrees, rights to
equal treatment under law, nondiscrimination in
See also: Adoption and Foster Care: United States; employment in both the public and private sectors,
Child Abuse; Children, Rights of: Cultural Concerns; and fuller integration into the economic and political
Dissolution of Family in Western Nations: Cultural life of their countries.
Concerns; Divorce and Children’s Social Develop- It is perhaps not surprising, therefore, that a
ment; Divorce, Sociology of; Family as Institution; worldwide movement to recognize children as a
Family Law; Family Processes; Family Theory: Role distinct subgroup with claims of rights also occurred.
of Changing Values; Nontraditional Families and In 1924, the League of Nations adopted the first
Child Development; Poverty and Child Development; Declaration of the Rights of Children. In 1989, the
Regulation: Family and Gender; Street Children: United Nations General Assembly adopted the UN
Cultural Concerns Convention on the Rights of the Child (UN Con-
vention 1989), which has been called the ‘most
comprehensive and detailed of all … international
human rights instruments’ (Alston 1994, p.1). The
Bibliography Convention has been ratified by 192 of the 194 UN
Alston P (ed.) 1994 The Best Interests of the Child: Reconciling members. In the United States conservative political
Culture and Human Rights. Clarendon Press, Oxford, UK groups, concerned that the CRC undermines parental
Beck U 1997 Democratization of the family. Childhood: A Global authority and gives government too much power over
Journal of Child Research 4(2): pp. 151–68 families, have contributed to blocking ratification.
Bridgeman J, Monk D (eds.) 2000 Feminist Perspecties on Somalia lacks an internationally recognized govern-
Child Law. Cavendish, London
Cornell D 1998 At the Heart of Freedom: Feminism, Sex and
ment, precluding ratification.
Equality. Princeton University Press, Princeton, NJ Despite the seemingly widespread acceptance of the
De Winter M 1997 Children as Fellow Citizens. Radcliffe Medical concept of children’s rights, the idea that children,
Press, Oxford, UK individually or as a class, should have rights (claims
Eekelaar J, Katz S N (eds.) 1984 The Resolution of Family enforceable against others) is actually revolutionary
Conflict: Comparatie Legal Perspecties. Butterworths, and problematic both in philosophical and practical
Toronto, Canada terms. Determining what rights to provide children
Flekkoy M 1991 A Voice for Children. Jessica Kingsley, London requires that societies confront fundamental value
Freeman M D A 1997 The Moral Status of the Child. Martinus issues quite different from those entailed in recognizing
Nijhoff, The Hague, The Netherlands
Goonesekere S 1998 Children, Law and Justice. Sage, New Delhi,
other groups’ claims.
India First, children generally are not autonomous; they
Kent G 1995 Children in the International Political Economy. St are dependent. Ideally, they are part of families; their
Martin’s Press, New York life chances are greatly affected by the care they receive
Mack J 1909–10 The juvenile court. Harard Law Reiew 23: from their families. Parents need authority to fulfill
104–22 these responsibilities; some highly respected child
Ncube W (ed.) 1998 Law, Culture, Tradition and Children’s development specialists argue that giving children
Rights in Eastern and Southern Africa. Dartmouth, Aldershot, rights may be incompatible with sound child de-
UK velopment (Goldstein et al. 1973). In addition, giving
Schwartz-Kenney B, McCauley M, Epstein M (eds.) 2001 Child
Abuse: A Global View. Greenwood Press, Westport, CT
children rights may conflict with parental rights and
Strater S 1995 The juvenile death penalty: In the best interests of privacy. However, parents do not always act in their
the child? Loyola Uniersity Chicago Law Journal 26: 147–82 children’s best interests and the community as a whole
Van Bueren G (ed.) 1998 Childhood Abused: Protecting Children has an interest in the development of future gener-
against Torture, Cruel, Inhuman and Degrading Treatment and ations. Children have their own views and desires.
Punishment (Programme on International Rights of the Child) Therefore, it must be decided how responsibility for
Ashgate, Aldershot, UK the well-being and upbringing of children should be
allocated between parents, families, the child, and the
J. Roche community as a whole.
Second, intellectual and emotional capacities de-
velop gradually, for both biological and social reasons.
These limits in capacity may conflict with claims that
children should have the same liberties and privileges
Children, Rights of: Cultural Concerns as adults (Purdy 1992). A theory of children’s rights
must address the question, what entitles persons to be
The last 100 years have witnessed enormous progress full participants in a country’s civil and political life?
in the recognition of individual human rights through- This article reviews the historical development of
out the world. In addition, various identity groups, children’s rights. It then describes the different cat-

1720
Children, Rights of: Cultural Concerns

egories of claims made under the rubric of children’s children throughout the world, increased state efforts
rights and discusses value issues that need to be to guarantee children a minimal level of material well-
resolved as governments, courts, and citizens seek to being, and greatly expanding state intervention into
define the children’s status in each society. It concludes family privacy through child abuse laws have all
by examining the challenges countries face imple- benefited children. Moreover, during this period, in
menting any such rights. many countries children were recognized as rights-
holders with respect to freedom of speech and other
civil liberties and were given a greater degree of legal
1. The Historical Context autonomy over the decisions affecting their lives.
Yet the scope of applicable rights remains highly
For much of history children were viewed largely as contested. At the extreme, some commentators have
nonpersons, the property and responsibility of their argued for ‘children’s liberation,’ with the complete
parents, who had a right to control their upbringing, elimination of distinctions between children and adults
even their very existence (Eekelaar 1986). Children with respect to all political, social, and familial rights
(usually defined as persons under 21 or 18 years of age) (children, as reflecting of their status, usually have not
also could not participate in the political or civic life of been the demanders of rights, in contrast to other
their countries. groups). For example, they argued that children
In the mid-1800s, some countries began assuming should have the right to vote, to decide whether to go
state responsibility for promoting and protecting to school, and to a share of the family income to spend
children’s well-being. Initially, this entailed providing as they choose (Farson 1974).
free public education and passage of laws protecting Few proponents of children’s rights go this far.
children from severe physical abuse by their parents. Most advocates propose limiting, but not eliminating,
Over the next 100 years, Western industrialized coun- parental autonomy in controlling their children’s lives
tries adopted more laws affecting the status of children, and giving children increasing autonomy as they
including laws limiting child labor in settings outside mature, in recognition that capacities develop over
the family, passage of compulsory school attendance time. Yet even the more limited types of claims of
laws, and greater state efforts to protect children from rights raise profoundly difficult issues.
parental maltreatment. The idea that childhood was a
period for gradual maturation and lesser responsibility
also was reflected in the creation of the juvenile court,
2. What is Meant by Children’s Rights
which was directed to treat, not punish, children who
violated the law (Hawes 1991). The term children’s rights is used to encompasses a
While these policies were advocated in the name of variety of different claims (Campbell 1992). Distin-
children’s rights, and certainly contributed to their guishing these categories reveals the tensions in the
well-being, these were not the types of civil rights that general concept and helps identify the issues related to
other groups were demanding and receiving. The focus the implementation of various types of claims.
was on their protection, not their autonomy or fuller
integration into the economic and political life of their
countries. Moreover, except where parental behavior
2.1 Basic Human Rights
was seen as inimical to the social order, parents
continued to have virtually total authority over their The first claim is that children are entitled to basic
children’s upbringing. In the United States, where the human rights, including the right to life itself, the right
movement towards child protection was most evident, not to be ‘owned’ by another, the right to freedom
the US Supreme Court ruled early on that parents had from tobrture or other cruel punishment, and the right
a constitutional right to control their children’s up- to be free from discrimination based on race, color,
bringing (Meyer v. Nebraska 1923). sex, religion, or national, ethnic, or social origin.
In fact, these laws often were supported primarily as Such rights are widely accepted for adults, despite
a means of protecting adult interests. For example, continuing problems with implementation, especially
child labor laws and compulsory schooling were with respect to nondiscrimination based on gender,
championed by newly emerging labor unions, which race, and ethnicity. Since these rights are not tied to
viewed the availability of child labor as a barrier to competence or capacity, there is no reason for denying
higher wages for adults (Woodhouse 1992). Pro- them to children on this basis. The critical step is
ponents of the juvenile court promoted it as a needed accepting that children are persons, not chattels or
intervention to protect society from children who were incomplete human beings. The adoption of this prem-
not being adequately supervised by their parents, ise by virtually all countries reflects a fundamental
often poor, immigrant parents (Schlossman 1977). shift in the conception of children.
Since the 1960s, countries have continued expand- Still, application of these rights to children is not
ing the protection of children’s well-being. The es- entirely unproblematic, at least with respect to the
tablishment of a right to education for virtually all relationship of parents and children. Few countries

1721
Children, Rights of: Cultural Concerns

still consider children to be the property of their decisions affecting children their ‘best interests’ shall
parents. However, many people agree with Professor be a primary consideration, constitutes a great ad-
Charles Fried that ‘the right to form one’s child’s vance that clearly benefits the vast majority of chil-
values, one’s child’s life plan … are extensions of the dren.
basic right not to be interfered with in doing these
things for oneself’ (Fried 1978). Yet, such a right
constitutes a form of ‘ownership’ and may conflict
2.2 Welfare Rights
with the child’s right to an ‘open future’ and with
society’s interest in how future adults are socialized A second category of claimed rights is that government
(Feinberg 1980). should guarantee children a basic level of well-being;
There is no easy resolution to this conflict. Unless these are often called welfare rights. The CRC provides
the state were to intervene in family life in ways that that children should have a right to ‘a standard of
would be deemed unacceptable in most countries, living adequate for the child’s physical, mental,
parents will continue to make critical decisions for spiritual, moral and social development’ (ART 27),
children, regardless of the child’s wishes. As noted, ‘the enjoyment of the highest attainable standard of
respected child development experts assert that health …’ (ART 24), and to education (ART 28). The
parents need autonomy to successfully raise their CRC goes beyond these basic goods, and seeks to
children. afford children a broader set of welfare rights, in-
Moreover, deference to parental or family authority cluding the right ‘to rest and leisure, to engage in play
may be demanded by a society’s political, cultural, or and recreational activities appropriate to the age of the
religious traditions. In some countries, supporting child and to participate freely in cultural life and the
parental autonomy is seen as critical to supporting arts’ (ART 31).
political and cultural diversity. In others, the proper Since the 1950s, governments have assumed in-
role of parents may be seen as teaching children to creasing responsibility for guaranteeing that children
accept national cultural values and traditions, not as have adequate food, nutrition, shelter, and health
helping children develop into ‘autonomous’ adults care. In many countries, children benefit from general
(see generally Alston 1994). Many countries have social welfare policies, which provide all citizens with
reserved the right to apply the CRC in light of their universal health insurance and various forms of
own cultural or religious laws, although there is income support. Most industrialized countries also
evidence that the CRC is influencing these values have partially socialized the cost of caring for children
(International Journal of Children’s Rights 1995). through children’s allowances, paid parental leave
Whether children’s rights are universal or culture after the birth of a child, and public provision of day
specific likely will remain one of the major debates in care. Many less economically developed countries are
implementing the CRC. moving in the same direction, despite limited financial
In fact, given that children’s values and perspectives resources.
are malleable and constructed, someone will shape The laggard in this regard, given its capacity, is the
their values and personhood. Parents, families, neigh- United States. In the late 1900s child poverty actually
borhoods, peers, government institutions such as grew in the United States. Resistance in the US reflects,
schools and general cultural norms all play a part. in part, the belief of many political figures that
Thus, the issue is what are the appropriate roles of the childhood poverty is caused by ‘irresponsible’ adult
family and the state in making critical decisions, such behavior, such as having children out of wedlock, and
as those regarding schooling, religion, where the child that it is impossible to provide goods to children
shall live, not whether children should have the right without also giving them to adults, thereby ‘reward-
to control their own upbringing (Coons et al. 1991). ing’ the adults’ action. In addition, many politicians
This question goes to the heart of how a society defines support the premise of libertarian theorists like Robert
the relationship between the state and the individual, Nozick that there is no moral justification for the
adults and children alike. redistribution of wealth (Nozick 1974).
Children also are denied some basic civil rights Even in countries that accept a welfare state, the
available to adults because societies have an interest in level of support that a child should receive remains
how they are socialized. At a minimum, children are largely a political issue, not a legal right enforceable by
compelled to go to school, thus restricting their liberty courts. No matter how wealthy a country, tradeoffs
rights. Finally, there is still substantial debate over the between levels of socially provided health care, hous-
status of the fetus with respect to abortion and ing, education, and other welfare goods are now seen
maternal behavior during pregnancy. Because no as inevitable. Courts, appropriately, are reluctant to
consensus could be reached on this issue, the CRC get too deeply into these decisions. Since the scope of
leaves the question to each country. the claimed right is so vague, courts lack the legitimacy
Despite these problems, the recognition that chil- and authority to order elected governments to provide
dren are persons, with rights, along with the ac- people with specific levels of well-being. In addition,
ceptance of the basic premise of the CRC that in all some theorists contend that, in the political process,

1722
Children, Rights of: Cultural Concerns

framing political debates in terms of moral obligations and connect children to social programs that enhance
that the citizens of a country pledge to provide to each all children’s well-being and future opportunities
other may prove more productive thn trying to without resorting to coercive interventions. By the
establish rights (O’Neill 1988, cf. Freeman 1992). time protection rights must be invoked, the child has
Another critical question is whether welfare rights suffered substantial harm and the capacity of the state
should encompass the right to ‘equal opportunity.’ to rectify the situation is often limited (US Advisory
Opportunity focuses on each child’s options in adult- Board 1991).
hood. Social scientists have struggled to identify the
goods that must be provided to children during
childhood in order to equalize their opportunities as
adults. Much of the focus is on education. But
2.4 Liberty and Participation Rights
opportunity is also highly influenced by the quality of
parenting a child receives and the social and cultural The fourth category of rights does focus on autonomy.
environment in which the child is raised. A right to In virtually all countries, persons under 18 lack the
equal opportunity is far more difficult to guarantee right to vote and to marry, they are restricted in
than a right to basic necessities and may require making decisions about whether to work, and in their
restricting parental rights (Fishkin 1983). freedoms of expression and other civil rights. In
addition, in many countries children accused of
criminal behavior generally are denied some due
process rights available to adults in that country, for
example the right to a jury trial; these limitations have
2.3 Protection Rights
been justified as necessary to promote the rehabili-
A third category of claimed rights for children is tative ideals of the juvenile justice system.
protection from inadequate care from their caretakers, In 1967, the US Supreme Court determined that
especially their parents. Since giving children more children are entitled to at least some of the rights
protection changes their relationship with their guaranteed by the free speech provision of the US
parents and the state’s relationship with parents and Constitution (Tinker v. Des Moines Independent
families, this right raises fundamental issues quite Community School District 1969). Since then the
different from welfare rights. liberty rights of minors have been gradually increased
The degree of justifiable state intrusion against a in the US and many other countries. The CRC
parent’s will remains highly debated. Should the state provides that: ‘(the) child shall have the right to
treat parents as ‘stewards’ or ‘trustees’ of their freedom of expression; this right shall include the
children, and seek to ensure that parents provide freedom to seek, receive and impart information and
children a high level of care, or should the state be the ideas of all kinds, regardless of frontiers, either orally,
protector ‘of last resort,’ intervening only if parental in writing or in print, in the form of art, or through any
care is clearly harmful? other media of the child’s choice’ (ART 13); ‘(s)tates
Most countries have decided that coercive inter- parties shall respect the right of the child to freedom of
vention should be limited to situations where the thought, conscience, and religion’ (ART 14); ‘(s)tates
actual or threatened harm to the child’s physical or parties recognize the rights of the child to freedom of
emotional well-being is substantial. This approach association and to freedom of peaceful assembly’
reflects the judgement that unwanted intervention is a (ART 15). Notably, the CRC does not provide adult
drastic step from the child’s, as well as the parent’s, status in other areas, such as the right to vote, marry,
perspective. It also is based on evidence that the care work, or to be from constraints imposed only on
provided to children removed from their families children, such as the obligation to attend school, or to
sometimes is as bad or worse than the parental care be at home by a certain hour (curfew laws).
(Wald 1982). Courts, legislatures, and the drafters of the CRC
However, a few countries have adopted laws that have not arrived at any consistent theory justifying
seek to provide greater protection for children. Five granting children some rights and not others. Nor
countries have banned any corporal punishment of have they resolved the tension between claims for
children; while these laws do not carry any significant special protection and for full participation. If adole-
sanctions for violation, they reflect a different con- scents are seen as having the maturity and capacity to
ception of the rights of children and the role of the make decisions about their health care, to choose what
state. In recent years, evidence that early child-rearing they read or see, or to vote, does it follow that they
can substantially affect a child’s school readiness and should be held fully responsible for their antisocial
performance has led to increased calls for more behavior?
extensive monitoring of how each child is being reared. The critical issues relate both to capacities and to
In the future, governments will continue to struggle questions about developmental needs. For some
with the appropriate grounds for protective inter- rights, such as voting or marrying, it is generally
vention. The critical task, however, will be to develop accepted that some minimum level of competence is

1723
Children, Rights of: Cultural Concerns

necessary to the proper exercise of the right. Children 3. The Future


often are denied other rights, such as unlimited access
to books or movies, on the assumption that early The situation of children throughout much of the
exposure to certain materials or ideas may be harmful world has been improved substantially during the
to them. twentieth century. Concepts of rights likely con-
An overriding question is whether it makes sense to tributed to these advances. The recognition that
assume that the same age line should govern access to children are independent human beings with claims to
all political and civil rights. Franklin Zimring has special attention and resources is a major advance
proposed treating children as possessing a ‘learner’s with respect to universal recognition of human rights
permit,’ whereby they are given increasing autonomy (Freeman 1992).
and responsibility over time (Zimring 1982). Different However, further development of children’s rights
rights might be available at different ages, based on will require resolving the value issues often masked by
best guesses about average capacity. Age lines are rights talk. Despite the unprecedented rapidity of the
inherently arbitrary; not all people attain capacities at ratification of the CRC, the fact that the contours of
the same age. But individualized decisions about many of the proposed rights are not very clearly
whether someone is old enough to vote, marry, or defined will make enforcement and monitoring by the
leave school are likely, in practice, to be even more international bodies problematic. In fact, the very
arbitrary, as is having a single age trigger all rights. vagueness of the obligations may account for the rapid
ratification and almost total support of the CRC,
despite the fact that many of the proposed rights
seemingly conflict with deeply held cultural and
religious values in many countries. Countries expect to
interpret the provisions in light of their customs and
2.5 Autonomy Within the Family
values.
The issues regarding autonomy rights are even more With respect to welfare rights, those concerned with
difficult with respect to the right of children to act helping children must recognize that realization of
independently of their parents’ wishes. The extreme most of the proposed rights requires policies that help
view, that children of any age should be free to make families, not just children. Children’s well-being gen-
their own decisions, has not received acceptance. In erally is inseparable from that of the parents. Greater
many countries giving children any autonomy within provision of welfare rights should lessen the occasions
the family would be incomprehensible. In other that protection rights will need to be invoked. How-
countries, there are debates about whether children ever, in establishing welfare rights countries need to
should have a right to make certain major decisions, ensure that when adults seek to act on behalf of
such as whether to have an abortion or receive children, their own needs do not take precedence.
treatment for substance abuse, without their parents’ With respect all general social policies, countries
permission or knowledge. Advocates of giving chil- must think through the implications of the fact that
dren these rights contend that many children will be children, who may constitute a third to half of the
reluctant to seek out needed services if their parents population, cannot vote. The poorest children are
are informed. Children have also been given some further disenfranchised by the general powerlessness
rights to determine important aspects of their life, such of their parents. Some countries are making significant
as which parent to live with in the case of divorce. The efforts to include the voices of children in discussions
case for giving children rights is strongest with respect of issues related to them and to more general public
to decisions where parents may not be counted upon policy issues. The recognition of children’s right to
to generally act in their children’s best interests. speak for themselves could significantly alter the place
There is no easy resolution to the tension between of children in the state (Melton and Limber 1992).
parental rights and children’s rights. It is unrealistic to In the poorest countries, it will be very difficult to
assume that children can have total liberty rights, promote any form of children’s rights unless general
enforceable against their parent’s opposition. Asking poverty is alleviated. Many countries continue to deal
courts to enforce most such rights would be a waste of with problems of street children, child prostitution,
court resources and the orders would be unenforceable child labor, and infanticide. The advancement of
as a practical matter. But giving children some rights is children’s rights in these countries will be intimately
feasible, for example, access to drug treatment or linked to women’s rights, since it is unlikely that
abortion without parental permission. The question is children will have economic security and greater
whether the benefits of giving children autonomy with liberty if their mothers lack these rights. It will also
respect to the decision outweigh the harms of limiting require wealthier countries to demonstrate concern for
parental oversight of their children. These rights will children beyond national borders.
have to reflect the legal and social structure of each Children’s rights advocates sought to make the
country and should not be thought of as universal 1900s the century of the child. They were successful in
children’s rights. many respects. Future advances will require finding

1724
Children, Value of

adequate balances for the tensions that have been United States Advisory Board on Child Abuse and Neglect 1991
identified, as well as eliminating the negative impacts Creating Caring Communities. US Department of Health and
on children of discrimination based on race, class, and Human Services, Washington, DC
gender. For most children, these later factors may be Wald M 1982 State intervention on behalf of endangered
children. International Journal of Child Abuse and Neglect 6:
greater barriers to their rights and well-being than age. 3–45
Woodhouse B 1992 ‘Who Owns the Child?’: Meyer and Pierce
and the child as property. William and Mary Law Reiew 33:
995–1122
Zimring F 1982 The Changing Legal World of Adolescence. Free
Bibliography Press, New York\London
Alston P 1994 The best interests principle: towards a rec-
onciliation of culture and human rights. In: Alston P (ed.) The M. S. Wald
Best Interests of the Child. Clarendon Press, Oxford, UK
Armstrong M, Chuulu M S, Himonga C, Letuka P, Mokobi K,
Ncube W, Nhlapo T, Rwezania B, Vilakazi P 1995 Towards a
cultural understanding of the interplay between children’s and
women’s rights: an Eastern and Southern African perspective.
International Journal of Children’s Rights 3: 333–68
Campbell T 1992 The rights of the minor: as person, as child, as Children, Value of
juvenile, as future adult. In: Alston P, Parker S, Seymour J
(eds.) Children, Rights and the Law. Clarendon Press, Oxford, In demography, the term ‘value of children’ most
UK often refers to the benefits parents receive from having
Convention of the Rights of the Child 1989 G.A. Res 44\25
and rearing children. Benefits may accrue from the
U.N., GAOR, 44th sess., Annex, Supp. No 49 at 167 U.N.
Doc A\44\49
children themselves, from the experience of rearing
Coons J, Mnookin R, Sugarman S 1991 Puzzling over children’s them, or from the responses of kin, community, and
rights. Brigham Young Uniersity Law Reiew 307–50 society at large. Children also entail costs for parents
Eekelaar J 1986 The emergence of children’s rights. Oxford and the ‘value of children’ sometimes refers to their net
Journal of Legal Studies 6: 161–82 value (benefits less costs). Benefits and costs of children
Farson R 1974 Birthrights. Macmillan, New York are shaped by the economic conditions of life, by
Feinberg J 1980 The children’s right to an open future. In: Aiken forms of social organization, and by cultural beliefs
W, LaFolette H (eds.) Whose Child? Children’s Rights, and practices. The net value of children underlies
Parental Authority, and State Power. Rowman and Littlefield, parents’ desires for children; childbearing desires, in
Tottowa, NJ combination with the ability to achieve them de-
Fishkin J S 1983 Justice, Equal Opportunity, and The Family. termine whether or not individuals or couples have
Yale University Press, New Haven, CT
children and how many children they have.
Freeman M 1992 Taking children’s rights more seriously. In:
Alston P, Parker S, Seymour J (eds.) Children, Rights and the
Law. Clarendon Press, Oxford, UK
Fried C 1978 Right and Wrong. Harvard University Press,
Cambridge, MA 1. Theoretical Deelopment
Goldstein J, Freud A, Solnit A J 1973 Beyond the Best Interests
The economic value of children is a key component of
of the Child. Free Press, New York
Hawes J 1991 The Children’s Rights Moement: A History of
fertility variation and change. In agricultural econ-
Adocacy and Protection. Twayne Publishers, Boston omies and during early periods of industrialization,
International Journal of Children’s Rights 1995 Special Issue: parents and kin make the heavy investment of time
Multiculturalism and The Rights of the Child 3: 1–144 and money in young children in order to reap the
Melton G, Limber S 1992 What children’s rights mean to rewards of children’s labor from adolescence onward
children: children’s own views. In: Freeman M, Vreeman R (Schultz 1973). The old-age security value of children
(eds.) The Ideologies of Children’s Rights. Kluwer Academic is particularly important in contexts where no public
Publishers, Dordrecht, The Netherlands provisions exist for elder care (Cain 1985). For large
Meyer v. Nebraska 1923 United States Supreme Court Reports numbers of people throughout the world, the econ-
262: 390–403 omic value of children continues to be a primary
Nozick R 1974 Anarchy, State and Utopia. Basil Blackwell, benefit of parenthood.
Oxford, UK
For most parents in industrial societies, however,
O’Neill O 1988 Children’s rights and children’s lives. Ethics 98:
445–63
children provide very little or no economic value.
Purdy L 1992 In Their Best Interest? Cornell University Press, Children’s labor does not produce subsistence for the
Ithaca, NY family nor do children often support their parents in
Schlossman S L 1977 Loe and the American Delinquent. old age. Children are extremely costly to raise and
University of Chicago Press, Chicago often require economic support well into the young
Tinker v. Des Moines Independent Community School District adult years as they complete their formal education.
1969 United States Supreme Court Reports 393: 503–26 The location and organization of paid employment

1725
Children, Value of

produces high opportunity costs of childrearing for outweigh the social and psychological costs of parent-
parental employment. Fertility declines in the twen- hood. On the other hand, the kin and community ties
tieth century have been attributed in large part to produced by children may be indirect sources of
declines in the economic value and increases in the economic benefits. Several of the benefits identified as
economic costs of children to their parents (Fawcett psychological also have social components: primary
1983). group ties include the parents’ relationship, itself a
Of course, scholars have long recognized that the form of social capital; expansion of self includes ties to
value of children extended beyond their economic larger social groups, including kin, community, and
benefits. Hoffman and Hoffman (1973) identified the society.
following psychological needs fulfilled by parenthood
and the experience of childrearing: (a) adult status and
social identity; (b) expansion of the self (tie to a larger 2. Measuring the Value of Children
entity, ‘immortality’); (c) morality (including religion,
altruism, good of the group); (d) primary group ties, The economic value and costs of children could be
affiliation; (e) stimulation, novelty, fun; (f ) creativity, measured indirectly by estimating the value of chil-
accomplishment, competence; (g) power, influence, dren’s labor and transfers to parents over the life
effectance; and (h) social comparison, competition. course, expenditures related to childrearing, and the
The influence of such benefits on fertility would value of parental time that might otherwise be spent
depend on the availability of alternative mechanisms on income-producing activities (e.g., Rosenzweig
for fulfilling the identified needs. They also recognized 1978). Similar estimates of social or psychological
that childrearing carried with it potential psycho- values have not been attempted, and it could be argued
logical costs such as stress and worry. that a shared metric for economic, social, and psycho-
Recent theoretical arguments reject lists of psycho- logical values does not exist. Even if it were possible to
logical benefits as sufficient to explain why people measure the true net value of children, parents’
continue to have children when they become economic perceptions of those values are what enter into fertility
liabilities. One alternative theory claims that the decisions.
reduction of uncertainty is the primary goal of The term ‘value of children’ is most closely asso-
parenthood (Friedman et al. 1994). Those with limited ciated with a set of surveys conducted in the mid-1970s
access to other means for uncertainty reduction, such in the United States, the Philippines, South Korea,
as stable careers and marriages, will want to become Taiwan, Indonesia, and Thailand (Fawcett 1983).
parents. Few of the hypotheses derived from this Although based on the psychological values of chil-
theory stand up to empirical scrutiny and those that do dren discussed above, the surveys also contained
can be explained by the economic and social op- information on the perceived economic and social
portunity costs of children (Lehrer et al. 1996). benefits and costs of children. Respondents were asked
Building on the work of cultural anthropologists open-ended questions about reasons for having or not
and sociologists, Schoen et al. (1997) proposed the having children and were asked to rate the importance
theory that children produce social capital for parents. of several lists of child benefits and costs as reasons to
Social capital consists of social relationships and the have or not have children or their next child. Analyses
social resources they provide. At birth, children of these data remain the primary source of current
strengthen parents’ ties to kin. Through schooling and knowledge about variation in the value of children.
other activities, they link parents with community Measures of the benefits and costs of children are
resources. As adolescents and young adults, children also the basis for subjective-expected-utility or expec-
bring new information, ideas, and social relationships tancy-value models of fertility decisions (e.g.,
to parental households. And most parents eventually Davidson and Jaccard 1979). Respondents are asked
obtain in-laws and grandchildren as a consequence of to rate the value of possible outcomes of having or not
parenthood. Parents may also, of course, incur loss of having a child and also the subjective probability that
social capital because they have less time to maintain having or not having a child will produce the outcome.
friendships, relationships with co-workers, or connec- The product of those two responses comprises the
tions to social and political organizations. importance of a given outcome as a reason to have or
While the distinctions among economic, psycho- not have a child. Almost all of the research based on
logical, and social benefits and costs of children make these models has focused on the total expected utility
some sense (and also fit disciplinary boundaries), it is of having a child, rather than on the relative import-
important to recognize their interrelationships. Econ- ance of specific child benefits and costs.
omic subsistence comes first in a list of parental needs, More recently, several European Fertility and Fam-
so the economic value of children may be sufficient to ily Surveys and the US National Survey of Families
stimulate parenthood, whether or not children have and Households used structured importance ratings to
any social or psychological value. Social and psycho- measure the value of children to parents. In these
logical benefits may be viewed as extras that make the surveys, the rating scales are more extensive than the
task of childrearing less burdensome, or at least three-point scale used for the earlier VOC surveys, but

1726
Children, Value of

the questions do not cover all of the theoretical benefits family systems and where men have greater access
and costs discussed above. Analyses of these data to than women to economic opportunities. Daughters
date have combined ratings of social and psychological are valued more than sons for household and childcare
benefits of children into a single scale (e.g., Schoen et help and companionship. In patrilineal societies, the
al. 1997). values associated with sons are produced for the most
part in adulthood, those associated with daughters
during childhood, consistent with the practice of
daughters’ leaving the parental home upon marriage.
3. Variations in the Value of Children Many of the benefits of children, particularly social
and psychological benefits, do not appear to differ for
sons and daughters.
3.1 The Value of First and Later-born Children
First, second and higher-order births are associated
with distinct benefits and costs (Bulatao 1981, Fawcett
1983). The first child confers the status of parenthood, 3.3 Socioeconomic Variation in the Value of
so that benefits associated with parenthood per se can Children
be acquired by having only one child. Adult status,
relationship stability, parent-child interaction, and kin As noted above, the economic value of children is
connections are all cited as primary reasons for associated with agricultural and household economic
becoming a parent. The first child is also associated production and the absence of social insurance for
with the greatest increase in opportunity costs, that is, elderly parents. Although children also entail high
constraints on parental time and energy. economic costs in such contexts, they are necessary for
The most important value of a second child is to survival. Industrialization and urbanization reduce
provide a sibling for the first (Bulatao 1981). Second, the economic value of child labor and increase the
children may further strengthen partnership and kin costs of rearing children to be economically inde-
ties and provide additional opportunities for reward- pendent. Economic development eventually leads to
ing parent-child interactions. The value of higher- the development of social insurance that reduces
order births is predominantly economic—each child further the economic value of children for support in
contributes additional labor or economic security for old age. As a result, children’s net economic value is
parents. Restrictions on parents’ time are also asso- perceived to be lower in industrialized wealthy coun-
ciated with higher birth orders, but at a diminishing tries than in poorer countries with a greater depen-
rate. Financial costs of children become more salient dence on agricultural and household production
to parents at the birth of fourth and higher-order (Fawcett 1983).
siblings. Psychological benefits and costs of children are
reported to be more important in industrial and
postindustrial settings than in agricultural settings
(Fawcett 1983). This difference may arise because of
the priority of economic survival over psychological
3.2 Gender and the Value of Children
wellbeing, that is, children may provide psychological
Overall, men and women perceive the values and costs benefits (and costs) for parents in all settings, but these
associated with children in much the same way. The components of child value become salient only when
few differences that are observed are consistent with children become irrelevant to economic survival. On
traditional gender roles (Fawcett 1983). Men are on the other hand, the increasing complexity and im-
average more concerned than women about the personal character of daily life in industrialized urban
financial costs of childrearing and about having sons societies may produce a greater need for the love and
to continue the family name. The latter difference is companionship and stimulation of children. At the
particularly pronounced in patrilineal societies. same time, psychological costs of childrearing may
Women place greater importance than do men on the increase because kin and community take less re-
work and strain of raising children, the opportunity sponsibility for the supervision and care of children.
costs of children for other activities, and the benefits of Because the social capital value of children has only
children for the marital relationship; the last difference recently been introduced into theoretical discussions
is also larger in patrilineal than in bilineal kinship of the value of children, it is difficult to know how such
systems. values might vary under different economic condi-
Differences in the perceived benefits of daughters tions. In agricultural settings there may be a stronger
and sons are also related to differences in the roles and association between economic and social ties so that
behaviors of men and women (Fawcett 1983). Sons are the latter are not distinguishable from the former.
valued more than daughters for kinship ties, that is, to Only in industrial and urban societies may social
continue the family name, and for financial assistance. capital be sufficiently separable from financial capital
These values are especially pronounced in patrilineal to identify it as a separate source of child value.

1727
Children, Value of

Socioeconomic variation in the value of children is logical benefits and restrictions on parental activities
also evidenced across families within societies are associated with a small family size (Fawcett 1983).
(Fawcett 1983). A consistent finding from VOC The relationship between perceived child values and
surveys was that urban respondents place a lower fertility is not, however, as strong as some scholars had
economic value and a higher emotional value on hoped, and the financial and time\effort costs of
children than do respondents living in rural areas. children are not associated with family size.
Similarly, education is inversely associated with chil- Relatively moderate associations between values of
dren’s economic value and directly associated with children and completed fertility should not be surpris-
their emotional value as well as with perceived ing. Specific values are associated with particular
restrictions or opportunity costs of parenthood. Direct numbers of children, not consistently with large or
financial costs and childcare stresses do not vary small numbers. Child values can influence only desired
substantially across countries or across individuals in fertility, so that the relationship between values and
different economic circumstances. Increasing econ- outcomes depends on the degree of fertility control. In
omic status is associated with desire for increased child addition, measures of the perceived value of children
‘quality’ which means greater financial and time\ have often been relatively crude (e.g., 3-point response
energy investments in each child. Thus, the perceived scale). When a particular parity progression is speci-
cost of childrearing, other than opportunity costs, fied, when contraception is pervasive, and when
remains essentially the same across socioeconomic precise measures of the expected value (net of cost) of
levels. children are generated, very high correlations are
observed with birth intentions and eventual births
(e.g., Davidson and Jaccard 1979).

3.4 Culture and the Value of Children


See also: Demographic Transition, Second; Family
Broad cultural values may also serve as sources of Size Preferences; Family Theory: Complementarity of
specific or general values of children. Religious institu- Economic and Social Explanations; Family Theory:
tions and beliefs may support the value of children for Economics of Intergenerational Relations; Family
social and psychological benefits. For example, Cath- Theory: Role of Changing Values; Fertility Tran-
olicism is viewed as a support for large-family values sition: Cultural Explanations; Fertility Transition:
in the Philippines, Confucianism for the high value of Economic Explanations; Reproductive Rights in
sons to carry on the family name in some Asian Affluent Nations
countries. The relative values of daughters and sons
are associated with broad cultural values on gender
equality (Fawcett 1983).
Lesthaeghe (1983) argued that ideational change is
an independent force underlying current low fertility Bibliography
in Western countries. He identifies the two most salient Bulatao R A 1981 Values and disvalues of children in successive
features of this change as secularization and individu- childbearing decisions. Demography 18: 1–25
ation. Secularization allows more latitude to indi- Cain M 1985 Fertility as an adjustment to risk. In: Rossi A S
vidual morality, individuation stresses the importance (ed.) Gender and the Life Course. Aldine, New York
of personal self-actualization. Using national surveys Davidson A R, Jaccard J J 1975 Population psychology: A new
of social and family values, Lesthaeghe and co- look at an old problem. Journal of Personality and Social
workers (e.g., Lesthaeghe and Meekers 1986) Psychology 31: 1073–82
distinguished two dimensions of family values—a Fawcett J T 1983 Perceptions of the value of children: Satisfac-
nonconformity dimension linked to partner relation- tions and costs. In: Bulatao R A, Lee R D (eds.) Determinants
ships and nonmarital childbearing; and the ‘meaning of of Fertility in Deeloping Countries, Vol. 2: Supply and Demand
for Children. Academic Press, New York
parenthood,’ including beliefs that children are neces-
Friedman D, Hechter M, Kanazawa S 1994 A theory of the
sary for ‘fulfillment’ and for marital success. Measures value of children. Demography 31: 375–401
of secularism and individuation were strongly asso- Hoffman L W, Hoffman M L 1973 The value of children to
ciated with nonconforming family values, but only parents. In: Fawcett J T (ed.) Psychological Perspecties on
weakly associated with the meaning of parenthood. Population. Basic Books, New York
Lehrer E L, Grossbard-Shechtman S, Leasure J W 1996 ‘A
theory of the value of children’—Comment. Demography 33:
133–9
4. The Value of Children and Fertility Lesthaeghe R 1983 A century of demographic and cultural
change in Western Europe: An exploration of underlying
Studies using ratings such as those in the Value of dimensions. Population and Deelopment Reiew 9: 411–35
Children Surveys have generally found that high Lesthaeghe R, Meekers D 1986 Value changes and the dimen-
perceived economic benefits of children are associated sions of familism in the European Community. European
with a large family size. The importance of psycho- Journal of Population 2: 225–68

1728
Children’s Play: Educational Function

Rosenzweig M R 1978 The value of children’s time, family size dren’s play; the animal stage (children’s climbing and
and non-household child activities in a developing country: swinging); the savage stage (hunting, tag, hide-and-go-
Evidence from household data. In: Simon J L (ed.) Research in seek); the nomad stage (keeping pets); the agricul-
Population Economics: An Annual Compilation of Research,
tural–patriarchal stage (playing with dolls, digging in
Vol. 1.
Schoen R, Kim Y J, Nathanson C A, Fields J, Astone N M 1997 sand); and the tribal stage (team games). They also
Why do Americans want children? Population and Deel- believed that play served as an outlet for the catharsis
opment Reiew 23: 333–58 or release of unnecessary, primitive racial instincts,
Schultz T 1973 The value of children: An economic perspective. thereby preparing individuals for the intellectually
Journal of Political Economy 81: S2–S13 advanced activities of the modern era. For example,
Patrick suggested that contemporary occupations
E. Thomson required abstract reasoning, concentrated attention,
and coordinated eye-hand activities, all of which were
presumed to be relatively recent evolutionary acquis-
itions. Because this worked tapped recently acquired
Children’s Play: Educational Function skills, it was more taxing than physical labor. As such,
relief from fatigue could be gained through play, or the
practice of ‘racially old’ activities (e.g., hunting,
Until recently, many writers have considered chil-
fishing).
dren’s play to be a trivial and inconsequential activity;
According to pre-exercise or practice theorists such
they have also disagreed on its definition. Today,
as Groos, the period of childhood existed so that the
however, social scientists and educators appear vo-
organism could play. Humankind’s relatively long
racious in their study of children’s play. To many
period of immaturity was considered necessary to
researchers, play is viewed as a generative force in
allow for children to practice the instinctively-based
children’s social, emotional, and cognitive develop-
complex skills that would be essential for survival in
ment (see Rubin et al. 1983). The developmental and
adulthood. Thus, the adaptive function of play was to
educational significance of play in childhood is dis-
prepare children to perfect skills that they would
cussed herein.
require in adulthood.
Of further interest, Groos noted that children’s play
1. Classical Theories of Play comprised ‘don’t have to’ activities—while playing,
children are more interested in the processes rather
Contemporary research on the topic of children’s play than the products of the behavior. In this regard,
draws heavily from early theoretical accounts about recent speculations concerned with effectance mo-
the functional significance of the phenomenon. The tivation can be traced back to Groos’ writings.
‘surplus energy’ theory characterized play as ‘blowing Importantly, Groos noted that children’s play
off steam’. Schiller, an eighteenth-century philosopher changed with development. First, there was exper-
defined play as the aimless expenditure of exuberant imental play, which included sensory and motor
energy. He wrote that work satisfied the primary needs practice activities. Such play evolved into constructive
of the human species. Once these needs were met, the play and the practice of higher mental powers. The
superfluous energy that remained resulted in play. purpose of such activity was to aid in the development
Because children were not responsible for their own of self-control. Second, there was socionomic play,
survival, they were thought to have a total energy which included fighting and chasing, as well as
‘surplus,’ which was depleted through play. Schiller imitative, social and family games (dramatic play).
raised a number of contemporary issues in his writings. This form of play was thought to aid in the de-
First, he considered play a symbolic activity through velopment of personal relationships.
which the participant could transform and transcend Despite obvious limitations, not the least of which
reality and thereby gain new symbolic representations are limited supportive, empirical bases, these early
of the world. Second, he distinguished between forms theories continue to have an impact on how many
of play—material superfluidity resulted in physical people think about play today. For example, parents
play, esthetic superfluidity culminated in dramatic– and teachers are often heard to express reservations
symbolic play. Collectively, these notions reappear in about poor weather conditions that might keep chil-
the later writings of Piaget, Vygotsky, and Bu$ hler. dren indoors all day. The traditionally held belief is
According to the relaxation theorists such as that such restriction constrains the expenditure of
Lazarus, the purpose of play was to restore energy surplus energy.
expended in work. Thus, labor was viewed as energy
consuming, resulting in an energy deficit. This deficit 2. Modern Theories of Play
could be replenished through rest or sleep, or by
engaging in play. Several common themes run across modern twentieth-
Subsequently recapitulation theorists posited that century interpretations of children’s play. These in-
cultural epochs were repeated sequentially in chil- clude the belief that: (a) children need to play in order

1729
Children’s Play: Educational Function

to express themselves or to relieve themselves of lowed the exercise of actions for their representational
anxieties and fears; and (b) play both causes and value. Games-with-rules was the last structural cat-
reflects developments in social, cognitive, and linguistic egory to develop; this type of play activity necessarily
prowess. incorporated social coordination and a basic under-
standing of social relationships.

2.1 Psychoanalytic Theory


2.3 Vygotsky
Freud (1961) believed that play provided children with
important avenues for the expression of wish fulfill- Like Piaget, Vygotsky (1967) framed play within a
ment and the mastery of traumatic events. He argued larger psychological theory of children’s cognition.
that play allowed the child to transcend the rigid Vygotsky argued that children used symbolic play as
sanctions of reality thereby serving as a safe context an essential link in the association of abstract mean-
within which the child could vent socially unacceptable ings and their associated concrete objects. Symbolic
impulses. Freud addressed the mastery aspect of play play was useful in allowing children to conceive of
through the repetition compulsion, a psychic mech- meanings independently of the objects that they may
anism that allows individuals to cope with traumatic represent. Thus, unlike Piaget, Vygotsky argued that
events through a compulsive repetition of components play was not a reflection of egocentrism, but rather a
of the disturbing events. Children were thought to use symbolic process that brings into being the mediating
play to become active masters of situations in which role of signs. For example, when children first begin to
they were once passive victims. For example, if her use words, the word is perceived as a property of the
mother, for not tidying up her room, scolded a young object rather than as a sign denoting the object. The
child, the child may re-enact the scene numerous times child grasps the external structure of the word–object
with a doll, casting herself in the role of angry mother. relation earlier than the internal structure of the sign-
More recently, psychoanalytic theorists have ex- referent relationship. The fusion of word with object
panded on Freud’s conceptualizations of play. For follows the more general fusion of action and object.
example, with regard to wish fulfillment, Peller noted In infancy, the child relates to things as ‘objects of
that children’s choices of roles are often based on action.’ For higher mental processes to develop, things
feelings of love, admiration, fear, or anger for a must become ‘objects of thought’ and practical actions
particular person. Such roles allow children to fulfill must become mental representations (e.g., volitional
the wish to be like certain others. With regard to choice). Play precipitates this emancipation of mean-
mastery, Erikson believed play served to allow the ing from object to action. The central event responsible
child to integrate biological and social spheres of func- for the emancipation is the use of one object (e.g., a
tioning. Through play, children create model situ- stick) to substitute for another (a real space ship), or
ations in which aspects of the past are re-lived, the the use of one action (a jump) to denote another (a
present represented and renewed, and the future antici- space launch). To Vygotsky, movement in the field of
pated. meaning is the predominant feature of play.

2.2 Piaget 2.4 Other Modern Theories


Piaget (1962) suggested that play represented the Drive modulation theorists proposed that excessively
purest form of assimilation. In assimilation, children high and excessively low levels of stimulation are
incorporate events, objects, or situations into existing aversive; play is used as a means of modulating the
ways of thinking. Thus, as ‘pure assimilation,’ play arousal associated with this aversion. For example,
was not considered an avenue to cognitive growth, but when confronted with a novel object, specific ex-
rather as a reflection of the child’s present level of ploration allowed the child to explore its features and
cognitive development. relieve arousal through increasing familiarity with the
Borrowing largely from Karl Bu$ hler and Spencer, object. Following specific exploration, an optimal level
Piaget described three stages in the development of of arousal is sought through diverse exploration, or
play. Practice play first appeared in infancy, consisting stimulus-seeking activity. This latter form of explor-
of sensorimotor actions (e.g., clapping hands). ation, or play, increases stimulation when we are
Through this ‘functional exercise,’ Piaget believed that ‘bored’ and continues until arousal reaches an optimal
children acquired and honed the basic motor skills level. Thus, play is viewed as a stimulus-producing
inherent in their everyday activities. Symbolic play, activity that is generated by low levels of arousal.
appearing around the second year, required an implied Recent theoretical accounts have emphasized the
representation of absent objects (e.g., pretending to role of play, particularly pretense, in the development
bake a cake while in the sand box). In contrast with of children’s theory of mind (e.g., Lillard 1998),
practice play, where actions were exercised and elab- including the provision of opportunities to build and
orated for their functional value, symbolic play al- expand mental representations. Further, educational

1730
Children’s Play: Educational Function

thinkers have posited that creativity and flexibility is (d ) Play is not a serious rendition of an activity or a
promoted by children’s play (e.g., Bruner 1972). behavior it resembles; instead it consists of activities
Accordingly, play allows the exploration of new that can be labeled as pretense
combinations of behaviors and ideas within a psycho- (e) Play is free from externally imposed rules—this
logically safe milieu. Through play, children develop characteristic distinguishes play from games-with-
behavioral ‘prototypes’ that may be used subsequently rules.
in more ‘serious’ contexts. For example, a young child ( f ) Play involves active engagement—this distin-
may button and unbutton her doll’s dress many times, guishes play from daydreaming, lounging, and aimless
and thereafter incorporate her accomplishments from loafing.
this play session when dressing herself. As such, the
means become more important than the ends, and
since accomplishing goals is not important in play,
4. Deelopmental Progressions in Children’s Play
children are free to experiment with new and unusual
combinations of behavior. Between the periods of infancy and middle childhood,
Finally, linguists have proposed ways in which play children’s play undergoes an evolution in both form
may help children perfect newly acquired language and content. These progressions are reviewed briefly
skills and increase conscious awareness of linguistic below.
rules. Play provides a superior context within which
children may gain valuable language practice as they
experiment with the meaning, the structure, and the
4.1 Infant and Toddler Play
function of language (Davidson 1996). Play conver-
sations also work to improve communication skills. By the end of the first year, infants begin to dem-
These skills, in turn, are important components of onstrate rapid growth in representational thinking.
many developmental acquisitions attained during Decontextualized behavior, which is first demon-
childhood, particularly narrative representation, strated in the first year, involves the ‘out of context’
social cognition, intersubjectivity, and fantasy play. production of familiar behavior. For example, the
Goncu (1993), for example, has suggested that the infant may close her eyes, put her head on a pillow,
improvisational processes typical of social pretend and lie in a curled position at a time of day (e.g., mid-
play are critical to the development of intersubjectivity morning), and in a context (e.g., playground) that is
(i.e., the development of a mutual understanding detached from the situational context when and where
between play participants). These processes prepare sleeping or napping occurs. By the middle of the
children for an ever increasingly complex social life second year, the toddler coordinates the use of several
within which a variety of interactional contexts exist objects in his or her demonstrations of decontextual-
that range from the more ritualized and structured to ized behavior (e.g., a ‘Teddy Bear’ is fed from an empty
the more improvisational. Within each of these inter- cup).
actional contexts, however, are elements of both This latter use of objects in pretense captures the
improvisation and social structure (Sawyer 1997). essence of the second developmental component of
Thus, among the unique properties of children’s peer play, self-other relationships. When pretense appears
play are its framed and improvisational nature, each at about 12 months, it is centered around the child’s
which the child must master. own body (e.g., the child feeds herself). Roughly
between 15 and 21 months, play becomes other-
referenced; however, the ‘other’ is typically an in-
3. Issues in Defining Play animate object, as in the ‘Teddy Bear’ example noted
above. Moreover, during this period, when others are
It is one thing to think about why play exists in the involved in pretense activities, they are passive re-
human repertoire; it is something else altogether to cipients of the child’s behavior. Beyond 20 months,
define it. Following from Rubin et al. (1983), the and increasingly so up to about 30 months, the child
following characteristics, when taken together, define gains the ability to ‘step out’ of the play situation and
play. to manipulate the ‘other’ as if it were an active agent
(a) Play is not governed by appetitive drives, (e.g., the Teddy Bear ‘feeds’ a doll with a plastic
compliance with social demands, or by inducements spoon). The developmental significance of these ac-
external to the behavior itself; instead play is in- complishments should not be easily underestimated.
trinsically motivated. Advances in maturity of play reflect the young child’s
(b) Play is spontaneous, free from external sanc- increasing ability to symbolically represent things,
tions, and its goals are self-imposed. actions, roles, and relationships.
(c) Play asks ‘What can I do with this object or A third component of play is the use of substitute
person?’ This question differentiates play from ex- objects. The ability to identify one object with another
ploration, which asks ‘What is this object (or person) (e.g., a stick is used as a laser gun) is paradigmatic of
and what I do with it (him or her)?’ symbolic representation. The fourth component of

1731
Children’s Play: Educational Function

play is the coordination and sequencing of pretense. 5.2 Play and Social Deelopment
Between the ages of 12 and 20 months, toddlers’
Because successful participation in social pretense
pretend acts become increasingly coordinated into
requires many of the skills theorized to be associated
meaningful sequences. At first, the child produces a
with the achievement of competent peer relationships,
single pretend gesture (drinking from a plastic cup);
this type of play is viewed as a marker of social
later, the child relates, in succession, the same act to
competence from toddlerhood to the middle and late
the self and then to others (drinks from the cup, feeds
childhood years. Preschoolers who frequently engage
the Teddy Bear from the cup). Subsequently, in a
in sociodramatic play are more socially skilled than
multi-scheme combination, the young child is able to
their age-mates who infrequently engage in such
coordinate different sequential acts (pours tea, feeds
activity. Moreover, results from various training
the Teddy Bear, puts bear to sleep). By the end of the
studies indicate that instruction in sociodramatic play
second year, children indicate verbally that these
is associated with increases in cooperation, social
coordinated sequences are planned prior to execution
participation, and role-taking skills (see Rubin et al.
(child self-verbalizes sequence of pretense behavior
1983 for a review).
prior to acting).

4.2 The Play of Preschoolers and Elementary


5.3 Play and Language Deelopment
School-age Children
The mechanisms by which play may aid in the
The above noted constituents of play are mastered
development of linguistic competencies are straight-
prior to or near the child’s second birthday. These
forward. Children frequently play with the different
elements of play become increasingly shared with
forms and rules of language. This play may take the
others as children mature. But why is shared or social
form of repeating strings of nonsense syllables (pho-
pretense important? As noted above, there are several
nology), substituting words of the same grammatical
functions of sociodramatic play. Such play creates a
category (syntax), or intentionally distorting meaning
context for mastering the communication of meaning.
through nonsense and jokes (semantics). As a result,
It provides opportunities for children to learn to
language play may help children perfect newly ac-
control and compromise; these opportunities arise
quired language skills and increase conscious aware-
during discussions and negotiations concerning pre-
ness of linguistic rules, as well as provide a superior
tend roles and scripts and the rules guiding the pretend
context in which the child may gain valuable language
episodes. Also, social pretense allows for a ‘safe’
practice.
context within which children can explore and discuss
Generally speaking, particular phases in the de-
issues of intimacy and trust.
velopment of symbolic play and language tend to co-
By 36 months, children are generally able to
occur. For example, sociodramatic play appears to be
communicate pretend scripts to adults and peers; by
an important factor in the development of oral
five years, they can discuss, assign, and enact play
language development and vocabulary, story pro-
themes while continuing to add novel components. By
duction, story comprehension, communication of
the middle years of childhood, social pretend becomes
meaning, and the early development of literacy
a venue for self-disclosure and the sharing of con-
(Davidson 1996, Shore 1995). Indeed, it has been
fidences especially among close friends.
reported that training children to engage in pretense
with others improves their language skills, literacy
development, and mathematical thinking (see From-
5. Correlates and Outcomes of Play berg and Bergen 1998 for relevant reviews).

5.1 Play and Cognitie Deelopment


Children (three- to five-year-olds) who engage fre- 6. Summary and Conclusions
quently in sociodramatic and constructive play tend to
perform better on tests of intelligence than their age- It is clear that play is a developmental phenomenon of
mates who are more inclined to play in a sensorimotor significant proportion. Not only does it seem to
fashion. Interestingly, children who frequently play in provide a window into the child’s cognitive and socio-
a constructive fashion (e.g., building things; con- emotional being, but it also appears to be a propelling
structing puzzles) are likely to be proficient at solving force for the development of cognitive, language, and
convergent problems (problems with a single sol- socio–emotional skills. Thus, play should be con-
ution). Those who frequently play in a dramatic sidered an informal, enjoyable, and relatively stress-
fashion are likely to be proficient at solving divergent free means of providing children with intellectual and
problems (problems with multiple solutions). social stimulation.

1732
China: Sociocultural Aspects

See also: Cognitive Development in Childhood and family and kinship, religion and ritual, and gender.
Adolescence; Infant and Child Development, Theories Throughout, attention is drawn to the conceptual
of; Personality Development in Childhood; Play and challenges China poses for the discipline of anthro-
Development in Children; Play, Anthropology of; pology.
Social Cognition in Childhood; Socialization and
Education: Theoretical Perspectives

1. Ethnicity and Identity


Bibliography China’s government recognizes officially 56 minzu or
Bruner J S 1972 The nature and uses of immaturity. American ‘nationalities,’ based on an evolutionary scale of
Psychologist 27: 687–708 material progress derived from Lewis Henry Morgan
Davidson J I F 1996 Emergent Literacy and Dramatic Play in and Friedrich Engels. Some groups, notably Tibetans
Early Education. Delmar, Albany, NY and Uighurs of Western China, and Mongols in the
Freud S 1961 Beyond the Pleasure Principle. Norton, New York north, have historical claims to large geographical
Fromberg D P, Bergen D 1998 Play from Birth to Twele and areas, and aspirations for political autonomy. Other
Beyond: Contexts, Perspecties and Meanings. Garland, New
groups (for example the ‘Miao,’ ‘Zhuang,’ and ‘Yi’)
York
Goncu A 1993 Development of intersubjectivity in social pretend have been dispersed among Han and other minzu over
play. Human Deelopment 36: 185–98 many provinces in South, Central, and Southwestern
Howes C 1992 The Collaboratie Construction of Pretend. State China. Official recognition as a nationality confers
University of New York Press, New York certain legal privileges (most famously, exemption
Lillard A S 1998 Playing with a theory of mind. In: Saracho from the ‘one-child’ policy of population control), but
O N, Spodek B (eds.) Multiple Perspecties on Play in Early also implies backwardness and legitimates paternal-
Childhood Education. State University of New York Press, istic treatment from the government. In this respect,
Albany, NY, pp. 11–33 China’s current government continues a self-con-
Piaget J 1962 Play, Dreams, and Imitation in Childhood. Norton,
sciously ‘civilizing’ mission with respect to its non-
New York
Rubin K H, Fein G G, Vandenberg B 1983 Play. In: Mussen Han populations that dates to imperial times (Harrell
P H (ed.) Handbook of Child Psychology: Socialization, 1995).
Personality and Social Deelopment. Wiley, New York, Vol. 4, China’s ‘Han’ peoples are by no means culturally
pp. 693–774 homogeneous, but scholars are divided as to the degree
Sawyer R K 1997 Pretend Play as Improisation. Conersations to which Han Chinese can be said to share a common
in the Preschool Classroom. Lawrence Erlbaum Associates, culture. For example, only about half of Han peoples
Mahwah, NJ are native speakers of ‘Mandarin’ or ‘the common
Shore C 1995 Indiidual Differences in Language Deelopment. language’ (putonghua). The rest speak one of about a
Sage, Thousand Oaks, CA
dozen major ‘dialects’ (related languages, but often
Vygotsky L S 1967 Play and its role in the mental development
of the child. Soiet Psychology 12: 62–76 mutually unintelligible). Broadly speaking, the
provinces of Southern and Southeastern China mani-
K. H. Rubin fest the greatest linguistic diversity among Han Chi-
nese. This diversity reflects the fact these areas were
incorporated into the Chinese empire somewhat later
than North and Central China. In addition, the
preservation of regional cultures in the South and
Southeast results from the fact that these areas have
China: Sociocultural Aspects been relatively less frequently beset by dramatic
population displacements caused by dynastic crises,
China as a political entity must be distinguished from droughts, floods, and other disasters than have areas
China conceived in cultural terms. Within the political in the North and West.
boundaries of China are many groups that are not Mitigating regional and linguistic diversity is the
generally considered to be ethnically or culturally fact that Han Chinese of all provinces share a common
Chinese (Han), although they are Chinese citizens. By written language, elite traditions, and a long history of
the same token there are significant populations political unification under a succession of dynasties
generally considered to be ethnically Chinese living and into the era of nationalist and, now, communist
outside China’s political boundaries—‘overseas Chi- governments. Moreover, although there is no clear
nese’ (huaqiao). This article discusses China’s nonHan consensus as to the precise defining characteristics of
peoples briefly, but focuses primarily on ethnic Chi- Chinese culture, many Chinese communities are
nese or Han peoples who constitute the vast majority marked by similar family institutions, popular re-
of China’s population. Topics addressed are ethnicity ligious customs, class relations, forms of corporate
and identity, scale and complexity, local communities, association, and broadly Confucian values emphasiz-
the local impact of imperial and state institutions, ing, most importantly, filial piety. Yet the issue of

1733
China: Sociocultural Aspects

China’s cultural unity, or lack thereof, has long vexed model of market-, town-, and city-centered regions in
anthropologists, and is increasingly becoming one of China in late imperial times that form an eight-tiered
political moment in the Sinocentric world. Among nested hierarchy culminating in nine ‘macro-regions.’
China’s national minorities, for example, there are Ascendance in the central-place hierarchy (for
movements among Tibetans and Uighurs for inde- example, from villages to market towns, to higher-
pendence. By the same token, regional and dialect- level central places) is characterized by increasing
based communalism is strong among Han Chinese in specialization of economic function, social complexity,
the Southeast. and cultural sophistication. By the same token, macro-
At the time of writing, the potentially most volatile regions are divided into more densely populated and
locus of separatist sentiment is Taiwan. Separated economically concentrated cores and more sparsely
politically from the mainland during 50 years of populated, and less productive, peripheries. These
Japanese colonial rule (1895–1945) and subsequently distinctions are reflected in class characteristics, kin-
governed by an exile nationalist government, Taiwan’s ship and religious organization, and other social
majority Han population—including speakers of the differences.
Southern Min dialect (minnanhua) shared with Interpenetrating the hierarchy of economic local
southern Fujian province, and of Hakka, another and regional systems is a hierarchy of political admini-
dialect widely dispersed in Southern China—increas- stration. Although there exists a rough articulation of
ingly asserts its political and, in some cases, cultural administrative and economic regional organization
separation from China. These assertions are anathema (for example, many district capitals are also important
to China’s government, which considers Taiwan a economic centers), these correspondences are far from
renegade province. Given such circumstances, the perfect (for example, consumers often have access to
issue of cultural unity and diversity has a political more than one higher-level economic center, whereas
significance that extends well beyond the concerns of every administrative district is discrete, with a single
anthropologists. capital at the next higher level). The ramifications of
regional analysis for more conventional styles of
ethnographic work are fundamental. Put simply,
Skinner’s work obviates imagining China as being
2. Scale and Complexity crudely dividable into distinctions such as ‘rural’ and
‘urban,’ or ‘elite’ and ‘folk’: From the vantage point of
China’s vast scale and complexity pose particular village-based ethnographic studies, it is necessary to
problems for anthropology because the discipline’s situate a locale with reference to its place in both
crucial organizing concepts—culture and society— administrative and economic systems if one is to
have developed historically in the study of small-scale comprehend its place with reference to China as a
societies. Anthropology’s traditional claim to illumi- whole. This caveat applies across the board of social
nate social-cum-cultural systems holistically—most relations and culture—kinship, religious and ritual
explicitly in the functionalist traditions of Bronislaw organization, social class, gender relations, and demo-
Malinowski and A. R. Radcliffe-Brown—is more eas- graphic processes. The potential implications of re-
ily accomplished in ethnographic descriptions in which gional analysis for Chinese anthropology are well
overarching institutions and consciousness of com- recognized, although yet to be thoroughly assimilated.
mon identity do not reach much above the village or The existence of China’s essentially ‘global’ scale of
tribal level. Given such limits, the interpenetration of civilization antedating the emergence of the Western-
kinship, political, religious, and economic institutions centered ‘capitalist world system’ also poses a fun-
is described and analyzed more easily than in China, damental conceptual challenge to the recent growth of
where economic, cultural, religious, and political ties academic interest in processes of ‘globalization.’ At
link metropolitan centers to distant rural locales. An the very least, China should provide a comparative
additional complicating consideration is the historical reference point to discussions that all too often
depth of supra-local social organization: China’s imagine ‘globalization’ to be unique to Western
history of economic, political, and cultural integration history and\or capitalism.
at local, regional, and empire-\nationwide levels has
exerted and continues to exert a strong influence on
local life and institutions. China’s significance for the
discipline of anthropology lies most of all in the 3. Local Communities
challenge this complexity poses for adapting and
modifying conceptual and theoretical tools developed Historically, China’s population has been overwhelm-
in analyses of societies of much more limited scale. ingly rural (until recently about 85 percent could
To this end, the path-breaking work of G. William broadly be characterized as rural), although some of
Skinner, growing out of ‘central-place theory,’ is China’s cities were probably the world’s largest during
particularly significant (Skinner 1964–5, Skinner the European Middle Ages. But, as noted above,
1977). Skinner has developed a spatial-cum-temporal ‘rural’ is a designation too crude to capture the cultural

1734
China: Sociocultural Aspects

and social distinctions between farming villages functions of ‘standard-marketing communities’ were
located in close proximity to large urban centers (with assumed by communes (often territorially isomorphic
consequent access to urban markets and culture), and with them), but in the reform era, local markets are re-
those located in distant peripheries. In brief, more emerging rapidly (Skinner 1985).
centrally located rural locales tend to conform to more
orthodox forms of social organization (for example,
with leadership in the hands of Confucian literati), and 4. The Local Impact of Imperial and State
more peripheral ones tend to be more variable and less Institutions
orthodox (leadership often exercised by local ‘strong-
men’). Villages in many parts of China are ‘nucleated’ In imperial times, the impact of the state on local life
(dwellings grouped close together), although some was mediated largely by local elites; imperial admini-
areas (the Sichuan basin, for example) lack nucleated stration extended to the county level, but the main-
villages, with the population being dispersed. Typi- tenance of social stability depended on local leaders
cally, groups of villages are linked to a market town and institutions (Ch’u 1962). The imperial state
economically and culturally, and this may also serve as managed to maintain its hegemony largely because
a focus for important local ritual activities, higher- local elites were thoroughly committed to the Con-
level kin-based groups (e.g., lineage corporations), fucian ideology upon which the legitimacy of both the
and other voluntary associations. Most classic eth- state and literati leadership rested. This commitment,
nographic studies have focused on villages with some in turn, was sustained in part by the imperial exam-
references to villagers’ participation in market-town- ination system. Aspirants in the highly competitive
focused activities. examinations were required to master Confucian
The general picture that emerges from village-based classics. Success came in the form of academic degrees
ethnographies is one of considerable variation, but and appointment to administrative office. The sym-
some significant commonalities are also discernible. bolic and material rewards for successful candidates
Single-surname villages and strong corporate patri- redounded to the benefit of their kin and communities.
lineages seem to be relatively more common in the Consequently, local wealth and institutional effort
more productive and prosperous regional cores and in were invested in the production of successful exam-
the South. Such villages are more likely to find ritual ination candidates (for example, lineage corporations
solidarity in lineage-based activities. In contrast, often established schools for its sons), on the one
temples to local gods are the more frequent ritual hand, and successful candidates were able to use their
focus of multi-surname villages. The contrast should official positions to return wealth and prestige to their
not be overdrawn, however. Ancestor worship at communities. Although the prospects for advance-
domestic altars, although discouraged during the high ment through the theoretically meritocratic exam-
tides of communist reform, seems to have been nearly ination system were distant or unrealistic for all but
ubiquitous among Han Chinese, and may be making a the relatively wealthy (Ho 1962), its values exercised a
comeback in areas where it was repressed during the strong hold on popular imagination in all of China’s
1960s and 1970s. By the same token, some deities social classes.
(Guanyin, for example) are worshipped nearly univer- Beyond the pervasive impact of the imperial exam-
sally, and most locales produced some local gods ination system on China’s class system and conscious-
closely associated with local lore and history. ness, state power affected localities primarily through
Although some village-based ethnographies have a taxation and social control. In these functions, too,
penchant for treating the village as a self-contained county magistrates often relied on the mediating
community, Skinner argues forcefully that the services of local elites, with whom they shared a similar
standard-marketing community was until recently the class background and Confucian education.
most important unit of rural social life (Skinner The mediating role of local elites and institutions in
1964–5). In frequent, sometimes daily, trips to mar- integrating localities and the state began to unravel in
kets, farmers had access to the cultural amenities of late imperial times, and reached a crisis during the
the town, interacted with traders, landlords, and other Republican era, when government policy tended to
local dignitaries, and found occasion to worship and undermine local authority (Duara 1988). The mo-
attend festivals at temples. Higher-level lineage mentum during the post-1949 era was to establish
corporations often built lineage halls in market towns, greater central control by binding local cadres and
and voluntary associations with market-system-wide officials more firmly to the policies of the center, often
memberships also concentrated their activities there. undermining local leadership and institutions. The
Improved transport and economic development have period of economic reforms beginning in the late 1970s
to some degree eclipsed the traditional ‘standard has seen a general loosening of central control over
market towns,’ thus expanding the horizons of China’s individuals (as, for example, in the initiation of the
rural population, and (at least in some areas) resulted famous ‘household responsibility system’) and, to
in the emergence of what Skinner terms ‘modern some degree, over local governments. As a result,
trading centers.’ During the Maoist era, many of the some of the forms of traditional communalism (for

1735
China: Sociocultural Aspects

example, territorial cults worshipping local gods) are looked down upon (having at least partially aban-
re-emerging in some parts of China (Dean 1993, Jing doned his filial commitments to his own patriline),
1996). they typically were men of relatively lower socio-
economic status and background. Because such mar-
riages self-consciously deviated from the ideal type,
5. Family and Kinship arrangements for inheritance, descent of children, and
so forth would be stipulated in a formal marriage
Patrilineal descent, virilocal residence, and equal contract. (For example, children might be divided
inheritance among sons characterize the standard between the patrilines of both the husband and the
model of Han Chinese kinship. It is a standard model wife; the groom might agree to change his own
less in a statistical sense than in the fact that it serves surname and become his father-in-law’s adopted son;
as an ideal type; even families whose own organization the groom might agree only to support the father-in-
does not conform are likely to acknowledge this ideal law in his dotage.)
type as how things ought to be. In principle, a bride ‘Little daughter-in-law’ marriage involves adopting
should leave her natal household and take up residence an infant or girl with the intention that she marry her
in the natal household of her groom—including his foster brother when she comes of age. No longer
parents, unmarried siblings, married brothers, and widely practiced (they are now illegal), such marriages
their wives and children. Prior to 1949, ‘five gener- were common in some locales through the first half of
ations living in a single household’ was widely held to the twentieth century. They were said to save the
be both an admirable and enviable achievement. But it expense and trouble of a wedding, but Wolf and
was an achievement also recognized to be difficult to Huang contend that their real advantage lay most of
attain. Domestic strife, economic misfortune, mor- all in the fact that mothers-in-law were able thereby to
tality, or infertility could intervene to prevent the develop more amiable relations with their daughters-
achievement of such a ‘grand family.’ in-law, having raised them essentially as daughters.
The characteristic ethos of Chinese family life is one Wolf and Huang also argue that improvement of
of the dimensions of Chinese culture that has been mother\daughter-in-law relations were won at the
examined most closely by ethnographers. The picture expense of conjugal ones; divorce rates and infidelity
that emerges of the Chinese domestic cycle includes were higher among ‘minor marriages’ in northern
tensions between daughters-in-law and their mothers- Taiwan, and fertility was lower (Wolf and Huang
in-law, stemming in part from their divergent interests 1980).
with respect to maintaining a large, unified family Contracts typically drawn at family division might
structure. Daughters-in-law often desire to escape the stipulate that some portion of the estate remain
authority of their mothers-in-law, and agitate for a undivided in the form of a corporation in memory of
family division; mothers-in-law exercise their often an honored ancestor (often the father of the sons
strong influence over their sons in the interest of dividing the estate). The income of the corporation
keeping the extended family united (Fei 1939, Wolf typically might be shared among the descendents, used
1968, 1972, Cohen 1976). By the same token, sons owe to build an ancestral hall, or to fund annual banquets
filial allegiance to their fathers, but they also owe their in memory of shared ancestors. Corporations formed
children as good a start in life as they can provide. in this fashion are often referred to as lineages
These circumstances can set brothers’ interests against (Freedman 1958, 1966). However, lineage corpor-
each other (Freedman 1966). In the end, competing ations could also be formed when a group of patri-
interests and domestic tensions can result in agreement lineally related men decided to invest in shares and to
to divide an estate. For many rural families, the most purchase some common property. By the same token,
tangible and immediate manifestation of division was unrelated parties might form similar corporations to
the setting up of separate cooking stoves. Members of help put the maintenance of a temple or any other
the formerly united family continued to reside in the enterprise on a stable financial footing. In short,
same building, but the separate stoves indicate sep- although many lineages in China were organized as
arate budgets and, symbolically, a parting of ways. shareholding corporations with both ritual and (in
As one might expect, by no means all families were some cases) entrepreneurial goals, similar corpor-
of ideal-typical form. We have already noted some of ations were also founded on bases other than kinship
the factors that might prevent achievement of the ideal (Sangren 1984).
of a ‘five-generation’ household. In addition, we now Lineage corporations, as noted above, tended to be
know that both uxorilocal marriages and ‘little more numerous in prosperous, core areas, especially in
daughter-in-law’ marriages were quite common in the Southeast. However, even in the absence of lineage
some parts of China (Wolf and Huang 1980). Uxori- corporations, patrilineal kinship ties were considered
local marriages seem to have occurred most commonly to be important. Consequently, some ambiguity
as a strategy in families lacking a male heir. In such attends to Chinese terms for ‘lineage’ (congzu, zu)
circumstances, a son-in-law might agree to marry into because they might refer to a formal corporation or
his wife’s family. Because a married-in son-in-law was merely to those related to one patrilineally.

1736
China: Sociocultural Aspects

The nature and quality of affinal ties seems to have miracles. Individuals pray to gods for blessings for
varied considerably by region and social class. Women themselves and their families; community leaders
sought to maintain ties with their natal families, and in organize ritual celebrations to the same gods in the
some areas such ties were important in the devel- hope of ensuring prosperity on behalf of the com-
opment of the social connectedness (guanxi) so im- munity. Indeed, such celebrations constitute one of the
portant to Chinese social relations. However, in terms main entertainments of local life, punctuating the
of customary law, women retained few rights in their annual calendar.
natal lines. Except for gifts bestowed upon them at The majority of Chinese gods are viewed as deified
marriage, women did not inherit. Moreover, after her historical personages; their accumulated legends of
death a women could be worshipped as an ancestor posthumous divine intervention on behalf of indi-
only in the patriline of her sons. viduals and communities play a crucial role in the
dissemination of their cults. The cults of some gods are
of strictly local provenance, while the cults of others
(such as those of Guanyin, the ‘Goddess of Mercy,’
6. Religion and Ritual and Guandi, the ‘God of War’) are popular through-
out China. In imperial times, successful local cults
Considered as philosophical or liturgical traditions, might grow to the point where they received imperial
Confucianism, Daoism, and Buddhism have exercised recognition, with the emperor claiming the authority
an important historical influence on Chinese religion. to promote and demote such deities. Pilgrimages from
However, these influences are not clearly distinguish- local temples to distant centers tied local communities
able at the level of popular belief and practice. For into wider ritual spheres and provided welcome
example, because of its emphasis on ‘filial piety,’ opportunities to travel and see the world outside an
ancestor worship is often considered to be in some individual’s own locale.
sense ‘Confucian.’ However, ideas having to do with Chinese polytheism is widely conceived in terms of a
the afterlife of the soul, the nature of supernatural celestial bureaucracy that roughly mirrored the im-
spirits, geomancy (the operation of unseen forces in perial state; many gods are viewed as supernatural
the landscape), and communication with supernatural governors of their assigned districts. Alongside such
powers all play an important role in ancestor worship, celestial officials, however, Chinese people worshipped
just as they do in the worship of the local deities (often a variety of mother goddesses, tricksterish imps, and
termed ‘Daoist’) central to territorial cults, and in the even demonic figures.
popular worship of Buddhist deities (Guanyin, for At death the soul is imagined to journey through the
example). In other words, Buddhism, Daoism, and underworld where it is judged and (if found guilty)
Confucianism are relatively distinguishable only in the punished for its deeds. One of the obligations of
contexts of monastic Buddhist institutions, the texts descendants is to perform rituals on behalf of the
and practices of the ordained Daoist priesthood, and deceased to bribe these underworld officials and their
(arguably) the self-consciously Confucian writings of demonic henchmen, thereby winning the soul’s release.
the official elite. Both Daoist priests and Buddhist clergy may be
‘Popular,’ ‘folk,’ or ‘local’ religion is, as C. K. Yang employed to perform such funerary rites. Lonely
argued influentially in his definitive treatise (Yang ghosts (souls of those who die in tragic circumstances
1961), ‘diffused’ throughout the institutions of social or who have no descendents to worship them as
life. Ancestor worship, for example, can be considered ancestors) are pitied and, in some cases, feared for the
the religious dimension of domestic life. Similarly, mischief they may bring down upon the living.
territorial cults give form to local communities at Communities commonly propitiate such spirits during
levels ranging from neighborhood ‘Place God’ (Tudi the seventh lunar month.
Gong) shrines, village-level temples, market-based Communication with unseen powers is established
temples, upward to the City God temples found in by a variety of techniques. Individual worshippers can
administrative capitals (Sangren 1987). City Gods also often cast ‘moon blocks’ or draw lots at temple altars
played a role in the official rites of the imperial state. in an attempt to discern a deity’s response to their
District magistrates were not only governors; they also queries. ‘Spirit writing’ via an apparatus believed to be
officiated as priests in the state cult which culminated possessed by a god or spirit is another common
in the sacrifices to heaven performed by the emperor technique (Jordan and Overmeyer 1986). Spirit-me-
himself (Feuchtwang 1977). diums possessed by gods speak directly in the gods’
Fundamental to popular religion is a belief in the voices (Elliott 1955), and revelation through such
power of supernatural spirits (ling); from the point of mediums has played an important role in the pro-
view of the majority of worshippers, it matters less duction of hagiographies (Seaman 1987, Kleeman
whether a god or goddess is Buddhist or Daoist, 1994).
heterodox or orthodox (from the point of view of the There has been a significant growth of Western
Daoist priesthood or Buddhist clergy), than whether it academic interest in Chinese popular religion in recent
has a reputation for answering prayers and performing years. One of the reasons for this growth is the close

1737
China: Sociocultural Aspects

association between local social organization and Elliott A J A 1955 Chinese Spirit-medium Cults in Singapore.
collective ritual activity. Moreover, as Paul Katz Department of Anthropology, London School of Economics
argues, China’s closest analog to Western ‘civic so- and Political Science, London
ciety’ or a ‘public sphere’ is most likely to be found in Fei H T 1939 Peasant Life in China: A Field Study of Country
Life in the Yangtze Valley. Kegan Paul, Trench, Trubner,
local temples and their rituals (Katz 1995).
London
Feuchtwang S 1977 School-temple and city god. In: Skinner
G W (ed.) The City in Late Imperial China. Stanford Uni-
7. Gender versity Press, Stanford, CA
Freedman M 1958 Lineage Organization in Southeastern China.
The feminist movement in academia has inspired a Athlone, London
large body of research and writing about women in Freedman M 1966 Chinese Lineage and Society: Fukien and
China. In anthropology, much of this work has Kwangtung. Athlone, London
focused on domestic life. Among the most important Harrell S 1995 Introduction. In: Harrel S (ed.) Cultural En-
conclusions is that, despite the ideological emphasis counters on China’s Ethnic Frontiers. University of Washing-
ton Press, Seattle, WA
on patriarchy, women exercise a good deal of practical Ho P T 1962 The Ladder of Success in Imperial China: Aspects of
influence (Wolf 1972). Much of this influence is linked Social Mobility, 1368–1911. Columbia University Press, New
to mothers’ close emotional relations with their chil- York
dren, especially their sons. Whereas Confucian Jing J 1996 The Temple of Memories: History, Power, and
ideology emphasizes the fundamental importance of Morality in a Chinese Village. Stanford University Press,
father–son ties, ‘filial piety’ manifested in popular Stanford, CA
myth, entertainment, and ritual more often empha- Jordan D K, Overmyer D L 1986 The Flying Phoenix: Aspects of
sizes children’s affection for their mothers. Some Chinese Sectarianism in Taiwan. Princeton University Press,
analysts (e.g., Martin 1988) believe that women’s views Princeton, NJ
on life differ so radically from those of men that a Katz P R 1995 Demon Hordes and Burning Boats: The Cult of
distinctive female ideology exists, one that often Marshal Wen in Late Imperial China. State University of New
York Press, Albany, NY
contradicts the male, or official, ideology. Alter-
Kleeman T F 1994 A God’s Own Tale: The Book of Trans-
natively, differences in men’s and women’s views can be formations of Wenchang, the Diine Lord of Zitong. State
conceived as generated within the family system University of New York Press, Albany, NY
considered holistically as a productive process. Martin E 1988 Gender and ideological differences in repre-
One of the problems confronting sinological anthro- sentations of life and death. In: Watson J L, Rawski E S (eds.)
pologists at the time of writing is to assess, on the one Death Ritual in Late Imperial and Modern China. University of
hand, the degree to which changes in gender ideology California Press, Berkeley, CA
(the government now officially advocates gender Sangren P S 1984 Traditional Chinese corporations: Beyond
equality) have improved women’s lives—very little, kinship. Journal of Asian Studies 43: 391–415
according to some (Wolf 1985)—and, on the other Sangren P S 1987 History and Magical Power in a Chinese
hand, the degree to which changes in family organiz- Community. Stanford University Press, Stanford, CA
ation (for example, those consequent on the ‘one-child Seaman G 1987 Journey to the North: An Ethnohistorical Analysis
and Annotated Translation of the Chinese Folk Noel Pei-yu
family’ policy) have altered gender ideology. Chi. University of California Press, Berkeley, CA
Skinner G W 1964–5 Marketing and social structure in rural
See also: Area and International Studies: Deve- China, Parts I, II, and III. Journal of Asian Studies 24: 3–43,
lopment in Southeast Asia; East Asia, Religions of; 195–228, 363–99
East Asian Studies: Culture; East Asian Studies: Skinner G W 1977 Cities and the hierarchy of local systems. In:
Gender; East Asian Studies: Politics; East Asian Skinner G W (ed.) The City in Late Imperial China. Stanford
Studies: Society; Historiography and Historical Tho- University Press, Stanford, CA
ught: East Asia; International Migration by Ethnic Skinner G W 1985 Rural marketing in China: repression and
revival. The China Quarterly (September): 393–413
Chinese; Kinship in Anthropology; Nationalism, Wolf M 1968 The House of Lim: A Study of a Chinese Farm
Historical Aspects of: East Asia Family. Appleton-Century-Crofts, New York
Wolf M 1972 Women and Family in Rural Taiwan. Stanford
University Press, Stanford, CA
Wolf M 1985 Reolution Postponed: Women in Contemporary
Bibliography China. Stanford University Press, Stanford, CA
Ch’u T Y 1962 Local Goernment in China Under the Ch’ing. Wolf A P, Huang C S 1980 Marriage and Adoption in China,
Harvard University Press, Cambridge, MA 1845–1945. Stanford University Press, Stanford, CA
Cohen M L 1976 House United, House Diided: The Chinese Yang C K 1961 Religion and Ritual in Chinese Society: A Study
Family in Taiwan. Columbia University Press, New York, NY of Contemporary Social Functions of Religion and Some of
Dean K 1993 Taoist Ritual and Popular Cults of Southeast China. Their Historical Factors. University of California Press,
Princeton University Press, Princeton, NJ Berkeley, CA
Duara P 1988 Culture, Power, and the State: Rural North China,
1900–1942. Stanford University Press, Stanford, CA P. S. Sangren

1738 Copyright # 2001 Elsevier Science Ltd.


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Chinese Law

Chinese Law centralized judicial systems that developed over


centuries in the West were absent, although there were
China’s modern legal institutions reflect powerful unofficial legal specialists at provincial and central
political, economic, and social forces that have levels. The outcomes of cases had to be substantively
struggled to shape them since the latter part of the correct according to both law and Confucian morality.
nineteenth century. For thousands of years, before the The concerns for procedural justice and uniformity of
last dynasty fell, cultural values in the world’s oldest results that have come to mark Anglo-American law
continuous empire were inhospitable to ideals of ‘rule were absent.
of law’ that evolved slowly in the West. In the Republic The Chinese state exercised its rule mostly in an
(1911–49) formal legal institutions were only super- indirect manner, through local elites—landowners,
ficially borrowed from abroad. From 1949 to 1979 the family heads, and village elders—which enforced local
Maoist party-state reduced law to the merest tool of customs. The concept of personal rights did not
totalitarian politics. Mao’s successors launched econ- develop because the basic units of society were not
omic reforms in 1979 that have led to the creation of individuals but rather the collectivities of family, clan,
the most significant legal institutions in Chinese village, and guild. Economic transactions arose and
history, but at the end of the twentieth century were enforced largely in the context of custom-
traditional values endured, the ideology and the governed relationships. The official philosophy
apparatus of the party-state remained in place, govern- exerted strong social pressure in favor of mediation
mental institutions were disorderly, and the economy and compromise. Litigation before the magistrates
in flux. The new institutions can grow in strength and was time-consuming, degrading, and costly. Civil
legitimacy only if they receive sustained and powerful disputes were common, nonetheless, but most were
political support from China’s leadership and greater settled extra-judicially. Unlike the West, where
recognition in the legal culture of China’s officials and lawyers emerged, any tendency for legal specialists to
its populace. act as intermediaries between individuals and the state
was actively discouraged, although in the late Qing
men who facilitated litigation, albeit tarred as ‘liti-
gation tricksters,’ did flourish.
1. The Chinese Legal Tradition From the mid-nineteenth century until 1949, when
the People’s Republic of China (PRC) was established,
Contemporary Chinese institutions should be viewed sporadic and inconsistent attempts were made to
in light of profound differences between Chinese and transplant foreign legal institutions. These failed to
Western legal history. Imperial China blended law and take root since they were often too complex as well
morality in contrast to Western Europe, where secular as being irrelevant to Chinese conditions. China estab-
and sacred authority were separated early. The domi- lished its first professional bar during the Republican
nant cult and philosophy of Confucianism emphasized period, but lawyers’ training and qualifications were
governance by men who acquired moral authority by uneven and their standards of professional behavior
emulating ancient sages in setting virtuous examples low. Judges were both few and poorly educated, and
of benevolence and social rightness for their subjects judicial professionalism and independence were weak-
to follow. Law was regarded as a set of inferior norms ened by corruption and favoritism. The authoritarian
that supplemented more basic principles, especially Nationalist Party undercut the spirit of the new legal
rules of propriety (li) that differentiated individuals reforms.
according to their status as determined by age and
rank in family and society. Confucianism was briefly
rivaled by the early philosophical school of Legalism,
which stressed the need for harsh penalties using
positive law ( fa) to deter wrongdoing, but both schools 2. The People’s Republic of China
shared a vision of society in which proper behavior
derived from an individual’s status in the hierarchies in
2.1 Maoism, 1949–79
which he or she lived.
Law was first codified in the Qin Dynasty (third Under Mao Zedong the Chinese Communist Party
century BC) and recodified and augmented by a (CCP) mounted extensive programs of economic
complex body of regulations in subsequent dynasties, reconstruction and social change, relying on both
notably in the Tang and Ming. Principally penal, it previous experience in ruling large areas before 1949
unambiguously reinforced ideas of hierarchy and and on Soviet models, which entwined state insti-
subordination and was addressed to officials, not to tutions with the Party. Law was used as a political
the populace. It was enforced by local county magis- tool, along with mass organizations and propaganda
trates without legal training or expertise, as part of media, to mobilize the populace to carry out policies.
their general duties to govern on behalf of the emperor. The criminal process was declared an instrument to
Specialized institutions for adjudication like the exercise ‘dictatorship’ over members of the former

1739
Chinese Law

exploiting classes, and sanctions frequently varied legislation, however, has been incomplete and ad hoc
according to political ‘campaigns’ and policies of the as new economic policies appeared, reflecting the
moment. Systematized codes of criminal law or crimi- difficulties that the leadership has had in defining the
nal procedure were lacking, and the courts merely direction of economic reform. There is a continuing
formalized findings of guilt by the police and the struggle between concepts of law as a framework for
procuracy, which was a prosecutorial agency. The economic activity by autonomous actors and as an
police could impose sentences of as long as four years administrative instrument.
without any judicial involvement. Extensive legislation also signals China’s partici-
The collectivization of almost all private property pation in a global economic community. Direct
left little scope for noncriminal law. In the planned foreign investment has been addressed by legislation
economy, disputes between state-owned enterprises on various investment vehicles and their operation, as
were resolved informally through flexible, highly well as on such matters as intellectual property, labor,
pragmatic attempts to adjust problems without fixing customs, foreign exchange, bank lending and guaran-
legal blame. Disputes among individuals were usually ties, and export and import licenses. A taxation system
dealt with through mediation. Mediation committees has been established. China has acceded to an ex-
that formed part of a totalitarian control apparatus tensive range of international agreements, such as the
which penetrated deeply into Chinese society were UN Convention on the International Sale of Goods,
charged with attaining politically correct results that whose rules have become part of Chinese law.
would benefit socialist construction and strengthen the
party-state’s control over ‘bad elements.’ Politiciz-
ation was not total, of course; traditional attitudes
among mediators and the populace persisted. 2.2.1 Building a judicial system. The courts,
Mao and other leaders, determined to speed formerly scorned as ‘rightist’ institutions at the end
continued revolution and social change, refused to of the 1950s and as ‘bourgeois’ during the Cultural
regularize law and administration. The Cultural Revolution, have been rebuilt in a four-level hier-
Revolution (1966–76), further reduced the relevance archy. The number of civil and economic disputes
of legal institutions that had already been politicized. brought to the courts rose from 2.4 million cases
After the overthrow of the ‘Gang of Four’ in 1976, the in 1990 to 5.7 million in 1999, while the number
lawlessness of the Cultural Revolution moved their of disputes brought to mediation committees de-
successors, led by Deng Xiaoping, to advocate adop- clined from 7.4 million in 1990 to 5.1 million in 1999.
tion of orderly legal institutions. An era of reform Growing reliance on contracts and the increase in liti-
began, and since then law has risen to greater gation suggests increasing acceptance of concepts of
prominence than ever before in Chinese history. law-based rights.
The Chinese judicial system presents many prob-
lems at the beginning of the twenty-first century.
Judges are poorly trained, and most still lack a
complete legal education despite efforts to raise their
2.2 Reform Since 1979
educational qualifications. Over half of the cases
Legal reform has been driven by the economic reforms brought to the courts are resolved through judicial
that began in 1979 and have unfolded irregularly but mediation rather than adjudication of competing
irreversibly. A growing and increasingly differentiated claims and rights. Judges often prefer to resolve cases
non-state sector has been created and modernization by mediation to avoid reversal by a higher court;
has been aimed at constructing a ‘socialist market lower courts, fearful of being reversed, sometimes
economy,’ including a legal system. China’s departure request instructions from a higher court before they
from previous Maoist disregard for formal legislation issue a judgment, thereby rendering meaningless the
has led to one of the greatest outpourings of legislation right of unsuccessful parties to appeal. The finality of
in history. judgments is impaired by legislation permitting non-
The structure of the Chinese state has been defined criminal decisions to be reopened within two years
in a Constitution and in ‘organic’ laws dealing with after they become effective. Sometimes, too, higher
key state institutions such as the courts and central courts reviewing the quality of the work of lower
and local legislative bodies; the General Principles of courts reopen decisions even though they have already
Civil Law, a partial civil code intended to mature into taken legal effect. The role of judges has been defined
a comprehensive one; ‘basic’ laws such as codes of only ambiguously, and adjudication has not been
criminal law and criminal procedure; and enactments significantly differentiated from decision making by
by the central government, subnational units, and by administrative agencies in the course of implementing
central ministries and their local branches. These policies.
define newly recognized economic relationships and The independence, powers, and effectiveness of the
participants in expanding markets and address regu- courts have been constrained by the requirement that
latory problems generated by economic reform. Much they follow CCP policies. In 2000, over 70 percent of

1740
Chinese Law

judges were members of the CCP and the principal a reluctance to allow courts to review the validity of
affairs of the courts, including personnel matters, were general rules issued by administrative agencies or to
directed by Party organizations. Judges are appointed decide that they had improperly used their discretion.
and their salaries paid by the local governments in the The courts were at best only at the same level of
jurisdictions in which they serve, leading to ‘local authority as the other institutions of the state appar-
protectionism’ that frequently influences the outcomes atus; their limited reach reflected the subordination
of litigation. In addition, guanxi (relationships), cor- of law to the bureaucracy.
ruption, and bribery are often employed to influence
outcomes that on many occasions pervert justice.
2.3 The Future
The development, shape and meaningfulness of
2.2.2 The legal profession. The bar was formally re- Chinese legal institutions will depend upon critical
established in 1980 following a twenty-year hiatus factors that lie outside the law.
after a brief experiment with a Soviet-style bar was
ended, and legal education was revived. There were
over 120,000 lawyers in 2000, but the educational 2.3.1 Policy. As long as the CCP rules China, its
level of many older lawyers is low and legal edu- policy will have to overcome its ambivalence toward
cation remains highly formalistic. By 2000 China had the role—and rule—of law. The Chinese Consti-
over 8,000 law firms, most of which were state-run, tution, amended in 1999 to declare that ‘The People’s
but the number of ‘cooperative’ firms was growing. Republic of China shall be governed according to
The sudden expansion of the legal profession created law and shall be built into a socialist country based
enormous temptations for lawyers, judges, and on the rule of law,’ also affirmed ‘the leadership of
officials to engage in bribery and other corrupt prac- the Chinese Communist Party, Marxism-Leninism
tices. The state continued to regulate and scrutinize and Mao Zedong Thought, and ‘‘Deng Xiaoping
lawyers’ activities, and a major unresolved contra- Theory’’’ as ‘guiding principles.’ The leadership has
diction existed between the concept of a professional tended to equate law with discipline and to treat it
bar and CCP opposition to autonomous organiz- as an instrument to maintain the dominant role of
ations and professions. the CCP in Chinese society. At the same time, it has
also recognized that law can further rationalize de-
cision making and implementation of policy, while
increasing legitimacy at home and abroad.
2.2.3 The criminal process. The criminal process con-
tinues to be a tool for the politicized administration
of law, as when political leaders focus it on activities 2.3.2 Structural problems. At the beginning of the
deemed to ‘endanger the security of the state,’ or on twenty-first century institutions for law making
other particular types of criminal activity. The for- heavily reflect the imprint of pre-reform doctrine and
mal rationality of the criminal process has been practice. Three principal law-making agencies share
slowly increased in a criminal code and code of crim- the central government’s legislative power under the
inal procedure, but the extensive power of the police Constitution adopted in 1982—the National People’s
and the CCP over the criminal process have been Congress, its Standing Committee and the State
only ineffectually restrained, and the police-admin- Council—but their respective powers were only very
istered system of sanctions begun under Mao re- generally defined and subject in practice to informal
mained in place in 2000. negotiations. The State Council, at the head of the
executive branch of the central government, together
with the ministries, commissions, and bureaus that
2.2.4 Administratie law. The Chinese leadership are subordinate to it, possesses broad power to gen-
began in the 1990s to address the need to create legal erate rules superior to all local enactments.
institutions that might curb bureaucratic arbitrariness. Although subnational units and more than 20
A series of laws gave affected persons or organizations functional bureaucracies of the central government
the right to sue agencies that have acted unlawfully, issue regulations, distinctions among the rules that
defined the wide assortment of punishments that may they issue, and between rule making and implemen-
be imposed by administrative agencies, and recog- tation, are blurred. No effective mechanism has existed
nized situations in which governmental agencies may to measure legal norms for consistency with higher-
be liable for injurious consequences of their acts. level norms, and the Constitution is not justiciable.
However, the jurisdiction of the courts and their Chinese administrative agencies have exercised the
power to restrain arbitrariness remained limited. power both to issue and interpret their own rules, and
Chinese laws and administrative rules have generally to require the courts to enforce them. The lower courts
given agencies very broad discretion, while judicial are formally denied power to interpret laws, although
control of administrative action has been limited by in practice the Supreme People’s Court has asserted a

1741
Chinese Law

strong role in the interpretation and clarification of questioned, while the opening of China to the rest of
laws. These problems are aggravated by the frequently the world exposed the Chinese people to new values
provisional nature and tentative style of legislation. and ideas, including Western concepts of legality. The
Although the dramatic Chinese economic reform at weakening of the totalitarian grip on individual lives
the end of the twentieth century was greatly promoted and the continuing flux of economic reform have
by increasing the power of local governments, it also fostered the re-emergence of an emphasis on personal
facilitated extensive interpenetration of government relationships and clientelism. Corruption has grown
and business. It was carried out without carefully despite continued efforts by the leadership to check
defining property rights, and local governments took and punish its many manifestations, and has aroused
advantage of the uncertainty to form alliances with alienation and cynicism among many Chinese.
private enterprises. Local government involvement
in enterprises varied from disguised ownership to
acceptance of payoffs and bribes. Overall, in the year 2.3.4 Chinese legal culture: continuity and change.
2000, marketization was often not synonymous with Chinese legal culture continues to reflect competing
privatization, and some Western scholars perceived currents. Traditional values remain strong. Many
the emergence of corporatist relationships that closely Chinese remain unwilling to take their disputes to
linked non-governmental actors to local governments. courts, choosing to rely on personal relationships or
Also, the growth of local power increased local to defer to authority. In the courts, concern for pro-
deviations from central state policies, and undermined cedural justice is weak. Bureaucrats continue to want
uniformity in the application of legal rules and the to enjoy broad discretion. At the same time, the ex-
reach of the central government generally. tensive social and economic changes sparked by re-
Legal development remains tied to economic re- form have promoted consciousness of legal rights
form; its vigor will in turn not only on the strength of and willingness to use legal processes to assert such
the economy and the consolidation of reforms already rights. Lawsuits against government agencies are
accomplished, but also on whether solutions are increasing, although they remain relatively small in
fashioned to deal with basic problems that have been number, and peasant and worker protests often
difficult to surmount. The state sector of the economy invoke published laws and policies to resist official
has been governed by rules and practices to which behavior that they consider to be unjust. Some
legal rules are essentially irrelevant. Although reform Chinese legal scholars, officials and intellectuals have
has long been a goal of the leadership, it has been called for a legal system with a national and auton-
unable to chart a clear course between privatization omous judiciary that applies standards of procedural
and continued dedication to state ownership, or to fairness. Some economic actors in the non-state sec-
create mechanisms to deal with the large-scale un- tor desire stronger protection of their transactions by
employment and associated social distress that further rules enforced meaningfully and consistently by the
industrial reform would generate. The establishment power of the Chinese state. Despite the resistance of
of enforceable rules on the governance of formerly the CCP to the growth of civil society, the continuing
state-owned firms that have been converted remains development of non-state economic activity, the
problematic. Financial system reform began in the last strength of communal traditions and the tenacity of
years of the twentieth century but was incomplete and some nongovernmental organizations in Chinese so-
faced serious obstacles; and both creation and regu- ciety could combine to advance the development of
lation of capital markets were still in early stages. legal consciousness.

2.3.3 The crisis of alues. Reform dramatically 3. Conclusion: Perspecties


relaxed state control over the lives of many Chinese
in noticeable ways, but it also created severe social It should not be assumed that legal development will
dislocations including income disparity and an lead to Western-type institutions or to liberal democ-
impoverished ‘floating population’ of as many as 100 racy. The domain considered ‘legal’ and the bound-
million peasants who have flocked to China’s cities aries between it and other areas of Chinese state,
seeking employment. Discontent has risen, among society, and economy will not necessarily converge
peasants angry at their exploitation by local cadres with Western concepts, and rights may remain ‘soft.’
and among unemployed workers at state-owned enter- Nor should Western observers overstate the supposed
prises that have closed down. Crime, violent and virtues for China of Western concepts and institutions,
otherwise, has risen, provoking widespread concern themselves imperfect and under question.
about social order. Twenty years of reform efforts began a journey
The profound political and economic changes of the toward greater legality, and further efforts are under-
last two decades of the twentieth century unsettled way at the beginning of the twenty-first century to
both traditional and Communist values. The Party’s advance judicial reform, add coherence to Chinese law
ideology became hollow and its legitimacy increasingly making, and develop administrative law further. The

1742
Chinese Reolutions: Twentieth Century

accession of China to the World Trade Organization Lawyers Committee for Human Rights 1998a Wrongs and
would impose on China international obligations to Rights: A Human Rights Analysis of China’s Reised Criminal
increase the transparency of government and the reach Law. Lawyers Committee for Human Rights, New York
Lawyers Committee for Human Rights 1998b Lawyers in China:
of legal institutions. The General Agreement on
Obstacles to Independence and the Defense of Rights. Lawyers
Tariffs and Trade and other agreements that are Committee for Human Rights, New York
implemented by the WTO require all member nations Lubman S (ed.) 1997 China’s Legal Reforms. Oxford University
to adopt and implement their laws relating to trade in Press, Oxford, UK
a manner consistent with the rule of law as it is Lubman S 1999 Bird in a Cage: Legal Reform in China After
understood in the West, and China will have to adjust Mao. Stanford University Press, Stanford, CA
its legal institutions to comply with WTO standards. Turner K G, Feinerman J V, Guy R K (eds.) 2000 The Limits of
The deepening of legal reform most depends, however, the Rule of Law in China. University of Washington, Seattle,
on political reform. Legal institutions will be hobbled DC
Van der Sprenkel S 1962 Legal Institutions in Manchu China: A
by political constraints as long as any Chinese leader-
Sociological Analysis. Athlone, London
ship, Communist or post-Communist, maintains an Xia Y (ed.) 1995 Zouxiang quanli de shidai: Zhongguo gongmin
instrumental view of law, remains ambivalent about quanli fazhan yanjiu [Toward a Time of Rights: A Perspectie of
the rule of law, and inhibits the growth of an active the Ciil Rights Deelopment in China]. China University of
civil society. For law to grow more meaningful, Politics and Law Press, Beijing, People’s Republic of China
leadership policy and official ideology must enlarge
the domain of the law, end or greatly dilute one-party S. B. Lubman
domination, and remedy institutional weaknesses in
the structure of the Chinese state. Even the strongest
political commitment will require considerable time to
implement further reform, overcome the serious limits
on state capacity that make the governance of China Chinese Revolutions: Twentieth Century
difficult under any circumstances, and inspire popular
confidence in institutions. The processes of institu- In the 1800s, a leading Sinologist claimed the Chinese
tional change that have begun can only work slowly were the ‘most rebellious’ but ‘least revolutionary’ of
at best. peoples and Karl Marx, along with other Western
See also: China: Sociocultural Aspects; East Asian social theorists, argued that peasants would never be
Studies: Economics; East Asian Studies: Politics; East the driving force in radical movements for change. The
twentieth century would prove both the Sinologist and
Asian Studies: Society; Globalization: Legal Aspects;
the theorists wrong—many times over. Nearly every
Law and Society: Sociolegal Studies; Law as an decade of it saw a revolutionary event of one kind or
Instrument of Social Change; Legal Culture and Legal another break out in China. In many of these, peasants
Consciousness; Mediation, Arbitration, and Alter- played central roles.
native Dispute Resolution (ADR)

Bibliography 1. Key Eents


Bodde D, Morris C 1967 Law in Imperial China. Harvard The first quarter of the twentieth century witnessed a
University Press, Cambridge, MA series of inter-related insurrections that toppled the
Chen J 1999 Chinese Law: Toward an Understanding of Chinese Qing Dynasty (1644–1911). Known collectively as the
Law, Its Nature and Deelopment. Kluwer Law International, 1911 Revolution, these paved the way for the es-
The Hague, The Netherlands tablishment of the Republic of China (ROC) on
Corne P H 1996 Foreign Inestment in China: The Administratie
January 1, 1912, which was also when a revolutionary
Legal System. Hong Kong University Press, Hong Kong
He W 1995 Tongguo sifa shixian shehui zhengyi: dui zhongguo activist, Sun Yat-sen (1866–1925), was inaugurated as
faguan xianzhuang de yige toushi [The realization of social this new country’s first President. A path breaking
justice through judicature: a look at the current situation of student-led mass struggle also took place in the
Chinese judges]. In: Xia Y (ed.) Zou xiang quanli de shidai: century’s first quarter. This was the May 4th Move-
Zhongguo gongmin quanli fazhan yanjiu [Toward a Time of ment of 1919, which ended with the dismissal from
Rights: A Perspectie of the Ciil Rights Deelopment in office of three high-ranking officials. In addition, two
China]. China University of Politics and Law Press, Beijing, revolutionary organizations were formed in this
People’s Republic of China, pp. 209–84 period: the Nationalist Party or Guomingdang
Huang P C 1996 Ciil Justice in China: Representation and
(GMD), which was led initially by Sun and then after
Practice in the Qing. Stanford University Press, Stanford, CA
Keller P 1994 Sources of order in Chinese law. American Journal his death by Chiang Kai-shek (1887–1975); and the
of Comparatie Law 42: 711–59 Chinese Communist Party (CCP) of Mao Zedong
Lawyers Committee for Human Rights 1996 Opening to Re- (1893–1976) and Deng Xiaoping (1904–1997).
form?: An Analysis of China’s Reised Criminal Procedure The second quarter of the century began with the
Law. Lawyers Committee for Human Rights, New York GMD and CCP working together in a United Front

1743
Chinese Reolutions: Twentieth Century

(1924–1927) orchestrated by Sun and his Soviet the first edition of this Encyclopedia included an entry
advisor Borodin. The alliance’s goal was to defeat on ‘The Chinese Problem’ but not ‘The Chinese
warlordism and imperialism and thus get the Geming Revolution.’ The rise of the CCP and the Cultural
(a Chinese term for revolution whose shades of Revolution still lay in the future, but there were clear
meaning are returned to below) back on track. In signs that the twentieth century would be a revolu-
1927, however, Chiang turned on his erstwhile allies, tionary one for China. One indication of just how
who had just helped him reunify the country through revolutionary is that, by the 1990s, many Chinese were
the military campaigns and mass movements known living lives that would have been unrecognizable to
as the Northern Expedition. He launched a White their great-grandparents or even their parents. More-
Terror against members of the CCP and left-wing over, the words ‘China’ and ‘revolution’ were by then
rivals within the GMD. Seven years later, after inextricably linked in the minds of many people.
persistent GMD extermination campaigns, the CCP This linkage was particularly strong in the PRC
set out on its epic Long March (1934–1935) to safety in where the term Geming continues to have a sacred
isolated northern base areas. The two organizations, patriotic meaning. This is of strategic significance to
each tightly disciplined and Leninist in structure, allied the CCP, which is still struggling, as this entry is being
again from 1937 to 1945 in a Second United Front written, to ride out an ongoing legitimacy crisis. The
against Japan. The 1925–1950 period concluded, regime continues to find it useful to play the revolu-
however, with a Civil War (1945–1949). This ended tionary legacy card periodically. It did so, for
with the GMD’s retreat to Taiwan and Mao’s procla- example, in 1999 when American bombs mistakenly
mation, on October 1, 1949, of the establishment of killed three Chinese journalists in Belgrade. The
a new People’s Republic of China (PRC)—something victims were quickly dubbed ‘revolutionary martyrs’
that this son of a relatively well-to-do farmer was only and the official media linked these new deaths to those
able to do because his party had won so much peasant of patriotic participants in hallowed historic events of
support. The CCP then launched a series of radical the 1910s–1950s.
initiatives, including a partially successful effort to In addition, even though Westerners often view the
gain adherence to a new Marriage Law (1950) that words ‘reform’ and ‘revolution’ as contrastive, in
gave women equal rights in family matters and strove China gaige (reform) is often presented as a method
to minimize the power of patriarchal kin-groups. for carrying forward the Geming. The policy of
The century’s third quarter began with Mao taking opening the PRC to international trade and moving
the lead in further campaigns, some of which caused away from a command economy that Deng and his
enormous misery. The most notorious included the successors have followed since 1979 is typically pre-
‘Anti-Rightist Campaign’ (a Red Terror directed sented in English language works as a pragmatic
largely against intellectuals) and the ‘Great Leap ‘Reform Program’ and a step away from state social-
Forward’ (a bizarre experiment in irrational utopian- ism. Its PRC proponents, however, describe it as an
ism that contributed to a famine of horrendous effort to ‘build socialism with Chinese characteristics’
proportions). Later in this quarter-century, Mao, and an effort to keep the Revolution from ossifying.
angered by his post-Great Leap marginalization with- It is not just in the PRC that the words ‘China’ and
in the CCP hierarchy, rallied young loyalists (Red ‘revolution’ go together naturally. Most of the few
Guards) to challenge the Party bureaucracy. Thus Chinese names and faces recognized by foreigners are
began the Cultural Revolution (1966–1976). Mao’s those of insurgents—from Mao to the anonymous
charge, soon echoed by the Gang of Four (a clique ‘man-who-stopped-the-tanks’ in 1989. And many
that included Jiang Qing, his wife), was that high ROC residents grew up learning history from school-
officials had lost their revolutionary zeal and become books that treated the story of the 1900s as an
corrupt. unending revolutionary struggle, which encountered a
The Cultural Revolution spiraled into a chaotic war terrible setback in 1949 but did not die.
of all against all that continued into the first year of the
century’s final quarter. This period also witnessed the
abortive revolution of 1989, during which giant 2. Categories and Definitions
demonstrations took place in Beijing and other cities.
In the ROC (as Taiwan was re-baptized in 1949), a This brings us to a question too rarely addressed: did
major breakthrough came with the lifting of a decades China experience one Revolution or several? Many
old policy of martial law. This laid the groundwork for official PRC histories (as well as some Marxist ones in
an East Asian variation of the ‘Velvet Revolutions’ of other languages and standard GMD accounts) refer to
Central Europe: the nonviolent transfer of power a single ongoing Revolution with many stages. West-
from the GMD to an opposition party early in the year ern social theorists (such as Theda Skocpol) and
2000. comparative historians (such as Crane Brinton) some-
By the time the 1900s ended, so many aspects of times prefer to think of a single great event that began
Chinese politics and society had been transformed or in 1911 and ended in 1949. They then fit this almost 40-
challenged by insurgents that it now seems strange that year long event into a Great Social Revolution model

1744
Chinese Reolutions: Twentieth Century

that is also used to interpret more compact phenomena structural terms, the Chinese one took so much longer
such as the French upheavals of 1789–1799. Many to play out. Brinton’s ‘anatomy of revolution’ model
non-Marxist Sinologists, meanwhile, speak of three becomes problematic because ‘terrors’ occurred at
relatively discrete revolutions: a Republican one in different points in the Chinese Geming. In other
1911, a National one in the 1920s, and a Communist revolutions, he claims, this ‘fever’ hit the patient just
one. once, then broke.
A tripartite schema, finally, leaves too little room
for radical developments that do not move in lockstep
with the rise of governments. 1911, the late 1920s, and
2.1 The Term ‘Geming’ 1949 were key turning points in the political story but
not necessarily the social and cultural ones.
Complicating these questions is the term Geming, the
meaning of which at times overlaps with, at others
diverges from its Western equivalents. Geming literally
means ‘stripping the mandate,’ that is, the successful 3. Oerlapping Reolutions
carrying out of an upheaval that demonstrates,
through its ability to overturn a ruling house and A more satisfying approach divides China’s recent
establish a new dynasty, that heaven now sides with a past, for heuristic purposes, into four overlapping
new regime. It was only from the 1890s on, in part revolutions, each following its own timetable: one
because of new senses the characters had acquired in political, one cultural\intellectual, one diplomatic,
Japan, that Geming began to be linked to the creation and one socioeconomic. Let us consider each briefly.
of a new political system, not just completion of a
dynastic cycle. Geming’s original linkage to dynastic
cycles makes it curiously like its English and French
counterparts, in which visions of return and of forward 3.1 Politics
movement alike can be found. In 1789, the Janus- The political revolution has the clearest starting date:
faced aspect of revolution was signaled by the use of October 10, 1911. GMD official historians present the
classical goddesses to portray new political virtues. In events that followed the Wuchang Uprising of that
the 1910s, it was signaled by anti-Qing leaders alter- day (really a military mutiny) as adhering to a master
nating between presenting themselves as avenging the plan crafted by Sun Yat-sen and carried out by
legacy of the Ming (1368–1644), the ethnically Chinese members of his Revolutionary Alliance, the organi-
ruling group the Manchus had displaced, and as zation that evolved into the Nationalist Party. But
struggling to create a new system. many participants had no contact with the Alliance
If there are overlaps between the terms ‘revolution’ and knew little of Sun or his ideas. Only some were
and Geming, however, there is also a difference: it is committed to republican ideals. Others were moti-
never clear whether the sense of the Chinese word is vated by distrust of the Qing ‘barbarians,’ anger at
singular or plural, lower or upper case, whether a text particular officials, or a desire to be on the winning
is referring to The Revolution or one of many. This is side of a Mandate shift.
more than merely a linguistic issue: to compare events, Still, the 1911 Revolution extinguished not just a
we need to know when they began and ended. With dynasty but a system, hence it deserves to be thought
France, for example, there is a consensus that the of as politically revolutionary. The National Assembly
revolution began in 1789 and lasted about a decade, and other republican institutions quickly proved
though other supplemental revolutions occurred later ineffectual. But the complete failure of attempts by
(1830, 1848, and 1871). With China, there are more Yuan and later another warlord to found new dynas-
options, including the three alluded to above. ties demonstrated that the rules of political life had
changed.

2.2 Problems with Standard Categorization Schemes


3.2 Cultural and Intellectual Change
There are, however, limitations to each. The con-
tinuous Revolution vision focuses too tightly on the Attempts by Chinese revolutionaries to transform
achievements of a particular organization. It distorts traditional belief structures and patterns of behavior
the shape of twentieth-century history by downplaying began after the Opium War (1839–1842). The military
struggles not guided by the particular Leninist party— defeat suffered then by China led some within the
the CCP or GMD—being celebrated. dominant scholar-official class to begin questioning
The 1911–1949 vision is problematic for use in longstanding assumptions about the inferiority of
comparison—even though comparativists champion foreign cultures. The first generations of intellectuals
it. Skocpol, for example, does not explain why, if the to rethink these issues, while interested in specific
‘Great Social Revolutions’ are essentially alike in things the West had to offer ( yong or techniques),

1745
Chinese Reolutions: Twentieth Century

were seldom revolutionary in their approach to ques- World War II precipitated a sea change, as China’s
tions of fundamentals (ti or essences). Their belief in position as one of the Allies gave its leader new
the superiority of traditional moral codes remained leverage in pushing for an end to unequal relations
unshaken; they merely sought efficacious ways to with the West. And, in 1943, the Allies renounced all
combine Confucian ti with Western gunboats and claims to special privileges within Shanghai and other
other sorts of foreign yong. ‘treaty-port’ cities. This decision and Japan’s with-
The ti\yong distinction remained at the heart of the drawal in 1945 made China freer of foreign influence
debate for decades, until the iconoclastic New Culture than it had been for a century.
Movement (1915–1923) led by intellectuals who had CCP historians argue, however, that the diplomatic
gone abroad, usually to Japan, or been exposed to revolution was still not complete—and they have a
foreign ideas at one of China’s recently formed point. The GMD regime remained economically and
Western-style schools. They disagreed about many diplomatically dependent on the USA. If credit for
things but shared three convictions. China’s weakness starting the diplomatic revolution rightly goes to
was due to the enduring power of entrenched ‘Con- Chiang, it remains true that China did not fully regain
fucian’ and ‘feudal’ beliefs (such as the veneration of its status as an independent country until after 1949.
age) and practices (such as arranged marriages).
Chinese should welcome the best the West had to
offer: ideas and methods of inquiry included. And
intellectuals needed to educate and mobilize others. 3.4 Socioeconomic Change
Much New Culture Movement energy was directed
The socioeconomic revolution also began long before
toward the publication of new periodicals, which were
the Communists came to power. Most notably, the late
filled with articles on Social Darwinism, Pragmatism,
Qing and Republican (1912–1949) eras saw dramatic
Anarchism, Marxism, and other imported ideologies.
shifts in the types of people in control of the govern-
These periodicals also attempted to serve another
ment and witnessed the rise within the social hierarchy
purpose: their articles, written in a ‘plain speech’ as
of merchants. But the development of new alignments
opposed to a ‘classical’ style, were supposed to in-
near the top, though important, did not have much
form and inspire the masses. This literary move to
effect upon the lives of the vast majority of Chinese
break down the barriers between intellectuals and
who continued to reside in villages and work small
non-intellectuals was reinforced by public speaking
plots of land. For them, a socioeconomic revolution
campaigns on issues ranging from medicine to interna-
did not take place until the CCP’s land reform
tional affairs.
campaigns. This socioeconomic revolution (like the
An important new turn came with the May 4th
others) did not hit all parts of the country at the same
Movement, when student activists with ties to the New
time but rather played itself out according to region-
Culture Movement actively appealed to members of
ally distinctive timetables. By the early 1950s, however,
other classes to join them in protesting terms of the
it had affected most of China’s villages.
Treaty of Versailles that transferred control of former
German territories in China to Japan. After sympathy
strikes by workers and merchants, the students
achieved their main domestic goals: the dismissal of
3.5 Unfinished Reolutions
three corrupt officials and the release of all those
arrested for protesting. But their inability to stop the If China’s twentieth century is envisioned as a time of
Treaty of Versailles from taking effect illustrated the multiple revolutions, we need to keep in mind that
need for a diplomatic revolution. most were incomplete—and that struggles for change
in other spheres were pursued sporadically but never
fully carried through. One important unfinished
Geming, linked to but not quite the same as any of
those just described, was the revolution for women.
3.3 Diplomatic Change
Aspects of sexual politics changed throughout the
No diplomatic revolution could take place until the 1900s, but the dream of complete equality promoted
rise of a strong central regime committed to over- by revolutionaries in various decades never became a
turning the unequal treaties forced upon the Qing by reality. Much the same can be said for the democratic
foreign powers. The Northern Expedition brought revolution: calls for democratization were heard often
this about. After reunifying the nation, however, but oligarchy and one-party authoritarianism tended
Chiang claimed that the spread of communism was to prevail.
more dangerous than the continuation of imperialism. Ironically, though China’s twentieth century was
Unlike some of his warlord predecessors, Chiang was filled with transformations, its final two decades were
an outspoken critic of the unequal treaties, but he strikingly similar to its first three. In Taiwan, moves to
insisted that China could only regain its place in the institute a system of free elections recall experiments
world once its house was in order. made around 1911. On the mainland, there were May

1746
Christian Liturgy

4th era parallels for the calls in 1989 for an end to Bianco L 1971 [English translation by Bell M] Origins of the
corruption. And in the automatic weapon fire that Chinese Reolution, 1915–1949. Stanford University Press,
accompanied the massacres, there were echoes of Stanford, CA
Brinton C 1965 Anatomy of Reolution. Vintage, New York
killings of the 1920s that were justified as necessary
Dirlik A 1989 The Origins of Chinese Communism. Oxford
efforts to maintain order and\or protect the Geming. University Press, New York
Duara P 1988 Culture, Power and the State: Rural North China,
1900–1942. Stanford University Press, Stanford, CA
4. Lessons From the Chinese Case Fairbank J K, Goldman M 1998 China: A New History, enlarged
China’s twentieth-century experience suggests the edn. Belknap Press of Harvard University Press, Cambridge,
need for social scientists to pay more attention in the MA
future to a few factors. One, already mentioned, is the Fitzgerald J 1996 Awakening China: Politics, Culture, and Class
in the Nationalist Reolution. Stanford University Press,
revolutionary potential of villagers. Another is scale. Stanford, CA
Revolutions have often been treated as purely Friedman E, Pickowicz P G, Selden M 1991 Chinese Village,
‘national’ events. Recent work on China shows that Socialist State. Yale University Press, New Haven, CT
local dynamics are often crucial—even in upheavals Goldstone J A (ed.) 1994 Reolutions: Theoretical, Comparatie,
driven by nationalism. So, too, are transnational flows and Historical Studies, 2nd edn. Harcourt, Fort Worth, CA
of people (key participants in 1911 were overseas Hartford K, Goldstein S (eds.) 1987 Single Sparks: China’s Rural
Chinese) and ideas. This suggests that, in future, Reolutions. Sharpe, Armonk, New York
models used for comparing revolutions will need to Ono K 1989 [English translation by Fogel J (ed.)] Chinese
make more room not just for revolutions of widely Women in a Century of Reolution, 1850–1950. Stanford
University Press, Stanford, CA
varying lengths but also revolutions that were influ- Perry E J, Selden M (eds.) 2000 Chinese Society: Change, Conflict
enced by local, national, and transnational impulses and Resistance. Routledge, London
and actors. Skocpol T 1979 States and Social Reolutions. Cambridge
However, China’s recent history reinforces some University Press, Cambridge, UK
important insights in the comparative and theoretical Spence J D 1982 The Gate of Heaenly Peace. Penguin Books,
literature on revolutions. It shows that Alexis de New York
Tocqueville was right to insist that nothing does more Tang T 2000 Interpreting the Revolution in China: Macrohistory
to inspire revolutionary activism than an inefficient and micromechanisms. Modern China 26(2): 205–38
authoritarian regime’s introduction of a reform pro- Wakeman F Jr. 1975 The Fall of Imperial China. Free Press, New
York
gram that offers too little and comes too late. The 1911 Wang Z 1999 Women in the Chinese Enlightenment: Oral and
Revolution (which was preceded by unsuccessful Qing Textual Histories. University of California Press, Berkeley,
reforms) and even, in a way, the mainland’s abortive CA
revolution of 1989 and the recent fall of the GMD in Wong R B 1997 China Transformed: Historical Change and the
Taiwan illustrate the power of Tocqueville’s insight. Limits of the European Experience. Cornell University Press,
These cases also illustrates the value of stressing, as Ithaca, NY
Marx and Skocpol among others have, the revolu- Wright M C (eds.) 1968 China in Reolution: The First Phase,
tionary potential of groups at the fringe of the 1900–1913. Yale University Press, New Haven, CT
governing elite who have rising expectations and are Yue D, Wakeman C 1985 To The Storm: The Odyssey of a
Chinese Reolutionary Woman. University of California Press,
frustrated by being kept away from real power. In Berkeley, CA
addition, Chinese events of the third quarter of the
twentieth century in particular underscore the validity J. N. Wasserstrom
of the claim made by both Tocqueville and Skocpol,
again among others, that revolutions, while attempt-
ing to do other things, often end up strengthening the
power of central governments. Finally, the Chinese Christian Liturgy
case shows just how much truth there is in the tragic
maxim that revolutions all too frequently and all too The Greek word leitourgos means, literally, ‘work of
regrettably end up devouring their own children. the people’ and was associated in ancient Greece both
with the payment of civic dues and the performance of
See also: China: Sociocultural Aspects; Communism; ritual duties. In the Christian context, by contrast, it
Communist Parties; East Asian Studies: Economics; had at first a more specific connotation, concerning
East Asian Studies: Politics; East Asian Studies: the performance of the Eucharistic action—although
Society; Maoism; Revolutions, Sociology of; Revol- this apparent narrowing of reference in fact indicated
utions, Theories of that this action was itself the supreme collective
obligation and source of collective unity. Only much
later did the term come to denote the entirety of
Bibliography Christian ritual practice. At first, in the seventeenth
Berge' re M C 1998 [English translation by Lloyd J] Sun Yat-sen. century, it was used as a neutral term, covering both
Stanford University Press, Stanford, CA the Catholic ‘Mass’ and Protestant ‘Communion’;

1747
Christian Liturgy

later, Catholic writers such as Dom Odo Casel worried neither was yet democratically available to all. This
about the degeneration in meaning if cultus tended to ‘orientalism’ was also present more in Rome than in
substitute ‘liturgy’ as a word describing the whole Greece, for, in the case of Rome, the plebs were
outward and inward opus of Christian piety. admitted or ‘initiated’ only gradually into the rites of
To comprehend the history of the latter, therefore, connubium (sacred marriage). The Greek fathers spoke
one must attend also to the evolution in meaning of of mysterion in ways that fused Jewish apocalyptic
other terms: ‘mystery,’ ‘cult,’ ‘rite,’ and ‘sacrament.’ In expectation and exaltation with a Greek sense of
the antique world there existed, broadly speaking, a participating in a hidden drama that yields under-
contrast between public ‘liturgies,’ connected to the standing. Nevertheless, the association with mystery
regular order of the city and its upholding, on the one religions was viewed typologically, and distance as well
hand, and private ‘mysteries,’ ‘cults,’ or ‘rites,’ some- as proximity was emphasized. One can note in part-
times connected with a certain dissension from civic icular the extent to which the pagan mysteries’
order, on the other. Public liturgy included animal involvement of the reslaying of a god was exaggerated
sacrifice, and was concerned primarily with a symbolic in the interests equally of resemblance with and
apportioning of different parts of the beast to gods and contrast to the (voluntary) death of Christ. Unlike the
different classes of men. It was at once a mimesis and pagan mysteries, the mystery of Christ involved a once
a reinstitution of civic order. Private cults, by contrast, and for all death and an unshakeable resurrection
also involved deviant sacrifice. Sometimes this was a which saved not a god but mortals.
deviation upwards, as with the Pythagoreans, who It remains the case, however, that now, in a more
modified or refused animal sacrifice, and associated ‘oriental’ (and also more Roman) mode, the most
less bloody offerings not with a feeding of the gods but public emerged from the most secret, a rite into which
with the transit of their own souls to a higher realm. one first had to be initiated as a catechumen. Unlike
Alternatively, there were deviations downwards, as the ancient Orient, furthermore, (although pagan
with the Eleusinian mysteries, which were essentially Rome had already evolved in this direction), all could
older, alien, more agrarian rites involving chthonic potentially be initiated into the Raza, or inner court
gods, reinterpreted in an urban context. Here secret. This was to break with the ancient Greek and
initiates identified themselves with the perpetual Roman association of the aristocratic with the eternal
dying and rising to life again of a god, who had been and transcendent reserved for a few, on the one hand,
originally a god of fertility. By participating in the and the democratic, associated with the immanent,
god’s own self-salvation, the mystes hoped to re- open, ‘positive,’ unmysterious, and available to the
generate their lives. many, on the other. Plato, in the Laws, had already
Christianity fused together public liturgy and pri- begun to de-eroticize the transcendent, or, alter-
vate rite, with momentous consequences. From early natively, to eroticize the democratic, since one must
times, it interpreted worship, latreia, as meaning, after express this both ways around. Now, however, not just
the incarnation, the entire offering of the whole person in a theoretical text, but in actual practice, the secret
to God in charity, which included charity toward was publicized, and, equally, the most public—the
one’s neighbor. In this way, the whole span of human Eucharist which engenders the ecclesia, the corporate
life was reconceived as ‘liturgical,’ since the new ‘city’ identity—was rendered secret (and permanently mys-
was an eternal city which also embraced true human terious, even for the catechized). Once, either some
life in time. On the other hand, the language of were to ascend, or all were to remain on an immanent
‘mystery’ and ‘initiation’ was also embraced. In the plane; now all were to ascend, continuously.
case of St. Paul’s use of the term mysterion, it is true, In accordance with this new conjoining of liturgy
the background is very unlikely to be that of the pagan and cult, the sacrifice of the Eucharist was public, yet
mystery cult, as was once thought. Instead, the involved no unequal apportionings. All now ingested
background is Jewish apocalyptic: thus Paul speaks of a all that was offered, and yet in eating the totality, the
primordial ‘mystery’ now disclosed to us in Christ, a elements and those whom they fed were offered in
mystery anticipated in the Jewish Passover, which their entirety back to God. In addition, all ‘sacrificial
involves a passage through destruction to renewal. economy’ was broken with; nothing was any longer
Nevertheless, this meant that, at its heart, Christianity expected from God in return for one’s giving. The new
involved the mystery of the death and rising again of sacrifice was one of pure gratitude to the God who, in
God, a mystery that was made present again in the any case, gives. As with the Pythagoreans, sacrifice
rituals of baptism and the Eucharist. Later, Patristic involves mainly human elevation. This elevation,
authors expanded this notion in terms that owed however, also now includes humans’ own free giving
something to pagan notions of initiation into secret in charity to others. In this fashion, elevation no
knowledge: the Syrian fathers spoke of Raza, an longer abandons the city. Indeed, for Augustine in the
originally Persian term (Raz) denoting secrets of state City of God, it is the whole city which is offered, which
within the imperial court; one can note here that in this is elevated—since the city exists through elevation.
more oriental context, the ‘secret’ and the ‘public’ Because of this new, nonsacrificial economy,
were already identified before Christianity, although Augustine refused most of the pagan terms for ritual

1748
Christian Liturgy

or worship, espousing only the Greek term latreia (CD enactment of the incarnation: only the divine man who
VI 3). Nevertheless, he spoke of a true Cultus as utterly gave Himself showed, by giving beyond hu-
involving a participation in the one true offering made manity, true humanity. Ultimately, the French School
by Christ (CD VII 30). This was an ‘inner’ cultus, not which he founded helped to sustain some unbroken
in the sense that outer signs are inefficacious, but in the tradition of Christian liturgy as encompassing all of
sense that they must be truly intended. Indeed, Christian life, since it was a sharing in the mystery of
participation in the mystery of Christ through the Christ. The centrality of ‘mystery,’ and so of the
Eucharist and the annual liturgical calendar is now so notion of a public secret (or a secret publicness) was
intense that christians inner life must be understood emphatically renewed in the nineteenth century by
also in cultic terms: their heart is not mainly ‘ltheins,’ Dom Odo Casel who (though he overstressed the
but rather an altar upon which they sacrifice to God influence of Greek mystery religion) was a decisive
(CD X 3; Ep. 140 Ad Honoratum, 18, 45). influence on Catholic liturgical renewal in the twen-
Thomas Aquinas reiterated and expounded tieth century.
Augustine’s understanding of latreia and cultus. He In many ways, Vatican II restored the theological
stresses that cult does not have God as its object but as centrality of liturgical mystery, although it is ques-
its end, since the aim is not to please God, but to be tionable whether its practical liturgical recommenda-
united with Him, and this is not brought about through tions were entirely in keeping with this understanding.
the work of worship; rather, God brings it about It misinterpreted a particular ancient local practice at
Himself by meeting humans in liturgical acts of worship Rome as indicating that originally the priest stood
(S.T. III, Q. 81 a 5; Q. 24 a 5). behind the altar to celebrate the liturgy, whereas
Up until the twelfth and thirteenth centuries, this the design of even the most ancient basilicas suggests
understanding of cult was in the most part preserved. this could not have been the case. Most scholars now
It worked against any notion that the essence of reject this conclusion, but the adopted new practice of
liturgy is a matter of ‘correct procedure,’ or that the the priest facing the congregation tends to reduce both
liturgical sphere is a special domain standing over the sense of approaching eschatological mystery, and
the practical and theoretical aspects of life. Even in of an equal approach by the entire public, both priest
Gratian’s Decretals, the aim of the canonist is to allow and people. Likewise, Vatican II mistook documents
proper scope to the force of local custom, so long as of broad direction for liturgical enactments, such as
this is not inconsistent with the custom and under- that of Hippolytus and Justin Martyr, as indications
standing of the church in general, following Augu- of original ‘simple’ liturgies, organized more formally
stine of Canterbury’s precept that ‘place does not and without supposedly ‘messy’ repetitions. In this
approve a custom, custom approves a place’ (Decre- way, another dimension of public mystery was lost in
tals, 12. C 10). practice: the endlessly ‘stammering’ recommencement
Nonetheless, the later Middle Ages witnessed an of an approach to an altar where one can truly
increased notion of juridical thinking in the liturgical worship—an approach that cannot be completed in
sphere, resulting in some separation of private piety time. Thereby both eschatological and apophatic
from public ritual: in this way, the Christian logic of aspects were diminished in practice, even though the
‘the public secret’ started to come undone. The reformers had stressed these in theory. Equally, the
Reformation then protested against a mechanistic new Latin liturgy and its vernacular translations
formalism which tended to suggest that the following tended to lose metaphoric richness, typological reso-
of certain procedures secured salvation. In response, nance, and a sense of language as an epiphanic
the Council of Trent sought diligently to reconnect the vehicle.
outer and the inner, and to reassert the notion of a For these reasons, efforts in the tradition of Be! rulle
sacramentum as an outward sign of an inner reality and Casel to restore the centrality of Christological
whose exteriority could not be dispensed with, since mystery and public secrecy to Christian practice
the fullness of this reality was only to be eschatologic- remain only partially accomplished.
ally disclosed, and then would transcend the inward\
outward contrast. However, the implementation of See also: Pilgrimage; Ritual; Sacrifice; Symbolism in
the Tridentine decrees involved more and more a dry Anthropology; Symbolism (Religious) and Icon
insistence upon outward observance, as sacramental
practice came to be viewed as a series of instituted
motions with formally consistent entailments ensuing Bibliography
automatically. Such an outlook at once harmonized
Aquinas T 1964–81 Summa Theologia. Eyre and Spottiswood,
with Enlightenment rationalism, and was itself part of
London
what the Enlightenment rejected. Augustine of Hippo 1984 Ciitas Dei Bettenson H. Penguin,
By contrast, the spirit of the most ancient tradition Harmondworth, UK
was renewed and rethought by Cardinal Be! rulle, who Beard M, North J, Price S 1998 Religions of Rome. Cambridge
insisted that the whole of Christian life, outer and University Press, Cambridge, UK, 2 Vols.
inward, was a participation in, and, in a sense, a re- Bouyer L 1968 Life and Liturgy. Sheed and Ward, London

1749
Christian Liturgy

Bouyer L 1986 Mysterion: Du MysteZ re aZ la Mystique. Aubier, They have been leading members of governmental
Paris coalitions in such countries as Italy, Germany,
Bremmer J 1994 Greek Religion, Greece and Rome new surveys. Austria, Belgium, and The Netherlands; they currently
Oxford University Press, Oxford, UK
form (together with their conservative allies) the
Bremmond H 1932 Histoire Litteraire du Sentiment Religieuse en
France t. 9: La Vie ChreT tienne sous L’Ancien ReT gime. largest group in the European Parliament.
Aubier, Paris Christian democratic parties are not just parties that
Bugnini A 1990 The Reform of the Liturgy 1948–1975 (trans. M J use this label (some do not). As a matter of fact,
O’Connwell). liturgical Press, Collegeville, MN Christian democratic parties are hardly Christian.
Burkert W 1985 Greek Religion. Archaic and Classical [trans. They are secular parties operating in highly secularized
Raffan J]. Blackwell, Oxford, UK societies. Today they seem almost indistinguishable
Burkert W 1987 Ancient Mystery Cults. Harvard University from conservative or liberal parties. However, they
Press, Cambridge, MA have a distinct history that both accounts for their
Casel O 1942 Das Christliche Kultmysterium. Munich,
particular nature and helps explain their major con-
Ratisbonne
de Certeau M 1975 L’En criture de l’histoire. P.U.F, Paris tribution to politics. This contribution can be articu-
Congar Y 1947 Le Christ, image de Dieu invisible. La Maison- lated around two paradoxical outcomes: although
Dieu: Reue de Pastorale Liturgique 59: 132–61 they were formed initially to challenge the emerging
Congar Y 1967 L’Ecclesia ou communicante Chre! tienne, sujet European liberal democratic order, these parties even-
integral de l’action liturgique. In: Jossua J-P, Congar Y (eds.) tually became pillars of secularism, liberalism, and
La Liturgie apreZ s Vatican II. Aubier, Paris, pp. 246–82 democracy in Europe. Christian democratic politicians
Dalmais I H 1990 Raza et sacrament. In: de Clerck P, Palazzo E also pioneered the process of European integration,
(eds.) Rituels: MeT langes offerts au PeZ re Gy. Editions du Cerf, and although they fought hard against socialism, they
Paris, pp. 173–82
ended up building vast welfare states. Their evolution
Duval A 1985 De Sacraments au Concile de Trente. Paris
Elich T 1991 Using liturgical texts in the Middle Ages. In: Austin is a stunning illustration of how democratization can
G (ed.) Fountain of Life. Pastoral Press, Washington, DC, pp. be the contingent outcome of political strategies rather
69–83 than result directly from the dissemination of demo-
Gratian 1995 The Treatise on Laws [trans. Thompson A]. cratic ideas and principles.
Catholic University Press of America, Washington, DC
Harrison T 2001 Greek Religion: Belief and Experience.
Duckworth, London 1. Origins
Lyonnet S 1967 La Nature du culte dans le N.T. La Liturgie Contemporary Christian democratic parties evolved
apreZ s Vatican II Aubier, Paris
from the confessional parties that were created in the
Milbank J 1998 The politics of time: Community, gift and liturgy.
Telos 113 (Fall): 41–69 second part of the nineteenth century and were an
Mohrmann C 1965 Sacramentum dans les plus anciens textes expression of political Catholicism. These Catholic
Chre! tiens. In: (eds.) En tudes sur le Latin des ChreT tiens. , Rome parties grew out of the largely antiliberal and ultra-
Pickstock C 1998 After Writing: On the Liturgical Consummation montane mass Catholic movement that challenged the
of Philosophy. Blackwell, Oxford, UK ascendancy of liberalism in Europe from a ‘funda-
Price S 1999 Religions of the Ancient Greeks. Cambridge mentalist’ and theocratic perspective (as codified in the
University Press, Cambridge, UK 1864 papal encyclical Syllabus of Errors). Indeed,
Riedwig C 1987 Mysterienterminologie bei Platon, Philon et Christian democracy was a concept coined in oppo-
Klemens on Alexandrien. Berlin
sition to liberal democracy. Though spearheaded by
de Roten P 1992 Le vocabulaire mystagogique de saint Jean
Chrysostome. In: Triacca A M, Pistoia A (eds.) Mystagogie: the Catholic Church which feared for the loss of its
PenseT e Liturgique d’aujourdhui et liturgie ancienne. Edizioni privileges, especially in the field of education, Catholic
Liturgiche, Rome movements won their independence from the church
Rotureau G 1944 Le Cardinal de BeT rulle, Opuscules de PieteT . through their transformation into Catholic parties.
Aubier, Paris The Catholic Church resisted this process, which
Zaidman L B, Pantel P S 1992 Religion in the Ancient Greek City robbed it from its monopolistic control over its flock,
[trans. Cartledge P]. Cambridge University Press, Cambridge, as much as it could; but it failed to thwart this
UK development because democracy provided Catholic
activists with a powerful source of power and legiti-
C. Pickstock mation. Although initially strongly opposed to democ-
racy on ideological grounds, these activists quickly
realized that their interests lay in its consolidation and
further expansion (Kalyvas 1996).
Christian Parties: European The process through which confessional parties
were formed had two important, though contradic-
Christian (or Christian democratic) parties have been tory, consequences: first, it turned religion into the
among the most successful political movements in foundational element of confessional parties, the core
Europe. Together with their Social democratic of their identity; yet religion proved more of a
counterparts, they have dominated European politics. hindrance than an advantage; second, the religious

1750
Christian Parties: European

appeal produced a highly heterogeneous sort of party, produced internal ‘classism.’ Powerful Catholic Work-
composed of interest groups which had been united ers’ and Peasants’ associations had to be incorporated
only by their adherence to the message of religious into the new parties which eventually adopted a
defense; yet this social heterogeneity increased the peculiar confederate structure based on organizations
salience of class within these parties. defined in terms of class (standen or lager). The ensuing
conflicts gave rise to intensely accommodationist and
compromising practices that were necessary for en-
2. Religion suring the parties’ unity and cohesion.
Hence, mediation between these assertive and di-
Confessional parties, albeit friendly to religion, vergent interest groups became imperative. As a result,
wanted to disassociate themselves from too close a Christian democratic parties became particularly
relationship with the church. Religion restricted their skilled in the exercise of the politics of mediation (van
appeal as well as their autonomy. Likewise, the church Kersbergen 1994), something their opponents have
could only protect its universalistic identity by moving derided as opportunism and a belief ‘that the ends
away from these parties. However, the confessional justify the means.’ The principle of subsidiarity (higher
character of these parties could not be shed because authorities, such as the state, should intervene only
religion had become the cement that kept their where individuals or smaller communities are not
heterogeneous social basis together. This quandary competent), a central concept in the process of
was solved in an ingenious yet momentous way. European integration, can also be traced back to these
Confessional parties redefined religion into a nebulous developments.
humanitarian and moral concept that allowed them to Herewith (rather than in papal encyclicals, such as
be simultaneously Christian and secular. Vague the 1893 encyclical Rerum Noarum) lies the source of
formulations such as ‘religious inspiration,’ or ‘values Christian democracy’s strong social component that
of Christian civilization’ remain the only references to distinguishes it from mainstream conservative or
religion one finds in the official discourse of these liberal parties. Indeed, a number of studies have found
parties. This led to a situation whereby it is perfectly that Christian democratic strength is positively associ-
possible to be simultaneously a Christian democrat ated with high levels of welfare expenditure and high
and an atheist. In fact, this is not even perceived as a levels of unionization. Van Kersbergen (1995) has
contradiction. identified a distinctively Christian democratic core of
Hence, Christian democratic parties contributed in social policies, which he calls social capitalism. This
a fundamental way to the ‘desacralization’ and policy core differs significantly and systematically
secularization of their countries’ politics. Paradoxi- from both the liberal and social democratic con-
cally then, the politicization of religion contributed to ceptions of social citizenship. The Christian demo-
the secularization of politics. In a perverse fashion, cratic welfare state is as large, in terms of expenditures
Christianity was drained of its religious content even and size, as its Social democratic counterpart; but it is
while being legitimated as a political identity—and quite different. It privileges families rather than indi-
this feat was accomplished by its proponents rather viduals, cash benefits rather than social services, and
than its opponents. The secularization of confessional seeks to preserve rather than subvert labor market
parties was, thus, endogenous to these parties and outcomes. Following the end of World War II, social
took place well before World War II, rather than being capitalism (along with political anticommunism),
a delayed adaptation to external societal developments rather than religion, provided the foundation on which
as often thought. Besides consolidating democracy, Christian democratic parties stood—and from which
this development further enhanced the position of they ruled. Social capitalism was, thus, not only the
Christian democratic parties by laying the path, after outcome, but also the means of Christian democratic
World War II, for interdominationalism, thus turning mobilization.
Christian democracy into a dominant party in con- In the course of the 1990s, Christian democratic
fessionally mixed societies such as Germany. parties entered into a protracted and deep crisis. They
have experienced steep electoral decline (Austria,
Belgium, The Netherlands) or even total collapse
3. Class (Italy); they have been implicated in major financial
scandals (Belgium, Italy, Germany, Austria); and they
The social basis of confessional parties was made have failed to expand, as expected, in Eastern Europe.
of numerous and often-conflicting interests: social Although the end of the Cold War can be partially
heterogeneity was these parties’ hallmark from the credited for some of these developments, this crisis
outset. In this sense, Christian democratic parties goes far deeper. The imperatives of global economic
were catch-all parties aant la lettre (van Kersbergen competition have undermined the social capitalist
1994). This heterogeneity was the direct result of their arrangements that had guaranteed both the parties’
ideological profile that emphasized religion at the cohesion and their electoral appeal. In other words,
expense of class. However, external ‘nonclassism’ this is a crisis of Christian democracy rather than a

1751
Christian Parties: European

crisis of Christian democratic parties. Christian demo- ‘globalization,’ which allows for simultaneous global
cratic parties will have to reinvent themselves if they homogenization of culture and local reassertions of
want to remain a relevant force in European politics. difference. Robertson’s theory that cultural forces
Yet if they look back, they will find that their often move faster than political and economic forces has
forgotten history holds a vast repertoire of frequently received considerable support from historians and
unintended but nonetheless ingenious practices of analysts of Christian missions, which have noted that
adaptation and reinvention. the indigenization of Christianity has historically been
faster and deeper in Asia and Africa than the move-
See also: Church and State: Political Science Aspects; ment of formal missionary and colonial structures. In
Consumption, History of; Party Systems; Religion and South America, by contrast, there is a body of opinion
Economic Life; Religion: Mobilization and Power; which states that the combination of state apparatus
Religion, Sociology of; Religious Stratification; and church extension actually retarded the extension
Socialization: Political; Western European Studies: of Christianity. It is clear, then, that globalization
Religion; Western European Studies: Society theory is vital for the understanding of the expansion
of Christianity generally, just as the study of the
expansion of Christianity is vital for understanding
the processes of globalization.
Bibliography While a slow starter to get into missions, evangelical
Buchanan T, Conway M 1996 Political Catholicism in Europe, Christianity began to articulate an increasingly soph-
1918–1965. Clarendon Press, Oxford, UK isticated global vision from the late 1600s, with the rise
Irving R E M 1979 The Christian Democratic Parties of Western of the Pietist movement. As Lewis has noted, Pietism
Europe. Allen and Unwin, London restored a sense of personal call to Protestant or-
Kalyvas S N 1996 The Rise of Christian Democracy in Europe. thodoxy which, through agencies such as the Uni-
Cornell University Press, Ithaca and London versity of Halle and Moravian missions, expressed
Kalyvas S N 1998 From pulpit to party: Party formation and the itself in a growing internationalism. The correspon-
Christian Democratic phenomenon. Comparatie Politics 30:
293–312
dence of Zinzendorf, the migration of Moravians, and
van Kersbergen K 1994 The distinctiveness of Christian Democ- missionary outreach to Russia, Greenland, and India,
racy. In: Hanley D (ed.) Christian Democracy in Europe: A created networks and examples of successful missions
Comparatie Perspectie. Pinter, London and New York which informed the expanding networks of new
van Kersbergen K 1995 Social Capitalism: A Study of Christian settler societies and expanding capitalism. It was on a
Democracy and the Welfare State. Routledge, London and trip to the New World that John Wesley, for instance,
New York first ran into Moravians, who, through the influence of
Peter Bo$ hler, provided the synthesizing principle to
S. N. Kalyvas Wesleyan eclectic theology and practice which in turn
made it a vibrant missionary force. It was likewise
among Scots-Irish and Dutch settlers in the New
World that religious revival produced the combination
of Protestant theology and religious experience which
Christianity: Evangelical, Revivalist, and has become known as ‘Evangelical Christianity.’
Definitions differ, usually with personal affiliation, but
Pentecostal a standard definition of Evangelicalism offered by
David Bebbington locates the movement in a quadri-
While ‘globalization’ as a concept arose as a fiscal lateral of theology and praxis involving activism,
planning tool in the 1950s, its extension as an cross-centeredness, Bible-centeredness, and conver-
explanatory concept for the sort of territory previously sionism. The fact that nothing like the French Rev-
covered by such frameworks as ‘world systems theory’ olution occurred in England helped legitimize the
dates from the 1970s. At the time of writing, Roland movement as a source of social and cultural stability.
Robertson and Peter Beyer are the key writers in the The inner energies of the movement, combining a
area as it relates to religion. Indeed, Robertson’s universalizing crucicentrism and biblicism, with con-
pioneering framework arises out of the conundrum version-oriented activism, meant that the tens of
which religion presents for standard modernization thousands of converts to the Evangelical Awakening
theory. The rise of Islamic and other forms of religious on both sides of the Atlantic provided a vast pool of
fundamentalism denies the linear rationalist assump- willing support to the evangelical program to change
tions of modernization theory and such offspring as globalizing British society. The leaders of the move-
secularization theory, which suggest that regional ment—in particular William Wilberforce, Lord
difference and religious conviction should be passing Shaftesbury, the Clapham sect, and Selina, Countess
away in the face of rational technical systems and the of Huntingdon—were able to draw upon these net-
homogenization of culture. Robertson has since then works of support in a dynamic program of social
further articulated the theory through the concept of activism over the century 1750–1850. This activism

1752
Christianity: Eangelical, Reialist, and Pentecostal

arguably provided important impulse force to the missions. The marginality of missions at the beginning
processes of globalization, just as globalizing Western of the nineteenth century demonstrates the close
economies provided important resources and oppor- association of evangelical expansion with Western
tunities for evangelical expansion and change. expansion. By 1800, North America was subject to
Protestants from Luther’s time were faced with an European settlement, but only on the eastern
expanding globe in tension with an increasingly half and the western rim of the continent, and
outmoded ecclesiology. While, during Luther’s own though there was no real challenge to Hispanic
lifetime, Diaz rounded the Cape of Good Hope (1488), hegemony in South America, Christianization had not
Columbus was driven west in pursuit of a millennial really pierced deeply below the surface of the in-
vision to land on the shores of America (1492), and digenous and emerging plantation cultures. The im-
other parts of the world were opened to Western pact of missions in the Indic, wider Asian, and Muslim
influence by da Gama, Cabot, Cortes, Magellan, and worlds was negligible. Obviously, then, developments
Pizarro, Protestants were combining with increasingly in the nineteenth century were critical for the de-
nationalistic ruling classes (the nobility in Germany, velopment of global Christianity in our own period.
bourgeoisie in Switzerland and The Netherlands) to Europe at the time was weak and divided religiously.
reinstitute the ‘Christendom’ ideal within the national The industrial revolution encouraged both concen-
churches of rising nation states. While Catholic tration of the sort of wealth which could capture
empires ruled the seas, Protestant missions were foreign markets and provided the scientific curiosity
restricted to Europe. While the theology became and the speed of travel to take advantage of new
increasingly arid and formalized, the impact of these opportunities. While most European expansion in the
churches was to reinforce the nexus between national- eighteenth century had been mercantilist in nature—
ism and capitalism. Capitalism expanded during the with small colonies of isolated Europeans trading with
period through royal chartered companies, combining Asian nations in sealed cantons such as Shanghai,
national expansion and capitalist enterprise. Halle Macau, and Hong Kong—the nineteenth century saw
Pietism specifically used this link to expand into a shift toward extensive settlement and active domi-
Russia, just as Puritans used the chartered company to nation of local politics. At least in part, this was
establish themselves in Massachusetts Bay. The CMS because it was now technically possible, with advances
first fought with and later used the East India in communication and military hardware, to dominate
Company in Asia, while the Claphamites moved into other less advanced cultures. So Britain absorbed
West Africa on the tails of colonization companies. India, Burma, and Ceylon into a new Empire, France
While Weber’s ‘spirit of Protestantism’ has been annexed Indo-China, and The Netherlands finalized
revised and attacked in some quarters, it is nonetheless its annexation of Indonesia. Other European powers
a useful heuristic when viewing the interaction of founded colonies, such as the German colonies in
shipping, chartered, and colonization companies on PNG, and annexed parts of Africa, such as Togoland,
the one hand, and missionary agencies on the other. Southwestern Africa, and the like. At least for the
Protestantism provided an ‘inner fascination with British, Stanley argues that there was a consistent
the world,’ and evangelicalism adopted the same imperial impulse to British society, but no plan of
fascination and extended it. There has been con- conquest, most acquisitions having been gained in
siderable work done, for instance, on the link between order to protect trading rights. Formal control was
enlightenment and Evangelicalism, indicating that never exercised when the much cheaper form of
Evangelicalism, through its commercial and mission- informal control could be used, and it was the
ary emphases, was a major carrier of rationalizing increasing need for formal control, driven by the
Western concepts of time and space into the rest of the intensified competition of European powers for global
world during the late nineteenth and early twentieth resources, which drove Britain to acquire and then
centuries. Not only did Wesley proceed on the as- defend India as the keystone in its defense of the East.
sumption of an ‘inner enlightenment,’ using both the In terms of the other global religious cultures,
vocabulary and technologies of enlightenment to evangelicalism also faced weakened opposition. One
further the evangelical cause (he was, for instance, an advantage of the time was that Islam was in retreat—in
assiduous publisher), but Puritan and evangelical 1821 Greece was retaken, and from the 1830s Euro-
input into the sciences from the sixteenth to the pean powers began dividing up Islamic parts of
nineteenth centuries was immense. Clergy were not Northern Africa such as Morocco, Algeria, and
only involved in agencies such as the Royal Society, Tunisia. Roman Catholicism at the beginning of the
but were key suppliers of information to enlight- nineteenth century was under siege from the forces of
enment projects such as the social sciences. (Malcolm modernization. With the eclipse of Portugal and Spain
Prentis, for example, has demonstrated the link be- as imperial powers, increasingly the push for missions
tween clerical collection and the construction of an in the Catholic world came from France, Belgium, and
international culture of research in anthropology.) (significantly for Australia) Ireland. While the Or-
The logical place to look for the globalizing influ- thodox churches did not pursue much in the way of
ence of evangelicalism is through the agency of missions, despite the spread of the Russian Empire

1753
Christianity: Eangelical, Reialist, and Pentecostal

during this period, the Protestants, energized by the tian’ countries without confessional attachments, and
Evangelical revival, certainly did, both within and many, in their colonies (particularly in Canada and
outside the English speaking world. Stanley suggests Australia) were actively encouraging liberal demo-
that, while there were agencies such as the Society for cratic states. Not only did missions change the
the Propagation of the Gospel with a missionary education of the increasingly vital professional class at
element to them, before the Evangelical awakening home, but in exporting the Western educational
there was no consistent acceptance of the burden of model, also exported Western professional values into
foreign missions by either Anglican or Dissenting the nation states which began to arise in Asia, Africa,
Protestants. Europe was the same. In 1810, for and the Pacific from the late nineteenth century. In
instance, a revival known as the Reveil spread out of exporting values such as professional disinterested-
Geneva and affected the French Protestant world ness, the importance of technical expertise, and basic
deeply, leading to considerable missionary endeavor human rights, a common culture was established
overseas. Norwegian revivals, most notably com- which facilitated not only the spread of global society,
menced by Hans Nielsen Hauge, a lay reader preach- but also (through preservation and re-skilling of
ing in a way which was not legalized until 20 years Fourth World peoples) the means for local opposition
after his death, energized Scandinavian missions to to global homogenization. The specter of evangelical
North America and later to South America. German African bishops nay-saying advanced liberal pro-
missionaries, in the tradition of Halle and the Mor- posals at a Lambeth conference, or Aboriginal evan-
avians, traveled all over the world. The interaction of gelicals controlling the balance of power in the
these revival movements and missions can be seen in national assemblies of the United churches of Canada
their common adherence to interdenominational and Australia, have demonstrated the interactivity
agencies such as the Evangelical Alliance, which was between evangelical missions and mediation of global
founded in 1847, and held conferences in France, culture.
Belgium, Switzerland, Germany, Canada, Sweden, The establishment of comity arrangements between
and the United States. evangelical missions in foreign countries highlights the
The model provided by the Alliance of a voluntary second major interaction. The movement of religious
society was to become the classic form for Protestant traditions outside their national boundaries produced
missions. Many were formed at the prompting of a relativization of their content and form, creating
chaplains and clergy travelling with army, trading movements toward ecumenism based on ‘common
company or political representatives. The first of them gospel’ assumptions, diminishing the importance of
was the Baptist Missionary Society (1792), inspired by liturgical and cultural tradition, and homogenizing the
William Carey, who in turn had been inspired to culture of international Christianity. This was im-
mission by the accounts of the travels of such explorers ported back into the West as de-denominalization,
as Captain James Cook in the Pacific. It was followed and into Third World countries as ‘alliance Chris-
by the London Missionary Society (1795), the British tianity,’ enabling Christianity to be described as a
and Foreign Bible Society (1804), the Wesleyan unitary object in contact with other world religions.
Methodist Missionary Society (1813), and many This sense of relativization of form sparked concerns
others. Because the Church of Scotland refused the about relativization of truth, concerns expressed most
requests of its Evangelical party to form their own markedly in the fundamentalism\modernism debates
missionary agency, most Scots siphoned off into other of the 1920s onwards.
agencies, such as the LMS. In 1810, preceded by many Fundamentalism self-consciously identifies itself as
small bodies dedicated to evangelizing First Nations’ a countermodernist movement, particularly in terms
peoples, the first American foreign agency was of theological modernism. Evangelicalism maintains
founded in the American Board of Commissioners something of this thrust, at the end of the twentieth
for Foreign Missions. century increasingly identifying itself as a counter-
The extension of evangelical missions had a number postmodernist movement. On another level, however,
of major interactions with the growth of global society. these debates can be seen as localized resistance to
The first was that, with increased need for personnel, homogenizing forces at the global level, resistance seen
the missions started their own Bible training institu- most markedly in the general opposition of national
tions. These in turn legitimized the long-standing ‘old evangelicals to agencies such as the World Council of
dissenting’ academies, supported religious emanci- Churches. In some cases, this meant the localization of
pation and social reform, and acted to break down the evangelicalism into extreme nationalist forms. In most
exclusivism of the established church-linked univer- cases, however, especially after the rise of the National
sities. European states, at slower or faster rates, Association of Evangelicals in the USA and the Billy
responded to this broadening of the religious franchise Graham Organization, the centrality of missions and a
(among other secularizing tendencies) by stepping self-identity partly defined by contention for the faith
back from the official sanctioning of one form of caused evangelicals after World War II to take the
religious truth or another. By the end of the century, same globalizing path that the WCC had already
most European states had become generally ‘Chris- taken. This is reflected in the formation of agencies

1754
Christianity: Eangelical, Reialist, and Pentecostal

such as the Lausanne Committee, the World Evan- note in this regard the high degree of evangelical
gelical Fellowship, and worldwide denominational participation in modern technological enterprises,
fellowships such as the World Alliance of Reformed from Back to the Bible Radio, to the massive media
Churches. Particularly after World War II, these empire of the Universal Church of the Kingdom of
agencies provided the basis for expansion of English God in Brazil. This has been seen as a paradox by
and, with increasing hegemony, American evangeli- some commentators—it is less obviously so when
calism onto the world stage. By the 1990s, it was clear revivalism’s accommodation to technological utili-
that denominations had become almost irrelevant to tarianism is taken into account. Though revivalism is
the self-definition of most evangelicals, and that not limited to evangelicalism (Catholic revivalism, for
localized megachurches were more important for instance, or the adoption of revivalistic techniques by
agenda setting. Hindu and Buddhist promoters in South and South-
Likewise, evangelical missions had shifted from east Asia), and not all evangelicalism is revivalistic, it
being denominationally and agency based to in- is clear that the two are blood relations and natural
creasing integration with global movements in migra- partners in many cultures around the world. At the
tion, occupation, and tourism. Short-term mission, same time, the fact that revivalism is a distillation of
and mission originating from countries in the ‘South,’ ‘revival tradition’ into a technological form which can
contribute nearly as much to the total Christian thereby cross cultural boundaries makes it eminently
missionary workforce as do traditional, First World, transportable. For this reason, Edith Blumhofer has
agency based missions, and is likely soon to outstrip it. called the major global form of modern revivalism,
Within these tendencies in evangelicalism in the pentecostalism, ‘world evangelicalism,’ because of its
nineteenth century arose the issue of ‘revivalism.’ facility with communications technologies. The ability
Revivalism is a technical accommodation between the of pentecostalism to ‘cast’ itself as a communal form
experience of revival, as formalized during and after of thought through the abstracting processes of print
the First Great Awakening by Calvinists such as media and CITs makes it a prime globalizing force. On
Jonathan Edwards, and the utilitarianism implicit in the one hand, this makes for global expansion of
early nineteenth century postenlightenment thought. evangelicalism through its revivalist and Pentecostal
The emphasis of Edwards and Wesley in defending the forms. On the other hand, the same process homo-
use of ‘means’ for the furtherance of religious revival genizes its distinctives and causes it to adapt new local
were codified and extended during frontier revivals in distinctives as it moves across cultures. The lack of a
New York State and the expanding American frontier. single center (unlike either Islam or Catholicism) is
Not least among the codifiers was Charles Grandison thus a great facilitator of growth, but is also the prime
Finney, whose Lectures on Reial became the great cause of internal dissension and fragmentation as
‘how to’ book of the nineteenth century. Revivalism as various local evangelical cultures seek to adopt or
it expanded from the USA to England, and then invent normative centers which enable local com-
around the world (by way of church and migrant munities and protect them from the effects of frag-
diasporas) developed an array of techniques to main- mentation. Neo-fundamentalism, Calvinist renewal
tain the association between Christian mass evan- movements in the Southern Baptist Convention, the
gelism and spiritual response. It became the tool of Toronto and Brownsville revivals—each of these may
choice for many evangelicals as Western societies be seen as the creation of new centers, which feed from
seemed to be slipping away from their commitments to the energies of globalization, and attempt to create
national religious frameworks, a set of ideas crystal- islands of order amidst the chaos of global society.
lized in the premillennial theology of archetypal In particular, evangelicalism as it stands at the
evangelistic movements such as the Brethren, Baptists, beginning of the twenty-first century is clearly largely
and Methodists. As a technical accommodation to a two-thirds world phenomenon. As Piggin, Reed, and
utilitarianism, revivalism was a natural client and others have shown, revival is the inculturation of
promoter of globalization, which was very largely Christianity into local cultures. It is not too much to
fueled by the technical innovation of high modernist say, therefore, that the theology of revival, particularly
capitalism. So revivalism quickly spread by the deve- as evidenced in Pentecostal revivalism since the 1950s,
lopment of Methodist itineracy into the big campaign is a mainstream legitimating structure for the indi-
evangelism of D. L. Moody, Alexander Somerville, genization of Protestant evangelicalism. This is par-
Gypsy Smith, and the like. Such itineracy became ticularly evident in revivalistic outbreaks such as the
global, as shipping, then automobiles, and then airp- East Africa Revival in the 1930s, the blossoming of
lanes, brought other parts of the globe within reach. Pentecostalism in Brazil and later in Argentina, and
As communications technologies also improved— the Aboriginal revival commencing at Galiwin’ku in
through telegraphed newspaper reports, international Australia’s Northern Territory from 1978. In each
postage, radioed accounts, and finally television cover- case, evangelicalism struck new roots and found new
age (culminating in the global campaign by satellite accommodations with existing religious cultures which
of Billy Graham in 1996)—evangelicalism spread with have in turn been exported into the global religious
revivalism to all parts of the globe. It is important to network.

1755
Christianity: Eangelical, Reialist, and Pentecostal

While the East Africa revival had comparatively Poewe K (ed.) 1994 Charismatic Christianity as a Global Culture.
little experience of charismatic manifestations, Brazil University of South Carolina Press, Columbia, SC
and the Aboriginal revival have been highly char- Pollak-Eltz A, Salas Y (eds.) 1998 El pentecostalismo en AmeT rica
Latina entre tradicioT n y globalizacioT n. Ediciones Abya-Yala,
ismatic in their combination of local and general
Quito, Ecuador
characteristics. As Freston notes, the massive expan- Robertson R 1992 Globalization: Social Theory and Global
sion of evangelicalism in that showcase of evangelical Culture. Sage, London
inculturation, Brazil, is likewise largely Pentecostal Robertson R, Garrett W R (eds.) 1991 Religion and Global
in nature. As he noted in 1998, a ‘survey of evan- Order. Paragon House, New York
gelical institutions in Greater Rio de Janeiro Smith T L 1980 Reialism and Social Reform: American Protes-
discovered that, of the 52 largest denominations, 37 tantism on the Ee of the Ciil War. Johns Hopkins University
were of Brazilian origin, virtually all Pentecostal. Press, Baltimore, MD
While only 61 percent of all evangelical churches were Synan V 1997 The Holiness-Pentecostal Tradition: Charismatic
Pentecostal, 91 percent of those founded in the Moements in the Twentieth Century, 2nd edn. W. B.
Eerdmans, Grand Rapids, MI
previous three years were’ (Freston 1998, p. 74). Of the Thomas G M 1989 Reialism and Cultural Change: Christianity,
450 million ‘evangelicals’ and 750 million ‘Great Nation Building, and the Market in Nineteenth-Century United
Commission Christians’ estimated by David Barrett to States. University of Chicago Press, Chicago
exist in 1990, some 350 million could be counted
within the Pentecostal or charismatic camp. It is clear M. Hutchinson
that at the end of the twentieth century, evangeli-
calism’s expanding edge was clearly Pentecostal\char-
ismatic in nature, leaving the national traditions of the
First World heartlands significant challenges of re-
invention in their struggle to survive the disappearance
of their defining ethnic and national boundaries. Christianity in Anglo-America and
See also: Globalization and World Culture; Globali-
Australasia
zation, Anthropology of; Historiography and Histori-
Historically, the major religious traditions for Anglo-
cal Thought: Christian Tradition; Religion: Evolution
America and Australasia are similar, even though the
and Development; Religion, Sociology of social contexts in which they are located vary con-
siderably. That they all function, more or less, as
cultural establishments within their countries provides
Bibliography a substantial basis for looking at them as a unified
religious bloc. The countries making up this large bloc
Bebbington D W 1992 Eangelicalism in Modern Britain: A extending across continents—that is, Canada, the
History from the 1730s to the 1980s. Baker Book House, United States, Australia, and New Zealand—share
Grand Rapids, MI origins as English colonies and, thus, are relatively
Beyer P 1994 Religion and Globalization. Sage Publications,
Thousand Oaks, CA
young as nations and bound by Anglo legacies of
Blumhofer E L 1993 Restoring the Faith: the Assemblies of God, language and culture. While the countries themselves
Pentecostalism, and American Culture. University of Illinois are relatively young, in another sense the Protestant
Press, Urbana, IL cultural establishments within them are increasingly
Blumhofer E L, Spittler R P, Wacker G A (eds.) 1999 Pente- viewed by many people as old and faltering—yet
costal Currents in American Protestantism. University of another reason for looking at them as of one piece.
Illinois Press, Urbana, IL For all these reasons, comparative analysis of religious
Carwardine R 1978 Transatlantic Reialism: Popular Eangeli- trends and dynamics is both possible and desirable,
calism in Britain and America, 1790–1865. Greenwood Press, despite some obvious differences in size, polity, and
Westport, CT
demographics for the various countries.
Featherstone M, Lash S, Robertson R (eds.) 1995 Global
Modernities. Sage Publications, Thousand Oaks, CA
Freston P 1998 Latin American perspectives. In: Hutchinson M,
Kalu O (eds.) A Global Faith: Essays on Eangelicalism and
Globalization. CSAC, Sydney 1. Historical Considerations
Marsden G M 1980 Fundamentalism and American Culture: the From the outset, we should not underestimate the
Shaping of Twentieth Century Eangelicalism, 1870–1925. extent to which Anglo religious life continues to be
Oxford University Press, New York
McGrath A 1995 Eangelicalism and the Future of Christianity.
fundamentally shaped by two historical realities, first,
InterVarsity Press, Downers Grove, IL a Protestant background, and thus a theological
Noll M A, Bebbington D W, Rawlyk G A (eds.) 1994 Eangeli- heritage of resistance to religious authority and,
calism: Comparatie Studies of Popular Protestantism in North second, a common experience and trajectory in the
America, the British Isles, and Beyond, 1700–1990. Oxford encounter of religion and modernity in the West.
University Press, New York Obviously the two are closely related. The Protestant

1756
Christianity in Anglo-America and Australasia

Reformation unleashed religious energies previously point out, Protestantism has confronted the modern
contained in medieval Catholicism and loosened the situation longer and to a greater extent probably than
hold of ascriptive loyalties, thereby giving rise to any other religious tradition. That of course is rapidly
greater autonomy of the individual in matters of faith changing in the contemporary world where no re-
and morals. Greater religious choice was accompanied ligious tradition can insulate itself from the pluralizing
by an elaboration of religious forms and styles, or a and individualizing trends. Yet for this very reason,
further working out generally of what is often referred the mainline Protestant experience is important and
to as the ‘voluntary principle’ within Protestantism. indeed paradigmatic of what other religions are now
This principle is most pronounced in the United States, facing. Indeed, the fate of Protestantism in the modern
but elsewhere with Anglo influence, even in England West is what people of other faith traditions are quick
itself, religious establishments have functioned in the to notice and worry about as perhaps their own. What
modern period in an environment of considerable they see are deep internal strains within the tradition
individual freedom. Protestantism has both given and disputes between liberal progressives, on the one
shape to, and been shaped by, high levels of individual hand, and conservative traditionalists, on the other,
freedom. In fact, three quite distinct historical waves over moral and religious principles and how to forge
of Protestant dissent legitimizing greater individual responsible religious styles in the contemporary world.
autonomy may be identified: first, the Calvinist move- Having said all this, it is not necessarily the case that
ments; second, the Methodist revivals; and more ‘Protestant’ translates to ‘Anglo’ and certainly not
recently, the Pentecostals. As David Martin observes that Anglo faith traditions can claim any superiority in
(1978), the three religious waves correspond to shift- their responses to modern life. The spiritual malaise
ing, ever-widening spheres of social influence. Unlike that has currently fallen upon the historic, so-called
the Calvinists who were limited mainly to social elites, ‘mainline’ Protestant institutions throughout much of
the Methodists empowered working-class and middle- Anglo-America and Australasia augurs against any
income populations, and the Pentecostals reached still such claim of superiority. Nor is there any necessary
lower social strata, and did so–and still does–in places presumption that other faith traditions must inevitably
and contexts often beyond the reach of the other two. follow the Protestant trajectory in its confrontation
For all three, the emphasis on the individual, on with modernity; there are multiple passages and
education and hard work, and on taking charge of courses of development. Even to argue that the
one’s life and making the most of it resulted in upward Protestant experience, given its strong Anglo connec-
mobility, and with that, a re-focusing of theology and tions, is paradigmatic for other religious traditions is
moral principles giving greater weight to the role of an to risk the charge that this is still another, albeit subtle
individual’s choice and conscience. claim to hegemony—alongside a history of racism,
The encounter with modernity in the West led to a sexism, and capitalist exploitation. But we need not
wide array of consequences, but most notably, to a draw this latter inference, especially considering the
loosening of the binding character of tradition and visible, widespread examples of ‘Protestantization’
memory and an increased awareness of religious within other faith traditions, presumably the choice of
pluralism. Both of these mesh well with the mounting conscious believers from within those traditions.
importance of individual autonomy and reliance upon Rather, we should look to the various countries
personal choice as articulated religiously. With all this described in this article as case studies in which to
came foundational shifts, or what Peter Berger describe Anglo religious trends, and to examine how
(1979) describes as a ‘loss of ontological certainty.’ these old Protestant establishments view themselves
Broadly speaking, the challenges to religious authority and extent to which they still define and shape religious
and increased pluralism ushered in an era of greater life within their environments. Writing some 30 years
negotiation in matters of faith and commitment in the ago, sociologist Charles H. Anderson 1970 observed
face of a widening array of choices. As the term that ‘The decline of the white Protestant majority and
‘secularization’ is often used, it is much too encom- of white Protestant hegemony in twentieth century
passing and glosses over the nuances of religion’s America has encouraged the growth of self-conscious
encounter with modernity to adequately describe what Protestant community’ (1970, p. 3). It is an observation
happens in this situation, but suffice it to say that the about religious identity which applies, in varying
latter as a historical and increasingly global process, degrees, to Anglo-Saxon populations in all the daugh-
creates ruptures in shared religious views and forces ter societies of England, and thus a good reason why
upon ordinary people a posture of cognitive bargain- we should take a close look at these societies.
ing. Organized religion’s monopoly over religious and
spiritual questions is easily undermined. Religion’s
presence in the public arena undergoes a qualitative 2. United States
shift. For our purposes here, what is important is not
just the parallels between Protestant influence and We look first at the United States. It is the prime
religion’s encounter with the Enlightenment but the example of a country with a strong Protestant legacy
fact that, as Berger and many other commentators where the religious norms of individualism are now

1757
Christianity in Anglo-America and Australasia

widely diffused and where the historic, so-called white, middle-class, Anglo types—came so suddenly
mainline Protestant denominations are mired in a and involved religious organizations so strikingly
deep spiritual malaise. Some would argue that the different in theological heritage and polity could only
United States is the society that has undergone the signal something deeply troubling in the culture at
greatest impact with modernity, that in many respects large. To begin with, the low and declining birth rates
its beliefs and values are highly secular, or more (well below the replacement level) finally caught up
precisely, that religion in this country is secularized with these churches. The birth rates were already
‘from within,’ and thus a model of sorts of what to lower than for other traditions, but after the birth-
expect in the modern world. Yet, as is commonly control pill became available to the public the rates
observed, by almost any standard of comparison the dropped even further—indeed, so low that many of
United States remains distinctive among modern these churches in the late 1960s and 1970s had so few
nations given its high levels of church-going and teenagers they could not provide very effective pro-
religious membership. Despite significant shifts in gramming. A ‘generation gap’ emerged as large num-
religious styles, levels of religious participation for the bers of the post-World War II generation effectively
country as a whole have not greatly changed over the dropped out of active involvement. In fact, the best
past four decades. Protestant values encouraging the predictor of declines in these churches were the
responsible practice of faith on the part of ordinary diminished church school enrollments 10 years earlier:
believers, and not just by religious elites or by fewer children were brought up in the churches,
politicians seeking votes, remains deeply ingrained in beginning in the early 1960s, which later translated
its public life. into significant declines in worship attendance and
It is widely accepted that the religious distinctiveness financial contributions. The decade of the 1960s was
of the United States is explained by ‘supply-side’ itself an important turning point in a broader cultural
thinking, that is, by a history of separation of church sense. John F. Kennedy was elected the first Roman
and state (the heritage of ‘voluntarism’) and a culture Catholic president, an event significant symbolically
encouraging innovative, competitive religious leaders to a Protestant sensibility and its fears of losing power
to gather followers around them and to organize new and influence. The antiestablishment ethos of that
churches. Innovation in recruitment methods, in de- period brought on by the struggles over civil rights, the
veloping new organizations, and in the framing of Vietnam War, and gender role and family changes
religious messages coincides with a culture that stresses likewise made for a distinct change of mood and
rational choices on the part of individuals. The fact outlook, working against those religious institutions
that so many Americans ‘switch’ religious affiliations that had long been closely identified with the main-
and move in and out of religious organizations stream values.
frequently underscores a high level of choice and social It is sometimes said that these old Anglo institutions
accommodation. Indeed, there is much to support the have depleted their theological resources, that the
rational-choice perspective on individuals and their problem is one of ‘tired blood.’ Certainly the insti-
religious affiliations and styles. In religion, as with the tutions do suffer from a loss of energy and direction
mass marketing of goods and services of all kinds, and for a long time have ridden on inherited cultural
Americans respond to skillful packaging and pres- capital. The shift in demographics was itself not
entation of the product. ‘Selling God’ is taken for inconsequential: the size of the Protestant majority
granted in an economically-driven culture as the steadily decreased in the twentieth century, as did the
selling of anything else, despite the fact that preachers, ethnic consciousness and identity underlying many of
priests, and rabbis often resist any notion that theirs is the Protestant communities. Anglo populations and
a message that in any way would or should be sold. cultures generally feel the squeeze resulting from
But there are other considerations involved in growing Hispanic and non-Christian faiths. Yet de-
accounting for religious trends and dynamics in the spite the erosions of consciousness and institutions, in
United States. An important example is the downward a very real sense these churches triumphed to a degree
spiral of the ‘oldline’ Protestant churches, some of within the culture: many of their goals arising out of
them with histories dating to colonial times. It is much the Social Gospel movement were achieved and, more
too massive a restructuring of American religious life to a theological point, the Protestant principle about
to explain simply by the rise of televangelism or the use believing for yourself and taking responsibility for
of any other innovative techniques for recruiting. your actions long inculcated within these traditions
Episcopalians, Presbyterians, the United Church of now emerged in, admittedly, a far more radical and
Christ (formerly, the Congregationalists), the Dis- liberated form. In the 1960s and 1970s, young
ciples of Christ, and the United Methodists all began people—many of them nominally Protestant—turned
to lose members in the mid-1960s and have continued inward and explored religious alternatives; they sought
to do so in the intervening years. The declines appear after experiential faith in a more direct, intimate sense
to be slowing down and may possibly be bottoming than they had usually found in the dry rituals of the
out, but the fact is that this switch in fate for these established churches; they felt free to create their own
culturally established churches—made up largely of personal collages of belief drawing from a variety of

1758
Christianity in Anglo-America and Australasia

sources and traditions, and worried less about theo- means that large numbers of Americans today com-
logical consistency and more about the deeper mean- bine old religious identities and new spiritual sensitiv-
ings of whatever beliefs they held. Words like ‘soul’ ities with great ease, and in ways that appear to be
and ‘spiritual’ which in the 1950s had all but dis- personally rejuvenating but which to their grand-
appeared in public discourse now returned with parents would no doubt be incomprehensible.
excitement. In this and so many other ways, signs
pointed to a spiritual ferment which in its early phases
that was not so much opposed to religion as alienated
from existing bourgeois religious forms, and increas- 3. Canada
ingly so for many college-educated, middle-class
youth. Similar patterns of post-World War II Protestant
With the declines of the older religious institutions decline are evident in Canada, despite the fact that the
came other developments that point to major restruc- religious situation is very different in that country.
turing of American religion. One is the deep cleavage Canada is historically less pluralistic, though that is
between liberals and conservatives, the latter having now rapidly changing, and much more shaped his-
been rejuvenated in its crusades to restore a Christian torically by the presence of two large religious consti-
society along the lines as defined by evangelicals and tuencies, Catholic and Protestant, each vying with one
fundamentalists. Since the 1970s the latter have grown another for power and influence. Catholics have long
often picking up dropouts and dissidents from the been numerous in Lower Canada and the Protestants,
oldline Protestant churches, and not without subtle especially Anglicans and Scotch Presbyterians, are
appeals often—in the most conservative sectors—to sizable in Upper Canada. For the country as a whole,
reclaim an older, Anglo-based moral and religious Protestants for a long time numbered more than
order. While the thesis of a ‘culture war’ is easily Roman Catholics. But as of the national census in
overstated, there is much tension and occasionally 1991, Roman Catholics, benefitting from a higher
overt conflict over unresolved issues like abortion, birth rate, edged above them with roughly 12,500,000
homosexuality, and prayer in public schools. Such members, concentrated largely in Quebec. The United
disputes arise out of seriously conflicting, indeed Church of Christ (a merger in 1925 of Congrega-
incommensurate, notions of moral and religious auth- tionalists, Methodists, and some Presbyterians) is the
ority—whether out of humanistic conceptions of the next largest at approximately 2,000,000 members,
self and metaphorical views about truth in search of concentrated in Ontario and Prairies; the Anglicans
personal truth and happiness or literal interpretations (plus the Orthodox) have substantial numbers, around
of Biblical authority and a monarchical view of a one million. Baptists, Presbyterians, Lutherans, Pente-
transcendent God who commands people to obedi- costals, and assorted other conservative Protestants
ence. are the next largest groups, most of them below a
A second and related development is the rise of a million members each. Birth rates for Mormons, the
full-blown spiritual quest culture that now permeates Salvation Army, and for conservative Protestant
much of the nation’s population—among mainline groups generally are higher than for the more es-
Protestants who find their own rituals and worship tablished Protestant denominations.
services in need of rejuvenation, among many ‘seeker- But the fate of Protestantism in Canada differs in
minded’ evangelicals who know very little about some important respects compared with the United
Christian tradition but are eager to adapt faith to their States. Mainline Protestants have suffered member-
needs and concerns, and among secularists who turn ship declines there, losing both to the conservative
to New Age religions (or more broadly, the ‘New churches and to Roman Catholic parishes. Propor-
Spirituality’) in search of inner truth and enlight- tionately they have lost more to Catholicism than is
enment (Roof 1999). This quest culture drives some probably the case in the United States where just the
Protestants to rediscover and reclaim their heritages; opposite pattern appears to be more prominent—that
for the great majority, it seems to raise their levels of is, somewhat greater losses to the religious conserva-
spiritual sensitivity more than their actual commit- tives. Statistically this is what one would expect: the
ments to institutions. At present, there is considerable larger the competing population, the greater the
exploration not just of what was once called ‘alterna- likelihood of mainline Protestants switching in, as a
tive religions’ but of psychological teachings and result of marriage or by religious choice. The con-
inspirational literature, often of a rather generic servative religious presence in Canada is neither as
quality; much influence of popular television programs well-institutionalized nor as publicly visible as in its
and films that address spiritual themes directly, and neighbor to the South. Sociologist Reginald W. Bibby
many new entrepreneurs operating in the religious (1993) argues in fact that the alleged recent growth of
marketplace offering their versions of spiritual wis- conservative Protestantism in Canada is largely a
dom, holistic thinking, and mind-cure. In fact, many misperception, more really a ‘circulation of the saints,’
people who are interested in the New Spirituality have or movement from one evangelical church to another,
no difficulties acknowledging a Protestant past, which than a case really of active and successful recruitment

1759
Christianity in Anglo-America and Australasia

from other churches. Protestants on the whole, and Both countries know the tensions arising out of an
conservative Protestants in particular, tend to be expanding religious pluralism, brought on largely by
converted over and over again—as a result of social successive waves of new migrants. Both are often
and geographic mobility, marriage, divorce, friendship described as increasingly secular societies, especially in
patterns, preference for religious leaders, and con- political and civic life.
gregational activities. Bibby’s research suggests some Yet there are significant differences. Australia has
slight increase in the recruitment of outsiders into the seen fewer new religions or even new versions of older
conservative fold over the past two decades, but even religions than has New Zealand. Religious history for
with these increases the actual number of converts the former is more a telling of stories about a European
remains small. past unlike for the latter where such stories are shaped
Overall, levels of religious participation in Canada more by local conditions and cultural and religious
are moderate—higher than those of most European admixtures. Even Anglican church history differs. In
countries but lower than found in the United States. Australia, the Anglican Church began as a convict
Conservative Protestants have higher levels of weekly church served by chaplains for whites, whereas in New
worship attendance than other Protestants and Zealand, the Anglican Church was at first, and for a
Roman Catholics, but because of their relatively small long time afterwards, a Maori aboriginal church
size proportionately and tendency to circulate among served by missionaries. It was not until the mid-1800s
themselves their impact is not as widely felt as in the that the Pakeha, or white Anglicans, began to out-
United States. The liberal–conservative cleavage with- number the Maoris. The Maori Ratana and Ringatu
in Protestantism is certainly evident, but the ‘culture- faiths are two examples of New Zealand’s popular
war’ infrastructure in Canada is weaker than in the mixing of religious and cultural themes, combining
United States, and therefore less of structural feature aboriginal and Christian elements. This difference
within the society. The conservative-moralist flavor of between the two countries need not be exaggerated,
evangelical Protestantism never had sway over Cana- but it does register in religious and spiritual styles.
dians in the way historically it did, and to some extent Australia’s religious profile today is as follows:
still does, in the United States. Of greater significance Roman Catholics, 27 percent; Anglicans, 22 percent;
is the influence of the New Spirituality which functions Presbyterians, Methodists, and Uniting Church, 13
less as a separate enclave than to permeate religious percent; other Christian, 11 percent; other Religions, 3
communities of all kinds, and especially the more percent; no religion or not stated, 23 percent. The
moderate-to-liberal communities. Again, to quote Catholic ascendency over Anglicans emerged about 50
Bibby, Canadians are very much into ‘religion a la years ago. The old, well-established religious com-
carte.’ Or, as he explains, ‘New Age religion seems to munities—the Anglicans, the Uniting Church, the
follow the pattern of most new entries into the Presbyterians, and Lutherans—have all suffered de-
Canadian religion market, offering consumers op- clines in absolute numbers during the past two
tional item that can be added to more conventional decades. The Christian or Christian-derived groups
religious beliefs and practices’ (Bibby 1993, p. 52). It is growing include the Pentecostals, Oriental Christian,
not clear whether this is more true in Canada than in Jehovah’s Witnesses, and Mormons. Non-Christian
the United States; in both countries great numbers of groups rapidly increasing but still relatively small
people interested in New Age and metaphysical proportionately are Buddhists, Hindus, and Muslims,
thought continue to identify, to a considerable extent, representing the latest of migrating populations into
with an inherited faith tradition—as Protestant or Australia. The greatest change in religious identifi-
Roman Catholic. This represents a major convergence cation since World War II, and particularly among
for the two countries in the adaptation of religion to those who were born in this period, is the increase in
modernity. In both of these highly individualized people having ‘no religion’ or choosing not to state a
settings, religion is not so much abandoned in some religion in the national census. Since the 1970s this
strict sense as it is privatized and modified to fit into population has doubled. Likewise, religious attend-
people’s life-situations at any given time. ance has declined in most of the churches while,
somewhat paradoxically, stated interest in spiritual
well-being appears to have increased.
In New Zealand, the Anglicans remain the largest
4. Australasia single constituency considering that both the Pakeha
and the Maoris are included. Roman Catholics are the
The very term ‘Australasia’ signals a confrontation next largest population, followed by Presbyterians
and mixing of cultures, of East and West, a term quite with a continuing, strong Scottish heritage. About 50
fitting to both of our remaining countries, Australia percent of the population identify with one of the
and New Zealand. Both countries have a heritage of major Christian churches; by contrast, the evangelicals
European settlements and Anglo dominance over and Pentecostals appear not to have had as much
aboriginal cultures, with the practices and structures success in recruiting in New Zealand as they have in
of that heritage reflected in their social institutions. Australia. Recent migrant religious communities are

1760
Christianity in Asia

growing but are infinitesimal in size compared with all Roof W C 1999 Spiritual Marketplace: Baby Boomers and the
others. There have been substantial declines in re- Remaking of American Religion. Princeton University Press,
ligious affiliation and congregational involvement Princeton, NJ
Wuthnow R 1988 The Restructuring of American Religion:
since World War II, especially for major churches, and
Society and Faith Since World War II. Princeton University
mostly as a result of trends among younger whites. A Press, Princeton NJ
third of the population currently report having ‘no
religion’ or object to stating what it is. That the white W. C. Roof
and non-white populations are on somewhat different
religious trajectories is apparent, evident most in the
Anglo trends toward reduced organizational involve-
ment; for the Maoris with their folk traditions and
close spiritual attachments to the environment, organ- Christianity in Asia
izational participation was never held up as a religious
norm in quite the same way it was for Anglos. Christianity in Asia has an ancient history though it
Both Australia and New Zealand have experienced remains a minority religion. There are approximately
enormous changes in the past half-century, as they 85 million Christians in Asia but more than 40 percent
evolved from being made up predominately of villages of them live in just two countries, the Philippines and
and tribes to becoming more urban and multicultural South Korea. The minority status of Christianity in
in character. Religion has lost much of its power to Asia gives it a distinctive sociological character in
integrate life experiences and has become one among relation to Christianity in other regions of the world.
many life-worlds in which people may or may not For many Asian theologians this distinctive socio-
participate. Anglo religious traditions have suffered in logical context has meant that the particular focus of
the process. There, as elsewhere in modern societies, their articulation of the Christ of Asia is that of
we observe greater eclecticism and pragmatism. How- dialogue with the dominant religious and cultural
ever, compared with Canada, and even more so with traditions of Asia. In the postcolonial era a strong
the United States, two things stand out making the element of Christian social militancy was observable
religious responses to modernity in Australasia dif- in many Asian countries. This militancy was as-
ferent: one is that fundamentalism seems to attract sociated with the ecumenical Christian Council of
fewer people and offers less of a viable alternative, and Asia, and especially its sponsorship of Urban Rural
the second is there is less movement generally from Mission in a number of countries. The Catholic
one faith tradition to another, the more common Federation of Asian Bishops Conference is also an
movement being simply to leave organized religion in organization which has confronted frequently social
favor of ‘no religion.’ and political controversies in particular Asian
countries. Postcolonial Asian Christian social protest
has focused on various concerns including the con-
See also: American Studies: Religion; Catholicism;
tinuing social power of caste in India, workers’ rights
Christianity in Asia; Christianity in Central and South in South Korea and Southeast Asia, and democratic
America; Christianity Origins: Primitive and ‘Western’ reform of authoritarian political regimes which charac-
History; Church and State: Political Science Aspects; terise government in much of the region. A more
Nationalism: General; Religion and Politics: United recent form of militancy in Asian Christianity is
States; Religion: Definition and Explanation; Re- associated with the rise of charismatic\Pentecostal
ligion: Nationalism and Identity; Secularization and fundamentalist\evangelical groups which have
arisen within and beyond the denominational bound-
aries and institutions established during the colonial
era. Adherents of the new charismatic and evangelical
Bibliography forms of Asian church regard Christian worship,
spirituality, and personal evangelism, rather than
Anderson C H 1970 White Protestant Americans: From National social transformation in the wider society, as the
Origins to Religious Group. Prentice-Hall, Englewood Cliffs, essential vocation of Asian Christians. This more
NJ transcendental and salvific emphasis represents a clear
Berger P L 1979 The Heretical Imperatie: Contemporary Possi- rejection of the social protest of earlier forms of
bilities of Religious Affirmation. Anchor Press, Garden City,
postcolonial Asian Christianity. The new militancy is
NY
Bibby R W 1993 Unknown Gods: The Ongoing Story of Religion
more focused on claims to spiritual power for the new
in Canada. Stoddart, Toronto, Canada forms of Asian Christian spiritual and congregational
Bouma G D 1992 Religion: Meaning, Transcendence and Com- life, and the rights of all Asian Christians to freedom
munity in Australia. Longman Cheshire, Melbourne, Aus- of worship, and to proselytise their neighbors. Just as
tralia. the earlier ecumenical and socially radical militancy
Martin D 1978 The Dilemmas of Contemporary Religion. St. was the occasion for formal State resistance in South
Martin’s Press, New York. Korea in the 1970s, and in Singapore, Malaysia, and

1761
Christianity in Asia

Hong Kong in the 1980s, so Asian Christians now find closed. Ironically the exclusion of foreign missionaries
their claimed freedoms to worship and evangelize allowed Chinese Christianity to flourish in a way it had
occasion increased social scrutiny and resistance not before and the Christian presence in China today
amongst majority religious and cultural groups and is much greater than it was at the time of the
their political leaders. Communist revolution. Most of the officially re-
cognized churches in China, whether Catholic or
Protestant in origin, are known as ‘three-self’
1. South Asia churches, which means that they are not dependent on
extra-Chinese ecclesiastical authority, nor do they rely
There is evidence of Christian activity in South India
on funds from overseas. There are also a large number
and Sri Lanka from the sixth century. Kerala is the
of unofficial churches and congregations in China
region where most activity was concentrated, and
whose leaders try to circumvent controls on officially
Christians are still strong there, comprising 20 percent
recognized churches, and in particular controls on
of the population. Christians also comprise a signifi-
proselytism. Evangelistic and worship meetings are
cant proportion of the population of Tamil Nadu.
held in secret, in apartments, or in open spaces away
Portuguese colonisers saw a connection between trade
from city or town centres. The mass migration of
and religion and, with the aid of the Jesuit order,
many tens of millions of rural migrants into towns and
encouraged the majority of the population of their
cities in China since the 1980s, the largest contem-
settlement in Goa to convert to Christianity by the end
porary movement of people on the planet, has oc-
of the sixteenth century. Under British rule the East
casioned conversions to new religions, including
India Company resisted the deployment of mis-
Christianity, as migrants seek to find a substitute for
sionaries, judging that their presence would exacerbate
village-folk religion. Churches in urban areas, and
existing religious tensions between Hindus and
especially in southern China, are currently growing
Muslims. But in the nineteenth century missionaries
very fast, fuelled by the spiritual and ritual vacuum
were allowed into most of British India. Converts were
occasioned by the Cultural Revolution and the en-
few and most were from marginal tribal groups, or
forced break-up of traditional Chinese religions, and
from untouchables who have much to gain by aban-
by intellectual and popular dissatisfaction with secular
doning Hinduism for Christianity. Since the 1980s a
Communist ideology.
vocal Dalit (untouchable) movement has emerged
which is radically anticaste and draws on Latin
American-style liberation theology for its inspiration.
With the rise of religious politics in India since the
Premiership of Mrs Gandhi, religious conflict has 3. Hong Kong, Taiwan, Japan, and Korea
become endemic in many parts of India and this has
One fifth of the population of Hong Kong are
had an increasing impact on Christians. In Sri Lanka
Christian and just under half are Catholics. The
Christians are most prominent in the West and make
principal Protestant and Catholic traditions are repre-
up 30 percent of Colombo’s population.
sented and there are also a large number of inde-
pendent and Pentecostal churches which have become
2. Christianity in China established since the 1970s. The majority of Christians
in Taiwan are Presbyterian, reflecting a missionary
Christianity in China goes back to the migration of drive in the nineteenth century. Many new Christian
Nestorian Christians from Iraq to China along the groups emerged after the emigration of supporters of
trade routes in the seventh century CE. The Jesuits the government of Chian Kai-sheck to Taiwan from
arrived in China in the sixteenth century under the mainland China. The Jesuit missionary Francis Xavier
leadership of Matteo Ricci who experimented with a introduced Christianity to Japan in 1549, after his
radical indigenisation of Christianity to Chinese cul- initial mission to India who converted the Japanese
ture. His efforts at cultural translation were con- court to Christianity. More than 300,000 Japanese
troversial in Rome and the Pope insisted on a reversion followed their rulers and converted by the end of that
to Roman rituals and traditions in 1744. After the century. However, Christianity was later proscribed,
opium wars and the Nanking treaty of 1842, Catholic though small pockets of Christian activity remained.
and Protestant missionaries settled in many parts of In the twentieth century Japan has been characterized
China. The association between Christianity and by remarkable religious innovation, and many New
Imperialism was strong and conversions were few until Religious Movements have emerged, particularly since
the failure of the Boxer rebellion at the end of the the Second World War. Parallel developments have
nineteenth century when many, especially young, occurred in Japanese Christianity. New indigenous
Chinese turned to Christianity as a potential source of styles of Christianity have emerged, and Pentecostal
social and spiritual restoration. With the advent of groups have also proliferated.
Communist government in 1949 all foreign mis- The Catholic Church has roots in Korea as far back
sionaries were removed from China and churches as the sixteenth century but Protestantism has taken

1762
Christianity in Asia

root in Korea in a manner it has nowhere else in Asia. cities, and especially in Bangkok, are the main locus of
Forty-one percent of the population are now Christian adherents. Myanmar Buddhists are also
Christian, and the majority belong to indigenous highly resistant to Christianity. Christians are most
Korean and Protestant churches. The beginning of the numerous amongst the Karen tribal group in the
twentieth century saw a particularly rapid influx of forests of the Northwest who are not Buddhist, and do
Koreans into the Protestant churches, and many not regard themselves as Burmese.
Protestant churches have also experienced consider- Christianity was introduced into the Indonesian
able growth since the Second World War. Seoul has archipelago by the Dutch in the sixteenth century.
some of the largest church buildings, and the largest Christians are most numerous in Northern Sumatra
congregations, anywhere in the world. Explanations where the non-Muslim Batak people converted to
for the strength of Protestantism in Korea are various. Christianity in large numbers. There are also churches
One theory is that there is a uniquely individualist in most areas of Java, and on some of the other
strain to Korean Confucianism and that Protestantism islands. Indonesian Christianity has been subject to
individualism, with its more progressive conception of the State religious doctrine of Panchasilla which
the role of the individual in the fast changing world of requires that all religious groups express loyalty to the
the twentieth century, proved particularly attractive to state and tolerance of other religions. Indonesian
Koreans who were dissatisfied with Confucianism. Christians were active in the struggle for independence
from the Dutch after the Second World War and their
partnership in the birth of the new independent nation
4. Southeast Asia may partly explain why, almost uniquely in the
Muslim world, there have in the past been a number of
As a consequence of more than three centuries of religious conversions. However, a peaceful inter-
settlement by the Spanish, and after the Spanish religious climate characterized by neighborliness and
American war, by the United States, Christianity is the dialogue between the religions recently has turned to
principal religion of the Philippines. Virtually the conflict as the Suharto and Habibi governments have
whole country was converted to Roman Catholicism stirred up religious conflict as a way of maintaining a
under Spanish rule, though a Muslim minority re- hold on the votes of the predominantly Muslim
mained, primarily in Mindanao. The Roman Catholic population.
Church, however, was organized and run exclusively Christians in Malaysia number around 7 percent of
by foreign clergy and resentment at this resulted in an the population, and 15 percent in Singapore. Dating
internal revolution whose outcome was the Philippine back to the Portugese settlement of Malacca in the
Independent Church which broke away from Cath- fifteenth century, Malaysian Christianity has taken up
olicism after 1860. Protestant missionaries arrived a range of influences including Portugese Catholic,
with the Americans, including Episcopalians who set Dutch Presbyterian, American Methodist, Anglican,
out to convert the mountain peoples in North Luzon and Pentecostal. Christians are confined to Chinese
as they had not been reached by the Catholics. and Indian minorities on the whole, though tribal
Roman Catholic missions were active in Vietnam, peoples in East Malaysia have converted to
Cambodia, and Laos in the seventeenth and eighteenth Christianity in large numbers. The fastest growing
centuries, and established churches in all three forms of Christianity in both Singapore and Malaysia
countries. Cambodia now has very few Christians today are Pentecostal or Charismatic. This modern
indeed. After Year Zero Christians were totally purged style of Christianity has spawned many independent
from Cambodia. In Laos the Catholic Church has had churches and has also been very influential in main-
most converts amongst tribal peoples on the border stream Catholic and Protestant congregations. Both
with Thailand but remains an insignificant presence countries have experienced a surge in religious interest
elsewhere in the country. In Vietnam Christians, the and affiliation in recent years, sparked in part by
great majority of whom are Catholic, constitute 7 Islamic resurgence among the Malays. This is reflected
percent of the population. The church was indigenized in a range of religious innovations both within Islam,
much more effectively in the North, whereas in the Buddhism, and Hinduism, as well as in Christianity.
South local leadership was discouraged and there are
far fewer adherents. Catholics have been close to See also: Catholicism; Christianity: Evangelical, Revi-
political leadership in both North and South Vietnam valist, and Pentecostal; East Asia, Religions of; India,
and the Catholic Church in Rome and in Vietnam Religions of; Religion: Evolution and Development;
was a strong supporter of moves for peace during the Religion, Sociology of
war, and of subsequent North–South reconciliation
efforts after the war.
Uniquely in Southeast Asia, Thailand has never
been colonized by a foreign power and Thai Buddhism Bibliography
is particularly resistant to foreign, including Christian, Ackerman S E, Lee R 1988 Heaen in Transition. University of
influence. Strong Chinese communities in the large Hawaii Press, Honolulu, HI

1763
Christianity in Asia

Barrett D B (ed.) 1982 World Christian Encyclopedia: A Com- More generally, religious legitimation meant that
paratie Study of Churches and Religions in the Modern World social distinctions, obligation to work, and access to
AD 1900–2000. Oxford University Press, Nairobi, Zimbabwe resources were regarded as being regulated by an order
Digan P 1984 Churches in Contestation: Asian Christian Social
that encompassed the social and the physical world. In
Protest. Orbis Books, Maryknoll, NY
von der Mehden F R 1986 Religion and Modernization in
order to maintain this order it was necessary to
Southeast Asia. Syracuse University Press, Syracuse, NY structure time and space, as well as the relations
Neill S C 1985 History of Christianity in India, 1707–1858. between age groups and the sexes. The organization of
Cambridge University Press, Cambridge, UK time was accomplished at several levels: at the indi-
Palmer S J 1967 Korea and Christianity: The Problem of vidual level, this involved rites of passage; at the
Identification with Tradition. Hollym Corp, Seoul, South macrosocial levels, on the other hand, the structuring
Korea required the use of liturgical calendars which were
related to the seasons and, above all, to agricultural
M. Northcott work. Similarly, space was structured by establishing
mythologically based distinctions among neighbor-
hoods and, at a more extended level, by dividing a
territory by means of shrines along imaginary lines, an
example of which is the ceque system centered in
Cuzco, the Inca capital.
Christianity in Central and South America
Throughout the pre-Hispanic, colonial, and contem-
porary periods, the history of religion in Central and
South America has been inextricably linked to political 2. Conquest, Resistance, Accommodation
structures, rituals that ordered everyday life, and The fact that the political, religious, and economic
movements of rebellion, with Catholicism having spheres were less differentiated in the Aztec and Inca
played a central role for almost four centuries. After states than in sixteenth-century Western Europe may
the break up of the monopoly exercised by the Catholic have contributed to the speed of the victory of the
church and the rapid growth of Evangelical churches Spanish and Portuguese adventurers. The ‘modernity’
there is growth, but also fragmentation; creativity as of the conquerors should not be exaggerated, however,
well as uncertainty; individualism, but also the con- for the Iberian discovery and conquest of America
stitution of new communities; freedom, along with the took place immediately after the defeat of the Muslim
authoritarianism of the leaders of the base communi- kingdom of Granada, and the expulsion of the Jews
ties or of the Pentecostal churches; political quietism who had refused to convert to Christianity. It can be
side by side with right-wing activism. There are even said, therefore, that the conquest itself took place as a
churches that reversing the traditional direction have kind of crusade at a time of politico-religious tri-
established branches outside Latin America. umphalism. Moreover, attention should be paid to the
fact that Columbus seems to have endowed his
enterprise with a millenarian aura, hoping that a
1. Religion Before the Conquest portion of the wealth found in the Indies would be
used in the conquest of Jerusalem. This millenarian
As is generally the case in the preindustrial world, in component can also be found at work in the self-
which ideological claims are advanced in mythological understanding of some of the members of the religious
terms and reinforced through ritual means, in the area orders involved in the process of conversion of the
now known as Latin America the social order was vanquished.
religiously legitimized. A proper discussion of the In any event, while the crusading spirit validated the
various ways in which this religious legitimization conquest, the transcendentalization of religion and the
took place would require considering the range of concomitant separation between the roles of priests
societies found in Meso and South America, from and those of soldiers and administrators allowed some
kinship-based groups, found for instance in the Ama- clerics to denounce the abuses of conquerors, without,
zonian area, to chiefdoms, to states. Since such an however, condemning the conquest itself. During the
examination cannot be undertaken here, it may suffice colonization of Latin America, therefore, Christianity
to say that societies organized as chiefdoms tend to be fulfilled the double role traditionally accomplished by
theocratic, their political life being heavily ritualized; religion: that of validating a social order, beginning
this also applies to early states, such as the Inca and with the justification of the conquest itself, while also
the Aztec, in which the rulers were sacralized and were providing the justification for judging that same order.
considered as the pivot of their society. Thus, in the Religions’ double function can also be seen at work
Andean world the Inca dynasty, originally based in the among the indigenous population, for whom
area of Cuzco in southeastern Peru, claimed descent Christianity served as a vehicle of accommodation as
from Inti, the Sun. well as of protest. The first aspect involved what is

1764
Christianity in Central and South America

generally known as syncretism, that is, the amal- The situation, however, was not always as favorable
gamation of Christianity and indigenous religions. as clerical groups would have wanted. As in Spain
Thus, the Christian god incorporated elements of itself, the very hegemony of the church contributed to
Andean divinities such as Wirakocha and Pachaca- an atmosphere of anticlericalism which led to attempts
mac, while the Virgin Mary, as the Virgin of to curtail that institution’s privileges; this anticlerical
Guadalupe, subsumed the Aztec goddess Tonantzin, attitude was intensified by the influence of liberal and
‘Our Mother’; Quetzalco! atl, on the other hand, was positivist currents. The most radical measures against
assimilated to the apostle Thomas. The fact that the Roman Catholicism were taken in Mexico, a country
physical appearance of some of the inhabitants of the that re-established relations with the Vatican only in
Christian pantheon was said to resemble that of the 1992. In general, however, Latin American gove-
Indian population, rather than that of the conquerors, rnments have regarded the church as an ideological
indicates on the one hand the extent to which the ally, one which played a crucial role in the legi-
subjugated population had assimilated the new re- timization of the social order.
ligion, and on the other, the intimate connection
between religion and ethnic identification.
In some cases, conquered elites sought the prot- 4. Religion and Eeryday Life
ection of some of the most bellicose inhabitants of the
Christian pantheon, expecting to profit from the power Encompassing domestic devotions and public anti-
these supernatural beings had shown on the Spanish clericalism, individual rites of passage and official
side; thus, the apostle Santiago, patron of Spain, liturgies, Catholicism has been an omnipresent reality
became syncretized with the Andean Illapa (Light- in the life of Latin Americans. But, as one would
ning), and already in the sixteenth century the Virgin expect, Catholicism has been lived differently by men
of Copacabana was enlisted in intradynastic struggles. and by women, by members of upper and lower
Christianity also played a role in the indigenous classes, in urban and rural milieus. As in southern
rebellions against Spanish domination. This can be Europe, in Latin America everyday religion in its
seen already on the sixteenth-century Taki Onqoy, private and public forms has been the domain of
and more clearly in the late eighteenth-century Tupac women; thus, except for special occasions such as Lent
Amaru rebellion, in which the rebels regarded them- and Christmas, women have constituted the majority
selves as better Christians than the Spaniards. of the faithful in Sunday services. Similarly, it is the
mothers who traditionally have been in charge of
introducing children to the rituals and the moral
3. Religion in the New Republics injunctions of Catholicism. It is in the process of
The accommodation between indigenous religions and religious socialization that social class plays an im-
Christianity should not obscure the fact that from the portant role, as children of the middle and upper
beginning of the conquest, in order to forestall classes are generally subject to religious indoctrination
challenges from within, Spanish authorities sought to in schools directed by members of religious orders,
destroy the indigenous religions, undertaking ‘extir- whereas those who attend state schools are subject to
pation of idolatries’ campaigns; similarly, in order to minimal religious education.
protect Catholic orthodoxy from external threats, One of the consequences of this religious indoc-
Spanish authorities tried to isolate the colonies from trination is that Catholic revival movements such as
Protestantism, Judaism, and liberalism: Christianity— Catholic Action, or political parties linked to Cath-
the Baroque Christianity of the Counter Re- olicism, such as Christian Democracy, involve mainly
formation—served as one of the bastions of Spanish middle and upper class persons. In large cities, public
power during the three centuries of colonial rule. The religious activity is usually restricted to Sunday masses
role played by religion in the wars of independence of and to occasional processions, in some of which
the early nineteenth century was ambiguous. Whereas participate government dignitaries. The legitimizing
the lower clergy tended to side with those who wanted role of Catholicism as the state religion can be seen at
to achieve independence from Spain, the prelates, work in the solemn liturgies with which church and
many of whom had been born in the Iberian peninsula, state commemorate independence day, and even more
were generally in favor of maintaining the colonial so in the consecration of entire countries to Jesus or
regime. Once independence was achieved in the nine- the Virgin Mary.
teenth century, the Creole elites kept in place the
church’s prerogatives, while at the same time seeking
4.1 Popular Religion
to control it, just as the kings of Spain had done. After
the governments of the new republics and the Vatican The counterpart of official Catholicism is represented
signed concordats regulating the status of the church, by what is known as ‘popular religion’—that is, those
Roman Catholicism generally enjoyed a privileged symbolic practices that seem to want to escape, and
position, one that was enshrined in the republics’ even to oppose, the control of religious elites. Am-
constitutions. bivalence is at the core of popular religion, since in

1765
Christianity in Central and South America

order to express opposition it is necessary to make use, progressive groups opposed the alliance between
against the grain, of the symbolic resources of the oligarchic governments and conservative church. It
institutions which one opposes. Popular religion is was, however, during the early 1960s, prompted first
part and parcel of a complex system of assimilation by the triumph of the Cuban revolution and then by
and rejection, through which subordinate groups the Second Vatican Council, that the traditional
sometimes negotiate and others struggle over access to equation between church and conservative politics
cultural and in the last instance material goods. Latin was broken. The Cuban revolution, combined with
American popular religion comprises the syncretic the Council’s attempts to re-examine the relation
processes referred to above, in which elements of the between the church and the world, forced some
symbolic world of the Spanish conquerors were members of the Latin American clergy to question the
incorporated into that of the Andean, Mesoamerican, role the church had played in the maintenance of an
and other conquered groups, as well as contemporary unjust social order. That questioning led to the
ritual practices in urban and rural milieus, such as emergence of a ‘prophetic’ understanding of
processions and pilgrimages, which represent the Christianity known as the Theology of Liberation.
counterparts of the liturgical functions choreographed Influenced by the social sciences as much as by
by the official Church. traditional methods of theological exegesis, the lib-
Despite the current academic infatuation with eration theologians removed concepts such as sin and
symbolic resistance, however, it must be said that the salvation from the mere personal and spiritual realms
parodic, carnivalesque, rituals of popular religion to one that took into account the structures of the
generally have not constituted serious threats to the societies within which sins were committed and sal-
hegemony of Catholicism or of the state. In fact, the vation was sought. This theological approach was
expenditures required by the ritual calendar of indi- attacked by conservative Catholics, clerical and lay
genous communities instead of leveling economic alike, who accused its proponents of forsaking
disparities may increase them. Christianity and falling prey to Marxism. It cannot be
emphasized enough, however, that despite its apparent
threat to Catholicism, the theology of liberation
4.2 Millenarian Moements constituted an attempt to save Latin America’s domi-
nant religion by enlisting it in the task of reforming
In general, in order for popular religious practices to society. That ultimately, attacked by the local churches
turn into insurrections it has been necessary that the and neutralized by Rome, the theology of liberation
state attempt to suppress in a violent manner groups gave up its distinctive use of sociological categories
that have sought to isolate themselves from a changing and retreated into traditional spirituality, was to be
society in order to live according to the teachings of expected, given that the liberation theologians were,
the Gospel under the guidance of a charismatic leader. after all, theologians.
This happened, among other places, in northeastern
Brazil from 1893 to 1897, in the millenarian movement
of Canudos; it happened again in southern Brazil from
1912 to 1916 in the Contestado rebellion. Both the
6. Alternaties to Catholicism
Canudos and Contestado movements constituted re-
actions against the secular, positivistic values of the But Rome’s victory was a hollow one, for the
Brazilian republic—and in the case of the Contestado presuppositions shared by both conservative and
rebellion it is clear that it was triggered by the revolutionary clerics—that Latin America is and will
dislocations brought about by capitalism. Like mil- remain Roman Catholic—proved to be illusory. As
lenarian and messianic movements in general, those the conflicts between Rome and the radical theo-
that have taken place in Latin America have involved logians were taking place, a far more important
marginal regions in transitional periods, and have development was unfolding, namely the spread of
been led by marginal, literate individuals, such as Protestantism—or, to be more correct, of Evangelical
Antonio Conselheiro, leader of Canudos, and Miguel forms of Christianity. Nevertheless, despite the fact
Lucena Boaventura, known as Jose! Maria, leader of that the triumph of Protestantism may have demon-
Contestado. strated the ultimate irrelevance of the confrontation
between Rome and the theology of liberation, there is
a strong affinity between this theology, which could be
5. Liberation Theology considered as a Protestant Catholicism, and the
evangelical churches. In both cases we find a puritan
At the elite level, the privileged status of Catholicism faith centered upon the Word, suspicious of the rituals
in Latin America has not gone unchallenged either. In of official Catholicism as well as of those that
the nineteenth century, anticlerical groups, influenced constitute popular religion. It is as if both movements
by positivism, had already decried the church’s in- away from a sacramental, hierarchical, agrarian-based
fluence; similarly, in the twentieth century politically vision of the world had developed as a response to the

1766
Christianity in Central and South America

weakening of the agrarian economy with its rituals 6.1 Conseratie Trends in Catholicism
and mythologies in order to come to terms with a
A development that will likely contribute to the further
predominantly urban world ruled by the market.
loss of popularity of the Catholic church among the
But, instead of the ‘disenchantment’ of the world, to
lower classes, while perhaps increasing it among high-
use Weber’s term, we find two religiously ambiguous,
income groups, is the growing importance of con-
half-enchanted formations, the Evangelical and the
servative Catholic groups such as the Opus Dei, a
Catholic, whose task it is to mediate between the two
secretive organization supported by the Vatican. But
worlds. In both versions of Christianity there is still
even in the case of the Opus Dei, one can see its
the attempt to preserve, if only though the Word, the
promotion as an attempt on the part of the church to
presence of the transcendent, while at the same time
accommodate itself to a changing world. In effect,
deritualizing everyday activities. The causes and cons-
despite its traditionalism and authoritarianism, the
equences of this deritualization have been explained in
values promoted by this organization are consonant
terms of a rationalization of economic activities,
with the demands of a capitalist order, and in more
involving, for instance, the avoidance of the burden
general terms with the demands of modernity.
represented by wasteful ritual expenditures, and in
more general terms turning away from the pressure
See also: Catholicism; Christianity: Evangelical, Re-
exercised by community and tradition. The result is a
world of ever widening economic disparities which vivalist, and Pentecostal; Latin American Studies:
must be confronted in an instrumental manner by the Religion; Latin American Studies: Society; Religion,
individual as individual, rather than as a member of a Sociology of
community. But, rather than repressing emotion,
individualism and instrumental reason seem to exac-
erbate it. Thus, as in the tense coexistence of Pietism Bibliography
and Enlightenment found in eighteenth-century nort-
hern Europe, we find that the strongest forms of Latin Annis S 1987 God and Production in a Guatemalan Town.
University of Texas Press, Austin, TX
American non-Catholic Christianity are Pentecostal;
Bastian J-P 1997 La mutacioT n religiosa de AmeT rica Latina. Para
this has led to the emergence of charismatic forms of una sociologıT a del cambio social en la modernidad perifeT rica.
Catholicism—one more example of the proliferation Fondo de Cultura Econo! mica, Mexico City
of religious offerings in the religious market that Burga M 1988 Nacimiento de una utopıT a. Muerte y resurreccioT n
emerged after the break up of monopolistic Cath- de los incas. Instituto de Apoyo Agrario, Lima, Peru
olicism. Diacon T A 1991 Millenarian Vision, Capitalist Reality: Brazil’s
In this context reference should be made to the Contestado Rebellion, 1912–1916. Duke University Press,
growth of New Age religions among the middle and Durham, NC
upper classes, as well as of religions of African origin Duviols P 1971 La lutte contre les religions autochtones dans le
PeT rou colonial (L’extirpation de l’idolatrie entre 1532 et 1660).
among groups which are not of African descent, and in
Institut Francais d’E; tudes andines, Lima [1977 La destruccioT n
countries, such as Argentina, which do not have a de las religiones andinas (durante la conquista y la colonia).
significant black population. UNAM, Mexico City]
While the early spread of Protestantism was caused Flores Galindo A 1988 Buscando un Inca: identidad y utopıT a en
by the activities of North American missionaries, the los Andes. Editorial Horizonte, Lima, Peru
subsequent growth of evangelical churches is now Garrard-Burnett V, Stoll D (eds.) 1993 Rethinking Protestantism
largely fueled by local ferment. In some countries, in Latin America. Temple University Press, Philadelphia, PA
such as Puerto Rico, El Salvador, Guatemala, Chile, Krickeberg W, Trimborn H, Mu$ ller W, Zerries O 1961 Die
and Brazil, evangelicals constitute a significant pro- Religionen des alten Amerika. Kohlhammer, Stuttgart [1968
Pre-Columbian American Religions. Weidenfeld & Nicolson,
portion of the population, whereas in others, such as
London\Holt, Rinehart and Winston, New York]
Venezuela and Colombia, that presence is much Lafaye J 1984 MesıT as, cruzadas, utopıT as. El judeo-cristianismo en
smaller. Considering, however, the continuous growth las sociedades iberoamericanas. Fondo de Cultura Econo! mica,
of non-Catholic forms of Christianity, it can be said Mexico City
without reservation that the equation between Latin Levine R 1992 Vale of Tears: Reisiting the Canudos Massacre in
American religion and Catholicism, presupposed, Northeastern Brazil, 1893–1897. University of California
among others, by the advocates of liberation theology, Press, Berkeley
is a thing of the past. It is true that the Catholic church Martin D 1990 Tongues of Fire: The Explosion of Protestantism
can still exercise pressure upon governments and in Latin America. Blackwell, Oxford-Cambridge, UK
Mecham J L 1966 Church and State in Latin America. University
legislatures on matters related to the control of female
of North Carolina Press, Chapel Hill, NC
sexuality, but even in these cases, despite the fact that Prien H-J 1978 Die Geschichte des Christentums in
abortion is still illegal in Latin America, birth control Lateinamerika. Vandenhoeck & Ruprecht, Go$ ttingen
campaigns are not uncommon, as the birth rate has Stoll D 1990 Is Latin America turning Protestant? The Politics of
decreased in the last several decades, a fact which Eangelical Growth. University of California Press, Berkeley
shows that the influence of the church on the everyday
life of ordinary people has diminished substantially. G. Benavides

Copyright # 2001 Elsevier Science Ltd. 1767


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Christianity: Liberal

Christianity: Liberal social improvement, (d) belief in the harmonious unity


of all true (‘natural’) religion, and (e) high valuation
‘Liberal’ is the designation given to a broad trajectory of freedom. Rationalist liberalism tended to be pri-
in Christianity by sympathisers and critics alike. In marily an intellectual movement, though it was often
some of its manifestations it is also referred to as related closely to political radicalism. At the insti-
‘modernism.’ Both terms draw attention to a defining tutional level it gave rise to the Unitarian and
characteristic: a dissatisfaction with earlier forms of Universalist churches.
religion and a concern to replace them with less
restrictive alternatives more open to the spirit of the
age. Whilst it has rarely led to the establishment of new
churches, liberalism has had a significant impact 1.2 Romantic
within all the mainline Christian denominations in the Whilst many sociologists identify liberal Christianity
West. with its rationalist forms, there are other important
Understood in this broad sense, liberal Christianity varieties. By the late eighteenth century, for example,
can be seen to date back even before the Renaissance a Romantic liberalism had begun to exercise an
to the fifteenth century, when the term ‘modern’ was important cultural and religious influence. In America
first invoked in relation to reforming movements in this was best represented by Transcendentalism,
theology and spirituality like the ‘Via Moderna’ and and in Germany by the theology of Friedrich
the ‘Deotio Moderna.’ Since then Western Chris- Schleiermacher. Romantic liberalism inherited ration-
tianity has been challenged repeatedly by individuals alist liberalism’s belief that the individual rather than
and movements who have championed the freedom of the institution was the locus of true religion (‘my mind
the individual Christian over against the authority of is my church,’ as Tom Paine had put it), but stressed
institutional religion. Yet it is in the modern period the authority of feeling, imagination, experience, and
that liberal Christianity has become most prominent self-consciousness rather than reason. This stress on
and has played its most important role in shaping the importance of individual experience of God has
modernity itself. been carried through into the twentieth century in the
By the end of the twentieth century something of a work of theologians like Rudolph Bultmann and Paul
consensus had developed amongst sociologists of Tillich.
religion that liberal Christianity was in inexorable
decline. This consensus coincided with a period when
theological liberalism had fallen out of fashion. It will
be challenged in what follows on three grounds: (a) it 1.3 Ethical and Social
overlooks the internal variety of liberal Christianity,
(b) it ignores evidence of its continuing vitality on the Kant had located religious authority in moral reason.
ground, and (c) it fails to recognize that liberal This ‘ethicisation’ of the Christian religion was taken
Christianity remains well adapted to many of the further in Germany by theologians like Albrecht
socio-economic formations and cultural trends of the Ritschl, Wilhelm Hermann, and Adolf von Harnack,
modern world. and in America by theologians like Horace Bushnell
and Walter Rauschenbusch. The latter, the leader of
the Social Gospel movement, was responsible for
interpreting Christianity as a force for social reform in
1. Varieties of Liberalism industrial society.

1.1 Rationalist
1.4 Liberation and Feminist
One of the most important ways in which liberal
Christianity is implicated with the rise of modernity is In the twentieth century the influence of liberal
through its central role in the Enlightenment of the Christianity continues to be felt in new reform move-
mid-seventeenth to mid-eighteenth centuries. In their ments within the churches and theology, most notably
different ways such figures as the Deists, Thomas in some important varieties of Liberation theology (a
Jefferson, Tom Paine, Rene! Descartes, John Locke, product of Latin America) and Feminist Theology (a
and Immanuel Kant were all concerned with the product of North America and Western Europe). In
liberal reform of religion. They helped shape a liberal both we find a characteristic emphasis on human
Christianity which was characterized by (a) hostility to liberation and the authority of experience, combined
‘traditional’ religion, conceived as superstitious, het- with a thoroughgoing criticism of existing structures
eronomous, and divisive, (b) confidence in human of power and domination in both the churches and
reason and the primacy of the sovereign individual, (c) wider society, and a bias towards the oppressed and
activist optimism about the possibilities of human and marginal.

1768
Christianity: Liberal

2. The Deelopment of Modern Liberalism no real sign of diminishing their commitment to


freedom, equality, democracy, and a liberal in-
Liberalism has had the greatest impact within mainline dividualism reinforced by the institutions of a
Protestant churches, and has been particularly influ- free-market economy.
ential in countries where Protestantism has been
dominant. Most commentators consider its heyday to
have been from the mid-nineteenth century through
the 1920s. Liberalism flourished because it was able to 3. Liberalism on the Ground
meet the new challenges posed by the forces of
modernisation. Its critical stance towards Christian Dean Kelley’s book Why the Conseratie Churches
tradition enabled it to assimilate the rise of historical are Growing (1972) captured a new mood in the
method and its application to the Bible. Equally, its sociology of religion. Where secularisation theory
long-held belief in progress and the capacity of human had previously encouraged the conclusion that the
reason to fathom the mysteries of God and the world more traditional and antimodern forms of religion
helped it embrace the discoveries of modern science, would be those which would decline most rapidly, the
including evolution. What is more, its libertarianism growth of conservative and sectarian bodies alongside
and individualism enabled it to support the interests of evidence of the decline of more liberal denominations
the new middle classes and to play a legitimating role led to a change of mind after the 1960s. As noted
in relation to democracy and the modern state. Far below, sociological theory was quickly mobilized to
from being threatened by rapid modernisation in the explain this change, and to reinforce the prediction of
nineteenth century, liberal Christianity tended to view further liberal decline.
itself as the religious and moral engine of social Much of the case for liberal decline focuses on the
progress. Nowhere was this truer than in the USA. In American example. It is clear from census data in the
Catholic countries the picture was very different. US, like the General Social Survey (GSS), that the
There, led by an increasingly defensive Rome, the three denominations it classifies as ‘liberal’ (Presby-
church explicitly repudiated the errors of ‘modernism’ terian, Episcopal, and United Church of Christ) did
(which included political and religious liberalism), and indeed decline after 1970. So too did the three
allied itself with the forces of reaction. In 1907 Pope denominations it labels as ‘moderate’ (United Meth-
Pius X condemned those Catholic ‘modernists’ who odist, Lutheran, and Disciples of Christ), together
had embraced aspects of theological liberalism, and with the Roman Catholic church. By contrast more
introduced an anti-Modernist oath for the clergy conservative denominations like the Southern Baptist
which effectively brought to an end all attempts to Convention and the Seventh-day Adventists grew in
develop a liberal Catholicism. It was not until the numbers. There is some evidence, however, that
Second Vatican Council of 1962–5 that a moderate numbers in the liberal denominations may now be
liberalism gained official sanction. Since then there has stabilising (Roof and McKinney 1987), and it may
been an explosion of liberal thought in the Catholic therefore be premature to pronounce on the relative
church, despite Pope John Paul II’s attempts to curb success or failure of liberal and conservative Chris-
its influence. The work of theologians like Hans Ku$ ng, tianity in modern times.
as well as the rise of Liberation Theology and Feminist It is also necessary to exercise caution when relying
Theology has been particularly notable. on data relating to levels of denominational attend-
The twentieth century has brought unprecedented ance and affiliation alone in assessing the state of
challenges for Liberal Christianity. These include the liberal Christianity in the churches. The main problem
rise of conservative evangelical and fundamentalist with this method is that it is blind to complex patterns
Christianity and, at the theological level, the rise of of belief and commitment within churches and den-
Neo-orthodoxy. At the same time many elements of ominations. For example, most mainline churches
the liberal creed—including belief in progress, reason, today—both Catholic and Protestant—contain both
‘humanity,’ and the ideal of religious unity—have liberals and conservatives within their congregations,
been shaken both inside and outside the churches. and it is increasingly hard to label whole denomin-
This together with shrinking attendance in the main- ations ‘liberal’ or ‘conservative.’ What is more, there
line Protestant churches most influenced by liberalism is some important evidence that even more conserva-
has led many commentators to conclude that it is in tive forms of Catholic and evangelical Christianity are
terminal decline. What this conclusion overlooks, increasingly permeable to the influence of liberalism.
however, is that the mainline denominations still Thus James Davison Hunter’s extensive research
account for the vast majority of Christians (especially among college-age Evangelicals in American un-
if one includes the Catholic church), and that many covered evidence of an increasing liberalisation of
within these denominations (including the Catholic their belief, values, and practice (Hunter 1987).
churches) continue to embrace some version of liberal The best way to assess the relative strength of
Christianity (see below). What is more, liberalism liberalism in the contemporary churches would seem
remains extremely well adapted to societies that show to be through intensive and widespread congregational

1769
Christianity: Liberal

study. The most important work of this kind is that tween liberal Christianity and radical or ‘alternative’
undertaken by Ammerman (1997) who studied 23 forms of spirituality. For whilst the fate of liberal
representative congregations across the USA and Christianity is bound up with that of conservative
surveyed almost 2,000 individuals. Her discovery was Christianity, it is also bound up with that of new forms
that respondents fell into three categories: liberal or of religiosity like the New Age. In some ways the latter
‘Golden Rule’ Christians (51 percent), evangelicals seems to represent an intensification of key liberal
(29 percent), and social activists (19 percent). On the themes like individualism and freedom, but without
basis of her research Ammerman challenges the liberal Christianity’s continuing commitment to some
assumption that religious liberalism is a spent force, form of institutional church. It remains to be seen
and that liberal religiosity is a paler reflection of whether the apparent growth of such religiosity will in
conservative Christianity. The Golden Rule Chris- the end have the effect of strengthening or of weak-
tianity she describes is characterized by an emphasis ening liberalism.
on the primacy of good deeds motivated by love, care,
and compassion, and by a belief in the importance of See also: American Studies: Religion; Christian
religious tolerance. Liturgy; Civil Religion; Feminist Theology; Liber-
It may be that Ammerman has discovered a fifth alism; Liberalism: Historical Aspects; Protestantism
variety of Christian liberalism—one which might be and Gender; Rationalism; Rationality in Society;
labelled ‘relational,’ and whose significance on the Reformation and Confessionalization; Religion and
ground in the second part of the twentieth century has Politics: United States; Religion: Evolution and
been largely overlooked. Development; Religion, Sociology of; Religiosity:
Modern

4. Sociological Interpretations
Bibliography
At least three clusters of explanations have been Ammerman N T 1997 Congregation and Community. Rutgers
offered by sociologists of religion to account for the University Press, New Brunswick, NJ
rise of liberal Christianity in modern times. It has Hunter J D 1987 Eangelicalism. The Coming Generation.
been explained as (a) an accommodation or even a University of Chicago Press, Chicago
capitulation to modernity (Peter Berger), (b) a natural Hutchison W R (ed.) 1968 American Protestant Thought in the
outgrowth of Protestantism and, in particular, of the Liberal Era. University Press of America, Lanham, MD
latter’s emphasis on personal subjective conviction Hutchison W R 1976 The Modernist Impulse in American
(Ernst Troeltsch), and (c) a means by which the clergy, Protestantism. Harvard University Press, Cambridge, MA
their social status undermined by modernity, have Kelley D M 1972 Why Conseratie Churches are Growing.
Harper & Row, New York
attempted to protest and regain a social role (Jeffrey Michaelsen R S, Roof W C (eds.) 1986 Liberal Protestantism:
Hadden). Woodhead and Heelas (2000) have also Realities and Possibilities. Pilgrim Press, New York
drawn attention to liberal Christianity’s compatibility Miller D E 1981 The Case for Liberal Christianity, 1st edn.
with modern socio-economic formations and wider Harper & Row, San Francisco
cultural trends such as the turn to the self. Reardon B M (ed.) 1968 Liberal Protestantism. Stanford Uni-
Sociologists have also developed theories to account versity Press, Stanford, CA
for liberal Christianity’s apparent decline. Kelley Roof W C 1978 Community and Commitment. Religious Plausi-
explained this by drawing a contrast with conservative bility in a Liberal Protestant Church. Elsevier, New York
religion. Where the latter was ‘strict’ and ‘challeng- Roof W C, McKinney W 1987 American Mainline Religion. Its
Changing Shape and Future. Rutgers University Press, New
ing,’ the latter was the opposite. As such, he argued, it Brunswick, NJ
was unable to generate or sustain commitment, con- Woodhead L J P, Heelas P L 2000 Religion in Modern Times.
sensus or strong community. Peter Berger offered a Blackwell, Malden, MA
more rigorous version of this explanation by arguing
that plausibility is a function of unanimity. The strong, L. Woodhead
unified communities that characterize conservative
religion are better able to sustain plausibility than are
the more diffuse and less disciplined communities of
liberalism. Meanwhile sociologists like Talcott Christianity Origins: Primitive and
Parsons and David Martin have also argued that
liberalism is a victim of its own success: because its ‘Western’ History
beliefs and values are so close to those of the wider
culture it is no longer able to sustain a distinctive This article considers Christianity in the first 300 years
identity nor to hold or attract adherents. of its existence, before it achieved a close alliance with
Much of this theoretical work depends on a contrast the Roman state. It pays particular attention to the
drawn between liberal and conservative Christianity. social forms of early Christianity and their relation to
It is also important to note another boundary: that be- wider society.

1770
Christianity Origins: Primitie and ‘Western’ History

1. The Jesus Moement incorporation into the body of Christ is corporate—


those who have faith are united not only with Christ
The movement that centered around Jesus of Nazareth but with one another. This corporate Christ-mysticism
in his own lifetime seems an unlikely candidate for the undergirded the development from the mid-first cen-
eventual transformation of Western society. It appears tury of a church (ekklesia), which took the form of
to have been one of many such movements in early local communities linked together in a ‘catholic’
first century Palestine led by a charismatic Jewish (universal) alliance by their common possession of the
teacher appealing to a poor and mainly rural audience Spirit of Christ.
(Theissen 1978). Jesus’s message, which is only com-
prehensible within a framework of contemporary
Jewish beliefs and expectations, centered round the
proclamation of the imminent reign of God. From this
expectation arose the urgent and sovereign demands
to repent and believe in the gospel. All other concerns 3. The Emergence of Catholic Christianity
were secondary—including on occasion those of Until at least the second century these early
Jewish law and custom. Yet this message was pre- Christian communities were charismatic communities
sented as ‘good news’: what Jesus offered his followers (communities of the Spirit), in which social authority
was the most intimate relationship with a God not of was not institutionalized, but conferred by the Spirit.
wrath and judgment but of love and mercy—a ‘father’ There seems to have been no clear hierarchy of
who cares with loving tenderness for each one of his authority, with different functions (such as apostle,
children. This God places no barriers on relationship teacher, prophet, and miracle-worker) being regarded
with him—all are welcomed into his kingdom ir- as mutually constitutive of the body of Christ. In a
respective of their moral, social, or religious status. development that Troeltsch (1931) categorizes as the
emergence of ‘catholic’ and ‘sacramental’ Christianity,
however, the spirit gradually became institutionalized
2. The Pauline Reolution in the sacraments, particularly those of baptism and
the eucharist. Here the presence of Spirit is, as it were,
Whilst Jesus’ mission was primarily to the Jews, the guaranteed. The sacraments are material signs of the
unrestricted address of his message gave it a univer- freely given grace of God and the efficacious tokens of
salist momentum which would make possible the later salvation.
spread of Christianity beyond Israel. As a universal The development of a sacramental Christianity
religion, Christianity provided an alternative to the allowed for the development of stable and enduring
national or civic religions, both Roman and Jewish, of communities not based on the unpredictable outpour-
the time. It appears to have been the apostle Paul, ings of the spirit. It went hand in hand with the
whose letters are preserved in the New Testament, emergence of a clergy whose authority was bound up
who played the decisive role in drawing out these with their exclusive authorisation to handle and
universalist implications and giving theological justifi- distribute the sacraments. Their status was not based
cation to a ‘mission to the gentiles.’ on personal charisma, superior religious achievement,
A Jew as well as a Roman citizen, Paul was or inheritance. Rather, they were the authorised
converted by a vision of the risen Jesus. The faith representatives of the wider Christian community.
whose spokesman he subsequently became was cen- Early documents defending a sacramental priesthood
tered on this risen, cosmic Christ rather than on the reveal that this development was not uncontroversial.
historical Jesus of Nazareth. There were important On the one hand it made possible catholicity and
sociological implications in this shift. As Troeltsch order. On the other it led to exclusions, most notably
(1931) noted, Jesus’s original message was ‘individu- the exclusion of women from positions of authority in
alistic’ in the sense that it was focused on intimate the church.
relation between the individual and God. Whilst its The development of catholic Christianity also in-
universalist and egalitarian message fostered a broad volved the definition and maintenance of uniformity
sense of community between all those called to love in belief and liturgical practice. This achievement was
God and neighbor, it neither fostered new communi- also a difficult and remarkable one given that ‘early
ties nor made any attempt to influence wider society. It Christianity’ was never as unified as that title implies.
remained a reforming faction within Judaism (Sanders Despite the idealized backward glance of a later era
1985, Elliott 1995). (such as that of the fourth-century church historian,
Paul’s reinterpretation of Christianity altered these Eusebius), Christianity came into being as a diverse set
dynamics of the early Jesus movement very signifi- of largely autonomous communities spread around
cantly. As Schweitzer (1931) argued, Paul developed a the Mediterranean basin and in Syria and Asia Minor.
‘Christ-mysticism’ in which the believer is incorpor- Many were centered around a particular apostle and a
ated through faith into the ‘body of Christ.’ Though particular gospel (whether in oral or written form),
this mysticism also has an individualist emphasis, and developed distinctive forms of belief and practice.

1771
Christianity Origins: Primitie and ‘Western’ History

If we compare the four gospels contained in the New energies were focused on love of ‘the brethren.’ Their
Testament (probably the products of such communi- duty was to ‘build up the body of Christ’ rather than to
ties), we get some idea of the range of beliefs they held change ‘the world’—the latter being a category which
and of their very different understandings of Jesus. derived from this mentality.
In the face of this diversity, the establishment of an The result, as Troeltsch (1931) argued, was that the
authorised scriptural, creedal, and rhetorical tradition early Christian communities did not develop a ‘social
was as important as that of a universal sacramental teaching.’ The Christian response to social problems
priesthood (Cameron 1991). By the second century we such as poverty was to advocate individual acts of
find early representatives of catholic Christianity charity rather than social reform. Ownership of
listing the documents which should be treated by property was neither abolished nor condemned, but
Christians as authoritative, and which would eventu- possessions were to be used to help the Christian
ally come to form the New Testament. These were community. Similarly, in relation to class and social
then bound up with the first authoritative Christian position, the early church initiated a revolution within
scripture, the Jewish Bible or ‘Old Testament’. The its own walls—slave and free, male, and female were
formation of this scriptural ‘canon’ went hand in hand equal before Christ and in relation to salvation—
with the development of a ‘canon of faith.’ Both were which left wider patterns of social inequality (including
later debated and defined by the councils and creeds slavery and the position of women) virtually un-
which would become such a distinctive feature of touched. The state, even when persecuting Christians,
Christianity. Together authorized scripture and doc- was regarded by most early Christians as the wielder of
trine came to define the boundaries of ‘orthodoxy.’ a proper and God-given authority that should call
Again, this process involved exclusions, including that forth respect and obedience rather than attempts at
of a large number of gospels, lives, and acts of Jesus, reform.
the apostles and saints which are now classified as
‘apocryphal,’ together with a large body of philoso-
phical–theological literature influenced by Christian, 4. Alliance of Church and State
Jewish, Platonic, and Persian sources, which is often
classified together as ‘gnostic.’ Despite its failure to develop a social teaching, it is
Against the spiritualizing tendencies of the gnostics clear that early Christianity had a significant impact
(a tendency which took further the Pauline spirituali- on its wider social context. It appears to have initiated
zation of Christ), the emerging catholic church de- an inner revolution within the Roman Empire whose
veloped an emphasis which may be characterised as effect was felt in a number of ways—not least through
‘materialist.’ The authority of the clergy, e.g., was said the new educational and welfare opportunities it
to rest on an ‘apostolic succession’ which consisted of offered, and through the model of an inclusive society
a historical and physical continuity established which it provided. Whilst it is impossible to reconstruct
through the ‘laying on of hands’ by Christ and the the nature and extent of the growth of Christianity in
apostles down to the present generation. Likewise, the the first three centuries of its existence, it is estimated
church was the visible community of men and women that by the beginning of the fourth century it may have
gathered together to receive these sacraments rather accounted for up to 10 percent of the population of the
than an invisible body of the elect, and the authorized Empire. Its success appears to have been due to its
means of salvation were the visible and tangible ability to form ‘a compact, even massive, constellation
sacraments. In many cases too Christian hope con- of commitments’ (Brown 1997). Morality, philosophy,
tinued to be focused on a physical resurrection, rather and ritual, which formally had formed separate
than on the release of an immaterial soul from the spheres of activity in the ‘pagan’ world, were brought
body. A more hostile attitude to the body and material together by the church, and fused into a universal
life would, however, become a feature of some of the religion.
asceticism and monasticism that developed within It was these new characteristics and potencies which
Christian circles from the end of the third century eventually enabled Christianity to serve as a unifying
onwards. and legitimating force for an empire which had once
Despite this materialist emphasis, however, early persecuted it. Constantine formalized the process
Christianity was not involved in any direct attempt to whereby church and state grew into alliance with one
reform the society within which it found itself. Jesus another after AD 312, and church leaders rapidly
had directed his followers’ energies to ‘the one thing exploited the new opportunities that this opened. In
needful’—love of God and neighbor—rather than to this way a decisive alteration in Christianity’s relation
social reform, and this emphasis continued in early to the social order took place, one which would have
catholic Christianity. To the extent that Jesus com- the most far-reaching consequences not only for the
manded his followers to love all, including the Roman evolution of the church, but for the social and political
soldier and the tax collector, it could even be argued ordering of Christianity’s territories in the East as well
that the Jesus movement had a broader social reach as in what—under Christian influence—would eventu-
than the Pauline and post-Pauline communities whose ally become Western Europe.

1772
Chronic Illness, Psychosocial Coping with

See also: Classical Archaeology; Historiography and logical outcomes, particularly in the area of coping
Historical Thought: Christian Tradition; Historiog- with pain, the enthusiasm for the empirical study of
raphy and Historical Thought: Islamic Tradition; coping in general has dampened significantly over the
Judaism; Near Middle East\North African Studies: course of the past several years. Indeed, recent reviews
Religion of coping research have harshly criticized the litera-
ture, particularly assessment methodologies (see
Coyne and Racioppo 2000). Thus, much of the initial
promise for coping research to enhance clinical prac-
tice has not been realized.
Bibliography
Brown P 1997 The Rise of Western Christendom. Triumph and
Diersity AD 200–1000. Blackwell, Malden, MA and Oxford,
UK
Cameron A 1991 Christianity and the Rhetoric of Empire. The 2. Historical Perspectie and Current Concepts of
Deelopment of Christian Discourse. Berkeley, CA, Los Coping
Angeles, Oxford, UK
Elliott J H 1995 The Jewish Messianic movement: from faction The psychological study of coping dates back to
to sect. In: Esler P F (ed.) Modelling Early Christianity: Social- Sigmund Freud (1896\1966), who put forth the con-
Scientific Studies of the New Testament in its Context. cept of defense mechanisms, defined as mental opera-
Routledge, London and New York tions that kept painful thoughts and feelings out of
Hazlett I 1991 ed. Early Christianity. Origins and Eolution to awareness. The next major shift in the study of coping
AD 600. SPCK, London was brought about as a result of cognitive theories.
Meeks W A 1983 The First Urban Christians: The Social World The focus on intrapsychic processes that intervene
of the Apostle Paul. Yale University Press, New Haven, CT
between events and responses to events increased with
Sanders E P 1985 Jesus and Judaism. Fortress Press, Philadel-
phia
the introduction of other cognitive theories such as
Schweitzer A 1931 The Mysticism of Paul the Apostle. A & C Beck (1976). According to cognitive theories, cognitive
Black, London coping mediated between stressful events and psycho-
Theissen G 1978 Sociology of Early Palestinian Christianity. logical and physical responses to stressful events. It
1st American edn. Fortress Press, Philadelphia was hypothesized that, by examining individual coping
Troeltsch E 1931 The Social Teachings of the Christian Churches differences, a greater understanding of why people
(trans. Wyon O). George Allen and Unwin, London, react differently to the same events would be achieved.
MacMillan, New York, Vol. 1 Research on stress and coping exploded with the
work of Lazarus and Folkman (1984), who put forth
L. Woodhead the transactional stress and coping paradigm. Ac-
cording to Lazarus, coping refers to cognitive and
behavioral efforts to manage disruptive events that tax
the person’s ability to adjust (Lazarus 1981, p. 2).
Chronic illness can pose a number of life stressors
including loss of physical and social functioning,
alterations in body image, managing difficult and
Chronic Illness, Psychosocial Coping with complex medical regimens, and chronic pain. Ac-
cording to Lazarus and Folkman (1984) coping
responses are a dynamic series of transactions between
1. Background the individual and the environment, the purpose of
which is to regulate internal states and\or alter person-
Improvements in health care technologies and treat- environment relations. The theory postulates that
ments have resulted in increased life expectancies and stressful emotions and coping are due to cognitions
improved disease management for individuals with associated with the way a person appraises or perceives
chronic illnesses. To a great degree, quality of life for his or her relationship with the environment. There are
many individuals with these illnesses may be deter- several components of the coping process. First,
mined by the ways they deal with the illness. Thus, appraisals of the harm or loss posed by the stressor
identifying effective and ineffective ways of coping (Lazarus 1981) are thought to be important determin-
with these diseases may lead to the development of ants of coping. Second, appraisal of the degree of
more efficacious interventions for these individuals. controllability of the stressor is a determinant of
Since 1980 there has been a substantial amount of coping strategies selected. A third component is the
research devoted to understanding the relation be- person’s evaluation of the outcome of their coping
tween coping with chronic illnesses and psychological efforts and their expectations for future success in
adaptation. Although there have been some consistent coping with the stressor. These evaluative judgements
findings regarding coping and its impact on psycho- will lead to changes in the types of coping employed, as

1773
Chronic Illness, Psychosocial Coping with

well as play a role in determining psychological cognitive processing theory constructs have been
adaptation. Two main dimensions of coping are applied to adjustment to losses such as bereavement
proposed, problem-focused and emotion-focused (e.g., Davis et al. 1998), these constructs have received
coping. Problem-focused coping is efforts aimed at relatively little attention from researchers examining
altering the problematic situation. These coping efforts coping with chronic illness.
include information seeking and problem solving. Another coping process that falls under the rubric
Emotion-focused coping are efforts aimed at man- of cognitive coping is social comparison. Social com-
aging emotional responses to stressors. Such coping parison (SC) is a common cognitive process whereby
efforts include cognitive reappraisal of the stressor and individuals compare themselves to others in order to
minimizing the problem. obtain information about them (Gibbons and Gerrard
How the elements of coping unfold over time is a 1991). According to SC theory, health problems
key theoretical issue involved in studies of coping increase uncertainty; uncertainty increases the desire
processes. Despite the fact that the theory is dynamic for information and creates the need for comparison.
in nature, most of the research utilizing the stress and Studies of coping with chronic illness have included
coping paradigm put forth by Lazarus (1981) has social comparison as a focus. A certain type of SC,
relied on retrospective assessments of coping and has downward comparison, has been the focus of em-
been cross-sectional. However, a team of researchers, pirical study among patients with chronic illnesses
including Affleck, Tennen, and Keefe (e.g., Keefe et al. such as rheumatoid arthritis (RA) (Tennen and Affleck
1997) has utilized a daily diary approach to assessing 1997). Wills (1981) has suggested that people ex-
coping with pain, a methodology that can examine the periencing a loss can experience an improvement in
proposed dynamic nature of coping. mood if they learn about others who are worse off.
Towards the end of the twentieth century there has Indeed, there is evidence to suggest that SC increases
been also been an expansion in theoretical perspectives as a result of experiencing health problems (Kulik and
on cognitive coping. The literature on cognitive Mahler 1997). One proposed mechanism for SC is that
processing of traumatic life events has provided a new downward comparison impacts cognitive appraisal by
direction for coping research and broadened theor- reducing perceived threat. When another person’s
etical perspectives on cognitive methods of coping situation appears significantly worse, then the ap-
with chronic illness. According to cognitive processing praisal of one’s own illness may be reduced (Aspinwall
theory, traumatic events can challenge people’s core and Taylor 1993).
assumptions about them and their world (Janoff-
Bulman 1992). The unpredictable nature of many
chronic illnesses, as well as the numerous social and
occupational losses, causes many individuals to ques-
tion beliefs they hold about themselves. For example, 3. Assessment of Coping
the diagnosis of chronic obstructive pulmonary disease
(COPD) can challenge a person’s core beliefs about
3.1 General Coping Checklists
personal invulnerability. To the extent that a chronic
illness challenges core beliefs, integrating the illness Folkman and Lazarus’ Ways of Coping Checklist
experience into their pre-existing beliefs should pro- (WOC, Folkman and Lazarus 1980) has been one of
mote psychological adjustment. Cognitive processing the most widely used instruments to assess coping
has been used as the phrase to define cognitive efforts. This instrument contains two major subscales,
activities that help people view undesirable events in problem-focused and emotion-focused coping, as well
personally meaningful ways and find ways of under- as a number of subscales including wishful thinking,
standing the negative aspects of the experience, and cognitive restructuring, information seeking, seeking
ultimately reach a state of acceptance. Attempts to support, self-blame, and minimization. Instructions
find meaning or benefit in a negative experience are typically ask the individual to rate how he or she
ways patients may be able to accept the losses they manages the stressor (Manne and Zautra 1989).
experience. Focusing on the positive implications of Another measure that has been used is the Coping
the illness or finding personal significance of a situ- Strategies Inventory (CSI, Tobin et al. 1989). The CSI
ation are two ways of finding meaning in the illness. distinguishes two dimensions of coping, engagement\
When considering meaning-making coping, one must disengagement strategies and focusing on the prob-
distinguish coping activities that help individuals to lem\focusing on emotions about the stressor. Prob-
find redeeming features in an event from the successful lem-focused engagement is composed of problem-
outcome of these attempts. For example, people who solving and cognitive restructuring; problem-focused
have a serious illness may report that as a result they disengagement is composed of problem avoidance and
have found a new appreciation for life or that they wishful thinking. Emotion-focused engagement is
place greater value on relationships. Patients may also composed of social support and expressed emotion;
develop an explanation for the illness that is more emotion-focused disengagement is composed of social
benign (e.g., attributing it to God’s will). While withdrawal and self-criticism.

1774
Chronic Illness, Psychosocial Coping with

Measuring meaning-making coping and other emotions; (f ) seek spiritual comfort; and (g) seek
methods of cognitive processing has been done utiliz- emotional support. These coping categories have been
ing existing measures. Some aspects of meaning- reduced using factor analyses to two factors, labeled
making coping can be assessed using the cognitive emotion-focused and problem-focused coping (Affleck
reappraisal subscales of the COPE (Carver et al. 1989) et al. 1999).
and the Ways of Coping Checklist (Lazarus and
Folkman 1984). Other means of measuring the process
of meaning-making involve using measures of cog-
nitive processing. For example, the Impact of Events 4. Studies Using the Stress and Coping Paradigm
scale (Horowitz et al. 1979) measures attempts to
integrate a traumatic event with current schemas.
4.1 Cross-sectional Studies
Other studies have utilized questions tailored
specifically for their population. Early studies of coping using the stress and coping
paradigm were cross-sectional and utilized retrospec-
tive checklists such as the WOC. The earliest studies
divided coping into the general categories of problem-
and emotion-focused strategies, and focused mostly
3.2 Illness-specific Checklists
on psychological outcomes, rather than pain and
The majority of illness-specific coping instruments functional status outcomes.
have been designed to assess coping with pain associa- Later studies have investigated specific types of
ted with chronic illnesses such as rheumatoid arthritis coping. For example, Felton et al. (1984) examined
(RA) and osteoarthritis (OA). Two instruments, the two types of coping, wish-fulfilling fantasy, and
Vanderbilt Pain Management Inventory (VPMI) and information seeking, using a revision of the Ways of
the Coping Strategies Questionnaire (CSQ) have been Coping Checklist. Wish-fulfilling fantasy was a more
the most widely used instruments. Both measures consistent predictor of psychological adjustment than
assess the degree to which patients employ a variety of information seeking. While information seeking was
cognitive and behavioral mechanisms to reduce the associated with higher levels of positive affect, its
impact of painful episodes. Brown and Nicassio (1987) effects on negative affect were modest, accounting for
developed the VPMI to assess cognitive and be- only 4 percent of the variance. In a second study,
havioral pain-coping strategies. The 18-item VPMI Felton and Revenson (1984) examined coping of
has two subdimensions, active and passive pain patients with arthritis, cancer, diabetes, and hyper-
coping. The CSQ comprises seven subscales measuring tension. Wish-fulfilling fantasy, emotional expression,
distinct coping strategies. Factor analyses of the CSQ and self-blame were associated with poorer adjust-
in both RA and OA samples provide evidence for a ment, while threat minimization was associated with
two-factor solution, Coping Attempts and Pain Con- better adjustment. Scharloo et al. (1998) conducted a
trol and Rational Thinking (Keefe et al. 1987). cross-sectional study of individuals with COPD, RA,
The Coping with Rheumatic Stressors (CORS, or psoriasis. Unlike the majority of studies, illness-
Lankveld et al. 1994) was specifically designed to related variables such as time since diagnosis and the
measure stressor-specific coping in RA. This measure severity of the patient’s medical condition were entered
is unique in that it measures coping separately with first into the equation predicting role and social
three stressors, pain, limitations, and dependence. The functioning. Overall, coping was not strongly related
three coping with pain scales are comforting cogni- to social and role functioning. Among patients with
tions, decreasing activity, and diverting attention. The COPD, passive coping predicted poorer physical
three coping with limitation scales are optimism, functioning. Among patients with RA, higher levels of
pacing, and creative solution seeking. The two coping passive coping predicted poorer social functioning.
with dependence scales are making an effort to accept Very few studies have examined coping with other
dependence and showing consideration. chronic illnesses. Several studies have investigated the
association between coping and distress among indivi-
duals with MS. Pakenham et al. (1997) categorized
coping as either emotion– or problem-focused, and
found that emotion-focused coping was related to
3.3 Daily Diary Instruments
poorer adjustment, while problem-focused coping was
Only one instrument, the Daily Coping Inventory associated with better adjustment. In contrast, Wine-
(Stone and Neal 1984), has been developed to assess man and Durand (1994) found that emotion- and
daily coping. This inventory has been adapted for problem-focused coping were unrelated to distress.
chronic pain coping by Affleck et al. (1992). Patients Mohr et al. (1997) found that problem solving and
are asked whether or not they utilize each of seven cognitive reframing strategies are associated with
categories of coping: (a) pain reduction attempt; (b) lower levels of depression, whereas avoidant strategies
relaxation; (c) distraction; (d) redefinition; (e) vent are associated with higher levels of depression.

1775
Chronic Illness, Psychosocial Coping with

As previously noted, most studies have used instruc- As previously mentioned, several recent studies have
tions that ask participants how they coped with the employed prospective daily study designs in which
illness in general, rather than asking participants how participants complete a 30-day diary for reporting
they coped with specific stressors associated with their each day’s pain, mood, and pain-coping strategies
illness. Van Lankveld et al. (1994) assessed how RA using the Daily Coping Inventory (Stone and Neale
patients cope with the most important stressors 1984). These studies, which have been conducted with
associated with RA ( pain, functional limitation, and RA and OA patients, have shown that emotion-
dependence). When coping with pain was considered, focused strategies, such as attempting to redefine pain
patients with similar degrees of pain who used comfor- to make it more bearable, and expressing distressing
ting cognitions and diverted their attention from the emotions about the pain, predict increases in negative
pain reported higher well-being. Limiting one’s activity mood the day after the diary report. The daily design
was associated with lower well-being. When coping is a promising new method of evaluating the link
with functional limitation was examined, patients who between coping strategies and mood. More import-
used pacing of their activity reported lower levels of antly, these studies can elucidate coping processes over
well-being, and use of optimism was associated with time. For example, Tennen et al. (2000) found that the
higher well-being after functional capacity was con- two functions of coping, problem- and emotion-
trolled for in the equation. Finally, when coping with focused, evolve in response to the outcome of the
dependence was examined, only showing consider- coping efforts. An increase in pain from one day to the
ation was associated with higher well-being after next increased the likelihood that emotion-focused
functional capacity was controlled for in the equation. coping would follow problem-focused coping. It ap-
peared that, when efforts to directly influence pain
were not successful, participants tried to alter their
cognitions and adjust rather than influence the pain.

4.2 Longitudinal Studies


Unfortunately, there have been relatively few studies 6. Challenges to the Study of Coping with
that have employed longitudinal designs. Overall,
Chronic Illness
passive coping strategies such as avoidance, wishful
thinking, and withdrawal, as well as self-blame, have Recently, the general literature on coping has received
been shown to be associated with poorer psychological a great deal of criticism from researchers (e.g., Coyne
adjustment (e.g., Scharloo et al. 1999), and problem- and Racioppo 2000). The main concern voiced in
focused coping efforts such as information seeking reviews regards the gap between the elegant, process-
have been found to be associated with better ad- oriented stress and coping theory and the cross-
justment (e.g., Pakenham 1999). sectional, retrospective methodologies that have been
used to evaluate the theory. Although the theory
postulates causal relations among stress, coping, and
adaptation, the correlational nature of most empirical
work has been unsuitable to test causal relations. In
5. Studies of Coping with Chronic Pain addition, retrospective methods require people to
recall how they coped with an experience, and thus are
The majority of these studies have utilized longitudinal likely to be influenced by both systematic and non-
designs. For example, Brown and Nicassio (1987) systematic sources of recall error. Coping efforts as
studied pain-coping strategies among RA patients and well as psychological outcomes such as distress are
found that patients who engaged in more passive best measured close to when they occur. Recent studies
coping when experiencing more pain became more have used an approach that addresses these concerns.
depressed six months later than patients who engaged These studies have employed a microanalytic, process-
in these strategies less frequently. Keefe et al. (1989) oriented approach using daily diary assessments (e.g.,
conducted a six-month longitudinal study of the Affleck et al. 1999). These time-intensive study designs
relationship between catastrophizing and depression allow for the tracking of changes in coping and distress
in RA patients. Those patients who reported high close to their real-time occurrence and moments of
levels of catastrophizing had greater pain, disability, change, are less subject to recall error, and capture
and depression six months later. Other investigators coping processes as they unfold over time. The daily
(Parker et al. 1989) have reported similar findings. assessment approach also can evaluate how coping
Overall, studies have suggested that self-blame, wish- changes as the individual learns more about what
ful thinking, praying, catastrophizing, and restricting coping responses are effective in reducing distress
activities are associated with more distress, while and\or altering the stressor. These advances may help
information seeking, cognitive restructuring, and ac- investigators to more fully examine whether the
tive planning are associated with less distress. methods used to cope with stressors encountered in

1776
Chronic Illness, Psychosocial Coping with

the day-to-day experience of living with a chronic tinguish coping dimensions may wish to evaluate
disease predict long-term adaptation. Unfortunately, coping intention.
this approach has only been utilized among individuals There are a number of additional methodological
with arthritis and has not been applied to individuals and conceptual challenges that are specifically relevant
dealing with other chronic illnesses. to studies of coping with chronic illness. First, rela-
Another key problem with coping checklists that tively few studies control for disease severity in
has been noted in a number of reviews of the coping statistical analyses. Extreme pain or disability can
with chronic illness literature is the instructional result in both more coping attempts and more distress.
format. The typical instructions used (e.g., ‘How do Studies that do not take into account these variables
you cope with RA?’) are so general that it is not clear may conclude mistakenly that more coping is associa-
what aspect of the stressor the participant is referring ted with more distress. In addition, little attention has
to when answering questions. Thus, the source of the been paid to the effects of progressive impairment on
stress may differ across study participants. There are the selection of coping strategies, and in the perceived
problems even when the participant is allowed to effectiveness of those strategies. Chronic progressive
define the stressor prior to rating the coping strategies illnesses may be expected to increase feelings of
used. The self-defined stressor may differ across hopelessness. For example, Revenson and Felton
participants, and thus the analyses will be conducted (1989) studied changes in coping and adjustment over
with different stressors being rated. a six-month period and found that lower acceptance,
A third assessment problem regards the definition more wishful thinking, and more negative affect
of coping. While Lazarus and Folkman (1984) regard accompanied increases in disability.
only effortful, conscious strategies as coping, other Another issue is the lack of longitudinal studies.
investigators have argued that ‘automatic’ coping Clearly, longitudinal studies would help the literature
methods also fall under the definition of coping (Wills in a number of ways. First, this type of design might
1997). Indeed, some coping responses may not be help clarify whether coping influences distress or
perceived by the individual as choices, but rather whether coping is merely a symptom of distress, a
automatic responses to stressful events. For example, criticism frequently raised in critiques of coping (e.g.,
wishful thinking or other types of avoidant types of Coyne and Racioppo 2000). Second, longitudinal
coping such as sleeping or alcohol use may be studies may clarify the role of personality factors in
categorized by researchers as a coping strategy, but coping. While some investigators suggest that per-
not categorized as such by the individual completing sonality factors play a limited role in predicting coping,
the questionnaire because the individual did not other investigators argue that coping is a personality
engage in this as an effortful coping strategy. A related process that reflects dispositional differences during
and interesting issue regards the categorization of stressful events.
unconscious defense mechanisms. Cramer (2000), in a Although the lack of progress in the area of coping
recent review of defense mechanisms, distinguishes is frequently attributed to methods of assessment and
between defenses that are not conscious and un- design, the relatively narrow focus on distress out-
intentional and coping processes that are conscious comes may also account for some of the problem,
and intentional. However, there has been an interest in particularly when coping with chronic illness is being
repressive coping, suggesting that some researchers evaluated. Chronic illness does not ultimately lead to
regard defensive strategies such as denial and re- psychological distress for the majority of patients.
pression under the rubric of coping. More clarity and Indeed, many individuals report psychological growth
consistency between investigators in the definition of in the face of chronic illness, and are able to find
coping, particularly when unintentional strategies are personal significance in terms of changes in views of
being evaluated, would provide more clarity for re- themselves, their relationships with others, and a
search. changed philosophy of life (Tennen et al. 1992). While
A fourth assessment issue regards the distinction positive affect is included as an adaptational outcome
between ‘problem-focused’ and ‘emotion focused’ in some studies (e.g., Bendtson and Hornquist 1991),
coping efforts. While researchers may categorize a the majority of studies do not include positive out-
particular coping strategy as problem-focused coping, comes. Positive affect will be a particularly important
the participant’s intention may not be to alter the outcome to evaluate when positive coping processes
situation, but rather to manage an emotional reaction. such as cognitive reappraisal and finding meaning in
For example, people may seek information about an the experience are examined, as these types of coping
illness as a way of coping with anxiety and to alter may play a stronger role in generating and maintaining
their appraisal of a situation, rather than to engineer a positive mood than in lowering negative mood.
change in the situation. The lack of an association Finally, relatively few studies have focused solely on
between emotion-focused coping and psychological coping and distress and have not taken into account
outcomes may, in part, be due to a categorization potential moderators such as level of pain, appraisals
strategy that does not account for the intention of the of controllability, gender, and personality. A careful
coping. Studies utilizing these two categories to dis- evaluation of potential moderators will provide both

1777
Chronic Illness, Psychosocial Coping with

researchers and clinicians with information about Beck A T 1976 Cognitie therapy and the emotional disorders.
which circumstances particular coping strategies are International Universities Press, New York
most effective. Bendtsen P, Hornquist J O 1994 Rheumatoid arthritis, coping
and well-being-cross-comparisons and correlational analyies.
Scandinaian Journal of Social Medicine 22: 97–106
Brown G, Nicassio P 1987 Development of a questionnaire for
the assessment of active and passive coping strategies in
7. Conclusions chronic pain patients. Pain 3: 53–84
As Lazarus points out in his commentary in American Carver C S, Scheier M F, Weintraub J K 1989 Assessing coping
Psychologist, ‘A premise that occurs again and strategies: A theoretically-based approach. Journal of Pers-
onality and Social Psychology 56: 267–83
again … is that for quite a few years research has
Coyne J C, Racioppo M W 2000 Never the twain shall meet?
disappointed many who had high hopes it would Closing the gap between coping research and clinical inte-
achieve both fundamental and practical knowledge rvention research. American Psychologist 55: 655–64
about the coping process and its adaptational conse- Cramer P 2000 Defense mechanisms in psychology. Psycho-
quences. I am now heartened by positive signs that logical Reiew 3: 357–70
there is a growing number of sophisticated, resource- Davis C G, Nolen-Hoeksema S, Larson J 1998 Making sense of
ful, and vigorous researchers who are dedicated to the loss and benefiting from the experience: Two constructs of
study of coping’ (Lazarus 2000). It is clear that, despite meaning. Journal of Personality and Social Psychology 75:
the multiple methodological problems that this area of 561–74
research has faced in the past, a heightened awareness Felton B, Revenson T 1984 Coping with chronic illness: A study
of these limitations has led to the application of of illness controllability and the influence of coping strategies
sophisticated methods that might assist this field in on psychological adjustment. Journal of Consulting and
fulfilling the high hopes for this field of research. If Clinical Psychology 52: 343–53
investigators in the field of coping with chronic Felton B J, Revenson T A, Hinrichsen G 1984 Stress and coping
illnesses can adapt daily-diary methods to their in the explanation of psychological adjustment among chro-
nically ill adults. Social Science and Medicine 10: 889–98
populations, focus on specific stressors related to the
Folkman S, Lazarus R S 1980 An analysis of coping in a middle-
illness when instructing participants to answer coping aged community sample. Journal of Health and Social Behaior
questions, include coping appraisals and the perceived 21: 219–39
efficacy of coping efforts, and carefully delineate Freud S 1966 Further remarks on the neuro-psychoses of
illness-related, contextual, and dispositional moder- defense. In: Strachey J (ed.) The Standard Edition of the
ators, the findings may lead to the development of Complete Psychological Works of Sigmund Freud. Hogarth
effective interventions for clinicians hoping to improve Press, London (Original work published 1896), Vol. 3, pp.
the quality of life for these individuals. 141–58
Gibbons F X, Gerrard M 1991 Downward comparison and
See also: Chronic Illness: Quality of Life; Chronic coping with threat. In: Suls J, Wills T A (eds.) Social
Comparison: Contemporary Theory and Research. Erlbaum,
Pain: Models and Treatment Approaches; Coping
Hillsdale, NJ, pp. 317–46
across the Lifespan; Coping Assessment; Coronary Horowitz M, Wilner N, Alvarez W 1979 Impact of event scale:
Heart Disease (CHD), Coping with; Illness Behavior A measure of subjective distress. Psychosomatic Medicine 41:
and Care Seeking; Illness: Dyadic and Collective 209–18
Coping; Pain, Health Psychology of; Pain, Mana- Janoff-Bulman R 1992 Shattered Assumptions: Towards a New
gement of; Rheumatoid Arthritis: Psychosocial Aspe- Psychology of Trauma. Free Press, New York
cts; Social Support and Health; Stress and Coping Keefe F J, Affleck G, Lefebvre J, Starr K, Caldwell D, Tennen H
Theories; Well-being and Health: Proactive Coping 1997 Pain coping strategies and coping efficacy in rheumatoid
arthritis: a daily process analysis. Pain 69: 35–42
Keefe F J, Brown G K, Wallston K A, Caldwell D S 1989
Coping with rheumatoid arthritis pain: Catastrophizing as a
maladaptive strategy. Pain 37: 51–6
Bibliography Keefe F J, Caldwell D S, Queen K T, Gil K, Martinez S, Crisson
Affleck G, Tennen H, Keefe F J, Lefebre J C, Kashikar-Zuck S, J 1987 Pain coping strategies in osteoarthritis patients. Journal
Wright K, Starr K, Caldwell D 1999 Everyday life with of Consulting and Clinical Psychology 55: 208–12
osteoarthritis or rheumatoid arthritis: independent effects of Kulik J, Mahler H 1997 Social comparison, affiliation, and
disease and gender on daily pain, mood, and coping. Pain 83: coping with acute medical threats. In: Buunk B P, Gibbons
601–9 F X (eds.) Health, Coping and Well-Being: Perspecties from
Affleck G, Urrows S, Tennen H, Higgins P 1992 Daily coping Social Comparison Theory. Erlbaum, Mahwah, NJ, pp.
with pain from rheumatoid arthritis-patterns and correlates. 227–61
Pain 51: 221–9 Lazarus R S 1981 The stress and coping paradigm. In: Eisdorfer
Aspinwall L G, Taylor S E 1993 Effects of social comparison C, Cohen D, Kleinman A, Maxim P (eds.) Models for Clinical
direction, threat and self-esteem on affect, self-evaluation and Psychopathology. Spectrum, New York, pp. 177–214
expected success. Journal of Personality and Social Psychology Lazarus R S 2000 Toward better research on stress and coping.
64: 708–22 American Psychologist 55: 665–73

1778
Chronic Illness: Quality of Life

Lazarus R S, Folkman S 1984 Stress, Appraisal, and Coping. Chronic Illness: Quality of Life
Springer, New York
Manne S L, Zautra A J 1989 Spouse criticism and support: Their
association with coping and psychological adjustment among 1. Background
women with rheumatoid arthritis. Journal of Personality and
Social Psychology 56: 608–17 Increased longevity and the development of soph-
Mohr D C, Goodkin D E, Gatto N, Van Der Wende J 1997 isticated health care technologies and treatments mean
Depression, coping, and level of neurological impairment in that people today live with chronic health conditions
multiple sclerosis. Multiple Sclerosis 3: 254–8 over extended periods of their lives. Quality of life
Pakenham K I 1999 Adjustment to multiple sclerosis: Ap- (QOL) has become an important goal of treatment
plication of a stress and coping model. Health Psychology 18: and marker of success in health care interventions in
383–92 chronic illness generally. In many disorders, e.g.,
Pakenham K I, Stewart C A, Rogers A 1997 The role of coping osteoarthritis, health interventions will have little
in adjustment to multiple sclerosis-related adaptive demands. impact on mortality statistics but great potential for
Psycholog Health and Medicine 2: 197–211 reducing disability and increasing QOL.
Parker J, Smarr K, Buescher K, Phillips L R, Frank R, Beck N C QOL research first developed in cancer settings
1989 Pain control and rational thinking: Implications for
where the balance of quality and duration of life
rheumatoid arthritis. Arthritis and Rheumatism 32: 984–90
Revenson T, Felton B 1989 Disability and coping as predictors
became a key concern in decisions to use novel
of psychological adjustment to rheumatoid arthritis. Journal treatments with very serious side effects and only
of Consulting and Clinical Psychology 57: 344–8 partial efficacy. However, over the past 20 years there
Scharloo M, Kaptein A A, Weinman J A, Hazes J M W, has been a burgeoning of research activity in every
Breedveld F, Anderson S K, Walker S E, Rooijmans H G M major chronic illness category. This is evident for
1999 Predicting functional status in patients with rheumatoid different diseases at various structural levels. For
arthritis. Journal of Rheumatology 26: 1686–93 instance, the US research funding agency, the National
Scharloo M, Kaptein A A, Weinman J A, Hazes J M W, Willems Heart Lung and Blood Institute, require almost all
L L N A, Bergman W, Rooijmans H G M 1998 Illness perc- clinical trials and many epidemiological studies they
eptions, coping and functioning in patients with rheumatoid fund to have a QOL component. In rheumatology,
arthritis, chronic obstructive pulmonary disease and psoriasis. there is an international, professionally endorsed
Journal of Psychosomatic Research 44: 573–85 cooperative called OMERACT (Outcome Measures
Stone A A, Neale J M 1984 A new measure of daily coping- in Rheumatoid Arthritis Clinical Trials). They seek to
development and preliminary results. Journal of Personality improve QOL outcome measurement through data-
and Social Psychology 46: 892–906
driven interactive consensus. In cancer, the European
Tennen H, Affleck G 1997 Social comparisons and occupational
stress: The identification-contrast model In: Buunk B P,
Organisation for Research on the Treatment of Cancer
Gibbons F X (eds.) Health, Coping and Well-Being: Perspec- (EORTC) has been established by interested profes-
ties from Social Comparison Theory. Erlbaum, Mahwah, NJ, sionals. They have developed a core QOL measure and
pp. 262–98 disease-specific modules for various types of cancer.
Tennen H, Affleck G, Armeli S, Carney M 2000 A daily process To the extent that QOL assessment reflects consider-
approach to coping: Linking theory, research and practice. ation of aspects of these diseases beyond the bio-
American Psychologist 55: 626–36 medical, QOL assessment reflects the increasingly
Tennen H, Affleck G, Urrows S, Higgins P, Mendola R 1992 biopsychosocial perspective in modern health care.
Perceiving control, construing benefits and daily processes in
rheumatoid arthritis. Canadian Journal of Behaioral Science
24: 186–203 2. Uses of Quality of Life Assessment in Chronic
Tobin D L, Holroyd K A, Reynolds R V, Wigal J K 1989 The
hierarchical factor structure of the Coping Strategies In-
Illness
ventory. Cognitie Therapy and Research 13: 343–61 QOL research in chronic illness has policy, treatment
van Lankveld W, van de Bosch P V, van Putte L, Naring G, van evaluation, descriptive, and individual clinical uses.
der Staak C 1994 Disease-specific stressors in rheumatoid There is now a vast literature on, and compendia of,
arthritis: Coping and well-being. British Journal of Rheu- instruments for this very wide set of circumstances and
matology 33: 1067–73
health conditions (e.g., Bowling 1997). At a policy
Wills T A 1981 Downward comparison principles in social psy-
chology. Psychological Bulletin 90: 245–71
level, initiatives to develop QOL assessment are
Wills T 1997 Modes and families of coping. In: Buunk B,
concerned with means to identify those components of
Gibbons F X (eds.) Health, Coping and Well-being: Perspec- the healthcare system which are most effective in
ties from Social Comparison Theory. Erlbaum, Mahwah, NJ, achieving good health outcomes. The Medical Out-
pp. 167–93 comes Study is the major example of this approach. As
Wineman N M, Durand E J, McCulloch B J 1994 Examination part of a two-year prospective study to identify the
of the factor structure of the ways of coping questionnaire association of reimbursement aspects of a health
with clinical populations. Nursing Research 43: 266–73 funding system with health outcomes in the US, a
health-related QOL (HRQOL) was devised. The most
S. Manne widely used version, the short-form 36-item ques-

1779
Chronic Illness: Quality of Life

tionnaire (SF-36) was developed as a generic HRQOL A final use of QOL assessment is to facilitate
instrument which could be used with the general decision making for the individual patient. While
population as well as patient groups (Stewart et al. some instruments such as the Dartmouth Primary
1989). The SF-36 has eight dimensions: physical Care Cooperative Information Project (COOP) charts
functioning; role limitations because of physical health have been developed with clinical settings and ease of
problems; bodily pain; social functioning; general administration and interpretation as a priority, the
mental health; role limitations because of emotional evidence is that there is little formal QOL assessment
problems; vitality, energy, or fatigue; and general in this format at present. This is so despite significant
health perception. Items on each dimension are professional support for the broad concept of QOL
weighted so that each scale adds to 100 points, with assessment in clinical practice.
higher scores indicating better health. Thus health
profiles across the eight dimensions can be easily
compared in graphic form across different disease or 3. Differing Perspecties on Quality of Life:
treatment groups. For SF-36, individuals rate their Needs Versus Wants
own health status. A different approach has been tried
at the general population level in the State of Oregon, Quality of life has been generically defined as the
USA, to assist the public in prioritizing medical capacity of an individual to achieve their life plans or
treatments for inclusion in publicly funded health the difference, at a particular point in time, between
care. Here, people were asked to rate the hypothetical the hopes and expectations of an individual and their
impact of various levels of symptoms and restrictions present situation. In Calman’s (1984) definition, a
as a way to create a hierarchy of levels of challenge to small difference or ‘gap’ is commensurate with a good
QOL. However, it has not proven possible to date QOL. He proposes that the gap is widened by
to develop a system whereby the public are willing to experiences such as illness and that it can be reduced
withhold certain treatments from patients on the basis by improving the present situation or by reducing the
of lesser clinical or QOL benefit. person’s hopes and expectations to a more ‘realistic’
Much of the work using QOL instruments in chronic level. The World Health Organization ( WHO) defi-
illness has been to evaluate the QOL impact of nition acknowledges possible cultural, as well as
differing types of treatment, e.g., comparing medi- individual, differences in the understanding of the
cations. In this formulation, QOL is seen as a concept of QOL (see Quality of Life: Assessment in
dependent variable. Treatment comparisons are Health Settings). The WHOQOL instrument is based
usually made on a combination of generic and disease- on a set of five domains: physical function; emotional
specific QOL instruments (see Quality of Life: Assess- function; social relationships; levels of independence;
ment in Health Settings) to cover the range of and environment. While core domains have been
dimensions in which one intervention may differ from demonstrated to be relevant across cultures, the
another in impact. In priority terms, QOL is seen on a individual facets and items to assess these domains are
continuum. It is positioned after efficacy and safety of developed within cultures. Thus while the domain of
the treatment and before or after cost-effectiveness, independence will feature in both British and Indian
depending on the costs and condition under con- versions of the instrument, the particular items to best
sideration. One merit of this approach has been to reflect that domain may differ. The WHOQOL in-
illustrate the sometimes very differing perspectives of strument also illustrates the multifaceted nature of
professionals and patients. For instance, innovative QOL; working capacity and vitality\fatigue might
limb-sparing treatment for cancer was found to result relate to work life from the physical function domain
in poorer QOL for patients than the traditional while working environment and transport would relate
amputation option (Sugarbaker et al. 1982). This to work from the environment domain. QOL has also
finding was subsequently used to refine treatment such been defined more narrowly in a health-related context
that initial difficulties with the limb-sparing approach (HRQOL) (see Quality of Life: Assessment in Health
could be removed and it was then subsequently offered Settings). HRQOL provides more focus on those
as the better QOL option. specific aspects of life function likely to be influenced
QOL research has provided a large descriptive body by illness although there is debate about how to
of research outlining the experience of various health differentiate health-related and non-health-related
conditions. This work is of benefit to professionals, aspects of QOL.
patients, and the public understanding the typical Underlying most definitions of QOL, although
HRQOL challenges in different health conditions; usually not articulated, is one of two philosophical
e.g., the SF-36 has been used to provide a HRQOL perspectives—the ‘needs’ or the ‘wants’ approach to
profile of nine common chronic medical conditions QOL (Browne et al. 1997). The needs approach views
(Stewart et al. 1989). From this we can see that cardiac QOL as the extent to which universal human needs are
conditions such as myocardial infarction and con- met. Many HRQOL instruments conform to the needs
gestive heart failure have a greater overall impact on approach: the themes, questions, and question weight-
HRQOL than do others such as diabetes. ings used are standard and established independent of

1780
Chronic Illness: Quality of Life

the person being assessed. The wants approach allows ways that reflect the needs\wants dichotomy of
for a definition of QOL using themes, questions, or approaches. Many have provided psychometric skills
question weightings determined by the individual within a largely biomedical framework to address
being evaluated. The methodologies used to achieve challenges such as increasing responsiveness to change
the wants approach are very varied and include in evaluative research instruments (e.g., McGee et al.
content analysis and repertory grid (see Quality of 1999) or in developing markers to identify clinical as
Life: Assessment in Health Settings regarding indi- distinct from statistical significance with HRQOL
vidual QOL methods). These measures typically do instruments (e.g., Juniper et al. 1994). The aim of the
notlimitthefocusofenquiry tohealth and it is withthese latter is to be able to use HRQOL scores in much the
assessments (which usually focus more on perception same manner as one would use blood pressure or body
than function) that there are findings of high levels of temperature readings to determine what course of
QOL in groups with major chronic illness as discussed action to take and when to do so. Others have been
in the next section. concerned with methods of documenting the salient
and dynamic features of QOL for individuals; this has
involved both qualitative and quantitative methods
4. Challenges to QOL Assessment in Chronic from a more phenomenological perspective. The
Illness challenge for the coming decades is to find some
reconciliation between these two perspectives. In this
One major challenge relates to the comparison of way QOL assessment can be better used to direct
professional and patient perspectives on patient QOL. services at individual and societal levels while expand-
As reviewed by Sprangers and Aaronson (1992), ing our understanding of how QOL is developed,
professionals are poor judges of patient QOL. More- maintained and altered as necessary through the
over, professionals consistently rate patient QOL as course of chronic illness and its treatment.
poorer than do patients. Where family members are
involved in ratings, their views are intermediate. There See also: Chronic Illness, Psychosocial Coping with;
is little understanding of the processes that facilitate Chronic Pain: Models and Treatment Approaches;
patients to develop or maintain a reasonable QOL in Coping across the Lifespan; Quality of Life: Assess-
the face of what others define more negatively. ment in Health Settings; Stress and Coping Theories;
A second complementary challenge arises from Welfare: Philosophical Aspects; Well-being (Subjec-
comparative QOL assessment where instrument use
tive), Psychology of
allows comparison with the general population. Many
of these studies find little difference in QOL between
‘healthy’ individuals and those with serious health
conditions; both groups reporting high QOL (Allison Bibliography
et al. 1997). Such study comparisons differ depend- Allison P J, Locker D, Feine J S 1997 Quality of life: a dynamic
ing on the wants versus needs approach taken to construct. Social Science Med. 45: 221–30
assessing QOL as discussed in the previous section. Bowling A 1997 Measuring Health: A Reiew of Quality of Life
Interesting work in the cancer area shows how patients Scales, 2nd edn. Open University Press, Birmingham, UK
with serious cancers, when compared with orthopedic Browne J P, McGee H M, O’Boyle C A 1997 Conceptual
patients and matched healthy controls, focus away approaches to the assessment of quality of life. Psychology and
Health 12: 737–51
from health issues in order to maintain a good level of Calman K C 1984 Quality of life in cancer patients—an
life satisfaction (Kreitler et al. 1993). hypothesis. Journal of Medical Ethics 10: 124–7
One way to address both of the challenges outlined Diener E 2000 Subjective well-being. The science of happiness
is to see QOL as having a trait or homeostatic and a proposal for a national index. American Psychologist 55:
dimension such that individuals ‘reset’ their evaluative 34–43
framework in adversity to recreate an appropriate Juniper E F, Guyatt G H, Willan A, Griffith L E 1994 Deter-
QOL level for themselves. Diener (2000) discusses mining a minimal important change in a disease-specific
psychological perspectives on adaptation and goal quality of life questionnaire. Journal of Clinical Epidemiology
setting relevant to this issue. The practical challenge 47: 81–7
Kreitler S, Chaitchuk S, Rapport Y, Kreitler H, Algor R 1993
when developing this understanding is how to use the Life satisfaction and health in cancer patients, orthopaedic
knowledge to promote psychological growth in the patients and healthy individuals. Social Science Med. 36:
context of a chronic health condition. 547–56
McGee H M, Hevey D, Horgan J H 1999 Psychosocial outcome
assessments for use in cardiac rehabilitation service evalu-
5. Conclusions and Outlook ation: a 10-year systematic review. Social Science and Medi-
cine. 48: 1373–93
To date, social and behavioral scientists have made Sprangers M A G, Aaronson N K 1992 The role of health care
major contributions to QOL assessment in chronic providers and significant others in evaluating the quality of
illness on a scale of cooperation with the medical life of patients with chronic disease: a review. Journal of
sciences that is probably unique. They have done so in Clinical Epidemiology 45: 743–60

1781
Chronic Illness: Quality of Life

Stewart A L, Greenfield S, Hays R D, Wells R D, Rogers W H, It is important to distinguish pain from nociception.
Berry S D, McGlynn E A, Ware J E 1989 Functional status Nociception is a physiological process by which
and well-being of patients with chronic conditions: Results transduction of sensory information is activated in
from the Medical Outcomes Study. JAMA—Journal of the specialized nerve endings that convey information
American Medical Association 262: 907–13
Sugarbaker P H, Barofsky I, Rosenberg S A, Gianola P J 1982
about tissue damage (e.g., location, descriptive feat-
Quality of life assessment of patients in extremity sarcoma ures) to the central nervous system. Pain, on the other
clinical trials. Surgery 91: 17–23 hand, is an integrated perceptual process. The Inter-
national Association for the Study of Pain (IASP)
H. M. McGee recognizes the complexity of pain and treats it as a
phenomenological experience unique to each person.
According to the IASP definition, pain is ‘an un-
pleasant sensory and emotional experience normally
associated with tissue damage or described in terms of
such damage’ (Merskey 1986). In addition to this
Chronic Pain: Models and Treatment global definition, many commonly used pain terms
and related phenomena have been described (Turk
Approaches and Okifuji 2001). For example, pain may be acute
and a response to an identifiable trauma or disease, or
1. Introduction it may be episodic and occur intermittently as in the
case of migraine headache. Pain may also be relatively
Pain is one of the most common reasons for seeking constant, extending for years. Unlike those with acute
medical care (Hart et al. 1995, Gureje et al. 1998). For or recurrent pain, people with chronic pain may suffer
many, pain is an adaptive response warning of the 24 hours a day, 365 days a year. Table 1 includes a list
potential for bodily harm. This signal helps to prevent of four classes of pain based upon timing and clinical
further injury and tissue damage. However, research implications.
as well as common experience suggests that pain is In this article an overview is provided of the current
much more than a mere barometer of the amount of understanding of pain and its management. First the
tissue damage. Despite the universal experience of conceptualizations that have dominated views of pain
pain, it is a difficult phenomenon to define precisely. historically are reviewed. The current conceptuali-
We have all seen instances where two people report zations that integrate multifactorial aspects of pain are
radically different amounts of pain despite an equi- described. The article also focuses on psychosocial
valent injury. Clinicians have also noted that soldiers factors that have been shown to contribute to the
report little or no pain after a combat injury until after subjective experience of pain as well as adaptation and
they are removed from the battlefield. The wide response to treatment. Following the discussion of
variations in how people communicate their pain model of pain, a historical overview of the most
further complicate matters. Some may express pain by common perspectives and treatments of pain is pres-
crying or groaning, whereas others are more stoic and ented. The importance of a multidisciplinary approach
handle their pain by gritting their teeth and suffering to understanding and treating complex chronic pain
silently. syndromes is stressed. Finally, the psychosocial stra-

Table 1
Four types of pain
Transient Pain: Pain elicited by activation of nociceptive process but no significant injury is
present and pain stops as soon as the activation ends. This type of pain is ubiquitous in everyday life
and rarely is a reason to seek heath care. EXAMPLES: pain associated with injection for
immunization
Acute Pain: Pain elicited by nociceptive activation at the site of injury or disease. This type of
pain is often a reason to seek medical care. EXAMPLES: post-surgical pain, pain associated with
muscle sprain
Recurrent Pain: Episodic or intermittent pain. Each pain episode is relatively short lasting but
recurs across an extended period of time. EXAMPLES: migraine headaches, tic douloureux,
sickle cell crisis
Chronic Pain: Usually elicited by an injury but lasting for a long period of time after the
original damage has been healed. At the chronic stage, no objective pathology may be present that can
explain the presence of pain. EXAMPLES: chronic low back pain, fibromyalgia
Source: Turk & Okifuji 2001

1782
Chronic Pain: Models and Treatment Approaches

tegies that incorporate behavioral and cognitive– 2.1 Biomedical ( pain l physical pathology) Model
behavioral perspectives with the goals of improving
The traditional biomedical view that continues to
pain and functioning of people suffering from pain are
dominate is reductionistic. This perspective assumes
emphasized. It is not meant to be suggested, however,
that every report of pain must be associated with a
that pain is purely psychological or that physical
specific physical cause (e.g., pain caused by the flame
factors are unimportant. Rather emphasis is put on
in Figure 1). As a consequence, the extent of pain
psychosocial factors, as they have previously been less
reported should be directly proportional to the
integrated into conventional thinking about chronic
amount of tissue damage. The expectation is that once
pain. The psychological approaches described are
the physical cause has been identified, appropriate
almost always employed within the multidisciplinary
treatment will follow and positive outcomes will be
treatment program (Flor et al. 1992).
attained. Treatment typically focuses on removing the
putative cause of the pain, or by chemically or
surgically disrupting the pain pathways (cutting or
blocking the transmission of signals from the per-
2. Historical Conceptualizations of Pain iphery to the brain). According to this model, once the
Over the centuries, many prominent scholars have cause of the pain is removed or the pain pathways are
attempted to uncover the mechanisms underlying blocked, pain should be eliminated.
pain. Perhaps among the most influential was the There are, however, several perplexing features of
seventeenth century French philosopher, Rene! Des- pain that do not fit neatly into the biomedical model
cartes. Descartes conceptualized pain as the result of with its suggestion of a one-to-one relationship betw-
representation of a painful stimulus traveling along een tissue pathology and symptoms. Most notably,
the nervous system starting at the periphery (the flame not all pain accompanies observable pathology. For
depicted in Figure 1 that will travel along the nerve example, in over 85 percent of the cases of back pain,
from the periphery to the spinal cord eventually one of the most common pain conditions in Western
reaching the brain where it acknowledged as ‘pain’). countries, the cause of the pain is unknown (Deyo
Descartes’s approach formed the basis for the mind– 1986). Conversely, physical pathology does not nece-
body dualism that plays a prominent role in the ssarily cause pain. Diagnostic imaging studies identify
understanding and treatment of pain. significant pathology in up to 35 percent of people who
are pain-free (Jensen et al. 1994). Thus, a paradox
exists. On one hand there are people without sign-
ificant pathology who report severe pain and on the
other hand there are people with objectively dete-
rmined physical pathology who do not report the
presence of any pain.

2.2 Psychogenic (‘It’s All in Your Head’) Model


According to this model, pain, in the absence of
physical pathology, is a result of the patient’s inherent
personality traits or psychological problems. The
psychogenic view is posed as the flip side of the coin
from the physical or biomedical model. From this
perspective, if the report of pain occurs in the absence
of, or is disproportionate to, objective physical path-
ology, then, ipso facto, the pain reports must have a
psychological etiology. Thus, a dichotomy is posed,
pain is either somatagenic or psychogenic.

2.3 Motiational (Secondary Gain) Model


A related conceptualization to the psychogenic one
Figure 1 described specifically focuses upon motivation. From
L’homme de Rene! Descartes (reproduced by this perspective, reports of pain in the absence of
permission of Rene! Descartes Paris: Charles Angot, physical pathology are attributed to the desire of the
1664) person complaining to obtain some benefit such as

1783
Chronic Pain: Models and Treatment Approaches

attention, time off from undesirable activities, or influence the gating mechanism but also by people’s
financial (disability) compensation (i.e., secondary psychological states. They suggested that the reticular
gain). Unlike the psychogenic model, in the moti- formation in the brain functions as a central biasing
vational model, the assumption is that the person is mechanism inhibiting the transmission of pain signals.
consciously attempting to acquire a desirable outcome Psychological factors that affect the reticular form-
based on the complaint of pain. Thus, the complaint of ation may modulate the pain experience.
pain in the absence of pathology is taken to be Melzack and Wall emphasized the modulation of
fraudulent. Simply put, the person who reports pain in inputs in the dorsal horn of the spinal cord and on the
the absence of objective pathology is lying in order to dynamic role of the brain in pain processes and
obtain a desired outcome. perception. As a result, the gate control theory
integrates psychological variables such as past ex-
perience, attention, and other cognitive activities into
2.4 Operant (Social Reinforcement) Model current research on and therapy for pain. Prior to this
formulation, psychological processes largely were
The operant or social reinforcement model is based dismissed as merely reactions to pain. This new model
upon the behavioral learning paradigm in which the suggested that severing nerves, the putative pain
probability of exhibiting a specific behavior depends pathways, was inadequate as a host of other factors
on the consequences that follow the behavior. modulated the input. Perhaps the major contribution
Behaviors are most likely to recur when they are of the gate control theory was that it highlighted the
reinforced by desirable consequences (e.g., attention, central nervous system as playing an essential role in
sympathy), by removal of something aversive (e.g., the perception and interpretation of nociceptive stim-
participating in undesirable tasks, work activity), or uli associated with the experience of pain.
avoidance of some undesirable consequence. Fordyce The physiological details of the gate control model
(1976) suggested that certain behaviors associated have been challenged and it has been suggested that
with pain (e.g., limping, grimacing) could be reinforced the model is incomplete. As additional knowledge has
to the point where the behaviors recur without the been gathered since the original formulation of the
original nociceptive stimuli. Pain behaviors that are gate control model, specific points have been disputed
consistent with acute pain and that may be reflexive and physiological details of the model have been
responses to noxious stimulation, may become chronic refined. The conceptual aspects of the gate control
pain behaviors even when there is no physiological theory have, however, proved remarkably resilient
basis for exhibiting those behaviors. Thus, in this and flexible in the face of accumulating scientific
model, chronic pain is considered a manifestation of evidence. The gate control model still provides a
learned behaviors that need to be extinguished, powerful summary of the phenomena observed in the
whereas well behaviors, such as activity, need to be spinal cord and brain, and has the capacity to explain
increased. The laws of operant learning are viewed as many of the most puzzling problems encountered in
playing a central role in the maintenance of chronic the clinic. The gate control theory has had enormous
pain behaviors. The role of other factors, such as heuristic value in stimulating further research in the
physiology, emotions, the persons’ attitudes and basic science of pain mechanisms and in spurring new
beliefs, play very little role in the operant model. approaches to treatment.

3. Multidimensional Conceptualization of Pain 3.2 Cognitie–Behaioral Model


The unidimensional models described fail to explain The focus of the cognitive–behavioral model is on the
individual and situational variation in pain ex- person’s own subjective perspectives (e.g., attitudes,
periences. The article will now focus on more con- believes, expectations) and feelings about their plight
temporary models of pain with multifactorial bases. (Turk et al. 1983). The model assumes that although
nociceptive stimuli precede pain, how the person
perceives the nociceptive event forms a total pain
experience by interacting with the sensory event. For
3.1 Gate Control Theory
example, negative and pessimistic views by people
Melzack and Wall (1965) developed the gate control about their pain condition (e.g., ‘It’s hopeless and it
theory that integrated physical and psychological will never get better.’) and their capabilities for
factors regarding pain experience. Briefly, the gate managing pain and stress (e.g., ‘There is absolutely
control theory proposes that a mechanism in the nothing I can do about my pain.’) are likely to
dorsal horn of the spinal cord acts as a ‘gate’ that can exacerbate their emotional distress and sense of
inhibit or facilitate transmission of nerve impulses disability. Similarly, if one views pain as inevitable
from the periphery to the brain. Melzack and Wall (e.g., ‘I was injured, that is why I hurt.’), attention to
postulated that not only injury sites at the periphery sensory events may become pronounced, and as a

1784
Chronic Pain: Models and Treatment Approaches

consequence, relatively subtle sensory information medication is referred to in an ancient Egyptian text
may be interpreted as being painful. dating back to 1550 BC where the god Isis recom-
The cognitive–behavioral model also acknowledges mended the use of opium to relieve Ra’s headache
the effects of physiological factors and the environ- (Bonica 1953). Ancient pain treatments have included
ment on behaviors and of behaviors on thoughts and crocodile dung, oils derived from ants, earthworms
feelings. Reporting of symptoms to family and health and spiders, spermatic fluid from frogs, and moss
care providers is influenced by how people view their scraped from the skull of a victim of a violent death. A
pain problem and, of course, physical pathology. quick surf of the Internet will reveal that many equally
The basic assumption of the cognitive–behavioral esoteric preparations are endorsed readily even today.
model is that people are not passive entities who The most common contemporary treatment ap-
simply react reflexively to nociception or social rein- proaches continue to be based primarily on the bio-
forcement, but actively engaging themselves in medical model of pain involving pharmacotherapy
defining the experience. Based upon past learning and (with new drugs touted on an almost weekly basis) or
medical history, people develop subjective represe- surgery performed along nearly every site of the
ntations of illness and symptoms—‘schemas.’ These nervous system including the periphery (sympath-
schema becomes the filters through which people ectomy), the spinal cord (percutaneous cordotomy,
process new sensory stimulus (Cioffi 1991). destructive neural blockade), and the brain (thal-
Beliefs about the meaning of pain and one’s ability amatomy, prefrontal lobotomy). Although current
to function despite pain are important aspects of pain methods are more sophisticated than the ancient ones,
schema. When confronted with pain, people draw the basic principles remain the same—alteration of an
causal, covariational, and consequential inferences alleged physical cause should result in symptomatic
about their symptoms based upon their own schematic improvement. Some modalities are quite helpful and a
references. For example, if the schema includes a great many people suffering from acute and cancer
strong belief that all physical activities must be ceased pain have benefited from those advances in pain
when experiencing pain and that pain is an acceptable medicine and surgery. However, surgery and phar-
justification for neglecting domestic and occupational macotherapy are not panaceas and do not result in
responsibilities, poor adaptation and coping are likely adequate control of pain in all cases.
to result (Williams and Thorn 1989). In contrast, some primitive treatment methods
focused on the cause of pain being purely psychogenic.
The assumption was that if no physical cause could be
4. Treatments detected then the pain must be caused by emotional
disturbance that required treatment and once psy-
As has been discussed previously, pain is a complex chological problems were resolved the pain would
phenomenon and diverse in its nature and degrees of resolve. Such a psychogenic bias can be seen in the
associated impairment. It is therefore important to current Diagnostic and Statistical Manual (American
know the objectives of treatments for pain. Many pain Psychiatric Association 1994) for mental disorder.
states are expected to remit over time with little Two types of pain disorders are listed as mental
intervention required. Some types of acute pain may disorders: Pain Disorder Associated with Psychol-
require surgical or pharmacological therapy to treat ogical Factors and Pain Disorder Associated with
underlying pathology or to block noxious sensory Both Psychological Factors and a general medical
information. A more comprehensive plan is required condition. In both cases, pain is the primary com-
for treating people with chronic pain. Many people plaint, with psychological factors being considered
with chronic pain live with their pain for years, and the important in the onset, severity, exacerbation, or
adverse effects of pain may be generalized across all maintenance of the pain. A somewhat presumptuous
aspects of their lives, familial, social, occupational, as assumption underlying these disorders is that the
well as physical. Given the multilayered problems ‘appropriate’ physical etiology of chronic pain must be
associated with pain, treatment goals become more identifiable (e.g., free from psychological factor) or
global. Rather than treating pain, it becomes necessary patients are suffering from a psychiatric disorder. The
to treat people with chronic pain. assumption ignores the current consensus that pain is
not a pure sensory experience but is inherently a
biopsychosocial phenomenon resulting in an intense,
4.1 Historical Approaches to Treatment
subjective experience of discomfort.
The earliest approach to treating pain was aligned
closely with unimodal views; particularly those based
on the biomedical model. Pain was thought of as
4.2 Comprehensie Approaches to the Treatment of
reflecting a physical problem. Early treatment ap-
the Person with Chronic Pain
proaches focused on the idea that something had to be
removed or signals had to be interrupted to bring The recognition of pain, particularly chronic pain, as a
about relief from pain. The earliest reference to major healthcare issue, has resulted in the development

1785
Chronic Pain: Models and Treatment Approaches

of growing number of specialists and specialized behaviors will be repeated. Thus, treatment involves
facilities offering treatments for pain over the last few eliminating the rewarding consequences of pain
decades. In the United States alone, over 3300 pain behaviors and increasing the positive responses for
specialists and treatment facilities have been identified activity and other well behaviors. In these programs,
(Marketdata Enterprises 1995). moaning, grimacing, and other pain behaviors are
Comprehensive, multidisciplinary treatments in- ignored. Usual activities of daily living and functio-
volving teams of professionals (physicians, psychol- nal exercises are prescribed and positively reinforced.
ogists, physical therapists, occupational therapists, Medications are made time-contingent, rather than
vocational therapists, and nurses) have been developed prescribed on an ‘as needed’ (prn) basis that is beli-
to treatment patients with the most recalcitrant prob- eved to reinforce positively the pain behaviors.
lems. Flor et al. (1992) summarized the major char- Simultaneous with the attempts to extinguish pain
acteristics of these patients as those with seven years of behaviors, operant treatments are designed to help
pain history and at least one failed surgery. These patients acquire a set of new, more adaptive, beha-
patients are likely to be unemployed, receiving dis- viors. Quota-based exercise programs with gradually
ability payments, and present an array of psychosocial increasing functional activities form the core of
and motivational issues. In addition to the suffering operant treatment.
associated with chronic pain, these patients present Although operant factors may play an important
socioeconomic difficulties for society due to loss of role in the maintenance of disability, the model has
productivity and costs associated with disability been criticized at two levels. First, it fails to integrate
payments and medical care. factors other than reinforcement, such as sensory,
Focus will be on the pain management approaches emotional, and cognitive factors in the overall pain
that have been developed based upon the psych- experience. Second, the assumption that pain beha-
ological models described earlier. Although these viors are acquired, maintained, and extinguished
interventions can be used separately, with few exce- solely through environmental reinforcement contin-
ptions (e.g., biofeedback for tension-type headache) gencies is questioned. For example, physical signs,
unimodal approaches are less effective than com- patients’ self-efficacy beliefs, and depression are also
prehensive rehabilitation programs. Typically, the reportedly related to pain behaviors (Buckelew et al.
psychological approaches are part of more com- 1953).
prehensive rehabilitation program in which psy- Pain behaviors have been considered to be mala-
chologists work together with therapist from other daptive manifestations of pain, guided by an incentive
disciplines (e.g., physicians and physical therapists). for attention or avoidance of physical activity. How-
The multidisciplinary rehabilitation programs are ever, pain behaviors may be functional if indeed the
philosophically distinct from unimodal medical and behaviors protect patients from further injury or
surgical treatments. The first and most critical dis- exacerbation of pain. Determination of whether an
tinction is that the major goals of the multidisciplinary overt pain behavior in a given patient is functional or
treatment programs extend beyond pain relief. Rather, maladaptive needs to be based upon careful assess-
they focus on physical and psychological functioning, ment of various factors associated with his or her
as well. pain conditions not just the reinforcement contin-
Overall, rehabilitation programs that incorporate a gencies. Despite the criticisms, operant-based treat-
significant psychosocial component are most effective ments generally are successful in reducing overt pain
in improving patient functioning, returning patients to behaviors, increasing well-behaviors, and decreasing
work, reducing use of analgesics, reducing health care analgesic medication.
utilization, and reducing disability costs (Okifuji et al.
1998). ‘Cure’ of pain (e.g., total relief from pain) is,
unfortunately, rarely attained. Reduction of pain is
usually only moderate but comparable to that ob- 4.2.2 Treatments based on the cognitie–behaioral
served with usual medical and surgical approaches model. As noted earlier, the cognitive–behavioral
that are more invasive and have increased likelihood model of pain acknowledges the importance of
of iatrogenic consequences. cognitive variables interacting with sensory, affective,
behavioral ones to establish, maintain, and exacerb-
ate the pain experience. The nature of specific tech-
niques may vary from program to program, however,
4.2.1 Treatment based on the operant model. As the primary goals of the cognitive–behavioral ap-
noted previously, the operant model of pain focuses proach are relatively uniform (i.e., enhancement of
upon pain-related behaviors believed to be mainta- patients’ sense of control over their symptoms, in-
ined by social reinforcement. Such behaviors are creased use of adaptive skills to cope with pain and
considered to be maladaptive and thus need to be stress). Cognitive–behavioral therapy is designed to
eliminated. As discussed, reinforcers are the conseq- assist patients to identify, evaluate, and correct
uences of behaviors that increase the likelihood that maladaptive conceptualizations and dysfunctional

1786
Chronic Pain: Models and Treatment Approaches

Table 2
Common examples of cognitive errors
Overgeneralization: Extrapolation from the occurrence of a specific event
or situation to a large range of possible situations. ‘This coping strategy didn’t work;
nothing will ever work for me’
Catastrophyzing: Focusing exclusively on the worst possibility regardless of its
likelihood of occurrence. ‘My back pain means my body is degenerating and falling apart’
All-or-none Thinking: Considering only the extreme ‘best’ or ‘worst’ interpretation
of a situation without regard to the full range of alternatives. ‘If pain is not completely gone,
I cannot do anything right’
Jumping to Conclusions: Accepting an arbitrary interpretation without a rational
evaluation of its likelihood. ‘The doctor didn’t return my call today. He thinks I am a hopeless case’
Selective Attention: Selectively attending to negative aspects of a situation while
ignoring any positive factors. ‘Physical exercises only make my pain worse’
Negative Predictions: Assuming the worst. ‘I know I will not get better even
with all these therapies and everyone will dislike me for that’
Mind Reading: Make arbitrary assumptions about others without finding out what others
are thinking. ‘My husband does not talk to me about my pain because he doesn’t care about me’

beliefs about themselves and their predicament. to challenge the legitimacy of the thoughts—was it
Additionally, patients are taught to recognize the true? Was it reasonable? Was it the only way to
connections linking cognition, affect, and behavior respond? What alternatives are available? and so on.
along with their joint consequences. Patients are Patients are instructed to gather evidence for or against
encouraged to become aware of and to monitor the their own maladaptive automatic thoughts. Alter-
impact those maladaptive thoughts may have in natives are discussed, with the suggestion that different
the maintenance and exacerbation of maladaptive ways of thinking can affect mood, behavior (e.g.,
behaviors (see Table 2 for common maladaptive reducing physical activity leading to greater disability)
thoughts). and even physiological activity (e.g., increase muscle
The cognitive–behavioral approach consists of four tension and thereby exacerbating pain).
interrelated phases. These include (a) reconceptua- The second phase is the acquisition of self-mana-
lization, (b) acquisition of coping skills and self- gement skills. A wide variety of techniques have been
management strategies, (c) skill consolidation, and (d) shown to be effective for reducing suffering and
generalization and maintenance. The first phase, disability. Some of these strategies are self-regulatory
reconceptualization, uses cognitive restructuring, a skills (e.g., relaxation, controlled breathing, and atte-
method that encourages people to identify and change ntion diversion) that allow pain sufferers to regulate
maladaptive thoughts and feelings that are associated their own physiological responses that may be in-
with the experience of pain. The crucial element in volved in the maintenance and exacerbation of
successful treatment is bringing about a shift in the pain. Other self-management strategies include stress-
patient’s thought processes, away from well-estab- reduction skills (e.g., problem-solving, behavioral
lished, habitual, and automatic but maladaptive rehearsals) that allow people with chronic pain to
thoughts toward more hopeful and rational ones. effectively manage the stress-inducing thoughts, be-
Cognitive restructuring helps foster the reconceptual- haviors, and emotions that trigger pain, emotional
ization by helping patients to become aware of the role distress, and other maladaptive responses. Instead of
thoughts and emotions play in potentiating and being a passive recipient of a medical intervention
maintaining stress and pain. The aim of this phase is to (e.g., medication, anesthetic nerve block), patients
combat the sense of demoralization that many with now learn to use self-management strategies to play an
chronic pain experience. Generally, the process of active role in managing the myriad of problems created
cognitive restructuring begins with a presentation of a by the presence of persistent pain. Research has
situation or event that provoked a pain-related re- suggested that there is no one specific coping skill that
sponse. The situation is dissected to identify key best manages pain and disability (Fernandez and Turk
thoughts and feelings that precede, accompany, and 1992). It is recommended generally that chronic pain
follow an episode or exacerbation of ongoing pain and patients should be taught various types of coping skills
pain-related problems. Then patients are encouraged to help them acquire a range of options.

1787
Chronic Pain: Models and Treatment Approaches

Phase 3 is skill consolidation. During the skill- followed the conceptualizations of pain and have
consolidation phase, patients practice and rehearse the included unidimensional modalities (i.e. biomedical,
skills that they have learned during the acquisition psychogenic) and progressed to multidimensional
phase and apply them outside the hospital or clinic. modalities.
Practice may start with the mental rehearsal, during Although pain generally is considered a physical
which patients imagine using the skills in different phenomenon, pain involves various cognitive, affe-
situations. Therapists can make use of role-playing, in ctive, and behavioral features. These psychological
which patients rehearse learned skills in situations that factors are important not only in determining the
mirror their home environments. Therapists may start perception of pain, but also defining disability and
with a relatively easy examples and then introduce patients’ general well being. It should be clear from the
scenarios that are progressively more realistic. The review that pain has three main components: physical,
importance of skill consolidation through home- psychosocial, and behavioral, that interact to define
practice cannot be overstated. When patients practice the unique pain experience. Because the pain ex-
skills at home, it is useful for them to record their perience is subjective and idiosyncratic, it cannot be
experiences including any difficulties that arise. Once understood without evaluating how patients perceive
problems associated with using the newly acquired and appraise their conditions. A complete clinical
skills are identified, these become targets for further picture involves consideration of how patients view
discussion. their plight. By understanding the phenomenology of
The final stage is phase 4; the preparation for chronic pain and disability, effective treatment can be
generalization and maintenance. To maximize the planned to alleviate persistent and debilitating pain,
likelihood of maintenance and generalization of trea- improve physical and psychological functioning,
tment gains, therapists focus upon the cognitive thereby reducing the disability that accompanies
activity of patients as they confront problems chronic pain.
throughout treatment (e.g., failure to achieve goals,
plateaus in progress, recurrent stress). These circum- See also: Chronic Illness, Psychosocial Coping with;
stances are used as opportunities to assist patients to Chronic Illness: Quality of Life; Pain, Health Psy-
learn how to handle setbacks and lapses because they chology of; Pain, Management of; Pain, Neural
are probably inevitable parts of life and will occur Basis of.
after the termination of the treatment. In the final
phase of treatment, discussion focuses on ways of
predicting and dealing with symptoms and related
problems following treatment termination. Patients Bibliography
are encouraged to anticipate future problems, stress,
American Psychiatric Association, 1994, Diagnostic and Stat-
and symptom-exacerbating events during treatment
istical Manual of Mental Disorders: DSM-IV, American
and to plan how to respond and cope with these Psychiatric Association, Washington, DC
problems. Anonymous 1995 Chronic pain management programs: a
Since self-initiating pain management is a key factor market analysis, Marketdata Enterprises, Tampa, FL
in pain rehabilitation. Some type of cognitive– Bonica J J 1993 The Management of Pain. Lea & Febiger,
behavioral approach generally is included in a multi- Philadelphia
disciplinary pain program. Buckelew S P, Murray S E, Hewett J E, Johnson J, Huyser B
Cognitive–behavioral approaches have been demo- 1953 Self-efficacy, pain, and physical activity among fibro-
nstrated to be effective with a wide range of debilitating myalgia subjects. Arthritis Care Research 8: 43–50
pain syndromes including low back pain (Lanes et al. Calfas K, Kaplan R, Ingram R 1992 One-year evaluation of
cognitive–behavioral intervention in osteoarthritis. Arthritis
1995), arthritis (Parker et al. 1995), and fibromyalgia
Care Research 5: 202–9
(Turk et al. 1998). Moreover, these methods have been Cioffi D 1991 Beyond attentional strategies: cognitive–
shown to be effective with children (Walco et al. 1992) perceptual model of somatic interpretation. Psychological
and geriatric samples (Calfas et al. 1992). Bulletin 109: 25–41
Deyo R A 1986 Early diagnostic evaluation of low back pain.
Journal of General Internal Medicine 1: 328–38
Fernandez E, Turk D C 1992 Sensory and affective components
of pain: separation and synthesis. Psychological Bulletin 112:
5. Summary 205–17
This chapter reviewed the definition of pain and the Flor H, Fydrich T, Turk D C 1992 Efficacy of multidisciplinary
pain treatment centers: a meta-analytic review. Pain 49:
historical context within which pain has been con- 221–30
ceptualized, made a distinction between nociception (a Fordyce W E 1976 Behaioral Methods in Chronic Pain and
sensory process) and pain (a perception), and des- Illness. Mosby, St. Louis
cribed the most common conceptualizations based the Gureje O, Von Korff M, Simon G E, Gater R 1998 Persistent
unidimensional and multidimensional models of pain. pain and well-being: a World Health Organization study in
It was noted that treatment strategies have closely primary care. JAMA 280: 147–51

1788
Chronology, Stratigraphy, and Dating Methods in Archaeology

Hart L G, Deyo R A, Cherkin D C 1995 Physician office visits Table 1


for low back pain. Frequency, clinical evaluation, and Datable materials and appropriate dating methods in common use
treatment patterns from a U.S. national survey. Spine 20: in early twenty-first century
11–19
Jensen M C, Brant-Zawadzki M N, Obuchowski N, Modic M,
Malkasian D, Ross J 1994 Magnetic resonance imaging of the
lumbar spine in people without back pain. New England
Journal of Medicine 331: 69–73
Lanes T C, Gauron E F, Spratt K F, Wernimont T, Found E,
Weinstein J 1995 Long-term follow-up of patients with chronic
back pain treated in a multidisciplinary rehabilitation pro-
gram. Spine 20: 801–6
Melzack R, Wall P D 1965 Pain mechanisms: a new theory.
Science 150: 971–9
Merskey H 1986 Classification of chronic pain. Descriptions of
chronic pain syndromes and definitions of pain terms. Pain
–Supplement 3: S1–226
Okifuji A, Turk D, Kalauokalani D 1998 Clinical outcome and
economic evaluation of multidisciplinary pain centers. In:
Block A R, Kremer E F, Fernandez E (eds.) Handbook of Pain
Syndromes: Biopsychosocial Perspecties, Erlbaum, Mahwah,
NJ
Parker J C, Smarr K L, Buckelew S P, Stucky-Ropp R, Hewett
J, Johnson J, Wright G, Irvin W, Walker S 1995 Effects of
stress management on clinical outcomes in rheumatoid arth-
ritis. Arthritisis and Rheumatism 38: 1807–18
Turk D C, Meichenbaum D, Genest M 1983 Pain and Behaioral
Medicine: A Cognitie–Behaioral Perspectie. Guilford, New
York
Turk D C, Okifuji A 2001 Pain terms and taxonomies of pain.
In: Loeser J D, Butler S H, Chapman C R, Turk D C (eds.) regions of the world, radiocarbon, dendrochronology,
Management of Pain. 3rd edn. Lippincott Williams & Wilkins, obsidian hydration, and archaeomagnetism are the
Philadelphia most common physical dating methods used, pre-
Turk D C, Okifuji A, Sinclair J D, Starz T 1998 Interdisciplinary sumably in that order, but often together as a group
treatment for fibromyalgia syndrome: clinical and statistical providing redundant checks on one of the most
significance, Arthritis Care and Research 11: 186–95 important goals of archaeology—providing a frame-
Walco G A, Varni J W, Ilowite N T 1992 Cognitive–behavioral work of time, a chronology, upon which a rational
pain management in children with juvenile rheumatoid arth- reconstruction of the past can be built. Regardless of
ritis. Pediatrics 89: 1075–79 an archaeologist’s theoretical or political perspectives,
Williams D A, Thorn B E 1989 An empirical assessment of pain
the first goal is to order regional prehistoric or historic
beliefs. Pain 36: 351–58
sites, a single site, or a portion of the site into
D. C. Turk meaningful cultural and chronological segments.
Chronology is a primary goal, but should not be
confused with the intent of reconstructing the past,
regardless of what an archaeologist’s definition of the
ultimate purpose may be.
During the formative cultural historic perspective in
archaeology in the first half of the twentieth century,
Chronology, Stratigraphy, and Dating constructing regional and site-oriented chronologies
Methods in Archaeology became the primary goal of archaeology, particularly
in North America and the UK (Trigger 1989). A
dissatisfaction with this rather unilineal explanation of
1. Introduction: Estimating Age in the the past stirred archaeologists to look beyond simply
Archaeological Record constructing chronologies and to begin to understand
process and social relationships in the past. Dating
Of the twenty or so dating methods employed in methods and the construction of cultural chronolo-
twenty-first-century archaeology, the US Congres- gies, however, have remained the basement upon
sional Office of Technology Assessment (OTA) report which archaeologists build their understanding of
on Technologies for Prehistoric and Historic Pres- social process through time.
eration (US Congress, OTA 1986) listed only seven While it would be optimal to discuss all the potential
and highlighted two radiocarbon and archaeomag- dating methods available to archaeologists, this is not
netic techniques; see Taylor 2000, pp. 75–60. In many possible in a short synopsis. The most commonly used

1789
Chronology, Stratigraphy, and Dating Methods in Archaeology

Figure 1
Commonly used dating methods in archaeology. Those marked with (*) are discussed in some detail in this entry
Source: adapted from Waters 1992

methods, and those showing promise, will be covered begun with John Frere in 1797 attempting to under-
here (see Table 1 and Fig. 1). Refer to the publications stand the relationship between stone axes in a sedi-
in the bibliography for more detailed treatments. mentary sequence in England: ‘The manner in
which…[the hand axes] lie would lead to the per-
suasion that it was a place of their manufacture and
2. Archaeological Stratigraphy not of their accidental deposit…It may be suggested
that the different strata were formed by inundations
Stratigraphic relations have always been the primary happening at distant periods’ (Frere in Rapp and Hill
method to infer the relative age of artifacts within a 1998, p. 5).
site. Stratigraphy is defined here as the study of the This idea that features or artifacts found within a
spatial and temporal relationships between the sedi- site in the same stratigraphic level are contempor-
ments and the soil (Waters 1992, pp. 60–1). Much of aneous forms the foundation upon which archaeo-
this section’s discussion can be more robustly under- logical stratigraphy is based. In early twenty-first
stood by referring to the entry Geoarchaeology; an century archaeology, the various relative and absolute
awareness of the geological processes of archaeo- dating methods such as radiocarbon or archaeo-
logical site formation is crucial to an understanding magnetic dating are used to verify this assumption
of dating archeological sediments. Indeed, the and produce a site chronology (see Fig. 2). Fig. 2
methods used by Quaternary geologists to date sedi- shows a simple stratigraphic profile of a single ex-
ments include most of the methods used by archaeo- cavation unit with the relative positions of natural and
logists and discussed here (see Fig. 1). Stratigraphic cultural features, and the position of radiocarbon
studies of archaeological sites are designed to define dates recovered from the various levels. Note that the
objectively and categorize the sediments and soils, the radiocarbon dates indicate that, in general, the stra-
contact units between them, and the amount of time tigraphy is intact in that the oldest dates are at the
they represent, as well as their relationship to the bottom and the latest dates at the top, except for one
surrounding sediment history. date (CAMS 43177) at 3690p50 BP which seems to be
Archaeological stratigraphy is based on the geo- out of sequence and suggests ‘small-scale’ disturbance
logical concepts of the law of superposition, which probably caused by the digging of the pit structure
states that older sediments are emplaced at a lower during the later ceramic period (Shackley et al. 2000).
level than more recent sediments. Therefore, sites, Time-sensitive artifacts, such as previously dated
features, and artifacts residing in lower levels are, by pottery or projectile point types, can be used as index
definition, older than those in upper levels. This fossils within the stratigraphic column to create a
conceptual framework in archaeology appears to have relative chronology within the site. The stratigraphic

1790
Chronology, Stratigraphy, and Dating Methods in Archaeology

Figure 2
Simple stratigraphic wall profile of a unit at McEuen Cave (AZ W:13:6 ASM) showing position of bedrock,
undisturbed strata, pit fill structure, and the position of recovered radiocarbon dates. Source: Shackley et al. 2000

example discussed above also used index fossils, in this adopted, since many geoarchaeologists are convinced
case projectile point types, to add another chrono- that the existing geological code is adequate for
logical piece of information. In Level 8, where the archaeological research (see Stein 1987). More re-
early radiocarbon date as well as some more recent cently, Harris has proposed an ‘archaeological tool’ to
radiocarbon dates were recovered, Middle Archaic as understand archaeological stratigraphy called the
well as Late Archaic projectile point forms were ‘Harris Matrix,’ essentially a simple postexcavation
recovered; both confirming the time range suggested way in which to understand the relationships between
by the radiocarbon date ranges (Fig. 2, Shackley et al. stratigraphic units in a single diagram as reflected on
2000). In this instance the stratigraphy, radiocarbon paper, or more recently in digital form (see Harris et
dates, and index fossils all contributed to an under- al. 1993). Farrand (1984) and others, mainly geol-
standing of the site chronology and the integrity of the ogists, have questioned Harris’s assumption that
archaeological deposit. archaeological stratigraphy is primarily culturally
conditioned rather than geologically determined. In a
Harris Matrix analysis the relationship between strati-
graphic units is based mainly on index fossils, for
2.1 Constructing Stratigraphic Chronologies:
example the relationship between artifacts and
Competing Ideas
features, rather than geological features or context—
It is important to note, however, that there is con- essentially emphasizing content rather than structure.
siderable recent controversy over the proper method Despite criticism from many geoarchaeologists, the
to employ in the interpretation of archaeological Harris Matrix analysis has become increasingly popu-
stratigraphy (Farrand 1984, Harris et al. 1993, Stein lar with archaeologists working in varieties of geo-
1987, Waters 1992). Archaeological stratigraphic logical and cultural contexts worldwide, particularly
codes similar to geological stratigraphic codes and in Europe. The growing popularity of the Harris
methodologies have been proposed but not generally Matrix approach will probably continue, particularly

1791
Chronology, Stratigraphy, and Dating Methods in Archaeology

in small site settings where a larger geological view is precision. At the turn of the twenty-first century,
difficult to obtain or not necessarily relevant. image analysis technology was being used to develop
electronic workstations, including image databases in
laptop computers, perform routine comparisons be-
tween the database and archaeological wood samples
3. Sidereal Dating Methods
similar to techniques used in obsidian hydration
Sidereal dating methods include historical records, analysis.
glacial varve dating, and dendrochronology. While Today, in a number of regions of the world (upper
only dendrochronology will be discussed here, glacial altitudes in the North American Southwest, Western
varve dating is quite useful in the upper latitudes, and Europe, the Eastern Mediterranean) dendrochron-
in some cases may be quite precise. Historical records ology forms the backbone of chronometric dating.
are a basic dating tool in historic and ethnohistoric When available, dendrochronology provides an ab-
archaeology, in addition to the social information they solute chronology that anchors other dating tech-
may contain. niques.

3.1 Dendrochronology
The science that uses annual tree rings for dating past 4. Isotopic Dating Methods
events and reconstructing past environmental con- Perhaps the one area of dating that has seen the
ditions has undergone explosive growth in recent greatest advances recently are techniques based on
decades (Dean 1997). While dendrochronology as an radioactive decay: "%C, K-Ar, and %!Ar\$*Ar. Ad-
archaeological dating tool enjoyed tremendous ex- vances in accelerator mass spectrometry and laser
pansion in the 1990s, particularly in western North fusion have propelled these techniques into the fore-
America, Europe, Siberia, and the eastern Mediter- front of the arsenal of methods used to deal with time
ranean, it is used wherever appropriate trees occur and in the archaeological record. Importantly, "%C is
a sequence has been established. Most importantly, generally restricted to dating the last 50,000 years and
directly dated tree rings have been instrumental in the %!Ar\$*Ar, previously restricted to time periods near
calibration of the radiocarbon timescale discussed and over one million years, are now nearly overlapping
below. This process indicated that the radiocarbon with %!Ar\$*Ar using the laser fusion method and
chronology underestimates the true ages of materials yielding younger and younger ages. These methods do
older than 2,000 years and that "%C dates must be have limitations, but the organic and mineral restric-
corrected. tions are being transcended almost daily.
The technique is based on the concept that each year
(four seasons) a tree will accumulate one growth ring,
and that that ring’s attributes reflect the specific
climatic regime of that year. Since each year is 4.1 Radiocarbon Dating ("%C)
essentially a unique climatic record, the attributes of Given that the greatest level of human activity oc-
the tree ring (thickness of inner and outer bands) are curred after the evolution of modern Homo sapiens, in
correspondingly unique. In trees, particularly fast- the last 45,000 years or so, radiocarbon dating has
growing conifers, the progression of rings from pith to become the most commonly relied upon dating
circumference presents an ‘unalterable temporal or- method in archaeology (Taylor in Taylor and Aitken
der, and the production of but one ring per year 1997). Radiocarbon dating, now in its fifth decade of
provides the incremental regularity necessary to es- general use, is a primary tool used by archaeologists
tablish a fixed [and absolute] time scale’ (Dean 1997, and Quaternary geologists to date the past.
p. 34). More than 180 tree and shrub species worldwide
possess the attributes required for successful dendro-
chronological studies: visible and unambiguous ring
definition, production of a set number of rings 4.1.1 The radiocarbon method. There are three prin-
(generally only one) per year, mainly climate-con- cipal isotopes of carbon which occur naturally: "#C,
trolled growth, and the presence of useable mor- "$C (both stable), and "%C (unstable or radioactive).
phological features that allow for ring comparison. The radiocarbon method is based on the rate of de-
Cross-dating, matching patterns of ring variation cay of the radioactive or unstable carbon isotope 14
among trees, is the necessary principle of dendro- ("%C), which is formed in the upper atmosphere
chronology. A sequence within a region is derived through the effect of cosmic ray neutrons upon
from overlap between cut trees as well as archae- nitrogen 14. The reaction is: "%Njn  "%Cjp (where
ological specimens, sometimes with great time depth n is a neutron and p is a proton).
into the thousands of years. While this process is The "%C formed is rapidly oxidized to "%CO and
apparently simple, the application necessarily requires #
enters the Earth’s plant and animal lifeways through

1792
Chronology, Stratigraphy, and Dating Methods in Archaeology

Table 2
Recent pre-3000 BP (RCY) maize and cucurbit AMS 14C dates from the American Southwest showing reporting conventions

Source: Shackley et al. 2000


1 The 2r dendrocalibrated minimums (intercepts) and maximums are based on a bidecadal correction in calendar years BP, rounded to the
nearest decade, from Stuiver and Reimer (1993, version 3.0.3c, Method A). The maize dates are not corrected for C4 carbon uptake and can
actually be up to 240 years older than reported here assuming –10 0/00. 2 This date and context is somewhat suspect, but was associated with
a similar charcoal date at Los Pozos in the Tucson Basin (Freeman 1997, see also Gregory 1999). 3 This date on a Cucurbita sp. is probably
from a wild species. However, three specimens of maize were dated to over 2800 RCY BP (Hard and Roney 1998, p. 1661).

photosynthesis and the food chain. Plants and animals small amount of radioactive carbon present in a
which utilize carbon in biological food chains take up sample. At about 50,000 to 60,000 years, then, the
"%C during their lifetimes. They exist in equilibrium limit of the technique is reached (beyond this time,
with the "%C concentration of the atmosphere; that is, other radiometric techniques must be used for dating,
the number of "%C atoms and nonradioactive carbon such as %!Ar\$*Ar, as discussed above). By measuring
atoms stays approximately the same over time. As the "%C concentration or residual radioactivity of a
soon as a plant or animal dies, they cease the metabolic sample whose age is not known, it is possible to obtain
function of carbon uptake; there is no replenishment the number of decay events per gram of carbon. By
of radioactive carbon, only decay. comparing this with modern levels of activity (1890
Libby, Anderson, and Arnold (Taylor and Aitken wood corrected for decay to AD 1950) and using the
1997) first discovered that this decay occurs at a measured half-life, it becomes possible to calculate a
constant rate. They found that after 5,568 years half date for the death of the sample.
the "%C in the original sample will have decayed, and It follows from this that any material which is
that after another 5,568 years half of that remaining composed of carbon may be dated. Herein lies the true
material will have decayed, and so on. The half-life advantage of the radiocarbon method—it is able to be
(t / ) is the name given to this value, which Libby uniformly applied throughout the world. A list of the
"#
measured at 5568p30 years. This became known as tremendous quantity of organic material that can be
the Libby half-life. After 10 half-lives, there is a very dated by radiocarbon is available at the University of

1793
Chronology, Stratigraphy, and Dating Methods in Archaeology

Waikato’s radiocarbon Web site, a standard for counting in mainly scintillation counters. This re-
understanding the technique: http:\\c14.sci.waikato. quires a relatively large sample, depending on the
ac.nz\webinfo\int.html. amount of carbon remaining in that sample. By the
late 1970s a number of researchers discovered that
when accelerating sample atoms in the form of ions
4.1.2 Calibration and radiocarbon dating. As men- to much higher energies in particle accelerators, a
tioned above, there are a number of effects that can much smaller sample was required to derive confi-
cause errors in the measurement of radiocarbon dent dates—in most cases only milligrams instead of
dates. For example, shell, in constant contact with tens of grams for scintillation counting. Both cyclo-
more recent atmospheric carbon, will generally yield tron and tandem accelerator mass spectrometers
young dates; conversely a shell artifact deposited in have been used to accomplish this, with tandem accel-
older limestone sediments will obtain a much older erators becoming the most popular. One additional
date than its actual death. These reservoir effects are advantage of acceleration is that the ‘stripping pro-
in part mitigated by the use of various calibration cess’ disassociates all molecular species with the result
algorithms, such as the CALIB (2001) and OxCal that carbon isotopes can be isolated, and contami-
(2001) programs, both available on-line. nation minimized. AMS "%C dating theoretically may
When a radiocarbon lab returns a date from a push the time frame back to 100,000, effectively
sample such as 5568 BP, it does not mean that it dates overlapping %!Ar\$*Ar laser fusion dating (Taylor
to 3619 BC, because the true half-life of radiocarbon is and Aitken 1997).
5730 years, and, more importantly, the proportion of
radiocarbon in the atmosphere has varied through
time, as discussed above. So the calibration utilities are 5. Radiation Dating Methods
written to allow for differential in the absorption of As with most of the other types of dating methods,
"%C by different materials (i.e., marine shell versus radiation dating methods have undergone tremendous
wood charcoal), and to allow for different atmospheric advances in the last decade, although many of these
effects. Using the CALIB 4.2 calibration, a radio- methods remain somewhat controversial in their
carbon assay of 5568 BP with a 1 standard deviation of applications. Electron spin resonance (ESR) and
55 years on wood charcoal yields a date of: thermoluminescence (both based on the accumulation
of trapped electrons in minerals), and fission track
68.3 (1 Σ) cal 4454–4416 BC dating (produced when alpha particles are created by
probability distribution 0.409 spontaneous fission of #$)U leaving a damage trail),
4408–4354 BC 0.591 have all had their detractors, but have recently gained
recognition as techniques that are useful in dating
Note that there are two dates with ranges of a number materials not readily possible with the more accepted
of years. The range includes the one standard de- technology, for example glass, tooth enamel, and
viation, and the two dates are due to multiple pigments (see Table 1). Archaeomagnetic dating has
intercepts on the calibration curve. We can be 68.3 gained wide acceptance in many regions of the world
percent certain that the dates fall either from 4454 to and deserves some discussion.
4416 BC or from 4408 to 4354 BC. There are multiple
possible dates because the radiocarbon date of 5568
BP intercepts the calibration curve at more than one 5.1 Archaeomagnetic Dating
point. Due to the perturbations in the absorption of
radiocarbon over the millennia, the variance is some- Archaeomagnetic and paleomagnetic dating both rely
times full of ‘wiggles’ on the curve, so placement can on the phenomenon of the frequent and predictable
occur at two or more points. shifts in the Earth’s magnetic polarity in space and
Radiocarbon dates are reported in a standardized time. The premise for the method states that the
method, as shown in Table 2. The convention calls for Earth’s magnetic poles wander (show secular vari-
reporting the provenience of the sample, the lab- ation) or flip (reverse direction), and that these
oratory number, the radiocarbon age, and then the variations provide a temporal fingerprint that can be
calibrated (in this case dendrocalibrated) age at one or detected in rock and sediments. Iron dipoles in
two standard deviations (note that most of these dates minerals in soft sediments align themselves parallel
yielded multiple intercepts). The calibration method with the Earth’s magnetic field at the time of de-
used and any further contextual information should position until the sediment solidifies, for example as
also be supplied. baked clay in a hearth. This creates detrital remnant
magnetism whose direction and intensity precisely
reflects the Earth’s magnetic field at the time of
4.1.3 Accelerator mass spectrometry (AMS ) and deposition; and by matching their magnetic orien-
radiocarbon. From the inception of radiocarbon tation to a master record, it is possible to derive a
dating, "%C ages of samples were calculated by decay relatively accurate date. This master curve has been

1794
Chronology, Stratigraphy, and Dating Methods in Archaeology

derived for the last 10,000 years. Archaeomagnetism is the bottom of a site than at the top. If this is not the
useful in dating material, as appropriate, when other case, then stratigraphic mixing is indicated. Obsidian
useful dating material, appropriate for other tech- hydration holds great promise, but most archae-
niques is absent. ologists are hesitant to include it in the arsenal of
absolute dating methods.

6. Chemical, Temperature, and Water Affected See also: Geoarchaeology


Dating Methods
Some of the more controversial dating methods are Bibliography
those that are based upon changes in chemicals,
temperature, and\or water for calibration. Amino CALIB Radicarbon Calibration Software (2001) http:\\depts.
washington.edu\qil\calib\
acid racemization held great promise for dating
Carver M O H 1990 Digging for data: Archaeological ap-
organic materials, but is not generally reliable, while proaches to data definition, acquisition and analysis. In:
obsidian hydration dating, which was also promising, Francovich R, Manacorda D (eds.) Lo scao archaeologica:
has become equally problematic. della diagnosi all’edizione. Edizioni all’Insegna del Giglio,
Firenze, Italy, pp. 45–120
Colcutt S N 1987 Archaeostratigraphy: A geoarchaeologist’s
6.1 Obsidian Hydration Dating viewpoint. Stratigraphica Archaeologica 2: 11–18
Dean J S 1997 Dendrochronology. In: Taylor R E, Aitken M J
Obsidian, a quenched rhyolite glass is common along (eds.) Chronometric Dating in Archaeology. Kluwer\Plenum
plate boundaries and volcanic arcs where crustal Publishers, New York, pp. 31–64
remelting has occurred. A glassy brittle rock with Farrand W R 1984 Stratigraphic classification: Living within the
remarkable cutting properties, it has been used law. Quarterly Reiew of Archaeology 5: 1–5
throughout human history in the production of stone Freeman A K L 1997 Middle to Late Holocene stream dynamics
tools (Shackley 1998). Obsidian is a noncrystalline of the Santa Cruz River, Tucson, Arizona: Implications for
glass, and therefore a disordered substance, and moves human settlement, the transition to agriculture and archaeo-
logical site preservation. Ph.D. thesis, University of Arizona,
toward an ordered state by crystallization or, more
Tucson, AZ
precisely, perlitization (Friedman et al. 1997, Friedman I, Trembour F W, Hughes R E 1997 Obsidian
Stevenson et al. 1998). This is accomplished by the hydration dating. In: Taylor R E, Aitken M J (eds.) Chrono-
absorption of water and a breaking of silicon and metric Dating in Archaeology. Kluwer\Plenum Publishers,
aluminum bonds, ultimately devitrifying the glass and New York, pp. 297–322
forming perlite. As the process proceeds, an absorp- Go$ ksu H Y, Oberhofer M, Regulla D 1991 Scientific Dating
tion front and hydration rim forms that theoretically Methods. Kluwer Academic Publishers, Dordrecht, The
occurs at a regular, linear rate through time. Measur- Netherlands
ing this rim with a petrographic microscope in microns Gregory D A, Adams J L (eds.) 1999 Excaations in the Santa
Cruz Rier Floodplain: The Middle Archaic Component at Los
theoretically yields an absolute date. On the surface,
Pozos. Center for Desert Archaeology, Tucson, AZ
this would seem to be a remarkable method, resolving Hard R J, Roney J R 1998 A massive terraced village complex in
a number of dating issues in archaeological contexts in Chihuahua, Mexico, 3000 years before present. Science 279:
which organic datable material does not occur, but 1661–4
obsidian artifacts are abundant. More recently, the Harris E C, Brown M R III, Brown G J (eds.) 1993 Practices of
intervening negative effects of variable temperature Archaeological Stratigraphy. Academic Press, London
and humidity through time have been found to Herz N, Garrison E G 1998 Geological Methods for Archaeology.
influence the rate at which glass will hydrate. Due to Oxford University Press, Oxford, UK
this a number of researchers have rejected the method Morgenstein M E, Wicket C L, Barkatt A 1999 Considerations
of hydration-rind dating of glass artefacts: Alteration
in toto (see Morgenstein et al. 1999, Ridding 1996).
morphologies and experimental evidence of hydrogeo-
Still, a number of researchers have continued to utilize chemical soil-zone pore water content. Journal of Archaeo-
the method, and it has a number of devoted followers, logical Science 26: 1193–210
even in the face of discrepancies between radiocarbon OxCal Radiocarbon Calibration Software (2001) http:\\units.
dates and obsidian hydration dates in the same ox.ac.uk\departments\rlaha\orau\06Ifrm.htm
stratigraphic position. Stevenson, and others have Rapp G Jr., Hill C L 1998 Geoarchaeology: The Earth Science
attempted to derive intrinsic hydration models based Approach to Archaeological Interpretation. Yale University
on the premise that each single obsidian nodule Press, New Haven, CT
contains a different proportion of water than any Ridding R 1996 Where in world does obsidian hydration work?
American Antiquity 61: 136–48
other, and this will produce ages compatible with
Shackley M S (ed.) 1998 Archaeological Obsidian Studies:
other dating methods (see Stevenson et al. 1998). Most Method and Theory. Kluwer Academic\Plenum Publishing,
do agree that obsidian hydration is a useful relative New York
dating method which can be used to determine the Shackley M S, Huckell B B, Huckell L W 2000 Late Preceramic
extent of mixing in a stratigraphic column; given the farmer\foragers at the foot of the Mogollon Rim: The
model, one expects to see larger rim measurements at McEuen Cave Archaeological Project Testing Report (AZ

1795
Chronology, Stratigraphy, and Dating Methods in Archaeology

W:13:6 ASM). Report prepared for the Bureau of Land establishment since 1948 through various inter-
Management, Safford Area, AZ, US Department of Interior national charters and conventions on human rights of
Stein J K 1987 Deposits for archaeologists. Adances in Archaeo- foundational claims to religious freedom presents the
logical Method and Theory 11: 337–95
challenge of institutionalizing ostensibly universal
Stevenson C M, Mazer J J, Scheetz B E 1998 Laboratory
obsidian hydration rates: Theory, method, and application. claims in this field within culturally diverse contexts.
In: Shackley M S (ed.) Archaeological Obsidian Studies:
Method and Theory. Kluwer Academic\Plenum Press, New
York, pp. 181–204 1. Church and State in the Perspectie of
Stuiver M, Reimer P J 1993 Extended "%C data base and revised
CALIB 3.0 "%C age calibration program. Radiocarbon 35:
Political Deelopment
215–30 In the developmentalist, or modernization, perspective
Taylor R E 2000 Science-based dating methods in historic traditional religiopolitical systems were typically un-
preservation. In: Williamson R A, Nickens P R (eds.) Science
derstood to combine religious and political functions
and Technology in Historic Preseration. Kluwer Academic\
Plenum Press Publishers, New York, pp. 75–96 within a single structure of authority. In certain,
Taylor R E, Aitken M J (eds.) 1997 Chronometric Dating in particularly oriental, cultures this took the form of
Archaeology. Kluwer Academic\Plenum Publishers, New institutions of divine kingship or theocratic rule, while
York in Christian cultures, where the church was from early
Trigger B G 1989 A History of Archaeological Thought. Cam- on conceived as a distinct institution, rulers usually
bridge University Press, Cambridge, UK claimed and often successfully asserted over against
US Congress, Office of Technology Assessment 1986 Tech- the church some species of sacral authority. From the
nologies for Prehistoric and Historic Preseration, OTA-E-319. time of Constantine, the emergent system of church
US Government Printing Office, Washington, DC
establishment shifted from epoch to epoch between
Waters M R 1992 Principles of Geoarchaeology: A North
American Perspectie. University of Arizona Press, Tucson, imperial authority over the church and Papal or
AZ Patriarchal authority over the emperors and other
temporal authorities, a tension which was only par-
S. Shackley tially resolved into contrasting patterns prevalent in
different parts of the Christian world at the times of
the East–West Schism in the eleventh century and the
Reformation\Counter-Reformation in the sixteenth.
In the high Middle Ages the Popes attempted to assert
Church and State: Political Science theocratic authority over the kings, princes, and
emperors of Western Europe, while in the Orthodox
Aspects East the Caesaropapist pattern of harmony between a
dominant emperor and a subservient patriarch pre-
The umbrella term Church and State is conventionally dominated. After the Reformation the authority of
used to refer to a range of topics within political secular princes vis-a' -vis the church was promoted
science which concern the relationships between re- both in the Protestant north and in the Roman
ligious organizations, institutions, and authorities on Catholic south as the contest was prosecuted by and
the one hand and the polity on the other. Across the between the sponsors of the respective confessions.
world religion has from the earliest times been cen- Within Islam the institutional differentiation be-
trally concerned with the shaping and authoritative tween religious and secular authority did not develop
allocation of values with which political science in its as Medina under Muhammad was considered to
broadest signification is concerned. In the West provide an unimprovable model for the life of the
churches in their proper sense represent only one type community of believers. The Prophet himself recog-
of religious collectivity—albeit historically the most nized the sovereignty of God alone; accordingly, it has
important—whose relationship with the state has been not been open to his successors (the caliphs) or, more
the subject of intense and prolonged controversy; recently, the people and their representatives to claim
sects, denominations, and cults, each with their dis- sovereign authority in their own right. Within this
tinctive characteristics and drives, have also given rise restrictive view it has been the duty of all who exercise
in their orientation to political authorities, and vice authority on behalf of the community faithfully to
versa, to difficult issues of recognition, regulation, and implement the rules encapsulated in the Qur’an and
control. Within the other world religions the term the traditions of the Prophet as interpreted by legal
church has no analogue, although the Buddhist sangha scholars, the jurisconsults. Similarly, within Judaism,
is often thought to share some features. In the other the authority of divine law, as interpreted by rabbinical
monotheistic religions, such as Islam and Judaism, scholars, was held to be supreme within the diaspora.
and in Hinduism the location of religious authority The recent reassertion of theocratic claims such as
and the degree of its independence from political these has helped to undermine the plausibility of the
authority suggest quite different patterns from those developmentalist perspective with its implication of a
typical of church–state relations in the West. The unilinear secularization process.

1796
Church and State: Political Science Aspects

2. Religion and the Rise of the Modern State in cleavages during the formative period up to the 1920s
the West accounts for the survival, by inertia, of patterns of
confessional politics which otherwise might well have
The rise of the modern state with its characteristic disappeared. In northern Europe the existence of state–
claim to exclusive authority within a particular ter- church systems had led over time to the emergence of
ritory can be seen as part cause, part consequence of movements of religious protest and dissent, which
the Renaissance and Reformation crises in Western typically aligned themselves with other reforming
Europe which divided it between a Protestant north elements in left or liberal movements. In the Counter-
and a Roman Catholic south. It was the religious wars Reformation south on the other hand, where religious
of the sixteenth and seventeenth centuries, brought to dissent had been more or less successfully repressed by
an end by the Peace of Westphalia of 1648, which the historic alliance of throne and altar, reforming and
institutionalized, over the protests of the Pope, the revolutionary movements tended to be militantly
right of sovereign authorities to decide inter alia on the secularist, with the effect that the connection between
religious constitution of their territories. In northern religion and the political right was consolidated. A
Europe, typically, Protestant national monarchies third pattern developed in the band of mixed-religion
reinforced their developing authority (and their re- territories, which spanned Europe from Ireland to
source base) by taking the church over and introducing Transylvania, where the liberalization of political
more or less Erastian (i.e., unilateral secular) patterns systems, when it occurred, ushered in patterns of
of control; in the south the typical pattern which confessional politics which, in nonmajoritarian set-
resulted was an alliance between royal absolutism and tings, took the form of consociationalism.
national Catholic hierarchies until the French Rev- Beyond Europe, developing party systems tended
olution led to a similar assault on the church’s property only marginally or incidentally to be affected by
and independent authority. The subsequent decline of religious cleavages as such; religious connections were
dynastic rule throughout Europe in the nineteenth usually outgrowths of ethnic or community identities,
century and the emergence of nationalist and liberal whether on the part of immigrant populations as in the
democratic mass politics saw a further weakening of USA, South Africa, or Australia, or on the part of
church authority north and south, east and west. anticolonial movements committed to seizing inde-
Institutional secularization was furthermore ac- pendence on behalf of indigenous populations. In the
companied by a decline in levels of orthodox religious latter case independence movements varied in terms of
belief and practice as urbanization and industrializ- the religious structure of the populations affected, with
ation transformed social and economic structures, the formerly united Indian subcontinent divided be-
although revivalism of various sorts meant that the tween Hindu, Islamic, and (in the case of Sri Lanka)
trend was not unilinear. In the USA the first amend- Buddhist leaderships heading up parties which, on
ment to the Constitution introduced for the first time independence, became dominant in their respective
a relatively thoroughgoing separation of church and systems. In more recent decades the often secular
state, which barred Congress from making any law leaderships of the movements which successfully
‘respecting an establishment of religion or prohibiting claimed independence have been challenged by move-
the free exercise thereof.’ A feature of the pattern ments of religious insurgency, driven in part by
which emerged in this context has been that, far from disappointment with the fruits of independence. Some
following the general secularising trend seen in authors have even identified in these developments a
Europe’ a ‘churching of America’ occurred as the global revival of the influence of religion in politics.
proportion of members and attenders in the popu-
lation tended to increase. Few other political systems,
even among the liberal democracies, have, however,
followed the example of constitutionally mandated 4. The Political Resurgence of the Religious
disestablishment and even where they have, as for Factor
example in the case of France in 1905, the principle of The late twentieth century has certainly seen a re-
separation has not been upheld with the judicial rigor crudescence of the religious factor in politics which
characteristic of the American case in recent decades. would have surprised (and dismayed) the more naive
modernization theorists. In 1979–80 the religious-led
Iranian revolution, the prominence in the United
3. Religious Cleaages and the Deelopment of States Presidential election of an emergent new
Party Systems Christian right, the overthrow of Nicaraguan dictator
Somoza, and the emergence after the visit of Pope
With the development of mass electoral politics in the John Paul II to his home country Poland of the
nineteenth century the emergent party systems tended church-supported Solidarity, indicated that the
to reflect divergent systems of religious—in addition phenomenon was not restricted to any one corner of
to social, economic, and other—cleavages; indeed, the the world. Each case could be seen as involving
relative prominence of religious or religion-related reactions to what were regarded by the ‘fundamen-

1797
Church and State: Political Science Aspects

talist’ militants and activists involved as trends toward within existing traditions, and the growth of im-
secularization or obstacles of entrenched corruption. migrant communities with distinct confessional pro-
Prominent among the issues involved were respect files has given rise in recent decades to a range of new
for religious authority, the maintenance of religion- problems, or the re-emergence of old ones. Thus in
related rules governing morality in all its guises (in- connection with Scientology and Transcendental
cluding in the West abortion, pornography, and Meditation, questions of recognition have arisen in
contraception), and the re-introduction as normative several countries: are they to be regarded as religions
for the political community of core religious values and so accorded attendant tax or other advantages, or
whether Christian, Islamic, Judaic, Buddhist, or not? More established religious traditions, in par-
Hindu. The liberal rule that the state should be neutral ticular Islam, have called for their values to receive the
as between different religions and between the religious protection of the law, so that what is judged to be
and nonreligious was most often regarded as either blasphemy should be punishable through the civil
illusory or misguided: illusory, because the all- courts. In the field of education, historically one of the
embracing nature of religious belief- and value-systems most contentious in church–state relations, several
left no neutral space to be legitimately marked out, or issues continue to present themselves: should inde-
misguided, because political authority in particular pendent schools provided for the education of children
ought not to be exempted from the authoritative of particular religious communities receive recog-
purview of religious values. nition, tax advantages, or direct state aid? What
provision, if any, should public schools make for
collective worship or religious instruction and how
5. The Contemporary Challenge to State should any such arrangements take account of the new
Religious Neutrality religious pluralism? Should public schools allow the
display of distinctive religious symbols either on the
In few, if any, cases has the ideal of state religious walls, as in the case of crucifixes, traditional in Catholic
neutrality been realized, as the concept itself has communities, or on the pupils themselves, as in the
progressively become problematized by significant case of Islamic girls’ headscarves? Many of these issues
voices claiming that the term has often been made to continue to agitate public debate episodically but none
cover for the artificial and prejudicial exclusion from so much as the traditionally religion-related issues,
public debate of religious claims. Until the 1950s in the such as abortion, where religious conservatives of the
USA ‘mainline’ Protestant religion remained as a sort most varied religious persuasion continue to prosecute
of informal establishment, recognized, albeit without a bitter struggle against what is typically seen as an
the confessional label, in certain Supreme Court outgrowth of secular humanism.
judgments, but since the early 1960s the separation
rule has been applied to exclude religious symbol and See also: Civil Religion; Political Culture; Political
activities from public institutions such as schools. In a Sociology; Reformation and Confessionalization;
number of cases of continuing church establishment, Religion and Politics: United States; Religion, History
evenhandedness has been approached by the extension of; Religion: Mobilization and Power; Religion:
of the list of recognized religions and other ‘com- Nationalism and Identity; Religion: Peace, War, and
munities of belief’ (as in Belgium, where the state pays Violence; Religious Nationalism and the Secular State:
the salaries and pensions of religious officials of six Cultural Concerns
recognized confessions) or the effective diminution of
the privileges of establishment (as in the UK). Since
the end of the Cold War official hostility to religion as
such (amounting in the case of Albania to the Bibliography
attempted suppression of religion for over 20 years Boyle K, Sheen J (eds.) 1997 Freedom of Religion and Belief. A
after 1967) has greatly diminished, although it has not World Report. Routledge, London
disappeared; thus, China and North Korea, for Monsma S V, Soper J C 1997 The Challenge of Pluralism.
example, continue to exercise heavy-handed control of Church and State in Fie Democracies. Rowman Littlefield,
religious bodies and activities. Within the world of Lanhom, MD
Islam meantime there has been a significant growth in Robbers G (ed.) 1996 State and Church in the European Union.
the number of states, which are either organized as Nomos
theocratic regimes dedicated to the implementation Robbins T, Robertson R (eds.) 1987 Church–State Relations:
Tensions and Transitions. Transaction Books, New Brunswick,
under clerical leadership of religious law, the shari’a
NJ
(e.g. Iran and Sudan), or which harbor important Weber P 1990 Understanding the First Amendment Religion
movements that aim to introduce such a regime (e.g. Clauses. Greenwood Press
Pakistan and Algeria). Witte J 1999 Church and State: The American Constitutional
In the West in particular, the increasing religious Experiment. Westview Press, Greenwood, CT
diversity of populations occasioned by the emergence
of new religious movements, both independently and J. T. S. Madeley

1798 Copyright # 2001 Elsevier Science Ltd.


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Cingulate Cortex

Cingulate Cortex midline, and intralaminar thalamic nuclei. However,


the anterior cingulate cortex is also innervated by the
The cingulate cortex is located in the medial wall of the medial dorsal (MD) and parafasicular thalamic nuclei,
cerebral hemispheres and has extensive reciprocal while the posterior cingulate cortex receives pro-
connections with various limbic structures as well as jections from the remaining members of the anterior
motor and premotor areas. In studies with rodents and group of thalamic nuclei: the anterior ventral (AV),
studies of brain activation in humans, the cingulate anterior dorsal (AD), and lateral dorsal (LD) thalamic
cortex is implicated in the processes of selective nuclei. In addition, neurons of the lateral dorsal and
attention and response selection. This article explores ventral anterior thalamic nuclei project to the pos-
the essential areas of convergence of the two bodies of terior cingulate cortex.
literature, and offers a common mnemonic\associative Cingulate cortical neurons are richly innervated by
interpretation of cingulate cortical function. pontine and midbrain fibers that fairly uniformly
distribute norepinephrine and serotonin but restric-
tively distribute dopamine to the anterior cingulate
cortex. Many additional afferent systems project to
1. Anatomy of the Cingulate Cortex the cingulate cortex, including fibers from the visual
cortex, hippocampus, subiculum, entorhinal cortex,
The cingulate cortex is a part of the limbic cortex, a and amygdala.
term referring to the cortical areas that receive axonal Cingulate cortical neurons send efferent fibers to
fibers from neurons in the anterior group of thalamic most of the aforementioned thalamic areas, the subi-
nuclei. By modern convention, cingulate cortex con- culum, entorhinal cortex, pons, and many areas of the
stitutes Brodman’s areas 24 and 29 in small animals striatal motor system including the caudate nucleus,
such as rabbits and rats, and also an additional area, nucleus accumbens, and zona incerta. Cingulate cor-
23, in primates and humans. Brodman’s areas 24 and tical neurons have also been found to project to
29 are often referred to, respectively, as anterior and multiple areas of the motor and premotor cortex,
posterior cingulate cortex. A cytoarchitectural map suggesting that numerous parallel pathways exist
illustrating these areas is shown in Fig. 1. whereby cingulate neurons can modulate motor out-
Both the anterior and posterior cingulate cortices put systems of the brain. In primates there exist direct
receive afferent fibers from the anterior medial (AM), reciprocal projections of cingulate neurons to lateral

Figure 1
Cytoarchitectural map of the limbic cortex in the rhesus monkey based on Brodman’s divisions

1799
Cingulate Cortex

prefrontal and parietal cortex, areas believed to be stimuli that predicted the occurrence of significant
critical for higher-order perceptual and mnemonic events.
functions (Goldman-Rakic 1988). Cingulate cortical neurons have also been shown to
exhibit salience compensation, a phenomenon that is
supportive of a cingulate cortical involvement in
mediating associative attention (Gabriel 1993, Gabriel
and Taylor 1998). When rabbits are trained with non-
2. The Role of the Cingulate Cortex in Attention salient cues such as a brief duration (200 ms) CSj and
CSk, greater brief-latency cingulate cortical dis-
criminative neuronal responses are observed than
2.1 Associatie Attention: Cingulate Cortical
when training is carried out with more enduring (e.g.,
Neuronal Actiity
500 ms) CSs. The enhanced neuronal encoding of the
Research in behavioral neuroscience with rabbits and brief stimuli, or salience compensation, has been
rats indicates that the cingulate cortex mediates interpreted as an attentional process that amplifies the
selective attention, or attention focused on particular neural representation of nonsalient yet associatively
stimuli. The stimuli that are selectively processed by significant stimuli in order to maximize the resources
cingulate cortical neurons are associatively significant available for processing those cues.
stimuli, i.e., stimuli that signal important events such The importance of the attentional processing in the
as reward or aversion, and call for action on the part cingulate cortex is demonstrated by studies showing
of the subject. Since the selective processing of the that bilateral, combined lesions of the anterior and
significant stimuli is a learned, associative process, an posterior cingulate cortices severely impair discrimina-
apt characterization of the cingulate cortex is that it tive avoidance and approach learning in rabbits
mediates associative attention to significant stimuli. (Gabriel 1993). In addition, adult rabbits exposed to
Support for the hypothesis that the cingulate cortex cocaine in utero, an exposure which induced mor-
mediates associative attention to significant stimuli is phological and biochemical abnormalities in the an-
shown by results of numerous studies on the activity of terior cingulate cortex, exhibited attenuated anterior
cingulate cortical neurons during Pavlovian and in- cingulate discriminative neuronal activity and learning
strumental conditioning in animals. For example, deficits when nonsalient CSs were used during dis-
extensive research has documented the responses of criminative avoidance learning and Pavlovian con-
cingulate cortical neurons during discriminative in- ditioning of eyeblink responses (Gabriel and Taylor
strumental learning in rabbits (Gabriel 1993, 2001). 1998, Romano and Harvey 1998). The results of these
In these studies, rabbits occupying a large rotating studies illustrate that deficits due to cingulate cortical
wheel apparatus learned to step in response to a tone damage emerge when a high demand is placed on
cue (CSj) to prevent a foot-shock delivered five attentional processing.
seconds later, and they learned to ignore a different
tone (CSk) not followed by shock. Neuronal activity
in multiple areas of the cingulate cortex exhibited the
development, during training, of massive discrimina-
2.2 Executie Attention: Anterior Cingulate Cortex
tive neuronal activity, defined as significantly greater
firing frequencies in response to the CSj than to the The application of neuroimaging and electrical record-
CSk (see Fig. 2). Discriminative activity also de- ing techniques, such as positron emission tomography
veloped during acquisition of a discriminative ap- (PET), functional magnetic resonance imaging
proach response, in which rabbits learned to make (fMRI), and high-density electroencephalography
oral contact with a drinking spout to obtain a water (EEG), has led to a large volume of data that implicate
reward following the CSj and to inhibit contact the cingulate cortex in cognitive processing. In con-
following the CSk, which did not predict a water vergence with results from animal studies, many
reward (Freeman et al. 1996). cognitive experiments in humans have supported the
Discriminative neuronal activity in the cingulate idea that the anterior cingulate cortex is involved in
cortex has also been reported during classical Pav- processes subserving attention. These studies show
lovian conditioning of heart rate and eyeblink respon- that the anterior cingulate cortex is engaged during
ses in rabbits (Powell et al. 1990). In addition, studies tasks in which routine or automatic processing is
in rats have demonstrated the occurrence of neuronal insufficient, as when novel or conflict-laden situations
responses in the anterior and posterior cingulate cortex are encountered. This type of attentional processing
that are specific to stimuli that predict reinforcement has been termed executie attention (Posner and
during appetitive conditioning (Segal and Olds 1972, DiGirolamo 1998). Situations likely to require exe-
Takenouchi et al. 1999). The activity of cingulate cutive attention are: a) planning and decision making;
cortical neurons in all these studies may be viewed as b) error detection; c) novel and early stages of learn-
a neuronal code for the associative significance of cues ing; d) difficult and threatening situations; and e)
since the neurons developed selective responses to overcoming habitual behavior.

1800
Cingulate Cortex

Figure 2
Average anterior (Area 24) and posterior (Area 29, cellular layers IV & V) cingulate cortical integrated unit activity
elicited by CSj and CSk in rabbits during pretraining, first exposure session, first significant behavioral
discrimination, and criterial behavioral discrimination in a discriminative avoidance task. The neuronal activity for
Area 24 is plotted in the form of standard scores normalized with respect to the pre-CS baseline for 40 consecutive
intervals following CS onset. Area 29 data are plotted starting 100 ms after tone onset

The role of the anterior cingulate cortex in executive to the stimulus word ‘car’). The subtracted control
attention is supported by numerous imaging studies condition for this task involves merely reading and
that have shown activation in the anterior cingulate pronouncing the words. It is argued that executive
cortex during tasks that engender conflict (Posner and attention (and thus anterior cingulate cortical ac-
DiGirolamo 1998). An example would be a task that tivation) is brought into play as a result of the conflict
requires the selection of a particular response from created by the multiple uses that are potentially
multiple competing responses. Most of these studies relevant to a given stimulus. The activation in the
employ a subtraction technique whereby the brain anterior cingulate cortex in this task declines as the
activation found in a neutral or control condition is subjects are repeatedly exposed to the same words and
subtracted from the activation produced by an ex- the generation of uses becomes more routine and less
perimental condition. For example, activation has dependent on executive control.
been found in the anterior cingulate cortex during the Additional evidence for the contribution of the
generate-uses task, which requires subjects to state anterior cingulate cortex to executive attention comes
uses commonly associated with visually or acoustically from the results of multiple experiments using the
presented words (e.g., generating the response ‘drive’ Stroop task, a task that requires subjects to name the

1801
Cingulate Cortex

compared to the neutral condition. It has been


suggested that both the congruent and the incongruent
conditions involve conflict because subjects must
respond to the ink color while inhibiting a response to
the word’s meaning. Some studies have found more
activation in the incongruent condition than in the
congruent condition, in line with the expectation that
the incongruent condition creates more conflict and
thus recruits more executive attention in the anterior
cingulate cortex.
Studies employing high-density scalp recordings of
EEG have pointed to an involvement of the anterior
cingulate cortex in error detection, another aspect of
executive attention (Dahaene et al. 1994, Holroyd et
al. 1998, Gehring et al. 1993). These experiments have
demonstrated a marked electrical negativity at mid
frontal regions of the scalp. The negativity peaks
about 100 milliseconds after subjects make an in-
correct response, such as an erroneous key press in a
reaction time task. Brain electrical source analyses
(BESA) carried out independently by separate investi-
gators consistently localize the error-related negativity
(ERN) to the anterior cingulate cortex. These results
support the role of the anterior cingulate cortex in
executive attention invoked during error-related pro-
cessing (see Fig. 3).

3. Moement-related Processing: Response


Selection by the Cingulate Cortex
Considering the ample connections of the cingulate
cortex with motor and premotor cortex as well as
areas of the striatal motor system, it is not surprising
Figure 3 that the cingulate cortex has been linked to processes
A 3-D and sagittal view of the human brain illustrating such as response selection. Several studies have
the source of error-related negativity (ERN) found documented the existence of a topographic organi-
after brain electric source analysis (BESA) of event- zation of the cingulate cortex with respect to parti-
related brain potentials. The bottom\right figure cular response modalities. For example, different
summarizes the results of several ERN studies that areas of the anterior cingulate cortex are active
have demonstrated that the source of the ERN is not depending on whether subjects perform in tasks
affected by response modality (subjects responding with involving oculomotor, manual, or spoken responses
their feet or hands) or error feedback modality (visual, (Paus et al. 1993).
auditory, somatosensory). Also shown is the ERN A case study of patient D.L., who sustained a
source for two reaction time experiments, one involving circumscribed right hemisphere lesion of the anterior
a decision of whether a number was ‘smaller cingulate cortex after surgery to remove a tumor, adds
than\larger than’ (RT Exp.1) and another involving a further support for cingulate cortical involvement in
classification of words into semantic categories (RT movement-related processing (Turken and Swick
Exp. 2) 1999). Interestingly, D.L. exhibited entirely normal
performance in Stroop-like and divided attention tasks
when responses were orally reported; however, D.L.
ink color of visually presented words in a congruent showed a dramatic deficit in the same tasks when
condition (e.g., the word ‘red’ printed in red ink), an manual responses were required. These results were
incongruent condition (e.g., the word ‘green’ printed interpreted as favoring the idea that command signals
in red ink), or a neutral condition (e.g., the word ‘door’ are sent to motor output areas through the anterior
printed in red ink) (Posner and DiGirolamo 1998). cingulate cortex. The authors characterize the role of
The anterior cingulate cortex has been found active in the anterior cingulate cortex as confirming the ap-
both the incongruent and congruent conditions when propriateness of the selected response, thus to facilitate

1802
Cingulate Cortex

correct responding while suppressing incorrect re- aspects of pain processing, such as evaluating the
sponding. immediate threat of the pain and its potential in-
Premotor neuronal activity in cingulate cortex has terference with daily activities. The anterior cingulate
been demonstrated in several studies of the single-unit cortex is thus positioned to integrate these two types of
correlates of learning and performance. In one such pain-related information in order to select appropriate
study, approximately half of all single neurons in coping responses such as escape or avoidance.
anterior and posterior cingulate cortex in rabbits
exhibited premotor firing ramps that consisted of
progressive increases in firing frequency preceding the
onset of the behavioral (locomotory) avoidance 5. Learning and Memory
responses (Kubota et al. 1996). Also, neuronal firing in
ventral portions of the anterior cingulate cortex was 5.1 Distinct Roles of the Anterior and Posterior
correlated with the onset of licking behavior during Cingulate Cortex
appetititve conditioning of rats (Takenouchi et al.
1999). Thus, cingulate cortical neurons become active Compelling evidence suggests an important role for
preceding the initiation of learned motor responses. the cingulate cortex in the mediation of learning and
The involvement of the anterior cingulate cortex in memory processes. Available data indicate that the
response selection does not negate the role of the contributions of the anterior and posterior cingulate
cingulate cortex in associative and executive attention. cortices are functionally distinct. A contribution of the
Appropriate response selection for a given situation anterior and posterior cingulate cortices to early and
can only occur if attention is devoted to the significant late stages of learning, respectively, has been docu-
associative stimuli and if conflict among competing mented in discriminative avoidance learning in rabbits
motor responses is resolved. as well as in conditioned visual discrimination in rats
(Gabriel 1993, Bussey et al. 1997). In rabbits, dis-
criminative neuronal activity (see Sect. 2.1) in the
anterior cingulate cortex develops after fewer training
4. Emotion and the Affectie Dimension of Pain trials than in the posterior cingulate cortex. The
Traditionally, the cingulate cortex has been viewed as observations of neuronal activity coincide nicely with
part of a brain circuit that is involved in the experience restricted lesion studies showing that lesions confined
and expression of emotion (Papez 1937, Maclean to the anterior cingulate cortex result in a deficit of
1975). Although more recent evidence has suggested behavioral performance in the early stages of learning,
an important role for the cingulate cortex in processes whereas lesions confined to the posterior cingulate
such as attention and response selection, the role of the cortex result in a loss of performance at later stages of
cingulate cortex in emotion remains unchallenged. learning.
Activations of the anterior cingulate cortex, in
particular, have been found to accompany the ex-
perience of emotion in numerous neuroimaging
5.2 Context-specific Retrieal Patterns and Spatial
studies. For example, when cerebral blood flow (CBF)
Processing: Posterior Cingulate Cortex and the
was measured using PET while subjects viewed
Anterior Thalamus
emotional film clips and recalled emotional situations,
the anterior cingulate cortex was the only structure to Although the evidence clearly indicates an involve-
exhibit CBF changes correlated with subjects’ scores ment of the cingulate cortex in the learning-based
on the Levels of Emotional Awareness Scale (LEAS), coding of associatively significant stimuli, further
a test that measures the capacity to perceive and evidence indicates that this coding subserves the
differentiate complex emotions in oneself and others retrieval of learned, context-appropriate responses.
(Lane et al. 1998). The results suggested that individual Evidence in support of this idea is shown by the
differences in emotional awareness could be related to existence of unique topographical distributions of
the degree of activation in the anterior cingulate CSj-related neuronal activity in different cell layers
cortex. of the posterior cingulate cortex and in various
Evidence has also been presented in support of a anterior thalamic nuclei in rabbits, during discrimina-
role for the anterior cingulate cortex in mediating the tive instrumental learning. Some layers are activated
emotional response to pain in humans (Price 2000). maximally by the CSj in the initial stages of training,
Pain is thought to involve two components, a sensory others in intermediate training stages, and others as
component and an affective component. The affective the rabbits attain asymptotic discriminative perform-
component reflects the unpleasantness associated with ance. The distribution of the activations changed not
pain and its long-term consequences. The anterior only across time (the stage in training) but also with
cingulate cortex receives direct input from spinal pain respect to the spatial context. For instance, the same
pathways and other input from areas (e.g., the set of cues elicited different patterns of activation
prefrontal cortex) that are involved in cognitive depending on whether the rabbits were engaged in a

1803
Cingulate Cortex

moderately learned discriminative avoidance task or responses to predictive stimuli. Cingulate cortical
(in a separate training apparatus) a well-learned neurons in these animals code associatively significant
discriminative approach task. These context-specific stimuli and exhibit context-specific topographic pat-
patterns of CSj-elicited activity could be associated terns that could mediate cued retrieval of context-
with the learned responses that are appropriate to a appropriate learned behavior. These functions occur
given situation or context. Thus, when a given context as a result of intimate interactions of the hippocampal
specific pattern is elicited on cue presentation, the and cingulothalamic brain regions. Studies of cog-
learned response is retrieved. This mechanism could nitive neuroscientists concerning brain activation dur-
subserve pattern separation, i.e., the ability to defeat ing cognitive task performance in human subjects have
proactive and retroactive interference when multiple yielded results that are fundamentally in agreement
similar cues are associated with different memories or with the studies with rats and rabbits. For example,
responses, as when one tries to recall the names of there is clear agreement that the anterior cingulate
several recently met individuals. The retrieval hy- cortex subserves an attentional role as its neurons are
pothesis is consistent with the finding that the brief- recruited in situations of high cognitive conflict, e.g.,
latency cue-elicited context-specific patterns in the when stimuli acquire new meanings at the outset of
posterior cingulate cortex are followed by premotor learning, or when a decision among multiple-response
firing, i.e., firing which precedes the onset of the alternatives must be reached. The involvement in
learned behavioral response (see Sect. 2.3). behavioral learning and the associative and memory-
Additional evidence suggests that the context- bearing characteristics of cingulate cortical neuronal
specific patterns in the posterior cingulate cortex activity have also led behavioral neuroscientists to
depend on the integrity of hippocampal connections, speak of cingulate cortical attention as associative in
which may supply information concerning the op- character, i.e., a learned form of attention. Cognitive
erative spatial context to the cingulate cortex. Fornix neuroscientists have, on the other hand, discussed the
lesions which disconnect the hippocampal formation cingulate cortex as involved in attentional processes
from the anterior thalamus disrupt the training-stage- without reference to memory. Given the findings of
related patterns in the posterior cingulate cortex and behavioral neuroscience and the very close neuro-
these lesions impair concurrent performance in two anatomical association of the cingulate cortex with
different discriminative learning tasks that employ other structures (e.g., the hippocampus) that are
very similar cues (Smith et al. 2000). acknowledged components of the brain’s memory
Posterior cingulate cortical neurons also have func- system, it is very likely that early in the twenty-first
tional properties that are similar to those found in century there will be an even greater convergence of
neurons of the hippocampus and parietal cortex behavioral and cognitive neuroscience upon a com-
during spatial processing. For instance, rodent hippo- mon mnemonic interpretation of cingulate cortical
campal neurons are selectively active when subjects function.
occupy a particular location in space, while other
neurons code information about directional heading, See also: Attention, Neural Basis of; Emotion, Neural
independently of spatial location or ongoing behavior. Basis of; Learning and Memory, Neural Basis of;
Direction-coding neurons have also been documented Neural Representations of Direction (Head Direction
in the posterior cingulate cortex and related thalamic Cells); Pain, Neural Basis of
nuclei, and these neurons, together with those of the
hippocampal formation, are thought to contribute to Bibliography
the sense of direction and place in spatial learning
situations. Interestingly, in primates, hippocampal Bussey T J, Muir J L, Everitt B J, Robbins T W 1997 Triple
neurons are selectively active when the subject is dissociation of anterior cingulate, posterior cingulate, and
medial frontal cortices on visual discrimination tasks using a
viewing (rather than occupying) a particular space in
touchscreen testing procedure for the rat. Behaioral Neuro-
the environment. The posterior cingulate cortex of science 111(5): 920–36
primates contains neurons that discharge with eye Dahaene S, Posner M I, Tucker D M 1994 Localization of a
movement and eye position, a tuning property also neural system for error detection and compensation. Psycho-
found among neurons in the frontal and parietal logical Science 5: 303–5
cortical areas (Olson et al. 1996). The eye direction- Freeman J H, Cuppernell C, Flannery K, Gabriel M 1996
coding neurons in primate cingulate cortex are Limbic thalamic, cingulated cortical and hippocampal
hypothesized to participate in the spatial interpret- neuronal corellates of discriminative approach learning in
ation of retinal images. rabbits. Behaioral Brain Research 80: 123–36
Gabriel M 1993 Discriminative avoidance learning: A model
system. In: Vogt B A, Gabriel M (eds.) Neurobiology of
6. Concluding Comment Cingulate Cortex and Limbic Thalamus. Birkhauser, Toronto,
Canada
Studies carried out by behavioral neuroscientists using Gabriel M, Talk A 2001 A tale of two paradigms: Lessions
rats and rabbits as subjects have shown that the learned from parallel studies of discriminative instrumental
cingulate cortex is a critical substrate of learned learning and classical eyeblink conditioning. In: Steinnetz A,

1804
Circadian Rhythms

Gluck M, Solomon P (eds.) Model Systems and the exposed to cocaine in utero: General and sex-specific effects.
Neurobiology of Associatie Learning: A Festschift in Honor of Behaioral Neuroscience 113: 62–77
Richard F. Thompson. Laurence Erlbaum Associates, NJ Turken A U, Swick D 1999 Response selection in the human
Gabriel M, Taylor C 1998 Prenatal exposure to cocaine impairs anterior cingulate cortex. Nature Neuroscience 2(10): 920–4
neuronal encoding of attention and discriminative learning.
In: Harvey J A, Kosofsky B E (eds.) Cocaine: Effects on the L. Burhans, A. Talk, and M. Gabriel
Deeloping Brain. New York Academy of Sciences, New York
Gehring W J, Gross B, Coles M G H, Meyer D E, Donchin E
1993 A neural system for error detection and compensation.
Psychological Science. 4: 385–90
Goldman-Rakic P S 1988 Topography of cognition: Parallel
distributed networks in primate association cortex. Annual
Reiew of Neuroscience 11: 137–56
Circadian Rhythms
Holroyd C B, Dien J, Coles M G 1998 Error-related scalp
potentials elicited by hand and foot movement: Evidence for This article reviews the nature of biological rhythms in
an output-independent error-processing system in humans. mammals, how they develop, and their eventual
Neuroscience Letters 242: 65–8 deterioration. It discusses the suprachiasmatic nuclei
Kubota Y, Wolske M, Poremba A, Gabriel M 1996 Stimulus- (SCN), the major pacemaker in the brain, and recent
related and movement-related single-unit activity in rabbit advances in understanding the molecular mechanisms
cingulate cortex and limbic thalamus during performance of
discriminative avoidance behavior. Brain Research 72: 22–38
of this unique timing system. It also discusses the
Lane R D, Reiman E M, Axelrod B, Yun L S, Holmes A, adaptive value of having an endogenous clock in the
Schwartz G E 1998 Neural correlates of levels of emotional brain and its implications for administering thera-
awareness: Evidence of an interaction between emotion and peutic agents.
attention in the anterior cingulate cortex. Journal of Cognitie
Neuroscience 10(4): 525–35
MacLean P D 1975 Sensory and perceptive factors in emotional
functions of the triune brain. In: Levi L (ed.) Emotions: Their 1. What Are Circadian Rhythms?
Parameters and Measurement. Raven Press, New York
Olson C R, Musil S Y, Goldberg M E 1996 Single neurons in
All our behavioral, physiological, and endocrino-
posterior cingulate cortex of behaving macaque: Eye move- logical functions are controlled by an endogenous
ment signals. Journal of Neurophysiology 76(5): 3285–300 clock that measures time in approximately 24-hour
Papez J W 1937 A proposed mechanism of emotion. Archies of intervals. The rhythms the clock generates are known
Neurology and Psychiatry 38: 725–43 as circadian rhythms (from the Latin, circa, about,
Paus T, Petrides M, Evans A C, Meyer E 1993 Role of the and dies, day). The ‘about’ is very important. Living
human anterior cingulate cortex in the control of oculomotor, systems do not merely respond to cyclic changes in
manual, and speech responses. Journal of Neurophysiology 70: their environment in some phase-locked fashion.
453–69 Leaves of plants that open during the day and close at
Posner M I, DiGirolamo G J 1998 Executive attention: Conflict,
night still open and close when the plants are kept in
target detection, and cognitive control. In: Parasuraman R
(ed.) The Attentie Brain. MIT Press, Cambridge, MA
constant darkness. Humans and other animals main-
Powell D A, Buchanan S L, Gibbs C M 1990 Role of the tain a period of about 24 hours in bodily functions
prefrontal-thalamic axis in classical conditioning. In: Uylings when they are in environments without time cues.
H B M, Van Eden C G, De Bruin J P C, Corner M A, Noise, temperature, and social stimuli can also syn-
Feenstra M G P (eds.) The Prefrontal Cortex: Its Structure, chronize, or entrain, the clock, but the strongest
Function and Pathology (vol. 85). Elsevier Science, Amsterdam entraining signal is the light\dark (LD) cycle.
Price D D 2000 Psychological and neural mechanisms of the
affective dimension of pain. Science 288(5472): 1769–72
Romano A G, Harvey J A 1998 Prenatal cocaine exposure:
Long-term deficits in learning and motor performance. In: 2. What Are Their Properties?
Harvey J A, Kosofsky B E (eds.) Cocaine: Effects on the
Deeloping Brain. New York Academy of Sciences, New York Figure 1 illustrates two important properties of cir-
Segal M, Olds J 1972 Behavior of units in hippocampal circuit of cadian rhythms in the wheel-running activity of a
the rat during learning. Journal of Neurophysiology 35(5): nocturnal mammal, the white-footed mouse (Pero-
680–90 myscus leucopus), that has been placed in a 24-hour
Smith D M, Patel J, Gabriel M 2000 Hippocampal-cingulo- LD photoperiod consisting of one hour of light and 23
thalamic interactions supporting concurrent discriminative hours of darkness. It takes the mouse about a week for
approach and avoidance learning in rabbits. Society for its activity to become entrained by the LD cycle, after
Neuroscience Abstracts 26: 198
Takenouchi K, Nishijo H, Uwano T, Tamura R, Takigawa M,
which it begins running immediately after the lights go
Ono T 1999 Emotional and behavioral correlates of the off. It runs for about 10 hours and then becomes
anterior cingulate cortex during associative learning in rats. inactive. After 60 days the animal is released into
Neuroscience 93(4): 1271–87 constant darkness. There its activity free-runs with a
Taylor C, Freeman J H, Holt W, Gabriel M 1999 Impairment of period of 23.6 hours, that is, it begins to run 24
cingulothalamic learning-related neuronal coding in rabbits minutes earlier each day.

1805
Circadian Rhythms

Figure 1
Entrained and freerunning rhythms of wheel-running
activity in Peromyscus leucopus. The animal uses a
running wheel during its nightly activity period which
lasts about 10 hours and is recorded as a heavy black
lime. Successive days are plotted beneath each other.
On the 10th day of the experiment the animal had
locked on to (was entrained by) the daily light pulse
(one hour duration). On the 60th day, the light was
discontinued. The rhythm persisted with a circadian
period of 23.6 hours. On the 90th day, it was again
entrained by a 24-hour LD cycle in which the light
pulse lasted 18 hours. On the 140th day the light was
discontinued, and the rhythm again free ran, but with a
period of 23.0 hours (from Pittendrigh 1974)

After two weeks in the dark, the mouse is put into a


24-hour LD cycle consisting of 18 hours of light and
six hours of darkness. It immediately becomes en-
trained to the new photoperiod and its activity is
tightly confined to the dark period. When it is released Figure 2
back into constant dark, two aspects of its activity are Entrained rhythms of colonic temperature, plasma
different from the previous stay in the same condition. levels of cortisol, urine volume, and plasma levels of
First, the amount of its activity is lessened. Second, the thyroid stimulating hormone, growth hormone,
period of its free-running rhythm is shorter—23 prolactin and parathyroid hormone, and activity in a
instead of 23.6 hours (Pittendrigh 1974). This indicates human monitored in a 16:8 LD cycle. W l waking; S
that the clock remembers the previous LD cycle and l sleeping (from Czeisler and Khalsa 2000)
this influences both the amplitude and period of the
free-running rhythm. A ‘memory’ of the preceding
daylength is even encoded in the electrophysiological is low during the day and shows a sharp peak at the
activity of an isolated slice of brain containing the beginning of the dark period, during sleep. Immune
SCN (Mrugala et al. 2000). system components also have circadian frequencies,
Rhythms have different phase relations to the LD and this can be very important in determining both
cycle. Figure 2 shows entrained rhythms in several responses to antigens and the timing of treatments
physiological parameters in a human subject in a 16:8 such as chemotherapy in cancer. Ideally, one would
LD cycle. Body temperature is high during the day and want to administer a therapeutic agent at a time when
falls at night, when the subject is asleep. Cortisol is it would produce the fewest unwanted side effects and
highest at the end of the dark period. Growth hormone yet be most effective in attacking the dividing cell.

1806
Circadian Rhythms

3. Why Do We Need Circadian Rhythms? SCN is the main clock in the mammalian brain. Each
morning, light from the eye sends electrical signals to
There is a distinct advantage in timing various beha- the SCN and resets it. The SCN, in turn, synchronizes
viors and metabolic processes to the appropriate time the rest of the brain and sets the pace for all daily
of day. Organisms have to sleep, and they generally do activity patterns, just as a conductor synchronizes all
so either in the day or at night. One would want the instruments in an orchestra.
potential mates and food to be available when one is The analogy of the SCN to an orchestra conductor
active. But a simple hourglass mechanism, timing is inadequate, however, because the cells of the SCN
precise 24-hour periods, would be inadequate because do not act as a single multioscillator unit. Welsh et al.
of seasonal changes in day length. A circadian clock, (1995) demonstrated that SCN cells grown in culture
sensitive to external conditions, can be reset each day, oscillate at different rates: this means that each single
making the clock conform to the environment. Furt- SCN neuron functions as an independent circadian
hermore, an internal clock allows organisms not only clock.
to respond to changes in the environment but also to
anticipate them. Lizards gain heat by basking in the
sun. When they are in their burrows, still with low
body temperatures, they crawl to the opening of the 5. Are Clocks Similar Throughout the Animal
burrow and stick their heads above the desert floor Kingdom?
before the sun comes up. This way they are in a
position to gain enough heat to become active as soon Clocks are being dissected rapidly through genetic
as possible. The early bird does catch the worm, and its analysis. Clock genes have been identified in cyano-
internal clock wakes it up before dawn. The body bacteria (blue–green algae), the fungus Neurospora,
temperature of humans falls in the evening, while we and the fruit fly Drosophila. In 1988, Ralph and
are still awake, and begins to rise in the early morning, Menaker (1988) identified a tau mutation in Syrian
while we are still asleep. Even in the same entrained hamsters. Hamsters homozygous for the mutation
environment, people have different phase relations had free-running activity rhythms of about 20 hours,
with the LD cycle; some are larks and some are compared to the wild type hamster’s rhythm of about
owls. 24 hours. In 1994, Takahashi’s laboratory discovered
the first mouse circadian mutant, Clock, which was
arrhythmic (Vitaterna et al. 1994). The Drosophila
clock is proving highly amenable to molecular analysis
4. What Controls Circadian Rhythms? and there are close connections to the mouse mutation.
Almost all known plants and animals exhibit circadian Many clock genes are highly conserved: some of the
rhythms. For instance, the microscopic single-celled same molecules are present in fruit flies and mice, and
aquatic plant, Gonyaulax polyedra, is phosphorescent, it is possible that common molecular clock mech-
lighting up at night, and dimming in the day. In 1958, anisms from bacteria to humans will be found
Hastings and Sweeney demonstrated that Gonyaulax in the not-too-distant future (see Young 2000, for a
showed peaks and troughs of luminescence even in review).
constant darkness, but the peak time of luminescence
shifted a little later each day. If the plant was exposed
to brief pulses of light, the peak of luminescence could 6. How Do Circadian Rhythms Deelop and Age?
be shifted to almost any time of day, depending on
when the light pulse was given. Thus, light reset the Using 14C-labeled deoxyglucose to monitor metabolic
Gonyaulax clock. activity in the SCN, Reppert and Schwartz (1986)
In 1972, Moore and Lenn placed a radioactive label found that there was a distinct day–night oscillation of
into the eyes of rats and found that there was a direct metabolic activity in both rat and primate SCN. But
pathway from the retina to two tiny nuclei in the fetuses do not process light. They found in rats that the
hypothalamus. These nuclei lie behind the eyes right maternal circadian system coordinated the timing of
above the location in the brain where fibers from the the fetal rhythms. It takes about a month after birth
left and right retinas cross, the optic chiasm, and for strong circadian rhythms to develop. In most, but
therefore the nuclei are named the SCN. In that same not all individuals, the robustness of the rhythms
year, Stephan and Zucker showed that lesions of the declines with age. Thus, rats that normally sleep
SCN could permanently eliminate or weaken circadian mostly in the light, and eat and drink mainly in the
patterns of behavior (for a review, see Rusak and dark, distribute all these activities about equally in the
Zucker 1979). In a series of papers beginning in 1987, light and the dark when they get old. The mean
Lehman and his colleagues demonstrated that loco- amount of sleep, food eaten, and water drunk, etc.,
motor activity rhythmicity can be restored in hamsters stays the same: only the pattern changes. These
with SCN lesions by transplants of fetal SCN tissue. changes are correlated with aberrant SCN firing
There is now no doubt that, for most rhythms, the patterns in SCN slices taken from old rats. This

1807
Circadian Rhythms

implies that aging could either disrupt coupling be- Welsh D, Logothetis D, Meister M, Reppert S 1995 Individual
tween SCN pacemaker cells or their output, or cause neurons dissociated from rat suprachiasmatic nucleus express
deterioration of the pacemaking properties of the independently phased circadian firing rhythms. Neuron 14:
individual cells (Satinoff et al. 1993). A better under- 697–706
Young M 2000 Circadian rhythms. Marking time for a kingdom.
standing of clock mechanisms should lead to more
Science 288: 451–3
efficacious clinical treatments for alleviating disorders
in the elderly, such as sleep disturbances, that are E. Satinoff
associated with changes in the clock itself. But aside
from pathologies in aging, a more complete under-
standing of clocks and how they are entrained by light
will be useful in helping to ameliorate the discomfort
and disability that comes from conditions as disparate
as jet lag, sleep disorders in shift workers and
blind people, and depression and manic-depressive Cities: Capital, Global, and World
disorder.

See also: Childhood Health; Sleep and Health; Sleep 1. The Global City: Introducing a Concept and
Disorders: Psychiatric Aspects; Sleep Disorders: Psy- its History
chological Aspects; Sleep: Neural Systems; Supra-
chiasmatic Nucleus Each phase in the long history of the world economy
raises specific questions about the particular con-
ditions that make it possible. One of the key properties
of the current phase is the ascendance of information
Bibliography technologies and the associated increase in the mo-
bility and liquidity of capital. There have long been
Czeisler C, Khalsa S 2000 The human circadian timing system
and sleep-wake regulation. In: Kryger M, Roth T, Dement W
cross-border economic processes—flows of capital,
(eds.) Principles and Practice of Sleep Medicine. W.B. labor, goods, raw materials. Over the last century,
Saunders, Philadelphia, PA these took place largely within the interstate system,
Hastings J W, Sweeney B M 1958 A persistent diurnal rhythm of where the key articulators were national states and
luminescence in Gonyaulax polyedra. Biological Bulletin 115: colonial empires. The international economic system
440–58 was ensconced largely in this interstate system. This
Lehman M, Silver R, Gladstone W, Kahn R, Gibson M, has changed rather dramatically during the 1990s as a
Bittman E 1987 Circadian rhythmicity restored by neural result of privatization, deregulation, the opening up of
transplant. Immunocytochemical characterization of the graft national economies to foreign firms, and the growing
and its integration with the host brain. Journal of Neuroscience
7: 1626–38
participation of national economic actors in global
Moore R Y, Lenn N J 1972 A retinohypothalamic projection in markets.
the rat. Journal of Comparatie Neurology 146: 1–14 It is in this context that we see a rescaling of what are
Mrugala M, Zlomanczuk P, Jagota A, Schwartz W 2000 the strategic territories that articulate the new system.
Rhythmic multiunit neural activity in slices of hamster With the partial unbundling or at least weakening of
suprachiasmatic nucleus reflect prior photoperiod. American the national as a spatial unit due to privatization and
Journal of Physiology-Regulatory Integratie and Comparatie deregulation and the associated strengthening of
Physiology 278: R987–94 globalization, come conditions for the ascendance of
Pittendrigh C 1974 Circadian oscillations in cells and the other spatial units or scales. Among these are the
circadian organization of multicellular systems. In: Schmitt F,
subnational, notably cities and regions; cross-border
Worden F (eds.) The Neurosciences: Third Study Program.
MIT Press, Cambridge, MA
regions encompassing two or more subnational enti-
Ralph M, Menaker M 1988 A mutation of the circadian system ties; and supranational entities, such as global elec-
in golden hamsters. Science 241: 1225–7 tronic markets and free-trade blocs. The dynamics and
Reppert S, Schwartz W 1986 Maternal suprachiasmatic nuclei processes that get terrritorialized at these diverse scales
are necessary for maternal coordination of the developing can in principle be regional, national, or global.
circadian system. Journal of Neuroscience 9: 2724–29 We can locate the emergence of global cities in this
Rusak B, Zucker I 1979 Neural regulation of circadian rhythms. context and against this range of instantiations of
Physiological Reiews 59: 449–526 strategic scales and spatial units. In the case of global
Satinoff E, Li H, Liu C, McArthur A, Medanic M, Tcheng T, cities, the dynamics and processes that get terri-
Gillette M 1993 Do the suprachiasmatic nuclei oscillate in old
rats as they do in young ones? American Journal of Physiology-
torialized are global. This article examines first some
Regulatory Integratie and Comparatie Physiology 265: of the key theoretical and empirical elements of the
R1216–22 global city model, followed by a brief history of the
Vitaterna M H, King D P, Chang A M, Kornhauser J M, evolution of the literature on cities in the global
Lowrey P L, McDonald J D, Dove W F, Pinto L H, Turek economy. Section three is a more in-depth discussion
F W, Takahashi J S 1994 Science 264: 719–25 of the organizing hypotheses of the global city model.

1808
Cities: Capital, Global, and World

Sections four and five discuss two specific features: the 1982, Ross and Trachte 1983, Sassen 1982, Rodriguez
question of place in a global economy and the question and Feagin 1986). But it is one article in particular,
of city-to-city networks in domains other than the ‘The World City Hypothesis’ by Friedmann and Goetz
economic. (1982) on which attention centered. This article took a
variety of elements that were emerging in the research
literature on cities, on the global economy, on im-
migration, and a number of other subjects, and sought
2. Elements in a New Conceptual Architecture to formalize these into several propositions about the
role of cities in the global economy. The key elements
The globalization of economic activity entails a new in this framework were the emergence of several cities
type of organizational structure. To capture this as basing points for global capital, a hierarchy (albeit
theoretically and empirically requires, correspond- a shifting one) of such cities, and the social and
ingly, a new type of conceptual architecture. Con- political consequences for these cities of being such
structs such as the global city and the global-city basing points.
region are, in my reading, important elements in this With the books by Castells (1989), King (1990), and
new conceptual architecture. Arrighi’s (1994) analysis Sassen (1991\2001), what had been a hypothesis in the
is of interest here in that it posits the recurrence of early 1980s became a full-fledged theorization and
certain organizational patterns in different phases of empirical specification. These three books add im-
the capitalist world economy, but at gradually higher portant and distinct propositions to the general frame-
orders of complexity and expanded scope, and timed work: Castells’ proposition that globalization as
to follow or precede particular configurations of the constituted today has engendered a space of flows that
world economy. reconfigures economic and political power; King’s
There are today several closely linked terms used to enlargement of the frame of reference to show that the
capture a particular intersection between global pro- highest levels of internationalization had taken place
cesses and cities (see Stren 1996 and Savitch 1996 for in the cities of colonial empires rather than in the
overviews). The most common is world cities, a term center of the world economy; Sassen’s proposition
attributed to Goethe and then relaunched in the work that it is not simply a matter of global coordination
of Peter Hall (1966) and more recently respecified by but one of the production of global control capacities
John Friedmann (Friedmann and Goetz 1982). Other and that an examination along these lines allows us to
related terms are ‘supervilles’ (Braudel 1984), informa- understand the role of global cities as production
tional city (Castells 1989). Choosing how to name a sites.
configuration has its own substantive rationality. The It is important to distinguish what is different about
decision to formulate the term ‘global city’ (Sassen this literature from a broader, earlier literature on
1984) stemmed out of a recognition of the specificity of world cities prominently represented by the work of
the current period. The term world city has precisely Peter Hall already in the 1960s, and a new literature on
the opposite attribute: it refers to a type of city which megacities especially focused on Latin America and
we have seen over the centuries (Braudel 1984, Hall Asia. These literatures do not have the fact of
1966, King 1990), and most probably also in much globalization and the centrality of crossborder net-
earlier periods in Asia than in the West. In this regard works connecting cities as crucial variables. The earlier
it could be said that most of today’s major global cities literature on world cities is closer to the notion of
are also world cities, but that there may well be some capitals of empires: one city at the top of the power
global cities that are not world cities in the full, rich hierarchy.
sense of that term. This is partly an empirical question; In the current literature on global cities the deter-
further, as the global economy expands and incor- mining factor is a cross-border, global network of cities
porates additional cities into various cross-border that function as strategic sites for the management and
networks, it is quite possible that the answer to that specialized servicing of global economic operations.
particular question will vary over time. Thus the fact There is no such entity as a single global city, as there
that Miami has developed global city functions be- is with the capital of an empire; by definition, the global
ginning in the late 1980s does not make it a world city city is part of a network of cities. Similarly, an older
in that older sense of the term (Nijman 1996). (See literature focused on past world cities, as in the work
generally Abu-Lughod 1999, Short and Kim 1999.) of Braudel (1984), and earlier studies of major centers
of world commerce and banking, as well as more
recent work on urban hierarchies in the world system
(Chase-Dunn 1984), are to be differentiated from the
3. The Elements of a New Theoretical current literature if we historicize the world economy
Framework and specify what is distinct today. Finally, we need to
distinguish between a narrowly specified literature on
By the early 1980s a number of scholars had begun to global and world cities today and various literatures
study cities in the context of globalization ( Walton that directly or indirectly contribute to our under-

1809
Cities: Capital, Global, and World

standing of these cities, notably the research on the last few years there has been a new interest in
producer services. this subject by geographers (Veltz 1996, Scott 2001,
By the mid-1990s the subject had clearly emerged as Storper 1997).
a rather large field for research among scholars in In terms of method, a number of strategies have
many different disciplines and countries. We can see been developed. Even where there are data on inter-
this in the variety of authors and themes in several city flows, it will take a lot of work to constitute the
state-of-the-art collections, notably by Fainstein et al. requisite data sets. In this regard, an ambitious
(1993), Knox and Taylor (1995), Noller et al. (1995), initiative by the National Academy of Sciences of the
Lo and Yeung (1996), and several others that US examines how we can construct better data sets at
elaborate, critique, expand the empirical base, and the scale of the city (NAS 2000).
generally advance this theoretical and methodological Among the quantitative methodological and data
project. We can also see it in several new important formation strategies are the efforts by Smith and
books that set the stage for highly focused research on Timberlake (2002) and by Taylor et al. (2002). Smith
particular variables, notably Meyer (1991), Thrift and and Timberlake (2002) conceptualize urban areas as
Leyshon (1994), Keil (1993), Eade (1996) among central nodes in multiplex networks of economic,
others. We also see the creation of several book series social, demographic and information flows. They use
by various publishers in different countries: the series the methodological logic of network analysis, using
on World Cities edited by Knox for Belhaven Press, particularly two measures: one of these is structural or
the series edited by Milton Santos and his colleagues in relational equivalence between actors (i.e., cities) in a
Sao Paulo for Hucitec, the series edited by Martin network; the second measure is centrality. Both of
Wentz for Campus Verlag, are just some. these measures relate to a number of propositions
It is not only the growth of the research literature developed in the literature on cities in the global
but also the growth of a body of critical responses and economy.
analyses that signals the strength and vigor of this field Taylor et al. (2002) have developed a new and
of inquiry. There is only space here for the briefest pioneering data set that makes it possible to map the
mention, a sort of guide to criticisms: Logan and global networks of offices of the leading firms world-
Swanstrom’s (1990) critique of the excessive weight wide in several specialized corporate services, such as
given to global structural processes in comparing accounting, law, advertising, and finance. These net-
internal vs. external factors that shape a city’s econ- works of offices can be used to classify cities in terms of
omic development; Hammet’s (1996) critique of their participation in cross-border networks. The data
Sassen’s proposition that globalization has contri- can be analyzed using a variety of hypotheses and
buted to social and economic polarization in global statistical as well as other methods.
cities; Markusen and Gwiasda’s (1994) critique of the There are several other efforts, but space dictates
notion that New York is at the top of the US urban singling out just a few. David Meyer (1991) has
hierarchy and how a comparison with Washington developed ways of analyzing international networks
shows that the latter has a higher level of specialization through which a variety of exchanges of capital take
than New York in many advanced specialized services, place. Castells (1989) and Sassen (1991\2001) have
notably in legal services; critiques of the literature for developed several techniques of analysis which range
its neglect of grassroots transnationalism and the new from methods to understand the place of cities in
kinds of politics and identitiy formation that this global markets to expanding the representation of the
entails; Beauregard’s (1991) critique of the explan- global. In The Informational City and The Global City
atory variables for changes in the built environment the authors sought to establish rather broadly what is
and the real estate industry; Simon’s (1995) critique of the array of data sets that can be brought into an
the neglect of the periphery, notably Africa; the debate analysis of this subject—from international flows of
in Urban Affairs (March 1998) on the concept of the capital and information to very localized social effects.
global city and a similar one in Urban Studies (Summer This was an effort to resist the simplification in
2001); the special issue on ‘Segregations Urbaines’ of mainstream accounts which emphasize the global
Societes Contemporaines in 1995; the special issue of dispersal of activities and telecommunications and
Urban Studies in 1996, and many others. exclude most social issues.
There are two types of scholarly literature that Techniques for data analysis traditionally used by
intersect with this body of research on cities and the economic geographers can also be helpful. For in-
global economy, and indeed often invoke or use it to stance Wheeler’s (1986) examination of the dispersion
develop their arguments. They are on the one hand a of higher-order financial services throughout the US
literature of anthropological and cultural studies on urban hierarchy—which he found had proceeded at a
transnationality, globalization, and identity formation much slower rate than the dispersion of headquarters
(Holston 1996, Low 1999). The other is the scholarship of other large corporations—can also be used
by regional economic geographers on the global for cross-border hierarchies. Wheeler found that
economy, who have also focused on cities (e.g., corporations tend to proceed up the urban hierarchy
Moulaert and Scott 1997, Gravestijn et al. 1999). In for their advanced service and banking needs.

1810
Cities: Capital, Global, and World

Elliott (1999) developed a test for the socioeconomic subject to uncertainty and non-standardized forms of
polarization hypothesis in global cities. complexity. Global cities are, in this regard, pro-
duction sites for the leading information industries of
our time.
A fourth hypothesis, derived from the preceding
one, is that the more headquarters outsource their
4. The Global City Model: Organizing most complex, unstandardized functions particularly
Hypotheses those subject to uncertain and changing markets and
to speed, the freer they are to opt for any location
There are seven core hypotheses that organize the data because the more the work actually done in the
and the theorization of the global city model. There headquarters is not subject to agglomeration econ-
follows a brief discussion of each as a way of producing omies. This further underlines that the key sector
a more precise representation of the model. (See Sassen specifying the distinctive production advantages of
1991\2001.) global cities is the highly specialized and networked
First, the geographic dispersal of economic activities services sector. In developing this hypothesis I was
that marks globalization, along with the simultaneous responding to a very common notion that the number
integration of such geographically dispersed activities, of headquarters is what specifies a global city. Empiri-
is a key factor feeding the growth and importance of cally it may still be the case in many countries that the
central corporate functions. The more dispersed a leading business center is also the leading concen-
firm’s operations across different countries, the more tration of headquarters, but this may well be because
complex and strategic its central functions—that is, there is an absence of alternative locational options.
the work of managing, coordinating, servicing, financ- But in countries with a well developed infrastructure
ing a firm’s network of operations. outside the leading business center, there are likely to
Second, these central functions become so complex be multiple locational options for such headquarters.
that increasingly the headquarters of large global firms Fifth, these specialized service firms need to provide
outsource them: they buy a share of their central a global service which has meant a global network of
functions from highly specialized service firms: ac- affiliates or some other form of partnership, and as a
counting, legal, public relations, programming, tele- result we have seen a strengthening of cross-border
communications, and other such services. Thus while city-to-city transactions and networks. At the limit
even ten years ago the key site for the production of this may well be the beginning of the formation of
these central headquarter functions was the head- transnational urban systems. The growth of global
quarters of a firm, today there is a second key site: the markets for finance and specialized services, the need
specialized service firms contracted by headquarters to for transnational servicing networks due to sharp
produce some of these central functions or com- increases in international investment, the reduced role
ponents of them. This is especially the case with firms of the government in the regulation of international
involved in global markets and non-routine opera- economic activity, and the corresponding ascendance
tions. But increasingly the headquarters of all large of other institutional arenas, notably global markets
firms are buying more of such inputs rather than and corporate headquarters—all these point to the
producing them in-house. existence of a series of transnational networks of cities.
Third, those specialized service firms engaged in the One implication of this, and a related hypothesis for
most complex and globalized markets are subject to research is that the economic fortunes of these cities
agglomeration economies. The complexity of the become increasingly disconnected from their broader
services they need to produce, the uncertainty of the hinterlands or even their national economies. We can
markets they are involved with either directly or see here the formation, at least incipient, of trans-
through the headquarters for which they are producing national urban systems. To a large extent it seems to
the services, and the growing importance of speed in me that the major business centers in the world today
all these transactions, is a mix of conditions that draw their importance from these transnational net-
constitutes a new agglomeration dynamic. The mix of works.
firms, talents, expertise from a broad range of spe- A sixth hypothesis, is that the growing numbers of
cialized fields makes a certain type of urban en- high level professionals and high-profit making spe-
vironment function as an information center. Being in cialized service firms have the effect of raising the degree
a city becomes synonymous with being in an extremely of spatial and socioeconomic inequality evident in
intense and dense information loop. This is a type of these cities. The strategic role of these specialized
information loop that as of now still cannot be services as inputs raises the value of top level profes-
replicated fully in electronic space, and has as one of sionals and their numbers. Further, the fact that talent
its value-added features the fact of unforeseen and can matter enormously for the quality of these
unplanned mixes of information, expertise, and talent; strategic outputs and, given the importance of speed,
these can produce a higher order of information. This proven talent is an added value, the structure of
does not hold for routinized activities which are not as rewards is likely to experience rapid increases. Types

1811
Cities: Capital, Global, and World

of activities and of workers lacking these attributes, encompassing economic base, more middle sectors of
notably in manufacturing and industrial services, are both households and firms. Emphasizing place, infra-
likely to get caught in the opposite cycle. structure, and non-expert jobs matters precisely be-
A seventh hypothesis, is that one result of the cause so much of the focus has been on the neutrali-
dynamics decribed in hypothesis six, is the growing zation of geography and place made possible by the
informalization of a range of economic activities which new technologies.
find their effective demand in these cities yet have Dealing with place brings with it the problem of
profit rates that do not allow them to compete for boundaries. These are at least of two sorts, the
various resources with the high-profit making firms at boundary of the territorial scale as such and the
the top of the system. Informalizing part or all boundary of the spread of globalization in the organi-
production and distribution activities, including of zational structure of industries, institutional orders,
services, is one way of surviving under these condi- places, and so on. In the case of the global city it is
tions. possible to opt for an analytic strategy that emphasizes
The first four hypotheses qualify what has emerged core dynamics rather than the unit of the city as a
as a dominant discourse on globalization, technology, container—the latter being one that requires territorial
and cities, which posits the end of cities as important boundary specification. Emphasizing core dynamics
economic units or scales. There is a tendency in that and their spatialization (in both actual and digital
account to take the existence of a global economic space) does not completely solve the boundary prob-
system as a given, a function of the power of lem, but it does allow for a fairly clear trade-off
transnational corporations and global communica- between emphasizing the core or center of these
tions. According to the global city model, the capa- dynamics and their spread institutionally and spa-
bilities for global operation, coordination, and control tially.
contained in the new information technologies and in Finally, the detailed examination of three particular
the power of transnational corporations need to be cities (Sassen 1991\2001) brought to the fore the
produced. By focusing on the production of these extent to which these cities collaborate through their
capabilities we add a neglected dimension to the very specific advantages rather than simply competing
familiar issue of the power of large corporations and with each other. In focusing on global finance in the
the capacity of the new technologies to neutralize 1980s and 1990s it becomes clear that the growth of the
distance and place. A focus on the production of these major centers was partly derived from the growing
capabilities shifts the emphasis to the practices that network of financial centers. In looking at the broader
constitute what we call economic globalization and network it also becomes clear to what extent it was and
global control. remains characterized by a pronounced hierarchy
A focus on practices draws the categories of place among the growing number of centers that make up
and work process into the analysis of economic the network.
globalization. These are two categories easily over- The growth of networked cross-border dynamics
looked in accounts centered on the hypermobility of among global cities includes a broad range of do-
capital and the power of transnationals. Developing mains—political, cultural, social, criminal. There are
categories such as place and work process does not cross-border transactions among immigrant com-
negate the centrality of hypermobility and power. munities and communities of origin and a greater
Rather, it brings to the fore the fact that many of the intensity in the use of these networks once they become
resources necessary for global economic activities are established, including for economic activities that had
not hypermobile and are, indeed, deeply embedded in been unlikely until now. We also see greater cross-
place, notably places such as global cities, global-city border networks for cultural purposes, as in the
regions, and export processing zones. growth of international markets for art and a trans-
This entails a whole infrastructure of activities, national class of art curators; and for non-formal
firms, and jobs necessary to run the advanced cor- political purposes, as in the growth of transnational
porate economy. These industries are typically con- networks of activists around environmental causes,
ceptualized in terms of the hypermobility of their human rights, and so on. These are largely city-to-city
outputs and the high levels of expertise of their cross-border networks, or, at least, it appears at this
professionals rather than in terms of the production or time to be simpler to capture the existence and
work process involved and the requisite infrastructure modalities of these networks at the city level. The same
of facilities and non-expert jobs that are also part of can be said for the new cross-border criminal networks.
these industries. This in turn brings with it an emphasis In brief, recapturing the geography of places re-
on economic and spatial polarization because of the presented by the network of global cities allows us to
disproportionate concentration of very high and very recapture people, workers, communities, and more
low income jobs in the city compared with what would specifically, the many different work cultures, besides
be the case at a larger scale such as the region or the the corporate culture, involved in the work of globali-
country. A focus on regions, in contrast will lead to an zation. It also brings with it an enormous research
emphasis on broad urbanization patterns, a more agenda, one that goes beyond the by now familiar

1812
Cities: Capital, Global, and World

focus on cross-border flows of goods, capital and the central city in different parts of the world, notably
information. the United States and Western Europe (Veltz 1996,
Further, by emphasizing the fact that global pro- Kunzmann 1994).
cesses are at least partly embedded in national terri- In the United States, major cities such as New York
tories, such a focus introduces new variables in current and Chicago have large centers that have been rebuilt
conceptions about economic globalization and the many times, given the brutal neglect suffered by much
shrinking regulatory role of the state. That is to say, urban infrastructure and the imposed obsolescence so
the space economy for major new transnational characteristic of US cities. This neglect and accelerated
economic processes diverges in significant ways from obsolescence produce vast spaces for rebuilding the
the duality global\national presupposed in much center according to the requirements of whatever
analysis of the global economy. The duality national regime of urban accumulation or pattern of spatial
vs. global suggests two mutually exclusive spaces— organization of the urban economy prevails at a given
where one begins the other ends. One of the outcomes time. In Europe, urban centers are far more protected
of a global city analysis is that it makes evident that the and they rarely contain significant stretches of aban-
global materializes by necessity in specific places and doned space; the expansion of workplaces and the
institutional arrangements a good number of which, if need for intelligent buildings necessarily will have to
not most, are located in national territories. take place partly outside the old centers. One of the
The two final sections examine two particular most extreme cases is the complex of La Defense, the
aspects that illustrate some of these issues concerning massive, state-of-the-art office complex developed right
place in a global economy and city-to-city networks in outside Paris to avoid harming the built environment
realms other than the economic. inside the city. This is an explicit instance of govern-
ment policy and planning aimed at addressing the
growing demand for central office space of prime
quality. Yet another variant of this expansion of the
‘center’ onto hitherto peripheral land can be seen in
5. New Forms of Centrality London’s Docklands. Similar projects for recentraliz-
ing peripheral areas were launched in several major
Several of the organizing hypotheses in the global city cities in Europe, North America, and Japan during
model concern the conditions for the continuity of the 1980s. (See Marcuse and van Kempen 2000.)
centrality in advanced economic systems in the face of Second, the center can extend into a metropolitan
major new organizational forms and technologies that area in the form of a grid of nodes of intense business
maximize the possibility for geographic dispersal. activity. One might ask whether a spatial organization
Historically, centrality has largely been embedded in characterized by dense strategic nodes spread over a
the central city. Have the new technologies and organi- broader region does in fact constitute a new form of
zational forms altered the spatial correlates of cen- organizing the territory of the ‘center,’ rather than, as
trality? in the more conventional view, an instance of subur-
Today there is no longer a simple straightforward banization or geographic dispersal. Insofar as these
relation between centrality and such geographic enti- various nodes are articulated through digital net-
ties as the downtown, or the central business district. works, they represent a new geographic correlate of
In the past, and up to quite recently in fact, the center the most advanced type of ‘center.’ This is a partly
was synonymous with the downtown or the CBD. The deterritorialized space of centrality. Indeed much of
spatial correlate of the center can assume several the actual geographic territory within which these
geographic forms. It can be the CBD, as it still is nodes exist falls outside the new grid of digital
largely in New York City, or it can extend into a networks, and is in that sense partly peripheralized.
metropolitan area in the form of a grid of nodes of This regional grid of nodes represents a reconsti-
intense business activity, as we see for instance in tution of the concept of region. Far from neutralizing
Frankfurt and Zurich. The center has been profoundly geography the regional grid is likely to be embedded in
altered by telecommunications and the growth of a conventional forms of communication infrastructure,
global economy, both inextricably linked; they have notably rapid rail and highways connecting to air-
contributed to a new geography of centrality (and ports. Ironically perhaps, conventional infrastructure
marginality). Simplifying we can identify four forms is likely to maximize the economic benefits derived
assumed by centrality today. from telematics. This is an important issue that has
First, while centrality can assume multiple spatial been lost somewhat in discussions about the neutrali-
correlates, the CBD in major international business zation of geography through telecommunications.
centers remains a strategic site for the leading in- Third, we are seeing the formation of a transter-
dustries. But it is one profoundly reconfigured by ritorial ‘center’ constituted, partly in digital space, via
technological and economic change (Graham and intense economic transactions in the network of global
Marvin 1996). Further, there are often sharp differ- cities. These networks of major international business
ences in the patterns assumed by this reconfiguring of centers constitute new geographies of centrality. The

1813
Cities: Capital, Global, and World

most powerful of these new geographies of centrality economic activity located in digital space contain
at the global level binds the major international points of coordination and centralization.
financial and business centers: New York, London,
Tokyo, Paris, Frankfurt, Zurich, Amsterdam, Los
Angeles, Sydney, Hong Kong, among others. But this 6. The Global City as a Nexus For New Politico-
geography now also includes cities such as Bangkok, cultural Alignments
Seoul, Taipei, Sao Paulo, Mexico City. The intensity
of transactions among these cities, particularly The incorporation of cities into a new cross-border
through the financial markets, trade in services, and geography of centrality also signals the emergence of a
investment, has increased sharply, and so have the parallel political geography. Major cities have
orders of magnitude involved. At the same time, there emerged as strategic sites not only for global capital,
has been a sharpening inequality in the concentration but also for the transnationalization of labor and the
of strategic resources and activities between each of formation of translocal communities and identities
these cities and others in the same country, a condition (Smith 1997). In this regard cities are sites for new
that further underlines the extent to which this is a types of political operations and for a whole range of
cross-border space of centrality. new ‘cultural’ and subjective operations ( Watson and
The pronounced orientation to the world markets Bridges 1999, Allen et al. 1999). The centrality of place
evident in such cities raises questions about the in a context of global processes makes possible a
articulation with their nation-states, their regions, and transnational economic and political opening for the
the larger economic and social structure in such cities. formation of new claims and hence for the constitution
Cities have typically been deeply embedded in the of entitlements, notably rights to place. At the limit,
economies of their region, indeed often reflecting the this could be an opening for new forms of ‘citizenship’
characteristics of the latter; and they still do. But cities (Isin 2000, Holston 1996).
that are strategic sites in the global economy tend, in The emphasis on the transnational and hypermobile
part, to disconnect from their region. This conflicts character of capital has contributed to a sense of
with a key proposition in traditional scholarship about powerlessness among local actors, a sense of the
urban systems, namely, that these systems promote futility of resistance. But an analysis that emphasizes
the territorial integration of regional and national place suggests that the new global grid of strategic sites
economies. is a terrain for politics and engagement. The loss of
In the case of a complex landscape such as Europe’s power at the national level produces the possibility for
we see several geographies of centrality, one global, new forms of power and politics at the subnational
others continental and regional. A central urban level. Further, insofar as the national as container of
hierarchy connects major cities, many of which in turn social process and power is cracked (Brenner 1998,
play central roles in the wider global system of cities: Taylor 1995) it opens up possibilities for a geography
Paris, London, Frankfurt, Amsterdam, Zurich. These of politics that links subnational spaces across borders.
cities are also part of a wider network of European Cities are foremost in this new geography. One
financial\cultural\service capitals, some with only question this engenders is how and whether we are
one, others with several of these functions, which seeing the formation of a new type of transnational
articulate the European region and are somewhat less politics that localizes in these cities.
oriented to the global economy than Paris, Frankfurt, Immigration, for instance, is one major process
or London. And then there are several geographies of through which a new transnational political economy
marginality: the East–West divide and the North– and translocal household strategies are being con-
South divide across Europe as well as newer divisions. stituted (Portes 1997, Skeldon 1997). It is one largely
In Eastern Europe, certain cities and regions, notably embedded in major cities insofar as most immigrants,
Budapest, are rather attractive for purposes of in- certainly in the developed world, whether in the US,
vestment, both European and non-European, while Japan, or Western Europe, are concentrated in major
others will increasingly fall behind, notably in Rum- cities. It is, in many regards, one of the constitutive
ania, Yugoslavia, and Albania. We see a similar processes of globalization today, even though not
differentiation in the south of Europe: Madrid, Barce- recognized or represented as such in mainstream
lona, and Milan are gaining in the new European accounts of the global economy. This configuration
hierarchy; Naples, Rome, and Marseilles are not contains unifying capacities across national bound-
quite. aries and sharpening conflicts within cities. Global
Fourth, new forms of centrality are being consti- capital and the new immigrant workforce are two
tuted in electronically generated spaces. For instance, major instances of transnationalized actors that have
strategic components of the financial industry operate unifying properties across borders, and thus internally
in such spaces. The relation between digital and actual to each, and find themselves in contestation with each
space is complex and varies among different types of other inside cities. Researching and theorizing these
economic sectors. But it is increasingly becoming issues will require approaches that diverge from the
evident that the highly complex configurations for more traditional studies of political elites, local party

1814
Cities: Capital, Global, and World

politics, neighborhood associations, immigrant com- services catering to global firms and markets—law,
munities, and so on through which the political accounting, credit rating, telecommunications—it is
landscape of cities and metropolitan regions has been clear that we are dealing with a cross-border system,
conceptualized in urban studies. one that is embedded in a series of cities, each possibly
One way of thinking about the political implications part of a different country. It is a de facto global
of this strategic transnational space anchored in global system.
cities is in terms of the formation of new claims on that Fifth, a focus on networked cross-border dynamics
space. The city has indeed emerged as a site for new among global cities also allows us to capture more
claims: by global capital which uses the city as an readily the growing intensity of such transactions in
‘organizational commodity,’ but also by disadvan- other domains—political, cultural, social, criminal.
taged sectors of the urban population, frequently as Global cities around the world are the terrain where
internationalized a presence in large cities as capital. a multiplicity of globalization processes assume con-
The ‘de-nationalizing’ of urban space and the forma- crete, localized forms. These localized forms are, in
tion of new claims by transnational actors, raise the good part, what globalization is about. Recovering
question ‘Whose city is it?’ place means recovering the multiplicity of presences in
This is a space that is both place-centered in that it this landscape. The large city of today has emerged as
is embedded in particular and strategic locations; and a strategic site for a whole range of new types of
it is transterritorial because it connects sites that are operations—political, economic, ‘cultural,’ subjective.
not geographically proximate yet are intensely con- It is one of the nexus where the formation of new claims,
nected to each other. If we consider that large cities by both the powerful and the disadvantaged, mat-
concentrate both the leading sectors of global capital erializes and assumes concrete forms.
and a growing share of disadvantaged populations—
immigrants, many of the disadvantaged women, See also: Cultural Studies: Cultural Concerns; Culture,
people of color generally, and, in the megacities of Sociology of; Globalization and World Culture; Glo-
developing countries, masses of shanty dwellers—then balization, Anthropology of; Globalization: Geo-
we can see that cities have become a strategic terrain graphical Aspects; Globalization, Subsuming Plural-
for a whole series of conflicts and contradictions ism, Transnational Organizations, Diaspora, and
(Allen et al. 1999, Tardanico and Lungo 1995). We Postmodernity; Information Society; Information
can then think of cities also as one of the sites for the Society, Geography of; Information Technology;
contradictions of the globalization of capital, even International Communication: History; Internet:
though the city cannot be reduced to this dynamic. Psychological Perspectives; Science and Tech-
nology: Internationalization

7. Conclusion
Bibliography
An examination of globalization through the concept
Abu-Lughod J L 1999 New York, Los Angeles, Chicago:
of the global city introduces a strong emphasis on America’s Global Cities. University of Minnesota Press, MN
strategic components of the global economy rather Allen J, Massey D, Pryke M (eds.) 1999 Unsettling Cities.
than the broader and more diffuse homogenizing Routledge, London
dynamics we associate with the globalization of Arrighi G 1994 The Long Twentieth Century. Money, Power, and
consumer markets. As a consequence, this also brings the Origins of Our Times. Verso, London
an emphasis on questions of power and inequality. Beauregard R 1991 Capital restructuring and the new built
And it brings an emphasis on the actual work of environment of global cities: New York and Los Angeles.
managing, servicing, and financing a global economy. International Journal of Urban and Regional Research 15(1):
90–105
Second, a focus on the city in studying globalization
Braudel F 1984 The Perspectie of The World. Collins, London,
will tend to bring to the fore the growing inequalities Vol. 3
between highly provisioned and profoundly disad- Brenner N 1998 Global cities, global states: Global city
vantaged sectors and spaces of the city, and hence such formation and state territorial restructuring in contemporary
a focus introduces yet another formulation of ques- Europe. Reiew of International Political Economy 5(1):
tions of power and inequality. 1–37
Third, the concept of the global city brings a strong Castells M 1989 The Informational City. Blackwell, London
emphasis on the networked economy because of the Chase-Dunn C 1984 Urbanization in the world system: New
nature of the industries that tend to be located there: directions for research. In: Smith M P (ed.) Cities in Trans-
formation. Sage, Beverly Hills, CA
finance and specialized services, the new multimedia
Eade J (ed.) 1996 Liing the Global City: Globalization as a Local
sectors, and telecommunications services. These in- Process. Routledge, London
dustries are characterized by cross-border networks Elliott J R 1999 Putting ‘global cities’ in their place: Urban
and specialized divisions of functions among cities hierarchy and low-income employment during the post-war
rather than international competition per se. In the era. Urban Geography 20(2): 95–115
case of global finance and the leading specialized Fainstein S, Gordon I, Harloe M 1993 Diided City: Economic

1815
Cities: Capital, Global, and World

Restructuring and Social Change in London and New York. Sassen S 1982 Recomposition and peripheralization at the core.
Blackwell, New York Contemporary Marxism 5: 88–100
Friedmann J, Goetz W 1982 World city formation: An agenda Sassen S 1984 The new labor demand in global cities. In: Smith
for research and action. International Journal of Urban and M P (ed.) Cities in Transformation. Sage, CA
Regional Research 6: 309–44 Sassen S 1991\2000 The Global City: New York, London, Tokyo.
Graham S, Marvin S 1996 Telecommunications and the City: Princeton University Press, NJ
Electronic Spaces, Urban Places. Routledge, London Savitch H V 1996 Cities in a global era: A new paradigm for the
Gravesteijn S G E, van Griensven S, de Smidt M C (eds.) 1998 next millenium. In: Cohen M A, Blair A, Ruble J, Tulchin S,
Timing global cities. Nederlandse Geografische Studies 241 Garland A M (eds.) Preparing for the Urban Future. Global
Hall P 1966 The World Cities. McGraw-Hill, New York Pressures and Local Forces. Woodrow Wilson Center Press,
Hamnett C 1996 Why Sassen is wrong: A response to Burgers. Washington, DC, pp. 39–65
Urban Studies 33(1): 107–10 Scott A J 2001 Global City-Regions. Oxford University Press,
Holston J (ed.) 1996 Cities and citizenship. Public Culture Oxford, UK
8(2)( Winter) Short J R, Kim Y 1999 Globalization and the City. Longman,
Isin E (ed.) 2000 Democracy, Citizenship and the Global City. Essex
Routledge, London Simon D 1995 The world city hypothesis: Reflections from the
Keil R 1993 Weltstadt- Stadt der Welt: Internationalisierung und periphery. In: Knox P L, Taylor P J (eds.) 1995 World Cities in
lokale Politik in Los Angeles. Westfaelisches Dampfboot, a World-System. Cambridge University Press, Cambridge,
Munster, Germany UK, pp. 132–55
King A D 1990 Global Cities: Post-Imperialism and the Inter- Skeldon R 1997 Hong Kong: Colonial city to global city to
nationalization of London. Routledge, London provincial city? Cities 14(5): 265–71
Knox P L, Taylor P J (eds.) 1995 World Cities in a World- Smith R C 1997 Transnational migration, assimilation and
System. Cambridge University Press, Cambridge, UK political community. In: Crahan M, Vourvoulias-Bush A
Kunzmann K R 1994 Berlin im Zentrum europa$ ischer (eds.) The City and the World. Council of Foreign Relations,
Sta$ dtnetze. In: Werner S (ed.) Hauptstadt Berlin. Band 1: NY
Nationale Hauptstadt Europaeische Metropole. Berlin Verlag, Smith D, Timberlake M 2002 Cross-border air traffic patterned
Berlin, pp. 233–46 networks. In: Sassen S (ed.) Global Networks\Linked Cities.
Lo F, Yeung Y (eds.) 1996 Emerging World Cities in Pacific Asia. Routledge, London
United Nations University, Tokyo Smith M P, Feagin J R (eds.) 1995 The Bubbling Cauldron:
Logan J R, Swanstrom T (eds.) 1990 Beyond the City Limits: Race, Ethnicity, and The Urban Crisis. University of
Urban Policy and Economic Restructuring in Comparatie Minnesota Press, Minneapolis
Perspectie. Temple University Press, Philadelphia
Storper M 1997 The Regional World: Territorial Deelopment in
Low S M 1999 Theorizing the city. In: Low S M (ed.) Theorizing
a Global Economy. Guilford Press, New York
the City. Rutgers University Press, New Brunswick, NJ, pp.
Stren R 1996 The studies of cities: Popular perceptions, academic
1–33
disciplines, and emerging agendas. In: Cohen M A, Blair A,
Markusen A, Gwiasda V 1994 Multipolarity and the layering of
Ruble J, Tulchin S, Garland A M (eds.) Preparing for the
functions in the world cities: New York city’s struggle to stay
Urban Future. Global Pressures and Local Forces. Woodrow
on top. International Journal of Urban and Regional Research
Wilson Center Press, Washington, DC, pp. 392–420
18: 167–93
Tardanico R, Lungo M 1995 Local dimensions of global
Machimura T 1998 Symbolic use of globalization in urban
restructuring in urban Costa Rica. International Journal of
politics in Tokyo. International Journal of Urban and Regional
Research 22(2): 183–94 Urban and Regional Research 19(2): 223–49
Marcuse P, van Kempen R 2000 Globalizing Cities. A New Taylor P J 1995 World cities and territorial states: The rise and
Spatial Order. Blackwell, Oxford, UK fall of their mutuality. In: Knox P L, Taylor P J (eds.) 1995
Meyer D R 1991 Change in the world system of metropolises: World Cities in a World-System. Cambridge University Press,
The role of business intermediaries. Urban Geography 12(5): Cambridge, UK, pp. 48–62
393–416 Taylor P J, Beaverstock J V, Walker D R F 2002 Introducing
Moulaert F, Scott A J 1997 Cities, Enterprises and Society on the GaWC: Researching world city network formation. In: Sassen
Ee of the 21st Century. Pinter, New York S (ed.) Global Networks\Linked Cities. Routledge, London
National Academy of Sciences (NAS) 2000 Panel on Urban Data Thrift N, Leyshon A 1994 A phantom state? The de-tradi-
Sets. Committee on Population, NAS, Washington, DC tionalization of money, the international financial system and
Nijman J 1996 Breaking the rules: Miami in the urban hierarchy. international financial centres. Political Geography 13(4):
Urban Geography 17(1): 5–22 299–327
Noller P, Prigge W, Ronneberger K (eds.) 1994 Stadt-Welt. Veltz P 1996 Mondialisation Villes et Territoires: L’Economie
Campus Verlag, Frankfurt, Germany d’Archipel. Presses Universitaires de France, Paris
Portes A (ed.) 1995 The Economic Sociology of Immigration. The Walton J 1982 The international economy and peripheral
Russell Sage Foundation, New York urbanization. In: Norman I, Fainstein S (eds.) Urban Policy
Rodriguez N P, Feagin J R 1986 Urban specialization in the under Capitalism. Sage, CA, pp. 119–35
world system. Urban Affairs Quarterly 22(2): 187–220 Watson S, Bridges G (eds.) 1999 Spaces of Culture. Sage,
Ross R, Trachte K 1983 Global cities and global classes: The London
peripheralization of labor in New York City. Reiew 6(3): Wheeler J O 1986 Corporate spatial links with financial institu-
393–431 tions: The role of the metropolitan hierarchy. Annals of the
Santos M, Souze M A A, Silveira M L (eds.) 1994 Territorio Association of American Geographers 76(2): 262–74
Globalizacao e Fragmentacao. Editorial Hucitec, Sao Paulo,
Brazil S. Sassen

1816 Copyright # 2001 Elsevier Science Ltd.


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Cities: Cultural Types

Cities: Cultural Types a few cross-cultural studies appeared, by Scargill


(1979), Brunn and Williams (1983), and Agnew et al.
Urban forms and structures vary from one region of (1984). The models of some cultural types of cities are
the world to the other. For better understanding of discussed in Ehlers (1992). But separate chapters
such variations the notion of culture realm is helpful. devoted to cultural types of cities are only found in the
urban geography textbooks by Beaujeu-Garnier and
Chabot (1963), Hofmeister (1999), and Rugg (1972).
A vast amount of material on cultural types of cities
1. The Intermediate Leel of Looking at Cities will, however, be available in the near future when the
series ‘Urbanization of the Earth’ started by Tietze in
From the global point of view cities are local con- 1977 is completed.
centrations of population sharing certain features such It is postulated that urbanization processes are the
as site and location factors, form (streets, houses), same all over the world. However, they encounter
functions, and land uses. These features are common different cultures and consequently generate different
to all cities around the globe. urban forms and structures in various parts of the
At the opposite end of the scale there is the world.
individual city. From the ideographic point of view These regional differences may be seen against the
each city is unique as to its historical development and background of culture realms. In the present author’s
design. opinion one may distinguish between twelve culture
At an intermediate level and mainly on cosmo- realms in the world and twelve cultural types of cities,
graphic, religious, mercantile, or military grounds, the respectively. These are the European, Russian,
cities in a specific region of the world have developed Chinese, Japanese, Southeast Asian, Indian, Oriental
certain peculiar traits common only to them and (Middle Eastern), Central African, South African,
distinguishing them from the cities of other regions. Australian, Anglo-American, and Latin American
These traits derived from ideas and thoughts of their types.
inhabitants about their way of life and the desire to We should, however, be aware of the fact that the
shape their settlements accordingly. Cities are, to a notion of culture realm implies a rather high degree of
certain degree, a mirror of the intentions of their generalization. As we attempt to go further into details
founders and of all successive generations over the we shall soon realize, for example, that Anglo-Ameri-
centuries. can cities are far from alike. The mere existence of
This aspect has been somewhat neglected, even to the two political entities of the United States and
the present day. Prior to the 1950s, professional Canada has had some bearing upon urban develop-
geographers were occupied mainly with location and ments, so that US cities will look somewhat different
growth factors, and urban functions. During the first from Canadian cities (see Goldberg and Mercer 1986).
quarter of the twentieth century a few German In an attempt to do justice to such variations Holzner
geographers made urban morphology their research et al. (1967) derived 34 sub-types of cities.
topic, and it was in this period that a few cultural– Even within the confines of the Dominion of Canada,
genetic studies appeared. The Austrian Oberhummer cities are different to a certain degree as we compare
participated in the Transcontinental Excursion of 1912 the cities in the predominantly French-settled Province
across the United States and published a comparative of Quebec with the cities of the Province of Ontario so
study on American and German cities in a memorial that we may at least distinguish between a French-
volume (Oberhummer 1915). Fleure (1920) wrote an Canadian type and an Anglo-Canadian type of city
article on various types of European cities for the 1920 (see Hecht 1977). A further breakdown might lead to
Geographical Reiew, while in 1928 Passarge convened even more regional subtypes.
a symposium on urban issues in various countries and
two years later published the results in his book
Stadtlandschaften der Erde (Townscapes of the World) 2. Cultural Types of Cities in their Culture
(Passarge 1930).
Interest in cross-cultural research arose again after Realms
1950 when urban land uses and urban structure The following paragraphs will restrict themselves to
became major research topics. Simultaneously, the these twelve cultural types of cities corresponding to
British geographer Smailes (1955) drew the attention the twelve great culture realms of the world, and to a
of the English-speaking scientific community to urban few selected cultural traits.
morphology with his paper on townscapes. These
attempts were, however, very soon superimposed by
the so-called ‘quantitative revolution’ and the rise of
2.1 The European City
urban social geography.
This is why the cultural–genetic approach was given Europe’s cities originated from processes that started
rather little attention in standard textbooks. Of course, in the eighth century with the consolidation of political

1817
Cities: Cultural Types

power and economic development. The ruler’s power, some residential quarters, and a somewhat greater
be it clerical or secular, found its manifestation in mixture of social status groups.
cathedrals, monasteries, and castles. To the present
day, churches and castles have remained focal
points of European cities.
Usually the early European town was a dual 2.3 The Chinese City
settlement: besides the ruler’s court, merchants’
The widespread regular grid pattern supposedly orig-
quarters developed as the second point of origin. The
inated in military camps on the dangerous northern
town was a closed entity, separated visually from its
margin of the later Chinese Empire. Geomantics
hinterland by a wall, its residents living under special
played a decisive part in the layout of towns. The
jurisdiction and enjoying the right of trading. The
rectangular system was related to yin, representing
market place and town hall became the focal point of
the Earth, in contrast to the circular yang, representing
the city. The heritage of these essentials of early
Heaven. The ruler’s palace always faced south. A fixed
European urbanism are still obvious in present-day
number of streets ran north–south, crossing the
townscapes.
east–west streets at right angles. However, some streets
There are, however, regional variations. For
were a little displaced in order to prevent evil spirits
example, Southern Europe’s towns were usually built
from getting through.
on hills corresponding to the acropolis of Athens, with
During the nineteenth century many towns along
the effect of many streets actually being stairway. The
the east coast and the rivers expanded by so-called
hilltop location gave them a certain amount of safety
‘concessions,’ i.e., foreign missions, industrial, port
not only against enemies but also against the torrential
and trading facilities, and living quarters established
floods of the seasonally inundating Mediterranean
by Europeans and Americans.
rivers. The numerous squares were integrated inten-
In the present People’s Republic there is a hi-
sively into the daily lives of their residents.
erarchical structure of urban residents with a certain
In contrast to the Continent, British towns show a
number of households forming a residents’ group, sev-
less compact building fabric due to much earlier
eral groups a residents’ committee, and several com-
dismantling of their fortifications and a strong desire
mittees a town. Residents are usually assigned dwelling
of people for single family dwellings, these traits
units and social services by production complexes,
making them look more like Anglo-American than
these danweis being highly autonomous entities. The
European cities.
planning of great modern industrial parks since 1958
gave rise to a number of satellite cities that are
supposed to relieve the metropolises from uncon-
2.2 The Russian City trolled expansion as well as eventually abolishing the
urban–rural dichotomy.
Many old Russian towns developed on the western
bluffs of rivers with the kremlin as their core while the
settlement was less compact on the eastern bank. The
second core was the posad or merchants’ quarters.
2.4 The Japanese City
Small suburban settlements called slobody developed
as living quarters of other population groups, while The rectangular cho-pattern copied from China and
still farther away often fortified monasteries were the observing of geomantic rules made for similarities
founded. The kremlin eventually lost its strategic to the Chinese cities. Around 1950 one half of Japan’s
function and mutated to become the city’s admini- cities originated from the castle town or joka-machi of
strative and cultural center. the fifteenth and sixteenth centuries. The castle district
In the colonial period grid-pattern quarters were is located in a strategic hillside position, protected by a
added to many historic towns. Especially in the Islamic system of walls and ditches and divided into the palace
towns of Central Asia, the modern Russian quarters area of the daimyo in the center and the quarters of the
made for a conspicuous contrast to the cul-de-sac higher and lower samurai (warriors). Beyond the wall
layout of the historic town center. there were the quarters of the craftsmen, merchants,
During the Soviet era, socialist principles were and priests. During the reforms of 1868 the daimyo lost
applied to town planning. The concept of microrayon their privileges. The castle district mutated to become
corresponded with the American neighborhood prin- a civic center in those castle towns that were able to
ciple inasmuch as entities of 6,000 to 25,000 residents acquire manufacturing and administrative functions.
with basic service functions were conceived. Three to The other half of Japan’s cities are port cities, market
four microrayons were called a district, and there were places, shrines or spas.
two more levels in the intra-urban hierarchy. During the twentieth century many small shopping
The post-Soviet changes of the 1990s brought streets grew to the size of large shopping centers,
foreign investments, mainly into hotels and office although without any societal and cultural functions.
buildings, the privatization of kolkhoz markets and Huge underground centers at the railroad stations

1818
Cities: Cultural Types

became keen competitors for the traditional shopping parts of the deity Vishnu’s body, specific quarters of
streets of the temple districts. the city being allocated to each of them. Priests used to
A number of walled industrial parks with dor- live together with the ruler and the members of his
mitories for single workers and multi-storey tenements court in the central temple and palace district or in the
for married workers’ families called danchi were northern sector of town. The officials privileged by
established. The third trend was land reclamation their position as supervisors of the ruler’s water and
beyond the former coastlines for large modern in- land resources lived in the eastern sector, the mer-
dustrial plants with their own port facilities and chants in the south, and the peasants in the west.
infrastructure of dwelling units, supermarkets, and Over the centuries more than 3,000 subcastes have
social institutions. developed due to secessions, political quarrels, and the
growing division of labor. These jatis, characterized by
hereditary professions and high rates of endogamy
have made for a very low degree of mobility and a high
2.5 The Southeast Asian City
degree of segregation.
Geomantic and anthropomorphic rules were ob- In North India the irregular Indian grid influenced
served in their layout. The port was considered the by Islamic culture is characterized partly by winding
‘heart’ of the city, while the seat of the clerical power streets of changing width and crooked byroads.
on the south side was its ‘soul.’ The north was identi- A completely new element was added to these towns
fied with the body’s head and was supposed to accom- by the Anglo-Indian stations of the British comprising
modate the representative functions. The east was the ‘cantonment’ or military quarters, the ‘civil lines’
compared with the right or working hand and assigned or quarters of the officials of the Indian Civil Service,
to the craftsmen, the west with brain work and mental private entrepreneurs, and the ‘railway colony.’ A
culture, and assigned to the officials and guardians of number of shops were lined up along the Mall while
the palace. the so-called sadr bazaar served those troops recruited
Since the Age of Discoveries, European urban from the indigenous people for whom the Hindu town
patterns and buildings have been introduced, Jakarta’s was off limits. Even a new house type was created: the
Dutch origin becoming obvious by its grachten and bungalow, with its rectangular shape, pyramid-like
Manila’s Spanish origin by its plaza mayor. As in roof thatched or covered with bricks, one storey high
many colonial cities there was an intermediate status with porches on each side, and put on stilts for better
group between the indigenous people and the colonial ventilation and protection against floods.
elite. In Southeast Asia this group is made up of the
Chinese, who dominate trade and commerce.
A conspicuous element of many towns is their
2.7 The Oriental or Islamic City
fishing villages where boat people, often referred to as
‘sea gypsies,’ spend most of their lifetime on boats or There has been a long-lasting controversy as to
sampans. whether the cities from Morocco to Pakistan should
Great percentages of Indonesia’s urban population be termed Oriental or Islamic. Adherents to the latter
are found in settlements referred to as ‘kampungs,’ opinion consider the Friday mosque and the bazaar
these settlements having a functional mixture of the two dominating elements, while their opponents
dwellings, crafts, and trade, and originating in villages argue that the suq, the caravansary, and the hammam
still lacking most urban amenities. Some authors or bathhouse can be traced back to the Roman
consider them squatter settlements. In recent times colonnades, basilica, and thermae of pre-Islamic
many kampungs have been integrated into the building times, respectively.
fabric of the growing metropolises, while only those One of the most conspicuous traits of the medina or
at the urban fringe have maintained their village- historic town center is its cul-de-sac pattern, while
like appearance. the few thoroughfares are exceptions. One expla-
nation for this phenomenon is the juxtaposition of
clans of different origins as to their ethnicity, religion
and language, and their desire to live segregated from
2.6 The Indian City
other residents. Second, in contrast to the Roman
Hindu urban culture is based on meditation images carriages, camels and mules were used for trans-
and symbols such as mandalas with a regular street portation. The third reason is the legal status of the
pattern arranged around temples devoted to one of the cul-de-sacs: they are not public spaces, but rather are
deities. The regular Indian grid is centered on an owned by their neighbors.
intersection of two major axes with a temple nearby, Much real estate of the medina is in waqfs, i.e.,
while the palace is integrated into the wall. Ritual and foundations donated by rulers, officials, or merchants
secular uses of streets have been highly interwoven. for religious and beneficial purposes or the use of their
The caste system played a decisive role in Hindu children, and administered by trustees. Neither these
culture. The four major castes originated in various nor the users are interested in investments, so that

1819
Cities: Cultural Types

these foundations prevent urban renewal, leading the post-apartheid town. As early as 1913 the Native
to the decay of buildings and the emigration of the Land Act prohibited Africans from acquiring land
wealthier people. outside their reservations, or later homelands. They
The bazaar is structured in a way that the most lived either as servants in their employers’ households
valuable goods such as jewelry, books, or perfumes are or were forced to stay in ‘locations’ and hostels.
traded in the most centrally located and covered lanes. The central city was reserved for white citizens and,
Spices, shoes, and carpets are found some distance as an exception, for some colored groups, by the
from the center, while pottery, leather goods, auto Group Areas Act of 1948. Even white people lived
parts, and other manufactured and bulky goods are more or less segregated as to their Afrikaner, British,
found in a still more peripheral location. In recent or Portuguese origin. The Natives Resettlement Act of
years this pattern has been weakened. Simple stalls 1954 initiated comprehensive relocation. Often rail-
have been replaced by shops with shop windows, and road tracks, highways, or canals served as barriers
the whole bazaar has faced competition from the between the quarters of racial groups, while each racial
development of a central business district (CBD) and quarter was assigned a considerable number of places
modern suburban shopping centers. of employment in order to keep commuting through
other racial quarters to a minimum.
However, complete segregation was never achieved.
Since 1986, non-white enterprises have been admitted
2.8 The Central African City
to so-called ‘free trade areas’ while ‘grey areas’ have
There has been a controversy as to whether the Yoruba been legalized, and even ‘free settlement areas’ open to
towns of southern Nigeria were urban settlements in all racial groups have been established. The repeal of
the strict sense of the word. The Hausa developed a the racial laws in 1991 brought about considerable
special kind of segregation, since no strangers were mobility and squatting.
allowed into their town, those people founding settle-
ments of their own outside the gates: sabon-gari which
2.10 The Latin American City
in Hausa means ‘new town.’ The British colonial
administration adopted the sabon-gari system for The Spaniards founded their overseas towns around
ethnic and hygienic reasons. the plaza mayor with the cabildo, cathedral, and
Although the numbers of European soldiers, admin- courthouse. They had a central–peripheral social
istrators, and business managers remained small, their gradient with the upper-class people living in the
settlements were spacious in contrast to the crowded cuadras (blocks) nearest to the plaza.
areas of the indigenous people. There were inter- This social structure has persisted to the present
mediate groups between them and the European elite: partly because centrally located residences are still
the Levantines in West Africa and the Indians in East considered a privilege. Part of the city center mutated
Africa. They were mainly in trade, transport, and into the CBD, with some storeys being added to old
lower ranks of the administration. The Indians es- patio houses and edificios or high-rise office buildings
pecially had their own temples, schools, clubs, and being constructed. Behind the fashionable paseos,
bazaars with dukas, i.e., open shops with verandahs overcrowded multi-storey tenements or conentillos
closed with wooden shutters at night. In recent times had developed, which received many rural migrants
modern shopping centers have developed beside the who eventually further migrated to the shantytowns at
duka-style bazaars. A slow transformation has oc- the urban fringe.
curred from a society composed of the European Since the 1930s a sectoral pattern has been super-
upper class, the Indian middle class, and the African imposed on the traditional zonal structure by the
lower class toward a society structured along socio- development of manufacturing plants and worker’s
economic lines. quarters along railroad lines, and by the centrifugal
In many towns the indigenous population is divided expansion of upper-class quarters.
into a dominating tribal group and several minority At the urban fringe a more or less high percentage of
groups from different tribes, each of them stressing people live in shantytowns called illas miserias in
their own tribal traditions, a phenomenon referred to Argentina or faelas in Brazil. Most of their oc-
as ‘retribalization’ or ‘peasantization.’ cupations are in the so-called ‘informal sector’ of the
economy. As a measure of relief, poblaciones or public
housing estates in combination with site-and-service
projects have been built by the governments of various
2.9 The South African City
South American states.
The early towns of the Cape Province and the towns in
Orange Free State and Transvaal were founded by the
2.11 The Anglo-American City
Afrikaners, and all others by the British. Within a few
decades they went full circle from the colonial town While the grid pattern had already been introduced to
through the apartheid town, and a transition period to colonial North America the grid was applied

1820
Cities: Cultural Types

rigorously to most towns beyond the Appalachian them to become home owners in some suburban areas
Mountains after the release of the Land Ordinance in within a few years, and the high degree of gentri-
1785. fication.
The United States was the pacemaker in skyscraper Urban sprawl has been even greater than in the
construction. Competition from huge suburban shop- United States and made for approximately 70 percent
ping centers and the loss of purchasing power of the home ownership. Despite this and a high rate of car
clientele of downtown shops brought decay to the ownership there are amazingly few urban expressways
downtown areas of US cities. Since the 1950s, re- (with the exception of Perth). A widely accepted
vitalization programs have been carried out with malls planning concept is the development of a limited
and shopping gallerıT as often in combination with number of growth corridors along railroad lines and
skyways, downtown motels, and modern cultural major highways with designated district centers
centers with concert halls, convention centers, and around suburban railroad stations.
exhibition halls. Historical foundations have pur-
chased and resold many historic buildings by means of See also: Cities: Internal Structure; Cultural Diversity,
revolving funds. Human Development, and Education; Cultural Geo-
Since in large areas the founding of towns and the graphy; Cultural Landscape in Environmental Studies;
construction of railroads occurred simultaneously, Cultural Landscape in Geography; Cultural Resource
railroad tracks often cut right through the town center Management (CRM): Conservation of Cultural Heri-
with level crossings, while large manufacturing areas tage; Ecology, Cultural; Urban Activity Patterns;
developed right outside the CBD along the tracks. The Urban Growth Models; Urban History
inner residential areas decayed, because most of
the building stock was made of timber and prone to
swift degradation, and the improved-value system
of taxation was no incentive for investments, so Bibliography
that tax delinquency and vacancy rates increased
Agnew J A, Mercer J, Sopher D E (eds.) 1984 The City in
rapidly. Moreover, since the 1830s low-status groups Cultural Context. Allen & Unwin, New York
of immigrants concentrated in these inner quarters Beaujeu-Garnier J, Chabot G 1963 TraiteT de geT ographie urbaine.
where they were close to their fellow countrymen and A. Colin, Paris
to manufacturing jobs. In recent years many such Brunn S D, Williams J F (eds.) 1983 Cities of the World. World
areas have been upgraded by ‘urban homesteading’ Regional Urban Deelopment. Harper & Row, New York
and gentrification. Ehlers E (ed.) 1992 Modelling the city: Cross-cultural per-
For many decades suburbia has been dominated by spectives. Colloquium Geographicum 22, Bonn
single-family homes and duplexes. Especially, young Fleure H J 1920 Some types of cities in temperate Europe.
families moved to the suburbs seeking better Geographical Reiew 10: 357–74
Garreau J 1991 Edge City. Life on the New Frontier. Doubleday,
schools for their children. Other reasons were invest- New York
ments in high-quality real estate and the chance to live Goldberg M A, Mercer J 1986 The Myth of the North American
close to neighbors sharing the same lifestyle (‘lifestyle City. University of British Columbia Press Vancouver, BC
suburbs’). The metropolises became completely frag- Hecht A 1977 Die anglo- und frankokanadische Stadt. Trierer
mented by dozens of huge shopping centers, these Geographische Studien Sonderheft 1. Trier, Germany, pp.
functioning as catalyzers for modern office and busi- 87–11
ness parks, and the growth of ‘urban villages’ termed Hofmeister B 1999 Stadtgeographie, 7th edn. G. Westermann
‘edge cities’ by Garreau (1991). Brunswick, Germany
Holzner D Dommisse E J Mueller J E 1967 Toward a theory of
cultural–genetic city classification. Annals of the Association
2.12 The Australian City of American Geographers: 367–81
Oberhummer E 1915 Amerikanische und europaW ische StaW dte.
In general, Australia’s cities very much resemble the Memorial Volume of the Transcontinental Excursion of 1912.
Anglo-American type. There are, however, certain New York
differences. The CBD has never experienced such a Passarge S (ed.) 1930 Stadtlandschaften der Erde. Breslau,
high concentration of skyscrapers with all their dis- Germany
advantages. The inner suburbs of the nineteenth Rugg D S 1972 Spatial Foundations of Urbanism. W C Brown
century have not experienced a decay similar to that of Co, Dubuque, IO
US cities due to more favorable conditions: the long Scargill D I 1979 The form of cities. St. Martin’s Press,
New York
rows of terrace houses are rather solid buildings made Smailes A E 1955 Some reflections on the geographical de-
of brick or natural stone (with the exception of the scription and analysis of townscapes. Transactions of the
Queenslander house made of timber and set on stilts), Institute of British Geographers: 99–115
the site-value system of taxation, the much smaller Tietze W (ed.) 1977 Urbanization of the Earth\Urbanisierung der
concentration of non-British minority groups, most of Erde. Berlin
these immigrants only arriving after 1950 in a country
with a booming economy and high wages enabling B. Hofmeister

Copyright # 2001 Elsevier Science Ltd. 1821


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Cities, Images of

Cities, Images of menological involvement; and designers offer top-


down speculations about the correct aesthetic for the
Community appearance matters to people. This article design of places.
reviews scientific findings on the visual features of Fechner (1876) introduced the bottom-up approach
cities that convey a strong and desirable image to to the study of aesthetics (meanings). He studied
people who experience them. The article defines the human evaluations of attributes of simple stimuli,
concepts, reviews the research, and discusses methodo- such as rectangles and polygons. Almost a hundred
logical questions and future research and use of the years later, Berlyne (1971) revived this scientific
findings. approach, called empirical aesthetics; and at a time
when psychologists sought real-world relevance,
Wohlwill (1976) extended the ideas and approach of
Berlyne to the study of real places. Although eval-
1. City Form: A Scientific Approach uative responses, such as preference, may vary across
individuals (cf. Little 1987), the theories of Berlyne
City form is shaped by and affects many people. (1971) and Wohlwill (1976) held that certain kinds of
Research shows that appearance is central in human physical features would likely have hedonic value.
responses to their surroundings (cf. Nasar 1994, 1997). More recently, some theories suggest an evolutionary
City form ‘should be guided by a ‘‘visual’’ plan: a set of basis for certain shared environmental preferences
recommendations and controls’ for its appearance (Kaplan and Kaplan 1989, Orians 1986, Ulrich 1993).
(Lynch 1960). US legislation and the courts grant Theories differ in their interpretation of the import-
governments the authority to control appearances ance and interdependency of the individual and the
(Mandelker 1996) and most American cities do so environment in response. Research, however, confirms
(Lightner 1993). To work, appearance controls must that beauty rests more in the features of the place
consider people’s image of places. This article centers evaluated than in the head of the evaluator (Stamps
on the two key aspects of the image: imageability and 1995).
linkability. Imageability refers to the probability that Although Lynch recognized the importance of
an environment will evoke a strong image from meaning, he felt one could not easily manipulate it
observers (Lynch 1960); and linkability refers to the through changes in urban form. He accepted the
probability that an environment will evoke a strong conventional wisdom that preference is highly variable
favorable response from observers (Nasar 1997). across individuals. He judged meaning as impractical
For most of its history, urban design—the practice to study, and concentrated on form—identity and
of shaping urban form—has followed a philosophical structure—separate from meaning.
approach. Theorists speculated on what ought to be,
but did not arrive at or test their speculations
scientifically. Lynch (1960) suggested and tested a
scientific approach. He assumed that people would
more likely know, and so use, an environment that was 2. Components of City Imageability
easy to read or legible. This made legibility a valid
purpose for research and design. Lynch interviewed residents in three cities to see what
Lynch described the environmental image as having they recalled about their cities; he found strong
three parts: identity, structure, and meaning. This consensus across respondents. They converged on five
means humans recognize or identify objects (identity), elements that enhance a city’s imageability: landmarks,
they see a recognizable pattern of relationships be- paths, districts, edges, and nodes. Landmarks are
tween objects (structure), and they draw emotional visible reference points, such as towers or mountains.
value (or have feelings) about the objects and structure Paths are channels for movement, such as streets or
(meaning). The meaning of a place may take a walkways. Districts are large sections of a city that
denotative or a connotative form. Denotative meanings have some recognizable, common perceived identity
are the same as identity; they refer to judgments of distinguishing them from other areas. Edges are
what the place is. Connotative meanings refer to barriers, such as shorelines, rivers, or railroad cuts.
inferences about the quality and character of the place Nodes are focal points of intensive human activity.
or its users. People often think of such connotative Research confirms the stability of these elements for
meanings as a question of aesthetics. This article many populations and cities around the world (cf.
avoids the term ‘aesthetics,’ because of its connection Nasar 1997). Although the images and prominence of
to art, where a statement may take priority over elements may vary across populations and places
pleasure, and because many people view aesthetics (Rapoport 1977), the correct arrangement of the
as something one cannot quantify. Aesthetics also elements can heighten the imageability of a city.
has its roots in philosophy and normative theory. Research that grew from Lynch’s seminal work has
Philosophers question the degree to which aesthetic yielded much information about mental maps, dis-
experience arises from psychical distance or pheno- tance perception, and wayfinding (cf. Evans 1980).

1822
Cities, Images of

To shape urban form, imageability is not enough. places: naturalness, order, complexity, novelty (atypi-
One must consider people’s evaluation of the city, the cality), upkeep, openness, and historical significance
meanings they see, or their evaluative image. (cf. Nasar 1994, 1997). People recognize variation
from natural (vegetation) to human-made. Research
shows that humans prefer vegetation, that preference
increases with the addition of vegetation, decreases
3. Components of the Ealuatie Image of the with the increase in human-made elements, and that
City people dislike obtrusive signs, utility poles, overhead
wires, and billboards, traffic, and intense land uses.
Communities might not need a scientific understand- Commuters drive out of their way to use a parkway
ing of the bases for evaluative responses if the rather than a less natural expressway; and research
designers, design review boards, and other experts suggests that exposure to vegetation may have res-
who shape places produced designs that pleased torative or healing effects.
people. Regrettably, research shows that they often do Research shows that people notice and prefer order.
not (cf. Nasar 1994, 1997). Preference for order has emerged for many kinds of
To find the evaluative image, one must consider urban settings and for various ordering variables,
both the evaluative responses important to people and including legibility, coherence, identifiability, clarity,
the features of the environment that people notice and compatibility, and congruity. People also prefer well-
evaluate. Research has found three important aspects kept to dilapidated areas. Dilapidation and disorder
of human evaluative response to places (Russell and such as vandalism, boarded up buildings, and litter,
Snodgrass 1989). Preference is a purely evaluative which researchers refer to as physical incivilities, also
dimension. Mixes of pleasure and arousal produce contribute to a perception of the breakdown of social
excitement and relaxation. Exciting places feel more controls, fear of crime, and crime.
pleasant and arousing than boring ones; and relaxing Complexity relates to the number of different
places feel more pleasant but less arousing than elements and the distinctiveness between those ele-
stressful ones. ments in a scene. Research shows that people notice
Evaluative response to places may arise from two variations in complexity, and that interest, excitement,
formal and symbolic variables (cf. Kaplan and Kaplan and viewing time increase with complexity, but that
1989, Nasar 1994). Formal variables have to do with preference tends to be highest for moderate levels of
the structure of form and include such things as shape, complexity. Though some research points to con-
proportion, scale, complexity, incongruity, novelty, tradictory findings, those findings suffer from method
and order. Symbolic or content variables have to do biases. Research shows that novelty and atypicality
with the connotative meanings associated with the also increase excitement and interest. People prefer
forms. Several kinds of theories discuss the relation- moderate to low levels of novelty or atypicality.
ship between these variables and response. One set of Though some studies show contradictory results, the
theories view preference as dependent on arousal discrepancies arise from flaws in measuring novelty
(Berlyne 1971, Mandler 1984, Wohlwill 1976). Of the and familiarity.
many variables these theories cite as affecting arousal, People readily notice changes in spaciousness. Pref-
complexity and novelty (atypicality) have garnered erence increases with openness, but people also like
the most research attention. In theory, complexity and some spatial definition. People also like mystery (in the
novelty increase arousal, interest, and excitement; but form of deflected vistas), but for uncertain conditions
preference has an inverted U-shaped relationship to such as urban areas deflected vistas and uncertainty
arousal. Preference would increase with increases in about information ahead heightens fear.
complexity or novelty up to a point, after which Places may have historical significance or just look
increases in complexity or novelty would produce a historical. In either case, they evoke favorable re-
downturn in preference. Another theory offers an sponse. People also prefer popular styles to the high-
evolutionary model in which human survival depen- style designs. The preference for historical significance
ded on preference for involvement and making sense, and certain popular styles over high styles may arise
and as a result, humans now prefer places that offer from connotative meanings associated with them or
involvement and either make sense or promise to from the mix of complexity and order in them.
make sense (Kaplan and Kaplan 1989). This theory Naturalness, upkeep, and historical significance
posits complexity and mystery (the promise of new appear to be symbolic variables, while the others
information ahead, as in a deflected vista) as creating appear to be formal variables, but each one may work
involvement; and it posits coherence and legibility as for formal or symbolic reasons. In addition, people
helping people make sense of things. People should may like some of these variables for their contribution
like a mix of complexity, mystery, coherence, and to order or for their associations with status. Natu-
legibility. ralness, upkeep, open views, order (compatibility),
Research shows seven environmental features as and historical significance enhance order, but these
prominent in human perception and evaluation of same features may look like feature that wealthier

1823
Cities, Images of

persons can afford. People notice status, make ac- Bibliography


curate judgments of social status from environmental
Berlyne D E 1971 Aesthetics and Psychobiology. Appleton-
cues, and prefer high-status to low-status areas. Century-Crofts, New York
For integrative reviews on perception and pref- Evans G 1980 Environmental cognition. Psychological Bulletin
erence of these features, see Kaplan and Kaplan (1989), 88: 259–87
Nasar (1988a, 1988b, 1989, 1994, 1997) and Wohlwill Fechner G 1876 Vorschule der Asthetik. Breitopf and Hartel,
(1976); for vegetation see Kaplan (1995), Ulrich (1983, Leipzig, Germany
1993), and Wohlwill (1983); for disorder incivilities, Kaplan R, Kaplan S 1989 The Experience of Nature: A
see Taylor (1989); for atypicality, see Mandler (1984), Psychological Perspectie. Cambridge University Press, New
Purcell and Nasar (1992), and Whitfield (1993); for York
fear and mystery, see Nasar and Jones (1997). Kaplan S 1995 The restorative benefits of nature: Towards an
integrative framework. Journal of Enironmental Psychology
15: 169–82
Lightner B 1993 A Surey of Design Reiew Practice in Local
Goernment. (Monograph) MEMO: Planning Adisory Ser-
4. Methodological Issues ice (PAS). American Planning Association, Chicago
Little B R 1987 Personality and the environment. In: Stokols D,
Visual quality research makes choices in selecting Altman I (eds.) Handbook of Enironmental Psychology,
respondents, environmental stimuli, measures of en- Vol. 1. Wiley, New York, pp. 205–44
vironmental features, and measures of evaluative re- Lynch K 1960 The Image of the City. MIT Press, Cambridge,
sponse. The choices involve trade-offs between what is MA
Mandelker D R 1998 Land Use Law. Mische, Charlottesville,
practical, what allows experimental control, and what
VA
will generalize to the real situation. For a full review of Mandler J M 1984 Stories, Scripts, and Scenes: Aspects of
the methodological issues see Nasar (1997, 1999). One Schema Theory. Erlbaum, Hillsdale, NJ
thing that differentiates visual quality research from Marans R W, Stokols D 1993 Enironmental Simulation: Re-
other social science inquiries is the need to get response search and Policy Issues. Plenum, New York
to the appearance of places (Marans and Stokols Nasar J L 1988a Enironmental Aesthetics: Theory, Research,
1993). For this, many studies use color photographs and Applications. Cambridge University Press, New York
and slides. Research shows that responses to such Nasar J L 1988b Perception and evaluation of residential street-
stimuli, even though they lack movement and sound, scenes. In: Nasar J L (ed.) Enironmental Aesthetics: Theory,
accurately reflect on-site response (Stamps 1990). Research, and Applications. Cambridge University Press, New
York, pp. 275–89
Nasar J L 1989 Perception, cognition and evaluation of urban
places. In: Altman I, Zube E (eds.) Human Behaior and
Enironment: Public Spaces, Plenum, New York, pp. 31–56
5. Future Directions Nasar J L 1994 Urban design aesthetics: The evaluative quality
of building exteriors. Enironment and Behaior 26: 377–401
Future research needs to better define the linkage Nasar J L 1997 The Ealuatie Image of the City. Sage, Thousand
between the judged and actual physical attributes. It Oaks, CA
should examine movement through environments. It Nasar J L 1999 Design by Competition: Making Design Com-
should apply scientific methods to historical data to petition Work. Cambridge University Press, New York
examine longitudinal aspects of evaluative response. It Nasar J L, Jones K 1997 Landscapes of fear and stress.
Enironment and Behaior 29: 291–323
should use meta analysis to integrate findings of Orians G H 1986 An ecological and evolutionary approach to
previous studies statistically. It should supplement landscape aesthetics. In: Penning-Rowsell E C, Lownthal D
verbal responses with psychophysiological measures (eds.) Landscape Means and Values. Allen and Unwin,
and observation of behavior. London, pp. 3–25
To derive specific guidelines for special situations, Purcell A T, Nasar J L 1992 Experiencing other peoples houses:
one can also use visual quality programming. This A model of similarities and differences in environmental
involves applied research to develop information for a experience. Journal of Enironmental Psychology 12: 199–211
visual quality plan or guidelines. Nasar (1997, 1999) Rapoport A 1977 Human Aspects of Urban Form. Pergamon
demonstrates examples of this for various appli- Press, Oxford, UK
cations. This can lead to improvements in the visual Russell J A, Snodgrass J 1989 Emotion and environment. In:
Stokols D, Altman I (eds.) Handbook of Enironmental
quality of communities for residents and visitors. Psychology, Vol. 1. Wiley, New York, pp. 245–80
Stamps A E 1990 Use of photographs to simulate environments.
See also: Memory for Meaning and Surface Memory; A meta-analysis. Perceptual and Motor Skills 71: 907–13
Mental Imagery, Psychology of; Mental Represen- Stamps A E 1995 Stimulus and respondent factors in environ-
mental preference. Perceptual and Motor Skills 80: 668–70
tations, Psychology of; Multi-attribute Decision Taylor R B 1989 Toward an environmental psychology of
Making in Urban Studies; Space: Linguistic Ex- disorder: Delinquency, crime and fear of crime. In: Stokols D,
pression; Spatial Analysis in Geography; Spatial Altman I (eds.) Handbook of Enironmental Psychology,
Cognition Vol. 2. Wiley, New York, pp. 951–86

1824
Cities, Internal Organization of

Ulrich R S 1983 Aesthetic and affective response to natural 1820 and 1920, the number of US cities with popu-
environment. In: Altman I, Wohlwill J F (eds.) Behaior and lations of 5,000 persons or more exploded from 39 to
the Natural Enironment: Human Behaior and Enironment, 1,467. During this period, these cities were also under
Adances in Theory and Research, Vol. 6. Plenum, New York,
transformation. Originally evolved from a focal point
pp. 85–125
Ulrich R S 1993 Biophilia and the conservation ethic. In: Kellert of employment, these cities were relatively small and
S R, Wilson E O (eds.) The Biophilia Hypothesis. Island Press, compact because contemporary transportation sys-
Washington, DC, pp. 73–137 tems limited their expansion. Technological advances
Whitfield T W A 1983 Predicting preference for everyday ob- led to successive transportation systems: streetcar,
jects: An experimental confrontation between two theories of suburban rail, and subway. Each new system helped to
aesthetic behavior. Journal of Enironmental Psychology 3: push the urbanized area further.
221–37 Similarly, in the early twentieth century, the use of
Wohlwill J F 1976 Environmental aesthetics: The environment as trucks, automobiles, and telephones vastly expanded
a source of affect. In: Altman I, Wohlwill J F (eds.) Human
the urban horizon and dispersed wholesale and manu-
Behaior and the Enironment: Adances in Theory and
Research, Vol. 1. Plenum, New York, pp. 37–86 facturing uses to the suburbs. Meanwhile, new build-
Wohlwill J F 1983 The concept of nature: A psychologist’s view. ing technologies such as elevators and skyscrapers
In: Altman I, Wohlwill J F (eds.) Behaior and the Natural helped rebuild city centers into central business dis-
Enironment. Plenum, New York, pp. 5–37 tricts (CBDs)—containing blocks having concen-
trations of high-rise buildings of offices and retail.
J. L. Nasar While the urban share of the US population first
reached 50 percent in 1920, the most striking change
that would sweep America, suburbanization, was
delayed by the Great Depression and World War
Two. In the postwar period, three interrelated pro-
Cities, Internal Organization of cesses restructured the urban landscape. High rates of
family formation and the ‘baby boom’ (a surge of
Economic activities, land uses, and socioeconomic births between 1946 and 1964) created extreme hous-
status of population seldom distribute evenly or ing shortages that the existing urban fabric could not
randomly in an urban area. They typically differentiate accommodate. Concurrently, huge federal and state
into internally homogeneous clusters. It is difficult not expressway-building programs opened up new resi-
to conceive that this differentiation is governed by dential suburbs while urban renewal efforts accel-
some underlying principles of spatial organization. erated the outward movement of blacks from segre-
And urban form affects the economic efficiency, social gated ghettoes into nearby predominantly white neigh-
equity, environmental quality, and sense of place. borhoods. Throughout the 1950s and the 1960s, these
Therefore, understanding theories of urban spatial last two processes resulted in the so-called ‘white
organization helps advance knowledge and shape flight’ or ‘flight-from-blight’ phenomena
better futures for urban areas. (Mieszkowski and Mills 1993). These processes accel-
Contemporary urban development has long spread erated an already existing decentralizing trend and led
beyond the political boundaries of a city. A typical to further dispersal of economic activities. As sub-
urban landscape in an advanced economy is a con- urban centers captured jobs, housing, and stores, the
tiguous urbanized area that consists of multiple cities central cities lost their dominance in metropolitan
and their suburbs. Hence, urban spatial organization population and employment. The result was the
must be analyzed in the context of an urban region. growth of sprawling urban regions. These trends
Most studies in urban spatial organization have continued through the late twentieth century as
been conducted in North America where an advanced urbanized areas gradually become polycentric.
capitalist market economy prevails. Their generaliz- By 2000, the US demographic character was firmly
ations may not be completely applicable to cities that established. For example, 80 percent of its population
have evolved in different economic modes, such as was classified as urban, up from 56 percent fifty years
those where the public sector dictates urban devel- earlier. Furthermore, suburbanization gained momen-
opment or communal institutions control property tum. In 2000, the share of the urban population living
rights. Furthermore, North American urban places in the suburbs was 60 percent, a complete reversal
are different from the rapidly growing mega-cities from the 40 percent of 1950. Suburbanization is not
(having populations of more than 15 million people) in unique to the US; other nations of advanced econ-
Asia and Latin America. omies are undergoing similar changes (Ingram 1998).

2. Patterns of Spatial Organization


1. Urbanization and Suburbanization
Evaluating urban spatial organization encompasses
During the nineteenth century, North America rapidly several patterns, including land use differentiation,
industrialized and urbanized. For example, between population distribution, household characteristics,

1825
Cities, Internal Organization of

income and racial segregation, employment distri- Income segregation is the most noticeable feature of
bution, changing building densities, and the rise of American urban spatial organization. Broadly speak-
subcenters and polycentric urban forms. ing, higher income groups reside in the suburbs while
Urban land uses are highly differentiated. Land uses the low-income groups stay in the core. This general
can be compatible or incompatible with one another. pattern is skewed by the high concentration of poverty
Compatible uses generate mutual benefits or positive in the core and also interrupted by other irregularities.
externalities. Incompatible uses create harmful conse- Small high-income precincts are commonly found in
quences or negative externalities. Compatible uses the urban core (e.g., Coral Cables in Miami, George-
usually coexist with one another while incompatible town in Washington DC, and Pacific Heights in San
uses show rigid segregation. It is debatable whether Francisco). Similarly, not all suburban households are
public land use control or voluntary market forces high-income because many suburban development
contribute to this segregation. projects target the middle class. Unlike population
Theoretically, land uses display a number of geo- density, changes of income level rarely move smoothly
metric forms: concentric rings, sectors or wedges, away from the CBD. Neighborhoods with large
multiple nuclei or small clusters, or a linear formation income differences are often separated by such physi-
from a narrow strip to an arc or a corridor. The cal barriers as rivers, highways, or even manmade
Burgess model (1925) posits that land uses fall into barriers.
concentric zones extending outward from the CBD. Employment distribution exhibits more dispersal
Two alternative formulations complementing this than residential distribution. Manufacturing activities
schema are the sectoral model, arguing that wedges of started to move out from the center in the 1900s as
similar activities radiate from the CBD along trans- trucks replaced rail freight movement. By mid century,
portation corridors, and the multi-nuclei model assert- as air transportation become important, manufac-
ing that secondary CBDs and suburban economic turing and distribution centers gravitated toward
centers emerge to accommodate second-order activi- airports. Expressway expansion has provided ubiqui-
ties. It appears that these three forms of land-use tous access within urban areas and diminished the
arrangements can coexist in a single urban area. For locational advantages of their cores. Today, manu-
example, concentric rings and corridor-type of uses facturing and wholesale activities tend to cluster
are evident in a polycentric Los Angeles. around freeway interchanges in the suburbs. Lower-
The way that population is distributed displays order retailing activities have also moved to the
some regularities as well. Urban population densities suburbs as purchasing power has shifted in that
tend to be higher around the CBD and lower at the direction. Suburban malls were first developed in the
edges. Alonso (1964) developed a monocentric model late 1950s and within decades modern and bigger
to reflect this pattern and calculated a density decay malls were built in the newer suburbs, leaving CBD
function, D(µ) l D.exp(kbµ), where the average den- and old suburban retail centers to falter. Despite that
sity, D, starts at the fringe of the CBD and declines at the average CBD’s share of metropolitan employment
a rate, b, per unit distance, µ, away from the center. has fallen to less than ten percent, CBDs have not
This function flattens as time changes, i.e., densities at completely lost their role because most of them still
the fringe will marginally increase or the limit of a city represent the biggest single concentration of special-
continues to expand. Over time, population densities ized services and government activities.
have decreased around the core, reflecting the de- Building and development densities also exhibit a
population of most of the central cities since the 1950s. distance decay relationship with the CBD but their
The monocentric model is limited because its ap- gradients are much steeper and often disrupted by
proximation of densities is specific to a specific scale of surges of density in secondary CBDs and suburban
observation. The gross population density, a com- centers. In general, high-rise development and multi-
monly used measure, tends to underrepresent the net family housing cluster in the primary and the sec-
density in the periphery because it factors unpopulated ondary centers more than in other part of the urban
and nonresidential areas into the analysis. Comparing region.
neighborhood by neighborhood, the net residential The rise of subcenters and polycentric urban form is
density in the outer suburbs may not be too different a direct result of continuous urban expansion and
from that in the inner suburbs. shifts in population and employment distribution. As
The spatial distribution of household characteristics population and employment disperse, they gravitate
is not uniform. Households of small size, female- to suburban centers and transform the old mono-
headed, or with no children are more concentrated in centric urban structures into polycentric ones. These
the core. Families and larger size households are more centers have a propensity to cluster or distribute
dispersed in the periphery where larger lots and loosely in a sectoral form. These centers evolve from
detached housing are more abundant. The differences older settlement points or they create their own set of
in preference also reflect the impact of family cycles in activities and population concentration. Garreau
housing choice. Families tend to be more sensitive to (1991) characterized the rise of these centers as ‘Edge
house size and school quality. Cities,’ recently developed places with a sizeable office

1826
Cities, Internal Organization of

and retail space attracting large numbers of com- tended urbanized areas. Contemporary sprawl typi-
muters. Today, these centers are increasingly self- cally hops along newly developed transportation
contained and making suburbs independent of the corridors. Similarly, the development of remote and
cores of their metropolitan regions. Recent studies difficult terrains is possible because of technological
reveal that office and professional service activities are changes to such infrastructure systems as water supply,
continuing to scatter. As a result, more than half of the power grid, and sewerage and drainage. Finally, the
commuting traffic is among these centers. advent of modern telecommunications incorporating
personal computers, facsimiles, and cellular tele-
phones is producing new locational requirements for
firms (Graham and Marvin 1996). While some specu-
3. Searching for the Principles of Spatial late that the revolution in information technology will
cause the ‘death of distance’ and the reintegration of
Organization home and workplace, evidence indicates that it merely
Explaining the organizing principle(s) of spatial expands choices of human interaction and does not
organization requires sophisticated understanding of reduce commuting (Kotkin 2000). Its impact on urban
the social, economic, technological, cultural, and other structure is still not clear, as it has generated both
phenomena influencing urban form and structure. centralization and decentralization of economic ac-
Spatial organization is derived from observable pat- tivities.
terns specific to the scale of observation. Also, the Income, wealth, and preferences have a great
paradigm adopted by a researcher generally shapes the influence on population distribution. Income and
findings. This section identifies and discusses common wealth affect consumer preferences and the ability to
determinants of urban spatial organization. live in specific locations. Research on housing choice
has consistently shown the importance of such at-
tractive attributes as neighborhood amenities, school
quality, personal safety, and quality of life. As the
spatial distribution of these attractive attributes is
3.1 Factors that Influence Urban Form and
highly differentiated, higher income groups tend to
Structure
outbid lower income groups in all localities in the price
Second to topography, the urban morphology, the competition for these attributes. Therefore, high-
built environment and the associated property rights, income group clusters are not confined to the suburbs.
is the most important constraint to urban devel- Both ‘middle-class flight’ and ‘gentrification’ reflect
opment. The ‘urban capital stock,’ i.e., lots, buildings, how higher income groups are sensitive to the per-
streets, road grids, rail tracks, and expressways are ceived changes of neighborhood quality. Furthermore,
relatively inelastic to changes. Since modifications of preferences in choosing neighbors along racial or
urban morphology involve very costly land assembly, ethnic lines are translated into such discriminatory
older areas in the core are likely to be left derelict. practices as selective real estate information, fiscal
Similarly, streets and freeways usually form impasses zoning and other restrictive land use regulations.
of a land use or boundaries to a neighborhood. These practices challenge the argument that income
Infrastructure, such as transportation facilities and and racial segregation is voluntary. Evidence that
transshipment centers and their ancillary uses resist minority inner-city residents have greater difficulties in
changes. Their obsolescence or closure causes der- dispersing to the suburbs in comparison to other
eliction and blight. Though building structure is groups of similar income level indicates that racial
malleable and conversion of uses is possible, these preference has prevented smooth and market-driven
transitions are lengthy and sometimes costly, espec- neighborhood transition.
ially in the absence of market demand and public Economic restructuring destabilizes the old order of
intervention. In contrast, development in the fringe is industrial location. Traditionally, manufacturing
less risky, subject to less uncertainty and cheaper. As firms were sensitive to transportation costs reflected in
real estate activities tends to be cyclical, construction the distance toward market or raw material. However,
occurs in waves. Investment converges on certain in today’s postindustrial era, minimization of trans-
locations for a short duration and then wanes. Some- portation costs has become less important as their
times, these activities create an impressive mark on share in total production costs has declined. Such
urban form—power centers, corridors of strip malls, factors as labor costs and quality, business climate,
and sprawling gated communities. Successive urban economies of scale, and quality of life are new
expansions have reflected the impetuous and specu- considerations. In addition, the agglomeration effects
lative nature of the development process. where inter-industry and inter-firm linkages allow
Technological changes greatly affect spatial organ- sharing of innovations and access to large labor pools
ization. The prime factor behind persistent decentraliz- and networks of suppliers and purchasers are im-
ation has been advances in transportation technology. portant. The postindustrial era has also witnessed the
For example, expressway systems have radically ex- employment dispersal that, together with residential

1827
Cities, Internal Organization of

decentralization, has developed into a multi-point and further out. Single-family dwellings came next and
multi-directional commuting pattern. Under the new commuters occupied the outmost urban zone. This
economy, knowledge-based industries require the new approach stimulated generations of urban studies. It
set of locational considerations discussed above highlighted the competition and conflict over space
(Wheeler et al. 2000). Finally, under globalization, among socioeconomic groups. It treated spatial form
urban regions have developed more external links and as a manifestation of a dynamic process of com-
expanded their hinterlands. While at the same time, manding space.
the economic functions of inner-city neighborhoods The utility maximization model has been the most
have changed. In some instances, the increase flow of dominant approach used to analyze spatial organiz-
capital, information, products, and people has given ation in the latter half of the twentieth century. Based
rise to specialized districts such as ethnic quarters for on Alonso’s seminal monocentric model that assumed
immigrants and clusters of import and export firms. In that firms and individuals made tradeoffs between
others, especially where in economically isolated transportation costs and locational rent, it stimulated
neighborhoods, high rates of unemployment have led numerous studies focusing on land use and trans-
to severe distress. portation modeling (Ottensmann 1975). The subse-
Although the private sector leads urban develop- quent work commonly adopted neoclassical economic
ment, the public sector plays an important role as the principles as the underlying organizing theme and
major undertaker of infrastructure projects and pro- treated an urban area as one aggregate unit of
vider of such services as public schools, law and order, optimization where each agent settled in the location
waste collection and disposal, parks and recreation, that maximized its utility and profit. This work was
and land use regulation. Infrastructure steers future generally mathematical and followed a set of deductive
development and the level of public services directly methods that required vigorous formulation of prem-
affects the quality of life. In addition, local tax levels ises. Although the resulting findings followed strong
and the efficiency and attitude of local government reasoning, they were contingent on the assumptions
shape the business climate that most private businesses stated in the model. This work has been employed for
rely on in making investment decisions. American the simulation of outcomes of one variable when other
urban areas commonly contain a large number of variables were under careful control.
small municipalities. This fragmentation has resulted An emerging interdisciplinary field, urban mor-
in great variations in the quality of services and phology, examines the physical form of cities, stresses
business climate. Labeled as the Tiebout effect, these the historical effects of the built environment and
variations induce businesses and residents to move property rights, and studies the interaction of elements
among places, leading localities to compete with one of the physical structure at various scales (Vance
another to attract investment and middle-class resi- 1990). A related approach, space syntax, correlates
dents. This competition is often unequal because some human activities and attributes with physical con-
municipalities have fewer resources, and suffer from figurations and linkages of buildings and street blocks.
eroding tax bases and the desertion of the middle class. Architects and urban designers have adopted these
Such unequal competition reinforces segregated and techniques to examine the evolution of building forms
fragmented spatial organization. and styles, street layouts, and the massing of buildings
throughout history (Kostof 1992). They have extended
beyond principles of design to interpretations of
historical and institutional constraints and behavioral
interactions with space.
The systems approach attempts to understand
3.2 Interpretations of Spatial Organization
spatial organization in its entirety (Bourne 1982). It
The study of the spatial organization of urban places views the city as a human body where its internal
can take many approaches. Each casts a different environment interplays with its external environment.
image and each adopts a particular set of methods and It tries to understand spatial organization holistically
paradigms. Here are the major approaches. by studying the form, interrelationships, behavior,
The ecological approach developed at the Uni- and evolution of activities of an urbanized area. It
versity of Chicago was the earliest systematic study of considers urban structure an undefined reflection of
urban form (Park 1925). It viewed the city as a the historical and organizational principles of society,
composite of multiple ecological colonies segregated which, in turn, are the products of current and
by income, class, and ethnicity. Its organization previous operating rules of culture, technology, econ-
principle was that the succession and invasion of one omy, and social behavior. Since this approach refuses
colony into another yielded a specific urban form of to isolate these elements, it has had to devise an
concentric zones. It suggested a distribution of col- overarching analytical framework to incorporate vari-
onies where the transients, low-income groups, and ous organizing principles. The difficulties in develop-
recent immigrants clustered around the CBD. The ing paradigmatic coherence have prevented this at-
working class and more stable immigrant groups were tempt to flourish.

1828
Cities: Internal Structure

The most recent effort to understand spatial organ- Ottensmann J R 1975 The Changing Spatial Structure of Ameri-
ization, the postmodern approach, has been adopted can Cities. Lexington Books, Lexington, MA
across such disciplines as geography, sociology, Park R E, Burgess E, McKenzie R (eds.) 1925 The City.
anthropology, cultural studies, and urban studies University of Chicago Press, Chicago
Queen S A, Thomas L F 1939 The City: A Study of Urbanism in
(Dear 2000, Soja 1989). Sometimes containing neo- the United States. McGraw-Hill, New York
Marxist doctrines, it examines urban development in Soja E 1989 Postmodern Geographies: The Reassertion of Space
the context of the rapid transformation of society in Critical Social Theory. Verso, New York
under globalization and economic restructuring. It is Vance J E Jr. 1990 The Continuing City: Urban Morphology in
sensitive to the worldviews held by groups differ- Western Ciilization. Johns Hopkins University Press, Balti-
entiated by gender, class, ethnicity, and race. While it more
does not specifically examine spatial organization per Webber M M, Dyckman J W, Foley D L (eds.) 1964 Explor-
se, it provides insights onto how space is perceived, ations into Urban Structure. University of Pennsylvania Press,
battled over, and controlled. It legitimizes the use of Philadelphia
qualitative methods to study the complex urban Wheeler J O, Aoyama Y, Warf B (eds.) 2000 Cities in the
Telecommunications Age: The Fracturing of Geographies.
process. It considers the city not as one economic unit Routledge, New York
or a production center but a place with multiple Yeates M 1998 The North American City, 5th edn. Harper and
functions. It also stresses the interaction of counter- Row, New York
vailing internal, regional, and global forces in polariz-
ing and tearing apart the traditional connections of S. Wong
urban spaces.

See also: Spatial Interaction Model; Urban Sprawl;


Spatial Organization Models; Spatial Pattern, Anal-
ysis of; Urban History\Towns and Cities in History;
Urban Geography; Urban Sociology; Urban Studies, Cities: Internal Structure
General; Income Distribution; Wealth Distribution
Urban research has always focused on big cities. Big
cities are concentrates of the cultural and economic
potential of mankind. They are centers of innovation
Bibliography
as well as hotbeds of social and ecological problems.
Alonso W 1964 Location and Land Use. Harvard University Moreover, they have long outgrown the limits of
Press, Cambridge, MA perception. Not only do planners plead for a return to
Anas A, Arnott R, Small K A 1998 Urban spatial structure. human dimensions, but all those engaged in urban
Journal of Economic Literature 36: 1426–64 research seek to delimit manageable segments, either
Berry B J L 1965 Internal structure of the city. Law and
Contemporary Problems 30(1): 111–19
by subject matter or by spatial criteria.
Bourne L S (ed.) 1982 Internal Structure of the City. Oxford
University Press, New York
Burgess E W 1925 The growth of the city. In: Park R E, Burgess
E, McKenzie R (eds.) The City. Chicago University Press, 1. General Principles
Chicago
Dear J M 2000 The Postmodern Urban Condition. Blackwell,
Malden, MA 1.1 Accessibility and Transport Technology
Garreau J 1991 Edge City: Life on the New Frontier. Doubleday,
New York
The city is a centered system based on the twofold
Graham S, Marvin S 1996 Telecommunications and the City: premise that the city center is the engine of devel-
Electronic Spaces, Urban Places. Routledge, London opment and the place easiest to access. Accessibility of
Harris R, Lewis R 1998 Constructing a fault(y) zone: Mis- both the center and the entire city area is determined
representations of American cities and suburbs, 1900–1950. by its opening up for development and by transport
Annals of the Association of American Geographers 88(4): technology (see Fig. 1).
622–39 A city of pedestrians and coaches tended to be
Ingram G K 1998 Patterns of metropolitan development: What circular. When streetcars came into use in Europe,
have we learned? Urban Studies 35(7): 1019–35 they started mostly at the town gates. This made for
Kostof S 1992 The City Assembled: The Elements of Urban Form star-shaped growth, with the interstitial areas lagging
Through History. Bulfinch, Boston
Kotkin J 2000 The New Geography: How the Digital Reolution
behind in development. Where public transport is
Is Reshaping the American Landscape. Random House, New predominant, the city center remains the easiest place
York to access. The meT tropole concentreT e of Paris is an
Mieszkowski P, Mills E S 1993 The causes of metropolitan example. The absolute predominance of private trans-
suburbanization Journal of Economic Perspecties 7(3): port reduces access to the center; under a liberal
135–47 political system and with land abundantly available, a

1829
Cities: Internal Structure

Figure 2
Centrifugal and centripetal social gradients
(Stadtgeographie (Urban Geography) 1998, p. 110)
Figure 1
The accessibility of the city center
the recent phenomenon of gentrification in North
American cities. Here, middle and upper income
city or metropolitan region designed for car traffic urbanites move centripetally into rundown central dis-
emerges in several steps. Los Angeles offers a pro- tricts (see Fig. 2).
totype of this. Wherever a centrifugal social gradient predomi-
nates, as in US cities, a filtering down takes place: with
the deterioration of the building fabric in the center,
lower-class people move outward into adjacent middle
1.2 Distance Decay and upper-class districts. This social downgrading of
high quality living quarters is hard to stop and
There are two theoretical approaches for analyzing presented a major problem for British town planners.
gradients from the center to the periphery: Having accomplished the redevelopment of rundown
early industrial terrace houses next to the old town
centers, they were faced with the daunting task of
redeveloping devastated mid- and late nineteenth
1.2.1 Social gradients. The central peripheral organ- century terrace house districts (while the necessary
ization of urban society may be described by means renovation of early twentieth century council housing
of centrifugal and centripetal social gradients. These was partly effected by privatization under Thatcher).
gradients permit the following diagnosis regarding
urban growth. Wherever the city center is also the
social center and a centripetal social gradient is in
evidence, the demand for space of the upper classes 1.2.2 Land alues gradient. According to the theory
extends outward from the center. This causes resi- of urban land markets (Alonso 1964), urban land use
dential and social upgrading of the adjacent middle- mirrors transport cost as well as the rent of land. Regu-
class districts. In the middle zone, former lower-class larly, in socio-economically intact city centers, there
quarters are turned into middle-class districts. The is a center-periphery gradient with several consecu-
Founders Period growth of Berlin, Budapest, Copen- tive zones of use outward from the central business
hagen, Paris, Prague, and Vienna was of this kind. district (CBD). Replacement of historical city-models
This type of upgrading must not be confused with by the new model of suburbia together with the re-

1830
Cities: Internal Structure

distance) as standard size. Comprehensive hierarchical


urban structures have been realized only in the former
socialist countries of Eastern Europe, above all the
USSR. In the rest of Europe, hierarchical organization
of urban space is the exception rather than the rule.
However, it is frequently found in traditional Oriental
cities. There, a spatial hierarchy of family, clan, and
local quarters is based on family, religious, and ethnic
affiliations. The hierarchical organization of living
quarters and sub-centers is paralleled in the hierarchy
of semi-private, semi-public, and public links as well as
markets.

2. The Impact of Political Systems

2.1 Social-ecological City Models


In their paper, The Nature of Cities (1945), Harris and
Ullman tried to model internal patterns within cities.
The triad of models presented makes use of diverse
development phenomena:
(a) invasion and succession of social groups in
Burgess’ zonal model,
(b) transport-induced ribbon-developments along
Figure 3 traffic lines in Hoyt’s sector model, and
Land values gradient and zoning (c) collective preferences as to the allocation of
workplaces in the Harris-Ullman multiple-nuclei
duction in accessibility have made for an abandon- model.
ment of central areas, resulting in ‘craters’ of land However, three important underlying assumptions
prices and visible decay of inner cities in the USA have not yet been made explicit:
and parts of Western Europe. In the centers, the gradi- (a) the political system of liberalism,
ent now falls in the opposite direction. The Alonso (b) the historical ‘one-dimensionality’ of urban
model excludes restrictions on planning land use and development, and
vertical development. With zoning laws, building cate- (c) the concept of the city center as CBD.
gories, and other legal regulations, the gradient of The models mentioned represent ideal types of
real estate prices is altered. Each category or zone is North American cities of the interwar period (see
divided in two, with the inner zone obviously better Fig. 4).
suited for business purposes and office space than the
outer zone that is used for dwelling as it is more profit-
able (see Fig. 3). 2.2 Historical-political Systems and the Concept of
the City Center
Changes in the political system alter the conceptions
1.3 Hierarchical Structures
of the city and urban society. The function of the city
Hierarchical order is one of the basic types of systemic center is changed. European urban development may
organization. When organizing the physical space of be defined as a succession of four types of political
urban areas, a hierarchical order has been attempted systems and the respective types of cities. Their
wherever planned development was feasible and land concepts of the city center were markedly different: (a)
in abundant supply. Examples run from 17C Sicily in the burghers’ city of medieval, feudal, territorial
(Granmichele) via Howard’s New Town idea (1902), state the marketplace was the social center; (b) which
with city center and garden suburbs in a cluster city, to shifted to the ruler’s residence in the residence city of
major planning projects for urban peripheries in the absolutist state. Thus, a social gradient falling
Europe at present. The hierarchy is constructed from from the center toward the periphery is the general
the bottom upwards. The basic units are electoral rule in pre-industrial cities. (c) In the age of liberalism,
wards, which in many European countries and in Great Britain created the prototype of the industrial
North America double up as statistical units: their city. A social gradient rising from the center toward
dimensions are derived from the traditional ‘pedes- the periphery predominated, a development later
trian ideology’ of 1 km (i.e., 15 minutes walking paralleled in North American cities. (d) Again, Great

1831
Cities: Internal Structure

Figure 5
The social ecological model of a continental European
Figure 4 city—Vienna (Geoforum 1970, 4: 61)
Social ecological models of US cities during the inter-
war-period
farming and viticulture, as well as of allotment gardens
Britain set the rules for the New Town idea, the and weekend homes (see Fig. 5).
attempt at structuring the amorphous masses of big
industrialized cities on a human scale, with the city
divided into parts with different functions. From the
2.3 The Impact of State Socialism
outset, spatial segregation of inhabitants was barred
from the design of New Towns—a fact that is still Under state socialism, municipal governments were
influential in urban planning today. European city the local planning authorities, responsible for the mass
development is most complex; various superpositions of multistory housing that was unprofitable because of
and, to different degrees, the persistence of historical ‘social’ low rents. They lost all chances of capital
structures caused the diversification of socioeconomic accumulation from real estate property, and became
patterns and different social gradients. dependent on financial endowments from the central
Vienna in the 1960s offers the model for the government and on central planning. The conse-
traditional continental European city: quences were:
(a) Social status is still highest in the core and The big cities’ increasing demand for space was met
declines toward the periphery. Depending on site by extensive incorporations.
preferences individual districts deviate from this rule. Public construction (by state, municipal and other
(b) The CBD has maintained some residential collective institutions) took absolute precedence over
functions and is not surrounded by slums, but by private construction.
middle and upper class residential districts, encircled Architectural design followed totalitarian princi-
by lower class quarters. ples, stressing elements of over wide avenues and huge
(c) The wide belt of multistory blocks is followed on squares, both imposing and strategically important.
its outside by Founders Period industrial quarters, Massive anti-segregation strategies were pursued.
followed in their turn by loosely built-up districts. City centers were usually put under monument
(d) The fringe zone is not defined by an extensive protection.
speculation area of ‘vacant land’ as in North America, City enlargement was effected via New Cities in the
but by sectors of intensive agricultural land use (in shape of hierarchically structured giant multistory
keeping with the von Thu$ nen model) such as truck blocks.

1832
Cities: Internal Structure

Figure 7
Skylines of European, North-American, and Russian
million-cities (Stadtgeographie (Urban Geography)
1998, p. 198)

Figure 6 expand vertically because of strict zoning laws, but


The model of a socialist city: Prague (Vienna. Bridge were forced to grow laterally. This necessitated the
between Cultures, 1993, p. 102) conversion of dwelling space into office space, e.g.,
in Paris and Vienna. High-rise construction in Euro-
In compliance with the Charter of Athens, industrial pean cities started relatively late (see Fig. 7). Its lo-
and dwelling zones were strictly separated. cation within the city—frequently subject to special
Commuting requirements were taken care of by permission—is different from that in North America.
subway construction. There, the vertical structure of urban skylines shows
Forced industrialization created extensive industrial that land prices peak in the center while in European
zones with plants located close to railway lines and cities ‘monument protection’ bars high-rises from the
super-highways. centers. Thus, the new landmarks of banks, insu-
Extensive leisure zones were created on the fringes, rance firms, corporation headquarters, and hotels
with both collective recreation facilities and private keep a polite distance from the old landmarks of
second homes (see Fig. 6). churches, town halls, and castles.
For the sake of access to the various supply and
disposal mains, high-rises are preferably located along
urban ‘scars’: at the interfaces of traditional zones
2.4 City Growth and Political Systems where former boundaries still show in open space or
low physical objects. Frequently, new high-rises accent
Cities are growing systems, with increasing popu- not only the edge of traditional inner cities but,
lations, and\or with increasing demand for space for centripetally, also major points of access to older outer
diverse urban functions (housing, work, education, cities and suburbs. High-rises also mark the front of
leisure, traffic, etc.) in a setting of economic growth growth of the CBD, busy commuter train stations, as
and technical innovation. well as ‘satellite’ districts. They are also instrumental
The diverging political-economic effects of central in slum clearing.
planning and of a free enterprise economy cause
significant differences in the physical growth of cities
in the respective systems. Cities grow in two directions:
laterally and vertically.
2.4.2 Urban fringe deelopment. Development of
urban fringes differs materially according to whether
it takes place under private capitalism, in welfare
2.4.1 The third dimension. Vertical growth means states or, in retrospect, under state capitalism.
high-rise construction. Up to the 1970s, when some (a) Under US private capitalism, cities grow as
cities changed their laws, European cities could not profits from land speculation and rising land prices are

1833
Cities: Internal Structure

invested in land development and in technical of 1935. It is 20 to 40 km wide and includes major
improvements. This starts an upward spiral of de- sports and cultural facilities. In most of the big cities of
velopment and rising prices. North American cities the former USSR, the Green Belt was an integral part
show two wide zones of speculation. The inner one, of development planning.
around the CBD, is marked by decay and slum areas
at present. Far more impressive, at least because of its
extent, is the peripheral zone of vacant land around
core cities and suburbs, which, even in the 1950s 3. On the Nature of Cities: the Immanent
amounted to 20 to 60 percent of the core areas Question
(Bartholomew 1955). Anglo-American urban geogra-
phy textbooks completely ignore those huge tracts of With current suburbanization and counter-urbaniz-
vacant land—implying that non-utilization of these ation tendencies, the model of the city as a centered
spaces is taken for granted rather than considered a system is becoming obsolete. Scenarios have been
problem. developed which anticipate the existence of cities as
(b) In continental Europe, too, urban fringes attrac- ‘non-places.’ But there were also new models de-
ted speculation. During World War I, they were veloped for the American ‘urbanlike system.’ During
occupied by so-called ‘emergency gardening plots,’ the period 1950–2000, this rapidly growing new system
later turned into allotment gardens. Frequently, these of suburbia created a sort of extensive—though not
became temporary settlements, partly forerunners of a ubiquitous—network, in part destroying former
second-home periphery. All over post-World-War I central place hierarchies, and in part developing the
continental Europe, spontaneous settlements typically surroundings of metropolitan areas. It also frequently
marked city fringes—a consequence of fundamental stopped the process of restructuring central cities by
political changes. The absence of government checks means of downtown redevelopment and gentrification.
on land use made for temporary settlements, among Core cities of mega-metropolises such as Chicago got
them the pavilions of ‘chaotic urbanization’ in France another chance of redevelopment as new (air) traffic
as well as the often illegal occupation of land around junctions, centers of business, finance, and the quat-
big central European cities (Belgrade, Budapest, ernary public sector, as well as cultural centers.
Bukarest, Sofia, Vienna, and Warsaw). This shows The dichotomy of the ideal types of ‘urbanlike’
that the succession states were less powerful than the structures and new mega structures seems to cor-
Austro-Hungarian monarchy in repelling indigent respond to postindustrial America’s abandoning and
illegal immigrants. Those postwar squatters were simply ‘forgetting’ the areas of urban desertification of
comparable to today’s squatter settlements on the earlier industrial city development.
fringes of Third World cities beyond the reach of state
or municipal authorities. See also: Cities: Cultural Types; Cities, Internal
(c) As to planning and regulating the growth of Organization of; Development and Urbanization;
agglomerations, Great Britain set the standard in the Ecology, Political; Fertility Transition: Economic
early twentieth century, with the two important Explanations; Spatial Interaction; Sustainable Devel-
concepts of the New Town (referred to above) and the opment; Sustainable Transportation; Transportation
Green Belt. The creation of a Green Belt presupposes Geography; Urban Activity Patterns; Urban Geog-
government control of land use, replacing the market raphy; Urban Growth Models; Urban Life and
mechanism of the liberal age. Originating in London, Health; Urban Poverty in Neighborhoods; Urban
Green Belts have also become constituent features of System in Geography
zoning plans in other cities of the former British
Empire. The urban development plan of Ottawa shows
a Green Belt several miles wide. Persistent urban
sprawl caused a characteristic overspill. In the USA,
too, Green Belt concepts were introduced into various Bibliography
urban development plans, but because of the enor- Alonso W 1964 Location and Land Use. Towards a General
mous extent of suburbanization they have not been Theory of Land Rent. Cambridge, MA
realized along the edges of core cities. At present, there Bartholomew H 1955 Land Uses in American Cities. Cambridge,
is little interest in creating public green spaces and MA
leisure areas there. Carter H 1997 The Study of Urban Geography, 4th edn. Arnold,
London
(d) The opposite is true for the former socialist
Heinritz G, Lichtenberger L (eds.) 1986 The Take-off of Suburbia
countries where extensive public recreation areas and the Crisis of the Central City. Erdkundliches Wissen 76,
and large private second-home districts exist. (Neither Steiner, Wiesbaden, Germany
of these important elements of the periphery of urban Herbert D T, Thomas C J 1997 Cities in Space: City as Place,
regions is to be found in the USA.) The public 3rd edn. Fulton, London
recreation area of Moscow goes back to a Green Belt Knox P L 1995 Urban Social Geography: An Introduction, 3rd
idea already incorporated in the city development plan edn. Longman, Singapore

1834
Cities: Post-socialist

Lichtenberger E 1993 Wien-Prag. Metropolenforschung, different from the city in capitalist society. Despite
Bo$ hlau, Vienna dissenters, the consensus is that a distinction existed
Lichtenberger E 1994 The future of the European city in the (French and Hamilton 1979, Pensley 1998, Smith
West and the East. European Reiew 3(2): 182–93
1996, Szelenyi 1996). Arguably an a fortiori case exists
Lichtenberger E 1998 Stadtgeographie (Urban Geography), 3rd
edn. Teubner, Stuttgart, Germany. Italian translation, for believing that these characteristics might be found
Geografia dello spazio urbano, 1st edn., Unicopli, Milano, in Russia. As the first socialist country it had the
Italy longest period in which to experiment and put its
Lichtenberger L 1970 The Nature of European Urbanism. ideological principles into practice. Additionally, it
Geoforum 4: 45–62 compelled its first satellites in eastern Europe to adopt
Lichtenberger L 1976 The Changing Nature of European its model of industrial and urban development. From
Urbanization. Urban Affairs Annual Reiews 11: 81–107 this point of view it would be possible to gain insights
Lichtenberger L 1992 Political Systems and City Development in into the essence of the socialist and post-socialist city
Western Societies. A Hermeneutic Approach. Colloquium
by studying the experience of the former socialist
Geographicum 22: 24–40
Lichtenberger L 1993 Vienna. Bridge between Cultures. Wiley countries in Europe.
Short J R 1996 The Urban Order: An Introduction to Cities, Addis Ababa is a post-socialist city as are Berlin,
Culture and Power. Oxford, MA Beijing, Budapest, Phnom Penh, and Moscow. The
capital city in both post-socialist and socialist societies
E. Lichtenberger is the distillery of the society’s contradictions and
conflicts; it contains in hypertrophied form what is
found to a much lesser extent in all other cities. The
transformation of societies from being socialist to
post-socialist is not a unilinear process (Grabher and
Stark 1997). Neither does the transition from socialist
Cities: Post-socialist to post-socialist city follow a clear pattern (e.g.,
Harloe 1996 Phe and Nishimura 1992). However,
1. Issues of Definition generalizations can be made about these cities in terms
of their external, representational, and material form,
Socialist governments were formed at different times their economic functions, and the social relationships
in different parts of the world: in Europe (USSR, amongst their populations.
eastern Europe), Asia (China, Cambodia, Vietnam),
Africa (16 claimed to be socialist, but only Angola,
Ethiopia, Mozambique qualified as Afro-Marxist 2. Urban Form
regimes), and the Americas (Cuba). The term socialist
is confined to a short historical period: the Soviet
2.1 The Symbolic, Representational Enironment
Union apart, it refers to the 40-year period after
1948. For the countries in the ‘South’ the period is The aesthetic of the socialist city in the Soviet Union
even shorter, in the case of Cambodia from 1975 until before 1939 represented a triumphant proletariat in a
1989. It is doubtful that we can talk of a ‘generic hostile but not yet belligerent world. After 1945 that
socialism,’ beyond the fact that each country was aesthetic had to include the victory in war against
characterized by the abolition of private ownership fascism. Thus, to the mausoleum and ubiquitous
and the concentration of economic resources in the monuments to the heroes of the Great October
hands of the state and by the monopolization of Socialist Revolution, in particular Lenin, Marx, and
political power in the hands of a ‘vanguard party.’ (In Engels, were added grave, granite memorials to those
1994 still only 15 percent of the population of Ethiopia who died in the titanic struggle against national
lived in towns.) The term post-socialist is used to socialism.
describe urban areas in societies which, until the The aesthetic and symbolic representation of the
breaching of the Berlin Wall in 1989, had been known post-socialist city is much more difficult to identify.
as ‘socialist.’ At present no government, with the Just as Soviet leaders, especially, but not only, Stalin
possible exception of China and Cuba, describes itself and Khrushchev, demolished churches and did much
as ‘socialist,’ therefore socialist cities no longer exist. to remove all vestiges of the tsarist past, post-socialist
leaders are rebuilding cathedrals—the most classic
being the Cathedral of Christ the Saviour near the
Kremlin in Moscow—and removing statues to the
1.1 Socialist Cities?
now fallen heroes of socialism. China, too, destroyed
At a formal level, if it is accepted that these were ‘real much of the civilizational grandeur of the Imperial
existing socialist societies,’ then their cities were by City in Beijing, while Cuba allowed Havana to wither
definition socialist cities—the position adopted in the (Segre et al. 1997).
Soviet Union in 1931. The more difficult question is Berlin, Budapest and Moscow encapsulate the
whether or not the socialist city is qualitatively characteristics of the post-socialist city. During

1835
Cities: Post-socialist

the belle eT poque their architectures expressed their just for formal celebrations of socialist power. This is
countries’ imperial status and cosmopolitan air. Sub- the myth of the harmonious, integrated community
sequently, architecture came to symbolize inter-war where everyone interacts as a free person; where the
authoritarianism, the defeat of fascism, a divided exotic and bizarre in appearance are silently applauded
world—epitomized in the divided Berlin. In contrast for their non-conformity as though these ostensible
to the socialist city, the post-socialist city has no state forms are a defence for everyone against socialist
decreed prototype. The socialist city, as conceived conformity. The third myth is that this square, which
during the 1920s and 1930s based on planning and is really a circus, a place for entertainment and the
reason, would, using the most up-to-date construction consumption of material goods and pleasure, is a
materials and building technologies to erect multi- symbol of Europeanization or Westernization. The
storey blocks of flats, be the acme of modernism. civic leaders of the post-socialist city bask in the
From 1949 this was the blueprint for all socialist sunshine of their enthusiasm for what its citizens
societies. regard as an abandonment of grayness.
Now coarse granite blocks that characterised govern- The post-socialist city is cosmopolitan in theory and
ment buildings of the socialist city have been re- localist in spirit, and latently nationalist in practice. It
placed by leafy atriums encased in glass, symbolically is an arena where people learn the laws governing
representing transparency, the current metaphor for contracts, and the rules protecting and regulating
democracy. The architectural preferences of the private property. At the same time it massages into life
noueau riche group tend to fall into two categories: an enemy, which for inhabitants of the socialist city
either a sentimental veneration of the past and national was defined in terms of ‘class’ and property relations,
styles, or an international orientation expressed in a a mode of analysis no longer acceptable. In the post-
pastiche and rococo post-modernism. socialist city ‘the enemy’ is the alien, the stranger, the
Changing architectural styles and usages of town immigrant.
squares are sound barometers of historical periods. Berlin is distinctive from other cities in many ways,
The memorial and monument, which became popular but its singular claim to uniqueness is that its post-
at the turn of the twentieth century, embellished and socialist incarnation, in combining its socialist and
blemished the streets, squares, and parks of towns and capitalist faces in one city, should be the most visible
villages after the end of World War I, especially in the healing of the painful rupture experienced by European
newly created, states. Ljubljana European even before nations. Ironically, however, this particular post-
the war had begun to substitute a Slovene for its socialist city reveals in concentrated form the contra-
German fac: ade (Jezernik 1998). This process reached dictions that are endemic in many of the most
its apogee in the socialist city where a baroque Marian populous capitalist cities and which the socialist city
column, for example, was ‘relocated’ in a less visible was to have overcome. A skyline of cranes tells of a
place, while obelisks were erected to the Red Army booming economy, while unemployment is at a high
and marble memorials were built to partisan leaders. level; a thriving cultural and political boundary-testing
In many post-socialist cities, these heroes of 40 years artistic scene co-exists with a growing intolerance of
have been, if not destroyed, decapitated or mutilated, immigrants.
then banished from public view, in some instances to
museums so that they may be reminders of the images
of dictators and criminals. These acts highlight the
transience not just of heroes, but also of the myths 2.2 The Material Enironment
about the inhabitants’ past and who they wish or think
themselves to be. The erection and removal of monu-
ments are potent symbolic acts, which create and 2.2.1 The economy. The keystone of the new global
sustain myths and then denigrate and depose them. In consensus amongst governments is the privatization
the post-socialist city new buildings are themselves of public assets and of government functions and the
monuments, but less to people or events than to the establishment of new property rights. This policy,
new-found ideology and to the power of money. The which is already influencing the structure and func-
architecture of new government buildings represents tioning of post-socialist cities, is hampered or aided
visions of the meaning and exercise of political by deeply ingrained regional, cultural traditions.
power. The economic dimension of cities most distinguishes
The post-socialist city celebrates three myths. The the socialist from the post-socialist city. The socialist
first is a ritualistic pageantry of the past, so easily city was unique in that economic wealth, which the
portrayed in a coat of arms and the insignia of local privileged enjoyed, could not be conspicuous. Second,
governance that are deemed to dignify the present. the post-socialist city is not just shaped by the wealth
Second, in keeping with contemporary Western plan- of an indigenous bourgeoisie, but also by foreign
ning theory and practice, the squares are pedestrian- capital invested in real estate. This critical difference
ized and garlanded with new lamps, benches, and cafes. between the socialist and post-socialist city is already
They are again public spaces for everyday events, not visible in three processes; the gentrification of parts of

1836
Cities: Post-socialist

the central city, the creation of a central business tions. Guided by an idea of social justice it addressed
district and more luxurious shopping centers, and an issues of exploitative landlord–peasant relationships,
influx and visible presence of foreign migrants. rural poverty and land hunger and the contrast
Cities are shaped by movements of capital. In the between the housing standards of rich and poor city
main, the protective walls which surrounded socialist dwellers. The state allocated land to publicly owned
cities have been pulled down. Previously, in theory but enterprises and institutions, free of charge and for
also to a certain extent in practice, cities had planned use in perpetuity. No value was imputed to the lo-
economic relations with one another as part of a cation of the land. Since in most cases land was not
regional, national, and international division of labor. in short supply, organizations applying to the state
Post-socialist cities are in competition with each other for plots asked for and received more than they
and increasingly subjected to global competition. One required. The bundle of de jure use rights enjoyed
aspect of globalization is the attraction of tourists. by individuals and juridical entities varied from
Only a small proportion of post-socialist cities have a country to country, but in no case were they able to
real potential to make this a source of local economic dispose of land, even of that which was surplus to
development. More importantly, globalization accel- their requirements. Yet, at the same time, land users
erates attempts to commodify a city’s history and behaved as though they were powerful, private
culture, which is achievable largely through its saniti- owners, with the result that detailed building and
zation and its conceptual Warholization and material zoning regulations were frequently neglected or
McDonaldization. Globalization has also helped to ignored.
accelerate the growth of the criminal economy, which The land reforms that have accompanied change at
is shaping the urban morphology, most visibly (es- the end of the century have been guided by no such
pecially in capital cities) in private financial institutions ideal of justice, but are regarded as an obligatory
and shopping malls. A number of post-socialist cities, component in the process of privatization and the
mainly, but not only, capitals (Moscow, Almaty, establishment of a market economy. While one of the
Berlin, Sofia, Shanghai, Phnom Penh) have become defining features of the socialist city was the state
centers for the consumption and transhipment of expropriation and nationalization of land, the defining
heroin and other drugs, for trade in illegal immigrants feature of the post-socialist society is the returning of
and prostitutes, and for smuggling. land and property to private ownership. This includes
Organized crime and corruption are pervasive but the return to the former owners all forms of property,
particularly evident in the largest cities. In some including agricultural land and urban real estate. This
countries there is virtually no sphere of revenue- process of restitution constitutes one of the thorniest of
generating activity (including charities) which has not problems faced by governments and societies. Coun-
been visited by extortionists (‘insurance brokers’). In tries have approached both the general policy of
the space of a decade gangs of extortionists have privatization and the specificity of restitution differ-
evolved into ‘enforcement partners.’ Before signing ently (Strong et al. 1996)
business contracts, companies acquire information In the first case, Belarus, Kyrgyzstan, Uzbekistan,
about each other’s enforcement partners. Only if the and China are among the firmly against land pri-
enforcement partners have recognized each other and vatization, while in Russia a minority of the popula-
given mutual guarantees will the contract with all its tion favors the unrestricted buying and selling of land.
formal juridical and business attributes be signed. And, even in countries where it is legal this has not
Whereas the socialist city was characterized by a necessarily meant the formation of a functioning land
manufacturing profile with a grossly underdeveloped market. These policies are coming under pressure
service sector, the post-socialist city is dominated, at from external sources, with the European Union in
both extremes of the class spectrum, by trading and 1999 declaring its opposition to requests from pros-
the provision of services. Both the indigenous and pective membership candidates (such as the Czech
foreign e! lites have generated a demand for casinos, Republic and Hungary) to continue their restrictions
restaurants, night clubs, and other leisure facilities. on foreign ownership of land. Through the selling of
This stimulates the local economy, creates new forms restituted buildings to banks and private companies
of labor market stratification and, with the com- and the conversion of more substantial houses into
modification of land, speculation in real estate. In luxury flats, shops, and restaurants, restitution plays a
Cambodia the overnight transfer of state housing into distinct role in the stratification and shaping of post-
private ownership in 1989 presented local government socialist cities.
officials with a golden opportunity, of which they
availed themselves, to sell off public land.

2.2.3 Housing. The interplay of a variety of factors,


primarily the ideological imperative to develop indus-
2.2.2 Land reform. Land nationalization was cen- try rapidly, bequeathed highly polluted and degraded
tral to the twentieth century socialist (agrarian) revolu- environments to post-socialist cities. The use of less

1837
Cities: Post-socialist

polluting fuels and the introduction of further legis- 2.2.4 Real estate markets. One of the most defining
lation enforcing air quality control in the 1980s and and visible features of the post-socialist city has
then the dramatic fall in industrial output in the been the property boom and the proliferation of real
1990s combined to reduce stationary sources of air estate agents, unknown in the socialist city. Requiring
pollution. On the other land, air pollution has a minimum of knowledge and finance to be estab-
increased as a result of the rapid rise in private car lished, real estate agencies have proliferated in most
ownership and the poor quality of fuel available post-socialist cities (and provided another lucrative
(Shahgedanova et al. 1999). Thus, there has been a source of income for organized crime). The immedi-
compensatory rise in vehicular pollution, especially ate force driving the urban economy and responsible
carbon monoxide discharge, which is most marked for transforming the class structure has been the buy-
in the economically successful post-socialist cities. ing, selling, renting, and construction of real estate.
Thus, while the sources of pollution are changing, the Its development is hampered by the fact that, after a
general level remains the same. decade of scandals associated with pyramid and
Although the housing stock in socialist societies other ‘banking’ schemes, individuals are wary of
varied considerably in terms of ownership and physical investing savings in private banks so that a mortgag-
structure, socialist housing policy rested on two ing system is either totally absent or only in its infancy.
principles: that no-one should draw an unearned While much of the apparatus of town planning and
income from renting space and that households should building, with its institutional rhythms, careers, and
not enjoy the use of housing space above a certain, entrenched interests, remains intact and ensures that
administratively defined norm. Larger properties be- the movement away from high-rise construction,
longing to more affluent citizens were taken into public based on the use of prefabricated units and the mass
ownership and then subdivided and leased at low rents. production of standardized parts, will take time,
In most countries state agencies commissioned and corruptible politicians and impotent and impoverished
financed new building, which they then allocated. officials oversee the making of visual urban chaos.
In 1988 the governments of the two largest socialist Currently, a challenge is being mounted (from
systems, China and the USSR, introduced legislation within the European Union) against the decade-long
designed to transform housing policy. They signalled hegemony of the USA in the debate over the role of
that rents in the public sector would have to rise and markets in housing and land (as expressed in Stryuk
encouraged sitting tenants to buy their homes. In 1996 (Urban Institute, Washington), Bertaud and
doing so they prepared the ground for the post- Rehaud 1997 (World Bank).
socialist city, one of whose defining features is the
absolute right of the owner-occupier to dispose of
property. The system of targeted housing allowances 3. Social Relations in the Post-socialist City
and the gradual introduction of market-level (or at
least cost-covering) rents that is being introduced, is
3.1 Population
intended over time to make owner-occupation more
appealing. This is unlikely to happen until there have The war-ravaged societies, principally in the ‘South,’
been substantial increases in real incomes. The avail- have created demographically imbalanced cities. In
ability of construction materials on the market has Phnom Penh, 29 percent of households are headed by
meant an expansion in self-build. This not only women. In Moscow, too, because of higher male
augments the supply of housing but can also be used to mortality rates, women predominate. The average age
generate income through (sub-)letting. Generally, is considerably higher in the cities of some countries,
however, social housing will, in one form or another, while in others the urban population is much younger.
play an important role for the foreseeable future One visible sign of the gender imbalance is the greater
(Wang and Murie 1999). presence on the streets of women engaging in petty
Such has been the importance attached by some trading or begging.
governments to the commercial housing sector as a Radical transformations of society are accompanied
powerful motor to drive the economy and urban by population movements and new residential con-
development that, in China for instance, many of the figurations. The Soviet Union used an internal pass-
‘private’ developers are either outright public organiz- port (propiska) system to regulate population flows
ations or government–private partnerships. In many into metropolitan and other large cities and to control
post-socialist societies, the decline in the role and their demographic and ethnic composition. After 1945
authority of the central state has been accomplished this method of control was exported as part of its
by a growth in local government bureaucracies. But ‘administration package’ to countries which adopted
the parvenu princelings in the cities and provinces, the Marxist model of development. Some post-social-
who now have greater tax-raising powers, nonetheless ist cities, notably Moscow, illegally retain it as an
lack the financial capacity to fulfil the functions instrument of social control. Urban residents’ asso-
devolved to them, which has meant a much-reduced ciations in Ethiopia continue to regulate movement
expenditure on the technical and social infrastructure. into their territories.

1838
Cities: Post-socialist

3.2 The Wealthy anti-parasite legislation was strengthened to deal with


begging and vagrancy, described as anti-social(ist)
Post-socialist societies have created a class of noueau
behavior. As soon as the socialist city transmogrified
riche, who differ from e! lites under the previous system
into the post-socialist city begging, street children, and
by an ostentatious display of their wealth status,
homelessness appeared as features of the urban land-
especially through the universal symbols of home and
scape (Andrusz 1998, Lugalla and Mbwambo 1999).
private car, both of which make their mark on the
However, it is doubtful that in cities of less developed
shape of, and life in, the post-socialist city. Accom-
countries (such as Dar-es-Salaam and Phnom Penh)
modation for e! lites under socialism was distinguished
where—as South, post-socialist cities, public hygiene,
by the following features. First, e! lites lived in the city
especially the disposal of sewage and solid waste
not out of town and had access to (but did not own) a
generally, and the provision of clean drinking and
country villa. Second, they did not live in socially
washing water remain the key issues—such groups are
exclusive enclaves, neither, third, did they live close to
a wholly new phenomenon, although their numbers
foreigners. This is in direct contrast to e! lite districts in
have increased. In the ‘South,’ the vast majority of
post-socialist cities. Members of the new, privileged
street children come from the countryside, have
indigenous e! lite live in close proximity to foreigners
inhabited the streets for a longer period, and have
often on small estates consisting of low-rise, detached
received no education at all. Their families of origin
housing surrounded by walls or fences covered by
are much larger and very frequently polygamous and
closed circuit television.
the mothers illiterate.
Views differ on the extent to which social seg-
Socialist cities had their quotas of materially
regation could be found in socialist cities (Szelenyi
deprived and vulnerable groups, alcoholics, vagrants,
1996; Prawelska-Skrzypek 1988; Dangschat 1987,
and homeless people. But, unlike in post-socialist
Hamilton 1993, Weclawowicz 1992). The market
cities,they were rarely seen or talked about and did not
economy of the post-socialist city intensifies any
form pressure groups or demonstrate.
existing tendencies to segregation through the pro-
Since, ideologically, such groups could not exist,
cesses of gentrification and suburbanization. In the
neither did the organizations to help them; when they
first case, a location in or near the city center, in
did appear the agents of social control dealt with them
buildings erected for prosperous families prior to the
under the appropriate anti-parasite legislation. Today
country’s socialist revolution, is attractive to a variety
the visible presence of large numbers of dispossessed
of social groups. Alongside the modernization of old
people has led to the establishment of a statutory and
buildings is the construction of prestige housing and of
non governmental framework to address the situation
retail and office space.
of these individuals. However, even where statutory
The arrival of international organizations has
regulations exist, they are frequently not implemented.
created an additional demand for higher quality
The institutional and ideological legacy has ensured
housing to rent and purchase. The subsequent rise in
that the law enforcement agencies continue to pursue
the price of housing sometimes leads to the displace-
a policy of harassment, intimidation, and abuse of
ment of the local population and a reduction in
position towards members of these groups. With the
density as pre-socialist residential buildings in Phnom
help of intergovernmental aid agencies and Western
Penh and Prague changed from multifamily to single-
philanthropy, indigenous voluntary organizations are
family occupation.
advocating that a more humane and positive approach
Suburbanization was described and denigrated as an
be taken. But, in societies where most people are, or
anarchic, capitalist-driven sprawl associated with low-
perceive themselves to be, suffering a deterioration in
rise, single-family dwellings. Now, from Berlin to
their standard of living, there is little public sympathy
Moscow to Almaty to Beijing the flight of the rich to
for beggars and special pleading indigents.
suburban and ex-urban settlements, usually along
Another visible feature of the post-socialist city is
the main arterial routes, has begun in earnest. This
that of the completed, but uninhabited luxury housing
development has been greatly assisted by the growth in
development, where supply has exceeded demand both
car ownership. Both processes put pressure on pro-
because of the high asking price and because either the
tective green belts and invade parkland within cities.
purchaser does not have the right to acquire the free-
hold or the period of the lease is too short. At the
other end of the housing spectrum are the equally
3.3 Marginal Groups
new phenomena of squatting and homelessness. The
In socialist cities apart from gypsies against whom transition from socialist to post-socialist was quint-
sanctions were not strictly applied or were ineffectual, essentially represented for a brief moment during
begging was unknown. Work was available for the perestroika when part of the space between the Hotel
able-bodied: for the elderly there were pensions which Rossiya and Red Square in Moscow was squatted by
could be supplemented by other legal activities and the homeless people. The unplanned use of land reaches it
state provided for other ‘vulnerable’ groups, such as apogee with squatter settlements in and around Asian
orphans and handicapped people. In the 1960s existing and African post-socialist cities. In 1994, up to 15

1839
Cities: Post-socialist

percent of the population in Phnom Penh was living in Lugalla L, Mbwambo J 1999 Street children and street life in
such settlements. urban Tanzania: the culture of surviving and its impli-
Uninhabited, privately constructed, luxury housing cations for children’s health. International Journal of Urban
and Regional Research 23(2): 329–44
on the one hand, and homelessness and squatting on
Pensley D S 1998 The socialist city? A critical analysis of
the other, are features of post-socialist cities that were Neubaugebiet Hellersdorf. Journal of Urban History 5:
unknown in socialist cities. 563–602
Phe H H, Nishimura Y 1992 Housing in Hanoi. HABITAT
INTL 15(1\2): 101–26
4. Conclusion Prawelska-Skrzypek G 1988 Social differentiation in old
central city neighbourhoods in Poland. Area 20(3): 221–32
The post-socialist city is a transparent city. Pros-
Segre R, Coyula M, Scarpaci L 1997 Haana. Two Faces of the
titution and poverty existed in St. Petersburg, Buda- Antillean Metropolis. John Wiley, Chichester, UK
pest, Beijing, and Havana. The ideology of socialism Shahgedanova M, Burt T P, Davies T D 1999 Carbon monoxide
drew down a curtain on the urban stage concealing and nitrogen oxides pollution in Moscow. Water, Air and
and reducing, but not eradicating social phenomena Soil Pollution 112: 107–31
which the system decreed should not exist. It also Smith D M 1996 The socialist city. In: Andrusz G, Harloe M,
banished the prosperous and powerful behind grey Szelenyi I (eds.) Cities after Socialism. Urban and Regional
urban facades or to sylvan retreats. One of the Change and Conflict in Post-Socialist Societies. Blackwell,
imperatives of capitalism is that the homeless should Oxford, UK, pp. 70–99
Strong A, Reiner T, Szyrmer J 1996 Transitions in Land and
be seen, disadvantaged groups should be allowed to
Housing. Bulgaria, The Czech and Poland. St. Martin’s Press,
express and demonstratively broadcast the inequities New York
of their status, and the rich should parade their wealth. Stryuk R (ed.) 1996 Economic Restructuring of the Former Soiet
Block. The Case of Housing. Urban Institute Press, Wash-
See also: Cities: Cultural Types; Democratic Tran- ington, DC
sitions; Development and Urbanization; Economic Szelenyi I 1996 Cities under Socialism—and: After. In: Andrusz
Transformation: From Central Planning to Market G, Harloe M, Szelenyi I (eds.) Cities after Socialism. Urban
Economy; Socialism; Socialist Societies: Anthropo- and Regional Change and Conflict in Post-Socialist Societies.
logical Aspects; Transition, Economics of; Urban Blackwell, Oxford, UK, pp. 286–317
Activity Patterns; Urban Geography; Urban Growth Wang Y P, Murie A 1999 Commercial housing development in
urban China. Urban Studies 36(9): 1475–94
Models; Urban History; Urban Places, Planning for Weclawowicz G 1992 The socio-spatial structure of the socialist
the Use of: Design Guide; Urban Planning: Central cities of East-Central Europe. In: Lando F. (ed.) Urban and
City Revitalization; Urban Studies: Overview Rural Geography. Cafoscarina, Venice, pp. 129–40
Wu V 1998 The Pudong development zone and China’s
economic reforms. Planning Perspecties 13: 133–65
Bibliography Wu W P 1999 Shanghai. Cities 16(3): 207–16
Andrusz G 1998 Housing and class structuration in Russia. In:
van Weesep J (ed.) Proceedings of the Conference Trans- G. Andrusz
formation Processes in Easter Europe. Housing and Labour
Markers. The Netherlands Organisation for Scientific Re-
search, The Hague, Part 1 pp. 27–41
Andrusz G, Harloe M, Szelenyi I (eds.) 1996 Cities after Citizen Participation
Socialism. Urban and Regional Change and Conflict in Post-
Socialist Societies. Blackwell, Oxford, UK
Bertaud A, Renaud B 1997 Socialist cites without land markets. ‘Citizen participation’ refers generally to citizen in-
Journal of Urban Economics 41: 137–51 volvement in public decision making. In planning and
Dangschat J 1987 Sociospatial disparities in a ‘socialist’ city: the related fields, the term also has a specialized meaning,
case of Warsaw at the end of the 1970s. International Journal designating efforts to facilitate participation of citizens
of Urban and Regional Research 11(1): 37–96 who would normally be unable or disinclined to take
French R A, Hamiliton R E I (eds.) 1979 The Socialist City. part.
Spatial Structure and Urban Policy. John Wiley, New York
Grabher G, Stark D (eds.) 1997 Restructuring Networks in Post-
socialism, Legacies, Linkages and Localities. Oxford Uni-
versity Press, Oxford, UK
1. The Emergence of an Ambiguous Definition
Hamiliton E 1993 Social areas under state socialism: the case of Alexis de Tocqueville, the chronicler of American
Moscow. In: Solomon S (ed.) Beyond Soietology: Essays on habits in the mid-nineteenth century, remarked on
Politics and History. M E Sharpe, New York
widespread popular participation in the country’s civic
Harloe M 1996 Cities in the transition. In: Andrusz G, Harloe
M, Szelenyi I (eds.) Cities After Socialism. Urban and Regional life. Americans seemed ready to form groups to
Change and Conflict in Post-Socialist Societies. Blackwell, address any problem. When mass immigration swelled
Oxford, UK cities at century’s end, citizen activists spearheaded
Jezernik B 1998 Monuments in the winds of change. International wide-ranging progressive reforms in sanitation, land
Journal of Urban and Regional Research 22(4): 582–8 use, and government organization. American city

1840
Citizen Participation

planning was a direct outgrowth of this citizen is largely a matter of judgment, which community
movement. Planning commissions were one means of members can render and about which many have
institutionalizing public influence over decisions. strong convictions. Thus federal and local officials
Nevertheless, the 1960s heard an explosion of talk who encouraged or coped with citizen participation
about ‘citizen participation’ in planning, Community faced community activists who challenged them and
Action and the War on Poverty, and Model Cities. forced consideration of what would be representative
Part of the explanation is that citizens turned away participation of citizens who are community members.
from public life in the quiescent 1950s, but the primary In addition, community activists, stimulated by the
reason is that the Johnson administration policy language of the War on Poverty, raised issues of
agenda focused on citizens who had never participated power: could participation of residents in developing,
much in public decision making: the poor and blacks. conducting, and administering programs mean any-
Community Action would involve them in unpre- thing other than community control? And, if so, what
cedented ways. should residents have control over, and how much
The traditional approach to citizen participation in control was significant? Answers to these questions
planning was represented by the federal Housing Act were complicated by ambiguities in definitions of
of 1954, which called for citizen advisory committees community and representation: who would have to
in urban renewal and other local projects. A 1966 exercise power over what in order for a community to
Department of Housing and Urban Development be represented in making decisions?
Program Guide suggested categories to be represented, In the early 1970s, the Nixon Administration
including business groups, civic clubs, churches, narrowed the meaning of citizen participation in two
schools, government agencies, social service organ- respects. ‘Citizens’ became individuals again, rather
izations, the mass media, neighborhood organizations, than community members. ‘Participation’ less often
and ‘ethnic or racial groups.’ The examples and meant activism or power than opportunities to speak
custom made it likely that, with the possible exceptions at formal hearings conducted by others, who would
of neighborhood and racial representatives, most ad- make decisions. Over the years, federal planning,
visory committee members would come from an housing, community development, and environmental
educational and economic elite. legislation has come to include provisions for citizen
The Economic Opportunity Act of 1964, the hall- participation, and local practice has followed suit. In
mark of the War on Poverty, declared a new approach: general, the balance of these practices has taken the
‘The term ‘community action program’ means a narrower view of citizen participation.
program … which is developed, conducted, and Planning in Europe and Australia has adopted
administered with the maximum feasible participation language of citizen participation, with variations in
of residents of the areas and members of the groups practice similar to those in the United States. United
served.’ Two years later, Model Cities began with a Nations and World Bank policies speak of ‘com-
consonant call for ‘widespread citizen participation.’ munity participation’ in aid projects. In this context
One sign of a new departure was the Office of ‘community’ refers to the locus or target of inter-
Economic Opportunity’s investment in community vention more than an expectation that an entire
organizing as part of Community Action and its community participate in planning and implemen-
technical assistance to community organizations in tation. Practice encounters similar issues about rep-
Model Cities. The War on Poverty aimed to activate resentation and power to American and European
the urban poor, particularly blacks. planning.
‘Citizen’ participation and ‘community’ action or
participation differ in an important way. A citizen is
an individual, whereas a community is a collectivity. 2. What ‘Citizen Participation’ Could Include But
Citizens are members of the state, with certain rights, Often Does Not; A Broader Definition
but they do not necessarily have intervening attach-
ments. Communities are groups with membership Because ‘citizen participation’ originally referred to
norms (ranging from residence in a territory to involving people in government-initiated programs,
conformity to rules of behavior), social relations, and common usage often does not include autonomous
loyalties. Participation of individual citizens is a citizen activities, even though they are acts of in-
quantitative matter: it can be counted, and more is volvement in the society and may aim to influence
arguably better than less. government. This broader ‘citizen participation’ in-
In contrast, participation of communities is a cludes three types of activity commonly associated
qualitative matter. Communities do not take action en with ‘community organization.’
masse, but, rather, individuals act in the name of ‘Locality development,’ or ‘community develop-
communities. This situation introduces questions of ment,’ involves the organization of residents of an area
representation: What is a community, who are its to create the capacity to improve their situation.
members, and to what degree do participants represent Projects may address various needs, such as health,
members’ interests? Partly a matter of calculation, this housing, education, and infrastructure. Activities may

1841
Citizen Participation

include popular education to understand local edge, the ability to solve problems, and improvements
conditions. in living conditions, including services.
‘Social planning’ involves citizens in working with The latter benefits involve sharing in outcomes of
technical data to solve substantive problems. These collective action, in community deelopment. When
activities aim to prepare plans for programs to address members of a community articulate and promote their
needs, which may be in many areas. Participants might interests with one another, they can resolve conflicts
focus on a single issue, or a community might develop and discover shared community interests. They will
a multifaceted plan. develop relationships they can use in acting together
‘Social action’ involves the organization of a popu- on behalf of those interests. As a result, they may gain
lation to press actors in the larger society, usually knowledge, power, resources, and relationships that
institutions and often government, to change policies enable them to influence institutions and other actors
or practices, redistribute economic or social resources, and make decisions that change conditions and solve
or yield power over these things. Disadvantaged problems. This may mean having more power over
populations may favor this approach, but many issue- societal resources, becoming more self-sufficient, or
oriented groups (such as the environmental movement both. In general, a community whose members de-
and consumer groups) may choose it. Activities liberate and make decisions together is likely to have a
include mass mobilization and advocacy. sense of vitality and potency and attract members’
It is reasonable to consider these activities part of a loyalty.
broad definition of ‘citizen participation’ as efforts by Organizational deelopment is a third benefit of
citizens (including, but not only, the poor and min- citizen participation. When citizens contribute knowl-
orities) to influence the policies and practices of edge about their needs, service providers are more
government, basic social institutions, or their own likely to design and deliver services that meet those
neighborhood or community. However, in this context needs, solve problems, and are used. When citizens
the conventional emphasis on efforts to involve participate in planning programs, they are more likely
citizens in government-initiated activity produces to develop relationships with organizations, consider
theoretical ambiguity. In addition, those who hold the programs legitimate, and use or accept them. In
narrower view may find conflicts in practice when general, voluntary organizations, including com-
encountering citizens or communities who hold the munity associations, need participation because citi-
broader view. zens provide three resources essential to organ-
izational operation and development: work, money,
and legitimacy.
Finally, citizen participation contributes to societal
3. Arguments for Citizen Participation; Practical deelopment. Active citizenship in a democratic frame-
Benefits work can be considered inherently good for society.
When activists openly discuss their interests, they can
Different views of citizen participation are associated resolve conflicts among private interests and discover
with different arguments for it and claims regarding its common, public interests. They can build relationships
benefits. A basic distinction can be drawn between a and learn together about shared conditions. As in
focus on the benefits to government and other service communities, citizen participation offers society the
providers and a focus on the benefits to citizens and possibility of articulating and solving problems
communities. The first view emphasizes the ways knowledgeably and legitimately, with the result that
citizens contribute knowledge and, especially, auth- collective actions, including programs, serve more
ority, such that program operators can be more people well.
confident citizens will use and benefit from services. Some proponents of citizen participation portray it
The second view emphasizes the ways citizens get as a means to various ends, whereas others, par-
more power and economic and social resources that ticularly those concerned with societal development,
promote their development. The first view is more present it as an end in itself. It can be both, and the
likely to be associated with the narrower view of delineation of means and ends varies with focus on
citizen participation, and the second with the broader individuals, communities, organizations, or society.
view, but the two may overlap and the correspondence
is imprecise. Four types of benefits can be identified.
The first emphasizes indiidual deelopment. Citizen
participation has been seen as therapy for the alienated 4. Specific Purposes of Participation Actiities;
and powerless; involvement in political activity may The Issue of Power
contribute to self-esteem, a sense of potency, and
growth. Inherently, participation in public life makes The discussion of benefits from citizen participation
someone a citizen. Collaboration helps individuals identifies many possible purposes for activities in
become part of networks that provide social supports. which citizens participate in public life. These can be
Instrumentally, participation can bring power, knowl- classified as follows:

1842
Citizen Participation

(a) Communicating information (including per- problems. This is the most familiar of three ‘levels’ of
ceptions, beliefs, opinions, hopes, expectations, and power. Beneath it are decisions about the issues or
intentions); problems that go onto the agenda for action. It is
(b) Developing relationships (creating new ones and possible for citizens to have considerable power in
strengthening existing ones); selecting from among alternatives on the table without
(c) Developing the capacity to act and organizing having influence in determining the array of possi-
action (including organizing coalitions, planning, bilities. Still one level lower are decisions, frequently
strategizing, and creating and exercising power); tacit, about which social conditions are defined as
(d) Preserving or changing selves and\or others issues or problems and even considered for the agenda.
(including policies, practices, conditions, and relation- Citizens may have influence in choosing among recog-
ships). nized issues for a group or organization’s agenda but
These categories may be seen hierarchically, in that have little power to draw public attention to certain
communicating information builds relationships, conditions that trouble them greatly. The purposes of
which in turn develop a capacity to act, which may citizen participation may pertain to each of these levels
effect change in some or all parties. However, in of power.
practice and in the abstract, the categories overlap: for
example, communication may contribute to devel-
oping relationships, the creation of which is change 5. Means of Citizen Participation
and which have the capacity to bring about other
change, such as further communication. The means of citizen participation are related to its
In short, citizen participation can enable citizens purposes. Some means serve single purposes, but most
and others who participate with them to learn and can may serve several purposes. Means can be grouped in
empower them to act together in ways that would have five general categories.
been impossible otherwise. However, when citizens or Groups, Organizations, Committees, Boards, Coun-
those they interact with believe they have conflicting cils, and Other Institutions are entities that offer arenas
interests, they may participate with other purposes. where members and others can deliberate, make
They may seek to prevent the communication of decisions, and otherwise act. Examples are community
information, the development of relationships, action organizations and planning councils, agencies that
by others, and any change in their own practices. deliver services or administer programs, planning
Under these conditions, when government officials or commissions and other boards that guide or regulate
funders, for example, organize citizen participation, public action, citizen review boards, and task forces.
they may involve citizens but limit their influence. All are concerned about and intervene in public life.
Sherry Arnstein (1969) analyzed this central ques- They vary in the degree to which they involve citizens
tion of citizens’ power in terms of ‘a ladder of citizen in activities characterized by purposes high on
participation,’ ranging from low to high power: Arnstein’s ladder.
Nonparticipation These entities conduct meetings where participants
(a) Manipulation: placing citizens on powerless can communicate information, develop relationships,
advisory committees for the purpose of ‘educating’ develop the capacity to act, and act in ways that
them or gaining their support for others’ positions; preserve or change people or things. Meetings may be
(b) Therapy: trying to change citizens’ beliefs, formal or informal, regularly scheduled or ad hoc,
attitudes, or values as a diversion from analyzing and generally oriented toward organizational affairs or
changing social conditions; focused on specific issues or problems. Meetings may
Degrees of tokenism have single or multiple purposes, such as discussion,
(c) Informing: giving information to citizens but not planning, or decision making. Some organizations
requesting or listening to their points-of-view; restrict citizens to discussion, whereas others may
(d) Consultation: asking citizens’ opinions but not involve them in planning or decision making as well.
making a commitment to act on them; These entities can sponsor inquiries, events, or
(e) Placation: placing a few citizens on committees services for one-directional communication of infor-
or boards that have limited influence over policies and mation. Information centers allow citizens to ask for
practices; information. Activities in which citizens provide in-
Degrees of citizen power formation include public hearings, surveys, focus
(f ) Partnership: sharing decision making power with groups, various structured group activities, referenda,
citizens; and elections. Organizations vary in their interest in
(g) Delegated power: delegating some decision giving information to citizens or receiving it from
making power to citizens; them.
(h) Citizen control: citizens have considerable in- These entities can sponsor such other action as
dependent power over decisions. organizing people and developing and implementing
This hierarchy focuses on power over overt de- interventions. Examples include place- or issue-based
cisions, such as policies or programs that address organizing, housing development, school reform, and

1843
Citizen Participation

health care provision. Line departments in state and holds own or rent and whether they have children.
local government, community organizations, and Representation could be defined primarily in terms of
community development corporations typically spon- positions on issues and secondarily in terms of housing
sor these activities. They vary in the extent to which tenure and family composition. Determination of
they require or can accommodate lay citizens, rather what interests, positions, or characteristics should be
than experts. Organizations vary in efforts to involve represented is consequential because it affects not only
citizens in these activities. which individuals participate, but also which positions
These entities or others can sponsor technical are given voice.
assistance to increase citizens’ ability to join the
entities, participate in meetings, make effective in-
6.2 Differences in Ability to Participate
quiries, and take part in action. Technical assistance
includes classes or workshops, as well as ad hoc or Nevertheless, even if there is agreement about how
ongoing consultation or mentoring. ‘Advocate many and which participants might represent an
planners’ work with or represent citizen groups in overall population, some groups can much more easily
planning. Topics for assistance include means of participate than others. Citizen participation depends
participating in organizations such as using parlia- on motive, opportunity, and means. The three are
mentary procedure, engaging in public discussion, related.
making decisions, and exercising leadership; technical A common motive, particularly with regard to
and analytic tasks such as budgeting, scheduling, and participation in activities sponsored by government or
grant-writing; managerial tasks such as running and other large institutions, is the possibility of influencing
organization and planning; and knowledge about conditions, for the benefit of a private or a public
substantive fields. interest. A strong motive, that often leads citizens to
create or activate an organization, is the desire to resist
an action seen as a threat, such as urban renewal,
6. Participation and Representation highway construction, or massive development.
When participation opportunities arise, individuals
may take part for the pleasure of interacting with
6.1 Meanings of Representation
others—either specific persons or a group in general.
When ‘citizen participation’ refers to individual citi- At the same time, they may assess whether these are
zens, each person can be said to represent him- or opportunities to exercise power or accomplish mean-
herself, and participation and representation are ingful ends. Government activities, for example, may
equivalent. Both can be measured quantitatively, in be linked to formal power but lie beyond local citizens’
terms of how many people or what percentage of influence; a community organization may offer indi-
members of some unit, such as a geographic neigh- viduals considerable opportunity for neighborhood
borhood, participate. influence but have little control over important outside
In contrast, when ‘citizen participation’ refers to a forces.
collectivity, such as a community or a racial or cultural Citizens vary in their ability to take advantage of
group, representation is more complicated and, while these opportunities. Time is a basic resource and
related to participation, separate from it. Represen- constraint. Those who work full time, parents, and, in
tation might be measured as a ratio of participants to particular, single parents have little time for meetings.
total members, but to do so would not take into Middle-class citizens, especially those who have a
account the characteristics that define the identity of flexible schedule, do not work full-time, or are sup-
the group, distinguish subgroups from one another, ported by a spouse, and particularly those without
and demarcate various interests. One common short- children at home, are most likely to have time to
cut is to identify certain demographic characteristics participate in public life.
of the whole—such as race, sex, income, and age—and Thus citizens’ motivations rest on a calculation of
seek participants who resemble the overall population the returns on an investment of time. Different citizens
in these respects. However, this approach assumes that are likely to figure the possibilities for influence
all blacks, women, poor people, or elderly are similar differently, based on their experience, relationships,
and that what matters about and to them is a simple skills, and confidence. The means at their disposal
product of their race, sex, income, and\or age. affect their ability to take advantage of opportunities.
There is no single correct way to define or measure Participation is aided by a cognitive style that allows
representation. Rather, it is useful to consider in a seeing how the particulars of a moment fit into a big
particular instance what aspects of the people involved picture—geographically, socially, and temporally.
matter and how they might be represented. For Planning depends on imagining a medium- to long-
example, in a neighborhood with bad housing, groups term future and having confidence in one’s ability to
may form around distinct positions on improving affect it. Substantive knowledge matters, both for
conditions; their views may be unrelated to race or making informed decisions and for having the con-
income, and associated mainly with whether house- fidence to take part. Procedural expertise is crucial,

1844
Citizen Participation

including knowing how to organize people, run a However, because there cannot be a contrasting case
productive meeting, resolve conflicts, and direct dis- where everything was the same except for citizen
cussion toward agreement. Social relations are an participation, it is impossible to answer that question.
important source of work, influence, and know- Moreover, the purpose, or at least the effect, of citizen
ledge—about issues and participation opportunities. participation may be to redefine the problem. In
The conditions of middle-class life are particularly particular, in the case of procedural problems defined
likely to give citizens these means and motivate them by the exclusion of citizens, citizen involvement in
to see and take advantage of opportunities for par- itself may provide a solution. Hence it is reasonable
ticipation. Extended formal education, professional simply to study events to assess the influence of
work, intricate social networks, and general success in particular actions by specific people.
influencing the world prepare middle-class citizens to Still, citizens may participate in activities that solve
participate in ways that elude many who are poorer. problems without themselves having influence over
These conditions encouraged the creation of re- the results. Tensions between the narrow and broad
medial ‘citizen participation’ activities. Extensive, notions of ‘citizen participation,’ as well as conflicts
effective participation of low-income citizens depends over substantive and procedural issues, will lead others
on overcoming these obstacles. Time is a difficult to try to limit the power of certain citizens. To evaluate
constraint to change, but linking civic events with the effects of citizen participation, it is necessary to
activities in which people already participate is im- examine the roles and influence of particular groups of
portant, as is child care. Training and experience can citizens at each level of power.
give people skills and confidence to organize and take Finally, citizens may participate powerfully in solv-
an active part in public affairs. Crucially, low-income ing problems, but those who take part may not
citizens, as any citizens, are more likely to participate represent a larger group in whose name they speak. An
when they see real opportunities for power. elite may deliberately try to corner power for them-
selves. Alternatively, a small group of activists who
have the skills, time, and connections to participate in
7. Ealuating Citizen Participation Actiities wide-ranging activities may, with the most generous of
intentions, take leadership, work closely together,
Does citizen participation work? Do some kinds of succeed in advancing projects, develop proprietary
citizen participation work better than others? How feelings for them, and continue along the easy path of
does narrowly defined ‘citizen participation’ compare standing for a larger group. They may succeed in
with the rest of broadly defined citizen participation? solving problems for members of their larger com-
Is citizen participation worth the effort—for citizens, munity, who can be said to be represented by the
for the government, for anyone else? outcomes, but not in the process.
These are good, crucial questions. Yet they are Thus the question of whether citizen participation
difficult to answer. First, terms must be defined. works, complexly combines issues of effectiveness in
Second, little of the relevant empirical evidence is solving problems, power, and representation. Still,
collected in writing. Hence available data offer only answers to the question do not affect the normative
specimens of answers from a larger undetermined position that citizen participation in public affairs is a
universe of possibilities. Third, in any case, the most democratic right.
precise answers are contingent: it depends.

7.2 The Haphazardly Collected Empirical Data


7.1 Defining the Questions
Few citizens record their participatory experiences,
Whether citizen participation ‘works’ depends on probably because few have both the interest and time.
expectations. Most generally, citizen participation is a Perhaps those who consider themselves successful are
strategy for solving a problem. Commonly, problems most likely to document their activities or draw the
are defined substantively: for example, the poor are attention of observers who will do so. For these
unemployed, low-income city residents lack affordable reasons there is no way of knowing what universe
food shopping, or public health programs do not reported cases represent.
reach low-income children. Alternatively, some prob- Moreover, it is unclear what should be considered a
lems may be defined procedurally, and many sub- sufficient account of a case. While many case studies
stantive problems, in fact, have procedural roots: for focus on specific ‘citizen participation’ activities, their
example, the city health department, local hospitals, context shapes their effects. Hence it would be in-
the state Medicaid division, and low-income health formative to see the entirety of influences on citizen
advocates cannot agree on an approach for taking participation in a case, but it may be uncertain at what
care of indigent children. distance from the ‘main action’ to draw boundaries
One might ask whether citizen participation in around the case. Within those boundaries there is the
planning for such problems contributed to better question of what the unit of analysis should be—what
solutions than would have been likely otherwise. should be taken as the unit of citizen participation that

1845
Citizen Participation

is to be measured for its effects. It might be an incident, Hinsdale M A, Lewis H M, Waller S M 1995 It Comes from the
an individual or collective actor, a relationship, a People. Temple University Press, Philadelphia
strategy, a tactic, another action, or a combination of Marris P, Rein M 1982 Dilemmas of Social Reform, 2nd edn.
University of Chicago Press, Chicago
these things. Further, it is unclear what characteristics
Medoff P, Sklar H 1994 Streets of Hope. South End Press,
of these elements might matter. For example, when Boston
might citizens’ age, education, family composition, Midgley J, Hall A, Hardiman M, Narine D 1986 Community
housing tenure, income, length of residence, occu- Participation, Social Deelopment and the State. Methuen,
pation, place of residence, race, religion, or sex, be London
pertinent to their participation in public affairs and the Mogulof M B 1970 Citizen Participation: A Reiew and Com-
effects of their activities? And how would the context mentary on Federal Policies and Practices. Urban Institute,
of the case affect the influence of the unit of citizen Washington, DC
participation? Moynihan D P 1970 Maximum Feasible Misunderstanding. Free
Press, New York
These questions are conceptual but also empirical,
Narayan D, Ebbe K 1997 Design of Social Funds; Participation,
dependent on data collection and analysis. Little Demand Orientation, and Local Organizational Capacity.
available material is robust enough to address all these World Bank Discussion Paper No. 375. World Bank, Wash-
questions, and, as a result, it is difficult to compare the ington, DC
effects of citizen participation in different contexts. Oakley P, Marsden D 1984 Approaches to Participation in Rural
These questions should direct the recording of cases. Deelopment. International Labour Office, Geneva, Switzer-
For now, the available data should be interpreted as land
complexly as possible. They offer specimens— Rosener J B 1975 A cafeteria of techniques and critiques. Public
examples of what citizen participation may Management 57: 16–19
Rothman J, Tropman J E 1987 Models of community organ-
accomplish—which could become the basis for
ization and macro practice perspectives: Their mixing and
generalizations. phasing. In: Cox F M, Erlich J L, Rothman J, Tropman J E
One can find examples of success and failure. There (eds.) Strategies of Community Organization, 4th edn. Peacock
are instances where organized citizens planned and Publishers, Itasca, IL
developed programs. There are instances where organ- Spiegel H B C 1968 Citizen Participation in Urban Deelopment;
ized citizens stopped projects they opposed. In these Vol. I Concepts and Issues. NTL Institute for Applied
cases, citizens have benefited from such resources as Behavioral Science, Washington, DC
organization, money, effort, skill, knowledge, plan- Stoecker R 1984 Defending Community. Temple University
ning, strategy, alliances, time, and commitment. These Press, Philadelphia, PA
findings are consistent with political science research.
H. S. Baum
See also: Citizenship and Public Policy; Civil Society,
Concept and History of; Community Power Structure;
Community, Social Contexts of; Community Soci-
ology; Neighborhood Revitalization and Community Citizenship and Public Policy
Development; Participation: Political; Political Rep-
resentation; Poverty Policy; Poverty, Sociology of; The conceptualization of the relationships between
Urban Poverty in Neighborhoods citizenship and public policy depends critically on how
these terms are defined, and in turn on the socio-
Bibliography political context in which they were developed and are
located. In the social sciences, these concepts are the
Arnstein S R 1969 A ladder of citizen participation. Journal of products of different histories and significance within
American Institute of Planners 35: 216–24 Europe and North America. In the European context,
Baum H S 1997 The Organization of Hope. SUNY Press, Albany,
public policy is directly connected with social citi-
NY
Checkoway B 1995 Six strategies of community change. Com- zenship in the historical context of the emergence of
munity Deelopment Journal 30: 2–20 the welfare state and industrial capitalism. Social
Checkoway B, Pothukuchi K, Finn J 1995 Youth participation citizenship can only be understood in the broader
in community planning: What are the benefits? Journal of historical context of social class relations, the modern
Planning Education and Research 14: 134–9 capitalist economy and the nation state (Turner 1986).
Fisher R 1984 Let the People Decide. Twayne, Boston For radical sociologists, citizenship and public policy
Forester J 1989 Planning in the Face of Power. University of were state strategies to secure the political compliance
California Press, Berkeley, CA of the urban working class (Mann 1987). In the USA,
Frieden B J, Kaplan M 1975 The Politics of Neglect. MIT Press,
citizenship was originally defined in political terms by
Cambridge, MA
Glass J J 1979 Citizen participation in planning: the relationship reference to the individual rights that followed from
between objectives and techniques. Journal of the American the War of Independence and the framing of the
Planning Association 45: 180–9 Constitution. American citizenship was essentially
Heskin A D 1991 The Struggle for Community. Westview Press, about nation-building, that is, about the creation of a
Boulder, CO ‘people’ in relation to the aspiration of political leaders

1846
Citizenship and Public Policy

(Shklar 1991). In the formation of the American the possibility of equality in a capitalist economy, the
people, political leaders created diverse ‘civic ideals’ American legacy, drawing on the work of both Alexis
(Smith 1997) that blended liberal, democratic repub- de Tocqueville (1966) and John Dewey (1963), has
lican, and inegalitarian ascriptive elements. In the late been concerned with electoral democracy and political
twentieth century, following the civil rights move- access. The Tocquevillian position has been to
ments and intensive waves of immigration, social emphasize the contribution of churches, voluntary
citizenship is related to government policies towards associations, and community initiative in the delivery
naturalization, ethnic integration, and multicul- of public policy. However, the distinctively American
turalism. It is difficult to think of American citizenship contribution to public policy has been from prag-
without thinking of the ‘American dilemma’ of race matism which, from John Dewey to Richard Rorty
relations (Myrdal 1944). Because the historical de- (1998), has placed its policy aspirations in the role of
velopment of the welfare states in Europe and America education for access and participation (Diggins 1994).
is fundamentally different, the terms of debate have It is also important to recognise that philanthropy has
different meanings, functions, and significance. How- played a much larger role in policy development and
ever, with globalization there is some convergence delivery in America than in Europe (Wuthnow 1996).
between these different traditions as nation states In general, citizenship establishes the broad legal
confront cultural hybridity, a global labor market, and social parameters within which public policy is set,
and the partial erosion of national sovereignty. but public policy creates the administrative and
Conventional forms of citizenship were associated legislative framework within which citizens can ef-
with the modernization of society and with the fectively enjoy their rights. Historically, in-
development of the administrative framework of the stitutionally, and analytically, citizenship and public
modern states. In the nineteenth century, for example policy are interconnected and interdependent. Citi-
in the public policies of Bismarck, there emerged a zenship is a collection of rights and obligations that
close relationship between nationalism, social in- gives individuals a formal juridicial identity. Social
surance, and state formation. The formation of citizenship involves social membership, a distribution
national identity, political integration, and citizenship of rewards, the formation of identities, and a set of
were aspects of the modernization of state admin- virtues relating to obligation and responsibility. It is
istrations, and citizenship came to form a basis for the constituted by social institutions such as the jury
convergence of national and masculine identity system, parliament, and welfare states. Whereas the
( Nelson 1998). Nineteenth-century nationalism meant history of political institutions identifies the origins of
that under the orchestration of the nation state, public citizenship with the Greek polis, social citizenship, as
policy was also cultural policy. However, with globaliz- the outcome of the American and French Revolutions,
ation, the sovereignty of the state has been com- is an essentially modern conception. It is analytically a
promised, and the modern debate about policy and product of modern liberal theory, specifically de
citizenship has to be set within the political constraints Tocqueville’s theory of democratic association.
of a global economy. The social rights of social citizenship are very
The concept of citizenship in British social theory different in their consequences in contrast with legal
has passed through several stages, from the idealism of and political rights. Social rights require the provision
T. H. Green to the welfare theories of T. H. Marshall of social services and transfer payments, and as a
and Richard Titmuss. Marshall’s Citizenship and Social result involve the state in public expenditure. They
Class and Other Essays (1950) and Social Policy in the also require some administrative structure to deliver
Twentieth Century (1967) have dominated recent these services, and hence further involve the state in
debate about social citizenship. The principal question fiscal management. Recognition of this fiscal role of
in this tradition has concerned how far a compre- the state gave rise to the famous division of welfare by
hensive welfare policy can be effectively implemented Titmuss (1958) into social welfare (education, health,
in a capitalist system without either destroying the and social services); fiscal welfare (allowances and
profitability of capitalist enterprises through excessive relief from taxation); and occupational welfare (bene-
taxation or compromising the principles of re- fits received by employees through employment). This
distributive justice (Loney et al. 1983). Skeptics of the provision of social rights clearly illustrates the tensions
Marshallian approach have claimed that welfare and contradictions between the state and the market.
benefits contributed more to the wellbeing of the Radical sociologists like Ju$ rgen Habermas (1976)
middle class than the working class. The contemporary predicted a ‘legitimation crisis,’ because the growth of
British welfare debate has concentrated on the ques- state expenditure in response to electoral pressure
tion of European union, the effects of a centralized would necessarily undermine capitalist profitability.
bureaucracy, and the possibility of the implementing a One can argue therefore that the neoliberal policies of
social wage. the governments of Ronald Reagan and Margaret
These policy debates can be contrasted with the Thatcher involved ‘rolling back the state,’ the pri-
American tradition. One might argue controversially vatization of welfare, and a return to third-sector
that, while the European policy debate has been about involvement through charities, philanthropic in-

1847
Citizenship and Public Policy

stitutions, and the voluntary associations. These and redistribution. Ecclesiastical responses to poverty
strategies sought to restore profitability through and need in the Middle Ages were located within a
deregulation, subsidiarity, and community initiatives, theological debate about economic exchange in which
but also emphasized the importance of individual and low wages were interpreted through a moral discourse
family responsibility for welfare delivery. Conserva- on justice. The church developed various institutional
tive critics of bureaucratic welfare argued that welfare responses to poverty and created the modern frame-
benefits undermined the family through payments to work of charity (Troeltsch 1912). Before the devel-
single parents, who as a result had no necessary opment of the capitalist market system, the poor could
incentive to work and clear incentives not to marry. In rely on custom and communal organizations for
the late twentieth century, so-called ‘third-sector subsistence. In the medieval period, guilds and fra-
strategies,’ which encouraged local initiative, com- ternal associations in the towns provided welfare
munity development, and voluntary sector delivery, benefits to members and their dependants. These
became fashionable, not only in North American and guilds often evolved into mutual societies and the
Europe but also in Australia and New Zealand (Brown principles of mutuality often survived the growth of
et al. 2000). capitalism. In traditional societies, it was the church,
The notion of ‘public policy’ also requires some the patriarchal family, or the landlord who determined
preliminary clarification. It is a general term to the conditions of survival, rather than the labor
describe the efforts of governments to coordinate the contract.
provision of a variety of governmental services and We can see the origins of public policy as a
utilities. Public policy expresses the political intentions social struggle over the conditions by which markets
and choices of government, and creates the framework function. In his classic study of The Three Worlds of
within which social planning takes place. ‘Public Welfare Capitalism, Esping-Andersen (1990, p. 35)
policy’ is often referred to as ‘social policy,’ typically claimed that ‘the mainsprings of modern social policy
when a broader philosophical dimension is involved. lie in the process by which both human needs and
Social policy is often distinguished from public policy labor power became commodities and, hence, our
by its broad welfare dimension; it attempts to regulate wellbeing came to depend on our relation to the cash
the provision of five social services: housing, edu- nexus.’ However, the impact of the market is modified
cation, health, social security, and personal social or moderated by the growth of social citizenship which
services. The connection between policy and citizen- provides, through public policy initiatives, a variety of
ship was overtly expressed by Marshall, who claimed safety nets to protect the individual from the full
that social policy includes the general policy of vagaries of unemployment, sickness, and disability.
governments with regard to action having a direct and With the decline of traditional patterns of social
explicit impact on the welfare of citizens by providing reciprocity, citizenship expressed through public pol-
them with services or income. Social Administration is icy reduces the contingencies of dependence on the
the science of the provision of such services through market.
social policy, and in Britain it was established as a The historical controversy in the social sciences has
discipline at the London School of Economics in 1912. been around the capacity of public policy and citi-
In Europe in the 1840s, the growth of social medicine zenship to bring about an effective redistribution of
expressed the idea that the health of individuals should resources. There are broadly two views about the
involve the state in the policing of society. Public historical role of welfare states. Radical critics of
policy and social citizenship have to be understood as capitalism (O’Connor 1973, Piven and Cloward 1972)
aspects of the growth of state administration designed have argued that public policy aimed at creating social
to bring about social order in a world of expanding citizenship is primarily concerned with legitimating
capitalism. governments and protecting them from political in-
For detailed treatment of these and related fields see stability arising from social conflicts, specifically from
Health Policy; Immigration: Public Policy; Industrial a revolutionary working class. Liberal theories of
Policy; Policy History: Regimes; Poerty Policy; Public welfare (Lipset 1960) argue that welfare policies reduce
Health; Social Insurance: Legal Aspects. the material causes of class conflict, incorporate the
working class and enhance democratic access to the
state. For sociologists like Frank Parkin (1979), social
welfare transforms class conflict into status compe-
1. Citizenship and the Rise of Public Policy tition. In broad terms, one can conclude that the
production of inflationary pressure of welfare benefits
Both citizenship and public policy are modern in- from public policy initiatives has been regarded by the
stitutions. Neither social citizenship nor public policy state and its elites as more tolerable than civil conflict
could exist without the nineteenth-century expansion and revolutionary protest. Whether or not public
of public administration and law. In pre-modern policies to institutionalize social citizenship have a real
times, it is clear that the church responded to the effect on social equality will depend on what type of
satisfaction of need through principles of reciprocity welfare state (if any) is created by government actions,

1848
Citizenship and Public Policy

and how these policies have emerged historically in wealth) and more concerned with maximizing the
relation to different forms of capitalism. productive power of administration over population
In recent scholarship, the radical or Marxist para- and reproduction. Furthermore, Foucault interpreted
digms have been less influential, and debate about the exercise of administrative power in productive
public policy and government agency has been in- terms, that is enhancing population potential through,
fluenced by the work of Michel Foucault. Social for example, state support for the family. For example,
scientists have in particular recognized the importance the state’s involvement in—and regulation of—
of Foucault’s concept of ‘governmentality’ as a para- reproductive technology is an important example of
digm for understanding the microprocesses of admin- governmentality in which the desire of couples to
istration and control within which self regulation and reproduce is enhanced through the state’s support of
social regulation are united (Foucault 1991).The con- new technologies. The existence of a demand for
cept of ‘governmentality’ provides an integrating fertility is supported by a profamilial ideology that
theme that is concerned with the sociopolitical regards the normal household as a reproductive social
practices or technologies by which the self is con- space. These Foucauldian perspectives have been
structed. ‘Governance’ or ‘governmentality’ refers to useful in providing an historical understanding of the
the administrative structures of the state , the patterns relationships between family, state, and public policy.
of self-government of individuals and the regulatory Feminist criticism of neoliberal policy to protect the
principles of modern society. Foucault argued that family has noted that these policy initiatives implicitly
governmentality has become the common foundation or explicitly ignore the presence of married women in
of all forms of modern political rationality, that is, the the labor force (Wilson 1977).
administrative systems of the state have been extended
in order to maximize the state’s productive control
over demographic processes. This extension of ad- 2. The Dimensions of Citizenship
ministrative rationality was first concerned with demo-
graphic processes of birth, morbidity, and death, and In historical terms, social citizenship and public policy
later with the psychological health of the population. have been shaped by the relationship between state
‘Governmentality’ can be seen as an administrative and market. Public policy attempts to promote a set of
rationality that produces the modern self as a conse- general social conditions through which effective
quence of social services. This perspective has been entitlement to social resources can be sustained.
valuable in understanding the growth, for example, of Welfare has, in practice, never been an unconditional
social gerontology as a science, a component of public right; entitlements have, in reality, been tied to
policy, and as the basis for new professions to contribution. The entitlement to benefits in liberal
discipline the elderly (Katz 1996). welfare systems have typically been through work,
Foucault’s historical inquiries gave rise to a dis- war, and reproduction.
tinctive notion of power, in which he emphasized Work was fundamental to the conception of citi-
the importance of its local or micro manifestations, zenship in the welfare state as described in W. H.
the role of professional knowledge and expertise Beveridge’s Social Insurance and Allied Serices (1942)
in the legitimation of such power relationships, and and Full Employment in a Free Society (1944). In-
the productive rather than negative characteristics of dividuals could achieve effective entitlements through
the effects of power. His approach can be contrasted the production of goods and services, namely through
usefully with the concept of power in traditional gainful employment, which was essential for the
Marxist sociology, where power is visible in terms of provision of adequate pensions and superannuation.
the police and army, concentrated in the state, and These entitlements also typically included work care,
ultimately explained by the ownership of the economic insurance cover, retirement benefits, and healthcare.
means of production. In the Marxist perspective, Citizenship for male workers characteristically
power is typically negative and signifies a system of evolved out of economic conflicts over conditions of
institutions that contain, prohibit, and control. employment, remuneration, and retirement .Service to
Foucault’s view of power is more subtle, with an the state through warfare generates a range of entitle-
emphasis on the importance of knowledge and in- ments for the soldier–citizen. Wartime service typically
formation in modern means of surveillance. leads to special pension rights, health provisions,
‘Governmentality’ is the generic term for these housing, and education for returning servicemen and
power relations. It was defined as ‘the ensemble their families. War service has been important, as we
formed by the institutions, procedures, analyses and have seen, in the development of the evolution of
reflections, the calculations and tactics, that allow the social security entitlements (Titmuss 1962).
exercise of this very specific albeit complex form of Finally, people achieve entitlements through the the
power, which has as its target populations’(Foucault formation of households and families, which become
1991, p. 102). The importance of this definition is that the mechanisms for the reproduction of society
the power of the state in the modern period has been through the birth, maintenance, and socialization of
less concerned with sovereignty over things (land and children. These services increasingly include care for

1849
Citizenship and Public Policy

the aging and elderly as generational obligations crease in economic participation has required the
continue to be satisfied through the private sphere casualization of the labor force. While the number of
(Finch 1989). These services to society through the men in part-time employment doubled between 1984
family provide entitlements to both men and women and 1999, radical changes in the labor market ( job
as parents, that is as reproducers of the nation. These sharing, casualization, flexibility, downsizing, and new
familial entitlements become the basis of family management strategies) have disrupted work as a
security systems, various forms of support for career. While for employers functional and numerical
mothers, and health and educational provision for flexibility has broken down rigidities in the workplace,
children. Although the sexual activity of adults in these strategies have compromised job security (Aber-
wedlock is regarded in law as a private activity, the crombie and Warde 2000, p. 81).
state and church have clearly taken a profound interest Sociological studies of social class suggest that,
in the conditions for and consequences of lawful (and while levels of unemployment have been falling in
more particularly unlawful) sexual activity. Following association with the long American economic boom of
Foucault, we can argue that heterosexual reproduction the 1990s, the contemporary class structure has new
has been a principal feature of the regulatory activity components—an ‘underclass’ of the permanently un-
of the modern state. It is evident that the values and employable (typically single-parent welfare claim-
norms of a household constituted by a married ants), a declining middle class associated with the
heterosexual couple provides the dominant ideal of decline of middle management, and the ‘working poor’
British social life , despite the fact that 49 percent of whose skill levels do not permit upward mobility
live births in 1998 occurred outside marriage and that (Sennett 1998). There is some academic consensus that
among the majority of one-family households 30 features of the class structure do not encourage active
percent had no children. In fact, the moral force of the citizenship through economic entitlement, but these
idea of marriage and domesticity is so compulsive in changes in the nature of employment are perhaps
contemporary society that in America a number of insignificant when compared to the graying of the
states are considering legislation that would enable population and the social problem of retirement. It is
gay couples to form ‘civil unions,’ entitling them to clear that the stereotype of the elderly as a dependent
about 300 rights and benefits currently available under and passive population in disengagement theory is
state law to married heterosexual couples. false, but it is also true that the aging of the population
has important implications for the shape of the
3. Regulatory Regimes and Public Policy working population and for employment as a basis for
entitlement. Intergenerational conflict in the struggle
This liberal pattern of public policy for social citi- over resources is likely to become an important
zenship has been eroded because the three foundations element in social divisions in this century.
of effective entitlement have been transformed by Public policy and social citizenship were aspects of
economic, military, and social changes. It may sound modern state administration that developed in the
perverse to suggest that in contemporary Britain the nineteenth century. We can interpret this adminis-
decline of economic participation has brought about trative complex as a response to social class conflicts in
an erosion of citizenship, when participation in the the development of modern capitalism. Social rights
labor force has been rising continuously since the early were granted as political mechanisms to secure an
1990s. Increasing economic activity has been especially acceptable level of social solidarity. Walter Bagehot
important for women; between 1971 and 1999, the (1963) in his The English Constitution warned in 1867
proportion of the adult female population being of the dangers of working class combination, which
economically active increased from 56 to 72 percent. would result in the ‘supremacy of ignorance,’ and
However, high levels of economic participation mask encouraged the ‘higher classes’ to exercise wisdom and
a real change in the nature of the economy and obscure foresight. An extension of social rights was a prudent
a transition from old to new welfare regimes. The new response to the dangers of combination among the
economic regime is based on monetary stability, fiscal ‘lower classes’ and social citizenship has remained a
control, and a reduction in government regulation of valuable public policy option to avoid civil war.
the economy. In this new economic environment, one However, social citizenship and capitalism have re-
version of the ‘Third Way strategy’ involves not mained in a state of permanent tensions—as Marshall
protecting individuals from the uncertainties of the recognized in his concept of ‘hyphernated society’ in
market that had dominated welfare strategies between The Right to Welfare and Other Essays (Marshall
1930 and 1970, but helping people to participate 1981). There is a permanent tension between the
successfully in the market through education (lifelong liberal rights of individual freedom in a capitalist
learning schemes), flexible employment (family- marketplace and principles of equity and justice that
friendly employment strategies), and tax incentives require state interventions to protect social rights.
(Myles and Quadagno 2000). However, while increas- Public policy and welfare systems oscillate between
ing rates of economic activity have been a positive different regulatory environments. A regulatory
aspect of economic liberalization, much of this in- regime may be defined as ‘a historically specific

1850
Citizenship and Public Policy

configuration of policies and institutions which by constant restructuring of government agencies and
structures the relationship between social interests, the the economy, an emphasis on individual responsibility
state, and economic actors in multiple sectors of the and subsidiarity, and the recommodification of
economy’ (Eisner 1993, p. 1). These regimes fluctuate services through neoliberalism have encouraged some
between interventionist welfare systems and de- social scientists to argue that public policy has become
regulated privatized systems. For example, in retro- postmodern (Petersen et al. 1999). A new policy
spect, we can see that American public policy passed environment may involve the globalization of delivery
through different regulatory regimes in the Progressive through global enterprises that provide services for
Era, the New Deal, postwar reconstruction, and the governments under ‘outsourcing’ arrangements. The
contemporary period. Following the depression, growth of private prisons, managed by global enter-
President Roosevelt and his advisors introduced a prises, to replace or supplement state prison services
legislative programme aimed at social and economic would be one example. Social experiments in policy
recovery. The National Industrial Recovery Act was delivery in a global context may not be postmodern,
characteristic of the new policy regime, but the but they will certainly be increasingly translocal. As a
Progressive Era still depended on market forces. The consequence, the conventional mixture of state, mar-
New Deal and postwar reconstruction required an ket, and voluntary sector as the framework of public
alliance between the state and capital to conduct a policy will change radically to reflect the changing
global war and then to achieve an economic recovery. circumstances of social citizenship in a global econ-
In wartime Britain, the social policies of John omy.
Maynard Keynes, as expressed in The General Theory
of Interest, Employment and Money (1936), were See also: Civil Society, Concept and History of; Civil
adopted to finance the war. Social Keynesianism also Society\Public Sphere, History of the Concept;
formed the basis for postwar reconstruction where the Citizenship, Historical Development of; Citizenship:
policy of supporting public works, such as direct Political; Citizenship: Sociological Aspects; State
investment in infrastructure projects, promoted the and Society; Welfare State
recovery of employment. These public policies were in
direct opposition to the ‘Treasury view,’ which advo-
cated fiscal constraint and limited government in- Bibliography
tervention. In the late twentieth century, there was a Abercrombie N, Ward A 2000 Contemporary British Society.
departure from interventionist policies; environmental Polity Press, Cambridge, UK
controls and workers’ safety are not primary ob- Bagehot W 1963 The English Constitution. Collins, London
jectives, and were seen to conflict with corporate Barbalet J M 1988 Citizenship. Open University Press, Milton
profitability. In both the USA and UK, there has been Keynes, UK
a similar history involving a rapid departure from the Beveridge W H 1942 Social Insurance and Allied Serices.
principles of the New Deal, Social Keynesianism, and HMSO, London
Beveridge W H 1944 Full Employment in a Free Society. Allen &
the postwar consensus in favour of policies that do not
Unwin, London
assume, for example, full employment or universal Brown K, Kenny S, Turner B S 2000 Rhetorics of Welfare.
criteria in the provision of pensions. Macmillan, Basingstoke, UK
de Tocqueville A 1966 Democracy in America. Doubleday, New
York
Dewey J 1963 Freedom and Culture. Capricorn, New York
4. Conclusion: Public Policy and Globalization Diggins J P 1994 The Promise of Pragmatism. Modernism and the
This historical pattern of policy options changed Crisis of Knowledge and Authority. University of Chicago
significantly in the last two decades of the twentieth Press, Chicago
Eisner M A 1993 Regulatory Politics in Transition. The John
century with the emergence of a global economy.
Hopkins University Press, Baltimore
Although it would be an exaggeration to claim that Esping-Andersen G 1990 The Three Worlds of Welfare Capi-
globalization has resulted in the decline of national talism. Polity Press, Cambridge, UK
sovereignty ( Hirst and Thompson 1996), it is the case Finch J 1989 Family Obligations and Social Change. Polity Press,
that global economic processes constrain the capacity Cambridge, UK
of national governments to make independent de- Foucault M 1991 ‘Governmentality’. In: Burchell G, Gordon C,
cisions about national public policy. For example, Miller P (eds.) The Foucault Effect. Studies in Goernmentality.
volatile financial markets responding to global in- Harvester, London, pp. 87–104
formation systems can undercut national public policy Habermas J 1976 Legitimation Crisis. Heinemann, London
Hirst P, Thompson G 1996 Globalization in Question. Polity
through the collapse of local currencies. The Asian
Press, Cambridge, UK
financial crisis of the late 1990s was a dramatic Katz S 1996 Disciplining Old Age. The Formation of Ger-
illustration of how global uncertainty can destabilize ontological Knowledge. University of Virginia Press,
currencies and prevent governments from adhering to Charlottesville, WV
public policies that have inflationary consequences. Keynes J M 1936 The General Theory of Interest, Employment
Volatile markets, the fragmentation of public policy and Money. Macmillan, London

1851
Citizenship and Public Policy

Lipset S M 1960 Political Man. The Social Bases of Politics. munity, which constitutes the good and responsible
Doubleday, Garden City, NY citizen. These two basic meanings of citizenship apply
Loney M, Bosell D, Clarke J (eds.) 1983 Social Policy and Social to all of the historical phases that the formation of
Welfare. Open University Press, Milton Keynes, UK
citizenship as subject and concept has undergone. The
Mann M 1987 ‘Ruling class strategies and citizenship’. Sociology
21(3): 339–54 politico-legal status and the ideal of civic virtue
Marshall T H 1950 Citizenship and Social Class and Other constitute the two aspects of a historical concept that
Essays. Cambridge University Press, Cambridge, UK has taken on a variety of further meanings and
Marshall T H 1967 Social Policy in the Twentieth Century. functions over time.
Hutchinson, London It has shifted between membership in ancient
Marshall T H 1981 The Right to Welfare and Other Essays. communities and legal membership in the modern
Heinemann, London state, between a concrete legal status and a wide-
Myles J, Quadagno J 2000 ‘Envisioning a Third Way.’ Con- ranging concept within political theory. It has been
temporary Sociology 29(1): 156–67
used to describe substantive rights and obligations as
Myrdal G 1944 The American Dilemma. Harper, New York
Nelson D D 1998 National Manhood. Capitalist Citizenship and well as to sketch a normative ideal of politics. While
the Imagined Fraternity of White Men. Duke University Press, the theoretical claims of citizenship often tend to be
Durham, NC universal, their meaning varies according to historical
O’Connor J 1973 The Fiscal Crisis of the State. St Martin’s Press, and national context. The concept of citizenship is
New York rooted in social ethics and institutionalized by law. As
Parkin F 1979 Marxism and Class Theory. A Bourgeois Critique. such, it is an object of political theory. To analyze the
Tavistock, London historical development of citizenship means to com-
Petersen A, Barns I, Dudley A, Harris P 1999 Poststructuralism, bine the development of norms of conduct regarding
Citizenship and Social Policy. Routledge, London
good civic behavior with the legal institutionalization
Piven F, Cloward R A 1972 Regulating the Poor. Tavistock,
London of citizen status and its theoretical conceptualization.
Rorty R 1997 Achieing Our Country. Leftist Thought in It also means tracing the trajectory from a status of the
Twentieth-Century America. Harvard University Press, Cam- city-state community to a key concept of modern
bridge, MA democracy and global political theory.
Sennett R 1998 The Corrosion of Character. The Personal
Consequences of Work in the New Capitalism. W W Norton,
New York 1. Citizenship as a Historical Subject
Shklar J N 1991 American Citizenship. The Quest for Inclusion. Every human group has developed institutions by
Harvard University Press, New York
which to define its members and procedures for
Smith R 1997 Ciic Ideals. Conflicting Visions of Citizenship in
US History. Yale University Press, New Haven, CT making new members. The Greeks, however, were the
Titmuss R 1958 Essays on the Welfare State. Allen & Unwin, first society to combine the legal provisions of mem-
London bership with a political theory of membership virtues
Titmuss R 1962 Income Distribution and Social Change. A and institutions in order to perpetuate their idea of
Case Study in Criticism. Allen and Union, London citizenship. The legal status of citizenship established
Titmuss R 1968 Commitment to Welfare. Allen & Unwin, equality among Athenian citizens in terms of their
London rights and obligations. The status of equal membership
Troeltsch E 1912 The Social Teaching of the Christian Churches. marked privilege vis-a' -vis nonmembers. The overall
University of Chicago Press, Chicago
legal status of Athenian citizenship was defined by
Turner B S (ed.) 1993 Citizenship and Social Theory. Sage,
London sharp boundaries, which distinguished Athenians
Turner B S 1986 Citizenship and Capitalism. The Debate oer from foreigners, resident aliens, and slaves. The
Reformism. Allen & Unwin, London minority of full citizens faced a majority of non-
Wilson E 1977 Women and the Welfare State. Tavistock, London privileged noncitizens whose rights were severely
Wuthnow R 1996 Poor Richard’s Principle. Recoering the circumscribed. Apart from the legal framework, the
American Dream through the Moral Dimension of Work, concept of Athenian citizenship included a set of
Business and Money. Princeton University Press, Princeton, citizen values, behaviors, and communal attitudes.
NJ ‘Passive’ legal citizenship was not necessarily, but
ideally and practically, linked to democracy by the
B. S. Turner civic virtues of an ‘active’ citizen, which permitted him
to ‘share in the polis,’ a politically active and auton-
omous community (Manville 1990). Greek city-state
Citizenship, Historical Development of citizenship and its dual construction as legal frame-
work and civic ideal was conceptualized in the political
Citizenship means membership in a political com- philosophy of Aristotle as a model of ‘ruling and being
munity. As membership, citizenship confers the status ruled in turn.’ It became the model of citizenship as
of equality among all citizens with respect to the rights such.
and duties that the status implies. Citizenship also Roman citizenship was more complex, expansive,
signifies a form of active behaior towards the com- and legalistic than its Greek counterpart. The history

1852
Citizenship, Historical Deelopment of

of Roman citizenship over eight centuries, from the the city-states of northern and central Italy. Along
end of the monarchy to the decline of the Roman with the weakness of the monarchy, a certain con-
Empire, reveals the stages of its development from an tinuity of Roman law and civic responsibility formed
instrument for defining the city-state community to the backdrop for the upsurge of a politically active
one for the legal integration of an extensive world municipal life with a high participation of the citizenry
empire. The Twelve Tables established the quality of in decision-making. A vigorous philosophical revival
Roman citizenship as a legal benefit and attractive of the Aristotelian concept of citizenship paved
political and social status. In the fourth century BC, the way for a resurgence of the classical concept
with the territorial expansion of Roman rule over of citizenship during the Renaissance. The founding of
Latium, Roman citizenship was conferred upon annex- universities, which promoted the development of
ed enemies and (Italian) allies alike as an instrument Roman law, a vital communal life, and a rich theor-
for promoting loyalty and political integration. By the etical literature on a rational and egalitarian order
first century BC, citizenship of the Roman Republic (particularly the work of Niccolo Macchiavelli) made
had reached its greatest ‘density’: the highest level of the city-state world of Upper Italy a birthplace of
participatory rights in the government of the republic modern citizenship.
and equality before the law accompanied a golden age The theory and political order of state sovereignty
of civic education and virtues. This construction lost in European absolutism with its concentration of
its balance in the Roman Empire. The largest terri- central state power and its unambiguous claim to
torial expansion of Roman citizenship, its conferral loyalty prepared the ground for a national rather than
upon nearly all men within the confines of the Empire local concept of modern citizenship. The personal and
except for slaves, meant the erosion of its quality as a territorial delimitation of sovereign states and the
privilege and a proof of superior morality. With the ordering of their international relations increased the
decline of the Empire, the diminution in standards of problem of defining membership in and allegiance to
citizenship corresponded to the degeneration of civic the state. The call for (individual) rights of religious
educational standards. Although the concept of citi- freedom and self-government, strengthened by re-
zenship did not necessarily depend on democracy, its ligious opponents of the state, and revolutionary
development was most favored by the political con- efforts at founding government upon popular sov-
ditions of a strongly participatory, republican order. ereignty, for example in England, were the pre-
A centralized, differentiated legal order within the city decessors to active citizenship. The institutional
or state had, at least, proved to be the indispen- foundations for such participatory claims and rights
sable precondition for the development of citizenship. remained narrow, however. They were often confined
With the loss of a central, universal legal structure at to the level of local government or socially privileged
the end of the Roman Empire, citizenship became ob- groups. The contrast between a basically oligarchic
solete as a political concept. and a democratic political citizenship (Heater 1990)
Christianity rejected the model of political order to became evident.
which ancient philosophy, especially Aristotle and The eighteenth century brought the breakthrough
Cicero, had contributed. It developed a complete of citizenship on a central level of political theory and
alternative system of social and moral values, which institutions. The word ‘citizen’—and its French
helped to establish a new institutional political order. equivalent citoyen—became a key concept in the
The rise of Christian corporatism and its unlimited legitimation of the political struggle against the feudal
commitment to the Kingdom of God produced a ancien reT gime (Gosewinkel 2001), as well as in the
dichotomous organization of the body politic. A dual struggle for equality before the law, freedom from
system arose of metaphysically based allegiance to the religious discrimination and arbitrary arrest, the ex-
Church and personal allegiance to the monarch. A tension of political rights and popular democracy.
system of multifaceted loyalty replaced the concen- The decisive step from a political and educational pro-
tration of the citizen’s loyalty upon the state. The gram to a legal guarantee was taken by American
notion of citizenship was transferred to a spiritual constitutionalism, however. The revolution of the
‘City of God’ (Augustine). Thus, the institutional American colonies against the ties and oblig-
point of reference for the ancient model of citizenship ations of subjecthood to the English Crown was ac-
disappeared. In the Middle Ages, the term ‘citizenship’ companied by a decisive change in legal terminology.
(French: citoyenneteT ) did not describe an Englishman’s Within the decade before the enactment of the
(or Frenchman’s) relationship with his country. In Federal Constitution (1787), the term ‘citizen’ came to
keeping with its immediate etymological origins replace previously related words such as ‘subject’ and
(‘city’), it referred solely to the rights and duties of a ‘inhabitant’ in constitutional texts. The republican
free city- or town-dweller, i.e., to a local, urban concepts of ‘citizen’ and ‘citizenship’ marked a political
community. and terminological break with the feudal age. From
The lack of a centralizing and nationalizing state this point onward citizen and citizenship became core
power in parts of medieval Europe left room for the terms of a state order based on democracy and
growth of a strong, urban citizenship, particularly in constitutionalism. The American constitution intro-

1853
Citizenship, Historical Deelopment of

duced a revolutionary new type of legal instrument more and more contested within the national citizenry.
that firmly rooted any state power in the law, while The struggle for inclusion, for example in American
simultaneously subjecting it to legal control. The constitutional law, dominates the history of citizenship
constitutionalization of state power represented a to this day. The gradual extension of equal civil rights
fundamental development in early modern Western to groups resident on American soil who had been
history: the subjection of social and political rela- denied citizenship rights and subjected to discrimi-
tionships to legal norms (Verrechtlichung). With the nation on the basis of ethnic or national origin,
constitutionalization of membership status in a repu- religious beliefs, social status, or gender determined
blican state order, the concept of citizenship gained the direction that citizenship was to take. The claims
new importance. It unified the claims to equality and of discriminated groups to equality became the motor
political participation inherent in the classical concept for full inclusion in the community defined by citi-
of city-state citizenship, transferred it to the state level zenship.
and gave it a new quality of legal efficacy. The With the decline of liberal democracy, and the rise
constitutionalization of citizenship with its classical of radical nationalism, racism, and totalitarian dic-
connotation of civil idealism and civic virtue repre- tatorship in the constitutional states of Europe and
sented a strong and permanent challenge to inequality Asia, the period between the two world wars repre-
and exclusion from rights in political practice. At the sented an interruption in the development of modern
same time, the shift in scale from the small local citizenship. The dominance of ascriptive national,
community to the level of the abstract state risked ethnic, or racial criteria in admission to citizenship, the
attenuation in the practices and understandings of splintering of the citizenry through hierarchical classes
citizenship. The need for a new, expansive legitimation of rights, the withdrawal of civil rights and massive
of citizenship to the state arose. expatriation of millions, destroyed the core of equality
The French Revolution made citizenship (cito- within the concept of citizenship. The rise of an army
yenneteT ) one of its key concepts (Waldinger et al. of stateless people deprived of rights and protection
1993). This new citizenship combined four traits that (Arendt 1951) revealed how dependent citizenship was
were to become essential for the development of on the liberal legal structures of the nation-state.
citizenship throughout the nineteenth and twentieth Developments after the Second World War were
centuries: its egalitarian, antifeudal impetus, its con- characterized by a dual tendency. On the one hand, the
firmation as a key concept of the legal constitution, its restoration of liberal democracy saw a reconstruction
association with extensive indiidual rights, and finally of citizenship. This was reinforced and extended to the
its nationalization. The modern concept of citizenship global arena after 1989 with the end of ideological
arose together with the concept of the nation-state block confrontations and the adoption of democratic
(Bendix 1964) and became one of its central legal and constitutional patterns by most of the formerly
institutions. The age of revolution and constitution- communist states. On the other hand, there is also
alism in the Western world was also an age of an evidence of a tendency towards a certain ‘devaluation’
increasing delimitation of national citizenries. Ex- of citizenship. The weakening of nation-state struc-
tended citizenship law came to define membership in tures, the trend towards transnational political unions,
the nation-state as well as the rules of naturalisation global standards and guarantees of civil rights as
policy.Whether they were based on an inclusionary, human rights have all diminished the importance of
territorial model (USA, France) or an exclusionary, (national) citizenship for the conferral of individual
descent-based model (Central and Eastern Europe), rights. While formal citizenship is still crucial on the
citizenship and citizenship law became key instru- level of the right to full participation in the political
ments for defining national identity and controlling arena, this no longer applies to economic and social
migration in a modern world characterized by in- rights. Citizenship as national membership status is of
creasing transnational mobility (Brubaker 1992). The decreasing importance for the exercise of these
central function of citizenship in defining membership increasingly relevant rights (Soysal 1994).
in the sense of nationality was supplemented by a The historical development of citizenship from its
second main function: that of conferring upon the beginning in the Greek city-states has been charac-
citizen individual rights vis-a' -vis the state. In the terized by a multiple process of expansion. Citizenship
nineteenth and early twentieth century, Western con- expanded from a membership status in a local com-
stitutional states extended the range of civil, political, munity to a central membership in the territorial
and, increasingly, social rights which were reserved nation-state. Citizenship as entitlement to individual
mainly for their own citizens. The nationalization of rights was transferred from the level of the nation-
citizenship as a membership status corresponded to a state to that of supranational communities. From its
nationalization of citizens’ rights. Citizenship became inception in the Greek city-communities, the concept
an institution for distributing ‘life opportunities’ in a and ideal of citizenship has spread all over the world to
world of nation-states. states based on the principle of constitutional demo-
As the importance of citizenship as a legal entitle- cracy. The substantive program of citizenship as a set
ment increased, its extension to new members became of individual rights has expanded from political and

1854
Citizenship, Historical Deelopment of

civil to encompass social and economic rights, and human relations, the growth of market economies and
ultimately cultural and environmental rights as well. political revolution, the regulatory idea of the social
order was no longer derived from a deity external to
the temporal world, but from within it and its ethical
values. The vision of an all-encompassing ‘society’ of
human beings, equal, autonomous, and rational by
2. The Historical Conceptualization of nature, who defined their own political order, and
Citizenship challenged the traditional norms of monarchical
order, ecclesiastical claims, and social hierarchy.
Two main features mark the conceptualization of Beginning with John Locke and the philosophers of
citizenship. The first consists of the twofold structure the Scottish Enlightenment, and passing on to Kant,
of a legally defined status of membership and rights on Hegel, and eventually Marx, the idea of ‘civil society’
the one hand and a philosophical and normative ideal (in German: ‘buW rgerliche Gesellschaft’) laid the
of ‘good’ civic behavior on the other. The second groundwork for a new idea of citizenship. Although
feature depends on the basic structure of equality not always developed systematically, the concept of
within citizenship. Its conceptual development may be the citizen and citizenship constituted the normative
interpreted as a claim for recognition as equal and a core of civil society (Haltern 1985). Out of Locke’s
struggle for inclusion. In the classical age of citi- view of civil society as a system of individual rights and
zenship, its inclusive function consisted of either duties, Kant’s definition of the public arena in civil
defining a ruling oligarchy with precisely delineated society as a sphere of juridical equality among citizens,
privileges or expanding a low standard of rights to an and Hegel’s (Riedel 1972) and Marx’s vision of civil
extensive group for the purposes of political inte- society as a field of conflicting particular interests,
gration. The linkage between a substantively high elements of a new understanding of citizenship
standard of citizenship rights and its extension to a emerged. It now referred to the central state, not the
broad group of persons was the result of a change in local community, and was based on individual auton-
politics and theory that began in the early modern omy, not obedience and subjection. It was legally
period. Between the sixteenth and eighteenth constituted and based on the principle of legal
centuries, the concept of citizenship lost its coherence equality, not oligarchic privilege. Finally, it was
and republican ethos (Riesenberg 1992). The classical inclusive with regard to the principles of property and
idea of citizenship survived as a historical concept in achievement. This new citizenship represented an ideal
education and political literature, however. It and became a key concept in the language of American
corresponded doubly with the era’s need for a break and European revolutions at the end of the eighteenth
with old categories and concepts. First, the vision of century. It was not, however, the product of political
an ‘active’ citizen fit well with the notion of a modern, revolution. Its achievement was the result not only of
visionary, and enterprising ‘polytechnic man’ as the conceptualization of ‘civil society,’ but also of its
outlined in the political theories of Thomas Hobbes realization as a new formation of social and political
and John Locke. Second, the concept of citizenship order throughout the Western world. This process
was enlarged by bringing the terms ‘citizen’ and marked such a major caesura in the historical de-
‘subject of the monarch’ closer together, and indeed velopment of citizenship that it may be considered the
mingling them. The era’s most elaborate conceptual- rise of a ‘second citizenship’ (Riesenberg 1992).
ization of citizenship, undertaken by Jean Bodin, In contrast to the universal and egalitarian concept
transferred citizenship from the local to the state level. of ‘civil society,’ the development of citizenship in the
According to Bodin, the citizen was a superior subject nineteenth century ran up against three fundamental
of the sovereign monarch. Citizenship denoted a obstacles to its realization in political practice. The
direct, personal link between the individual and his first obstacle was nationality. Citizenship was con-
sovereign. This conceptual transfer opened the way ceptualized as membership in the nation-state. While
for a revolutionary, egalitarian view: from the per- the rights conferred upon citizens by the liberal
spective of the sovereign all citizens were equal qua constitutions of the nineteenth century grew in number
subjection to the monarch’s sovereign authority. It and substance, they were increasingly confined to
was not until the eighteenth century that this kind of nationals. The universalist pathos of human rights,
passive equality was transformed by the subject-citizen which was still inherent in the revolutionary declara-
into a new, revolutionary call for active citizenship, tions of the American and French Revolutions (e.g.,
i.e., for participation in sovereignty. The Declaration of the Rights of Man and Citizen),
The concept that paved the way for this funda- was increasingly reduced to national entitlements and
mental change of perspective was the revolutionary interpreted in this light by legal scholars. The pre- and
idea of ‘civil society.’ It emerged in the later sev- transnational origins and impetus of ‘civil society’
enteenth and eighteenth centuries as a result of a crisis gave way to a nationally restricted concept of con-
both in the reality and idea of social order. In an age stitutional citizenship, which had lost its universalist
marked by an unprecedented commercialization of ethos of civic values.

1855
Citizenship, Historical Deelopment of

The second obstacle was gender inequality. Citi- rights weakened the Marxist critique. At the same
zenship was conceptualized in a seemingly gender- time, Marshall asserted that by challenging existing
neutral manner, although both membership status inequities and calling for their abolition as an in-
and the civic ideal of an ‘active citizen’ were based on dividual right, equality (as the conceptual core of
rights and activities that were largely reserved for men. citizenship) was a dynamic principle that continued to
The construction of the state as a ‘male state’ was a exercise an emancipatory and inclusive influence.
challenge to the principle of equality by law which was By doing so, Marshall’s concept of citizenship
proclaimed by the American and French revolutions. attained the force of a universal claim to equality that
From its beginning, the revolutionary principle of was applicable to a virtually unlimited range of subject
citizenship did not imply equal rights for women. The matters. Despite continuing Marxist skepticism, social
traditional concept of citizenship of the nineteenth and citizenship was interpreted as a call for the stabiliza-
first half of the twentieth century was deeply gendered. tion and expansion of social welfare rights (Turner
Thus it became a central issue of deconstruction in 1986). The lack of civil and political rights
feminist critiques (Pateman 1988). strengthened the call for citizenship as a key concept of
The third obstacle to democratic citizenship was opposition to dictatorships in the Soviet block. The
criticized by Karl Marx in the 1840s. He demonstrated idea of civil society was rediscovered in Central and
that the provision of equality in civil and political Eastern Europe as a basis for citizenship rights and
citizenship rights was merely a formal, legal one: it left translated into a political program for restructuring
untouched the practical inequalities in people’s abili- post-Soviet societies after 1989.
ties to exercise the rights or legal capacities that This expansive conceptualization has had a dual
constituted citizen status. Moreover, even the exercise effect. After the downfall of the Soviet block and the
of the rights of membership could not influence the global expansion of democratic constitutionalism,
basic conditions of class inequality because of the citizenship was adopted globally as a constitutive
purely formal nature of these rights. Marx interpreted element of the democratic order (Thompson 1970).
bourgeois citizenship as an instrument of class rule, New states in Asia, Africa, and South America used
and thus as an institution of industrial society, which citizenship to integrate into the society of world
was particularly driven by class. Citizenship as a democracy. At the same time, social conflict and
phenomenon of the modern industrializing world thus cultural change within the states with citizenship
corresponded to the interest of the new social sciences. traditions were often conceptualized as calls for
It was sociologists, beginning with Max Weber, who equality and recognition expressed in terms of citi-
analyzed the origins and growing importance of zenship. Massive transnational migration and
citizenship in the modern social order. Because citi- attempts to take account of ‘multiculturalism’
zenship was imbedded in the ‘Western’ model of (Kymlicka 1995), the enforcement of gender equality
modernization and industrialization, it came to be and sexual rights (Lister 1997), the debates within
identified as a specifically Western concept of political liberalism and communitarianism surrounding the
and social order. foundation and values of a ‘good’ civic life—all of
During the first half of the twentieth century, these issues have been interpreted as struggles over
though, intellectual interest in the concept of citi- citizenship rights. Ultimately, citizenship has become
zenship was mainly historical. As a political ideal it increasingly detached from its association with the
suffered both from Marx’s cutting critique and the nation-state and conceptualized instead as ‘trans-
crisis of the liberal-democratic idea between the world national citizenship’ (Baubo$ ck 1994), culminating in
wars. Thus, only the restoration of a democratic the vision of ‘world citizenship’ (Heater 1996).
political order after the Second World War and the The global success and expansion of the term
significant success of the welfare states established at ‘citizenship’ has not, however, necessarily stabilized
that time paved the way for the revival of citizenship as the underlying concept. As a political ideal, citizenship
a key concept in both social analysis and politics. It has been transplanted to states that lack the historical
was against this background that the 1950 article on foundation of a ‘civil society.’ It may, therefore, prove
‘Citizenship and social class’ by the English soci- inadequate for the social and cultural traditions of
ologist T. H. Marshall attained its path-breaking non-Western states. Historically, citizenship’s efficacy
influence over the entire debate on citizenship in the as a claim for rights was always based on political
second half of the twentieth century. Marshall took up sovereignty—be it monarchical or popular. This was a
the interpretation of citizenship as a set of rights and prerequisite both for addressing individual claims and
analyzed its historical development as a successive enforcing the corresponding civil duties. With the
creation of civil, political, and social rights. Marshall pluralist expansion of citizenship rights in a situation
particularly stressed two conclusions that were crucial of declining nation-state sovereignty and a nascent
to the revaluation of citizenship as both an analytical world state order, citizenship as it currently exists may
instrument and a political ideal: his emphasis on the lack the requisite unifying force. As Roman history
gradual expansion of rights and the ultimate under- demonstrates, the global expansion of citizenship does
mining of class conflict by the achievement of social not necessarily lead to its global achievement.

1856
Citizenship: Political

See also: Citizen Participation; Citizenship: Political; Soysal Y 1994 Limits of Citizenship. University of Chicago Press,
Citizenship: Sociological Aspects; Civic Culture; Civil Chicago
Society, Concept and History of; Civil Society\Public Thompson D 1970 The Democratic Citizen. Cambridge Uni-
versity Press, New York
Sphere, History of the Concept; Democracy; Demo- Turner B S 1986 Citizenship and Capitalism: The Debate oer
cracy, History of; Democratic Theory; Freedom\ Reformism. Allen and Unwin, London
Liberty: Impact on the Social Sciences; Freedom: Waldinger R, Dawson P, Woloch I 1993 The French Reolution
Political; French Revolution, The; Human Rights, and the Meaning of Citizenship. Greenwood Press, Westport,
History of; Liberalism: Historical Aspects; Marshall, CT
Thomas Humphrey (1893–1981); Public Sphere: Nine- Walzer M 1970 The problem of citizenship. In: Walzer Obli-
gations. Harvard Unversity Press, Cambridge, MA
teenth- and Twentieth-century History; Rights; State,
History of D. Gosewinkel

Bibliography
Arendt H 1951 The Origins of Totalitarianism. Harcourt Brace
Jovanovich, New York Citizenship: Political
Barbalet J M 0000 Citizenship. Rights, Struggle and Class
Inequality. University of Minnesota Press, Minneapolis, MN
Baubo$ ck R 1994 Transnational Citizenship. Membership and It may seem superfluous to subtitle a discussion of
Rights in International Migration. Edward Elgar Publishing, citizenship as ‘political.’ The term is at core in-
Aldershot, UK eradicably political: its oldest, most basic, and most
Beiner R 1995 Theorizing Citizenship. State University of New prevalent meaning is a certain sort of membership in a
York Press, Albany, NY political community. There are nonetheless good
Bendix R 1964 Nation-Building and Citizenship. University of reasons to underline just how deeply political citi-
California Press, Berkeley, Los Angeles, London zenship is, as this essay strives to do. The chief reason
Brubaker R 1992 Citizenship and Nationhood in France and is that, precisely because citizenship is so profoundly
Germany. Harvard University Press, Cambridge, MA
Gosewinkel D 2001 Citizenship, subjecthood, nationality: con-
political a term, there are recurrent pressures to
cepts of belonging in the age of modern nation states. In: Eder depoliticize both its meaning and its accompanying
K, Giesen B (eds.) European Citizenship Between National practice. These pressures, and the understandings of
Legacies and Postnational Projects. Oxford University Press, citizenship they propagate, are best seen as con-
Oxford, UK, pp. 17–35 firmations of the political character of citizenship, not
Hallern U 1985 BuW rgerliche Gesellschaft. Wissenschaftliches as alternative, apolitical conceptions.
Buchgesellschaft, Darmstadt
Heater D 1990 Citizenship. The Ciic Ideal in World History,
Politics and Education. Longman, London
Heater D 1996 World Citizenship and Goernment. Cosmopolitan 1. Four Meanings of Citizenship
Idea in the History of Western Political Thought. Macmillan
Press, Basingstoke, UK Perhaps the most familiar meaning of citizenship is in
Hufton O H 1992 Women and the Limits of Citizenship in the fact the seminal one. In both ancient and modern
French Reolution. University of Toronto Press, Toronto republics and democracies, a citizen has been a person
Kymlicka W 1995 Multicultural Citizenship. A Liberal Theory of with political rights to participate in processes of
Minority Rights. Clarendon Press, Oxford, UK
popular self-governance. Yet we also commonly speak
Kymlicka W, Norman W 1995 Return of the citizen: A survey of
recent works on citizenship theory. In: Beiner R (ed.) of citizenship as a more purely legal status. Citizens
Theorizing Citizenship. State University of New York Press, are people who are recognized legally as members of a
Albany, NY particular political community and who, therefore,
Lister R 1997 Citizenship. Feminist Perspecties. Macmillan possess some basic rights to be protected by that
Press, Basingstoke, UK community’s government, whether or not those rights
Manville P B 1990 The Origins of Citizenship in Ancient Athens. include rights of political participation. During the
Princeton University Press, Princeton, NJ last century, moreover, many have come to use citizen
Marshall T H 1950 Citizenship and Social Class—and other as a way of referring to those who belong to almost
Essays. Cambridge University Press, Cambridge, UK any human association, whether a political community
Pateman C 1988 The Sexual Contract. Stanford University
or some other group. I can be said metaphorically to
Press, Stanford, CA
Riedel M 1972 Artikel Bu$ rger, Staatsbu$ rger, Bu$ rgertum. In:
be a citizen of my neighborhood, my fitness club, and
Brunner O, Conze W, Koselleck R (eds.) Geschichtliche my university as well as my broader political com-
Grundbegriffe. Ernst Klett Verlag, Stuttgart, Vol. 1, pp. munity. Finally, we often use citizenship to signify not
672–725 just membership but certain standards of proper
Riesenberg P 1992 Citizenship in the Western Tradition. The conduct, implying that only ‘good’ citizens are truly
University of Carolina Press, Chapel Hill, NC citizens in the full meaning of the term.

1857
Citizenship: Political

The word citizen derives from the Latin ciis, longer evoked recurrent engagement in practices of
meaning a member of an ancient city-state, preemi- self-governance, but it was quite plainly a politically
nently the Roman republic; but ciis was a Latin crafted status that represented a new distribution of
rendering of the Greek term polites, a member of a power of enormous political significance.
Greek polis. Innumerable scholars have told how a Citizenship was then eclipsed in the West by the
renowned resident of the Athenian polis, Aristotle, various feudal and religious statuses of the medieval
defined a polites or citizen as someone who rules and is Christian world, but it did not vanish entirely.
ruled in turn, making citizenship conceptually in- ‘Burghers’ or the bourgeoisie were citizens of munici-
separable from political governance (Aristotle 1968, p. palities that often had some special if restricted rights
1275a23). Aristotle doubtless pleased many Athenians of self-governance within feudal hierarchies. Such
by arguing that such a status and activity, properly burghers remained, however, fundamentally subjects
performed, represented the highest form of life avail- of some ruling prince or lord, with their citizenship
able to most men. Yet this life was not available to chiefly providing legal rights of protection in the
Aristotle himself; he was not an Athenian citizen but a manner of Roman imperial citizenship. In contrast,
metic, a resident alien. He also suggested that the during the Renaissance some Italian cities achieved
philosophic life he could and did pursue was in the end both independence and a meaningful measure of
the highest of all; and the fact that he chose to pursue popular self-governance. They invoked ancient ‘re-
this life in a city that denied him citizenship may call publican’ ideals of participatory citizenship to define
into question how valuable he really thought citi- and defend their regimes. Their experiences in turn fed
zenship to be. into the antimonarchical revolutions that created the
His fulsome praise of the citizenship his Athenian first modern republics, including the short-lived sev-
hosts had created thus suggests how greatly definitions enteenth century English Commonwealth and late
of citizenship have always been shaped by the political eighteenth-century French Republic, as well as the
structures of power within which they have been still-enduring United States (Pocock 1975).
offered. It clearly would not have been prudent for In complex fashion, those revolutions inaugurated
Aristotle to denigrate Athenian citizenship. The fact transformations ‘from subjectship to citizenship’
that neither Aristotle nor most of other residents of across much of the globe that are still ongoing today,
Athens, including aliens, women and slaves, were when most of the world’s governments proclaim
eligible for citizenship also underlines how citizenship themselves to be ‘republics’ of some sort populated by
originated not only as a way of structuring mem- citizens. It is in that context that we have come to use
bership, but also as a way of distributing power within citizen ubiquitously for almost every kind of mem-
a particular political regime. That distribution dis- bership in every kind of organization, and to equate
empowered far more Athenians than it enfranchised. genuine citizenship with being a good, contributing
Even so, the ideal of citizenship as self-governance member of those organizations. These pervasive pop-
that Athens and Aristotle established has often served ularizations of the term reflect, however, political
since as an inspiration and instrument for political developments that have in some respects diminished
efforts to achieve greater inclusion and engagement in the significance of citizenship even as the term has
political life. spread.
As such, this ancient idea of citizenship has often
seemed politically threatening to many rulers, who
have abolished or redefined the category. It was for
this sort of political reason—because the regimes that 2. The Politics of Apolitical Citizenship
had created citizenship succumbed to conquest by
Alexander the Great’s monarchical empire—that Men created the early modern republics in an in-
ancient Greek citizenship disappeared. And it was for ternational realm that had been organized by the 1648
a similar political reason—because the Roman repub- Treaty of Westphalia into a system of mutual rec-
lic gave way to imperial rule generated from with- ognition among overwhelmingly monarchical
in—that Roman citizenship came to have a different nation–states. In gaining acceptance within that sys-
meaning than the one Aristotle articulated. In prin- tem, the new republics defined their citizens as having
ciple, Roman citizenship always carried with it the the same international status as national monarchical
right to sit in the popular legislative assembly that had subjects. For international purposes, these citizens,
been the hallmark of Athenian citizenship. But as too, were simply persons who owed allegiance to and
participation in that assembly became increasingly could claim protection from particular governments.
meaningless as well as impractical for most imperial Thus, Westphalian international law treated modern
inhabitants, Roman citizenship became essentially a republican citizenship as akin to the legalistic,
legal status comparable to modern nationality (Pocock protection-oriented version of Roman citizenship.
1995). It provided rights to legal protection by Roman Furthermore, Americans especially forged their
soldiers and judges in return for allegiance to Rome. republic amid racial and gender hierarchies that few
That status was less ‘political’ in the sense that it no leaders sought to challenge. Hence they felt compelled

1858
Citizenship: Political

to argue that, though free blacks and women might be chiefly in terms of subnational, often nongovern-
citizens, citizenship did not in fact inherently entail mental associations, and in terms of the ‘good citizen’s’
rights of political participation. It guaranteed, once civic service rather than vigorous political partici-
again, only more limited rights to certain judicial and pation. Certainly few policies within modern republics
executive protections (Smith 1997). For long stretches do much to enhance the feasibility and potency of such
of time, then, both international and domestic politics participation.
worked to strengthen legalistic as opposed to the more
participatory conceptions of citizenship despite the
rise of modern republicanism. 3. The Prospect of Postnational Citizenships
Yet even though courts made the narrower,
protection-centered view of citizenship legally auth- Though some scholars and democratic activists lament
oritative, the notion that genuine citizenship involved this current circumstance, others stress that the height-
rights of political participation remained a resonant ened transnational economic, transportation, and
rhetorical tool of legislative and constitutional re- communication systems that we call globalization are
formers. Eventually both domestic protest movements in any case making traditional notions of national
and international pressures, including the need for citizenship obsolete (Soysal 1994). Regional asso-
broad support in wartime, converged to work for the ciations, international legal institutions, and trans-
expansion of the franchise to all adult citizens in the national economic, cultural, and political organiz-
US and most of the western world. In America, blacks ations are all said to be more likely to shape humanity’s
won both citizenship and voting rights after the Civil future than existing national regimes. Hence mem-
War, even though most came to be effectively dis- bership in such bodies will represent the most im-
franchised in the ‘Jim Crow’ segregation era; and portant forms of citizenship in the twenty-first century.
women gained the franchise after World War I. In That such globalizing trends exist is undeniable,
both cases, arguments appealing to their public ser- though often national governmental actors remain
vice, especially in wartime, and to the idea that true major players even in transnational or international
citizenship must include the franchise, played key roles organizations and institutions. Despite advances in
in their successes (Foner 1988, Flexner 1973). communication and transportation, moreover, mean-
In Britain and to some degree in other western ingful participation in the governance of such popu-
European nations that had been politically configured lous and geographically far-flung entities seems even
essentially by feudal and industrial class systems, more chimerical for most people, deepening the eclipse
modern citizenship was wrought out via somewhat of citizenship’s oldest meaning. Thus there is a real
different struggles. As Marshall famously argued, first prospect that the idea of citizenship increasingly will
middle and then working class political pressures be severed not only from engagement in traditional
resulted in the expansion of civil rights of property and forms of self-governance, but even from membership
protection, then in near-universal rights of political in some titularly sovereign political community. It
participation, and finally and incompletely, in ‘social may become a term for membership and participation
rights’ for all national citizens that included income, in a wide variety of human groups, often simul-
housing, medical, and educational guarantees taneously.
(Marshall 1950). But even as the franchise has broad- There are, however, reasons to doubt this scenario.
ened in the US, Europe, and elsewhere, the notion of History suggests that the leaders of political com-
citizenship as active participation in meaningful self- munities rarely give up power willingly. Therefore, it is
governance has become more remote to many modern not surprising that efforts to resist globalizing trends
citizens. In part the logistics of large-scale modern and reinvigorate loyalties to existing nations and
societies make effective democratic participation very regimes are also visible players in modern ‘citizenship
difficult. In part the economic and cultural develop- politics,’ particularly in regard to immigration policies.
ments that have led to a focus on ‘social citizenship’ A truly all-encompassing global government, more-
make political activism seem less important. Engage- over, still seems a fantasy, so that memberships in
ment in one’s social and economic organizations can particular political communities are likely to remain
appear more pressing. Perhaps, then, the term citi- important features of human life, even if those
zenship has become common in such contexts because communities come to be constituted in new ways.
it is there that people find the memberships that mean Under at least some conditions, moreover, many
the most and in which they can most actively par- people may feel great concern over the decline in forms
ticipate. If so, then the inevitable corollary is that of citizenship through which they can exercise some
citizenship understood as political self-governance has genuine control over their collective lives. The fact
become quite secondary to many modern citizens. Yet that political and social reform movements have often
this, too, is a signal political development. It is gained wide support by insisting that citizenship means
probable that like their predecessors in other regimes, sharing in governance shows that such feelings can be
many who wield power in modern republics are politically powerful fuel driving quite important
content when those they govern think of citizenship changes.

1859
Citizenship: Political

Thus, we cannot rule out the possibility that older citizenship had its origins within political theory,
notions of participatory citizenship may continue to having abandoned its social dimensions for a long
play a role in the recrafting of political institutions and period.
communities that the twenty-first century will in- From Aristotle to Locke and Rousseau, many
evitably see. But whatever forms of citizenship result, philosophers have reflected on the nature of the social
they will almost certainly be the products of political link, the commitment of individuals within the public
contests that result in distributions of powers and space, the formation of the social contract and general
memberships to some people and not others, distrib- willingness. People are conceived here as rational
utions that will convey to them certain rights and beings: By their entrance into the political community,
protections, and not others. Hence, citizenship will they have access to the status of citizen which alone
remain what it has always been, a fundamentally gives meaning to its own history. In the contemporary
political status through which human beings partly era, from Sheldon Wolin (1993) to Carole Pateman
order both their individual and their collective lives. (1970), this reflection of political philosophy on the
foundations of a democracy of citizens is illustrated by
See also: Citizen Participation; Citizenship and an immense literature, of which Benjamin Barber
Public Policy; Citizenship, Historical Development (1984) and his ‘strong democracy’ or Selya Benhabib
of; Citizenship: Sociological Aspects; Civic Culture; (1996) are good representatives.
Participation: Political; Political Culture The revival of the political theory on citizenship has,
however, found its origins: Hannah Arendt sought
within the Greek polis the foundations of a ia actia
which would give life to citizens who were indifferent
Bibliography to the social, determined only by their reason, to enter
into public space. After Arendt, it was Ju$ rgen
Aristotle 1968 The Politics of Aristotle. Clarendon Press,
Oxford
Habermas who took it upon himself to seek the origin
Flexner E 1973 Century of Struggle: The Woman’s Rights of modern public space as a place of deliberation and
Moement in the United States. Atheneum, New York discussion. According to him, public opinion formed
Foner E 1988 Reconstruction: America’s Unfinished Reolution, by all citizens was born in the seventeenth and
1863–1877. Harper & Row, New York eighteenth centuries, in France and in England. View-
Marshall T H 1950 Citizenship and Social Class and Other ing capitalism as a system that alienates actors and
Essays. Cambridge University Press, Cambridge, UK destroys their exchanges, he only found reason within
Pocock J G A 1975 The Machiaellian Moment: Florentine contemporary ‘communicational behavior’; the use of
Political Thought and the Atlantic Republican Tradition. modern information techniques would therefore give
Princeton University Press, Princeton, NJ
life to a democracy of citizens capable of communi-
Pocock J G A 1995 The ideal of citizenship since classical times.
In: Beiner R S (ed.) Theorizing Citizenship. State University
cating among themselves. Arendt and Habermas
Press of New York, Albany, NY played an immense role in the revival of a political
Smith R M 1997 Ciic Ideals: Conflicting Visions of Citizenship in theory on citizenship capable of leading to a com-
U.S. History. Yale University Press, New Haven, CT munity which is undiluted by citizens who are little
Soysal Y N 1994 Limits of Citizenship: Migrants and Post- concerned by their particular historical identity, or
national Membership in Europe. University of Chicago Press, their cultural identity, either social or even less
Chicago biological.
A reflection upon the sociological foundations of
R. M. Smith citizenship should consequently find food for its
debates elsewhere. Thus, it can be claimed that during
the nineteenth century, de Tocqueville was one of the
few thinkers to extend the eighteenth century tradition
by giving it a more sociological context. He clearly
poses the question of the commitment of ordinary
Citizenship: Sociological Aspects citizens. Tocqueville asks himself questions about how
to avoid the isolation and apathy of individuals, which
Classical sociological theory was especially interested favors, in France, for example, an indifference pro-
in the social mechanisms ensuring social solidarity, pitious to all forms of authoritarianism; in his opinion,
and in the sources of conflicts that challenge the social the forms of local self-government established by US
structure. It underlined roles, functions or dysfunc- democracy limit these dangers. For Tocqueville, pub-
tions, the networks of sociability which bring the lic space is doomed to silence and to state domination
actors together, etc. In general, the founders of the if the associations and social groups that bring
discipline were not very concerned with politics and individuals together with specific identical interests do
hardly dealt with the question of citizenship, which not intervene within the elaboration of public politics.
marks the integration of actors within their nation. Citizens find themselves, from the beginning, plunged
Consequently, until very recently, the question of into the social. Whereas Marx thrusts aside the coming

1860
Citizenship: Sociological Aspects

of citizenship for the birth of a society that has rejected ‘outsiders’ (Gunsteren 1988). Neither did it extend this
capitalism and sees only alienation in present society, type of reflection which uses citizenship as its founda-
thus silencing the purely political role of determined tions for the territory of the nation state by dealing
citizens by their place in production relationships, with the case of a postnational citizenship which
Tocqueville was already interested in the sociological would take place, for example, in the new public
aspects of citizenship. European space where all citizens who have become
Strangely enough, between Tocqueville and T. H. ‘cosmopolitan’ would benefit from formal identical
Marshall, the predominance of the social was so rights based upon an intangible constitutional prin-
important that it was necessary to wait until the end of ciple, that of ‘constitutional patriotism’ dear to
World War II for the concept of citizenship to return Habermas, hoping for the emergence of collective
to the heart of the debates: However, this time, it was mobilizations destined to accentuate the democratic
not the perspective of Rousseau that prevailed but dimension (Cesarini and Fulbrook 1996, Delanty
more that of Tocqueville. During his famous con- 1997). Instead, the systematic research of elements of
ference in 1949, Marshall retraced the evolution which a ‘differentiated citizenship’ was undertaken, taking
led from legal citizenship, created during the eight- into account the multiple sociological dimensions
eenth century with the obtaining of civil rights, to peculiar to each citizen, whether it be economical,
political citizenship, obtained during the nineteenth cultural, or a matter of genus.
century with the exercising of political rights, and then For Kymlicka, ‘the members of certain groups are
finally to social citizenship which is granted to all, in incorporated into the political community not only as
the twentieth century, with the triumph of the Welfare individuals but also through the group. I have some-
State and social rights (minimum salary, health pro- times described these rights as forms of differentiated
vision, etc.) (Marshall 1977). citizenship’ (Kymlicka 1995, p. 174). If many socio-
For Marshall, the triumph of capitalism does not logists dealt with the consequences of these socio-
prevent the implementation of citizenship, this time, economical inequalities on political participation and
full and entire: From then onwards, for the first time, the exercise of the profession of the citizen, it is even
citizenship openly bears an essential sociological di- more the dimensions, diagrammatically coming under
mension, as social redistribution considers the di- culture, that increasingly attract attention. In a context
versity of social situations this time beyond the of growing crisis within the nation state that greatly
common role of the citizen. If the civil and political effects citizenship in Weber’s sense, or even in
dimensions concerned all citizens in their actual state, Marshall’s or Bendix’s sense, it was suggested that the
independently from their specific social identity, the theory of integration, for example that of Marshall,
economical dimension aims at correcting the inequali- ‘does not necessarily work for culturally distinct
ties amongst citizens. Marshall himself does not immigrants or for various other groups which have
underline this modification on the principle of citi- been historically excluded from full participation in
zenship that thrusts it into the social domain, and the national culture—such as blacks, women, religious
hardly interferes with its universalistic dimension. He minorities, gays and lesbians. Some members of these
does not consider either—and was later criticized for groups still feel excluded from the ‘‘common culture,’’
this—the persistence of so many social inequalities despite possessing the common rights of citizenship’
which remain, even in the age of the Welfare State, and (Kymlicka 1995, p. 180).
occasionally even worsen, actually threatening the Stemming from such a perspective, citizenship finds
citizenship of the most deprived. In this way, Ralph itself this time plunged into the social with the risk of
Dahrendorf stresses ‘all those who are rejected by losing its original meaning: As a result, public space
citizenship’: ‘the noncitizens’ who are the immigrants; becomes diversified to an infinite extent as citizens
‘those who are no longer entirely citizens,’ i.e., the preserve their identity there and democracy itself
elderly; and finally ‘those who are not yet citizens,’ i.e., changes its meaning. As Amy Gutmann asks ‘What
youth. The erosion of citizenship questions the image does it mean for citizens with different cultural
of the ‘good citizen’ equipped with all the necessary identities, often based on ethnicity, race, gender or
attributes for entering into public space. The explosion religion, to recognize ourselves as equals in the way we
of the ‘underclass’ breaks the image of a citizenship are treated in politics?’ (Guttman 1992, p. 3). The
which is full and entire, which all people would benefit democracy of citizens in fact finds itself greatly
from in an identical manner in the developed world modified in the same way as the political game, the
(Heissler 1994). strategy of parties and of pressure groups who use
From then onwards, a fairly important turnaround such identitarian groups as the basis for their actions.
of perspective occurred. It no longer involved con- Affirmative action, a policy openly destined to smooth
sidering, in a traditional manner, that ‘in the Nation out socioidentitarian inequalities by privileging the
State each citizen stands in a direct relation to the members of deprived ‘ethnic groups,’ is the clear
sovereign authority of the country’ (Bendic 1977). It outcome which implies a differential management of
did not reflect on the conditions of admission to citizenship and sets down an infinite number of
citizenship which separate the ‘insiders’ from the problems of justice and of equality. From the moment

1861
Citizenship: Sociological Aspects

that we consider that individuals possess a ‘thick self,’ State, Sociology of the; Tocqueville, Alexis de
it becomes difficult to claim the recovering of these (1805–59); Women’s Suffrage
differences by a ‘veil of ignorance’ (Rawls [1971] 1987),
differences which are deep-rooted within these com-
munity memberships considered from now on as being
essential. Bibliography
A considerable literature grew up in the area of Barber B 1989 Strong Democracy. University of California
multiculturalism, which constantly further reduces the Press, Berkeley, CA
importance of the classical theory of citizenship. By Bendic R 1977 Nation-building and Citizenship. University of
moving onward from the fact that it is unfair to ignore California Press, Berkeley, CA
the identity of citizens, the tendency is to reinstate Benhabib S (ed.) 1996 Democracy and Difference. Princeton
their culture with the risk of slipping toward a deep University Press, Princeton, NJ
relativism. In this way, the rediscovery of the identity Birnbaum P 1996a From multiculturalism to nationalism.
Political Theory 24(1)
shared by citizens relegitimizes the particular national Birnbaum P 1996b Sur la citoyennete! . L’AnneT e sociologique
cultures, the ‘ethnic’ feelings and also the nationalistic 46(1)
ideologies (Birnbaum 1996) which consider these Cesarini D, Fulbrook M (eds.) 1996 Citizenship, Nationality and
neglected ethnic groups to be the foundations of their Migration in Europe. Routledge, London
action in aid of the new citizens who are members of Delanty G 1997 Models of citizenship: Defining European
these particular homogenous cultural groups. identity and citizenship. Citizenship Studies 1(3)
Plunged into a growing communitarism, citizenship Gunsteren H 1988 Admission to citizenship. Ethics July
therefore leads to a nationalistic revival. On an internal Guttman A 1992 Introduction. In: Taylor C (ed.) Multicul-
level of society, he justifies the ‘tribalization’ of society turalism and the Politics of Recognition. Princeton University
Press, Princeton, NJ
into many specific homogenous groups separate from
Heissler B 1994 A comparative perspective on the underclass:
each other: with the image of the working class of Questions of urban poverty, race and citizenship. In: Turner
yesteryear which constituted a countersociety of which B, Hamilton P (eds.) Citizenship, Critical Concepts. Routledge,
they claimed to be the active citizens, immigrants are London
supposed to conserve their culture, their language, and Hunt L 1992 The Family Romance of the French Reolution.
their own customs by benefiting from the right to vote, University of California Press, Berkeley, CA
of local citizenship when they have not been natural- Kymlicka W 1995 Multicultural Citizenship. Clarendon Press,
ized. Women or homosexuals, like all social and Oxford, UK
cultural minorities, are also invited to join together in Kymlicka W, Norman W 1994 Return of the citizen: A survey of
a particular manner. recent work on Citizenship Theory. Ethics January
Landes J 1988 Women and the Public Sphere in the Age of the
The first among the feminist critics severely empha- French Reolution. Cornell University Press, Ithaca, NY
sized the great indifference of the classical theoreticians Marshall T H 1977 Citizenship and social class. In: Marshall
of citizenship toward the feminine gender: the French T H (ed.) Class, Citizenship and Social Deelopment. Chicago
Revolution, which was geared towards the uni- University Press, Chicago
versalism of citizens, did, however, force women to Pateman C 1970 Participation and Democratic Theory. Cam-
return to the only private space (Landes 1988, Hunt bridge University Press, Cambridge, UK
1992). In a more general way, the woman citizen Rawls J [1971] 1987 A Theory of Justice. Harvard University
demands consideration for her body and her own Press, Cambridge, MA
values in the exercise of this role within the public Shklar J 1991 American Citizenship. Harvard University Press,
Cambridge, MA
space ( Young 1990). In this way, it is in fact the Young I M 1990 Justice and the Politics of Difference. Princeton
classical theories of deliberative democracy and also University Press, Princeton, NJ
the models of the integration of the social system, for Walzer M 1994 Thick and Thin. Moral Argument at Home and
example, in their systematic Parsonian presentation, ..... University of Notre Dame Press, Notre Dame, IN
which find themselves questioned (Birnbaum 1996). Wolin S 1993 Democracy, difference and re-cognition. Political
To take into consideration the sociological variables Theory 21(3)
of citizenship is, therefore, in one way or another, to
give an advantage to the ‘thick self’ to the detriment of P. Birnbaum
the ‘thin’ self upon which the classical theories of
citizenship were formerly built (Walzer 1994,
Kymlicka and Norman 1994).

See also: Citizenship and Public Policy; Citizenship, Civic Culture


Historical Development of; Citizenship: Political;
Democracy; Democracy, History of; Marshall, Ciic Culture theory asserts that democracy is stable or
Thomas Humphrey (1893–1981); Nations and Nation- consolidated when specifically democratic attitudes
states in History; State Formation; State, History of; and practices combine and function in equilibrium

1862
Ciic Culture

with certain non-democratic ones. It was formulated written, Dahl’s concept of polyarchy already had
and tested in empirical research in the late 1950s and introduced an empirically grounded and quantitative
early 1960s, years still reverberating with the memories set of concepts into democratic theorizing. Democracy
of the rise of Communism and Fascism, the collapse of was not an essence. In its full sense it did not exist, and
democracy and the catastrophes of World War II. It probably could not exist. Hence the concept polyarchy
drew from a contemporary social science literature was developed to refer to real political entities which
similarly influenced by the interwar history of the attained measured performance levels on specified
stalemated Third French Republic, the deeply flawed empirical dimensions.
Weimar Germany, the Austrian and Spanish civil Harry Eckstein was the first to emphasize the
wars, and the breakdown of the Fourth French ‘mixed’ or paradoxical side of democracy, recognizing
Republic. the necessity for a democracy not only to represent
It could point to the long tradition in political and formulate the will of the public, but to govern it
theory of ‘mixed government,’ from Plato and authoritatively. In his Theory of Stable Democracy he
Aristotle through to Montesquieu, which supported describes some of the qualities which enable demo-
this anti-populism and prudentialism. Among the post cracies to reconcile responsible authority and demo-
World War II social science influences were works of cratic responsiveness. Such reconciliation is facilitated
Joseph Schumpeter, Paul Lazarsfeld and Bernard by balances among contrasting qualities; participant
Berelson, Edward Shils, Robert A. Dahl, and Harry behavior is balanced by deference to authority, dog-
Eckstein, among others. matism by pragmatism, and the like. Institutionally he
Schumpeter rejected the ‘classic democratic’ as- attributes democratic stability to the degree to which
sumption of the necessity of an informed, activist, social authority patterns coincide or are ‘congruent’
rational public for a genuine democracy, and pro- with political ones.
posed in its place, an ‘elites competing for votes’ Civic culture theory had the advantage of being
theory (Schumpeter 1947, Chaps. 21 and 22). This formulated in the context of a major empirical
minimalist theory could be reconciled with more investigation informed by this historical experience,
realistic assumptions of a relatively ignorant and and benefiting from this prior research and schol-
indifferent demos. arship. Its results were reported in book form in 1963,
Paul Lazarsfeld and Bernard Berelson, theorizing in paperback form in 1965; and were reprinted in 1989
from their ‘panel’ voting studies of the 1940s similarly (Almond and Verba, 1963, 1965, 1989) and remains
saw democracy as associated with a set of cultural and in print as of this publication. It was widely reviewed
social conditions having the effect of limiting the in social science periodicals and in 1980 a retrospective
intensity of social conflict. These included relative volume was published with critiques of the theory and
economic and social stability, a pluralistic social the findings (Almond and Verba 1980).
organization, a basic value consensus. and what we The data were made available in the Interuniversity
would now call a ‘civil society.’ They described a Consortium for Political and Social Research at the
democratic equilibrium as involving mixes of involve- University of Michigan, and have been utilized in
ment and indifference, stability and flexibility, con- many secondary studies.
sensus and cleavage ( Berelson et al. 1954). Four of the five countries that it investigated were
Edward Shils, on the democratic prospects of the chosen because they exemplified democratic stability
new nations, emphasized the importance of a ‘widely and instability in the first half of the twentieth
dispersed civility.’ By this he meant a moderate sense century—the United Kingdom and the United States
of nationality, a degree of interest in public affairs, a exemplifying stability on the one hand—Germany and
consensus on values, institutions, and practices, and a Italy exemplifying democratic instability and break-
recognition of individual rights and obligations. He down on the other. The fifth, Mexico, was a target of
wrote ‘ … These qualities should not be intense, and opportunity, selected with the thought that it might
they need not be either equally or universally shared.’ provide some insight into problems of demo-
( Shils 1960, pp. 387ff.) cratization outside the North American-European
Robert Dahl’s theory of polyarchy, elaborated in area. The method used in the study, of combining
1956 in a full-length contrast with populistic and structured and open-ended questions administered to
Madisonian democracy, belongs among these early probability samples of national populations, provided
social science influences on the civic culture. His early a richer set of data specifically responsive to questions
characterization of the American political system as arising out of this historical experience and body of
providing ‘ … a high probability that any active and speculative theory.
legitimate group will make itself heard effectively at The conception of stable democratic political cul-
some stage in the process of decision … ’ (Dahl 1956, p. ture as a ‘mixed’ political culture received a fuller
145; see also Dahl 1970; Dahl 1989) reflected the elaboration than it had been given in the earlier work
minimalist mood that Dahl shared with the generation of Berelson and Lazarsfeld and Eckstein. The mix of
that emerged out of the great depression and World democratic political culture was based on political role
War II. At the time that The Ciic Culture was being theory. People in stable democracies were both citizens

1863
Ciic Culture

and subjects, and they needed to accommodate their popular education should not have occasioned sur-
non-political, private, and parochial roles. A thriving, prise. That the exemplars of the civic culture of the
stable democracy consists not only of voters, demon- 1950s—Britain and the United States—should be
strators, petition signers, and politician button-holers; showing signs of wear and tear in the 1970s and 1980s;
but of taxpayers, jurors, and military conscripts; as and that the problem child of democracy—Germ-
well as parents, mates, work-persons, voluntary as- any—was showing strong signs of an emerging civic
sociation-members, vacationers, and private, self- culture, were not causes for rejecting civic culture
involved individuals. theory. The question was whether the changes obse-
It is this mix of roles—participant as well as subject, rved in the two decades after the civic culture study,
non-political as well as political—that democratic were in a direction which sustained or disproved the
citizens of a stable democracy must balance and theory. The evidence was moderately supportive of the
accommodate; and which their institutions must theory. Thus, for example, the withdrawal of Johnson
choreograph in a process of converting demands and from the 1968 presidential race, despite the fact that
supports into outputs and outcomes. To shift meta- political tradition would have legitimized another term
phors, civic culture theory was an equilibrium theory of office, and the resignation of Nixon from the
in which political buyers and sellers reach prices at presidency in the 1970s were clear evidences of
which the political market is cleared. Civic culture American instability; and the political disorders of the
theory specified what conditions had to be present in 1960s and 1970s were clear evidences of cultural
order to clear these markets. disequilibria of one kind or another—conflict had
These mixes and balances were located empirically undermined consensus, the legitimacy of government
in the British and American cases in the late 1950s and had declined, the modes of participation had rad-
early 1960s—this combination of influentialism and icalized. In contrast Germany had had several decades
deferentialism, involvement and indifference, con- of experience of effective political leadership and
flictual and consensual attitudes, principled and in- remarkable economic growth appearing to produce a
strumental ones, The relative absence of these balances moderating koalition-faX hig partisanship, growing
in the German and Italian cases was noticeable. There popular trust in government, civic obligation, and the
was more deferentialism and less participationism in like.
the British case than in the American. A ‘reserve of The place of civic culture theory in the contem-
influence’ was also noted in the American and British porary theory of democracy is in some doubt. In
cases, based on the finding that Americans and the the continuing theoretic polemic about the nature
British acknowledged the obligation to participate far of democracy anything settling for less than perfec-
more frequently than they reported actually par- tion partakes of sin. It was precisely to avoid this
ticipating, This discrepancy between performance and commingling of the sacred and the secular that
obligation could be viewed as a kind of ‘default’ mode, led Robert Dahl to invent the concept of polyarchy
a reserve supply of participatory energy available for and to place the concept of democracy somewhat,
crises. The civic culture would run cool normally and but not completely, off limits. However Dahl’s Poly-
at a moderate speed, but it had a reserve of influence to archy III (which is the closest to ultimate democracy
draw upon in the twists and turns of democratic that he gets) is achieved through increasing the
politics, as the concerns and interests of different depth and extent of attentive publics within the
groups of voters were engaged. larger mass public, corresponding to the significant
By the time the retrospective volume, The Ciic issues confronting the polity and policy elites.
Culture Reisited was published in 1980 it was evident Modern information technology makes it possible
that British and American civic culture were in trouble that this gap between policy elites and the mass
(Almond and Verba 1980, Chaps. 5 and 6). public might be significantly reduced (Dahl 1989,
The balance of consensus and conflict had moved pp. 338 ff.).
toward conflict; pride in nation and confidence in gov- The state of the polemic regarding the competence,
ernment were down. Participationism had declined. rationality, and potential effectiveness of mass publics
In contrast Germany showed dramatic gains in in contemporary polyarchies is well argued in a
social trustfulness, confidence in government, and symposium held during the 1990s. The merits of the
civic competence (Almond and Verba 1980, Chap. 7). various options—elitism, inventive utilization of in-
In Italy political alienation and extreme partisan formation technology, or reducing the scope of
antagonism continued largely unchanged. In Mexico politics—are among the issues debated (Friedman
the political culture of ambivalent belief in the legi- 1999).
timacy of the democratic revolution, and the cor- Civic Culture theory might have enriched this
ruption of politicians and office-holders, also still discussion somewhat by affirming the legitimacy of
survived (Almond and Verba 1980, Chap. 9). other than political claims upon humankind. If one
That patterns of political culture would change in views the full range of role demands on the time and
response to changes in economy, demography, politics resources of humans, how do we choose among them?
and public policy, communications technology, and How do these choices interact? What are the tradeoffs,

1864
Ciil Law

synergies, and opportunity costs? How do we weight between legal systems belonging to the same ‘family.’
the claims of the civic world against the demands of Thus, the use of codes (for example, the Uniform
profession, family, edification, and pleasure? Commercial Code), the existence of a large body of
statutory law, and the adoption of the influential
See also: Democracy; Democracy: Normative Theory; Restatements may make the legal system of the United
Democratic Theory; International Relations, History States appear closer to the civil law than to its English
of; Lazarsfeld, Paul Felix (1901–76); Participation: roots (see Farnsworth 1996, p. 227). Such a conclusion,
Political; Schumpeter, Joseph A (1883–1950); Trust, however, would misconceive the role and function of
Sociology of statutory law and of the methodology in its application
in the civil law, on the one hand, and in common law
legal systems, on the other hand.
Codes and statutes consist of rules in all legal
Bibliography systems. These may be very specific or they may be
Almond G A, Verba S 1963 The Ciic Culture: Political Attitudes stated with some degree of generality. In some areas,
and Democracy in Fie Nations. Princeton University Press common law codes and other legal texts display
Princeton, NJ. Reprinted in 1965 by Boston, Little Brown & extraordinary detail, attempting to anticipate every
Co., and in 1989 by Sage Publications, Newbury Park, CA eventuality. The common law judge, as a consequence,
Almond G A, Verba S (eds.) 1980 The Ciic Culture Reisited. will begin with the factual circumstances of the case
Little Brown & Co., Boston. Reprinted in 1989 Sage Publi- and after examining, comparing, and weighing the
cations Newbury Park, CA factual elements of the case, will attempt to find a rule
Berelson B, Lazarsfeld P, McPhee W 1954 Voting: A Study of
Opinion Formation in a Political Campaign. University of
that fits it. This rule may be one of statutory law or
Chicago Press, Chicago, Chap.14 derive from prior decisional law. The civilian judge
Dahl R A 1956 A Preface to Democratic Theory. University of sees the case as a problem to be solved within the legal
Chicago Press, Chicago, p. 145 structure of the legal system. First, the problem at
Dahl R A 1970 Polyarchy: Participation and Opposition. New hand must be fitted into a legal category. Next,
Haven, CT subcategories and sub-subcategories must be identified
Dahl R A 1989 Democracy and Its Critics. Yale University Press until ‘legal rule’ or ‘concept’ and the problem at hand
New Haven, CT, pp. 338 ff match. Civilians proceed by deductie reasoning, while
Friedman J (ed.) 1999 Special Issue: Public ignorance and the common law approach employs an inductie
democratic theory. Critical Reiew; An Interdisciplinary Jour-
nal of Poliltics and Society 12(4)
methodology.
Schumpeter J A 1947 Capitalism, Socialism, and Democracy. Categorization, as a structural characteristic of the
Harper and Brothers, New York, Chap. 21 and 22 civil law, also results in the drawing of sharp distinc-
Shils E 1960 Political Deelopment in The New States. Mouton, tions between different areas of the law. Private law
The Hague, The Netherlands, pp. 387 ff. deals with legal problems that arise between natural
persons or between natural and legal persons (such as
G. A. Almond corporations), while public law (for instance, con-
stitutional law, administrative law, but also criminal
law) addresses the relationship between citizen and
state. In some civil law systems, this results in the
establishment of special courts to deal with these
different areas of the law. In Germany, for instance,
Civil Law there are ‘ordinary’ courts with competence for private
law and criminal law, and separate court systems with
1. Structural Aspects competence for questions of administrative law, labor
law, and social law, each system has its own ‘Supreme
In the comparative study of law, legal systems are Court.’ Since many cases will obviously involve ‘mixed
often seen as belonging to ‘legal families.’ These may questions’ (e.g., of labor and private contract or tort
be religious-based legal systems, they may be geo- law), there may be contradictory rules of law, with no
graphically defined, or they may be distinct because of unifying ‘Supreme Court.’ In France, the Conseil
their structure and methodology. The principal, or d’Etat stands beside the Cour de Cassation, with
characteristic difference between the families of the exclusive competence in certain public fields of law.
common law and of the civil law lies in the last of these. The exclusivity of the Conseil’s competence avoids
This is true despite the fact that both legal families some of the German problems. Germany and Italy
have their roots in the Corpus Iuris Civilis of Justinian have Constitutional Courts, other civil law countries
(AD 534) and that the common law has preserved do not. While the divisions are more pronounced in
some of these traditions to this day, while the civil law Germany than elsewhere, categorization of fields of
has moved closer to the common law in some areas. law and the establishment of specialized courts un-
With respect to both families, it is dangerous to avoidably leads to a high degree of specialization
generalize; there can indeed be marked differences within the legal profession, both among lawyers as

1865
Ciil Law

well as judges. These lines are much more fluid in differences are also reflected in the legal systems of
common law. those countries that modeled their codes or statutes
The central role of the legal rule in civil law legal after earlier rules of others. Two codifications in
systems also explains why precedent (that is, the particular replaced the earlier ‘ius commune’ and
binding effect of a prior court decision on a different, proved very influential: the French code ciil of 1804
but similar subsequent case) is quite different from and the German BuW rgerliches Gesetzbuch of 1896 (see
what it is in the common law. The civilian judge also the Austrian Allgemeine BuW rgerliche Gesetzbuch
applies the law but is not bound (obviously with some of 1811). The French Code influenced the development
exceptions) by earlier decisions of a higher court. In of legal systems not only in many European (especially
common law, in contrast, it is the decision of the the Latin and Eastern); countries, it also spread to the
highest court that ultimately is the law and therefore Near East, Central and South America, and even to
binds inferior courts. Subsequent discussion will re- parts of North America (e.g., Louisiana). The German
turn to these points. Code affected the law of Eastern and Southern Europe
(e.g., Hungary, the Czech Republic, and Slovakia,
Yugoslavia, the Baltic States, and Greece) as well as of
Japan and China (see Zweigert and Ko$ tz 1998, 159,
2. History 160). Roman Law still exists in Southern Africa as
Roman-Dutch law, an admixture of Roman law and
Roman law had no comprehensive codification before old Dutch customary law which has interacted with
Justinian’s Corpus Iuris Civilis. Law, such as it was, English common law (see Zweigert and Ko$ tz 1998).
was divided into ius ciile and ius gentium. The former Codification extended, in these and in other countries,
applied among Roman citizens, the latter applied to not only to private law and private-law relationships
legal relationships among others (Romans and but encompassed criminal law, commercial law, and
foreigners, foreigners among themselves, and slaves). both civil (private) and criminal procedure.
Judicial functions were exercised by praetores. The
praetor prereginus administered the ius gentium and
conceived and developed legal concepts unknown to
the strict Roman ius ciilis. In a different context, of 3. Sources of Law
course, the English chancellor—at the beginning of
the equity-jurisprudence—performed equally creative
3.1 Primary Sources
functions. Similar ideas—for equity and judicial
creativity—underlie the ‘general clauses’ of the civil Civil law systems draw a sharp distinction between
law (see ide infra). primary and secondary sources. Primary sources are
With population growth and increasing urbaniz- enacted law, custom, and ‘general principles of law.’
ation, a new profession—that of the jurisconsult— Of these, the main source is the enacted (statutory)
arose. They were legal advisors who prepared written law; it predominates in civil law systems.
opinions for cases. As the number of opinions grew, A code in a civil law system consists of general
principles could be derived from them which could be principles, arranged in order of importance. At the
taught to students and serve as a basis for advice to beginning there may be general rules regulating basic
judges. The development of principles fosters more problems that need to be addressed before the par-
abstract ways of thinking. Categories are conceived ticular problem can be analyzed. For example: if a
and problems are classified for assignment to various plaintiff seeks damages for breach of contract, prelim-
categories. Under Emperor Justinian, opinions, de- inary analysis must determine whether the contract
cisions, and other materials were gathered and the was validly concluded. Provisions dealing with inva-
Corpus Iuris Civilis was prepared. lidity and avoidance of contracts usually are found in
The Corpus Iuris stands at the beginning of what the general part of a civil code. Such a general part
Roman law means today. This Code evolved in may be followed by particular parts dealing with
subsequent times as a result of the work of the individual fields of law, such as torts, contracts,
Glossators and Commentators. All taken together property, or the law of succession. The main or basic
constituted the ‘Ius Commune,’ the common law of codes are supplemented in increasing number by
Europe before, in subsequent centuries, countries and special statutes or codes of limited coverage with
areas began to grow in different ways and today’s which the legal system reacts to new societal problems,
‘different families’ began to take shape. In Europe, for instance, in areas such as consumer protection,
England gave rise to the common law (although, telecommunication, and new media.
especially in the early period, with many Roman law Custom is also a primary source of law, but tends to
elements). The Scandinavian countries, France, and be less important in practice because it is often difficult
Germany had a leading role in developing civil law. to prove its pervasive observance in society. Customs
Once again, ‘civil law’ is not homogeneous. There are are nonwritten rules, developed and observed over
differences among civil law countries, and these years and now part of social and economic thinking.

1866
Ciil Law

‘General principles of the law’ are what the term norm or general principles exist and, in the absence of
expresses: basic principles of the legal system which legislative action, how can law develop further? (For
are pervasive of it and derive from norms of positive comparative discussion see Capalli 1998, p. 87, Adri-
law. Civil law judges resort to ‘general principles of the aansen 1998, p. 107, Baudenbacher 1999, p. 333).
law’ as guidelines in the interpretation of statutory Legislative action is of course the classic instrument
norms both for the purpose of defining their inter- for legal change. But the process, from drafting a bill
relation and for the purpose of their application. This until its ultimate passage and entry into force, can be
is of particular importance when dealing with statu- long. A commission for the revision of the German
tory norms that are rather abstract in their formu- law of obligations, for instance, has worked on this
lation. It is tempting to consider this process to be not project for more than a decade. The final report of the
very different from the case law methodology of the commission was published in 1992. Nevertheless, it
common law. There is an important difference, how- was not until 2001 that a bill was introduced in
ever. The common lawyer derives the appropriate Parliament. Thus, problems may need to be addressed
interpretation by reliance on precedent. The civilian that existing legal norms do not cover. The Swiss Civil
judge is not so restricted but derives the interpretation Code is unique in its candid grant of discretion to the
considered to be appropriate from the structure of the judge to fill the gap. It provides in Art. 1, § 2: ‘if the
legal system and the general principles of law that Code does not furnish an applicable provision, the
pervade it; nor will the decision in the present case judge shall decide in accordance with customary law,
have a necessary effect on later cases. This is not to say and failing that, according to rule which he would
that later cases may not reach the same conclusion: at establish as legislator.’ The assumption must be, of
the point of what French lawyers call jurisprudence course, that even with such a grant of authority, the
constante and German lawyers staW ndige Recht- judge will try to fashion a result that conforms to the
sprechung, such decisional law may itself be regarded general structure and tradition of the legal system. The
as having risen to the position of ‘general principles of result will not contravene existing conceptions of what
law.’ the law should be, but rather will fill the gap, and
thereby contribute to the evolution of the law. One
3.2 Secondary Sources way of doing this is to work by analogy, that is, to
compare the present problem with other problems that
Secondary sources consist of case law and the legal similarly require a weighing of the interests of the
literature. The legal literature consists of monographs parties in the dispute. The analogy then extends the
and contributions to the legal periodical literature as balance struck with respect to other problems of the
well as commentaries. The last are particularly im- present case. An example is the treatment of leasing in
portant in civil law countries that follow or are close to German law. The German Civil Code contains pro-
the Germanic legal tradition. Commentaries are de- visions on leases and for sales, but not for leasing.
tailed annotations of each provision of a particular Since leasing is regarded as possessing elements from
code, consisting of an analysis that brings together all both fields of law, case law and literature now derive
case law dealing with this provision, opinions of others rules from both fields and apply them to the phen-
as expressed in the periodical literature, and the omenon of leasing.
commentator’s own evaluation and summary. The Swiss provision just quoted confers discretion.
Case law by itself, as already mentioned in other Yet it was suggested that no court will exercise
contexts, does not have the same central importance in unfettered discretion but will seek to follow the
a civil law system as it does in a common law system. structure and to implement the values of the particular
It is indeed a secondary source, for the judge is bound legal system. This has very much been the experience
only by the enacted law, except in the few cases in in other civil law legal systems, for instance, the
which—as discussed—the decisional law has reached German. The German Civil Code, as do others,
the level of jurisprudence constante or in systems (for contains provisions of enormous breadth and, conse-
instance, Germany) in which a ‘constitutional court’ quently, lack of specificity. These are ‘general clauses,’
has power to bind other courts. To repeat, however, of which § 242 of the German Civil Code is perhaps the
such cases are rare. best example. It requires parties to a contract to
Judges will read commentaries and the legal litera- perform their obligations in ‘good faith.’ Section 242
ture in general, just as lawyers do. The process of ‘law has become the source of an extensive body of case law
finding’ and its application, however, does not restrict and of legal concepts not addressed specifically in the
the court to these sources. civil code. When the economic upheavals of the 1930s
threatened parties to contracts with economic ruin, the
4. Deeloping the Law German Supreme Court invoked Section 242 to
develop a doctrine of ‘frustration of contract’ (Wegfall
If ‘law’ is a norm in the form of a statutory codification der GeschaW ftsgrundlage). This doctrine, interestingly
or a ‘general principle’ derived from such statutory never accepted to that extent in the United States
norms, how can a judge decide a case for which no (despite the greater freedom of common law courts to

1867
Ciil Law

fashion law through decisions), permits courts in courts will read quite differently from those of ap-
appropriate cases to adjust the obligations of the pellate courts of common law countries. Attention to
parties. The French Conseil d’Etat—but not the Cour factual detail will be very slight (this is particularly
de Cassation—developed a similar remedy, the doc- true of high French courts) and the emphasis will be on
trine of impreT ision (see De Laubade' re 1984, § 637). legal norms, rather than the other way round.
Good faith, so the German court states, requires the
protection of both parties: cancellation of the contract
(by analogizing the situation to one of impossibility) 5. Legal Education and the Legal Profession
would be unfair to one party; enforcing it in full
(against the background of totally different and In the United States of the eighteenth and nineteenth
unexpected circumstances) would be similarly unfair centuries, the young person could become a lawyer by
to the other party. ‘reading law in the chambers of a lawyer,’ in other
The obligations of parties to a contract begin with words by serving as a clerk. In the civil law countries of
its conclusion. However, expectations will have been Continental Europe, legal education has been the
created earlier and their disappointment may cause monopoly of the universities for centuries. With
damage. In certain circumstances, such a situation ‘doctrine’—not case law—the heart of the legal system,
could be dealt with under the law of Delikt (tort), but law faculties not only served as the only way of entry
sometimes the preconditions for a remedy in tort will to the profession, but its faculty members equaled in
be lacking. Again, the extension of an existing pro- stature the position of the high judges in common law
vision may provide relief. Section 276 of the German systems (see Dawson 1978). (For the practice of
Civil Code provides for liability for damage caused ‘Aktenersendung’ in Germany in the earlier centuries,
intentionally and negligently. For liability to arise, see Rheinstein 1938.) German appellate and supreme
there must be some relationship, some duty owed to courts routinely sent case files to the prominent law
the other party. Such a relationship, so states the case faculties, virtually for final decision on doctrinal
law, is created by contract negotiations. Parties thus grounds. The influence of law professors on the
have a precontractual duty of care to each other and development of legal doctrine thus has a long tradition
may have a remedy for its breach. This is the doctrine and continues to this day.) The course of study
of culpa in contrahendo. reflected, at least until recent times, both the strict
The civilian judge, despite the different theoretical categorization of the legal system into specific fields of
structure and focus of the legal system, thus can and law and the high level of abstraction that characterizes
does display the same creativity as do common law the deductive analytic method of the civil law. Prac-
judges, with the principal—perhaps only—distinction tical training of the young lawyer was left to a period
that the decision in the individual case does not ‘make of time required to be spent in an official training
law.’ That new concepts can become part of the law, program (such as the Referendariat in Germany) or as
through repeated practice, is shown by the two a junior lawyer under the direction of a licensed
examples just given. At that point, sources and meth- lawyer, combined with additional training (such as in
odology may differ, the practical result no longer does. France). Full admission to the practice of law then
The basic conceptual difference between common occurs with the completion of these additional require-
law and civil law finds reflection in still another area: ments. The mindset of the young lawyer had been
the role of a judge and the conduct of a case. In formed, of course, by the time the university studies
common law jurisdictions, particularly the United were completed, nor would it be changed by the
States, litigation is ‘fact driven.’ The facts need to be acquisition of practical skills.
established and the applicable precedent must then be In more recent times, law curricula increasingly
found. With such an emphasis on the facts of a case, reflect the offering or requirement of election of
the role of the lawyers is a particularly active one. The interdisciplinary subjects. There is also increasing
judge functions as a neutral arbiter; judges of courts of instruction in the use of new technologies, for instance,
appeal ultimately decide questions of law. In a civil how to access and use legal databases. However, to the
law court, the judge starts with the rule of law, as extent that these offerings represent ‘skills training,’
outlined, and searches for the facts that he or she needs they may not be expected to change the structure of
for the further categorization of the case. The judge’s the legal system.
role thus is much more active. Facts are elicited by the
judge; lay judges may participate as members of the
court (particularly in commercial matters), but there is 6. Outlook
never a lay jury in private law matters. The judge also
ascertains the applicable law and, if the applicable rule Civil law systems differ among themselves, as noted
of law happens to be foreign law because of the initially. This also holds true for the course of study
international aspect of the case, may even have to and practical training required to become a lawyer. As
ascertain the content of the foreign rule of law ex a result, a lawyer—as in the case of other profes-
officio. As a result, decisions of civil law appellate sionals—may face impediments in attempting to es-

1868
Ciil Liberties and Human Rights

tablish a practice in another country or even to render De Laubade' re A 1984 Moderne & DeoleT , 1 TraiteT des contrats
occasional services there. In the European Union, the administratifs, § 637, 2nd edn. L.G.D.J., Paris
1988 so-called ‘Diploma Directive’ was one of the first Farnsworth E A 1996 A common lawyer’s view of his civilian
steps to implement the European Community’s colleagues. Louisiana Law Reiew 57: 227
Ko$ tz H 1996 EuropaW isches Vertragsrecht. Mohr, Tu$ bingen,
freedom-of-establishment and free-supply-of-services Germany
provisions. The 1998 ‘Establishment Directive’ re- Lando O (ed.) 1999 Principles of European Contract Law Parts I
quires foreign lawyers who want to work perma- and II. Kluwer International, Dordrecht, The Netherlands
nently to pass an aptitude test or to have three years of Rheinstein M 1938 Law faculties and law schools: A comparison
practical experience in the state where they want to of legal education in the United States and Germany.
work. The Council of Bars and Law Societies of the Wisconsin Law Reiew 5: 42
European Community (CCBE) is an organization Von Bar C 1996 GemeineuropaW isches Deliktsrecht. Beck Pub-
which includes the national associations of lawyers of lishing Company, Munich, Germany
the member states of the EC. It is the author of the Zweigert Ko$ tz 1987 An Introduction to Comparatie Law 3rd
edn. Clarendon Press, Oxford, pp. 109–15, 154, 231
‘Code of Conduct for Lawyers of the European
Community.’
P. Hay
In the European Union, civil law meets common
law in a shared organizational structure. Efforts to
facilitate the transborder practice of law have been
accompanied by many legislative acts by the Com-
munity that go beyond regulatory economic action
but also affect private law. Legal scholars are docu- Civil Liberties and Human Rights
menting the doctrinal commonalities in various fields
of law—under such titles as ‘European-wide tort law’
(see Von Bar 1996), or ‘Property law in Europe’ (Von ‘Civil liberties’ and ‘human rights’ are closely related
Bar is currently working on this project)—and there terms that embrace the basic freedoms and claims to
has been discussion concerning the feasibility of a which individuals are entitled, either as citizens of a
European Civil Code (see Ko$ tz 1996, Lando 1999). If particular state or by virtue of being human. The very
there is, then, some movement toward a new ius ideas of civil liberty and human rights presuppose two
commune in Europe—which, however, will ultimately intertwined convictions: that individuals or groups in
extend to a significant additional number of countries civil society have moral status independent of the
as a result of the enlargement of the European organized power of society (e.g., the state), and that
Union—many other civil law countries around the this power must respect the rights that accrue to this
world will not participate directly in these develop- status. Though they arose as a response to political
ments. But the ‘new European law’ does not only seek and normative claims in the West, at the beginning of
to bridge differences among the civil law countries and the twenty-first century civil liberties and human rights
between them and the common law; it will also effect are at least officially endorsed by virtually all countries
reforms and innovations within the civil law. These and the community of international law. The content
may well serve as models for other civil law countries, and scope of these concepts are contested on philo-
for instance, for those of South America that grow sophical and political grounds. Accordingly, the con-
together through similar processes of economic co- cepts provide a good method of examining the ways in
operation and integration. which important philosophical and legal concepts
interact with political and historical forces.
See also: Common Law; Law: Overview; Legal Edu-
cation; Legal Reasoning and Argumentation; Legal
Scholarship; Lex Mercatoria; Supreme Courts 1. Basic Historical Background
The idea of a moral status independent of the state is
tied in multifarious ways to notions of higher (or
Bibliography natural) or universal law, democracy, individual con-
science, and limited government. These notions have
Adriaansen R 1998 Open forum: At the edges of law: Civil law echoes in history as far back as ancient Athens and
v. common law. A response to Professor Richard B Capalli. Rome, and in medieval Christian thought. The mod-
Temple International and Comparatie Law Journal 12: 107 ern turn toward individual ‘rights’ (as opposed to
Baudenbacher C 1999 Some remarks on the method of civil law.
Texas International Law Journal 34: 333
natural ‘law’) was the product of a complex historical
Capalli R B 1998 Open forum: At the point of decision: The process that included: Renaissance and humanist
common law’s advantage over the civil law. Temple Inter- emphases on human achievement and creativity; the
national and Comparatie Law Journal 12: 87 Protestant Reformation’s stress on individual religious
Dawson J P 1978 reprint 1986 Oracles of the Law. University of conscience and religious pluralism; the Enlighten-
Michigan Law School, Ann Arbor ment’s belief in the power of reason and the individual,

1869
Ciil Liberties and Human Rights

the growth of markets, and—most importantly—the Civil liberties and rights are generally claims tied to
rise of democracy (see Law and Democracy). The citizenship; in particular, legal orders. ‘Human rights’
specifically liberal tradition of limited government are more universal in nature; they exist simply because
and natural rights arose in the political and intel- one is a human being. These include the civil liberties
lectual history of several European countries, notably discussed above, as well as freedom from torture,
England, Scotland, France, and the Netherlands, and slavery, and degrading treatment; freedom of the
in the United States, in the seventeenth and eighteenth family; and basic self-determination. Debate swirls
centuries. around whether such rights include basic economic,
Social contract theory in the seventeenth and social, and cultural rights and needs, or broader
eighteenth centuries envisioned ‘social contracts’ be- collective goods.
tween the government and the citizenry based on the Though human rights claims can be derived from
consent of the governed and the protection of natural specific domestic or international legal sources, their
rights. Building on the theory of Thomas Hobbes, claims are distinctively moral. As political theorist
John Locke (1988) maintained in his influential Second Jack Donnelley (1989) remarks, ‘Human rights claims
Treatise on Goernment (a work defending the Glo- are essentially extralegal; their principle aim is to
rious Revolution in England in 1688, and the 1689 challenge or change existing institutions, practices or
English Bill of Rights) that government’s primary norms, especially legal institutions.’ Accordingly,
purpose is the protection of the rights of individuals Donnelley emphasizes the ‘possession paradox’: Hav-
found in the state of nature, specifically rights to life, ing a right is most important when ‘enjoyment of the
liberty, and property. The American Declaration of object of the right is threatened or denied.’ Thus,
Independence (1776) and the French Declaration of human rights claims typically arise when a particular
Man and the Rights of Citizens (1789) carried these claim is not afforded legal protection by a particular
ideas further, declaring the primacy of civil and country, such as homosexual conduct in some states in
political liberties. the United States, or religious conscience in China.
The distinct concept of ‘human’ rights arose in the
aftermath of World War II, with the widespread
2. Basic Aspects of Ciil Liberty and Human condemnation of the atrocities the Nazis committed
Rights against Jews and other minorities. After having
declined in the wake of skepticism and new political
movements in the nineteenth and twentieth centur-
2.1 Definitions
ies (e.g., utilitarianism, emotivism, nationalism, and
Though the distinction between ‘positive’ and ‘nega- Marxism), notions associated with natural rights and
tive’ rights (first articulated by philosopher Isaiah natural law enjoyed a revival in the aftermath of the
Berlin (1969) is often blurred or overstated, civil war, as democratic theorists regained respect for more
‘liberties’ are often considered negative rights, in that objective moral principles that provide standards by
they serve as shields that protect the liberty and rights which to evaluate the practices of states. Such theorists
of individuals and members of civil society from state as Leo Strauss (1950) and Edward Purcell (1973) have
oppression. They represent claims against state action. written about a postwar ‘crisis’ in democratic theory
Classic civil liberties include freedom of speech, the along these lines. The term ‘human’ rights avoided the
press, and assembly, freedom of religion, due process intellectual and political baggage associated with
and fairness in legal proceedings (especially criminal ‘natural’ law and rights, while at the same time
process), privacy, and freedom from illegitimate pointing to universally-held moral principles. In the
discrimination (see Censorship and Secrecy: Legal unprecedented Nuremberg trials held after the war,
Perspecties; Discrimination). Allied prosecutors convicted Nazi leaders of crimes
Civil liberties should be distinguished from civil against peace and humanity. Also, in the wake of
rights. Civil ‘rights’ are often construed as more World War II, the new United Nations made human
‘positive’ rights, in that they entail the state bestowing rights an important part of its agenda, and Japan and
a power to do something affirmative, or taking action West Germany, under the aegis of occupying forces,
to protect fundamental interests or claims against adopted constitutions that protect basic civil liberties
private (nongovernmental) actions. For example, the and rights.
right to use privately owned public accommodations
or facilities, or the right not to be discriminated against
2.2 Relationship to the State
in private employment, can be construed as civil rights.
More broadly defined positive rights may include such The concept of individual or natural rights is his-
claims as the right to a job, to obtain adequate housing, torically and pragmatically related—both positively
and to share in more equal distribution of resources. and negatively—to the emergence of the modern
More aggressive state action is needed to effectuate nation states from feudalism between the thirteenth
such rights (see Ciil Rights; Fundamental Rights and and seventeenth centuries. In 1648, the Peace of
Constitutional Guarantees). Westphalia, which ended the murderous Thirty Years’

1870
Ciil Liberties and Human Rights

War in Europe between Catholic and Protestant states, which include shelter, food, and livelihood; other
constituted the first formal international recognition emphasize human beings’ distinctive moral nature,
of the nation state’s autonomy from religious auth- stressing human dignity, self-respect, and citizenship.
ority. It also established the first official tolerance of The debate concerns those who define human nature
religious pluralism, a crucial move in the rise of civil in largely materialistic or naturalistic terms, and those
liberty and human rights. Yet the Westphalian model who define human nature in terms of such qualities as
of international law left no room for the international rationality, moral capacity, or spirituality.
enforcement of individual rights, as its main objective Writers such as Henry Shue (1980), distinguish
was the recognition of the principle of territorial basic from less basic rights. A right is basic if its
sovereignty (domestic jurisdiction) of strong states. enjoyment is ‘essential to the enjoyment of all rights.’
Yet the rise of strong nation states made individual These rights include physical security, economic se-
rights more important than they had been in the past, curity or subsistence, and liberty to participate in the
spawning new theories about the obligations of states economic and political life of the community. Still
to citizens. Indeed, another paradox (which also others, such as Donnelley, argue that this list is
involves the endemic jurisprudential debate between insufficient because a fully developed life requires
legal positivism and forms of legal analysis based on more opportunities and attributes than these mini-
natural law) concerns the relationship between rights mums. However, history shows that ‘negative’ civil
claims and their enforcement or recognition (see liberties are necessary to protect us from the state, so
Natural Law). Though many theorists persuasively these should always be on the short list of basic rights.
contend that rights claims exist independently of legal If we decide to include more rights as basic, we must
protection, rights claims (positive and negative) have do so without sacrificing basic civil liberties. And we
to be recognized and enforced by those in power in must understand that the longer the list of basic rights,
order to be effective. The Nuremberg trials present a the greater the potential for conflict among rights and
classic example of this fact. (Some have called the social policies designed to promote them.
verdicts ‘victors justice.’) As James Madison wrote, Theorists such as French jurist Karel Vasak (1982)
following the logic of Hobbes and Locke, liberal posit ‘generations’ of rights based on historical de-
freedom can exist only when the state is strong enough velopment. The first generation consists of political
to protect its citizens, but also limited enough so as not and civil liberties, while the second generation em-
to oppress them. Writing in the aftermath of World braces egalitarian social and economic rights. The so-
War II, Hannah Arendt (1951) chillingly portrayed called third generation rights involve humanity as a
how vulnerable stateless people are to abuse of their whole, including cultural self-determination, environ-
humanity. Legal theorists Stephen Holmes (1995) puts mental health, solidarity, and peace.
the matter succinctly: ‘Weak-state pluralism is a recipe The founding movements and documents in the rise
not for liberalism, but for a proliferation of rival and of liberal democracy accentuated civil and political
coercive mafias, clans, gangs, and cults … Liberal liberties. Yet the rise of socialism, Marxism, and the
government … is meant to solve the problem of working class in the nineteenth century spawned the
anarchy and the problem of tyranny within a single advocacy of social and economic rights in addition to
and coherent system of rules ( pp. 270–1). (or instead of ) civil and political rights. In the twenty-
first century, such rights are found in the constitutions
or fundamental laws of communist (and former
3. Some Basic Issues communist) states and many developing or Third
World States (see Socialist Law; Postcolonial Law).
Important questions have been raised about the Though developed liberal states are mainly dedicated
content and intellectual foundation of human rights to political and civil rights, social and economic rights
and civil liberties. What is the scope of rights? Are such often comprise parts of their social and legislative
rights derived from political or legal agreement, or are policy. In the late 1990s, such internationalist groups
they postulates of theological or philosophical in- as the Lawyers Committee for Human Rights and the
quiry? Do we grasp them by intuition or reason? Fair Labor Association contend that corporations’
Which claims are fundamental, and which less fun- use of factories in developing countries has made the
damental, and how can we make this determination? protection of economic and social rights in those
Should the list of fundamental claims include only countries a primary concern.
basic political and civil liberties, or should it also Human Rights covenants in the United Nations
include social and economic rights? Are rights claims reflect these debates (see International Law and
culturally determined or relative, as the American Treaties). In 1946, the Economic and Social Council of
Anthropological Association officially declared in the UN established the Commission on Human
1947, or are there general principles that make certain Rights, which led to the Universal Declaration of
claims universal? Human Rights in 1948, a foundational document that
Some thinkers, such as Christian Bay (1982), main- has achieved the status of customary international
tain that human rights stem from human ‘needs,’ law. In 1966, the International Covenant on Civil and

1871
Ciil Liberties and Human Rights

Political Rights (ICCPR) and the International Cov- riage, divorce, and inheritance. And communist China
enant in Economic, Social, and Cultural Rights has persecuted such religious groups as Christians and
(ICESCR) were signed by most states, taking effect in the Falun Gong because their views are allegedly
1976 (see Torture: Legal Perspecties). The ICCPR contrary to state ideology.
protects such basic civil liberties as freedom from Social scientists and legal scholars cite several
arbitrary punishment, forced servitude, and unfair factors that influence the extent to which countries will
criminal process; freedom of thought, conscience, and support civil liberties and rights on a sustainable basis.
religion; personal liberty and security; freedom of the Commitment to civil liberty has historically been
family; freedom to participate in fair elections; and accompanied by social pluralism, legal institutions
equal suffrage. An Optional Protocol of the ICCPR based on rule of law, and the differentiation of the
commits ratifying states to allow a special committee state from civil society (see Rule of Law; Law and
of experts to examine claims by individuals against Deelopment). More specific explanations include such
them. factors as the existence of a bill of rights and judicial
The ICESCR covers such rights as the right to work independence, judicial leadership and control of cases
under good conditions; the right to an adequate dockets, and a culture of rights consciousness that
standard of living, and the right to social security, encourages citizens to think in terms of rights (see
food, clothing, shelter, and basic health. The ICESCR Judicial Reiew in Law; Legal Culture and Legal
is less stringent in its wording than the ICCPR. Consciousness; Rights: Legal Aspects).
Signatories established no rank ordering of these More recent explanations point to the presence of
rights, and enforcement is more a matter of exposure political and social movements that engender legal
and persuasion than force. change (e.g., the Civil Rights Movement in the United
States; the freedom movement of South Africa led by
Nelson Mandela). Recently Charles Epp (1998) has
pinpointed sustained pressure exerted by a ‘support
4. Ciil Liberties in Practice structure for legal mobilization,’ which consists of
rights-advocacy organizations and lawyers, and suf-
The protection of civil liberties varies (in legal pro- ficient funding from private and (especially) public
visions and applications of these provisions) in dif- sources. Rights revolutions succeeded in the United
ferent countries due to cultural and political factors States and Canada in recent decades because of the
(see Fundamental Rights and Constitutional Guaran- presence of these factors, while India’s rights move-
tees; Constitutionalism, Comparatie). We can cite just ment was thwarted despite a favorable Supreme Court
a few of many examples. For instance, free speech because such factors were absent (see Law as an
doctrine and practice in the United States protect the Instrument of Social Change).
advocacy of illegal action, including racist rhetoric,
that falls short of directly triggering a disturbance
of the peace or inciting violence. Canada, Israel,
Germany, France and many other countries, on the 5. Human Rights in International Practice
other hand, prohibit speech that advocates racism
(racist rhetoric), regardless of the likelihood of illegal Until recently, the international system remained
action. In Germany it is illegal to belong to the Nazi committed to the Westphalian model’s strong pre-
party or to wear a Nazi uniform, while courts in the sumption in favor of the ‘domestic jurisdiction’ of
United States have expressly protected such actions. states. But this situation slowly began to change with
Though virtually all countries have basic legal the growth of consciousness of human rights, the
protections for criminal suspects, standards vary democratic ethic, and globalization (see Globalization:
widely, especially when we look at practice rather than Legal Aspects). The most important events before the
the letter of the law. In Russia, where legal institutions end of World War II and the Nuremberg trials include:
are poorly developed, preventive detention and crimi- the abolition of slavery in the British empire in
nal procedure rights of criminal suspects are poorly the 1830s and 1840s, culminating in the League of
enforced despite legal protections to the contrary; Nations’ Slavery Convention of 1926; the policy of
France does not recognize the privilege against self- ‘humanitarian invention’ by Western states to protect
incrimination. In the area of religion, the United Christian citizens abroad in the nineteenth century;
States maintains an exceptionally strict separation of and several conventions and treaties protecting the
church and state, while such democracies as Ireland, rights of soldiers in war promulgated in nineteenth
Italy, and Germany allow more accommodation and twentieth centuries.
between state and religion. In India, ‘personal laws’ In the wake of World War II and the humiliating
linked to the major religions (Hindu, Muslim, failures of the League of Nations, the world com-
Christian, and Parsi) are distinguished from normal munity established the United Nations, which prom-
civil law, thereby accommodating culturally-based ulgated the UN Charter. Whereas the Westphalian
discrimination against women in such areas as mar- model is premised on the freedom of states over their

1872
Ciil Liberties and Human Rights

domestic jurisdictions, the ‘UN Charter’ or ‘new persecuted minorities in these countries, thereby serv-
international law’ model embraces the Kantian model ing the interests of entrenched elites or tyrants.
of international relations and law, which emphasizes Enforcement of human rights at the international
universal peace and human dignity. In reality, the level has remained problematic because of the con-
models coexist in the contemporary world, posing tinued normative and prudential reluctance to in-
sometimes vexing questions about where to draw the tervene in the domestic jurisdiction of states. Hobbes
line between state sovereignty and international hu- and Locke would have predicted such a result in the
man rights norms. Yet the conferees who established absence of an international sovereign with sufficient
the Charter rejected a proposal to authorize inter- power to enforce protections of rights. As a result, the
vention to protect rights, and a clause in the Charter main support for human rights has been in the form of
expressly prohibits intervention in ‘matters which are moral persuasion (far from meaningless, if not always
essentially within the domestic jurisdiction of states.’ efficacious), exposure of violations through research
In recent decades, many international treaties and and publication, and the deployment of such measures
forums have been established under the aegis of the as economic sanctions. UN organizations have investi-
UN or other regional and international organizations gated several countries, including Chile, Rwanda,
to promote recognition of human rights (see Inter- Somalia, Zaire, several Latin American countries,
national Law and Treaties). States have signed treaties Iran, Iraq, and South Africa.
in conventions against torture, genocide, racial and
gender discrimination, and treaties protecting refugees
and children. Again, social and political movements 6. Victories and Defeats: The Continuing
have been indispensable to the promotion of human Dilemma of Rights and the State
rights logic and practice. In addition to hundreds of
regional and intergovernmental organizations (‘IO’s, The decline of Cold War politics in the UN Security
such as the Organization of American States, the Council has enabled the UN to be somewhat more
Organization of African Unity, etc.), such non- aggressive, sponsoring peacekeeping and actual inter-
governmental organizations (NGOs) as Amnesty Inter ventions to protect human rights in Somalia, Iraq, and
national, Human Rights Watch, and the Inter- Bosnia in the early 1990s. Yet the efforts in Somalia
national Committee of the Red Cross have played and Iraq have proved short-lived, and Bosnia’s fate
major roles in raising awareness, linking inter- was quite uncertain as of 1999. And in 1994, the world
national organizations, and even bringing cases for stood by while massive genocide took place in
enforcement in relevant jurisdictions. Rwanda. In 1999, Cuba, China, and Sudan champion-
States have drafted regional agreements to protect ed the norm of the territorial sovereignty of states
rights on all continents except Asia. In Europe, the in order to shield their abuses of human rights from
European Convention of Human Rights and Fun- international intervention, even though these states
damental Freedoms (based on the ICCPR) formed the are themselves members of the United Nations Human
European Court of Human Rights, which takes cases Rights Commission.
after they have been heard by the relevant domestic To be sure, Yugoslav President Slobodon Milosevic
courts (see European Union Law). Member states have was defeated in his attempt to take over Kosovo in
agreed to accept all of the court’s rulings, leading, for 1999, yet this victory was won by the military
example, to changes in Britain’s law of criminal commitment and might of the North Atlantic Treaty
procedure (most notably in the area of pretrial Organization under the leadership of the United States
detention) and changes in several states’ laws con- and Britain; victory came only after Serbian forces had
cerning the rights of children born out of wedlock. already carried out massive ethnic cleansing. During
Not surprisingly, politics has affected the appli- the war, the war-crimes tribunal in the Hague indicted
cation of the ICCPR and ICESCR covenants. During Milosovic for war crimes; earlier (1995) it had indicted
the Cold War era, Western countries championed the Serbian leader Radovan Karadzic and his top military
ICCPR, while communist countries supported the officer, General Ratko Mladic. Yet no political body is
ICESCR. Third World countries have advocated in charge of bringing such indictments, and the United
rights of self-determination, cultural rights, and col- States and leading European countries have not
lective rights concerning resources (debates over law managed to arrest these Serbian leaders, preferring
of the seas, etc.). Cultural relativism remains an issue. instead to target lesser figures as of this writing.
In 1993, Asian countries challenged claims about the After Kosovo, Czech President Vaclav Havel wrote
universality of political and civil rights during the UN in ‘Kosovo and the End of the Nation-state’ that the
World Conference on Human Rights, arguing that nation state would end in the next century, giving
such rights can be counterproductive, even dangerous, away to an international community governed ‘by
in the context of economic underdevelopment, frag- universal or global respect for human rights, by
mented nationalism, and fragile state institutions. universal civic equality and the rule of law, and by a
Theorists such as Donnelley counter by pointing out global civil society’. Writer Leon Wieseltier (1999)
that these arguments ignore the actual plights of replied that no oppressed soul had ever been saved by

1873
Ciil Liberties and Human Rights

the forces of ‘global civil society.’ Kosovo was de- Vasak K (ed.) 1982 The International Dimensions of Human
livered from Milosovic by the willful acts of allied Rights. Greenwood Press, Westport, CT
nation states. Though the nation state is a source of Wieseltier L 1999 Winning ugly. The New Republic June 28:
27–33
evil, ‘it is also the nation-state from which we may
demand rescue from such evils. The ethical content of
D. A. Downs
a particular sovereignty is what finally matters.’

See also: Censorship and Secrecy: Legal Perspectives;


Civil Rights; Civil Rights Movement, The; Cons-
Civil Religion
titutionalism, Comparative; Discrimination; Europ-
ean Union Law; Freedom\Liberty: Impact on the In a seminal thesis published in the Winter, 1967, issue
of Daedalus, and in later revisions of his argument,
Social Sciences; Freedom: Political; Fundamental
Bellah 1967 claimed to have discerned two kinds of
Rights and Constitutional Guarantees; Globalization: civil religion in the USA. One was fairly traditional: a
Legal Aspects; Human Rights, Anthropology of; composite of biblical themes that were compatible
Human Rights, History of; Human Rights in Inter- with the natural law tradition mediated by the church.
cultural Discourse: Cultural Concerns; Human This legacy saw in the history of the US a version of
Rights: Political Aspects; International Law and God’s election of ancient Israel. That is why Bellah
Treaties; Judicial Review in Law; Law and Demo- referred to it as a ‘special civil religion’ that was
cracy; Law and Development; Law as an Instrument compounded of prophetic warnings and commands to
of Social Change; Legal Culture and Legal Con- a chosen nation burdened with particular rights and
sciousness; Legal Positivism; Natural Law; Rights: responsibilities. The other civil religion was utilitarian
Legal Aspects; Rule of Law; Socialist Law; Torture: rather than traditional, and it was based primarily on
Legal Perspectives the thought, interests, and experiences of the American
people themselves. Describing it as ‘the lowest com-
mon denominator of church religions,’ Bellah argued
that it paid more attention to the interests than to the
Bibliography responsibilities of the people. Interpreted in terms of
the social contract rather than of the covenant, it owed
Arendt H 1951 The Origins of Totalitarianism. Harcourt, New far more to John Locke than to the Bible (Bellah 1976a,
York p. 57).
Bay C 1982 Self respect as a human right: Thoughts on the
dialectic of wants and needs in the struggle for human
Bellah alternated between arguing that the civil
community. Human Rights Quarterly 4: 53–75 religion was vital and enduring and issuing warnings
Berline I 1969 Two concepts of liberty. In Four Essays on that the civil religion was in a precarious state. Bellah
Liberty. Oxford University Press, London at first seemed to be sure that the civil religion was
Cassese A 1986 International Law in a Diided World. Clarendon alive and well: far from dead, it would show its vitality
Press, Oxford during the American Bicentennial of 1976 (Bellah
Conot R E 1983 Justice at Nuremberg. Harper & Row, New 1973). However, in reflecting on that celebration and
York also on the latest Presidential campaign addresses,
Donnelley J 1989 Uniersal Human Rights in Theory and Bellah was quite clear that Americans had largely
Practice. Cornell University Press, Ithaca, NY forgotten about a past that they had never clearly
Epp C R 1998 The Rights Reolution: Lawyers, Actiitists, and
understood.
Supreme Courts in Comparatie Perspectie. University of
Chicago Press, Chicago, IL
Bellah himself alternated between thinking that his
Henkin L 1990 The Age of Rights. Columbia University Press, article coincided with the debut of the civil religion on
New York the American scene and with its decline. On the one
Holmes S 1995 Passions and Constraints: On the Theory of hand, he suggested that the civil religion had only
Liberal Democracy. University of Chicago Press, Chicago, IL come into being through his publication of the 1967
Lawson E (ed.) 1991 Encyclopedia of Human Rights. Taylor and article and appeared confident in its continued exist-
Francis, New York ence (Bellah 1973). On the other hand, disappointed in
Locke J 1998 Two Treatises of Goernment [ed. Laslett P]. the results of the Bicentennial, he regretted that it was
Cambridge University Press, Cambridge, UK only an ‘empty and broken shell’ (Bellah 1975).
Purcell E A Jr 1973 The Crisis of Democratic Theory: Scientific Some religious leaders have condemned American
Naturalism and the Problem of Value. University Press of
civil religion as authoritarian, dangerous, and idol-
Kentucky, Lexington, KY
Shue H 1980 Basic Rights: Subsistance, Affluence, and US atrous. They are joined in this criticism by some
Foreign Policy. Princeton University Press, Princeton, NJ leading politicians who see a civil religion as idol-
Steiner H J, Alston P (eds.) 1996 International Human Rights in atrous. Some critics, therefore, argued that Bellah was
Context: Law, Politics, Morals. Clarendon Press, Oxford, UK engaged in an attempt to revitalize the nation itself by
Strauss L 1950 Natural Right and History. University of Chicago infusing its political institutions with religious mean-
Press, Chicago, IL ing. Bellah merely wished, through exhortation, and

1874
Ciil Religion

admonition to recall the nation to its higher purpose charisma as being somewhat ephemeral or evanescent
(Crouter 1990). To others it was quite clear that Bellah and always distorted by any attempt to make it part of
was trying to resuscitate Protestant beliefs and public a routine or rational social order. Religion, on this
influence at a time when both seemed to be losing view, is disruptive to institutions and is, therefore,
credibility and support. At the very least he was trying most evident during times of crisis or chaos. Bellah
to invoke the legacy of the Protestant establishment of would therefore not be surprised that that some
the late nineteenth century. ‘Bellah introduced theo- sociologists, therefore, have found in the civil religion
logical principles that he presumed overarched the an episodic phenomenon that is most visible in times
state and the religions it protected’ (Hammond et al. of crisis and is only gradually rooted in enduring
1994 pp. 8–9). These criticisms persisted, despite institutions or ways of life (Marty 1974).
Bellah’s disclaimer that the civil religion was merely The debate on civil religion thus reflected a wide
another way of talking about a world view (Hammond range of assumptions about the relation of religion to
et al. 1994, p. 2). complex societies. Bellah, in keeping with his Durk-
Bellah’s claim to have identified a civil religion that heimian assumptions, saw the US as largely indi-
endures regardless of those who believe in it or who vidualistic and utilitarian as a result of a break with the
can verify its existence has elicited criticism or comm- Biblical tradition. In his view the Constitution owed
ents from those who see his work not as sociology but more to Locke and to notions of interest than to
as political theology or ideology. Indeed, some noted covenantal theology; for Bellah, that break repre-
that his interpretation of American religious and sented a considerable decline in the moral vision of the
political ideals found its justification in theological nation.
propositions (Hammond et al. 1994, pp. 8–9). Perhaps On the other hand, also in keeping with a Durk-
in response to these critics, Bellah has argued that the heimian interest in a religion of humanity, Bellah and
civil religion needed no help from himself or from his associate, Philip Hammond, saw more universal
anyone else; it could endure on its own terms. On one possibilities for the civil religion. All societies express
occasion Bellah insisted that the civil religion had their political unity in civil religious terms. Their
never been a majority viewpoint but continued to exist critics, however, accused them of expressing an Amer-
enshrined in certain texts, particularly Lincoln’s ican notion of manifest destiny under the guise of the
Gettysburg address. It, therefore, did not matter how civil religion (Weddle 1983).
many people believed in the civil religion or whether Despite the protest by Hammond et al. (1994, p. 2)
evidence could be found for it through the use of that Bellah did not have in mind an ‘idolatrous
questionnaires. The civil religion was there, Bellah worship’ of the American nation–state, Bellah argued
(1976b, pp. 153–4) argued: a matter of ‘faith in certain that American culture has within it the potential of
abstract propositions which derive ultimately from becoming the basis for a global civil religion. It was a
God. If the ‘‘larger society’’ does not conform to them, point in keeping with his Durkheimian interest in a
so much the worse for it’ (Bellah 1976b, pp. 153–4). religion of humanity, and others have found some
Bellah himself has pointed out that his case for the support for his thesis. There is some evidence, for
American civil religion is quite in keeping with a instance, that the US space program transformed
Durkheimian approach to social life. Societies do astronauts from American celebrities into represen-
express their identity and define themselves in religious tatives of a global civil religion (Wilson 1984).
terms; indeed any enduring form of social life may well Furthermore, the US is apparently not peculiar in its
become serious about its foundations, standards, use of national heroes as exponents of a civil religion;
boundaries, and destiny. The sacred is a pervasive similar processes appear to be at work in Yugoslavia
aspect of social order. No wonder, then, that for (Flere 1994).
Bellah (1989), while the concept of the civil religion Subsequent studies have suggested that religion
may be dispensable, it nonetheless points to an may still be engaged with the political system, but far
enduring problem concerning the relation of the less at the national than at the local level. Rather than
political to the religious aspects of any society. (Marty influencing a broad range of social values, religion is
1974). now more likely to be engaged in interest-group and
To sum up Bellah’s many and quite variant readings single-issue politics. Instead of a steady pull on the
of the civil religion and of his own arguments, it is direction of social change, religion therefore increas-
helpful to distinguish two sets of axioms. These are ingly exerts a temporary, however intense, influence
shared among sociologists who study religion and are during spurts of social mobilization on the part of
not idiosyncratic to Bellah alone, but Bellah does particular communities and constituencies (Demerath
provide an example of how they may operate in the and Williams 1992).
thought of a single sociologist. On the one hand, Some have argued that the civil religion is no longer
Bellah identified himself as working within a set of a national conscience but a set of partial ideologies. It
assumptions that he would attribute to Durkheim. On is, therefore, merely ‘a confusion of tongues speaking
the other hand, Bellah would also acknowledge the from different traditions and offering different visions
validity of a Weberian viewpoint, that thinks of of what America can and should be’ (Wuthnow 1998).

1875
Ciil Religion

That is perhaps why Bellah’s more universal claim for cleansing. As in the case of Yugoslavia, it would be a
the civil religion has aroused another set of criticisms mistake to underestimate ‘the power of a system of
to the effect that the civil religion, at least as Bellah reified, prescriptive culture to disrupt the (contra-
conceived it, ignored the presence and claims of dictory) patterns of social life’ (Hayden 1996, p.784).
minorities and smacked of ‘cultural imperialism’ Similarly, in Chile a form of the civil religion re-
(Moseley 1994, p. 18). inforced by the church and articulated by the Pinochet
Beyond the context of the US, however, scholars regime sought to give religious legitimacy to a re-
have found the notion of a civil religion to be a pressive military elite and regime (Christi and Dawson
particularly suggestive concept. Despite—or because 1996). Such attempts, however, associate civil religion
of—Bellah’s ‘broad and diffuse use of the term,’ civil with a regime rather than with a nation–state as a
religion, and the ‘theoretical instability’ of Bellah’s whole and, therefore, place it in a more marginal
model, there has been a proliferation of studies of civil location. The same observation applies to attempts by
religion in a wide range of national contexts (Crouter the dominant regime in Malaysia to use Islamic beliefs
1990, p. 161). and practices to reinforce social discipline and to
Drawing on Dobbelaere’s (1986) discussion of the legitimate the political and economic goals of the state
civil religion, it would be possible to distinguish four (Regan 1976, p. 103).
conditions under which religion would have different As nations become increasingly secularized, the civil
relationships to a national political system. Where religion, to the extent that it survives, will become
religion is still an enduring and vital institution, and marginal to the culture and politics of the nation–state
where it is still central to the nation–state, one would and put in only episodic appearances during periods of
expect to find such traditional forms of civil religion as social mobilization around specific issues. For in-
Islam and Eastern Orthodoxy. However, where trad- stance, in Celtic heroes and symbols were central to
itional religion has been eroded or transformed, one the resistance of France, Belgium, and the Netherlands
might expect to find more secularized cultural systems against Germany in the nineteenth century. Recently,
like Soviet Marxism or Nazism. More ephemeral or however, France and Spain have met with resistance
episodic forms of religion might persist and remain from Celtic communal groups on their own peri-
central to a nation–state; consider the notion of a civil pheries. The symbols and identity of the Celtic
religion that is episodic and not widely known but still periphery are thus being co-opted by the European
central to the history, traditions, and identity of the Union in an attempt to assert a Pan-European
US. communal culture (Dietler 1994).
Dynamic relationships between the center and the As the cases of Northern Ireland and Wales dem-
periphery, however, will change the meaning and onstrate, there is no necessary connection between
location of civil religious beliefs and symbols. In strong ‘nationalist doctrines’ on the periphery and an
Japan, since the nineteenth century, Shinto has moved equally strong ‘nationalist politics’ (Beuilly 1985, p.
from the center, where it was the civil religion of an 74). In the US, marginal groups that wished to define
aristocratic and military elite, to the villages and clans, themselves in opposition to the national center that
where it has strengthened national resistance to seemed to them to be insufficiently legitimate have
Western influence. Whereas under the Meiji, Shinto established their own alternative versions of the civil
had been central and enduring as a national civil religion: the Seventh Day Adventists being one ex-
religion, under occupation by the US Shinto was no ample (Bull 1989, p. 181).
less enduring, however marginal it had become to the Some have, therefore, argued that the civil religion
public ideology of the nation (Takayama 1988, p. is an essential arena for the contest among opposing
328). communal and interest groups and between the center
Although Japan’s national culture was transformed and the periphery, (Willaime 1993, p. 573). There is an
into one that was largely secular and democratic, intimate connection between civil religious protest on
Shinto remained vital to the ability of Japan to recover the periphery and oppositional politics toward the
from the war and to rebuild its economy while center. Indeed, it would appear that civil religion
preserving a sense of continuity with the past. Howe- comes into being as a way of arbitrating the protest of
ver, as Shinto has been adopted by the corporations local or communal groups against the state. Mexican–
and the nation–state as a means of mobilizing the Americans, for instance, have joined elements of
loyalty of workers and citizens, it has become more Catholic liturgy with traditional folk celebrations to
secular and arguably less significant to the mobi- mobilize and discipline farm-workers for an agri-
lization and motivation of Japanese citizens and cultural union movement in the United States. The
workers (Dobbelaere 1986). movement used traditional religious symbols that had
The beliefs and symbols of the civil religion may been central to Mexican culture for the purpose of
therefore become the source of legitimacy for the opposing a dominant class and its institutions in the
nation–state or the target of cultural opposition. On US (Bennett 1988). Indeed, highly sectarian forms of
the one hand, an ideal of an ethnically pure nation– the civil religion have been developed in order to
state may legitimate the most brutal forms of ethnic protest the secularization of the civil religion at the

1876
Ciil Religion

political center; the Unification Church would be a cumstances that are both historical and geopolitical.
prime example of this authoritarian tendency (Rob- Thus, it is not unusual that traditional forms should be
bins et al. 1976). Similarly, representatives of Native co-opted by chauvinism or become a source of
American communities have attacked a politicized peripheral resistance to the center.
civil religion at the center as being inimical to the In the course of trying to clarify the civil religion
Indian traditions (Deloria 1992). thesis, various proponents have argued that there are
As Bryan Wilson and others have argued, there are a wide range of types based either on the content of the
strong secularizing tendencies in the Christian faith, ideology or its social constituency. Martin, however,
and these have been deployed on various occasions has pointed out the free-floating nature of the sacred
against civil religion. In Sri Lanka the role of the King and its contingent relation to religion and politics.
has been crucial in maintaining social and cosmic Rather than developing complex typologies to en-
harmony; even the British took the role of the King in compass these relations, what is needed, therefore, is a
the most important ritual of Sri Lankan civil religion, series of statements of the ‘If … then …’ variety that
until protest from missionaries forced them to cease stipulate the conditions under which civil religion may
their involvement. As a result, the ceremony has been be more or less central, marginal, traditional, secular-
degraded into a festival of Sri Lankan cultural arts, ized, popularized, or politicized. It is appropriate to
and the society as a whole has lacked the means of ask to what extent religion is ‘locked into the core
legitimating the center to the periphery, (Seneviratne processes of cohesion, power, and control’ and to
1984). investigate the extent to which religion and its domi-
There is some evidence that the decay or removal of nant institutions maintain their traditional ‘relation to
the civil religion from public discourse in some territory and history, to national belonging and death’
countries has created space for a secular civil society to (Martin 1997, p. 104). One must also ask whether the
emerge. In Norway, civil religious symbols have been religious community in question has a voluntaristic or
used by conservative and Christian elites to legitimate ethnoreligious base and whether its symbolic options
the monarchy, but there is a general tendency to keep are more generic or particular. The answers to these
religion on the margins of public life; even during questions will then help one interpret each com-
times of crisis civil religious symbols may be notable munity’s or nation’s construction of ‘the world’ as
by their absence from public discourse. As they remain relatively open or closed, hostile or indifferent (Martin
on the periphery, however, the symbols of civil religion 1997 p. 56).
may be deeply, if not widely held. In Norway, although Further work, therefore, remains to be done on civil
public discourse is secular and political legitimacy is religion as an expression or outgrowth of conflict
derived from the legal system, the Christian com- within and between civilizations. On this level, it
munities of the south-west still maintain a hope that remains to interpret civil religions as cultural develop-
Norway may become a Christian nation (Furseth 1994 ments of civilizations translated from their centers to
pp. 46, 50–1). Although Canada lacks a civil religion, new peripheries. Following Martin, it would be poss-
religious symbols and beliefs remain vital to regional ible to view the American civil religion as a con-
and local communities (Reimer 1995). tinuation and residue of the English civil war over a
On the other hand, some have argued that it is the century earlier, just as conflicts between the North
very secularization of the center that raises fears that American center and its Central or Latin American
the society as a whole may disintegrate; these fears are periphery are continuations of the struggle between
the source of demands for a revitalized civil religion Northern and Southern Europe. On the basis of
(Willaime 1993, p. 571). As Martin (1997, pp. 28–35) Martin’s argument one could interpret the American
has pointed out, for a society to survive it must have civil religion as the response of one periphery to the
continuity, to achieve which it must maintain a certain English center: a response composed partly of a
identity. That identity, furthermore, always implies a tradition of establishment and partly of voluntaristic
difference between the society in question and all and ethnoreligious dissent (Martin 1997, p. 57). Simi-
others. By having a cultural as well as political center, larly, one could investigate Japanese Shinto and Sri
a society forecloses other possible bases for Lankan civil religion as the development of Buddhism
integration. The full range of possibilities, and the un- on Asian peripheries.
certainty of making choices among alternative ident- If arguments concerning the civil religion are to
ities, is not open to any society that wishes to maintain contribute to mainstream social scientific discussions
its identity, and therefore also its difference from other of nationalism however, further work remains to be
societies, over time. done on civil religions as the result of the ‘civilizing
Once this is understood, Martin (1997, pp. 29–35) process’ as religion is transplanted from the center to
argues, it is no longer problematical that religion may periphery. Relatively few scholars interested in civil
become one of the markers of identity or difference. religion have focussed on the work of Anderson
Whether religion becomes an exclusive marker or (1983), who has traced the interaction of Western
becomes associated with other forms of the sacred in a Christianity with a wide range of societies in both
particular society depends on a wide range of cir- hemispheres. In Anderson’s view, Roman Christian-

1877
Ciil Religion

ity, carried by the Church throughout Europe, had nationalism, and the manipulation of Celtic identity in modern
integrated a wide range of local cultures and religions Europe. American Anthropologist 96(3): 584–605
within a common civilization and by means of a lingua Dobbelaere K 1986 Civil religion and the integration of society:
A theoretical reflection and an application. Japanese Journal
franca (Latin). In developing local vernaculars into
of Religious Studies 13(2–3): 127–45
which the Bible and liturgy were then translated, Flere S 1994 Le De! veloppement de la sociologie de la religion en
however, the Church succeeded in encouraging in- Yougoslavie apre' s la deuxie' me guerre modiale (Jusqu’a son
digenous elites to develop a national culture resistant de! membrement). Social Compass 41(3): 367–77
to the imperial center. These smaller and more Furseth I 1994 Civil religion in a low key: The case of Norway.
cohesive entities thus represented a limited and secular Acta Sociologica 37: 39–54
reduction of the religious civilization that created Hammond P E, Porterfield A, Moseley J G, Sarna J D 1994
them. Thus civil religions, by this argument, are Forum: American civil religion revisited Religion and Amer-
secularized remnants of a trans-national religious ican Culture. A Journal of Interpretation 1: 1–7
Hayden R M 1996 Imagined communities and real victims: Self-
civilization.
determination and ethnic cleansing in Yugoslavia. American
See also: Citizenship: Sociological Aspects; Civic Ethnologist 23(4): 783–801
Martin D 1990 Tongues of Fire, The Explosion of Protestantism
Culture; Religion and Politics: United States; Reli-
in Latin America. Blackwell, Oxford, UK
gion: Mobilization and Power; Religion: Nationalism Martin D 1997 Does Christianity Cause War? Clarendon Press,
and Identity; Religion: Peace, War, and Violence Oxford, UK
Marty M 1974 Two kinds of civil religion. In: Richey R, Jones D
(eds.) American Ciil Religion 1st edn. Harper & Row, New
Bibliography York
Moseley J 1994 In: Hammond P E, Porterfield A, Moseley J G,
Anderson B 1983 Imagined Communities, Reflections on the
Sarna J D (eds.) Forum: American Ciil Religion Reisited.
Origin and Spread of Nationalism. Verso, London
Religion and American Culture, A Journal of Interpretation
Bellah R N 1967 Civil religion in America. Daedalus, Journal of
Winter 4(1): 13–18.
the American Academy of Arts and Sciences 96: 1000–21
Neuhaus R J 1986 From civil religion to public philosophy. In:
Bellah R N 1973 American civil religion in the 1970s. Anglican
Rouner L S (ed.) Ciil Religion and Political Theology.
Theological Reiew , Supplemental Series 1: 8–20
University of Notre Dame Press, Notre Dame, IN, pp. 98–110
Bellah R N 1975 The Broken Coenant. Seabury, New York,
Regan D 1976 Islam, intellectuals and civil religion in Malaysia.
pp. 142–158
Sociological Analysis 37(2): 95–110
Bellah R N 1976a The revolution and the civil religion. In: Bauer
Reimer S H 1995 A look at cultural effects on religiosity: A
J C (ed.) Religion and the American Reolution. Fortress Press,
comparison between the United States and Canada. Journal
Philadelphia
for the Scientific Study of Religion 34(4): 445–57
Bellah R N 1976b Response to the panel on civil religion.
Richey R, Jones D (eds.) 1974 American Ciil Religion. Harper &
Sociological Analysis 37(2): 153–9
Row, New York
Bellah R N 1978 Religion and legitimation in the American
Robbins T, Anthony D, Doucas M, Curtis T 1976 The last civil
republic. Society 15(4): 16–23
religion: Reverend Moon and the Unification Church. Soci-
Bellah R N 1980 The five religions of modern Italy. In: Bellah R,
ological Analysis 37(2): 111–26
Hammond P (eds.) Varieties of Ciil Religion. Harper & Row
Seneviratne H L 1984 Continuity of civil religion in Sri Lanka.
San Francisco
Religion 14: 1–14
Bellah R N 1989 Comment. Sociological Analysis 50(2): 129–46
Takayama K P 1988 Revitalization movement of modern
Bennett S 1988 Civil religion in a new context: The Mexi-
Japanese civil religion. Sociological Analysis 48(4): 328
can–American faith of Cesar Chavez. In: Benavides G, Daly
Weddle D L 1983 Review of varieties of civil religion. Journal of
M W (eds.) Religion and Political Power. State University of
the American Academy of Religion. LI(3): 198–9
New York Press, Albany, NY
Willaime J 1993 La religion civile a' la française et ses me! t-
Beuilly J 1985 Reflections on nationalism. Philosophy of the
amorphoses. Social Compass 40(4): 571–80
Social Sciences 15: 65–75
Wilson C R 1984 American heavens: Apollo and the civil
Bull M 1989 The Seventh-day Adventists: Heretics of American
religion. Journal of Church and State 26(2): 209–26
civil religion. Sociological Analysis 50(2): 177–87
Wuthnow R 1998 The Restructuring of American Religion:
Christi M, Dawson L 1996 Civil religion in comparative
Society and Faith since World War II. Princeton University
perspective: Chile under Pinochet (1973–1989). Social Comp-
Press, Princeton, NJ, p. 244
ass 43(3): 319–38
Crouter R 1990 Beyond Bellah: American civil religion and the
Australian experience. The Australian Journal of Politics and R. K. Fenn
History 36(2): 155–65
David D H 1998 Editorial: Civil religion as judicial doctrine.
Journal of Church and State 40(1): 7–24
Deloria V 1992 Secularism, civil religion, and the religious Civil Rights
freedom of American Indians. American Indian Culture and
Research Journal 16(2): 9–20
Demerath N J, Williams R H 1992 A Bridging of Faiths: Religion Civil rights legally protect individuals or groups from
and Politics in a New England City. Princeton University Press, certain forms of oppression. While civil rights are
Princeton, NJ commonly associated with the 1960s movement in the
Dietler M 1994 ‘Our ancestors the Gauls’: Archeology, ethnic USA to establish equality for people of African

1878
Ciil Rights

descent and—more generally—with the US Bill of agency. This has occurred in the USA with freedom of
Rights, by the end of the twentieth century their reach speech, which has acquired a major place in the
and recognition was global. In modern political, national origin story, although its most significant
academic, and public usage, civil rights embody, and features developed in the mid-twentieth century by
provide legal support for, basic concepts of human common forms of social change (Kairys 1982, Levy
dignity and respect for individuals in their diverse 1985).
cultures and ways. Recognition and enforcement of
civil rights, or some assortment of the most fun-
damental civil rights, is widely understood as a
necessary element of freedom, democracy, and equa- 2. Scope
lity.
Civil rights are usually established in a written The most common civil rights are: prohibition of
constitution, statute, or treaty. However, they have discrimination based on race, ethnicity, religion, and
been based on unwritten constitutions and have gender; the right to personal security, including pro-
occasionally arisen from the pronouncements of mon- tections for persons accused or suspected of crimes;
archs and conquerors, perhaps most famously Napo- the right to vote and to participate in democratic
leon. These sources of civil rights protections are not political processes; and freedom of expression, as-
self-executing: some countries have strongly worded sociation, and religion. Privacy, which in the US
civil rights provisions in written constitutions that are system encompasses personal security and autonomy,
not enforced; the meaning and interpretation of civil as well as control over personal information, is of
rights provisions are usually controversial; and despite growing concern throughout much of the world.
the prevalence of civil-rights aspirations and rhetoric, Personal security was the first, and remains the
no country has as yet found a reliable method for primary, civil right. It was a major focus of leading
systematic, consistent protection of civil rights. Anglo–American legal authorities, from William
Blackstone (1769\1992) and St. George Tucker (1803)
to Thomas Cooley (1874). The pathbreaking civil
rights text by Thomas Emerson and David Haber
(1952) considers the right to personal security first and
1. Origins describes its significance in the introduction: ‘In a
society based upon human dignity and the develop-
The origins of some civil rights go back almost a ment of the individual personality, clearly all
thousand years to the earliest limitations on govern- members are entitled to security of the person—
ments, which at least indirectly protected individuals. protection from bodily harm, involuntary servitude,
Civil rights as such emerged in the seventeenth and and the fear of physical restraint.’ Although the rights
eighteenth centuries, as social development and pol- of criminal defendants in the USA are commonly
itical and philosophical thought emphasized the in- thought to originate in 1960s Supreme Court rulings,
dividual and, later, based national sovereignty and the US Bill of Rights has as its major subject and most
legitimacy on democratic forms of government. How- numerous protections a series of limits on the govern-
ever, systematic legal enforcement, and even wide- ment’s power to punish individuals criminally.
spread aspirational recognition, of civil rights did not International bodies, treaties, and courts—and
occur until after World War II. They quickly achieved many countries—prefer the term ‘human rights,’
global acceptance in the last half of the twentieth which usually encompasses a broader array of rights
century; the laws of almost all countries, and several and places different priorities on some particular rights
widely adopted international pacts, at least purport to than does US civil rights law (see Law: Defense of
protect some array of fundamental rights of indivi- Insanity). The United Nations Charter, the Uniersal
duals. Declaration of Human Rights (1948), and many other
Approached historically and contextually, the vari- human-rights treaties and covenants protect social,
ous civil rights can be traced to practices by govern- economic, and cultural rights as well as political and
ments or by powerful individuals or institutions that civil rights. These rights place affirmative obligations
came to be viewed as oppressive. Thus, the Fourth on governments to provide all their people with
Amendment to the US Constitution limiting police minimal nutrition, healthcare, housing, and educa-
searches was a response to the dreaded house-to-house tion. Eleanor Roosevelt, criticizing the US reluctance
general searches by British authorities in the colonial to include economic and social rights, put it best: ‘You
period. However, such origins are often obscured, can’t talk civil rights to people who are hungry.’
because civil rights—like other basic legal precepts— Internationally, the content of human rights also
are usually analyzed and clothed with the rhetoric of often exceeds specific US civil rights protections,
foundational principles and tend to take on, in law as including, for example, prohibition of the execution of
well as in politics and fiction, an abstract, timeless children and mentally disturbed adults and protections
quality that transcends history, experience, and human against domestic abuse. Other nations may also place

1879
Ciil Rights

different relative importance on various rights, such as speech. Such occasions are rare, but they do exist—for
rejecting absolutist free-speech rights and emphasizing example, where one person’s speech is interfering with
equality rights. While the USA has adopted most of or drowning out another’s or all others. Nevertheless,
the human rights treaties and covenants (although the US Supreme Court has protected electoral cam-
often with specific reservations), US courts have not paign finances as speech in absolutist terms that make
consistently enforced them. The USA is also unusual even minimal regulation difficult or impossible (Buck-
among the nations of the world in its renewed ley vs. Valeo 1976). The Court characterized money as
embrace of capital punishment and in the retrench- speech and refused to recognize as significant or
ment of procedural and fair trial rights to expedite legitimate countervailing concerns the importance of
executions. fair elections to society and other individuals or the
In most countries, civil rights protect individuals speech rights of those whose voices cannot be heard in
from their governments, but protection for groups and money-driven elections.
from other individuals have also been recognized. Absolutist conceptions of this sort, though often
Prohibitions of slavery around the world limit indivi- articulated by staunch advocates of individual rights,
duals, as well as governments. Equal rights for women are reminiscent of the ‘Lochner era’ in the USA, in
have included limits on individually inflicted violence which the civil right to due process was used by the
against women and recognition of the rights of women Supreme Court to invalidate a series of economic
as a group. reforms aimed at protecting working and poor people.
The combination of an absolutist formulation and a
civil right based on unrestrained free enterprise yielded
one of the most notorious eras in US constitutional
3. Uniersal, Absolute, and Conflicting Rights law. These rights that people around the world seek
and rely on for protection from powerful individuals,
Civil rights lose their meaning and importance if they groups, and institutions, often at considerable per-
are not extended to all people. However, the uni- sonal sacrifice, can become another instrument for the
versality of civil rights, both within a particular powerful.
country and across national borders, and the common There is also a fundamental sense in which all civil
notion that extension of civil rights to those previously rights protect individuals by limiting others—either all
denied them has no affect on others, are problematic. others collectively, because civil rights prohibit certain
Typically, recognition of civil rights disturbs a status actions by majorities and governments, or individu-
quo in which privileges and hierarchies were entrench- ally, in the situations in which civil rights protect
ed. If previously silenced speakers or voters gain the against private as well as governmental actions. Thus,
rights to speak and vote, the effectiveness of the voices civil rights both bestow and limit freedom. They limit
and votes of the previously privileged is diminished. the ability of even democratically constituted govern-
Abolishing racial discrimination in employment or ments, for example, to discriminate against a racial
college admissions diminishes the prospects of the minority or to ban an opposition political party, by
previously privileged. Abolishing racial discrimination placing the individual’s rights above collective power.
in public accommodations diminishes the freedom of Civil rights are, in this sense, both a limit on and a
association of those who wish to exclude racially. necessary feature of democracy.
Recognition of the rights of people who have been These tensions permeate heated controversies about
oppressed involves a determination that the interests, the universality of civil rights across national borders.
expectations, tastes, and sometimes civil rights of The conventions on the rights of women referred to
others constituted an illegitimate or disproportionate above have been the subject of an unusual array of
privilege which came only at the expense of that national reservations, usually on the ground that they
oppression, and to which they should not have been, clash with cultures and religions, which also can claim
and no longer are, entitled. The failure to acknowledge civil rights protection. For example, female genital
the interconnectedness of rights, interests, privileges, mutilation violates several of the most fundamental
and expectations has probably made acceptance of and generally accepted civil and human rights, but it is
civil rights advances more difficult than it has to be. also central to some cultures and religions.
This has been a substantial problem in the USA, where
interconnectedness is only acknowledged for affirma-
tive action and where civil rights tend to be articulated
in absolutist terms. 4. Formulation and Enforcement: the US Model
Absolutist conceptions of civil rights usually rest on
the same thinking, memorably critiqued by Justice The US tradition of extensive judicial power, which
Oliver Wendell Holmes’ famous example of shouting includes judicial innovation beyond the specific lan-
‘fire’ in a crowded theater. Words have effects, guage of a constitution, statute, or treaty, is gaining
otherwise we might not be so fond of them, and in some acceptance internationally. This expanded ac-
some circumstances those effects outweigh the value of ceptance is in part due to its reputed tendency to

1880
Ciil Rights

protect civil rights, although the role of courts in civil legislative initiatives for civil rights. During these
rights matters (and more generally) is controversial in periods, the Supreme Court established: a system of
the USA. Many other countries have tended to rely on limited government and protection of individual rights
international human rights treaties or legislation. available to people of ordinary means probably
Courts lack the democratic legitimacy of legislatures, unrivaled in world history, including strenuous pro-
while their very distance from democratic processes tection of speech, association, and privacy; procedural
could enhance the potential for protection of in- and substantive rights of persons accused or suspected
dividual rights against even mobilized majorities. of crime; the rights of women; voting and participatory
Protection of individuals and groups from majori- rights; and prohibitions of discrimination (Kairys
tarian oppression is the hallmark of meaningful civil 1993, 1998; W. Va. Bd. of Educ. vs. Barnette 1943).
rights and provides the best measure of a particular The main judicial vehicle for protecting civil rights
society’s civil rights record. A closer look at the history in these two periods was the ‘strict scrutiny’ standard
and record of a particular country—the USA, because of judicial review, by which any government action
of its historical role and wide recognition as a leader that infringed or restricted these rights was strictly
and model for civil rights—demonstrates both the scrutinized and presumptively invalid, surviving only
potential and the difficulty of sustained, systematic if supported by a ‘compelling’ government interest
protection of civil rights. that could not be furthered by ‘less restrictive’ means.
The role of courts in the civil rights victories of the While these decisions seldom articulated absolute
1960s and throughout US history is often exaggerated. rights (although absolutist rhetorical flourishes were
The integration of public schools (Brown vs. Bd. of common), in practice strict scrutiny meant near-
Educ. 1954) was a milestone, but most aspects of certain invalidity.
equality for African Americans in the twentieth Starting in the mid-1970s, as the national political
century were established by the US Congress. In the mood and the composition of the justices changed, the
1960s these included banning racial discrimination in Supreme Court repudiated or significantly diluted
voting, housing, employment, and public accom- almost all of the major civil rights advances of the
modations. The major civil rights advances of the post-World-War-II period. This retrenchment was led
nineteenth century were achieved by constitutional in the political arena by President Ronald Reagan and
amendment and legislation. a conservative movement that emphasized limiting
The history of US courts reveals them to be more government and espoused the sometimes extreme
often an obstacle to civil rights than a protector of distrust of government evident in many periods of US
them. For example, the US Supreme Court ruled in history (Wills 1999).
1857 that African Americans have ‘no rights which the The thrust was to empower the people by
white man is bound to respect’ and are not fully limiting the courts—in the popular phrase, ‘judicial
citizens or human beings (Dred Scott vs. Sanford), restraint’—which, paradoxically, strengthens gov-
which destroyed the Missouri Compromise on slavery ernment and weakens individual rights. Thus, this
and was a contributing cause of the Civil War; conservative judicial restraint trend reviled the 1965
approved segregation even after the Civil War amend- case establishing a civil right to privacy (Griswold
ments to the Constitution (Plessy vs. Ferguson 1896); vs. Conn.), a decision that protected the individual
negated the ‘privileges and immunities’ clause of the and the people generally from a state law that banned
Fourteenth Amendment (Slaughterhouse Cases 1873), all use of birth control. A judicially based system
which applied civil rights protections against the of protection of civil (or any other) rights requires
states; refused to protect even the most basic free the courts to intervene when government is infringing
speech rights until the 1930s (Davis vs. Massachusetts on protected rights, and conservative justices have
1897; Kairys 1982); and approved the imprisonment not hesitated to intervene to protect civil rights they
of all persons of Japanese ancestry on the west coast value (e.g., Buckley vs. Valeo 1976; Shaw vs. Reno
during World War II without any charges or proof of 1993). Historically, the pattern in the USA has been
individual guilt (Korematsu vs. US 1944). Generally, advocacy of judicial restraint by those whose values
throughout US history, oppressed minorities and and approaches are being rejected by the courts.
individuals facing repression by government or Perhaps the most vehement and successful advocate
organized majorities—including in modern times the of judicial restraint was liberal President Franklin
repressive McCarthyism of the 1950s and the approval Roosevelt, whose reforms were struck down by a
of state criminalization of gay sexual activity con- Supreme Court intent on imposing unrestricted lais-
tinuing into the new millennium—have not been sez-faire economics (Kairys 1998).
significantly protected by the courts. In this retrenchment, the rights to exercise non-
There are only two periods in US history charac- mainstream religion and against establishment of
terized by systematic or sustained judicial protection religion were substantially undercut, including the
of civil rights, from about 1937 to 1944 and from refusal to protect a Native American religious cere-
about 1961 to 1973. Both these periods occurred in the mony that pre-dates Christianity (Empl. Div. vs.
context of mass popular support and successful Smith 1990). Speech rights available to people of

1881
Ciil Rights

ordinary means were retrenched (for example, the 5. Prospects for Ciil Rights
public areas and circumstances available for protest
were substantially narrowed), while speech rights Since civil rights can encompass a variety of possible
available to corporations and wealthy individuals were limits on government to protect individuals and
enhanced, including recognition of corporate free groups (and some limits on other individuals and
speech rights. The equality rights of minorities were affirmative obligations of governments), the details of
undercut by adoption of virtually insurmountable their substance, context, and history are necessary for
burdens of proof, resulting in denials of relief for some an understanding of their social meaning and im-
egregious practices left over from segregation. The portance. In this sense, it is hard to say that any of us
equality rights of the white majority have been is ‘for civil rights,’ and perhaps harder to say we are
enhanced, resulting in successful challenges to elec- ‘against civil rights’ (which might mean government
toral reapportionment and affirmative action, even without limits), without specifying whose rights, which
where they are good faith remedial efforts to end rights, and the context. For some, government is most
discrimination against African Americans and other oppressive when it limits gun ownership or attempts to
minorities. As a result, almost all the winning parties overcome the effects of past discrimination; for others,
in racial equality cases decided by the Supreme Court it is most oppressive when it limits exercise of religion,
in the last two decades of the twentieth century were abortion rights, or the rights of historically oppressed
white. The US attempt to end centuries of discrimi- minorities. In the post-World-War-II period, the USA
nation that included slavery and forced segregation set an international standard for protection of minori-
was short and mostly limited to erasing de jure ties and the enforcement of a range of civil rights that
discrimination; the mass of African Americans live in protect the individual from oppression by government
the same segregated poverty they endured under de and powerful majorities; but US history also demon-
jure segregation (Kairys 1993, 1994, 1996; Memphis strates the fragility of civil rights, and the use of civil
vs. Greene 1981, Richmond vs. Croson 1989, Shaw vs. rights to insulate the powerful from popular reforms
Reno, 1993). aimed at protecting working and poor people.
This retreat was accomplished by a range of judicial Civil rights have played an important role in the
means. In some areas, such as exercise of religion and advancement of human dignity, respect for the ways of
some aspects of privacy, the Court withdrew strict diverse cultures and individuals, and democratic forms
scrutiny protection. In others, such as equality, es- of government. In a relatively short time in the last half
tablishment of religion, and some aspects of free of the twentieth century, racism moved from ac-
speech, a range of new rules made it near impossible to ceptable or trivial to wrong, and a deep sense of civil
prove a violation—such as the requirement for proving rights and fair play became an international intel-
that government not only violated a civil right but did lectual and moral standard. However, this has not
so purposely, with the specific motivation to violate stopped—by any reasonable measure—racial, ethnic,
the right. religious, and national oppression or violence, or the
None of these new rules is required by the language routine denial of the range of basic civil rights. There
of the US Constitution or by rules of legal reasoning or is a substantial and troubling divide between our
analysis; nor is the strict scrutiny standard that was words, aspirations, and moral precepts on the one
used to protect civil rights. This is characteristic of the hand, and our actions on the other. We may be a
judicial model, which in the USA yielded cyclic species whose gift of reason and empathy has outpaced
protection of civil rights in some (not all) periods in its habits of survival, which still reside—as if helpless
which support for those rights had strong, sustained, to listen to our own pleas—in deepseated (perhaps
and politically mobilized support. significantly genetic) tribalism. If this is so, the struggle
By the end of the twentieth century, the lead in civil for civil rights and human dignity will be long, indeed,
rights passed from the USA to other nations and to but all the more compelling.
international human rights tribunals and agreements. No single source or enforcement mechanism for
The Constitutional Court of South Africa led a civil rights has proved consistently superior to others.
worldwide trend to ban or limit the death penalty Establishment and enforcement seem most effective
(State vs. Makwanyane 1995); the European Court of when accomplished by the most popular and partici-
Human Rights, unlike the Supreme Court of the USA patory means available, and popular understanding of
(Bowers vs. Hardwick 1986), protected the rights of the importance and history of civil rights is probably
gay men and lesbians (Dudgeon vs. United Kingdom the most significant factor in their continued vitality.
1981); and the rights of racial minorities and women
were advanced principally by international human See also: Bill of Rights; Civil Liberties and Human
rights agreements. Progress in this direction has come Rights; Civil Rights Movement, The; Discrimination;
mostly as a result of international human rights Discrimination: Racial; Fundamental Rights and
agreements and the growing sense they represent that Constitutional Guarantees; Human Rights, Anthro-
oppression and repression, though they have hardly pology of; Human Rights, History of; Human Rights
ceased, are no longer internationally acceptable. in Intercultural Discourse: Cultural Concerns; Human

1882
Ciil Rights Moement, The

Rights: Political Aspects; Race and the Law; Rights; Tribe L 1988 American Constitutional Law, 2nd edn. Foundation
Rights: Legal Aspects Press, New York
Tucker St. G 1969 Blackstone’s Commentaries, with Notes of
Reference to the Constitution and Laws of the Federal Goern-
Cases: ment of the United States and of the Commonwealth of Virginia.
Bowers vs. Hardwick, 478 US 186 (1986). Rothman, South Hackensack, NJ
Brown vs. Board of Education, 347 US 483 (1954). Universal Declaration of Human Rights. 1948 G.A. Res. 217A,
Buckley vs. Valeo, 424 US 1 (1976). U.N. Doc. A\810
Davis vs. Massachusetts, 167 US 43 (1897). Wills G 1999 A Necessary Eil, A History of American Distrust of
Dred Scot vs. Sanford, 60 US (19 How.) 393 (1857). Goernment. Simon and Schuster, New York
Dudgeon vs. United Kingdom, 45 European Ct. of Winant H 2001 The World is a Ghetto. Basic Books, New York
Human Rights (ser.A), 4 E.H.R.R. 149 (1981).
Employment Division vs. Smith, 494 US 872 (1990). D. Kairys
Griswold vs. Connecticut, 381 US 479 (1965).
Korematsu vs. United States, 323 US 214 (1944).
Lochner vs. New York, 198 US 45 (1905).
Miller vs. Johnson, 115 S.Ct. 2475 (1995).
Memphis vs. Greene, 451 US 100 (1981). Civil Rights Movement, The
Plessy vs. Ferguson, 163 US 537 (1896).
Richmond vs. Croson, 488 US 469 (1989). The modern American Civil Rights Movement was
Shaw vs. Reno, 509 US 630 (1993). one of the pivotal freedom struggles of the twentieth
Slaughterhouse Cases, 83 US (16 Wall.) 36 (1873). century. This article will discuss the historic oppression
State vs. Makwanyane, 1995(3) South Africa Law of African Americans and how the Civil Rights
Reports 391 (1995) Movement fought to liberate that population. It will
West Virginia State Board of Education vs. also analyze the origins of that movement as well as its
Barnette, 319 U.S. 624 (1943). national and international achievements.

Bibliography
1. Black Oppression
Blackstone W 1992 Commentaries on the Laws of England. Hein,
New York Even as late as the middle of the twentieth century,
Convention of Discrimination Against Women 1992 Recom- millions of Black citizens in the USA were socially
mendation No. 19, Violence Against Women oppressed, economically exploited, and disenfranchis-
Cooley T 1874 Constitutional Limitations. Little, Brown, Boston ed politically. These conditions of subjugation en-
Copelon R 1998 The indivisible framework of international dured in a nation viewed as the world’s leading
human rights: Bringing it home. In: Kairys D (ed.) The Politics democracy. This view of the USA was projected
of Law, 3rd edn. Basic Books, New York, Chap. 9, pp.
nationally and internationally by White US leaders.
216–39
Cover R 1975 Justice Accused. Yale University Press, New The US image of democracy contrasted sharply with
Haven, CT the actual treatment of African Americans. Unlike
Emerson T, Haber D 1952 Political and Ciil Rights in the United European immigrants, Black Africans were forcibly
States. Dennis, Buffalo, NY transported to America as slaves. As captives of the
Henkin L 1990 The Age of Rights. Columbia University Press, institution of slavery for over two centuries, Blacks
New York supplied the free slave labor that assisted the USA in
Henkin L, Neuman G, Orentlicher D, Leebron D 1999 Human becoming an economic superpower. Slavery denied
Rights. Foundation Press, New York Blacks the basic rights that constitute the foundation
Janis M, Kay R 1990 European Human Rights. University of
of a democracy. Indeed, during slavery, African Ameri-
Connecticut Law School Foundation Press, Hartford, CT
Kairys D (ed.) 1982 The Politics of Law. Pantheon Books, New cans were officially defined as chattel, not human
York beings.
Kairys D 1993 With Liberty and Justice for Some. New Press, The American Civil War (1861–5) was the force that
New York overthrew the slave regime. With the triumph of Union
Kairys D 1994 Race trilogy. Temple Law Reiew 71: 1–12 forces in 1865, Black equality became a real possibility.
Kairys D 1996 Unexplainable on grounds other than race. For a brief period following the Civil War it appeared
American Uniersity Law Reiew 45: 729–49 that the former slaves would be granted their demo-
Kairys D (ed.) 1998 The Politics of Law, 3rd edn. Basic Books, cratic rights. During the Reconstruction period,
New York
Blacks gained expanded citizenship rights including
Levy L 1985 Emergence of a Free Press. Oxford University Press,
New York freedom of movement, restricted male access to the
Locke J 1771 Two Treatises on Ciil Goernment. Whiston, franchise, and access to employment. This was a
London promising beginning but it would not endure for long.
Paine T 1992 The Rights of Man. Hackett, Indianapolis, IN By the turn of the twentieth century the Recon-
Rousseau J–J 1901 On the Social Contract. Tudor, New York struction period had come to an end. In the early

1883
Ciil Rights Moement, The

1900s, a formal system of racial segregation known as The movement took off during this period for
Jim Crow was firmly established. The new form of several reasons. First, by the 1950s large numbers of
racial oppression required that Blacks and Whites be African Americans had migrated to southern cities
segregated on the basis of race. Thus, legally the two where they developed tightly knit urban communities
races were not allowed to attend the same movie and dense, effective communication networks. These
theaters, drink from the same water fountain, sit on resources made it possible for mass mobilization to
the same side of a courtroom, or be sworn in with the occur. Second, by the 1950s the National Association
same Bible. Blacks and whites were prohibited from for the Advancement of Colored People (NAACP)
occupying the same space on a public bus or train. In had won successful legal rulings against the Jim Crow
short, Blacks were denied equal access to public system, especially in the 1954 Supreme Court decision,
accommodations. Moreover, Blacks were excluded Brown s. Board of Education, which reversed the 1896
from the political process. Toward the close of the Plessy s. Ferguson decision. The Brown ruling stated
nineteenth century the emerging Jim Crow regime that separate schooling based on race was uncon-
received backing from the highest court of the nation. stitutional. The implications of this ruling went far
Thus an 1896 Supreme Court Ruling, Plessy s. beyond this case for it delegitimized the entire system
Ferguson, declared that racial segregation was con- of racial segregation and encouraged struggles to
stitutional. It ruled that it was constitutionally legal implement the court orders.
for the two races to use separate-but-equal facilities. Additionally, changing international relations were
With this ruling the Jim Crow regime became national important to the birth of the movement. By the middle
in scope although it was more rigorously enforced in of the twentieth century African and Third World
the South. countries were gaining independence through anti-
By the middle of the twentieth century the majority colonial struggles. African Americans identified with
of Blacks were disenfranchised. In the South, Blacks those struggles, which intensified their own thirst for
held no significant political offices and they were freedom. Moreover, the context of decolonization and
constant victims of terror and violence including Cold War rivalry rendered the US Federal Govern-
lynchings. In the labor market Blacks were restricted ment susceptible to pressure from an oppressed Black
to low-paying undesirable jobs. As a result, economic population because the USA sought to persuade these
exploitation of Blacks was widespread. Beyond ma- new Third World nations to model themselves after
terial subjugation, African Americans experienced US democracy, not the Soviet alternative. However,
daily personal humiliation because racial segregation US racial oppression stood as a barrier to harmonious
marginalized them as a people and labeled them as an relations between the USA and new Third World
inferior race. Human dignity was stripped from nations. The proliferation of television and communi-
African Americans: simple titles of respect such as cation satellites made it possible for Black oppression
‘Mr.’ or ‘Mrs.’ were withheld, and even White young- to become visible worldwide. America’s participation
sters held symbolic authority over all Blacks, however in the two world wars and the Korean War also
elderly or eminent. rendered it vulnerable to Black protest because in
those wars the USA championed egalitarian values. In
this context, Black soldiers were radicalized while
2. The Moement ’s Origin fighting for democracy on distant shores. Thus, the
federal government came under increased and sus-
African Americans consistently attacked the Jim Crow tained international and domestic pressure to support
regime. Black protests began during slavery and efforts to overthrow institutionalized racial segrega-
remained evident during the Jim Crow period. At tion.
times, resistance was collective and public, while at
other times it remained covert and limited in scope. By
the 1950s African Americans had produced a long,
2.1 Strategies, Tactics, and Goals
rich tradition of social protest. The modern Civil
Rights Movement drew upon this tradition and These factors created the fertile soil from which the
embedded itself deeply within the historic struggle for Civil Rights Movement emerged. By employing the
Black liberation. strategy of mass, nonviolent, direct action, this move-
The Civil Rights Movement took deep root in the ment mobilized widespread social protest. For such a
south in the mid-1950s. Thus, this movement emerged strategy to succeed, White communities, businesses,
where Black oppression was most intense and where and institutions had to be disrupted and prevented
racial segregation was firmly entrenched. Given the from doing business as usual. To this end, civil rights
scope of oppression and the power of the White leaders and organizers mobilized thousands of African
opposition, the birth of a powerful resistance move- Americans to confront the Jim Crow regime through
ment was unanticipated, especially by Whites who social protest.
thought that segregation had become part of the The Civil Rights Movement succeeded in mobilizing
natural order. massive nonviolent social protest. Innovative tactics

1884
Ciil Rights Moement, The

included economic boycotts (beginning with the year- Inequality (CORE); and Fannie Lou Hamer, who
long boycott of a bus company in Montgomery, went to a voter registration meeting run by SNCC,
Alabama, sparked by the arrest of Rosa Parks, in was arrested at the courthouse in Indianola, Missis-
December 1955 and led by Martin Luther King Jr.); sit- sippi, when she tried to register to vote, and then was
in demonstrations intensified in February 1960 by viciously beaten in prison. Hamer personified the
Black college students at a lunch counter in Greens- thousands of women who played important roles in
boro, North Carolina; dramatic confrontations in the organizing and leading the movement.
streets of Birmingham, Alabama in 1963; and mass These leading figures and thousands of movement
marches (including a massive mobilization of Whites participants articulated Black suffering and the demo-
and Blacks in the August 1963 March on Washington, cratic aspirations of African Americans of every
which culminated in King’s ‘I have a dream’ speech, generation and circumstance. In addition, thousands
and protest marches led by King that met with police of Whites (students, ministers, lawyers, and other civil
violence in Selma, Alabama, in January 1965). rights workers) were inspired to join the Movement.
The goal of these protests was to overthrow racial They participated in lunch-counter sit-ins, mass
segregation and empower African Americans by seiz- demonstrations, and campaigns such as the 1964
ing the franchise. Southern officials utilized their Mississippi Freedom Summer Project, a campaign
institutional power and the resistance of the larger that involved hundreds of volunteers in voter regis-
White community in an intense, and often violent, tration drives and the creation of ‘freedom schools.’
effort to defeat the movement and to maintain legally Some, like Andrew Goodman and Michael Schwerner,
enforced racial segregation. Movement partici- who were involved in the Freedom Summer campaign,
pants—many of them women, children, and college and Viola Liuzzo, a Michigan homemaker shot by
students—were often beaten and brutalized by south- Klansmen after a rally in support of the march from
ern law enforcement officials, and thousands were Selma to Montgomery, lost their lives in the Move-
arrested and jailed for their protest activities. Some ment and, in turn, helped inspire others.
leaders and participants—such as Medgar Evers, of Differences regarding ideology, leadership style,
the Mississippi NAACP in 1963, and three civil rights and goals emerged within the movement as the SCLC,
workers in Mississippi in 1964—were murdered. NAACP, SNCC, CORE, and other organizations
Nevertheless, the widespread and highly visible reached different judgments about the value of nonvio-
confrontations in the streets, which contrasted the lent action, racial integration and separatism, the role
brutality and the inhumanity of the White segrega- of Whites in the movement, and the influence of Black
tionists with the dignity and resolve of Black protes- nationalism. Despite these controversies, the moral
tors, made the cause of Black equality the major issue challenge and the widespread social disruption caused
in the USA for over a decade during the 1950s and by the economic boycotts, the marches, the sit-ins, and
1960s. The nation and its leaders were forced to decide other forms of nonviolent direct action, coupled with
publicly whether to grant African Americans their the international pressure, created an impasse in the
citizenship rights or to side with White segregationists nation that had to be resolved.
who advocated racial superiority and the undemo- As a result, the Civil Rights Movement achieved
cratic subjugation of African Americans. important legislative victories in Congress. The land-
mark Civil Rights Act of 1964 outlawed discrimination
3. National Achieements in public accommodations on the basis of race, color,
religion, or national origin and it identified the legal
The movement could not be dismissed. Eloquent measures to be used to achieve racial integration.
leaders and their massive followings sustained the Moreover, it barred discrimination in employment
pressure on local elites and the federal government. practices on grounds of race, color, religion, national
Countless heroic figures inspired and organized a origin, or sex. The passage of the 1965 Voting Rights
massive following. Among them were Rosa Parks, a Act was another major achievement, for it suspended
dignified older woman and local NAACP activist, the use of literacy tests, authorized the attorney general
who sparked the Montgomery bus boycott when she to challenge the constitutionality of poll taxes, and
defied an order to move to the back; Martin Luther introduced procedures that provided for the appoint-
King, Jr., who emerged from the Montgomery bus ment of examiners to ensure that all restrictions on
boycott and the Southern Christian Leadership Con- Black voter registration be ended. In short, the Voting
ference (SCLC) to assume a position of preeminent Rights Act enfranchised the southern Black popu-
moral leadership and national influence; James For- lation, making it possible for a historic Black elected
man, executive secretary of the more militant Student political class to emerge.
Nonviolent Coordinating Committee (SNCC), who
was to challenge SCLC’s and King’s strategy; SNCC’s
3.1 International Achieements
leader, Stokely Carmichael (now Kwame Toure), who
introduced the slogan ‘Black power’; James Farmer The significance of the Civil Rights Movement extends
and Floyd McKissick who led the Congress of Racial far beyond its historic overthrow of the Jim Crow

1885
Ciil Rights Moement, The

regime. This Movement has affected US politics in Bibliography


fundamental ways. It demonstrated to the oppressed
Branch T 1988 Parting the Waters: America in The King Years
Black community how such protest could be suc- 1954–63. Simon and Schuster, New York
cessful, and it made social protest respectable. The Carson C 1981 In Struggle: SNCC and the Black Awakening of
Civil Rights Movement also proved that social protest the 1960’s. Cambridge University Press, Cambridge, UK
is capable of generating significant change. Hence, the Garrow D 1986 Bearing the Cross: Martin Luther King, Jr., and
movement broadened the scope of US politics and the Southern Christian Leadership Conference. William Mor-
inspired diverse movements for citizenship rights and row, New York
social justice in the USA and abroad. Before the Klinker P A, Smith R M 1999 The Unsteady March: The Rise
Movement, many groups in the USA—women, His- and Decline of Racial Equality in America. University of
Chicago Press, Chicago
panics, Native Americans, farm workers, the physi-
Layton A S 2000 International Politics and Ciil Rights Policies
cally disabled, gays and lesbians, etc.—were oppressed in the United States 1941–1960. Cambridge University Press,
but unaware of how to resist or galvanize support. The Cambridge, UK
Civil Rights Movement provided a model of successful McAdam D 1988 Freedom Summer. Oxford University Press,
social protest and produced a host of new tactics and New York
social change organizations. Moreover, this Move- Morris A D 1984 Origins of the Ciil Rights Moement: Black
ment had an influence on freedom struggles around Communities Organizing for Change. Free Press, New York
the world. Participants in freedom struggles in Africa, Morris A D 1999 A retrospective on the Civil Rights Movement:
Eastern Europe, the Middle East, Latin America, and Political and intellectual landmarks. Annual Reiew of Soci-
ology
China have made it clear that they were deeply
Robnett B 1997 How Long? How Long? Oxford University Press,
influenced by the US Civil Rights Movement. New York

4. Continuing Challenges A. Morris


For all its success and influences, however, the Civil
Rights Movement did not solve all of America’s racial
problems. At the start of the twenty-first century,
African Americans and many other non-White groups Civil Service
are still at the bottom of the social and economic
order. These current conditions are exacerbated by a The civil service is the generic name given in English to
social climate in which the disadvantaged are often the administrative apparatus of the state. The term
blamed for their own predicament. Thanks to the Civil was first introduced in the British administration in
Rights Movement, a relatively large Black middle India and then in the UK (1854), and has become
class has emerged. Yet, over a third of the Black almost universally synonymous with civilian (i.e., non-
community remains trapped in poverty. Black military and nonjudicial) administrators employed by
Americans are disproportionately the victims of police central governments. The civil service is also referred
brutality and housed in the rapidly growing prison to by different terms (e.g., ‘public bureaucracy’), and it
industry. Thus, during the infancy of the new mil- differs from other bureaucracies by virtue of its public
lennium, some Blacks are the recipients of middle- missions, however defined. It has a collective justifi-
class comforts while millions of others are victims of cation, namely, doing something otherwise unattain-
poverty and degradation. Poverty and inequality are able for a certain pubic end, and therefore, civil
also widespread outside the Black community. It may servants have been entrusted with the task of being the
be that protest remains the only viable means to guardians, or at least the interpreters, of the common
achieve greater empowerment. If this is the case, the good. The modern civil service is above all a ‘state
Civil Rights Movement has left a rich legacy to inspire institution’ and this article will focus on the changing
and inform future struggles. The sparking of a renewed relationship between the state and its specialized
interest in the academic study of social protest is an bureaucratic institution—the civil service.
unnoticed legacy of the Civil Rights Movement.
Through such studies it is possible that new mecha-
nisms of social change will come to light. 1. History
See also: Civil Liberties and Human Rights; Civil There were bureaucratic administrators long before
Rights; Conflict Sociology; Integration: Social; Min- they were known as ‘civil servants.’ Historical ante-
orities; Nonviolence: Protective Factors; Race and cedents can be found mainly in centralized States and
Gender Intersections; Race Identity; Racial Relations; empires where public bureaucracies were developed to
Racism, History of; Racism, Sociology of; Slavery as serve the rulers, or the dynasty. Despite their subs-
Social Institution; Slaves\Slavery, History of; Social ervience to the emperors and kings, however, these
Movements: Resource Mobilization Theory; Social bureaucracies became ‘professional,’ that is, they
Movements, Sociology of manifested some conception of themselves as servants

1886
Ciil Serice

of the ‘state,’ the polity, or even the community back to the middle seventeenth century with the
(Eisenstadt 1965). Such embryonic civil service inst- inauguration of competence entrance examinations in
itutions existed in different forms first in the Egypt and Prussia under Friedrich Wilhelm I, analogous to the
China empires, and later, for instance, in the Byza- old Chinese practice. Equally important, the German
ntine, Sassanid, Abbasid, and Ottoman empires. Cameralists were the first modern scholars–practi-
Sculptures of the ancient Egyptian scribes date back to tioners to study public administration and to offer a
the Old Kingdom (ca. 2600 ). They belonged to a set of principles for the ‘management of the state’
body of officials who served the Kingdom, as did their (Gross 1964, Vol. I, p. 108). The next stages in the
Assyrian and Babylonian counterparts. Even more development of the civil service were diffused, but a
professional were the ancient Chinese functionaries few milestones can be presented, all of them related to
employed in the rather developed court machinery. the attempt to create an institutional instrument with
These officials—whose rank was conspicuously exhi- acquired experience in running state affairs. The
bited by the shape and color of the buttons worn on dissimilar features should also be noted as each civil
their caps—became much later known in the West as service is also a product of its particular environment,
‘mandarins.’ The term is often used pejoratively to and the human composition of each of them reflects
indicate the aloofness of civil servants, the official recurring attempts of social groups to enter the ruling
jargon and the sheltered life of their long careers. elites of their respective states.
Broadly speaking, the history of public admin-
istration testifies that there has always been a need to
link state (or ruler) goals, with the official expertise of
a centralized bureaucracy. These bureaucratic entities 2.1 The Napoleonic Reforms Early Nineteenth
were engaged in three main types of activities: tech Century
nical services such as land registration or water The Napoleonic Reforms converted the previous
allocation; social and political regulation such as French royal service into a State public service.
administrating justice or collecting taxes; and above France, and before that Prussia, created the cont-
all—managing war-related affairs (Finer 1997, Vol. 1, inental model of a professional, tenured and in-house
pp. 59–72). trained civil servant. More than in other countries,
The European patterns of the civil service have continental civil servants embody the state, and the
emerged gradually from the Greek and especially the continuity of state affairs, and operate within a specific
Roman law and administrative traditions, of, for legal framework of administrative law (Crozier 1964,
example, public construction, food supply regulation, Suleiman 1974).
and population census. They were later influenced by
the medieval feudal system of financial and judicial
administration. Yet, the most strenuous collective
effort has always been related to mobilizing money, 2.2 The Northcote–Treelyan Report 1854
soldiers, and equipment for the military. Accordingly,
throughout the history of most countries, organiz- This report introduced competitive examinations into
ational practices, public and private, have been the British civil service and uniform methods of re-
influenced directly by the most readily available cruitment and promotion across departments. It even-
example—the military model. This model was further tually succeeded in eliminating patronage. Making it
developed to encompass the colonial expansion of a professional life career attracted the educated to
some European countries, which required new organ- enter the senior administrative class, which was given
izational expertise, both in content and in scope. The the task of shaping public policy and advising
influence of the military is most readily seen through politically elected ministers. Below them were the
such administrative concepts as ‘chain of command,’ executive and clerical classes, which were given mana-
or ‘line and staff,’ so much so that recent attempts to gerial and routine responsibilities. The UK was the
debureaucratize the civil service are essentially an first to establish a civil service commission to oversee
attempt to break away from the dominance of the the entire operation of the civil service. Politically
military bureaucratic model. neutral and very secretive, the British senior home civil
service acquired a reputation of integrity and im-
partiality, and of being guardians of crown affairs, as
distinct from the governments of the day.
2. The Modern Ciil Serice
The emergence of the modern civil service is connected
2.3 The Pendelton Act 1883
directly to the crystallization of the European style
state, first in France and Prussia and later in all the 200 The Pendelton Act in the USA marked the beginning
or so states that populate the globe in the early twenty- of federal civil service reform, aimed at abolishing the
first century. As an institution the civil service dates previous ‘spoils system’ in the central government.

1887
Ciil Serice

Patronage in the federal government was closely ministrative staff. Strongly influenced by the Prussian
related to electoral turnovers in the Presidency and example, Weber drew a list of the characteristics of a
Congress. Unlike the UK, the American system has modern bureaucracy and the conditions that contri-
never achieved the same degree of uniformity, political buted to its emergence and growth, particularly in
neutrality, or continuity. In the USA, however, the capitalist market economies (Weber 1947). His ideas
civil service has been much more open, with recr- reflected, and to some extent strengthened, the heavy
uitment to the senior positions, less dependent on reliance—both in continental Europe and in the
social class and on elite education systems. Cons- USA—of civil service institutions upon administrative
equently, the civil service has never had the aura of law: legal codes, rules, regulations, and precedents. To
‘officialism’ as in Europe, and has not acquired the this very day the notable exception is the UK which
prestige and the consequent status of being a perm- (unlike Canada, Australia and its other former domin-
anent agent of the federal state. ions) does not have a civil service act of Parliament
and does not require legislation to change its public
administration system.
2.4 Nondemocratic Regimes The rational (‘Weberian’) model of the civil service
as a permanent institution also includes the following
In nondemocratic regimes the position of the civil common features, which are assumed to exist in the
service institution is related directly to the scope of public bureaucracies of the developed states: the
the state’s monopoly of the different spheres of life. The centralization of authority; formal modes of oper-
stronger the state, the more dominant is the public ation; a central commission or agency in charge of civil
bureaucracy—operating sometimes from the ruler’s service affairs; established procedures for recruitment,
palace, the military barracks, or the party’s head- tenure, promotion, compensation, job evaluation,
quarters. In such regimes, the civil service hardly exists discipline, and the right to associate and strike; and
as a differentiated professional institution. In the arrangements for securing political neutrality. In
Soviet Union, for example, the lines between state and addition, civil servants must adhere to a special code
party bureaucracies, and indeed between political and of conduct which specifies their obligations, and
administrative decisions, were practically nonexistent. constitutes the ethos (or perhaps the myth) of im-
In this respect, the most troubling phenomenon personality, neutrality, and guardianship of state
appeared with the total submission of the highly secrets.
developed German civil service (including the military The important point about the rational model of the
and the judicial) to Nazi leaders. This raises the civil service is that there is a connection between the
question: Can there be safeguards against turning development of the modern state and this new form of
professional competence—and loyalty—into a sharp bureaucratic organization. Indeed, in the twentieth
tool in the hands of evil political masters? century the civil service and the entire public sector
expanded rapidly in response to growing roles of the
state, particularly with the growth of the welfare state.
2.5 New States Governments and their civil service apparatus have
In the New States the civil service is a mixture of been asked to provide more services and to find
colonial heritage and unique local elements. In India, answers to new problems such as environmental
and in some other former British colonies, the civil degradation. But at the same time, the ‘administrative’
service is a strong institution, and it helps to keep the or ‘bureaucratic’ state has also come under attack, in
country going, despite recurring crises and accusations terms of the size, structure, and functions of the civil
that it impedes development. In many other new service. New post-Weberian questions have been
countries, particularly where the military took over, floated: to whom should the civil service be responsive
the civil service hardly functions. It cannot maintain and accountable: to the state, the law, the government
law and order, let alone carry the task of providing of the day, the multicultured conglomeration of
social services. The weaker the state grows, the more sectors and groups, or perhaps to the individual
helpless is the civil service institution, and the weaker customers of its services?
the civil service, the less the ability to sustain stability
and encourage positive changes.
4. What do Ciil Serants Do?

3. The Rational Model Civil servants are in charge of a rich menu of


government activities ranging from artificially ‘seeding
Max Weber (1864–1920), one of the most prominent clouds’ in the sky to induce rainfall, to operating
students of bureaucracy, delineated three ‘ideal-types’ schools, or administrating programs for the increasing
of legitimate authority—the legal-rational, the tra- number of aged people. In between, civil servants, as
ditional, and the charismatic, and suggested that the in ancient times, continue to collect taxes and perform
first one requires the existence of a ‘rational’ ad- state functions that cannot be trusted to, or done by,

1888
Ciil Serice

social (voluntary) or private market institutions. These delegate authority; match responsibility with auth-
various activities can be grouped under the following ority; and limit span of control (Gulick and Urwick
headings: shaping and implementing public policy 1937).
decisions; providing services to individuals, groups A change came with Herbert Simon’s (1946) attack
and organizations; and administrating regulatory on these principles which he regarded as being no
schemes in areas such as aviation, drugs, and election more than ‘proverbs of administration.’ Simon’s
financing. concept of ‘bounded rationality’ that governs the life
The main structural entities for carrying out these of all organizations, was an important step away from
activities could be divided roughly into three (a) the prescriptions towards observations, and from prac-
regular civil service departments and agencies— tical advice on how to run organizations, to laying the
responsible for the generic governmental functions; foundation of public administration theory and of a
(b) the special statutory agencies, authorities, commis- comparative perspective. First came the formal studies
sions, etc.—responsible for specific tasks removed of the civil service as an institution within the executive
from the regular civil service, (e.g., the USA Securities branch, or in the context of administrative law. Later,
and Exchange Commission); and (c) government mainly as a result of the impact of studies in political
corporations entrusted with running utilities and other science, the context as well as the horizon were
commercial enterprises regarded as natural monopol- expanded, away from the previous formalism. For
ies or in some other way related to national interests example: power, group theory, and communication
(e.g., postal services, public broadcasting, electricity were introduced into the vocabulary of public admini-
companies, and regional development projects). The stration research. More recently, the related area of
combinations of the various types of activities and the public policy has emerged with new emphasis on
different structures reflect the myriad and economic models and game theory, as well as on the
ever-changing scope of civil service responsibilities. redefinition of values, public goods, and the role of
It is impossible to prescribe the right number of civil politics in decision-making (Majone 1989). New foci
servants per capita, given the differences in their were introduced such as implementation, social regu-
functions, and in the number of intra-state govern- lation, and comparative policy analysis in areas such
mental levels in different states. In economically as welfare, health, and environment (Goodin and
developed states, the tendency is to cut down the scope Klindermann 1996, pp. 551–641).
of civil service activities (and consequently its size too) The most challenging and inadequently researched
through privatization, outsourcing, etc. In new states new area has remained the role of the civil service
the pressure still exists on the civil service to do much institution in democracies. This is a practical issue as
more for their hard-pressed societies. well: how to reform the civil service within a new
democratic framework in the era of state weakening
(Silberman 1993).

5. Studying the Ciil Serice


6. Blurring the Boundaries of the Ciil Serice
Woodrow Wilson (1887) was among the pioneers who
attempted to create ‘a practical science of adminis- The modern version of the state is changing rapidly,
tration,’ aimed at improving not only the personnel, and if there is going to be a ‘skeleton state,’ there will
but also the organization and methods of government be a skeleton civil service as well. This process has been
offices. He saw the field of administration as a field of gradually occurring both because of internal changes
business which should be outside the sphere of politics. such as the weakening of pubic trust in political
Hence his famous dictum: separate policy-making institutions, and under the impact of globalization. In
from administrative execution. But there were other the early 2000s there are states still struggling to build
scholarly currents as well. From the ‘scientific manage- a British-type civil service (which does not exist in the
ment’ movement in the USA (ca. 1900s) came the UK anymore); other states in which few changes have
notion of ‘efficiency,’ aimed at increasing workers’ taken place in their public bureaucracies; and econ-
productivity through detailed time and motion studies, omically-rich states in which the role of the civil service
standardization of tools, and careful attention to is visibly contracting. There are also new supra-state
training. From the ‘human relations’ school (ca. bureaucracies, not only in international organizations
1930s) came the message that in all organizations, such as the European Union, but also in inter-state
individual and group motivations are the most impor- agencies and in joint public management of regional
tant factors. A little later, and more directly applied projects. These developments have shattered many of
to public administration, attempts were made to the old features of the classical Weberian-type civil
develop prescriptive principles. For instance, establish service.
one top executive; fit people to organizational struc- Examples:
tures; ensure unity of command; use specialized staff; (a) Civil service monopoly on official information is
maintain homogeneity in organizational subdivisions; no longer feasible and government secrecy has yielded

1889
Ciil Serice

to freedom of information acts enacted in most textbooks. Everybody knew that civil servants exercise
democratic states (Galnoor 1989). state power both formally (law application) and
(b) With modern information technology and multi- informally (policy shaping). However, starting in the
channel communication, civil servants’ anonymity 1980s the distinction returned as part of NPM. The
(and perhaps discretion too) has been gradually new textbooks dealing with the civil service are now
disappearing. called ‘introduction to public sector management,’
(c) The previous civil service—interest groups net- and the civil service is conceived to be market-oriented,
work, has been replaced by a much more intricate and customer-driven, deregulated, decentralized, etc. Thus
uncontrollable system of organization in the mush- the old Wilsonian dichotomy has returned through the
rooming ‘third sector’ (nongovernmental, nonprofit) back door: an institutional separation within the civil
of public interest groups, NGO’s, and philanthropic service between the ‘departments’ in charge of policy
funds. and the ‘performance-based organizations’ or ‘execu-
(d) The civil service is under great pressure to be tive agencies’ responsible for delivering services to
more ‘representative’ in its membership, or at least to customers.
assist in achieving equal opportunity through ‘affirm-
ative action’ and other measures. See also: Administration in Organizations; Admini-
(e) Direct accountability of the civil service to the strative Law; Bounded Rationality; Bureaucracy and
public is demolishing the last walls of insulation. One Bureaucratization; Bureaucratization and Bureau-
example is the spread of the Swedish Ombudsman cracy, History of; Decision-making Systems: Personal
institution. Another is the authority given to special and Collective; Delegation of Power: Agency Theory;
investigation commissions to probe the decision- Executive Branch, Government; Governments; Public
making discretion of public officials. Administration: Organizational Aspects; Public Ad-
These developments were followed by important
ministration, Politics of; Public Bureaucracies; Public
changes in some of the traditional features of the civil
service. For example, in many states the civil service is Management, New; Weber, Max (1864–1920)
no longer a life career and new opportunities for
entrance to (and exit from) senior positions have been
introduced. The emphasis on management skills has
Bibliography
downgraded the policy-advice roles of civil servants.
The results-oriented business approach has diverted Aberbach J D, Putnam R D, Rockman B A 1981 Bureaucrats
attention to the quality of service aspect, including the and Politicians in Western Democracies. Harvard University
idea that civil servants should be rewarded according Press, Cambridge, MA
to their standing in customers’ satisfaction surveys. Barnard C 1938. The Functions of the Executie. Harvard
University Press, Cambridge, MA
In states where New Public Management (NPM) Crozier M 1964 The Bureaucratic Phenomenon. University of
reforms were was introduced, a new post-public Chicago Press, Chicago
bureaucracy model is slowly emerging. It requires Dror Y 1971 Design for Policy Sciences. Elsevier, New York
questioning many assumptions about the civil ser- Eisenstadt S N 1965 Essays in Comparatie Institution. Wiley,
vice—both in scholarship and in practice (Peters and New York
Wright 1996). The most important change in this Finer S E 1997 The History of Goernment From the Earliest
respect is in the blurring of the old boundaries between Times. Oxford University Press, Oxford, UK
the public and private sectors, and between politics Heady F 1979 Public Administration: A Comparatie Perspectie,
and administration. 2nd edn. Marcel Dekker, New York
The first blurred boundary was pointed out long ago Hood C 1999 Regulation Inside Goernment. Oxford
University Press, Oxford, UK
by Merriam (1944): the civil service has constituencies, Galnoor I 1989 Government Secrecy. In: International Ency-
and in many countries corporations exert a great deal clopedia of Communications. Oxford University Press, Oxford,
of influence on public policy. However, with increased UK, pp. 34–7
regulation, the private sector became even more Goodin R E, Klingemann H-D (eds.) 1996 Public policy and
involved, and hence the phenomenon of excessive administration. In A New Handbook of Political Science.
clientelism (Nadel and Rourke 1975). Moreover, as Oxford University Press, Oxford, UK, pp. 551–641
part of the process of ‘reshaping the state,’ the size, Gross B M 1964 The Managing of Organizations. The Free Press,
structure, and functions of the civil service are under New York
attack. Independent public and private entities and Gulick L, Urwick L (eds.) 1937 Papers on the Science of
agencies are performing what used to be civil service Administration. Institute of Public Administration, New York
Majone G 1989 Eidence, Argument and Persuasion in the Policy
tasks such as postal services. Other functions have been Process. Yale University Press, New Haven, CT
‘privatized,’ ‘marketed,’ or ‘outsourced’ to the private Merriam C E 1944 Public and Priate Goernment. Yale Uni-
sector: running prisons, computer services, or even versity Press, New Haven, CT
recruitment to the civil service itself. Nadel M V, Rourke F E 1975 Bureaucracies. In: Greenstein F I,
As for the politics–administration dichotomy, for Polsby N W (eds.) Handbook of Political Science. Addison-
many years it disappeared from public administration Wesley, Reading, MA, 5: 373–440

1890
Ciil Society, Concept and History of

O’Donnell G 1973 Modernization and Bureaucratic Authori- istration of justice and held office to this end (Aristotle
tarianism: Studies in South American Politics. University of 1965). In this sense, citizenship in the Athenian city-
California Press, Berkeley, CA state was as much a moral as a legal category
Peters B G, Wright V 1996 Public policy and administration:
(Ehrenberg 1999). The household was not excluded
Old and new. In: Goodin R E, Klingemann H-D (eds.) A New
Handbook of Political Science. Oxford University Press, from Aristotle’s moral scheme. Household subsistence
Oxford, UK, pp. 628–41 production was ‘natural,’ while production for com-
Pressman J L, Wildavsky A 1984 Implementation, 3rd edn. mercial exchange and profit was ‘unnatural’ and
University of California Press, Berkeley, CA subversive of the moral order. The exercise of justice
Silberman B 1993 Cages of Reason: The Rise of the Rational presupposed constraints on commercial activity.
State in France, Japan, the United States and Great Britain. For Cicero (106–43 BC) and the Roman lawyers,
University of Chicago Press, Chicago civil society (societas ciilis) was the equivalent of res
Simon H A 1946 The proverbs of administration. Public publica (commonwealth), or ‘an assemblage of people
Administration Reiew 6: 53–67
in large numbers associated in an agreement with
Simon H A 1957 Administratie Behaior: A Study of Decision-
Making Processes in Administratie Organizations, 2nd edn. respect to justice and a partnership for the common
Macmillan, New York good’ (Cicero 1988). Cicero viewed justice as rooted in
Suleiman E N 1974 Politics, Power and Bureaucracy in France. man’s natural ‘social spirit’ which was informed by
Princeton University Press, Princeton, NJ reason and induced individuals to forego a measure of
Weber M 1947 The Theory of Social and Economic Organization. self-interest in the interest of common good (Ehren-
Oxford University Press, New York berg 1999). Cicero was writing when Rome had ceased
Wildavsky A 1987 Speaking Truth to Power. Transaction, New to be a harmonious ‘commonwealth’ run by its citizens
Brunswick, NJ and their subsistence producing households, but had
Wilson J O 1989 Bureaucracy. Basic Books, New York
become an imperial city in which landowners,
Wilson W 1887 The study of administration, Political Science
Quarterly 2: 197–222 financiers, and merchants coexisted uneasily with a
vast and hungry population of peasants and slaves.
I. Galnoor Cicero saw the realization of justice as the ‘even
balancing of rights, duties and functions’ within the
state (Cicero 1988). Societas ciilis thus represented
groups and individuals united by laws and institutions,
Civil Society, Concept and History of which organized their activities and sought to achieve
a flexible equilibrium among them.
Historically, the idea of civil society takes two very Until the end of the eighteenth century, the various
different forms. In the first, civil society is ‘political societies of western Eurasia generally conformed to
community’ (societas ciilis or koinonia politike) en- the Aristotelian model of a politically constituted
compassing a state undifferentiated from society (Ellis moral community. In western Europe, however, the
2000). Here, civil society is coterminous with the state: idea of koinonia politike virtually disappeared during
that is power relations ordered through law and the Middle Ages when extreme political fragmentation
institutions with the objective of ensuring social led to the Church assuming a central role in govern-
harmony. In the second, civil society is a self- ment and social life.
regulating, self-governing body outside and often in On the other hand, in the Byzantine Empire, the
opposition to the state, represented both as the nexus idea of a politically constituted community persisted
of societal associations expected to generate civility, and the Church was subordinated to the political and
social cohesion and morality, and as the site of moral authority of the emperor. The Ottoman Empire
reciprocal economic relations among individuals en- succeeded to this mode of political discourse, but
gaged in market exchange activity. added ideas of bureaucratic and hierarchical orga-
nization derived from the political traditions of the
1. Koinonia Politike and its Historical agrarian empires farther east, notably Persia. This
Modulations synthesis, which predated the Ottomans, was form-
alized in the ninth and tenth centuries by Islamic
In Greek and Roman political thought the notion of philosophers well-versed in the works of Plato and
civil society as political community was not limited to Aristotle, as well as by the bureaucratic elites of the
the legal category of citizenship. Central to Aristotle’s Islamic states who were educated in the Persian
(384–22 BC) conception of political community was tradition. It was resisted by some sections of the
the recognition that people lived in different social Islamic establishment on the grounds of the equality
spheres and their status varied in terms of property, of all before God. The ensuing power struggles did
skills, and abilities. The art of politics was the use of not, however, result in an autonomous ‘Islamic
laws and institutions to organize activities within these Church,’ but rather the religious establishment was
different spheres with the objective of attaining a subsumed within the political hierarchy.
harmonious or ‘just’ social environment. Aristotle In the Ottoman Empire, koinonia politike was
regarded the citizen as one who shared in the admin- epitomized by the figure of the just ruler, whose ability

1891
Ciil Society, Concept and History of

to establish ‘good order,’ or shariah in the Islamic sense, diffusion of political power. Instead, political society
predicated upon an absence of social strife, constituted referred to the sphere of absolute sovereignty. Social
justice. Exercise of justice involved laws and insti- harmony, which remained the legitimizing objective of
tutions accommodating different interest groups in rule, was the product of bureaucratic regulations and
order to sustain the conditions for household sub- no longer the outcome of reciprocal exchanges or
sistence production and to achieve equitable distri- negotiated settlements between the ruler and the
bution of resources among power holders. Thus laws different sectors of society.
and institutions represented negotiated settlements Attempts to formalize the sovereign state also had
between the ruler and different interest groups. the effect of pointing to the existence of a sphere
In western Europe, the idea of community created outside the political domain. In the latter part of the
by God and comprising all mankind (i.e., all eighteenth century, a new conception of civil society as
Christians) was a challenge to the notion of dif- the site of self-regulation developed, referring to
ferentiated community constituted through politics to voluntary associations freed from the corporate grip
achieve social harmony and justice. St Augustine’s of the Church and urban institutions and to the sphere
City of God (De Ciitate Dei, 413–26), where men of economic activity . This concept of a self-regulatory,
were equal before God and united in the universal self-governing society was often in opposition to the
church, stood in contrast to the Greek and Roman regulatory, political domain of the state.
polis of differentiated spheres of life. The Church and The dichotomous conception of state and civil
the temporal rulers responsible for its government society, was, however, preceded by the distinction in
derived their right to rule from God and stood outside Natural Law theory between status ciilis and status
the community. naturalis (Tribe 1988). The latter was the sphere of
However, struggles between the Church and tem- discord which the English philosopher Thomas
poral rulers, coinciding with the rise of monarchies, Hobbes (1588–1674) described as bellum omnium
resulted in the division of social activities into spiritual contra omnes—barter society in which individuals
and temporal spheres governed, respectively, by the contracted with and against one another (Hobbes
Church and the kings. The concept of koinonia politike 1949), and to which his countryman John Locke
found a new foothold in the institution of monarchy. (1632–1704) assigned the contentious process of the
Between the fourteenth and seventeenth centuries, formation of private property. For Hobbes, the state
European monarchs, endowed with divine right and of nature ended when it submitted to the ‘civil union’
with absolute powers, nevertheless engaged in con- ‘called state or civil society.’ For the Prussian Camer-
stant negotiations with different interest groups. Pol- alists of the same period, organization of economic
itical society rested on the exchange of entitlements to activity in political society, or its regulation, ensured
different groups and individuals in return for their the security and prosperity of the commonwealth,
obedience and\or services they rendered. including providing the population with subsistence
and productive employment. For the English mer-
2. Koinonia Politike Transformed cantilist, James Steuart (1712–90), self-interest had
to be restrained and directed by the ‘reason of state’ if
In the sixteenth and the seventeenth centuries, fol- it were not to damage public good, which he defined
lowing a period of religiously-driven civil wars, the in terms of England’s increasing commercial success.
idea of a Christian koinonia politike was replaced by Concentration of political power in administrative
the concept of the overriding sovereignty of the secular monarchies had resulted in a decline in the importance
ruler who subordinated religious claims and restored of the corporate bodies and deliberative institutions
peace. This was the Hobbesian moment in which the that had earlier served to mediate power. Groups that
modern state rose to define, limit, and enforce moral were excluded from the political process, as well as
alternatives to God’s order (Maier 1987). Competition new bureaucratic elites and commercial classes, sought
among European monarchies, warfare, civil war, and inclusion through societal associations and political
peasant uprisings all contributed to centralization of institutions that could provide them with a voice
political power. By the end of the seventeenth century, against absolutism. In France, Britain, and Germany
this was reflected in the institution of the sovereign respectively, Montesquieu (1689–1755), Ferguson
monarch as a site of bureaucratic administration and (1723–1816), and Kant (1724–1804) proposed con-
regulation. Commercial expansion in the seventeenth ceptions of civil society focused on delineation of
and eighteenth centuries meant that economic activity associational spaces, of environments for societal
became a primary target of state regulation. The negotiations, of civility, and of publicity (Jacob 1991).
administrative, regulatory state was represented in Responding to the despotism of the eighteenth-
discourses on economic administration that combined century French ancient regime, Montesquieu saw in
cameralism, mercantilism, political arithmetic, and civil society (l’eT tat ciil ) a context for the societal
bullionism. negotiation of the absolute power of the monarch that
Thus political or civil society was no longer the was not a domain separate from the monarchy. His
politically constituted community characterized by a concept of division of powers addressed a situation in

1892
Ciil Society, Concept and History of

which the monarchy had severed itself from the Ferguson did not place civil society in opposition to
network of power relations within the political society the state; rather, civil society constituted a protective
and had appropriated for itself a separate sphere of shield from the uncertainties of social and political life.
power (l’eT tat politique). It sought to reintroduce the
monarch into the l’eT tat ciil by means of institutions
that would check the absolute authority of the ruler 3. Ciil Society as a Separate Domain
and balance it against the authority of the landed
aristocracy, their advocates in the judiciary, and For Adam Smith (1723–1790), also a thinker of the
commercial interests (Richter 1998). Scottish Enlightenment, civilized society consisted in
Kant first employed the notion of civil society, or self-regulating and interdependent networks of econ-
buW rgerliche Gesellschaft, in the sense of political society omic relations among individuals and groups, ori-
inseparable from the Prussian absolutism of Frederick ginating in the decisions of individuals competing in
the Great (1712–1786), which he considered indis- markets for goods, labor and capital. Economic
pensable for social stability. Secondly, buW rgerliche activities of self-interested individuals were guided by
Gesellschaft referred to the public sphere or a domain the universal, natural laws of supply and demand. For
of literate citizenry that was separate from the arena of Smith, the civilizing of economic society and the
political power and action. Kant regarded the political harmonizing of individual interests presupposed a
arena as the reserve of the state, or the ruler. mode of sociability and reciprocal sympathies among
BuW rgerliche Gesellschaft referred to a sphere ‘beyond individuals (Trentmann 2000).
the political order,’ ‘beyond the particularistic con- Smith proposed a separation between the civilized
cerns of political action’ where practical issues of society of economic activity and the political sphere of
governance could be debated on the basis of universal the state and insisted on the liberation of labor, capital
principles of reason. For Kant, the critical practice of (including land), and goods from the network of
exposing actual state policies to the light of universal relations of political society. The doctrine of laissez
reason could act to restrain the absolute power of the faire, laissez passez, first introduced by the eighteenth-
ruler as well as legitimizing his power (Ellis 2000). century French Physiocrats and taken up by Smith,
Kant’s buW rgerliche Gesellschaft was composed of was a plea for the removal of regulations that
individuals from the Prussian bureaucratic and bour- privileged the landed classes through restrictions on
geois elites, educated and trained in state schools and the grain trade, merchants through grants of mono-
administrative offices, as well as members of social polies, and the ‘poor’ through grain subsidies provided
clubs and associations, who could by dint of reason by the Poor Laws. Smith was an advocate for the new
rise beyond the trappings of class or official status. industrial order and opposed to regulations that
The thinkers of the Scottish Enlightenment who represented interventions by the state in the economic
addressed the issue of civil society did not limit process, including the free exchange of labor, capital
themselves to the issue of restraints to centralized state (including land), and of goods.
authority. Adam Ferguson pointed to the corrosion of Smith argued that if individuals, unimpeded by
civic spirit in political society, where the successful privilege and state regulation, could act in accordance
commercial classes became servile to the adminis- with their self-interest, markets could be trusted to
trative state, which provided them with a ‘rule of law’ allocate resources equitably in the form of wages,
but deprived them of their traditional rights (Keane rents, and profits. This, for Smith, was the key to
1988, Ferguson 1995). He conceived of civil society as progress, economic growth, and national prosperity
networks of self-governing and self-regulating vol- (Smith 1974). On the other hand, he was not entirely
untary associations, such as self-help groups and indifferent to the dislocations generated by markets
‘friendly’ or charitable societies, which had expanded and he assigned the civil state the duties of admin-
rapidly in the eighteenth century and, in Britain, istering justice and providing security, including the
played an important part in poverty relief efforts. protection of private property and correction of severe
Ferguson pointed to the potential of voluntary associ- inequalities created by market forces.
ations for engendering civility beyond the special Smith’s civilized society, like Ferguson’s civil so-
interests of state administration and the commercial ciety, Kant’s buW rgerliche Gesellschaft, and Monte-
classes. The central question facing Ferguson and squieu’s l’eT tat ciil, represented positive images of civil
other eighteenth century European thinkers was how society, which, in the course of encounters with
society increasingly differentiated administratively hitherto unknown regions of Asia and the Americas
and economically, could remain integrated and har- during the seventeenth and eighteenth centuries, be-
monious. For Ferguson, as for his compatriot, David came part of Europe’s definition of itself as the domain
Hume (1711–1776), civility was the basis for social of the ‘civilized’ (Kocka 2000). The discourse of
cohesion, and he saw it as rooted in sociability or civilized Europe created its opposite in images of an
moral and emotional communication among persons uncivilized non-Europe, notably the East, which
‘that fostered social bonds and friendships and cul- was perceived as the domain of despotism pace
tivated manners and moral tastes’ (Trentmann 2000). Montesquieu, of chafing regulations over economic

1893
Ciil Society, Concept and History of

activity pace Smith, and of the absence of private social stability in market societies where the logic of
property pace Marx. Following the last decade of the economic activity overrode moral and political con-
eighteenth century, with the onset of revolutionary cerns regarding equity. Civil, or liberal, society was
upheavals against absolutist regimes, the terms perceived as a domain shaped by or reformed through
buW rgerliche Gesellschaft and civil society were in- the practices of a central state representing the public
creasingly equated with bourgeois society and acquir- interest. This constitutes a major departure from
ed a negative connotation as ‘the reign of dissoluteness, eighteenth century thinking. On the one hand, civil
misery and physical and ethical corruption’ (Hegel society, its actors, its activities, and the economic
1967, Kocka 2000). The European idea of itself as relations that characterized it, were viewed as in-
‘civilized’ was effectively distanced from understand- separable from their legal and administrative formu-
ings of civil society as bourgeois society . No matter lations. This resulted in the creation of legal entities
how ‘barbaric’ Europe became in Europe, it would including trade unions, corporations, and family, and
remain ‘civilized’ vis-a' -vis non-Europe, its optimistic of voluntary and charitable associations, which, while
image of itself having been fixed in a vocabulary of autonomous from the state, remained within the
domination. bounds of its administrative–legal vision (Neocleous
Hegel (1770–1831) responded to this situation with 1996). On the other hand, the state, ‘objectified’ in its
the idea of the universal state that was to end the administrative and legal practices, was understood to
separation between the political and the economic. He stand apart from civil society and to mediate divergent
subscribed to the political economists’ notion of civil interests and needs.
society as a nexus of economic interests and relation- From the perspective of the English utilitarians,
ships that continuously reproduce themselves. He did most notably Jeremy Bentham (1748–1832), and of
not, however, share Smith’s utopian faith in the the German social economist Lorenz von Stein (1815–
mechanistic operations of the market. Revolutionary 1890), civil society, or the market economy of the
upheavals had revealed a discrepancy between models political economists, needed to be actively constituted.
of an orderly economy subject to universal laws of It was not enough simply to remove the obstacles of
nature and the chaotic reality of contemporary society. obsolete privilege and restrictive policies of mercan-
Conceptualizations of civil society as an autonomous tilism of the ancien reT gime. For Bentham, the new
domain capable of generating order and progress economic order required positive state intervention,
through its own dynamics now appeared dubious. and government was inseparable from an ‘art of
Hegel’s civil society was not the civilized society of directing the national industry to purposes to which it
Smith, but bourgeois society which manifested ex- may be directed with greatest advantage.’ Von Stein,
tremes of wealth and poverty that threatened to agreeing with Hegel’s perception of civil society as a
destroy the productivity of individuals pursuing their site of conflict and oppression, identified the ‘social
self-interest (Reidel 1984). The destructive potential of problem’ as the main obstacle to economic progress.
civil society, he felt, could be restrained through Progress required a market economy, or civil society;
redirecting individual self-interest by administrative however, the injustices and inequalities it generated
means, including justice, police, and moral measures. must be ameliorated through the state’s administrative
Marx (1818–1883) subscribed to Hegel’s critique of activity. The creation of social citizenship and the
political economy and his conception of civil society as active participation of the citizen in the state’s decision
an arena of self-interest and divisiveness with a making were central to this process (Pasquino 1981).
potential for self-destruction (Bottomore 1983). But Thus, civil society represented an outcome of collective
unlike Hegel, Marx was not concerned with the struggles and clashes among divergent interests that
containment of bourgeois society’s destructive po- were mediated through the state’s administrative
tential. Instead he focused on the revolutionary practices, so that for von Stein, as for Bentham, the
transformation of economic relations in civil society, state was not the Rechtsstaat (‘rule of law’ state) that
which he believed was possible through the mobiliz- stood outside civil society, but the Sozialstaat (social
ation of conflicts between different interest groups. state) representing a process whereby society was
For Marx, the state, distinct from civil society or continuously formed and reformed.
buW rgerliche Gesellschaft, is largely limited to forma- Not all observers of nineteenth century Europe saw
listic and negative activities since in his view, civil the centralized administrative state as the liberator of
society both preceded and determined the state. Civil civil society. Conservative Romantics rejected the
society required no regulation but was ruled through notion of the state or politics shaping society and
the contingencies of class struggle. assigned self-policing power to society and church.
Alexis de Tocqueville (1805–1859) in his De la demo-
cratie en Amerique (1835–1840) saw the real danger to
4. Ciil Society and the Liberal State modern society in the new despotism of the all-
pervasive state administrations rather than in class
Political and academic debates of the nineteenth conflicts. Anticipating his fellow Frenchman, Michel
century addressed the question of how to achieve Foucault (1926–84), de Tocqueville pointed to the

1894
Ciil Society, Concept and History of

administrative suffocation of civil society as evidenced outside Western Europe, including Eastern Europe
in the state’s monopoly of public education, health and post-colonial societies in the so-called Third
care, and social services to the poor and unemployed World. The different regions varied greatly in terms of
that subjected all aspects of citizens’ lives to state how growth was attained and sustained, and the
scrutiny (Keane 1988). De Tocqueville stressed the efficiency of realizing their economic and social goals.
importance of voluntary associations in placing checks They also varied in terms of their political institutions
on administrative despotism by providing the services and modes of resolution of social and political conflict.
that people expected from government, thus pre- From the late 1940s to the mid-1970s sustained
empting government intervention in civil society. economic growth in the capitalist economies of West-
Recalling the ideas of the Scottish Enlightenment, de ern Europe and North America made challenges to
Tocqueville saw voluntary associations as generating economic redistribution less urgent, and the inter-
and expressing community civic values. nationalization of the security problem (e.g., the
Civil society all but disappeared from political and formation of NATO) led to a consensus regarding
academic debates in the late nineteenth century and security. Capital transfers from a booming United
the period following World War I, a time of slacken- States economy, including military and economic aid,
ing economic growth, rising working-class activism, helped to sustain the developing states of the Third
imperial rivalries, and wars, which witnessed the dis- World, while the statist socialist regimes relied on the
placement of political power from state admin- ideological and political disciplining of their popu-
istrations to major organized groups in society. The lations to generate surpluses centrally distributed by
European welfare states, heirs to the liberal states, the state.
increasingly became sites for centralized and bureau- During the 1950s and 1960s, some social scientists in
cratic bargaining among political parties, labor Western Europe and the United States predicted that
unions, and business cartels vying for economic and under conditions of long-term growth, political prob-
political power and blurring the distinction between lems could be largely transformed into noncontro-
state and civil society, public and private (Maier 1987). versial administrative routines resolvable by experts.
Antonio Gramsci (1891–1937), leader of the Italian Socialist regimes perfected this functionalist percep-
Communist Party, represented a notable exception to tion of politics where government excluded all compet-
the lack of interest in the notion of civil society in the ing forms of political organization, including political
post-World War I era. He reformulated the Marxist– parties, trade unions, and collective bargaining.
Hegelian understanding of civil society in terms of the However, this perspective precluded discussion of civil
corporatist Zeitgeist of Italy in the 1920s. For society. Socialist ideologues identified civil society
Gramsci, civil society was not merely the sphere of with bourgeois society and held it was superseded by
individual needs but also of organizations where the the advent of socialism.
hegemony of the ruling class and consent to that rule In the mid-1970s, slowing economic growth in the
was negotiated. In this sense, civil society comprised advanced capitalist societies resulted in constraints on
not only all material (economic) but also political and the ability of the state to tax business and distribute
cultural relations. While Marx insisted on separation assistance to the working class in the form of social
between state and society, Gramsci held the two were welfare benefits. The fiscal crisis was further aggra-
interrelated. Hegemony, which was basic to Gramsci’s vated by rising costs of welfare provisions, and called
notion of civil society, presupposed ‘interpretations’ into question the validity of ideas that subsumed
of the economic structure; that is, the political and society to state regulation. Since the 1970s, social and
cultural mediation of different interests (Bobbio 1988). political debates have largely focused on the critique
In that sense, Gramsci had a kindred spirit in Hegel of social statism and the differentiation of civil society
rather than Marx. from the state.
From a neo-conservative stance, over-politicization
of the state and the fusion of political and non-
political spheres of social life has eroded political
authority, weakened governability, and destroyed the
autonomy and authority of non-political spheres,
5. Late Twentieth Century Understandings of including family, religion, and the market. Neo-
Ciil Society conservatives proposed the boundaries between state
and civil society be redrawn. Civil society, in op-
After World War II, the European political and position to the state, represented the depoliticized
economic order focused politics on the state with sphere of market activity and all else, including
welfare states subsuming civil society to state regu- religion and family. It should be reintegrated through
lation. As in the late nineteenth century and the post- a cultural model that cohered the exchange activities
World War I era, economic growth, distribution, and of self-interested individuals. This is a model of the
security represented the core themes of politics. How- self-regulating civil society of non-profit voluntary
ever, now these themes were generalized to regions associations reminiscent of the eighteenth-century

1895
Ciil Society, Concept and History of

Scottish Enlightenment. During the 1980s and 1990s, and to the state as the implementor of market reforms.
social science paradigms based on individual self- The second emphasizes the centrality in civil society of
interest and rational expectations, presented models religion, family, and voluntary associations for gen-
for social programming that would render a politicized erating moral, economic, and cognitive norms.
civil society unnecessary. In the neoconservative view,
the state is a lean structure characterized by effective See also: Civic Culture; Civil Society\Public Sphere,
authoritarian forms of action, approximated by its History of the Concept; Communication and Demo-
Thatcherite interpretation in 1980s Britain. cracy; Democracy: Normative Theory; Democratic
For proponents of the voluntary social movements Theory; Hegel, Georg Wilhelm Friedrich (1770–1831);
paradigm, statism had disastrous environmental, pol- Hobbes, Thomas (1588–1679); Kant, Immanuel
itical, and social consequences. They proposed an (1724–1804); Liberalism: Historical Aspects; Monte-
understanding of civil society as an autonomous squieu, Charles, the Second Baron of (1689–1755);
politicized sphere independent of regulation and con- Participation: Political; Smith, Adam (1723–90)
straint by bureaucratic political institutions. Social
movements such as the environmentalist, anti-nuclear,
and women’s and gay movements of the 1970s, were
politically motivated by concerns about the environ- Bibliography
ment and quality of life, and issues of equity, auth- Aristotle 1965 In: Barker E (ed.) The Politics. Oxford University
enticity, and participation. Their agendas required Press, New York
‘spontaneous’ organization and not ‘officially’ Bobbio N 1988 Gramsci and the concept of civil society. In:
sanctioned institutions of political parties and trade Keane J (ed.) Ciil Society and the State: New European
unions. Here, civil society referred to an intermediate Perspecties. Verso, London
institutional space between private (personal) and Bottomore T (ed.) 1983 Ciil Society. A Dictionary of Marxist
public (the object of official political institutions and Thought. Blackwell, Oxford, UK
actors). Calhoun C (ed.) 1992 Habermas and the Public Sphere. MIT
Jurgen Habermas’ (1929) idea of public sphere Press, Cambridge, MA
Cicero 1988 The Republic. Cambridge University Press, Cam-
parallels this understanding of civil society in the bridge, UK
context of ideas and communications; it includes the Cohen J L, Arato A 1992 Ciil Society and Political Theory. MIT
organization of civic or bourgeois opinion as repre- Press, Cambridge, MA
sented by associations and media. Public sphere is a Donzelot J 1984 L’Inention du social. Fayard, Paris
space where communication about collective values Ehrenberg John 1999 Ciil Society: The Critical History of an
takes place. In advanced capitalist societies where Idea. New York University Press, New York
political and economic domains merge, the state Ellis E 2000 Immanuel Kant’s two theories of civil society. In:
became a major actor in the market economy and thus Trentmann F (ed.) Paradoxes of Ciil Society: New Per-
no longer able to advance the common good. This, for specties on Modern German and British History. Bergham,
New York
Habermas, signals the legitimation crisis of advanced Gierke O 1958 Political Theories of the Middle Age. Beacon,
capitalist societies and raises the issue of the formu- Boston
lation of a public discourse outside of the spheres of Ferguson A 1995 An Essay on the History of Ciil Society.
market economy and the welfare state and in doing so, Transaction, New Brunswick, NJ
(in the spirit of Kant) subjects both spheres to the Havel V 1988 Anti-political politics. In: Keane J (ed.) Ciil
critical scrutiny of communicative rationality. Society and the State: New European Perspecties. Verso,
In the mid-1970s, the crises in socialist states, London
initially Poland, provoked Central and Eastern Euro- Hegel G W 1967 (Knox T M, English translation) The Phil-
pean intellectuals to address the issue of the politiciz- osophy of Right. Oxford University Press, London
Hobbes T 1949 (Lamprecht S P, English translation) De Cie or
ation of civil society. After failed revolutions in The Citizen. Greenwood, New York
Hungary (1956) and Czechoslovakia (1968), the new Jacob C M 1991 The enlightenment redefined: The formation
evolutionism of Adam Michnik and Jacek Kuron of modern civil society. Social Research 58(2): 475–95
proposed the bottom-up construction of a civil society Kant I 1963 What is enlightenment? Beck L W (ed.) On History.
to repulse state intrusions into social life. For other The Bobbs-Merrill Company, Indianapolis, IN
dissident socialist intellectuals in the region, most Keane J (ed.) 1988 Ciil Society and the State: New European
notably Vaclav Havel, civil society represented the Perspecties. Verso, London
domain of anti-politics; it was a vision of society not Kocka J 2000 Zivilgesellschaft als historisches Problem und
simply independent of the state but opposed to it. Versprechen. In: Hildermeier M, Kocka J, Conrad C (eds.)
EuropaX ische Ziilgesellschaft in Ost und West: Begriff,
During the 1980s and 1990s in many parts of the
Geschichte, Chancen. Campus Verlag, Frankfurt\Main,
world, including the former socialist states of Central Germany
and Eastern Europe, the expansion of the market Maier C (ed.) 1987 Changing Boundaries of the Political: Essays
system, including privatization programs, led to two on the Eoling Balance between State and Society, Public and
divergent perceptions of civil society. The first repre- Priate in Europe. Cambridge University Press, Cambridge,
sented civil society in opposition to global capitalism UK

1896
Ciil Society\Public Sphere: History of the Concept

Neocleous M 1996 Administering Ciil Society: Towards a The Greek conception of the polis, for example,
Theory of State Power. MacMillan, London usually referred to both, but when a distinction was
Offe C 1987 Challenging the boundaries of institutional politics: made, it clearly favored the state.
Social movements since the 1960s. In: Maier C (ed.) Changing
Roman law contributed the idea of ciitas and a
Boundaries of the Political: Essays on the Eoling Balance
Between State and Society, Public and Priate in Europe. stronger sense of relations among persons that were
Cambridge University Press, Cambridge, UK neither narrowly familial nor specifically about consti-
Pasquino P 1981 Introduction to Lorenz von Stein. Economy and tuting the political society through the state. Medieval
Society 10(1): 1–6 political and legal theory developed this theme, es-
Reidel M 1984 Between Tradition and Eolution: The Hegelian pecially in relation to the freedoms claimed by medi-
Transformation of Political Philosophy. Cambridge University eval cities but also in relation to the Church. Some
Press, Cambridge, UK strands juxtaposed the notion of legitimacy ascending
Richter M 1998 Montesquieu and the concept of civil society. from ‘the people’ to the eventually dominant idea of
The European Legacy 3(66): 33–41
divine right of kings, with its notion of legitimacy
Rosanvallon P 1988 The decline of social visibility. In: Keane
J (ed.) Ciil Society and the State: New European Perspecties. descending from God. Also influential was the dis-
Verso, London tinction of civil from criminal law (in which the former
Smith A 1974 The Wealth of Nations. Penguin, Harmondsworth, governs relations formed voluntarily among indivi-
UK duals and the latter the claims of the whole society
Trentmann F (ed.) 2000 Paradoxes of Ciil Society: New against malefactors). Nonetheless, it was only in the
Perspecties on Modern German and British History. Bergham, course of early modern reflection on the sources of
New York social order that civil society came to be seen as a
Tribe K 1988 Goerning Economy: The Reformation of German distinct sphere.
Economic Discourse, 1750–1840. Cambridge University Press,
A crucial step in this process was the ‘affirmation of
Cambridge, UK
ordinary life’ (Taylor 1989). Whereas the Greek
H. Islamoglu philosophers had treated the private realm—including
economic activity—as clearly inferior to the public
realm associated with affairs of state, many moderns
placed a new positive value on family and economic
Civil Society/Public Sphere: History of the pursuits. They argued that both privacy and civil
society needed to be defended against encroachments
Concept by the state. In this context, it was also possible to
conceive of a public sphere that was not coterminous
The closely related concepts of civil society and public with the state but rather located in civil society and
sphere developed in the early modern era to refer to based on its voluntary relations. In this communicative
capacities for social self-organization and influence space citizens could address each other openly, and in
over the state. Civil society usually refers to the ways that both established common notions of the
institutions and relationships that organize social life public good and influenced the state.
at a level between the state and the family. Public Social contract and natural law theories—especially
sphere is one of several linked terms (including ‘public as joined in the work of John Locke—contributed to
space,’ simply ‘public,’ and the German Oq ffentlichkeit, this shift by suggesting ways in which the creation of
or publicness) that denote an institutional setting society conceptually preceded the creation of govern-
distinguished by openness of communication and a ment. From this it was only a short step to say that the
focus on the public good rather than simply compro- legitimacy of government depended on its serving the
mises among private goods. Located in civil society, needs of civil society (or of ‘the people’). Thomas
communication in the public sphere may address the Paine and other advocates of freedom from unjust rule
state or may seek to influence civil society and even advanced an image of the freedoms of Englishmen
private life directly. Key questions concern the extent which was influential not only in England and Ameri-
to which it will be guided by critical reason, and how ca, but in France, notably in Montesqueiu’s account
boundaries between public and private are mediated. of the ‘spirit’ of laws which combined an appreciation
of English division of powers with an older tradition of
1. Ciil Society and Self-organization republican (and aristocratic) virtue. From Rousseau
through Tocqueville, Comte, and Durkheim, this
The distinction of ‘civil society’ from the state took its French tradition developed an ever-stronger account
modern form in the seventeenth and eighteenth cen- of the autonomy of the social (resisting not only the
turies. Prior to this separation, political and social claims of the state but the Cartesian postulate of the
realms were seldom clearly distinguished. When they primacy of the individual subject).
were, the social was exemplified by the family and A crucial innovation was to understand society as at
often subordinated as the realm of necessity or mere least potentially self-organizing rather than organized
reproduction to the broader public character and only by rulers. If there was a single pivotal intellectual
possibilities for active creation that lay in the state. source for this, it lay with the Scottish moralists. In

1897
Ciil Society\Public Sphere: History of the Concept

Adam Smith’s (1776) notion of the invisible hand, the Kant, like many eighteenth-century philosophers,
market exemplified this self-organizing capacity but lacked a strong notion of the social. This Hegel (1821)
did not exhaust it. In his Essay on the History of Ciil supplied, rejecting social contract theory because even
Society (1767), Adam Ferguson presented human in Rousseau’s notion of a general will it suggested that
history as a series of social transformations leading to the union achieved in the state depended not on its
modern society. This prompted Hegel (1821) to treat own absolute universality but on a development out of
civil society as a field in which the universal and individual wills. Nationalism also shaped ideas of
particular contended; their reconciliation depended society and political community in holistic ways well
on the state. The idea of civil society also shaped matched to unitary states (Calhoun 1999). Marx’s
classical political economy and ideas of social evol- (1843, 1927) critique of politics based on bourgeois
ution, and informed Marx’s account of the stages of individual rights further challenged the adequacy of
historical development as combinations of productive civil society as a realm of freedom and unity. Where
capacity and (conflict-ridden) social relations. Marx Hegel thought that the state in itself might overcome
also challenged the notion that markets were neutrally the tension between necessity and freedom and the
self-organizing, emphasizing the role of historical clash of particular wills, Marx held that only a
accumulations of power. transformation of material conditions including the
Though the actual analyses differed, what had been abolition of private property could make this possible.
established was the notion of society as a distinct As a result, theories stressing stronger ideas of the
object of analysis, not reducible to either state or social were apt to offer weaker notions of public life.
individual. People formed society impersonally as The Marxist tradition denigrated ‘mere democracy’ as
actors in markets, more personally as parties to an inadequate means of achieving either freedom or
contracts. The idea of civil society hearkened back to unity.
the sort of social life that emerged among the free The ideas of public sphere and civil society de-
citizens of medieval cities because this was largely self- veloped primarily in liberal theory. These were not
regulated—as distinct from direct rule by ecclesiastical always seen in the manner of Hegel as merely ‘edu-
or military authorities. It also suggested ‘civility’ in cative’ on the way to a more perfect latter unity. Nor
interpersonal relations. This meant not just good was political unity necessarily left to the workings of
manners, but a normative order facilitating amicable an invisible hand or other unchosen system, but
or at least reliable and nonthreatening relationships freedom was treated commonly as a matter of in-
among strangers and in general all those who were not dividual rather than collective action. This accom-
bound together by deep private relations like kinship. panied the rise of relatively asocial understandings of
Equally important, the idea of civil society included— the market (Polanyi 1944). In addition, the emerging
in some versions—the notion that communication notion of the public sphere was not clearly distinct
among members might be the basis for self-conscious from other usages of ‘public.’ State activity, for
decisions about how to pursue the common good. This example, was sometimes described as public without
notion is basic to the modern idea of public sphere. regard to its relationship to democracy or its openness
to the gaze or participation of citizens. This usage
2. The Idea of a Public Sphere survives in reference to state-owned firms as ‘public’
regardless of the kind of state or the specifics of their
Rousseau (1762) famously sought to understand how operation.
social unity could result from free will rather than More important was the overlapping concept of
external constraint. This depended, he argued, on ‘public opinion’ (see Public Opinion: Political Aspects).
transcending the particular wills of many people with The dominant eighteenth-century usage emphasized
a general will that was universal. Kant admired open expression and debate, contrasting free public
Rousseau’s pursuit of unity in freedom as distinct opinion to absolutist repression. At the same time, it
from mere social instinct (as in Aristotle’s notion of a generally treated public opinion as a consensus formed
political animal) or imposition of divine authority. He on the basis of reasoned judgment. ‘Opinion’ was
relied implicitly on the idea of a collective conversation something less than knowledge, but especially where it
through which individual citizens reach common had been tested in public discourse, it was not simply
understandings. Likewise the development of rep- sentiment and it gained truth-value from reflexive
resentative institutions in eighteenth century England examination. Various euphemisms like ‘informed
informed and anchored a public discourse directed at opinion’ and ‘responsible opinion,’ however, reflected
bringing the will and wisdom of citizens to bear on both a bias in favor of the opinions of elites and an
affairs of state. Finally, the idea of the people as acting anxiety about the possibly disruptive opinions of the
subject came to the fore in the American and French masses. During the nineteenth century, this anxiety
revolutions. The idea of a public sphere anchored came increasingly to the fore. Tocqueville (1840) and
democratic and republican thought in the capacity of Mill (1859), thus, both contrasted public opinion to
citizens in civil society to achieve unity and freedom reasoned knowledge; Mill especially worried about
through their discourse with each other. ‘collective mediocrity’ in which the opinion of debased

1898
Ciil Society\Public Sphere: History of the Concept

masses would triumph over scientific reason. While emancipatory potential of a collective discourse about
advocates of the public sphere saw rational-critical the nature of the public good and the directions of
discourse producing unity, critics saw mass opinion state action. This could be free insofar as it was
reflecting psychosocial pressures for conformity. Im- rational—based on the success of argument and
plicitly, they associated reason with individuals rather critique rather than the force of either status or
than any collective process. The distinction between coercion—and could achieve unity by disregarding
‘public’ and ‘crowd’ or ‘mass’ was lost in such views particular interests—like particular statuses—in favor
(Splichal 2000). Early positivist research into public of the general good. The best version of the public
opinion approached it as explicable on the basis of sphere was based on ‘a kind of social intercourse that,
social psychology rather than as a species of reasoned far from presupposing the equality of status, disre-
argument. Toennies (1922) sought a way to discern garded status altogether.’ It worked by a ‘mutual
when each approach ought to apply. willingness to accept the given roles and simultan-
Conversely, in the late nineteenth and twentieth eously to suspend their reality’ (Habermas 1962, p.
centuries, a new field of public opinion research 131).
developed that approached public opinion as an The basic question guiding Habermas’ exploration
aggregation of individual opinions. The shift was of the public sphere was: to what extent can the wills or
based largely on the development of empirical polling opinions guiding political action be formed on the
methods. It brought a renewal of attention to differ- basis of rational-critical discourse? This is a salient
ences within public opinion, and thus to the distinction issue primarily where economic and other differences
between public and crowd (Blumer 1948, Key 1961). It give actors discordant identities and conflicting inter-
also focused attention on patterns of communication ests. For the most part, Habermas took it as given that
among members of the public rather than the more the crucial differences among actors were those of class
generalized notions of imitation or emotional con- and largely political-economic status; in any case, he
tagion. New media—first newspapers, and then broad- treated them as rooted in private life and brought from
cast—figured prominently in efforts to understand there to the public. He focused on how the nature,
public communication. While Lippman (1960) and a organization, and opportunities for discourse on
variety of social psychologists worried that the new politically significant topics might be structured so
media would produce the descent to a lowest common that class and status inequalities were not an in-
denominator of public opinion that liberals had long superable barrier to political participation. The first
feared, Dewey (1927) and other pragmatists defended issue, of course, was access to the discourse. This was
the capacity for reason in large-scale communication. not so simple as the mere willingness to listen to
In this, they hearkened back to the eighteenth-century another’s speech, but also involved matters like the
hopes of Kant and Rousseau. distribution of the sorts of education that empowered
Even before the apotheosis of the opinion poll, speakers to present recognizably ‘good’ arguments.
Cooley (1909) had argued emphatically that public Beyond this, there was the importance of an ideo-
opinion ought to be conceived as ‘no mere aggregate logical commitment to setting aside status differences
of individual opinions, but a genuine social product, a in the temporary egalitarianism of an intellectual
result of communication and reciprocal influence.’ A argument.
key question was whether this communication and The public sphere joined civil society to the state by
reciprocal influence amounted to the exercise of focusing on a notion of public good as distinct from
reason. Peirce (1878) had argued that among scientists private interest. It was however clearly rooted in civil
the formation of consensus on the basis of openness society and indeed in the distinctive kind of privacy it
and debate was the best guarantee of truth. Could this allowed and valued.
view be extended into less specialized domains of The bourgeois public sphere may be conceived
public discourse? This has been an enduring focus for above all as the sphere of private people coming
Jurgen Habermas, the most influential theorist of the together as a public; they soon claimed the public
public sphere. sphere regulated from above against the public
authorities themselves, to engage them in a debate
over the general rules governing relations in the
3. Habermas basically privatized but publicly relevant sphere of
commodity exchange and social labor. The medium of
In the context of some cynicism about democratic this political confrontation was peculiar and without
institutions, Habermas (1962) set out to show the historical precedent: people’s public use of their reason
unrealized potential of the public sphere as a category (Habermas 1962, p. 27).
of bourgeois society. He challenged most directly the This public use of reason depended on civil society.
tendencies in Marxism and critical theory to belittle Businesses from newspapers to coffee shops, for
democratic institutions—and also the collapsing of example, provided settings for public debate. Social
public into state characteristic not only of Hegel but of institutions (like private property) empowered indivi-
actually existing socialism. Habermas celebrated the duals to participate independently in the public sphere;

1899
Ciil Society\Public Sphere: History of the Concept

forms of private life (notably that of the family) from the natural world, of appearance and memory,
prepared individuals to act as autonomous, rational- and of talk and recognition. Such action both requires
critical subjects in the public sphere. But the and helps to constitute public spaces—spaces held
eighteenth-century public sphere was also distin- in common among people within which they may
guished by its normative emphases on openness and present themselves in speech and recognize others.
rational political discourse. Habermas’ concern focus- Public action is thus a realm of freedom from the
ed on the way later social change brought these two necessity—notably of material reproduction—that
dimensions into conflict with each other. dominates private life.
The idea of publicness as openness underwrote a Arendt’s usual term, ‘public space,’ leaves the
progressive expansion of access to the public sphere. ‘shape’ of public life more open than the phrase public
Property and other qualifications were eliminated and sphere. Public action can create institutions, as in the
more and more people participated. The result was a founding of the American Republic, but as action it is
decline in the quality of rational-critical discourse. As unpredictable. Its publicness comes from its perform-
Habermas later summed up: ance in a space between people, a space of appearances,
but it is in the nature of public action to be always
Kant still counted on the transparency of a surveyable public
forming and reforming that space and arguably the
sphere shaped by literary means and open to arguments and
which is sustained by a public composed of a relatively small people themselves. This conceptualization offers clear
stratum of educated citizens. He could not foresee the advantages for thinking about the place of plurality in
structural transformation of this bourgeois public sphere into the public sphere. As Arendt wrote of America, ‘since
a semantically degenerated public sphere dominated by the the country is too big for all of us to come together and
electronic mass media and pervaded by images and virtual determine our fate, we need a number of public spaces
realities (Habermas 1998, p. 176). within it’ (1972, p. 232).
Arendt saw this plurality threatened not just by
While Habermas’ account of the continuing value mass conformity but by the reduction of public
of the category of public sphere evoked by the concerns to material matters. A focus on sex as much
eighteenth-century ideal set him apart from Hork- as on the economy threatens the public–private dis-
heimer and Adorno (1944) and their pessimistic turn tinction. It not only intrudes on intimacy and private
in critical theory, he largely incorporated their critique life but impoverishes public discourse. Arendt (1951)
of ‘mass society’ as ‘administered society’ into his saw this problem as basic to totalitarianism, which
survey of twentieth-century developments and with it could allow citizens neither privacy nor free public
many of the fears of nineteenth-century liberals. He discourse. Totalitarianism is distinguished from mere
held that the public sphere was transformed not only tyranny by the fact that it works directly on private life
by simple increase of numbers but by the success of as well as limiting public life. This is not just a matter
various new powers at re-establishing in new form the of contrasting intentions, but of distinctively modern
power to ‘manage’ public opinion or steer it from capacity. Modern sociological conditions offer rulers
above. Public relations agents and public opinion polls the possibility to reach deeply into the family in
replaced rational-critical debate; electronic media particular and personal life in general, to engineer
allowed openness but not the give and take con- human life in ways never before imagined.
versation of the eighteenth-century coffee houses. At This potential for collapsing the public and private
the same time, rising corporate power and state realms is linked to Arendt’s unusually negative view of
penetration of civil society undermined the distinction civil society. ‘Society,’ she writes, is ‘that curious and
of public and private, producing a ‘refeudalization’ of somewhat hybrid realm which the modern age inter-
society. jected between the older and more genuine realms of
the public or political on one side and the private on
4. Arendt the other’ (1990, p. 122). Civil society is first and
foremost a realm of freedom from politics. But public
Hannah Arendt also focused on the problem of freedom is freedom in politics. This calls for action
collapsing distinctions between public and private. that creates new forms of life, rather than merely
Arendt emphasized the capacity of action in public to attempting to advance interests or accommodate to
create the world that citizens share in common. The existing conditions. This distinguishes Arendt’s view,
term ‘public,’ she wrote, ‘signifies two closely inter- and republicanism generally, from much liberal
related but not altogether identical phenomena: It thought: ‘Thus it has become almost axiomatic even in
means, first, that everything that appears in public can political theory to understand by political freedom not
be seen and heard by everybody and has the widest a political phenomenon, but on the contrary, the more
possible publicity. … Second, the term ‘‘public’’ signi- or less free range of nonpolitical activities which a
fies the world itself, in so far as it is common to all of given body politic will permit and guarantee to those
us and distinguished from our privately owned place who constitute it’ (1990, p. 30).
in it’ (Arendt 1958, pp. 50, 52). Public action, moreover, The founding of the United States was a favorite
is the crucial terrain of the humanly created as distinct example of such action for Arendt. The American

1900
Ciil Society\Public Sphere: History of the Concept

Founders imagined and created a new kind of society, sought to stress. This has sometimes been a source of
a new set of institutions. This relied on citizens’ public confusion in use of the public sphere concept to
commitments to each other rather than assumptions analyze distinctive institutional developments in di-
about human nature or mere external application of verse political and cultural settings (Calhoun 1993).
law. The Founders ‘knew that whatever men might be Civil society has been important to defenders of free
in their singularity, they could bind themselves into a market economics because it suggests the virtues of an
community which, even though it was composed of economy in which participants’ choices are regulated
‘‘sinners,’’ need not necessarily reflect this ‘‘sinful’’ side by their interests rather than their official statuses. In
of human nature’ (1990, p. 174). Arendt’s vision of principle, such an economy is able to effectively
public life as central to a moral community shares produce and circulate goods on the basis of prices
much with a republican tradition that deplores the rather than government direction. Civil society has
modern decline of the public sphere—generally as- been equally important to advocates of democracy
sociated with the rise of particular interests at the because it signifies the capacity of citizens to create
expense of concern for the general good, the de- amongst themselves the associations necessary to
terioration of rational public discourse about public bring new issues to the public agenda, to defend both
affairs, or outright disengagement of citizens from civil and human rights, and to provide for an effective
politics (see Public Sphere: Nineteenth- and Twentieth- collective voice in the political process. This involves
century History). Republican accounts of the both a free press and political mobilization on the
public sphere place a strong emphasis on the moral basis of parties and interest groups (see Cohen and
obligations of the good citizen; recent scholarship has Arato 1992 for the most detailed review; also Chand-
often questioned whether citizens lived up to signifi- hoke 1995, Seligman 1992, Alexander 1998, Keane
cantly higher standards in earlier eras (Schudson 1999). Habermas (1992, p. 367) summarizes the recent
1998). usage: ‘civil society is composed of those more or less
spontaneously emergent associations, organizations,
and movements that, attuned to how societal problems
5. Differentiation in the Public Sphere and Ciil resonate in the private life spheres, distill and transmit
Society such reactions in amplified form to the public sphere.
The core of civil society comprises a network of
Habermas’ account of the public sphere has been associations that institutionalizes problem-solving dis-
enduringly influential (see Calhoun 1992). Its delayed courses on questions of general interest inside the
translation into English in 1989 ironically contributed framework of organized public spheres.’ Habermas’
to an invigorating new reading shaped by both the fall work more generally, however, reveals this to be a
of communism and widespread projects of privatiza- minimally theorized as well as optimistic usage. It
tion in the West. Critics within communist societies highlights one aspect of civil society but does not make
had revived the notion of civil society (as distinct from clear the most basic issue.
simply ‘society’) in order to speak of the realm outside While part of the heritage of the idea of civil society
state control and its relative absence in communist has been the effort to organize society through public
societies. Likewise, transitions away from right-wing discourse, an equally influential part has been the
dictatorships were often treated in terms of a ‘return of claim to privacy, the right to be left alone, the
civil society’ (Perez-Diaz 1993). In the US, the idea of opportunity to enter into social relations free from
civil society was linked not only to democracy but to governance by the state or even the public. The idea of
reliance on voluntary organizations and philanthropy business corporations as autonomous creatures of
(Powell and Clemens 1998, Putnam 2000). private contract and private property thus reflects the
What civil society signifies in contemporary political heritage of civil society arguments as much as the idea
analysis is the organization of social life on the basis of of a public sphere in which citizens joined in rational-
interpersonal relationships, group formation, and critical argument to determine the nature of their lives
systems of exchange linking people beyond the range together. Civil society refers to the domains in which
of intimate family relations and without reliance on social life is self-organizing, that is, in which it is not
direction by the government. As a number of scholars subject to direction by the state. But this self-organiza-
of Africa have noted, it incorporates an unfortunate tion can be a matter of system function or of conscious
understanding of family privacy that underestimates collective choice through the public sphere (Calhoun
the positive and supraprivate social roles that African 2001).
kin organizations can play (see essays in Harbeson et Habermas’ account of the public sphere drew a
al. 1994). Even more basically, references to civil variety of important critical responses. One of the first
society often fail to distinguish adequately between focused on the extent to which he focused on the
systemic capitalist economic organization and much bourgeois public sphere and correspondingly neglect-
more voluntary creation of social organization ed nonbourgeois public life and failed to clarify some
through the formation of civic associations, interest of the conditions built into the bourgeois ideal. Negt
groups, and the like—a distinction Habermas has and Kluge (1972) responded with an account of the

1901
Ciil Society\Public Sphere: History of the Concept

proletarian public sphere. Clearly, workers have at See also: Citizenship and Public Policy; Civil Society,
many points built their own institutions, media, and Concept and History of; Democracy; Individual\
networks of communication, and entered into con- Society: History of the Concept; Public Good, The:
tention with bourgeois elites and other groups over the Cultural Concerns; Public Sphere: Nineteenth- and
collective good. But if this is a discursive competi- Twentieth-century History; State, History of
tion—that is, if workers and bourgeois argue over
what constitutes the collective good rather than only
fighting about it—then this implies an encompassing Bibliography
public sphere, albeit an internally differentiated one.
Alexander J C 1998 Real Ciil Societies: Dilemmas of Institu-
Nancy Fraser (1992) has influentially emphasized tionalization. Sage, Thousand Oaks, CA
the importance of ‘subaltern counterpublics’ such as Arendt H 1951 The Origins of Totalitarianism. Harcourt Brace,
those framed by race, class, or gender. Some pub- New York
lics—even very partial ones—may claim to represent Arendt H 1958 The Human Condition. University of Chicago
the whole; others oppose dominant discursive patterns Press, Chicago
and still others are neutral. Not all publics that are Arendt H 1972 Crises of the Republic. Harcourt Brace
distinguished from the putative whole are subaltern. Jovanovich, New York
As Michael Warner (2001) has suggested, the de- Arendt H 1990 On Reolution. Penguin, New York
Blumer H 1948 Public opinion and public opinion polling.
ployment of claims on an unmarked public as the
American Sociological Reiew 13: 542–54
public sphere is also a strategy, generally a strategy of Calhoun C (ed.) 1992 Habermas and the Public Sphere. MIT
the powerful. Yet, it is important to keep in mind both Press, Cambridge, MA
that the existence of counterpublics as such presup- Calhoun C 1993 Civil society and public sphere. Public Culture
poses a mutual engagement in some larger public 5: 267–80
sphere and that the segmentation of a distinct public Calhoun C 1999 Nationalism, political community, and the
from the unmarked larger public may be a result of representation of society: Or, why feeling at home is not a
exclusion, not choice. Feminist scholars especially have substitute for public space. European Journal of Social Theory
drawn attention to both the gender biases within 2(2): 217–31
Calhoun C 2001 Constitutional patriotism and the public sphere:
family life that disempower women and the historically
Interests, identity, and solidarity in the integration of Europe.
strong gender division between public and private In: De Greiff P, Cronin P (eds.) Transnational Politics. MIT
realms on which male political freedom has generally Press, Cambridge, MA
rested (Elshtain 1993, Young 2000). Chandhoke N 1995 State and Ciil Society: Explorations in
Political Theory. Sage, New Delhi
Cohen J, Arato A 1992 The Political Theory of Ciil Society.
6. Conclusion MIT Press, Cambridge, MA
Theories of civil society focus on the capacity for self- Cooley C H 1909 Social Organization: A Study of the Larger
organization of social relations, outside the control of Mind. Scribner, New York
Dewey J 1927 The Public and its Problems. Ohio State University
the state and usually beyond the realm of family. The
Press, Columbus, OH
basic question posed by theories of the public sphere is Elshtain J B 1993 Priate Man, Public Woman. Princeton
to what extent collective discourse can determine the University Press, Princeton, NJ
conditions of this social life. Contemporary research Ferguson A 1767 Essay on the History of Ciil Society.
on civil society and the public sphere turns on the Transaction Publishers, New Brunswick, NJ
breadth of political participation, the extent to which Fraser N 1992 Rethinking the public sphere: a contribution to
capitalist markets limit other dimensions of self- the critique of actually existing democracy. In: Calhoun C
organization in civil society, the existence of multiple (ed.) Habermas and the Public Sphere. MIT Press, Cambridge,
or overlapping public spheres, the impact of new MA, pp. 109–42
Habermas J 1962\1991 The Structural Transformation of the
communications media, and the quality of rational-
Bourgeois Public Sphere: An Inquiry into a Category of
critical discourse and its relationship to culture- Bourgeois Society [trans. Burger T]. MIT Press, Cambridge,
forming activities. These issues also inform discussions MA
about international civil society and its public sphere. Habermas J 1992 Between Facts and Norms. MIT Press,
The concepts of civil society and public sphere took Cambridge, MA
on their primary modern dimensions in the late Habermas J 1998 In: Cronin C, De Grieff P (eds.) The Inclusion
eighteenth and early nineteenth centuries in Western of the Other. MIT Press, Cambridge, MA
Europe and to a lesser extent the United States. They Harbeson J W, Rothchild D, Chazan N (eds.) 1994 Ciil Society
have become important in a variety of other settings, and the State. Lynne Rienner, Boulder, CO
Horkheimer M, Adorno T W 1944\1972 Dialectic of Enlight-
including in conceptualizing social autonomy in re-
enment. Herder and Herder, New York
lationship to communist and authoritarian states. Hegel G W F 1821 The Philosophy of Right [trans. Knox T M]
They inform democratic projects as well as academic Keane J 1999 Ciil Society. Stanford University Press, Stanford,
research in a variety of settings and are in turn CA
themselves informed by cultural creativity and social Key V O 1961 Public Opinion and American Democracy. Knopf,
action. New York

1902
Ciilization, Concept and History of

Lippman W 1960 Public Opinion. Macmillan, New York from designating certain morphological features of
Marx K 1843\1975 On the Jewish question. Marx–Engels human society, particularly with reference to urbanism
Collected Works. Lawrence and Wishart, London, Vol. 3, pp. and urbanity, civilization has been a schema for
146–74
historical categorization and for the organization of
Marx K 1927\1975 Critique of Hegel’s philosophy of law.
Marx–Engels Collected Work. Lawrence and Wishart, historical materials. Here it has generally taken two
London, Vol. 3, pp. 1–129 forms, the universalist evolutionist, and the romantic
Mill J S 1859 On Liberty. Penguin, London particularist. The latter was tending to regain, in the
Negt O, Kluge A 1972\1993 The Public Sphere and Experience. ascendant context of identity politics, a certain he-
University of Minnesota Press, Minneapolis, MN gemonic primacy worldwide at the close of the
Peirce C S 1878\1992 The Essential Peirce: Selected Philo- twentieth century. In all, the concept of civilization
sophical Writings, 1867–1893. Indiana University Press, forms a crucial chapter in the conceptual, social, and
Bloomington, IN political history of history; it, or its equivalents are
Pe! rez-Dı! az V M 1993 The Return of Ciil Society: The Emergence
presupposed, implicitly or explicitly, in the construal
of Democratic Spain. Harvard University Press, Cambridge,
MA and writing of almost all histories.
Polanyi K 1944 The Great Transformation: the Political and
Economics Origins of Our Time. Beacon, Boston 1. Pre-History
Powell W, Clemens E (eds.) 1998 Priate Action and the Public
Good. Yale University Press, New Haven, CT
Putnam R 2000 Bowling Alone. Simon and Schuster, New 1.1 The Past Continuous
York The mental and social conditions for speaking about
Rasmussen T 2000 Social Theory and Communication Tech- civilization in a manner recognizable in the year 2000
nology. Ashgate, London
were not available before the middle of the eighteenth
Rousseau J-J 1762 The Social Contract. Everyman Paperback
Classics, London century. Hitherto, in Europe as elsewhere, large-scale
Schudson M 1998 The Good Citizen: A History of American Ciic and long-term historical phenomena, which later came
Life. Free Press, New York to be designated as civilizations, had been categorized
Seligman A 1992 Ciil Society. Free Press, New York in a static manner that precluded the consciousness of
Sennet R 1977 The Fall of Public Man. Knopf, New York directional or vectorial historicity as distinct from the
Smith A 1776 On the Wealth of Nations. Penguin, Harmonds- mere register of vicarious change.
worth, UK Hitherto, the succession of large-scale historical
Splichal S 2000 Defining public opinion in history. In: Hardt H, phenomena, such as Romanity or Islam, had been
Splichal S (eds.) Ferdinand Toennies on Public Opinion.
regarded (a) typologically, most specifically in the
Rowman and Littlefield, London, pp. 11–48
Taylor C 1989 Sources of the Self. Harvard University Press, salvation-historical perspective of monotheistic re-
Cambridge, MA ligious discourse, in which successive events are taken
Tocqueville A de 1840\1844\1961 Democracy in America. Scho- for prefigurations and accomplishments of each other;
cken, New York, Vol. 2 (b) in terms of the regnal succession of world-empires;
Toennies F 1922\2000 Kritik der oW ffentlichen Meinung; selections (c) in the genre of regnal succession, which started with
translated. In: Hardt H, Splichal S (eds.) Ferdinand Toennies the Babylonian king-lists and the earliest stages of
on Public Opinion. Rowman and Littlefield, London, pp. Chinese historical writing, and culminated in medieval
117–210 Arabic historical writing. Not even the schema of state
Warner M 2001 Public and Counterpublics. Zone Books,
cycles evolved by the celebrated Ibn Khaldun
Cambridge, MA
Young I M 2000 Inclusion and Democracy. Oxford University (d. 1406), where civilization (‘umraV n) was quasi-
Press, Oxford, UK sociologically identified with various organizational
forms of human habitation and sociality, could mean-
C. Calhoun ingfully escape from this finite repertoire of possible
historical conceptions.
In the perspective of typology, the continuity of
historical phenomena was expressed in the repetition
Civilization, Concept and History of of prophecies successively reaffirming divine intent
and inaugurating a final form of order whose telos
The concept of civilization is inextricably connected would be the end of time. Thus the Jewish prophets
with the conditions of its emergence, most notably repeat each other and are all figures for Abraham;
with the rise of historical consciousness in Europe in Jesus is at once the repetition and termination of this
the eighteenth and nineteenth centuries, as well as the unique cycle of terrestrial time and is prefigured in
globalisation of this form of historical understanding Jewish prophecies; Muhammad is the final accom-
and correlative forms of intellectual practice. The plishment and the consummation of earlier prophetic
concept is complex and imprecise in its definition, but revelations, prefigured in Jewish and Christian scrip-
ubiquitous in its uses, and inextricably imbricated with tures; his era inaugurates the consummation of time
other categories by which historical materials are with the Apocalypse. The structure of time in the
organized, such as culture, nation, and race. Apart Talmud, in the Christian writings of Eusebius (d. 339),

1903
Ciilization, Concept and History of

of Augustine’s Spanish pupil Orosius (fl. 418), of pean, but their conceptual consequences were glo-
Bernard de Clairvaux (1153), as in Muslim writings balized in the course of the nineteenth and twentieth
such as Muhammad’s biography by Ibn Isha# q (d. after centuries. The first was Humanism, particularly in
761) or the universal histories of Tabarı# (d. 923) and Italy, which for the first time broke the spell of Roman
Ibn Kathı# r (d. 1373), is homologous. continuity by construing the immediate past as an age
All but Jewish typology is independent of ethnic of darkness, and by confining Roman grandeur to the
origin or geographical location, and construes his- republican and early imperial ages. Thus Petrarch’s
torically significant units as religious communities of (d. 1374) project of classicism counterposed to living
ecumenical description. This yields the second mode tradition, Flavio Biondo’s (d. 1463) anticipation of
of organizing historical phenomena, the regnal. Thus, division of history into the classical, the medieval, and
the regnal categorization of long-term historical phe- the modern, and Lorenzo Valla’s (d. 1457) refuta-
nomena of broad extent was expressed in terms of the tion of medieval documentary forgeries such as the
succession of four ecumenical world-empires, suc- Donation Constantini, based on an argument from
ceeding one another as the central actors in world anachronism: Together, these laid the ground for a
history: the Assyrio-Babylonian, the Median-Persian, view of history as the domain of change rather than of
the Alexandrian-Macedonian, and the Roman. This repetition.
perspective was shared by the Book of Daniel and by Needless to say, the notion of translatio imperii was
ecclesiastical works, especially Syrian and Byzantine no longer tenable in this context. It is at this point that
apocalypses influenced by it, albeit with minor vari- Humanism converged with the other event founda-
ations, as in Orosius’ substitution of the Carthaginian tional of the modern historical consciousness, namely,
world-empire for the Median-Persian. Muslim caliphs the Reformation. Criticism of the Church by Wycliff
considered their own ecumenical world empire to be (d. 1384) and Luther, and the historiographic ex-
the fifth and final order of world history, a conception pression of this anti-Traditionalist fundamentalism
shared by Muslim apocalypticism and, with many in John Foxe’s Acts and Monuments (1563), were
complications and nuances, by universal histories in crucial ways congruent in their conception of the
written in Arabic, all of which regarded dynastic past with Humanism. The identification of the Pope
succession as both prophetic inheritance and as the with the Antichrist, the designation of the greater part
renewal of ecumenical imperial ambition, transferred of the history of Christianity as a history of falsehood,
from one line to another. and the devalorization of the immediate past as
Analogously, medieval Christian polities, Byzantine abiding Tradition and its construal as degeneration,
as well as Frankish, subscribed to the same theory of also led to the rejection of the notion of translatio
translatio imperii by regarding themselves as being in a imperii, and the substitution of notions of Refor-
direct line of typological continuity with Rome, mation and renovation to that of transfer in con-
variously through Byzantium, the ‘New Rome,’ or tinuity.
through the Holy Roman Empire. In both, Romanity Thus the ground was prepared for notions of rise,
was the worldly cement of Christianity. This was a decline, and fall—most notably the decline and fall of
conception developed by Eusebius for his contem- Rome and the reclamation of Roman republicanist
porary overlord Constantine, and was to remain models in a spirit of revivalism—that were finally to
effective until the dawn of modern times. In all cases, mature in the eighteenth century, with Gibbon and
the past was understood to have been completed at its Montesquieu among others, spurred along with the
inception, with subsequent polities re-enacting the development and ultimately, in the late eighteenth and
foundational event. during the nineteenth centuries, the institutional trans-
Finally, mention must be made of the disjunction formation of philology and antiquarianism into his-
between these meta-historical and transcendental tory as a topic of research detached from rhetoric. This
realms of typological continuity, and the all-too- was far beyond the late flowering of medievalizing
human chaos of particular histories. No movement or typology with Bossuet (d. 1704), and it made the past
qualitative change is discernible in the context of these, tangible in its having-been (Vergangenheit), most
only the predictable succession of wars, rapine, pesti- graphically represented in the establishment of
lence, and occasionally of praiseworthy acts, without museums during the eighteenth century in many
connection with an order of reality that might European capital cities.
transcend the events themselves and render sense unto With Voltaire and other eighteenth-century figures
them. like Volney and Chardin, another notion crucial for
speaking of civilization was developed. The notion of
qualitative societal and cultural difference (les
moeurs)—quite apart from the dynastic and the
1.2 The Past Estranged
religious—was now available in the eighteenth cen-
Two roughly contemporary events heralded a new tury, as Europe was accounting for her differences from
conception of history that made the modern notion of the Ottomans, the Persians, the Chinese, and tribal
civilization conceivable. Both were specifically Euro- peoples in the Americas.

1904
Ciilization, Concept and History of

Whereas previously the notion of historical sen- ‘culture,’ which was more commonly used in Ger-
escence may have been used in a tragical and rhetorical many, a land then with little or no experience of the
sense, the new notion of decadence required the world outside Europe. ‘Culture’ also came to be used
correlative notion of progress and amelioration. These in France and more saliently in England, as in
notions are the very conditions of possibility for Germany, decidedly to signal social distance and social
conceiving of civilization, as the accomplishment of a distinctions within particular countries.
continuous line of historical development in which In all cases, these two terms increasingly came to be
origins and beginnings are transcended rather than associated with a developmental perspective on his-
repeated. tory: not only the linear and cumulative course
traversed by historical phenomena in time, but also of
languages, geological layers, plant and animal species,
and human societies generally (and later, races),
2. Word and Concept towards greater differentiation, complexity and ac-
complishment. Correlatively, the meanings conveyed
by these terms were implied by other terms or by none
2.1 Ciilization and Culture
at all, as for example with Rousseau and Voltaire.
The terms civilization and culture are intimately The nineteenth century witnessed a complicated
related in their reference, and in many instances are relationship between ‘culture’ and ‘civilization’ whose
used almost interchangeably, according to national fields of connotation and denotation shade into each
and linguistic conventions. Both are terms of ancient other in a manner that has not helped the distinctive
vintage which underwent a gradual lexical expansion clarity and definition of either. The crucial player here
until, in the eighteenth century, they came to designate was Germany, where Kultur took a decidedly rom-
meanings that are recognizable in 2001. antic-nationalist turn in the early nineteenth century,
In the course of the seventeenth and eighteenth dwelling on national uniqueness and individuality,
centuries, the meaning of the term culture expanded, and buttressed by the emergence of Kulturwissenschaft
in English, French, and German (as Kultur) from a as a discipline and by studies of folklore.
medieval sense indicating the cultivation of land and The further developments of this politico-cultural
of the religious cult, figuratively to denote the main- impulse towards the end of the nineteenth century led
tenance and cultivation of arts and letters. This not only to the profuse discourses on decadence,
figurative sense was further extended in the course of Entartung, (‘disnaturation’) but to the extension to
the eighteenth century to encompass the non-material France of this particularitic understanding of cul-
life of human societies in a very broad sense encom- ture—and of civilization—under the influence of de
passing the refinement of manners no less than Joseph de Maistre and the Catholic Counter-Enlight-
intellectual and artistic accomplishments associated enment. As a result of Franco-German conflicts and
with the Enlightenment—the former sense still persists of the severe stresses within France, a battle was waged
in terms such as haute couture and Kulturbeutel (vanity between the advocates of ‘culture,’ upholders of
case). national particularism, and of ‘civilization,’ cham-
The term was used both locally and in contrast to pions of Enlightenment universalism accused by their
societies adjudged still living in a state of nature, detractors of crass materialism, a battle that reached
though in German the accent had been on an aesthetic its apogee during the first World War in the polemics
of the lofty and the sublime as distinct from the crassly between Romain Rolland and Thomas Mann.
material, in association with a correlative emphasis on Yet ‘civilization’ itself had been increasingly more
cultivation, Bildung, both individual and collective. In receptive to particularism and nationalism, most
this way, the term was opened up to impregnation by specifically in historical writing. Books on the civili-
the emergent notion of progress, the progress of zation of France, Germany, Europe, Italy, and
individual societies as of humanity in general regarded England, emerged from the second quarter of the
both as a process natural to human society and as a nineteenth century, and in certain ways the very
principle of normative ranking among societies. currency of the term made it open to divergent uses.
Not dissimilarly, and in imitation of ‘culture,’ in the Towards the end of that century, the term came to be
eighteenth century the term civilization underwent— used under the influence of the German notion of
most especially in France and somewhat later in Kultur, a term often rendered as ‘civilization’ in French
England—a figurative expansion in its lexical reference translations of German works. One additional but
from the Latin ciilis, life under reputable forms of decisive factor was English anthropology, with the
government, to the broader designation of order, civil, appearance in 1871 of Edward Tylor’s Primitie
and governmental. This order befitted developed Culture (1958), and the subsequent predominance of
societies that might regard themselves as civilized in the term ‘culture’ to designate a condition, on a
contradistinction to other, barbarian or savage, yet to ladder leading from savagery to barbarism and finally
be civilized societies. The contrastive connotations of to civilization, in Anglo-Saxon anthropology. With
‘civilization’ were far more accentuated than those of the discovery and mystique of classical Greece towards

1905
Ciilization, Concept and History of

the end of the eighteenth century, of the unity of the which continuity with medieval organismic concepts
‘West’ from Greece, through Rome, on to the of the historico-political order made itself evident and
Romano-Germanic peoples in the Middle Ages, cul- conceptually formative. This was at a time when the
minating in modern European civilization, was to idea of progress—and in contrast to it—had become a
become the locus classicus of this notion of civilization. genuinely historical category, involving a consequen-
In the latter part of the nineteenth and throughout tial conception of change, of evolutionism, of the
the twentieth century, both the terms and conceptions temporality specific to events (Verzeitlichung), and of
outlined were subsequently taken over and became a distanciation correct between human and natural
crucial instruments of historical categorization in histories.
India, China, and the Arab countries and elsewhere. The course of a particular history was seen to reside
In Arabic thaqafa, an equivalent to the German in a number of essential features, which Herder termed
Bildung, came to stand for culture, normally taken to KraW fte, resulting in a history which, increasingly
designate intellectual and artistic life in a manner elevated and evolutive in the course of time as it may
strongly elitist in character, and hadara was used for be, was still governed by principles which were, in
civilization, taken as a more general concept indicating essence, changeless, principles which imparted indi-
the entire life of society including material life. viduality upon these intransitive histories.
Whereas the Enlightenment provided, and the
nineteenth century elaborated, a notion of genetic
development along an axis of cumulative time to
2.2 Continuities: Relatiism
which civilizations and other historical masses are
Linear developmentalism had, in general, underlain subject, the organismic, particularist notion conceived
the entire body of diverse discourses on civilization a civilization as bound to a self-enclosure inherent in
and of its sister concept of culture. This develop- its origins. Although this did not necessarily lead to an
mentalism bifurcated along lines that may be charac- historical cyclism, it constituted its conceptual con-
terized broadly as German and French in original dition of possibility, and facilitated culturalist notions
inspiration. of nationalism that spoke in terms of ‘revival.’
Of the two, the former had been the more intimately It was a cyclical notion of the history of cultural and
associated with romantic national and, later, with civilizational circles, Kulturkreise according to Ernst
civilizational particularism, producing a natural his- Troeltsch (1920) which was made conceivable by the
tory of human groups regarded in analogy to organic manifold discourses on decadence, malfunction, and
species. These human groups were thus conceived as historical pathology, notions which implicitly involve
self-subsistent, continuous over time, largely im- a measure against a more consummate state of organic
permeable, essentially intransitive, and according to health and well-being implicit in the foundations of
many representatives of this view, almost congenitally each. It was the systematic elaboration of the deca-
given to conflict and war. dence\normalcy structure that led to great schemas of
Originating in conservative reactions to the En- world-history, divided into intransitive civilizations,
lightenment, with a strong anti-Gallic political im- of Oswald Spengler’s Untergang des Abendlandes
pulse, this theory of historical and social development (Spengler 1922) and of the Spenglerian heresy repre-
was associated with figures such as Johann Gottfried sented by Arnold Toynbee’s A Study of History
Herder in Germany and Edmund Burke in England, (Toynbee 1934–1959).
though it did have a strong representation in France Described by Claude Chaunu (1984 as ‘a samsara
among royalist, Catholic, and other anti-revolutionary of historical forms,’ these vitalist theories of the rise and
(1789, 1848, 1871) currents represented by figures such inevitable decline of civilizations constitute a natur-
as de Maistre, Gobineau, and Gustave Le Bon. The alistic morphology of historical becoming. From this
principal conceptual feature of this anti-mechanistic perspective, civilizations were seen as historical phen-
concept of history was insistence on individuality in omena that are perpetually in conflict with one another,
the histories of different nations, races, and civili- each endowed with a particular ethos or animating
zations—terms often conflated in various combina- principle. Such were Spengler’s (1922) Magian culture
tions, in analogy with biological organisms, and (the Perso-Islamic) and Promethean culture (the Euro-
perhaps best captured in the capacious semantic field pean). Such were also Toynbee’s (1934–59) Syriac and
of the German word Volk. In this sense, one may other civilizations, though with this latter author the
speak of continuity with pre-Enlightenment concepts inner definition of civilizations was less clearly pre-
of a social organism modeled upon the integrated determined, and founded on a firmer and far more
somatic unity of the human body, as had been scrupulous empirical foundation than with Spengler.
previously thought in medieval Arabic historico- Nevertheless, Toynbee does characterize civilizations
political writings and in medieval European concep- in terms of particularistic impulses, such as the
tions from the time of John of Salisbury (d. 1180). The aestheticism of Greek civilization, the religious spirit
fear of decline and decadence, conceived as a break- of the Indian, and the mechanistic ethos of the West.
down of a natural order, was the specific point at Each of these is an integrated pattern of daily life, on

1906
Ciilization, Concept and History of

attitude towards the holy, a style of jurisprudence, a itineraries, such as the West or Islam, the whole of
manner of government, an artistic style, and much humanity partook of the development of civilization
more. which, in the historicist perspective was universal and
With both authors, the historical phenomena re- continuous.
spectively designated as ‘cultures’ and ‘societies,’ obey Several versions of this were in evidence, of which
an iron law of rise and decline, of glory and sensecent two are of particular note due to their wide conceptual
atrophy, the terminal phases of which Spengler, in incidence and to the social and political influence they
keeping with German usage of the day, derisively exercised through worldwide social movements,
termed ‘civilization.’ It is noteworthy that this mor- both revolutionary and gradualist, inspired by the
phology of historical masses, be they called civili- Enlightenment. According to this conception of a
zations, cultures or societies, that this monocausal universal civilization, human societies pass through
description in terms of basic traits such as the a uniform series of progressive developments which
Promethean or the aesthetic, was congruent with result in intellectual and moral elevation. They also
certain developments in anthropology, particularly in result in superior social, economic, and political levels
American anthropology in the first half of the twen- of development, marked by a higher and wider order
tieth century (Franz Boas, Alfred Kroeber, Ruth of rationality, social differentiation, and control over
Benedict, and Edward Sapir) which identified separate nature whose instrument is science.
societies according to self-consistent and intransitive The more consistently universalist of these ver-
personality profiles they ostensibly gave rise to. This sions—exemplified, among others, by Jean Marquis
had a decisive influence upon the introduction of de Condorcet, Auguste Comte, and Herbert Spencer,
organismic thinking into the human sciences in gen- and in the evolutionist anthropology of Tylor
eral. After a long period of disrepute, the ‘culturalized’ (1958)—saw the whole of humanity as being pre-
notions of human collectivities, of self-enclosure, of disposed to this upward movement. Nevertheless,
continuity, have come back to center stage at the close according to this view some peoples may still subsist
of the twentieth century, correlatively with the politics in a condition that others (Europe) had already
of identity worldwide. surpassed, being still captive to superstition, to a weak
The organismic and vitalist notion of civilization organization of society and the economy, and to
was, and still is, extremely effective in the writing of undeveloped political institutes (despotism, as in the
history and in the late twentieth century has had a case of Orientals, or otherwise various forms of
certain political salience in terms of Samuel Hun- acephalic organization, as is the case with primitive
tington’s ‘war of civilizations’ and the mirror-image societies). The only caveat here is that many theorists
riposte in terms of the ‘dialogue of civilizations.’ It of the evolution of human societies did not see human
denies the possibility of a general human history, improvement as uniform in developed societies them-
which it regards, in the words of Ernst Troeltsch selves, but that developed societies were not only
(1920), as being ‘violently monistic,’ and construes the internally differentiated in the levels of accompli-
task of a history of civilizations as separate histories of shment attained by different social groups, but were
Europe, China, India, Islam, Byzantium, Russia, also liable to fall far below the moral ideals that
Latin, and Protestant Europe, and others, in various development makes possible. Nevertheless, much
possible permutations and successions, as separate writing on the history of universal civilization, es-
cultural and moral spheres which are merely con- pecially in countries influenced by Marxism, such as
tiguous in space. Correlatively, this constituted the the Soviet Union or the People’s Republic of China,
conceptual armature of certain forms of violently saw the historical itineraries they led as crucial and
particularistic history that was mirrored or endogen- exemplary steps in the development of humanity at
ously paralleled, among others in India for instance large. In this way, the histories of Russia and China
(Savarkar 1969), or in the writings of radical Muslim come to recuperate and tap world history at large by
ideologues. acting as the vanguard and exemplar of its future
consummation.
Correlatively, the other conception of universalism,
and this is one of profound conceptual and political
2.3 Noelties: Uniersalism
importance, recognized the past contributions of
The deficit in historicity evident in romantic and various civilizations—the Mesopotamian, Egyptian,
vitalist historism, with its emphasis on organic and Islamic, and in some instances the Indian and the
continuity, was to a great extent made up in the more Chinese as well—to the course of human civilization,
consistently evolutionist accounts of historicism. It which eventually lodged itself in Europe, defined as
was this conception of history, at once evaluative, the abode of Romano–Germanic history, or even in
evolutive, and vectorial, which bore the burden of the small parts of Europe, such as France or Prussia. This
universalist notion of civilization. Whereas in his- writing of universal history, much in evidence in the
torism the bearers of civilization—or rather of ‘cul- plethoric literature of universal histories both popular
ture’— were particular peoples or individual historical and learned, was expressed in what is perhaps its most

1907
Ciilization, Concept and History of

accomplished general formulation by Georg Hegel Jakob Burckhardt’s studies of Constantine and Re-
(1956) in the nineteenth century, and by Karl Jaspers naissance Italy (Burckhardt 1955, Vols. 1 and 3).
(1949) in the twentieth—the latter is little-read today, As a result, since about 1950 it has become possible
but he is one who nevertheless captures this notion historically to specify the material elements that
with special clarity. constitute the history of a civilization, however its
Not infrequently, this second type of universalist temporal and geographical boundaries may be de-
history is allied to an important element derived from fined. Correlatively, it has become possible to conceive
the romantic theory of the history of civilizations the specific differences between histories—China and
treated in Sect. 2.2, namely the presumption of very Europe for instance, as in the work of Jacques
long-term individual continuity. Thus the point is Gernet (1988)—beyond a discourse on immobility and
habitually made, with varying shades of emphasis and other immanent characteristics ascribed to this history
nuance, that universal civilization made Europe its or that, and to think of specificity in proper historical
eventual home because some abiding characteristics terms, such as the relative weight of various elements
possessed ab initio by ‘the West,’ such as rationality, of the rural economy, the relation between state and
the spirit of freedom, vigor and dynamism. This the economy, the impact of metallurgy, and much
changeless West is counterposed to an eternal East, more.
such as Mesopotamia, Egypt, and Islam which, their In this context, civilizational continuities came to be
relative erstwhile merits apart, constitute, in a cat- reconsidered in terms of historically determinant
egorical degradation, a mere prehistory to fully de- factors of a predominantly geographical nature, not so
veloped civilization. East and West—not to speak of much in the spirit of geographical determinism as
Islam, the name of a religion transmuted into an described by the German school of Friedrich Ratzel
atopic location—are metageographical notions which (1895), but with a greater degree of temporal spec-
do not allow, except with anachronistic violence, for ification, mediated by Lucien Febvre’s (1925) con-
projection into the antique and late antique worlds. sideration of historical geography and culminating
However, this does not disturb the ideological co- in Fernand Braudel’s study of the Mediterranean
herence of this notion of civilization. (1972–73). By the same token, it has become possible
squarely to face the nominalist caution required in
thinking about civilizations, and to think of their
3. Beyond Totality constitution, specification, and collapse in terms of the
concrete historical investigation of demography, econ-
Writing about civilization according to the manners omy, and society without recourse to the metaphysical
outlined contained both large-scale abstraction and a rhetoric of decline (Tainter 1988).
great deal of precise empirical historiography. Under These specifications apart, it remains true that the
the influence of Marxism, acknowledged as such as construal of civilizational intransitivity, with Braudel
well as implicit, historical scholarship, and most as with others, still needs to resort to a new redaction
specifically social and economic history in the tradi- of the rhetoric of permanence, most particularly with
tions of Max Weber and of the Annales school, regard to non-material culture, now underscored and
produced specifications regarding material and other almost overdetermined by considerations of relief,
aspects of civilization that allayed, to a considerable soil, water supply, and means of transport—all of
degree, the rhetorical force of thinking civilizations in which are undeniable factors, albeit ones that modern
terms of purely moral and ideological continuities, technology and economy, most poignantly the post-
betokening exclusive socio-historical groups. The no- modern economy, have rendered questionable.
tion of a Judæo-Christian civilization, as distinct from Nevertheless, recent historical research has made it
textual typology within the Bible, is an excellent concretely possible to tap the genial formulations
illustration of this, having been born and expanded in made by Marcel Mauss in 1930 (Febvre et al 1930)
specific circumstances following the second World concerning the categorization of historical masses: of
War. civilizations as a ‘hyper-social systems of social sys-
In this way, civilizations for contemporary historical tems,’ as trans-societal and extra-national units of
writing have come to comprise the total historical historical perception and categorization. These are
conditions that exist in a specific place and time: conceived in opposition to specific social phenomena,
functional and organizational forms of the state, the and this conception of civilization valorises the dist-
longue dureT e of demographic, agricultural, economic, inction between civilization, society, and culture,
social, urban, ecological, climatic, and other forces freeing the first of the deterministic and totalizing
underwritten by geographical structures and relations, rhetorical glosses of metahistory, and making possible
and non-material culture, such as arts, letters, cog- a veritable history of civilizations. Civilization may
nitive structures, and religions. The possibilities of a thus be considered as at once a particular instance of
total history of a civilization is, it must be stressed, an historical becoming, and a specific ideological redac-
expansion of a previous and more limited form of the tion of the past whose relation to historical reality can
cultural history of a particular epoch, exemplified by be questioned and rendered historical. In this context,

1908
Ciilizational Analysis, History of

the historian may also be able to valorise the extremely Ginzburg C 1990 Ecstasies: Deciphering the Witches’ Sabbath.
expansive longue dureT e implied by such theories as Hutchinson, London
Dume! zil’s (1958) Indo-European tri-functionalism Hegel G W F 1956 Philosophy of History. Sibree G (trans.).
Dover, New York
without recourse to organismic and totalizing figures
Herder J G von 1968 Reflections on the Philosophy of History of
of particularity and continuity (Le Goff 1965). An Mankind. Abridged by Manuel F, University of Chicago
historian may be similarly able to valorise recent Press, Chicago and London
studies that stress the occurrence and communicability Herder J G von 1969 JG Herder on Social and Political Culture.
of recurrent phenomena of the imaginary order across F M Barnard (ed. and trans.). Cambridge University Press,
vast spaces, cultures, histories, and times, as instanced Cambridge, UK
by Carlo Ginzburg’s study of the European witch- Ibn Khaldun 1958 The Muqaddimah: An Introduction to History.
hunts (Ginzburg 1990). Finally, given the accent on Rosenthal F (trans.). Pantheon Books, New York
complexity, one might be able to take the precise and Jaspers K 1949 Vom Ursprung und Ziel der Geschichte. Artemis
Verlag, Zu$ rich, Switzerland
nuanced study of levels and modes of socio-economic,
Kemp A 1991 The Estrangement of the Past. A Study in the
political, institutional, ideational, and other instances Origin of Modern Historical Consciousness. Oxford University
of complexity—rather than criteria of simple cont- Press, Oxford and New York
inuity—as crucial to the delimitation of historical Kosellek R, Widmer P (eds.) 1980 Niedergang. Studien zu einem
phenomena that one designates as ‘civilizations.’ geschichtlichen Thema. Klett-Cotta, Stuttgart, Germany
Kroeber A L 1963 An Anthropologist Looks at History. Univer-
sity of California Press, Berkeley and Los Angeles
See also: Civilizational Analysis, History of; Civil- Kroeber A L, Kluckhohn C 1952 Culture: A Critical Reiew of
izations; Cultural Landscape in Geography; En- Concepts and Definitions. The Peabody Museum, Cambridge,
lightenment; Global History: Universal and World; MA
History: Overview; Modernization and Modernity in Le Goff J 1965 La Ciilisation au l’occident meT dieal. Arthaud,
History; Societies, Types of; Society\People: History Paris
of the Concept; State and Society; State, History of; Lewis M W, Wigen K E 1997 The Myth of Continents. A Critique
of Metageography. University of California Press, Berkeley
Time, Chronology, and Periodization in History and Los Angeles
Marrou H I 1938 Culture, civilization, de! cadence. Reue de
SyntheZ se, December: 133
Ratzel F 1895 Anthropogeographische Beitrage. Duncker und
Bibliography Humblot, Leipzig, Germany
Ru$ sen J, Gottlob M, Mittag A (eds.) 1998 Die Vielfalt der
Auerbach E 1984 Figura. In: Auerbach E Scenes from the Drama Kulturen. Erinnerung, Geschichte, IdentitaW t, 4. Suhrkamp,
of European Literature. University of Minnesota Press, MN Frankfurt am Main, Germany
Be! ne! ton P 1975 Histoire des Mots: Culture et Ciilisation. Savarkar V D 1969 Hinduta: Who is a Hindu? 5th edn. Veer
Fondation Nationale des Sciences Politiques, Paris Savarkar Parakashan, Bombay, India
Braudel F 1972–73 The Mediterranean and the Mediterranean Schlanger J E 1971 Les MeT taphores de l’Organisme. Vrin, Paris
World in the Age of Philip II. Reynolds S (trans). 2 Vols. Spengler O 1922 Der Untergang des Abendlandes. 2 Vols. W.
Collins, London Braumu$ ller, Vienna and Leipzig
Braudel F 1993 Grammaire des Ciilisations. Flammarion, Paris Tainter J A (ed.) 1988 The Collapse of Complex Societies.
Burckhardt J 1955 Renaissance Italy. Gesemmete Werke. Vols 1 Cambridge University Press, Cambridge, UK
and 3. Schwab, Basle, Switzerland Toynbee A 1934–1959 A Study of History. 12 Vols. Oxford
Chaunu P 1981 Histoire et DeT cadence. Librairie academique University Press, London
Perrin, Paris Troeltsch E 1920 Der Aufbau der europa$ ischen Kulturge-
Collingwood R G 1946 The Idea of History. Clarendon Press, schichte. Schmollers Jahrbuch fuW r Gesetzgebung, Verwaltung,
Oxford, UK und Volkswirtschaft im Deutschen Reiche 44(3): 1–48
Dume! zil G 1958 L’ideT ologie Tripartie des Indo-EuropeT ens. Tylor E B 1958 Primitie Culture. 2 Vols. Harper, New York
Collection Latours, Bruxelles, Belgium Zureiq K 1969 Nahnu wa’t-tarikh. Dar al- Ilm lil-Malayin,
Enciclopedia Einaudi 1977–84. G. Einaudi, Torino (s.v. ‘Cultura Beirut, Lebanon
materiale’, ‘Civilita' ’)
Encyclopædia Uniersalis 1968 Encyclopædia Universalis A. Al-Azmeh
France, Paris (s.v. ‘Civilization’, ‘Culture et civilization’)
Encyclopedia of Religion and Ethics 1980 T and T Clark,
Edinburgh, UK (s.v. ‘Civilisation’)
Febvre L 1925 A Geographical Introduction to History. Kegan
Paul, Trench, Trubner & Co., London Civilizational Analysis, History of
Febvre L, Mauss M, Tonnelat E, Nicophoro A 1930 Ciilisation.
Le Mot et l’IdeT e. Renaissance du Livre, Paris (PremieZ re
Semaine internationale de SyntheZ se, Fascicule 2) The term ‘civilizational analysis’ is used here to
Fehl E (ed.) 1971 Chinese and World History. The Chinese describe a whole cluster of traditions, rather than a
University of Hong Kong, Hong Kong specific theoretical perspective. The shared theme is a
Gernet J 1988 A History of Chinese Ciilization. Forster J R plurality of fundamental and comprehensive socio-
(trans.). Cambridge University Press, Cambridge, UK cultural patterns, seen as sufficiently different from

1909
Ciilizational Analysis, History of

each other to justify the idea of civilizations in the longue dureT e (Braudel is the prime example), but
plural, and in contradistinction to civilization in the comparative sociologists (such as S. N. Eisenstadt)
singular. This is the conceptual framework preferred have been no less interested in the civilizational
by some of the most seminal theorists in the field. This dynamics set in motion by major cultural break-
is predominant in contemporary debates, although throughs.
authors who opted for a different terminology must be
included in the present survey.
1.2 Early Deelopments and Nineteenth-century
Reersals
1. Origins and Directions of Ciilizational The explicit idea of civilizations in the plural seems to
Analysis have grown out of the same historical developments as
that of civilization in the singular (Starobinski 1983),
although it took longer for the plural to be codified for
1.1 General Characteristics
official usage. Both are eighteenth-century responses
In the context of a broader history of social thought, to internal transformations of the West as well as to
the civilizational approach emerges as a counterpoint encounters with non-Western societies and traditions.
and a potential corrective to mainstream modes of However, the pluralistic view was much less clearly
theorizing. It is only in the most recent phase that one articulated and became more marginal as the new
can speak of progress towards it becoming a fully- phase of Western expansion from the early nineteenth
fledged alternative. The most obvious contrast has to century onwards seemed to herald a triumph of
do with visions of history: civilizational lines of civilization in the singular.
interpretation stress the plurality of historical trajec- In retrospect, elements or anticipations of civi-
tories and contest the claims of general evolutionistic lizational analysis can be found in the writings of
theories. For the same reason, the emphasis on original earlier authors. Giambattista Vico’s New Science
and mutually irreducible cultural configurations runs (definitive Italian edition, 1744; for an English trans-
counter to functionalist postulates of universal con- lation, see Vico 1968) is perhaps the most frequently
straints or imperatives. This need not mean an outright invoked example. Those who regard Vico as a pioneer
rejection of any common frame of reference, but its stress his interest in ‘the differences as well as the
explanatory scope is at best limited. similarities in the patterns and paces of social and
Civilizational analysis deals with units of larger cultural processes and structures across histories’
dimensions and longer duration than the single soci- (Nelson 1976, p. 875). But it can be objected that the
eties that they encompass, and this focus of interest latent civilizational perspective remains subordinate
leads it to question the self-contained image of society to an exclusive focus on European cycles of rise and
(and the underlying fixation on the nation state) that decline, seen against the background of Greek, Roman,
has been central to the sociological tradition. The and Jewish sources. The most systematic recent inter-
large-scale and long-term patterns in question can be pretation of Vico’s work (Lilla 1993) suggests that the
conceptualized in various ways, and different ap- whole comparative inquiry is only a sideline to a
proaches may be reflected in more or less developed traditionalist critique of modernity.
typologies of civilizational formations. In some cases, Another ambiguous precursor is Montesquieu,
the organic metaphors commonly linked to func- whose Spirit of the Laws (first published, 1748; for an
tionalist images of society reappear on the civili- English translation, see Montesquieu 1989) linked the
zational level, but they should not be mistaken for a analysis of legal and political regimes to ‘customs’ and
defining characteristic of civilizational thought. drew on new knowledge of Asian civilizations (espe-
If civilizational analysis is defined in these broad cially China). But the project as a whole is still centered
terms, it does not emphasize any particular type of on the traditional problematic of political philosophy
intercivilizational relations at the expense of others. and its implications for a reformist response to
The historical experience to be analyzed includes absolutism.
closures, encounters, and conflicts. In different situ- Eighteenth-century encounters with Asia were
ations, some of these patterns of interaction become widely and variously reflected in European thought
more salient. Different theoretical approaches may and literature. The attitude of the educated public was
entail correspondingly selective views of the inter- less prejudiced than it later became, and the ‘intuition
civilizational field. For example, debates on this topic about the equal value of cultures,’ which Charles
have been marked by a disproportionate emphasis on Taylor (1990) has identified as a persistent but often
civilizational conflict. subordinate theme in Western thought, was more in
Finally, the civilizational perspective may serve to evidence. A shift to markedly more Eurocentric
highlight macrohistorical continuities as well as major approaches took place around 1800 (Osterhammel
ruptures. Overall, contemporary historians have been 1998). The potential openings to civilizational anal-
more sensitive to civilizations as phenomena of the ysis, which we can attribute to the Enlightenment in

1910
Ciilizational Analysis, History of

retrospect, were thus blocked by contrary trends 2.1 Durkheim, Mauss, Weber: The Sociological
before they could translate into more lasting results. Discoery of Ciilizations
If some progress was made towards the articulation
of a more pluralistic approach, it was more directly A clearly defined sociological concept of civilizations
linked to the concept of culture than to the ideas of in the plural appears for the first time in a short text by
civilizations in the plural. In Herder’s Ideas for a Durkheim and Mauss, little noticed at the time and
Philosophy of the History of Mankind (first published, long neglected by the most influential interpreters of
1784–91; for an English translation, see Herder 1968), the Durkheimian tradition, but more recently redis-
there is no reference to cultures in the plural, and covered by theorists who have tried to reactivate the
culture in the singular remains closely associated with civilizational perspective. The Note on the notion of
progress and enlightenment. But the emphasis on the ciilization was first published in 1913 (for an English
singularity of each people gives a pluralizing twist to translation with a commentary by Benjamin Nelson,
the model. Cultural particularities are, however, them- see Durkheim and Mauss 1971) and is obviously
atized in a way which proved much more important to related to arguments developed in other writings
the history of nationalist thought than for the civi- during the last phase of Durkheim’s work, such as the
lizational approach to human diversity. Elementary Forms of the Religious Life. The common
The mainstream of nineteenth-century social thou- theme is a perceived need to go beyond the concept of
ght was uncongenial to civilizational analysis. This society formulated in earlier works (now seen as too
was the case for the theories drawing on the legacy of close to the idea of a given system of structures and
German idealism as well as the positivistic conceptions functions) and to explore various ways of doing so.
of general social evolution, reinforced by (but not The particular insight provided by the notion of
derived from) the Darwinian discovery of biological civilizations in the plural has to do with large-scale
evolution. The Marxist tradition may be seen as a units and long-term processes which encompass mul-
meeting ground for ideas from these two sources. On tiple societies. Civilizations ‘reach beyond the national
both sides, a unilinear vision of history and a uni- territory’ and ‘develop over periods of time that
versalistic model of development set strict limits to exceed the history of a single society’; they constitute
the recognition of cultural plurality. ‘a moral milieu encompassing a certain number of
Contrary to some recent suggestions, it does not nations’ or ‘a plurality of interrelated political bodies
seem justified to include Hegel among the pioneers of acting upon one another’ (Durkheim and Mauss 1971,
civilizational analysis. His plurality of collective spirits pp. 810–11). The emphasis is on the cultural unity of
is only a derivative aspect of the progress of the world civilizational complexes (‘moral milieu’ is obviously
spirit throughout universal history, although he unde- to be understood in a broad sense), and political
niably made a certain effort to grasp the individuality plurality appears as a normal rather than a prob-
of the cultures (including China and India) that he lematic condition. But there is no a priori classification
consigned to lower levels of the unfolding design. of unifying and fragmenting factors: Durkheim and
Similarly, Marx’s scattered comments on the Asiatic Mauss call for a comparative study of the civilizational
world and its deviation from the Western pattern of potential inherent in various categories of social
development do not go beyond a general contrast phenomena: ‘the unequal coefficient of expansion and
between progress and stagnation (explained in terms internationalization (Durkheim and Mauss 1971,
of fundamental economic structures). But his well- p. 811). Although the connection is never made
known analysis of India under British rule reflects explicit, this variety can also be seen as a matter of the
some interest in the characteristics of a markedly alien forms and degrees of social creativity, and thus related
civilization. to a theme which figures prominently in other texts.
Mauss returned to the problematic of civilizations
in a later debate with Lucien Febvre and others. A
2. The Classics of Ciilizational Analysis: specific civilization is, as he put it, a ‘family of
Sociologists and Metahistorians societies’ or a ‘hyper-social system of social systems’
(Mauss 1930, p. 89; the term ‘system’ is evidently not
Major contributions to civilizational analysis—on used in a very rigid sense). He proposed a more
theoretical as well as substantive levels—were made advanced conceptual framework than in the earlier
during the most formative period of modern socio- text: the elements of civilizations—the unequally
logical thought (roughly between 1890 and 1920). unified ideas, practices, and products characteristic of
But the new perspectives were not developed to the a civilizational complex—are distinguished from their
same degree as the ideas that became more central to forms, i.e., the patterns that grow out of complex
sociological theory. In the absence of systematic combinations of such elements. In addition, the
sociological inquiry, the comparative study of civi- comparative study of civilizations would deal with the
lizations was pursued along very different lines by characteristics of the areas or regions over which they
writers who defied the conventional academic division expand, as well as the interconnections of the societies
of labor. that belong to them. With regard to the last point,

1911
Ciilizational Analysis, History of

Mauss adds a passing but potentially far-reaching to reconstruct Weber’s problematic have therefore
comment: societies may ‘singularize’ themselves and been tempted to ground it in more or less differentiated
enhance their individual features against a broader universal models of rationality, rather than to start
civilizational background. The varying outcomes and with a plurality of cultural patterns and their formative
possible implications of such processes are also a imprints on civilizational complexes.
matter for comparative analysis. The civilizational aspect of Weber’s project neither
Mauss went on to draw an important theoretical was given from the outset nor equally present in all
conclusion. The plurality of civilizations is the most parts of his work. His original and most abiding
striking case of a characteristic common to all forms of concern was the combination of multiple and suc-
social life: their arbitrariness, or—in other words—the cessive developments which had set the Western traj-
formative role of collective choices embodied in more ectory apart from all others. The interest in other
or less coherent patterns (Mauss 1930, p. 97). Once civilizations developed in this connection and did not
again, the civilizational perspective brings out the lead to totalizing interpretations of non-Western
theme of social creativity. traditions in their own terms or attempts to reorient
Mauss seemed to have thought of this interpretive the comparative strategy from their points of view.
model as equally applicable to ‘primitive’ and ‘civi- None of Weber’s comparative studies aspires to do
lized’ societies (if anything, the anthropologists had what Louis Dumont later proposed to do through a
been quicker to recognize civilizational phenomena new interpretation of India. On the other hand, closer
than the sociologists). However, it is clearly more consideration of non-Western cases raises questions
attuned to the latter case (the reference to ‘families of which go beyond the initial issues, reveals new inter-
societies’ with shared cultural horizons and historical connections and opens up new perspectives on the
traditions is easier to understand in that context). The Western experience.
idea of civilizations in the plural is thus reconnected to This shift towards a more comprehensive frame-
civilization in the singular, and the prime objects of work was still in progress in Weber’s last writings, but
comparative analysis would be the advanced civi- the results varied—both in kind and in degree—from
lizational complexes (Hochkulturen) whose dynamics case to case. Weber’s most extensive work on the
have shaped the course of world history. Neither ancient world focuses on socioeconomic structures.
Durkheim nor Mauss made any significant moves in The analysis of ancient Judaism is primarily concerned
that direction, and later attempts linked to the Durk- with a religious breakthrough and its long-term
heim school (especially Marcel Granet’s work on rationalizing potential. The studies of China and India
China) did not do much to concretize the theoretical (Weber 1920–21, pp. 1–2) deal most extensively with
project outlined above. complex and long-term interrelations of cultural trad-
The most important substantive contribution of itions and institutional contexts, and are therefore
classical sociology is to be found in Max Weber’s closest to the model of civilizational analysis as defined
analyses of the contrasts between patterns of dev- by Durkheim and Mauss.
elopment in Western and non-Western civilizational
settings. But the absence of explicit conceptual foun-
dations and the lack of a clearly defined research
program made it difficult to distinguish the civil-
2.2 The Other Tradition: from Spengler to Toynbee
izational perspective from more narrowly focused
and Beyond
parts of Weber’s work. The idea of culture as a
distinctive way of lending meaning and significance to As shown above, the sociological classics left civi-
the world—outlined in Weber’s earlier methodological lizational analysis in a very inconclusive and frag-
writings—is never put to systematic or comparative mentary state: there was no connection between the
use. Although the major civilizational studies vari- Durkheimian sketch of a theoretical framework and
ously refer to ‘cultural centers,’ ‘cultural areas,’ and the Weberian exploration of historical testing grounds,
‘cultural worlds,’ cultural patterns and contexts are and neither of the two overtures was followed by
never theorized as such. further work. For several decades after Weber’s death,
The resultant ambiguity of Weber’s argument is the idea of a comparative study of civilizations was
reflected in later controversies around his work. Forms mainly associated with metahistorical projects of the
of rationality and dynamics of rationalization were the kind exemplified by Oswald Spengler’s Decline of the
main foci of his comparative analyses, and he rep- West (1923) (for an English translation, see Spengler
eatedly stressed that they were always embedded in 1926–28) and Arnold Toynbee’s Study of History
specific settings. But when it comes to details, the (Toynbee 1934–61). Among later attempts to deal with
constitutive frameworks—from overall civilizational questions raised by Spengler and Toynbee, Franz
configurations to specific sociocultural spheres that Borkenau’s posthumously published fragments
crystallize within them in varying ways—are over- (Borkenau 1981) deserve special mention.
shadowed by a seemingly uniform—but unequally In general terms, this tradition should be given
developed—rationality in progress. Those who set out credit for preserving the insight that the dynamics of

1912
Ciilizational Analysis, History of

world history involve units of greater size and longer of Civilizations, which publishes Comparatie Cii-
duration than the single (or artificially singled-out) lizations Reiew. However, when it comes to inno-
societies more familiar to mainstream scholarship, vative theoretical conceptions, two projects seem to
and that the patterns of formation, flowering and deserve a somewhat more detailed account.
decline of such macro units call for closer examination.
Moreover, analyses in this vein have sometimes
3.1 Benjamin Nelson: Orientations and Encounters
thrown new light on inter-civilizational relations,
despite a tendency to stress the separate and self- Benjamin Nelson’s programmatic outline of a civi-
contained history of each civilizational domain. lizational theory, presented in a series of essays
Spengler’s notion of ‘pseudomorphosis’ is a case in (Nelson 1981) but never developed in a systematic
point. It refers to the impact of dominant cultures in fashion, is inseparable from a reinterpretation of Max
latent decline on those emerging within their orbit; the Weber’s work. As Nelson argued (in the 1960s and
original example was the transformation of the Near 1970s, when more restrictive readings of Weber
East in the shadow of the Roman Empire, culminating dominated the field), the agenda most succinctly
in the rise of Islam, but plausible attempts have been summarized in the ‘Author’s Introduction’ to the
made to generalize the concept. Protestant Ethic, centered on the comparative study of
On the other hand, the Spengler–Toynbee tradition civilizational patterns and their historical trajectories.
has been beset by major problems. Apart from a At the same time, Nelson went beyond Weber in
general tendency to indulge in speculation far beyond thematizing the cultural cores of civilizational com-
the limits of historical evidence, more specific weak- plexes. The different clusters of cultural orientations,
nesses are inherent in the overall pattern. Spengler, also equated with ‘structures of consciousness,’ gave
Toynbee, and those who followed their lead were rise to correspondingly different frameworks for the
—notwithstanding differences in emphasis—inclined rationality of conduct, social coexistence, and reflexive
to exaggerate the cultural or societal closure of thought. Nelson took a particular interest in the
civilizational units. Nevertheless, at the same time, rationalizing efforts which led to the overcoming of
they took a cross-civilizational identity of develop- traditional dualisms, such as those of the religious and
mental patterns for granted: a uniform, more or less the mundane life or the insider and the alien, and this
consistently cyclical model was applied across other- enabled him to reinterpret the triangular comparison
wise rigid boundaries. of China, India, and the West in more focused terms
When it came to concrete analysis and demarcation than Weber had done.
of civilizational domains, this line of thought led to a But the emphasis on breakthroughs to more in-
dilemma. If the self-contained units in question are clusive and interactive forms of social life did not lead
defined based on unique cultural features, claims Nelson to neglect the other side of civilizational
to cross-cultural understanding and theorizing are dynamics: some of his essays stress the productive
thereby undermined. To avoid this self-defeating turn, potential of internal conflicts at the level of the most
exemplified by Spengler’s work, Toynbee shifted the basic cultural premises (he refers to them as ‘civil wars
focus towards civilizations as ‘societies,’ distinct and within the structures of consciousness’). His favorite
durable frameworks of interaction, but found it very example was the highly articulate tension between
difficult to specify the characteristics of a civilizational faith and reason in the course of the eleventh- and
society. At an advanced stage of his project, he sought twelfth-century transformation of Western Christen-
to defuse the problem by relegating civilizations to the dom.
prehistory of universal religions. This crucial phase of European history, much more
important for Nelson’s genealogy of modernity than it
had been for Weber, also exemplifies the decisive role
3. The Renaissance of Ciilizational Analysis of intercivilizational encounters: the interaction with
the Byzantine and Islamic worlds had an epoch
Since the 1970s, several developments have led to a making impact on Western ways of life and thought,
revival of interest in civilizational analysis. The plu- not least through an unprecedented revival of interest
rality of civilizations has resurfaced as a key theme in in classical sources. Nelson discussed other en-
the work of prominent historians (Braudel 1994). counters, such as the contacts between China and
Projects drawing on anthropological and sociological the West, and although his treatment of this problem
traditions have moved towards a civilizational per- was in some ways inconclusive, he did more than
s-pective; here Louis Dumont’s work on India is of anybody else to integrate it into the domain of
particular importance (Dumont 1967). Occasional civilizational theory.
attempts have been made to bring the problematic
legacy of Spengler and Toynbee into the orbit of a
3.2 S. N. Eisenstadt: Breakthroughs and Dynamics
historical sociology of civilizations. Various ap-
proaches to a shared problematic are represented in In contemporary social theory, S. N. Eisenstadt’s
the International Society for the Comparative Study work stands out as the most sustained exploration of

1913
Ciilizational Analysis, History of

civilizational themes. His interest in this field grew out 4. Alternatie Views
of several research projects, but they converged in a
critique of the conventional distinction between tra- Among the few constructive responses to the meta-
dition and modernity. The diversity of modern societies historical tradition discussed in Sect. 2.2, Jaroslav
could not be explained without reference to the Krejci’s work is noteworthy for its scope and ambition
ongoing formative role of traditions, and the most (Krejci 1982, 1993). The starting point is a critique of
important factors of that kind have to do with inconsistencies and loose ends in Toynbee’s theory. In
enduring civilizational legacies. A comparative study particular, the incomplete analysis of creative elites as
of empires, designed to distinguish their complex social civilization builders and the unclear status of universal
and political structures from stereotypes of traditional religions as supracivilizational units are singled out for
society, raised questions about civilizational back- reconsideration from a more sociological angle. As
grounds and their influence on imperial formations. Krejci sees it, ‘protagonist groups and changing
Most importantly, a comparative analysis of modern relationships between them (such as the shifting
revolutions (Eisenstadt 1978) took an explicitly civi- balance between Brahmins and kshatryas in India)
lizational turn: the ‘great revolutions’ that had come play a key role in the construction of civilizations.’ But
to be seen as paradigms of radical change were based they operate through specific core institutions (em-
on a more fundamental cultural shift which opened pires, states, churches, or ideological communities of
the constitutive visions of social order to dissent, various kinds) and are inspired by distinctive—more
protest, and innovation. The revolutionary cultural or less overtly religious—visions of the human con-
foundations of modernity mark it as a new civilization. dition. Religious traditions are thus reintegrated into
After discovering the civilizational dimension from the civilizational frame of reference, and their form-
these different angles, Eisenstadt’s next step was a ative role is analyzed in terms of interpretive patterns
closer examination of the historical cases that seemed that lend meaning to life and death.
to have brought it to the fore in the most revealing Krejci distinguishes various ‘paradigms of the
way. The epoch already described by some earlier human predicament,’ ranging from the theocentric
authors as an ‘Axial Age,’ covering a few centuries invented in ancient Mesopotamia to the utilitarian
around the middle of the last millennium BC, was—as version of the anthropocentric in the modern West.
Eisenstadt argued—the prime example of a civili- This key to civilizational theory provides an alter-
zational breakthrough. In some major cultural centers native to the more common typologies of social
(ancient Greece, ancient Israel, India, and China), formations, based on the division of labor, and to the
radical changes to cultural ontologies gave rise to a-theoretical use of geographic or historical criteria to
new images of social order: ideas of a fundamental con- demarcate civilizations. However, once the defining
trast between transcendental and mundane realities, anthropological premises have been identified, geo-
unknown to more archaic cultures, translated into graphical and historical perspectives can be given their
orderbuilding visions and strategies. due; it becomes possible to distinguish civilizational
Nevertheless, the expanded scope for imagination, areas and sequences.
in the interpretive as well as the institutional domain, Arguments developed by Johann P. Arnason (1988,
was also conducive to higher levels of interpretive 2001) draw more directly on classical sources. The
conflict and ideological rivalry; the elites and co- ideas put forward by Durkheim and Mauss on one
alitions that mobilized the new cultural resources had to side and Weber on the other are seen as incomplete
confront more or less structured currents of hetero- insights to be synthesized, but this would require a
doxy and dissent. Eisenstadt’s analyses have high- closer connection between the Weberian theme
lighted the variety and dynamism of social formations of world interpretation and the Durkheimian
that develop within this framework, although the notion of collective representations. The idea of
Axial constellation—defined in the most general sense imaginary significations—introduced by Cornelius
—seems to have a uniform structure (for a theoretical Castoriadis—appears as the most suitable basis for
discussion, accompanied by case studies of major such a rapprochement.
civilizations, see Eisenstadt 1986). Among the off- Imaginary significations are horizons of meaning,
shoots of the Axial transformation, European civi- irreducible to experiential foundations as well as to
lization—shaped by recurrent combinations of diverse functional constraints or rational principles. Specific
sources—became the main center of another civi- clusters of such significations are at the core of different
lizational mutation: the transition to modernity. But social worlds and structure their relations to non-
other civilizational backgrounds left their traces on the social domains of reality as well as their internal forms
specific patterns of modernity that emerged within of differentiation and integration. On this view, civi-
their orbit. lizational patterns can be analyzed as the most
Both the analysis of Axial civilizations and the comprehensive and distinctive constellations of imag-
general interpretive framework growing out of it are inary significations. In that capacity, they give rise to
still in progress. The results so far achieved suggest specific ways of being in the world and corresponding
that this is a particularly promising line of inquiry. types of relationships between the main spheres of

1914
Ciilizations

social life. These constitutive frameworks make it Montesquieu C 1989 The Spirit of the Laws. Cambridge
possible for civilizations to encompass groups of University Press, Cambridge, UK
societies and maintain their identity throughout suc- Nelson B 1976 Vico and comparative historical civilizational
sociology. Social Research 43(4): 874–81
cessive historical phases (for Durkheim and Mauss,
Nelson B 1981 On the Roads to Modernity: Conscience, Science
integrative capacities manifested in space and time and Ciilizations. Rowman and Littlefield, Totowa
were the defining feature of civilizations). At the most Osterhammel J 1998 Die Entzauberung Asiens: Europa und die
visible level, civilizational formations take the shape asiatischen Reiche im 18. Beck Verlag, Mu$ nchen, Germany
of regional configurations, central to the agenda of Spengler O 1926–28 The Decline of the West. Knopf, New York,
comparative history. Vols. 1–2
To speak of civilizations in this sense is not to Starobinski J 1983 Le mot ‘civilisation.’Le temps de la reflexion
prejudge the levels of coherence, unity, and con- 4: 13–52
sistency. The approach just outlined can allow for Taylor C 1990 Comparison, history, truth. In: Reynolds F,
significant variations in all these respects. In par- Tracey D (eds.) Myth and Philosophy. SUNY Press, Albany,
NY
ticular, it may be argued that some civilizations are Toynbee A J 1934–61 A Study of History. Oxford University
more markedly characterized by conflicting cultural Press, Oxford, UK, Vols. 1–12
orientations than others (among the major non- Vico G B 1968 The New Science of Giambattista Vico. Cornell
Western traditions, interpretations of India have laid University Press, Ithaca, NY
more emphasis on this theme than those of China). Weber M 1920–21 Gesammelte AufsaW tze zur Religionssoziologie,
Comparative analyses of such differences—and other Bd. 1–3. Mohr Verlag, Tu$ bingen, Germany
related ones—would be the most effective antidote
against the identitarian and overintegrated models J. P. Arnason
which continue to obstruct the progress of civili-
zational studies.

See also: Civilization, Concept and History of; Civili-


zations; Primitive Society; States and Civilizations, Civilizations
Archaeology of
1. Introduction
The term ‘civilization’ has been used in modern social
Bibliography science and historical literature in several different
ways. One such way, developed above all in Germany
Arnason J P 1988 Social theory and the concept of civilization. from about the end of the nineteenth century through
Thesis Eleen 20: 87–105 the period up to World War II and perhaps best
Arnason J P Ciilization and Difference. Sage, London represented by scholars like Alfred Weber (and taken
Borkenau F 1981 End and Beginning: On the Generations of over to a certain extent in the English-speaking world
Cultures and the Origins of the West. Columbia University by R. M. McIver), designated ‘civilization’—as dis-
Press, New York
Braudel F 1994 A History of Ciilizations. Penguin, Harmonds-
tinct from ‘society’ and above all from ‘culture’—as
worth, UK encompassing above all the technological, material
Dumont L 1967 Homo hierarchicus: essai sur le systeZ me des factors and to some extent organizational aspects of
castes. Gallimard, Paris social life as against the deeper, more ‘spiritual’
Durkheim E, Mauss M 1971 Note on the notion of civilization. cultural and aesthetic ones.
Social Research 38(4): 808–13 Another designation of the term, made famous by
Eisenstadt S N 1978 Reolution and the Transformation of Norbert Elias in his Uq ber den Prozess der Ziilisation
Societies: A Comparatie Study of Ciilizations. Free Press, (1939) focused on the ‘socializing’ process through
New York which the image of the civilized person, as constructed
Eisenstadt S N (ed.) 1986 The Origins and Diersity of Axial in the courtly and also early bourgeois society in
Ciilizations. SUNY Press, Albany, NY Europe, was promulgated and institutionalized. This
Herder J G 1968 Reflections on the Philosophy of the History of designation of civilization was related to an earlier one
Mankind. Chicago University Press, Chicago rooted in the French Enlightenment, in which civiliza-
Krejci J 1982 Civilization and religion. Religion 12: 29–47
Krejci J 1993 The Human Predicament: Its Changing Image: A
tion was seen as the opposite of barbarism. However,
Study in Comparatie Religion and History. St. Martin’s Press, in later works by Elias’s followers, for instance
New York Goudsblom, this view of civilization was extended to
Lilla M 1993 G.B. Vico: The Making of an Anti-modern. Harvard cover many other societies and historical periods,
University Press, Cambridge, MA going back even to the impact of the presumably first
Mauss M 1930 Les civilisations: Elements et formes. In: Febvre domestication of fire.
L et al. (eds.) Ciilisation: Le mot et l’idee. La Renaissance du The third and most extensive designation of civiliza-
Livre, Paris tion was promulgated by scholars such as Max Weber,

1915
Ciilizations

Emile Durkheim, Oswald Spengler, Pitirim Sorokin, ation into basic premises of the social order, these elite
Arnold Toynbee, A.L. Kroeber, Carroll Quigley, groups tend to exercise different modes of control over
Cristopher Dawson, Fernand Braudel, William H. the allocation of basic resources.
McNeill, Adda Bozeman, or Immanuel Wallerstein, Such combination of ontological visions and of
and lately very forcefully by Samuel Huntington. structuration of institutional formations and collective
However great the differences in perspective, meth- identities constitutes an inherent component of the
odology, focus, and concepts that pervade the works formation of any society, and is always closely
of these scholars, they share the use of the term interwoven with the more organizational aspect of any
civilizations as distinct societal-cultural units which institutional formation—political, economic, or fam-
share some very important, above all cultural, charac- ily and kinship.
teristics. Here we shall use the term civilization in a The very implementation or institutionalization of
way very close to, but also distinct from, such a such premises and the concomitant formation of
designation. institutional patterns through processes of control,
symbolic and organizational alike, also generate ten-
dencies to conflict and change. The crystallization of
Civilization as combination of ontological or cosmological
these potentialities of change usually takes place
visions, of conceptions of trans-mundane and mundane
reality, with the definition, construction, and regulation of through the activities of secondary elite groups who
the major arenas of social life and interaction attempt to mobilize various groups and resources to
change aspects of the social order.
The full development of the distinct ideological and
The central analytical core of the term civilization as institutional dimensions, and of some awareness of
employed here—as distinct from such social forma- their distinctiveness, has emerged in some very specific
tions as political regimes, different forms of political historical settings—namely, the so-called Axial Civili-
economy or collectivities like ‘tribes,’ ethnic groups or zations—even if some very important kernels thereof
nations, and from religion or cultural traditions—is can be identified in some archaic civilizations such as
the combination of ontological or cosmological visions, those of ancient Egypt, Assyria, or Mesoamerica.
of visions of trans-mundane and mundane reality,
with the definition, construction, and regulation of the
major arenas of social life and interaction.
The central core of civilizations is the symbolic and
institutional inter-relation between the formulation, 2. Axial Age Ciilizations: the Reconstruction of
promulgation, articulation, and continuous reinter- the World and the Crystallization of Distinct
pretation of the basic ontological visions prevalent in Ciilizational Complexes
a society, its basic ideological premises and core
symbols on the one hand, and the definition and By Axial Age civilizations (to use Karl Jaspers’
regulation of major arenas of institutional life on nomenclature), we mean those civilizations that crys-
the other. Such definitions and regulations construct tallized during the thousand years from 500 BC to the
the broad contours, boundaries, and meanings of the first century of the Christian era, within which new
major institutional formations and their legitimiza- types of ontological visions, of conceptions of a basic
tion, and greatly influence their organization and tension between the transcendental and mundane
dynamics. orders, emerged and were institutionalized in many
The impact of such ontological visions and premises parts of the world. Examples of this include ancient
on institutional formation is effected through the Israel; later in Second-Commonwealth Judaism and
processes of interaction and control that develop in a Christianity; Ancient Greece; possibly Zoroastrianism
society. Such processes of control—and the opposition in Iran; early imperial China; Hinduism and
to them—are not limited to the exercise of power in the Buddhism; and, beyond the Axial Age proper, Islam.
‘narrow’ political sense. Rather, they are activated by The crystallization of these civilizations constitutes
major elites in a society. The most important such elite a series of some of the greatest revolutionary break-
groups are the political, the cultural, and the economic throughs in human history, which have shaped con-
ones and those which construct the solidarity and tours of human history in the last two to three
collective images of the major groups, all of which millennia. The central aspect of these breakthroughs
have different cultural visions and represent different was the emergence and institutionalization of new
interests. ontological metaphysical conceptions of a chasm
The structure of such elite groups is closely related, between the transcendental and mundane orders.
on the one hand, to the basic cultural orientations The development and institutionalization of these
prevalent in a society; that is, different types of elite ontological conceptions entailed the perception of the
groups bear different orientations or visions. On the given mundane order as incomplete, inferior—often-
other hand, and in connection with the types of times as evil and polluted. It gave rise in all these
cultural orientations and their respective transform- civilizations to attempts to reconstruct the mundane

1916
Ciilizations

world, from the human personality to the sociopolit- most appropriate for the implementation of tran-
ical and economic order, according to the appropriate scendental visions.
‘higher’ transcendental vision.
The revolutionary conceptions, which first devel-
oped among small groups of autonomous, relatively 3. Autonomous Elites as Bearers of Ciilizational
unattached ‘intellectuals’ (a new social element at the Visions: Change, Protest, and Heterodoxies
time), were ultimately transformed into the basic
‘hegemonic’ premises of their respective civilizations, The development of new ontological metaphysical
and were subsequently institutionalized. That is, they conceptions in the Axial civilization was closely
became the predominant orientations of both the connected with the emergence of a new type of elite,
ruling elites and of many secondary elites, fully carriers of models of cultural and social order. These
embodied in the centers or subcenters of their re- were often autonomous intellectuals, such as the
spective societies. ancient Israelite prophets and priests, and later on
One of the most important manifestations of such the Jewish sages, the Greek philosophers and sophists,
attempts in all these civilizations was the strong the Chinese literati, the Hindu Brahmins, the Buddhist
tendency to construct societal centers to serve as the Sangha, the Islamic Ulema. Initial small nuclei of such
major autonomous and symbolically distinct embodi- groups of cultural elites developed the new ontologies,
ments of respective ontological visions, as the major the new transcendental visions and conceptions, and
loci of the charismatic dimension of human existence. were of crucial importance in the construction of the
But at the same time the ‘givenness’ of the centers new ‘civilizational’ collectivities.
could not necessarily be taken for granted. The The new type of elites differed greatly from the
construction and characteristics of the center tended ritual, magical, and sacral specialists in the pre-Axial
to become central issues under the gaze of the Age civilizations. They were recruited and legitimized
increasing reflexivity which focused above all on the according to autonomous criteria, and were organized
relations between the transcendental and mundane in settings distinct from those of the basic ascriptive
orders. The political dimension of such reflexivity was political units of the society. The acquired a potentially
rooted in the transformed conceptions of the political countrywide and also trans-country status of their
arena and of the accountability of rulers. The political own. They also tended to become potentially in-
order as one of the central loci of the mundane order dependent of other categories of elites, social groups,
had to be restructured according to the precepts of the and sectors.
transcendental visions. The rulers were usually held At the same time a far-reaching transformation of
responsible for organizing the political order accord- other elites, such as political elites, took place. All
ing to such precepts. these elites saw themselves not only as performing
At the same time the nature of rulers became greatly specific technical activities—be they those of scribes,
transformed. The king-god, embodiment of the cosmic ritual specialists, and the like—but also as potentially
and earthly order alike, disappeared, and a secular autonomous carriers of a distinct order related to the
even if often semisacral ruler appeared. Thus there prevalent transcendental vision. They saw themselves
emerged the conception of the accountability of rulers as the autonomous articulators of the new order, and
and community to a higher authority, God, Divine rival elites as both accountable to them and as
Law, or a metaphysical vision. Accordingly, the essentially inferior. Moreover, each of these groups of
possibility of calling a ruler to judgement appeared. elites was not homogeneous, and within each of them
One such dramatic appearance of this conception a multiplicity of secondary influentials developed.
occurred in ancient Israel, in the priestly and prophetic These new groups became transformed into rela-
pronunciations. ‘Secular’ conceptions of such ac- tively autonomous partners in the major ruling coali-
countability to the community and its laws appeared in tions. They also constituted the most active elements
both the northern shores of the eastern Mediterranean, in the movements of protest and processes of change
in ancient Greece, as well as in the Chinese conception that developed in these societies and which evinced
of the Mandate of Heaven. some very distinct characteristics at both symbolic and
Concomitantly with the emergence of these concep- organizational levels.
tions of accountability of rulers, autonomous spheres First, there was a growing symbolic articulation and
of law began to develop as somewhat distinct from ideologization of the perennial themes of protest found
purely customary law. Such developments could also in any human society, such as rebellion against the
entail some beginnings of a conception of rights even constraints of division of labor, authority, and hi-
if the scope of these spheres of law and rights varied erarchy, and of the structuring of time dimension, the
greatly. Of special importance from the point of view quest for solidarity and equality, and for overcoming
of our analysis is the fact that one of the most human mortality.
important manifestations of the attempts to recon- Second, utopian orientations were incorporated
struct the social order was the strong tendency to into the rituals of rebellion and the double image of
define certain collectivities and institutional arenas as society. It was this incorporation that generated

1917
Ciilizations

alternative conceptions of social order and new ways necessarily obliterating many of the symbolic and
of bridging the distance between the existing and the institutional features of the non-Axial. The most
‘true’ resolution of the transcendental tension. important case of an encounter of non-Axial with
Third, new types of protest movements appeared. Axial civilization in which the former absorbed the
The most important were intellectual heterodoxies, latter has been Japan.
sects, or movements which upheld the different con-
ceptions of the resolution of the tension between the
transcendental and the mundane order, and of the 5. The Multiplicity of Axial Ciilizations and
proper way to institutionalize such concepts. Since World Histories
then, continuous confrontation between orthodoxy
on the one hand, and schism and heterodoxy on the The general tendency to reconstruct the world and to
other, has been a crucial component in the history of expand was common to all the post-Axial age civiliza-
mankind. tions. But the concrete implementation varied greatly.
Fourth, and closely related to the former, was the There emerged a multiplicity of different, divergent,
possibility of the development of autonomous political yet mutually impinging world civilizations, each at-
movements and ideologies usually oriented against an tempting to reconstruct the world in its own mode,
existing political but possibly also religious center. and either to absorb the others or to segregate itself
All these developments ushered into the arena of from them.
human history the possibility of the conscious ordering Two sets of conditions were of special importance in
of society, and also the continuous tension that this shaping these different modes of institutional creati-
possibility generated. The new dynamics of civilization vity and expansion. One such set are variations in the
transformed group conflicts into potential class and basic cultural orientations. The other is the concrete
ideological conflicts, cult conflicts into struggles be- structure of the social arenas in which these institu-
tween the orthodox and the heterodox. Conflicts tional tendencies can be played out.
between tribes and societies could become missionary Among the different cultural orientations the most
crusades. The zeal for reorganization informed by important have been differences in the very definition
each civilization’s transcendental vision made the of the tension between the transcendental and mun-
entire world at least potentially subject to cultural- dane orders and the modes of resolving this tension.
political reconstruction. There is the distinction between the definition of this
tension in relatively secular terms (as in Confucianism
and classical Chinese belief systems and, in a somewhat
4. The Expansion of Axial Ciilizations different way, in the Greek and Roman worlds) and
those cases in which the tension was conceived in
Concomitantly with the institutionalization of Axial terms of a religious hiatus (as in the great monotheistic
civilizations, a new type of intersocietal and intercivili- religions and Hinduism and Buddhism).
zational world history emerged. To be sure, political A second distinction is that between the mono-
and economic interconnection have existed between theistic religions in which there was a concept of God
societies throughout human history. Some concep- standing outside the Universe and potentially guiding
tions of a universal kingdom emerged in many post- it, and those systems, like Hinduism and Buddhism, in
Axial civilizations, like that of Genghis Khan, and which the transcendental, cosmic system was con-
many cultural interconnections developed between ceived in impersonal, almost metaphysical terms, and
them, but only with the institutionalization of Axial in a state of continuous existential tension with the
civilizations did a more distinctive ideological and mundane system. The ‘secular’ conception of this
reflexive mode of expansion develop, with potentially tension was connected, as in China and to some degree
strong semimissionary orientations. in the ancient world, with an almost wholly this-
It was indeed in close connection with the Axial worldly conception of salvation.
civilizations’ tendency to expansion that there de- A third major distinction refers to the focus of
veloped new ‘civilizational’ collectivities, distinct from the resolution of the transcendental tensions. Here
political and from ‘primordial’ ones, yet impinging on the contrast is between purely this-worldly, purely
them, continuously challenging them, and provoking other-worldly, and mixed this- and other-worldly
continual reconstruction of their respective collective conceptions of salvation. The metaphysical nondeistic
identities. Such processes were effected by the in- conception of this tension, as in Hinduism and
teraction between the new autonomous cultural elites Buddhism, tends towards an other-worldly conception
and the various carriers of solidarity and political of salvation, while the great monotheistic religions
elites of the different continually reconstructed ‘local’ emphasize different combinations of this- and other-
and political communities. worldly conceptions of the transcendental vision.
In the continuous encounter of Axial civilizations Another set of cultural orientations which influ-
with non-Axial or pre-Axial civilizations it was usually enced the expansion of the various Axial civilizations
the Axial that came out victorious, without however was the extent to which the access to their centers and

1918
Ciilizations

major attributes of the sacred within them was open to took place. It was epitomized in the Jacobin orienta-
all members of the community or was mediated by tions which became a central component of the
specific institutions. modern political program—to reappear yet again
In addition, there are differences in the way in which forcefully, as Alain Besanc: on has shown, in the
relations between the attributes of cosmic and social Russian Revolution, and later in the Chinese and
order of civilizational collectivities and those of the Vietnamese revolutions.
major primordial ascriptive collectivities are con- The strong sectarian roots of modernity and of the
ceived—the extent to which there is a disjunction tensions between totalistic Jacobin and pluralistic
between the two, to which these respective attributes orientations which developed in Europe find very
are mutually relevant, each serving as a referent of the strong resonance in the utopian sectarian traditions of
other. the Axial civilizations. It is also the religious roots of
But the concrete working out of all such tendencies the modern political program that explain the specific
depends on the second set of conditions—namely the modern characteristics of what may be seen as the
arenas for their concretization. These conditions most antimodern contemporary movements—namely
included, first, the respective concrete economic polit- the various fundamentalist movements which, con-
ical-ecological settings, whether they were small or trary to the view which defines them as traditional,
great societies, whether they were societies with con- constitute a new type of Jacobin movement construct-
tinuous compact boundaries, or with cross-cutting ing tradition as a totalistic ideology.
and flexible ones. Second was the specific historical
experience of these civilizations—especially in terms
of mutual penetration, conquest, or colonization. 7. The Cultural and Political Program of
Modernity: Premises and Antinomies
6. Internal Transformation of the Axial The cultural and political program of modernity, as it
Ciilization: Secondary Breakthroughs and the crystallized first in Western Europe from around the
Crystallization of Modern Ciilization seventeenth century, was rooted in the premises of the
European civilization and European historical ex-
One of the most important aspects of the dynamics of perience and bore these imprints—but at the same
Axial civilizations was the possibility of development time it was presented and perceived as being of
within them of internal transformation, of what has universal validity and bearing.
been designated as secondary breakthroughs, the most The radical innovation of this cultural program as it
important illustrations of which have been Second developed in Europe lay first in the ‘naturalization’ of
Temple Judaism and Christianity; later Islam, man, society, and nature; second in the promulgation
Buddhism, and to a lesser extent Neo-Confucianism, of the autonomy and potential supremacy of reason in
all of which developed out of heterodox potentialities the exploration and even shaping of the world; and
inherent in the respective ‘original’ Axial civilizations. third the emphasis on the autonomy of man, of his
But the most dramatic transformation from within reason and\or will.
one of the Axial civilizations has probably been that of In connection with these orientations there took
modernity as it first emerged in Western Europe and as place far-reaching transformations of the symbolism
it expanded—encompassing most parts of the world, and structure of modern political centers, as compared
giving rise to development of multiple, continually with their predecessors in Europe or with the centers
changing modernities. of other civilizations. The crux of this transformation
The cultural and political program of modernity was first the charismatization of the political centers as
constituted in many ways a sectarian heterodox the bearers of the transcendental vision of the cultural
breakthrough in the West and Central European program of modernity; second the development of
Christian Axial civilization. Such transformation took continual tendencies to permeation of the peripheries
place through the Reformation and in the Great by the centers and of the impingement of the peri-
Revolutions, in which there developed very strong pheries on the centers, of the concomitant blurring of
emphasis on the bringing together of the City of God the distinctions between center and periphery; and
and the City of Man. It was in these revolutions that third was the incorporation of themes of protest as
sectarian activities were taken out from marginal or basic, legitimate components of the premises of these
segregated sectors of society and became interwoven centers. These themes became central components of
not only with rebellions, popular uprisings, move- the project of emancipation—a project which sought
ments of protest but also with the political struggle at to combine equality and freedom, justice and auton-
the center, and were transposed into the general omy, solidarity and identity of modern political
political movements and the centers thereof. discourse and practice. The program also entailed a
It was above all in the French Revolution that the distinctive mode of the construction of the boundaries
fully secular transformation of the sectarian anti- of collective identities. Such identities were not taken
nomian orientation with strong Gnostic components as preordained by some transcendental authority, but

1919
Ciilizations

continually constructed and continually problema- took place through colonialization and imperialist
tized, becoming also foci of political struggle by expansion, gave to the Western institutions the he-
national and ethnic movements. gemonic place in these systems. Yet it was in the nature
The civilization of modernity as it developed in the of these international systems that they generated a
West was from its very beginning beset by internal dynamics which gave rise both to political and
contradictions, giving rise to critical discourse which ideological challenges to existing hegemonies, as well
focused on the tensions and contradictions between its as to continual shifts in the loci of hegemony within
premises, and between these premises and institutional Europe, from Europe to the United States, then also to
development. The most important such tensions in Japan and East Asia.
this program were first that between totalizing and But it was not only the economic, military-political,
more pluralistic conceptions of its major compo- and ideological expansion of modernity from the West
nents—of the very conception of reason and its place throughout the world that was important in this
in human life and society, and of the construction of process. Of no lesser significance was the fact that this
nature, of human society and its history; second, expansion has given rise to continual confrontation
between reflexivity and active construction of nature between the cultural and institutional premises of
and society; third, those between different evaluations Western modernity and those of other civilizations.
of major dimensions of human experience; and fourth Thus, while the spread or expansion of modernity has
between control and autonomy. indeed taken place throughout most of the world, it
These basic tensions, contradictions, and anti- did not give rise to just one civilization, one pattern of
nomies inherent in the cultural program of modernity ideological and institutional response, but to at least
were continually played and worked out in major several basic variants—and to continual refracting
institutional arenas. thereof.
Consequently, multiple modernities have emerged.
These civilizations, which share many common com-
8. Continually Changing Multiple Modernities ponents and which continually constitute mutual
reference points, have been continually unfolding,
It was out of the conjunction of these cultural giving rise to new problematiques and reinterpreta-
orientations with the development of market, com- tions of the basic premises of modernity. All these
mercial, and industrial economies; with the crystalli- attest to the growing diversification of the visions and
zation of a new political and state order; and with understanding of modernity, of the basic cultural
military and imperialist expansion, that the civilization agendas of different sectors of modern societies—far
of modernity emerged. Its crystallization and expan- beyond the hegemonic vision of modernity that was
sion were not unlike those of the expansion of all prevalent before. The fundamentalist—and the new
historical civilizations. What was new was first that the communal-national—movements constitute one of
great technological advances and the dynamics of such new developments, in the unfolding of the
modern economic and political forces made this potentialities and antinomies of modernity.
expansion, the changes and developments attendant Such developments may indeed also give rise to
on them and their impact on the societies to which it highly confrontational stances—especially to the
expanded much more intensive. The expansion, West—but these stances are promulgated in changing
through the use of military, political, and economic modern idioms, and they may entail a continual
forces, of modern civilization which took place first in transformation of the cultural programs of modernity.
Europe and then beyond it continually combined At the same time the new diversity was closely
economic, political, and ideological aspects and forces, connected—perhaps paradoxically—with the devel-
and its impact on the societies to which it expanded opment of new multiple common reference points, and
was much more intense than in most historical cases. It with a globalization of cultural networks and channels
spawned a tendency—rather new and practically of communication far beyond what existed before.
unique in the history of mankind—to the development
of universal, worldwide institutional, cultural, and See also: Civilization, Concept and History of; Civi-
ideological frameworks and systems. Yet all of these lizational Analysis, History of; Cultural History;
frameworks were multicentered and heterogeneous, Cultural Psychology; Elias, Norbert (1897–1990);
each generating its own dynamics. Elites: Sociological Aspects; Hegemony: Cultural;
Of special importance in this context was the relative National Character; Societies, Types of
place of the non-Western societies in the various—
economic, political, ideological—international sys-
tems which differed greatly from that of the Western
ones. It was not only that it was Western societies
which were the ‘originators’ of this new civilization. Bibliography
Beyond this and above all was the fact that the Breuer B 1994 Kulturen der Achsenzeit. Leistung und Grenzen
expansion of these systems, especially insofar as it eines geschichtsphilosophischen Konzepts. Saeculu 45: 1–33

1920
Clan

Durkheim E, Mauss M 1971 Note on the notion of civilization. social theorists of the eighteenth and nineteenth
Social Researc 38: 808–13 centuries who were concerned to understand the
Elias E 1939 Uq ber den Prozess der Ziilisation. Haus zum Falken, origins of the state and democracy favored evolutionist
Basel
explanations. Some viewed the family as the original
Eisenstadt S N 1973 Tradition, Change and Modernity. John
Wiley & Sons, New York form of society which, over the course of human
Eisenstadt S N 1982 The axial age: the emergence of tran- history, had aggregated into progressively larger and
scendental visions and rise of clerics. European Sociology more complex kinship-based groupings of clans and
23(2): 294–314 tribes, which were transformed eventually into the
Eisenstadt S N (ed.) 1986 The Origins and Diersity of Axial Age territorially-based political formation of the state.
Ciilizations. State University of New York Press, Albany, Drawing on the work of Barthold Niebuhr (1828),
New York George Grote (1851) and other historians of Greece
Eisenstadt S N 1982 Heterodoxies and dynamics of civilizations. and Rome, social theorists such as Sir Henry Maine,
Diogene 120: 3–25
Numa Fustel de Coulanges, and Lewis Henry Morgan
Eisenstadt S N, Achlama R 1992 Kulturen der Achsenzeit II, ihre
institutionelle und kulturelle Dynamik. Suhrkamp, Frankfurt used the ancient Greek and Roman kinship groups,
am Main, 3 Vols. the genos and the gens, as models upon which they
Eisenstadt S N 1996 Japanese Ciilization: A Comparatie View. based their understanding of early stages of social
University of Chicago Press, Chicago evolution.
Eisenstadt S N 1999 Fundamentalism, Sectarianism and Re- Maine’s (1861) analysis of the evolution of legal
olution. The Jacobin Dimension of Modernity. Cambridge systems, using the example of the Roman gens and the
University Press, Cambridge, UK concept of agnation—that is, kinship traced through
Huntington S P 1996 The Clash of Ciilizations and the Remaking exclusively male links—posited a primitive stage of
of World Order. Simon and Schuster, New York
social development in which membership in patri-
Kroeber A L, Kluckhohn C 1952 Culture: A Critical Reiew of
Concepts and Definitions. The Museum, Cambridge, MA archal kinship groups defined a person’s social status.
MacIver R M 1931 Society: Its Structure and Changes. R. Long Maine emphasized that the gens was a corporate
& R. R. Smith, Inc. New York group which had a legal personality that endured
Melko M 1969 The Nature of Ciilizations. Porter Sargent, beyond the lives of its individual members, an insight
Boston which has been very influential in subsequent work on
Nelson B 1981 On the Roads to Modernity. Rowman and clans and other forms of descent grouping.
Littlefield, Totowa, NY In contrast to Maine’s theory of primitive patri-
Ogburn W F 1922 Social Change: With Respect to Culture and archy, John Ferguson McLennan (1865) and Lewis
Original Nature. Huebsch, New York
Henry Morgan (1877) argued that the early stage of
Schluchter W 1979 Die Entwicklung des okzidentalen Rationalis-
mus—Eine Analyse on Max Webers Gesellschaftsgeschichte. social evolution was characterized by group marriage,
Mohr, Tubingen which meant that paternity was uncertain, and kinship
Schluchter W 1981 (1985) The Rise of Western Rationalism, Max was traced in the maternal line. For some authors
Weber’s Deelopmental History. University of California favoring the theory of primitive matriliny, the term
Press, Berkeley ‘clan’ was applied exclusively to matrilineal descent
Schluchter W 1989 Rationalism, Religion and Domination. A groups, while gens referred to patrilineal groups, a
Weberian Perspectie. University of California Press, Berkeley terminological distinction that has fallen out of use in
Weber A 1921 Prinzipielles zur Kultursoziologie. Archi fuW r the modern literature. Morgan, whose own research
Sozialwissenschaft und Sozialpoliti XL(VII): 1–49, J-C-B.
among the Iroquois in New York State had been
Mohr, Tubingen
Weber A 1931 Kultursoziologie. In: Enke F (ed.) influenced by Grote’s analysis of the ancient Greek
HandwoW rterbuch der Soziologie. Alfred Vierkandt, Stuttgart, genos, saw descent group organization, which he
pp. 284–94 termed the ‘gentile’ system, as characteristic of much
Weber M 1922–3 Gesammelte AufsaW tze zur Religionssoziologie. of early human social history. McLennan, basing his
Mohr, Tu$ bingen reasoning on contemporary reports of female infan-
ticide in India, claimed that primitive societies were
S. N. Eisenstadt obliged, in their struggles for survival, to engage in this
population-limiting practice, which in turn led kin
groups to procure wives by capture from outside their
own group. McLennan termed this marriage outside
the group ‘exogamy,’ which came to be seen, by many
Clan nineteenth and early twentieth century social theorists,
as another characteristic attribute of clanship.
The modern, now widely accepted, definition of the McLennan was also influential in developing the
clan, as a group of persons who believe themselves theory of totemism, which built on E. B. Tylor’s notion
to be related by unilineal descent but who are unable that primitive religion was based on animism, the
to trace genealogical connections linking all members worship of inanimate objects or fetishes which were
of the group, has emerged out of a complex intellectual believed to be the abodes of spirits. Along with his
history stretching back to the Enlightenment. Many follower W. Robertson Smith and Smith’s student Sir

1921
Clan

James Fraser, McLennan defined totemism as a system lineage. Lineage genealogy can also be the basis for
that originally combined animistic religious beliefs internal segmentation of groups, according to the
with matrilineal descent and exogamy. According to segmentary lineage model. In contrast, lacking such a
this view, through their ignorance of paternity caused comprehensive internal genealogical armature, rela-
by group marriage and their belief in animism, tions of clanship are categorical in character and
primitive societies posited links to original animal or typically nonhierarchical within the clan. Persons are
plant ancestors. These natural species, which were members of a clan because of being the offspring of
treated as sacred emblems or ‘totems’ of different their fathers or mothers, with the terms ‘patriclan’ and
clans, were worshipped periodically by clan members ‘matriclan’ often being used to indicate a clan’s mode
in totemic rituals. of recruitment of its members. Although ascending
Emile Durkheim (1912), in his work on the social descent links beyond the grandparental or great-
origins of religion which drew extensively on early grandparental generation are normally of little or no
ethnographic reports of aboriginal Australian organizational significance, clan members in many
societies, reproduced many of these classic arguments societies do nonetheless recognize a founding clan
concerning totemism, particularly emphasizing the ancestor, who is often of mythical or nonhuman
concept of a sacred social solidarity based on a belief status. Thus, when clans do segment according to
in a shared substance between clan members and their putative genealogical links, this is typically ‘from the
ancestral totemic species. Claude Le! vi-Strauss’ (1962) top’, with reference to the founding ancestor and his or
subsequent general critique of theories of totemism, her children, producing a set of sub-clan categories
which emphasized the lack of a necessary coincidence that also lack comprehensive internal genealogical
between totemic species as emblems, exogamous clans, structures.
and religious sacrifice, served to unhitch the clanship Terminological confusion is also possible in the case
concept from the pseudo-historical hypotheses of of the descent group known as the conical clan, or
nineteenth century evolutionism while, at the same ranked lineage, which combines characteristic features
time, helping to clarify the classificatory logic op- of both lineages and clans. Such descent groups are
erative in cultural categories such as the clan. differentiated internally into a high ranked, lineage-
The work of the British structural-functionalist like, chiefly or noble descent line, and a lower ranked
A. R. Radcliffe-Brown (1950) and his followers E. E. and internally undifferentiated clan-like category of
Evans-Pritchard, Meyer Fortes (Fortes and Evans- commoners (Kirchhoff 1959, Friedman 1979). Chiefly
Pritchard 1940), and others between the two world rank in a conical clan typically is based on relative
wars, did much to put the study of unilineal descent birth order in present and ancestral generations, so
systems on a firmer ethnographic footing, while that senior sons or daughters of senior ranking
refining the conceptual framework for their study. The ancestors keep careful track of their pedigrees to
model of the segmentary lineage, consisting of a nested validate their noble status. Junior offspring of junior
set of increasingly inclusive corporate unilineal descent ancestors, on the other hand, have little motivation in
groups linked by a comprehensive genealogical struc- remembering their genealogies, and their affiliation to
ture, is the most well-known outgrowth of their work, the group is more categorical in character. Numerous
with the term ‘clan’ often being used by these authors ethnographers have drawn attention to the structural
to label a particular segmentary level within a lineage potential of conical clanship, particularly as a tran-
system. Although Radcliffe-Brown’s (1950, pp. 39–40) sitional formation between uncentralized and central-
definition of the clan, given in the first sentence of this ized political systems. Conical clan structures are
article, has come to be widely used, this application of relatively common in Polynesian societies, such as the
the term to characterize one of the levels in a Maori, and in south-east and central Asian societies,
segmentary lineage system is not always appropriate. such as the Mongols.
It reflects a continuing influence of earlier understand- The clans of the Scottish Highlands also displayed
ings of totemism, in which descent group exogamy and such internal ranking (Dodgshon 1998). The Scottish
totemic taboos were taken to be defining attributes of clan chief and his close relatives enjoyed high status
clanship. due to their close patrilineal connection to the found-
Although a definitional contrast between clan and ing clan ancestor. The commoner members of the clan,
lineage is thus still not always made clearly in the more on the other hand, could not necessarily demonstrate
modern literature, there is analytical value in main- such genealogical links, but were bound to their clan
taining a definite distinction, since the organizational chief by diverse social ties including land tenancy or
implications of the two forms of descent grouping are marriage alliance, as well as real or fictive kinship
different. Members of a lineage know, or claim to expressed by a common surname.
know, the genealogical connections interlinking all Systems of clanship occur in many parts of the
members of the group, and these links, viewed in terms world, and their diversity of form can be understood in
of generation and relative birth order, often provide a relation to several key variables. Of prime importance
basis for calculating relations of seniority or degrees of is their capacity for organized collective action as
relatedness between individuals and segments within a corporate groups, which frequently is limited or absent

1922
Clan

due to the spatial dispersal of clan memberships. with high rates of spatial mobility, where local groups
Particularly in territorially extensive societies with frequently split as a mode of dispute settlement. In
large populations, a clan’s members often reside in such cases it may be posited that, following an episode
many different locations, and there are limited pos- of intraclan dispute which has been resolved by group
sibilities for the full membership of a clan to meet as a fission and spatial displacement, the members of the
total group. In such circumstances, clans may act resultant new segments have an active interest in
collectively on only a few occasions, such as during denying knowledge of former genealogical inter-
annual rituals like the aboriginal Australian corrob- linkage. Such systems of proliferating clanship, char-
oree ceremonies, which clans celebrated to ensure the acterized by many small clans that are widely dispersed
continued fertility of their totemic species. In other spatially, are not conducive to the maintenance of the
cases of spatially dispersed clan membership, a clan patterns of long-term interclan alliance mentioned
may never meet as a totality, but simply act as a named above.
category from which localized sets of clan members In some cases, the term ‘clan’ has also been applied
may be mobilized to pursue collective action if to territorial groups which are recruited both on the
necessary or advantageous. basis of unilineal descent and long-term co-residence.
In addition to their variable capacities for organiz- Such a usage is common in New Guinea. Here, in-
ing group action, numerous authors have also drawn migrant strangers hailing from other clans may
attention to the potential of clanship systems to serve initially be welcomed in order to strengthen a localized
a social networking function. For example, in West clan grouping.
Africa among speakers of the Mande family of Over time, cultural theories of personal identity may
languages (Jackson 1974), a limited set of clan names posit, for example, that through continued consump-
is found throughout this large zone, and a clan member tion of foodstuffs cultivated on the clan’s land, the
making a long journey can feel confident of finding physical nature of such incomers becomes transformed
members of his or her clan who will provide assistance into that of true clansmen, thus converting co-
at the distant destination. residence into shared unilineal descent. In effect, as
Even in cases where the set of clan names changes argued by the American kinship specialist, George P.
from one society to an adjacent one, conventional Murdock (1949), such a ‘compromise kin group’
equivalences are often established between clan names reflects the difficulty of constituting a large-scale,
to facilitate the extension of such networks of mutual residentially unified and collectively functional kinship
aid. Similar clan networks have also been noted among grouping on the basis of unilineal descent alone.
North American Amerindian and aboriginal Austral- Finally, it should be recognized that, in common
ian societies. parlance, the term ‘clan’ is often used in a metaphorical
Although ethnographic analysts of clanship fre- way to refer to any group of persons who act toward
quently have conflated it with lineage structure, as each other in a particularly close and mutually
mentioned above, the logic of clanship is often more a supportive way. Thus, criminal organizations such as
matter of ‘sub-ethnicity’ rather than ‘super-lineage.’ the Mafia may be referred to as ‘clans,’ in recognition
Many cultures’ understandings of clanship, which are of the ideology of kinlike solidarity that binds their
often rooted in myth, posit a primordial clan identity members together.
which is seen as immutable, much like many cultures’
conceptualizations of ethnicity. It is this theme that See also: Chiefdoms, Archaeology of; Kinship in
Le! vi-Strauss (1962) explored in his work on totemism, Anthropology; Matrifocality; Tribe
in which the classificatory logic of clans as social
species is seen as homologous to cultural classifications
of animal and plant species. In such cases, the total Bibliography
number of clans in a society may be quite small and be
Dodgshon R 1998 From Chiefs to Landlords. Edinburgh Uni-
viewed as unchanging, with the various clanship versity Press, Edinburgh, UK
categories standing in long-term interrelationships of Durkheim E 1912 Les Formes EleT mentaires de la Vie Religieuse.
marriage alliance, ritual cooperation or other modes Alcan, Paris [1915 The Elementary Forms of the Religious Life.
of mutual exchange or solidarity. In societies where Allen & Unwin, London]
the total number of clan categories is only two, these Fortes M, Evans-Pritchard E E (eds.) 1940 African Political
groupings are conventionally referred to as ‘moieties.’ Systems. Oxford University Press for the International
In contrast, some societies have systems of prolif- African Institute, London
erating clans in which it is evident to the analyst, from Friedman J 1979 System, Structure and Contradiction in the
Eolution of ‘Asiatic’ Social Formations. National Museum of
the presence of large numbers of clans with relatively
Denmark, Copenhagen, Denmark
small memberships in the context of overall popu- Fustel de Coulanges N 1876 La CiteT Antique. Hachette, Paris
lation growth, that clans are undergoing segmentation, [1980 The Ancient City. Johns Hopkins University Press,
although this is not acknowledged by clan members Baltimore]
themselves. Such systems of proliferating clanship Grote G 1851 A History of Greece, 3rd edn. John Murray,
tend to be found in politically uncentralized societies London

1923
Clan

Jackson M 1974 The structure and significance of Kuranko the Court of Chancery in the sixteenth and eighteenth
clanship. Africa 44(4): 397–415 centuries, in such cases as rights to fish in a river, rights
Kirchhoff P 1959 The principles of clanship in human society. In: of creditors against an insolvent debtor, rights of
Fried M (ed.) Readings in Anthropology. Crowell, New York,
shareholders to enforce duties of corporate directors
Vol. II
Kuper A 1988 The Inention of Primitie Society. Routledge, and obligations of parishioners to pay church titles
London (Yeazell 1987). By the end of the eighteenth century
Le! vi-Strauss C 1962 Le Totemisme Aujourd’hui. Presses Univer- the basic formula for a class suit had been established,
sitaires de France, Paris [1969 Totemism. Penguin, Harmonds- in these terms: ‘Plaintiff, suing on behalf of himself
worth, UK] and all others similar situated, alleges as follows: …’
Maine H 1861 Ancient Law. John Murray, London In the United States the class suit procedure was
McLennan J F 1865 Primitie Marriage. Adam & Charles Black, elaborated by Justice Joseph Story in his treatise on
Edinburgh, UK Equity and was recognized in the federal courts,
Morgan L H 1877 Ancient Society. Henry Holt, New York
notably in the case of Smith . Swormstedt, (57 US [16
Murdock G P 1949 Social Structure. Macmillan, New York
Niebuhr B 1828 The History of Rome. Taylor, Cambridge, UK How] 288 [1853]). In the nineteenth century the class
Radcliffe-Brown A R 1950 Introduction. In: Radcliffe-Brown suit evolved in state court procedure, particularly in
A R, Forde D (eds.) African Systems of Kinship and Marriage. litigation by city taxpayers complaining about im-
Oxford University Press for the International African In- proper municipal expenditures and in proceedings to
stitute, London reorganize insurance companies that had over-
extended themselves. The ‘taxpayers’ suit’ has since
P. Burnham evolved into a standard procedure for obtaining
judicial review of action by municipal and state
government. The insurance reorganization proceed-
ings have since evolved, in one direction into bank-
Class Actions: Legal ruptcy procedure and in another offshoot into pro-
cedures for reorganization of insurance companies
and banks (Hazard et al. 1998). The class suit
A class action is a procedure whereby one or more
procedure was used only infrequently until the liti-
claimants, called the class representative, may bring
gation involving desegregation of the public schools in
suit (or, more rarely, be designated as a defendant
the 1960s and 1970s, in which the procedure was a
class) to obtain a remedy responsive to the legal
standard technique.
interest of all members of the class (Lindblom 1996).
The class suit procedure was given greater status
The remedy may be an injunction, for example
and more precise definition in the Federal Rules of
prohibiting racial discrimination in public schools, or
Civil Procedure, adopted in 1938. Federal Rule 23
compensatory damages and, in some instances, puni-
permitted class suits for injunctions, damages and
tive damages, for example in claims of mass consumer
multiple claims against a limited fund. In 1966 Federal
fraud. The suit may seek both an injunction and
Rule 23 was elaborately amended to provide clearer
monetary redress for or against a class. The procedure
criteria for when class suits maybe maintained and
facilitates assertion of similar claims on behalf of a
greater protection for class members, particularly
large number of allegedly injured parties, including
those who were not the designated representative
claims that could not, as a practical matter, otherwise
parties.
be asserted on account of the cost of litigation. A suit
The validity of class suit procedure has been
against a defendant class typically would seek impo-
recurrently questioned in terms of due process. The
sition of an identical remedy against all of them, for
issue is essentially whether, and in what circumstances,
example, against stockholders or creditors of a cor-
a judgment in a class suit can validly determine the
poration. By the same token, because the procedure
rights of members of class who do not actively
aggregates claims, it can impose very heavy liability
participate in the litigation. The courts have upheld
and hence can become a weapon of ‘legal blackmail.’
the concept of the class suit but have given ambiguous
The procedure is governed by Rule 23 of the Federal
pronouncements about its effect on absent class
Rules of Civil Procedure in federal courts and by
members. Leading Supreme Court cases addressing
similar rules in state courts. Class suits are very
that issue include Hansberry . Lee (311 US 32 [1938])
controversial but they are an important component in
disapproving a class suit where the representative has a
the American use of litigation to address public issues
conflict of interest; Phillips Petroleum Co. s. Shutts
of compensatory and distributive justice.
(472 US 797, [1985]) approving a class suit in state
court determining rights of residents of other states;
1. History and Amchem Products, Inc. s. Windsor (521 US 591,
[1997]), and Ortiz s. Fibreboard Corp. (527 US 815
English courts recognized litigation on behalf of [1999]) disapproving class suit settlements covering
groups from no later than the fifteenth century and potential claimants who could not be given notice of
perhaps earlier. The concept of a class suit evolved in the suit.

1924
Class Actions: Legal

2. Requirements for a Class Suit is proper. An initiative by the plaintiff to proceed with
a class suit is ordinarily necessary, except in unusual
The requirements for a class suit in federal court are situations where a defendant class is established.
set forth in Federal Rule 23. Essentially similar However, a plaintiff initiative is not sufficient without
requirements prevail in state court procedures. the court’s approval. Usually the question of certifi-
The class suit complaint must be set forth, first, that cation is a major preliminary issue, strenuously con-
plaintiff is a party injured in the manner detailed in the tested with legal and factual argument. A great deal
complaint’s subsequent allegations and, second, that depends on resolution of this issue. If certification is
the plaintiff maintains the action on behalf of all denied, the lawsuit lapses into a claim by one or a few
members of the described class, for example, all Black parties, rather than on behalf of a large group.
children seeking admission to the defendant public Reduction of the size of the case to the claims of the
school or all consumers who borrowed money from individual representatives usually makes further pur-
the defendant lending institution. The first of these suit of the litigation by the plaintiff unattractive or
allegations establishes ‘standing’ to sue and the second impractical. On the other hand, if certification is
asserts plaintiff’s assumption of a representative ca- granted, the lawsuit may immediately assume major
pacity in the litigation. The remedy must be sought for proportion for the defendant, often creating a com-
all members of the class. The complaint must describe pulsion to reach expensive settlement.
the wrong in accordance with the usual pleading rules. Despite extensive procedural jurisprudence, and
It must also allege that the members of the group are intensive professional and academic debate, the issue
too numerous to sue separately; that there are common of certification in a specific case remains relatively
questions presented in the claims for the class; and that open ended and hence very much a matter of judicial
the representative party will be an adequate rep- judgment and discretion (Newberg and Conte 1992).
resentative of the entire class. It must also be shown For many years the question of certification was held
that common relief for all class members is necessary to be an interlocutory determination and hence not
or at least convenient and that the procedure will be subject to appellate review until final resolution after
efficient as a means of resolving the many claims determination of the merits of the class members
involved. claims. See Eisen . Carlisle & Jacquelin (417 US 156
The terminology in Federal Rule 23(b) differentiates [1974]). This made the certification issue in the trial
three types of class suits: 23(b)(1), involving a course court all the more crucial. Denial of certification was
of conduct by defendant that affects all class members; termed the ‘death knell’ of a class suit, while grant of
23(b)(2), in which an injunction is sought against certification usually resulted in defendant feeling
conduct affecting all members of the class; 23(b)(3), in obliged to negotiate a settlement with the class. In
which similar injury has allegedly been inflicted on all 1999, Rule 23 was amended to permit immediate
class members, for which damages are sought. These appeal of a grant or denial of certification. Immediate
distinctions are not mutually exclusive. For example, a appeal of the trials courts’ certification decision has
proper class suit can seek both an injunction against been permitted under class suit procedure in most
future wrong and damages for past conduct. However, states.
other provisions of Rule 23 impose different pro-
cedural requirements on the basis of this typology,
particularly a requirement that all members of a
‘(b)(3)’class be given individual notice of the pendency
of the suit. This discrepancy as well as redundancy in 4. Initiatie to Prosecute Class Suits
other terminology in Rule 23 have resulted in much A class suit can be initiated by a pre-existing group,
confusion in administrating class suits. There is a large such as a labor union or trade association; by a
and complex procedural jurisprudence concerning political action organization wishing to press a legal
these requirements (Wright et al. 1986). contention, such the NAACP’s litigation to end school
However, there are two essential questions. The first desegregation; or by a group recognizing itself to have
is whether, all things considered, the litigation may be a common interest in making claims, such as dis-
maintained on a group basis, or may not be so affected shareholders of a corporation. Initiation of
maintained. This is the ‘certification’ issue, so-called such efforts requires assistance of a lawyer willing to
because the court must certify the suit as a proper class prosecute the case. In modern practice such an
proceeding if it is to proceed as such. The second initiative is often by lawyers specializing in class suit
critical question is whether individual notice to all litigation. Because class suits typically involve large
class members will be required. stakes so far as defendant is concerned, most class
suits are strongly contested. That prospect in turn
3. Certification requires that the organizer, whether an action group
or the lawyer, have financial resources and staying
A class suit may be maintained as such only with the power to sustain costly and protracted litigation. Legal
approval of the court, that is, certification that the suit grievances that can be framed as class suits hence

1925
Class Actions: Legal

ordinarily result in escalation of a dispute into liti- typically calls for defendant to pay the plaintiff’s
gation of major proportion. lawyer a substantial fee. A defendant is typically
The initiator of a class suit, whether a claimant indifferent whether settlement money goes to the
action group or a lawyer, defines the grievant group by lawyer or to the class members. A settlement could pay
the description of the class in the complaint. The class could pay $1 million to the lawyer and $9 million to the
is defined in terms of common characteristics and class, or $2 million to the lawyer and $6 million to the
usually a specified time interval, for example, ‘female class. The terms of a settlement injunction can be
employees of defendant corporation in the period similarly manipulated. The class members typically
January, 1998 through December, 1999.’ The time are dispersed and not organized, hence in a weak
interval refers to the period in which the alleged position to contest the terms of settlement (Coffee
injuries occurred and is called the ‘class period.’ A 2000).
larger class may be subdivided into subclasses having
different specific characteristics, either by the plaintiff
or by order of the court. 6. Use and Abuse of Class Suits
Some class suits are almost entirely the result of
lawyer initiative. A lawyer envisions that a wrong has Most class suits are resolved by settlement, typically
been committed against many people, locates someone after a period of intensive discovery and extensive
fitting the description to serve as class representative, motion practice. Hence, although class suit litigation
and then manages the lawsuit. The lawyer’s incentive typically is expensive, there are few cases where an
in such a case includes the prospect of large fees upon adjudication has determined the merits. Most of the
obtaining a judgment or, much more likely, a controversy over class suits is founded on dispute
settlement. Some analysts differentiate between ‘cause’ about the fairness of settlements.
class suits, such those asserting civil rights, and Control against blackmail of a defendant or ex-
‘money’ class suits, where large damages are sought. ploitation of absent class members is afforded through
‘Money’ class suits are typified by cases where in- exercise of responsibility by class counsel and by court
ventive lawyers frame class damage claims out of supervision. Most class counsel are faithful to their
transactions too small to be worth individual pros- responsibilities, notwithstanding that class suit de-
ecution but affecting thousands of alleged victims. A fendants typically assert that they are being oppressed.
classic illustration is a California case seeking res- But some class counsel have been flagrantly exploitive.
titution from a taxi company that allegedly fixed its The quality of court supervision varies greatly. Some
meters to record fares at a rate higher than legally judges conduct searching inquiries into the basis and
permitted. See Daar s. Yellow Cab Co. (67 Cal. 2d 695 terms of a proposed settlement, particularly the
[1967]): Claims on behalf of corporate stockholders, attorney fees. Some judges make only a perfunctory
alleging misrepresentation in the corporation’s finan- review. Close supervision is in any event difficult
cial projections, are a common type of modern class because there is no reference point as to the merits of
suits. Many class suits involve both politically signifi- the claim and both sides are at high risk in a trial on the
cantly issues and large financial stakes. In any event merits—plaintiff through possible loss of its invest-
the class permits isolated individual legal grievances to ment in the litigation, defendant through possible loss
be amalgamated into large scale claims. on the merits (Hensler et al. 2000).
Both the opportunity to achieve justice for indi-
viduals suffering common wrongs and the danger to
5. Uses and Abuses of Class Actions defendants of ‘betting the ranch’ could be improved if
clearer definition could be drawn of claims appropriate
The typical class suit defendant is a business cor- for class suits. Efforts in this direction have been
poration or a government bureau. The class suit without much success. Congress adopted special pro-
procedure is an important device for commutative and visions concerning securities class suits in federal court
distributive justice on behalf of individuals who have but their effect has been avoided by bringing suits in
suffered legal wrong at the hands of powerful actors in state courts. Proposed legislation would extend federal
modern mass society. At the same time, the class suit court jurisdiction to class suits involving interstate
procedure is a menacing device by which self-consti- transactions, but encountered intense criticism that
tuted protagonists, chiefly lawyers, can exploit rela- they would stifle proper class suits and inappropriately
tively minor legal mistakes to reap large recoveries burden the federal courts. Revisions in Rule 23, in
from ‘target’ defendants. Officials of such a defendant addition to making certification orders immediately
typically consider that they cannot afford the risk to appealable, have been proposed but probably would
their organization’s continuity of a big judgment—the have modest effects.
risk, in common parlance, of ‘betting the ranch.’ Much of the controversy over class suits implicates
A class suit also presents opportunity for the other aspects of American civil procedure, including
plaintiff’s lawyer to exploit the class, often through broad discovery, the right of jury trial of issues of fact,
connivance of the defendant. A class suit settlement and the American rule concerning court costs (Yeazell

1926
Class and Law

1987). Class suits typically involve extensive discovery sharped pattern of social identity (not gender, ethnicity,
which, under broad American discovery practice, can and the like). On a certain level of productivity a
extend to thousands of documents and dozens of surplus can be produced. The mode of production
witness depositions. The right of jury trial imposes and appropriation of a surplus changes historically.
inherent risks to a ‘deep pocket’ defendant and, Classes evolve out of unequal property relations to the
perhaps more important, precludes decisions by the means of production. Owners and non-owners of
judge of issues of fact that might narrow the cont- the means of production form the dominating vs. the
roversy or resolve it altogether. The American cost dominated, exploited classes. (In non-Marxist ap-
rule, whereby the winner of litigation generally is not proaches, e.g., in Max Weber, classes are defined not
entitled to recover its litigation costs, means that a only according to their market position or possession
class suit plaintiff has no ‘down side’ risk of becoming but also according to their relation to the mode of
liable for a defendant’s costs. Moreover, many class production and appropriation; Weber 1964, p. 688.)
suits claims are based on statutes that provide recovery Classes are not only structural, anonymous entities.
of litigation costs to winning plaintiffs but not to Classes operate as collective actors vested with class
defendants. A commonly voiced criticism of these consciousness (Klasse fuW r sich) and class interests. The
statutes is that they contemplated individual suits, interests of the economically dominating class are
where litigation costs would make enforcement of backed by the state as an instrument of suppression.
rights practically impossible, whereas the class suit is (In contrast, Max Weber suggests that the legal order
another approach to enforcement of individual rights has an impact on the power order of a society (Weber
through a ‘private attorney general.’ 1964, p. 678), which is determined by, among other
things, the economic Klassenlage, insofar as the legal
See also: Legal Process and Social Science: United order protects the free disposition of the owners of the
States; Liability: Legal; Litigation; Parties: Litigants means of production (Weber 1964, p. 682).
and Claimants; Procedure: Legal Aspects; Rules in the At the same time, law disguises class domination
Legal Process behind a veil of ideological justification. The legal
ideology of the formal equality of autonomous market
subjects legitimates both the economic and the pol-
Bibliography itical order. Instead of mere force, a system of alleged
Coffee J 2000 Class action accountability: Reconciling exist, universal legitimized domination, and thus the step
voice, and loyalty in representative litigation. Columbia Law from might to right, is institutionalized. Therefore, not
Reiew 100: 370 only class interests but also general (class-indepen-
Hazard G, Gedid J, Sowle S 1998 An historical analysis of the dent) conditions of a society are secured by state and
binding effects of class suits. Uniersity of Pennsylania Law law. By its intellectual sublimation and doctrinal
Reiew 146: 1849 systematization, law gains a relative autonomy. Law is
Hensler D R, Pace N M, Dombey-Moore B, Giddens B, Gross no longer the pure expression of the economic basis or
J, Moller E K 2000 Class Action Dilemmas: Pursuing Public state power. There exists a feedback of legal regula-
Goals for Priate Gain. RAND, Santa Monica, CA
tions on the economic basis including the class struc-
Lindblom P H 1996 Group Actions and the Role of the Courts: A
European Perspectie. Kluwer, The Hague, Netherlands ture.
Newberg H, Conte A 1992 Newberg on Class Actions, 3rd edn. According to the basic assumptions of Marxist
McGraw-Hill, New York historical materialism, the systematic relationship
Vos W de 1996 Reflection on the introduction of a class action in between the class basis and the legal superstructure
South Africa. Tydskrif ir die Suid-Afrikaanse Reg 4: 639–57 evolves in historical stages: In ancient ‘primitive,’
Wright C, Miller A, Kane M 1986 Federal Practice and ‘classless’ societies, which do not produce a surplus,
Procedure. West, St Paul, MN, Vols. 7A–7C there is no need for state and law. In hierarchical
Yeazell S 1987 From Medieal Group Litigation to the Modern societies that are economically based on slavery or a
Class Action. Yale University Press, Haven, CT
feudal system, state and law play a dominant role. In
the tradition of Marxist legal theory, this characteriza-
G. C. Hazard, Jr.
tion was denied by Eugen Pas) ukanis (1924). According
to his approach, law exists only in capitalist (market)
societies with antagonistic interests of the owners
of commodities. Preceding social formations are
Class and Law marked by lawless domination. The stage of the
dictatorship of the proletariat is, according to Marx\
1. Basic Marxist Assumptions Engels and Lenin, signified by the use of law as an
instrument of overt state domination, even of terror.
The basic assumptions in a Marxist explanation of the After that, law is used as an instrument to build up and
origin, the development, and the function of law to secure a socialist society. The future of state and law
include the following: Productive labor is the basis and in a communist society was an open question. The
thus the focus of social ascriptions, i.e., of a collectively answers range from the notion of the withering away

1927
Class and Law

of law and state, to a ‘state of the whole people’ in allegedly free autonomous persons veils substantial
which law still serves administrative purposes. inequality in market positions (e.g., in the labor market
In the following systematic presentation the concept between employers and workers, in the housing
of law is used either in the sense of legislative acts, or market between landlords and tenants). But this
the activities of the legal staff, or the legally normative pattern is relativized by special, protective
relevant activities, opinions, and beliefs of the citizens regulations in these fields, when the weaker position of
in general. one group is acknowledged and becomes legally
relevant.

2. Class Legislation
In Marx’s base–superstructure scheme (Marx 1859, pp. 3. Class Justice
8–9), law (as an element of the superstructure) is
conceived of as an expression of the economic base. In It is a truism that courts cannot act neutrally in a class
the writings of Marx and Engels, the instrumental society and in a class state. If class justice is the justice
view comes in via the notion of RuW ckwirkung (feed- of the class state that is organized according to the
back) of the superstructure onto the economic base. interests of the dominating class, how is it organized?
This instrumental view was dominant from the time What mechanisms ensure that the courts will act in
that the communists gained power; for example, in the the interest of the ruling class? To which class do
writings of Stalin. In the instrumentalist view, law is judges belong? How are the interests of the dominating
conceived of as an instrument used by the ruling class transferred into the courts? (See Judges.)
class(es) in order to realize a socialist order. The The easiest example is, of course, where members of
theoretical background of Marxist legal theory can be the economically and politically dominating class
characterized by this oscillation between law as an themselves have jurisdiction. The Landesherr, for
expression of the given class structure and law as an example, is in charge of the administration of justice in
instrument of class domination. his region. But with the professionalization of the
But what are the concrete mechanisms by which a judiciary, and a rudimentary separation of powers, the
link between social classes and legislation, between the problem arises as to how the interests of the dominat-
dominant class(es) and the legislative majority, can be ing class can be promoted by persons who are not
achieved (as soon as there are distinct political bodies)? themselves members of this class but rather act as class
What is the logic of ‘class legislation’ (Mill 1861, representatives. A number of factors help to promote
Chap. 10)? the interests of the dominant class.
On a primitive level, members of the dominating The first factor that promotes the interests of the
class(es) are in persona members of the legislative dominant class is socialization. Judges tend to come
body. With growing differentiation ‘class represen- from a similar background (Griffith 1981). Family
tation’ becomes important: parliamentary representa- background, and especially the occupation of the
tives are to a large extent nominated or elected by father, helps to explain the class bias in the administra-
members of the dominant class; the legislative power tion of justice. And the high costs of education, in
of the dominant class can be secured by special particular the costs of academic legal education, make
arrangements (e.g., Prussian Dreiklassenwahlrecht the legal professions accessible only to members of
(1849–1918) by which the representatives were elected families with a high income.
according to the amount of taxes paid by groups of The second factor promoting the interests of the
voters). This becomes a problem with the introduction dominant class is the process of professionalization.
of equal, universal suffrage. Class influence then can That all judges share a common academic background
be maintained by a combination of economic and and training is likely to have an impact on the attitude
political positions, by regulations concerning access to of law students, reinforcing a status quo orientation.
Members of Parliament and government, by lobbying Furthermore, judges must adhere to a set of pro-
and corruption. Alternatively, input-orientation–out- fessional standards; if they do not, they may suffer
put-orientation becomes more relevant; i.e., via threats political consequences, or be subjected to professional
and sanctions, capitalists can use their economic discipline. Thus, in almost every state—be it a dic-
power (reduction, relocation, export of jobs, and tatorship, an apartheid regime, or a democracy—
investments, flight from taxation, etc.) to make the organizational means are successfully employed to
legislator issue certain regulations. The legislator has secure the conformity of the legal profession to the
to take into account the consequences threatened by political regime, and to the dominant political ide-
the employers. The subtlest form—often found in the ology.
critique of the bourgeois ideology of formal law—is Illuminating examples of the operation of these
the use of general, universal norms, the application of mechanisms can be found in Karl Liebknecht’s
which confirms economically privileged positions. The critique of the class justice during the German Reich
construct of formal equality in civil law among and in Prussia before the First World War: ‘If we look

1928
Class and Law

at where our judges come from it is obvious out of variables that do not take into account legal argu-
which milieu, from which point of view they regularly mentation raises several problems. The ‘background
will decide. Naturally the judges will be recruited only approach’ (i.e., the use of variables like father’s
from the possessing classes simply because of the high occupation, religion, ethnicity, party, and other group
costs of education, because of the low income at the affiliations) in judicial research did not lead to clear
beginning of their career, by which the need for an results. Does there still exist a homogeneous working
adequate way of life, that still exists, cannot be class culture with distinct socialization patterns? Does
satisfied.’ (Liebknecht 1910, pp. 26–7) During this legal education and role expectations on the job
period the criterion of political reliability was overtly neutralize a possible impact of the social background?
and restrictively used by the court administration to What about lay judges who have almost no legal
regulate the access to the judiciary. There are also brief training? When a jury is selected, lawyers consider the
remarks on the English system by Max Weber (1964, composition of race and gender (see Juries). There is
p. 1049) on the recruitment of judges from the advocacy strong evidence that neither the background or edu-
who served the capitalistic interests of their clients cation of judges, nor their more recently achieved
(For England see Griffith (1981) and the recent attributes like group or party affiliations, have an
analysis of Abel (1998): the access to the judiciary is impact on the outcome of judicial decisions (part-
politically controlled by the Lord Chancellor and icularly for labor courts; cf. Rottleuthner 1984). The
there still exist strong financial barriers.) (See Judicial focus of research into the outcomes of litigation has
Selection.) therefore shifted from the personality of judges to
Although according to Marxist theory, state and features of the parties to a court conflict (cf. Galanter
law are always ‘class state’ and ‘class law,’ in socialist 1974). Success in a civil court litigation can be
countries the term ‘class justice’ was not used as a self- explained by the experience of repeat players—in
description of their courts, but solely in a pejorative general bigger firms which tend to come out ahead of
sense against ‘bourgeois justice’, etc. private one-shotters.
Is there still—as it was evident for Karl Liebknecht The focus of sociolegal research has furthermore
in cases where workers were involved (as an accused shifted away from the courts to what happens out of
or, in civil cases, with an employer on the other court: to different legal needs (of the rich and poor), to
side)—a class bias in the administration of justice? unequal access to courts and other legal or extralegal
What about civil cases in which the parties come both remedies (see Justice, Access to: Legal Representation
from the same class, i.e., from the class of the owners of the Poor), in general to the selectivity of legal
of the means of production, or from the proletariat— procedures (see Law, Mobilization of). The problem of
and here possibly as blue-collar workers versus white- access to courts has been adressed by Max Weber
collar workers? What if conflicting parties avoid court (1964, p. 719) in the case of the English system:
litigation? And in penal cases, how do we compare monetary barriers for the poor lead to denial of the
legally similar cases in which the judgment differs administration of justice.
solely because of the different class affiliation of the
accused? The appropriate field of study of class justice
seems to be labor courts with employers and employees 4. Class Consciousness and Law
as parties, i.e., where class conflict is transformed
into legal conflict (cf. Rottleuthner 1984). There are no empirical studies that contrast the
There is substantial evidence documenting dis- attitudes of workers (or of the proletariat) to legal
crimination in penal courts with a high selectivity norms and legal institutions with those of employers
against the poor, from police arrest, through state (or of capitalists). Empirical research into knowledge
prosecution, to disparities in sentencing and im- and opinion about law uses simple indicators of social
prisonment. Assessing the extent to which capital pun- status (preferably income or occupation), but no class
ishment (see Death Penalty) is discriminatory in the US variables in a strict sense. Georg Luka! cs (1920) dealt
is methodologically complicated because of the corre- theoretically with the attititude of the proletariat
lation of crimes rates with social status, race and even towards legality and illegality and towards the bour-
gender (see Gender and the Law). Further, statistical geois law and state in general. In order to establish
correlations between race or gender and the law are successfully a proletarian state, the proletariat has to
hard to interpret because it is still an open question acquire a sober, purely tactical attitude toward law
how these factors become operative within legal and state. The state has to be seen solely as an element
argumentation. The overrepresentation of lower class of power, as an empirical entity without any normative
people in the penal system could be explained to a obligatory force. Legality or illegality are not matters
certain degree by legal standards, because there are of principle but of utility. This instrumental attitude
correlations between legally relevant variables like towards law and state can also be found among the
observability of a crime, recidivism, confession, way of members of Communist parties. After their experi-
life, and sociologically relevant variables like (lower) ences with bourgeois legislation and class justice, they,
class (see Crime and Class). The use of purely external having gained power, used law as an instrument of

1929
Class and Law

suppression and of strengthening the socialist state. what today is labelled as ‘economic analysis of law,’
The doctrine of the withering away of law and state the individualistic, utilitarian approach of the Chicago
was discarded; and law was not understood as a limit school, would have become the target of Marx’
on state power. sarcasm.
Class has been used as an explanatory variable,
namely as a background variable, in order to explain
5. Abiding Marxian Concepts the behavior of legislators or judges, or as a variable
attributed to the accused or to the parties involved in
The replacement of the Marxian concept of class with a legal conflict. But it has regularly been used in a non-
constructs like status and milieu, and with indicators Marxian sense; in empirical research, indicators other
such as income, education, and occupation can itself than the property relationship to the means of pro-
be explained by substantial social changes. Labor is no duction are often used. Thus the concept of class has
longer the focus of social ascriptions. The forms of been dissolved into notions like status, stratum, milieu,
property or ownership (of the means of production) etc., and indicators like income, occupation, and level
have changed and have become objectively blurred by of education have taken the place of the theoretical
the development of big companies (see Property: Legal construct of class. In sociology, as in sociolegal
Aspects). Other forms of social inequality became studies, class has become a vague notion relevant to
important like gender, race, ethnicity, nationality, age, social stratification or to social movements.
etc. Of course, hierarchical stratification still exists: the The critical heritage implied in analyses of class
haves and have-nots (the distinction between Besit- legislation, class justice, and legal class consciousness
zenden and the Besitzlosen constituted the Klassenlage lies in the disclosure of inequality that is veiled behind
inWeber1964,p.679),richandpoor,thelegallyincluded formally equal legal rules. Marxian and Marxist
and excluded, that share or do not share the benefits analysis like Critical Legal Studies (see Critical Legal
and compensations of the welfare state, the politically Studies) teach us about the ‘dialectic’ of formal and
dominating and dominated, etc. Instead of property, material equality. Formal equality in law means that
the notion of access becomes more important. Scho- there are legal rules framed in neutral language with
lars are interested in the access to the means of no overt discrimination. But the application of these
production, but also the access to money, to the labor rules has a discriminatory impact on those who do not,
market, to information, knowledge, education (cul- in practice, have the options that are imputed to
tural capital), access to medical care, to other social allegedly autonomous persons. The strategy of argu-
benefits and rewards, to legal services and remedies, to mentation is fundamental for the discourse of ine-
leisure time options, to political power, etc. It is quality in general (see Discrimination.) insofar as it
important to look at correlations and congruence holds not only for class discrimination but also for
between these varieties of access. There are no longer unequal treatment in the cases of gender, race, color,
collectively conscious actors like classes. Some authors ethnicity, nationality, etc. Also the attempts to over-
deny the existence of collectivities altogether under the come the contradiction between formal and material
heading of individualization; others speak of social inequality, legally and\or politically, are similar in the
movements (Eder 1993). Finally, the role of law has various domains. Should one adopt compensatory
changed. Law is no longer conceived of as a general legal measures, such as protective regulations? Should
structure of society, whether based on social con- one opt for positive discrimination? (See Affirmatie
tractual consensus or on class domination or an Action: Comparatie Policies and Controersies; Affir-
anonymous economic base. Law serves as a political matie Action: Empirical Work on its Effectieness.)
instrument, among others, used in order to achieve Or should one dispose of law generally because of its
certain goals. Law, in particular welfare and labor law, class character as Marx did in his critique of the Gotha
is applied in order to compensate for the shortcomings program: law is by necessity a law of inequality (Recht
of formal equality. Law is constitutive of state and der Ungleichheit) that cannot cope with the variety of
social activities (see Law as Constitutie); and law can individuals. (A similar argument can be found in
work to limit political power. radical feminism against the intrinsic male character
Basic Marxian concepts are used today only meta- of law. See Feminist Legal Theory.)
phorically. Everything, including culture and law, A negative heritage of the Marxist juxtaposition of
becomes a matter of production. The notion of capital class and law lies in the conception of law either as the
is converted into social capital and cultural capital expression of a basic anonymous class structure or as
(Bourdieu 1986). Class becomes an empty shell sig- an instrument of the dominant class(es) as conscious
nifying every form of inequality. Lenski (1966), for actors. What has been systematically neglected is the
example, speaks of various class systems according to ubiquitous constitutive role of law (already in the
different principles of stratification. There is not only a fundamental property relationships) and the function-
‘power class,’ but also a ‘sexual’ class system, (on class in ing of law as a possible limit to political power; in
general, see Milner 1999). Marxist legal theory could short, the intrinsic value of legal normativity for a
be understood as an economic analysis of law. But social order.

1930
Class Consciousness

See also: Affirmative Action: Comparative Policies conditions for them to constitute a class. In The
and Controversies; Class Consciousness; Class: Social; Eighteenth Brumaire of Louis Bonaparte (1852), Marx
Crime and Class; Gender, Class, Race, and Ethnicity, describes the living conditions of peasant families. He
Social Construction of; Justice, Access to: Legal sees them as having few connections with each other
Representation of the Poor; Networks and Linkages: and even less with the rest of society. For this reason,
Cultural Aspects; Social Class and Gender these families do not form a social class. Only groups
of individuals engaged in common activities, mainly
relations of production and exchange, constitute a
social class.
Bibliography In the struggle that necessarily opposes the owners
Abel R L 1998 The Legal Profession in England and Wales. of the means of production and the workers, who
Blackwell, London possess only their labor power, all the conditions are
Bourdieu P 1986 Forms of capital. In: Richardson J G (ed.) fulfilled for the proletariat to constitute a true social
Handbook of Theory and Research for the Sociology of class. The workers are all equally dependent upon
Education. Greenwood, Westport, CT, pp. 241–58 their employers. They are forced to sell their labor and
Eder K 1993 The New Politics of Class. Social Moements and must also resist the stronghold of capital whose
Cultural Dynamics in Adanced Societies. Sage, London
Galanter M 1974 Why the ‘haves’ come out ahead: Speculations
demands continually threaten their livelihood. This
on the limits of legal change. Law Society Reiew 9: situation of dependence and resistance, this class
95–160 struggle, which accompanies the beginnings of capi-
Griffith J A G 1981 The Politics of the Judiciary, 2nd edn. talist relations, becomes particularly intense due to the
Fontana, London system’s growing contradictions. As observed by Marx
Institut gosudarstva i prava AN SSSR (ed.) 1970 Marksistko- from 1845 onward, this is the context in which the
leninskaja obs\ c\ aja teorija gosudarsta i praa (Marxist– issue of class consciousness arises. In other words, it is
Leninist general theory of state and law). Moscow the situation through which workers become con-
Lenski G 1966 Power and Priiledge: A theory of Social scious of their shared socioeconomic conditions, of
Stratification. Mcgraw-Hill, New York
Liebknecht K 1910 Gegen die preußische Klassenjustiz (Against
their fundamentally antagonistic relationship with the
the Prussian class justice). In: Gesammelte Reden und Schriften, capitalists, and hence of the need for political struggle.
Vol. III. 1960 Dietz, Berlin, pp. 3–55 This awareness signals a new change within the
Luka! cs G 1920 1971 Legality and illegality. In: History and Class working class itself, leading it from the condition of
Consciousness: studies in Marxist Dialectics. MIT Press, class ‘in itself’ to one of class ‘for itself.’
Cambridge, MA In The Poerty of Philosophy (1847), Marx analyzes
Marx K 1859 Zur Kritik der politischen O= konomie. Vorwort the economic transformation of the working class and
(On the critique of political economy. Preface). In: Marx- the ensuing changes in its subjective position. Since
Engels-Werke, Dietz, Berlin 1975, Vol. 13, pp. 7–11 workers have been brought together, ‘agglomerated,’
Mill J S 1861 1962 Considerations on Representatie Goernment.
Reprint Regnery, Chicago
in the same workplaces, they experience similar labor
Milner A 1999 Class. Sage, London conditions. They face the same demands from their
Pas) ukanis E 1924 1980 General theory of law and Marxism. In: employers, and are therefore led to organize strikes
Beirne P, Sharlet R (eds.) Selected Writings on Marxism and together and to form ‘coalitions,’ which persist beyond
Law. Academic Press, London the period of strike. Such struggles and communal
Rottleuthner H (ed.) 1984 Rechtssoziologische Studien zur activities bring the entire working class above the level
Arbeitsgerichtsbarkeit (Socio-legal studies on labor courts). of local conflicts and heighten its awareness of its
Nomos Verlagsgesellschaft, Baden-Baden political role. This rise in consciousness represents
Weber M 1964 Wirtschaft und Gesellschaft (Economy and more than simply the awareness of a particular
society). Kiepenheuer and Witsch, Berlin
situation. For Marx, proletarian consciousness is
H. Rottleuthner simultaneously the discovery by the laborers of their
extreme alienation and of their need to overcome such
alienation through a form of action aimed at destroy-
ing the capitalist mode of production. Class con-
sciousness is considered to be the sine qua non of social
revolution.
Class Consciousness Such a concept, which is both an element of Marxist
philosophy of history and a theory of social change,
1. A Key Concept of Marxist Theory comes from two major sources: one theoretical, the
other empirical. Hegel, in The Phenomenology of Mind
In Marx’s work the concept of class consciousness is (1807), describes ‘self-consciousness’ as a moment in
based on his theory of social classes and the distinction the mind’s evolution by means of which the subject
he makes between existing social classes and politically reaches a heightened level of awareness and a new
active ones. It is not sufficient for a large number of potential for action. Yet it is essentially the changes in
individuals merely to be living under similar social working conditions in the UK and France that

1931
Class Consciousness

inspired Marx to formulate such a concept. As E. P. 3. Four Sociological Criteria


Thompson (1963) shows in his study, The Making of
the English Working Class, the years 1820–40 wit- After World War II, certain works suggested that
nessed a decline in the traditional organization of the such a theory should be made sociologically relevant.
various trades, a re-enforcement of the process of To this end Mann (1973) distinguished four principles
proletarianization of the English working class, and which together would constitute class consciousness:
the beginnings of chartism. In France, during the (a) identity—the definition of oneself as a member of
1840s an increasing number of scientific and journal- the working class; (b) opposition—the designation of
istic writings emphasized the unification of the work- an opposing class; (c) totality—the vision of society as
ing class, who thereby transcended divisions between a whole; (d) alternative—an alternative vision of
the trades and traditional organizations. Newspapers society based on different principles of organization
appeared, written by workers, in which they asserted and aimed at replacing the established order. These
their need to express their own claims and their hopes four conditions give rise to many different questions
for a new form of social organization. Some of these which numerous studies have attempted to answer.
reiterated the appeal formulated by Saint-Simon as Many studies lean toward a positive answer to the
early as 1820, calling for class consciousness as the first principle, or condition, concerning the awareness
fundamental condition necessary for political activity of an identity. Richard Hoggart’s work (1957) asks
by the agents of production. whether it is pertinent to emphasize the originality of
working-class culture. Using the perspective of cul-
tural anthropology, the study undertaken by Hoggart
in England in the 1950s illustrates the uniqueness of
the attitudes, the habits and the representations of the
2. A Political Debate lower classes. Hoggart underlines the permanence of a
strong identity among members of this milieu and
In 1865, Proudhon, one of the founders of an- their heightened awareness of the social distance
archism, developed a theory similar to that of Marx. existing between themselves and other social classes.
In his work, De la capaciteT politique des classes The opposition between ‘us’ and ‘them’ lies at the
ourieZ res (On the political potential of the working heart of their entire social representation.
classes), he poses the question of the conditions The world of the ‘others’ is a vague congrega-
necessary for a social class to unite in political action. tion of employers of the private sector and civil
He formulates three conditions: the achieving of self- servants. However, such a feeling of membership
consciousness, the formulation of a political program, does not necessarily lead to the consciousness that
and the capacity to carry out such a program. He there exists an inevitable conflict. Moreover, class
believes that the working class has fulfilled the first two consciousness is limited essentially to those who work
conditions since 1848, but that it has yet to reach the in industry. One might add that studies which adopt
third condition. He emphasizes, even more than does the perspective of social stratification bring nuances to
Marx, the dynamics of the social movement and also this notion of identity. The works by W. L. Warner,
the problems it faces as it attempts to achieve its goals. published in the USA in 1949, suggest that the social
This theory of class consciousness led to divergent structure differentiates between multiple levels of
and far-reaching interpretations. One could indeed social status. Three of them—the lower-middle class,
conclude from it that the function of a labor party the upper-lower class, and the lower-lower class—to-
should be only to encourage class consciousness, gether make up the working class. To this is added the
without aiming to control or direct it. On the contrary, distinction between ethnic groups, characterized by
Lenin, in his 1902 publication What is to be Done, differing feelings of membership.
states that the class consciousness of the workers has The principle of the opposition to, and the des-
led spontaneously within the context of capitalism to ignation of, an adversarial class may be regarded both
purely economic claims. He considers that it is not the from the perspective of representation and from that
working class, but rather intellectuals of bourgeois of action. Many studies indicate that it is not among
origin, who produced the theory of social revolution. the most disadvantaged workers that one finds the
He therefore concludes that the political leadership of most entrenched adversarial stances. Material diffi-
the movement must not be handed over to represen- culties, unemployment, insecurity, and lack of qualifi-
tatives of the working class (who tend towards cation generate a consciousness of the opposition
reformism and anarchism), but rather to professional between ‘us’ and ‘them,’ but not a polarization against
revolutionaries, united by strong party discipline. an adversarial class. It is mainly in professions such as
These sociopolitical arguments provoked critical reac- mining and among dock workers, where attachment
tions on the part of Marxist theoreticians (Rosa to one’s job and loyalty between members of the
Luxembourg, Georg Lukacs, Antonio Gramsci), who working group are very strong, that one finds the most
feared the consequences of replacing class conscious- adamant expressions of general opposition to auth-
ness by an authoritarian party. ority (Touraine 1966). In industry, among skilled

1932
Class: Social

workers, attitudes are less radical. In their research, national interests. In this sense they reproduce the
J. H. Goldthorpe and his collaborators (1969) formu- traditional divisions found in public opinion rather
lated the general hypothesis that prosperous ‘affluent’ than following the model of political awareness sup-
workers, who have a more functional relationship to posedly characteristic of the working class.
their work, seek to obtain job stability and a higher
salary, and are more interested in consumption than See also: Bourgeoisie\Middle Classes, History of;
conflict. Patterns of strike participation confirm the Class: Social; Labor Movements, History of; Marx,
main points of this thesis. In industry, strikes have Karl (1818–89); Mobility: Social; Poverty, Culture of;
wide participation, while in the service sector (banking, Social Class and Gender; Social Mobility, History of;
administration), where the organizing unions are Social Movements, Sociology of; Social Stratifi-
concerned more with obtaining satisfaction for par- cation; Working Classes, History of
ticular claims (raises, better working conditions,
shorter working hours), they tend to result in negotia-
tions. Bibliography
The theory of class consciousness presupposes that Bulmer M (ed.) 1975 Working Class Images of Society. Rout-
the working class, due to its subordinate position, will ledge & Kegan Paul, London
eventually come to have a global image of society, as Dahrendorf R 1959 Class and Class Conflicts in Industrial
opposed to those who own wealth and whose horizons Society. Routledge & Kegan Paul, London
are limited by their personal interests (Luka! cs 1971). Goldthorpe J H, Lockwood D, Bechhofer F, Plat J 1969 The
Affluent Worker in the Class Structure. Cambridge University
Studies of the ‘images’ which workers hold of the
Press, Cambridge, UK
social system help to shed light on such assumptions Hoggart R 1957 The Uses of Literacy. Aspects of Working-Class
(Bulmer 1975). These studies show that workers’ Life with Special References to Publications and Enter-
representations are not as homogenous as Marxist tainments. Chatto & Windus, London
theory might lead one to assume. Depending on Lockwood D 1958 The Blackcoated Worker. A Study in Class
whether the people questioned interpret the social Consciousness. George Allen and Unwin, London
system in terms of power, of prestige, or of property, Luka! cs G 1971 [trans. Livingstone R] History and Class
they are more or less likely to perceive the social Consciousness: Studies in Marxist Dialectics. Merlin Press,
system in terms of conflict. The most adversarial London
Mallet S 1963 La nouelle classe ourieZ re. Editions du Seuil, Paris
perception of society is expressed by those whose
Mann M 1973 Consciousness and Action among the Western
perceptions are linked to power. Yet such a rep- Working Class. Macmillan, London
resentation is not the most frequent one. W. L. Thompson E P 1963 The Making of the English Working Class.
Warner’s studies of US workers already suggest that Gollancz, London
differences exist depending on the type of workplace, Touraine A 1966 La conscience ourieZ re. Editions du Seuil, Paris
the degree of urbanization, and the level of qualifi- Warner W L, Meeker M, Eells K 1960 Social Class in America.
cation. The attitude of the ‘affluent workers,’ because A Manual of Procedure for the Measurement of Social Status.
they tend toward representations related to possession, Harper & Row, New York
suggest the generalization of an image of society in
which the majority is made up of a middle class, P. Ansart
subdivided according to wealth, income, and con-
sumption habits.
These studies cast doubt upon the argument that
claims that the working class is driven by a will for
Class: Social
political change and by a capacity to support an
alternative program. During the 1960s, these observa- Class is a key concept in sociological theory, but its
tions led to the proposal that such a political potential precise meaning and definition is highly contested. It is
for change was no longer to be found among unquali- also a core explanatory variable in much empirical
fied workers, but instead in a ‘new working class’ made research, yet there is enormous diversity in the ways in
up of technicians and more qualified workers from which it is operationalized and measured. Most
cutting-edge industries such as computers, communi- sociologists agree that social class refers to how people
cations, and petrochemical industries (Mallet 1963). make a living, and that there are relatively stable
This idea, which places emphasis on new skills, has not patterns of inequality between different classes—but
been confirmed. It assumes that these highly qualified there the consensus ends. Indeed, some sociologists
technicians and workers will call upon their unions to now argue that the concept of social class has outlived
adopt radical political stances. Yet the low level of its usefulness and should be abandoned altogether.
union membership among workers in general, and in
particular among such technicians, tends to indicate 1. Marx’s Class Theory
that they have different types of demands. They ask
their unions to negotiate to safeguard their own The concept of ‘class’ was central to Marx’s theory of
interests, while expecting political parties to defend how societies are constituted and how they change,

1933
Class: Social

although Marx himself never produced a definitive 2.1 How Many Classes?
statement of how class was to be conceptualized. The
One of the most intractable problems arises out of
treatment of class in his different works is not always
Marx’s insistence that class is structured around the
consistent, but five basic themes emerge.
relationship between owners and nonowners of the
First, although different classes generally have
means of production. Logically, this implies that there
different levels of income and different life-styles, it is
are only ever two classes, yet in Capital he identifies
not their income or life-style per se that distinguishes
three classes in modern capitalism (landowners, wage
them. The crucial determinant of class is ownership (or
laborers, and capitalists), and, analyzing the 1848
nonownership) of productive property. Because
events in Paris, he finds as many as nine (proletarians,
people either own, or do not own, the basic means of
financiers, industrialists, the middle class, the petty
production in a society, it follows that there can be
bourgeoisie, the lumpenproletariat, intellectuals, the
only two principal classes in any mode of production.
clergy, and the peasantry).
Under capitalism, these are the bourgeoisie and the
Later Marxists have sought to resolve this confusion
proletariat.
in two ways. First, they have distinguished ‘classes’
Second, class is not a ‘position’ which we occupy in
and ‘class fractions.’ Owners of capital, for example,
society, but a relationship which one group in society
are a single class, but different ‘fractions’ of this class
has to another. This relationship is one of exploitation,
squabble when their interests diverge. Thus, land-
for those who own the means of production add to
owners seek to extract high rents, but this conflicts
their wealth by using the labor power of those who
with the interests of industrialists who want to keep
own nothing. Because class relations are always
their overhead costs down. Industrialists and land-
exploitative, they are always antagonistic. Class strug-
owners share a common class interest in the main-
gle is an inherent feature of all class societies, and this
tenance of the capitalist system, but their different
perpetual battle between classes is the motor driving
interests in respect of profits and rents creates different
human history.
fractions within the bourgeoisie.
Third, class is more than just a concept developed
Second, Marxist theorists have distinguished be-
by social scientists to help them make sense of the
tween a pure ‘mode of production’ (where there are
world. Classes exist, and they have real effects. Even
only ever two classes) and actual ‘social formations’
where the members of a given class do not recognize
(which may contain elements of more than one mode
their class identity (a situation which Engels labeled
of production, and thus more than two classes). For
‘false consciousness’), class is still an objective reality
example, France and the UK are structured around a
shaping their lives.
capitalist mode of production (with two classes,
Fourth, class struggle is a feature of every society
bourgeoisie and proletariat), but they also contain
since the development of settled agriculture. Even
elements of an earlier, feudal, mode of production (the
where divisions seem to reflect factors other than
surviving elements of a peasant class in France, and an
property ownership (e.g., in caste systems where status
aristocratic class in the UK).
at birth counts more than wealth), the ‘real’ force
structuring social relations is still class. Class relations
shape every aspect of social life. Politics, law, art, and
philosophy are the ‘superstructural’ expressions of a
2.2 Theorizing the Middle Class
more ‘basic’ relation between owners and nonowners
of the means of production. Class relations impose A common feature of modern capitalism in all parts of
their stamp on every aspect of life. the world is the growth of managerial, administrative,
Fifth, Marx claimed that the class relation between and professional occupations. This ‘middle class’
bourgeoisie and proletariat was becoming increasingly cannot easily be explained as either a ‘fraction’ of one
sharp. Capitalist societies were polarizing between a of the two ‘main’ classes, or a remnant of an earlier
small number of large capitalists and a large number mode of production. Braverman (1974) suggested that
of propertyless proletarians. He predicted that the large sections of it are ‘really’ part of the proletariat
condition of the proletarians would become increas- because they are increasingly subject to typically
ingly miserable, and that sooner or later, there would ‘proletarian’ conditions of labor such as job insecurity
be a revolution which would replace the capitalist and deskilling, but this interpretation has been chal-
system with a socialist one in which there would be no lenged, and higher up the hierarchy it is clear that
classes because all productive property would be managers and professionals are in quite a different
owned in common. situation from manual workers as regards remuner-
ation, security of employment, career prospects, and
workplace autonomy.
2. The Marxist Tradition of Class Analysis Recognition of these differences has resulted in
various attempts to develop Marx’s theory to take
Later Marxists have wrestled with some major prob- account of the distinctiveness of the ‘new middle class.’
lems left unresolved by Marx’s analysis. Carchedi (1975) theorized it as employees who sim-

1934
Class: Social

ultaneously contribute to the functions of ‘collective It is also clear that Weber does not see class as
worker’ (e.g., by coordinating the labor process) and pervasive and enduring in the way that Marx does.
of capital (e.g., by controlling their fellow workers). For him, class (market power) is only one dimension
Poulantzas (1975) distinguished it on ‘political’ (super- of power in society. In the modern period, it is often
visory functions) and ‘ideological’ (mental labor), as the most important dimension, but in other periods,
well as ‘economic’ (nonproductive activity) criteria. the ‘social power’ of status groups (such as the old
And Wright (1985), while emphasizing the division European aristocracy or the high castes in India), or
between owners and nonowners of capital assets as the ‘political power’ of parties, mobilizing to turn state
fundamental, differentiated nonowners according to authority to their own advantage, has outweighed that
their ability to ‘exploit’ skills and organizational assets of classes. Even today, status and political power can
(a class of ‘expert managers,’ for example, exploits cut across class lines. There is no assumption in
both skills and organizational assets, while prole- Weber, as there is in Marx, that the dominant
tarians can exploit neither). economic classes are also the dominant social and
All of these formulations have been influential. political force in any society.
Their weakness, however, is their complexity (Wright’s
schema, for example, generates a total of 12 different
classes). The more Marxist theorists have tried to
develop a descriptively adequate account of the 4. The Weberian Tradition
contemporary class structure, the more they have had
to sacrifice the theoretical sharpness of Marx’s original
approach. 4.1 Class Location and Social Action
Weber’s approach to class analysis was essentially
concerned with classification—social classes are ideal
types. This has made things easier for later theorists,
3. Weber’s Concept of Class for unlike Marx’s approach, Weber’s work can readily
be adapted and developed to take account of changed
Weber contradicts almost every element in Marx’s conditions (such as the growth and increased com-
approach. Where Marx insists that classes arise out of plexity of the middle classes).
the organization of production, Weber treats them as What is lacking in Weber’s approach, however, is a
distributional categories. For him, ‘economic classes’ causal theory. For Marx, classes act—they drive
can be identified wherever individuals share a common history through struggle. For Weber, however, classes
market situation, either in property markets (where are simply categories into which people can be
those who are ‘positively privileged’ can live off classified, and the role played by these categories in the
revenues from assets) or in labor markets (where those explanation of social phenomena is left open. Class
who are ‘positively privileged’ can command high may help explain some of the things that people do,
salaries in return for their scarce skills and qualifica- but it may be irrelevant to others—the usefulness of
tions). ‘Social classes’ are simply clusters of economic the concept is left to empirical research to determine.
classes between which inter- or intragenerational This is both a strength and a weakness of the
mobility is ‘easy and typical.’ Weberian legacy. Its strength is that it has enabled the
In this approach, social classes are not defined in development of class typologies which appear both
relation to each other, as they are for Marx, but are valid (they adequately capture some of the key sources
mapped positionally in a hierarchy according to their of differentiation in contemporary societies) and re-
market capacity. Weber therefore has no problem liable (they predict fairly accurately different patterns
identifying a ‘middle class’; indeed, logically, there are of social behavior and attitudes). Its weakness is that
two middle classes (a petty bourgeoisie of small Weberian class analysis has often been limited to
property owners with a relatively weak labor-market mapping differences between classes rather than ex-
position, and an intelligentsia who own few assets but plaining them. It generates useful depictions of the
who command high returns in the labor market ‘class structure’ but seems to lack a theory of ‘class
because of their education and training). They sit action’ (why and how a person’s location in the
between an upper class (positively privileged in prop- structure influences social action).
erty and skills) and a working class (negatively
privileged on both, and therefore relatively poorly
remunerated).
‘Class’ is not ‘real’ for Weber as it is for Marx—it is
4.2 Work, Market, and Status Situations
simply an analytical construct, a label to refer to
clusters of individuals. Sometimes people express a One influential attempt to link people’s class position
common class identity and act accordingly—but a to their attitudes and behavior was David Lockwood’s
failure to think and act in class terms does not (1958) study of clerical workers in Britain. He defined
constitute a ‘false consciousness’ as it does for Engels. class position by people’s ‘work situation’ (i.e., the

1935
Class: Social

degree of autonomy they enjoy and the authority gories based around ownership of property, possession
invested in them or exercised over them) and ‘status of qualifications, and possession of manual labor
situation’ (i.e., their occupational prestige) as well as power, and that movement between these three is
their ‘market situation’ (i.e., their income, job security, typically limited. These are the three social classes of
and career prospects), and he showed that on all three the modern period, but Giddens then goes on to
criteria, clerical employees differed radically from identify the factors (‘proximate structuration’) which
manual workers. These differences were then reflected keep them apart and help promote distinctive class
in their behaviors and attitudes, such as willingness to identities. Here he echoes Lockwood’s work by em-
join a trade union. phasizing the importance of differences in the work-
Lockwood and his colleagues utilized much the place (e.g., in authority relations) and in the locality
same approach in a later study of the British working (e.g., residential segregation). He argues that proxi-
class (Goldthorpe et al. 1969). The research refuted the mate structuration always leads to some degree of
embourgeoisement thesis—the idea that affluence was ‘class awareness,’ though not necessarily to ‘class
blurring the boundary between the working class and consciousness’ in a Marxist sense.
the middle class—by identifying clear differences in The problem of linking the ‘class position’ that
the work, market and status situations of well-paid people occupy to their values, beliefs, and actions was
manual workers as compared with white-collar work- tackled very differently by Frank Parkin (1979), who
ers. These differences were then held to explain the resurrected Weber’s neglected concept of ‘social clos-
different patterns of consciousness found in each ure.’ Closure refers to the way groups try to improve
group. Lockwood (1966) also analyzed differences or maintain their privileges by restricting the access of
within the working class, arguing that different work others, and Parkin identified two main strategies.
and ‘community’ situations could account for the ‘Exclusion’ operates downwards and involves pro-
different ‘images of society’ characteristic of trad- tection of privileges by dominant groups; ‘usurpation’
itional proletarian workers (e.g., miners), traditional operates upwards and represents the attempt by
deferential workers (e.g., farm laborers), and privatiz- subordinate groups to claim more privileges for
ed workers (such as those employed in modern, high- themselves.
wage industries). Exclusion typically seeks to defend privileges as-
Since then, John Goldthorpe (1987) has used dif- sociated with property rights or qualifications (e.g.,
ferences in the ‘work situation’ (authority relations) professional closure). Parkin defines the dominant
and ‘market situation’ (income, security, and pros- class as those whose resources (revenues or sal-
pects) of different occupational groups to develop a aries\fees) derive mainly from the exercise of ex-
new, 11-category model of the class structure. De- clusion. Usurpation, by contrast, is typified by forms
veloped initially as a framework for the study of social of solidaristic action such as trade-union organization,
mobility in Britain, this typology became the basis for and it is the defining feature of the subordinate class.
a major international study of mobility rates (Erikson Those in white-collar occupations, who frequently
and Goldthorpe 1992), and it has also been used to claim rewards on the basis of their individual qualifica-
investigate issues such as the class basis of voting tions (exclusion) and through membership of the
behavior (Heath et al. 1985). In comparison with other organized labor movement (usurpation), constitute an
schema, it has been found to have a stronger predictive ‘intermediate class’ between these two.
power and clearer internal consistency (Marshall et al. Defining classes by their action (mode of closure),
1988, Breen and Rottman 1995), and it has won rather than by their location in a structure of positions,
widespread acceptance in much of Western sociology. Parkin claims to have sidestepped the recurring prob-
lem of how to link class positions to behavior. In his
view, the ‘class structure’ derives from collective forms
of action, rather than the other way around, and there
4.3 Class Structuration and Social Closure
is no structure of positions independent of the way
Lockwood’s concern with linking people’s experience, people act.
at work and in the locality, with their sense of class
identity was later elaborated in Giddens’s theory of
5. Measurement of Class
class structuration. He explicitly addressed the prob-
lem of how a shared economic location becomes Theoretical debates over whether and how to divide
important for collective social action. the population into discrete class categories are re-
Weber’s answer to this question emphasized pat- flected in the different ways the class concept has been
terns of social mobility and closure—social classes are operationalized in empirical research.
clusters of individuals occupying market situations
between which mobility is readily possible. Giddens
5.1 Class as a System of Categories
(1973) refers to this as the ‘mediate’ structuration of
social classes. He argues that market situations in Class has often been operationalized in quite crude
modern capitalism typically cluster into three cate- and untheorized ways. Many studies have simply

1936
Class: Social

distinguished ‘manual’ and ‘nonmanual’ occupational 6. The Future of Class Analysis


groups, and some have used the marketing industry’s
sixfold system of classification based on spending Controversy continues today, not only over how to
power, but neither of these approaches corresponds to measure class, but also over whether the concept
any sociological concept of class. In the UK, research remains useful.
has often used the government’s five (later six) class
schema, but the criteria underpinning this system
shifted over time from occupational prestige to skill 6.1 Women and Class
levels creating confusion over what the classes des-
ignate. Feminists have criticized the failure of class analysis
In recent years, there has been a growing consensus adequately to encompass the position of women. Like
around the use of systems of classification based on the students, retired people, and the unemployed, married
Goldthorpe schema, and even the UK government’s women who do not have full-time jobs are usually
system of classification has now been revised to bring classified according to the occupation of their hus-
it more into line with the neo-Weberian emphasis on bands, and even women who do have jobs may be
work and market situation (Rose and O’Reilly 1997). allocated to their male partner’s social class where that
However, there is still intense disagreement among is ‘higher’ than theirs. Goldthorpe defends this on the
researchers over whether it makes sense to measure grounds that the ‘life chances’ of a lower-class woman
class in terms of categories at all, irrespective of how married to a higher-class man are shaped more by his
they are derived. market situation than by hers, but this issue still
generates considerable disagreement. Very different
pictures of the class structure emerge depending on
whether women are allocated to their own, or to their
husband’s, class location.
5.2 Class as a Continuous Scale
In the USA, there is a long tradition of measuring class 6.2 Nonclass Identities
differences on a continuous scale rather than a
categorical system of classification. The best known Feminists have also criticized class analysis for the
example is the 96-point scale developed by Blau and assumption that ‘class location’ is more important
Duncan (1967) for their study of social mobility. They than gender in determining life chances and influenc-
justified using a continuous scale rather than discrete ing values and behavior. Their argument is comple-
categories on the grounds that there are no clear cutoff mented by those (notably in the USA) who emphasize
points between occupations on any of the pertinent the primary importance of race and ethnicity rather
criteria that might differentiate them. In other words, than class. There has also been some debate over
classes shade off into one another. whether people’s ‘consumption location’ (e.g., as home
Their approach attracted widespread criticism. De- owners or renters) outweighs their class location, and
spite reporting high correlations ( 0.9) between postmodernists insist that clear class divisions have
occupational prestige and income, and education now fragmented into a multitude of different interests
levels, they were accused of measuring status dif- and identities.
ferences rather than differences in market power, and Class analysts have tended to respond to such
they were attacked for their ‘conservative’ assumption criticisms by appealing to evidence that class identity
that there is a consensus over the prevailing system of is still primary (Marshall et al. 1988). They point out
occupational rewards (Horan 1978). Partly in order to that class effects may be crosscut by gender, race, or
counter such criticisms, Stewart et al. (1980) developed other identities without class itself losing its signifi-
a scale which ranks occupations according to the cance.
‘social distance’ that separates them, and which does
not therefore depend on any assumption of value
consensus over the worth of different positions. In
6.3 The Releance of Class
practice, this scale looks very similar to those based on
prestige rankings and correlates highly with them, Some critics (Pahl 1989, Clark and Lipset 1991) argue
which suggests that the problem of ‘ideological con- that class is a concept that has outlived its usefulness.
tamination’ of occupational prestige scales has prob- People no longer think of themselves in class terms,
ably been exaggerated. movement between class locations is common, bound-
The advantage of continuous scales over categorical aries between classes have blurred, and class analysis
schema (such as Goldthorpe’s) is that they avoid the has failed to explain the causal link between class
problem of drawing artificial class ‘boundaries’ locations and social outcomes.
(Kelley 1990). There is, however, no reason why we Responding to this, Breen and Rottman (1995)
should not use both in empirical research. accept that class is a weak source of identity for many

1937
Class: Social

people, but they argue that people’s ‘objective’ class Rose D, O’Reilly K 1997 Constructing Classes: Towards a New
location is still crucial in explaining many aspects of Social Classification for the UK. ESRC and ONS, Swindon,
their lives. Recent work on class differences in health UK
Stewart A, Prandy K, Blackburn R M 1980 Social Stratification
and morbidity (Wilkinson 1996) are one striking
and Occupations. MacMillan, London
example. It is probably fair to conclude that class does Weber M 1968 Economy and Society. Bedminster Press, New
still correlate significantly with many of the phenom- York
ena studied in social science, but we often lack a clear Wilkinson R 1996 Unhealthy Societies: The Afflictions of
explanation of the mechanism which translates ‘class Inequality. Routledge, London
position’ into social outcomes. Wright E 1985 Classes. Verso, London

See also: Class Consciousness; Consumption, Soci- P. Saunders


ology of; Equality of Opportunity; Income Distribu-
tion; Inequality; Inequality: Comparative Aspects;
Marx, Karl (1818–89); Mobility: Social; Social Class
and Gender; Social Mobility, History of; Status and
Role: Structural Aspects; Underclass; Wealth Dis- Classical Archaeology
tribution; Weber, Max (1864–1920)
The notion of ‘classical archaeology’ is of relatively
recent origins in the vocabulary of the humanities. Its
Bibliography common usage dates to the second half of the
nineteenth century, when archaeology, a newly formed
Blau P M, Duncan O D 1967 The American Occupational university discipline, sought its place within the
Structure. Wiley, New York broader intellectual framework of the sciences of
Braverman H 1974 Labor and Monopoly Capital. Monthly antiquity, the Altertumswissenschaft of German schol-
Review Press, New York arship. For knowledge of antiquity to be scientific, it
Breen R, Rottman D B 1995 Class Stratification: Comparatie
Perspectie. Harvester Wheatsheaf, New York
had to encompass all the creations of human genius.
Carchedi G 1975 Economic identification of the new middle Thus, just as philology had emancipated itself from
class. Economy and Society 4: 1–86 theology, so was archaeology to free itself from
Clark T N, Lipset S M 1991 Are social classes dying? Inter- philology in order to constitute, together with the
national Sociology 6: 397–410 latter and with ancient history, the third pillar of
Erikson R, Goldthorpe J H 1992 The Constant Flux: A Study of Altertumswissenschaft. This disciplinary triangle effec-
Class Mobility in Industrial Societies. Clarendon Press, tively constitutes the foundations of the modern
Oxford, UK conception of the classical past—that is, the past of the
Giddens A 1973 The Class Structure of the Adanced Societies. Greco-Roman world.
Hutchinson, London
Goldthorpe J H 1987 Social Mobility and Class Structure in
Modern Britain, 2nd edn. Clarendon Press, Oxford, UK
Goldthorpe J, Lockwood D, Becchoffer F, Platt J 1969 The 1. The Definition of Classical Archaeology
Affluent Worker in the Class Structure. Cambridge University
Press, London Such a definition of classicism is quite evidently
Heath A, Jowell R, Curtice J 1985 How Britain Votes 1st edn. arbitrary. It rests on a very specific and thoroughly
Pergamon Press, Oxford, UK European experience of antiquity, in which Greco-
Horan P M 1978 Is status attainment research a theoretical? Roman literature and its concepts are overwhelmingly
American Sociological Reiew 43: 534–41 privileged. In specifying the chronological and geo-
Kelley J 1990 The failure of a paradigm: Log-linear models of graphical boundaries of the classical world, nineteenth
social mobility. In: Clark J, Modgil C, Modgil S (eds.) John H
century historians and archaeologists have confirmed
Goldthorpe: Consensus and Controersy. Falmer Press, Lon-
don a spatio–temporal distinction already initiated during
Lockwood D 1958 The Blackcoated Worker. Allen and Unwin, antiquity, and notably in the Alexandrian golden age
London of the third and second centuries BC. Following this
Lockwood D 1966 Sources of variation in working class images definition, the classical world effectively begins with
of society. Sociological Reiew 14: 249–67 the dispersal of the Greeks in the Mediterranean
Marshall G, Newby H, Rose D, Vogler C 1988 Social Class in during the eighth century BC, and ends with the fall of
Modern Britain. Unwin Hyman, London the occidental Roman empire in 476 AD. In terms of
Marx K, Engels F 1968 Collected Works in One Volume. its geographical extension, the classical world reaches
Lawrence and Wishart, London
as far as the farthest expansion of the Roman empire,
Pahl R E 1989 Is the emperor naked? International Journal of
Urban and Regional Research 13: 709–20 at the end of the first century AD.
Parkin F 1979 Marxism and Class Theory: A Bourgeois Critique. As can be seen, this mode of historical represen-
Columbia University Press, New York tation clearly focuses on the Mediterranean as the
Poulantzas N 1975 Classes in Contemporary Capitalism. New zone of contact between Europe, Africa and Asia. But
Left Books, London where does this Mediterranean space actually ends? Is

1938
Classical Archaeology

it somewhere in the vast European plains? On the reinventing the contemporary world. Collections of
shores of the Black Sea? At the foot of the mountain Greek and Roman manuscripts arose passions,
ranges bordering Anatolia or Africa? According to ancient coins were coveted, monuments begun to be
their readings of both the classical sources and the excavated for the sculptures and architectural remains
surrounding landscapes, experts have drawn different they may have contained. The past that was sought
boundaries to the classical world. Given this diversity after and revived was that of Greece and Rome, if only
of opinions, German archaeologists have proposed because a reference to this world was seen as a means
the notion of Randkulturen to designate all those to do away with a present seen as barbarous, illiterate
civilizations known to have been in contact with the and indeed ‘gothic.’
classical world without fully merging with it: For Renaissance scholars, antiquity was quinte-
Scythians, Parthians, Germans, Phoenicians, Egypt- ssentially Greco-Roman: the literary works, sculptures
ians, and many others. and architecture of these periods were seen as unsur-
Chronology is not much better established. When passable productions, to be studied and emulated by
the Archaic epoch was first identified, archaeologists the best contemporary artists. This classical focus
knew nothing of the Minoan world and the palace dominated scholarly and artistic production until the
civilizations of Greek antiquity, civilizations which end of the eighteenth century, and it explains that
later on would end being called preclassical. At the ‘barbarian’ antiquities received only marginal interest.
other end of the scale, the date commonly chosen to The Renaissance has thus contributed to conflating
signal the conclusion of Greco-Roman history—the the knowledge of the past with the investigation and
fall of the occidental Roman empire in 476 AD— understanding of the Greco-Roman heritage—the
proves to be equally problematic: it takes no account part of universal culture believed to be ‘classic’ by
of the persistence of the oriental Roman empire until virtue of being the veritable pedestal of humanist
the fall of Constantinopolis in 1453. scholarship. There were some scholars who manifested
considerable interest for the oriental world, Egypt or
even Mesopotamia, and others, particularly in Scandi-
navia, Britain and Germany, who paid attention to
2. The History of Classical Archaeology local antiquities, but the model for the organization of
knowledge derived directly from the prestigious
With its uncertain chronological and geographical Greco-Roman tradition.
boundaries (a problem admittedly shared with proto- Renaissance scholars and their Enlightenment suc-
history), classical archaeology stands somewhat apart cessors assembled together coins, inscriptions, sculp-
from other, more global, strands of archaeology. It is tures, and sometimes even ceramics or other objects of
both the oldest among these archaeologies, and the daily life. These collections were displayed in ‘cabinets
one most influenced by tradition: the ancient Greek of curiosity’ and eventually came to adorn the first
word archaiologia retains within it all its meaning as a public museums, such as the Venetian collections and
discourse on antiquity. It was, after all, in fifth century the Ashmolean Museum in Oxford (from 1683).
Greece that the historical genre first appeared, the However, since no careful observations were made
reasoned discourse on the past to which the Greeks regarding the place of discovery and the mode of
gave the name of historia (enquiry). And since Varron manufacture of these collected works and monuments,
in the first century BC, the title of antiquary has been it was only with the greatest difficulty that scholars
given to the man who seeks to interpret and classify were able to classify them in terms of their dates and
ancient objects or monuments. In Greece as well as provenance. In fact, it was only the antiquarians of the
Rome, these antiquaries contributed to elucidate the eighteenth century who were first able to lay the
past; they assembled series of objects and monuments, grounds of a critical analysis of archaeological monu-
collected inscriptions, and then put them in order and ments.
sought to understand them. These advances occurred at different paces, and
In this respect, the collapse of the Occidental Roman they were due both to discoveries in the field and to
Empire was not without consequence, for with it a new attitudes towards finds and their study. In this
whole social class of knowledge- producers and users respect, the fortuitous discovery of the buried cities of
came to disappear. Secular men of letters were Pompeii and Herculaneum at the beginning of the
gradually replaced by clerics, whose function was to eighteenth century did much to awaken the interest of
educate the masses and the elite in the Christian faith. the elite in the Greco-Roman world. What was being
To be sure, there were among these clerics some unearthed there by the excavating teams of the King of
antiquarians who undertook to collect and investigate Naples were no longer usual ruins, but entire cities,
the monuments of antiquity. But one has to wait until covered by volcanic eruptions and thus miraculously
the end of the Middle Ages to find an antiquarian preserved from the ravages of destruction. Closely
movement comparable to that of Greco-Roman times. guarded by the Neapolitan government, the explo-
It was in Renaissance Italy that a new passion for ration of Pompeii gave rise to an unprecedented craze
antiquity emanated which was also a means for for the painting, architecture and implements of daily

1939
Classical Archaeology

life of the time. Given the context of their discoveries, rigorous analysis. While the antiquarian channeled his
these items could also for the first time be attributed a passion for the past to the collection of objects, the
secure chronological position. Naples thus played the archaeologist aimed to valorise these objects and
role of an antiquarian capital, and gradually came to monuments in the context of their discovery. This
compete with Rome in the esteem of travelers and approach found its most systematic expression in
connoisseurs. Germany with the research programme of E. Gerhard
Without interpretation, however, discovery is noth- (1795–1867), who effectively fought for an auton-
ing. In the second part of the eighteenth century, two omous archaeology, free from collectors, philologists
men revolutionized the understanding of antiquity. and artists. Archaeologists must take the place of
The first of these was Johannes Winckelmann (1717– connoisseurs with their various intermediaries, and
68), a German scholar from a small Prussian town themselves seek for the evidence on the ground.
who achieved tremendous influence with his work. His They must also distance themselves from philolo-
History of Art in Antiquity, published in Leipzig in gists, and challenge the primacy of written over
1764, offered to amateurs of Greco-Roman art the first nonwritten sources. Finally, they must take leave of
ever systematic and chronological treatise on the aesthetic considerations, and study ancient pro-
subject. Aided by a thorough familiarity with textual ductions in all their material and technical dimensions.
sources, Winckelmann depicted and commented on Building on advances in historical studies, this
antique objects scattered throughout European col- positivist program also drew on developments in
lections. Drawing on his personal study of the most geology and natural history to forge for itself its
important Roman collections, he could propose to his specific scientific instruments.
readers a remarkable history of plastic forms in Three complementary pillars came then to form
antiquity. The style of this work, the quality of its then backbone of archaeology. One, stratigraphy,
descriptions, and the philosophical spirit which ani- consists of observing the conditions of deposition of
mated it, all explain its unprecedented success: with it, objects and monuments in the ground. The second,
the occident could at last discover the sources of typology, seeks to identity in the objects themselves
Renaissance art in the classical world. Winckelmann, some evidence regarding the time and place of their
otherwise an enemy of the aristocracy and determined manufacture. The third, technology, deals with the
critic of the ancient regime, received his highest modes of production, the raw materials, the pro-
acclaim from the courts of kings and princes, among cedures of fabrication. It was in fact only by the end of
those who sought antiquity for the aesthetic pleasure it the nineteenth century, when these three strands were
procured them. mastered and marshaled together, that archaeology
One of the most notable aristocrats in the court of became a fully-fledged discipline able to undertake the
France, the Comte de Caylus (1692–1765), was in- reliable and methodical investigation of the past. The
strumental in drawing the attention of his con- body of doctrines and institutions which accompanied
temporaries to another aspect of the study of the past. this development contributed to giving archaeology,
For Caylus, what justified the quest for antiquity is not and specifically to classical archaeology, its distinctive
the aesthetic, but the technical achievements of the identity. In this respect E. Gerhard created a novel
past. He therefore developed an approach to the study kind of institution which prefigures in many ways
of past techniques in which can be recognized our those of today: the Instituto di corrispondenza archeo-
current concerns with artifact morphology and logica, established in 1829 in Rome through the
archaeometry. Each in his way, Winckelmann for the activities of aristocrats and scholars from across
appreciation of art and Caylus for the understanding Europe, took it as its aim to collect, excavate and
of techniques, laid down the bases for a history of the publish as thoroughly as possible all the antiquities of
oeures of antiquity, a research programme developed the Mediterranean world.
and followed throughout the nineteenth century. By drawing archaeologists to the field, by elab-
orating a modern strategy of publication based on
detailed sections and descriptions, the Instituto vir-
tually launched the new discipline that was embodied
3. The Rules of Classical Archaeology in this dedicated research institution located in Rome,
one of the capitals of the classical world.
The exploration of antiquity thus gradually trans- Until the middle of the nineteenth century, the
formed itself into a new discipline of archaeology. exploration of the Greco-Roman world had been a
Under this framework, nineteenth century scientists matter of individual enterprise with occasional royal
elaborated and imposed new rules for the extraction, support. From then on, archaeology would become
analysis and publication of evidence from the past. what the discoverer of Greek Asia Minor T. Wiegand
Archaeology distanced itself from collections and (1864–1936) has called a Grosswissenschaft, a state-
from the social milieu of collecting, and advocated sponsored science. The leading European powers
instead the necessity of observing finds in the field, and established then various institutes or schools in Rome
of recognizing them as coherent wholes open to and Athens, and these encouraged archaeological

1940
Classical Archaeology

expeditions throughout the main urban sites of the of Eurocentrism. The Greek and Roman worlds are
Mediterranean. The enormous success encountered by now understood as melting-pots rather than hoch-
the private initiatives of the German businessman kulturen. Under the influence of economic history,
Heinrich Schliemann (1822–90) at Mycenae and Troy scholarly interest has shifted from a history of art in
fueled a veritable scramble for new excavating terri- the narrow sense towards a wider history of pro-
tories among the countries of Europe and even ductions. The better chronological grasp allowed by
America. This competition contributed notably to the typological studies has served to advance studies on
growth of collections in the main European and colonization, on long distance contacts and trade.
American museums. The rise of such museums as the Studies of the relations between centre and periphery,
British Museum, the Louvre, the Berlin museums and between Greeks, Roman and ‘barbarians’ have bene-
even the Metropolitan Museum in New York, all fited from developments in the field of protohistory. It
catering for an ever-increasing public, corresponded may indeed be said that the research program of
with the ascendancy of a classical archaeology closely classical archaeology has undergone a wide-reaching
entwined with the economic developments of capi- redefinition during the 1960s.
talism and colonialism. The conquest of the past was No longer considered as the elder of the sciences of
effectively an instrument of foreign policy in the hands antiquity, it is now one discipline among others,
of all main powers, beginning with the core area of the sharing resources and addressing issues similar to
Greco-Roman world and reaching towards the Orient, other strands of archaeology. In this context, such
Africa, and the globe as whole. topics as the history of the landscape, or that of the
Archaeological schools, museums and, of course, movements of populations and the exchanges of goods
universities were all part of this expansion, from which and ideas, all open up new perspectives which con-
they have benefited. Whereas in the mid nineteenth tribute to extricate classical archaeology from its
century only Germany was endowed with a network of somewhat marginal position.
university Chairs in classical archaeology, by 1914 all At the same time, advances in the iconography and
European nations as well as the United States had the sociology of representations make it possible to
established university curricula in classical archae- cast a new light and reach a more satisfying under-
ology. This three-pronged movement, involving the standing of otherwise well-known bodies of evidence.
development of museum collections, the systematic Thus, building on the strength of its centuries-long
excavation of sites, and the setting of an academic traditions as a science dedicated to Ancient art,
theoretical and educational framework, effectively Classical archaeology is developing now into to a
gave to classical archaeology its modern appearance. human science in the full sense of the term.
See also: Christianity Origins: Primitive and ‘Western’
History; Chronology, Stratigraphy, and Dating Meth-
4. The Paths of Classical Archaeology ods in Archaeology; Historiography and Historical
Thought: Classical Period (Especially Greece and
Set out during the second half of the nineteenth Rome)
century, this disciplinary model witnessed during the
first half of the twentieth century a considerable
expansion and consolidation in both qualitative and Bibliography
quantitative terms. Alongside a marked increase in the
number of excavations carried out, the main archaeo- Be! rard C, Vernant J-P 1989 A City of Images. Iconography and
logical institutions followed the German model by Society in Ancient Greece. Princeton University Press, Prince-
systematically publishing catalogues of finds and ton, NJ
Bordein A, Ho$ lscher T, Zanker P (eds.) 2000 Klassiche
collections according to specific descriptive protocols.
Archaeologie, eine EinfuW hrung. Dieter Reimer Verlag, Berlin,
This increased and refined considerably the existing Germany
knowledge on artistic and artisanal productions: Elsner J A 1955 Art and the Roman Viewer, the Transformation of
sculptures, architecture, ceramics were all researched, Art from the Pagan World to Christianity. Cambridge Uni-
and so were numismatics and glyptics. Thus classical versity Press, Cambridge, UK
archaeology forged itself a vast corpus of systematic Goldhill S, Osborne R (eds.) 1994 Art and Text in Ancient Greek
reference, bearing equally on chronology, typology Culture. Cambridge University Press, Cambridge, UK
and technology. Marchand S L 1996 Down from Olympus. Archaeology and
Some criticisms and contradictions emerged, how- Philhellenism in Germany. Princeton University Press, Prince-
ton, NJ
ever, during the second half of the twentieth century.
Morris I (ed.) 1994 Classical Greece, Ancient Histories and
The implicit supremacy of Greco-Roman culture came Modern Archaeologies. Cambridge University Press, Cam-
under challenge; the comparative study of cultures bridge, UK
raised questions over the Hellenocentrism charac- Parslow C C 1998 Rediscoering Antiquity, Karl Weber and the
teristic of most Classical archaeologists, while ad- Excaations of Herculaneum, Pompeii and Stabiae. Cambridge
vances in the study of mentalities promoted a critique University Press, Cambridge, UK

1941
Classical Archaeology

Schnapp A 1997 The Discoery of the Past. Harry Abrams, New science of which continued to advance after behavior
York therapy began (see below).
Thomson de Grummond N (ed.) 1996 An Encyclopedia of the
History of Classical Archaeology. Greenwood Press, Westport,
CT
2. Behaioral Consequences of Classical
A. Schnapp Conditioning
The events in Pavlov’s experiment are often described
using terms designed to make the experiment ap-
plicable to any situation. The food is the ‘uncon-
ditional stimulus,’ or US, because it unconditionally
elicits salivation before the experiment begins. The bell
Classical Conditioning and Clinical is known as the ‘conditional stimulus,’ or CS, because
Psychology it only elicits the salivary response conditional on the
bell–food pairings. The new response to the bell is
Classical conditioning occurs when neutral stimuli are correspondingly called the ‘conditional response’
associated with a psychologically significant event. (CR), while the natural response to the food itself is
The main result is that the stimuli come to evoke a set the ‘unconditional response’ (UR).
of responses or emotions that may contribute to many Culture has created the impression that condition-
clinical disorders, including (but not limited to) anxiety ing is a rigid affair in which a fixed event comes to elicit
disorders and drug dependence. Research on con- a fixed response. In fact, conditioning is more complex
ditioning has uncovered many surprising details about and dynamic than that. For example, signals for food
the underlying learning process, as well as methods for may evoke a large set of responses that prepare the
eliminating emotional and behavioral problems. organism to digest food: They can elicit secretion of
gastric acid, pancreatic enzymes, and insulin in ad-
dition to the famous salivary response. The CS can
also elicit approach behavior, an increase in body
1. Historical Background temperature, and a state of arousal and excitement.
When a signal for food is presented to a quiescent,
Classical conditioning was first studied systematically food-replete animal, the animal may get up and eat
at the turn of the twentieth century by the Russian more food. Signals for food evoke a whole ‘behavior
physiologist, Ivan Pavlov (Pavlov 1927). In the usual system’ that is functionally organized to deal with the
description of his best-known experiment, Pavlov rang meal (Timberlake 1994).
a bell and then gave a dog some food. After a few Classical conditioning is also involved in other
pairings of bell and food, the dog began to salivate to aspects of eating. Through conditioning, humans may
the bell, and thus anticipated the presentation of food. learn to like or dislike different foods. In infrahuman
The classical conditioning phenomenon quickly animals, flavors associated with nutrients (sugars,
attracted psychologists who applied it to clinical starches, calories, proteins, or fats) come to be
issues. Before most of Pavlov’s work was available in preferred. Flavors associated with sweet tastes are also
English, John Watson showed that human emotions preferred, while flavors associated with bitter tastes
are also influenced by classical conditioning (Watson are avoided. At least as important, flavors associated
and Rayner 1920). Watson showed an infant boy a with illness become disliked, as illustrated by the
stimulus (which happened to be a laboratory rat) and person who gets sick drinking tequila and conse-
then made a frightening noise. After a few pairings of quently learns to hate the flavor. The fact that flavor
the rat and the noise, the child became afraid whenever CSs can be associated with such a range of biological
the rat was presented. This fear generalized to a rabbit, consequences (USs) is important for omnivorous
a dog, and a fur coat. Watson saw conditioning as a animals that need to learn about new foods. And it has
means by which emotions could be elicited by an clinical implications. For example, chemotherapy can
expanding range of cues. make cancer patients sick, and can therefore cause the
The application of conditioning to clinical issues conditioning of an aversion to a food that was eaten
became central to the behavior therapy movement recently (or to the clinic itself). And conditioning can
that began in the 1950s and 1960s (e.g., Wolpe 1958, enable external cues to trigger food consumption and
see also Behaior Therapy: Psychological Perspec- craving, a potential influence on overeating and
ties). The idea was that psychiatric disorders could be obesity.
understood and treated using scientifically established Classical conditioning also occurs when we ingest
principles of learning. The view that anxiety disorders drugs. Whenever a drug is taken, it constitutes a US,
result from classical conditioning was later criticized and it may be associated with potential CSs that are
(e.g., Rachman 1977). However, many of the criticisms present at the time (rooms, odors, injection rituals,
were directed at an obsolete view of conditioning, the etc.). CSs that are associated with drug USs can have

1942
Classical Conditioning and Clinical Psychology

an interesting property: They often elicit a conditioned may potentiate the conditioned eyeblink response
response that seems opposite to the unconditional elicited by another CS or a startle response to a sudden
effect of the drug (Siegel 1989). For example, although noise. Once again, CSs do not merely elicit a simple
morphine causes a rat to feel less pain, a CS associated reflex, but evoke a complex and interactive set of
with morphine elicits an opposite increase, not a responses.
decrease, in pain sensitivity. Similarly, although al- Classical fear conditioning can contribute to
cohol can cause a drop in body temperature, a phobias (where specific objects may be associated with
conditioned response to a CS associated with alcohol a traumatic US) as well as other anxiety disorders. For
is typically an increase in body temperature. In these example, in panic disorder, people who have un-
cases, the conditioned response is said to be ‘com- expected panic attacks can become anxious about
pensatory’ because it counteracts the drug effect. having another one (see Panic Disorder). In this case,
Compensatory responses are another example of how the panic attack (the US or UR) may condition anxiety
classical conditioning helps organisms prepare for a to the external situation in which it occurs (e.g., a
biologically significant US. crowded bus) and also internal (‘interoceptive’) CSs
Compensatory conditioned responses have impli- created by early symptoms of the attack (e.g., dizziness
cations for drug abuse. First, they can cause drug or a sudden pounding of the heart). These CSs may
tolerance, in which repeated administration of a drug then come to evoke anxiety or panic responses. Panic
reduces its effectiveness. As a drug and a CS are disorder may begin because external cues associated
repeatedly paired, the compensatory response to the with panic can arouse anxiety, which may then
CS becomes stronger and more effective at counter- exacerbate the next unconditional panic attack and\or
acting the effect of the drug. The drug therefore has panic response elicited by an interoceptive CS (Bouton
less impact. One implication is that tolerance will be et al. 2001). Interestingly, the emotional reactions
lost if the drug is taken without being signaled by the elicited by CSs may not require conscious awareness
usual CS. Consistent with this idea, administering a for their occurrence or development. Indeed, fear
drug in a new environment can cause a loss of drug conditioning may be independent of conscious aware-
tolerance and make drug overdose more likely (see ness (e.g., LeDoux 1996).
Siegel 1989). A second implication stems from the fact In addition to eliciting conditioned responses, CSs
that compensatory responses may be unpleasant. A also motivate ongoing behavior. For example, pre-
CS associated with an opiate may elicit several senting a CS that elicits anxiety can increase the vigor
compensatory responses—it may cause the drug user of instrumental or operant behaviors that have been
to be more sensitive to pain, undergo a change in body learned to avoid or escape the frightening US. Thus,
temperature, and perhaps become hyperactive (the an individual with panic disorder will be more likely to
opposite of another unconditional morphine effect). express avoidance in the presence of anxiety cues.
The unpleasantness of these responses may motivate Similar effects may occur with CSs that predict other
the user to take the drug again to get rid of them. USs (such as drugs or food)—as already mentioned, a
Compensatory responses may often resemble with- drug-associated CS may motivate the drug abuser to
drawal effects (Siegel 1989). The idea is that the urge to take more drugs. The potential influence of classical
take drugs may be strongest in the presence of CSs that conditioning on behavior is thus extensive and ubiqui-
have been associated with the drug. The hypothesis is tous.
consistent with self-reports of abusers who, after a
period of abstinence, are tempted to take the drug
again when they are re-exposed to drug-associated 3. The Learning Process in Classical
cues. Conditioning
Classical conditioning is also involved in anxiety
disorders, as Watson originally envisioned. We now Modern research has revealed some important details
know that CSs associated with frightening USs can about the learning process that underlies classical
elicit a whole system of conditioned fear responses, conditioning (Rescorla 1988). For example, con-
again broadly designed to help the organism cope. In ditioning is not an inevitable consequence of pairing a
animals, cues associated with frightening events elicit a CS with a US. Such pairings will not cause con-
set of natural defensive reactions that have evolved to ditioning if there is a second CS present that already
prevent attack by predators. They also elicit changes predicts the US (Kamin 1969). This sort of finding
in respiration, heart rate, and blood pressure, and even (‘blocking’) suggests that a CS must provide new
a (compensatory) decrease in sensitivity to pain. Brief information about the US if learning is to occur. Many
CSs that occur close to the US in time can also elicit theorists now suppose that conditioning is determined
adaptively timed protective reflexes. For example, the by the discrepancy between (a) the US predicted by all
rabbit blinks to a brief signal that predicts a mild CSs present on a trial and (b) the US that actually
electric shock near the eye. The same CS, when happens on the trial (Rescorla and Wagner 1972). One
lengthened in duration and paired with the same US, implication is that, depending on the size and direction
elicits mainly fear responses. And fear elicited by a CS of the discrepancy, the pairing of a CS and traumatic

1943
Classical Conditioning and Clinical Psychology

US, for example, can cause an increase in fear why human phobias tend to be for certain objects
conditioning, no change in conditioning, or even a (snakes or spiders) and not others (knives or electric
decrease in conditioning. sockets) that may as often be paired with pain or
The latter implication is interesting. A new CS can trauma.
acquire a negative value if the US is smaller than that
which other CSs present predict. Casually speaking,
the new CS predicts ‘less US than expected.’ Such 4. ‘Unlearning’ in Classical Conditioning
negative signals are called ‘conditioned inhibitors’
because they inhibit performance elicited by other Once one accepts a role for conditioning in behavioral
CSs. They are clinically relevant because they may and emotional disorders, the question becomes how to
hold pathological CRs like anxiety at bay. A loss of the eliminate it. Pavlov studied ‘extinction’: conditioned
inhibition would allow the anxiety response to emerge. responding decreases if the CS is presented repeatedly
Classical conditioning is most robust if the CS and without the US after conditioning. Extinction is the
US are intense or salient. It is also best if the CS and basis of many therapies that reduce pathological
US are novel. For example, in ‘latent inhibition,’ conditioned responding through repeated exposure to
repeated exposure to the CS alone before conditioning the CS. Another elimination procedure is ‘counter-
can diminish its ability to elicit responding when it is conditioning,’ in which the CS is paired with a very
paired with the US. In the ‘US pre-exposure effect,’ different US\UR. Counterconditioning was the in-
repeated exposure to the US before conditioning can spiration for ‘systematic desensitization,’ a behavior
likewise decrease the conditioning that later occurs therapy technique in which frightening CSs are
when a CS and the US are paired. One idea is that the deliberately associated with relaxation during therapy
CS and the US must be ‘surprising’ at the time of their (Wolpe 1958).
pairing for learning to occur. Thus, the effects of Although extinction and counterconditioning re-
pairing a CS with trauma or drug USs may depend in duce unwanted conditioned responses, they do not
subtle ways on the individual’s prior experience with destroy the original learning, which remains in the
the CS and US. brain, ready to return to behavior under the right
There are important variants of classical condition- circumstances. For example, conditioned responses
ing. In ‘sensory preconditioning,’ two stimuli (A and that have been eliminated by extinction or counter-
B) are first paired, and then one of them (A) is later conditioning can recover if time passes before the CS
paired with the US. Stimulus A evokes conditioned is presented again (‘spontaneous recovery’). Con-
responding, of course, but so does stimulus B— ditioned responses can also recover if the patient
indirectly, through its association with A. One im- returns to the original context of conditioning (the
plication is that exposure to a potent US like a panic general situation, mood, or state in which conditioning
attack may influence our reactions to stimuli that have occurred), or if the current context is associated with
never been paired with the US directly; the sudden the US (Bouton 2000). All of these phenomena are
anxiety response to stimulus B might seem spon- potential mechanisms for relapse. Techniques that
taneous and mysterious. A related finding is ‘second- may minimize relapse include conducting therapy in
order conditioning.’ Here, A is paired with a US first the contexts where the disorder is a problem, con-
and then subsequently paired with stimulus B. Once ducting therapy in multiple contexts, or providing the
again, both A and B will evoke responding. Sensory client with retrieval cues or retrieval strategies that
preconditioning and second-order conditioning in- help recall therapy (Bouton 2000). In the long run,
crease the range of stimuli that can control conditioned contemporary research on extinction and counter-
responses. conditioning may suggest ways to optimize their
Emotional responses can also be conditioned clinical effectiveness.
through observation. For example, a monkey that
merely observes another monkey being frightened by a
snake can learn to be afraid of the snake itself (Mineka 5. Challenges
1992). The observer learns to associate the snake (CS)
with its own emotional reaction (US\UR) to the other The idea that classical conditioning is a basis of
monkey being afraid. Although monkeys readily learn behavior disorders, particularly anxiety disorders, has
to fear snakes, they are less likely to associate other not gone unchallenged. However, most challenges
salient cues (such as colorful flowers) with fear in the have been directed at early versions of conditioning
same way. This is an example of ‘preparedness’ in theory that did not recognize factors such as in-
classical conditioning—some stimuli are especially formation value, latent inhibition, preparedness, or
effective signals for some USs because evolution has context. For instance, Rachman (1977) noted that
made them that way. (Another example is the fact that London air raids during World War II did not cause
tastes are easily associated with illness but not shock, an increase in anxiety disorders despite their emotional
whereas auditory and visual cues are easily associated impact. We now know that air raids might not have
with shock but not illness.) Preparedness may explain caused conditioning if the potential CSs were familiar

1944
Classical Conditioning and Clinical Psychology

rather than novel, if the raids were experienced in the processes like classical and operant conditioning, a
presence of safety cues that inhibited fear (e.g., complete appreciation of any disorder will probably
relatives, bomb shelters), or if the raids were signaled require a more integrative perspective on how these
by other cues (e.g., sirens) that could have ‘blocked’ factors work in combination.
conditioning of potential CSs (Kamin 1969). Another In the meantime, classical conditioning remains a
criticism was that fears in the general population are surprisingly rich scientific phenomenon that can be
disproportionately directed toward things like snakes, expected to come into play whenever people experi-
a fact that is consistent with the preparedness principle ence significant emotional and biological events and
(above). As a final example, critics of conditioning can associate them with other events in their world.
explanations of panic disorder have asked why a CS
like a pounding heart does not elicit panic in the
context (say) of athletic exercise, or why extinction See also: Autonomic Classical and Operant Con-
exposure to the CS without panic during exercise does ditioning; Behavior Therapy: Psychological Perspec-
not eliminate its ability to cause panic in other tives; Classical Conditioning, Neural Basis of; Clinical
situations. The answer may be that the loss of Psychology: Animal Models; Fear Conditioning;
responding that occurs in extinction is especially Operant Conditioning and Clinical Psychology; Panic
specific to the context in which it is learned (Bouton et Disorder; Pavlov, Ivan Petrovich (1849–1936);
al. 2001). Thus, although fear may extinguish in the Watson, John Broadus (1878–1958)
context of exercise, that extinction will not abolish the
CS’s ability to elicit fear in other contexts, such as a
crowded bus or a shopping mall. The fact that
conditioned responses generalize more across contexts
than extinction does may be a reason why many Bibliography
disorders are so persistent. Bouton M E 2000 A learning theory perspective on lapse,
relapse, and the maintenance of behavior change. Health
Psychology 19 (Suppl.): 57–63
Bouton M E, Mineka S, Barlow D H 2001 A modern learning
6. Future Directions theory perspective on the etiology of panic disorder. Psycho-
logical Reiew 108: 4–32
Basic research on classical conditioning will continue Kamin L J 1969 Predictability, surprise, attention, and cond-
to investigate the circumstances that allow (and itioning. In: Campbell B A, Church R M (eds.) Punishment
prevent) conditioning to occur, how conditioning is and Aersie Behaior. Appleton-Century-Crofts, New York,
represented in memory and the brain, and how it pp. 279–96
ultimately influences cognitions, emotions, and be- LeDoux J 1996 The Emotional Brain. Simon and Schuster, New
havior. Research will also address how extinction and York
Mineka S 1992 Evolutionary memories, emotional processing
other behavior-elimination procedures work and can
and the emotional disorders. The Psychology of Learning and
be improved. That research may eventually explain Motiation 28: 161–206
why conditioning processes like extinction seem more Pavlov I P 1927 Conditioned Reflexes. Oxford University Press,
context-specific than conditioning itself. And it will Oxford, UK
also eventually provide a more complete account of Rachman S 1977 The conditioning theory of fear-acquisition: A
the factors that determine the nature and form of critical examination. Behaior Research and Therapy 15: 375–87
conditioned responses, and how the many responses Rescorla R A 1988 Pavlovian conditioning: It’s not what you
that CSs can evoke can also interact and interrelate. think it is. American Psychologist 43: 151–60
As our understanding of classical conditioning Rescorla R A, Wagner A R 1972 A theory of Pavlovian
continues to deepen and expand, so will our insight conditioning: Variations in the effectiveness of reinforcement
into its possible role in various clinical disorders. and nonreinforcement. In: Black A H, Prokasy W F (eds.)
Understanding the role of conditioning in causing a Classical Conditioning II. Appleton-Century-Crofts, New
disorder, however, will probably also require pro- York, pp. 64–99
spective studies in which clinical investigators observe Siegel S 1989 Pharmacological conditioning and drug effects. In:
Goudie A J, Emmett-Oglesby M W (eds.) Psychoactie Drugs.
conditioning trials as they naturally occur in the world
Humana Press, Clifton, NY pp. 115–80
and then measure their effects as the disorder actually Timberlake W 1994 Behavior systems, associationism, and
develops. It will also benefit from a better appreciation Pavlovian conditioning. Psychonomic Bulletin and Reiew 1:
of how conditioning processes interact with biological 405–20
factors (such as genetically linked vulnerabilities) that Watson J B, Rayner R 1920 Conditioned emotional reactions.
may also play a role. And it will benefit from a better Journal of Experimental Psychology 3: 1–14
understanding of how conditioning interacts with Wolpe J 1958 Psychotherapy by Reciprocal Inhibition. Stanford
cognitive factors such as thoughts, beliefs, and aware- University Press, Stanford, CA
ness. Although biological and cognitive factors are
sometimes viewed as alternatives to simple learning M. E. Bouton

Copyright # 2001 Elsevier Science Ltd. 1945


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Classical Conditioning, Neural Basis of

Classical Conditioning, Neural Basis of noted that the contingent (informational), rather than
contiguous (temporal), relationship between CS and
Classical conditioning, a phenomenon described by US is essential in classical conditioning. For detailed
Pavlov around 1900, is an elementary form of associa- treatment of this see Kamin 1968, Rescorla, 1968,
tive learning that is considered to be an essential Wagner et al. 1968.]
building block for complex learning. This article will Classical conditioning was first characterized by
present some essential characteristics of classical con- Ivan P. Pavlov, a Russian physiologist. [The English
ditioning that permit this learning to be a model system translation of Pavlov’s book Conditioned Reflexes was
par excellence for understanding the neurobiology of first published in 1927. Classical conditioning was also
how the brain encodes, stores and retrieves memory. independently discovered by an American Psycholo-
gist, Edwin B. Twitmyer.] Having already conducted
prominent work on the digestive system, for which he
received a Nobel Prize in 1904, Pavlov employed a
1. Introduction and Historical Background salivary-conditioning procedure in dogs to system-
Classical or Pavlovian conditioning is the simplest atically characterize some of the fundamental prin-
form of associative learning by which animals, in- ciples of classical conditioning. In brief, Pavlov’s dogs
cluding humans, learn relations among events in the were presented with a discrete CS (e.g., the beat of a
world so that their future behaviors are better adapted metronome) just prior to the delivery of a US (e.g.,
to their environments (Rescorla 1988). Generally, meat powder). Initially, the subjects did not respond
classical conditioning ensues when an initially neutral to the CS but salivated profusely (UR) to the US. With
stimulus (conditional stimulus, CS) is paired in close repeated CS–US pairings, Pavlov’s dogs exhibited
temporal proximity with a biologically significant salivation (CR) to the CS that both preceded US onset
stimulus (unconditional stimulus, US) that elicits an and occurred in the absence of the US.
unlearned, reflexive behavior (unconditional response,
UR). Through CS–US association formation, the
animal learns to exhibit a learned behavior (con-
ditional response, CR) to the CS that generally (a) 2. Behaioral Principles of Classical Conditioning
resembles the UR (but not always), (b) precedes the
US in time, and (c) reaches a maximum at about the Since Pavlov, various types of classical conditioning
time of US onset. A typical classical conditioning procedures have been developed, ranging from a
arrangement is represented in Fig. 1. [It should be potently fast (one-trial) taste aversion conditioning

ITI
CS
b
ISI

US
c
Before
Learning
(no CRs)
d
After
Learning
(no CRs)

Figure 1
A typical classical conditioning procedure requires that the CS and the US are paired in close temporal proximity.
Presentation of CS and US may be scheduled according to two different temporal arrangements: (i) In delay
conditioning, CS onset procedes US onset, such that the two stimuli overlap and terminate together; (ii) In trace
conditioning, the CS and the US are separated by some ‘empty’ interval in which neither stimuli is present. The
top two traces indicate presentation of the CS and the US in a delay conditioning arrangement. The bottom two
traces depict behavioral responses before learning and after learning. (a), CS duration; (b), US duration; (c), UR;
(d ), CRjUR; ISI (Inter-stimulus Interval), the time between the CS onset and the US onset; ITI (Inter-trial
Interval), the time between the termination of one CS–US pairing and the beginning of the next CS–US pairing

1946
Classical Conditioning, Neural Basis of

(e.g., Garcia et al. 1974) to a relatively slow and


incremental eyeblink conditioning (e.g., Gormezano
et al. 1983), and a wide variety of organisms have been
used, ranging from invertebrates (e.g., Aplysia) to
primates (e.g., humans). These different types of
classical conditioning, however, all share certain com-
mon factors that influence the formation of CS–US
associations which can be classified into three general
categories: contiguity (or temporal) constraints, sen-
sory constraints, and contingency (or informational)
constraints (Table 1). Clearly, these constraints must
be considered when employing classical conditioning Figure 2
in learning and memory research. Various classical conditioning phenomena. CSO l no
In addition to simple CS–US pairings (also called or small magnitude CR to CS; CSR l CR to CS; USR
first order conditioning), there are many other training l UR to US; CS \CS l simultaneous presentations of
protocols within classical conditioning that can serve " #
the two CSs; in Overshadowing, CS is more salient
as effective tools for investigating the theoretical and nd
"
than CS ; 2 order conditioning is often elusory and
biological mechanisms of learning and memory. Some #
transient when it does appear
of these are presented in Fig. 2.

amygdala, striatum, diencephalon, medial temporal


3. Classical Conditioning as a Model System for lobe, neocortex) (Squire 1987). Considerable progress
Studying the Neurobiology of Learning and has been made in the understanding of these different
Memory learning and memory systems through the use of
various experimental techniques (e.g., lesions, revers-
Current views recognize that, in mammals, there are ible inactivation, drug administration, neural re-
multiple forms or aspects of learning and memory cordings, genetic manipulations, brain imaging). The
(e.g., habituation, sensitization, classical conditioning, success of this work is due in large part to the use of
priming, procedural learning, episodic, semantic) that classical conditioning as a model system, since it
are subserved by different structures in the central provides an important advantage over other, more
nervous system (e.g., reflex pathways, cerebellum, complex, forms of learning in that the stimuli involved

Table 1
Behavioral principles of classical conditioning

1947
Classical Conditioning, Neural Basis of

(CS and US) are well defined and can be precisely


controlled, and the behavioral output is discrete and
may be accurately assessed.

3.1 Localization of Brain Substrates of Classical


Conditioning: Rationale
The identification of the locus of learning and memory
storage in the brain is a prerequisite to an under-
standing of the neurochemical, cellular and molecular Figure 3
mechanisms by which organisms acquire and retain A highly schematized diagram of a hypothetical
information. For many years the task of localizing learning site in classical conditioning. The shaded box
memory storage (the engram) has been the paramount represents the locus of conditioning. * denotes
challenge facing investigators of memory mechanisms modifiable connections underlying learning
in mammalian systems. Classical conditioning is es-
pecially attractive as a model system for use in these (g) electrical stimulation of the putative site should
types of investigations since only two stimuli are evoke the CR.
involved, and thus the learning or association of CS If there is a single locus of learning that supports
and US must occur at the brain site(s) where the two classical conditioning, then these seven criteria must
pieces of information converge. For instance, in be demonstrable within that specific learning site;
classical eyeblink conditioning, where animals (such whereas if there are multiple structures that encode
as mice, rats, rabbits or humans) learn to exhibit CS–US association, then these seven criteria must be
eyeblink responses to a CS (e.g., tone) that has been satisfied collectively by these structures. Figure 3
paired with a US (e.g., air-puff to the eye), one can illustrates how these different criteria can be applied to
trace the pathways from the peripheral sensory recep- a putative locus of CS–US association.
tors in the ear (for CS) and around the eye (for US) to In Fig. 3, the afferent CS and US information is
the brain and examine those brain regions where the relayed via hypothetical structures 1 (and 5 for different
CS and US pathways converge. It is only in the past 20 CSs) and 2 (for US), respectively, to the learning site 3.
years, however, that technology has permitted this The outputs (efferents) from structure 3 activate the
type of analysis, and only then at the gross structural motor center 4 that in turn controls the CR. Permanent
level. lesions of structures 1, 2, 3 or 4 prior to conditioning
The convergence of CS and US information, how- will block the acquisition and\or expression of the
ever, is not a sufficient condition for identifying the CR. Thus, the permanent lesion technique (e.g.,
locus of learning. In order for a particular region of the electrical, chemical, radio-frequency, aspiration, is-
brain to be considered a viable candidate as a learning chemia) is limited in that it does not allow for
and memory site, it must demonstrate the following dissociation of the site of learning from the sites of
criteria: input or from motor centers. In contrast, the reversible
(a) permanent lesions of the putative site prior to inactivation (pharmacological, cooling) technique,
CS–US training should completely and permanently which temporarily inactivates the neurons within a
abolish the acquisition of the CR; structure, is not so constrained. For example, re-
(b) permanent lesions made after training should versible inactivation of structures 1, 2 or 3 during
completely abolish the expression of the CR, and the conditioning will block acquisition of the CR to CS .
CR should not be reacquired with further CS–US In subsequent CS–US training when the inactivation "
pairings; has been reversed, the animal should learn as though it
(c) reversible inactivation during CS–US training is naı$ ve. By contrast, reversible inactivation of struc-
should block the development of the CR such that ture 4 at the time of CS–US training will block the
when the structure is activated again the CR should expression of the CR, but once the inactivation is
develop comparably to that of a naı$ ve animal (i.e., no removed, the animal will immediately exhibit CRs to
evidence of saings); the CS because the CS–US association center was not
(d) reversible inactivation following training should affected during conditioning.
temporarily impair the expression of the CR; Thus structure 4 must be efferent to the site of
(e) learning-related neural activities should occur learning. Structures 1, 2 and 3 can be further disso-
that correspond with and immediately precede the ciated by examining the effects of reversible inac-
behavioral CR; tivation to CSs of different sensory modalities.
(f) electrical stimulation of the CS and the US input Whereas inactivation of structure 2 and 3 will block
pathways to the putative site should effectively sub- conditioning to all CS modalities (e.g., CS and CS ),
stitute for the peripheral stimuli and support con- "
inactivation of structure 1 will block conditioning #to
ditioning; and CS but not CS . Thus structure 1 must be afferent to
" #
1948
Classical Conditioning, Neural Basis of

the site of learning. Finally, structures 2 and 3 can be terized by its rapid inducibility and longevity (lasting
dissociated by examining the reversible inactivation of the order of hours in itro to weeks in io), as well
effects following conditioning: once the animal has as its being strengthened by repetition and demon-
acquired the CR, inactivating structure 3 should strating specificity and associativity. LTD displays
abolish the expression of CR, whereas inactivation of similar characteristics desirable of an information
structure 2 should not interfere with the expression of storage mechanism. Both LTP and LTD have been
the CR because the CS inputs to the CS–US as- demonstrated in various brain structures, including
sociation center remain intact. Interestingly, with those that are hypothesized to be critical for learning
continued CS–US training, inactivation of structure 2 and memory.
will lead to extinction of the CR because the CS–US One can easily imagine how LTP and LTD might be
association center will, as a result, receive only the CS applicable to classical conditioning. For example,
information ( just as in CS-alone extinction training). suppose that the CS pathway to the site of learning
If structures 1 and 2 relay information about the CS (pathway between steps 1–3 in Fig. 3) is initially weak
and the US, then electrical stimulation of the CS and and that, as a result, the CS alone cannot sufficiently
US pathways should also support conditioning. activate the CS–US association center to produce a
Lastly, recordings from structure 3 (the site of learn- CR. Following CS–US pairings, if this pathway is
ing) should reveal learning-related changes in neural strengthened via LTP (or LTP-like changes), then the
activity that model the behavioral CR, whereas CS alone will be able to activate the CS–US association
recordings from CS and US input structures should center to elicit a CR. Similarly LTD or LTD-like
show stimulus-evoked neural activities. changes may support conditioning by weakening the
The neural circuitry of the brain is almost infinitely inputs to a structure that normally inhibits the CS–US
complex and thus no single experimental technique is association center (as is postulated to occur in eyeblink
in itself sufficient to identify the site of learning. It is conditioning circuit). It is likely that combinations of
also important to note that different types of classical LTP- and LTD-like forms of synaptic plasticity (and
conditioning (e.g., eyeblink versus fear conditioning) perhaps other unknown cellular changes), involving a
are subserved by different neural circuits. Through the network of synapses, are important in classical con-
utilization of various techniques, however, the struc- ditioning.
tures and mechanisms involved in this type of learning
can be reasonably delineated and systematically ana-
lyzed for its validity.
3.3 Neuronal Substrates of Eyeblink Conditioning
Classical eyeblink conditioning in rabbits has been
used extensively to investigate brain mechanisms
3.2 Putatie Cellular Mechanisms of Classical
underlying learning and memory. Converging lines of
Conditioning
evidence from lesion, recording, stimulation, revers-
Once the learning and memory storage site has been ible inactivation, and brain imaging studies indicate
identified, a logical next step is to determine what that the cerebellum mediates the formation of the
neural changes take place that allow a CS that CS–US association for eyeblink conditioning
previously did not elicit a CR (before conditioning) to (Thompson 1990). In brief, selective lesions of the
now evoke a CR (after conditioning). It is generally cerebellum (i.e., the interpositus nucleus) block the
assumed that an initially weak connection between a acquisition and retention of eyeblink CRs; the lesion is
CS relaying structure and a CS–US association struc- limited to the CR since the UR (reflexive eyeblink) to
ture becomes strengthened as a function of CS–US the US is not affected. (An important feature of
paired training, such that the CS becomes able to eyeblink conditioning is that the CR and the UR can
effectively activate the CS–US association structure, be dissociated and, thus, effects of various manipu-
which in turn activates the CR pathway. It is further lations on memory versus performance can be care-
hypothesized that changes in synaptic efficacy (that is, fully addressed.) Correspondingly, recording studies
the efficacy with which a neuron is able to com- indicate that cells in specific regions of the cerebellum
municate with another neuron) underlie this type of undergo plastic changes during eyeblink conditioning;
strengthening of the CS relay-to-CS–US association for example, cells in the interpositus nucleus increase
connection (e.g., Hebb’s postulate (Hebb 1949)). Two their activity (postulated to occur via LTP-like
forms of experimentally-induced synaptic plasticity, changes), while Purkinje cells in the cortex, which send
long-term potentiation (LTP) and long-term de- inhibitory projections to the interpositus nucleus,
pression (LTD), have received close scrutiny as the most decrease their activity (postulated to occur via LTD-
promising cellular mnemonic mechanisms (Bliss and like changes).
Collingridge 1993; Ito 1989). LTP and LTD refer to The involvement of the cerebellum in eyeblink
sustained increase and decrease of synaptic trans- conditioning is also evidenced by stimulation studies
mission, respectively, following different stimulation which show that direct stimulation of the two major
patterns of afferent fibers. In brief, LTP is charac- afferents to the cerebellum, the mossy fibers from

1949
Classical Conditioning, Neural Basis of

the pontine nucleus and the climbing fibers from the (e) electrical stimulation of the amygdala elicits fear
inferior olive, can substitute for the peripheral CS responses; and
and US, respectively. Since limited lesions of the (f) drugs that block LTP also prevent fear con-
pontine nucleus (i.e., the lateral region) abolish CRs to ditioning.
a tone CS but not to a light CS, and lesions of the However, there is also evidence that the amygdala
inferior olive in animals that already acquired CRs may not necessarily be involved in fear learning and
result in behavioral extinction with continued CS–US- memory, and that other brain structures (e.g., insular
paired training, it is not likely that these afferent cortex) may mediate fear conditioning (McGaugh et
structures are the site of learning and memory storage. al. 1996). Thus, additional studies are required to
Reversible inactivation studies further support the firmly establish the role of the amygdala in fear
conclusion that the cerebellum, and not its efferent conditioning.
structures, is the locus of CS–US association. More-
over, recent human brain-imaging studies reveal eye-
blink conditioning-related activity changes in the
cerebellum. Collectively, these findings strongly sug- 4. Conclusion
gest that the cerebellum is essential for eyeblink
conditioning. Although the cerebellum seems to be Although classical conditioning is well understood at a
critical, the relative importance of the cerebellar cortex behavioral level and the neuroanatomical circuits that
and the interpositus nucleus in supporting eyeblink underlie it are beginning to be unveiled, there is much
conditioning is not clear and has been disputed (see left to learn. However, this most basic form of
Kim and Thompson 1997 for detailed treatment of associative learning offers a useful means to investigate
this topics and a putative eyeblink conditioning the synaptic and molecular mechanisms underlying
circuit). The fact that the critical CS–US association in learning and memory and may help reveal common
eyeblink conditioning occurs within the cerebellum, biological mechanisms shared by all learning and
however, permits research at the molecular level of memory systems.
analysis.
See also: Autonomic Classical and Operant Con-
ditioning; Behavioral Assessment; Cardiovascular
3.4 Neuronal Substrates of Fear Conditioning Conditioning: Neural Substrates; Conditioning and
In fear conditioning, CSs such as tones, lights or Habit Formation, Psychology of; Eyelid Classical
experimental chambers are typically paired with aver- Conditioning; Fear Conditioning; Learning and
sive US such as electric shock. Following CS–US Memory, Neural Basis of; Long-term Depression
pairings, the CS can elicit numerous fear CRs, such as (Hippocampus); Long-term Potentiation and Depres-
an increase in blood pressure, reduction in pain sion (Cortex); Long-term Potentiation (Hippo-
sensitivity (analgesia), fear-potentiated startle, and\or campus); Memory, Consolidation of; Pavlov, Ivan
defensive freezing. Several lines of evidence point to Petrovich (1849–1936)
the amygdala, one of the principle structures of the
limbic system that seems to be situated such that it has
access to both sensory inputs and response outputs, as
a critical neural substrate for this type of emotional Bibliography
learning (LeDoux 1996). In brief, the critical role of
the amygdala in fear conditioning is supported by Garcia J, Hankins W G, Rusiniak K W 1974 Behavioral
regulation of the milieu interne in man and rat. Science 185:
observations that:
824–31
(a) amygdalar lesions (permanent and reversible) Gormezano I, Kehoe E J, Marshall B S 1983 Twenty years of
abolish various fear CRs as well as innate fear classical conditioning research with the rabbit. In: Sprague
responses; J M, Epstein A N (eds.) Progress in Psychobiology and
(b) selective lesions of structures afferent to the Physiological Psychology. Academic Press, New York
amygdala affect conditioning to specific CSs (e.g., Hebb D O 1949 The Organization of Behaior: A Neuro-
medial geniculate nucleus of the thalamus for tones, psychological Theory. John Wiley and Sons, New York
and the hippocampus for contexts); Ito M 1989 Long-term depression. Annual Reiew of Neuro-
(c) selective lesions of structures efferent to the science 12: 85–102
amygdala abolish specific CRs (e.g., the lateral hypo- Kamin L J 1968 Attention-like processes in classical con-
thalamus for blood pressure CR, and the ventral ditioning. In: Jones M R (ed.) Miami Symposium on the Pre-
diction of Behaior: Aersie Stimulation. University of Miami
region of the periaqueductal gray matter for freezing Press, Miami, FL, pp. 9–32
CR); Kim J J, Thompson R F 1997 Cerebellar circuits and synaptic
(d) recording studies reveal that neurons in the mechanisms involved in classical eyeblink conditioning.
amygdala respond to both CS and US and undergo Trends in Neurosciences 20: 177–181
plastic changes during fear conditioning (e.g., LTP- LeDoux J 1996 The Emotional Brain. Simon ans Schuster, New
like changes); York

1950
Classical Mechanics and Motor Control

McGaugh J L, Cahill L, Roozendaal B 1996 Involvement of the deriving the needed time-sequence of forces, one must
amygdala in memory storage: Interaction with other brain calculate the first temporal derivative of the trajectory,
systems. Proceedings of the National Academy of Sciences of the velocity, and then the second temporal derivative,
the United States of America 93: 13508–14 the acceleration. Finally, one obtains the desired force
Pavlov I P 1927 Conditioned Reflexes. Oxford University Press,
London
from this acceleration. The above calculation is an
Rescorla R A 1968 Probability of shock in the presence and example of an inverse dynamic problem. The direct
absence of CS in fear conditioning. Journal of Comparatie dynamic problem is that of computing the trajectory
and Physiological Psychology 66: 1–3 resulting from the application of a force, F(t). Direct
Rescorla R A 1988 Behavioral studies of Pavlovian condition- problems are a common challenge for physicists who
ing. Annual Reiew of Neuroscience 11: 329–52 are concerned, for example, with predicting the motion
Squire L R 1987 Memory and Brain. Oxford University Press, of a comet from the known pattern of gravitational
NY forces. Unlike physicists, the brain deals most often
Thompson R F 1990 Neural mechanisms of classical condition- with inverse problems: we routinely recognize objects
ing in mammals. Philosophical Transactions of the Royal and people from their visual images—an ‘inverse
Society of London Series B-Biological Sciences 329: 161–70
optical problem’—and we find out effortlessly how to
Wagner A R, Logan F A, Haberlandt K, Price T 1968 Stimulus
selection in animal discrimination learning. Journal of Ex- distribute the forces exerted by several muscles to
perimental Psychology 76: 171–80 move our limb in the desired way: an inverse dynamics
problem.
J. J. Kim One of the central questions in motor control (see
Motor Control) is how the central nervous system may
solve the inverse dynamics problem and generate the
motor commands that guide our limbs (Hollerbach
and Flash 1982). In the biological context, however,
the inverse dynamic problem assumes a somewhat
more complex form than the one described above. A
Classical Mechanics and Motor Control system of second-order nonlinear differential equa-
tions is generally considered to be an adequate
In order to control the execution of limb movements, representation for the passive dynamics of a limb. A
the central nervous system must solve complex prob- compact expression for such a system is:
lems of mechanics. A substantial body of evidence
supports the view that, in solving these problems the D(q, qc , qW ) l τ (1)
nervous system develops internal representations of
the mechanics of the body coupled with its environ- where q, qc , and qW represent the limb configuration
ment. Thus, through the process of motor learning vector—for example the vector of joint angles—and
(see Motor Control) the brain becomes implicitly an its first and second time derivatives. D is a non-linear
‘expert’ in classical mechanics. After discussing the vector, valued mapping from the current state and its
problems of kinematics and dynamics associated with rate of change to the vector of joint torques, τ. In
the control of movement, this article introduces the practice, the expression for D may have a few terms for
formal definitions and the empirical evidence that a two-joint planar arm, or it may take several pages for
constitute the underpinnings of the theory of internal more realistic models of the arm’s multi-joint geo-
representations in motor control. metry. The inverse dynamics approach to the control
of multi-joint limbs consists in solving explicitly for a
torque trajectory, τ(t), given a desired trajectory of the
limb, qD(t). This is done by replacing qD(t) for the
variable q on the left side of Eqn. (1):
1. Dynamics
According to the laws of Newtonian physics, if one τ(t) l D(qD(t), qc D(t), qW D(t)) (2)
wants to impress a motion upon an object with mass
m, one must apply a force, F, that is directly pro-
portional to the desired acceleration, a. This is
Newton’s equation: 2. Kinematics, Statics, and Coordinate Systems
A significant computational challenge comes from the
F l ma need to perform changes of representation—or, more
technically, coordinate transformations—between the
A desired motion may be expressed as a sequence of description of a task and the specification of the body
positions, x, that one wishes the object to occupy at motions. Tasks, such as ‘hitting a ball with a racket,’
subsequent instants of time, t. Such a sequence is are described most efficiently and parsimoniously with
called a trajectory and is mathematically represented respect to some fixed reference points in the en-
as a function, x l x(t). To use Newton’s equation for vironment. In this example, the racket is the site at

1951
Classical Mechanics and Motor Control

which one interacts with the environment. Borrowing For example, one can mentally formulate (and ex-
some terminology from robotics, such a site is called ecute) commands such as ‘move the hand 10 cm to the
an ‘endpoint.’ The position of the racket is fully right’ without being concerned with the set of muscle
determined by six coordinates. These coordinates may commands that are involved with this action. How-
be measured with respect to three orthogonal axes ever, once one has decided a plan of action one must
originating, for example, from the shoulder. Then, a somehow choose which muscles to activate and in
position in endpoint coordinates may be specified as a what temporal order. In carrying out this task the
point p l (x, y, z, θX, θY, θZ). The coordinates, x, y, brain must faces the challenges associated with kine-
and z determine a translation with respect to the matic redundancy: the imbalance between the number
orthogonal axes. The angular coordinates, θX, θY , and of joints that may participate in a movement, the
θZ determine an orientation with respect to the same number of degrees of freedom of the hand, and the
axes. Consistent with this notation, a force in endpoint number of independently controlled muscles acting
coordinates is a vector with three linear and three upon the joints. There are typically fewer hand
angular components, F l (FX, FY, FZ, τX, τY, τZ). coordinates than joint angles and fewer joint angles
A different way of describing the position of an arm than muscles. Such imbalance renders both transfor-
is to provide the set of joint angles that define the mations (3) and (4) non-invertible.
orientation of each skeletal segment either with respect
to fixed axes in space or with respect to the neighboring 3. Internal Models of Limb Dynamics
segments. Joint angles are a particular instance of
generalized coordinates. According to the standard The ability to generate a variety of complex behaviors
definitions of analytical mechanics, generalized coor- cannot be attained by just storing somewhere the
dinates are independent variables, which are suitable control signals for each action and recalling these
for describing the dynamics of a system (see for signals when subsequently needed. Simple consider-
example Goldstein 1980). ations about the geometrical space of meaningful
Once we have defined a set of generalized coordi- behaviors are sufficient to establish that this approach
nates we may also define a set of corresponding would be inadequate (see Bizzi and Mussa-Ivaldi
generalized forces. For example, if we use joint angles 1998). To achieve its typical competence, the motor
as generalized coordinates, the corresponding gener- system must take advantage of experience for going
alized forces are the torques measured at each joint. beyond experience itself, by constructing internal
The dynamics of any mechanical system with N representations of the controlled dynamics. These
generalized coordinates are described by N second representations allow us to generate new behaviors
order differential equations relating the generalized and to handle situations that have not yet been
coordinate to their first and second time derivatives encountered (see Motor Control Models: Learning and
and to the generalized forces. The dynamics Eqn. (1) is Performance). A vivid illustration of how explicit
an example of formulation in generalized coordinates. representations of dynamics, also called internal
Movements are executed by the central nervous models, may facilitate motor learning is offered by
system activating of a multitude of muscles. Muscle work of Atkeson and Schaal (1997, Schaal 1999) who
coordinates afford the most direct representation for studied the task of balancing an inverted pendulum on
the motor output of the central nervous system. A the hand of a robotic arm. They found that robots
position in this coordinate system is described by a learn to carry out this task successfully when they can
collection of muscle lengths, l l (l , l , …, lM). Ac- build an internal model of the dynamics associated
cordingly, a force is a collection of "muscle
# tensions, with the balancing act. Such a model may be con-
f l ( f , f , …, fM). structed using data derived from the observation of
Both" the
# transformations from generalized coor- humans engaging competently in the same task.
dinates to endpoint coordinates, and from generalized The term ‘internal model’ refers to two distinct
coordinates to actuator coordinates are nonlinear mathematical transformations: (a) the transformation
mappings. In the case of the arm, the transformation from a motor command to the consequent behavior;
from joint to endpoint coordinates is a nonlinear and, (b) the transformation from a desired behavior to
function: the corresponding motor command (see Kawato and
Wolpert 1998). A model of the first kind is called a
p l L(q) (3) ‘forward model.’ Forward models provide the control
system with the means not only to predict the outcome
The transformation from joint to muscle coordinates
of a command, but also to estimate the current state in
is another nonlinear mapping:
the presence of feedback delay. A representation of the
l l M(q) (4) mapping from planned actions to motor commands is
called an ‘inverse model.’ Strong experimental evi-
Some experimental studies (see Flash and Hogan dence for the biological and behavioral relevance of
1985, Morasso 1981) have suggested that actions are internal models has been offered by experiments that
planned by the brain in terms of endpoint coordinates. involved the adaptation of arm movements to a

1952
Classical Mechanics and Motor Control

perturbing force field generated by an instrumented In this expression, each spinal field is a force that
manipulandum (Sabes et al. 1998, Shadmehr and depends upon the state of motion of the limb, (q, qc )
Mussa-Ivaldi 1994). The major findings of these and upon time, t, in a fixed stereotyped way. The
studies are as follows: (a) when exposed to a complex descending commands, (u , u , …, uK), act as coeffic-
but deterministic field of velocity-dependent forces, " # with which each spinal
ients that modulate the degree
arm movements are first distorted and, after repeated field participates in the combination. These commands
practice, the initial kinematics are recovered; (b) if, can just select the modules by determining how much
after adaptation, the field is suddenly removed, after each one contributes to the net control policy. The
effects are clearly visible as mirror images of the initial linear combination (5) generates the torque that drives
perturbations; (c) adaptation is achieved by the motor the limb inertia. Substituting it for τ(t) in Eqn. (1) one
system through the formation of a local map that obtains:
associates the states (positions and velocities) visited
during the training period with the corresponding K
forces; and, (d) after adaptation this map—that is the D(q, qc , qW ) l  uii(q, qc , t) (6)
internal model of the field—undergoes a process of i="
consolidation (see Brashers-Krug et al. 1996). Least squares approximation can efficiently determine
an optimal set of tuning coefficients given a desired
trajectory, qD(t):
4. The Neurobiological ‘Building Blocks’ of
Internal Models K
ui l  [Φ]−i,j" Λj (7)
Once it has been established that the motor system
j="
creates internal representations of complex multi-joint
dynamics, it remains to determine how these represen- with
tations may come about. As pointed out by David

&
1
Marr (1982), any mathematical transformation may
be carried out in different ways depending upon which 2
Φl, m l l(qD(t), qc D(t), t)$m(qD(t), qc D(t), t) dt

&
3
building blocks or ‘primitives’ are employed (see
Computational Neuroscience). Electrophysiological 4
Λj l j(qD(t), qc D(t), t)$D(qD(t), qc D(t), qW (t)) dt
studies involving the stimulation of muscles and of the (8)
spinal cord in frogs (Bizzi et al. 1991) indicated: (a)
that the stimulation of a site in the lumbar spinal cord The symbol $ indicates the ordinary inner product.
results in the activation of multiple muscles acting on While spinal force fields offer a practical way to
the leg on the same side of the stimulation; (b) that generate movement, they also provide the central
concomitant or ‘synergistic’ muscle recruitment gene- nervous system with a movement’s representation (see
rates a field of viscoelastic forces over a broad region Neural Representations of Intended Moement in
of the leg workspace; and, (c) that the simultaneous Motor Cortex). This representation is geometrically
activation of multiple spinal sites leads to the vectorial similar to the representation of space by a set of
summation of the corresponding force fields. These Cartesian coordinates. In the latter case, we may take
and similar findings suggest that motor commands three directions—represented by three independent
reaching the spinal cord from higher brain centers are vectors—and then project any point in space along
not directed at controlling the forces of individual these directions. As a result, an arbitrary point in
muscles or single joint torques. Instead, the descending space is represented by three numbers, the coordinates
motor commands modulate the viscoelastic force fields x, y, and z. The movements of a limb can be considered
produced by specific sets of muscles. These force fields as ‘points’ in an abstract geometrical space. In this
have influence over broad regions of the limb state abstract geometrical space, the force fields produced
space as each active muscle within a synergy contribute by a set of modules play a role equivalent to that of the
a significant force over a large range of positions and Cartesian axes and the selection parameters that
velocities. generate a particular movement may be regarded as
From a mathematical standpoint, the force fields generalized projections of this movement along the
generated by neural modules in the spinal cord are module’s fields.
nonlinear functions of limb position and velocity and
of time: i(q, qc , t). Consistent with the finding of
vector summation, the net force field induced by a 4.1 A Computational Approach to Motor
pattern of K motor commands may be represented as Adaptation
a linear combination: If the dynamics change while the modules remain
K unchanged, then the representation of the movement
 uii(q, qc , t) (5) must change accordingly. This is shown by the
i=" following argument. Suppose that a trajectory, q(t), is

1953
Classical Mechanics and Motor Control

represented by a selection vector c l (c , c , …, cK) for (e , e , …, eK) as a linear transformation of the original
a limb with the dynamics of Eqn. (6)."Now, # suppose " #
coefficients c l (c , c , …, cK):
that the limb dynamics are suddenly modified by an " #
additional load, E(q, qc , qW ). Leaving the representation e l Wc
and the fields unchanged we have now a new differ- This transformation is a coordinate transformation of
ential equation the selection vector and may be implemented by a
K
linear associative network (see Neural Networks:
D(q, qc , qW )jE(q, qc , qW ) l  cm : φm(q, qc , t) (9) Biological Models and Applications). With a minimum
m=" of algebra one sees that
whose solution is a trajectory qg (t), generally different K
from the original q(t). The original set of coefficients, D(q, qc , qW )jE(q, qc , qW ) l  cm : φm(q, qc , t)
cm, now generates the trajectory qg (t) and can, ac- m="
K
cordingly, be considered as its representation within l  (cmjem):φm(q, qc , t)
the modified environment, DjE. The old trajectory is
m="
recovered by changing the selection coefficients to a K
new set, ch l cje with l  cm : φ̀m(q, qc , t)
m="
K (13)
E(q(t), qc (t), qW (t)) l  em:φm(q(t), qc (t), t) (10)
m=" where the old fields φm have been replaced by the new
With these new coefficients, the new dynamics fields
become equivalent to the old dynamics along the 1
K 1 if l l m
original trajectory. φ̀m l  (δl,mjWl,m):φl δl,m l 2
3
The modified coefficients ch offer a new represen- l=" 4
0 otherwise
tation of the old movement q(t) in the altered (14)
dynamics. This procedure for forming a new rep-
resentation and for recovering the original movement This is, again, a coordinate transformation of the
is consistent with the empirical observation of original fields that may be implemented by a neural
aftereffects in load adaptation (see Shadmehr and network intervening between the descending com-
Mussa-Ivaldi 1994). If the load is removed after the mands and the original fields. By means of such
new representation is formed, the dynamics become coordinate transformations, one obtains the import-
ant result that the movement representation—that is
K the selection vector c—can be maintained invariant
D(q, qc , qW ) l  (cmjem) : φm(q, qc , t) (11) after a change in limb dynamics.
m=" In conclusion, the system of motor primitives
that can be rewritten as induced by independent modules within the spinal
cord—as well as within higher structures of the
K K nervous system—provides us with an alphabet for
D(q, qc , qW )k  em:φm(q, qc , t) l  cm:φm(q, qc , t)
representing the mechanics of the body and for
m=" m=" modifying this representation as required by changes
(12)
in limb and environmental dynamics.
Therefore, removing the load with the new represen-
tation corresponds approximately to applying the See also: Motor Control; Motor Control Models:
opposite load with the old representation. Learning and Performance; Motor Cortex; Motor
Is it necessary for the motor system to modify a Skills, Psychology of; Neural Representations of
movement’s representation each time the limb dy- Intended Movement in Motor Cortex
namics changes? Or is there a way for restoring the
previously existing representations? From a computa-
tional point of view, whenever a dynamical change Bibliography
becomes permanent—as when we undergo growth or Atkeson C, Schaal S 1997 Robot learning from demonstration.
damage—it would seem convenient for the central In: Fisher D (ed.) Machine Learning: Proceedings of the
nervous system to have the ability to restore the Fourteenth International Conference (ICML 97). Morgan
previously learned motor skills (that is, the previously Kaufman, San Francisco
BizziE,Mussa-IvaldiF A1998Theacquisitionofmotorbehavior.
learned movement representations) without need to
Daedalus 127: 217–32
relearn them all. It is possible for the adaptive system Bizzi E, Mussa-Ivaldi F A, Giszter S 1991 Computations
to restore, at least partially, the motor representations underlying the execution of movement: A biological persp-
that preexist a change in dynamics by modifying the ective. Science 253: 287–91
modules and their force fields. A specific modification Brashers-Krug T, Shadmehr R, Bizzi E 1996 Consolidation in
is obtained when we may express the coefficients e l human motor memory. Nature 382: 252–5

1954
Classical (Psychometric) Test Theory

Flash T, Hogan N 1985 The coordination of arm movements: (b) How dependable is a measurement in character-
An experimentally confirmed mathematical model. Journal of izing an attribute of an individual unit, i.e., which is
Neuroscience 5: 1688–703 the confidence interval for the true score of that
Goldstein H 1980 Classical Mechanics. Addison–Wesley, Read- individual with respect to the measurement con-
ing, MA
Hollerbach J M, Flash T 1982 Dynamic interactions between
sidered?
limb segments during planar arm movement. Biological (c) How reliable is an aggregated measurement
Cybernetics 44: 67–77 consisting of the average (or sum) of several measure-
Kawato M, Wolpert D 1998 Internal models for motor control. ments of the same unit or object (Spearman–Brown
Noartis Foundation Symposium 218: 291–307 formula for test length)?
Marr D 1982 Vision: A Computational Inestigation into the (d) How reliable is a difference, e.g., between a pre-
Human Representation and Processing of Visual Information. test and post-test?
W H Freeman and Co, San Francisco, CA
Morasso P 1981 Spatial control of arm movements. Exper-
imental Brain Research 42: 223–7
Sabes P N, Jordan M I, Wolpert D M 1998 The role of inertial
sensitivity in motor planning. Journal of Neuroscience 18:
2. Basic Concepts of Classical Test Theory
5948–57
Schaal S 1999 Is imitation learning the route to humanoid 2.1 Primities
robots? Trends in Cognitie Sciences 3: 233–42
Shadmehr R, Mussa-Ivaldi F A 1994 Adaptive representation of In the framework of CTT, each measurement (test
dynamics during learning of a motor task. Journal of Neuro- score) is considered being a value of a random variable
science 14: 3208–24 Y consisting of two components: a ‘true score’ and an
‘error score.’ Two levels, or more precisely, two
S. Mussa-Ivaldi random experiments may be distinguished: (a) sam-
pling an observational unit (e.g., a person) and (b)
sampling a score within a given unit. Within a given
unit, the true score is a parameter, i.e., a given but
unknown number characterizing the attribute of the
unit, whereas the error is a random variable with an
Classical (Psychometric) Test Theory unknown distribution. The true score of the unit is
defined to be the expectation of this intraindividual
1. Introduction distribution.
Taking the across units perspective, i.e., joining the
One of the most striking and challenging phenomena two random experiments, the true score is itself
in the Social Sciences is the unreliability of its considered to be a value of a random variable (the
measurements: Measuring the same attribute twice ‘true score variable’). The ‘error variable’ is again a
often yields two different results. If the same measure- random variable, the distribution of which is a mixture
ment instrument is applied twice, such a difference of the individual units’ error distributions. Most
may sometimes be due to a change in the measured theorems of CTT (e.g., Lord and Novick, 1968) are
attribute itself. Sometimes these changes in the formulated from this across units’ perspective allowing
measured attribute are due to the mere fact of talking about the correlation of true scores with other
measuring. For example, people learn when solving variables, for instance.
tasks and they change their attitude when they reflect More formally, CTT refers to a ( joint) random
on statements in an attitude questionnaire. In other experiment of (a) sampling an observational unit u
cases the change of the measured attribute is due to (such as a person) from a set ΩU of units (called the
developmental phenomena, or it might be due to population), and (b) registering one or more ob-
learning between occasions of measurement. How- servations out of a set ΩO of possible observations. The
ever, if change of the attribute can be excluded two set of possible outcomes of the random experiment is
different results in measuring the same attribute can be the set product: Ω l ΩUiΩO. The elements of ΩO, the
explained only by ‘measurement error.’ observations, might be qualitative (such as ‘answering
Classical (Psychometric) Test Theory (CTT) aims at in category a of item 1 and in category b of item 2’),
studying the reliability of a (real-valued) test score quantitative (such as reaction time and alcohol con-
variable (measurement, test) that maps a crucial aspect centration in the blood), or consisting of both quali-
of qualitative or quantitative observations into the tative and quantitative components. In Psychology,
set of real numbers. Aside from determining the the measurements are often defined by test scoring
reliability of a test score variable itself, CTT allows rules prescribing how the observations are trans-
answering questions such as: formed into test scores. (Hence, these measurement
(a) How do two random variables correlate once the are also often called ‘tests’ or ‘test score variables.’)
measurement error is filtered out (correction for These scoring rules may just consist of summing initial
attenuation)? scores of items (defining a psychological scale) or

1955
Classical (Psychometric) Test Theory

Table 1 true scores and errors. Hence, trying to test or falsify


Basic Concepts of Classical Test Theory these properties empirically would be meaningless in
just the same way, as it is meaningless to test whether
Primitives or not a bachelor is really unmarried. The property of
The set of possible eents of the Ω l ΩUiΩO being unmarried is an inherent part or logical conse-
random experiment quence of the concept of a bachelor.
Test Score Variables Yi : Ω  Only one of the ‘axioms of CTT’ does not follow
Projection U:Ω ΩU from the definition of true score and error variables:
Definition of the Theoretical Variables ‘uncorrelatedness of errors variables’ among each
True Score Variable τi U E(YiQU) other. Hence, uncorrelatedness of errors has another
Measurement Error Variable εi U Yikτi epistemological status as the properties displayed in
Table 2 (the other ‘axioms’). Uncorrelatedness of
might be more sophisticated representations of ob- errors is certainly a desirable and useful property; but
servable attributes of the units. CTT does not prescribe it might be wrong in specific empirical applications
the definition of the test score variables. It just (e.g., Zimmerman and Williams 1977). In fact it is an
additively decomposes them into true score variables assumption and it plays a crucial rule in defining
and error variables. Substantive theory and empirical models of CTT.
validation studies are necessary in order to decide Equation 1 of Table 2 is a simple rearrangement of
whether or not a given test score variable is mean- the definition of the error variable. Equation. 2 shows
ingful. CTT only helps disentangling the variances of that the variance of a test score variable, too, has two
its true score and error components. additive components: the ‘variance of the true score
Referring to the joint random experiment described variable’ and the ‘variance of the error variable.’ This
above the mapping U: Ω ΩU, U(ω) l u, (the unit or second property follows from Eqn. 3 according
person projection) may be considered a qualitative to which a true score variable is uncorrelated with a
random variable having a joint distribution with the measurement error variable, even if they pertain to
test scores variables Yi. Most theorems of CTT deal different test score variables Yi and Yj. Equation 4
with two or more test score variables (tests) Yi and the states that the expected value of an error variable is
relationship between their true score and error com- zero, whereas Eqn. 5 implies that the expected value of
ponents. (The index i refer to one of several tests an error variable is zero within each individual
considered.) observational unit u. Finally, according to Eqn. 6 the
conditional expectation of an error variable is also
2.2 The core concepts: True score and error zero for each mapping of U. This basically means that
ariables the expected value of an error variable is zero in each
subpopulation of observational units.
Using the primitives introduced above, the true score
variable τi: l E(YiQU ) is defined by the conditional
expectation of the test Yi given the variable U. The 3.1 Additional Concepts: Reliability, Unconditional
values of the ‘true score variable’ τi are the conditional and Conditional Error Variances
expected values E(YiQU l u) of Yi given the unit u. Although the true score and error variables defined
They are also called the ‘true scores’ of the unit u with above are the core concepts of CTT, in empirical
respect to Yi. Hence, these true scores are the expected applications, the true scores can only be estimated.
values of the intraindividual distributions of the Yi. What is also possible, is to estimate the ‘variances’ of
The ‘measurement error variables’ εi are simply defined the true score and error variables in a random sample
by the difference εi: l Yikτi. Table 1 summarizes the (consisting of repeating many times the random
primitives and definitions of the basic concepts of experiment described earlier). The variance Var (εi) of
CTT. the measurement error may be considered a gross
parameter representing the degree of unreliability. A
normed parameter of unreliability is Var (εi)\Var (Yi),
3. Properties of True Score and Error Variables
the proportion of the variance of Yi due to
Once the true score variables and error variables are measurement error. Its counterpart is 1kVar (εi)\
defined a number of properties (see Table 2) can be Var (Yi), i.e.,
derived, some of which are known as the ‘axioms of
Rel(Yi): l Var (τi)\Var (Yi) (1)
CTT.’ However, since the work done by Novick (1966)
and Zimmerman (1975, 1976) it is well known that all the ‘reliability’ of Yi. This coefficient varies between
these properties already follow from the definition of zero and one. In fact, most theorems and most
true score and error variables. They are not new and empirical research deal with this ‘coefficient of reli-
independent assumptions as has been originally pro- ability.’ The reliability coefficient is a convenient
posed (e.g., Gulliksen 1950). All equations in Table 2 information about the dependability of the meas-
are no assumptions. They are inherent properties of urement ‘in one single number.’

1956
Classical (Psychometric) Test Theory

Table 2
Properties of True Score and Error Variables Implied by Their Definition
Decomposition of the Variables Yi l τijεi (1)
Decomposition of the Variances Var(Yi) l Var(τi)jVar(εi) (2)
Other Properties of True Score and Co(τi, εj) l 0 (3)
Error Variables implied by their definition E(εi) l 0 (4)
E(εiQU) l 0 (5)
for each (measurable) mapping of U: E [εiQ f(U)] l 0 (6)

In early papers on CTT, reliability of a test has been The assumption (a ) to (a ) specify in different ways
defined by its correlation with itself (e.g., Thurstone the assumption that "two tests$ Y and Y measure the
i j
1931, p. 3). However, this definition is only metaphoric, same attribute. Such an assumption is crucial for
because a variable always correlates perfectly with inferring the degree of reliability from the discrepancy
itself. What is meant is to define reliability by the between two measurements of the same attribute of
correlation of ‘parallel tests’ (see below). The as- the same person. Perfect identity or ‘τ-equivalence’ of
sumptions defining parallel tests in fact imply that the the two true score variables is assumed with (a ). With
correlation between two test score variables is the (a ) this assumption is relaxed: the two true " score
reliability. Note that the definition of ‘reliability’ via #
variables may differ by an additive constant. Two
Eqn. (1) does not rest on any assumption other than balances, for instance, will follow this assumption if
0 Var (Yi) _. one of them yields a weight that is always one
‘Reliability’ is useful to compare different instru- pound larger than the weight indicated by the other
ments to each other if they are applied in the same balance, irrespective of the object to be weighed.
population. Used in this way, reliability in fact helps According to Assumption (a ), the two tests measure
evaluating the quality of measurement instruments. the same attribute in the sense$ that their true score
However, it may not be useful under all circumstances variables are linear functions of each other.
to infer the dependability of measures of an individual The other two assumptions deal with properties of
unit. For the latter purpose one might rather look at the measurement errors. With (b) one assumes
the ‘conditional error variance’ Var (εiQU l u) given a measurement errors pertaining to different test score
specific observational unit u or at the ‘conditional variables to be uncorrelated. In (c) ‘equal error
error variances’ Var (εiQτi l t ) given the sub- variances are assumed,’ i.e., these tests are assumed to
population with true score τi l t. measure equally well.

4. Models of Classical Test Theory 4.1 Parallel Tests


The definitions of true score and error variables have
to be supplemented by assumptions defining a model if 4.1.1 Definition. The most simple and convenient
the theoretical parameters such as the reliability are to set of assumptions is the model of ‘parallel tests.’
be computed by empirically estimable parameters such Two tests Yi and Yj are defined to be parallel if they
as the means, variances, covariance, or correlation of are τ-equivalent, if their error variables are uncor-
the test score variables. Table 3 displays the most related, and if they have identical error variances.
important of these assumptions and the most im- Note that Assumption (a ) implies that there is a un-
portant models defined by combining some of these "
iquely defined latent variable being identical to each
assumptions. of the true score variables. Hence, one may drop the
Table 3
Assumptions and Some Models of CTT
Assumption used to define some models of CTT
(a ) τ-equivalence τ i l τj ,
"
(a ) essemtial τ-equivalence τi l τjjλij, λij ?,
#
(a ) τ-congenerity τi l λij jλij τj, λij , λij ? , λij  0
$ ! " ! " "
(b) uncorrelated errors Co(εi, εj) l 0, i  j
(c) equal error variances Var(εi) l Var(εj).
Models defined by combining these assumptions
Parallel tests are defined by Assumptions (a ), (b) and (c).
"
Essentially τ-equivalent tests are defined by Assumptions (a ) and (b).
#
Congeneric tests are defined by Assumptions (a ) and (b).
$
Note: The equations refer to each pair of tests Yi and Yj of a set of tests Y ,…, Ym, their true score
"
variables, and their error variables, respectively.

1957
Classical (Psychometric) Test Theory

Table 4
The Model of Parallel Tests
Definition Assumptions (a ), (b) and (c) of Table 3
"
Identification E(η) l E(Yi)
Var(η) l Co(Yi, Yj), ij
Var(εi) l Var(Yi)kCo(Yi, Yj), i  j
Rel(Yi) l Corr(Yi, Yj), i  j
Testability
in the total population E(Yi) l µ
Var(Yi) l σ#Y
Co(Yi, Yj) l σ#η
within each subpopulation s E(s)(Yi) l µs
Note: The indices i and j refer to tests and the superscripts s to a subpopulation. The equations are true
for each test Yi of a set of parallel tests Y , …, Ym or for each pair of two such tests, their true score
"
variables, and their error variables, respectively.

index i and denote this latent variable by η. The as- be computed by the ‘Spearman-Brown formula for
sumption of t-equivalence may equivalently be writ- lengthened tests:’
ten Yi l ηjεi, where εi : l YikE(YiQU ).
m:Rel(Yi)
Rel(S) l Rel(S\m) l
4.1.2 Identification. For parallel tests the theoreti- 1j(mk1):Rel(Yi)
cal parameters may be computed from the para-
meters characterizing the distribution of at least two Using this formula the reliability of an aggregated
test score variables, i.e., the theoretical parameters measurement consisting of the sum (or average) of m
are identified in this model if m  2. According to parallel measurements of the same unit can be com-
Table 4 the expected value of η is equal to the ex- puted. For m l 2, each with Rel(Yi) l 0.80, for
pected value of each of the tests, whereas the vari- instance
ance of η can be computed from the covariance of
two different tests. The variance Var (εi) of the measure- Rel(S) l 2:0.80\1j(2k1):0.80 $ 0.89
ment error variables may be computed by the dif- The Spearman–Brown formula may also be used to
ference Var (Yi)kCo(Yi, Yj), i  j. Finally, the re- answer the opposite question. Suppose there is a test
liability Rel(Yi) is equal to the correlation Corr(Yi, Yj) being the sum of m parallel tests and this test has
of two different test score variables. reliability Rel(S). What would be the reliability Rel(Yi)
of the m parallel tests? For example, if m l 2, what
4.1.3 Testability. The model of parallel tests im- would be the reliability of a test half?
plies several consequences that may be tested em-
pirically. First, all parallel tests Yi have equal
expectations E(Yi), equal variances Var (Yi), and 4.2 Essentially τ-equialent Tests
equal covariances Co(Yi, Yj) in the total population.
Second, parallel tests also have equal expectations 4.2.1 Definition. The model of essentially τ-equiva-
within each subpopulation (see Table 4). lent tests is less restrictive than the model of parallel
Note that these hypotheses may be tested separately tests. Two tests Yi and Yj are defined to be ‘essentially
and\or simultaneously as a single multidimensional τ-equivalent’ if their true score variables differ only
hypothesis in the framework of ‘simultaneous by an additive constant (Assumption a in Table 3)
equation models’ via AMOS (Arbuckle 1997), EQS #
and if their error variables are uncorrelated (Assum-
(Bentler 1995), LISREL 8 (Jo$ ereskog and So$ erbom ption b in Table 3). Assumption (a ) implies that
1998), MPLUS (Muthe! n and Muthe! n 1998), MX #
there is a latent variable η that is a translation of
(Neale 1997), RAMONA (Browne and Mels 1998), each of the true score variables, i.e.,
SEPATH (Steiger 1995), and others. Such a sim-
ultaneous test may even include the hypotheses about η l τijλi, λi ? , such that Yi l ηjλijεi
the parameters in several subpopulations (see Table where εi: l YikE(YiQU ) and λi ? 
4). What is not implied by the assumptions of parallel
tests is the equality of the variances and the co- Also note that the latent variable η is uniquely defined
variances of the test score variables in subpopu- up to a translation. Hence, it is necessary to fix the
lations. scale of the latent variable η. This can be done by fixing
For parallel tests Y j(jYm as defined in Table 4, one of the coefficients λi (e.g., λ l 0) or by fixing the
" score S : l Y j(jY may
the reliability of the sum expected value of η [e.g., E(η) l" 0].
" m

1958
Classical (Psychometric) Test Theory

Table 5 Hence, two tests Yi and Yj are called τ-congeneric if


The Model of Essentially τ-Equivalent Tests their true score variables are positive linear functions
of each other and if their error variables are uncor-
Definition Assumptions (a ) and (b) of Table 3
# related. Assumption a implies that there is a latent
Fixing the scale E(η) l 0 $ true score variable is a pos-
variable η such that each
of η itive linear function of it, i.e.,
Identification Var(η) l Co(Yi, Yj), i  j
Var(εi) l Var(Yi)kCo(Yi, Yj), i  j τi l λi jλi η, λi , λi ? , λi  0,
Rel(Yi) l Co(Yi, Yj)\Var(Yi), i  j ! " ! " "
or equivalently:
Table 5 summarizes the assumptions defining the Yi l λi jλi ηjεi,
! "
model and the consequences for identification and
testability. In this model, the reliability cannot be where εi: l YikE(YiQU ).
identified any more by the correlation between two The latent variable η is uniquely defined up to
tests. Instead the reliability is identified by positive linear functions. Hence, in this model, too, it
is necessary to fix the scale of η. This can be done by
Rel(Yi) l Co(Yi, Yj)\Var (Yi), i  j. fixing a pair of the coefficients (e.g., λi l 0 and λi l
1) or by fixing the expected value and !the variance" of
Furthermore,theexpectedvaluesofdifferenttestsare η [e.g., E(η) l 0 and Var (η) l 1].
not identical any more within each subpopulation. Table 6 summarizes the assumptions defining the
Instead, the differences between the expected values model and the consequences for identification and
E(s)(Yi)kE(s)(Yj) of two essentially τ-equivalent tests Yi testability assuming E(η) l 0 and Var (η) l 1. Other
and Yj are the same in each and every subpopulation. ways of fixing the scale of η would imply different
All other properties are the same as in the model of formula. As can be seen from the formula in Table 6,
parallel tests. Again, all these hypotheses may be the model of τ-congeneric variables and all its para-
tested via structural equation modeling. meters are identified if there are at least three different
For essentially τ-equivalent tests Y , …, Ym, the tests for which Assumptions a and b hold. The
reliability of the sum score S: l Y j(jY " may be $
" m covariance structure in the total population implied by
computed by the ‘Cronbach’s coefficient α:’ the model may be tested empirically if there are at least
E G four test score variables. Only in this case the model
m has fewer theoretical parameters determining the
 Var (Yi)
m i=" covariance matrix of the test score variables than there
αl 1k . (2) elements in this covariance matrix. The implications
mk1 F Var (S) H
for the mean structure are testable already for three
This coefficient is a lower bound for the reliability of S test score variables provided the means of the test
if only uncorrelated errors are assumed. score variables are available in at least four sub-
populations.

4.3 τ-Congeneric Tests


4.4 Other Models of CTT
4.3.1 Definition. The model of τ-congeneric tests is The models treated previously are not the only ones
defined by the Assumptions (a ) and (b) in Table 3. that can be used to determine the theoretical para-
$
Table 6
The Model of t-Congeneric Tests
Definition Assumptions (a ) amd (b) of Table 3
$
Fixing the scale of η E(η) l 0 and Var(η) l 1
Identification λi l NCov(YCov(Y
i, Yj) Cov(Yi, Yk), ij, ik, jk
" j, Yk)

Var(εi) l Var(Yi)kλ#i
"
Rel(Yi) l λ#i \Var(Yi)
"
Testability
in the total population Cov(Yi,Yk) l Cov(Yi, Yl), ik, il, jk, jl
Cov(Yj, Yk) Cov(Yj, Yl)

between subpopulations E(")(Yi)−E(#)(Yi) l E($)(Yi)−E(%)(Yi)


E(")(Yj)−E(#)(Yj) E($)(Yj)−E(%)(Yj)

Note: The indices i and j refer to tests and the superscripts to one of four subpopulations.

1959
Classical (Psychometric) Test Theory

meters of CTT such as reliability, true score variance, case in which there is no differential change, i.e., τ l
and error variance. In fact, the models dealt with are τ jconstant, the reliability coefficient Rel(Y kY # )
limited to unidimensional models. However, true score " be zero. Obviously, this does not mean that
will " the #
variables may also be decomposed into several latent change is not dependable. It only means that there is
variables. ‘Confirmatory factor analysis’ provides a no variance in the change, since each individual
powerful methodology to construct, estimate, and test changes by the same amount. This phenomenon has
models with multidimensional decompositions of true lead to much confusion about the usefulness of
score variables. Note, however, that not each factor measuring change (e.g., Cronbach and Furby 1970,
model is based on CTT. For instance, there are one- Harris 1963, Rogosa 1995). Most of these problems
factor models that are not models of τ-congeneric are now solved by structural equation modeling
variables in terms of CTT. A model with one common (allowing the include latent change variables such as in
factor and several specific but uncorrelated factors is a growth curve models (e.g., McArdle and Epstein 1987,
counter example. The common factor is not necess- Willet and Sayer 1996) or, more directly, in true
arily a linear function of the true score variables and change models (Steyer, Eid, and Schwenkmezger 1997,
the specific factors are not necessarily the measure- Steyer, Partchev, and Shanahan, 2000). Models of this
ment error variables as defined in CTT. kind are no longer hampered by reliability prob-
lems and allow the explanation of inter-individual
differences in intraindividual change.
4.5 Some Practical Issues
Once a measurement or test score has been obtained 5. Discussion
for a specific individual, one might want to know how It should be noted that CTT refers to the population
dependable that individual measurement is. If the level, i.e., to the random experiment of sampling a
reliability of the measurement is known and if one single observational unit and assessing some of its
assumes a normal distribution of the measurement behavior (see Table 1). CTT does not refer to sampling
errors which is homogeneous for all individuals, the 95 models that consist of repeating this random ex-
percent-confidence interval for the true score of that periment many times. Hence, no questions of stat-
individual with respect to the measurement Yi can be istical estimation and hypothesis testing are dealt with.
computed by: Of course, the population models of CTT have to be
supplemented by sampling models when it comes
Yip1.96:NVar (Yi):(1kRel(Yi)). (3) to applying statistical analyses, e.g., via structural
equation modeling.
Another result deals with the correlation between two Aside from this more technical aspect, what are the
true score variables. If the reliabilities for two test limitations of CTT? First, CTT and its models are not
score variables Y and Y are known, and assuming really adequate for modeling answers to individual
"
uncorrelated measurement # errors, one may compute
items in a questionnaire. This purpose is more ad-
equately met by models of item response theory (IRT)
Corr(Y ,Y ) which specify how the probability of answering in a
Corr(τ ,τ ) l " # . (4)
" # NRel(Y ):NRel(Y ) specific category of an item depends on the attribute to
" # be measured, i.e., on the value of a latent variable.
This eqn. is known as the ‘correction for attenuation.’ A second limitation of CTT is the exclusive focus on
Another important issue deals with the ‘reliability of measurement errors. ‘Generalizability theory’ pre-
a difference variable,’ for example, a difference be- sented by Cronbach et al. (1972) (see also Shavelson
tween a pretest Y and a posttest Y . Assuming equal and Webb 1991) generalized CTT to include other
"
true score and error #
variances between pre- and factors determining test scores.
posttest implies identical reliabilities, i.e., Rel(Y ) l Inspired by Generalizability Theory Tack (1980),
Rel(Y ) l Rel(Y ). If additionally uncorrelated " Steyer et al. (1989) presented a generalization of CTT,
#
measurement errors are assumed, the reliability called ‘Latent State-Trait Theory,’ which explicitly
Rel(Y kY ) : l Var (E(Y kY QU )\Var (Y kY ) of takes into account the situation factor, introduced
" # Y kY may "be computed
the difference # by:" # formal definitions of states and traits, and presented
" # models allowing to disentangle person, as well as
situation and\or interaction effects and from measure-
Rel(Y )kCorr(Y ,Y )
Rel(Y kY ) l " # (5) ment error. More recent presentations are Steyer et al.
" # 1kCorr(Y ,Y ) (1992) as well as Steyer, et al. (1999). Eid (1995,
" #
1996) extended this approach to the normal ogive
According to this formula, the reliability of a difference model for analyses on the item level.
between pre- and posttest is always smaller than the The parameters of CTT are often said to be
reliability of the pre- and posttest, provided the ‘population dependent,’ i.e., meaningful only with
assumptions mentioned above hold. In the extreme respect to a given population. This is true for the

1960
Classical (Psychometric) Test Theory

variance of the true score variable and the reliability There is no doubt that IRT models are more
coefficient. The reliability (coefficient) of an intel- informative than CTT models if samples are big
ligence test is different in the population of students enough to allow their application, if the items obey the
than the general population. This is a simple conse- laws defining the models, and if detailed information
quence of the restriction of the (true score) variance of about the items (and even about the categories of
intelligence in the population of the students. How- ‘polytomous items,’ such as in ‘ratings scales’) is
ever, such a restriction neither exists for the true score sought. In most applications, the decision how to
estimates neither of individual persons nor for the item define the test score variables Yi on which models of
parameters λi of the model of ‘essentially τ-equivalent CTT are built is arbitrary, to some degree. It should be
tests,’ for instance. Proponents of IRT models have noted, however, that arbitrariness in the choice of the
often forwarded the population dependence critique. test score variables cannot be avoided altogether.
They contrast it with the ‘population independence’ of Even if models are based on the item level, such as in
the person and the item parameters of IRT models. IRT models, one may ask ‘Why these items and not
However, ‘population independence’ also holds for other ones?’ Whether or not a good choice has been
the person and the item parameters of the model of made will only prove in model tests and in validation
essentially τ-equivalent tests, for instance. studies. This is true for models of CTT as well as for
In applications of CTT it is often assumed that the models of alternative theories of psychometric tests.
error variances are the same for each individual,
irrespective of the true score of that individual. This See also: Dimensionality of Tests: Methodology;
assumption may indeed be wrong in many appli- Factor Analysis and Latent Structure: IRT and Rasch
cations. In IRT models no such assumption is made. Models; Generalizability Theory; Psychometrics;
However, it is possible to assume different error Reliability: Measurement; Test Theory: Applied
variances for different (categories of ) persons in CTT Probabilistic Measurement Structures
models as well. In this case, the unconditional error
variance and the reliability coefficient are not the best
available information for inferring the dependability
of individual true score estimates. In this case one Bibliography
should seek to obtain estimates of conditional Arbuckle J L 1997 Amos User’s Guide: Version 3.6. SPSS,
measurement error variances for specific classes of Chicago, IL
persons. It is to be expected that persons with high true Bentler P M 1995 EQS. Structural Equation Program Manual.
scores have a higher measurement error variance than Multivariate Software, Encino
those with medium true scores and that those with low Browne M W, Mels G 1998 Path analysis (RAMONA). In:
true scores have a higher error variance again. (This SYSTAT 8.0 -Statistics. SPSS, Inc., Chicago
would be due to ‘floor and ceiling effects.’) Other Cronbach L J, Furby L 1970 How should we measure
patterns of the error variance depending on the size of ‘change’—or should we? Psychological Bulletin 74: 68–80
Cronbach L J, Gleser G C, Nanda H, Rajaratnam N 1972 The
the true score may occur as well. Dependability of Behaioral Measurements: Theory of Gen-
Such phenomena do not mean that ‘true scores’ and eralizability of Scores and Profiles. Wiley, New York
‘error scores’ would be correlated; only the ‘error du Toit M, du Toit S 2001 Interactive LISREL: User’s Guide.
variances’ would depend on the true scores. None of Scientific Software International, Chicago
the properties listed in Table 2 would be violated. As Eid M 1995 Modelle der Messung on Personen in Situationen
mentioned before, the properties listed in Table 2 [Models of measuring persons in situations]. Psychologie
cannot be wrong in empirical applications. What could Verlags Union, Weinheim, Germany
be wrong, however, is the true score interpretation of Eid M 1996 Longitudinal confirmatory factor analysis for
the latent variable in a concrete structural equation polytomous item responses: model definition and model
selection on the basis of stochastic measurement theory
model. Misinterpretations of this sort can be most [online]. http:\\www.ppm.ipn.uni-kiel.de\mpr\issue1\art4\
effectively prevented by empirical tests of the hy- eid.pdf: 1999-7-5
potheses listed in the testability sections of Tables 4 Fischer G H, Molenaar I W 1995 Rasch Models. Springer, New
to 6. York
The most challenging critique of many applications Gulliksen H 1950 Theory of Mental Tests. Wiley, New York
of CTT is that they are based on rather arbitrarily Jo$ reskog K G, So$ rbom D 1998 LISREL 8. Users Reference
defined test score variables. If these test score variables Guide. Scientific Software, Chicago
are not well chosen any model based on them is also Lord F M, Novick M R 1968 Statistical Theories of Mental Test
not well founded. Are there really good reasons to Scores. Addison Wesley, Reading, MA
Muthe! n L, Muthe! n B 1998 Mplus User’s Guide. Muthe! n &
base models on sum scores across items in question-
Muthe! n, Los Angeles
naires? Why take the sum of the items as test score Neale M C 1997 MX: Statistical Modeling, 4th edn. Department
variables Yi and not another way of aggregation such of Psychiatry, Richmond, VA
as a weighted sum, or a product, or a sum of Novick M R 1966 The axioms and principal results of classical
logarithms? And why aggregate and not look at the test theory. Journal of Mathematical Psychology 3: 1–18
items themselves? Rogosa D 1995 Myths and methods: ‘Myths about longitudinal

1961
Classical (Psychometric) Test Theory

research’ plus supplemental questions. In: Gottman J M (ed.) is partitioned, while a typology is a particular type of
The Analysis of Change. Lawrence Erlbaum, Mahwah, NJ rigorous classification, in which a field of data is
Shavelson R J, Webb N M 1991 Generalizability Theory. A divided up into categories that are all defined ac-
Primer. Sage, Newbury Park
cording to the same set of criteria, and that are
Steiger J H 1995 Structural equation modeling. In:
STATISTICA 5—Statistics II. StatSoft Inc., Tulsa, OK mutually exclusive. As will be shown later, most
Steyer R, Ferring D, Schmitt M J 1992 States and traits in archaeological classifications of artifacts are typolo-
psychological assessment. European Journal of Psychological gies, while most classifications of cultures are not.
Assessment 8: 79–98
Steyer R, Majcen A-M, Schwenkmezger P, Buchner A 1989 A
latent state-trait anxiety model and its application to de- 1.1 Archaeological Classification and Culture
termine consistency and specificity coefficients. Anxiety Re- The basic organizing concept for most prehistorians,
search 1: 281–99 as for most other anthropologists, is the concept of
Steyer R, Partchev I, Shanahan M 2000 Modeling true intra-
culture, but it is somewhat differently defined in the
individual change in structural equation models: the case of
poverty and children’s psychosocial adjustment. In: Little two cases. The cultural anthropologist conceives of
T D, Schnabel K U, Baumert J (eds.) Modeling Longitudinal the world as divided into a set of distinct peoples—
and Multiple-Group Data: Practical Issues, Applied Appro- tribes, nations, or ethnic groups—each of which has its
aches, and Specific Examples. Erlbaum, Hillsdale, NJ, pp. own unique set of behavior patterns and beliefs, very
109–26 often including its own language, which together
Steyer R, Schmitt M, Eid M 1999 Latent state-trait theory constitute a culture. The prehistorian thinks of the
and research in personality and individual differences. Euro- ancient world as similarly partitioned, but the various
pean Journal of Personality long-vanished peoples can now be recognized only by
Tack W H 1980 Zur Theorie psychometrischer Verfahren:
the distinct kinds of artifact types they left behind. In
Formalisierung der Erfassung von Situationsabha$ ngigkeit
und Vera$ nderung [On the theory of psychometric procedures: place of forgotten languages and behavior patterns,
formalizing the assessment of situational dependency and every artifact type is treated as tantamount to a
change]. Zeitschrift fuW r Differentielle und Diagnostische Psycho- deliberate cultural expression—a culture trait. An
logie 1: 87–106 archaeologically defined ‘culture’ is then a unique
Thurstone L L 1931 The Reliability and Validity of Tests. combination of artifact, house, and burial types, which
Edwards Brothers, Ann Arbor are assumed, because of their cultural commonality, to
Zimmerman D W 1975 Probability spaces, hilbert spaces, and be the remains left by a distinct, self-recognizing
the axioms of test theory. Psychometrika 40: 395–412 people. Those commonalities are recognized above all
Zimmerman D W 1976 Test theory with minimal assumptions.
through processes of classification.
Educational and Psychological Measurement 36: 85–96
Zimmerman D W, Williams R H 1977 The theory of test validity
and correlated errors of measurement. Journal of Math- 1.2 Kinds of Archaeological Classification
ematical Psychology 16: 135–52
Obviously, any of the different kinds of material
R. Steyer remains that archaeologists find can be classified, and
there are in fact many different kinds of archaeological
classifications and typologies. In the broadest sense,
all of them fall into two categories, which may be
Classification and Typology called analytic and synthetic. Analytic classifications
(Archaeological Systematics) are classifications of one particular kind of object, in
which all of the regularly recurring variants are
recognized, defined, and named. The things most
Classification is the initial means through which we often classified are those that show a high degree of
impose a degree of order on the enormously diverse culturally patterned variability, including various
remains of the human past. As such, it is probably the kinds of stone tools and weapons; pottery; beads and
single most basic analytical procedure employed by other ornaments; house types; and grave types.
the archaeologist. Excavation yields an enormous Classifications of these things are usually typologies;
diversity of materials that are not self-labeling; they that is, they partition the entire field of variability into
must be endowed with identity and meaning by the a comprehensive set of mutually exclusive categories,
excavator or the analyst. This is done in the first because they are very commonly used for sorting and
instance through classification. counting the objects found.
Artifact typologies can be made in a wide variety of
1. Classification and Typology ways, depending on what criteria of identity are
considered important. This in turn will depend on the
Archaeologists often use the terms classification and purpose for which the classification is made. Among
typology interchangeably, but in this article a dis- the many kinds of artifact classifications it is possible
tinction will be made. A classification is any set of to recognize purely morphological typologies, based
formal categories into which a particular field of data on the overall form of objects; stylistic typologies,

1962
Classification and Typology (Archaeological Systematics)

which specially emphasize stylistic features; functional much interest in sorting out their chronology. Amer-
classifications, in which objects are classified according ican prehistorians were much more struck by the
to their presumed use; ‘emic’ classifications, in which spatial than by the chronological variability of
objects are classified according to criteria believed to the indigenous cultures, as they began to recognize the
have been important to the makers; and distributional very wide diversity of pottery types, tool types, and
typologies, in which objects are classified according to house types that had been used in different parts of the
their distribution in time and space. continent. Beginning in the early twentieth century,
In addition to the analytic classifications of par- they set about defining and naming a whole panoply of
ticular object types, there are also synthetic classifi- localized cultures and subcultures on the basis of these
cations, in which recurring combinations of different variable traits. As their work intensified and their
artifact, house, and grave types are taken together to methods improved, however, they also became aware
define ‘cultures.’ These classifications are quite dif- of temporal differences among the remains they
ferent from artifact classifications, in that they are not studied, again based on typological features. In 1927
typologies. That is, they are not used to divide up A. V. Kidder and his associates proposed the ‘Pecos
material into discrete, mutually exclusive units. The Chronology,’ in which prehistoric and historic South-
boundaries between units are not always sharp, and western remains were assigned to seven developmental
the criteria of identity are not always uniform. Some phases, designated as Basket Maker II and III, and
‘cultures’ have been identified primarily on the basis of Pueblo I–V, each with its defining typological charac-
pottery types, others by stone tool types, and still teristics. Within a decade, similar chronological
others by house types. Archaeological ‘cultures’ are schemes had been devised in many other parts of
above all historical constructs; they are the pre- North and Middle America.
historian’s basic way of mapping the prehistoric world, Once they were drawn into the classificatory enter-
by dividing it into units of study which can be thought prise, prehistorians both in the Old and the New
of as equivalent to peoples. Worlds devoted much of their energies to the de-
Culture classifications generally have a chronologi- velopment of what have been called ‘time–space grids,’
cal as well as a spatial dimension. That is, the which were to become the basic map of prehistory.
classification includes cultures that existed in different The prehistoric world was divided into a set of
areas, but also that existed in different periods of time cultures, and most cultures were further divided into
in the same area. Very often a generalized regional developmental phases, strictly on typological grounds.
culture, like Anasazi, is divided into a sequence of To a very large extent, the schemes that were developed
developmental stages, which in the case of Anasazi are remain in use down to the present day.
designated as Pueblo I, II, III, IV, and V. Like By the middle of the twentieth century the ‘time–
biological classifications, then, culture classifications space’ grids were mostly in place, at least over North
often have a genetic component, when later cultures America and Europe, and archaeologists began mak-
are recognized as ‘descended from’ earlier ones. ing classifications for new purposes. As a result at that
time of a strong influence from functionalist anthro-
pology, it was argued that classifications should
2. Historical Background emphasize what the objects were used for, or what they
meant to the makers and users, rather than simply
Although the excavation of ancient sites, for anti- what was useful to the archaeologist for purposes of
quarian purposes, had its beginnings in the Renais- identity and dating. The functionalist paradigm lasted
sance, the scientific investigation of prehistory began for about a generation, and then was replaced by what
only in the nineteenth century, above all in Scandi- may be called the nomothetic paradigm. It was argued,
navia. It was Danish archaeologists who developed in the 1960s and 1970s, that scientific archaeology
the first ‘culture classification’—the division of all should devote itself not to historical issues of cultural
European prehistory into Stone, Bronze, and Iron development but to the testing of general, causal
Ages—and they also developed the specific artifact hypothesis about culture processes, and classifications
typologies on which the ‘Three-Age System’ was should be developed that would aid in that process. In
based. In the later nineteenth century, especially in reality very few of them ever were; the nomothetic
France, prehistorians carried the same basic approach paradigm, as applied to classification, was much more
much further, dividing up the Stone Age into a whole a lofty ideal than a practical reality.
succession of phases, or cultures, defined by distinctive A subsequent revolution, or at least an anticipated
tool types. In this work, artifact classification was revolution, came about with the introduction of
conceived above all as an aid to dating; that is, to the computers. The practical problem in making typo-
placing of prehistoric remains in their proper chrono- logies was always that of limiting the attributes to be
logical order. considered to a finite number, without at the same
In the Americas there was a general belief that time introducing the bias of human judgment. It was
prehistoric Indian remains were not more than two or believed, however, that if all the possible attributes of
three millennia old, and consequently there was not a group of objects were fed into a computer, the

1963
Classification and Typology (Archaeological Systematics)

machine itself could determine, on a purely quan- is no way of differentiating the pottery made at Pueblo
titative basis, which were and were not important. Bonito from that made at other nearby sites. On the
Thus was born the concept of ‘numerical taxonomy,’ other hand, it is also possible to designate types that
in which computers would designate types, and dif- are readily recognizable—for example, all vessels
ferentiate them from other types, purely on the basis of having scratch marks on one side—but that have no
the numbers of shared traits, regardless of what the evident significance for any purpose.
traits were. The goal was ‘automatic classification,’ in
which human judgment would be altogether elimi-
nated. 3.1 The Criterion of Identity
After two decades of experimentation, however, this
The basic criterion of identity for artifact types have
goal was found to be illusory. Unless there was some
been designated as variables, and attributes. To make
preselection of attributes to be coded, based on human
the distinction in the simplest terms, ‘color’ is a
judgmental decisions, the classifications produced by
variable, while ‘red’ is one of the attributes of the color
computers were far too cumbersome and parti-
variable. Artifact types are never designated on the
cularized to have any practical utility. Every par-
basis of all their visible attributes. To do so would
titioning scheme that was tried produced hundreds of
result in a typology in which every single object was a
types. Moreover, the ‘types’ they produced could not
separate type, since no two things are ever absolutely
be shown to have meaning with reference to any
identical. Rather, certain variables are selected out as
specific purpose. As a result, most of the typologies
a basis for the differentiation of types, while others are
that are in use today are still those that were developed
ignored. To cite one example, color is usually treated
in the earlier part of the century, in the heyday of
as a significant variable in the case of pottery types,
‘time–space partitioning.’
because it is something produced deliberately by the
vessel makers, whereas it is nearly always ignored in
3. Artifact Classifications and Types classifications of stone tools, because it is an accidental
property of the lithic material selected. Some qualities
Basic to all artifact classifications is the concept of
are ignored simply because they do not vary: they are
‘type.’ Whatever kind of material is being classified—
common to all of the types in a typology.
pottery or stone tools, for example—it is partitioned
Every variable has a specified set of attributes, and
into a set of mutually exclusive categories that are
these also are selected in accordance with the needs of
usually called types. The type concept is actually a
the typologist. How much distinction is made between
good deal more complex than it at first appears, and it
attributes of the same variable—colors, for example—
has been the subject of various controversies that will
will depend partly on their identifiability, but also on
be considered later. A type consists in the first instance
how much hair-splitting is necessary for the typo-
of a body of objects having common features that set
logist’s purposes. Pottery vessels made in the pre-
them apart from other objects. However, the type
historic American Southwest may exhibit a very wide
concept also includes our ideas about the things and
variety of surface colors, but typologists have generally
what they have in common, and the words and
been content to assign them to five color categories:
sometimes the pictures that we use to describe them.
white wares, yellow wares, buff wares, orange wares,
Every type, in short, has members (actual objects), a
and red wares.
description, a definition, and a name.
It is important to notice that these things may be
modified independently of one another; we may refine
3.2 The Question of Purpose
our notions about what defines a particular type,
based on the finding of additional material, but we Above all, it is the typologist’s purpose that determines
may also find better ways of defining and describing which variables and which attributes are selected in
the type, even if no new material is found. We may find making a classification. Artifact classifications can in
that some characteristics are important that we had practice be made for a very wide variety of purposes,
formerly ignored. Useful type concepts are always and and classifications that yield meaningful results for
necessarily mutable: they evolve continually as more one purpose may not do so for another.
material is found, but also as we develop new ideas The various purposes that may be served by artifact
about what is and is not important. classifications can for convenience be characterized as
Within any typological system, the types must have basic, ancillary, and instrumental. Basic purposes are
two characteristics: identity and meaning. A type served when we classify objects in such a way as to
which cannot be recognized by any objective measure learn or to express something important about the
has obviously no practical utility. It would be possible, objects themselves. Pottery vessels, for example, may
in theory, to conceive of a type including all of the be classified on the basis of their constituent clays and
pottery made at Pueblo Bonito between 1100 and 1125 tempers, which indicate where they were made, or they
CE, and such a type would have enormous interpretive may be classified on the basis of vessel shapes, which
utility if it could be recognized. In fact, it cannot: there may indicate what they were used for, or they may be

1964
Classification and Typology (Archaeological Systematics)

classified on the basis of decorative designs, which will other classifications. The complete typology must be a
say something about the cultural preferences of the comprehensive set of categories (types), such that
makers and users. there is one and only one type for each object found.
Objects, however, may also be classified for ancillary The types must all be defined on the basis of the same
purposes: not because we want to learn or say set of criteria, and they must be mutually exclusive.
something about the material itself, but because we By way of summation, it may be said that a typology
want to use it as a guide to other understandings. is a conceptual system made by partitioning a specified
Pottery types and certain stone tool types have long field of entities into a comprehensive set of mutually
been treated as ‘index fossils’ or horizon markers, to exclusive categories (types), according to a uniform set
identify a particular culture or a particular period in of criteria dictated by the purposes of the typologist.
time. Their presence in a site may enable us to date that Within any typology, each type is a category created
site within a century or even a generation, or to say by typologists, into which they can place discrete
that it was inhabited by a particular people and not by objects having specific identifying characteristics, to
another, contemporary people. Some rather elaborate distinguish them from objects having other charac-
and highly particularized pottery classifications have teristics, in a way that is meaningful to the purposes of
been developed primarily as an aid to dating sites. the typology.
Classifications made for this purpose will place special
emphasis on whatever features show the most reco-
gnizable variability in time and space, whether or not 4. Problems and Controersies: the ‘Typological
any functional significance can be attached to them. Debate’
Other kinds of ancillary classifications have been
developed in order to yield information about manu- Everyone recognizes that archaeological types are not
facturing technologies, about resource acquisition, self-labeled; it is the classifiers who give them names
and for other purposes. and definition. There has nevertheless been a very
Some classifications are also made purely in the long-running debate over whether our types are
interest of practical convenience; for example, econ- ‘natural’ or ‘artificial.’ Are we merely ‘finding the
omy of description. Editors will usually not allow the joints in nature,’ as one proponent has it, or are we
archaeologist an unlimited number of pages in which imposing our own artificial order on nature? In reality,
to describe a mass of finds, such as beads or cutting both things are true: nearly all artifact types are partly
tools. For economy of space they must be described in natural and partly artificial. They are natural in that
groups rather than individually. Some classifications the differences between one object and another have
are also made for the same reason that library books objective reality; they were not created by us. On the
are classified: there must be a coherent way of dividing other hand it is we, the typologists, who decide which
up the material into groups, for purposes of storage. distinguishing characteristics we will focus on, and
which we will ignore, in making a typology. There may
often be varying degrees of ‘naturalness’ between
types in the same typology. Some types will stand out
3.3 Frequency Seriation
very sharply in a great many respects, while we may
When artifact types are used as a basis for the dating decide to differentiate other types only because of
of sites, this is often done through the technique of minor stylistic differences that are nevertheless im-
frequency seriation. We recognize that cultures do not portant for dating purposes.
evolve in time through a series of instantaneous leaps, A related question is whether types should be
in which old artifact types are suddenly and totally created by object clustering or by attribute clustering.
replaced by new ones. Rather, there is a gradual Should we begin our typology by dividing up a
process of transformation and replacement, in which collection of objects into groups that look intuitively
some new types are becoming increasingly common at similar to one another, or should we first decide which
the same time that older ones are becoming less variables and attributes will be important, and then
common. Consequently, sites may be assigned to a decide that all unique combinations of those attributes
particular developmental phase, such as Pueblo II or will automatically constitute a type? Again, the prac-
Pueblo III, not on the basis of types absolutely present tical reality lies between the two positions. Virtually all
or absent, but on the percentages of particular types useful typologies develop dialectically through a feed-
present or absent. back between object clustering and attribute cluster-
Obviously, frequency seriation requires quantifica- ing. We begin, necessarily, with a collection of objects,
tion: the actual counting of the numbers of each and make some initial observations about what seem
artifact type present. It is for this reason that artifact to be the most obvious differences, on the basis of
typologies must be different from other kinds of which we divide them into types. As more material
classifications, such as culture classifications. A ty- accumulates, however, our ideas about what is and is
pology is a sorting and counting system, and as a result not important change, and we may add some new
it must have a degree of rigor not necessarily found in criteria of differentiation while eliminating others.

1965
Classification and Typology (Archaeological Systematics)

Often we will find that we have split hairs too finely in Dunnell R C 1986 Methodological issues in Americanist artifact
the differentiation of some types, and not finely enough classification. Adances in Archaeological Method and Theory
in other cases. 9: 149–207
Ford J A 1954 The type concept revisited. American Anthro-
Types, we say, must be defined by a combination of
pologist 56: 42–54
‘internal cohesion and external isolation.’ They must Gardin J-C 1962 Archaeological Constructs. Cambridge Uni-
have features that are common to all of their members, versity Press, Cambridge, UK
but they must also lack features that are possessed by Kaplan A 1984 Philosophy of science in anthropology. Annual
the members of other types. Archaeologists, however, Reiew of Anthropology 13: 25–39
have differed in their emphasis on one or another of Klejn L 1982 Archaeological Typology, trans. Dole P. BAR
these characteristics. Some have argued that types International Series 153
must be defined by central tendencies, without a strict Krieger A D 1944 The typological concept. American Antiquity
definition of their boundaries; other have insisted that 9: 271–88
Marquardt W H 1978 Advances in archaeological seriation.
if typologies are to be used as sorting systems, every
Adances in Archaeological Method and Theory 1: 257–314
type must have clear boundaries. There is no one McKern W C 1939 The Midwestern Taxonomic Method as an
correct solution to this problem; it will depend to a aid to archaeological culture study. American Antiquity 4:
considerable extent on the purpose for which the 301–13
typology is to be used. The sharper the type bound- Rouse I 1960 The classification of artifacts in archaeology.
aries, the more useful is the typology for sorting American Antiquity 25: 313–23
purposes. It must be recognized, however, that in Rouse I 1967 Seriation in archaeology. In: Riley C L, Taylor
practice there are very few artifact types that do not W W (eds.) American Historical Anthropology. Southern
exhibit some fuzziness at the boundaries. The sorter Illinois University Press, Carbondale, IL pp. 153–96
Sokal R R, Sneath P H A 1963 Principles of Numerical Tax-
will often, like a baseball umpire, have to make purely
onomy. W. H. Freeman & Co, San Francisco, LA
arbitrary decisions in borderline cases. Spaulding A G 1960 Statistical description and comparison in
Although the theoretical literature on archaeo- artifact assemblages. In: Heizer R F, Cook S F (eds.) The
logical classification is voluminous, much of it bears Application of Quantitatie Methods in Archaeology. Viking
little relation to what really happens in practice. There Fund Publications in Anthropology 28: 60–83
are two reasons for this disjunction. First, most of the Whallon R, Brown J A (eds.) 1982 Essays on Archaeological
literature refers to closed classifications, intended to Typology. Center for American Archaeology Press, Evanston,
classify only material already in hand. Such classifi- IL
cations can be as rigidly formal and immutable as the
classifier wishes. In practice, however, the vast ma- W. Y. Adams
jority of artifact classifications are open systems,
intended for the processing of future finds as well as
for material already in hand. Such systems must
necessarily by mutable: capable of continual adjust- Classification: Conceptions in the Social
ment as more material comes to hand.
Second, too many authors have ignored the fact Sciences
that types must have not only identity, but also
meaning relevant to some specific purpose or pur- Classification is the assignment of objects to classes.
poses. As we have seen above, a great many legitimate For example, an educational researcher might want to
purposes may be served by archaeological classifi- establish a taxonomy of teaching styles that covers all
cations, and the nature of the classifications will vary possible approaches to teaching. A psychologist study-
accordingly. ing personality might be interested in whether children
can be grouped into categories according to their
See also: Archaeology and Philosophy of Science; patterns, or profiles, of personality traits. A sociologist
Ceramics in Archaeology; Classification: Conceptions might be interested in whether certain combinations of
in the Social Sciences; Culture as Explanation: Cul- characteristics of urban areas (average socioeconomic
tural Concerns status, crime rate, building types, etc.) occur much
more often than other combinations. A biologist might
Bibliography want to study whether animals showing a particular
phenotype have specific combinations, or patterns, of
Adams W Y, Adams E W 1991 Archaeological Typology And genetic codes. In all these cases, objects (teachers,
Practical Reality. Cambridge University Press, Cambridge, children, urban areas, animals) are classified based on
UK
their patterns of some observable characteristics
Cormack R M 1971 A review of classification. Journal of the
Royal Statistical Society, Series A, 134: 321–53 (teaching behaviors, personality traits, city character-
Doran J E, Hodson F R 1975 Mathematics and Computers in istics, genes).
Archaeology. Harvard University Press, Cambridge, MA The task of classifying objects poses several prob-
Dunnell R C 1971 Systematics in Prehistory. Free Press, New lems. The objects to be classified, the properties based
York on which they are classified, and the way of assessing

1966
Classification: Conceptions in the Social Sciences

similarities among objects have to be specified. The 1.2 Limits of Discussion


aim is to identify individual classes, to decide how
In studying classification in the behavioral and social
many classes are warranted, and to establish proce-
sciences, we have to distinguish between two ques-
dures for identifying which class each object should be
tions. (a) What theory and empirical evidence are
assigned to. Eventually, the success of the classification
available about how people classify objects? (b) What
system must be evaluated. There are problems associ-
theories and methods are used to create classification
ated with each of these steps, and different solutions
systems? The first question will not be dealt with here
have been suggested for each problem.
(see Concept Learning and Representation: Models).
In a sense, this paper recreates the process of
The second question refers to the formal and empirical
establishing and testing a classificatory system. First,
procedures used for defining classes and the rules that
we define the topic more precisely, clarify the ter-
have evolved for assigning cases to classes. This second
minology, and provide some examples. Then, we
question is what concerns us here; it will be discussed
present an outline of the first steps in a classification
from a conceptual rather than a statistical point of
process and discuss one of the most debated issues:
view. For more statistically-oriented discussions of
What is the concept of a class? After this, we deal
classification methods see, for example, Statistical
briefly with procedures for assigning objects to classes
Clustering; Mixture Models in Statistics; Configur-
and with the—frequently neglected—question of how
ational Analysis.
to evaluate the resulting classification system.
One commercially ‘booming’ application of classifi-
cation that this paper does not refer to is in biometric
1. Definition of the Topic authentification. The aim of these methods is to use
combinations of characteristics of individuals to uni-
1.1 Terminology quely identify each individual. Thus, conceptually, the
task is to assign one of very many cases into a class,
Depending on the research tradition, the objects to be
when there might be as many cases as classes. In this
classified into a system are called elements, cases,
process, an observed pattern of features, or even an
units, exemplars, specimens or items. They are the
observed set of such patterns, is matched against a
sources or ‘carriers’ of properties, characteristics or
stored list of patterns. For example, DNA is used in
variables. These properties may be dichotomous or
this way in forensic criminology, as are records of the
polytomous, qualitative or quantitative. A property
past modus operandi of individual criminals in de-
can only be useful in a classification, if it varies within
tective work. This is, however a fairly unique example
the set of objects, that is, if at least two different values
of classification, because the number of classes is
(categories, states, labels) on the respective property
intended to be equal to the number of cases, which is
occur in the sample. When more than one property is
rarely the case in the social and behavioral sciences.
used to characterize an object, the object can be
described as a vector of values, a profile, a set of
symptoms, or a pattern of features. 1.3 Purposes of Classification
Sometimes, the data do not consist of objects and
their properties, but of measures of relations between The fundamental purpose of classification is to find
objects, such as their similarity, likeness, or belonging structure. Typically, a large number of objects is
together. For example, a researcher might ask partici- reduced to a much smaller set of classes without too
pants to rate the similarity between different poli- much loss of information about the objects. The data
ticians. The similarity ratings can then be used to thus summarized allow objects to be identified, at least
classify politicians into groups. Based on this classifi- in part, through the class to which they belong.
cation, the researcher can then study on which features Specifying the boundaries describing a class has
people’s perceptions of similarities between politicians several advantages. One is that limits to generalization
are based. can be established, and another is that it becomes
The crucial assumption underlying classification is possible to generate predictions about how different
that objects are elements of a class, of a set, of a classes are composed and how class membership
partition or—in biology—of a taxon. In other term- relates to other variables.
inologies, the terms ‘category’ or ‘cluster’ are also
used. Classification is the process of finding classes 2. Some Examples of Classifications
and of assigning entities to these classes. The end-
product of this order-creating process, however, is The most well-known examples of classifications are
often also referred to as ‘classification.’ To stress this from the natural sciences, rather than the social
distinction, the term ‘classification system’ can be used sciences. A well known, still used, and expanding
for the end-product, although in clinical psychology classification is Mendelejew’s Table of Elements. It can
and biology the word ‘taxonomy’ is more common. be viewed as a prototype of all taxonomies in that it
Identification is the assignment of a specific case or satisfies the following evaluative criteria: (a) Theor-
object to (usually only) one of the classes. etical foundation: A theory determines the classes and

1967
Classification: Conceptions in the Social Sciences

their order. (b) Objectivity: The elements can be classification. (b) Which properties characterize the
observed and classified by anybody familiar with the cases? The list of these properties is called the
table of elements. (c) Completeness: All elements find ‘intension’ of the classification. The answers to these
a unique place in the system, and the system implies a questions already determine, in part, the results of the
list of all possible elements. (d) Simplicity: Only a classification. To quote Hartigan (1982, p. 2): ‘Clearly,
small amount of information is used to establish the the selection of variables to measure will determine the
system and identify an object. (e) Predictions: The final classification. Some informal classification is
values of variables not used for classification can be necessary before data collection: deciding what to call
predicted (number of electrons and atomic weight), as an object, deciding how to classify measurements from
well as the existence of relations and of objects hitherto different objects as being of the same variable, deciding
unobserved. Thus, the validity of the classification that a variable on different objects has the same value.’
system itself becomes testable. Sometimes, no well-defined population of objects is
Another successful classification system is biological available from which to sample, and a preliminary
taxonomy. Indeed, most attempts to formalize classifi- selection has to be made intuitively. In such cases,
cation have some intellectual roots in this tradition future applications of the classification system may
(Sokal and Sneath 1963). The result of such classifica- result in more or different classes from those originally
tion is frequently depicted as a ‘phylogenetic tree,’ obtained.
today often the result of comparative genomics. In
biological taxonomy, however, theory is not so strong
as to warrant completeness, as in the Table of Elements
3.2 Specifying the Properties
(e.g., how should one deal with archaebacteria?).
Moreover, the identification of a specimen requires The question of feature selection (selection of the
information from morphology and sometimes from properties on which the classification will be based)
behavioral observation. In addition, the system arises at two points in the process of classifying. First,
abounds with nested criteria. And, compared with as mentioned above, it arises at the very beginning of
physics, predictions of future developments or of the process. The second opportunity to select proper-
‘missing links’ in biological taxonomy are vague. ties comes when an established procedure is tested for
However, the classes of the phylogenetic system are identification. This problem is very similar to one in
still useful because, at the very least, they indicate regression analysis: Which variables should be re-
boundaries to generalization. tained because they discriminate best, between the
In the behavioral and social sciences, hundreds of classes (see Pankhurst 1991). Even if computational
classifications are published every year. Noteworthy problems do not play a role, use of too many properties
examples are Bloom’s taxonomy of educational objec- can still be problematic, if measuring these variables is
tives (Krathwohl et al. 1964), as well as the DSM expensive or dangerous. In both instances of selecting
(Diagnostic and Statistical Manual of Mental Dis- properties, reliability is a very important issue. With
orders) and ICD (International Classification of Dis- decreasing reliability of the measurement of the
eases) classification systems used in psychology and properties, the identification of classes becomes more
psychiatry. None of these systems have been formally difficult.
derived, however. Instead, they were generated based Another important question is whether the values of
on ‘experience.’ The resulting classes are so hetero- the properties should be transformed before searching
geneous that they acknowledge many exceptions. for classes. The results of most classification proce-
Also, a phenomenon called ‘comorbidity’ shows that dures will be influenced by transformations. If differ-
these classification systems are not optimal yet. It ences in variability between the variables are of
refers to the simultaneous existence of two or more substantive importance, no transformations that
disturbances in the same patient. If comorbidity is the equate variability across variables should be used. The
rule rather than the exception, then the classification use of transformations is also called ‘a priori weight-
system loses plausibility and practicability. ing.’ ‘A posteriori weighting’ refers to cases in which
different variables are given different emphasis in the
identification process.
Especially in routine applications, a good strategy
3. Preparing the Basis of a Classification for selecting properties to be retained in the final
classification might be to find a minimal set of
variables sufficient to discriminate between all the
3.1 Selecting the Cases
classes. Relative to the set of all variables, the minimal
In the beginning of the process of developing a set may not be unique. If this is the case, one will often
classification, two main questions arise. (a) Which prefer a set with few practical problems, and replace
elements are to be differentiated in a classification? properties and take other aspects, such as minimiza-
One searches for a (complete, if possible) list of cases tion of costs, into account (see Pankhurst 1991 for an
to be classified. This list is called the ‘extension’ of a algorithm).

1968
Classification: Conceptions in the Social Sciences

3.3 Determining the Similarity precautions are necessary. For example, similarity
judgment data may not fulfill some necessary assump-
After the objects to be classified and their relevant
tions: Generally, related objects are to be located in the
properties have been selected, the similarity between
same class. The relationship xRx when ‘x is related to
objects is determined. Similarity is a key concept in
x’ has the properties of reflexivity: xRx, symmetry:
classification. As was mentioned earlier, there are two
xRy  yRx, and transitivity: xRy and yRz  xRz.
basic ways to obtain similarity measures: The re-
These are the properties of an equivalence relation. If
searcher can either collect similarity judgments from
the empirical data are similarity judgments, they do
participants, or derive similarity measures from the
not necessarily fulfill this relation. Some properties of
empirical co-occurrence of properties. Methods for
this relation can be tested statistically. Another im-
obtaining similarity judgments in the context of the
portant consideration applies if the objects are classi-
first approach—similarity ‘in the eyes of the partici-
fied by using rules referring to features. In such cases,
pants’—are discussed in the context of data theory
these rules need to be free of contradictions (for a test,
(Coombs 1964). ‘Proximity’ measures can also be
see Feger 1994).
derived from confusion or generalization data, as-
Only a few substantive theories in the behavioral
sociation probabilities, substitutability ratings, sorting
and social sciences allow one to deduce the number
procedures, and so on. The second case—‘similarity in
and kind of classes needed to describe a given range of
the mind of researchers’—amounts to comparing the
phenomena. Therefore in many cases inductive proce-
feature patterns of objects and describing the similarity
dures have to be used to generate classes. For this, one
between objects using similarity coefficients. A large,
needs a concept of what constitutes a class. Many
and still growing, number of these coefficients exist,
researchers apply inductive classification methods
and monographs on classification devote a lot of space
without ever considering explicitly the class concept
to them. Therefore, the choice of one particular
that their method implies. The following part of the
coefficient should be explicitly justified. An analysis of
paper gives a brief discussion of the class concepts
the metric properties of coefficients is given by Gower
implied in frequently used methods for finding classes.
and Legendre (1986). To choose a coefficient, one may
The list is not complete, and the ‘cluster analysis
refer to their axiomatic foundation (Baulieau 1999).
proper’ dominates all other approaches, because of the
Another important distinction in the selection of
frequency of its use.
similarity coefficients refers to ‘negative matching,’
Before discussing class concepts in detail, one more
i.e., deciding whether to include observations stating
general distinction needs to be made. If classes are
that two objects agree that a property is absent rather
defined by properties of objects, two levels of definition
than present that is, whether similarity between two
can be distinguished. A general definition specifies the
objects in the absence, rather than the presence, of a
relationship between the properties and the classes.
property should be included in the similarity measure.
Specific definitions provide detailed translations of the
Jaccard’s (1908) coefficient excludes negative match-
general definition into formal operations for assigning
ings.
the objects to the classes. Obviously there can be many
Although most classification methods make use of
different specific definitions. General definitions can
similarity information, clustering models exist that do
be ordered by the kind and amount of variability they
not refer to similarity. Another aspect that might be
allow among objects within the class. There are two
taken into account is the concept of similarity used:
general positions with respect to within-class varia-
Why only use pairwise co-occurrences, and not higher
bility. The ‘monothetic’ position (Sutcliffe 1993) as-
order contingencies (Daws 1996)?
sumes that a class is defined by one or a few necessary
properties. The ‘polythetic’ counter-position (Gyllen-
4. Establishing a Classification berg and Koski 1996) assumes that some properties of
a specified total set, not necessarily the same for every
After a measure of similarity has been selected, the object, are sufficient. According to this position, a
next step is the actual classification of the objects property is shared by most, but not necessarily all
based on the similarities between them. Formally (e.g., objects of a given class. Proponents of the monothetic
Biggs 1999), classes can be thought equivalent to (a) camp tend to stress that some properties are more
partitioning a set into subsets, (b) classifying a set of important than others, and that these properties
objects, and (c) distributing a set of objects into a set of should be used to establish the classification. The
‘boxes.’ These various perspectives differ markedly in opposite position assumes equal importance of all
their implications for classification. For example, in properties. As a third type of general definition, one
most mathematical conceptualizations, an element is may add definitions referring to a ‘prototype,’ that is,
classified into exactly one class. Some clustering the most typical example of a class or a hypothetical
procedures, however, allow for residual elements, mean object. In this last case, ‘closeness’ or similarity
which are not considered clusterable. decides about class membership, and the prototype
Depending on the approach to classification that a may be defined with or without allowing for variation
researcher has chosen, certain considerations and in its properties.

1969
Classification: Conceptions in the Social Sciences

Given properties as the base for a classification, the classes. Each step provides a set of classes, from which
actual observations often are represented as a data the researcher has to make his choice. Once a fusion is
matrix containing, for example, the objects as the made, it is irrevocable, so the early fusions should be
columns and their properties as the rows. The cells of very reliable. Additie clustering (Shepard and Arabie
the matrix contain either the values ‘0’ or ‘1’ to 1979) is a hierarchical partitioning allowing mem-
indicate the absence or presence of properties, or they bership of objects in any number of classes. Here, the
contain frequencies, durations, intensities, or symbols classes might be interpreted as properties (Lee 1999).
(in the case of qualitative polytomous items) that The cluster concept treated thus far is based on
indicate the type or degree of the respective property in similarity as formally represented either in a space or
the respective object. As this enumeration shows, the by set theory. A close relative is prototype theory,
procedure can accommodate data of all scale types. popular in cognitive research. A prototype can be
The goal now is to find a ‘feature by classes’ matrix, defined as a vector of values of selected properties;
called—corresponding to its purpose—the reference usually a list of cases as exemplars of this prototype is
or identification matrix, or simply ‘a classification.’ also available. One fundamental assumption of the
prototype-oriented approach can be formulated as
follows: If there is high similarity among a set of
patterns, these patterns are also similar to an—
4.1 Concepts of Classes
observed or inferred—prototypical pattern. An in-
Cluster analysis proper. When authors (e.g., Everitt ferred pattern could, for example, be the vector of
1993) illustrate the concept of a cluster, they often use mean values. This pattern has high or maximal
two-dimensional graphs to show the clusters as clouds similarity to every other pattern. The idea of inferring
of points (representing the objects). The clouds can the prototypical pattern from the data forms a bridge
have various forms; generally there are ‘gaps’ between to the similarity-based conception. But the researcher
the clusters that contain no data points, so that the has to be more active in abstracting and defining a
clusters are isolated from one another. While such specific instance as the prototype.
explanations of the cluster concept seem intriguing as Contingencies of higher order than similarities
they invoke classical ‘gestalt’ concepts, it is important between the properties are exploited in some other
to remember that the properties of (good) figures are generalizations of the concept of similarity-based
defined by several ‘laws,’ not just one or two axioms or clustering, such as Configural Frequency Analysis
rules, as in cluster concepts. (Krauth and Lienert 1973) and Pattern-Analytic Clus-
Helpful as visualizations are, the more general tering (McQuitty 1987). For example, Configural
definition of a cluster does not refer to any particular Frequency Analysis identifies combinations of proper-
conception of space, be it dimensional, metric or Eucli- ties that occur more often than expected from some
dean. Set theory defines a cluster as the maximal subset specified base model.
of elements for which proximities within this subset A recent trend, increasing in strength, is to use
are larger than between any elements of the subset and mixture models for clustering. The original purpose of
elements not contained within it. As was discussed these methods was to base classification on a model
above, proximities are information about the extent to that allows for inference-statistical treatment. But they
which objects ‘belong together,’ and could be ex- have since found wider purposes. The basic idea of
pressed in many different ways, for example, as simi- mixture models can be illustrated using the following
larities, distances, ranks, or binary information about example: Assume that a sample of measurements of
set membership. More than one subset may exist; body height is drawn from a human population. While
subsets may be disjointed or overlapping; and they it is known that all the cases are male or female, gender
may or may not be hierarchically ordered. Given this is not recorded for individual respondents. It is,
very broad conceptualization, social scientists have however, possible, based on the distribution of heights
access to a large number of clustering procedures. The in the total sample, to estimate the coefficients of the
large number of options reveals that no ‘one and only’ separate height distributions for men and women.
definition of a cluster can be found. Presumably, the This is done by interpreting each measurement as a
availability of so many approaches is one reason for sum of weighted height measurements for women and
the paucity of comparative studies on methods of for men. These weights are the probabilities for each
clustering. measurement to be from a man and from a woman.
Clustering procedures can be classified as ‘leading ‘Thus the density function of height has been expressed
to a structure that is either hierarchical or non- as a superposition of two conditional density func-
hierarchical.’ The most frequently applied classifi- tions; it is known as a finite mixture density.’ (Everitt
cation procedures are hierarchical, disjointed, and 1993, p. 110).
provide exactly one class for each object. The best Mixture models are based on a ‘space’ concept
known hierarchical procedures are agglomerative, rather than a ‘similarity’ concept; clusters are regions
that is, in a series of partitions they successively and of relative point densities in this space. The assump-
with increasing dissimilarity, fuse the objects into tions for mixture models are comparable with those of

1970
Classification: Conceptions in the Social Sciences

the general linear model: cardinal scale level and checking which class they would be assigned to. In
multivariate normal (or similar) distributions of the both cases, the question is: Into which class should the
data. A comparatively common mixture model for case be placed? In practice, experts (e.g., physicians)
categorical data is latent class analysis (De Soete are often consulted for the answer this question. In
1993). other cases, numerical procedures (‘automatic classifi-
To conclude this classification of class concepts, one cation’) are used. Here, the properties may be used,
further concept needs to be mentioned. This con- either sequentially, as in a diagnostic key, or sim-
ception, models for block structure, is close to the raw ultaneously by some type of matching between the
data matrix and the Aristotelian tradition. A block is case and the existing classes (see Dunn and Everitt
a maximal rectangular submatrix combining some 1982, especially on diagnostic keys). Quite often, as
objects and some properties with the same (or similar) identification with certainty is impossible ‘either be-
values in the cells of the data matrix. The scale level of cause too many characters are variable within taxa or
the values is not fixed; and the similarity concept is not because all assessments of character states are subject
invoked in the analytical procedure. In a block, the set to error, probabilistic identification methods are often
of partially similar objects corresponds to the ex- used’ (Dunn and Everitt 1982, p. 112). Of the prob-
tension of a concept or class. The set of partially abilistic procedures, the Bayes approach (see Decision
similar properties corresponds to the intension. The Theory: Bayesian) and discriminatory analysis (see
symmetry in the definitions of intension and extension Multiariate Analysis: Classification and Discrimin-
is fully exploited and preserved (see Feger and De ation) are especially well known.
Boeck 1993). Other placement rules can be used if they are
transparent, unambiguous, and do not lead to contra-
dictions. For example, the principle of ‘nearest neigh-
4.2 Ealuation of a Clustering Result bor’ computes the distances of a new pattern to all
existing classes. It assigns the new case to the class to
Although model evaluation is only a part of the overall
which the distance is shortest (for details and other
evaluation of a classification (see Sect. 5), it is an
rules, see Looney 1997). Rules may also include
important one. As Dunn and Everitt (1982, p. 94) state:
options such as rejecting a case as ‘not classifiable’ or
‘Since clustering techniques will generate a set of
postponing a decision until more information is
clusters even when applied to random, unclustered
available. Most rules currently applied are compensa-
data, the question of validating and evaluating be-
tory, but rules could also be disjunctive or conjunctive,
comes of great importance.’ Jain and Dubes (1988)
requiring at least one value to reach a high amount, or
classify the criteria of validation as follows:
all values to surpass a given minimum (Coombs 1964).
External criteria measure performance by matching a cluster- Different rules lead to different results, especially if
ing structure to a priori information… . Internal criteria assess the classes vary in their a priori probability, if the
the fit between the structure and the data, using only the data distributions and covariances of the variables are very
themselves… . Relatie criteria decide which of two structures different, and if the number of observations is small.
is better in some sense, such as being more stable or The single most important criterion for evaluating an
appropriate for the data. assignment procedure is the number of correct classifi-
cations. But this ‘apparent error rate’ is optimistically
Considerable progress has been made in internal
biased, because it does not take into account the
statistical cluster evaluation. Statistical procedures
probability of correct assignments by chance. If base
exist for testing the existence of ‘natural’ clusters, for
rates of class membership are known, the predictions
testing the adequacy of computed classifications, and
have to perform better than the base rate (Pires and
for the determination of a suitable number of clusters
Branco 1997).
(see, e.g., Bock 1996). A very plausible way to evaluate
any solution, independent of the clustering approach
used, is to reproduce, or ‘derive,’ from the solution all
information that the solution gives about raw data 5. Ealuating a Classification
that would fit with the solution, and then to compare
While every step of the classification process can be
this information with the actual raw data.
evaluated (Milligan 1996), two stages have received
special attention: the evaluation of class definitions
and of the identification procedure. Both have been
4.3 Procedures to Assign Cases to Classes
mentioned previously. There also exist procedures to
Procedures to assign single cases to classes are needed evaluate the overall performance of a classification
for two purposes. One purpose is to assign newly system. The main method is ‘cross validation,’ using a
observed cases to the classes of an already existing new sample of data comparable to the old one, or
classification. The other purpose is to evaluate a splitting the original sample randomly into two halves
classification by taking ‘old’ cases from the original and using one half to evaluate the classification
sample on which the classification was based, and obtained in the other half.

1971
Classification: Conceptions in the Social Sciences

Another way of evaluating the results of a classifica- While classification is a ‘process,’ the temporary
tion process is by comparing the results of using result is a ‘structure.’ Dynamic aspects, such as the
different classification procedures. Usually the re- development of a property and other trends and
searcher has several choices in the classification pro- changes, might be included in the definitions of
cess, and is not forced by theory to select just one variables. This does not, however, make a formal
option. Examples include multiple options about the classification a process model. In this sense, classifica-
selection of cases and variables, of similarity coeffici- tion is static. It temporarily fixes the—sometimes
ents, of clustering models and of identification rules. turbulent—streams of information. Changing a classi-
With computers, it is easy to try several combinations fication means the (re-)interpretation of some sub-
of such choices. Confidence that the classification stantive area.
captures substantial information in the data grows
with the amount of agreement in the results from See also: Statistical Clustering; Multivariate Analysis:
different combinations of choice options (for the Classification and Discrimination; Mixture Models, in
evaluation and comparison of solutions, see Everitt Statistics; Measurement Theory: Conjoint; Person-
1993). To aid the interpretation of the resulting class centered Research; Configurational Analysis
structure, Milligan (1996, p. 346) suggests deliberately
adding ‘ideal types,’ that is, characteristic patterns
constructed by the researcher, to the data, and to
assess what clusters these patterns are assigned to. Bibliography
Can a classification be wrong? In most cases, a Arabie P, Hubert L-J, De Soete G (eds.) 1996 Clustering and
classification is just a systematic description, and as Classification. World Scientific, Singapore
such, may, or may not, be useful. New observations Baulieau F B 1999 Two variant axiom systems for presence\
may require changes in the classification. But if there absence based dissimilarity coefficients. Journal of Classifica-
exists a theory about the definitions of the classes, and tion 14: 159–70
the theory is strong enough to allow for specific Biggs N L 1999 Discrete Mathematics. rev. edn. Clarendon
predictions, then these predictions can be falsified Press, Oxford, UK
Bock H-H 1996 Probability models and hypothesis testing in
and\or lead to revisions of the classification system. partitioning cluster analysis. In: Arabie P, Hubert L J, De
Soete G (eds.) Clustering and Classification. World Scientific
Singapore , pp. 377–453
6. Conclusions Coombs C H 1964 A Theory of Data. Wiley, New York
Daws J T 1996 The analysis of free-sorting data: Beyond pairwise
More in the past than in the present, opinions cooccurrences. Journal of Classification 13: 57–80
fundamentally critical of the possibility of classifica- De Soete G 1993 Using latent class analysis in categorization
tion in the social sciences have been expressed. For research. In: van Mechelen I, Hampton J, Michalski R S,
example, Galt and Smith (1976, p. 58) stated: ‘Because Theuns P (eds.) Categories and Concepts. Academic Press,
London, pp. 309–30
they usually lack measurable dimensions, social enti- Dunn G, Everitt B S 1982 An Introduction to Mathematical
ties are difficult to classify, and any given system of Taxonomy. Cambridge University Press, Cambridge, UK
classification will inevitably be arbitrary.’ Indeed, Everitt B S 1993 Cluster Analysis. 3rd edn. Edward Arnold,
numerical classification definitely requires measure- London
ment, or more generally, the interpretation of observa- Feger H 1994 Structure Analysis of Co-occurrence Data. Shaker,
tions as variables. Indeed, every variable is itself a Aachen, Germany
classification, defined as a set of disjointed, exclusive Feger H, De Boeck P 1993 Categories and concepts: Intro-
and together sufficient classes, and the ‘categories’ of duction to data analysis. In: van Mechelen I, Hampton J,
variables are referred to as their values. Without Michalski R S, Theuns P (eds.) Categories and Concepts.
Academic Press, London, pp. 203–23
variables in this formal sense, one might consider a Galt A H, Smith L J 1976 Models and the Study of Social
heuristic equivalent in abstraction from, and ordering Change. Wiley, New York
of, observations: the ‘ideal type,’ as introduced by Gower J C, Legendre P 1986 Metric and Euclidean properties of
Max Weber. dissimilarity coefficients. Journal of Classification 3: 5–48
Variables used to establish a classification may be Gyllenberg M, Koski T 1996 Numerical taxonomy and the
discrete or continuous. For a classification to be principle of maximum entropy. Journal of Classification 13:
justified, the frequency or density distributions of the 213–29
properties should not be equal distributions, but show Hartigan J A 1982 Classification. In: Kotz S, Johnson N L,
one or several peaks. Then, when the joint distribu- Read C B (eds.) Encyclopedia of Statistical Sciences. Wiley,
New York, Vol. 2, pp. 1–10
tions of more than one variable are considered, some
Jaccard P 1908 Nouvelles recherches sur la distribution florale.
combinations of values may be more frequent than Bulletin de la SocieT teT Vaudoise de Science Naturelle 44: 223–70
other combinations, perhaps even more frequent than Jain A K, Dubes R C 1988 Algorithms for Clustering Data.
would be expected based on the marginal distribu- Prentice-Hall, Englewood Cliffs, NJ
tions. This is one of the fundamental phenomena Krathwohl D R, Bloom B S, Masia B B 1964 Taxonomy of
enabling the formal definition of classes. Educational Objecties. Longman, London

1972
Classifiers, Linguistics of

Krauth J, Lienert G A 1973 KFA—Die Konfigurationsfrequenz- Table 2


analyse. Alber, Freiburg, Germany Lexical sources of basic classifiers
Lee M D 1999 An extraction and regularization approach to
additive clustering. Journal of Classification 16: 255–81 Classifiers Lexical origin
Looney C G 1997 Pattern Recognition Using Neural Networks.
Oxford University Press, New York 1D: long-rigid tree\trunk
McQuitty L L 1987 Pattern-Analytic Clustering: Theory, Meth- 2D: flat-flexible leaf
od, Research and Configural Findings. University Press of 3D: round fruit
America, New York
Milligan G W 1996 Clustering validation: Results and im-
plication for applied analyses. In: Arabie P, Hubert L J, De
Soete G (eds.) Clustering and Classification. World Scientific, (1977) have provided the framework for many of the
Singapore, pp. 341–75 subsequent descriptions and discussions, and can be
Pankhurst R J 1991 Practical Taxonomic Computing. Cambridge
considered as classics of the field. Adams and Conklin
University Press, Cambridge, UK
Pires A M, Branco J A 1997 Comparison of multinomial were the first to wade through much comparative data
classification rules. Journal of Classification 14: 137–45 and claim the existence of some universal semantic
Shepard R N, Arabie P 1979 Additive clustering representations properties, based primarily on data from Asian nu-
of similarities as combinations of discrete overlapping proper- meral classifier systems. They established the primacy
ties. Psychological Reiew 86: 87–123 of three basic shapes, which are semantically combina-
Sokal R R, Sneath P H A 1963 Principles of Numerical Tax- tions of one of the major dimensional outlines of
onomy. Freeman, San Francisco objects (1D, 2D, 3D) with a secondary characteristic
Sutcliffe J P 1993 Concept, class, and category in the tradition of of consistency and\or size. This combination is di-
Aristotle. In: van Mechelen I, Hampton J, Michalski R S,
rectly inherited from the most common lexical sources
Theuns P (eds.) Categories and Concepts. Academic Press,
London, pp. 35–65 of a basic set of classifiers, which are the primary
elements of the physical world being handled for the
H. Feger survival of human communities (Table 2).
Denny (1976) is the work of a psychologist handling
data secondhand. He offers the appealing proposal
that the semantic traits of classifiers may be organized
into three kinds, those of ‘social, physical and func-
Classifiers, Linguistics of tional interaction,’ assuming that what classifiers are
‘good for’ is to signal how humans interact with the
world. Under social interaction he places interaction
Classifiers are overt morphemes that constitute mor- with animate entities of our world, principally fellow
phosyntactic systems which are semantically motivat- human beings, classified by sex, social rank, or other
ed and subject to discourse-pragmatic conditions of categorization schema, as well as other entities such as
use. Classifier systems are not found in Indo–Euro- divinities and other powers specific to a culture. In the
pean languages. They are in essence secondary linguis- physical interaction realm, objects of the world are
tic systems characterized, on the one hand, by their classified along certain parameters linked to their
clear lexical origin and persistent semantic motivation nature as manipulable and manipulated objects, prin-
and, on the other, by their functioning as morpho- cipally the parameter of shape. Finally, in the func-
syntactic systems. The better-known systems are the tional interaction realm, entities of the world are
numeral classifier systems of Asian or Amerindian classified by the use to which they are put, such as
languages, illustrated in Table 1. items of clothing, hunting or fishing, transportation,
Classifier studies became of interest to general for instance.
linguists in the 1970s, following proposals to capture Allan (1977) is a first typological of study of so-
the universal semantic properties of classifier systems. called classifiers, based on a broad data base of fifty
Adams and Conklin (1973), Denny (1976), Allan classifier languages. Although the reliability of the
Table 1 data is variable and different types of nominal classifi-
Examples of numeral classifier systems cation systems are lumped together under the label of
‘classifiers,’ there is still remarkable overlap between
Japanese his seven ‘categories of classification’ (material, shape,
enpitsu ni-hon hon ni-satsu consistency, size, location, arrangement, and quanta)
pencil 2-CL(1D) book 2-CL(bound volume) and Denny’s three. Two of Allan’s original statements
‘two pencils’ ‘two books’ are of particular interest for later discussions on the
nature and purpose of classifier categorization; they
Tzotzil
are the fact of a total absence of color classifiers, and
j-p’ej alaxa j-ch’ix kantela
the constraint that the characteristics denoted by the
1-CL(3D) orange 1-CL(1D) candle
categories of classification be perceivable by more
‘one orange’ ‘one candle’
than one of the senses alone, such as sight and touch,

1973
Classifiers, Linguistics of

Table 3
Specific, unique, and repeaters in Jakaltek-Popti’
Type of classifier CL Class members
specific: no’ ALL animals, except dog,
AND all products of animals (no’
hos ‘egg,’ no’ lech ‘milk,’ leather
shoes, wool blankets etc …)
unique: metx’ ONLY metx’ tx’i ‘dog’
repeaterjspecific: tx’otx’ tx’otx’ tx’otx’ ‘dirt, ground’
AND all objects made of clay
(tx’otx’ xih ‘pottery jug’ etc …)
repeaterjunique atz’am ONLY atz’am atz’am ‘salt’

where sight means primarily perception of shape. It is ophonous with a noun while being either unique or
worth noting here that to arrive at the kind of specific (Table 3). The existence of repeaters is what
statements of universals found in the above mentioned makes for the openendedness of some classifier
publications meant wading through vast amounts of systems.
data from fieldwork often containing large sets of The semantic studies of the 1970s had a tendency to
classifiers, and interpreting their semantics through overlook the existence of different types of systems,
approximate translations, in order to identify those often lumping together classifier systems with other
universal characteristics. systems. Towards the end of the twentieth century,
Besides varying as to the semantics of the individual attention has been given to the fact that classifier
classifying elements, systems of classifiers vary greatly systems are one type of nominal classification system
as to the number and the specificity of the classes among several others, as argued in Grinevald (2000),
around which the systems seem to be organized. The in contrast to the position taken by Aikhenvald (1999).
classes headed by classifiers can vary from very simple The non-lumping position argues that classifier sys-
to very complex; they can be small or large, homo- tems are intermediate systems in a continuum of
geneous or extremely heterogeneous. Homogeneous nominal classification systems that range from lexical
classes are those with transparent semantic motiv- to morphosyntactic systems. At the lexical end they
ation, while heterogeneous classes are usually con- may be distinguished from two types of systems, the
sidered to be composed of a core set of prototype measure terms and class terms systems, with which
elements to which others have been added through they are often either confused or consciously lumped.
various means of extension. Therefore, within the At the grammatical end, they are widely considered as
literature on classifier systems one finds different labels distinct from the gender systems and the noun class
to indicate the nature of the classes themselves. One systems, which are both essentially grammaticalized
talks, for instance, of specific, general, unique and concordial systems.
repeater classifiers. Specific classifiers are the most All languages have lexical sets of measure terms, the
common type. The classes they head are built around expression ‘measure terms’ being used here as a cover
prototypical exemplars, with incorporation of other term for what are strictly speaking measures and for
elements by any number of types of extensions of the types of arrangements. Examples of English measure
class. One of the most notorious examples in the terms include actual measure terms such as a glass of
literature of a specific classifier heading a very het- water, a pound of sugar, a slice of bread, a sheet of
erogeneous class is the case of the Japanese numeral paper, and arrangements such as a pile of books, a
classifier hon, used prototypically for long, thin group of children, a line of cars. Class terms are sets of
objects. lexical items used in lexicogenesis; they participate in
General classifiers, as their label indicates, are largely compounding processes of word formation that are
desemanticized and head large heterogeneous classes functionally equivalent to derivational processes. The
with no distinct semantic motivation. Large Asian English class terms ‘-berry’ (as in strawberry, blue-
numeral classifier systems are known to have general berry, boysenberry, goodberry, loganberry), ‘-tree’ (as
classifiers. At the opposite end, unique classifiers head in apple tree, banana tree, cherry tree), or even ‘-man’
classes of just one element. One finds in the literature (as in mailman, policeman, garbage man) are the
examples of unique classifiers for certain animals, for functional equivalent of French derivational suffixes
instance, such as the elephant or the tiger, or even the such as ‘-ier’ (as in pommier ‘apple tree,’ bananier
dog, generally interpreted a posteriori as highlighting ‘banana tree,’ cerisier ‘cherry tree’) or ‘-ier\-eur’ (as in
some cultural item of particular significance. Finally, facteur ‘mailman,’ policier ‘policeman,’ eT boueur ‘gar-
the term repeater refers to classifiers that are hom- bage man’).

1974
Classifiers, Linguistics of

Table 4
Class markers in Tswana (Bantu)
a. le-kau le le-leele le le-ntsho le le opelang le-le
5-boy 5 5-tall 5 5-black 5 5 sing 5-DEM
‘this tall black boy who is singing’
b. Le-kau le lapile; Ke le thusitse
5-boy 5-is tired I 5 have helped
‘the boy is tired, I have helped him’

At the grammatical end of the continuum of b. so:wa:s akh-nahskw-ae’


nominal classification systems are the gender systems dog I-CL:domestic.animal-have
found in Indo–European languages and the noun class ‘I have a ( pet) dog’
systems of Bantu languages. Note in the examples c. skitu ake’-treh-tae’
inTable 4 the ubiquitous presence of noun class skidoo I-CL: vehicle-have
markers (here of class 5) on nouns, adjectives, and ‘I have a car’
demonstratives, and in the verb as pronominal clitics. Gunwinggu (Oates 1964
The different degrees of grammaticalization of gend- in Mithun 1986, p. 389)
er\noun class systems and classifier systems are gene- d. gugu ga-bo:-mangan
rally assessed according to the set of criteria listed in water it-CL: liquid-fall
Table 5. ‘water is falling’
Beyond distinguishing classifier systems from other The major argument to prove the existence of
systems of nominal classification it is important to different types of classifiers is the co-occurrence of
acknowledge the existence of several subsystems of several independent systems in the same language.
classifiers, which are usually identified and labeled These are systems with different inventories of classi-
primarily by their morphosyntactic locus. The best fier morphemes, different semantics and different
known and documented types are the ones found as morphosyntactic loci, such as the coexisting numeral
elements of the noun phrase itself, such as the numeral and possessive classifier systems of Micronesian lan-
classifiers (numeraljCL) used in quantifying expres- guages like Ponapean, for instance. Although much
sions; the noun classifiers (CL noun) so-called for remains to be done in terms of the study of the
appearing with a bare noun, not linked to the semantics of classifiers, preliminary exploration of a
expression of quantification or possession; and the correlation between the major morphosyntactic types
genitival or possessive classifiers ( possjCL) that are of classifiers known and their semantic profiles point
part of possessive constructions. The verb forms are to the following alignment (Grinevald 2000).
the locus of two possible systems of classification: the Shape seems to be the dominant semantic parameter
verbal classifiers (verb-CL) which belong to the sys- in numeral classifier systems, while function is the
tems of nominal classification and the lesser known major semantic parameter of genitival classifier sys-
verb classifiers, which actually classify types of verbs tems, and material the major one of noun classifier
rather than nominal arguments. systems, in the following pattern:
Noun classifiers; Jakaltek-Popti’ (Craig 1986, (a) numeral classifiers l physical categories:
p. 264) two-ROUND oranges;
xil naj xuwan no7 lab’a three-LONG RIGID pencils;
saw CL John CL snake four-FLAT FLEXIBLE blankets
‘(man) John saw the (animal) snake’ (b) genitive classifiers l functional categories
Numeral classifiers; Ponapean (Rehg 1981, p. 130) my-EDIBLE food;
pwihk riemen ‘two pigs’ his-DRINKABLE potion;
pig 2jCL: animate their-TRANSPORT canoe
tuhke rioapwoat ‘two trees’ (c) noun classifiers l material\essence categories
tree 2jCL: long an ANIMAL deer;
Genitive classifiers; Ponapean (Rehg 1981, p. 184) the ROCK cave;
kene-i mwenge ‘my(edible) food’ MAN musician
CL-GEN.1 food The claim that there exist different types of classifiers
were-i pwoht ‘my(transport) boat’ raises two questions about the function of classifiers:
CL-GEN.1 boat one is the unavoidable one about the common function
Verbal Classifiers; Cayuga (Mithun 1986, pp. 386–8) of classifiers in general in the languages that avail
a. ohon’atatke: ak-hon’at-a:k themselves of such systems. The other arises from the
it-potato-rotten past.I-CL-eat ‘I identification of different types of classifier systems
( potato)ate a rotten potato’ and concerns the distinct functions that those different

1975
Classifiers, Linguistics of

Table 5
Criteria for distinguishing noun classes and classifiers
Noun classes Classifiers
a. classify all nouns don’t classify all nouns
b. in a small number of classes in large(r) number
c. closed system open system
d. fused with other grammatical not fused
categories (number, case …)
e. can be marked on N not marked on N itself
f. in concord\agreement pattern not part of concord systems
g. N assigned to one class can be to assigned to several classes
h. no speaker variation possible speaker variation
i. no register variation possible formal vs informal use
Source: Dixon 1982

types of classifier fulfill, in view of their different they develop and are used. These variables include:
morphosyntactic loci and semantic profiles. It has to (a) their degree of grammaticalization: within each
be noted that when the issue of the function of classifier subtype of classifier system, one can identify systems
systems has been addressed in the literature, it has at different stages of grammaticalization. For instance,
generally been, admittedly or not, from the perspective incipient systems of noun classifiers can be found on
of numeral classifiers only, without regard for the the Australian continent next to well-established ones,
variety of classifier types. In this context, (numeral) meanwhile the numeral classifiers of the Chibchan
classifiers have been seen as markers of individuation languages of Central America are much more
or unitizing that operate in languages in which the grammaticalized than those of Asian languages.
semantics of nouns is taken to be essentially equivalent (b) the age of the system: some systems are very old,
to that of mass or concept nouns of Indo–European e.g., the Chinese system of numeral classifiers, while
languages. others can be argued to be only several centuries old,
One proposal dealing with the distinct functions of like the q c anjob c alan-Mayan noun classifiers.
the various types of classifier systems has been an (c) the productivity of a classifier system must be
analysis of noun phrases as layered structures parallel considered too, independently of its age. For instance
to verbal layered structures in which different types of the Thai numeral classifier system, which is very old, is
operators are found. In this framework numeral classi- also very productive: it is open and adapting to the
fiers were considered to be quantification operators language of modern life, while the noun classifier
with semantics that appealed to the handling of items system of Jakaltek-Popti c , which is not very old, seems
to be counted, hence primarily shape and physical frozen and unable to cope with the classification of
characteristics. It was further argued that possessive modern imports and products.
classifiers were localizing operators, and their sem- (d) the particular classifier system needs to be
antics linked to the function of the items appropriated, assessed in the context of the common phenomenon of
while the noun classifiers were quality operators and areal spread of such systems. The spread can operate
as such appealed to the material or essence of the items either through the actual borrowing of a system,
that appeared as arguments of discourse (Grinevald morphology included, as was the case with the
2000). Much remains to be done to document the expansion of the original Chinese numeral classifier
variety of classifier systems well enough to be able to system into its surrounding regions, or through the
address this issue comprehensively. borrowing of the idea and motivation for the de-
One of the major challenges of classifier studies is velopment of such systems, as seems to have taken
that the essentially intermediate nature of classifier place between the q c anjob c alan languages of Guate-
systems, as secondary linguistic systems halfway be- mala and their neighboring Mamean languages.
tween lexicon and grammar, means great variability of While early propositions of matching classifier
the systems. Acknowledging the need to take into systems with morphological types of languages were
account the inherent dynamics of classifier systems not enlightening, there is indeed a tendency for
makes the descriptive task both more onerous and different types of nominal classification systems in
more productive if comparative and typological work general, and of different types of classifiers in par-
is to proceed properly. The need is felt to include a ticular, to distribute themselves in clusters around the
number of dynamic variables to handle the description world. For instance, gender systems are a widespread
of specific classifier systems, by attending to their place phenomenon in Indo–European languages, while
in the overall grammar of the language within which noun class systems were originally mostly known from

1976
Classifiers, Linguistics of

Bantu languages. In addition, they have been de- data, for the richness of its classification systems and
scribed for languages of Australia and Papua New the challenges they pose to the proposed typology, and
Guinea, and are perhaps more widespread than which underlines in general the extreme complexity of
previously recognized in Amazonia. the inter-relation of systems in many languages. Sands
Turning our attention to classifier subtypes, nu- (1995) is a useful survey of nominal classification
meral classifiers are best known for their presence in systems in Australia which reveals two interesting
South East Asian languages, but have also been phenomena: one is the parallel development, in differ-
identified in America, in particular Mesoamerica. ent languages, of concordial noun class systems and
Noun classifiers appear to be a rare type, mostly noun classifier systems out of the same lexical material
identified in Mesoamerica and Australia, while pos- of generic nouns; and the other is the documentation
sessive classifiers are the hallmark of Micronesian of the various stages of the evolution of noun classifier
languages, although they are also found in various systems from the discourse sensitive use of generic
parts of America. As for verbal classifiers, they have nouns and through the increasingly frequent colloca-
been documented for North American languages and tion of generics and nouns in classifier constructions.
for signed languages, though it is sometimes difficult Bisang (1996, 1999) provides overviews of the gramm-
to establish how segmentable and identifiable the aticalization dynamics through which the classifier
actual classifier morphemes are in verbal predicates. systems of East and South East Asian languages arose,
Classifier studies raise difficult methodological within the areal typological frame called for by the
issues. There is the first-degree challenge of the language contact situation of the region.
fieldwork to be done to produce descriptions of these In terms of an agenda for the development of
systems. Fieldwork faces the problem of the eth- classifier studies in the twenty-first century, the work is
nocentrism of the semantic analysis. This will continue proceeding on two fronts. On the linguistic fieldwork
as long as much of this work is done through front, there is still an enormous need for more
translation, and is still largely carried out by linguists comprehensive descriptions. And, given the fact that
who are native speakers of languages where the many systems worth investigating are to be found in
phenomenon does not exist. ( Work done on South the languages of Amazonia, Australia, and Papua
East Asian languages is to some extent an exception to New Guinea, the challenging nature of this fieldwork
this problem.) For instance, it is often difficult to say has to be kept in mind. Work on nominal classification
whether the semantics of a classifier is one of the processes in signed languages is also under way; a
strictly physical characteristic of shape or one of collective volume on that topic in a cross-linguistic
function, since certain shapes naturally lend them- perspective is scheduled to appear following a working
selves to certain functions. Are objects hollowed or conference in 2000 (Emmorey in press).
made to be concave to be considered for their shape, as A better understanding of the general phenomenon
round hollow objects, or for their function, as reci- of classifiers should also emerge from the confron-
pients and containers? The basic issue is, of course, tation of what is now known of nominal classification
whether the right questions are being asked of these systems with the much less known phenomenon of
systems in the first place. There is the further challenge verb classification. This phenomenon has been de-
of the fundamental lexico-grammatical nature of such scribed for some Australian languages, but has not
systems, with their common open-endedness and come fully to the attention of linguists interested in
subtle discourse functioning that require extensive nominal classification. However, the identification of
studies all too rare in the descriptive tradition. And a similar phenomenon in South American languages
there is the second-degree challenge facing linguists such as the Barbacoan languages of Ecuador and its
working on secondhand data of uncertain reliability ongoing description should facilitate further explo-
and common incompleteness, particularly in terms ration of the parallels between the organization of
of the pervasive dynamics and complex internal nominal and verbal linguistic expressions and the
typology of such systems, of the kind introduced in this similarity in the function of their operators, in par-
article. ticular that of their respective classifier systems.
A number of publications are shaping the field of On the more theoretical front, various debates are
classifier studies in the context of the wider discipline open and in need of further consideration. They
of nominal classification. Craig (1986) meant to begin include taking a position on the a priori lumping or not
to confront various approaches to the study of lumping together of most nominal classification sys-
nominal classification systems and included articles on tems. Lumping means subsuming, under the label of
the acquisition, historical development, discourse func- classifiers, cases of class terms, measure terms, noun
tion, and semantics of classifier systems. Senft (2000) is classes, as well as classifiers. This position is defensible
a more recent collection (based on a 1993 working in terms of how the new data collected in the field,
conference) where the issues of typology, gramma- particularly data on languages until recently never
ticalization, and function of classifiers are further described, appear startling if not overwhelming be-
elaborated. Aikhenvald (1999) is a substantial mono- cause of overlapping layers of elements of supposedly
graph which attests to the importance of Amazonian various types of systems. Alternatively one can opt to

1977
Classifiers, Linguistics of

tease apart different types of classification systems Bisang W 1999 Classifiers in East and Southeast Asian lan-
using as a reference clear cases of classifier systems guages: Counting and beyond. In: Gvozdanovic J (ed.)
with the characteristics given above in Table 5. Numeral Types and Changes Worldwide. Mouton de Gruyter,
Berlin
To handle the cases of data overlaps one can proceed
Craig C G (ed.) 1986 Noun Classes and Categorization. John
with a number of tools from a functional-typological Benjamins, Amsterdam
approach to the study of language. One is the notion de Leo! n L 1988 Noun and numeral classifiers in Mixtec and
of prototype, with its accompanying concept of fuzzy Tzotzil: A referential view. Ph.D thesis, University of Sussex,
boundaries; this allows for some systems to be in- UK
termediate between two systems, such as noun class Denny J P 1976 What are noun classifiers good for? Papers from
and classifier. Another is the notion of the gramma- the Regional Meeting of the Chicago Linguistic Society 12:
ticalization dynamics, inter- and intra-types of nom- 122–32
inal classification systems, allowing in particular for Dixon R M W 1982 Noun classifiers and noun classes. In: Dixon
R M W (ed.) Where Hae all the Adjecties Gone? And Other
variations along this parameter within the same type,
Essays on Semantics and Syntax. Mouton, The Hague, pp.
which can change an analysis of multiple classifier 211–33
systems to one of incipient noun class system for Dixon R M W 1986 Noun classes and noun classification in
instance. A third is the notion that several systems may typological perspective. In: Craig C G (ed.) Noun Classifica-
indeed co-exist, the original system and another one tion and Categorization. John Benjamins, Amsterdam, pp.
evolved in part from it, resulting in homophonous 105–12
morphemes belonging to various systems, such as class Emmorey K (ed.) in press Perspecties on Classifier Constraints
terms and classifiers. in Sign Languages. Erlbaum, Mahwah, NJ
Another major issue due for more debate in the Foley W 1997 Anthropological Linguistics: An Introduction.
Blackwell, Oxford, UK
twenty-first century is the issue about how much the
Greenberg J H 1978 How does a language acquire gender
linguistic phenomenon of classifiers is linked to cate- markers? In: Greenberg J H (ed.) Uniersals of Human
gorization of referents in the world linguistic classi- Language, Vol. III. Stanford University Press, Stanford, CA,
fication of nouns. This has been best articulated by pp. 47–82
Lucy (1992), while Foley (1997, Chap. 12) provides a Grinevald C 2000 A morphosyntactic typology of classifiers. In:
good summary of recent experimental work on the Senft (ed.) Nominal Classification. Cambridge University
cognitive impact of classifiers in categorization. It is Press, Cambridge, UK, pp. 50–92
clear that before constructing experimental studies Lucy J 1992 Grammatical Categories and Cognition. Cambridge
that use classifiers in the search for the nature of the University Press, Cambridge, UK
Matsumoto Y 1993 Japanese numeral classifiers: A study of
links that hold between language and cognition, an
semantic categories and lexical organization. Linguistics 31:
assessment of the degree of grammaticalization of the 667–713
systems is needed. Ongoing discussions of so-called Mithun M 1986 The convergence of noun classification systems.
classifiers in signed languages underline also their In: Craig C G (ed.) Noun Classification and Categorization.
fundamental discourse functions of referent ident- John Benjamins, Amsterdam, pp. 379–97
ification and referent tracking, and the likelihood that Oates L 1964 A tentative description of the Gunwinggu
the term ‘classifiers,’ now well established in the language. Oceania Linguistic Monographs 10, University of
literature, both in its narrower and wider scope, may Sydney, Australia
very well be a misnomer which delays a better grasp of Rehg K 1981 Ponapean Reference Grammar. University Press of
H7awaii, Honolulu, HI
their role in language.
Sands K 1995 Nominal classification in Australia. Anthro-
pological Linguistics 37: 247–346
See also: First Language Acquisition: Cross-linguistic; Senft G 2000 Systems of Nominal Classification. Cambridge
Foreign Language Teaching and Learning; Language University Press, Cambridge, UK
Acquisition; Language Development, Neural Basis
of C. Grinevald

Bibliography
Adams K L, Conklin N F 1973 Towards a theory of natural Classroom Assessment
classification. Papers from the Regional Meeting of the Chicago
Linguistic Society 9: 1–10 Around the end of the 1980s, the traditional con-
Aikhenvald A 1999 Classifiers. A Typology of Noun Categoriza- ception of teachers’ classroom assessment roles began
tion Deices. Clarendon Press, Oxford, UK
to change. Previously, teachers’ classroom assessment
Allan K 1977 Classifiers. Language 53: 285–310
Berlin B 1965 Tzeltal Numeral Classifiers. Mouton, The Hague responsibilities were narrowly focused on summative
Bisang W 1996 Areal typology and grammaticalization: Proces- decisions narrowly related to grouping, grading, and
ses of grammaticalization based on nouns and verbs in East selecting students. In the late 1980s, close examination
and Mainland South East Asian languages. Studies in Lan- of what teachers do and what decisions they are called
guage 20–3: 519–97 on to make in their classrooms increased substantially

1978
Classroom Assessment

views of teachers’ classroom lives, responsibilities, and student disabilities and medication. Note that alth-
assessments (Jackson 1990, Wittrock 1986). A syn- ough cognitive information is important to all teac-
thesis of this research identified three generalizations hers, the classroom society requires that affective and
that provided a useful perspective of the realities of the psychomotor student characteristics also must be
classroom teachers’ classroom. identified.
First, classrooms are both academic and social Integrating the pieces of formal and informal
environments that teachers must master and under- information gathered in the first two or three weeks,
stand to successfully instruct and interact with their the teacher forms a description or perception of each
students. Many teacher decisions are dependent on the student and the class as a whole. This provides the
social and academic knowledge they acquire about teacher with the kind of nitty-gritty information
their students. Second, classrooms are busy, inter- needed to make the classroom function effectively
active, and ad hoc settings that call on the teacher to (Good and Brophy 1997). Because teachers cannot
make many and varied decisions. Third, although spend a great deal of time assessing their students at
many nonteachers view classrooms as a unified whole, the start of school, the validity and reliability of the
teachers know that such uniformity is illusionary. In initial student assessments are important. Two major
classrooms, teachers continually deal with a range of validity concerns during start-of-school assessments
individual student concerns and issues. Understanding are labeling students based on stereotypes and treating
the implications of these three classroom realities students’ cultural or language differences as if they
produces a broad domain for assessments. were student deficits (Oakes and Lipton 1999). The
Given the richness and complexity of classrooms, a main concern of reliability is that teachers obtain
more realistic description of classroom assessments is sufficient and recurring information before labeling
the process of collecting, synthesizing, and interpreting students. These important initial teacher assessments
information to aid teachers in making classroom are often overlooked as an important and influential
decisions. While the overriding purpose is to make form of classroom assessment.
decisions, classroom assessment involves many differ-
ent decisions and contexts. In particular, classroom
assessments focus on learning about students at the 2. Classroom Assessment in Planning and
start of school, planning and delivering instruction to Deliering Instruction
students, and formally assessing student learning.
Note that each of these three assessment focuses is Teacher classroom decisions about planning and
dependent on the collection, synthesis, and interpret- delivering instruction encompass a variety of issues.
ation of assessment. These three focuses are used to There are, for example, many considerations that
structure the discussion of classroom assessments teachers must recognize and assess in order to suc-
below. cessfully plan lessons for their students. Student
characteristics vary with student readiness, attention
1. Classroom Assessment at the Start of School span, prior subject knowledge, attitude toward school,
disabilities, and other characteristics that must be
The beginning days of school are important for both considered when planning instruction. Similarly,
teachers and students. In the first few weeks of school school and classroom resources range from textbooks
the teacher and students must get to know and to copying machines to sophisticated laboratory ap-
understand each other so that they can be organized paratus, and can hamper or enhance lesson planning.
into a classroom learning community. The activities in Time is a critical factor in planing, as every teacher
the early few days of school set the stage for how well knows. Also, teacher characteristics such as subject
students will behave, attend, and learn during the matter knowledge, physical limits, and preferred
school year (Airasian 2001, Stiggins 1997). In the first teaching style influence planning. A significant amount
early days of school, teachers have their antenna up, of assessment is involved in decision making for valid
observing, listening, mentally recording, and assessing and viable lesson plans (Wragg 1997).
their perceptions of the students. In order to know Once relevant information about the student, the
how to group, teach, motivate, manage, accommo- teacher, and the instructional resources are identified,
date, and reward students, the teacher must learn their the teacher’s task is to synthesize and decide how to
particular characteristics. construct a set of instructional plans containing
Many forms of planned and unplanned, formal educational objectives, instructional materials, teach-
and informal, sources of information contribute to ing strategies, and assessment procedures (Airasian
teachers’ perceptions of their students, e.g., ‘on the fly’ 2001). Different objectives call for different forms of
observations, hearsay from the school grapevine, and instruction and assessment, and teachers must be able
prior teachers’ comments, as well as information such to teach students in more than one way. Objectives
as school records, formal assessment results, and indicate the outcomes of student learning. Higher level
performance in the classrooms. Two increasingly objectives include cognitive processes such as ap-
important areas teachers want to know about are plication, analysis, synthesis, and evaluation. Lower

1979
Classroom Assessment

level objectives emphasize rote memorization. Because that can have important consequences for students and
educational objectives are developed before instruc- therefore are taken seriously by students, parents, and
tion begins, teachers often must make a decision to teachers (Black 1998).
adapt objectives, materials, and instructional strat- A fair and valid formal assessment includes in-
egies to suit student readiness and needs. Suitable formation and skills similar to those presented in
strategies to accommodate students with disabilities instruction. While the type of assessment strategy
also must be planned. The decisions teachers make to chosen to assess students depends on the nature of
match instruction to objectives and make appropriate instruction, all types should represent the objectives
instruction for students with disabilities improve the and instruction presented. Factors such as the age of
validity of their instruction and assessment. the students, the subject matter assessed, and the
During teaching, the teacher is concerned with length of time for testing, all impact the length of
decisions and assessments to determine how well the formal assessments.
instruction is progressing. Planning and instructing Obtaining fair and valid formal assessments in-
assessments are integrally related; the processes con- volves alignment among objectives, instruction, and
stantly cycles from planning to delivering to revising assessments, providing students with good instruction,
to planning and so on. There is a logical, continuous, and selecting appropriate strategies to assess learning.
and natural link between the two processes. Formal assessments gather valid and reliable samples
Oral questioning is the most common form of of student performance and use them to make gen-
instructional assessment because it best fits the flow eralizations about general student learning. The most
of instruction (Airasian 2001). During instruction, important preparation for formal assessment is a good
teachers ask questions for many reasons, to reinforce teacher. Students should also be familiar with the
important points, to maintain students’ attention, to assessment item formats and be given a review session
assess student learning, and to promote deeper pro- prior to the assessment.
cessing of important information. Teachers use both If these factors are not met, invalid assessment
convergent questions that have a single correct answer results can occur. Other practices that diminish as-
and divergent questions that have more than one sessment validity are: failure to develop assessments
appropriate answers. Lower level questions tap recall based on objectives and instruction, failure to assess
and memorization while higher level questions tap all the important objectives taught, failure to select
processes more complex than recall. Classroom ques- item types that prevent students from showing their
tioning strategies can be improved by asking questions full performance, including topics or objectives not
related to important objectives, avoiding overly gen- taught, including too few items to obtain adequate
eral questions, distributing questions among many assessment reliability, and using tests to punish stu-
students, allowing sufficient ‘wait time’ before calling dents. Further, the success of formal assessment can
on students, stating questions clearly to avoid con- be undone if the test questions are faulty or confusing.
fusion, probing student responses with follow-up Poorly constructed or unclear assessment questions do
questions such as ‘why’ or ‘explain your answer,’ and not provide students a fair chance to show what they
remembering that oral questioning is a social process have learned from instruction, and consequently,
in which student answers should be treated with diminish assessment validity.
respect, regardless of the quality of the answer.

3.2 Types of Formal Classroom Assessments


3. Assessments of Formal Learning There are many types of test items that are used in
classrooms assessments. Selection items include mul-
tiple-choice, true–false, and matching items to which
3.1 General Aspects of Formal Assessments
students respond by selecting an answer from a set of
Formal assessment is the culmination of planning and presented items. Supply items include short-answer,
delivering instruction. It focuses on the extent to which completion, and essay items to which students are
students have learned from instruction. There is an im- required to create or supply their own answers.
portant difference between good teaching and effective Selection items can cover many items in a short time
teaching. Good teaching refers to what teachers do and can be scored quickly. Supply items can be
during planning and delivering instruction. Effective constructed quickly and permit students to provide
teaching refers to whether students have learned from their own constructed answers. Selection items are
their instruction. Formal assessments are concerned difficult to construct and encourage guessing, while
with the effectiveness of learning from instruction. supply items are difficult to score and cover smaller
Formal assessments are also called summative assess- samples of instruction (Gronlund 1998).
ments and commonly include tests, projects, term Common guidelines for writing and critiquing test
papers, lab reports, portfolios, performances, pro- items include: (a) assess important objectives; (b) state
ducts, and final examinations. These are assessments items clearly, describing the students’ task; (c) avoid

1980
Classroom Assessment

ambiguous and confusing wording and sentence struc- to describe the essay’s quality. Analytic scoring breaks
ture—students should have a clear understanding of the essay down into component parts such as organi-
what is expected of them; (d) use vocabulary ap- zation, spelling, accuracy, and grammar and gives
propriate to the students being assessed; (e) write each component an individual score. Holistic scoring
selection items that have one correct answer; (f ) is most used in grading students, while analytic scoring
provide information about the nature and form of the is most used to correct and improve initial drafts of
desired response, particularly for essay questions; (g) written responses. To ensure objectivity in essay
avoid clues to correct answers; and (h) review items scoring, the following steps should be followed. Define
before assessing students. what constitutes a good essay answer before it is
In assembling and preparing items for a formal administrated. Tell students whether handwriting,
assessment the following suggestions should be ap- spelling, grammar, and punctuation will count in
plied: (a) group items of the same type; (b) place scoring the essay. If possible, score students essays
selection items first and supply items last; (c) provide anonymously. If multiple essays are in the assessment,
directions for each type of test item; and (d) diminish score all students’ answers to the first essay question
assessment anxiety by giving advanced notice of the before moving on to score the second essay item, and
assessment, providing a review session before assess- so on. Reread some of the essays a second time to
ment, and most of all, providing students with good determine the reliability of scoring.
instruction. Many students experience anxiety before In addition to selection and short-answer items,
and during testing. While it is difficult to eliminate test there are other important types of items that are
anxiety, these strategies can lower it. Plan accom- important in classroom assessments. The most promi-
modations for students with disabilities. Two types of nent of these item types is performance assessment, also
accommodations should be addressed, one for test referred to as authentic or alternative assessments
administration (e.g., having directions read to stu- (Mehrens et al. 1998). Performance assessments allow
dents, giving extra time) and one for the test itself (e.g., students to demonstrate what they know or can do in
divide the test into small section, provide a sample of a real situation. Examples of performance assessments
each test item, arrange student items from concrete to are essays, pronouncing a foreign word, setting up
abstract). laboratory equipment, catching a ball, reciting a poem,
Unfortunately, cheating on classroom assessments identifying unknown chemicals, generalizing exper-
is a fairly common occurrence. Forms of cheating imental data, working in cooperative groups, obeying
range from looking at another’s paper, bringing crib school rules, and painting a picture. All of these
sheets into class, to other illicit strategies (Cizek 1999). performances require more than memorization and a
No matter how or why it is done, cheating is dishonest one or two word response.
and unacceptable. When cheaters state or imply that All performance assessments are developed in four
the work they have turned is their own, they are lying, steps: (a) identifying the purpose of the performance
and should be penalized. Useful strategies to dis- assessment; (b) stating the observable aspects of the
courage cheating on classroom assessments include performance, also called performance criteria; (c)
spreading seating arrangement, careful proctoring, selecting a suitable setting to carry out the perform-
and movement around the classroom during testing. ance assessment; and (d) scoring the quality of the
Ultimately, all formal classroom assessments will be performance. The key aspect of assessing performance
scored, usually by the classroom teacher. Scoring assessments is the identification of the criteria that
selection items is straightforward, efficient, and ob- define a good performance. Performances are nor-
jective. Each student’s score is compared to a scoring mally broken down into specific, observable criteria
key and an overall score is obtained. Selection items that can be individually assessed. Criteria should be
are typically scored objectively, that is, two or more specific and unambiguous. For example, stating ‘in-
independent scorers would agree on a student’s score. formation is presented in a logical sequence’ is better
Supply and especially essay items tend to be more than stating the more ambiguous ‘has organization,’
difficult and time-consuming to score. Because student and ‘can be heard in all parts of the room’ is better
responses to supply items are more lengthy and varied than ‘speaks correctly.’ Statements of clear perform-
than those of selection items, the former are more ance criteria are important for both holistic and
likely to be subjectively scored. That is, scores of two analytic scoring approaches.
or more independent scorers do not agree on the same Multiple approaches to scoring students’ perform-
or similar student score. Many factors influence essay ance assessments are available, and all are based on
subjectivity, including handwriting, spelling, neatness, performance criteria. Checklists, rating scales, and
and teacher fatigue. These factors are not central to scoring rubrics are most commonly used to assess
the essay, but their presence influences the teacher’s performance assessments. A checklist is a written list
perception of students’ essays and can influence the of performance criteria that the teacher uses to judge
objectivity of essay scoring. student performance on each of the criteria. Checklists
Two common essay scoring approaches are holistic allow only ‘yes’ or ‘no’ judgments of each criterion. A
and analytic. Holistic scoring provides a single score rating scale is a written list of performance criteria that

1981
Classroom Assessment

permits the teacher more than two choices (e.g., good, attain high scores, while in criterion-referenced grad-
fair, poor or excellent, good, fair, poor) to judge ing they can if they all reach the standard. Grading
student performance of each criterion. A scoring students based on a comparison of student perform-
rubric summarizes the overall performance on the ance to the teacher’s estimate of the student’s ability is
criteria into holistic descriptions representing different not recommended because estimating ability is difficult
levels of a student’s overall performance. Rubrics to do accurately. Similarly, grading based on student
describe performance in a summative way, while improvement over time is also not recommended. In
checklists and rating scales provide specific diagnostic general, regardless of the grading approach selected, it
information about each criterion in a formative way is strongly advised that grades be based mainly on
(Airasian 2001, Goodrich 1997). students’ academic performance.
Another important addition to performance as-
sessment is the portfolio. A portfolio is a carefully
selected collection of a student’s performances that
show accomplishments and improvements over time.
5. Assessment of Ethical Responsibilities
Portfolios allow students and teachers to revisit and Thus far discussion has focused on the technical
reflect prior work. Like any performance assessment, aspects of classroom assessment. However, it is im-
performance criteria are defined to identify and judge portant to recognize that teachers’ assessments have
each of the individual pieces and the overall portfolio. short-term and long-term consequences for students,
As in all classroom assessments, the criteria should be thus requiring that teachers have an ethical responsi-
aligned to the teacher’s objectives (Arter and Spandel bility to make decisions that are the most valid and
1992). The purpose of performance assessment is the reliable as possible. A number of groups in the USA
same as all formal classroom assessments, to determine have set standards for teachers’ ethical performance
how well students have learned from the instruction (American Federation of Teachers et al. 1990 and
they were provided. To improve the validity and National Education Association 1992–3). Among
reliability of performance assessments, teachers teachers’ ethical responsibilities are: to provide stu-
should select performance criteria that are appropriate dents access to varying points of view; not to expose
for their students, observe and record student per- students to embarrassment or ridicule; not to exclude,
formance while it is being performed rather than at deny, or grant advantages on the basis of students’
some later date, judge student performance in terms of race, color, creed, gender, national origin, religion,
the performance criteria not the personal characteris- culture, sexual orientation, or disability; and not to
tics of the students and, if possible, observe a student’s label students with stereotypes.
performance more than once. This article has indicated that classrooms are com-
Managing and scoring portfolios is a time-con- plex environments that call upon teachers to make
suming activity, and teachers who attempt portfolio many and varied decisions. The bases for these
assessment are advised to start a portfolio with a single decisions derive from a wide range of formal and
topic with a limited number of entries in the portfolio. informal assessment information. Although it is not
expected that every teacher assessment decision will
always be correct, it is expected that they can provide
4. Grading defensible assessment evidence to support classroom
decisions. This should be expected in a context in
Grading is the formal process of judging the quality of which teachers’ actions have important consequences
a student’s performance. Grades are always based on for students.
teacher judgment. However, the helping relationship
that teachers have with their students can make it See also: Classroom Climate; Educational Assess-
difficult to judge them in a completely objective ment: Major Developments; Instructional Design;
manner. Further, since there is no uniformly accepted Instructional Psychology; Performance Evaluation
teacher grading strategy, teachers must find a grading in Work Settings; Program Evaluation; Teacher
approach that they feel is fair to themselves and to the Behavior and Student Outcomes; Teacher Expertise;
students (Brookhart 1998, Frisbie and Waltman 1992). Teaching and Learning in the Classroom; Test
All grading approaches are based on comparisons. Administration: Methodology
The most common grading comparisons are norm-
referenced and criterion-referenced grading. Norm-
referenced grades are determined by comparing how a
given student performed compared to the performance Bibliography
of other test takers. Norm-referenced grading is also Airasian P W 2001 Classroom Assessment: Concepts and Appli-
called grading on the bell curve. Criterion-referenced cations. McGraw-Hill, Boston
grades are determined by comparing how a student American Federation of Teachers, National Council on Mea-
performed in comparison to pre-established stan- surement in Education, National Education Association 1990
dards. In norm-referenced grading not all students can Standards for teacher competence in educational assessment

1982
Classroom Climate

of students. Educational Measurement: Issues and Practice is the availability of a variety of economical, valid, and
9(4): 30–2 widely applicable questionnaires that have been de-
Arter J, Spandel V 1992 Using portfolios of student work in veloped and used for assessing students’ perceptions
instruction and assessment. Educational Measurement: Issues
of classroom climate. This article makes some of these
and Practice 11: 36–44
Black P J 1998 Testing: Friend or Foe? Theory and Practice of valuable instruments readily available by describing
Assessment and Testing. Falmer Press, London some major questionnaires and their past application
Brookhart S M 1998 Teaching about Grading and Communicating in various lines of research.
Results. School of Education, Duquesne University, Pitts- Although using students’ and teachers’ perceptions
burgh, PA to study classroom climate forms the focus of this
Cizek G J 1999 Cheating on Tests: How to Do it, Detect it, and article, this method can be contrasted with the external
Preent it. Erlbaum Associates, Mahwah, NJ observer’s direct observation and systematic coding of
Frisbie D A, Waltman K K 1992 Developing a personal grading classroom communication and events and the tech-
plan. Educational Measurement: Issues and Practice 11(3):
niques of naturalistic inquiry, ethnography, case
35–42
Good T L, Brophy J E 1997 Looking in Classrooms, 7th ed. study, or interpretive research. In the method con-
Longman, New York sidered in detail in this article, defining the classroom
Goodrich H 1997 Understanding rubrics. Educational Lead- climate in terms of the shared perceptions of the
ership 54(4): 14–17 students and teachers has the dual advantage of
Gronlund N E 1998 Assessment of Student Achieement. Allyn & characterizing the setting through the eyes of the
Bacon, Boston participants themselves, and capturing data which the
Jackson P W 1990 Life in Classrooms. Teachers College Press, observer could miss or consider unimportant.
New York
Mehrens W A, Popham W J, Ryan J M 1998 How to prepare
students for performance assessments. Educational Measure-
ment: Issues and Practice 17(1): 18–22 1. Instruments for Assessing Classroom Climate
National Education Association 1992–3 Ethical Standards for
Teachers’ Relations with Pupils. In: NEA Handbook. National Historically, the development of classroom climate
Education Association, Washington, DC, pp. 366–7 instruments commenced three decades ago with the
Oakes J, Lipton M 1999 Teaching to Change the World, 1st ed. appearance of Learning Enironment Inentory (LEI)
McGraw-Hill, Boston and Classroom Enironment Scale (CES). The LEI was
Stiggins R J 1997 Student-centered Classroom Assessment, 2nd developed in conjunction with evaluation and research
edn. Merrill, Uppersaddle River, NJ related to Harvard Project Physics (Walberg and
Wittrock M C (ed.) 1986 Handbook of Research on Teaching.
Macmillan, London
Anderson 1968). The respondent expresses degree of
Wragg T 1997 Assessment and Learning. Routledge, London agreement with each of 105 statements (seven per
scale) using the four response alternatives of strongly
P. W. Airasian disagree, disagree, agree, and strongly agree. The
names of some of the scales are cohesiveness, speed,
difficulty, goal direction, and disorganization. The
CES (Moos and Trickett 1987) grew from a com-
prehensive program of research involving perceptual
measures of a variety of human environments in-
Classroom Climate cluding psychiatric hospitals, prisons, university resi-
dences and work milieus (Moos 1974). The final
In the 30 years since the pioneering use of classroom published version contains nine scales with ten items
climate assessments in an evaluation of Harvard of true–false response format in each scale. Scales
Project Physics (Walberg and Anderson 1968), the include involvement, teacher support, task orienta-
field has undergone remarkable growth, diversifica- tion, and innovation.
tion, and internationalization. Literature reviews Three more contemporary classroom climate instru-
(Fraser 1994, 1998) place these developments into ments are described below: Science Laboratory En-
historical perspective and show that classroom climate ironment Inentory (SLEI); Constructiist Learning
assessments have been used as a source of dependent Enironment Surey (CLES); and What Is Happening
and independent variables in a variety of research In This Class (WIHIC) questionnaire.
applications spanning many countries. The assessment
of classroom climate and research applications has
involved a variety of quantitative and qualitative
1.1 Science Laboratory Enironment Inentory
methods, and an important accomplishment within
the field has been the productive combination of Because of the importance of laboratory settings in
quantitative and qualitative research methods (Tobin science education, an instrument specifically suited to
and Fraser 1998). assessing the climate of science laboratory classes at
A historical look at the field of classroom climate the senior high school or higher education levels was
over the past few decades shows that a striking feature developed (McRobbie and Fraser 1993). The SLEI

1983
Classroom Climate

has five scales (student cohesiveness, open-endedness, changed in the preferred form to ‘there would be a clear
investigation, rule clarity, and material environment), set of rules for students to follow.’
each with seven items. The five response alternatives Tobin and Fraser (1998) point out that there is
are almost never, seldom, sometimes, often, and very potentially a problem with nearly all existing class-
often. Typical items are ‘I use the theory from my room climate instruments when they are used to
regular science class sessions during laboratory act- identify differences between subgroups within a class-
ivities’ (integration) and ‘we know the results that we room (e.g., males and females) or in the construction
are supposed to get before we commence a laboratory of case studies of individual students. The problem is
activity’ (open-endedness). that items elicit an individual student’s perceptions of
the class as a whole, as distinct from a student’s
perceptions of his\her own role within the classroom.
For example, items in the traditional class form might
1.2 Constructiist Learning Enironment Surey
seek students’ opinions about whether ‘the work of the
The CLES (Taylor et al. 1997) was developed to assist class is difficult’ or whether ‘the teacher is friendly
researchers and teachers to assess the degree to which towards the class.’ In contrast, a personal form of the
a particular classroom’s climate is consistent with a same items would seek opinions about whether ‘I find
constructivist epistemology, and to assist teachers to the work of the class difficult’ or whether ‘the teacher
reflect on their epistemological assumptions and re- is friendly towards me.’ For these reasons, most of the
shape their teaching practice. The CLES has 36 items questionnaires discussed above have a personal form.
with five response alternatives ranging from almost Comprehensive statistics supporting the validity
never to almost always. The scales are personal and reliability of the above questionnaires are pro-
relevance, uncertainty, critical voice, shared control, vided in Fraser (1998).
and student negotiation. Typical items are ‘I help the
teacher to decide what activities I do’ (shared control)
and ‘other students ask me to explain my ideas’
(student negotiation). 2. Research Inoling Classroom Climate
Instruments

1.3 What Is Happening In This Class (WIHIC) 2.1 Associations Between Student Outcomes and
Questionnaire Classroom Climate
The WIHIC questionnaire brings parsimony to the The strongest tradition in past classroom climate
field of classroom climate by combining modified research has involved investigation of associations
versions of the most salient scales from a wide range of between students’ cognitive and affective learning
existing questionnaires with additional scales that outcomes and their perceptions of psychosocial char-
accommodate contemporary educational concerns acteristics of their classrooms. Fraser’s (1994) tabu-
(e.g., equity and constructivism). Whereas an Austra- lation of 40 past studies shows that associations
lian sample of 1,081 students in 50 classes responded between outcome measures and classroom climate
to the original English version, a Taiwanese sample of perceptions have been replicated for a variety of
1,879 students in 50 classes responded to a Chinese cognitive and affective outcome measures, a variety of
version that had undergone careful procedures of classroom climate instruments, and a variety of
translation and back translation (Aldridge et al. 1999). samples (ranging across numerous countries and grade
This led to a final form of the WIHIC containing the levels). Using the SLEI, associations with students’
seven eight-item scales of student cohesiveness, teacher cognitive and affective outcomes were found for a
support, involvement, investigation, task orientation, sample of approximately 80 senior high school chemis-
cooperation, and equity. try classes in Australia (Fraser and McRobbie 1995
and Fraser 1993), 489 senior high school biology
students in Australia (Fisher et al. 1997) and 1592
grade 10 chemistry students in Singapore (Wong and
1.4 Different Forms of Questionnaires Fraser 1996).
The instruments discussed above have not only a form
to measure perceptions of ‘actual’ or experienced
classroom climate, but also another form to measure
2.2 Ealuation of Educational Innoations
perceptions of ‘preferred’ or ideal classroom climate.
The preferred forms are concerned with goals and Classroom climate instruments can be used as a source
value orientations and measure perceptions of the of process criteria in the evaluation of educational
classroom climate ideally liked or preferred. For innovations. An evaluation of the Australian Science
example, an item in the actual form such as ‘there is a Education Project (ASEP) revealed that, in compari-
clear set of rules for students to follow’ would be son with a control group, ASEP students perceived

1984
Classroom Climate

their classrooms as being more satisfying and indivi- whether an attempt would be made to change the
dualized and having a better material environment climate in terms of some of the dimensions (reflection
(Fraser 1979). The significance of this evaluation is and discussion). The main criteria used for selection of
that classroom climate variables differentiated reveal- dimensions for change are, first, that there should be a
ingly between curricula, even when various outcome sizeable actual-preferred difference on that variable
measures showed negligible differences. Recently, the and, second, that the teacher should feel concerned
incorporation of a classroom climate instrument about this difference and want to make an effort to
within an evaluation of the use of a computerized reduce it. Fourth, the teacher introduces an inter-
database revealed that students perceived that their vention of about two months’ duration in an attempt
classes became more inquiry oriented during the use of to change the classroom climate (intervention). For
the innovation (Maor and Fraser 1996). In an evalu- example, strategies used to enhance the dimension of
ation of an urban systemic reform initiative in the teacher support could involve the teacher moving
USA, use of the CLES painted a disappointing picture around the class more to mix with students, providing
in terms of a lack of success in achieving constructivist assistance to students, and talking with them more
oriented reform of science education (Dryden and than previously. Fifth, the student actual form of the
Fraser 1996). scales is re-administered at the end of the intervention
to see whether students perceive their classroom
climate differently from before (reassessment).
Yarrow et al. (1997) reported a study in which 117
2.3 Differences Between Student and Teacher
preservice education teachers were introduced to the
Perceptions of Actual and Preferred Climate
field of classroom climate through being involved in
An investigation of differences between students and action research aimed at improving their university
teachers in their perceptions of the same actual teacher education classes and their 117 primary school
classroom climate and of differences between the classes during teaching practice. Improvements in
actual climate and that preferred by students or classroom climate were observed, and the preservice
teachers was reported by Fisher and Fraser (1983) for teachers generally valued both the inclusion of the
a sample of 116 classes for the comparisons of student topic of classroom climate in their preservice pro-
actual with student preferred scores, and a subsample grams, and the opportunity to be involved in action
of 56 of the teachers of these classes for contrasting research aimed at improving classroom climate.
teachers’ and students’ scores. Students preferred a
more positive classroom climate than was actually
2.5 Combining Quantitatie and Qualitatie
present for all five climate dimensions. Also, teachers
Methods
perceived a more positive classroom climate than did
their students in the same classrooms on four of the Significant progress has been made towards the
dimensions. These results replicate patterns emerging desirable goal of combining quantitative and quali-
in other studies in other countries (Fraser 1998). tative methods within the same study in research on
classroom climates (Tobin and Fraser 1998). Fraser’s
(1999) multilevel study of classroom climate inco-
rporated a teacher-researcher perspective as well as the
2.4 Teachers’ Attempts to Improe Classroom
perspective of six university-based researchers. The
Climates
research commenced with an interpretive study of a
Feedback information based on student or teacher Grade 10 teacher’s classroom at a school which
perceptions has been employed in a five-step procedure provided a challenging classroom learning climate in
as a basis for reflection upon, discussion of, and that many students were from working-class back-
systematic attempts to improve classroom climate grounds, some were experiencing problems at home,
(Yarrow et al. 1997). First, all students in the class and others had English as a second language. Qual-
respond to the preferred form of a classroom climate itative methods involved several of the researchers
instrument, while the actual form is administered in visiting this class each time it met over five weeks,
the same time slot about a week later (assessment). using student diaries, and interviewing the teacher-
Second, the teacher is provided with feedback in- researcher, students, school administrators, and pare-
formation derived from student responses in the form nts. A video camera recorded activities for later
of profiles representing the class means of students’ analysis. Field notes were written during and soon
actual and preferred climate scores (feedback). These after each observation, and team meetings took place
profiles permit identification of the changes in class- three times weekly. The qualitative component of the
room climate needed to reduce major differences study was complemented by a quantitative component
between the nature of the actual climate and that involving the use of a questionnaire which linked three
preferred by students. Third, the teacher engages in levels: the class in which the interpretive study was
private reflection and informal discussion about the undertaken; selected classes from within the school;
profiles in order to provide a basis for a decision about and classes distributed throughout the same State.

1985
Classroom Climate

This enabled a judgment to be made about whether moved from generally smaller primary schools to
this teacher was typical of other teachers at the same larger, departmentally organized lower secondary
school, and whether the school was typical of other schools, perhaps because of less positive student
schools within the State. Some of the features identi- relations with teachers and reduced student oppor-
fied as salient in this teacher’s classroom climate were tunities for decision making in the classroom.
peer pressure and an emphasis on laboratory activities. Ferguson and Fraser’s (1998) study of 1,040 students
from 47 feeder primary schools and 16 linked high
schools in Australia also indicated that students
perceived their high school classroom climates less
2.6 Cross-national Studies
favorably than their primary school classroom
Educational research which crosses national bound- climates, but the transition experience was different
aries offers much promise for generating new insights for boys and girls and for different school size
for at least two reasons (Aldridge et al. 1999). First, ‘pathways.’
there usually is greater variation in variables of interest
(e.g., teaching methods, student attitudes) in a sample
drawn from multiple countries than from a one-
country sample. Second, the taken-for-granted fam- 3. Conclusion
iliar educational practices, beliefs, and attitudes in one
country can be exposed, made ‘strange,’ and ques- The major purpose of this article has been to make this
tioned when research involves two countries. In a exciting research tradition involving classroom climate
recent cross-national study, six Australian and seven more accessible to wider audiences by portraying
Taiwanese researchers worked together on a study of several widely applicable instruments for assessing
classroom climate. The WIHIC was administered to perceptions of classroom climate and by describing
50 junior high school science classes in Taiwan (1,879 several major lines of previous research.
students) and 50 classes in Australia (1,081 students)
(Aldridge et al. 1999). An English version of the See also: Classroom Assessment; Educational As-
questionnaire was translated into Chinese, followed sessment: Major Developments; Environments for
by an independent back translation of the Chinese Learning; Group Processes in the Classroom; School
version into English, again by team members who as a Social System; Teacher Behavior and Student
were not involved in the original translation. Quali- Outcomes; Teaching and Learning in the Classroom
tative data, involving interviews with teachers and
students and classroom observations, were collected
to complement the quantitative information and to
clarify reasons for patterns and differences in the
means in each country. Bibliography
Data from the questionnaires guided the collection Aldridge J M, Fraser B J, Huang T-C I 1999 Investigating
of qualitative data. Student responses to individual classroom environments in Taiwan and Australia with mul-
items were used to form an interview schedule to tiple research methods. Journal of Educational Research 93:
clarify whether items had been interpreted consistently 48–62
by students and to help to explain differences in Ferguson P D, Fraser B J 1998 Changes in learning environment
questionnaire scale means between countries. Class- during the transition from primary to secondary school.
Learning Enironments Research 1: 369–83
rooms were selected for observation on the basis of the Fisher D L, Fraser B J 1983 A comparison of actual and
questionnaire data, and specific scales formed the preferred classroom environment as perceived by science
focus for observations in these classrooms. The quali- teachers and students. Journal of Research in Science Teaching
tative data provided valuable insights into the percep- 20: 55–61
tions of students in each of the countries, helped to Fraser B J 1979 Evaluation of a science-based curriculum. In:
explain some of the differences in the means between Walberg H J (ed.) Educational Enironments and Effects:
countries, and highlighted the need for caution when Ealuation, Policy, and Productiity. McCutchan, Berkeley,
interpreting differences between the questionnaire CA, pp. 218–34
results from two countries with cultural differences. Fraser B J 1994 Research on classroom and school climate. In:
Gabel D (ed.) Handbook of Research on Science Teaching and
Learning. Macmillan, New York, pp. 493–541
Fraser B J 1998 Science learning environments: assessment,
2.7 Transition from Primary to High School effects and determinants. In: Fraser B J, Tobin K G (eds.)
International Handbook of Science Education. Kluwer, Dord-
There is considerable interest in the effects on early recht, The Netherlands, pp. 527–64
adolescents of the transition from primary school to Fraser B 1999 ‘Grain sizes’ in learning environment research:
the larger, less personal climate of the junior high combining qualitative and quantitative methods. In: Waxman
school at this time of life. Midgley et al. (1991) reported H, Walberg H (eds.) New Directions for Research on Teaching.
a deterioration in the classroom climate when students McCutchan, Berkeley, CA, pp. 285–96

1986
Cleaages: Political

Maor D, Fraser B J 1996 Use of classroom environment 1. The Lipset–Rokkan Model


perceptions in evaluating inquiry-based computer assisted
learning. International Journal of Science Education 18: 401–21 The concept of ‘cleavage’ has been current in the social
Midgley C, Eccles J S, Feldlaufer H 1991 Classroom environ-
sciences for some time, although it was given full
ment and the transition to junior high school. In: Fraser B J,
Walberg H J (eds.) Educational Enironments: Ealuation,
development only in the 1960s by Seymour Martin
Antecedents and Consequences. Pergamon, London, pp. 113– Lipset and Stein Rokkan. Both of them political
39 sociologists by training, Lipset and Rokkan (1967)
Moos R H 1974 The Social Climate Scales: An Oeriew. sought to redefine and specify the ‘social bases of
Consulting Psychologists Press, Palo Alto, CA politics.’ Writing when structural-functionalism was
Moos R H, Trickett E J 1987 Classroom Enironment Scale at its height—and, therefore, influenced by the
manual, 2nd edn. Consulting Psychologists Press, Palo Alto, Parsonian theory which assigned to the political
CA parties the function of encapsulating social conflicts
Taylor P C, Fraser B J, Fisher D L 1997 Monitoring con- and stabilizing the social system—they set out to
structivist classroom learning environments. International
Journal of Educational Research 27: 293–302
explain the persistence of party systems in the Euro-
Tobin K, Fraser B J 1998 Qualitative and quantitative land- pean democracies. In the 1960s, in fact, those systems
scapes of classroom learning environments. In: Fraser B J, still displayed features similar to those that had been
Tobin K G (eds.) International Handbook of Science Edu- institutionalized at the beginning of the century. Not
cation. Kluwer, Dordrecht, The Netherlands, pp. 623–40 surprisingly, their explanation was called the theory of
Walberg H J, Anderson G J 1968 Classroom climate and the ‘freezing’ of the European party systems.
individual learning. Journal of Educational Psychology 59: Their method was primarily historical–sociological
414–19 in so far as it connected existing political divisions in
Yarrow A, Millwater J, Fraser B J 1997 Improving university the European countries with the principal cleavages
and primary school classroom environments through pre- that had opened up in the course of their development,
service teachers’ action research. International Journal of
Practical Experiences in Professional Education 1(1): 68–93
from the birth of the nation-state in the sixteenth
Dryden M,, Fraser B J 1996 Ealuating Urban Systemic Reform century to its full democratic maturation in the
Using Classroom Learning Enironment Instruments. Paper twentieth. The specific political cleavages that gave
presented at the annual meeting of the American Educational rise to the modern party systems accordingly were seen
Research Association, New York to be the result of two great historical processes: the
Fisher D, Henderson D, Fraser B 1997 Laboratory environments one that had bred national revolutions (and, therefore,
and student outcomes in senior high school biology. American the formation of the modern European nation-states),
Biology Teacher 59: 214–19 and the one that had engendered the industrial
Fraser B J, McRobbie C J 1995 Science laboratory classroom revolution (and, therefore, the formation of modern
environments at schools and universities: A cross-national
study. Educational Research and Ealuation 1: 289–317
European capitalist systems).
McRobbie C J, Fraser B J 1993 Associations between student National revolutions had created two structural
outcomes and psychosocial scince environment. Journal of divisions: (a) between the center and the periphery, or
Educational Research 87: 78–85 between the groups and areas that sought to impose a
Wong W L F, Fraser B J 1996 Environment-attitude associ- single public authority on a given territory and the
ations in the chemistry laboratory classroom. Research in groups and areas which asserted their traditional
Science and Technological Education 14: 91–102 autonomy against such centralizing pressures; (b)
between the lay state and the church, or between
B. J. Fraser groups which sought to separate temporal from
religious authority and groups intent on preserving the
intimate connection between them. The industrial
revolution in its turn created two further structural
divisions: (a) between agriculture and industry, or
between groups and areas whose survival depended on
Cleavages: Political traditional activities and groups and areas which
endeavored to remove traditional constraints in order
‘Political cleavages’ are political divisions among to foster the growth of new activities and production
citizens rooted in the structure of a given social system. methods; (b) between capital and labor, or between
However, although cleavages are political divisions, the groups that dominated the new industrial structure
not all political divisions among citizens spring from and the workers, whose only possession was their
structural cleavages. For one to talk of ‘cleavages’ capacity to perform labor.
such divisions must be permanent and noncontingent. In Europe, only the parties that reflected these
They must orient people’s behavior and sense of cleavages were able to survive, that is, reproduce
belonging stably and constantly. Political cleavages themselves electorally and institutionally. The in-
are the partisan expression of an underlying division stitutionalized interaction among these parties gave
among the members of a given society (whether rise to the modern party systems which, in individual
national, subnational, or supranational). European countries, and in forms that differed from

1987
Cleaages: Political

one country to another, still conserved in the mid- were given various interpretations by scholars. The
twentieth century the cleavages that had arisen in 1992 study by Franklin suggested that the decline of
previous ones. cleavage politics was ineluctable, those of Inglehart
(1977), Dalton et al. (1984) and subsequent studies
until seemingly showed that cleavage politics were
2. Subsequent Debate evolving in a new direction so that ‘cultural’ cleavages
were now taking the place of fading social cleavages
The Lipset–Rokkan model heavily influenced the and reorienting electoral and political behavior.
debate conducted during the 1960s on the political For these authors, the new structure of divisions
parties. The discussion started from the premise that might indeed have a ‘social basis,’ but it was manifest
political parties were necessary to make democracy in a clash of values: between industrial values (in favor
safe (i.e., stable), as Schattschneider (1948) had already of the quantitative growth of affluence) on the one
argued. However, the model was not endorsed uni- hand, and postindustrial ones (which gave priority to
versally, at least in its entirety. In a study of a small the quality of life and the protection of the environ-
Scandinavian democracy, Eckstein (1966) pointed out ment) on the other. Associated with each side were
the existence of multiple political divisions, identifying socioeconomic groups and geographical areas, but the
ones due to specific disagreements on particular public clash involved distinct (and opposed) cultural con-
policies, others due to cultural divergences on inter- ceptions and lifestyles. Of course, there was no lack of
pretations of political life, and yet others arising from criticism of this approach—especially by Bartolini and
segmental cleavages caused by objective social dif- Mair (1990)—given that it emptied Lipset and
ferences. Again in 1966, Daalder examined the small Rokkan’s original concept of cleavage of much of its
democracies of continental Europe and pointed out meaning. For this reason, Bartolini and Mair pro-
the existence of political divisions due to factors (for posed the following redefinition of the notion: (a)
instance, the nature of the political regime or the empirically, a cleavage must be definable in terms of
concept of nationality) other than those envisaged by social structure; (b) normatively, a cleavage is a system
the Lipset–Rokkan model. of values which gives a sense of collective identity to a
But it was Sartori (1969) who challenged most social group; (c) behaviorally, a cleavage is manifest in
radically the Lipset–Rokkan model, by reversing its the interaction among political actors. Thus redefined,
causal logic. For Sartori, it was not social divisions the concept of cleavage is broader in its compass and
that encouraged the birth of parties; rather, it was the becomes a means to order social relations.
parties that gave visibility and identity to a particular
structure of social divisions. In short, Sartori argued,
political sociology (and political science) should take 3. The Freezing of Cleaages
the place of sociology of politics if partisan politics in
the European democracies were to be understood Sociologists and economists also joined the debate.
properly. Lipset (1970) himself acknowledged the Goldthorpe (1996), for example, found that trad-
ability of parties to exacerbate politically a cleavage itional social divisions were still conditioning political
that might socially be in decline. Nonetheless, he allegiances and electoral choices at the end of the
reiterated that a social basis was necessary for a party twentieth century. Other studies appeared which,
to exist. Thus, while for Lipset and Rokkan social although they extended the concept of social cleavage,
cleavages were necessary, though not sufficient, for the continued to frame it in structural terms. Lijphart
formation of parties and of party systems, for Sartori (1977), in his study of the small consociative de-
they were neither necessary nor sufficient because mocracies of continental Europe, and then in his
politics can only be conducted independently of other analyses of the established democracies (Lijphart
social spheres. This autonomy of the parties from 1999), showed that ethnic divisions performed the
society had already been shown by Kirchheimer (1966) same function in structuring identity and behavior as
in his celebrated study in which he investigated the did the other social divisions of the Lipset–Rokkan
transition from the ‘party of social integration’ to the model. These divisions, too, sprang from the long
‘catch-all party,’ that is, a party able to represent historical process that had led to the formation of the
diverse classes and social groups electorally. nation-state. Thereafter, they had continued to pre-
From the 1970s onwards, partly due to the de- dominate despite the divisions created by the process
velopment of more sophisticated techniques of social of industrialization. In the nation-states, the divisions
research, the debate moved in a more microempirical between agriculture and industry, and between capital
and less macrohistorical direction. The decade saw and labor were absorbed by more basic ethnic-
numerous studies of electoral behavior, although their linguistic cleavages. According to Lijphart, the diverse
results were equivocal. While early studies like Rose nature of these cleavages lay at the origin of the two
(1974) showed the relative decline of politics based on principal models of democracy (what he called ‘con-
social cleavages (or ‘cleavage politics,’ as it came to be sensual’ and ‘majoritarian’) that developed in the West
called), the magnitude and implications of this decline after the World War II.

1988
Cleaages: Political

The model of consensual democracy based on the stable interaction among the political parties, and that
inclusion in the executive of all the country’s main stability may conceal processes of dealignment and
ethnic groups proved highly effective (in stabilizing realignment sufficient to gainsay the logic of the
democracy). It was accordingly used by authors Lipset–Rokkan model.
(starting from Sartori and his studies of party systems
in the 1970s) to investigate the workings of national
societies connoted by identity divisions, albeit based
on ideology rather than ethnicity or language. The 4. Between Europe and America
reference here is to the postwar European democracies
distinguished by the presence of powerful communist The theory of cleavages has been developed on the
parties. Even these democracies were consensual in basis of the experiences of the Western European
nature, although their operation was sustained, not by countries, with no reference to the other great model
inclusive coalitions in the executive (access to which of democracy: that of the United States. And yet it was
was barred to communist parties, owing to the precisely in the United States that modern political
geopolitical cleavages created by the Cold War), but parties and party systems were invented. Indeed, the
by consensual practices in parliament. However, while myth of American exceptionalism has been fostered
these ideological cleavages proved unstable with the by this European neglect; a neglect motivated by the
passage of time, this was not the case of ethnic ones. It belief that American society, unlike those of European
seemed, indeed, that the model of consensual democ- countries, is based on cross-cutting cleavages which—
racy had ended up by ‘freezing’ ethnic allegiances, as Lipset maintained as early as 1963—are unable to
though managing to cushion their impact. produce stable divisions among citizens. This absence
The reasons why party systems were frozen in the of cleavage politics, the argument ran, gave rise to the
postwar European democracies were expressly in- depolarization of partisan conflict in the United States
vestigated by Mair and Bartolini (1990). These two that underpinned the stability of ‘American democ-
authors examined three different hypotheses with racy.’ In short, the more cleavages multiply and
regard to the freezing process. First, it may involve the interweave, the more numerous the divisions among
freezing of social cleavages, that is, the stabilization of citizens become, and the safer democracy grows.
the social structure from which the parties draw And yet, as Bensel (1987) showed, the situation in
legitimation for their political action. Second, the the United States was not so clear-cut, for that
freezing may be due to the institutionalization of the country, too, displayed, and still does, a stable political
political parties, albeit accompanied by the fading of cleavage; sectional rather than social and cultural,
the social divisions that had prompted their formation although it has latterly acquired these features as well.
(here by ‘institutionalization’ is meant the parties’ This is the political cleavage between states and
ability to stand as the only practicable electoral regional areas expressed in two radically different
choices). Third, the freezing may be due to the conceptions of the balance of powers to be struck
stabilization of the party system as such, or put between the states and the center of the federation.
otherwise, the institutionalization of the system of And it should not be forgotten that this fracture
interactions among the main political actors. Mair and provoked one of the most violent and bloody civil
Bartolini seem to suggest that the third of these wars of the modern age. It is around this cleavage that
hypotheses is the most plausible, given that both the the various party systems that have arisen since the
hypothesis of the freezing of social cleavages and that foundation of American republic have structured
of the freezing of political parties must admit to so themselves.
many exceptions that they are not falsifiable. In short, In the light of the postnational experience of Europe
for both authors a distinction must be drawn between at the end of the twentieth century, the case of the
the freezing of party systems and the freezing of United States appears less exceptional than it did in
individual parties. the past. This is because the process of European
The freezing hypothesis has also been discussed in integration has generated a sectional divide among
terms of voting behavior. Several surveys have geo-economic areas of the continent which cuts across
sought—using different indicators—to collect reliable the traditional (in Europe) party-political axis ranging
data on the stability and instability of voting choices. from right to left. And in this case, too, the new
Many scholars, from Pederson (1983) to Maguire contraposition has taken the form of a different
(1983) and especially Bartolini (2000), have shown interpretation of the balance of powers that should be
that rates of aggregate electoral volatility were rela- established between the European and national in-
tively low until the 1980s: which corroborated Lipset stitutions. Can European integration be regarded as a
and Rokkan’s original contention that continuity further historical cleavage—in addition to those
rather than change was the distinguishing feature of singled out by the Lipset–Rokkan model—destined to
partisan politics in Europe. These studies came in for produce another political structural cleavage? If so,
criticism, of course, mainly on the grounds that the cleavage theory might be updated, this time
electoral stability does not necessarily coincide with bridging the European and American experiences.

1989
Cleaages: Political

See also: Conflict\Consensus; Conflict Sociology; Sartori G 1969 From sociology of politics to political sociology.
Ethnic Conflict, Geography of; Ethnic Conflicts; Party In: Lipset S M (ed.) Politics and the Social Sciences. Oxford
University Press, New York
Systems; Pluralism; Political Geography; Political
Sartori G 1976 Parties and Party Systems. Cambridge University
Sociology; Race Relations in the United States, Press, New York
Politics of Schattschneider E E 1948 The Struggle for Party Goernment.
University of Maryland, College Park, MD

Bibliography S. Fabbrini
Bartolini S 2000 The Class Cleaage: The Political Mobilization
of the European Left, 1860–1980. Cambridge University Press,
Cambridge, UK
Bartolini S, Mair P 1990 Identity, Competition and Electoral
Aailability. Cambridge University Press, Cambridge, UK
Bensel R F 1987 Sectionalism and American Political Deelop- Climate Change and Health
ment: 1880–1980. University of Wisconsin Press, Madison,
WI
Daalder H 1966 Parties, elites and political development(s) in
This article outlines the potential impacts on human
Western Europe. In: LaPalombara J, Weiner M (eds.) Political health of climate change due to the accumulation of
Parties and Political Deelopment. Princeton University Press, greenhouse gases in the earth’s atmosphere. It de-
Princeton, NJ scribes the range of potential mechanisms by which
Dalton R J, Flanagan C S, Beck P A 1984 Electoral Change in health could be affected and the difficulties of esti-
Adanced Industrial Democracies: Realignment or Dealign- mating the magnitude of such effects. It concludes
ment? Princeton University Press, Princeton, NJ with a brief discussion of how, not withstanding the
Eckstein H 1966 Diision and Cohesion in Democracy: A Study of need to prevent climate change as far as possible,
Norway. Princeton University Press, Princeton, NJ humankind will have to adapt to changing climate if
Franklin M T, Mackie T, Valen H (eds.) 1992 Electoral Change:
Responses to Social and Attitudinal Structures in Western
the adverse effects are to be minimized.
Countries. Cambridge University Press, Cambridge, UK
Goldthorpe J H 1996 Class and politics in advanced industrial
societies. In: Lee D J, Turner B S (eds.) Conflict about Class:
Debating Inequality in Late Industrialism. Longman, London 1. Background
Inglehart R 1977 The Silent Reolution. Changing Values and
Political Styles among Western Publics. Princeton University
1.1 Climate Change
Press, Princeton, NJ
Kirchheimer O 1966 The transformation of the Western Euro- Human activities, particularly the burning of fossil
pean Party systems. In: LaPalombara J, Weiner M (eds.) fuels, but also changes in land use, are leading to the
Political Parties and Political Deelopment. Princeton Uni- accumulation of greenhouse gases such as carbon
versity Press, Princeton, NJ
dioxide and methane in the earth’s atmosphere.
Knutsen O 1988 The impact of structural and ideological party
cleavages in Western European democracies: A comparative The resulting increase in ‘radiative forcing’ is leading
empirical analysis. British Journal of Political Science 18: to warming of the earth’s surface. The United Nations
323–52 set up the Intergovernmental Panel on Climate Change
Lijphart A 1977 Democracy in Plural Societies: A Comparatie (IPCC)—a multidisciplinary body of scientific advi-
Exploration. Yale University Press, New Haven, CT sers, which in its third assessment report forecast an
Lijphart A 1999 Patterns of Democracy. Yale University Press, increase in the average global temperature of
New Haven, CT 1n4–5n8 mC between 1990 and 2100 (IPCC 2001). There
Lipset S M 1963 Political Man: The Social Bases of Politics. are a number of sources of uncertainty in projections
Johns Hopkins University Press, Baltimore, MD
of future climate, including changes in greenhouse gas
Lipset S M 1970 Reolution and Counterreolution. Anchor
Books, New York emissions and concentrations, the sensitivity of climate
Lipset S M, Rokkan S 1967 Cleavage structures, party systems to greenhouse gases, and the impact of modulating
and voter alignments: An introduction. In: Lipset S M, processes, such as the short-term cooling effects of
Rokkan S (eds.) Party Systems and Voter Alignments. Free aerosols as a result of industrial emissions. However, it
Press, New York does appear likely that the rate of climate change over
Maguire M 1983 Is there still persistence? Electoral change in the twenty-first century will be far greater than any
Western Europe, 1948–1979. In: Daalder H, Mair P (eds.) natural changes in world climate over the past 10,000
Western European Party Systems Continuity and Change. years. There has been substantial warming since 1856,
Sage, Beverly Hills, CA
when records began, with particularly rapid increases
Pederson M N 1983 Changing patterns of electoral volatility in
European party systems, 1948–1977. In: Daalder H, Mair P in temperatures since about 1980. The warmest year
(eds.) Western European Party Systems. Continuity and on record was 1998, partly as a result of the marked El
Change. Sage, Beverly Hills, CA Nin4 o which occurred over 1997–8. In its second
Rose R 1974 Electoral Behaiour: A Comparatie Handbook. assessment report the IPCC concluded that the balance
Free Press, New York of evidence suggested that the impact of human

1990
Climate Change and Health

Table 1
Mediating processes and direct and indirect potential effects on health of changes in temperature and weather
Mediating process Health outcome
Direct effects
Exposure to thermal extremes Changed rates of illness and death
related to heat and cold
Changed frequency or intensity of Deaths, injuries, pyschological
other extreme weather events disorders; damage to public
health infrastructure
Indirect effects
Disturbances of ecological systems: Changes in geographical ranges
Effect on range and activity of and incidence of vector borne
vectors and infective parasites disease
Changed local ecology of water Changed incidence of diarrheal
borne and food borne infective and other infectious diseases
agents
Changed food productivity Malnutrition and hunger, and
(especially crops) through consequent impairment of child
changes in climate and associated growth and development
pests and diseases
Sea level rise with population Increased risk of infectious disease
displacement and damage to psychological disorders
infrastructure
Biological impact of air pollution Asthma and allergies; other acute
changes (including pollens and and chronic respiratory
spores) disorders and deaths
Social, economic, and demographic Wide range of public health
dislocation through effects on consequences: mental health
economy, infrastructure, and and nutritional impairment,
resource supply infectious diseases, civil strife
Source: McMichael and Haines 1997

activities on global climate was now discernible (IPCC 2. Potential Impacts on Health
1996). The subsequent IPCC report pointed to ‘new
and stronger evidence that most of the warming over 2.1 Range of Effects
the last 30 years was attributable to human activities.’
Climate change is likely to have substantial effects on
human health through a range of pathways
(McMichael and Haines 1997, McMichael et al. 1996)
(Table 1). The potential effects of climate change on
1.2 Other Global Enironmental Changes health are sometimes divided into direct and indirect
Climate change is not occurring in isolation and there to separate those impacts where the chain of causation
are a range of other global changes—stratospheric is short, such as increased deaths during heatwaves,
ozone depletion, loss of biodiversity, changes in land and those that are mediated through a longer causal
use patterns and depletion of aquifers, all of which chain. The latter include changes in ecosystems which
may also have effects on human health and society. can, for example, effect the distribution of insect
There are linkages between climate change and some vectors of disease. Most of the anticipated effects are
of these other phenomena, for example, the rise in the likely to be adverse, although some, such as possible
temperature of the lower atmosphere may increase reductions in excess winter death rates due to warmer
stratospheric ozone depletion. Deforestation leads to winters in cool-temperate countries, could be bene-
loss of biodiversity, particularly when it involves ficial. Quantification of potential impacts is com-
tropical forests, and also results in a release of plicated by the many uncertainties involved and any
substantial amounts of carbon dioxide into the at- estimates should be taken as indicative.
mosphere. Population growth in developing countries
and unsustainable patterns of consumption in indus-
2.2 Approaches to Assessing Potential Impacts
trialized nations are increasing the strain on earth’s
life support systems and the demand for energy There are a number of approaches to assessing the
(McMichael and Powles 1999). potential impacts of climate change on health. These

1991
Climate Change and Health

include the study of historical analogues that simulate perhaps because populations in countries with gen-
certain aspects of future climate change. One example erally mild winters fail to wear suitable clothing or
is the study of the effects of the El Nin4 o\Southern their housing is not adapted to low temperatures.
Oscillation (ENSO), a large irregularly occurring
atmosphere–ocean system which results in relatively
short-term climate changes over the Pacific region
2.4 Studies of the Effects of El Ning o\Southern
every 2–7 years. The warm event (El Nin4 o) is followed
Oscillation
frequently by a cold event (La Nin4 a). The ENSO is
also linked by distant connections (teleconnections) to ENSO can affect rainfall, leading to either droughts or
climatic anomalies elsewhere in the world, particularly floods in parts of the world, as well as causing increases
in countries bordering the Pacific and Indian oceans. in temperature and changes in the frequency, intensity,
Integrated mathematical modeling is also being used and geographical distribution of extreme weather
increasingly to estimate the future impact on health of events such as storms.
climate change. In order to undertake such modeling The ENSO cycle has been associated with sub-
each component of the sequence of climate, environ- stantial changes in the incidence of malaria in coun-
mental and social change, is represented mathemat- tries such as Pakistan, Sri Lanka, Colombia, and
ically. Venezuela (reviewed by Kovats et al. 1999). The
incidence of dengue fever (a viral disease carried by
mosquitoes) is affected by the ENSO cycle in some
2.3 Direct Effects of Heat and Cold
Pacific Islands (Hales et al. 1996). There are large
The link between heatwaves and increased death rates increases in the numbers of people affected by natural
has been described in many parts of the world. Excess disasters at a global level in El Nin4 o years, and the year
mortality especially is experienced by the elderly, in following (Bouma et al. 1997). Other health impacts of
particular by those who live in disadvantaged areas El Nin4 o include increases in respiratory disorders due
without adequate air conditioning. Some of the to very high levels of air pollution as a result of forest
increase in deaths is due to mortality displacement, fires that occurred, for example, both in Indonesia and
i.e., short-term shift in the time of death of those who Brazil in association with the 1997\98 event. The
would have died anyway in the near future. The ENSO phenomenon is clearly not strictly an analogue
threshold at which increased death rates occur depends for global climate change but does demonstrate that
on population acclimatization and is, therefore, higher some diseases and health outcomes are sensitive to
in those cities where the populations are used to high changes in climate. Recently, there have been sugges-
temperatures. The impact of the first heatwave on tions that the frequency of El Nin4 o may increase in the
mortality in a given summer is often greater than the future as a result of climate change (Timmermann et
impact of subsequent heatwaves, probably because a al. 1999). Increasingly, forecasting is being used to give
disproportionate number of susceptible people die early warning of El Nin4 o in order to improve pre-
during the first heatwave. Several studies have quant- paredness and reduce the adverse effects.
ified the impact of climate change on heat-related
mortality. For example, one study (Kalkstein and
Greene 1997) estimated an annual excess mortality
2.5 Mathematical Modeling of Malaria
attributable to climate change (assuming acclimati-
zation) of between 500 and 1,000 for New York and Mathematical modeling has been applied to the
100 and 250 for Detroit by the year 2050. assessment of likely changes in the geographical range
There is controversy over the degree to which of vector-borne diseases such as malaria. One estimate,
increases in summer mortality will be outweighed by for example, suggests that approximately 45 percent of
decreases in winter mortality. Although death rates the world’s population live in zones of potential
are higher in winter than in summer in the temperate malaria transmission as defined by current climatic
countries, the relationship may not be directly due to circumstances, and this would increase to around 60
low temperatures and increased viral infections may percent towards the end of the next century assuming
be partly responsible. Some countries with very cold other relevant factors remain constant (Martens et al.
winters, for example, Russia, seem to have low excess 1995). Highly aggregated models such as the one used
winter mortality, probably because of the effective in this example are, of necessity, unable to take into
adaptation of the population to cold winters by the use account complexity of future changes. Nevertheless,
of warm winter clothing and adequate indoor heating they give a broad indication of the potential magnitude
(Donaldson et al. 1998). The UK has a particularly and direction of change and are continually being
high winter excess mortality and this may be due, at refined. They suggest that changes in distribution of
least in part, to fuel poverty. Within Europe larger malaria are likely to occur particularly at the edges of
increases in winter mortality may occur with decreas- the current distribution, including, for example,
ing temperature in warmer locations, e.g., Athens, mountainous regions in the tropics and subtropics.
than in colder locations (Eurowinter Group 1997) Estimation of numbers of excess cases and deaths in

1992
Climate Change and Health

the twenty-first century as a result of climate change is Federation of Red Cross and Red Crescent Societies
hampered by our lack of knowledge, for example, 1998). Drought, famine, and flood are the main
about the potential advances in the development of an categories of disaster responsible for the majority of
effective vaccine for malaria, the distribution of people affected.
impregnated bed nets to reduce transmission, and the Floods may cause a range of impacts on health
trends in the development of resistance of the parasites including deaths and injuries from trauma or drown-
to drugs used in treatment. ing, increased incidence of diarrheal disease and
A number of empirical studies in Zimbabwe, sometimes leptospirosis caused by exposure to infected
Rwanda, and Ethiopia have examined how climate rats’ urine in floodwaters. Malnutrition may increase
variability influences the distribution of malaria. They following flooding in some countries where food
have indicated that highland malaria can respond to security is a problem. The impacts on mental health
climatic variability, but whether changes in the alti- may be substantial and in some cases long-lasting. An
tudinal range of malaria which have apparently been increase in suicides was reported from Poland fol-
observed in a number of sites are due to global climate lowing floods in 1997 and an increase in behavioral
change, is currently a matter of scientific debate. Only disorders amongst children has been reported. Some
long-term monitoring of climate, vector populations, parts of the world may experience increased rainfall
and the incidence of malaria, as well as potential due to climate change that could lead to larger floods
confounding factors, such as changes in vector control (IPCC 1998).
programs and forest cover can finally resolve the Climate change could exacerbate periodic and long-
controversy. term shortages of water especially in the arid and semi-
arid parts of the world (IPCC 1998). Droughts tend to
effect health particularly by causing a reduction in the
2.6 Other Vector-borne Diseases availability of food. There may also be an increase in
diarrheal diseases because water is short and there
Other vector-borne diseases which may be affected
may not be sufficient for hygienic purposes. Severe
include those carried by ticks such as tick-borne
drought may not invariably result in famine or major
encephalitis (inflammation of the brain) and Lyme
food shortages. For example, a severe drought in
disease. The former occurs widely in Central and
southern Africa in 1992 resulted in crop failure rate
Eastern Europe and in Scandinavia and the latter
approaching 80 percent in some of the most affected
occurs in both Europe and the North Eastern US.
areas but famine was averted because of regional
Other factors which may effect tick-borne diseases
cooperation and external assistance which provided
include the pattern of forest cover that can influence
grain shipments (Noji 1997). Unfortunately, inter-
the distribution of animal hosts on which the ticks can
national assistance to support humanitarian relief has
feed and changing patterns of leisure activities which
fallen overall, for example, it declined 17 percent in
may influence exposure to bites by infected ticks.
real terms between 1992 and 1996, whereas emergency
There are several clinical types of Leishmaniasis, which
aid has tended to increase. After remaining steady at
are transmitted by sandflies in Asia, the Americas,
under half the United Nations target of 0.7 percent of
Southern Europe, and Africa. Sandflies are sensitive
gross national product (GNP) for more than 20 years,
to changes in temperature and, for example, it was
aid as a share of donor’s wealth fell to 0.25 percent in
estimated that a 3 mC increase in temperature could
1996, its lowest level ever (International Federation of
increase both the geographic and seasonal distribution
Red Cross and Red Crescent Societies 1998). If this
of one important species in Southwest Asia (Cross and
trend continues, populations in the twenty-first cen-
Hyams 1996). The tsetse fly that transmits sleeping
tury may be not only more vulnerable to climatic
sickness (human African trypanosomiasis) is also
disaster but less likely to receive effective assistance.
climate sensitive. In Latin America, the distribution of
There have been many studies to assess potential
Chagas’ disease, which is transmitted by the triatomine
changes in food production globally and regionally
bug and causes long-term damage to the heart and to
under conditions of climate change. In general, it
the muscle of the gastrointestinal tract, could be
appears likely that agricultural yields may increase in
affected.
the twenty-first century in middle to high latitudes
depending on crop type, growing season, and changes
in temperature and seasonality of precipitation. How-
2.7 Extreme Eents, Malnutrition, and Sea Leel
ever, there are concerns that yields may decrease in
Rise
parts of the tropics and subtropics particularly where
On average, every year around 120,000 people were dryland, nonirrigated agriculture predominates (IPCC
killed by natural disasters between 1972 and 1996, 1998). This could lead to increased hunger, particu-
with around 60 percent of the deaths occurring in larly in Africa.
Africa. Over the same period on average nearly 140 Sea level rise caused by climate change may result in
million people were affected by such disasters an- displacement of some populations particularly those
nually, most of these were living in Asia (International living on deltas and low lying islands as well as leading

1993
Climate Change and Health

to salination of fresh water and increased vulnerability fuel use (Haines and McMichael 1997). This could
to extreme events. have near term benefits by reducing deaths and other
adverse effects on health of air pollution (Working
Group on Public Health and Fossil Fuel Combustion
2.8 Air Pollution 1997). The provision of ‘clean energy’ is an important
contribution to improving health, particularly in
The weather has a substantial influence on the ambient developing countries (Haines and Kammen 2000).
concentrations of air pollutants, for example, high-
pressure systems often create a temperature inversion See also: Desertification; Globalization and Health;
which traps pollutants in the boundary layer near the Globalization: Geographical Aspects; Health Policy
earth’s surface. Because of increases in anticyclonic
conditions in summer in some parts of the world,
climate change may increase concentrations of some
pollutants. Ozone formation and destruction occurs
by means of a complex series of photochemical Bibliography
processes and concentrations in the troposphere may Bouma M J, Kovats R S, Goubet S A, Cox J S, Haines A 1997
be higher under climate change depending on the Global assessment of El Nin4 o’s disaster burden. Lancet 350:
emission of precursors. Any increase in forest fires 1435–38
could have substantial effects on human health be- Emberlin J 1994 The effects of patterns in climate and pollen
cause of the formation of ‘haze’ with high concentra- abundance on allergy: 1994. Allergy 49: 15–20
tions of fine particulates (see earlier discussion on El Cross E R, Hyams K C 1996 The potential effect of global
warming on the geographic and seasonal distribution of
Nin4 o). Concentrations of aeroallergens (pollen, etc.) Phlebotamus papatasi in Southwest Asia. Enironmental
could be affected by both temperature and precipi- Health Perspecties 104: 724–27
tation but other factors are also involved such as Donaldson G C, Tchernjavskii V E, Ermakov S P, Bucher K,
changes in land use and farming practices (Emberlin Keatinge W R 1998 Winter mortality and cold stress in
1994). Yekaterinberg, Russia: Interview survey. British Medical
Journal 316: 514–18
The Eurowinter Group 1997 Cold exposure and winter mortality
from Ischaemic heart disease, cerebrovascular disease, res-
3. Adaptation and Vulnerability piratory disease, and all causes in warm and cold regions of
The reductions in greenhouse gas emissions resulting Europe. Lancet 349: 1341–46
Haines A, McMichael A J 1997 Climate change and health:
from the Kyoto protocol of the UN Framework
implications for research, monitoring, and policy. British
Convention on Climate Change are likely to have little Medical Journal 315: 870–4
effect on the projected rises in temperature within the Haines A, Kammen D 2000 Sustainable energy and health.
first half of the twenty-first century (Parry et al. 1998). Global Change and Human Health 1: 78–87
Thus, reducing vulnerability to climate change is an Hales S, Weinstein P, Woodward A 1996 Dengue fever epidemics
important goal for public health in the twenty-first in the South Pacific driven by El Nino southern oscillation?
century. There are a number of factors which influence Lancet 348: 1664–5
vulnerability, notably poverty with its associated lack International Federation of Red Cross and Red CrescentS-
of resources and infrastructure. Although historically ocieties 1998 World Disaster Report 1998. Oxford University
the majority of greenhouse gas emissions have come Press, New York
from the industrialized nations, vulnerability to cli- Intergovernmental Panel on Climate Change (WGI) Houghton
J T, Meira Filho L G, Callander B A, Harris N, Kattenberg
mate change is probably greater in developing coun- A, Maskell K (eds.) 1996 Climate change, 1995—the science of
tries. climate change: Contribution of Working Group 1 to the second
Adaptation may be autonomous, indicating a natu- assessment report of the Intergoernmental Panel on Climate
ral or spontaneous response by individuals, or pur- Change. Cambridge University Press, New York
poseful, typically by governments or other institutions Intergovernmental Panel on Climate Change, Watson R T,
in response to projected climate change. The latter Zinyowera M C, Moss R H, Dokken D J (eds.) 1998 The
might include strengthening existing disease surveil- regional impacts of climate change, an assessment of ul-
lance systems for potentially climate sensitive diseases, nerability. Cambridge University Press, New York
improving vector control programs, and enhancing Intergovernmental Panel on Climate Change Working Group
disaster preparedness plans. 1 2001 Third assessment report. Intergovernmental Panel on
Climate Change, www.ipcc.ch
Kalkstein L S, Greene J S 1997 An evaluation of climate\
mortality relationships in large US cities and the possible
4. Conclusions impacts of climate change. Enironmental Health Perspecties
105: 84–93
Whilst adaptation to climate change is necessary this Kovats R S, Bouma M J, Haines A 1999 El Nin4 o and Health,
does not negate the importance of strategies to WHO Task Force on Climate and Health, WHO\SDE\
mitigate climate change, particularly by reducing fossil PHE\99.4, Geneva

1994
Climate Change, Economics of

Martens W J M, Jetten T H, Rotmans J, Niessen L W 1995 would be avoided if the change did not occur; benefits
Climate-change and vector-borne diseases: A global modeling are similarly estimated in terms of fortuitous impact
perspective. Global Enironment Change 5: 195–209 that otherwise would not have happened. The Inter-
McMichael A J, Haines A 1997 Global climate change: The
governmental Panel on Climate Change (IPCC) re-
potential effects on health. British Medical Journal 315: 805–9
McMichael A J, Haines A, Sloof R, Kovats R S 1996 Climate ported preliminary estimates of the annual economic
Change and Human Health WHO, EHG 96\7, Geneva impact of a doubling of concentrations of greenhouse
McMichael A J, Powles J W 1999 Human numbers, environment, gases (" 550 ppmv) and an associated 2.5 mC increase
sustainability and health. British Medical Journal 319: 977–80 in global mean temperature (Houghton et al. 1996).
Noji E K (ed.) 1997 The Public Health Consequences of Disaster. The estimates reported by the IPCC for the United
Oxford University Press, New York States, for example, ranged from a low of $55.5 billion
Parry M, Arnell N, Hulme M, Nicholls R, Livermore M 1998 (1990$) offered by Nordhaus (1991) to a high of $139.2
Adapting to the inevitable. Nature 395: 741 billion (1990$) authored by Titus (1992). The low and
Timmermann A, Oberhuber J, Bacher A, Esch M, Latif M,
high estimates were calculated to be 1.0 percent and
Roeckner E 1999 Increased El Nino frequency in a climate
model forced by future greenhouse warming. Nature 3989: 2.5 percent of anticipated gross domestic product in
694–7 2065, respectively. The year 2065 was chosen as a
Working Group on Public Health and Fossil-Fuel Combustion benchmark because that was when the specified
1997 Short-term improvements in public health from global- doubling of concentrations was anticipated to occur.
climate policies on fossil-fuel combustion: An interim report. All of the estimates reported in 1995 by the IPCC
Lancet 350: 1341–8 were dominated by declines in agricultural production.
Agriculture is, of course, a sector whose current
A. Haines practices would likely be threatened by higher temp-
eratures and less precipitation. Most of the estimates
for agriculture and other sectors were, however, the
result of vulnerability studies that paid little attention
Climate Change, Economics of to the ability of humans and their institutions to
reduce economic damage and expand economic op-
There are two sides to the economics of climate change. portunity by adapting—that is, by changing practices
The first recognizes the economic costs and potential so that they became less vulnerable to the new climate
benefits that can be attributed to the physical and or so that they could take greater advantage of its
natural impacts of a changing climate—warmer temp- appearance. Moreover, most of these early studies
eratures, changes in precipitation patterns, rising sea relied on relatively primitive methods of tracking the
level, and so on. These economic impacts include the different regional consequences of a 2.5 mC warming.
cost of adapting to change in addition to the economic Global climate modelers have long expected that some
consequences that remain after such adaptation is regions would see temperatures increase by more than
effected. They also include the benefits that climate 2.5 mC while others might actually get cooler. Some
change might bring that would otherwise not have areas would get wetter while others get drier. Seas
been forthcoming. The second side recognizes the would rise in some places and actually fall more slowly
economic costs that would be attributed to policies elsewhere (where the coastline is actually rising at
designed to mitigate climate change. These costs, too, present). Finally, few of the early studies were able to
include the cost of adapting to policy in addition to the consider the effects of changes in humidity, frequency
residual consequences that persist in the wake of this of extreme temperature events, or any of the other
adaptation. more subtle physical ramifications of global warming.
Estimates of the economic impacts on both sides are More recent cost estimates have begun to overcome
highly uncertain, given our inability to understand these shortcomings. Table 1 presents regional cost
fully the science of climate change and to look many estimates of market impacts published by Mendelsohn
decades into the future with any clarity. Estimates of et al. (2000) for a 2 mC warming and a 50 cm increase
both are also evolving continuously over time, so any in sea level. Notice that the overall annual effect on
estimate must be read with a sense of what was known world economic activity, a 0.3 percent reduction, is
at the time that it was published. The coverage offered much smaller than in the earlier studies. Effects on
here will provide some insight into this evolution even agriculture still dominate, but the regional distribution
as it reports on the latest results that were available at of impacts is striking. Some regions, notably North
the turn of the twentieth century. America and Europe, are now seen to benefit from
warming whereas others such as Africa are severely
harmed. These results were based on a statistical
1. The Economic Cost of Climate Change approach that looked carefully at how various regions
Impacts cope with their current climates to see how other
regions might respond if their climates changed.
Costs associated with the impacts of climate change Table 2 gives estimates from Tol (1998) for a 1 mC
are generally judged in terms of economic damage that warming in even geographical greater detail. Only

1995
Climate Change, Economics of

Table 1
Regional market impacts for a 2 mC warming (billions 1990$)
Region Agriculture Forest Coast Energy Tourism Total %GDP
Africa k11 k6 0 k2 k3 k22 k0.8
Asia k14 k1 k5 k9 k8 k37 k0.1
Latin America k7 k5 0 k4 k5 k22 k0.4
Europe 34 16 k4 4 31 82 0.2
North America 21 14 0 0 13 47 0.2
Oceana k3 k1 0 k1 k1 k7 k0.6
OECD 26 16 k7 2 45 82 0.15
Non-OECD 6 k1 k3 14 k18 k40 k0.1
Source: Mendelsohn et al. (1999, Table 1)

Table 2
Annual impact for a 1 mC warming (billions 1990$)
Standard Standard
Region Estimate deviation Percent of GDP deviation
American OECD 175 107 3.4 2.1
European OECD 203 118 3.7 2.2
Pacific OECD 32 35 1.0 1.1
Central Europe & former Soviet Union 57 108 2.0 3.8
Middle East 4 8 1.1 2.2
Latin America k1 5 k0.1 0.6
Southern and southeast Asia k14 9 k1.7 1.1
Centrally Planned Asia 9 22 2.1 5.0
Africa 17 9 4.1 2.2
Source: Tol (1998, Table 7)

Latin America and southern and southeast Asia suffer the distributional ramifications of this uncertainty can
losses in his work, and many regions (including Africa) be quite unsettling.
benefit substantially. Modest warming might, it would The role of adaptation and learning how to in-
seem, be a good thing. Indeed, Mendelsohn et al.’s corporate physical impacts other than temperature is
work also supports this suggestion. Tol’s results do, clearly demonstrated in Table 3. Regional estimates
however, offer a warning about leaping to that offered by four other scholars plus Tol are depicted
conclusion too quickly. Two columns in Table 2 there for a 2.5 mC temperature increase. Some include
report standard deviations for his estimates. The adaptation—switching crops, adjusting planting
standard deviation is a measure of uncertainty that dates, adding or eliminating irrigation, adjusting
indicates roughly a 66 percent likelihood that the true fertilizer practices, and so on; others do not. Some
impact of a modest 1 mC warming would lie within a include the fertilizing effect on plant productivity of
range of plus or minus the standard deviation from the higher carbon dioxide concentrations in the atmos-
recorded figure. For example, then, Tol suggests phere, while others do not. Notice that carbon dioxide
roughly a 66 percent likelihood that the annual impact fertilization can turn damages into benefits and that
of a 1 mC warming on the Pacific members of the adaptation can reduce damage and increase benefits.
OECD would lie between a 0.1 percent loss (l 1.0– Indeed, all of Tol’s estimates with adaptation represent
1.1) in gross domestic product (GDP) and a 2.1 percent gains. Be warned, though, that he still reports enor-
gain in GDP. Given current understanding of climate mous ranges of uncertainty surrounding them.
and economic systems, therefore, there is roughly a 17 Finally, all of these results envision smooth if not
percent chance that GDP would climb by more 2.1 predictable climate change. The real concern on the
percent and a 17 percent chance that GDP would fall impacts side could, however, be the potentially exag-
by more than 0.1 percent of GDP. There is also a 17 gerated effects of sudden, surprising and perhaps
percent chance that the economic damage suffered by irreversible consequences of warming. Economic
southern and southeast Asia in the wake of a 1 mC systems never cope well with sudden changes in their
warming would be larger than 2.8 percent of GDP. environment, even if the changes are ultimately bene-
The size of the uncertainty within which impact ficial. As a result, the estimates quoted above could be
estimates must be contemplated is still enormous, and dwarfed if the physical impacts of climate change are

1996
Climate Change, Economics of

Table 3
Economic impacts on agriculture of a 2.5 mC warming (percent of gross agricultural product)
Kane Darwin
et al. Tsigas et al. et al. Reilly et al. Tol

Fertilization? No No Yes No No Yes Yes Yes Yes


Adaptation? No No No Yes No No Yes No Yes
American OECD 0.03 k0.31 0.05 0.10 0.03 0.00 0.00 k0.25 1.30
European OECD k0.52 k0.73 0.14 k0.41 k0.34 k0.06 k0.02 0.55 2.09
Pacific OECD k2.08 k1.38 k0.06 0.31 k0.31 k0.04 k0.01 k0.15 0.80
Central Europe\former Soviet Union k0.02 k1.48 k0.07 0.14 k0.18 k0.25 k0.18 0.94 2.65
Middle East k0.01 k1.48 k0.07 0.14 k0.18 k0.25 k0.18 k0.44 0.58
Latin America 0.05 k2.18 k0.47 0.10 k0.22 k0.15 k0.16 k0.76 0.55
Southern and southeast Asia k0.08 k2.26 k0.32 k0.04 k0.91 k0.17 k0.13 k0.66 0.63
Centrally Planned Asia 3.84 k3.97 0.28 0.11 k10.09 0.04 0.53 1.73 3.10
Africa k0.01 k1.48 k0.07 0.14 k1.18 k0.25 k0.18 k0.23 0.47
Source: Tol (1998, Tables 1 and 2). He references Kane et al. (1992), Tsigas et al. (1996), Darwin et al. (1996), and Reilly et al. (1996)

not smooth. Yohe and Schlesinger (1998) computed


the economic cost of sea level rise on the developed
coastline of the United States with and without
sufficient foresight for markets to response to the
threat of inundation. The difference between the two,
one estimate under the best of circumstances of the
extra cost of surprise, was as large as 100 percent even
for sea-level rise trajectories in the middle of its own
range of uncertainty.

Figure 1
2. The Economic Cost of Mitigating Climate The marginal cost of reducing emissions of carbon
Change dioxide
The scale and pace of climate change can be influenced
by policy interventions that slow the emission of in the absence of any policy intervention. Fig. 1
greenhouse gases. Many researchers have investigated displays reductions in cumulative emissions from
the cost of this sort of climate change mitigation. They various baselines through the year 2100 against the
have, in particular, focused their attention on energy estimated tax (marginal cost) that would have to be
consumption and the resulting emission of carbon imposed per ton of carbon to achieve those reductions.
dioxide. Carbon dioxide is a product of burning fossil The points portrayed there indicate selected estimates
fuel, and its emission varies from fuel to fuel. Burning published by various researchers through the middle
coal, for example, emits 25 percent more carbon per of 1998, and the curve summarizes these cost data as a
unit energy than burning oil, and burning oil emits 43 function of percentage emissions reduction. Fig. 1
percent more than natural gas. Burning hydrogen shows clearly that the marginal cost of emissions
emits no carbon. Hydroelectric, wind, solar, and reduction increases at an increasing rate even though
nuclear power are similarly carbon-free sources of the taxes estimated for any particular reduction in
energy. Mitigation simply involves substituting emissions are disperse. This dispersion is a reflection of
carbon-free sources of energy for carbon-based fuels the uncertainty with which these costs can be com-
and low-carbon fossil fuels such as natural gas for puted—uncertainty caused by assumptions about
high-carbon fuels such as coal. Our ability to effect technology, supplies, and the intensity with which
and to sustain this sort of substitution over the very future economic activity would employ energy along
long run depends upon the availability of new tech- the baseline.
nology and the supply of low-carbon and carbon-free
sources of energy.
2.1 Interentions that Limit Atmospheric
The most effective means of conveying the cost of
Concentrations
mitigation is to track the economic impact of reducing
cumulative global emissions through the year 2100 The Framework Convention on Climate Change
from ‘baseline’ levels that would have been anticipated (FCCC) committed the globe in 1992 to holding

1997
Climate Change, Economics of

concentrations of greenhouse gases below levels that would support focusing on a middle concentration
would prevent ‘dangerous anthropogenic interference target such as 550 ppm and assuming that emissions
with the climate system.’ The precise concentration would otherwise track slightly higher than commonly
target that corresponds with this imperative has not accepted ‘best-guess’ baseline. This sort of hedging
yet been identified, so many researchers have investi- would increase the costs of meeting a concentration
gated the economic cost of reducing emissions over the target, but only modestly under the assumption of
next 100 years so that concentrations do not exceed a maximum geographical and intertemporal flexibility.
range of thresholds. Manne and Richels (1997) es-
timate that the cost of achieving the most popular
threshold, 550 ppmv, could be as high as $3.5 trillion
2.2 The Kyoto Emissions Reduction Protocol
(1990$) or as low as $650 billion (1990$). Estimates
from other researchers were comparable; but they The Third Conference of the Parties of the FCCC
differed from one another for the same reasons as agreed in 1997 through the Kyoto Protocol to impose
mentioned above. Costs for lower thresholds such as a set of greenhouse gas emissions targets for the
450 ppmv are much higher. However, some possible world’s developed countries (the so-called Annex I or
low-emissions baselines achieve stable concentrations Annex A countries). The targets were different for
around 750 ppmv without any intervention of any different countries, but their combined effect would
kind. reduce total emissions from Annex I countries by
Cost estimates for meeting concentration thresholds nearly 6 percent relative to their 1990 levels and almost
depend critically on the timing and location of each 20 percent from their 1999 levels by 2012. Non-Annex
unit of emissions reduction. Wigley et al. (1996) have I countries were exempted by the Protocol from any
called the ‘where’ and ‘when’ flexibility components of emissions reduction.
cost. Their ‘WRE’ results emphasize that costs would Subsequent research has raised a large number of
be minimized if each ton of emissions reduction were issues in regard to achieving any eventual FCCC
taken from the least costly source regardless of where concentration limit. Some are technical and deal with
it is located. Their results also required that emissions accounting procedures for counting emissions reduc-
should be reduced at any point in time only if the tions. Others are more fundamental. First among
present value of the associated cost is in line with all these is the observation that full compliance by Annex
other reductions at all other times. Compared with I by 2012 with fixed total Annex I emissions thereafter
emissions reductions suggested by the IPCC (see will not stabilize concentrations at any level for most
Houghton et al. 1992), the combined effect of exploit- baselines. It follows that non-Annex I countries will
ing both types of flexibility allowed the ‘WRE’ path eventually have to accept limits on their emissions, as
to cut the cost of achieving the same 550 ppmv well, if the FCCC objective of stable concentrations is
threshold from the same middle emissions baseline by to be achieved. Indeed, a 550 ppmv threshold would
more than 80 percent. not be achieved along many baselines even if Annex I
The ‘WRE’ results are controversial because ‘when’ eliminated all dependence on fossil fuel by the middle
flexibility implies that early reductions in emissions of the twenty-first century.
would be smaller than they would under the IPCC Second, Annex I compliance with their Kyoto
proposal. As a result, the near-term pace of climate targets by 2012 does not conform with the cost-
change would be larger. Meanwhile, ‘where’ flexibility minimizing pattern of maximal intertemporal flexibi-
has served as an anchor for a wide range of proposals lity for most baselines and most concentration targets.
that would allow countries to trade permits to emit Passing through the Kyoto benchmark increases the
greenhouse gases. The idea here is that an emerging cost of meeting any threshold above 500 ppmv along
market for permits would work to ensure that least all but the most energy intensive baselines.
cost sources of reductions were always exploited. Finally, negotiations about how to arrange for
It is difficult to contemplate near-term mitigation in geographic flexibility within the implementation of the
the absence of any knowledge about the appropriate Kyoto Protocol are critical. The equity and cost
concentration target and without any real under- implications of allowing flexibility with Annex I
standing about whether future baseline emissions will and\or the ability for Annex I countries to be credited
be relatively high or low. Yohe and Wallace (1996) for reductions that they underwrite in non-Annex I
looked as this problem as one of hedging. Policy countries are enormous. McKibben and Wilcoxen
directed at a low threshold along a high-emissions (1999) have argued that changes in the so-called ‘terms
path would be far more vigorous in the near-term than of trade’ caused by massive transfers of wealth to non-
a policy directed at a high threshold along a low- Annex I countries in exchange for emissions reduction
emissions path. Either would be in error, though, if credits could actually lower their economic welfare. In
either the presumed target or the presumed emissions addition, Manne (1999) observed that the partial
path turned out to be incorrect. As a result, adopting global coverage of the Protocol could lead to signi-
either would impose extra cost on the global economy. ficant ‘leakage’ so that global emissions might not fall
Yohe and Wallace reported that least-cost hedging as far as expected. Why? Because restricted emissions

1998
Climate Change, Economics of

in Annex I would cause the prices of fossil fuels to fall Houghton J T, Callander B A, Varney S K (eds.) 1992 Climate
and thereby increase emissions across the developing Change 1992. The Supplementary Report to the IPCC Scientific
world. Assessment. Cambridge University Press, Cambridge, UK
Houghton J T, Meira Filho L G, Callander B A, Harris N,
Kattenberg A, Maskell K (eds.) 1996 Climate Change 1995:
The Science of Climate Change. Cambridge University Press,
Cambridge, UK
2.3 Optimal Emissions Reductions Kane S, Reilly J M, Tobey J 1992 An empirical study of the
Nordhaus (1991) was the first researcher to weigh the economic effects of climate change on world agriculture.
Climatic Change 21: 17–35
present value of the benefits of mitigation policy
Manne A 1999 International carbon agreements, trade and
against the present value of their costs to compute an leakage. In: Proceedings of the IEA\EMF\IIASA Energy
economically optimal policy trajectory over the long Modeling Meeting. Energy Modeling Forum, Stanford, CA
term. His results were based on the anticipation that Manne A, Richels R 1997 On stabilizing CO concentra-
impacts would be smooth and amount to roughly 1 #
tions—cost-effective emission reduction strategies. In:
percent of gross world product if the global mean Cameron O K, Fukuwator K, Morita T (eds.) Proceedings of
temperature rose by 2.5 mC. Corroborated by his own the IPCC Asia–Pacific Workshop on integrated Assessment
subsequent work and by others, they support modest Models. Center for Global Environmental Research, Ibaraki,
early intervention followed by smooth and gradual Japan, pp. 439–59
tightening of emissions restrictions. Indeed, most McKibben W, Wilcoxen P 1999 The theoretical and empirical
optimality exercises propose carbon taxes between $10 structure of the G-Cubed model. Economic Modelling 16:
and $20 around 2000; and most see those taxes 123–48
increasing over time at " 3 percent per year. None of Mendelsohn R, Morrison W, Schlesinger M E, Andronova N G
the results achieve stable atmospheric concentrations 2000 Country-specific market impacts of climate change.
Climatic Change 45: 553–69
along most baselines, and none come close to the
Nordhaus W D 1991 To slow or not to slow: the economics of
restrictions imposed on 2012 emissions by the Kyoto the greenhouse effect. Economic Journal 101: 920–37
Protocol. Nordhaus W D 1994 Managing the Global Commons: The
Economics of Climate Change. MIT Press, Cambridge, MA
Reilly J M, Baethgen W, Chege F E, van de Geijn S C, Lin E,
Iglesias A, Kenny G, Patterson D, Rogasik J, Roetter R,
3. Synthesis as the Future Unfolds Rosenzweig C, Sombroek W, Westbrook J 1996 Agriculture
in a changing climate: impacts and adaptation. In: Watson
Synthesizing the economics of climate change and R T, Zinyowera M C, Moss R H (eds.) Climate Change 1995:
climate policy is an evolving process that will continue Impacts, Adaptations and Mitigation of Climate Change:
well into the twenty-first century. As researchers learn Scientific–Technical Analysis—Contributions of Working
more about both, the costs associated with both could Group II to the Second Assessment Report of the Intergoern-
easily fall. However, they may not, particularly on the mental Panel on Climate Change. Cambridge University Press,
climate side of the calculus. The possibility of sudden Cambridge, UK, pp. 427–68
and, as yet, unanticipated impacts could change the Titus J G 1992 The cost of climate change to the United States.
picture dramatically. In: Majumdar S K, Kalkstein L S, Yarnal B, Miller E W,
Rosenfeld L M (eds.) Global Climate Change: Implications,
Challenges and Mitigation Measures. Pennsylvania Academy
See also: Agricultural Change Theory; Agricultural of Science, Easton, PA, pp. 217–35
Sciences and Technology; Agriculture, Economics of; Tol R S J 1998 New Estimates of the Damage Costs of Climate
Climate Change and Health; Climate Impacts; Clim- Change. Part I: Benchmark Estimates. Institute for Environ-
ate Policy: International; Desertification; Ecological mental Studies, Vrije Universiteit, Amsterdam
Economics; Economic Geography; Environmental Tsigas M E, Frisvold G B, Kuhn B 1996 Global climate change
in agriculture. In: Hertel T W (ed.) Global Trade Analysis:
Adaptation and Adjustments; Environmental and
Modeling and Applications. Cambridge University Press,
Resource Management; Environmental Challenges in Cambridge, UK, pp. 3–15
Organizations; Environmental Planning; Environ- Wigley T, Richels R, Edmonds J A 1996 Economic and en-
mental Policy; Environmental Vulnerability; Food vironmental choices in the stabilization of atmospheric CO
Security; Globalization: Geographical Aspects #
concentrations. Nature 379: 240–43
Yohe G, Schlesinger M E 1998 Sea-level change: the expected
economic cost of protection and abandonment in the United
States. Climatic Change 38: 447–72
Yohe G, Wallace R 1996 Near-term mitigation policy for global
Bibliography change under uncertainty: minimizing the expected cost of
Darwin R F, Tsigas M, Lewandrowski J, Raneses A 1996 World meeting unknown concentration thresholds. Enironmental
Agriculture and Climate Change. US Department of Agricul- Modeling and Assessment 2: 47–57
ture Report 703. US Department of Agriculture, Washington,
DC G. W. Yohe

Copyright # 2001 Elsevier Science Ltd. 1999


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Climate, History of

Climate, History of stimulating efforts towards reconstructing past


climates. In this respect historical climatology co-
operated with scientific disciplines. Remarkable pro-
1. Introduction gress was made from mostly isolated attempts at
Climatologists and climate historians have assembled reconstructing local climate histories, to successful
robust evidence that the world’s climate has changed attempts at boiling down regional evidence into quasi-
significantly over the past millennium. However, for homogeneous highly correlated monthly time series of
most historians climate still is an unacknowledged temperature and precipitation indices on the supra-
constant. There remain serious intellectual and prac- regional scale (Pfister et al. 1999). In the late 1990s
tical obstacles to understanding and using the new historical climatology was moving at the center of the
evidence that is now becoming available (Richards controversial detection debate about anthropogenic
2001). The first section of this article reviews the climate change, because documentary data is the only
discussion on the issue. The second section examines evidence for assessing the frequency and clustering of
the evidence and the approaches used for recon- rare but socioeconomically significant disasters such
structing past weather and climate. The third section as intense storms, severe floods, and droughts.
reviews the main trends of climate variability over the On the other hand the mainstream of historians
last millennium. In the last section the assessment of turned away from structural history and the ‘longue
climate impacts on premodern societies is considered. dureT e’ in favor of discourse analysis. As a consequence
the incentive to investigate the impacts of the climatic
variability being detected by climate historians de-
2. The Discussion on ‘Climate and History’ clined. In the context of the ongoing debate the issue of
the social perception of (reconstructed) climate change
Every society develops philosophical and mythical may become attractive for historians of ideas.
interpretations about the role of the natural environ-
ment in human affairs. Enlightenment thinkers con-
cluded that cultures were determined or strongly 3. The Reconstruction of Weather and Climate
shaped by climate. Economists and geographers (e.g. from Natural and Manmade Archies
Stanley Jevons, Eduard Bru$ ckner) assumed that econ-
omic life was affected by climatic cycles. Environ- The global climate of the last millennium is recon-
mental determinism was carried to an extreme by the structed using evidence from both natural and man-
geographer Ellsworth Huntington in the early twen- made archives. Data from natural archives (e.g., tree
tieth century (Fleming 1998). Sociologist Emile ring or ice core data) is essential for those periods of
Durkheim summarily rejected efforts to link human history and those regions of the world for which
performance to climate changes. He postulated that documentary evidence is sparse or nonexistent, such
social issues could be explained solely by social factors as precolonial North America. However, most recon-
(Glaeser 1994). The discussion on the social signifi- structions from natural archives cannot be broken
cance of climatic variations was resumed by historians down into sufficiently short units of time (e.g., months
of the French Annales School (e.g., Emmanuel Le Roy or seasons) and into specific parameters (temperature
Ladurie) after World War II. Ladurie suggested that and precipitation) (Bradley 1999) that would be
historical climate should first be reconstructed for its needed for conclusive investigations into the human
own sake without considering its potential significance dimension of climatic change.
for human history. This issue should only be addressed Data from manmade archives (i.e., documentary
in a second step based on reliable reconstructions of evidence) is well researched in Europe and East Asia.
past climate. However, he was skeptical in this respect, Investigations have hardly begun in Latin America.
postulating that ‘the narrowness of the range of secular The evidence for Africa is spotty. In the Islamic world
temperature variations and the autonomy of the the possibly abundant evidence remains to be ex-
human phenomena which coincide with them in time plored.
make it impossible for the present to claim that there Documentary evidence is classified into descriptive
is any casual link between them’ (Le Roy Ladurie 1971, and proxy data. Descriptive data includes chroniclers’
p. 293). The English meteorologist Hubert Lamb narratives of weather patterns characteristic of a
(1988), who took an active interest in history, became particular region. Chinese historians have drawn upon
Le Roy Ladurie’s most prominent opponent. Lamb observations found in local gazetteers maintained by
was convinced that weather and climate had affected local gentry in nearly every district. In order to portray
human affairs in the past and that humankind would more objectively the character of extreme events,
do well to examine some of the lessons provided by chroniclers referred to the duration of snow cover or
nature. the freezing of bodies of water, to the development of
From the early 1990s the framework of the dis- crops, and to high and low water levels. In Europe
cussion changed. On one hand the issue of the daily weather observations were promoted by the rise
increased greenhouse effect was put on the agenda of planetary astronomy from the late fifteenth century.

2000
Climate, History of

Regular instrumental measurements of weather began down to regional monthly or seasonal temperature
in the late seventeenth century. From 1860 national and precipitation patterns. So far, this level of detail is
meteorological networks came into being. only available for Europe and China.
Evidence providing an indirect measure of climate is
mostly drawn from administrative records. These may
yield long, continuous, and quasi-homogeneous series 4.2 Central Europe
of climate related data that reflect the beginning of
agricultural activities (e.g., the vine harvest), agri- After a cold phase in the twelfth century, winters were
cultural production (e.g., yield of vineyards), or the prevailingly warm from 1180 to 1300. From 1300 to
time of freezing and opening up of waterways (Pfister 1900 the winter half-year was colder than today. This
et al. 1999). is related to more frequent and sustained advection of
Records of rogations (i.e., standardized religious cold, dry continental air-masses from the (north) east.
ceremonies to put an end to a meteorological stress Severe winters were frequent from 1306 to 1328, 1430
situation) are a promising source for the Spanish- to 1490, 1565 to 1615, 1655 to 1710, 1755 to 1860, and
speaking world. In Spain rogations were recorded in 1880 to 1895. From 1365 to 1400, 1520 to 1560, and
the account books of both the municipalities and the from 1610 to 1650 moderate winters prevailed. Springs
church (Martin and Barriendos 1995). were extremely cold in the 1690s and in the 1740s.
Manmade archives are interpreted by historical Summers do not show distinct long-term character-
climatology, which serves as an interface between istics. Those in the thirteenth century were prevailingly
climatology and history. It is directed towards three warm and dry. In the fourteenth century clusters of
objectives (Pfister et al. 1999): cold and wet summers occurred repeatedly (e.g., in the
(a) reconstructing weather and climate as well as 1310s and 1340s). From 1380 to 1430 and again from
natural disasters prior to the creation of meteoro- 1530 to 1565 the summer half-year was as warm as
logical networks; today. Over the last third of the sixteenth century cold
(b) investigating the vulnerability of past societies to spells and long rains in midsummer expanded at the
climatic extremes and natural disasters; expense of warm anticyclonic weather. This tendency
(c) exploring past discourses on and social repre- culminated in the 1590s (Pfister et al. 1999). Summers
sentations of climate. at the beginning and end of the seventeenth century
Usually the evidence available for a given month or were prevailingly cool while those from 1630 to 1687
season is converted to ordinal indices for temperature were moderate. In the 1700s several warm decades (the
and precipitation. The computing of transfer functions 1720s, the 1730s, and the 1780s) stand out in England
with instrumental series allows temperature and pre- and in Central Europe, whereas the first half of the
cipitation to be assessed for the pre-instrumental nineteenth century, particularly the 1810s, was mark-
period. Series of indices were also included in statistical edly cooler (Bradley and Jones 1996).
models to reconstruct monthly mean air pressure at
sea level for the eastern North Atlantic-European
region (25mW to 30mE, and 35mN to 70mN) back to 1659 4.3 Russia
(Luterbacher et al. 2000). Winters became more severe at the end of the sixteenth
century, in particular from 1620 to 1680 and in the first
half of the nineteenth century.
4. Climatic Trends and Anomalies Oer the Last In the summer half-year droughts were frequent
Millennium from 1201 to 1230, 1351 to 1380, and 1411 to 1440. A
period of comparatively warm conditions in all
4.1 Three Main Phases seasons stands out during the first half of the sixteenth
century. Subsequently, cold spells occurred more often
Palaeoclimatologists and climate historians describe from 1590 to 1620 and from 1690 to 1740 with a peak
three main phases of climatic change over the past in the 1730s. Droughts occurred frequently from 1640
millennium: A ‘Medieval Warm Period’ to 1300 to 1659 and from 1680 to 1699. The six decades from
(Hughes and Diaz 1994); a subsequent cool phase 1770 to 1830 were warm, and droughts were frequent
lasting to the late nineteenth century that is labeled from 1801 to 1860. Summers from 1890 to 1920 were
‘Little Ice Age’ because glaciers in most regions of the by far the coldest in the last 500 years. This included an
globe were expanding during that time (Bradley and unusually large number of extreme dry and wet
Jones 1996). The twentieth century is the warmest seasons (Bradley and Jones 1996).
period of the millennium, partly as a consequence of
the increased greenhouse effect. However, such
generalizations on the global level mask a broad array
4.4 China
of contrasting regional and local trends. Moreover, in
order to investigate human vulnerability to climatic In South China the thirteenth century was the warmest
stress, the perspective of ‘ages’ needs to be broken of the last millennium. Three cold periods—1470 to

2001
Climate, History of

1520, 1620 to 1740, and 1840 to 1890—are identified, 5. The Historical Significance of Climatic Change
the 1650s being by far the coldest decade. Rainfall
during the seventeenth century was extremely variable. The issue of whether climatic change has had a
Temperatures during the eighteenth century, unlike in significant impact on history is controversial. It should
Europe, rarely climbed to twentieth century levels, but not be overlooked that both ‘climate’ and ‘history’ are
precipitation conditions were more favorable. Climate blanket terms located on such a high level of ab-
variability increased markedly throughout the nine- straction that relationships between them cannot be
teenth century to a maximum in the early twentieth investigated according to the rules of scientific meth-
century. In North China two cold periods—1500 to odology. In order to become more meaningful, the
1690 and 1800 to 1860—stand out over the last six issue needs to be broken down to lower scales of
centuries. Considering all seasons, the period from analysis e.g., by focusing on specific human activities
1650 to 1670 was the coldest, but the summer half-year and\or needs in relation to a given set of climatic
was almost equally cold from 1580 to 1600 (Wang variables. Regarding preindustrial societies this con-
1991). cerns primarily the availability of biomass (e.g., food,
fodder) and energy (e.g., wind, water-power) but also
processes of population dynamics (e.g., patterns of
4.5 The Mediterranean disease and epizootics, as well as fertility of men and
livestock), and transport and communications as well
After a cold twelfth century the period 1200 to 1400 as military and naval operations. Undoubtedly, bene-
was very warm in the southwest. Annual precipitation ficial climatic effects tend to enlarge the scope of
in Morocco was generally lower from the sixteenth to human action, whereas climatic shocks restrict it or
nineteenth centuries (Bradley and Jones 1996). In even lead to emergency situations. Which climatic
Catalonia (northeastern Spain) dry spells in the winter constellations matter for energy availability and popu-
half-year were frequent in the mid-sixteenth century, lation dynamics depend on the environmental, cul-
but almost absent from 1580 to 1620. Numerous tural, and historical context.
autumnal floods were reported from 1580 to 1630, Models of climatic effects on society are often
from 1770 to 1800, and again from 1840 to 1870 framed as a chain of causation. Climatic patterns have
(Martin and Barriendos 1995). a first order or biophysical impact on agricultural
production or on the outbreak of diseases or epizo-
otics. These may have second order effects on prices of
4.6 Latin America food or raw materials, which may then ramify into the
In both Spanish and Portuguese America there seems wider economy and society (third order impacts). The
to have been a trend to greater aridity in the 1700s farther we move away from first order impacts, the
compared to the 1600s. Dendroclimatic evidence for greater the complexity of the factors masking
the Santiago de Chile area indicates higher than the climatic effect. It is also plain that it is easier to
average rainfall from 1450 to 1600 whereas droughts investigate the effects of short-term impacts. In dealing
became frequent over the subsequent centuries (e.g., with the effects of multidecadal climate variations we
1637 to 1640, 1770 to 1773, 1790s, 1810s). In the have to account for modifications in the economic,
Buenos Aires region (Argentina) the 1700s were drier institutional, and environmental setting so great as to
than the previous century. Prolonged droughts are vitiate any attempt at strict comparison or measure-
recorded in the 1690s, the 1710s, the 1750s, and 1771 ment (Kates et al. 1985). Most climatic impacts were
to 1774 (Claxton 1993). related to food scarcity or famines.
Crises were triggered by a slump in overall agri-
cultural production. This could be a consequence of
climatic shocks or warfare. In Central Europe severe
4.7 ENSO
climate induced crises (e.g., 1569–74, 1627–29, 1692–
The El-Nin4 o Southern Oscillation (ENSO) is the result 94, 1769–72, 1816–17, 1853–55) were connected to a
of a cyclic warming and cooling of the ocean surface in cumulation of unfavorable weather patterns, which
the central and eastern Pacific that strongly affects made the traditional risk minimizing strategies in-
rainfall in the areas around the Pacific and the Indian effective (Richards 2001). Rainfall is the limiting factor
Ocean. Archival data suggests that ENSO episodes in the subtropical and tropical zones; in higher
from 1600 to 1900 had more intense and global effect latitudes it is summer warmth. Connections between
than those of the twentieth century. For example, the climatic anomalies and diseases are complex. Some
worst droughts in the colonial history of India (mid diseases (e.g., cholera) are climate related whereas
1590s, 1629 to 1633, 1685 to 1688, 1788 to 1793, 1877 others (e.g., bubonic plague) are not (Rotberg and
to 1878) are related to ENSO connected failures of the Rabb 1983). The theory of pre-industrial trade cycles
monsoon. For the last two events the global dimension considers the harvest the critical determinant influ-
of these episodes is demonstrated (Grove and Chappell encing urban income and rural employment levels. A
2000). sharp rise in food prices promoted widespread un-

2002
Climate Impacts

employment, begging, and vagrancy that further Fleming J R 1998 Historical Perspecties on Climate Change.
propagated infectious diseases and increased crisis Oxford University Press, Oxford, UK
mortality (Post 1985). Glaeser B 1994 Soziologie der Umwelt. In: Ernste H (ed.)
Pathways to Human Ecology. Lang, Bern, Switzerland, pp.
Crises represented a major challenge for political
115–32
and social systems. Rather than investigating changes Grove R, Chappell J (eds.) 2000 El Ning o, History and Crisis.
in average values, historical climatology should focus White Horse Press, Knapwell, UK
on changes in the frequency and severity of extremes. Hughes M K, Diaz H F 1994 The Medieal Warm Period.
The evidence is growing that exogenous shocks (in- Kluwer, Dordrecht, The Netherlands
cluding natural disasters) have a tendency to cluster Kates R W, Ausubel J H, Berberian M (eds.) 1985 Climate Impact
rather than being randomly distributed along the time Assessment, Studies of the Interaction of Climate and Society.
axis, as is often believed. This allows us to distinguish Wiley, Chichester, UK
between periods of high and low climatic stress. Lamb H H 1988 Weather, Climate & Human Affairs. Routledge,
London
An example is provided by sixteenth-century
Le Roy Ladurie E 1971 Times of Feast, Times of Famine: A
Europe: there was a sudden increase in the number of History of Climate Since the Year 1000. Doubleday, Garden
cold anomalies after 1565. Over the subsequent de- City, NY
cades climate became more significant for food prices Luterbacher J, Rickli R et al. 2000 Monthly mean pressure
than population levels and increases in the money reconstruction for the late Maunder minimum period (AD
supply. The case is even more obvious for wine 1675–1715). Journal of Climatology 20: 1049–66
production which as a consequence of an almost Martin V J, Barriendos V M 1995 The use of rogation ceremony
uninterrupted series of cold summers nearly collapsed records in climatic reconstruction. Climatic Change 30: 201–21
from 1585 to 1600 across a large region ranging from Pfister C, Bra! zdil R, Glaser R (eds.) 1999 Climatic Variability in
Sixteenth Century Europe and its Social Dimension. Kluwer,
Switzerland to Hungary. The slump of vine production
Dordrecht, The Netherlands
had far-reaching consequences for major social groups Post J D 1985 Food Shortage, Climatic Variability, and Epidemic
depending on vine growing. Many peasant communi- Disease in Preindustrial Europe. Cornell University Press,
ties suffered such a large collective damage from the Ithaca, NY
effects of continuous crop failures that they pressed Richards J F 2001 The Unending Frontier: Enironmental History
the authorities to permit witch hunts. Thousands of in the Early Modern World. University of California Press,
witches were burnt as scapegoats of climatic change Berkeley, CA
(Behringer 1999). Rotberg R I, Rabb T K (eds.) 1983 Hunger and History.
Based on the new reconstructions that are becoming Cambridge University Press, Cambridge, UK
Wang S W 1991 Reconstruction of temperature series of North
available, the significance of climatic variability needs
China from 1380s to 1980s. Science in China B 34: 751–9
to be reassessed in many contexts of economic, social, Wigley T M L, Ingram M J, Farmer G (eds.) 1981 Climate and
and environmental history without including deter- History. Cambridge University Press, Cambridge, UK
ministic overtones.
C. Pfister
See also: Climate Change and Health; Climate
Change, Economics of; Climate Impacts; Deserti-
fication; Environmental Determinism; Irrigation Soci-
eties; Water Resources Climate Impacts

1. Introduction
Bibliography
While there is scientific consensus that increased
Behninger W 1999 Climatic change and witch hunting: The
atmospheric concentrations of greenhouse gases will
impact of the little ice age an mentalities. In: Pfister C, Brazdil
R, Glazer R (eds.) Climatic Variability in Sixteenth-century likely raise global temperatures, with associated
Europe and its Social Dimension. Kluwer, Dordrecht, The increases in global precipitation and sea level, there is
Netherlands no consensus on how fast and how much the climate
Bradley R S 1999 Palaeoclimatology. Academic Press, San may change, on how regional climates may change,
Diego, CA or on how climate variability may change. Climate
Bradley R S, Jones P D (eds.) 1996 Climate Since AD 1500. change impact assessment for a country or region
Routledge, London consists of a set of tasks beginning with problem
Claxton R H 1993 The record of drought and its impact in definition and leading through sector analysis to
colonial Spanish America. In: Herr R (ed.) Themes in Rural
analysis of adaptation methods and response policies.
History of the Western World. Iowa State University Press,
Ames, IA A broad understanding of the potential future with
Dupa# quier J 1989 Demographic crises and subsistence crises in climate change demands multifaceted analyses, in-
France, 1650–1725. In: Walter J, Schofield R (eds.) Famine, volving study of both biophysical and socioeconomic
Disease and the Social Order in Early Modern Society. processes. A wide range of methods for climate change
Cambridge University Press, Cambridge, UK impact analysis has been developed, from simple

2003
Climate Impacts

Figure 1
Integrated impacts, adaption, and vulnerability framework
Source: Rosenzweig and Iglesias 2000

regression models to complex integrated systems 2. Approach


models. Techniques are becoming ever more complex
as more interacting systems and the propagation of There are several approaches that serve as foundations
uncertainties are included in the analysis. The chal- to climate change impact studies. One approach is
lenge is to simulate the biophysical and socioeconomic based on climate change scenarios, that is, projections
aspects of a system (such as agriculture, human health, of what future climate variables (and the character-
urban areas) in a framework appropriate to regional, istics of future impacts) may be like. Equilibrium
national, international, and global scales. Spatial climate change scenarios have been most often used in
analyses and first-order biophysical impacts are im- this approach, but recent more realistic studies and
portant, as well as assessment of vulnerability in the projections incorporate dynamic or ‘transient’ climate
socioeconomic welfare of the different components of change scenarios. The climate change scenarios ap-
the system. Thus, biophysical scientists and social proach sometimes includes the study of responses of
scientists must work together to provide realistic the system to past climatic variations, in order to allow
assessments of how climate change might affect a comparison with future projections.
system in the future (see Fig. 1). Another approach is threshold-based, and attempts
Methodological issues to be resolved include how to to define the limits of sensitivity of a system as it is
generalize from the enormous heterogeneity of ex- currently configured to changes in climatic variables.
posure units and systems and how to address spatial The first approach addresses the question, ‘What will
scales and units of analysis from field to region the system be like in a given future changed climate?’
to nation and beyond. Models must be continually while the threshold approach asks, ‘What type, mag-
tested, calibrated, and validated, and improved for nitude, and rate of climate change would seriously
their use to be well-founded. The inclusion of the perturb the system as we know it?’ This approach most
transient nature of climate change and its associated often applies transient scenarios of climate change.
uncertainties in the modeling techniques is particu- Both of the approaches construct a chain of causality
larly important. from the biophysical responses at a small scale to

2004
Climate Impacts

socioeconomic effects at the regional, national, and responses to which other types of scenarios may be
international levels. compared.
Several different techniques from the field of econ-
omics have been used in climate change impact
analysis. One technique is the utilization of economic
3.2 Historical Analogs
data to estimate the value of climate to the exposure
unit (i.e., farmers) implicitly through regression equa- Another type of climate change scenario is based on
tions. Linear programming models of the national the historical record. Observations from cool or warm,
sector (i.e., agriculture) are also used, as well as linked wet or dry historical periods are used to construct
national and regional models. scenarios for use in modeling studies of climate change
Analysis of adaptive responses to climate change is impacts. Such periods are also useful for the insights
an important part of climate change impacts research. provided by studying the responses of any given system
The biophysical approaches described above allow the to periods of climatic extremes. The Dust Bowl of the
explicit examination of exposure unit adaptations, 1930s in the Southern Great Plains is a well-known
while the economic approach deals with adaptation example (see e.g., Warrick 1984), but past freeze
implicitly. events, aquifer depletion, and lake-level changes have
also been used to study societal responses to regional
climate change (Glantz 1988).
3. Climate Change Scenarios A difficulty with either of these scenario approaches
as proxies for the global warming currently predicted
Climate change scenarios are defined as plausible for increasing CO and other trace gases is that the
combinations of climatic conditions that may be used patterns of climate # warming may be different de-
to test possible impacts and to evaluate responses to pending on the nature of the atmospheric forcing
them. Scenarios may be used to determine how mechanisms.
vulnerable a sector is to climate change, to identify
thresholds at which impacts become negative or
severe, and compare the relative vulnerability among
sectors in the same region or among similar sectors in 3.3 GCM-based Scenarios
different regions. Climate change scenarios are also derived from global
It is still difficult, if not impossible, to associate climate model (GCM) experiments with specified
probabilities with any particular scenario of climate forcing mechanisms (e.g., 1 percent annual increase in
change, due to uncertainties in future emissions of greenhouse gas concentrations in the atmosphere).
radiatively active trace gases and in the response of the Current GCM model experiments are conducted to
climate system to those emissions. Thus, impact produce transient climate projections. The advantages
studies based on climate change scenarios do not make of GCM scenarios are their internal consistency and
actual predictions; rather, they are useful in defining global extent. GCMs estimate how regional and global
for critical biophysical and socioeconomic systems climates may change in response to increased con-
directions of change, relative magnitudes of change, centrations of trace gases. Thus, regional and global
and potential critical thresholds of climate-sensitive climate responses are internally consistent. The
processes. By conducting climate change impact ana- climate variables are also physically consistent, as
lyses, researchers and resource managers are con- heat, moisture, and energy processes are calculated
ducting ‘practice’ exercises, which help to engender from a consistent set of equations representing
flexibility in the systems’ responses to potentially physical processes.
changing conditions in the future. At present, GCMs represent current climate at
global and zonal (latitudinal) scales, but do not do
particularly well at simulating regional current cli-
3.1 Arbitrary Scenarios
mate. Differences in climate projections among GCMs
The simplest scenario is the application of prescriptive increase as scale decreases from the global to the
changes, such as a 2 mC increase in temperature and\or regional and gridbox levels. GCM simulation of
a 10 percent decrease in precipitation, to observed current temperature regimes is better than simulation
climate. Tests with such simple changes can help to of current hydrological regimes. A range of GCM
identify the sensitivities of systems to changes in scenarios should be included in the design of impact
different variables. One can isolate the effects of one studies in order to incorporate a range of climate
climate variable, for example, temperature, while sensitivities to greenhouse gas forcing, and it is very
holding other variables constant. However, such tests important to consider GCM regional climate change
do not offer a consistent set of climate variables, since projections as examples of possible future climates,
evaporation, precipitation, wind, and other variables rather than actual predictions.
are all likely to change with change in temperature. Because GCM simulation of current climates is
Arbitrary scenarios do, however, provide a set of often inaccurate, direct projections of GCM-generated

2005
Climate Impacts

future climates is seldom used. Changes in climate economic factors are not unrelated, since changes in
variables in the perturbed simulations relative to the population are likely to affect national and per-capita
control run are often applied to historical observed income. Recent IPCC scenarios include estimates of
weather data to create the climate change scenarios population and economic growth rates for a set of
used in impact studies. Absolute model biases are possible futures (IPCC 2000).
omitted by using the relative changes. Thirty years of Socioeconomic factors that are often considered in
current climate data are often used to develop the future scenarios include population, income, produc-
baseline climate scenario to which the GCM changes tivity, and technology levels. Environmental factors
are applied. A 30-year period is considered long may include stratospheric and tropospheric ozone
enough to represent ‘normal’ climate variability. levels and changes in land use. Institutions and legal
Recent periods, e.g., 1951–80, or 1961–90, are often structures may change as well, but these evolutions are
selected, representing current climate and having ac- very hard to predict. The World Bank (1994) and the
curate data most easily available. The latter period United Nations (1999) have published population
contains some of the warmest years on record that estimates by country through 2100 for a range of
may have been caused by the enhanced greenhouse scenarios. The World Bank (1993) has published
effect. estimates of changes in income. Various economic
The use of GCM transient scenarios (i.e., time models are used to project such productivity factors as
dependent) in climate change impact studies is growing, gross domestic product (GDP) into the future. Popu-
since they provide a much more realistic picture of the lation and economic growth may bring increases in
projected warming from current conditions to some urbanization, expansion of agriculture and mining of
point in the future. natural resources, and accelerating rates of deforesta-
tion, habitat fragmentation, desertification, and water
and air pollution (FAO 1993, Dregne and Chou
4. Integrated Global Change Scenarios 1992).

Climate is not the only factor that will be changing as


the twenty-first century unfolds. Population growth
4.1 CO and Greenhouse Gas Emission Scenarios
and changing economic and technological conditions #
are likely to affect world society and the environment CO and greenhouse gas emission scenarios are
even more than changes in climate. It is important to #
needed, especially for agriculture because of the need
take such changes into account in climate change to estimate crop responses to the CO fertilization
impact analyses: first, because climate change will effect, as well as projections of sea-level#rise. Climate
occur not in the present but in the future, and second, modelers need estimates of future levels of atmos-
because such changes may affect the sensitivity of a pheric CO and other trace gases in order to prepare
system or sector to climate. However, predict- #
transient scenarios of future climate. Crop and forest
ing population growth rates and future economic modelers also need such estimates in order to take
conditions is equally if not more uncertain than fertilization effects into account in their impact analy-
predicting the future climate. Therefore, future ses. Global emissions of CO , the most important
scenarios need to be designed carefully to address a greenhouse gas, depend primarily # on fossil fuel use in
range of possible conditions. One approach is to three major sectors—electrical generation, industry,
contrast ‘optimistic’ and ‘pessimistic’ views of the and transportation. A growing world economy con-
future. In the optimistic scenario, population growth sists of growth in industrial production, consumption
rates are low, economic growth rates and incomes rise, of goods, and travel and concomitant increases in
environmental pollution decreases, and land degra- energy use. Deforestation also contributes to CO
dation abates. In more pessimistic scenarios, popu- emissions and is linked to economic growth as land is#
lation growth rates are high, economic growth rates converted from natural ecosystems to agriculture and
and incomes are low, environmental pollution other uses.
increases, and land degradation accelerates. A scen- The IPCC (2000) and others estimate growth rates of
ario of no change (i.e., present conditions) should also world carbon emissions from fossil fuels and de-
be included. The differential effects of climate change forestation in order to calculate atmospheric CO
on current conditions, and on these two alternative levels for climate projections. Such calculations are#
scenarios of the future may then be evaluated. also important in international negotiations that
In order to place possible changes in climate in the consider limiting CO emissions. Since only a portion
context of potential socioeconomic changes, estimates # carbon added to the atmos-
(about one-half ) of the
of population, economic growth, and technological phere remains, carbon cycle models are used to
change are needed. These estimates will also affect translate carbon emissions into atmospheric levels of
future rates of CO and other greenhouse gas emis- CO . Models that include the effects of CO fertiliza-
#
sions. Economic projections beyond the next 10 to 20 tion,# feedback from stratospheric ozone depletion,
#
years generally are unreliable. Furthermore, socio- and the radiative effects of sulfate aerosols have been

2006
Climate Impacts

combined to project radiative forcing of climate, 5.1.4 Dynamic crop models. Dynamic crop growth
changes in global-mean temperature, and sea level models formulate the principal physiological, mor-
(Wigley and Raper 1992). Recent projections have phological, and physical processes involving the trans-
tended to reduce the projected rates of warming and fers of energy and mass within the crop and between
sea-level rise, but they are still four to five times the the crop and its environment. Such models have been
rates observed over the twentieth century. developed for most of the major crops, with the aim
of predicting their responses to specified climatic,
edaphic, and management factors governing produ-
5. Modeling Techniques: A Case Study of the ction. Dynamic models capable of simulating the
Agricultural Sector response of crops to climatic variables may be used
in conjunction with GCM climate change scenarios
Modeling techniques of several kinds are used to to explore the consequences of increased atmospheric
study potential impacts and responses of agriculture CO and climate change on yields and phenology or
to changing climate and atmospheric composition. #
to determine thresholds of crop growth sensitivity to
The agricultural sector is chosen to illustrate the range changing climate variables. They are also useful for
of modeling techniques, because agriculture is a testing possible adaptations to climate change, such
key socioeconomic sector for development in many as altered planting dates, irrigation scheduling, or
regions, agricultural land use is a primary driver of crop variety (see Rosenzweig and Iglesias (1994) for
land-use change, and agriculture is a sector vulnerable applications of crop models to climate change impact
to global environmental change. Choice of technique evaluation).
depends on the sphere of analysis considered and the
research questions posed.
5.2 Economic Techniques
Economic measures are an important component of
5.1 Biophysical Modeling the information that policymakers need to evaluate
the climate change issue. Economic analyses are
concerned with the reciprocal relations between physi-
5.1.1 Crop suitability. Spatial analysis consists of cal and biological changes on the one hand and the
identification of critical environmental limits (pri- economic responses of individuals and institutions on
marily climate, soil and water resources) of specific the other. Once crop yield impacts are estimated, it is
crops or agricultural systems, applications of climate useful to translate such biophysical responses into
change scenarios, and calculation of resulting spatial economic measures of human welfare. While bio-
shifts in crop or agricultural regions. This agrocli- physical analyses focus primarily on the production of
matic method provides an approximation of possible agricultural crops, economic analyses consider both
changes in crop areas from a biological perspective, producers and consumers of agricultural goods. Econ-
but does not address potential changes in either yield omic measures of interest include the responses of
or production. input and output market prices to yield changes and
the responses in terms of inputs and outputs that
affected individuals make to minimize losses or maxi-
5.1.2 Potential production. Potential production mize gains, based on the changes in production and
may be estimated from climatic variables or indices consumption opportunities and in price. If climate
such as length of growing season, precipitation, evapo- change causes substantial changes in outputs, price
transpiration, solar radiation, and temperature. A and quality changes can result, which, in turn, can lead
prime example of this technique is found in the Agro- to further market-induced output changes. Even if
Ecological Zone Project of the FAO (FAO 1978). The prices remain constant, accurate indications of output
FAO Agro-Ecological Zone modeling technique simu- changes are needed if production practices and types
lates both crop zonation and potential production of outputs may change.
(Leemans and Solomon 1993, Cramer and Solomon Previous work on the economics of environmental
1993, Fischer et al. 2001). stresses on agriculture has resulted in a number of
general findings (Rosenzweig and Hillel 1998). Im-
portant points are that both producers and consumers
5.1.3 Statistical regression models. Multiple regres- are included in the domain, that economic activities
sion models have been developed from the statistical constitute a type of societal adaptation to environ-
relationships between historical crop yields and cli- mental stresses, leading in the most part to mitigation
matic variables in specific locations (i.e., Waggoner of negative effects, and that environmental stresses
1983). The use of regression models is limited by have differential effects on the comparative advantage
their lack of explanatory power, since the techniques of regions and countries.
rely on statistical coefficients rather than on descrip- Economic models calculate estimates of the po-
tions of the underlying biophysical relationships. tential impacts of climate change on measurable

2007
Climate Impacts

economic quantities, including production, consump- domestic farm output prices, and thus farm revenues
tion, income, gross domestic product (GDP), and that are dependent on changes in agricultural pro-
employment. It is important to remember, however, duction inside and outside of the US are assumed to be
that these may be only partial indicators of social held constant.
welfare. Different social systems, households, and
individuals may not be represented in models that
are based on producer and consumer theory. Fur- 6. Integrating across Sectors
thermore, many of the economic models do not
account for climate-change induced alterations in land Integrated studies link the biophysical and economic
availability and water for irrigation; these nonmarket realms, and ideally may extend to interactions both
aspects of a changing climate may be critical. within and across sectors such as agriculture and its
As a starting point, the gathering of available competing demands for water by irrigators or urban
information about production, consumption, and users, or shifting patterns of land use between agri-
policies provides a framework for determining the cultural and forest (or other natural) ecosystems. This
existence and possible magnitude of economic vul- is a more realistic, but more complicated approach,
nerability in the agricultural sector (US Country because individual biophysical and socioeconomic
Studies Program 1994). Microeconomic farm-level sectors will not be affected by climate change in
models are designed to simulate the decision-making isolation. For example, agricultural responses will be
process of a representative farmer in regard to methods sensitive not only to changes in crop yields, but also to
of production and allocation of capital, labor, and alterations in water supplies, demand for water from
land and infrastructure. Such models are based on the other sectors, and to the inundation and salinization
goal of maximizing economic returns to inputs. Some of arable land by rising seas. The following are some
farm-level models include a range of farmer behavior examples of integration in agricultural impact studies:
in regard to risk, for example risk-averse or risk- (a) Parry et al. (1988) report on integrated agri-
neutral. cultural sector studies in high-latitude regions in
Macroeconomic equilibrium models of the agri- Canada, Iceland, Finland, USSR, and Japan, that
cultural sector include price-responsive behavior for involved teams of meteorologists, agronomists, and
both consumers and producers. Equations for these economists. The general conclusions of the studies
relationships are developed based on economic princi- were that warmer temperatures may aid crop pro-
ples that consumers will maximize the utility of their duction by lengthening the growing season at high
food-buying and that producers (farmers) will mini- latitudes, but that potential for higher evapo-
mize their costs of production. Such models usually transpiration and drought conditions may counteract
are calibrated for a given reference year; for climate the positive effects and may even be detrimental to
change purposes, the models solve for the reference productivity.
year given perturbations in crop production and water (b) Adams et al. (1990) conducted an integrated
supply and demand for irrigation derived from bio- study for the US, linking models from atmospheric
physical techniques (see e.g., Adams et al. 1990). science, plant science, and agricultural economics.
Population growth and improvements in technology While the outcomes for US agriculture in the study
are set exogenously (i.e., not computed dynamically in depended on the severity of climate change and the
the model). Model results include equilibrium prices compensating effects of carbon dioxide on crop yields,
and quantities. the simulations suggest that irrigated acreage will
General equilibrium economic models are useful expand and that regional patterns of US agriculture
because they measure the potential magnitude of will shift with predicted global warming. With the
climate change impacts on the economic welfare of more severe climate change scenario tested, the move-
both producers and consumers of agricultural goods. ment of US production into export markets was
They do not, however, provide a detailed picture of reduced substantially.
how the economy will respond over time. These models (c) The Missouri, Iowa, Nebraska, and Kansas
may overestimate the adjustment of the agricultural (MINK) study integrated potential biophysical and
economy to climate change. Results of changes in economic effects of climate change on agriculture and
production and prices from agricultural sectoral other sectors (Rosenberg 1993). The study incorpor-
models can then be used in general equilibrium models ated the physiological effects of CO and adaptation
of the larger economy. by farmers to the climatic conditions # of the 1930s.
Regression models have been developed that test for Even with the relatively mild warming (1.1 mC) of the
statistical relationships between climate variables and 1930s and with farmer adaptation and CO effects
economic indicators such as farm values. Some recent taken into account, regional production declined# by
studies utilize these methods known as the ‘Ricardian’ 3.3 percent. Given the estimate of 2.5 mC warming for
approach (e.g., Mendelsohn et al. 1994, Polsky and doubled CO conditions, the results of the MINK
Easterling 2001). The behavior of consumers is not study imply #agricultural losses of about 10 percent
included in this approach and world food prices, (Cline 1992).

2008
Climate Impacts

(d) Strzepek et al. (1996) linked climate change projected biophysical changes. However, the global
impacts in Egypt on agriculture, water, and the coastal perspective masks differences in levels of effects, re-
zone in an economic model. This integrated study gionally and socially. Studies done to date concur that
demonstrates that the sectors directly affected by there will be significant change in global agricultural
climate change need to be analyzed in concert with the patterns. All regions are likely to be affected, but large
other sectors of the economy in sufficient detail so that differences occur among regions. While changes in
feedback can be part of the analysis. Egypt was found global production with climate change may be small,
to be highly vulnerable to the warming as well as to the potential remains for regional vulnerability to food
changes in precipitation and river runoff that are deficits due to distributional problems of getting food
forecast to accompany greenhouse-gas-induced clim- to specific regions and groups of people. For subs-
ate change. istence farmers and people lacking entitlement to
In its fullest sense, integrated assessment attempts food, lower yields may result not only in measurable
to close the loop by linking the greenhouse gas economic losses, but possibly malnutrition and star-
emissions caused by human activities, the climatic vation. Several studies have addressed vulnerability to
consequences of the emissions, the impacts of the food deficits explicitly and found potential increases
climate changes on important systems, and the feed- (e.g., Rosenzweig and Parry 1994, Fischer et al. 2001).
back of the impacts back to the generation of Risk can be evaluated when the probability of
greenhouse gas emissions. Modeling frameworks have occurrence of an event is known, but in impact
been devised to integrate the causes, impacts, feed- evaluation, the associated probabilities to a particular
backs, and policy implications of global climate scenario are generally not known. Therefore, the
change (Nordhaus 1992, Manne et al. 1993, Hulme inclusion of uncertainty (i.e., when the event is known
and Raper 1993, Alcamo et al. 1993, Edmonds et al. but the probabilities that it will occur are not known)
1993). An example of a feedback in such models is the into climate change impact methods is very important
pathway leading from energy consumption, to green- and recent studies are now beginning to include
house gas emissions, to climate warming, to changes explicit methods to deal with it. Earlier studies have
in demand for energy (e.g., decreases in demand for often used ‘best estimate’ scenarios that represent
energy for heating and increases in demand for energy the mid-point of predictions. The inclusion of a range
for air conditioning), and thus back to changes in of scenarios representing upper and lower bounds of
energy consumption. The models may be used, for the predicted effects is more realistic and allows for the
example, to explore the effects of policies limiting propagation of uncertainty throughout a model sys-
greenhouse gas emissions, the ensuing reduction in tem. Further, probability distributions of different
global warming, and the alteration of potential climate events may be defined, with contrasts between low
change impacts, for example, on agriculture. probability catastrophic events (surprises) and higher
probability gradual changes in climate trends.
7. Thresholds, Risk, and Surprises One ‘surprise’ (i.e., when reality departs qualitat-
ively from expectations) may lead to another in a
The identification of thresholds in climate change cascade, since subsystems are connected. Complex
impacts research involves the analysis of the effects of systems and chaos theory provide conceptual and
different levels of climate forcing on a system or analytical tools for anticipating and preparing for
activity and the identification of possible discon- surprises. Identification of potential surprises and
tinuities in response. The determination of critical communication of them to the public and policy-
levels of climate change for any given system may be makers should allow improvements in environmental
separated into biophysical and socioeconomic realms. and societal resilience to surprise. Surprises related to
In the biophysical realm, although the thermal regimes global climate change may be either scientific or
and responses of managed and unmanaged ecosystems societal in nature. The anticipation of surprises in the
and water resource availability are complex, critical science of global climate change may be encouraged
temperatures (minimum, optimum, and maximum) by efforts to integrate across disciplines, to support a
have been defined for many individual processes. In multiplicity of research approaches, and to focus on
the socioeconomic realm, defining critical levels of outlier outcomes and unconventional views. Beyond
warming is more challenging, due, at least in part, to the anticipation of scientific surprise, it seems worth-
the interplay of supply, demand and prices, and to the while to increase the resilience and adaptability of
adaptability of the system. Here, determining critical social structures, so that the sensitivity to impacts of
levels of warming involves defining relative impacts on unexpected or uncertain perturbations is decreased.
actors from diverse geographic and social groups. Such societal preparedness might include the diversi-
For example, global effects of climate change in fication of economic, productive, and technological
agriculture measured with current economic valuation systems; the establishment of disaster, coping, and
techniques generally are predicted to be small to entitlement systems; and the creation of adaptive
moderate. This occurs because the economic system is, management systems capable of learning from sur-
in general, effective in fostering adaptation to the prises.

2009
Climate Impacts

See also: Climate Change and Health; Climate Leemans R, Solomon A M 1993 Modeling the potential change
Change, Economics of; Climate Policy: International; in yield and distribution of the earth’s crops under a warmed
Environmental Adaptation and Adjustments; Envi- climate. Climate Research 3: 79–96
Manne A, Mendelsohn R, Richels R 1993 MERGE—A Model
ronmental Challenges in Organizations; Environm- for Evaluating Regional and Global Effects of GHG Re-
ental Change and State Response; Environmental duction Policies. Paper presented at the International Work-
Policy: Protection and Regulation; Environmental shop on Integrated Assessment of Mitigation, Impacts and
Risk and Hazards; Environmental Sciences; Envi- Adaptation to Climate Change, 13–15 October 1993. In-
ronmental Surprise; Environmentalism: Preservation ternational Institute for Applied Systems Analysis, Laxen-
and Conservation; Global Environmental Change: burg, Austria, p. 14
Human Dimensions; Integrative Assessment in Envi- Mendelsohn R, Nordhaus W D, Shaw D 1994 The impact of
global warming on agriculture: A Ricardian analysis. The
ronmental Studies; International and Transboundary American Economic Reiew 84(4): 753–71
Accords, Environmental; Land Use and Cover Nordhaus W D 1992 The DICE model: Background and
Change structure of a dynamic integrated climate economy model of
the economics of global warming. Cowles Foundation Dis-
cussion Paper No. 1009. New Haven, CT
Bibliography Parry M L, Carter T R, Konijn N T (eds.) 1988 The Impact of
Adams R M, Rosenzweig C, Pearl R M, Ritchie J T, McCarl Climatic Variations on Agriculture. Volume 1: Assessments in
B A, Glyer J D, Curry R B, Jones J W, Boote K J, Allen Cool Temperate and Cold Regions. Kluwer, Dordrecht, The
L H Jr 1990 Global climate change and US agriculture. Nature Netherlands, p. 876
345: 219–24 Polsky C, Easterling W E III 2001 Adaptation to climate
Alcamo J, Kreileman G J J, Krol M, Zuidema G 1993 Modelling variabilily and change in the US Great Plains: A multi-scale
the global society-biosphere-climate system: Part 1: Model analysis of Ricardian climate sensitivities. Agriculture, Eco-
description and testing. Water, Air and Soil Pollution 76: 1–35 systems and Enironment 85: 133–44
Cline W R 1992 The Economics of Global Warming. Institute for Rosenberg N J (ed.) 1993 Towards an Integrated Impact As-
International Economics, Washington, , p. 399 sessment of Climate Change: The MINK Study. Kluwer,
Cramer W P, Solomon A M 1993 Climatic classification and Dordrecht, The Netherlands
future global redistribution of agricultural land. Climate Rosenzweig C, Hillel D 1998 Climate Change and the Global
Research 3: 97–110 Harest: Potential Impacts of the Greenhouse Effect on
Dregne H E, Chou N -T 1992 Global desertification, dimension, Agriculture. Oxford University Press, New York, p. 324
and costs. In: Dregne H E (ed.) Degradation and Restoration Rosenzweig C, Iglesias A (eds.) 1994 Implications of Climate
of Arid Lands. Texas Tech University, Lubbock, TX Change for International Agriculture: Crop Modeling Study.
Edmonds J A, Pitcher H M, Rosenberg N J, Wigley T M L 1993 US Environmental Protection Agency, Washington, 
Design for the Global Change Assessment Model GCAM. Rosenzweig C, Parry M L 1994 Potential impact of climate
Paper presented at the International Workshop on Integrated change on world food supply. Nature 367: 133–8
Assessment of Mitigation, Impacts and Adaptation to Climate Strzepek K M, Onyeji S L, Saleh M, Yates D N 1996 An
Change, 13–15 October 1993. International Institute for assessment of integrated climate change impacts on Egypt. In:
Applied Systems Analysis, Laxenburg, Austria, p. 7 Strzepek K M, Smith J B (eds.) As Climate Changes: Inter-
Fischer G, Shah M, van Velthuizen H, Nachtergaele F O 2001 national Impacts and Implications. Cambridge University
Executie Summary Report: Global Agro-ecological Assess- Press, Cambridge, UK, pp. 180–200
ment for Agricultie in the 21st Century. IIASA, Laxenburg, United Nations 1999 Population Division of the Department of
Austria Economic and Social Affairs of the United Nations Secretariat
Food and Agriculture Organization of the United Nations 1978 (1999). Long-range World Population Projections: Based on the
Report on the Agro-Ecological Zones Project. Vol. 1. Meth-
1998 Reision (ESA\P\WP.153)
odology and Results for Africa. FAO, Rome, p. 158
US Country Studies Program 1994 Guidance for Vulnerability
Food and Agriculture Organization of the United Nations 1993
and Adaptation Assessments. Washington, 
Agriculture: Towards 2010. United Nations, Rome
Waggoner P E 1983 Agriculture and a climate changed by more
Glantz M H (ed.) 1988 Societal Responses to Regional Climatic
carbon dioxide. In: Changing Climate. National Academy of
Change. Westview Press, Boulder, p. 428
Hulme M, Raper S 1993 An integrated framework to address Sciences Press, Washington, , pp. 383–418
climate change (ESCAPE) and further developments of the Warrick R A 1984 The possible impacts on wheat production of
global and regional climate modules (MAGICC). Paper a recurrence of the 1930s drought in the U.S. Great Plains.
presented at the International Workshop on Integrated Climatic Change 6: 5–26
Assessment of Mitigation, Impacts and Adaptation to Climate Wigley T M L, Raper S C B 1992 Implications for climate and
Change, 13–15 October 1993. International Institute for sea level of revised IPCC emissions scenarios. Nature 357:
Applied Systems Analysis, Laxenburg, Austria, p. 14 293–300
IPCC 1995 IPCC Guidelines for Assessing Impacts and Adap- World Bank 1993 Income Projections. Washington, 
tations under Changing Climate. Carter T R, Parry M L, World Bank 1994 World Population Projections 1994–95:
Harasawa H, Nishioka S (eds.), University College London Estimates and Projections with Related Demographic Stat-
and Center for Global Environmental Research, London and istics. Eduard Bos, My T. Vu, Ernest Massiah, and Rodolfo
Tsukuba, p. 59 A. Bulatao. 532 pages. Published for the World Bank by The
IPCC 2001 Special Report on Emissions Scenarios. Nakicenovic Johns Hopkins University Press, Baltimore, MD
N, Swart R (eds.) Cambridge University Press, Cambridge,
UK C. Rosenzweig

2010 Copyright # 2001 Elsevier Science Ltd.


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Climate Policy: International

Climate Policy: International Roger Revelle chaired the Environmental Pollution


Panel of the US President’s Science Advisory Com-
As early as the closing years of the nineteenth century mittee. The panel concluded that projected increases
natural scientists postulated that human activities, in carbon dioxide concentrations in the atmosphere
particularly burning fossil fuels, could lead to climate could produce changes in the climate (USA\PSAC
change, but it was only late in the twentieth century 1965).
that the international community began to develop Deepening concern with climate change stimulated
public policies to deal with climate change. The lag research. Fourier, Tyndall, and Arrhenius had pro-
resulted from the necessity of developing scientific vided the broad framework for this research. Models
understanding and public awareness and the immense could be built using the framework. Testing the models
difficulty of crafting and implementing policies that would require collecting and assembling data. Lewis
would be effective. The climate is a classic common Frye Richardson, a British meteorologist, developed
good. Policies to deal with it must overcome all of the the first numerical weather prediction system in the
obstacles to providing common goods. early years of the twentieth century (Richardson 1922).
Although at the time the calculations required were
too complex for his system to have any practical use,
1. History when computers were developed his system could be
applied. It provided the basis for the development of
Svante Arrhenius, a Swedish chemist and Nobel General Circulation Models (GCMs) which simulate
laureate, first put forth the theory that rising concen- the global circulation of the atmosphere. GCMs are
trations of carbon dioxide would lead to global basic instruments for weather and climate prediction.
warming in 1896. According to his calculations a To study the climate, GCMs must be coupled with
doubling of carbon dioxide (CO ) in the atmosphere other models, particularly of ocean circulation.
would result in an increase of #5m Celsius in global The Mauna Loa observatory continued to provide
average surface temperatures (Arrhenius 1896). data documenting an increase of CO concentrations
Arrhenius’s forecast was based on the concept of a #
in the atmosphere. The World Meteorological Organi-
greenhouse effect, which was first advanced by Jean- zation (WMO) established the World Weather Watch
Baptiste Fourier in 1827. Fourier argued that the sun’s (WWW) in 1968. WWW promotes standardized ob-
rays would enter the atmosphere, but not all of them servations and the exchange of data. In 1967 WMO
would escape, thus warming the atmosphere. John and the International Council of Scientific Unions
Tyndall developed this idea by suggesting that par- (ICSU) launched the Global Atmospheric Research
ticular atmospheric gases—water vapor and carbon Program (GARP). GARP was an international re-
dioxide—were responsible for the greenhouse effect. search program that focused on the transient behavior
Arrhenius’s contribution was to suggest that the of the atmosphere and the factors that determine
strength of the greenhouse effect would increase. His statistical properties of the general circulation of the
forecast, however, was largely ignored until the second atmosphere. GARP culminated in a 12 months global
half of the twentieth century. The dominant view was experiment that began on 1 December 1978, in which
that the climate was essentially constant except for the earth’s atmosphere and weather were observed
short-term fluctuations. and measured.
Starting in the 1950s, scientists, nongovernmental As scientific knowledge of the climate system
organizations (NGOs), and eventually governments increased, scientists issued firmer warnings about the
began to pay greater attention to the possibility of possibility and dangers of climate change. In 1977 a
climate change. Scientific advances were important. US National Academy of Sciences panel that Roger
Roger Revelle and Hans Suess of the Scripps Institute Revelle chaired concluded that if the use of fossil fuels
of Oceanography developed doubts about the ability continued to increase at present rates, average global
of the oceans to absorb the amount of carbon dioxide surface temperature could rise by about 6 mC over the
that was being emitted (Revelle and Suess 1957). The coming 200 years, with potentially ominous conse-
belief that the oceans could absorb all of the carbon quences for agriculture and fishing (USA\NAS 1977).
dioxide was a principal factor shaping the view that Public and governmental concern mounted during the
the climate was constant. The increasing availability 1970s. The possibility of climate change and the effects
of data was also important. In 1957 a station to of this change were discussed at a series of UN-
measure atmospheric carbon dioxide was established sponsored large-scale conferences on the human en-
at the Mauna Loa observatory in Hawaii. By the 1960s vironment (1972), food (1974), water (1977), and
this station was reporting a steady increase in carbon desertification (1977).
dioxide concentrations in the atmosphere. In 1963 the WMO initiated a program on the climate in 1974
US-based Conservation Foundation sponsored a and, together with the United Nations Environment
meeting on climate issues. The report forecast that a Program (UNEP) and the International Council of
doubling of carbon dioxide in the atmosphere would Scientific Unions, convened the First World Climate
produce a 3.8 mC temperature increase. In 1964–5, Conference in Geneva, Switzerland in February 1979.

2011
Climate Policy: International

The First World Climate Conference brought together zation and the United Nations Environment Program
experts who focused on the scientific aspects of climate to create the Intergovernmental Panel on Climate
change. Later in 1979, on the basis of the conference’s Change (IPCC).
recommendation, WMO established the World Cli- IPCC is open to all members of WMO and UNEP.
mate Programme (WCP). The WCP consists of the Its mandate is to assess scientific, technical, and
World Climate Research Program (which succeeded socioeconomic information relevant to understanding
GARP), the World Climate Data and Monitoring the risk of human-induced climate change. It has three
Program, the World Climate Applications and working groups and a task force. Working Group I
Services Program, and the World Climate Impact assesses the scientific aspects of the climate system and
Assessment and Response Strategies Program. The climate change. Working Group II addresses the
WCP fosters international research, coordinates data vulnerability of socioeconomic systems to climate
and monitoring activity, and facilitates access to change and options for adapting to it. Working Group
information. III assesses options for mitigating climate change. The
Stimulated by a sequence of abnormally hot years, Task Force on National Greenhouse Inventories
concern about climate change grew stronger during oversees the National Greenhouse Gas Inventories
the 1980s. Pressure grew for the adoption of in- Program. The IPPC does not carry out research. It
ternational policies to mitigate climate change. WMO bases its assessments mainly on published and peer
and UNEP sponsored a workshop of climate scientists reviewed scientific literature. Governments appoint
on Developing Policies for Responding to Climatic members of IPCC.
Change’ in Villach, Austria and Bellagio, Italy in The Intergovernmental Panel on Climate Change
the fall of 1987. In their report on the workshop the issued its first scientific assessment in 1990 (Houghton
scientists concluded that as a consequence of the et al. 1990). The Second World Climate Conference
emission of greenhouse gases (GHG) average global was held later that year. The IPCC report and the
surface temperatures would increase at a rate of 0.3 mC Second World Climate Conference laid the scientific
per decade. There would also be changes in pre- foundation for negotiations toward a treaty on climate
cipitation and soil moisture and sea level rise. They change. The IPCC stated in its report that it was
called for a treaty to reduce GHG emissions (WMO confident of the existence of a natural greenhouse
1987). In June 1988, the government of Canada effect and that emissions resulting from human ac-
organized an expert conference in Toronto on ‘The tivities were substantially increasing the atmospheric
Changing Atmosphere: Implications for Global Se- concentrations of greenhouse gases. It specifically
curity.’ The conference called for a reduction in mentioned carbon dioxide, methane, chlorofluoro-
deforestation and a 20 percent cut back in CO carbons, and nitrous oxide. Carbon dioxide, chloro-
emissions from 1988 levels by 2005 with the eventual# fluorocarbons, and nitrous oxide are removed from
aim of a 50 percent cut back (WMO et al. 1988). the atmosphere only slowly. Their sources and sinks in
Representatives in many national legislative bodies the atmosphere, biosphere, and oceans determine their
and other governmental officials called for action. atmospheric lifetimes, which can last from decades to
The Toronto conference brought out sharp differ- centuries. The IPCC concluded that global mean
ences between the USA and many other countries. surface air temperature had increased by 0.3 mC to
These differences have surfaced regularly. Because the 0.6 mC over the past 100 years and the size of this
USA had the largest greenhouse gas emissions, and increase was consistent with the predictions of climate
because of its place in the global economy and the models.
strength of its scientific establishment, these differ- Greenhouse gas emissions are essentially a product
ences have had a profound impact on the negotiations of the number of people times the level of development.
about the climate change regime. The energy intensity of development and the carbon
In 1978 the National Climate Program Act (PL 95- intensity of energy use modify the effects of the level
367) established the United States National Climate of development. Under IPCC’s business-as-usual
Program Office in the National Oceanic and Atmos- scenario, greenhouse gas emissions would increase
pheric Administration which coordinated research following historically based trajectories. The IPCC
efforts on climate change throughout the federal predicted that the consequence of the increased con-
government and US involvement in international centration of long-lived gases in the atmosphere would
programs. US national research efforts were sub- be an increase in the global mean surface temperature
stantial and the USA played an important role in of about 0.3m Celsius per decade. This would result in
international research efforts. The USA, however, was an increase of about 1 mC by 2025.
concerned about the costs of cutting GHG emissions The IPCC report and the Second World Climate
(Brenton 1994, pp. 167–9, Rowlands 1995, pp. 74–6) Conference, where the report was discussed and its
and sought to organize a mechanism that would have conclusions reaffirmed, provoked concern and also
broad legitimacy for obtaining objective scientific made it obvious how complicated dealing with climate
advice. The USA was an important leader in the change would be. Governments responded with what
decision in 1988 by the World Meteorological Organi- in the history of international diplomacy must be

2012
Climate Policy: International

regarded as alacrity. The first session of the Inter- Carbon dioxide emissions result primarily from fossil
governmental Negotiating Committee for a Frame- fuel combustion and land use and land use changes.
work Convention on Climate Change was held in Agriculture contributes a large share of methane and
February 1991. nitrous oxide emissions.
The goal was to have a treaty ready for signature at The production and use of chlorofluorocarbons
the United Nations Conference on Environment and were controlled under the 1985 Vienna Convention for
Development, which was held in Rio de Janeiro in the Protection of the Ozone Layer and the 1987
June 1992. The Rio Conference was an immense Montreal Protocol on Substances that deplete the
gathering (Brenton 1994, pp. 223–35). Representatives Ozone Layer and subsequent amendments. These
of 178 governments attended including 117 heads of accords phase out the production and use of chloro-
state or government. More than 1,400 NGOs were fluorocarbons.
represented and there were more than 35,000 The first principle of the UNFCCC is that ‘The
accredited representatives. The deadline set by the Parties should protect the climate system for the
date of the Rio Conference and the publicity asso- benefit of present and future generations of human-
ciated with the conference put considerable pressure kind, on the basis of equity and in accordance with
on governments. their common but differentiated responsibilities and
Despite this pressure, the USA insisted that it would respective capabilities’ (Article 3). Other principles
not sign a treaty that contained binding emissions include exhorting the parties to give special con-
limitations. Many NGOs and several governments sideration for developing countries, take ‘precaution-
wanted the treaty to include emission limitations. The ary measures,’ promote sustainable development, and
US administration led by President George H. W. promote a ‘supportive and open international econ-
Bush did not believe that at that time the USA could omic system.’
fulfill a commitment to limit emissions, and it doubted All parties to the treaty undertake commitments
the capacity of other countries to do so. Eventually the (Article 4) to meet the treaty’s objective. Though these
US position was accepted (Soroos 1997, pp.191–200), commitments are stated in general terms in the
and in June 1992 154 states and the European Union framework convention, they are to be made more
(EU)—in legal terms the European Economic Com- specific through additional legal instruments nego-
munity—signed the United Nations Framework Con- tiated later. The UNFCCC defines various categories
vention on Climate Change (UNFCCC). The treaty of countries and establishes differential responsibilities
came into effect in March 1994 after 50 states had for them. There are three categories of parties to the
become parties. As of mid-2000, there were 184 parties treaty: developed countries, developed countries with
to the convention. special financial responsibilities, and developing coun-
tries.
Developed countries, or Annex I countries, are
exhorted to take ‘immediate action’ to limit green-
2. The United Nations Framework Conention on house gas emissions. Annex I includes 38 states, of
Climate Change which 13 were Eastern Europe states in transition to
democracy and market economies, and the European
The UNFCCC is one of the most far-reaching treaties Union (in formal legal terms the European Com-
ever negotiated. Since greenhouse gas emissions are munity). Article 4, paragraph 2, explicitly requires
the result of so many aspects of modern life, becoming Annex I countries ‘to adopt national policies and take
a party to the treaty commits states to subjecting corresponding measures on the mitigation of climate
almost all aspects of their economic activities to some change’ by limiting their anthropogenic emissions of
form of international scrutiny and supervision. The greenhouse gases. The Article further requires them to
legal instruments that will be negotiated under the report on the steps that they take with the aim of
UNFCCC will have profound impacts on national ‘returning individually or jointly to their 1990 levels
economies. these anthropogenic emissions of carbon dioxide and
The objective of the treaty is ‘… stabilization of other greenhouse gases not controlled by the Montreal
greenhouse gas concentrations in the atmosphere at a Protocol.’
level that would prevent dangerous anthropogenic The countries that are listed in Annex II of the
interference with the climate system’ (Article 2). UNFCCC are required ‘to provide new and additional
Greenhouse gases are defined as ‘gaseous constituents financial resources to meet the full agreed costs
of the atmosphere, both natural and anthropogenic, incurred by developing country Parties in complying
that absorb and re-emit infrared radiation.’ In its first with their obligations’ to produce national inventories
scientific assessment IPCC concluded that during the of their ‘emissions by sources and removals by sinks of
decade from 1980 to 1990 carbon dioxide had contri- all greenhouse gases not controlled by the Montreal
buted 55 percent of the greenhouse gases to the change Protocol.’ They also have responsibilities for pro-
in radiative forcing, chlorofluorocarbons 24 percent, viding financial assistance to the developing countries
methane 15 percent, and nitrous oxide, 6 percent. for other agreed tasks (Article 4, paragraph 3). Annex

2013
Climate Policy: International

II includes Annex I countries except those in transition dioxide equivalents calculated in terms of global
to democracy and market economies. warming potential. The methodologies for measuring
The requirements for developing countries are more emissions and calculating global warming potential
modest. They are specifically required to submit their were those developed by the IPCC.
inventories to the UNFCCC secretariat. Beyond that The Kyoto Protocol included three flexible mechan-
they are exhorted to adopt policies and take measures isms. One was emission trading. Annex I countries
to mitigate climate change and adapt to it. would be able to trade emission allowances. For
instance, if a party had difficulty meeting its required
limitations, it would be able to purchase emission
allowances from another party that had emissions that
3. The Kyoto Protocol were lower than its limitation. Joint Implementation
among Annex I countries was another flexible mech-
The process of negotiating legal instruments under the anism. If a party engaged in a project to increase the
UNFCCC was launched at the first Conference of the sinks on another party’s territory through refores-
Parties (COP-1) in Berlin in 1995. The IPCC published tation the two parties could share the credit for the
its second assessment report in 1996 (Houghton et al. increased sinks. The third flexible mechanism was the
1996, Watson et al. 1996, Bruce et al. 1996). The report Clean Development Mechanism. Annex I countries
confirmed that climate change was occurring because could engage in projects in non-Annex I countries that
of human actions and analyzed the impacts of climate reduced prospective emissions and share the credit for
change and the measures that could be taken to adapt this. The US government and many economists argued
to and mitigate climate change. The report provided a that these flexible mechanisms would make it possible
strong stimulus for further action. to limit emissions at the least possible cost. The US
An initial agreement to establish legally binding administration of President William J. Clinton argued
quantified emissions limitations and reduction com- that the USA could only meet its commitment in the
mitments was reached at COP-3 in Kyoto, Japan on Kyoto Protocol if it were allowed to use flexible
10 December 1997. Article 3, paragraph 1 of the mechanisms.
Kyoto Protocol would require Annex I countries How well these flexible mechanisms would work
collectively to reduce their overall greenhouse gas was unknown. Emission trading mechanisms were
emissions ‘by at least five percent below their 1990 included in US domestic legislation and the 1979
levels in the commitment period 2008 to 2012.’ Convention on Long Range Transboundary Air Pol-
Specifically the Kyoto Protocol required the USA to lution. Various pilot projects have been undertaken.
reduce its greenhouse gas emissions to 93 percent of Whatever the fate of the Kyoto Protocol, flexible
the 1990 levels, Japan to 94 percent, and the European mechanisms in some form will likely be part of the
Economic Community to 92 percent. Norway, on the climate change regime. Flexible mechanisms will put
other hand, is allowed to increase its emissions to 101 heavy demands on the capacity of UNFCCC organs
percent of the 1990 level, Australia to 108 percent, and and the parties to the convention.
Iceland to 110 percent. The European Community’s The Kyoto Protocol did not include requirements
obligation applied to the Community as a whole. that specify policies and measures that parties must
Under arrangements made within the EU some adopt, although it could have under the terms of the
countries, especially Germany and the UK, which had UNFCCC. It did not require that countries impose a
the largest emissions, agreed to reduce their emissions carbon tax or ensure that appliances or automobiles
more than 8 percent below the 1990 levels, which meet efficiency standards. Many European states
would allow others, such as Greece and Portugal, to argued that any effort to limit greenhouse gas emis-
increase their emissions above the 1990 levels. Greece sions should require uniform policies and measures.
would be allowed to increase its emissions by 25 Some US economists also made this argument. The
percent and Portugal by 27 percent. US government preferred that international accords
The Kyoto Protocol covered six greenhouse gases state obligations and that parties should be free to
that were not covered by the Montreal Protocol: adopt whatever policies and measures they choose to
carbon dioxide, methane, nitrous oxide, hydrofluoro- meet their obligations. Policies and measures—
carbons, perfluorcarbons, and sulphur hexaflouride. whether they are uniform or country-specific—will
The limitations apply to ‘net changes in greenhouse inevitably be part of the climate change regime. They
gas emissions by sources and removals by sinks will also place heavy demands on the capacity of
resulting from direct human-induced land-use changes countries that are parties to the UNFCCC.
and forestry activities, limited to afforestation, re- Of the Annex I countries in the late 1990s only
forestation and deforestation since 1990, measured as Austria, Germany, Luxembourg, the UK, and the
verifiable changes in carbon stocks in each commit- Eastern European countries in transition to democ-
ment period’ (Article 3, paragraph 3). The emissions racy and market economies had emissions below 1990
from the other five greenhouse gases that were covered levels (Grubb et al. 1999, p. 82). The UK’s emissions
in the protocol were measured in terms of their carbon fell because of its transition from coal to natural gas

2014
Climate Policy: International

and Germany’s because of the collapse of East Bringing developing countries into the climate
German industries. Industries had also collapsed in change regime will be essential but extraordinarily
the countries in transition. difficult to achieve. Developing countries accounted
As of January 2000 84 countries had signed the for slightly more than 30 percent of greenhouse gas
Kyoto Protocol and 22 had ratified it. None of the emissions in 1990, and their emissions were growing
Annex 1 countries had ratified the Kyoto Protocol. rapidly. As of 2000 developing countries accounted
Prior to COP III the US Senate adopted a resolution for just fewer than 80 percent of the world’s popu-
by a vote of 95 to 0 stating that the USA would not lation. Future population growth was projected to
agree to a treaty limiting emissions unless limitations occur primarily in developing countries. Developing
also applied to developing countries. After COP III country governments desire rapid economic growth.
adopted the Kyoto Protocol, Senators from both Population and economic growth will increase devel-
parties stated that the protocol was not ratifiable. oping country greenhouse gas emissions. Developing
Senators and representatives of industry and labor countries will become the dominant source of green-
argued that unless developing countries were included house gas emissions in the twenty-first century.
in the emission limitation requirement, developing Developing countries are also extremely vulnerable
countries would have an economic advantage. Some to climate change. The small island states could be
argued that industry would move from developed to submerged by sea-level rise. Sea-level rise could
developing countries. threaten coastal zones where developing country
Without US ratification, it would be difficult for the populations are often concentrated. Countries that are
Kyoto Protocol to come into effect. To come into heavily dependent on agricultural production could
effect 55 states, which together accounted for 55 suffer greatly from climate change. Petroleum-pro-
percent of the 1990 carbon dioxide emissions of Annex ducing countries could see their incomes cut if pet-
I countries, must have become parties to the protocol roleum consumption were to decrease.
(Article 25). US emissions accounted for more than a Promoting economic growth is arguably one of the
third of the 1990 totals for Annex I countries. most effective steps developing countries could take to
Several factors contributed to the differences be- deal with climate change. The more developed it is, the
tween the USA and other countries and to the US more resources a country has to devote to limiting
reluctance to accept the Kyoto Protocol. Under the greenhouse gas emissions. Developed countries also
US legal system, individual citizens or NGOs can sue can adapt more easily to climate change. Developing
the government to force it to comply with an in- countries would never agree to limit their economic
ternational treaty to which the USA is a party. The growth. To become effective the climate change regime
USA does not ratify treaties unless the government is will have to find ways to promote sustainable de-
confident that it can comply. Other states have velopment.
different legal systems where suing the government is
much more difficult. For other states becoming a party
to a treaty is frequently a statement of intention to try 4. Issues
to comply. The size and climate of the USA are factors
in the US propensity to consume energy. More The climate is a classic common good. Human action
compact countries with milder climates have more anywhere affects the climate and humans everywhere
modest energy requirements. Unlike many developed are affected by the climate. Mitigating climate change
countries, the US population was growing in the late will require the combined efforts of governments
twentieth century and was projected to continue to and—in response to government policies—of indi-
grow, leading to increased demands for energy and viduals throughout the world. Article 4 of the United
other products that would increase greenhouse gas Nations Framework Convention on Climate Change
emissions. At least partly because the USA was a requires that these efforts be based on equity. This
petroleum producer, the US public was addicted to acknowledges the common good character of the
inexpensive energy. Efforts to increase energy prices climate. Parties to the UNFCCC will only act if they
regularly produced political outcries. feel that the burdens of dealing with climate change
If the Kyoto Protocol did not come into effect, a are borne equitably. The treaty embodies one concept
new agreement would have to be negotiated. If the of equity. The special requirements placed on Annex I
protocol did enter into force, it would require that and Annex II countries recognize their relative wealth
additional limitations should be negotiated for sub- and the fact that because of their historical lead in
sequent periods—that is, those after 2008–2012—and industrializing they have contributed more to current
that these negotiations should begin no later than 2005 concentrations of greenhouse gases in the atmosphere.
(Article 3, paragraph 9). Eventually a global bargain The resolution adopted by the US Senate reflects
will have to be struck involving all parties, both another concept of equity, the necessity of all states
developed and developing countries. The process of taking at least some action particularly in view of the
negotiating instruments to implement the UNFCCC growing share of greenhouse gases emitted by non-
will be long-lasting. Annex I countries. These different concepts of equity

2015
Climate Policy: International

will have to be reconciled for effective action to be Science of Climate Change: Contribution of Working Group I
taken. to the Second Assessment Report of the Intergoernmental
Knowledge is a second issue. Decades of work have Panel on Climate Change. Cambridge University Press, Cam-
bridge, UK
greatly improved the natural science of climate change.
Nordhaus W D 1994 Managing the Global Commons: The
Climate change models have become more sophis- Economics of Climate Change. MIT Press, Cambridge, MA
ticated and the amount of data has increased. There Revelle R, Suess H E 1957 Carbon dioxide exchange between
are roughly two dozen large-scale models that are used atmosphere and ocean and the question of an increase in
in this work. They produce somewhat different results, atmospheric CO during the past decade. Tellus 9: 18–27
#
particularly with respect to forecasts of regional Richardson L F 1922 Weather Prediction by Numerical Process.
impacts, but the range of differences with respect to Cambridge University Press, Cambridge, UK
the global average mean surface temperature is not Rowlands I H 1995 The Politics of Global Atmospheric Change.
different from the range that has existed since the Manchester University Press, Manchester, UK
Soroos M 1997 The Endangered Atmosphere: Presering a Global
beginning of the twentieth century. Even though work
Commons. University of South Carolina, Columbia, SC
on the economic and social dimensions of climate United States of America, National Academy of Sciences
change started later substantial progress has been (USA\NAS) 1977 Climate, Climatic Change, and Water
made. Sophisticated models have been developed to Supply. National Academy Press, Washington, DC
estimate future emissions of greenhouse gases and the United States of America, President’s Science Advisory Council
costs of mitigation strategies (Nordhaus 1994, Watson (USA\ PSAC) 1965 Restoring the Quality of Our Enironment:
et al. 1996, pp. 263–396). Report of the Enironmental Pollution Panel. The White
Estimates produced by these models vary because House, Washington, DC
they are sensitive to assumptions about appropriate Watson R T, Zinyowera M C, Moss R H 1996 Climate Change
1995: Impacts, Adaptations and Mitigation of Climate Change:
model structure and demographic and economic
Scientific-Technical Analyses: Contribution of Working Group
growth and the availability of demand (energy effic- II to the Second Assessment Report of the Intergoernmental
iency) and supply-side (alternative sources of Panel on Climate Change. Cambridge University Press, Cam-
supply) energy options. Research has produced better bridge, UK
understanding of the mechanisms that promote com- World Meteorological Organization (WMO) 1987 Deeloping
pliancewithinternationalaccords.Despitethisprogress Policies for Responding to Climatic Change: A Summary of the
in the natural and social sciences much remained to be Discussion and Recommendations on the Workshop held in
done, especially concerning the economic and social Villach, 28 September–2 October 1987 and Bellagio, 9–13
dimensions. Efforts to adapt to and mitigate climate Noember 1987, (WMO\TD 225). WMO, Geneva
World Meteorological Organization (WMO), Environment
change will require modifying human behavior. Under-
Canada, United Nations Environment Program 1988 The
standing how to promote appropriate modifications Changing Atmosphere: Implications for Global Security,
in behavior is a crucial issue for research. Toronto Canada, 27–30 June 1988, Conference Proceedings.
WMO, Geneva
See also: Climate Change and Health; Climate
Change, Economics of; Climate, History of; Climate H. K. Jacobson
Impacts; Tropospheric Ozone: Agricultural Implic-
ations; United Nations: Political Aspects

Clinical Assessment: Interview Methods


Bibliography
Arrhenius S 1896 On the influence of carbonic acid in the air The single most common method of assessment in
upon the temperature on the ground. Philosophical Magazine both clinical practice and research is an interview,
41: 237–76 whereby the clinician speaks directly to a person to
Brenton T 1994 The Greening of Machiaelli: The Eolution of obtain the clinical assessment (Widiger and Saylor
International Enironmental Politics. Earthscan, London 1998). Additional methods of assessment, such as self-
Bruce J P, Lee H, Haites E F (eds.) 1996 Climate Change 1995: report inventories, projective instruments, or labora-
Economic and Social Dimensions of Climate Change: Con- tory tests, are often used to supplement or inform a
tribution of Working Group III to the Second Assessment
Report of the Intergoernmental Panel on Climate Change.
clinical interview, but only under quite special cir-
Cambridge University Press, Cambridge, UK cumstances would a clinician rely solely upon one of
Grubb M, Vrolijk C, Black D 1999 The Kyoto Protocol: A Guide these other techniques, whereas clinicians and re-
and Assessment. Royal Institute of International Affairs, searchers will often rely solely upon a clinical in-
London terview.
Houghton J T, Jenkins G J, Ephraums J J 1990 Climate Change:
The IPPC Scientific Assessment, Report Prepared for IPCC by
Working Group I. Cambridge University Press, Cambridge, 1. Adantages of a Clinical Interiew
UK
Houghton J T, Meira Filho L G, Callander B A, Harris N, Many of the advantages of a clinical interview are
Kattenberg A, Maskell K 1996 Climate Change 1995: The somewhat obvious, but worth noting for the record

2016
Clinical Assessment: Interiew Methods

nevertheless (Groth-Marnat 1997). First, clinical deceiving. Many clinicians may place excessive faith,
interviews are substantially more flexible than alterna- or at least have excessive confidence, in their own
tive methods. Interviewers can alter the focus, depth, perceptions and judgments, despite the fact that
or even the style of an interview to be optimally studies have shown repeatedly that unstructured clini-
responsive to the particular demands, interests, or cal assessments often obtain poor agreement across
needs of the respondent or the assessment. Response different interviewers (Dawes 1995, Garb 1998) (see
sets (intentional, habitual, or unconscious tendencies Clinical Psychology: Validity of Judgment; Clinical
to provide false or misleading responses) can affect the ersus Actuarial Prediction). Two clinicians relying
validity of a clinical interview (Rogers 1995), but upon their own skills, talents, and abilities will often
interviewers can themselves be sensitive and respon- provide different conclusions regarding the same
sive to symptom exaggeration, distortion, or denial patient. At least one of them will be wrong, but both
during the course of an interview. Interviewers may will believe it is the other clinician. The instrument of
notice if a respondent is being excessively acquiescent the unstructured clinical interview is for the most part
or defensive, if the mood state of a respondent is the clinician, and there are perhaps few clinicians who
contributing to excessive self-denigration, or if the truly recognize or adequately appreciate their own
responses are inconsistent across the interview. The limitations, deficits, and flaws.
interviewer can then alter immediately the format, Many studies have documented that unstructured
style, or scoring of the interview to make adjustments clinical interviews tend to be unreliable and are highly
for the response sets, can conduct follow-up queries to susceptible to primacy effects, halo effects, false
assess for the presence of problematic response sets, or expectations, misleading assumptions, and confirm-
even discuss a response set directly with the respondent atory biases (Dawes 1994, Garb 1998, Widiger and
in order to decrease its effects. Saylor 1998), and this research appears to have had
A disadvantage of other methods of clinical as- only a marginal effect upon the beliefs or behavior of
sessment is that they will routinely cover domains of most individual practitioners (e.g., Westen 1997). The
functioning that will not be particularly relevant or lack of an adequate impact of this research is perhaps
necessary for an issue or patient at hand and yet, at the due in part to the ability of persons to believe that the
same time, fail to cover in adequate depth the domain research is primarily relevant to persons other than
of functioning of most interest or relevance to the themselves. An advantage of other methods of as-
patient and clinician. For example, most omnibus self- sessment is that the research indicating, for example,
report inventories attempt to cover virtually all systematic errors within a laboratory instrument, will
domains of psychopathology but must then provide clearly be relevant to almost any administration of
an inadequate assessment for any one of them. An that instrument. Research indicating the systematic
interviewer has the unique advantage of being able errors of a sample of clinicians might not be applicable
virtually to abandon a focus of inquiry during the to a clinician who did not actually participate in that
course of an assessment to spend more time and effort particular study. Clinicians can then argue and believe
on a particular line of investigation. that the decision-making research is not really ap-
Finally, the clinician conducting the interview is plicable to them because they are in fact adequately
usually the person who will ultimately provide the sensitive and responsive to the issues, errors, biases, or
clinical report, and seeing or hearing for oneself is concerns identified in this research.
usually much more compelling than being told by A major source for the failure of unstructured
something or someone else. The clinician is able to see clinical interviews to provide reliable or valid assess-
and experience firsthand the person’s behaviors, ments is the failure to conduct systematic or com-
feelings, statements, and manner of relatedness. The prehensive assessments. An innovation of the
presentation of the patient’s psychopathology in his or American Psychiatric Association’s (APA) Diagnostic
her speech, affect, and behavior within the clinician’s and Statistical Manual of Mental Disorders (DSM-IV;
office can provide a powerfully vivid and compelling APA 1994) is the provision of relatively specific and
portrayal. explicit diagnostic criteria for each mental disorder
(see Mental and Behaioral Disorders, Diagnosis and
2. Limitations of Unstructured Clinical Interiews Classification of ). Reliable and valid clinical diagnoses
are now readily obtained as long as the interviewer
Many of the benefits and advantages of a clinical does indeed comprehensively assess every diagnostic
interview, however, fail to be realized in routine criterion in a systematic manner (Nathan and Lang-
clinical practice, as the freedom and authority pro- enbucher 1999). Many studies, however, have indi-
vided by a clinical interview does not come without cated that clinicians will often reach a conclusion after
substantial responsibilities, costs, and limitations. The determining the presence of only a small subset of the
reliability and validity of a clinical interview depend diagnostic criteria set, and will fail to assess for the
substantially upon the conscientiousness, skills, and presence of additional symptomatology of other poss-
talents of the clinician. Seeing for oneself can be very ible disorders (Garb 1998, Widiger and Saylor 1998).
compelling, but it can be equally illusory and Zimmerman and Mattia (1999), for example, com-

2017
Clinical Assessment: Interiew Methods

pared the clinical diagnoses provided for 500 patients structured interview schedules in which a set of
who were assessed with unstructured clinical inter- specified questions must be administered, the inter-
views with the diagnoses provided by a semistruc- pretation and scoring of the responses to which are
tured interview that systematically assessed for the guided by an accompanying manual. These interview
presence of the diagnostic criteria for most of the schedules can vary substantially in the extent to which
commonly occurring Axis I mental disorders (i.e., the questions are open-ended, observations of the
mental disorders other than personality disorders or respondent are included, and the interviewer is allowed
mental retardation). More than 90 percent of the to conduct follow-up queries. A fully structured
patients receiving the unstructured clinical interview interview would be essentially equivalent to a verbally
were provided with only one diagnosis, whereas more administered self-report inventory (Widiger and
than a third of the patients assessed with the semi- Saylor 1998). Most interview schedules, however, are
structured interview were discovered to have met the more accurately described as being semistructured, as
diagnostic criteria for at least three different mental they will include subtle and indirect questioning, will
disorders. Comorbidity among mental disorders require follow-up queries, and will include open-ended
has substantial significance and importance to clini- questions, the responses to which will require pro-
cal treatment, yet it appears to be grossly under- fessional judgment and expertise for interpretation
recognized in general clinical practice (Zimmerman and scoring. The only difference between some
and Mattia 1999). semistructured interviews and a skilled clinician is that
A variety of studies have also indicated that the inclusion of a semistructured interview documents
clinicians relying upon unstructured clinical interviews explicitly the obtainment of a reliable, replicable,
routinely fail to assess for the presence of the specified systematic, objective, and comprehensive assessment
diagnostic criteria (Widiger and Sanderson 1995). One of all of the relevant symptomatology.
of the more compelling demonstrations of this failure Semistructured interviews are the preferred method
was provided by Morey and Ochua (1989). Morey and for obtaining clinical assessments in research, but are
Ochua provided 291 clinicians with the 166 DSM-III perhaps rarely used in general clinical practice.
(APA 1994) personality disorder diagnostic criteria Clinicians perceive semistructured interviews as being
and asked them to indicate which DSM-III personality constraining, impractical, or superficial (Westen
disorder(s) were present in one of their patients and to 1997). Semistructured interviews are indeed constrain-
indicate which of the 166 diagnostic criteria were ing, as they are a means by which to ensure that the
present. Kappa for the agreement between their findings are reliable, replicable, systematic, compre-
diagnoses and the diagnoses that would be given based hensive, and objective by constraining the clinician
upon the diagnostic criteria they indicated to be from failing to assess all of the necessary criteria in a
present was very poor, ranging from 0.11 (schizoid) to minimally adequate manner (Segal 1997). Most semi-
only 0.58 (borderline). In other words, their clinical structured interviews, however, allow and do in fact
diagnoses agreed poorly with their own assessments of encourage clinicians to have a significant impact
the diagnostic criteria for each of the personality through the administration of follow-up queries and
disorders. Comparable results have since been re- the reliance upon their professional judgment for the
ported in many subsequent studies (Widiger and scoring of the responses.
Saylor 1998). One of the major impediments to the implemen-
Among the more consistently documented errors in tation of a semistructured interview in general clinical
clinical practice are gender and racial biases in the practice is the amount of time that is required for their
application of diagnostic criteria (Garb 1998). DSM- complete administration. Researchers will often pay
IV diagnostic criteria sets will contain a degree of both interviewers and patients for two or three hours
gender and racial bias (Hartung and Widiger 1998). of interviewing; no such funding luxury is available in
However, racial and gender biases that have been general clinical practice. However, the amount of time
documented empirically have been due in large part to required for the administration of a semistructured
a failure of clinicians to adhere to the specified interview can be reduced substantially by first ad-
diagnostic criteria set for a respective mental disorder ministering a screening questionnaire to narrow the
(Whaley 1997, Widiger 1998). When clinicians are line of inquiry. Screening questionnaires with a high
compelled to follow closely the criteria set for a mood, false positive rate (i.e., err in the direction of identi-
psychotic, or personality disorder, gender and racial fying too much rather than too little psychopathology)
biases are significantly less likely to occur. are also useful in alerting the clinician to domains of
functioning that might have been otherwise neglected.
Many clinicians will also perceive some of the
3. Adantages of Semistructured Clinical required questioning to be simplistic or superficial.
Interiews However, it is important to appreciate that a sub-
stantial amount of research has informed the de-
Limitations of unstructured clinical interviews can be velopment of a particular line of questioning. Semi-
addressed in part through the administration of more structured interviews can in fact be an excellent source

2018
Clinical Assessment: Interiew Methods

for discovering new and effective methods of inquiry. various mental disorders has relied substantially on
Semistructured interviews, however, will not be as the administration of a semistructured interview. The
effective as an unstructured interview in establishing results of this extensive research provide considerable
rapport. Most clients will appreciate the compre- support for the construct validity of the respective
hensive and thorough nature of a semistructured semistructured interviews that were used in these
interview, but if the establishment of rapport is a studies. In addition, detailed summaries of the re-
major clinical issue, then a lengthy semistructured liability and validity of alternative interview schedules
interview will at times be problematic. are provided in a number of published papers and
texts (e.g., Rogers 1995, Segal 1997, Widiger and
Sanderson 1995).
4. Recommendations for Future Research
A clinical and scientific limitation of many semi- 5. Conclusions
structured interviews is the absence of data normally
obtained through the course of the development and In sum, the many advantages of semistructured
validation of a psychometric instrument. For example, interviews clearly outweigh their limitations and dis-
semistructured interview reliability data are often advantages. Many are now being used in general
simply confined to an agreement with respect to the clinical practice when the results of the clinical
scoring of a previous or concurrent administration of assessment might be subsequently questioned or re-
the interview. The poor reliability obtained in general viewed (e.g., custody, disability, and forensic assess-
clinical practice is due to inconsistent, incomplete, or ments). A highly talented clinician can provide a more
idiosyncratic interviewing. It is unclear if some of the valid assessment than a semistructured interview, but
semistructured interviews have actually resolved this it is risky to assume that one is indeed that talented
problem given the absence of studies on the interrater clinician. It would at least seem desirable for a talented
(or test–retest) reliability of independent administra- and insightful clinician to be fully informed by a
tions of the interview (Rogers 1995, Segal 1997, systematic and comprehensive assessment. Semi-
Widiger and Saylor 1998). structured interviews are used routinely in general
There are a variety of different interview schedules clinical research and perhaps will eventually be used
to assess the same domains of psychopathology. An routinely in general clinical practice. Individually
advantage of this diversity is the availability of administered intelligence tests are comparable to a
different options to choose from. However, current fully structured clinical interview, particularly an
research suggests that these different interview assessment of verbal intelligence that involves a series
schedules are providing different findings and it is of specified questions, the responses to which are
unclear if the failure to replicate findings across studies scored according to a test manual. Very few clinicians
is due to idiosyncratic administration of interviews, would attempt to diagnose mental retardation in the
differences in setting, or differences in the interview absence of the administration of one of these
schedules. One suggestion has been to confine future structured interviews. Perhaps in the future no clin-
research to just one interview schedule (Regier et al. ician will attempt to diagnose an anxiety, mood,
1998). This confinement would contribute to the psychotic, dissociative, personality, or other mental
obtainment of more uniform results, but at the cost of disorder without at least considering the results
the failure to appreciate the extent to which the results obtained by the administration of a respective semi-
in fact reflect unique aspects of a particular interview structured interview.
schedule. What is needed are studies comparing
directly the concurrent and predictive validity of See also: Clinical Psychology: Validity of Judgment;
alternative interview schedules within the same patient Clinical versus Actuarial Prediction; Minnesota
sample. Multiphasic Personality Inventory (MMPI)
Normative data are also lacking for many of the
semistructured interviews. The test manuals that
accompany the publication of a semistructured in- Bibliography
terview are often surprisingly weak in their coverage of
reliability and validity data. Diagnoses obtained American Psychiatric Association 1994 Diagnostic and Stat-
through the administration of a semistructured in- istical Manual of Mental Disorders, 4th edn. American
terview are used as the criterion by which the validity Psychiatric Association, Washington DC
Dawes R M 1994 House of Cards: Psychology and Psychotherapy
of other instruments is evaluated, but semistructured
Built on Myth. Free Press, New York
interview schedules may rely too heavily for their own Garb H N 1998 Studying the Clinician. Judgment Research and
derivation on simply face validity. In defense of the Psychological Assessment. American Psychological Associ-
validity of semistructured interviews, the most com- ation, Washington, DC
pelling published research concerning the etiology, Groth-Marnat G 1997 Handbook of Psychological Assessment,
course, pathology, and treatment responsivity of 3rd edn. Wiley, New York

2019
Clinical Assessment: Interiew Methods

Hartung C M, Widiger T A 1998 Gender differences in the and study them under controlled experimental condi-
diagnosis of mental disorders: Conclusions and controversies tions to help us better understand various aspects of
of DSM-IV. Psychological Bulletin 123: 260–78 human disorders. The goal in these attempts is to
Morey L C, Ochua E S 1989 An investigation of adherence to
develop an animal model of a disorder or its treatment.
diagnostic criteria: Clinical diagnosis of the DSM-III per-
sonality disorders. Journal of Personality Disorders 3: 180–92
Nathan P, Langenbucher J W 1999 Psychopathology: Descrip-
tion and classification. Annual Reiew of Psychology 50:
79–107 1. Historical Background
Regier D A, Kaelber C T, Rae D S, Farmer M E, Knauper B,
Kessler R C, Norquist G S 1998 Limitations of diagnostic Although the study of emotions such as fear and
criteria and assessment instruments for mental disorders. sadness in animals dates back at least to Darwin
Implications for research and policy. Archies of General (1872), the experimental study in animals of the
Psychiatry 55: 109–15
neurotic extremes of emotional states and other
Rogers R 1995 Diagnostic and Structured Interiewing. A
Handbook for Psychologists. Psychological Assessment aspects of psychopathology did not begin until some
Resources, Odessa, FL years later (e.g., Pavlov 1927). Shortly after that time
Segal D L 1997 Structured interviewing and DSM classification. in the United States, where the methods of Pavlov and
In: Turner S M, Hersen M (eds.) Adult Psychopathology and Thorndike to study learning were enthusiastically
Diagnosis. Wiley, New York, pp. 24–57 embraced, interest in Pavlov’s so-called ‘neuroses of
Westen D 1997 Divergences between clinical and research the experiment’ spread during the 1930s and 1940s.
methods for assessing personality disorders: Implications for Indeed, a number of well-known laboratories were
research and the evolution of Axis II. American Journal of established to study what came to be called exper-
Psychiatry 154: 895–903
imental neurosis (e.g., Liddell, Gantt, Masserman, and
Whaley A L 1997 Ethnicity\race, paranoia, and psychiatric
diagnoses: Clinician bias versus sociocultural differences. N. R. F. Maier). At the time, this work was reasonably
Journal of Psychopathology and Behaioral Assessment 19: influential, partly because other extant models of what
1–20 causes psychopathology were generally very primitive.
Widiger T A 1998 Sex biases in the diagnosis of personality By contrast, ideas for studying human psychopath-
disorders. Journal of Personality Disorders 12: 95–118 ologies through developing animal models seemed
Widiger T A, Sanderson C J 1995 Assessing personality remarkably advanced. Indeed, the success of various
disorders. In: Butcher J N (ed.) Clinical Personality Assess- experimental manipulations in producing disordered
ment. Practical Approaches. Oxford University Press, New behavior and emotions in several different species, was
York, pp. 380–94
clearly influential in establishing the foundations of
Widiger T A, Saylor K I 1998 Personality assessment. In:
Bellack A S, Hersen M (eds.) Comprehensie Clinical Psy- behavioral approaches to the etiology and treatment
chology. Pergamon, New York, pp. 145–67 of anxious and depressive disorders.
Zimmerman M, Mattia J I 1999 Psychiatric diagnosis in clinical Unfortunately, this early work on experimental
practice: Is comorbidity being missed? Comprehensie Psy- neurosis was fairly unsystematic. Investigators ex-
chiatry 40: 182–91 plored the effects of experimental variants on tra-
ditional learning paradigms (often discovered acci-
T. A. Widiger dentally) that seemed to produce disturbed behavior
in their animals. However, the next steps were not
taken. One would have been to manipulate system-
atically various aspects of the procedures to determine
what the critical (causal) features were. Investigators
Clinical Psychology: Animal Models also needed to demonstrate (but did not) compelling
phenotypic (symptomatic) and\or functional similari-
Over the years, considerable debate has surrounded ties between the ‘neurotic symptoms’ seen in animals
the question of whether psychopathology and its and human patients. Instead, somewhat superficial,
treatment can be studied meaningfully in animals. One and often anthropomorphic, assertions of similarity
side has argued that psychopathological syndromes were made. As a consequence of these failures, the
such as anxiety and depression are uniquely human study of animal models for psychopathological dis-
and cannot be experienced in animals. Another side orders fell into relative obscurity for several decades
argues that there are both naturally occurring and (see Mineka and Kihlstrom 1978).
experimentally induced psychopathological states in
animals that closely parallel those seen in humans.
This side also argues that there is so much to be
learned from the systematic study in controlled sett- 2. Contemporary Use of Animal Models
ings of emotional or otherwise disturbed behavior in
animals that this more than offsets any problems A resurgence of interest started about 1970, when a
created by potential species differences. Proponents of number of investigators began to make persuasive
this position bring animals into laboratory settings arguments that animal models of some disorders could

2020
Clinical Psychology: Animal Models

be very useful if certain criteria are adhered to in model of the human disorder (or often just a subset of
developing the model. For example, Seligman (1975) symptoms of a disorder) before they can test their
and McKinney (1974) both argued that animal models medications. For example, determining if a new
can be useful if one attempts to document similarities medication serves as an anxiolytic (anxiety-reducing)
and parallels in the symptoms, etiology, therapy, and medication, requires knowing how to produce strong
prevention of the animal and human syndromes. symptoms of anxiety in animals (as well as knowing
Obviously, at the outset not all of the parallels will be that the measure of anxiety is functionally if not
possible to detail because little may be known about phenotypically, related to human anxiety). The same,
some of these factors (e.g., prevention) for either the of course, applies to treatments of other disorders.
animal or the human disorder. Nevertheless, herein Staying with the anxiety example, at least 30
lies one of the special advantages of developing an different animal models have been used to test the
animal model. Initially some compelling similarities effectiveness of anxiolytic drugs. One of the most
must be drawn between the human disorder and the common uses a conflict\punishment procedure. Rats
animal model. Then, however, the animal model can are first trained to press a bar to obtain food
be used to test hypotheses about other possible reinforcement on an occasional basis. Later they can
parallels (e.g., prevention) that often cannot easily be still obtain food on some trials but now they also
tested experimentally with humans. Some of the work receive an unpleasant electric shock following the
developing such full-fledged animal models has been food. Not surprisingly, this punishment procedure
quite successful, as discussed below. puts hungry rats in a state of conflict: anxiety about
Alternatively, others have argued that requiring punishment now conflicts with hunger. Typically, rates
adherence to all these criteria may be unnecessarily of responding for the food are diminished substantially
restrictive, given that for most disorders no such unless an effective anxiolytic medication is given (e.g.,
complete models have yet been discovered. This is diazepam from the benzodiazepine category). Having
partly because of significant limitations on the range been validated as a model in numerous studies with
of human symptoms that can be modeled in animals— medications known to reduce anxiety in humans, the
especially if expression of that symptom is mediated model is then used to test potential new anxiolytic
by higher cortical structures in the brain not shared by compounds.
most species. There are, however, many very interes- Researchers must beware, however, that this
ting and important ‘mini-models’ which help illumi- method produces both false positives and false nega-
nate different aspects of the symptomatology, or tives. Sometimes a medication that seems to work in
etiology, or prevention, or treatment of these dis- the animal model, is later shown not to work in
orders. Mini-models are emotional, behavioral, cog- humans. Alternatively, a medication that does not
nitive, and\or physiological phenomena studied in seem effective with animal models may nonetheless be
animals (or humans) that may clarify some of the most effective in humans. In such cases further work is
prominent features of the origins or treatment of a needed to determine the source of the discrepancy. To
disorder. The behaviors and emotional responses are illustrate, a relatively new anxiolytic compound
manipulated experimentally through either behavioral known as buspirone (not from the benzodiazepine
or physiological experimental manipulations. Al- category) did not initially seem effective using the
though any given mini-model may illuminate only a conflict\punishment procedure in rats. However, bus-
subset of prominent features of a disorder, in nature pirone operates through different physiological mech-
such factors would operate in interaction with other anisms, and has a different time course of action, than
factors in the etiology or treatment of a human do traditional benzodiazepine compounds. Animal
disorder. models incorporating this knowledge do show busp-
irone to be effective. Thus researchers must beware,
when an established animal model does not demon-
strate effectiveness of a novel compound, that the
3. Animal Models in Psychopharmacology and new compound should not be dismissed without
Behaioral Pharmacology further work. Such work is needed to avoid the risk of
prematurely screening out potentially effective novel
Nowhere has the use of animal models been more medications (e.g., Rodgers 1997).
prominent than in the fields of psychopharmacology Another use of pharmacological models involves
and behavioral pharmacology, where researchers de- testing theories of the physiological underpinnings of
velop new medications to treat mental disorders and different disorders, rather than their treatment. Here
try to understand how the medications work to researchers use various pharmacological agents to
ameliorate symptoms. Nearly all medications are induce physiological and behavioral states in animals
initially tested on animals before being approved for that resemble those seen in a human disorder. Caution
use by humans—to determine both their effectiveness is also needed here. Observing that a drug induces a
and their safety. However, before this can be done, state in animals resembling that in humans neither
researchers must first develop and validate an animal justifies the conclusion that the same system is nor-

2021
Clinical Psychology: Animal Models

mally involved in the human case, nor speaks to the Moreover, a primate model showed that being reared
issue of whether there may be a number of alternative with a strong sense of mastery and control over one’s
routes to the human disorder (e.g., Weiss and Uhde environment makes one less susceptible to the effects
1990). of the stressors that later may be involved in cond-
itioning incidents.
Finally, there are evolutionarily based predisposi-
tions to acquire fears and phobias of certain objects or
4. Animal Models of Anxiety and Anxiety situations (e.g., snakes, water) that once posed a threat
Disorders to our early ancestors more readily than other objects
and situations not present in our early evolutionary
history (e.g., guns, knives). Thus animal models have
4.1 Specific and Social Phobias
shown us that personality, experiential, and evol-
Historically, research on animal models of anxiety— utionary factors may serve as diatheses or vulnerability
especially fears and phobias—began before models of factors for the development of phobias in certain
other disorders. The role of classical conditioning (see individuals, given appropriate experiential input (cf.,
Classical Conditioning and Clinical Psychology) in the Mineka and Zinbarg 1996).
etiology of specific phobias, first proposed by Watson In social phobias, people have strong and persistent
and Rayner (1920) has been the subject of some fears of various types of social interaction where they
controversy since about 1970. Some theorists ask how fear they may be evaluated and judged unfavorably.
classical conditioning can play an important role both: Again, traumatic conditioning experiences (direct or
(a) when some people recall no traumatic conditioning vicarious) are thought to play an important role, but
experiences; and (b) and when others who can recall with the same caveat as for specific phobias. That is,
traumatic experiences are not fearful\phobic. They animal models demonstrate that personality, experi-
have also wondered why some objects and situations ential, and evolutionary variables determine a person’s
are much more likely to become the objects of fears level of vulnerability\invulnerability to developing
and phobias than others (e.g., phobias for snakes, social phobias. Considerable knowledge in this area
spiders, and heights are far more common than for stems from animal models of social anxiety (such as
cars, guns, or knives, which may also be associated with occur following defeats in physical fighting, when the
trauma). animal typically becomes afraid of all dominant
However, contemporary research with animal conspecifics rather than simply the one involved in the
models illustrates that Pavlov’s and Watson’s core defeat) (cf. Mineka and Zinbarg 1996).
ideas about the role of classical conditioning were
sound, but need to be expanded to incorporate the
broader knowledge now available about the complexi-
4.2 Panic Disorder with Agoraphobia
ties of conditioning (see also Classical Conditioning and
Clinical Psychology). For example, a primate model Individuals with panic disorder have recurrent un-
showed that conditioning sometimes occurs obser- expected panic attacks, usually associated with per-
vationally or vicariously—that is, watching someone sistent anxiety about having another attack (antici-
behave fearfully with, for example, a snake, may be patory anxiety). Many also develop some degree of
sufficient to induce a fear of snakes in the observer agoraphobic avoidance, learning to avoid situations in
(without any direct trauma occurring). Moreover, the which they fear panicking. A full-fledged animal model
role of conditioning (vicarious or direct) must be of panic disorder remains elusive, although some
considered in light of various vulnerability and invu- pharmacological agents that provoke panic attacks in
lnerability factors that influence the outcome of a humans with the disorder, do seem to produce a panic-
traumatic conditioning experience (cf. Mineka and like state in some primates. Nevertheless, animal mini-
Zinbarg 1996). For example, animal research by models of panic and anxiety together have proved
Pavlov and others laid the foundation for showing important in the development of a new theory of the
how individual differences in personality\tempera- origins of panic disorder that is largely based on
ment (such as levels of trait anxiety) affect conditioning contemporary principles of learning studied in animals
and the likelihood of acquiring fears and phobias. (Bouton et al. 2001). The essence of this complex
Animal work also illustrates a wide range of expe- theory is that the occurrence of panic attacks sets the
riential differences across individuals that strongly stage for the conditioning of anxiety to both internal
affect the outcome of direct or indirect conditioning and external cues that preceded a panic attack, thus
experiences (and therefore why many without phobias explaining the development of both anticipatory
will have such histories that involved putative cond- anxiety and anxiety leading to agoraphobic avoidance.
itioning events). For example, having extensive prev- In addition, internal cues associated with the be-
ious neutral or positive experiences with a potentially ginning of an attack can become conditioned to elicit
phobic object (e.g., a dog) can prevent the acquisition panic attacks themselves. For example, a few heart
of dog phobia if the individual is later bitten by a dog. palpitations that often occur early during an attack

2022
Clinical Psychology: Animal Models

could come to serve as conditional stimuli that trigger cognitive and somatic symptoms. Some of these can be
full-blown panic attacks (see Classical Conditioning modeled better in animals than others (the cognitive
and Clinical Psychology). ones are especially difficult). Since the 1960s, various
animal models of depression have produced useful
insights into human depression, ranging from new
ideas concerning etiology, prevention, and treatment.
4.3 Post-traumatic Stress Disorder (PTSD) The models have differed in several ways, perhaps
most notably in the methods used for causing ‘de-
This disorder develops in some individuals following pression’ in the animals.
exposure to a traumatic event in which the person Some of the earliest and most striking work with
experienced or witnessed events involving actual or primates used a social separation paradigm. Following
threatened death to themselves or others. Symptoms on earlier work demonstrating that human infants
include persistent re-experiencing of the event (e.g., undergoing prolonged separations from their mothers
through nightmares and flashbacks), persistent avoid- showed a biphasic response to the separation, Harlow
ance of stimuli associated with the trauma, and arousal and colleagues in the 1960s began to study this
symptoms such as difficulty concentrating and exag- phenomenon in infant rhesus monkeys separated from
gerated startle responses. Recently, animal models of their mothers. Both the monkey and human infants
PTSD have been the focus of much research, providing typically go through an initial state of intense agitation
useful insights into the nature of this disorder as well and distress (the protest phase—often seen as a
as its etiology and treatment. Animal models involve prototype of anxiety), followed several days later (if
exposure to unpredictable and\or uncontrollable the separation persists) by a phase of despair\depres-
stress which initiates the emotional, behavioral, and sion, characterized by social withdrawal and rejection.
physiological symptoms resembling those seen in Although some have argued this is at best a model of
human PTSD. The intense physical stressors used in infant depression, others have argued that infant
animal studies resemble those in some forms of human depression is in fact a prototype for adult human
traumatization associated with PTSD, including tor- depression, with most of the prominent symptoms
ture, abuse, and assault. This has in turn drawn being functionally quite similar (except the cognitive
attention to the role that perceptions of uncontroll- ones).
ability\unpredictablity play in the development of By studying social separation in monkeys one can
human PTSD symptoms and has led to studies testing manipulate experimentally numerous variables, both
these ideas. before and during the separation, to test hypotheses
This animal model has also shown the powerful role about factors promoting minimal versus exaggerated
that various vulnerability and invulnerability factors responses to separation—something that obviously
play in determining who is more or less likely to cannot be done in human infants. Research using
develop PTSD, given exposure to the same trauma. animal models made it clear, for example, that having
For example, the animal model has revealed that prior a sibling or alternate caregiver present during the
exposure to uncontrollable stressors prior to the separation from mother can attenuate (but usually not
relevant trauma sensitizes the animal (or human), eliminate) the response to separation. But similar
making it more likely to develop PTSD-like symptoms research with different species of monkeys, in which
than animals without prior exposure to the uncon- separated infants automatically get adopted by
trollable stressors. Conversely, prior exposure to ‘aunts,’ also showed that reducing behavioral signs of
controllable stressors before the relevant trauma leads distress is not tantamount to reducing physiological
to an immunization effect, making it less likely the signs of distress and arousal which can remain high
animal will develop PTSD-like symptoms. These (e.g., Coe et al. 1985). Another line of primate work
hypotheses are beginning to be corroborated in human showed the importance of preseparation experiences
research. Finally, the idea that perceptions of uncon- in determining the outcome of any given separation.
trollability and unpredictability mediate many aspects For example, infant monkeys of one species (bonnet
of PTSD has led to the formulation of hypotheses macaques) whose mothers are relatively permissive,
about the importance of re-instilling a sense of control allowing the infants considerable freedom to interact
and predictability as part of treatment. Such hy- with other adults, cope reasonably well with separa-
potheses are currently being tested (see Mineka and tions. By contrast, infants of another species (pigtail
Zinbarg 1996). macaques) whose mothers are quite possessive and
restrictive of their infant’s freedom, do not cope as
well with separations (e.g., Kaufman 1973). Observa-
tional follow-up work with human infants has often
5. Animal Models of Depression corroborated the hypotheses developed based on
experimental animal models which are better able to
Clinically significant depression is a surprisingly com- pinpoint causal factors (see Mineka and Zinbarg
mon condition. There are emotional, motivational, 1991).

2023
Clinical Psychology: Animal Models

Another influential animal model of depression cussed here illustrate important features of the sequen-
derives from the learned helplessness phenomenon tial relationship between anxious and depressed symp-
and theory (e.g., Seligman 1975). In the late 1960s, toms that also occurs in humans.
Seligman and his colleagues Maier and Overmier Other animal models of depression have also pro-
noted that laboratory dogs initially exposed to un- vided somewhat different but also partially overlap-
controllable shocks later showed major deficits in ping information about both the symptom picture,
learning to control shock in different situations; and etiological factors in depression. The learned
indeed, they mostly seemed to accept the shock helplessness model has also generated hypotheses
passively rather than trying to escape it. Yet animals about effective treatment and prevention. Pharmaceu-
first exposed to equal amounts of controllable shock tical companies make ample use of the most practical
showed no such deficits. Learned helplessness theory animal models to test their antidepressant com-
proposed that exposure to uncontrollable events leads pounds. Although recent etiological theories of human
one to learn that responses are ineffective in bringing depression have generally now come to incorporate
relief, i.e., one is helpless to control important out- features that cannot be modeled in animals (such as
comes. The expectation of helplessness leads to: (a) feelings of hopelessness about the future), many useful
cognitive deficits (difficulty learning to control shock), insights from the original models remain generally
(b) motivational deficits (reduced incentive to try intact.
responding in other situations because of a belief that
responses will be ineffective); (c) emotional changes 6. Other Animal Models
(e.g., feelings of sadness, depression, and anxiety); and
(d) physiological changes that occur with uncontrol- Animal models of numerous other disorders have also
lable but not controllable stress. Seligman (1975) later been studied, including psychopathy, schizophrenia,
proposed that the primary symptoms of depression and addictions. For example, regarding psychopathy,
resembled these primary changes seen with learned Newman (1997) and his colleagues used the well-
helplessness quite strongly. known animal syndrome that stems from dysfunction
Seligman also proposed etiological similarities. A in septo-hippocampal areas of the brain to model
large percentage of humans experiencing clinical various features of psychopathic behavior—most no-
depression have had one or more significant life tably the failure to inhibit inappropriate behaviors
stressors in the recent past (such as a major loss or and inability to delay gratification. Fowles and Missel
unemployment, etc.). His argument was that all such (1994) also summarized an important series of findings
precipitants could be seen as inducing a sense of lack using animal and human mini-models, indicating the
or loss of control over important aspects of one’s centrality of deficits in passive avoidance learning
environment. Thus perceptions of helplessness may be (learning what not to do to avoid punishment) to
the proximal cause of such cases of depression. Finally, many important features of psychopathy. Both lines
Seligman also developed the corollary hypothesis that of work have led to new insights on psychopathy.
re-instilling a sense of control may be a core ingredient In schizophrenia, one early example of the use of
in the most effective treatments for depression (see animal models came when researchers first attempted
Peterson et al. 1993). to discovered how traditional antipsychotic medica-
This animal model of depression led to an enormous tions work to reduce schizophrenic symptoms in
amount of animal and human research, much of it humans. Animal research helped to show the im-
continuing today. Interestingly, although originally portant role that the neurotransmitter dopamine plays
developed as a model of depression, its relevance has in the effects of these medications, and also helped lead
expanded to include a role in theories of several to the development of the dopamine hypothesis of
anxiety disorders (most notably PTSD, see above). schizophrenia (including its etiology). This very in-
For example, the initial emotional state during and fluential theory, now known to be overly simplistic,
following uncontrollable stress is one of intense and was developed and tested in good part with animal
diffuse anxiety, which may lapse later into a depressive mini-models.
state. This idea is highly consistent with neurochemical Finally, the use of animal models to study of
results, indicating a shift from an intense aroused addictions has proved very important for unders-
anxious state to a later depressed state, sometimes tanding how addictive substances induce brain and
called conservation-withdrawal (e.g., Woodson et al. behavioral changes. Using alcohol as an example,
1998). Interestingly, as noted above, anxious symp- initially this work necessitated developing methods
toms also precede depressive symptoms during pri- through which animals became addicted to alcohol (or
mate and human infant separations, as well as fol- other drug of interest). Subsequently, animal research
lowing major losses in adults (Bowlby 1980). In with alcohol has revealed the numerous different
addition, far more people who suffer depression at neurotransmitter pathways in the brain with which
some point in their lives have suffered from an anxiety alcohol interacts, leading to numerous alterations in
disorder first than the reverse (cf. Maser and Cloninger brain function and behavior. Animal research has also
1990). Thus both animal models of depression dis- facilitated understanding of brain mechanisms involv-

2024
Clinical Psychology in Europe, History of

ed in maintaining the motivation and desire to drink. Fowles D C, Missel K A 1994 Electrodermal hyporeactivity,
In addition, animal research focusing on environ- motivation, and psychopathy: Theoretical issues. In: Fowles
mental determinants of alcohol use, such as avai- D C, Sutker P, Goodman S H (eds.) Progress in Experimental
Personality and Psychopathology Research. Springer, New
lability of alternative reinforcers, stress, and ease of
York
access, etc., has provided important information. Kaufman I C 1973 Mother–infant separation in monkeys: An
Some of this work has also focussed on genetic experimental model. In: Scott J P, Senay B (eds.) Separation
determinants of alcohol preference and alcohol use, and Depression: Clinical and Research Aspects. AAAS, Wash-
and on how genetic factors interact with environ- ington, DC, pp. 33–52
mental factors (National Institute on Alcohol Abuse Maser J, Cloninger C (eds.) 1990 Comorbidity in Anxiety and
and Alcoholism 1997). Mood Disorders. American Psychiatric Press, Washington,
DC
McKinney W T 1974 Animal models in psychiatry. Perspecties
in Biology and Medicine 17: 529–41
7. Conclusions Mineka S, Kihlstrom J F 1978 Unpredictable and uncontrollable
aversive events. Journal of Abnormal Psychology 87: 256–71
As indicated by this necessarily selective review of Mineka S, Zinbarg R 1991 Animal models of psychopathology.
historical and contemporary research on animal In: Walker C E (ed.) Clinical Psychology: Historical and
models, such research has made important contribu- Research Foundations. Plenum Press, New York, pp. 51–86
tions to our understanding of various forms of human Mineka S, Zinbarg R 1996 Conditioning and ethological models
psychopathology and its treatment (see also Classical of anxiety disorders: Stress-in-dynamic-context anxiety
Conditioning and Clinical Psychology). Some of this models. In: Hope D (ed.) Perspecties on Anxiety, Panic, and
research has attempted to develop full-fledged models Fear. 43rd Annual Nebraska Symposium on Motivation.
by uncovering parallels between symptoms, etiology, University of Nebraska Press, Lincoln, NE, pp. 135–211
Newman J P 1997 Conceptual models of the nervous system:
treatment, and prevention in the animal model and the
Implications for antisocial behavior. In: Stoff D M, Breiling J,
human disorder (e.g., the learned helplessness model Maser J D (eds.) Handbook of Antisocial Behaior. Wiley,
of depression). However, an even greater amount of New York, pp. 324–35
work exemplifies the mini-model approach in which National Institute on Alcohol Abuse and Alcoholism (NIAAA)
only a subset of cardinal features of a human disorder 1997 Ninth special report to the US Congress on Alcohol and
are studied. An important advantage of studying Health. NIH Publication, Washington No. 97–4017
animal models is that through experimental mani- Pavlov I P 1927 Conditioned Reflexes. Oxford University Press,
pulation one can better determine what the critical London
causal features are than is generally possible in human Peterson C, Maier S F, Seligman M E P 1993 Learned Help-
lessness: A Theory for the Age of Personal Control. Oxford
research. With this background of success, further
University Press, New York
research with animal models certainly will continue to Rodgers R J 1997 Animal models of ‘anxiety’: Where next?
provide important new insights and information about Behaioral Pharmacology 8: 477–96
many areas of human psychopathology and its Seligman M E P 1975 Helplessness: On Depression, Deelopment,
treatment. and Death. Freeman, San Francisco
Watson J B, Rayner R 1920 Conditioned emotional reactions.
See also: Animal Rights in Research and Research Journal of Experimental Psychology 3: 1–14
Application; Anxiety and Anxiety Disorders; Anxiety Weiss S R, Uhde T W 1990 Animal models of anxiety. In:
Disorder in Children; Childhood Depression; De- Ballenger J (ed.) Neurobiology of Panic Disorder, Wiley-Liss,
New York pp. 3–27
pression; Depression, Clinical Psychology of; Genes Woodson J C, Minor T R, Job R F S 1998 Inhibition of
and Behavior: Animal Models; Panic Disorder; Pav- adenosine deaminase by erythro-9-(2-hydroxy-3-nonyl)
lov, Ivan Petrovich (1849–1936); Spatial Memory adenine (EHNA) mimics the effect of inescapable shock on
Loss of Normal Aging: Animal Models and Neural escape learning in rats. Behaioral Neuroscience 112: 399–409
Mechanisms
S. Mineka

Bibliography
Bouton M, Mineka S, Barlow D H 2001 A modern learning
theory perspective on the etiology of panic disorder. Psy-
chology Reiew, 108: 4–32 Clinical Psychology in Europe, History of
Bowlby J 1980 Attachment and Loss, III: Loss, Sadness and
Depression. Basic Books, New York
Coe C, Wiener S, Rosenberg L, Levine S 1985 Endocrine and 1. Introduction
immune responses to separation and maternal loss in non-
human primates. In: Reite M, Field T (eds.) The Psychobiology Europe is characterized by its social, linguistic, and
of Attachment and Separation. Academic Press, New York cultural diversity (e.g., Drenth et al. 1990, Poortinga
Darwin C R 1872 Expression of Emotion in Man and Animals. 1996), which is rooted in very different histories and
John Murray, London traditions of the 48 or so countries. Even the geo-

2025
Clinical Psychology in Europe, History of

graphical or political extent of Europe is a matter for cognitive behavior therapy, other European coun-
some debate, though it is assumed that the growth of tries such as France and Italy historically have been
the European Union (currently 15 states, all of them in more influenced by psychoanalytic ideas. Although all
western Europe) will clarify this in future years. European countries have been influenced by the USA,
Although psychology as a discipline originated with this influence has been perhaps most striking in the
Wundt’s laboratory in Leipzig, Germany, in 1879, the Scandinavian countries which embraced the ‘scientist
profession of clinical psychology was considerably practitioner’ approach to practice from an early stage.
slower to develop in Europe than in the USA (Sexton
and Misiak 1976, Eysenck 1990, Sexton and Hogan
1992).
The major influence for the development of clinical 2.1 Europe
psychology in many western European countries was As mentioned, the political face of Europe is develop-
the rebuilding of Europe after World War II in the ing and changing rapidly (see Lunt 1998). Many of the
1940s and 1950s and the development of mental health central and eastern European countries have applied
facilities by European states, as the challenge of mental for membership of the European Union, and are
problems and mental health presented itself to health working towards criteria which will enable them to
authorities (Lunt 2000). In eastern European countries join. This will mean an expanded and changed Europe
clinical psychology developed much later as these in the not too distant future, bringing together the early
countries emerged in the 1980s from political regimes members of the European Union and nonmember
which were critical of the discipline (e.g., Pawlik 1996). states; many of the latter in Eastern Europe and
It is only relatively recently that these countries have formerly belonging to the Soviet Union. The countries
begun to develop a profession of clinical psychology. of Europe constitute a very diverse group that may
nevertheless be grouped into broad regions with some
commonality in their organization and practice of
clinical psychology. For these purposes, at a very
2. The Field of Clinical Psychology broad level it is possible to identify a Nordic group of
Although there are differences between countries, it is countries with considerable commonalities in their
possible to provide a general definition of the field practice (the Scandinavian countries), Great Britain
which would be agreeable to all European countries; which has much in common with education and
clinical psychologists may be defined as psychologists training in the USA and other parts of the English-
who speaking world, the German speaking countries, a
southern European group, and an eastern European
apply psychology in a clinical context, usually a hospital, group. Of course, within these regions, the countries
medical or community setting, with people (patients or staff ) are characterized by their individuality and their
who consider themselves to be in need of a psychological diversity, and the different histories that have had a
perspective on their lives. In practice, the majority of clinical profound influence on the development of clinical
psychologists contribute to the assessment and treatment of psychology.
people who see themselves as having psychological problems,
such as those with mental health difficulties, but they also
work with the handicapped, families, those with learning
difficulties, and, more widely, with staff and organisations
(Llewelyn 1994). 3. The Early Years of Clinical Psychology
There are a number of models in the field of clinical
There are some differences in the practice of clinical psychology, a field which emerged in European coun-
psychology, from countries where the majority are tries substantially after World War II (see above).
employed by the state such as the United Kingdom It emerged as a recognizable profession at different
where most clinical psychologists work for the times in different countries. In the United Kingdom
National Health Service (NHS) and there is relatively the two World Wars provided a significant impetus to
little private practice, to countries such as Switzerland the emergence of this profession, initially through the
and Germany where many clinical psychologists work need to develop psychological tests to recruit suitable
in private practice, with charges reimbursed by medi- personnel, and later in 1948 with the formation of the
cal insurance companies. Furthermore, there are differ- NHS that provided a considerable impetus to the
ences in the dominant activity of clinical psychologists: development of this new profession. In these early
in many European countries the main activity is days, the role of the clinical psychologist was largely
psychotherapy, whereas in others there is a broader one of laboratory technicians administering psych-
role which includes assessment and other forms of ometric and other tests, usually for medical prac-
intervention, and also a role in training and consulting titioners (Eysenck 1990). However, in the United
with other staff. While the dominant paradigm inform- Kingdom, the development of behavior therapy, in
ing clinical psychology in the United Kingdom is particular under the influence of Hans Eysenck who

2026
Clinical Psychology in Europe, History of

established clinical psychology as a profession in within the mental health field. This has been supported
England in 1949, led to clinical psychologists deve- by the WHO commitment to Health for All by the
loping a therapeutic role, and by the 1960s they had Year 2000; EFPPA has had three Task Forces,
become clinical practitioners in their own right. focusing respectively on clinical psychology, health
In the Nordic countries, also, clinical psychology psychology, and psychotherapy, which have drawn up
emerged substantially after World War II, in the a model defining overlap and separate areas of activity
1940s and early 1950s; the Norwegian Psychological within the health field, with corresponding commonal-
Association was founded in 1934, the Danish Psycho- ities and differences in education and training (EFPPA
logical Association in 1947, the Swedish Psychological 1997).
Association in 1955, and the Finnish Union of
Psychologists in 1957. The formation of these Psycho-
logical Associations reflected the emergence in these
countries of a profession of psychology, mainly clinical 4. Training of Clinical Psychologists
psychology, which progressed through similar broad Education and training of clinical psychologists varies
major phases as the UK; that is, a phase focusing on in European countries though, again, there are major
diagnostic examination and testing of individual pa- regional groupings. Across Europe, as in other regions
tients, a phase focusing on therapeutic work involving of the world, the education and training period has
mainly psychotherapy, and a phase focusing on increased, with moves in some European countries for
indirect work with other professional groups through doctorate training for clinical psychologists, and more
techniques such as consultation and training. demanding requirements in all countries for the
In Spain, on the other hand, clinical psychology has internship period. In the United Kingdom, there is a
had a shorter history (Belloch and Olabarria 1994) strong commitment to a scientist–practitioner model:
and is said to have emerged in the 1970s along with ‘the clinical psychologist is first and foremost an
fundamental changes in Spanish society. Similarly in ‘‘applied scientist’’ or ‘‘scientist–practitioner’’ who
Italy, clinical psychology as a professional application seeks to use scientific knowledge to a beneficial end’
emerged substantially in the 1960s and there remain (Marzillier and Hall 1990). This commitment also
some tensions over the autonomy and role of clinical characterizes the Nordic countries which have been
psychologists and their relationship with medical influenced substantially by the USA, and also by the
practitioners, especially psychiatrists. The Association UK. These countries would sign up to the definition
of Greek Psychologists, founded in 1963, reflects the that:
existence of professional psychology in Greece, al-
though at that time and until very recently, all clinical clinical psychologists share several common attributes. They
psychologists received their training overseas, and the are psychologists because they have been trained to use the
profession was very much dominated by the medical guidelines of knowledge of psychology in their professional
profession. work. They are clinicians because they attempt to understand
As mentioned above, clinical psychology emerged people in their natural complexity and in their continuous
adaptive transformations … they are scientists because they
much later in Eastern European countries, and it was utilize the scientific methods to achieve objectivity and
not until the 1980s that psychotherapy was fully precision in their professional work. Finally, they are profes-
recognized as a profession (Pawlik 1996). Until re- sionals because they render important human services by
cently, nearly all psychologists were employed in state helping individuals, social groups, and communities to solve
institutions, though in recent years there has been a psychological problems and improve the quality of life
growth in demand which has resulted in a growth in (Kendall and Norton-Ford 1982, p. 4).
private practice.
In all European countries, clinical psychologists This model is also espoused by Spanish clinical
represent the largest group of psychologists, and psychology where Belloch and Olabarria (1994) state
experienced a rapid growth in their number between that clinical psychology training is ‘very similar to that
the 1960s and the 1990s with the expansion of mental proposed in the 1940s, in the famous Boulder Con-
health provision and the growing awareness of the ference, organised by the APA.’ In France, where there
contribution of clinical psychology to a wide and is a strong clinical psychoanalytic tradition, there is
diverse range of areas of work (EFPPA 1997). Indeed, less of a commitment to the ‘Boulder’ model, and more
the 31 Member Associations of EFPPA (see below) of a philosophical or hermeneutic tradition in relation
represent 150,000 psychologists, the majority of whom to education and practice.
are clinical psychologists. However, since 1957 when the Treaty of Rome
Since the 1990s, in many European countries the provided the foundation for the European Com-
emergence of the specialty of health psychology, with munity (later Union), there have been requirements on
a focus on prevention rather than treatment, and the individual countries (‘states’) to provide procedures
promotion of health rather than a more therapeutic or for the mutual recognition of psychologists’ qualifica-
curative function, has led to attempts to define and tions across national boundaries (see McPherson
differentiate a new field of psychological activity 1988). Wider moves within the European Union such

2027
Clinical Psychology in Europe, History of

as the Bologna Agreement, which was signed by 29 psychologists and psychology associations of member
Ministers in 1999 and commits them to greater states that a federation would provide a professionally
convergence in terms of university degree structures, and politically useful way to move forward and to
mean that even within Europe there is likely to be begin to develop common policies in this area. Clinical
greater similarity in terms of structures of education psychologists, in particular, were faced with the
and training for clinical psychologists. growing prospects of mobility between countries in the
European Union, and the implications of the Treaty of
Rome that provided the foundation for the European
5. Professionalization of Clinical Psychology Community in 1957.
As a federation of Professional Psychology Asso-
The period since the 1950s has seen a greater profes- ciations representing around 150,000 professional
sionalization of clinical psychology in all European psychologists in Europe, EFPPA spends most of its
countries, with the development of codes of ethics efforts on clinical psychology and clinical psycho-
(Lindsay 1996) and increased regulation and laws for logists, also the largest group of psychologists within
clinical psychologists (Lunt 2000) across European Europe (as in the rest of the world). In many countries
countries. These political and professional develop- psychology has become one of, if not the, most popular
ments have been supported by EFPPA which has subjects to study at university. The majority of
brought together clinical psychologists from all over students studying psychology aspire to become clinical
Europe to work on the professional aspects of practice psychologists, and for this reason, in many European
at a European level and to support individual Euro- countries, there is an oversupply of qualified prac-
pean countries seeking to develop their ethical codes, titioners. Many countries now operate a so-called
laws protecting the title of psychologist, and education numerus clausus, either at the start of the psychology
and training in clinical psychology. study, or during the study. This controls the numbers
in training. In the United Kingdom, where specialist
training in clinical psychology is funded by the NHS—
6. Organization of Clinical Psychology where the vast majority of clinical psychologists
work— the number of ‘trainee’ posts is strictly limited
It is also possible to trace the history of clinical and is planned according to staffing needs in the
psychology through its organization in Europe (Gilgen different regions of the country. In other countries,
and Gilgen 1987, Lunt 1998). Many European coun- where there is a tradition of predominantly private
tries founded scientific societies for psychologists in practice in clinical psychology, there are large numbers
the early twentieth century whose purpose was to of qualified psychologists unable to find work. In all
foster research and psychological science. At the time European countries, the ratio of female to male
of World War II, separate professional associations students is between 6:1 and 3:1, leaving the profession
to meet the needs of professional psychologists, often in danger of becoming an almost feminized profession
mainly clinical psychologists, were founded in a in the future (Schorr and Saari 1995). There has also
number of European countries. These associations been some difficulty in many countries in recruiting
had, as a focus, issues concerning professional practice students from the range of ethnic groups represented
and emerged in some countries as Trade Unions, in Europe’s increasingly multiethnic population. This
negotiating salaries and terms and conditions of work clearly has implications for the clinical treatment of
for clinical psychologists as well as wider professional different client groups.
issues such as regulation, legislation, and ethical codes.
In 1981, at a time when the provisions of the European
Community demanded that European member states 7. The Future
encourage mobility of professionals across Europe,
the European Federation of Professional Psycho- As new fields of practice in the health field evolve, in
logists Associations was formed to bring together particular health psychology, there are pressures on
professional associations in Europe and to collaborate clinical psychology. For example, in some countries it
on matters of common professional concern. has been said that there may no longer be a field of
EFPPA currently has 31 member associations repre- clinical psychology, since there are strong moves
senting all the countries of the EU, all other countries towards a broader field of health psychology and a
in western Europe, and a growing number of member greater focus on preventive work. These newer areas,
associations from central and eastern Europe. The such as health psychology, neuropsychology, and
federation provides a unique opportunity for com- forensic psychology, are leading to greater specializ-
parison between the practices of different European ation within and outside clinical psychology. Although
countries and a forum for discussion and debate of European countries differ in the extent and nature of
important issues. The formation of EFPPA in 1981, their specialisms within the health field, there is an
when matters of mobility and mutual recognition were increasing trend for specialization and demands for
becoming more pressing, was due to a realization by higher qualifications. In one respect, psychologists

2028
Clinical Psychology in North America, History of

working within the health system could be said to be Contemporary Psychology in Europe. Hogrefe & Huber,
becoming more generic, while on the other hand there Go$ ttingen
are increasing specializations within this field of work. Lunt I 1998 Psychology in Europe: challenges and opportunities.
The European Psychologist 3(2): 93–101
Lunt I 2000 Psychology as a profession. In: Pawlik K,
Rosenzweig M (eds.) The International Handbook of Psy-
8. Summary chology. Sage, London
Marzillier J, Hall J 1990 What is Clinical Psychology?, 2nd edn.
The brief 50-year history of clinical psychology in Oxford University Press, Oxford, UK
Europe has seen an enormous increase in numbers McPherson F 1988 Psychologists and the EEC. The Psychologist
both of students and of practitioners, such that the 9: 353–5
majority of psychologists are now clinical psychol- Pawlik J 1996 The situation of the psychologists in Eastern
ogists. This rapid professionalization has been accom- European countries today. In: Georgas J, Manthouli M,
panied by higher qualifications, greater regulation, the Besevegis E, Kokkevi A (eds.) Contemporary Psychology in
Europe. Hogrefe & Huber, Go$ ttingen
development of ethical codes, and all the character- Poortinga Y 1996 Cultural diversity in Europe: Extrapolations
istics of traditional professions. However, as the from cross-cultural research for professional psychology. In:
number of clinical psychologists increases, and the Georgas J, Manthouli M, Besevegis E, Kokkevi A (eds.)
question of mobility across Europe becomes more Contemporary Psychology in Europe. Hogrefe & Huber,
pressing, there will be increasing attempts to develop Go$ ttingen
more common frameworks and standards for edu- Pilgrim D, Treacher A 1992 Clinical Psychology Obsered.
cation and practice; the challenge will be to achieve a Routledge, London
balance between allowing individual countries their Schorr A, Saari S (eds.) 1995 Psychology in Europe. Hogrefe,
own autonomy which reflects their differing history Go$ ttingen
Sexton V S, Hogan J D (eds.) 1992 International Psychology.
and culture (‘subsidiarity’ as it is called), and develop- Views from around the World. University of Nebraska Press,
ing more common agreed frameworks of practice Lincoln & London
which reflect a possible future ‘federalization’ of Sexton V S, Misiak H (eds.) 1976 Psychology around the World.
Europe. Brooks\Cole, Monterey, CA

See also: Clinical Psychology in North America, I. Lunt


History of; Freud, Sigmund (1856–1939); Psychiatry,
History of; Psychoanalysis, History of; Psychoanalysis
in Clinical Psychology; Psychotherapy: Ethical Issues;
Psychotherapy, History of: Psychiatric Aspects;
Training in Clinical Psychology in the United States: Clinical Psychology in North America,
Practitioner Model; Training in Clinical Psychology in
the United States: Scientist–Practitioner Model History of
Clinical psychology is concerned primarily with the
study of psychopathology and with its diagnosis and
Bibliography treatment. It shares this domain with several other
Belloch A, Olabarria B 1994 Clinical psychology: Current status mental health disciplines, including psychiatry, social
and future prospects. Applied Psychology: An International work, nursing, and various types of counseling.
Reiew 43(2): 193–211 Compared to these other disciplines, clinical psy-
Drenth P J D, Sergeant J A, Takens R J 1990 European Perspec- chology is distinctive for its training in research and
ties in Psychology, vol. 2. Wiley, Chichester, UK for its expertise in psychometrics and the behavior
EFPPA 1997 Report to the General Assembly of EFPPA of
therapies. North America played an important role in
Clinical Psychology Task Force. Available from EFPPA
secretariat the emergence of clinical psychology. The field usually
Eysenck H 1990 Clinical psychology in Europe and in the US: dates its origin from the founding of the first psy-
Development and future. In: Drenth P J D, Sergeant J A, chology clinic in 1896 by Lightner Witmer (1867–1956)
Takens R J (eds.) European Perspecties in Psychology. Wiley, at the University of Pennsylvania (Routh 1996).
Chichester, UK, Vol 2 Clinical psychology in the English-speaking parts of
Gilgen A R, Gilgen C K (eds.) 1987 International Handbook of Canada developed in a pattern similar to that seen in
Psychology. Greenwood Press, New York the US (with doctoral training required for inde-
Kendall P C, Norton-Ford J D 1982 Clinical Psychology. Scien- pendent practice) but somewhat later in time. The field
tific and Professional Dimensions. Wiley, New York
developed in French-speaking Canada and in Mexico
Llewelyn S P 1994 Assessment and therapy in clinical psy-
chology. In: Spurgeon P, Davies R, Chapman A (eds.) in a way more resembling that of European countries,
Elements of Applied Psychology. Harwood Academic Pub- with master’s or licenciate-level training required for
lishers, Chur, Switzerland independent practice. The North American Free
Lindsay G 1996 Developing an ethical psychological practice. Trade Agreement (NAFTA) now exerts pressure on
In: Georgas J, Manthouli M, Besevegis E, Kokkevi A (eds.) all three countries and their states and provinces to

2029
Clinical Psychology in North America, History of

coordinate these differences to a greater degree, to Rush (1745–1823), and Vincenzo Chiarugi (1759–
permit more freedom of movement to qualified clinical 1820), it was associated with the development of
psychologists. mental asylums or hospitals as a separate locations for
the care of those with mental derangements. The
theory of ‘moral treatment’ that was typical of that
1. The Prehistory of the Mental Health Field time tried to minimize the use of coercive methods
such as chaining patients to restrain them and instead
The need for humans to deal with the problems now insisted that they be treated with kindness and cour-
called mental illness did not emerge suddenly a century tesy. It was often found that even some very disturbed
ago. It seems reasonable to assume that such problems patients responded positively to such a regimen.
have existed in some form in every society through all
the millenia of human experience. The ancient litera-
tures of India, Egypt, China, Greece, and Rome 4. Modern Psychology and the Study of
contain descriptions of disturbed behavior, often Psychopathology
interpreted in religious terms as some type of ret-
ribution by magical or divine forces. Legal systems as Long before a formal discipline of psychology existed,
they developed in all of these civilizations necessarily people in every society still no doubt reflected upon
included provisions for seeing to the management of human experience and behavior. As was the case of
the affairs and property of persons who were tem- Hippocrates in relation to medicine and psychiatry,
porarily or permanently unable to manage for them- the influence of ancient Greek philosophers such as
selves (Routh 1998). Plato (427–347 BC) and Aristotle (384–322 BC) upon
our present psychological concepts was pervasive.
There is conventional agreement that psychology
2. Greek Ideas Concerning Psychopathology emerged as a formal academic discipline only in the
mid-nineteenth century in Europe. Wilhelm Wundt
Western concepts of psychopathology have their roots (1832–1920) of the University of Leipzig is usually
in those of the ancient Greeks, including the writings named as the founder of the field, and 1879, the year in
attributed to the physician Hippocrates (460–377 BC). which he set up his psychology laboratory there, is
These Hippocratic writings include terms such as celebrated as the key event in its origin. Wundt’s work
melancholia, mania, paranoia, and dementia, with and those of the other early psychologists often
meanings not all that different from their present ones, focused on sensory processes, reaction time, and
albeit with different explanations. For example, in memory. It is also noteworthy that the study of
Greek, the word ‘melancholia’ simply means ‘black psychopathology was a possible topic of psychological
bile.’ In the Hippocratic theory of the humors, a study even in those days. The eminent psychiatrist
person suffering from severe mental depression had an Emil Kraepelin (1856–1926) was influenced by
overabundance of black bile, a substance thought to Wundt’s writings and was interested in psychology.
be produced by the spleen. Within this system, one His main motive for becoming a psychiatrist was that
aspect of treatment quite reasonably aimed to reduce this was the only way he could see to make a living
the amount of black bile by administering a purgative while doing psychological research. He later actually
such as hellebore. Another disorder was that of studied under Wundt, who encouraged him to keep on
‘phrenitis,’ literally meaning an inflammation of the with his work combining psychology and psychiatry.
mind. This referred to mental disturbance accom- Kraepelin set up psychology laboratories at his psy-
panied by fever, and the approach taken was simply to chiatric clinics in both Heidelberg and Munich.
wait for the fever to abate. Though some of these Many of the pioneers in psychology in both Europe
Hippocratic ideas may seem strange to us now, it is and the US were trained as medical doctors. In France
worth reflecting why some of them lasted well into the these included Theodule Ribot (1839–1916), who
eighteenth century and beyond. wrote about diseases of memory and about personality
disorders. An important French colleague was Pierre
Janet (1859–1947), who studied anxiety, hysteria, and
3. Emergence of Psychiatry obsessions and developed concepts of dissociation
that continue to be influential today. In the US the
Although according to Herodotus (ca. 484–425 BC), leading pioneer in psychology, William James (1842–
medical specialties existed even in ancient Greece, the 1910) was originally trained in medicine but wrote a
one we now call psychiatry did not emerge until the psychology textbook that proved to be the most
late eighteenth century in Europe. The ancients did influential of all. In 1896, James gave his Lowell
not conceptualize mental disorders as a separate lectures on exceptional mental states, much influenced
category but regarded them as being illnesses like any by the work of Janet. Boston neurologist Morton
other. When psychiatry did emerge, with the work of Prince also became interested in Janet’s writings and
such pioneers as Philippe Pinel (1745–1826), Benjamin published a description of a woman with multiple

2030
Clinical Psychology in North America, History of

personalities who had been his patient. Prince es- present-day psychotherapy. As a matter of fact, he was
tablished the Journal of Abnormal Psychology in 1906 little influenced by the activities of the Boston School
and later gave it to the American Psychological of psychotherapy that was contemporary with his
Association. Subsequently, in 1926, he established the work, nor later by Freud and his psychoanalytic
Harvard Psychological Clinic, which was a research movement.
facility rather than one delivering mental health Witmer is not remembered for any noteworthy
services. The most influential medically trained stu- scientific discoveries but rather for his persistence in
dent of psychology of this time was no doubt Sigmund enacting this new role of the clinical psychologist. At
Freud (1856–1939). Breuer and Freud’s book Studies the University of Pennsylvania’s Ph.D. program, he
in Hysteria, was published in 1895 and Freud’s book essentially trained most of the first generation of
on the interpretation of dreams in 1900. The first clinical psychologists. He maintained his clinic as a
international psychoanalytic meeting was held in service and training facility and in 1907 began a
Salzburg in 1908. In 1909, Freud came to the US for journal, the Psychological Clinic, to publicize these
the first and only time. activities (Witmer 1907).

5. Lightner Witmer and Clinical Psychology 6. The Binet Test


As the above paragraphs make clear, Witmer was There is still no consensus among psychologists as to
hardly the first to suggest that psychologists study precisely how to interpret its findings. Still, Alfred
psychopathology. Instead, his main contribution was Binet’s ‘metric scale’ of intelligence (Binet and Simon
to go beyond that to advocate that psychologists try to 1905) may be the most noteworthy piece of technology
help people as well as study them. Witmer had been an developed by psychology in its first century. Certainly
undergraduate at the University of Pennsylvania and it had a major impact on the new field of clinical
then for a time, before going to Leipzig to obtain his psychology. In fact, before World War II, probably
Ph.D. under Wundt, served as a school teacher. He the most characteristic activity of the typical clinical
had as a student a young man with marked difficulty in psychologist was the administration of the Binet test
reading and was able to help the youngster suc- and other similar measures (Routh 1994). This was
ceed in school and go on to attend college. This turned true despite the fact that Lightner Witmer, the founder
out to be a formative experience for Witmer. After of the field, was quite critical of the Binet test and used
Witmer had obtained his Ph.D. and returned to his it only as one part of his extensive battery of laboratory
alma mater as a psychology professor, a school teacher procedures.
named Margaret Maguire asked his advice about one In the light of how influential his test was, it is
of her pupils with a spelling problem. Witmer reasoned interesting to note that Alfred Binet himself was not
that if psychology was of any practical use, it should particularly identified with the field of clinical psy-
be able to be of help in a case of this kind. Thus was the chology. Originally trained as a lawyer, Binet became
psychological clinic and the field of clinical psychology part of the circle around the influential neurologist
launched. Jean Charcot at the Salpetriere in Paris. Much of his
Witmer’s clinic worked more with children than psychology was self-taught, through extensive reading
with adults and tended to concentrate on academic at the BibliotheZ que Nationale. In France, Binet became
difficulties such as reading, spelling, or general back- known as one of the founders of the entire field of
wardness in school as opposed to emotional or psychology (often characterized worldwide as ‘ex-
behavioral problems. His historic forebears are thus perimental psychology’) and edited the influential
not Hippocrates and Pinel but rather eighteenth and journal, Annea e psychologique. As is well known,
nineteenth century French physicians and special Binet’s successful attempts to devise an intelligence
educators such as Jacob Pereire (1715–1780) (who test departed from the conventional approach of using
taught deaf-mutes to speak), J. M. G. Itard (1775– relatively ‘pure’ sensory and motor tasks to use
1838) (who worked with the ‘wild boy’ of Aveyron), complex worksamples of the kinds of things school-
and Edouard Seguin (1812–1880) (a physician who children might be expected to know and do.
devised a ‘physiological method’ of sensory and motor For some reason, Binet’s new test did not create as
training in an attempt to remediate mental retar- much of a stir in his homeland as it did in the US.
dation). Witmer used existing laboratory procedures Psychologist Henry Goddard, who directed the psy-
including the Seguin formboard and sensory\motor chology laboratory at the Vineland Training School,
procedures adapted from Wundt to evaluate the in New Jersey, had the Binet test translated and soon
children referred to him and often tried to teach them confirmed its impressive validity in identifying persons
simple tasks as a part of his diagnostic efforts. In his with mental retardation. The use of the Binet spread
treatment activities he often collaborated with school like wildfire among the early clinical psychologists in
teachers, as well as with physicians, thus serving as the US, beginning with those employed in the field of
more of a consultant than doing anything resembling mental retardation. Goddard founded the first psy-

2031
Clinical Psychology in North America, History of

chology internship in 1908 at Vineland, NJ. Goddard (1876–1969) and Leta Hollingworth (1886–1939). In-
went on to become the first professor of clinical cidentally, Hollingworth was the first to suggest in
psychology at Ohio State University, like the Uni- 1918 that a person trained in clinical psychology
versity of Pennsylvania an important early training receive a distinctive type of degree, the doctor of
center in the field (Routh 1994). psychology. The new organization was viewed by
Lewis M. Terman (1877–1956) at Stanford Uni- many as divisive and was soon incorporated into the
versity developed a standardized version of Binet’s American Psychological Association as its Clinical
test, collected normative data for it, and introduced section. An effort by the same group to introduce
certain refinements such as the ratio IQ score (orig- procedures for certifying qualified clinical psycholo-
inally suggested by Wilhelm Stern of Hamburg). The gists failed, however.
1916 Stanford–Binet, as it was called, dominated this
field for many years. At about the same time, 1915,
Robert Yerkes pointed out the unsuitability of the 9. Psychometric Deelopments
concept of mental age and of this testing format for
use with adults and introduced his own ‘point scale’ as The interwar years were a fertile time for the emergence
a substitute for it. Yerkes and his colleagues were also of various new psychometric procedures, many of
responsible for the development of group intelligence which continue to be in use today. For example, in
tests, the Army Alpha (for those who could read) and 1921, the Swiss psychiatrist Herman Rorschach
Army Beta (for the illiterate), used for mass testing of (1884–1922) published his well-known inkblot test. It
military recruits during World War I. Another war- was brought to the US by a child psychiatrist who
time development was Robert S. Woodworth’s Per- taught it to a clinical psychology graduate student at
sonal Data Sheet published in 1917. This was one of Columbia named Samuel Beck (1896–1976). Beck
the first rationally developed self-report question- then proceeded to do his dissertation on this new test
naires intended to detect neurotic tendencies (Routh and eventually to develop his own system for admin-
1994). istering and scoring it. Psychologist Bruno Klopfer
(1900–1971), a disciple of Carl Jung (1875–1961), also
introduced the Rorschach to the US and developed a
7. The Child Guidance Center Moement separate system for administering and scoring it (see
Projectie Methods in Psychology). In 1936 the
The development of child guidance centers was an- Thematic Apperception Test was introduced by Henry
other factor that influenced early clinical psychologists A. Murray (1893–1988) of the Harvard Psychological
in the direction of working with children more than Clinic, and a colleague. Also in 1935 Edgar A. Doll
with adults. The first child guidance clinic was the (1889–1969), introduced the Vineland Social Maturity
Institute of Juvenile Research, established in 1909 by Scale, an interview-based method involving infor-
physician William Healy in conjunction with the mants familiar with the person, for assessing the social
juvenile court of Chicago. The idea behind such competence of individuals suspected of mental retard-
facilities was that careful clinical study of children ation. David Wechsler (1896–1981) published the
engaging in antisocial activities could assist in guiding original version of his Wechsler–Bellevue intelligence
them away from crime. Healy was joined at first by test for adults. This was but the first of many Wechsler
clinical psychologist Grace Fernald and subsequently tests of intelligence and memory. It introduced the
by her replacement, Augusta Bronner (Healy and use of the deviation IQ, a standard score comparing
Bronner 1926). The child guidance clinic was the the individual to age-matched normative subjects. In
origin of the ‘clinical team’ of psychiatrist, psychol- 1943, psychologist Starke R. Hathaway and psyc-
ogist, and social worker that later spread to other hiatrist J. C. McKinley introduced the first edition of
settings. The typical pattern was that the psychiatrist the Minnesota Multiphasic Personality Inventory
saw the child, the social worker saw the family, and the (MMPI). The MMPI had novel ‘validity’ indicators,
psychologist did the testing. The child guidance and its measures of psychopathology were empirically
movement was supported by the Harkness family’s keyed to psychiatrically defined groups (see Minnesota
philanthropy in the form of the Commonwealth Fund Multiphasic Personality Inentory (MMPI)).
and replicated in many US cities and abroad.

10. Organizational Actiities


8. The First Clinical Psychology Organization
In 1937 a new organization known as the American
In 1917, in Pittsburgh, a group of eight clinical Association of Applied Psychology (AAAP) split off
psychologists organized themselves into what they from the American Psychological Association (APA)
called the American Association of Clinical Psycho- to provide a home for various professionally oriented
logists (AACP) and invited 48 colleagues to join them groups including the clinical psychologists, who at this
(Routh 1994). They were led by J. E. W. Wallin same time dissolved the Clinical Section of the APA.

2032
Clinical Psychology in North America, History of

The AAAP began to publish the Journal of Consulting Rapaport (1911–1960), chief psychologist at the
Psychology, which subsequently developed into a Menninger Clinic in Kansas. A two volume set of
high-prestige clinical psychology journal (Routh books published at this time by Rapaport and co-
1994). workers (Rapaport et al. 1945, 1946) established the
It was also at this time that some preliminary Rorschach, the TAT, and the Wechsler test to be a ‘full
developments in psychology began in other parts of test battery’ for almost a half century to come.
North America. In 1937, for example, the first psy- The clinical psychologists of this era were also eager
chology curriculum was devised at UNAM, the to become full-fledged psychotherapists as well as
National Autonomous University of Mexico, in mental testers. Their route to therapeutic training was
Mexico City. In 1939, the Canadian Psychological blocked to some extent by the American Psycho-
Association was founded. It had 38 members to begin analytic Association’s 1938 policy decision (contrary
with, and it has been estimated that there were only 53 to Freud’s own views) that only psychiatrists were to
psychologists in all of Canada at the time. Needless to be trained in psychoanalysis. Clinical psychologists
say, there were few prewar developments in Canada or thus became very ingenious in devising ways of
Mexico specifically relating to clinical psychology. becoming therapists. The best known of them was
perhaps Carl Rogers (1902–1987), whose original
therapy supervisor was the social worker Jessie Taft.
11. The Post-World War II Boom in US Clinical Taft, in turn, had received her training from Otto
Psychology Rank, a psychologist from Vienna who had received
orthodox psychoanalytic training there and had been
Clinical psychology expanded so greatly in the US a close colleague of Freud’s. Rogers was successfully
after World War II that many brief historical accounts assertive in other ways. At one point in his career he
of the field even consider its development to have was director of a child guidance clinic when such
begun at that time. The war effort tended to draw administrative positions were supposed to be held only
everyone into it, either on the battlefield or on the by physicians. Rogers also was determined to combine
home front. Many psychologists whose interests prior his psychology training with his role as a therapist. He
to the war had been strictly in research and in the was among the first to produce recordings of actual
academic side of the field found themselves assigned to psychotherapy sessions, and was a pioneer in doing
carry out psychological testing or to help medical staff controlled research on the outcome of psychotherapy.
in treating psychiatric casualties. After the war, it was Rogerian therapy (‘client centered’ or ‘person cen-
clear that the Veterans Administration (VA) would tered,’ therapy, as it was later called) is still practiced
have to be vastly expanded to deal with the need for and studied both in North America and elsewhere
residential care, psychotherapy, or at least vocational (Routh 1994, 1998) (see Person-centered Therapy).
counseling of some of those returning from military In 1945, the first state law certifying psychologists
service. for independent practice was passed by Connecticut.
In 1945, the VA and the newly established National By 1977, all states in the US had passed such
Institute of Mental Health in the US came to the APA certification or licensing laws regulating the use of the
to ask it to establish a system for accrediting training title, ‘psychologist’ or the practice of psychology
programs in clinical psychology. The government (Routh 1994).
intended to pour millions of dollars into training such
individuals and needed to know which programs were
competent to carry this out. In response, APA created 12. The Behaior Therapy Moement
a system of accreditation, and for the first time, it
began to be possible to say who was a well trained Some psychologists were of the opinion that clinical
clinical psychologist and who was not. David Shakow psychologists in their professional activities should
(1901–1981) was the architect of the 1949 conference not simply try to duplicate the activities of psychia-
held in Boulder, Colorado, which ratified what has trists. In 1913, John Watson had boldly proclaimed a
come to be called the ‘scientist–practitioner’ model of behavioristic approach to psychology, which was
training clinical psychologists (Raimy 1950) (see widely influential at least in academic psychology in
Training in Clinical Psychology in the United States: the US. In a famous paper reporting research carried
Scientist–Practitioner Model). out under Watson’s supervision, Jones (1924) de-
Postwar clinical psychologists continued in their scribed the case of ‘Peter,’ whose fear of rabbits she
role as mental testers, but gave more emphasis to the desensitized. Not even Jones herself realized the wider
assessment of personality and psychopathology, not implications of this study at the time, but in the light of
just cognitive status. This was the heyday of projective events many years later some considered her to have
tests, and at least for a time the Rorschach inkblot was been ‘the mother of behavior therapy’ (see Behaior
an appropriate symbol for the clinical practitioner of Therapy: Psychological Perspecties).
psychology. A well known exemplar of the clinical The behavior therapy movement progressed not
psychologist as projective tester in this era was David only by following its own agenda but by attacking its

2033
Clinical Psychology in North America, History of

opponents. Psychologist Eysenck (1952) in England concerning the CNS (conceptual nervous system) were
thus skeptically reviewed the evidence for the effec- influential. It was at McGill University that Olds and
tiveness of psychotherapy. It was not enough, he Milner published their famous 1954 paper on the
noted, to show simply that psychotherapy patients reinforcement of an animal’s behavior by electrical
improved. One also needed to consider the rate of stimulation of its brain. On the clinical side, Ronald
spontaneous improvement of patients who did not Melzack elaborated his theory of gating mechanisms
receive psychotherapy. influencing the experience of pain. Brenda Milner
In 1962 in Charlottesville, Virginia, a behavior explored the role of the hippocampus in semantic
therapy conference was held, sponsored by psychia- memory, including work with her famous patient,
trist Joseph Wolpe and psychologists Andrew Salter ‘H. M.’ This man developed permanent memory
and Leo Reyna. At the time Wolpe had just published deficits after surgery inadvertently destroyed his hip-
a book on his success in treating patients with phobias pocampus bilaterally. Doreen Kimura documented
using the behavioral method of systematic desensiti- the left cerebral hemisphere advantage in dichotic
zation. Soon afterward, Lang and Lazovik (1963) listening.
published the first controlled study of desensitization,
in treating snake phobia. Soon the behavior therapy
movement was in full swing, with its own organiza-
tions, journals, and many adherents. Sidney Bijou and 14. Clinical Psychology in Mexico
his colleagues established behavioral treatments based
In 1997, the Division of Clinical Psychology of the
on the research of B. F. Skinner. These came to be
APA held its midwinter board meeting in Mexico City,
known as applied behavior analysis and were es-
hosted by Juan Jose Sanchez-Sosa, the director of the
pecially influential in work with the behavior disorders
school of psychology at UNAM, the National Auton-
of children and of those with mental retardation (see
omous University of Mexico. Sanchez-Sosa provided
Behaior Analysis, Applied).
his US colleagues with a tour of this school, itself as
large as many an entire college campus in the US. The
school offers both Master’s and Ph.D. degrees in
13. Canadian Clinical Psychology psychology, but as in much of Europe, these degrees
are intended for those headed for academic and
It was in the 1960s that clinical psychology finally
research careers. Those who intend to practice psy-
came of age in Canada. It was and is the largest
chology, including clinical psychology, in Mexico need
applied specialty in psychology numerically in that
only a ‘licentiate’ or diploma to do so, which is
country, as it is in the rest of the world. In 1965, the
awarded after 6 years of what to persons trained in the
Couchiching Conference basically endorsed the
US seems to be undergraduate training. But students
‘Boulder model’ of scientist–practitioner training for
in such a program spend essentially full time on
clinical psychology. Some Canadian doctoral pro-
psychology, without the need for a broad liberal arts
grams such as the one at McGill even sought accredi-
distribution of courses. This includes a significant
tation by the American Psychological Association.
amount of practicum experience. The psychology
There was even something of a boom north of the
clinic, one of the practicum facilities used at UNAM,
border. It is said that by 1966, more than half the
features a variety of clinical activities, including
doctoral psychologists in Canada were either Amer-
psychological testing, psychodynamic therapy, behav-
ican born or trained in the US. In 1983, the Canadian
ior therapy, group therapy, and even biofeedback.
Psychological Association established its own pro-
Since there is no certification or licensing system
gram of accreditation. The first CPA-accredited doc-
beyond the licentiate degree itself, it is difficult to be
toral programs in clinical psychology were those at
sure how many of these graduates are practicing
McGill, Concordia, and Simon Fraser Universities.
clinical psychology in a way parallel to what would be
The success of doctoral training in clinical psychology
seen in the US or Canada.
tells only part of the story there, however. In Quebec
and some other eastern provinces, the master’s degree
was accepted as the entry level of training for in-
dependent clinical practice. By 1996, Canada had 88 15. Independent Practice of Clinical Psychology
graduate training programs in professional psycho-
logy (including clinical): 57 doctoral and 31 terminal Although Lightner Witmer in his clinical work often
masters. collaborated with school teachers, physicians, or
Canadian clinical psychology is noted for particular others, psychologists working in his clinic were never
strength in the area of neuropsychology, which built supervised by members of any other profession. This
on Canadian strengths in the neurosciences, including tradition of independent work has continued within
the work of neurosurgeon Wilder Penfield at the the field of clinical psychology, somewhat in contrast
Montreal Neurological Institute. In academic psy- with social work and nursing. The post-World War II
chology, the research and writings of Donald Hebb expansion of the field in the US was primarily in the

2034
Clinical Psychology in North America, History of

public sector, typically VA hospitals, but also child DeLeon, have been trying to obtain the right to
guidance centers and eventually community mental prescribe medications for mental health conditions, so
health centers. The large government training grants far without much success.
supporting clinical psychology programs at the time
presupposed that the graduates would go to such
public sector jobs or teach in colleges and universities. 16. The Continued Commitment of Clinical
In the 1980s, the Reagan administration made most Psychology to Research
such training grants a thing of the past (Routh 1994).
David Mitchell, a Ph.D. student of Lightner Well before the founding of Witmer’s clinic, there was
Witmer, was one of the first individuals to make his a strong interest on the part of psychology in research
living primarily in the private practice of psychology. on psychopathology, including its diagnosis and treat-
Eventually, he was joined by many others. After all, ment. This continues to be the case. In fact, psycho-
the state and provincial laws that developed to regulate logists are far more likely than psychiatrists or those in
psychology after 1945 specified what qualifications other fields to be principal investigators on research
were necessary to offer one’s services to the public as a grants from the National Institute of Mental Health.
psychologist. Many psychologists trained in Boulder- Such research is international in scope and is a
model Ph.D. programs did no research after gradu- collaborative interdisciplinary enterprise. The turf
ation, and eventually the idea of training psychologists battles that characterize the marketplace of practice
as practitioners rather than scientist–practitioners are much less typical in the research arena, where
emerged. Beginning with the University of Illinois in clinical psychologists often cooperate smoothly with
1966, a number of programs began to offer the doctor experimental psychologists and statisticians as well as
of psychology (Psy.D.) degree rather than the Ph.D. with medical colleagues. In research, it is neither
The conference in Vail, Colorado in 1973 officially possible nor necessary to draw any bright line to show
legitimized such practitioner training for the first time. the boundaries between these fields.
In fact, a number of nonuniversity-affiliated schools of The diversity of such research is so great that it
professional psychology sprang up beginning in the would be impossible to cover it in an article as brief as
1970s, many of them offering Psy.D. degrees. Often this one. Instead, a few examples must suffice. In 1954,
these programs were supported only by student psychologist Evelyn Hooker received her first NIMH
tuition, and many students assumed substantial loans grant to study homosexuality. Her work, much of it
to finance their education (Routh 1994). No such using the types of projective testing that were typical of
private school of professional psychology has emerged clinical psychological work at the time, suggested that
in Canada (nor in Mexico) (see Training in Clinical homosexuals might be essentially normal psychologi-
Psychology in the United States: Practitioner Model). cally. This research was related to the later decisions to
As psychologists in private practice emerged in delete homosexuality as a pathological category from
larger numbers, they also became more active pol- the Diagnostic and Statistical Manual of the APA.
itically. The first practitioner became president of the In the 1950s, psychologist Leonard Eron began
APA in 1977, and within 20 years the first Psy.D. was collecting peer-rating data on aggressive behavior in
elected to this position. The practitioners began to 8-year-old children. The subsequent longitudinal
dominate both the APA and the Canadian Psy- research he and his colleagues did helped establish the
chological Association to a greater and greater extent. fact that aggressive behavior is highly stable well into
In response, many academic psychologists retreated to adulthood. Psychologist Gerald R. Patterson studied
form national organizations of their own that were aggressive behavior in children using direct behavioral
more research-oriented. Thus, the American Psycho- observations, relating it to coercive processes in the
logical Society was founded in 1988. Similarly, in parent–child dyad and doing controlled intervention
1989, academic and research psychologists in Canada studies showing how it could be reduced.
founded the Canadian Society for Brain, Behaviour, In the late 1950s, psychologist C. Keith Conners
and Cognitive Science. devised a simple teacher rating scale for the assessment
Practicing psychologists both in the US and Canada of varied types of disordered behavior in school
battled psychiatrists for their share of the mental children. Together with child psychiatrist Leon
health ‘market.’ Thus, they fought to obtain hospital Eisenberg he helped carry out the first controlled
privileges. They supported ‘freedom of choice’ legis- studies of the effects of stimulant medications on
lation to become eligible as health providers reim- children’s disruptive behavior (Conners and Eisenberg
bursable by health insurance companies and Health 1963). Such research formed an important basis of
Maintenance Organizations (HMOs). In 1988, a law present concepts of Attention Deficit Hyperactivity
suit by clinical psychologist Bryant Welch and others Disorder (ADHD), the most commonly diagnosed
forced the American Psychoanalytic Association to type of child psychopathology and one still typically
begin to admit psychologists for training at its local treated with stimulant medications. By the 1970s,
institutes (Routh 1994). Most recently, a number of Thomas Achenbach and his colleagues had begun to
practicing psychologists in the US, led by Patrick develop the use of parent, teacher, and self-ratings for

2035
Clinical Psychology in North America, History of

child behavior into the most widely used forms of Holzman P S, Proctor L R, Hughes D W 1973 Eye tracking
assessment by mental health workers. patterns in schizophrenia. Science 181: 179–81
In 1962, Meehl published a classic paper on ‘schizo- Jones M C 1924 The elimination of children’s fears. Journal of
Experimental Psychology 7: 382–90
taxia, schizotypy, and schizophrenia,’ elaborating his
Kraepelin E 1987 Memoirs. Springer, Berlin
concepts as to how genetic factors might be involved in Lang P J, Lazovik A D 1963 Experimental desensitization of a
the development of this disorder (Meehl 1962). In phobia. Journal of Abnormal and Social Psychology 66: 519–25
1973, psychologist Holzman and co-workers an- Lazarus R S 1966 Psychological Stress and the Coping Process.
nounced their discovery of smooth pursuit eye-move- McGraw-Hill, New York
ment difficulties in patients with schizophrenia Meehl P E 1962 Schizotaxia, schizotypy, and schizophrenia.
(Holzman et al. 1973). This neurological symptom American Psychologist 17: 827–38
proved to be an important ‘trait’ marker in the first Raimy V C (ed.) 1950 Training in Clinical Psychology. Prentice-
degree relatives of schizophrenics as well, whether or Hall, New York
Rapaport D, Gill M M, Shafer R 1945 Diagnostic Psychological
not they manifested any overt psychopathology. In
Testing, Vol. 1. Yearbook, Chicago
Denmark, psychologist Sarnoff A. Mednick carried Rapaport D, Gill M M, Shafer R 1946 Diagnostic Psychological
out a series of studies using the excellent public Testing, Vol. 2. Yearbook, Chicago
registers that characterize that country to do longi- Routh D K 1994 Clinical Psychology since 1917: Science Practice
tudinal, epidemiological studies of schizophrenia, and Organization. Plenum, New York
implementing the type of research that Meehl had only Routh D K 1996 Lightner Witmer and their first 100 years of
been able to imagine. clinical psychology. American Psychologist 51: 244–7
Problems of the dysregulation of affect and emotion Routh D K 1998 Hippocrates meets Democritus: A history of
are the most ancient in the field of psychopathology, psychiatry and clinical psychology. In: Bellack A S, Hersen M
(eds.) Comprehensie Clinical Psychology, Vol. 1. Pergamon,
having been with us since Hippocrates. Beginning in
New York, pp. 1–48
the 1960s, psychologist Richard Lazarus elaborated Witmer L 1907 Clinical psychology. Psychological Clinic 1:
his concepts of stress, appraisal, and coping and 1–9
demonstrated experimentally how his subjects’ use of
different coping strategies could dampen or heighten D. K. Routh
their physiological stress (Lazarus 1966). Beginning in
the 1970s, Charles Spielberger and his colleagues
began the development and validation of measures of
state as well as trait anxiety and later of state and trait
anger as well. In the 1970s, Martin E. P. Seligman and Clinical Psychology: Manual-based
his colleagues showed how his studies of learned
helplessness in dogs could be used to reconceptualize Treatment
human depression.
In conclusion, clinical psychology has both inter- The practice of psychotherapy has seen dramatic and
national and important North American roots. In its sweeping changes since the 1950s. One of the most
first century, it has developed into a viable science and dramatic changes impacting the delivery and dis-
profession, and there is good reason to suppose that its semination of specific psychotherapeutic service has
trajectory will continue in the twenty-first century. been the development of detailed explanatory manuals
for complex psychological treatments. More recently
See also: Clinical Psychology in Europe, History of; client ‘workbooks’ are available that guide the client
Psychiatry, History of; Psychoanalysis, History of; through therapeutic direction. The main function of
Psychotherapy, History of: Psychiatric Aspects; Train- these treatment manuals is to, ‘outline the procedures,
ing in Clinical Psychology in the United States: techniques, and strategies which comprise an accept-
Practitioner Model; Training in Clinical Psychology in able implementation of a given [psychotherapeutic
the United States: Scientist–Practitioner Model approach]’ (Luborsky and DeRubeis 1984, p. 7) and to
make the process of psychotherapy more accessible to
clients and clinicians alike. Typically these manuals
Bibliography describe session-by-session strategies for specific prob-
lems such as depression or phobias for therapists to
Binet A, Simon T 1905 A new method for the diagnosis of follow.
intellectual level of abnormal persons. Annee Psychologique Many changes in the field have led to the impetus for
11: 191–244 specific treatment manuals to be developed including
Conners C K, Eisenberg L 1963 The effects of methylphenidate
the demand for ‘good and better research,’ in which
on symptomatology and learning in disturbed children.
American Journal of Psychiatry 120: 458–64 the therapeutic intervention is specified for all to
Eysenck H J 1952 The effects of psychotherapy: An evaluation. examine, and the burgeoning of the hegemonic man-
Journal of Consulting Psychology 16: 319–24 aged behavioral healthcare system. In response to
Healy W, Bronner A F 1926 Delinquents and Criminals: Their these many demands, the number of manuals has
Making and Unmaking. Macmillan, New York proliferated. Although the psychotherapeutic com-

2036
Clinical Psychology: Manual-based Treatment

munity has positively received the majority of man- clinical practice guidelines are typically based on two
uals, some front-line clinicians raise objections to this specific factors: (a) efficacy; or internal validity of the
approach, stating, among other objections, that man- specific treatment, the determination of which is based
uals damage the therapeutic relationship, and that on the results of a systematic evaluation of the
clinical innovation is restricted (Addis et al. 1996). intervention in a controlled setting, and (b) effec-
Despite these obstacles, treatment manuals appear tiveness; or clinical utility of the treatment, which is
entrenched in the forefront of clinical psychology. based on the feasibility, generalizability, and cost-
effectiveness of the intervention actually being deli-
vered in a local setting. Based on these equally
1. Deelopmental History of Treatment Manuals important and rigorous bases of evidence, the devel-
opment of treatment manuals that could produce the
The initial impetus for the development of treatment necessary evidence was encouraged. As a result,
manuals came from psychotherapy researchers who in manual-based treatments have been incorporated as
the early 1960s began to test broadly the effectiveness one of the major components of evidence-based service
of specific treatments in controlled outcome studies. delivery (Strosahl 1998).
Looking to demonstrate successfully that psy-
chological interventions could withstand rigorous
scientific investigation, similar to that of existing phar-
macological treatments (Luborsky and DeRubeis 2. Manualized Treatment in Behaioral Health
1984), scientist–practitioners realized that they needed Care Settings
treatment tools that would allow for systematic
replication and comparison. Wilson (1996) more One of the final steps in the progression of manual-
specifically pointed out that treatment manuals sought based treatments is their incorporation into managed
to eliminate any ‘error’ associated with ‘clinical judg- care treatment settings. Strosahl (1998) points out that
ment’ or intuition that might cause one therapist to these manuals are especially appealing to these settings
behave in a substantially different manner from because they are essentially an easily discernable
another. Thus, to study the effectiveness of these roadmap for the most appropriate way to implement
therapies, treatments were condensed into manuals clinical practice guidelines. Within managed care
that could then be reviewed and used across studies. organizations, where psychotherapy must demon-
Many researchers hoped that by utilizing treatment strate: (a) its overall effectiveness, (b) may be limited to
manuals, presented in this fashion, psychological a certain number of sessions, and (c) increasingly is
interventions would be able to withstand the method- delivered by practitioners with less than doctoral
ological constraints of research protocols. More specifi- degrees, treatment manuals are embraced for their
cally, ‘treatment manuals help support the internal ability to facilitate the delivery of empirically sup-
validity of a given study by ensuring that a specific set ported treatments at lower costs. Treatment manuals
of treatment procedures exists, that procedures are increasingly are also adopted because they aid in the
identifiable, and that they can be repeated in other training of master’s level clinicians.
investigations’ (Dobson and Shaw 1988, p. 675). This is
in contrast to the conduct of treatment outcome
research prior to manualization, in which specific 3. Specific Treatment Manuals
therapeutic techniques were often not explained and
thus could not be compared to other treatments or be The earliest treatment manuals were based on be-
replicated by other investigative groups. As a conse- havioral treatment techniques (Dobson and Shaw
quence, the use of a treatment manual is currently a 1988). These treatments were the logical outgrowth of
prerequisite to receive federal funding for psycho- behavior therapy’s de-emphasis of therapist variables
therapy research in many countries. in favor of specific procedures (Parloff 1998). Non-
Another push to develop specific treatment manuals behavioral varieties of treatment manuals are cur-
came from the development of the Agency for Health rently available, although behavioral and cognitive
Care Policy and Research (AHCPR) in the United behavioral techniques tend to predominate. The ma-
States in 1989. The sole purpose of this Agency was to jority of treatment manuals do not adhere to more
facilitate identification of the effectiveness of specific traditional psychotherapy models, which tend to
strategies for specific disorders, with the aim of emphasize a more individualized theory-based under-
increasing the quality and reducing the cost of health- standing of underlying patient problems (Wilson
care (Barlow 1996). One major mechanism of ac- 1996). Rather, most manuals target the specific di-
complishing this goal was the creation of clinical agnostic categories specified by DSM-IV. For ex-
practice guidelines that explicitly articulate the op- ample, specific manuals exist for a wide variety of
timal strategies for assessing and treating a variety of anxiety disorders including panic disorder, generalized
psychological problems (see Psychotherapy: Clinical anxiety disorder, obsessive-compulsive disorder, post-
Practice Guidelines). Treatments recommended in these traumatic stress disorder, and social phobia. Most of

2037
Clinical Psychology: Manual-based Treatment

these manuals utilize cognitive and behavioral tech- techniques to monitor, treatment manuals streamline
niques with an emphasis on identifying and chal- the learning process and delineate a specific set of
lenging maladaptive cognitions and eliminating be- therapeutic skills to be learned. This in turn may lead
haviors that serve to increase and maintain anxiety. to a greater aptitude and ability for learning thera-
Other manuals with a cognitive behavioral emphasis peutic techniques in general, and a larger armentarium
include treatments for bulimia nervosa, weight re- of clinical skills.
duction, stress reduction, depression addictive be- Treatment manuals, as structured interventions, can
haviors, and sexual dysfunction. also help clinicians deliver therapy in a brief and in
As noted above, other forms of therapy have been many cases non-traditional format (Craske et al.
manualized into specific books such as interpersonal 1995). For example, even a very brief and unstructured
psychotherapy for depression (IPT), which stresses the intervention for patients presenting to an emergency
alleviation of destructive interpersonal patterns that room with panic attacks may be effective if delivered
are maintaining depressive mood, and Dialectical early enough to less severe patients (Swinson et al.
Behavior Therapy (DBT) for borderline personality 1992). Considering the high prevalence rates of clinical
disorder which draws on cognitive-behavioral tech- and subclinical mental disorders in primary care
niques and eastern philosophy. More recently, im- settings (Fifer et al. 1994) such interventions are clearly
portant new treatments for bipolar disorder and cost effective.
schizophrenia, featuring family systems therapy Finally, utilization of treatment manuals can pro-
directed at emotional lability in these families, have mote innovation in the delivery of clinical services.
been developed. All of these treatments have received That is, when highly specified treatments are delivered,
empirical support and are also incorporated into many and assuming appropriate outcomes measures are
aspects of clinical practice guidelines. collected, clinical administrators can determine when
a specific treatment is working and for whom. In the
case of failures either individually or systematically, it
4. Pros and Cons of Treatment Manuals will then be clear that innovations to the treatment
program are needed and should be incorporated and
One distinct advantage of treatment manuals is that evaluated.
they have been shown to be effective in controlled Despite these many advantages, a sizeable segment
treatment outcome studies (Barlow and Hofmann of practicing clinicians object to the notion of treat-
1997, Nathan and Gorman 1998). Thus, when prac- ment manuals, pointing out the many disadvantages
titioners implement techniques they can do so with the in the delivery of what has been called derisively,
confidence that they are delivering services that have a robot-like therapy (Parloff 1998). For example, dis-
high probability of success. Furthermore, manuals senters have argued that randomized controlled trials,
guide clinicians to use strategies that are spelled out utilized to validate most manualized treatments, ex-
and are clearly discernable. clude patients with co-morbid diagnoses, among other
Wilson (1996) points out that treatment manuals constraints, and that the results of these types of
can reduce the errors that might be associated with research studies inherently have limited applicability
unrestrained clinical judgment. ‘What is typically to ‘real-world’ settings (Bologna et al. 1998). These
overlooked, however, and what research on human same individuals note that highly structured, manual-
judgement so clearly documents, is that individualized ized psychological interventions used in clinical trials
clinical judgement can just as easily produce worse cannot generalize to practice settings where psycho-
results than standardized treatments, by introducing therapy is often applied more flexibly and adapted to
errors and inappropriate strategies that are not part of meet the needs of the patients, who may present with
manual-based treatments’ (Wilson 1996, p. 302). Man- multiple problems (Barlow 1996). This set of ob-
uals, however, outline specific techniques to be used in jections is based on the fundamental idea that the
each session, which have been created by systematic scientific method, as it is known, is incapable of
investigation, rather than relying upon clinical judg- proving the effectiveness or ineffectiveness of psycho-
ment to choose a session topic. therapy, and that some alternative methodological
Due to their structure, treatment manuals may also approach to psychotherapy research is preferable.
facilitate a more highly focused and efficient therapy. These are important criticisms that cannot be ignored.
With a limited number of sessions and specific goals But Wilson (1998) outlines how there are significant
and strategies outlined for each of these sessions, and important differences between the use of treatment
Wilson (1998) suggests that this more focused ap- manuals in therapy protocols and their use in clinical
proach may actually lead to a more active engagement practice, with the latter context allowing for greater
in therapy for patients. flexibility. Furthermore, the beginning of much needed
Treatment manuals also make it easier to train and research in this area suggests generalizability of manu-
supervise therapists in specific clinical techniques and alized treatments to front line clinical settings despite
strategies (Calhoun et al. 1998). By providing nascent high co-morbidity and a diverse population base
clinicians and supervisors with specific guidelines and (Barlow et al. 1999).

2038
Clinical Psychology: Manual-based Treatment

Other clinicians bemoan that structured treatment mill. Research on the effectiveness or external validity
manuals will compromise the integrity of the thera- of therapeutic interventions will be the ground on
peutic relationship and thus interfere with the therapy which these competing ideas will be evaluated and
process (Garfield 1996). For example many of these tested.
clinicians fear that manualized therapy may reduce the
therapy process to mechanized and robotic delivery of
techniques, thereby eliminating the importance of the
therapist as a clinical tool. Further, innovation and the 5. The Future of Structured Treatment Manuals
development of unique and creative strategies will be The science and practice of psychotherapy has under-
stagnated by reliance on prescribed treatment strat- gone comprehensive shifts in the 1990s, none arguably
egies. Parloff (1998) summarized these sentiments by more central than the proliferation of semistructured
stating that many saw the reliance on actuarial treatment manuals. In this era of treatment guidelines,
decision making over honed clinical judgment as best practice algorithms, and behavioral healthcare
‘perverse.’ Many clinicians also feel hindered by management organizations, treatment manuals in-
having to adhere to a specific set of goals for a given creasingly are being embraced. Despite objections,
session, and prefer to rely on their own judgment to and lingering doubts, treatment manuals have many
guide them. However, Wilson (1996) notes, ‘if clinical merits, with demonstrated efficacy the principal at-
artistry is taken to connote such necessary therapeutic tribute. As psychotherapeutic services become more
elements as developing a therapeutic relationship and integrated with other forms of healthcare in the
engaging patients in the change process, then treat- twenty-first century, treatment manuals with demon-
ment manuals do not obfuscate it—rather they demand strated outcomes may become the preferred method of
it’ (p. 305). It can also be assumed that even though the nondrug interventions, and ‘formularies’ of effective
therapist utilizing a treatment manual is encouraged psychological treatments may begin to appear in
to deliver the techniques outlined, within this delivery mental health services or managed behavioral health-
they are also encouraged to rely on their own personal care organizations. In this context, research will
skills to convey the therapy to the patient (Dobson and progress to address the reasonable concerns that have
Shaw 1988). Finally, the data indicate that therapists arisen regarding the implementation of these strat-
engaged in manualized treatments in clinical practice egies.
can form strong alliances (Addis et al. 1996). Simply
because the therapist has to spend a segment of time See also: Psychological Treatments, Empirically Sup-
conveying specific therapeutic techniques does not ported; Psychological Treatments: Randomized Con-
mean that they are not building alliances with their
trolled Clinical Trials; Psychotherapy: Case Study;
patients. Furthermore, a great deal of time is allotted
in manual-based therapies to emphasize alliance build- Psychotherapy: Clinical Practice Guidelines; Psy-
ing. Examples include such strategies as identifying chotherapy: Ethical Issues; Psychotherapy Process
client and therapist expectations for treatment and Research
eliciting client feedback (Addis et al. 1996).
Criticisms of treatment manuals also encompass the
belief that manuals may undermine successful case Bibliography
formulation. That is, many clinicians argue that
manual-based treatments will be less effective because Addis M E, Wade W A, Hatgis C 1999 Barriers to dissemination
these approaches assume that all individuals with a of evidence based practices: Addressing practitioners’ con-
given disorder are uniform and have the same symp- cerns about manual based psychotherapies. Clinical Psy-
toms for the same reasons (Wilson 1996). As such, chology: Science and Practice 6(4): 430–41
Barlow D H 1996 Health-care policy, psychotherapy research,
these techniques will ‘miss the boat’ for a certain and the future of psychotherapy. American Psychologist
subset of patients and will attempt to treat them with 51(10): 1050–58
techniques that are not applicable. The fact that Barlow D H, Hofmann S G 1997 Efficacy and dissemination of
manuals do not allow for clinical exploration that psychological treatments. In: Clark D M, Fairburn C G (eds.)
might lead to the discovery of what works for patients Science and Practice of Cognitie Behaiour Therapy. Oxford
with dissimilar etiology is a commonly expressed University Press, Oxford, UK, pp. 95–117
concern that may lead to diminished treatment inno- Barlow D H, Levitt J T, Bufka L F 1999 The dissemination of
vation. In fact, treatment manuals do emphasize the empirically supported treatments: A view to the future.
current symptom picture over past development pro- Behaiour Research and Therapy 37: 5147–62
Calhoun K S, Moras K, Pilkonis P A, Rehm L P 1998 Empiri-
cesses and idiosyncratic etiology. But research has yet cally supported treatments: Implications for training. Journal
to suggest that therapeutic attention to past devel- of Consulting and Clinical Psychology 66: 151–62
opmental processes contributes in a substantial way to Craske M G, Maidenberg E, Bystrisky A 1995 Brief cognitive-
the amelioration of the disorder. behavioral versus non-directive therapy for panic disorder.
Many of these criticisms of manualized treatment Journal of Behaiour Therapy and Experimental Psychiatry 26:
have some validity and will be grist for the research 113–20

2039
Clinical Psychology: Manual-based Treatment

Dobson K S, Shaw B F 1988 The use of treatment manuals in the empirical findings should also be of particular
cognitive therapy: Experience and issues. Journal of Consulting interest to individuals who devise health care policy
and Clinical Psychology 56(5): 673–80 and decide what services to reimburse. For example,
Fifer S K, Mathias S D, Patrick D L, Mathias S D, Mazonson
there is a new and intense controversy over the validity
P D, et al. 1994 Untreated anxiety among adult primary care
patients in a health maintenance organization. Archies of of judgments made by clinicians who use the Rors-
General Psychiatry 5: 740–50 chach Inkblot Test (Wood et al. 1996) (see Projectie
Garfield S L 1996 Some problems associated with ‘validated’ Methods in Psychology). This controversy is well
forms of psychotherapy. Clinical Psychology: Science and known within the field of clinical psychology, but is
Practice 3(3): 218–29 not yet well known by people in other professions.
Luborsky L, DeRubeis R J 1984 The use of psychotherapy There is another group that should be interested in
treatment manuals: A small revolution in psychotherapy the results from studies on mental health professionals.
research style. Clinical Psychology Reiew 4: 5–14 Consumers of mental health services should be espe-
Nathan P E, Gorman J M 1998 A Guide to Treatments that
cially interested in the results of studies on clinicians. If
Work. Oxford University Press, New York
Parloff M B 1998 Is psychotherapy more than manual labor? one includes family members of consumers of mental
Clinical Psychology: Science and Practice 5(3): 376–81 health services and potential consumers of mental
Strosahl K 1998 The dissemination of manual-based psycho- health services, then one can conclude that virtually
therapies in managed care: Promises, problems, and prospects. everyone should be interested in this research.
Clinical Psychology: Science and Practice 5(3): 382–6 Overall, a huge number of studies (over 1,000) have
Swinson R P, Soulios C, Cox B J, Kuch K 1992 Brief treatment been published on the validity of clinical judgments
of emergency room patients with panic attacks. American and most of the studies have been well designed. Yet,
Journal of Psychiatry 149: 944–6 they are not well known outside of mental health
Wilson G T 1996 Manual-based treatments: The clinical ap-
fields. In this article, highlights of the research will be
plication of research findings. Behaior Research and Therapy
34(4): 295–314 described. Topics include: (a) assessment of pers-
Wilson G T 1998 Manual-based treatment and clinical practice. onality and psychopathology, (b) diagnosis, (c) case
Clinical Psychology: Science and Practice 5(3): 363–75 formulation, (d) prediction of behavior, and (e) treat-
ment decisions. In general, mental health professionals
D. H. Barlow and K. A. I. Greene are able to make reliable and valid judgments for some
tasks, but not for others (Garb 1998).

1. Assessment of Personality and


Clinical Psychology: Validity of Judgment Psychopathology
Mental health professionals almost always evaluate a
A large body of research exists on the validity of client’s personality traits and psychiatric symptoms.
judgments made by mental health professionals (Garb Some clinicians also evaluate a client’s defense mech-
1998). Most of the studies have been conducted by anisms. Personality traits can include characteristics
clinical psychologists, counseling psychologists, and like narcissism and dependence, while psychiatric
psychiatrists, but important studies have also been symptoms can include things like hallucinations or
conducted by neuropsychologists, social workers, and panic attacks. A defense mechanism, as defined by
sociologists. The studies describe how well mental psychoanalytic theory, is an unconscious strategy that
health professionals perform on a range of tasks, e.g., protects the ego from anxiety. For example, a client
how well they make diagnoses and treatment deci- may push impulses and thoughts that are unacceptable
sions. to the ego into the unconscious.
Research on mental health professionals should Results on reliability and validity vary for the tasks
be of interest to the general public. Many of the of describing psychiatric symptoms, personality traits,
studies can help us to understand important social and defense mechanisms. Mental health professionals
issues, e.g., the occurrence of race bias, gender bias, are often good at describing psychiatric symptoms.
and other types of biases. Also, a large number of the This should not be surprising. Clients are often able to
studies bear on questions that are important for our report if they have had hallucinations, panic attacks,
justice system. These questions include: should mental or other symptoms. On the other hand, inter-rater
health professionals be allowed to testify as expert reliability varies widely for describing personality
witnesses? (see Expert Witness and the Legal System: traits, and it is poor for describing defense mech-
Psychological Aspects). Are mental health profess- anisms. Perhaps this is because these tasks require
ionals able to make accurate predictions of violence? clinicians to make more inferences. Given the poor
Are they able to make accurate decisions regarding results for describing defense mechanisms, it is impo-
child abuse and domestic violence? Do they make rtant to point out that many clinicians do not perform
appropriate judgments when petitioning to have indi- this task. Psychodynamic therapists regularly evaluate
viduals committed to psychiatric hospitals? Finally, clients’ defense mechanisms, but other clinicians (e.g.,

2040
Clinical Psychology: Validity of Judgment

cognitive behavior therapists) rarely concern them- likely to be diagnosed as having organic impairment
selves with this task (see Psychoanalysis in Clinical and they are less likely to be diagnosed as having a
Psychology). depressive disorder, even when all of the clients are
described by the same case history except for the
designation of age. Of course, someone diagnosed as
2. Diagnosis having organic impairment will be less likely to receive
psychotherapy and antidepressant medicine.
Diagnostic classification systems have been const- It should be noted that even when clinicians attend
ructed to help clinicians make diagnoses. The most to diagnostic criteria and apply them the same way for
commonly used classification system in the United different groups of patients (e.g., for African-Amer-
States is the American Psychiatric Association’s Diag- ican and White patients), diagnoses can be biased
nostic and Statistical Manual of Mental Disorders, 4th (Widiger 1998). For example, diagnoses can be biased
edition (1994, generally referred to as DSM-IV). This because diagnostic criteria, not the cognitive processes
classification system contains specific and explicit of clinicians, are biased. Diagnostic criteria are said to
criteria for making diagnoses (see Mental and Beha- be biased if they are more valid for one group than for
ioral Disorders, Diagnosis and Classification of). another (e.g., if diagnostic criteria for a particular
Clinicians’ diagnoses are reliable and moderately disorder are more valid for males than for females). In
valid, but only when they attend to diagnostic criteria. general, little is known about whether diagnostic
Unfortunately, there is evidence that a significant criteria are biased.
number of clinicians do not adhere to criteria when Research has also described other types of errors.
making diagnoses. That is, many clinicians may think Mental health professionals disagree strongly over
that they are making diagnoses according to the DSM- whether dissociative identity disorder (formerly called
IV criteria, but they do not refer to the criteria when multiple personality disorder) is overdiagnosed or
making a diagnosis, and examination of their diag underdiagnosed. There is also a controversy over
noses reveals that they are not made in accordance whether attention-deficit\hyperactivity disorder
with the DSM-IV criteria. This can lead to different (ADHD) is overdiagnosed. Diagnoses of ADHD have
types of problems including race bias, gender bias, age doubled in frequency in recent years (see Attention-
bias, and the underdiagnosis or overdiagnosis of some deficit\Hyperactiity Disorder (ADHD)), while diag-
mental disorders. These problems are described below. noses of dissociative identity disorder have increased
The most widely replicated finding for race bias 10-fold. Finally, research suggests that clinicians
involves the differential diagnosis of schizophrenia underdiagnose mental disorders in the mentally reta-
and psychotic affective disorders. African-Americans rded, they also underdiagnose mental disorders (e.g.,
and Puerto Rican Hispanics with bipolar affective major depression) in terminally ill patients, they
disorder (formerly called manic depression) are more frequently underdiagnose personality disorders, they
likely than Whites with bipolar affective disorder to be underdiagnose substance abuse in psychiatric patients,
misdiagnosed as having schizophrenia. For this and they underdiagnose mental disorders in indi-
reason, Black patients and Puerto Rican Hispanic viduals admitted to substance abuse treatment
patients are more likely than White patients to be programs.
overmedicated with neuroleptic medications, and their
depressive symptoms are more likely to be untreated.
The most widely replicated finding for gender bias 3. Case Formulations
involves the differential diagnosis of histrionic pers-
onality disorder and antisocial personality disorder. The most difficult task for mental health professionals
When different groups of mental health professionals involves making causal judgments. When making
have been given identical case histories except for the causal judgments, clinicians try to explain the causes
designation of gender, clinicians have been more likely of their clients’ behaviors and symptoms. Inter-rater
to diagnose women as having a histrionic personality reliability and validity for this task is often poor. This
disorder and men as having an antisocial personality has been true when psychodynamic clinicians desc-
disorder. Histrionic personality disorder is charac- ribed clients using psychoanalytic theory and when
terized by overly dramatic, attention seeking behaviors behavior therapists conducted functional analyses to
(e.g., uncomfortable when not the center of attention), understand the relations that exist between causal
and antisocial personality disorder is characterized by variables and behavior problems. One study on
antisocial behaviors (e.g., habitual lying, having no reliability will be described in detail. In this study
regard for others, showing no remorse after hurting (DeWitt et al. 1983), case formulations were made by
others) (see Personality Disorders). two teams of psychodynamically trained clinicians
The most widely replicated finding for age bias after viewing videotapes of intake evaluations that
involves the differential diagnosis of organic impa- lasted from 60 to 90 minutes. Both teams were
irment and depressive disorder. Compared to young composed of three clinicians. Descriptions were made
and middle-aged patients, elderly patients are more by consensus. Each team was to, ‘Define the basic

2041
Clinical Psychology: Validity of Judgment

neurotic conflict(s) that lie at the core of the patients’ tment plans were in agreement only 62 percent of the
difficulties. Include the kind of stress to which the time. Inter-rater reliability has also been poor when
patient is vulnerable’ (p. 1124). Each team wrote psychiatrists have made decisions about whether a
formulations for 18 adults who sought psychotherapy patient with major depressive disorder should receive
for pathological grief reactions after the death of a psychotropic medicine, electroconvulsive treatment,
parent. With regard to the results, agreement between and\or psychotherapy (e.g., Keller et al. 1986). Diff-
the two teams was poor. Typically, they mentioned erences in the amount and type of treatment could not
different symptoms, emphasized different conflictual be explained by variation in the clinical characteristics
areas, and formulated different cause–effect of patients. The best predictor of treatment was
explanations. medical center, indicating that the type of treatment
severely depressed patients receive depends largely on
which hospital they go to.
4. Prediction of Behaior For some decision tasks, many mental health
professionals make decisions that are not in agreement
With regard to predicting behavior, mental health with legal and ethical principles. For example, several
professionals have been able to make reliable and studies have reported that many psychiatrists do not
moderately valid judgments. For the prediction of make appropriate judgments about committing a
suicide and the prediction of violence, inter-rater patient to a hospital. For example, in one study
reliability has ranged from fair to excellent. However, (Bagby et al. 1991), 26 percent of the individuals
validity has been poor for the prediction of suicidal depicted as meeting the criteria for involuntary hosp-
behavior (suicidal behavior refers to suicide gestures, italization were not recommended for commitment.
suicide attempts, and suicide completions). For exam- At the same time, 20 percent of those who did not meet
ple, in one study (Janofsky et al. 1988), mental health the legal standard for commitment were recommended
professionals on an acute psychiatry in-patient unit for involuntary hospitalization.
interviewed patients and then rated the likelihood of Another example will also illustrate how many
suicidal behavior in the next seven days. They were not mental health professionals make decisions that are
able to predict at a level better than chance. In contrast not in agreement with legal principles. Mental health
to the prediction of suicidal behavior, predictions of professionals are mandated by law to report child
violence have been more accurate than chance. In fact, abuse, but a widely replicated finding in clinical
though the long-term prediction of violence is comm- judgment research is that large numbers of clinicians
only believed to be a more difficult task than the short- do not report child abuse (e.g., Brosig and Kalichman
term prediction of violence, both short- and long-term 1992). This may sometimes occur because they are
predictions of violence have been moderately valid. unfamiliar with mandatory reporting laws, and it may
Finally, the validity of clinicians’ prognostic ratings sometimes occur because they believe it will interfere
has rarely been studied, but there is reason to believe with treatment.
that clinicians’ prognostic ratings for patients with Some mental health professionals use treatment
schizophrenia may be too pessimistic (patients with interventions that are controversial. For example, one
schizophrenia may do better than many clinicians controversy involves the recovery of lost memories.
expect). Some clinicians will inform their clients that they
believe their emotional problems are due to having
been abused. They may say this even when the clients
5. Treatment Decisions have no memory of having been abused. They may
believe the client has been abused because of some of
Perhaps the most important task is to make treatment the client’s symptoms, e.g., low self-esteem or sexual
decisions. Unfortunately, many problems exist with dysfunction. Clinicians may then use a variety of
the treatment decisions made by clinicians. Problems techniques to help clients ‘remember’ having been
that will be described include: poor inter-rater reli- abused. For example, they may tell the clients that
ability, nonconformance with ethical and legal issues, they were abused and repeatedly ask them to reme-
and the use of controversial techniques. mber the events. Unfortunately, these interventions
Inter-rater reliability has been poor for several may lead a client to ‘remember’ an episode of abuse
judgment tasks. Two examples will be given. In one that never occurred.
study (Felton and Nelson 1984), six clinical psyc-
hologists, all trained in behavioral assessment, were
asked to formulate specific treatment plans for three 6. Discussion
clients. When psychologists conducted interviews,
treatment plans were in agreement only 59 percent of Though biases and errors sometimes occur, it is
the time. When psychologists conducted interviews important to note that clinicians’ judgments are
and also used questionnaires and role-playing sessions frequently reliable and valid. For example, diagnoses
to collect additional assessment information, trea- are moderately useful when they are made by clinicians

2042
Clinical Treatment Outcome Research: Control and Comparison Groups

who attend to the DSM-IV criteria. When made theoretical and practical implications. This article will
appropriately, diagnoses can inform us about the briefly review the role of control and comparison
nature, course, outcome, and recommended treatment conditions in clinical treatment outcome research and
for patients. It is also important to note that many describe advantages and disadvantages of several
clinicians use treatment interventions that have been different types of control and comparison designs. For
supported by empirical research. Empirically vali- readers interested in more detailed review of this
dated treatment interventions used by psychologists important area, the classic texts by Kazdin (1980,
are listed in the Winter 1995 issue of The Clinical 1998) are excellent sources.
Psychologist and in subsequent issues of the journal.

See also: Clinical versus Actuarial Prediction


1. Background
The randomized controlled trial is the sine qua non of
Bibliography treatment efficacy research (see Psychological Treat-
American Psychiatric Association 1994 Diagnostic and Stat- ments: Randomized Controlled Clinical Trials). A
istical Manual of Mental Disorders DSM-IV, 4th edn. Amer- given treatment cannot be considered ‘empirically
ican Psychiatric Association, Washington, DC validated’ (see Chambless and Hollon 1998) unless
Bagby R M, Thompson J S, Dickens S E, Nohara M 1991 trials purporting to establish its effectiveness have
Decision-making in psychiatric civil commitment: an expe-
employed at least two fundamental methodological
rimental analysis. American Journal of Psychiatry 148: 28–33
Brosig C L, Kalichman S C 1992 Clinicians’ reporting of features: (a) a credible control condition, and (b)
suspected child abuse: a review of the empirical literature. random assignment to the experimental and control
Clinical Psychology Reiew 12: 155–68 conditions. These critical features offer enormous
DeWitt K N, Kaltreider N B, Weiss D S, Horowitz M J 1983 advantages in evaluating a treatment’s efficacy. For
Judging change in psychotherapy: reliability of clinical form- example, random assignment of participants to the
ulations. Archies of General Psychiatry 40: 1121–8 experimental vs. control condition is essential in
Felton J L, Nelson R O 1984 Inter-assessor agreement on ruling out alternate explanations of findings, including
hypothesized controlling variables and treatment proposals. regression to the mean, maturation, history, effects of
Behaioral Assessment 6: 199–208
repeated testing, and selection basis (Kazdin 1998).
Garb H N 1998 Studying the Clinician: Judgment Research and
Psychological Assessment. American Psychological Assoc- The selection of an appropriate comparison or control
iation, Washington, DC group also is essential in determining, all other things
Janofsky J S, Spears S, Neubauer D N 1988 Psychiatrists’ being equal, if a treatment is effective for a given
accuracy in predicting violent behavior on an inpatient unit. disorder, and whether it has benefits relative to other
Hospital and Community Psychiatry 39: 1090–4 treatments for the disorder, as well as how it may
Keller M B, Lavori P W, Klerman G L, Andreasen N C, achieve its effects and what its ‘active ingredients’ may
Endicott J, Coryell W, Fawcett J, Rice J P, Hirschfeld R M A be.
1986 Low levels and lack of predictors of somatotherapy and
psychotherapy received by depressed patients. Archies of
General Psychiatry 43: 458–66
Widiger T A 1998 Invited essay: Sex biases in the diagnosis of
personality disorders. Journal of Personality Disorders 12: 2. What Should Control Groups Control for?
95–118
Wood J M, Nezworski M T, Stejskal W J 1996 The comp- The historical paradigm for control groups in be-
rehensive system for the Rorschach: A critical examination. havioral research is the placebo control for pharmaco-
Psychological Science 7: 3–10 therapy efficacy trials. Placebo (I shall please) controls
provide a means of evaluating a pharmacotherapy’s
H. N. Garb efficacy compared with those ‘placebo’ or ‘non-
specific’ effects which may occur simply because
participants believe they are receiving a credible
treatment (Frank 1961). Placebo effects in pharmaco-
therapy trials, which may be associated with sub-
Clinical Treatment Outcome Research: stantial and durable improvement, may occur because
of participants’ expectations of improvement, de-
Control and Comparison Groups mand characteristics, the instillation of hope, the
provision of support and education, and other non-
The essential paradigm for determining if a treatment specific effects that are associated with simply being in
is effective is to compare it with something else. treatment. Thus, because (a) the pill (or, sometime,
However, the issue of what that ‘something else’—a injection) in placebo control conditions is medically
control or comparison group—should be in any given inert, and (b) nonspecific elements are assumed to be
clinical trial is quite complicated and has enormous comparable in the ‘active medication’ and pill

2043
Clinical Treatment Outcome Research: Control and Comparison Groups

placebo conditions, differences in outcome are as- Given these issues, a variety of different control
sumed to be due to the unique ‘active ingredients’ of conditions have been proposed for psychotherapy
the experimental medication. research. Several of the most common will be described
However, the issue of what constitutes an appro- below.
priate ‘placebo’ condition (and even if the term
‘placebo’ is appropriate for control conditions in
behavioral research) in psychotherapy efficacy re- 3. Types of Control Groups
search is much more complex and has been the subject
of much controversy for many years (see Critelli and 3.1 No-treatment Controls
Neumann 1984, O’Leary and Borkovec 1978, Parloff
This strategy compares the experimental approach to
1986, Wilkins 1986). Soon after Rosenthal and Frank
no treatment; addressing the question ‘Does this
(1956) declared that the only adequate control con-
treatment work better than no treatment at all?’ In this
dition in psychotherapy research was a placebo group,
paradigm, participants are assessed, randomized to
several problems with this strategy were raised.
treatment condition, and reassessed after completion
Defining a ‘psychologically inert’ approach that is an
of treatment (or after an equivalent length of time for
appropriate ‘placebo’ control for behavioral research
the no-treatment control group).
raised a number of conceptual, practical, and ethical
concerns (O’Leary and Borkovec 1978).
Conceptually, defining an appropriate nonspecific 3.1.1 Adantages. No-treatment controls are gen-
control condition for psychotherapy is difficult be- erally seen as the ‘minimal’ or basic standard for
cause many of the ‘nonspecific’ elements that are evaluating the effectiveness of an intervention
common to many psychotherapies (e.g., formation of (Chambless and Hollon 1998). No treatment control
a therapeutic alliance, provision of support and conditions are sometimes referred to as assessment-
education, and the instillation of hope) are not inert, in only controls, as they control for the effects of the
that they are associated with legitimate psychological study assessments and the passage of time. Thus,
processes of change and have been shown to have they are useful in evaluating conditions that have a
powerful effects on psychological conditions high likelihood of improving without intervention
(O’Leary and Borkovec 1978). Moreover, Parloff (e.g., spontaneous remission) or when the natural
(1986) noted that referring to these ‘common ele- history of a disorder is not well established.
ments’ or ‘nonspecific elements’ as placebos sug-
gests that these powerful mechanisms of change are
somehow spurious. 3.1.2 Disadantages. There are a number of prob-
Practically, it can be difficult to conceive of a lems with no-treatment control conditions. No-
psychologically inert approach (e.g., an approach that treatment controls do not control for the effects of
has no theoretically supported rationale for influenc- participant expectancies, ‘common elements’ or
ing the behavior in question) that is sufficiently nonspecific effects, or time spent in treatment. Ethical
credible to participants (O’Leary and Borkovec and practical issues arise when no-treatment controls
1978). This is particularly difficult in the case of are used with populations seeking treatment for a par-
lengthy trials where it is essential to retain participants ticular condition, and particularly when the con-
in those conditions over periods of several weeks or dition is life-threatening or associated with significant
months. Furthermore, there has been little agreement medical, psychosocial, or other problems. Further-
on what constitutes an acceptable ‘placebo’; thus, more, withholding treatment from individuals who
there is no single, widely recognized or standardized request it raises ethical issues, particularly for those dis-
placebo condition. This makes comparison of effect orders where effective alternative treatments have
sizes (a standardized estimate of the magnitude of the been demonstrated to exist. No-treatment controls
effect of the experimental condition relative to the may also raise practical issues, because individuals
control) across studies very complex (Parloff 1986). assigned to the no-treatment condition may seek
This can pose problems, for example, in meta-analyses alternative therapies or become demoralized and
aimed at comparing the effect sizes for different types drop out of the study, particularly for more lengthy
of behavioral approaches for a given disorder. trials. This can lead to problems associated with con-
Ethically, for many treatment-seeking populations founding of conditions or differential attrition.
it is highly questionable whether providing a Some of the practical and ethical concerns associ-
placebo—a treatment that is expected to have no effect ated with no-treatment control conditions can be
on the presenting problem—can be justified. Fur- avoided in designs where the experimental treatment is
thermore, many placebo conditions, in presenting a ‘added on’ to a standard treatment program and the
credible rationale but no interventions that are expec- control group receives the standard treatment only.
ted to improve the participant’s problem, are also Thus, all participants would, at minimum, receive the
inherently deceptive (Kazdin 1980, O’Leary and standard treatment. This design addresses the ques-
Borkovec 1978). tion, ‘Does the addition of Intervention X improve

2044
Clinical Treatment Outcome Research: Control and Comparison Groups

outcomes over standard treatment for this disorder’ paradigm, the participants are assigned to a condition
and avoids many ethical and practical problems. that provides some common elements of psycho-
However, this strategy still does not control for time therapy (e.g., time spent, presentation of a credible
spent or nonspecific effects. rationale for the approach, receiving support and
attention), but does not offer the hypothesized ‘active
ingredients’ of the experimental intervention. The
3.2 Wait List\Delayed Treatment Controls specific nature of this type of control group varies
widely from trial to trial, but may include interventions
This strategy is similar to the no-treatment control such as having the participants meet with a clinician
approach, but differs in that some form of treatment is who provides education and support, ‘discussion
offered to those assigned to the no-treatment control control’ groups where several participants meet to
condition after completion of study assessments. Thus, provide mutual support or are presented with edu-
after completing all past treatment assessments, part- cation materials, ‘bibliotherapy’ conditions where
icipants originally assigned to the delayed-treatment participants receive written reading or other edu-
condition may be given the option of no further cational materials such as videotapes, and several
treatment or receiving the experimental treatment or more.
an alternative approach.

3.3.1 Adantages. Attention-control conditions, by


3.2.1 Adantages. Like the no-treatment control
controlling for time spent and providing some com-
condition, this strategy is one that addresses the
mon elements of psychotherapy, are typically seen as
question, ‘Is Treatment X effective compared with no
more stringent control than no-treatment or delayed-
treatment?’ and can help clarify the effects of the ex-
treatment control conditions. If well-conceived, care-
perimental intervention with respect to the natural
fully implemented and monitored attention controls
history of the disorder. Provided that participants in
can provide a test of the hypothesized active ingre-
the delayed treatment control are adequately mon-
dients of the experimental condition and may allow
itored and the disorder of interest is one where the
for some evaluation of mechanisms of action of the
likelihood of clinical exacerbation and negative conse-
experimental condition, which is not usually feasible
quences are minimal, this strategy avoids some of the
with no-treatment or delayed-treatment controls.
ethical problems associated with withholding treat-
ment from individuals who need or request it. Some
prospective participants who might not accept ran-
domization to a no-treatment control condition may 3.3.2 Disadantages. As noted above, there have
be more likely to accept randomization to a delayed been several problems with attention\discussion con-
treatment control. trol conditions. First, it is unclear whether many of
these approaches actually adequately control for non-
specific elements of the experimental intervention
3.2.2 Disadantages. Like the no-treatment control (Basham 1986). Second, given that a key ingredient
strategy, delayed treatment controls do not control of such approaches is the presentation of a convinc-
for differences in nonspecific effects of treatment, ing rationale, but at the same time the intervention
including expectancies, demand characteristics, ‘… should have no currently supported theoretical
attention, and so forth. Moreover, evaluation of long- reason why the placebo would influence the behavior
term effects of treatments is difficult with delayed under question’ (O’Leary and Borkovec 1978, p.
treatment controls, as provision of treatment to the 821), a truly convincing rationale for these conditions
participants in the delayed-treatment condition may is not so easily formulated. If participants are not con-
obfuscate determination of long-term effectiveness of vinced or do not see themselves as deriving adequate
the treatment in follow-up evaluations. This can, of benefit from the approach, they may be more likely
course, be avoided by waiting to provide treatment to drop out of the condition, leading to problems of
to the delayed-treatment condition until after the com- differential attrition in the control vs. experimental
pletion of all follow-up assessments, but this may in condition. Furthermore, since the attention\discus-
turn result in the same practical and ethical problems sion is intended to be ‘inert’ and not directly affect
associated with no-treatment controls. the disorder or problem, ethical questions regarding
the withholding of treatment may be raised for those
conditions where effective alternative approaches
exist. Finally, as noted above, because there has been
3.3 Attention\Discussion\Minimal Treatment
little consensus on what such controls should consist
Controls
of (either within or across different disorders), atten-
This strategy most closely resembles what has been tion control conditions vary widely from study to
referred to as a psychotherapy ‘placebo.’ In this study, making cross-study comparisons difficult.

2045
Clinical Treatment Outcome Research: Control and Comparison Groups

3.4 Actie Treatment or Comparatie Controls pists or participants). Finally, for those disorders
where a number of viable comparison approaches are
In this strategy, the experimental intervention is
available, but no one clear standard or ‘reference’
compared with another ‘active’ intervention, ideally
condition that is widely acknowledged as the most
one of demonstrated efficacy for the disorder. Thus, a
effective, choice of the comparison condition is chal-
comparative design directly compares two treatment
lenging. Kazdin (1986) has provided thoughtful
conditions without conceptualizing either as being a
recommendations for the selection and implementa-
formal control group (Basham 1986). Rather than
tion of appropriate active treatment control con-
addressing the issue of ‘Does this treatment work?’
ditions.
this strategy addresses questions such as ‘Is Treatment
The relative stringency of comparison controls
A more effective than Treatment B’ and ‘Does Treat-
relative to attention\discussion, delayed-treatment, or
ment A offer advantages over standard treatment for
no-treatment controls has practical implications as
this disorder?’ It should be noted that for many
well. For example, within a given field, if one type of
disorders, the comparison condition may be a psycho-
intervention (Treatment A) routinely uses no-treat-
therapy or a pharmacotherapy.
ment controls in efficacy studies, while another type of
intervention (Treatment B) uses discussion control
conditions, while trials evaluating Treatment C
3.4.1 Adantages. When well conceived and exe- routinely use active treatment controls, the estimated
cuted, comparison controls do control for many arti- effect size of Treatment A is likely to be larger than
facts and demand characteristics, including time that for Treatment B, which is in turn likely to be
spent in treatment, the provision of a credible larger than for Treatment C, even though all three
rationale, and other common elements or nonspecific approaches may be comparably effective. Progress-
effects (although it should be noted that equivalence ively smaller effect sizes for different types of control
of nonspecific elements across conditions must be sup- conditions (no treatment, placebo, active treatment)
ported by process evaluations). This strategy has the has been suggested in several meta-analyses (Basham
additional advantage of permitting more thorough 1986).
evaluation of treatment process and mechanisms of
action, and it can help identify what treatment is best
for a given disorder, as well as what type of treat- 3.5 Dismantling Controls
ment is best for what type of individual (Basham
1986, Kazdin 1986, O’Leary and Borkovec 1978). Also referred to as component control condition, this
Furthermore, by comparing the experimental inter- strategy essentially requires specification of a given
vention to a treatment of demonstrated efficacy for treatment or treatment ‘package’ into its component
the disorder (e.g., a ‘standard’ or ‘reference’ con- elements, and then evaluating the effectiveness of the
dition), this approach avoids deception and many of treatment with or without one specific element or
the ethical problems discussed earlier. technique of interest. For example, a cognitive be-
havioral approach might be evaluated with and
without a relaxation training component.
3.4.2 Disadantages. Comparison controls, by con-
trolling for nonspecific elements and essentially pit-
ting the hypothesized ‘active ingredients’ of one 3.5.1 Adantages. By altering only a single element
approach against those of another, are highly strin- of a treatment, component control strategies offer an
gent tests of novel interventions. Thus, such studies elegant strategy for evaluating well-defined tech-
often lead to small effect sizes and findings of no sig- niques and components. Component control con-
nificant differences between interventions (Luborsky ditions can address issues of particular theoretical or
et al. 1975), which in turn may lead to problems in clinical relevance, by isolating the most effective ‘in-
interpretation of the study outcomes. For example, gredients’ of complex or multifaceted treatment
in a trial comparing two active treatments where no packages. Furthermore, because the interventions
significant difference in effectiveness is found, inter- compared differ only with respect to a single element,
pretation of the ‘absolute’ effectiveness of the many of the practical, ethical, and theoretical issues
treatments is problematic. That is, although the treat- associated with placebo and no-treatment controls
ments compared were not significantly different, it is can be avoided (O’Leary and Borkovec 1978).
often hard to gauge the magnitude of the effect of
either treatment on the symptoms targeted (Nathan
1997). Kazdin (1986) has also pointed out that com- 3.5.2 Disadantages. Before treatments or treat-
parative designs may overemphasize elements that dif- ment packages can be dismantled, they must first be
ferentiate the treatments compared (e.g., use of demonstrated to be effective. Thus, for a given dis-
specific techniques) over other important variables order, there may only be a small number of
that may affect outcome (e.g., characteristics of thera- approaches with sufficient empirical support that

2046
Clinical Treatment Outcome Research: Control and Comparison Groups

would justify their ‘dissection’ in a dismantling group selected does determine, however, the nature of
study. Similarly, not all approaches are easily parsed the questions that can be addressed by the trial (e.g.,
into discrete components. Furthermore, to justify a ‘Does this treatment work’ to ‘Does this treatment
full dismantling trial, a compelling rationale for the work for the reasons we believe it does’ to ‘Does this
component to be evaluated must be made, e.g., the treatment have specific benefits compared with other
component might be ‘… very risky, costly, or theo- available treatments?’).
retically interesting’ (Strayhorn 1987). Selection of an optimal control condition for a given
study thus depends on many factors, including the
level of development of the treatment being evaluated,
3.6 Pill Placebo Conditions
the availability of alternative treatments for the
Citing the problems noted above with placebo con- disorder, the severity or natural history of the disorder,
ditions for psychotherapy research, Klein (1997) and and many others (Crits-Christoph 1997). Thus, early
others have argued for wider use of pill placebo in the development of a field, where questions remain
controls for psychotherapy trials. In this strategy, the about a disorder’s natural course, and available treat-
experimental psychotherapy would be compared with ments are scarce, no-treatment or delayed treatment
a pill placebo condition. This strategy was used in the controls may be appropriate choices. However, as a
landmark NIMH Treatment of Depression Collab- field becomes more sophisticated and alternative
orative Research Program (Elkin et al. 1985), which treatments are identified, research questions also
compared cognitive therapy and interpersonal psycho- become more fully developed and require the use of
therapy to a placebo\clinical management condition control groups that can address progressively more
(with imipramine\clinical management as a reference complex questions.
condition).
See also: Censuses: Demographic Issues; Internal
3.6.1 Adantages. The use of pill placebo conditions Validity; Psychological Treatments: Randomized
in psychotherapy research has a number of potential Controlled Clinical Trials; Psychological Treatments,
advantages. First, when carefully implemented, pill Empirically Supported
placebos provide control for participant expectations
and many demand characteristics. Klein (1997) also
points out that this approach enables researchers to
evaluate psychotherapy effects in the context of a
pharmacotherapy-responsive population. This ap- Bibliography
proach avoids problems with ‘psychotherapy pla-
Basham R B 1986 Scientific and practical advantages of com-
cebo’ conditions, as it compares the effects of a parative design in psychotherapy outcome research. Journal of
given psychotherapy with those nonspecific effects of Consulting and Clinical Psychology 54: 88–94
a pill placebo. This strategy may also allow clearer Chambless D L, Hollon S D 1998 Defining empirically sup-
evaluation of treatment processes and mediation ported therapies. Journal of Consulting and Clinical Psy-
effects (Crits-Christoph 1997). chology 66: 7–18
Carroll K M 1997 Manual guided psychosocial treatment: A
new virtual requirement for pharmacoltherapy trials? Archies
3.6.2 Disadantages. Pill placebo controls require of General Psychiatry. 54: 923–8
the availability of a widely recognized effective Critelli J W, Neumann K F 1984 The placebo: Conceptual
pharmacotherapy for the disorder in question, as the analysis of a construct in transition. American Psychologist 39:
rationale for the pill placebo control requires that par- 32–9
ticipants expect to improve by taking the pill. Thus, Crits-Christoph P 1997 Control groups in psychotherapy re-
a major drawback of this approach is that for many search revisited. Treatment, 1, Comment 1, posted September
conditions for which psychotherapies may be appro- 22
priate, no such reference pharmacotherapy exists. Elkin I, Parloff M B, Hadley S W, Autry J H 1985 NIMH
Moreover, because most pharmacotherapies are deliv- treatment of depression collaborative research program:
ered in conjunction with a minimal supportive psycho- Background and research plan. Archies of General Psychiatry
therapy condition to foster compliance and enhance 42: 305–16
retention (conditions which closely resemble placebo Frank J D 1961 Persuasion and Healing. Johns Hopkins Uni-
psychotherapies), the problems associated with con- versity Press, Baltimore, MD
Kazdin A E 1980 Research Design in Clinical Psychology. Harper
ceiving and implementing minimal or placebo psycho- and Row, New York
therapies may not be avoided entirely (Carroll 1997). Kazdin A E 1986 Comparative outcome studies of psycho-
therapy: Methodological issues and strategies. Journal of
4. Summary Consulting and Clinical Psychology 54: 95–105
Klein D F 1997 Control groups in pharmacotherapy and psycho-
Just as there is no one ‘perfect’ clinical trial, there is therapy evaluations. Treatment, 1, Article 1, posted September
no one perfect control group. The particular control 22

2047
Clinical Treatment Outcome Research: Control and Comparison Groups

Luborsky L, Singer B, Luborsky L 1975 Comparative studies of way that the linear composites correlate maximally
psychotherapies: Is it true that ‘everyone has won and all must with the criterion of interest in a new set of data.
have prizes’?Archies of General Psychiatry 32: 995–1007
Nathan P E 1997 Would a pill placebo have redeemed Project
MATCH? Treatment, 1, Comment 3, posted September 22 2. Reiew of the Empirical Findings
O’Leary K D, Borkovec T D 1978 Conceptual, methodological,
and ethical problems of placebo groups in psychotherapy
Meehl was concerned primarily with the statistical vs.
research. American Psychologist 33: 821–30 clinical methods for integrating information; thus, he
Rosenthal D, Frank J D 1956 Psychotherapy and the placebo compared instances in which both types of prediction
effect. Psychological Bulletin 53: 294–302 had been made on the basis of exactly the same data.
Strayhorn J M 1987 Control groups for psychosocial inter- (He also insisted that the accuracy of the statistical
vention outcome studies. American Journal of Psychiatry 144: model should not be checked on the same data on
275–82 which it was derived—or that the sample size be so
Wilkins W 1986 Placebo problems in psychotherapy research: large that it will not appear superior owing to
Social-psychological alternative to chemotherapy concepts. capitalizing on chance fluctuations.) Twelve years
American Psychologist 241: 551–6 later, Jack Sawyer (1966) published a review of about
45 studies; again, in none was clinical prediction
K. M. Carroll superior. Unlike Meehl, Sawyer also included studies
in which the clinician had access to more information
than that used in a statistical model—for example,
interviews of people about whom the predictions were
made, or interviews by experts who had access to the
statistical model information prior to the interview.
Such interviews did not improve the clinical pre-
Clinical versus Actuarial Prediction dictions. In fact, the predictions were better when the
opinions of the interviewers were ignored.
Paul Meehl’s book Clinical Versus Statistical Pre- A prototypical study by Carroll et al. (1988)
diction: A Theoretical Analysis and a Reiew of the supports Sawyer’s conclusion. A Pennsylvania parole
Eidence (Meehl 1954) concluded that the prediction board considered about 25 percent of the 743 parolees
of numerical criterion variables of psychological to be failures within one year of being released, for
interest (e.g., faculty ratings of graduate students who reasons such as being recommitted to prison, ab-
had just obtained a Ph.D.) from numerical predictor sconding, being apprehended on a criminal charge, or
variables (e.g., scores on the Graduate Record Exam- committing a technical parole violation. A parole
ination, grade point averages, ratings of letters or board interviewer’s ratings had predicted none of
recommendation) is better done by a proper linear these outcomes; the largest correlation was only 0.06.
model than by the clinical intuition of people pre- In contrast, a three-variable model based on the type
sumably skilled in such prediction. The point of this of offense that had led to imprisonment, the number of
article is to review summaries and conclusions sub- past convictions, and the number of noncriminal
sequent to Meehl’s original one and to present violations of prison rules did have a modest pre-
evidence that even what can be termed ‘improper’ dictability, correlating about 0.22, a result consistent
linear models (Dawes 1979) often yield predictions with earlier findings that actuarial predictions based
superior to human intuition. on prior record predict with a correlation of about
0.30 across a large number of settings. When parolees
were convicted of new offenses, the seriousness of their
crimes was correlated 0.27 with the interviewers’
1. Type of Statistical Models ratings of assaultive potential, but a simple dichot-
omous evaluation of past heroin use correlated 0.46.
A proper linear model is one in which the weights All parole board interviewers had access to all this
given to the predictor variables are chosen in such a statistical information, but did worse.
way as to optimize the relationship between the None of these correlations is particularly high; first,
prediction and the criterion. Simple regression analy- the sample is highly select, being limited to those who
sis, where the predictor variables are weighted in order have been convicted of a crime; second, not all the
to maximize the correlation between the subsequent parolees who committed crimes were caught, and
weighted composite and the actual criterion, is the third, these types of behaviors are not as predictable as
most common example. Discriminant function analy- we believe they are or would like them to be. The
sis is another example; weights are given to the difference in the effectiveness of actuarial vs. clinical
predictor variables in such a way that the resulting prediction is, however, clear.
linear composites maximize the discrepancy between Moreover, this difference is consistent with com-
two or more groups. Ridge regression analysis, parisons of actuarial vs. clinical methods for predicting
another example, attempts to assign weights in such a violence (see, e.g., Werner et al. 1984, Monahan 1997).

2048
Clinical ersus Actuarial Prediction

An important qualification: the best prediction in 3. Explanation for the Empirical Findings
general is that neither violence nor criminal behavior
will be repeated. Although the general ‘base rate’ Why the consistent results? The answer involves an
prediction is that people will not repeat problems, understanding of which factors are favorable to the
judges—professional and nonprofessional alike— linear model and which disfavor the intuitive pre-
have a bias to believe that repetition is common. The diction. Whatever additional factors there may be that
studies show that judgments about who is more likely disfavor the model or favor the clinician are out-
than whom to repeat are much better made on an weighed by the former types of factors in almost all
actuarial than on a clinical basis. contexts.
After his book had been out about 30 years, Meehl First, consider the statistical prediction. Each in-
(1986) was able to conclude. ‘There is no controversy stance to be predicted is characterized in terms of its
in social science which shows such a large body of aspects that allow its location in a category or along a
qualitatively diverse studies coming out so uniformly dimension. These categories and locations have been
in the same direction as this one.’ Since that time, even found in the past to be in general predictive. Aspects
more evidence has been accumulating in favor of that have no such predictive power are automatically
Meehl’s generalization and practical conclusion; 110 ignored. The model involves the weighting of the
studies that Dawes et al. (1989) reviewed favored it. predictive aspects. Moreover, these aspects often are
Subsequently, Grove and Meehl (1996) published a diverse, e.g., an undergraduate grade point average, a
meta-analysis involving even more studies. Their number of prison violations, an instance of violence
conclusion (Grove and Meehl (1996 p. 293) was, ‘The that led to hospitalization (even a rating routinely
clinical method relies on human judgment that is based made on the first day of hospitalization). The statistical
on information contemplation and, sometimes, dis- model automatically makes the predictive variables
cussion with others (e.g., case conferences). The comparable, by assigning weights to integrate them.
mechanical method involves a formal, algorithmic, Now, consider the drawbacks of attempting to make
objective procedure (e.g., equation) to reach the an intuitive integration of the information. Suppose
decision. Empirical comparison of the accuracy of the even that we are not interested in making an optimal
two methods (136 studies over a wide range of prediction, just in integrating information from di-
predictors) shows that the mechanical method is verse and incomparable dimensions. Suppose, for
almost invariably equal to or superior to the clinical example, we were deciding between two jobs where the
method.’ important considerations are pay and enjoyment of
There are four logical relationships possible between the activities involved. It may be very easy, knowing
the information set on which a clinician or expert our preferences for different types of activities, to
makes a prediction and information set on which a judge which job will be more enjoyable. But now job A
formal model (e.g., equation) is based. These sets may pays more than job B, but is less enjoyable. Which
be identical, which was the original requirement for a should be chosen? We must weight the two dimen-
comparison in Meehl’s 1954 book. Or one information sions—at least implicitly—if we are to make a choice.
set may be a subset of the other; in the studies reviewed What psychologists have found (e.g., Svenson 1992,
by Sawyer and in almost all subsequent studies having Langer 1994) is that in such conflicting dimensions
this structure, the set on which the model is based is the situations people generally search for reasons to
subset of the set available through the clinician (who, dismiss one or another of the dimensions as ‘not really
for example, is allowed to supplement a sparse that important.’ Then, there is no conflict between
information set with an interview). In the field of dimensions, and all that must be done is to make an
psychology, the model prediction has always been ordinal judgment, again of the form ‘more is better.’
superior. There are, however, exceptions in the field of Now, consider the additional complications of
medicine, in the situations where the clinical physician trying to decide on an intuitive basis which of two
has access to more information than is used by the instances is more predictive of something. While we
model (not in situations where the inputs are identical). might at least have some insight—explicit or im-
Even in these, however, the models may be modified plicit—into how we assess differences and how much
on the basis of interviewing the clinicians themselves to care about them, we often have less insight into how
to ‘distill’ what they are responding to, so that once to assess differences and to weight them in order to
again the statistical prediction becomes superior. For predict. (Feedback observing all outcomes without
example, a predictive system termed APACHE-II did being affected by our judgment—e.g., of success or
not predict as well as physicians who would survive in failure—is necessary, but not sufficient; see Einhorn
emergency wards, yet the modified model termed and Hogarth (1978).)
APACHE-III did in fact predict better than physicians As noted, statistical integration has the advantage
who would survive the first 24 hours (Knaus et al. that weighting obviates the problem of conflicting
1991). Another possible relationship is that the in- dimensions. The question then arises of whether the
formation sets are overlapping, which has not been weighting system need be optimal in order to maintain
studied much. the advantage. The answer—based on mathematical

2049
Clinical ersus Actuarial Prediction

considerations, simulations, and empirical investiga- did as well as the linear composites based on the
tion—is no. The robust result is that so long as the diagnostic experts’ judgments. Unit weighting was
dimensions are weighted in the correct direction, ad even better.
hoc weighting (e.g., ‘intuitive weighting,’ unit weight-
ing, or even weights chosen according to some random
sampling scheme) yields results that are close enough 4. Conclusions and Recommendations
to those provided by optimal weighting that the Thus, ‘the whole trick is to know what variables to
resulting linear composites still outperform clinical look at and then to know how to add’ (Dawes and
intuition, especially when the predictive variables tend Corrigan 1974, p. 105). Of course, there are some
to be positively correlated with each other. Composites contexts where configural, multiplicitive, or even more
based on such nonoptimal weights have been termed complicated models are more appropriate than are
‘improper linear models’ (Dawes 1979). In fact, not simple additive models; such models may, for example,
only may such models yield predictions similar enough be found in the area of predator–prey population
to optimal models that they outperform clinical dynamics. But the decision making discussed here
judgment, but they may even outperform optimal involves a prediction of important human outcomes
models on cross-validation, because they are not where—although the human experience may be
subject to the ‘overfitting’ problem that plagues many complex—what can be best distilled from it are simple
optimal models. For example, the ‘robustness’ of unit- predictive variables where more (or less) is better (e.g.,
weighted models on cross-validation has been noted as test scores, indicators of past performance, past
far back as the 1930s by Wilks (1938). The math- criminal or psychiatric record). Not only do we find
ematical rational for this robustness of unit weighting such simple monotone relationships in our studies, but
has been provided by Wainer (1976) and Wainer and we search for such relationships and tend to code our
Thissen (1976). social world in terms of variables capturing monotone
In fact, improper models such as unit-weighted relationships to what is important to us. (Occasionally,
models may be particularly advantageous in situations a predictor variable has a single-peaked relationship
where models developed in one context are to be to the criterion, as when moderate aggression is more
applied in a slightly different context—which many of desirable in a business person; such a variable is easily
us (Dawes 1997) believe to be the norm in social transformed into a monotone variable by evaluating
science, as opposed to applying a model developed on distance from the ideal.) When there are strong
one sample of a particular population to another scientific reasons for hypothesizing a complex model,
sample drawn from this exact same population. naturally they should not be ignored. In the absence of
(‘Cross-validation’ actually refers to such a subsequent such theory or evidence, however, the claim that it is
application to a sample from the exact same popu- possible to construct a valid nonadditive model ‘in our
lation; a far more descriptive term would be simply head’—particularly one representing some sort of
‘validation,’ where what is commonly termed a ‘vali- valid, ineffable intuition—is pure hubris.
dation’ sample should be termed a ‘development’ In contrast, an understanding of the research
sample. The point here is that when we move across reviewed in this article leads to ‘an awareness of the
contexts that vary to some—perhaps known perhaps modest results that are often achieved by even the best
unknown—degree, even the estimate based on what is methods, [an awareness which] can help to counter
standardly termed ‘cross-validation’ may be overly unrealistic faith in our predictive powers and our
optimistic.) understanding of human behavior. It may well be
An empirical overview of the success of improper worth exchanging inflated beliefs for an unsettling
models in general is provided by Dawes (1979). sobriety, if the result is openness to new approaches
Perhaps the most famous example of an improper and variables that ultimately increase our explanatory
model is that provided by Goldberg (1965) to predict and predictive powers’ (Dawes et al. 1989, p. 1673).
a diagnosis of psychosis vs. neurosis by using MMPI For the most recent survey and analysis, see Swets et
profiles. The unit-weighted composite obtained by al. (2000).
adding together three scaled scores indicating psy-
chosis (L, Pa, Sc) and subtracting two indicating See also: Clinical Psychology: Validity of Judgment
neurosis (Hy, Pt) not only ‘outperformed all diag-
nosticians’ (Goldberg 1965, p. 24) but was more stable
across subsamples than were nonlinear complex Bibliography
scores—and it was not for want of trying enough of
Carroll J S, Werner R L, Coates D, Galegher J, Alibrio J J 1988
the latter.
Evaluation, diagnosis, and prediction in parole decision
Later, Dawes and Corrigan (1974) demonstrated making. Law and Society Reiew 17: 199–228
that improper weights (in fact two weighting systems Dawes R M 1979 The robust beauty of improper linear models
chosen on random bases except for the direction of the in decision making. American Psychologist 34: 571–82
weights) not only outperformed clinical judgment in Dawes R M 1997 Qualitative consistency masquerading as
the MMPI diagnosis problem and several others, but quantitative fit. In: Dall Chiara M L, Doets K, Mundici D

2050
Clitics, Linguistics of

(eds.) Structures and Norms in Science. Kluwer, Dordrecht, morphology. Clitics can be found in nearly every
The Netherlands, pp. 387–94 language; however, they may be hard to recognize,
Dawes R M, Corrigan B 1974 Linear models in decision making. because they may be disguised as contracted forms, for
Psychological Bulletin 81: 95–106
instance. Clitics also exhibit some unique properties
Dawes R M, Faust D, Meehl P E 1989 Clinical versus actuarial
judgment. Science 243: 1668–74
that have been especially difficult to account for in
Einhorn H J, Hogarth R M 1978 Confidence in judgment: grammar. For that reason, they help linguists to
persistence in the illusion of validity. Psychology Reiew 85: reformulate hypotheses about the grammatical struc-
395–416 ture that underlines language (Nevis 2000).
Goldberg L R 1965 Diagnosticians vs. diagnostic signs: the
diagnosis vs. neurosis from the MMPI. Psychological Mono-
graphs 79: 1–28 1. Illustration of Clitics
Grove W M, Meehl P E 1996 Comparative efficiency of informal
(subjective, impressionistic) and formal (mechanical, algorith- Grammatical analysis offers a basic distinction betw-
mic) prediction procedures: the clinical–statistical contro- een words and affixes: words are independent elements
versy. Psychology, Public Policy, and Law 2: 293–323 used for phrase and sentence formation (see Word,
Knaus W A, Wagner D P, Lynn J 1991 Short-term mortality Linguistics of), whereas affixes are word-building units
predictions for critically ill hospitalized adults: science and that attach to roots and stems (see Morphology in
ethics. Science 254: 389–94 Linguistics). Problematic for this division are two sorts
Langer E 1994 The illusion of calculated decisions. In: Schank R,
Langer E (eds.) Beliefs, Reasoning and Decision Making:
of objects: those that form phrases but are not fully
Psycho-Logic in Honor of Bob Abelson. L. Erlbaum, Hillsdale, independent, and those that help to build words but
NJ, pp. 33–53 exhibit a loose attachment that typical affixes do not.
Meehl P E 1954 Clinical Versus Statistical Prediction: A Theor- Both kinds have been labeled clitics, a term derived
etical Analysis and a Reiew of the Eidence. University of from Ancient Greek enklitikoT n ‘leaning on a previous
Minnesota Press, Minneapolis, MN (word),’ which in various guises has been used in
Meehl P E 1986 Causes and effects of my disturbing little book. grammatical description for two millennia.
Journal of Personality Assessment 50: 370–5 As an illustration of clitics, consider English con-
Monahan J 1997 Clinical and actuarial predictions of violence. tracted forms, such as ’e in you’e. It lacks the
In: Faigman D, Kaye D, Saks M, Sanders J (eds.) Modern
Scientific Eidence: the Law and Science of Expert Testimony.
independence of hae, but otherwise acts like a full
West, St. Paul, MN, Vol. 1, pp. 300–18 verb by helping to form sentences and by showing
Sawyer J 1966 Measurement and prediction, clinical and agreement with the subject. If ’e is a verb, then it is a
statistical. Psychological Bulletin 66: 178–200 verb pronounced without a vowel (since the e is silent
Svenson O 1992 Differentiation and consolidation theory of here)—can a word in English consist of a single
human decision making: a frame of reference for the study of pronounced consonant? While most contractions in
pre- and post-decision processes. Acta Psychologica 80: English and in other languages qualify as clitics, not all
143–68 do.
Swets J A, Dawes R M, Monahan J 2000 Psychological science A different sort of example is the English possessive
can improve diagnostic decisions. Psychological Science in the
’s (see Possession (Linguistic), and Jespersen 1922).
Public Interest 1: 1–26
Wainer H 1976 Estimating coefficients in linear models: it do Unlike other affixes in English, it attaches to an entire
not make no nevermind. Psychological Bulletin 83: 213–7 phrase, as in the King of Morocco’s death, where the ’s
Wainer H, Thissen D 1976 Three steps toward robust regression. is affixed to the phrase the King of Morocco rather than
Psychometrika 41: 9–34 just to the word Morocco (that is, this is not Morocco’s
Werner P D, Rose T L, Yesavage J A, Seeman K 1984 Psychia- death). If ’s is to be treated as an affix, then gram-
trists’ judgments of dangerousness in patients on an acute marians will have to adopt an expanded view of
care unit. American Journal of Psychiatry 141: 263–6 affixation that includes such phrasal affixes.
Wilks S S 1938 Weighting systems for linear functions of The study of clitics is important because their
correlated variables when there is no dependent variable. position on the edge of the word\affix distinction
Psychometrika 8: 20–6
sheds light on the complexities involved in determining
R. M. Dawes how languages are constructed. The passage in Sect. 3
below will survey some of the unique properties
attributed to clitics that have forced linguists to
accommodate new facts into their theories of gram-
mar.

Clitics, Linguistics of
2. Types of Clitics
Clitics straddle the boundary between words and Clitics have traditionally been cited according to the
affixes because they have some properties of each. way they attach: they may be enclitics (attaching
Therefore clitics pose a quandary for the longstanding leftward onto a host), proclitics (attaching rightward),
division of labor in grammar between syntax and or endoclitics (attaching inside the host). In Spanish

2051
Clitics, Linguistics of

me ‘me’ illustrates both the enclitic and the proclitic 1990s, with some scholars arguing for affixal status
options: when me follows its host word, as in dıT ga-me and others for word status. Positing the clitic as a
‘tell me,’ it functions as an enclitic, but when it precedes unit distinct from the word and the affix has led scho-
the host as in no me dıT ga ‘don’t tell me’ it is a proclitic. lars to attribute to it a special set of properties (Zwicky
Most scholars assume that clitics attach in a very loose 1977, Spencer 1991, Katamba 1993). Usually the
fashion to their hosts, and therefore must occur main thrust of this approach is to present a scheme
outside of regular affixes. Following that reasoning, locating and attaching the clitic in the appropriate
proclitics precede any and all prefixes, and enclitics place in a phrase and attaching the clitic to one of
follow all suffixes. Endoclitics are so rare—if they exist two neighbors.
at all—that specialists have examined any putative Unique characteristics such as clitic doubling and
cases closely to see whether the positing of endoclitics special syntax have been claimed to be shared with
can be avoided altogether. neither affixes nor words. Such special properties help
In technical descriptions of languages, a category highlight differences between clitics on the one hand
‘clitic’ remained commonplace throughout the twen- and words or affixes on the other, or to enrich linguists’
tieth century, but seldom had it been incorporated into understanding of the syntactic patterns of pronouns
grammatical theories until the 1970s, when the pheno- and other word classes. Some unique properties are
menon was surveyed by Zwicky (1977). Zwicky’s early particular to clitic pronouns, and in the last quarter of
investigations established three types of clitics: simple the twentieth century many scholars have devoted
clitics, special clitics, and bound words. In later work, their efforts to the study of these aspects of clitic
bound words and special clitics merged, producing pronouns. In particular, the clitic pronouns in Spanish
two categories: simple and special clitics. and French have attracted considerable attention
Simple clitics can be substituted for full words. The because the pronouns may occur in places where
simple clitic is thus considered a reduced version of an regular nouns may not. French and Spanish direct
independent word, sharing its meaning and manifest- objects follow the verb if they are noun phrases (in
ing a similar pronounciation. An example is the pro- French tu as u le chien ‘you have seen the dog’)
noun them in a sentence like we like them, which is often whereas they precede the verb if they are pronouns (tu
reduced in casual speech to we like ’em. The clitic form l’as u ‘you have seen it,’ literally you it hae seen).
’em is relatable to full form them and the clitic’s distri- There are other differences as well (van Riemsdijk
bution in a sentence is a subset of the distribution of the 1999).
full word, with the contraction occuring in most places Clitic doubling occurs when a pronoun appears in
the nonclitic is found, but when the pronoun occurs the same clause as the noun phrase to which it refers.
under emphasis or in isolation, clitic ’em is not permit- For example, in certain South American Spanish
ted and the full form is used. For example, in response varieties, the proclitic pronoun lo can be found
to the question Who do you like better, us or them?, the together with the direct object:
full word them is an acceptable answer, but the con- Lo imos a Juan
traction ’em is not. Although simple clitics might be Him saw to Juan
related to their free forms, they are not mere reductions ‘We saw John’
due to fast speech: in this regard note that the English Normally pronouns replace noun phrases rather
simple clitics like ’em are used in slow speech as well. than repeat them, so this behavior is noteworthy.
Special clitics, on the other hand, either lack a Another pattern unique to clitics is Wackernagel’s
freeform equivalent or show some special syntax (to be Law, which refers to the restriction of elements to a
discussed in the next section). The English possessive ’s slot immediately after the first unit of a phrase or
lacks a nonclitic alternative, therefore it constitutes a clause; that initial unit can be a phrase or word (Steele
special clitic. As illustrated below, the Tagalog pron- 1977, Nevis et al. 1994). An example of Wackernagel
ouns are restricted to the second position of the clitics comes from Tagalog. In Tagalog, pronouns and
sentence, so they constitute special clitics in Zwicky’s certain adverbs must occur after the first word or
scheme. phrase of the sentence. Usually this is the verb, but it
can also be the negative hindi or some other word. The
following sentences offer several clitic pronouns and
3. Unique Properties of Clitics particle clitics (na ‘already,’ lamang ‘only,’ and pol-
iteness marker po; such ‘particle clitics’ function as
Accommodating clitics in theories of grammar has adverbs in Tagalog), which are highlighted to indicate
proven challenging. Initial investigations into clitics the second position in the sentence.
focused on their status in between words and affixes Nakita ko na siya
and pondered whether the notion clitic should be Seen I already her\him
viewed as a special unit of grammar (distinct from I have already seen her\him’
the affix and the word) or as a sort of deviant affix Hindi ka niya kapatid
or deviant word. Although early views opted for separ- Not you his\her sibling
ate treatment, the debate persevered through the ‘You aren’t his\her sibling’

2052
Clitics, Linguistics of

Tatlo lamang po sila Chee-hwa to succeed and I want to succeed Tung Chee-
Three only POLITE them hwa), whereas the sentence Tung Chee-hwa I wanna
‘There are only three of them’ succeed is restricted for most speakers to one meaning
The Tagalog clitics must remain in the second slot in (I want to succeed Tung Chee-hwa) and resists the other
the sentence, no matter what comes before them (verb, interpretation.
negative, or numeral).
5. Clitics as Deried Words and Phrasal Affixes
4. Clitics s. Leaners
Many of the scholars researching clitic doubling, and
Studies on clitic elements must establish that the to a certain extent Wackernagel’s Law, seem to assume
objects under investigation exhibit mixed status be- that clitics are wordlike in nature. A number of studies
tween affixes and words. Diagnostic tests can be used treat clitics as derived words. But even if most clitics
to identify properties of clitics, words and affixes are analyzable as derived words, there is nevertheless a
(Zwicky 1977). Because the phonological attachment small residue of clitics that are better handled as
of a clitic to its host cannot be assumed and must be affixes. English possessive ’s is an example of one such
proved, it is important to distinguish clitics from clitic: it is best treated as an affix attaching not to any
leaners (also called quasiclitics or semiclitics), which individual word but to a group of words insofar as it
are simply unstressed words. Although clitics also offers no wordlike characteristics beyond phrasal
usually lack stress, they nevertheless show clear evi- affiliation and clearly patterns with the other inflec-
dence of being part of a word’s pronunciation. By tional affixes of English in morphology and pro-
contrast, a leaner is merely unstressed and does not nunciation. To the chagrin of theoreticians, both kinds
become part of another word’s pronunciation. of clitics appear to be required to handle such puzzles
For example, the contraction of English is to ’s as the wordlike contracted auxiliaries of English (e.g.,
entails a change in pronunciation according to the ’e) and its phrasal affix possessive.
last sound of the word it attaches to. Accordingly it is Clitics can sometimes cluster into longer sequences.
pronounced z at the end of dog but s after cat: These sequences exhibit looser order than affix chains
The dog’s barking. but stricter order than word order is generally capable
The cat’s purring. of. In Tagalog, for example, all pronoun clitics
This is the behavior of the plural suffixes as well, consisting of a single syllable precede all particle
which has two pronunciations, z and s, under the same clitics, which in turn precede all pronoun clitics
conditions as the contracted form of is: consisting of two syllables (e.g., ko ‘I’ precedes na
The dogs are barking. ‘already,’ and na precedes siya ‘her\him.’
The cats are purring. Nakita ko na siya
Thus, contracted ’s, which is a clitic, and plural s, Seen I already her\him
which is a suffix, share a pattern of pronunciation that ‘I have already seen her\him’
depends crucially on the last sound of a word. This But among the two-syllable pronouns order is not
fact of pronunciation distinguishes clitic ’s from a restricted; nila ‘they’ and ako ‘me’ may occur in either
word that is merely unstressed, as demonstrated by the order:
pronunciation of the word who when used to introduce Hindi nila ako nakita l Hindi ako nila nakita
a relative clause (e.g., the one who got elected )—it is Not they me seen
usually pronounced with a reduced vowel in casual ‘They didn’t see me’
conversation (that is, more like huh and not with full Similarly, order need not be limited among the
vowel u). Here unstressed who is considered a leaner particle clitics; lamang ‘only’ and the politeness marker
rather than a clitic because it does not attach phono- po occur in two orders as well.
logically to a neighboring word as does contracted ’s. Tatlo lamang po sila l Tatlo po lamang sila
It is not always easy to decide what is a clitic and Three only POLITE them
what is not. When speakers contract want to to wanna, ‘There are only three of them’
investigators determine whether wanna can be derived Surprisingly, such clitic sequences have been used to
in a principled way from want plus to using regular argue for the more wordlike standing of clitics, for the
rules of grammar. It it can, then wanna may contain a affixal nature of clitics, and for the unique status of
clitic form of to. However, if additional rules are clitics. Clearly more work needs to be done in this
needed to handle this case in an idiosyncratic way, area.
then the derivation is not considered valid and wanna
might instead be considered a single indivisible word 6. Clitics in Historical Studies
rather than a composite of want and to. In English,
wanna is not just a matter of fast speech reduction In the 1980s and 1990s the focus was on accom-
because it can convey a difference in meaning from the modating clitics into current theories of grammar, but
sequence want plus to: a sentence like Tung Chee-hwa previously to this most investigations of clitics had
I want to succeed offers two meanings (I want Tung scrutinized historical change and the development of

2053
Clitics, Linguistics of

affixes from separate words. Affixes can arise from other, nonclitic verbs. If a language had originally
several sources historically; one such source is a former placed verbs at the end of a sentence, generalizing this
word. A word can become less stressed, begin to lean second position pattern would cause verbs to migrate
on a neighboring word, and eventually turn into a to second position. That is, a subject–object–verb
fully dependent element. This is known as the agglu- pattern will become subject–verb–object (see Word
tination cycle (see Grammaticalization). For instance, Order). Although the details of this proposal may not
the Modern English adjective suffix ly developed from be widely accepted, it demonstrates that the role of
a Germanic word that also yielded Modern English clitics should not be ignored in the study of word order
like. One hypothesis is that full words first cliticize change.
before becoming affixes, as demonstrated in the nearly
extinct Balto-Finnic language Livonian, which has a
comitative suffix -ks meaning ‘with’ (e.g., suu-ks ‘with 7. Conclusion
the mouth’) that came from a former word kansa Most, if not all, languages offer at least one clitic
‘with’ (that is, this word would originally have been element. However clitics themselves do not seem to
the phrase *suun kansa, literally ‘mouth with’), which form a homogeneous class, as illustrated in English by
cliticized into -kas before becoming suffix -ks. How- the rather wordlike contracted auxiliary ’e and the
ever, it is by no means evident that cliticization is a more affixal possessive ’s. Clitics cross-cut most word
necessary stage in the development of affixes from classes (see also Word Classes and Parts of Speech) and
words. may appear as adverbs, pronouns, prepositions, auxi-
Another question is whether the agglutination cycle liary verbs, and the like. Apparently, the only word
operates in one direction only. Normally a word classes inaccessible to clitics are nouns and verbs, as
weakens into a clitic, and then the clitic fuses as an affix well as adjectives (and this is because putative ‘clitic
into the host. For example, German dem ‘the’ has nouns’ and ‘clitic verbs’ would be treated separately as
fused with zu ‘to’ to form zum ‘to the.’ So if the incorporations; see Linguistics: Incorporation).
agglutination cycle is limited to one direction of Apart from the historical studies, much of the
development (that is, it is unidirectional), then affix literature on clitics in the first three-quarters of the
sequencing might be a good indication of an earlier twentieth century was pretheoretical in nature insofar
word order. Since the affixes would have been separate as investigators struggled to understand the funda-
words previously, they could provide a tool for mentals of what a clitic is. Subsequent debates were
linguistic rconstruction of word order patterns in often couched in particular theoretical frameworks or
earlier stages of a language (see Linguistics: Com- focused exclusively on certain kinds of clitics (especial-
paratie Method). ly pronominal clitics). Although numerous generaliza-
But exceptions to this scenario have been noted. An tions have been established, no real consensus had
affix can loosen and can become a clitic, and a clitic resulted in a single theory of clitics by the end of the
can likewise become a separate word. The modern twentieth century—probably because clitic pheno-
English possessive is a case in point because it is a clitic mena do not constitute a uniform phenomenon. To
derived from a regular suffix in old English. Older supplement such argumentation, future research on
English exhibited attachment of the possessive to the clitics will have to incorporate evidence from areas
head noun in a phrase (as is expected of an affix) rather such as psycholinguistic experimentation (see also
than to the end of the phrase (as a phrasal-affix type Psycholinguistics: Oeriew) and language disorders as
clitic): well as acquisition of language (see also First Language
Earlier: The King’s crown of England Acquisition: Cross-linguistic), and will have to better
Now: The King of England’s crown integrate studies on the historical evolution of clitics.
An illustration of a clitic turning into a word can be But above all, the study of clitics will continue to assist
found in the Sami languages (formerly called Lappish), scholars in defining and delimiting related areas such
where the original abessive ending developed first into as Phonology, Morphology, and Syntax).
a clitic and then in some varieties of Sami into a
separate word. The abessive expresses absence (and is See also: First Language Acquisition: Cross-linguistic;
translated as ‘without’). In Northern Sami, the abes- Grammaticalization; Linguistics: Comparative
sive is an enclitic on a preceding noun (e.g., airoj-taga Method; Linguistics: Incorporation; Morphological
‘without oars’), whereas in the Enontekio$ subdialect it Case in Linguistics; Morphology in Linguistics;
has become an independent adverb, standing even Phonology; Possession (Linguistic); Syntax; Valency
without a preceding noun: and Argument Structure in Syntax; Word Classes and
Mun baT hcen taga Parts of Speech; Word, Linguistics of; Word Order
I go without
‘I do without’
Clitics may play a role in word order change. When Bibliography
verbal clitics are attracted to the second position, the Bauer L 1988 Introducing Linguistic Morphology. Edinburgh
Wackernagel slot, that pattern may be generalized to University Press, Edinburgh, UK

2054
Closed and Open Systems: Organizational

Carstairs A 1981 Notes on Affixes, Clitics, and Paradigms. 1.1 Organizational Parts, Relationships, and Wholes
Indiana University Linguistics Club, Bloomington, IN
Jespersen O 1922 Language, Its Nature, Deelopment and Origin. A key premise of GST has to do with the definition of
Allen and Unwin, London a system and how it forms an organized whole. A
Katamba F 1993 Morphology. St Martin’s Press, New York system is composed of parts and relationships among
Nevis J 2000 Clitics. In: Booij G, Lehmann C, Mugdan J, them. The system provides the framework or organiz-
Skopeteas S (eds.) Morphology: An International Handbook on ing principle for structuring the parts and relationships
Inflection and Word Formation. de Gruyter, Berlin, Article into an organized whole. This makes systems capable
41
of behaving in ways that are greater than merely the
Nevis J A, Joseph B D, Wanner D, Zwicky A 1994 Clitics: A
Comprehensie Bibliography 1892–1991. Benjamin, Amster- sum of the behaviors of their parts, thus leading to the
dam common adage: ‘the whole is greater than the sum of
van Riemsdijk H (ed.) 1999 Clitics in the Languages of Europe. de its parts.’ Organizational scholars have expended
Gruyter, Berlin considerable effort in identifying the constituent mem-
Spencer A 1991 Morphological Theory: An Introduction to Word bers or subunits of organizational systems and exam-
Structure in Generatie Grammar. Blackwell, Oxford, UK ining relationships among them. They have sought to
Steele S 1977 On the count of one. In: Juilland A (ed.) Studies discover the organizing principals through which the
Presented to Joseph Greenberg. Anma Libri, Saratoga, CA parts are arranged into a coherent whole. For example,
Zwicky A 1977 On Clitics. Distributed by the Indiana University
job design researchers have identified different ele-
Linguistics Club, Bloomington, IN
ments of jobs; they have shown how they can be
J. A. Nevis combined to affect employee motivation. Group dy-
namics scholars have spent considerable time ad-
dressing issues of group membership and member
interaction. They have discovered different ways of
structuring groups to perform tasks that members
Closed and Open Systems: Organizational could not do working alone. They have shown how,
under certain conditions, groups can outperform
Closed and open systems refer to whether organi- individuals, thus leading to various high-performing
zational entities, such as groups and organizations, are group designs, such as self-managing teams, quality
viewed as relatively closed or open to their environ- circles, and cross-functional teams. Similarly, organi-
ment. Such a perspective has a profound influence on zation theorists have identified the different compon-
how organizations are described and studied. When ents of organizations and examined relations among
treated as closed systems, organizations are unaffected them. They have found different ways to organize the
by their environment, and attention is directed inward components and relationships for competitive ad-
to internal structures and behaviors. As open systems, vantage.
organizations are interdependent with their environ-
ment, and focus is outward to how such interaction is
managed.
Knowledge of closed and open systems derives from 1.2 Leels of Organizational Systems
the broad framework of general systems theory (GST)
A second principle of GST has to do with the multilevel
which seeks to explain the structure and behavior of
nature of systems. Systems exist at different levels; the
complex wholes called systems (e.g., von Bertalanffy
levels exhibit a hierarchical ordering, with a higher
1956, Miller 1978). This broad metatheory is based on
level of system being composed of systems at lower
related research from the physical, biological, and
levels. For example, societies are composed of organi-
social sciences, and seeks to discover general laws
zations; organizations are composed of groups; groups
which apply to all levels of systems from single cells to
are composed of individuals; and so on. Higher-level
societies.
systems provide constraints and opportunities for how
This article shows how GST applies to organiza-
a system organizes its parts, and the nature of those
tions. It describes the systemic properties of organi-
parts affects the system’s organizing possibilities.
zations as closed and open systems, and explains how
Thus, to describe a system at a particular level and
they are structured and managed.
explain its behavior, it is necessary to look both
upward to the higher level system within which it is
1. Organizations as Systems embedded and downward to its constituent parts.
This multilevel perspective has led scholars to
Organizational scholars use GST to describe the identify different levels of organizational systems, and
general properties of organizational systems. This to focus on understanding them and how they interact
includes defining their constituent parts and how they with each other. Considerable attention has been
are inter related to form a system. It also involves directed at specifying appropriate levels of analysis,
identifying different levels of organizational systems both for conceptualizing about organizational systems
and explaining how they interact with each other. and for aggregating and disaggregating data that apply

2055
Closed and Open Systems: Organizational

to different levels. For example, scholars have de- effects on the system (e.g., Aldrich 1979, Buckley 1967,
veloped analytical guides for aggregating data from Katz and Kahn 1966, Lawrence and Lorsch 1967,
individuals to devise measures of group and organi- Scott 1981, Thompson 1967). It has led to considerable
zation functioning. As researchers have developed research about organizational environments, their
more comprehensive theories and more powerful dynamics and effects, and how organizational systems
analytical methods, they have made finer distinctions interact with them. An open systems perspective draws
among levels of organizational systems, particularly attention to how organizations exchange information
above the organization level. This has led to at least and resources with their environment, and how the
five levels of organizational systems: two mutually influence each other. Moreover, it
(a) individual member, role, or job; provides a number of useful concepts for under-
(b) group; standing how organizations maintain functional auto-
(c) organization; nomy while influencing and adapting to external
(d) population of organizations and\or alliance forces.
among organizations; and
(e) community of populations and\or community
of alliances.
3.1 Critical Functions
To survive and prosper, open systems need to perform
2. Organizations as Closed Systems at least four critical functions:
(a) transformation of inputs of energy and in-
Closed systems do not interact with the environment, formation to produce useful outputs;
and consequently their behavior depends largely on (b) transaction with the environment to obtain
the internal dynamics of their parts. They seek to needed inputs and to dispose of outputs;
maintain a steady state or equilibrium among their (c) regulation of system behavior to achieve stable
parts while performing goal-directed behaviors. Be- performance; and
cause the environment is inconsequential for goal (d) adaptation to changing conditions.
achievement, system behavior is highly specified and Because these different functions often place con-
maximally controlled within the system. flicting demands and tension on open systems, system
Until the late 1960s, organizational scholars tended viability depends on maintaining a dynamic balance
to employ a closed-system perspective for studying among them. Organizational researchers have devoted
organizations. Organizational environments were seen considerable time to identifying and explaining how
as relatively simple and predictable, and thus were not these four functions operate and contribute to organi-
problematic or significant for how organizations zational survival and effectiveness. This has led to
behaved. Attention was directed at the internal dy- knowledge about how organizations and groups pro-
namics of groups and organizations, particularly at duce products and services through operating and
how member behavior was controlled to achieve developing different technologies; how they protect
specific objectives and goals. This led to extensive their technologies from external disruptions while
knowledge of organizational control mechanisms and acquiring raw materials and marketing finished
search for the best way to structure organizational products; how they regulate themselves for stable per-
systems. For example, at the individual level, job formance while initiating and implementing inno-
design researchers discovered how to analyze and vation and change. This research defines a key role of
maximally specify the most efficient way to perform management in organizational systems as sustaining a
various tasks. At the group level, researchers showed dynamic balance among these four functions; one that
how group structures and processes could contribute allows the organization sufficient stability to operate
to member conformity to group norms. At the rationally yet requisite flexibility to adapt to changing
organization level, scholars studied a variety of devices conditions.
for controlling member behavior, such as managerial
hierarchy, rules\procedures, and functional design.
3.2 Information and Resource Flows

3. Organizations as Open Systems As open systems, organizations seek to sustain a cycle


of activities aimed at taking in inputs of information
In the late 1960s, organizational scholars began to and resources from the environment, transforming
broaden their focus to external forces affecting organi- them into outputs of goods and services, and exporting
zational systems. This open systems view was fueled them back to the environment. This cycle enables
by growing applications of GST to the social sciences organizations to replenish themselves continually, so
and by realization that the behavior of organizational long as the environment provides sufficient inputs and
systems could not be adequately explained without the organization delivers valued outputs. Considerable
examining environmental relationships and their research has gone into understanding how organi-

2056
Closed and Open Systems: Organizational

zations manage these information and resource flows. logies and environments become more complex and
One perspective focuses on how organizations process uncertain, organizations seek to become more flexible
information to learn how to improve themselves and and nimble. They can enhance their adaptive capa-
to relate to their environments. Organizations must be bilities through such innovations as downsizing,
capable of learning from their experiences and of employee empowerment, and re-engineering.
disseminating such knowledge widely if they are to Extensive research has been devoted to understand-
change themselves to respond to emerging conditions. ing how organizations regulate themselves. Using
Consequently, research has been directed at organi- modern information technology, organizations have
zational learning and knowledge management, developed a variety of methods for setting goals,
particularlyatdiscoveringmechanismstoenhancelearn- obtaining information on goal achievement, and
ing capability, such as shared databases, groupware, making necessary changes. For example, database
and decision-support systems. Another view concen- technologies enable organizations to store large
trates on how organizations compete for resources batches of information and to logically connect events,
through managing key resource dependencies. At- actions, and outcomes. Employees can access and
tention has been directed at how organizations gain revise this information through on-line, common
access to resources without becoming overly depen- databases.
dent on those who supply them. Still another per-
spective focuses on how organizations gain legitimacy
from environmental institutions, so they can continue 3.5 Equifinality
to function with external support. Researchers have
sought to identify the institutional demands of power- As open systems, organizations display the property
ful resource providers, and how organizations respond of equifinality. They can achieve objectives with
to them. varying inputs and in different ways. Consequently,
there is no one best way to design and manage
organizations, but there are a variety of ways to
achieve effective performance. This contrasts sharply
3.3 Boundaries with a closed system perspective which seeks the one
In managing information and resource flows, organi- best way to structure and control organizations.
zations, like all open systems, seek to establish Organizational researchers have devoted consider-
boundaries around their activities. These organiza- able attention to identifying different choices for
tional boundaries must be sufficiently permeable to designing and managing organizations. This has led to
permit necessary environmental exchange, yet afford a virtual revolution in new organizational designs
the organization adequate protection from external aimed primarily at making organizations leaner, more
demands to allow for rational operation. Organiza- flexible, and more responsive to human resources. For
tional scholars have devoted considerable attention to example, traditional bureaucratic structures that em-
understanding the dual nature of organizational boun- phasize efficiency and control have increasingly been
daries. They have studied various boundary-spanning supplanted with designs that emphasize flexibility and
roles that relate the organization to its environment, innovation, such as matrix organizations, horizontal
such as sales, public relations, and purchasing. They organizations, network organizations, and virtual
have examined how organizational members perceive organizations. Moreover, researchers have developed
and make sense out of environmental input, and how contingency theories which specify under what tech-
organizational boundaries vary in sensitivity to ex- nological, environmental, and human conditions dif-
ternal influences. Researchers have also identified ferent organizational designs are likely to be most
different strategies for protecting transformation pro- successful. For example, these newer, more innovative
cesses from external disruptions while remaining designs are best suited to situations where technologies
responsive to suppliers and customers. are complex, environments are unpredictable, and
people have high growth needs.

3.4 Self-regulation
4. Promising Trends
Viewed as open systems, organizations use infor-
mation about how they are performing to modify Organizational researchers have recently applied sys-
future behaviors. Referred to as cybernetics, this tems concepts to understand how organizations can
information feedback enables organizations to be self- adapt to rapidly changing, unpredictable environ-
regulating (e.g., Ashby 1956). They can adjust their ments. They have borrowed heavily from complexity
behavior to respond to deviations in expected per- theory which seeks to explain the behaviors and
formance. To be effective however, organizations must changes that can occur when the parts of complex
have sufficient diversity of responses to match the systems interact (e.g., Holland 1995, Brown and
variety of disturbances encountered. Thus, as techno- Eisenhardt 1998).

2057
Closed and Open Systems: Organizational

Such systems tend to be self-organizing in response (re)organizing their mental structures in increasingly
to environmental feedback; they can change in non- sophisticated ways, while interacting with the physical
linear and dynamic ways and can invent entirely new and symbolic environment. According to Piaget and
responses to external forces. These concepts provide a most of his successors in cognitive, developmental,
dynamic, change-oriented perspective on organiza- and educational psychology, this process of adaptive
tions. They help to explain how organizations can and viable reality construction is enabled and con-
restructure themselves continually to keep pace with strained both by biologically grounded structures (the
fast-changing environments. strength and scope of which, however, are not yet well
known) and by the already existing preknowledge
See also: Bureaucracy and Bureaucratization; Bu- (concepts, operative schemas, and structures) of the
reaucratization and Bureaucracy, History of; Lead- individual.
ership in Organizations, Psychology of; Leadership in Even though the constructivist assumption makes
Organizations, Sociology of; Organization: Overview; some traditional problems in both psychology and
Organizational Climate; Organizational Decision education easier to solve, it also raises some new ones.
Making; Organizations and the Law; Organizations, An important problem is how we can think of
Sociology of achieving intersubjectivity. How can individuals who
personally construct their knowledge independently of
each other come to the same or similar cognitive
structures? How can we share a knowledge of our
Bibliography culture if people are conceived of as being solo
Aldrich H E 1979 Organizations and Enironments. Prentice- learners, and socially isolated Robinson Crusoe
Hall, Englewood Cliffs, NJ figures?
Ashby W R 1956 An Introduction to Cybernetics. Wiley, New One striking answer, which at the same time
York challenges traditional (Western) epistemological con-
Brown S L, Eisenhardt K M 1998 Competing on the Edge. structivism, stems from symbolic interactionist (Mead
Harvard Business School Press, Boston 1934) and sociocultural theory (Vygotsky 1962). It
Buckley W 1967 Sociology and Modern Systems Theory. Pren-
tice-Hall, Englewood Cliffs, NJ
claims that learning is fundamentally a social actiity.
Holland J H 1995 Hidden Order: How Adaptation Builds Com- Learning and enculturation are not bounded by the
plexity. Addison-Wesley, Reading, MA individual brain or mind but are intrinsically social
Katz D, Kahn, R L 1966 The Social Psychology of Organiz- endeavors, embedded in a society and reflecting its
ations. Wiley, New York knowledge, perspectives, and beliefs. People construct
Lawrence P R, Lorsch J W 1967 Organization and Enironment. their knowledge, not only from direct personal ex-
Harvard University Press, Boston perience, but also from being told by others and by
Miller J G 1978 Liing Systems. McGraw-Hill, New York being shaped through social experience and inter-
Scott W R 1981 Organizations: Rational, Natural, and Open action. The basis of personal development and en-
Systems. Prentice-Hall, Englewood Cliffs, NJ
Thompson J O 1967 Organizations in Action. McGraw-Hill, New
culturation, thus, is not the socially isolated con-
York struction of knowledge, but its co-construction in a
von Bertalanffy L 1956 General systems theory. General Systems social and cultural space. Or, as Bruner puts it: ‘Most
1: 1–10 learning in most settings is a communal activity, a
sharing of the culture. It is not just that the child must
T. G. Cummings make his knowledge his own, but that he must make it
his own in a community of those who share his sense of
belonging to a culture’ (Bruner 1986, p. 86). Knowl-
edge, from this perspective, is no longer seen as solely
residing in the head of each individual, but as being
Co-constructivism in Educational Theory distributed across individuals whose joint interactions
and negotiations determine decisions and the solution
and Practice of problems.

Ever since Piaget’s dynamically Kantian epistem-


ology, it has been widely accepted as a pervasive 1. Concept and Process of Co-construction
assumption that learning is a constructive process. In
contrast to the epistemological assumption of em- No precise and widely accepted definition of the
piricism that what we know is a direct reflection of concept and process of co-construction can be found
ontological reality, learning is considered as an active in psychological or educational literature. What has
construction of knowledge. Learners, as they strive to been provided is very diverse and depends on the
make sense of their world, do not passively receive theoretical context in which it is embedded. Dif-
stimulus information matching independent physical ferences can be found with regard to at least three
structures, but genuinely interpret their experience by aspects:

2058
Co-constructiism in Educational Theory and Practice

(a) the social type of discourse eligible to be called dividual—in psychological tools (such as ‘language’)
co-constructive: mother–child dialog, peer interaction, and interpersonal relations’ (Kozulin 1998, p. 15).
teacher–student interaction, learning in teams, According to Vygotsky’s claim that interpersonal
computer-supported collaborative work; interactions on a social plane serve as prototypes for
(b) the psychopedagogical processes involved in intrapersonal processes, i.e., for functions to be inter-
productive co-constructive activity: productive dialog nalized, co-construction can be seen as (asymmetrical)
such as exploratory talk and collective argumentation, adult–child interaction, or interaction between a child
collaborative negotiation after sociocognitive conflict and a more capable peer, in the ‘zone of proximal
or as a process of reciprocal sense-making, joint development.’ ‘What a child can do today in co-
construction of a shared understanding, elaboration operation, tomorrow he will be able to do on his own’
on mutual knowledge and ideas, giving and receiving (Vygotsky 1962, p. 87). The quality and development
help, tutoring and scaffolding; of higher order thinking is prepared by the co-
(c) the expected outcomes of collaboration: taken- constructive patterns and distinctive properties of
as-shared individual vs. socially shared cognitions; social interaction. Meaningful new learning emerges
convergence and intersubjectivity; academic task ful- by embedding mental functions (like logical argu-
fillment, student motivation, and conceptual devel- mentation, proof, reflection, or problem solving) into
opment; effects on skills in listening, discussion, specific forms of goal-directed interaction and dialog,
disputation, and argumentation. where more knowledgeable individuals tailor a task in
Common to most theoretical contexts of co- such a way that a child can successfully coperform it.
constructivism is the implication of some kind of col- The acquisition of a new concept or mental function
laborative activity and, through joint patterns of becomes progressively more skillful as the child learns
awareness, of seeking some sort of convergence, to respond in gradually more sophisticated and per-
synthesis, intersubjectivity, or shared understanding, sonally more meaningful ways to the co-constructive,
with language as the central mediator. Theorists, sense-mediating context of adult regulations, and
moreover, largely converge in the adopted method- eventually takes over responsibility for his or her own
ology of microgenetic analysis that has been used learning.
to examine the inherently fragile processes of co-
construction.
1.3 Perspectie of Situated and Socially Shared
Cognition
1.1 (Neo-)Piagetian Perspectie Situated learning theory views human cognition as
being embedded in and inseparable from specific
In a (neo-)Piagetian framework, true dialog becomes
sociocultural contexts. The goal of learning is to enter
possible and facilitates the individual cognitive con-
a community of practice and its culture, i.e., to learn,
struction of operatory structures when children are
like an apprentice, to use tools as practitioners use
able to take other persons’ points of view into
them (Brown 1989). As a process, learning takes place
consideration and when they are able to resolve
through the interaction and transaction between
sociocognitive conflict. Although regarded (by the
people and their environments. Co-construction, from
early Piaget) as a developmental factor, social in-
a situated cognition perspective, can be seen as having
teraction—specifically, peer interaction—remains
two or more individuals collaboratively construct a
more of a catalyst for individual cognitive devel-
shared understanding, or a solution to a problem,
opment. According to studies carried out by co-
which neither partner entirely and necessarily posses-
workers of Piaget (e.g., Doise and Mugny 1984, Perret-
ses beforehand (Chi 1996). In a widely quoted def-
Clermont et al. 1991), social factors, such as the need
inition proposed by Roschelle and Teasley (1995):
to deal with conflicting perspectives, can have a
‘Collaboration is a coordinated, synchronous activity
productive impact on cognitive behavior. For ex-
that is the result of a continued attempt to construct
ample, in a Piagetian conservation task, pupils more
and maintain a shared conception of a problem’ (p.
easily progressed to a subsequent level of develop-
70).
ment after having been confronted by contradictory
At the heart of this concept of co-construction are
judgments given by an adult or another child.
two coexisting activities: collaboratively solving the
problem, and constructing and maintaining a joint
problem space. Both activities require constant nego-
1.2 (Neo-)Vygotskyan context
tiations and recreations of meaning, i.e., trying to find
In Vygotsky’s cultural-historical view of development out what can reasonably be said about the task in
as a process of meaningful appropriation of culture, hand, and occur in structured forms of conversation
the interactive foundation of the cognitive is at the and discourse utilizing language and physical actions
core of the developmental process. In contrast to as their most important mediators and resources. With
Piaget’s view, however, ‘the constructivist principle of the use of symbolic tools, it becomes possible for the
the higher mental functions lies outside the in- conversants to express and objectify meanings, to

2059
Co-constructiism in Educational Theory and Practice

compare and change them deliberately, to exchange classrooms that permits the co-construction of mean-
and renegotiate them with others, and to reflect on the ing between teachers and students, tutors and tutees,
organization of judgments and arguments (see van the more and the less experienced. Consistent with
Oers 1996). However, as observational studies show, Vygotsky’s theory of the constructive role played by
co-constructive learning is hardly a homogeneous but adults in children’s acquisition of knowledge, the
an inherently fragile process in the service of con- teacher’s goal of assistance can be seen as trying to get
vergence and mutual intelligibility. The achievement the students to share his or her understanding and
of a shared conceptual structure cannot be reliably knowledge. However, because of the asymmetrical
predicted, nor does the iterative construction of a joint distribution of knowledge between teachers and stu-
problem space through cycles of displaying, con- dents, understanding might be expected to be less
firming and repairing occur by simply putting two jointly constructed in instructional conversation than
students together. As Roschelle and Teasley (1995, it is observed to be in peer-cooperative dialog. Actions
p. 94) remark: that tutors or teachers can take in order to elicit
responses, including some co-constructive behavior
Students’ engagement with the activity sometimes diverged
and later converged. Shared understanding was sometimes
from a tutee, are, for example, described in literature
unproblematic and but oftentimes troublesome. The intro- on reciprocal teaching and on cognitive apprenticeship
duction of successful ideas was sometimes asymmetric, (Collins et al. 1989). They can be subsumed under two
although it succeeded only through coordinated action. These broad categories: (a) modeling, scaffolding, and fading
results point to the conclusion that collaboration does not as content-specific ways of providing hints, strategies,
just happen because individuals are co-present: individuals and situational forms of coaching and guidance that
must make a conscious, continued effort to coordinate their are tailored to the needs of individual students; and (b)
language and activity with respect to shared knowledge. prompting as a more content-neutral invitation by the
tutor to elicit elaborations, reflections, and self-
explanations from students (Chi 1996).
1.4 Context of Discourse Linguistics: Grounding
From the perspective of communication or conver- 2. Pedagogical Facilitation of Co-construction
sation analysis, co-constructive or collaborative learn- The question of how best to support co-constructive
ing requires individuals to establish, maintain, and learning is concerned with the design of effective
update some degree of mutual understanding. The collaborative learning environments. Much empirical
basic process by which this is accomplished between work has addressed the conditions under which
individuals is called grounding (Clark and Brennan productive collaborative interaction is most likely to
1991). Grounding as a basic form of collaboration occur, and a whole range of possible ways to enhance
means the moment-by-moment coordination and syn- its quality has been provided. Among the input
chronization of the content-specific as well as the characteristics that exert a complex influence upon the
procedural aspects and steps of co-constructive ac- quality of interaction are: the preparation of the
tivity. There is no need, however, to fully ground every students for collaborative learning (including training
aspect of an utterance. Clark and Brennan (1991, p. for cooperation and discourse prior to the col-
148) frame a pragmatic criterion for grounding: The laborative learning event), the establishment of a
conversants ‘mutually believe that they have under- culture of dialog and of problem-based learning, group
stood what [they] meant well enough for current characteristics (composition, size, ability and sex), the
purposes.’ Thus, the techniques that are used for goal and incentive structure of the task, and the
grounding are shaped by the goal and the medium of structuring of group interaction (see, for a review,
communication. That is, the criterion of grounding Webb and Palincsar 1996).
and the techniques exploited for its maintenance
dramatically change according to the purpose of
communication (e.g., planning a party, swapping 2.1 Importance of Dialog
gossip, or gaining deep understanding) and the con- Probably the most important single feature of a culture
straints of its medium (copresence and visibility in of collaborative learning is dialog as opposed to, e.g.,
face-to-face communication; sequentiality and review- solo learning and teacher monologs. Emphasis on
ability in letter communication, e-mail, or computer- joint learning and instructional conversation among
supported collaborative work). peers, and between teachers and students, is associated
with the internal mediating processes that are essential
for an understanding of how co-construction through
1.5 Pedagogical Context of Tutoring
discourse operates and influences outcomes. The
Another aspect of concern for the social nature of pedagogical cultivation of processes such as nego-
learning—and for a crucial way in which it is sup- tiation of meaning, reciprocal sense-making, revising
ported by culture—is instructional dialog or con- one’s cognitions in situations of sociocognitive con-
versation. This term refers to a discursive activity in flict, precise verbalization of reasoning and knowl-

2060
Co-constructiism in Educational Theory and Practice

edge, listening to others’ lines of argumentation, applied to cognitive learning activities, teachers, ex-
tuning one’s own information to that of a partner, perts, or more capable peers provide guidance and
giving and receiving help, or modeling cognitive and support to learners as they participate as apprentices
metacognitive activities to be internalized by the in authentic and task-related, structured social inter-
participating individuals should, thus, be placed at the actions. As opposed to a transmissionist view of
core of instructional design. instruction, teachers should provide aid in the in-
tellectual development of students in ways that leave
room for negotiation and joint expansion of meaning:
2.2 Support Structures for Collaboratie Thinking (a) as scaffolds and role models for the behavior that
and Problem Soling students are expected to engage in; (bi) as active
participants in learning groups aiming at shaping the
A means of improving the quality of collaborative
group’s dialog; (c) as monitors of co-constructive
thinking is explicit process-related and task-related
norms in social interactions in which negotiation of
support structures in the learning environment (Pauli
taken-as-shared meaning is essential (Webb and
and Reusser 2000). Process-related support structures
Palincsar 1996); (d) as advocates of content-specific
refer to the structuring of the interaction through the
standards and of the achievement of convergence
implementation of scripts for collaboration, such as
and intersubjectivity in understanding and problem
‘reciprocalteaching’,‘scriptedcooperation’,or‘promp-
solving.
ting’ for questions and elaborations. These tech-
Associated with this shift in the pedagogical orien-
niques have in common the fact that a set of cognitive
tation of teachers is a shift in the role of learners and
and metacognitive strategies which have to be used in
the organization of classrooms. In the wake of a view
a prescribed way is provided. A complementary way
that sees learning essentially as sociocultural inter-
of supporting collaborative learning is to provide
action, classrooms should develop from aggregations
students with task-specific support structures and help.
of solo learners to communities engaged in co-
The main goal of task-related assistance, including
constructive learning. That is, individuals should
more or less explicit instructions, domain-specific
become acculturated members of a culture and com-
formats of task representation, and the modeling of
munity through collaboration and negotiation. Or, as
strategies, is to scaffold students’ domain knowledge
Bruner (1986, p. 123) has put it: culture as ‘a forum for
construction, understanding, and skill acquisition.
negotiating and renegotiating meaning and for ex-
What is not yet clear, however, is how much struct-
plicating action ... is constantly in process of being
uring of the interaction is actually beneficial. Ideally,
recreated as it is interpreted and renegotiated by its
the quality and quantity of guidance and help has to be
members.’
adjusted to the learners’ subjective needs. As Cohen
(1994) has pointed out, overstructuring interaction See also: Cooperative Learning in Schools; Piaget,
may be counterproductive and have detrimental ef- Jean (1896–1980); Piaget’s Theory of Human
fects, if it ‘prevent[s] students from thinking for Development and Education; Situated Cognition:
themselves and thus gaining the benefits of the Contemporary Developments; Situated Cognition:
interaction’ (p. 22). Origins; Vygotskij, Lev Semenovic (1896–1934);
One promising possibility for making environments
Vygotskij’s Theory of Human Development and New
more supportive for collaborative learning is to enrich
learning situations with technology. Well-designed Approaches to Education
computer-based cognitie tools provide users with
both process-related and task-related instruments of
thought and communication. As mediational re- Bibliography
sources and cognitive tools for the representation, Brown J S, Collins A, Duguid P 1989 Situated cognition and the
negotiation, and modeling of concepts and activities, culture of learning. Education Researcher 18(1): 32–42
educational software has the potential—by making Bruner J S 1986 Actual Minds, Possible Worlds. Harvard
conceptual structures and processes visible, accessible, University Press, Cambridge, MA
and manipulable on a computer screen—to facilitate Chi M T H 1996 Constructing self-explananations and scaf-
folded explanations in tutoring. Applied Cognitie Psychology
processes of sharing understanding and of achieving
10: 10–49
convergence and intersubjectivity (Reusser 1993). Clark H H, Brennan S E 1991 Grounding in communication. In:
Resnick L B, Levine J, Teasley S D (eds.) Perspecties on
Socially Shared Cognition. American Psychological Associ-
2.3 Structuring the Role of Teachers ation, Washington, DC pp. 127–49
Cohen E G 1994 Restructuring the classrooms: Conditions for
The role of teachers in the co-constructive activities of productive small groups. Reiew of Educational Research 64:
learners can be described within the didactic frame- 1–35
work of ‘cognitive apprenticeship’ (Collins et al. 1989). Collins A, Brown J S, Newman S E 1989 Cognitive appren-
According to the ethnographic model in which prac- ticeship: Teaching the crafts of reading, writing, and math-
tices and principles of traditional craftsmanship are ematics. In: Resnick L B (ed.) Knowing, Learning, and In-

2061
Co-constructiism in Educational Theory and Practice

struction: Essays in the Honor of Robert Glaser. Erlbaum, communities suggests, however, that speakers gen-
Hillsdale, NJ erally manage to circumvent these difficulties. CS
Doise W, Mugny G 1984 The Social Deelopment of the Intellect. tends not to produce utterances that contain mono-
Pergamon Press, Oxford
lingually ungrammatical sentence fragments. Discov-
Kozulin A 1998 Psychological Tools. A Sociocultural Approach
to Education. Harvard University Press, Cambridge, MA ery of the mechanisms enabling such ‘grammatical’ CS
Mead G H 1934 Mind, Self, and Society. University of Chicago is the major goal of current research. Central questions
Press, Chicago include locating permissible switch sites and ascertain-
Pauli C, Reusser K 2000 Cultivating students’ argumentation ing the nature (hierarchical or linear, variable or
and reasoning in solving mathematical text problems through categorical) of the constraints on switching.
the use of a computer tool: a video-based analysis of
dialogues. Research Report. University of Zurich Institute of
Education (http:\\www.didac.unizh.ch) 1. Background
Perret-Clermont A-N, Perret J-F, Bell N 1991 The social
Though CS is apparently a hallmark of bilingual
construction of meaning and cognitive activity in elementary
school children. In: Resnick L B, Levine L, Teasley S D (eds.) communities world-wide, it has only begun to attract
Perspecties on Socially Shared Cognition. American Psycho- serious scholarly attention in the last few decades.
logical Association, Washington, DC pp. 41–62 Researchers first dismissed intrasentential code-
Reusser K 1993 Tutoring systems and pedagogical theory: switching as random and deviant (e.g., Weinreich
Representational tools for understanding, planning, and 1953\1968) but are now unanimous in the conviction
reflection in problem-solving. In: Lajoie S P, Derry S (eds.) that it is grammatically constrained. The basis for this
Computers as Cognitie Tools. Erlbaum, Hillsdale, NJ pp. conviction is the empirical observation that bilinguals
143–78 tend to switch intrasententially at certain (morpho)
Roschelle J, Teasley S D 1995 The construction of shared
syntactic boundaries and not at others. Early efforts to
knowledge in collaborative problem solving. In: O’Malley C E
(ed.) Computer-supported Collaboratie Learning. Springer, explain these preferences proceeded by proscribing
Berlin pp. 69–97 certain switch sites, for example, between pronominal
van Oers B 1996 Learning mathematics as meaningful activity. subjects and verbs or between conjunctions and their
In: Steffe L P, Nesher P, Cobb P, Goldin G A, Greer B (eds.) conjuncts. However, these particular sites were soon
Theories of Mathematical Meaning. Erlbaum, Mahwah, NJ reported to figure among the regular CS patterns of
pp. 91–113 some bilingual communities.
Vygotsky L S 1962 Thought and Language. Harvard University The first more general account of the distribution of
Press, Cambridge, MA CS stemmed from the observation that CS is favored
Webb N M, Palincsar A S 1996 Group processes in the
at the kinds of syntactic boundaries which occur in
classroom. In: Berliner D C, Calfee R C (eds.) Handbook of
Educational Psychology. Macmillan, New York pp. 841–73 both languages. The equialence constraint (Poplack
1980) states that switched sentences are made up of
K. Reusser concatenated fragments of alternating languages, each
of which is grammatical in the language of its
provenance (see also Muysken 2000). The boundary
between adjacent fragments occurs between two con-
stituents that are ordered in the same way in both
Code switching: Linguistic languages, ensuring the linear coherence of sentence
structure without omitting or duplicating lexical con-
Code-switching (CS) refers to the mixing, by bilinguals tent.
(or multilinguals), of two or more languages in That general principles, rather than atomistic con-
discourse, often with no change of interlocutor or straints, govern CS is now widely accepted, though
topic. Such mixing may take place at any level of there is little consensus as to what they are or how they
linguistic structure, but its occurrence within the should be represented. Much current research assumes
confines of a single sentence, constituent, or even word, unquestioningly that the mechanisms for language
has attracted most linguistic attention. This article switching follow directly from general principles of
surveys the linguistic treatment of such intrasentential (monolingual) grammar. Theories based on this as-
switching. sumption tend to appeal to such abstract grammatical
In combining languages intrasententially, various properties as inter-constituent relationships (e.g., gov-
problems of incompatibility may arise. The most ernment, case assignment) and\or language-specific
obvious derive from word order differences: under features of lexical categories (i.e., subcatgorization of
what conditions, if any, can the boundary between grammatical arguments, inherent morphological fea-
constituents ordered differently in two languages host tures).
a switch? Other potential combinatorial difficulties Since Klavans’s (1985) proposal that CS was con-
involve mismatches in grammatical categories, sub- strained by structural relations, the formal linguistic
categorization patterns, morphology, and idiomatic theories successively in vogue have each been extended
expressions. Systematic examination of the sponta- to encompass the data of CS. Di Sciullo et al. (1986),
neous speech of bilinguals resident in a wide range of for example, identified the relevant relations as C-

2062
Code switching: Linguistic

command and government: CS cannot occur where a peting predictions. But their disparate assumptions,
government relation holds. Replacement of the func- goals, and domains of application have hindered such
tion of government in standard theory by the notion of efforts. Assessment of the descriptive adequacy of a
feature agreement led to a parallel focus on feature theory of CS requires that at least two methodological
matching in CS studies. The Functional Head Con- issues be resolved. One involves classification of other-
straint (Belazi et al. 1994) adds language choice to the language phenomena, the other, confronting the pre-
features instantiated in functional and lexical cate- dictions of the theory with the data of actual bilingual
gories, prohibiting CS where a mismatch occurs. A behavior.
more recent Minimalist proposal (MacSwan 1999) It is uncontroversial that CS differs from the other
restricts CS at structural sites showing cross-language major manifestation of language contact: lexical bor-
differences in monolingual features. rowing. Despite etymological identity with the donor
This distinction between lexical and functional language, established loanwords assume the morpho-
categories is not new to CS research. It is a hallmark of logical, syntactic, and often, phonological, identity of
theories invoking the complement structure of in- the recipient language. They tend to be recurrent in the
dividual lexical items to characterize permissible CS speech of the individual and widespread across the
sites (e.g., Joshi 1985) and its sequel, the Null Theory of community. The stock of established loanwords is
CS (Santorini and Mahootian 1995; see also Bentahila available to monolingual speakers of the recipient
and Davies’s Subcategorisation Constraint 1983). Per- language, who access them normally along with the
haps the most detailed model involving the contrast remainder of the recipient-language lexicon. Loan-
between lexical properties and functional (or ‘system’) words further differ from CS in that there is no
morphemes is the Matrix Language Frame model involvement of the morphology, syntax, or phonology
(Azuma 1993, Myers-Scotton 1993). Here, structural of the lexifier language.
constraints on CS result from a complex interaction Recent research has shown that borrowing is
between a dominant matrix language and the pro- actually much more productive than implied above
hibition against embedding ‘system’ morphemes from (see the papers in Poplack and Meechan 1998). In
the ‘embedded’ language in matrix language structure. particular, the social characteristics of recurrence and
The assumption that bilingual syntax can be ex- diffusion are not always satisfied. This results in what
plained by general principles inferred from the study has been called, after Weinreich (1953\1968), nonce
of monolingual grammar has not yet been substan- borrowing (Sankoff et al. 1990). Like its established
tiated. While formal theories of grammar may account counterpart, the nonce loan tends to involve lone
well for monolingual language structure, including lexical items, generally major-class content words, and
that of the monolingual fragments in CS discourse, to assume the morphological, syntactic, and often,
there is no evidence to suggest that the juxtaposition of phonological identity of the recipient language. Like
two languages can be explained in the same way. CS, on the other hand, nonce borrowing is neither
Bilingual communities exhibit widely different pat- recurrent nor widespread, and necessarily requires a
terns of adapting monolingual resources in their certain level of bilingual competence. Distinguishing
code-mixing strategies, and these are not predictable nonce borrowings from single-word CS is conceptually
through purely linguistic considerations. The equi- easy but methodologically difficult, especially when
valence constraint, as formalized by Sankoff (1998), is they surface bare, giving no apparent indication of
a production-based explanation of the facts of CS, language membership.
which incorporates the notions of structural hierarchy The classification of lone items is at the heart of a
and linear order, and accounts for a number of fundamental disagreement among CS researchers over
empirical observations in addition to the equivalent (a) whether the distinction between CS and borrowing
word order characterizing most actual switch sites. should be formally recognized in a theory of CS, (b)
These include the well-formedness of the monolingual whether these and other manifestations of language
fragments, the conservation of constituent structure, contact can be unambiguously identified in bilingual
and the essential unpredictability of CS at any poten- discourse, and (c) criteria for determining whether a
tial CS site. The mechanisms of monolingual and given item was switched or borrowed. Researchers
bilingual grammars are not assumed a priori to be who consider lone other-language items to be CS tend
identical. to posit an asymmetrical relationship, in which one
language dominates and other-language items are
inserted (e.g., Joshi 1985, Myers-Scotton 1993). Where
2. Ealuating CS Theories the class of CS is (in the first instance) limited to
unambiguous multiword fragments, both languages
There has been remarkably little cross-fertilization are postulated to play a role (Belazi et al. 1994,
among CS theories; indeed, each has been greeted with Sankoff 1998, Woolford 1983). Muysken (2000) ad-
a host of counter-examples. Testing the fit of com- mits the possibility of both strategies.
peting models against the data of CS should be a The appropriateness of data is also relevant to
straightforward matter since they often make com- evaluating CS theories. The literature on CS largely is

2063
Code switching: Linguistic

characterized by the ‘rule-and-exception’ paradigm. at the same time differing from the patterning in the
Despite the onslaught of counter-examples provoked unmixed recipient language, the lone other-language
by successive CS theories, very few have in fact been items must result from CS.
tested systematically against the data of spontaneous Quantitative analysis of language mixing phenom-
bilingual usage. Instead, both the theories and tests of ena in typologically distinct language pairs shows that
their applicability tend to be based on isolated ex- lone other-language items, especially major-class con-
amples, drawn from judgments, informant elicitation, tent words, are by far the most important component
and linguist introspection. The relation between such of mixed discourse. These lone items show the same
data and actual usage is not known; nor do they fine details of quantitative conditioning of phono-
permit us to distinguish between the recurrent and logical, morphological, and syntactic variability as
systematic patterns of everyday interaction and exam- dictionary-attested loanwords, both of which in turn
ples which may be judged ‘acceptable’ in some sense, parallel their unmixed counterparts in the recipient
but which rarely or never occur. language (Poplack and Meechan 1998). This tendency
The equivalence constraint has been verified as a is apparent regardless of the linguistic properties of the
general tendency in Spanish-English (Poplack 1980); language pair. This is evidence that most lone items
Finnish-English (Poplack et al. 1987), Arabic-French are borrowed, even if only for the nonce, despite the
(Naı$ t M’Barek and Sankoff 1988), Tamil-English lack, in some cases, of dictionary attestation or
(Sankoff et al. 1990), Fongbe-French and Wolof- diffusion within the community.
French (Meechan and Poplack 1995), Igbo-English
(Eze 1998), French-English (Turpin 1998) and Ukrain-
ian-English (Budzhak-Jones 1998) bilingual commu- 4. Future Directions
nities. But most of the voluminous literature on CS,
especially of the ‘insertional’ type, is based on data Lack of consensus characterizing the discipline is
which represents, properly speaking, lexical borrow- related to a number of methodological problems.
ing. As only the grammar and word order of the Foremost among them is failure to distinguish code-
recipient language are pertinent to borrowing, attempts switching from other types of language mixture,
to understand the structure of CS based on a mixture which, despite similarities in surface manifestation,
of borrowing and true CS (e.g., Myers-Scotton 1993 are fundamentally different mechanisms for combin-
and many others) appear unwieldy or descriptively ing languages. The current state of knowledge suggests
inadequate. that borrowing, nonce or established, is the major
manifestation of language contact in most bilingual
communities. Its linguistic structure is well accounted
3. Identifying the Results of Language Contact for in the traditional language contact literature
Insofar as CS and borrowing are based on some (Weinreich 1953\1968). Intrasentential CS involving
principled combination of elements of the monolingu- multiword fragments of two or more languages is also
al (i.e., unmixed) vernaculars of the bilingual com- attested in some communities. Achievement of con-
munity, it is important to have as explicit an idea as sensus on an empirically verifiable characterization of
possible of the nature of these vernaculars before the rules for juxtaposing these fragments within the
concluding that a code-mixed element is behaving like sentence remains an important goal for CS research.
one or the other. The analysis of code-mixing as a Fit between theories and data could be improved by a
discourse mode requires access to the grammars of the broader empirical base. This would permit researchers
contact languages as they are spoken, and spoken to situate bilingual behavior with respect to the
language is characterized by structural variability. In monolingual vernaculars implicated in language mix-
confronting, rather than evading this variability, ing, account for the disparate CS strategies that have
Sankoff et al. (1990) and Poplack and Meechan (1998) evolved in different bilingual communities, and dis-
developed a method to compare bilingual structures tinguish among incommensurable manifestations of
with the unmixed source languages of the same bilingual language contact.
speakers. Making use of the framework of linguistic See also: Bilingual Education: International Perspec-
variation theory (Labov 1969), the inherent variability tives; Bilingualism and Multilingualism; Bilingualism:
of such forms is used to determine their status. If the
Cognitive Aspects; First Language Acquisition: Cross-
rate and distribution of, for example, case-marking of
the contentious lone other-language items show quan- linguistic; Second Language Acquisition; Sociolin-
titative parallels to those of their counterparts in the guistics
(unmixed) recipient language, while at the same time
differing from relevant patterns in the donor language,
the lone other-language items are inferred to be Bibliography
borrowed, since only the grammar of the recipient Azuma S 1993 The frame-content hypothesis in speech pro-
language is operative. If they pattern with their duction: Evidence from intrasentential code switching. Ling-
counterparts in the (unmixed) donor language, while uistics 31: 1071–93

2064
Coeducation and Single-sex Schooling

Belazi H M, Rubin E J, Toribio A J 1994 Code-switching and Coeducation and Single-sex Schooling
X-bar theory: The functional head constraint. Linguistic
Inquiry 25(2): 221–37
Bentahila A, Davies E E 1983 The syntax of Arabic-French Coeducation or single-sex schooling does not only
code-switching. Lingua 59(4): 301–30 refer to an organizational form of separating or mixing
Budzhak-Jones S 1998 Against word-internal code-switching: girls and boys, but also refers to such questions as:
Evidence from Ukrainian-English bilingualism. International should there be different goals, curricula, rights, and
Journal of Bilingualism 2(2): 161–82 outcomes for the two genders? Equality is not per-
di Sciullo A-M, Muysken P, Singh R 1986 Government and ceived in all countries and not always an important
code-mixing. Journal of Linguistics 22(1): 1–24 principle for men and women. For a long time,
Eze E 1998 Lending credence to a borrowing analysis: Lone
English-origin incorporations in Igbo discourse. International
education was oriented to prepare girls and boys for
Journal of Bilingualism 2(2): 183–201 their different spheres in adult life. Today, equality is
Joshi A K 1985 Processing of sentences with intrasentential the goal, but it is not at all clear whether separation or
code-switching. In: Dowty D R, Karttunen L, Zwicky A M coeducation is the better way to reach this goal.
(eds.) Natural Language Parsing: Psychological, Computa-
tional and Theoretical Perspecties. Cambridge University
Press, Cambridge, UK, pp. 190–204
Klavans J L 1985 The syntax of code-switching: Spanish and 1. A Brief History of Coeducation
English. In: King L D, Maley C A (eds.) Selected Papers from
the XIIIth Linguistic Symposium on Romance Languages. The history of educational systems shows, for most
Benjamins, Amsterdam, pp. 213–21 countries, that especially for education beyond el-
Labov W 1969 Contraction, deletion, and inherent variability of ementary levels, single-sex schools have been the
the English copula. Language 45(4): 715–62 preferred form, although coeducation was not un-
MacSwan J 1999 A Minimalist Approach to Intrasentential Code usual. Financial restrictions in mass education forced
Switching. Garland Press, New York
Meechan M, Poplack S 1995 Orphan categories in bilingual
coeducational schooling, but for ideological reasons,
discourse: Adjectivization strategies in Wolof-French and separation was preferred by the authorities. While this
Fongbe-French bilingual discourse. Language Variation and is mainly true for Europe, in the United States, soon
Change 7(2): 169–94 after establishing public schools, coeducation became
Muysken P C 2000 Bilingual Speech: A Typology of Code- the norm. David Tyack and Elisabeth Hansot describe
mixing. Cambridge University Press, Cambridge, MA the beginning as ‘smuggling in the girls’ which led to
Myers-Scotton C 1993 Duelling Languages. Clarendon Press, the adoption of coeducation by gradually moving
Oxford, UK from ‘why to why not’ (Tyack and Hansot 1992, p. 47).
Naı$ t M’Barek M, Sankoff D 1988 Le discours mixte arabe\ Even in secondary education, only 12 cities out of 628
franc: ais: des emprunts ou des alternances de langue? Reue reported that they had single-sex high schools at the
Canadienne de Linguistique 33(2): 143–54
Poplack S 1980 Sometimes I’ll start a sentence in Spanish Y
end of the nineteenth century. The European sec-
TERMINO EN ESPANM OL: Toward a typology of code- ondary education was for boys only—especially in
switching. Linguistics 18(7\8): 581–618 Germany; girls received secondary education only on
Poplack S, Meechan M (eds.) 1998 Instant Loans, Easy Condi- a private basis and were not allowed to attend higher
tions: The Productiity of Bilingual Borrowing; Special Issue of education at all. Prussian universities enrolled women,
the International Journal of Bilingualism. Kingston Press, but not before 1908.
London Democratic movements in all countries valued
Poplack S, Wheeler S, Westwood A 1987 Distinguishing equality of education and therefore pledged for co-
language contact phenomena: Evidence from Finnish-English education. The debates emphasized the question of
bilingualism. In: Lilius P, Saari M (eds.) The Nordic Languages
difference or similarity of women and men as human
and Modern Linguistics. University of Helsinki Press, Helsinki,
pp. 33–56 beings. While the assumption of similarity historically
Sankoff D 1998 A formal production-based explanation of the was associated with the preference for coeducation,
facts of code-switching. Bilingualism: Language and Cognition those who postulated differences were divided in
1(1): 39–50 advocates and opponents to coeducation. The advo-
Sankoff D, Poplack S, Vanniarajan S 1990 The case of the nonce cates thought coeducation would ensure that girls and
loan in Tamil. Language Variation and Change 2(1): 71–101 boys themselves would make sure they behaved like
Santorini B, Mahootian S 1995 Code-switching and the syntactic girls and boys. The opponents feared that girls and
status of adnominal adjectives. Lingua 96: 1–27 boys together would encourage the loss of their
Turpin D 1998 ‘‘Le franc: ais c’est le last frontier’’: The status of engendered behavior.
English-origin nouns in Acadian French. International Journal
of Bilingualism 2(2): 221–33
In most European countries, the years between 1960
Weinreich U 1953\1968 Languages in contact. Mouton, The and 1980 brought big educational reforms, and co-
Hague education was one of those (Wilson 1991, p. 203)—
Woolford E 1983 Bilingual code-switching and syntactic theory. although it was more of a side-effect of the reforms,
Linguistic Inquiry 14(3): 519–36 and astonishingly, after the heated discussions in the
first half of the twentieth century, a change nearly
S. Poplack without debate. Efforts to establish ‘schools for all’

2065
Coeducation and Single-sex Schooling

would not go along with separation. Poland has had havior against the traditional gender-stereotypes. The
equal opportunity programs since 1965; Sweden since school curriculum serves as a ‘hidden curriculum’ to
the late 1960s, and most other European countries ensure those processes.
since the beginning 1970s, except for Greece (mid- (b) Classroom interaction processes show un-
1970s) and Spain (1980s). Today all European balanced communication structures. Dominance and
countries have coeducation with the exception of forced attention for boys are tolerated, self-effacement
Ireland, and to a lesser degree, in England and Wales. by and disregard of girls remain unnoticed.
Some countries still have single-sex classes in some The central reason for this outcome lies in gender
subjects, such as Physical Education and Crafts. stereotypes and in the gender-division of labor: both
In the US, with its long tradition of coeducation, are still to such a degree ‘normal’ that the ‘masculine
Title IX of the Educational Amendments in 1972 was dominance’ still remains effective, unquestioned by
the major legal tool in implementing equal education most people.
by stating that discrimination on the basis of sex is The hidden curriculum was a major theme in the
illegal in any educational program receiving federal OECD report, describing three forms of sex bias:
funding. This already made clear that coeducation (a) Picture books, reading schemes and children’s
does not necessarily guarantee equality and the literature are characterized by the lack of represen-
absence of sexism. tation of women and female roles.
(b) The curriculum and textbooks in school tend to
be either male-oriented or female-oriented rather than
‘bicultural.’
2. New Women’s Moement and Coeducation (c) The presentation of men and women in the
curriculum shows a picture of the social world that is
‘Sexism’ was the new term to mark inequality for ‘more sexist than reality’ (OECD 1986, p. 75).
women and men in society. Evaluations of the edu- These results did force many of the political insti-
cational reforms showed differences in attainment tutions responsible for the educational system to set
between the countries, between social groups and up new criteria for school books and material. One can
between gender. A close look at processes in the admit that there have been some changes, although the
educational system reveals successes as well as failures: situation still has not changed radically.
women participate much more in higher education Quite a lot of mostly qualitative studies deal with
but there is under-representation of young women in the interaction process. Maggie Wilson resume! s their
some educational fields, especially in vocational edu- results:
cation and in science. At the same time, women’s self-
confidence remains at lower levels than men’s. The Although teachers often deny that differential treatment of
new Women’s movement criticized the outcomes of boys and girls exists in classrooms, case studies from Belgium,
coeducation, and showed that it put women at a Spain, Sweden, Greece, West Germany and England and
disadvantage. Beginning in the 1980s, the critics Wales chart the great amount of attention which teachers give
started a new debate on coeducation. It provided a young boys, albeit in the form of both praise and censure, and
new research area, although one has to admit that the their overtly expressed preference for teaching boys, especially
basis of the literature, even in the late 1990s was still at the upper levels of school and in certain subject areas, such
rather thin; the reports are mostly ‘anecdotal’ (AAUW as in Science and Mathematics (Wilson 1991, p. 213).
1998).
Research in the USA is very much in line with these
findings (Riordan 1990).
Florence Howe, and others, called on the ‘myth of
3. Research about Coeducation and Single-sex coeducation.’ She wrote: ‘One of the central ideas of
Schooling coeducation provides a central myth: that if women
are admitted to men’s education and treated exactly as
Compilations of research show the advantages and men are, then all problems of sexual equity will be
disadvantages of coeducation. The Organization for solved’ (Howe 1984, p. X). The valuing of admittance
Economic Co-operation and Development (OECD) as the only criteria for equality made us forget that
published in 1986 a report on ‘Girls and Women in there were other qualities in school education that
Education,’ and the American Association of Uni- proved to be unequal, e.g., the hidden curriculum and
versity Women (AAUW) released their report: ‘How the interaction processes.
Schools Shortchange Girls,’ in 1992. There are at least Although the AAUW report did not recommend
two fields in which research found relevant gender single-sex schooling, it mentioned that research studies
inequalities: would indicate that girls often learn and perform
(a) Coeducation retains the gender-hierarchical better in same-sex work groups than in mixed-sex
division of the world. It strengthens gender-specific groupings (AAUW 1992, p. 130). This, together with
interests and seldom encourages thinking and be- many other publications, supported the remaining

2066
Coeducation and Single-sex Schooling

women colleges in the USA and also helped to keep, Barrie Thorne analyzed the ‘gender play’ in schools
and even to start private girls’ schools. In Europe, and she found: ‘Gender boundaries are episodic and
especially in Germany, campaigns to enroll young ambigious, and the notion of ‘‘borderwork’’ should be
women in science and technology subjects experi- coupled with a parallel term—such as ‘‘neutraliz-
mented with the separation of gender. ation’’—for processes through which girls and boys
At the end of 1997, the AAUW organized a round (and adults who enter into their social relations)
table entitled ‘Separated by Sex—a critical look at neutralize or undermine a sense of gender as division
single-sex education for girls,’ in order to resume the and opposition’ (Thorne 1993, p. 84).
outcomes of the debate about coeducation and single- Further research, gender studies in programs for
sex education since their 1992 report. The main results teaching credentials as well as teacher training for
are consistent with research from other countries and more awareness of interaction processes dealing with
could therefore be state-of-the-art for coeducation gender questions, would help to make education
today: gender-sensitive and valuable for both girls and boys.
(a) There is neither a significant correlation
between the self-concept of students, nor the gender- See also: Education and Gender: Historical Per-
stereotyping and coeducational or single-sex school- spectives; Education (Higher) and Gender; Education
ing.
(Primary and Secondary Schools) and Gender; Gender
(b) Students in single-sex schools rarely believe that
mathematics and science are specifically male subjects, and School Learning: Mathematics and Science;
while students in coeducation schools believe this Gender Differences in Personality and Social
more often. Behavior; Mathematical Education; Sex-role Devel-
(c) These beliefs do not lead to differences in opment and Education
capacities between the students from the two types of
schooling.
(d) Better results were found in the US single-sex
schools only for ‘at-risk-students,’ especially Spanish- Bibliography
American girls from low socioeconomic families. The
American Association of University Women Educational Foun-
results are very small however, and they are due to
dation 1992 How Schools Shortchange Girls. A Study of Major
the academic orientation of these schools. Findings on Girls and Education. Marlowe, New York
(e) Sexism could be found everywhere, not the American Association of University Women Educational Foun-
separation of gender or coeducation but mainly the dation 1998 Separated by Sex—a Critical Look at Single-sex
awareness of teachers is responsible for non-sexist Education for Girls. Washington, DC
environments. Connell R W 1995 Masculinities. University of California Press,
(f ) The majority of students wish to attend co- Berkeley, CA
educational schools. Diller A, Houston B, Morgan K P, Ayim M 1996 The Gender
Altogether, the report called for more complexity in Question in Education. Westview Press, Boulder, CO
the research design as well as in the theorizing of Eder D, Evans C C, Parker S 1995 School Talk. Rutgers
gender. Both could help to deal with educational University Press, New Brunswick, NJ
questions of gender equity more adequately than just Faulstich-Wieland H 1991 Koedukation—enttaW uschte Hoff-
nungen? [Coeducation—disappointed hopes?] Wissenschaft-
criticizing coeducation and valuing separation.
liche Buchgesellschft, Darmstadt, Germany
Faulstich-Wieland H, Horstkemper M 1995 ‘Trennt uns bitte,
4. Perspecties bitte nicht!’. Koedukation aus MaW dchen- und Jungensicht.
[‘Please don’t separate us!’ Coeducation as Girls and Boys see
Since the beginning of the new debate on coeducation, it]. Leske and Budrich, Opladen, Germany
working with girls was added to working with boys Howe F 1984 Myths of Coeducation. Indiana University Press,
about masculinity (Connell 1995), and, slowly but Bloomington, IN
surely, is changing the coeducational classroom. OECD 1986 Girls and Women in Education. A Cross-national
The research reports, as well as the practical Study of Sex Inequalities in Upbringing and in Schools and
experiences in different schools, show that single-sex Colleges. Paris
education can help to deal with some of the problems Riordan C 1990 Girls and Boys in School. Together or Separate?
of gender. More especially, those courses dealing with Teachers College, Columbia University, New York
Thorne B 1993 Gender Play. Rutgers University Press, New
gender stereotypes; physical education programs that Brunswick, NJ
let girls and boys have new untypical experiences, can Tyack D, Hansot E 1992 Learning Together. A History of
help them gain more self-respect and a better under- Coeducation in American Public Schools. Russell Sage Foun-
standing of gender-processes. But it cannot be done dation, New York
just by separating girls and boys, it requires awareness Wilson M (ed.) 1991 Girls and Young Women in Education. A
and cautious acting! European Perspectie. Pergamon Press, Oxford, UK
Coeducational settings do need more sensitive
reflections about what is going on in classrooms. H. Faulstich-Wieland

Copyright # 2001 Elsevier Science Ltd. 2067


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Cognition, Distributed

Cognition, Distributed 1. Mind in Society


For many people, distributed cognition means cog-
Like all other branches of cognitive science, distributed nitive processes that are distributed across the mem-
cognition seeks to understand the organization of bers of a social group (Salomon 1993). The fun-
cognitive systems. Like most of cognitive science, it damental question here is how the cognitive processes
takes cognitive processes to be those that are involved we normally associate with an individual mind can be
in memory, decision making, inference, reasoning, implemented in a group of individuals? A wide range
learning, and so on. Also following mainstream of disciplines in the social sciences has explored this
cognitive science, it characterizes cognitive processes question.
in terms of the propagation and transformation of Treating memory as a socially distributed cognitive
representations. function has a long history in sociology and anthro-
What distinguishes distributed cognition from other pology. Durkheim, and his students, especially
approaches is the commitment to two related theo- Halbwachs (1925), maintained that memory could not
retical principles. The first concerns the boundaries of even be coherently discussed as a property of an
the unit of analysis for cognition. While boundaries isolated individual. Roberts (1964) proposed that
are often a matter of tradition in a field, there are some social organization could be read as a sort of archi-
general rules one can follow. Bateson (1972) says one tecture of cognition at the community level. He
should bound the unit so that things are not left characterized the cognitive properties of a society (its
inexplicable. This usually means putting boundaries memory capacity and ability to manage and retrieve
on units where the traffic is low. The second principle information) by looking at what information there is,
concerns the range of mechanisms that may be where it is located, and how it can move in a society.
assumed to participate in cognitive processes. While Schwartz (1978) proposed a distributional model of
mainstream cognitive science looks for cognitive culture that emphasized the distribution of beliefs
events in the manipulation of symbols (Newell et al. across the members of a society. Romney et al. (1986)
1989), or more recently, patterns of activation across created quantitative models of the patterns of cultural
arrays of processing units (Rumelhart et al. 1986, consensus. The identification of patterns raised the
McClelland et al. 1986) inside individual actors, question of why such patterns form. Sperber (1985)
distributed cognition looks for a broader class of introduced the idea of an epidemiology of representa-
cognitive events and does not expect all such events to tions. He suggested an analogy in which anthropology
be encompassed by the skin or skull of an individual. is to psychology as epidemiology is to pathology. In
When one applies these principles to the observation the same way that epidemiology addresses the dis-
of human activity ‘in the wild,’ at least three interesting tribution of pathogens in a population, anthropology
kinds of distribution of cognitive process become should treat questions about the distribution of
apparent: cognitive processes may be distributed representations in a community. A similar set of
across the members of a social group, cognitive developments followed from Dawkins’ (1976) dis-
processes may be distributed in the sense that the cussion of ‘memes’ as the cultural analog of genes.
operation of the cognitive system involves coordi- These ideas have now coalesced in the field of memetics
nation between internal and external (material or (Blackmore 1999). March and Simon (1958) argued
environmental) structure, and processes may be distri- that organizations can be understood as adaptive
buted through time in such a way that the products of cognitive systems. Juries are an important class of
earlier events can transform the nature of later events. distributed problem solving organization and they
The effects of these kinds of distribution of process are have been intensely studied by social psychologists
extremely important to an understanding of human (Hastie 1993). Of course, in social psychology there is
cognition. a vast literature on small-group decision making, some
The roots of distributed cognition are deep, but the of which discusses the properties of aggregates.
field came into being under its current name in the Scientific communities have received special atten-
mid-1980s. In 1978, Vygotsky’s Mind in Society was tion because the work of science is fundamentally
published in English. Minsky published his Society cognitive and distributed. The phenomena that have
of Mind in 1985. At the same time, Parallel Distri- been explored include how the organization of com-
buted Processing was making a comeback as a munication media in a scientific community affect the
model of cognition (Rumelhart et al. 1986). The kinds of things the community can learn (Thagard
nearly perfect mirror symmetry of the titles of 1993), how conditions external to the individual
Vygotsky’s and Minsky’s books suggests that some- scientists can affect their individual choices in ways
thing special might be happening in systems of that lead to different high-level structures to emerge
distributed processing, whether the processors are (Kitcher 1990), how the distribution of cognitive
neurons, connectionist nodes, areas of a brain, whole activity within social networks and between people
persons, groups of persons, or groups of groups of and inscriptions accounts for much of the work of
persons. science (Latour 1987), and how scientific facts are

2068
Cognition, Distributed

created by communities in a process that simply could of similar societies of mind. This means, of course, that
not be fit into the mind of an individual (Fleck 1935). both what’s in the mind, and what the mind is in are
Economists have been interested in the tension societies. Getting internal agencies into coordination
between what is individually rational and what is with external structure can provide the organization of
rational at the aggregate level. This theme has been the relations between the internal agencies that is
explored in game theory under the rubric of the required to perform the new functional skill.
Prisoner’s Dilemma, the paradox of the commons, Vygotsky developed this idea of the social origins of
and other cases where individual rationality and group individual psychological functions in Society of Mind
rationality diverge (Von Neumann and Morgenstern (Vygotsky 1978, Wertsch 1985). Vygotsky argued that
1964). every high-level cognitive function appears twice: first
Anthropologists and sociologists studying know- as an interpsychological process and only later as an
ledge and memory, social psychologists studying intrapsychological process. The new functional system
small-group problem solving and jury decision inside the child is brought into existence in the
making, organizational scientists studying organiz- interaction of the child with others (typically adults)
ational learning, philosophers of science studying and with artifacts. As a consequence of the experience
discovery processes, and economists and political of interactions with others, the child eventually may
scientists exploring the relations of individual and become able to create the functional system in the
group rationality, all have taken stances that lead absence of the others. This could be seen in Minsky’s
them to a consideration of the cognitive properties of terms as a mechanism for the propagation of a
societies of individuals. There is ample evidence that functional skill from one society of mind to another.
the cognitive properties of a group can differ from the From the perspective of distributed cognition, this sort
cognitive properties of the members of the group. of individual learning is seen as the propagation of a
particular sort of pattern through a community.
Cultural practices assemble agencies into working
assemblages and put the assemblages to work. Some
2. The Society of Mind of these assemblages may be entirely contained in an
individual, and some may span several individuals and
The work described above looks for mind-like proper- material artifacts. The patterns of activity that are
ties in social groups. This is the Mind in Society repeatedly created in cultural practices may lead to the
reading. The metaphor can be run the other way as is consolidation of functional assemblages, the atrophy
done in Minsky’s Society of Mind (1985). Rather than of agencies that are rarely used, and the hypertrophy
using the language of mind to describe what is of agencies that are frequently employed. The result
happening in a social group, the language of social can be individual learning or organizational learning,
groups can be used to describe what is happening in a or both.
mind.
Minsky argued that to explain intelligence we need
2.1 Interaction as a Source of Noel Structure
to consider a large system of experts or agencies that
can be assembled together in various configurations to An important property of aggregate systems is that
get things done. Minsky also allowed that a high-level they may give rise to forms of organization that
agency itself could be composed of low-level agencies. cannot develop in the component parts. Freyd (1983)
With Papert (Minsky and Papert 1988), he argued that argued that some of the features of language that are
the low-level agencies (the ones that take on ‘toy-sized identified as linguistic universals could arise out of the
problems’) could be implemented as distributed com- necessity of sharing the linguistic code. For instance,
putations in connectionist nets. Minsky said, ‘… each the reason that linguistic categories tend to approxi-
brain contains hundreds of different types of machines, mate discrete structures may have little to do with the
interconnected in specific ways which predestine that organization of the brain, and everything to do with
brain to become a large, diverse society of partially the problem of pushing a complex representation
specialized agencies’ (1988). What this means of course through a very narrow channel. As Minsky and Papert
is that the cognition of an individual is distributed point out, symbols can be expected to arise where
cognition too. there are bottlenecks in communication. That means
A problem that remained unsolved by Minsky’s we should look for the origins of symbols at places
work was ‘how such systems could develop managers where the information ‘traffic’ is relatively low—or at
for deciding, in different circumstances, which of those the boundaries of our various units of analysis.
diverse procedures to use’ (1988). That is, how can the The phenomena related to the social distribution of
relations among the agencies get organized to perform cognition are most often investigated using ethno-
new functional skills? To solve this problem, Minsky graphic methods. In some cases, however, simulation
and Papert invoked biological maturation. An altern- models may be used to test hypotheses about the
ative way to approach this problem is to note that each behavior of such distributed systems. For example,
‘society of mind’ resides and develops in a community Hutchins and Hazlehurst (1995) explored Freyd’s

2069
Cognition, Distributed

ideas in a series of simulation models in which Consider an example from the world of ship
individuals (modeled by connectionist networks) in- navigation (Hutchins 1995). Navigators frequently
teract with one another. They developed a robust face the problem of computing the ship’s speed from
procedure in which a shared lexicon emerges from the distance traveled over a given period of time. If a ship
interactions of individuals. Hazlehurst and Hutchins travels 1,500 yards in 3 minutes, what is the speed of
(1998) demonstrated the emergence of reduced con- the ship in knots? There are many ways to solve this
ventional sequences of lexical items—which they take problem. Most readers of this article would probably
to be the beginnings of syntax. These conventional attempt to use a paper and pencil plus their knowledge
sequences arise only in the condition of negotiated of algebra to solve it. That procedure is effective, but
learning where the representing structures must sim- not nearly as efficient as the ‘so-called’ 3-minute rule.
ultaneously come to accurately represent the world An experienced navigator need only see the problem
and be shared among individuals, that is, be able to stated to see that the answer is 15 knots. The speed in
pass the communication bottleneck between individ- knots equals the number of hundreds of yards covered
uals. Representations that are learned inside an in- in 3 minutes. The use of this rule is a case of situated
dividual, without the requirement of sharing them seeing. The rule itself is an internal cognitive artifact.
with others, come to represent the world, but do not But suppose the ship covered 4,000 yards in 7 minutes?
show the reduced conventional code aspects that are For that problem a material artifact called the three-
the hallmarks of language and syntax. scale nomogram is more appropriate. A nomogram
By simultaneously considering the society of mind has three logarithmic scales: one each for distance,
and mind in society, the distributed cognition ap- time, and speed. If the values of any two variables in a
proach provides a new place to look for the origins of distance\rate\time problem are known, the other can
complexity. Phenomena that are not predictable from be determined by laying a straight edge on the
the organization of any individual taken in isolation nomogram so that it touches the known values. The
may arise in the interactions among individuals. Once straight edge will touch the third scale at the value of
having developed in this larger system, they may the answer. It is clear that cognitive work is being
become elements of cultural practices and thereby done, but it is also clear that the processes inside the
become available for appropriation by individuals. person are not, by themselves, sufficient to accomplish
This sort of scheme may be a partial solution to the the computation. A larger unit of analysis must be
paradox of how simple systems can lead to more considered. The skills of scale reading and inter-
complex ones. polation are coordinated with the manipulation of
objects to establish a particular state of coordination
between the straight edge and the nomogram. This is a
3. The Material Enironment very different set of agencies than was involved in
doing the problem via algebra and paper and pencil. In
A second major thread in the fabric of distributed fact, the skills that are needed to use the nomogram are
cognition concerns the role of the material environ- the things that humans are good at: pattern matching,
ment in cognitive activity. Again, the question of manipulation of objects in the world, and mental
where to bound the unit of analysis arises. The simulation of simple dynamics (Rumelhart et al. 1986).
potential of the material environment to support A computation was performed via the manipulation
memory is very widely recognized. But the environ- of a straight edge and nomogram. The nomogram was
ment can be more than a memory. Cognitive activity is designed in such a way that the errors that were
sometimes situated in the material world in such a way possible in algebra are impossible when using the
that the environment is a computational medium. nomogram. It is essential to distinguish the cognitive
Cognitive artifacts are the Things that Make Us properties required to manipulate the artifact from the
Smart in the title of Norman’s (1993) book. The computation that is achieved via the manipulation of
notion that cognitive artifacts amplify the cognition of the artifact. This is a key point, and the failure to see
the artifact user is fairly commonplace. If one focuses it clearly has been the source of many difficulties in
on the products of cognitive activity, cognitive arti- cognitive science.
facts do seem to amplify human abilities. A calculator
seems to amplify one’s ability to do arithmetic, writing
down something one wants to remember seems to 4. Distributing Cognition in Time
amplify one’s memory. Cole and Griffin (1980) point
out that this is not quite correct. When I remember Simon (1998) offered a parable as a way of emphasiz-
something by writing it down and reading it later, my ing the importance of the environment for cognition.
memory has not been amplified. Rather, I am using a He argued that, as we watch the complicated move-
different set of functional skills to do the memory task. ments of an ant on a beach, we might be tempted to
Cognitive artifacts are involved in a process of attribute to the ant some complicated program for
organizing functional skills into cognitive functional constructing the path taken. In fact, Simon says, that
systems. trajectory tells us more about the beach than about the

2070
Cognition, Distributed

ant. Similarly, in watching people thinking in everyday Dawkins R 1976 The Selfish Gene. Oxford University Press,
settings, we may be learning as much about their Oxford
environment for thinking as about what is inside them. Fleck L 1979 The Genesis and Deelopment of a Scientific Fact.
The environments of human thinking are not ‘natural’ University of Chicago Press, Chicago
environments. They are artificial through and through. Freyd J 1983 Shareability: the social psychology of epistemology.
Cognitie Science 7: 191–220
They develop over time. The crystallization of partial
Gardner H 1985 The Mind’s New Science. Basic Books
solutions to frequently encountered problems in arti- Halbwachs M 1925 Les Cadres Sociaux de la MeT moire. Libraire
facts such as the 3-minute rule and the nomogram is a Felix Alcan, Paris
ubiquitous strategy for the stabilization of knowledge Hastie R 1993 Inside the Juror: The Psychology of Juror Decision
and practice. Humans create their cognitive powers in Making. Cambridge University Press, Cambridge, UK
part by creating the environments in which they Hazlehurst B, Hutchins E 1998 The emergence of propositions
exercise those powers. from the coordination of talk and action in a shared word.
Language and Cognitie Process (special issue on Connections
Approaches to Language Development edited by Kim Plun-
kett). 2 & 3 April\May
5. Conclusion Hutchins E 1995 Cognition in the Wild. MIT Press, Cambridge,
MA
It does not seem possible to account for the cognitive Hutchins E, Hazlehurst B 1995 How to invent a lexicon: the
accomplishments of our species by reference to what is development of shared symbols in interaction. In: Gilbert N,
inside our heads alone. One must also consider the Conte R (eds.) Artificial Societies: The Computer Simulation of
cognitive roles of the social and material world. But, Social Life. UCL Press, London
how shall we understand the relationships of the social Kitcher P 1990 The distribution of cognitive labor. The Journal
and the material to cognitive processes that take place of Philosophy 87(1): 5–22
inside individual human actors? This is the problem Latour B 1987 Science in Action. Harvard University Press,
that distributed cognition attempts to solve. Cambridge, MA
According to Gardner (1985), a more or less explicit March J, Simon H 1958 Organizations. Wiley, London
decision was made in cognitive science to leave culture, McClelland J, Rumerlhart D, PDP research group 1986 Parallel
context, history, and emotion out of the early work. Distributed Processing: Explorations in the Microstructure of
These were recognized as important phenomena, but Cognition. Vol 2: Psychological and Biological Models. MIT
their inclusion made the problem of understanding Press, Cambridge, MA
Minsky M 1985 Society of Mind. Simon and Schuster, Hemel
cognition too complex. The ‘Classical’ vision of
Hempstead, UK
cognition that emerged was built from the inside out, Minsky M, Papert S 1988 Perceptrons. MIT Press, Cambridge,
starting with the idea that the mind is a central logic MA
engine. From that starting point, it followed that Newell A, Rosenbloom P, Laird J 1989 Symbolic architectures
memory could be seen as retrieval from a stored for cognition. In: Posner M (ed.) Foundations of Cognitie
symbolic database, that problem solving is a form of Science. MIT Press, Cambridge, MA
logical inference, that the environment is a problem Norman D 1993 Things that Make Us Smart. Addison Wesley,
domain, and that the body is an input device (Clark Location
1996). Attempts to reintegrate culture, context, and Roberts J 1964 The self-management of cultures. In: Good-
history into this model of cognition have proved very enough V (ed.) Explorations in Cultural Anthropology: Essays
frustrating. The distributed cognition perspective in honor of George Peter Murdock. McGraw-Hill
aspires to rebuild cognitive science from the outside in, Romney A K, Weller S, Batchelder W 1986 Culture consensus:
beginning with the social and material organization of A theory of culture and informant accuracy. American
cognitive activity. Anthropologist 88(2): 313–38
Rumelhart D, McClelland J, PDP research group 1986 Parallel
Distributed Processing: explorations in the Microstructure of
See also: Cognitive Science: Overview; Situated Cognition. Vol 1: Foundations. MIT Press, Cambridge, MA
Cognition: Origins Rumelhart D, Smolensky P, McClelland J, Hinton C 1986
Schemata and sequential thought processes in PDP models. In:
McClelland J, Rumelhart D, PDP research group (eds.)
Parallel Distributed Processing: Explorations in the Micro-
Bibliography structure of Cognition. Vol 2: Psychological and Biological
Bateson G 1972 Steps to an Ecology of Mind. Balantine Books, Models. MIT Press, Cambridge, MA
Location Salomon G 1993 Distributed Cognitions. MIT Press, Cambridge,
Blackmore S 1999 The Meme Machine. Oxford University Press, MA
Oxford Simon H 1998 The Sciences of the Artificial 3rd edn. MIT Press,
Clark A 1996 Being There: Putting Brain, Body and World Cambridge, MA
Together Again. MIT Press, Cambridge, MA Schwartz T 1978 The size and shape of a culture. In: Barth F (ed.)
Cole M, Griffin M 1980 Cultural amplifiers reconsidered. In: Scale and Social Organization. Universitetesforlaget
Olson D (ed.) The Social Foundations of Language and Sperber D 1985 Anthropology and psychology: towards an
Thought. Norton, London epidemiology of representations. Man 20: 73–89

2071
Cognition, Distributed

Thagard P 1993 Societies of minds: Science as distributed defined populations, first assessed at a particular life
computing. Studies in History and Philosophy of Science 24: stage, whether in early adulthood or in early old age.
49–67 Although descriptive studies often begin as cross-
Von Neumann J, Morgenstern O 1964 Theory of Games and
sectional inquiries, they are most frequently conducted
Economic Behaior. Wiley, Chichester
Vygotsky L 1978 Mind in Society: The Deelopment of Higher as longitudinal analyses since the interest is often in
Psychological Processes. Harvard University Press, individual differences in intraindividual change, or in
Cambridge, MA the elucidation of typologies of individuals who follow
Wertsch J 1985 Vygotsky and the Social Formation of Mind. different growth trajectories. These are frequently
Harvard University Press, Cambridge, MA large-sample studies, and the use of correlational or
quasi-experimental approaches is typical (Baltes et al.
E. Hutchins 1999, Schaie 1996b).

2. Methodological Issues

Cognitive Aging 2.1 Age-comparatie s. Age Change Designs


Much of the experimental cognitive aging literature is
Cognitive aging is concerned with age-related changes based on age-comparative studies, which typically
in adulthood in the basic processes of learning and contrast a group of young adults (typically college
memory, as well as the complex higher order processes students) with convenience samples of community-
of language and intellectual competence or executive dwelling older adults in their sixties and seventies. It
functioning. Although most of the literature has been should be recognized that such comparisons are
concerned with explaining the mechanism of cognitive fraught by the problem that it is often unreasonable to
decline, there is also a substantial interest in issues assume that the two age groups can be adequately
such as compensation and the role of external support, matched for other status variables that might provide
including collaborative problem solving. a rival explanation for any observed age difference on
the dependent variable. This creates particular prob-
lems for identifying the mechanisms that may be
1. Definition of Cognitie Aging implicated in age-related decline from young adult-
There have been two distinct traditions in the study of hood into old age. Age-comparative designs are also
cognitive aging. The first grew out of experimental inadequate in explaining individual differences in age
child psychology while the second derives from psy- changes. The latter can only be investigated by means
chometric roots. of longitudinal paradigms (Schaie 1965).

2.2 The Role of Response Speed


1.1 Experimental Study of Memory Functions and
Language A number of theorists have argued that changes in the
central nervous system are the primary common cause
The concern in this literature is to explicate possible for the observed age-related declines in cognitive
causal variables that would explain why many adults performance. In fact, there have been many published
suffer memory loss and decline in the complex ma- analyses that show a substantial reduction in age
nipulation of language variables such as text pro- differences, if some measure or measures of reaction
cessing. The typical approach here is to design ex- time or perceptual speed is partialled out of the relation
periments testing for the effects of single variables in between a given cognitive process and chronological
carefully controlled laboratory settings requiring only age (Salthouse 1999). This issue is of particular
limited numbers of subjects. Because there is often concern because it is not clear whether the observed
little interest in individual differences, or population average increase in reaction time (generally assumed
parameters, study participants are typically drawn to be of the magnitude of approximately 1.6 from the
from convenience samples (McKay and Abrams early twenties to the late sixties), while reliably
1996). demonstrable in the laboratory, is of significance in
many or most tasks of daily living.
1.2 Descriptie Study of Adult Intellectual
Deelopment 3. Basic Findings from the Experimental
Literature on Cognitie Aging
Descriptive studies of adult intellectual development
often stem from the longitudinal follow up of samples Most of this literature is cross-sectional in nature and
first assessed in childhood or adolescence. Other such usually consists of a comparison of convenience
studies may represent carefully stratified samples from samples of young adults (often sophomore psychology

2072
Cognitie Aging

students) and of community-dwelling older adults material that provides priming of association because
(often participants in adult education programs). The it contains learned semantically linked information
major findings regarding age differences in cognitive (McKay and Abrams 1996).
performance include the following.

4. Basic Findings from the Descriptie Literature


3.1 Memory on Age Changes in Intellectual Competence
It is currently thought that older persons are at a Changes in intellectual competence over the adult life
disadvantage in retrieving information from memory span have been studied primarily with either the
when the information to be retrieved is complex and Wechsler Intelligence Scale or with ability batteries
when there are few cues or other environmental derived from the Thurstonian Primary Mental Ability
support. Hence, age differences are far greater in recall framework.
than in recognition of information. It is also thought Distinctions are made between fluid abilities
that the magnitude of age difference in memory is far thought to be innate and crystallized abilities which
greater when a task involves effortful processing than involve the utilization of culturally acquired knowledge
when automatic processing is involved. Hence, greater (Cattell 1963). More recently further distinctions have
age differences have been found for explicit than been introduced between the mechanics (or basic
implicit memory. Older persons are also thought to processes) of intellectual competence and the prag-
have greater difficulty in integrating the context of matics that involve cultural mediation (Baltes et al.
information they are trying to remember. It is also 1984).
thought that working memory capacity (i.e., the What has been found in most longitudinal studies is
information kept in immediately accessible store) that the adult life course of mental abilities is not
becomes reduced with increasing age. On the other uniform. The so-called fluid abilities (sometimes de-
hand there is little evidence for age differences in long- fined as cognitive mechanics or primitives) peak in
term storage. Memory deficits occurring with age early midlife and begin to decline in the early sixties.
include nonverbal tasks such as memory for spatial By contrast, crystallized abilities, or the pragmatics of
location, memory for faces, and for actions and intellectual competence that represent abilities ac-
activities. Studies of prospective memory (i.e., re- quired in a given cultural context (particularly verbal
membering something to be done in the future) suggest abilities), do not usually peak until the fifties are
that older people do well in remembering simple and reached and begin to show significant decline only in
event-based tasks, but are at a disadvantage when the seventies and often show only minimal decline
tasks become complex or are time-based. In sum, it even in the eighties (Schaie 1996a). However, recent
appears that age differences are known to increase in work in advanced old age suggests increasing con-
magnitude as a function of the processing requirement vergence and steeper decline for both aspects of
of a given task (Salthouse 1999, Smith 1996). intellectual competence, probably caused by the in-
creasing decline of sensory and central nervous system
functions (Baltes and Lindenberger 1997, Baltes and
Mayer 1999).
3.2 Language
However, at any particular time, a cross-sectional
Age-related differences in language behavior are snapshot may yield very different ability profiles
closely related to the processes of encoding and because of the fact that subsequent population cohorts
retrieving verbal materials discussed above. But in reach different asymptotes in midlife. For example,
addition there appear to be greater age differences in there has been a positive linear cohort trend in the
textual tasks that involve recent connections than twentieth century for inductive reasoning, the basic
in those that involve recollection of older connections. component of most problem-solving tasks, while there
Language production also seems to be adversely has been a negative trend in numeric skills. The
affected in older persons under intense time pressure. magnitude of cohort differences in abilities since the
The interesting tip-of-the-tongue phenomenon involv- 1950s has been comparable to the average age changes
ing word-finding difficulty, however, seems to be more observed from young adulthood into the seventies.
likely with infrequently used words. Significant age Hence, many older persons may appear to have
differences have also been found in language planning, declined markedly in comparison to young peers, even
that is, in planning what one intends to say and how to though the age difference may be primarily due to
say it during language production. Hence, older what might be called the obsolescence of earlier
persons are more likely to engage in hesitations, false cohorts (Schaie 1996b).
starts, and repetitions. Age-linked deficits in story Studies of individual differences suggest that while
recall is thought to be more of a general deficit in most persons have declined on some aspect of in-
connection formation than in specific communication tellectual functioning from their own peak as the
ability. Older persons tend to benefit from textual sixties are reached, that specific patterns of decline

2073
Cognitie Aging

may well depend on complex patterns of individual life such training will also be effective in enhancing the
experience. Most healthy community-dwelling per- performance of young adults such that age differences
sons are able to maintain a high level of function until tend to remain robust (cf. Baltes and Kliegl 1992).
advanced old age (but see Baltes and Mayer 1999 for
the consequences of sensory dysfunctions). Because
most tasks of daily living represent complex com- 6. Other Related Topics in Cognitie Aging
binations of basic cognitive processes, many indiv-
iduals can maintain their abilities above the minimally Much of the work on cognitive aging in the past has
necessary threshold level for independent functioning been concerned with age-related development in the
by often rather complex compensatory processes (cf. mechanics and basic processes of cognition. It should
Baltes et al. 1984, Baltes et al. 1999). be recognized that current attention in the study of
cognitive aging is turning to the discovery of how these
basic processes operate within more complex domains.
5. Can Cognitie Aging be Slowed or Reersed? Of particular interest here are the study of wisdom
Once the course of adult intellectual development had (e.g., Baltes and Staudinger 1993, Sternberg 1990), the
been described and a number of antecedents of application of the basic processes to social cognition
individual differences had been identified, it then (e.g., Staudinger 1999), the development of expert
became useful for researchers to think about ways in systems (e.g. Charness and Bosman 1990), and in
which normal intellectual aging might be slowed or everyday problem solving (Willis 1996). The extensive
reversed. literature on these topics is beyond the scope of this
In a number of laboratories (primarily in the USA article.
and in Germany) cognitive training programs have See also: Aging and Health in Old Age; Aging,
been developed that have been applied in the lab-
Theories of; Aging Mind: Facets and Levels of
oratory, and more recently in cooperative multisite
intervention trials. In contrast to training young Analysis; Brain Aging (Normal): Behavioral, Cogni-
children, where it can be assumed that new skills are tive, and Personality Consequences; Differential
conveyed, older adults are likely to have access to the Aging; Ecology of Aging; Lifespan Theories of
skills being trained, but through disuse have lost their Cognitive Development; Memory and Aging, Cogni-
proficiency. Information from longitudinal studies is tive Psychology of; Memory and Aging, Neural Basis
therefore particularly useful in distinguishing indi- of; Old Age and Centenarians; Social Cognition and
viduals who have declined from those who have Aging; Spatial Memory Loss of Normal Aging:
remained stable. In the former, training is directed Animal Models and Neural Mechanisms
towards remediation of loss, while in the latter the
enhancement of previous levels of functioning are
sought with the intention of compensating for possibly Bibliography
cohort-based disadvantage of older persons.
Baltes P B, Dittmann-Kohli F, Dixon R A 1984 New per-
Results from such cognitive interventions allow the spectives on the development of intelligence in adulthood:
conclusion that cognitive decline in old age, for many Toward a dual process conception and a model of selective
older persons, is likely to be a function of disuse rather optimization with compensation. In: Baltes P B, Brim O G Jr
than of the deterioration of the physiological or neural (eds.) Lifespan Deelopment and Behaior. Academic Press,
substrates of cognitive behavior. For example, a brief New York, Vol. 6, pp. 33–76
five-hour training program for persons over 65 re- Baltes P B, Kliegl R 1992 Further testing of limits of cognitive
sulted in average training gains of about one half SD plasticity: Negative age differences in a mnemonic skill are
on the abilities of spatial orientation and inductive robust. Deelopmental Psychology 28: 121–5
Baltes P B, Lindenberger U 1997 Emergence of a powerful
reasoning. Of those for whom significant decrement
connection between sensory and cognitive functions across
could be documented over a 14-year period, roughly the adult life span: A new window at the study of cognitive
40 percent were returned to the level at which they had aging. Psychology and Aging 12: 12–21
functioned when first studied. The analyses of struc- Baltes P B, Mayer K U (eds.) 1999 The Berlin Aging Study:
tural relationships among the ability measures prior to Aging from 70 to 100. Cambridge University Press, Cam-
and after training further allow the conclusion that bridge, UK
training does not result in qualitative changes in ability Baltes P B, Staudinger U M 1993 The search for a psychology of
structures, and is thus highly specific to the targeted wisdom. Current Directions in Psychological Science 2: 75–80
abilities. A seven-year follow up further demonstrated Baltes P B, Staudinger U M, Lindenberger U 1999 Lifespan
psychology: Theory and application to intellectual
that those subjects who showed significant decline at
functioning. Annual Reiew of Psychology 50: 471–507
initial training do remain at substantial advantage Cattell R B 1963 Theory of fluid and crystallized intelligence: A
over untrained comparison groups (Willis and Schaie critical experiment. Journal of Educational Psychology 54:
1994). It should be noted, however, that while cog- 1–22
nitive training may improve performance in the elderly Charness N, Bosman E A 1990 Expertise and aging: Life in the
and may function to reduce effects of age decrement, lab. In: Hess T H (ed.) Aging and Cognition: Knowledge

2074
Cognitie and Interpersonal Therapy: Psychiatric Aspects

Organization and Utilization. Elsevier, Amsterdam, pp. chologists. This article will provide a brief overview of
343–85 their respective rationales, methods, and current appli-
McKay D G, Abrams L 1996 Language, memory, and aging: cations. It concludes with some speculations on future
Distributed deficits and the structure of new-versus-old
developments.
connections. In: Birren J E, Schaie K W (eds.) Handbook of
the Psychology of Aging, 4th edn. Academic Press, San Diego,
CA, pp. 251–65 1. Cognitie Psychotherapy
Salthouse T 1999 Theories of cognition. In: Bengtson V L,
Schaie K W (eds.) Handbook of Theories of Aging. Springer,
New York, pp. 196–208
1.1 Oeriew
Schaie K W 1965 A general model for the study of developmental All forms of cognitive therapy (CT) work from the
problems. Psychological Bulletin 64: 92–107 premise that common mental disorders are consequent
Schaie K W 1996a Intellectual Deelopment in Adulthood, The to and\or maintained by faulty thinking (rather than
Seattle Longitudinal Study. Cambridge University Press, New
York
the reverse). Cognitive therapists, therefore, set out to
Schaie K W 1996b Intellectual functioning and aging. In: Birren identify cognitions associated with a target problem
J E, Schaie K W (eds.) Handbook of the Psychology of Aging, (such as depressed mood or hypochondriacal behav-
4th edn. Academic Press, San Diego, CA, pp. 266–86 ior) then use this analysis as the basis for an explicit
Schaie K W, Willis S L 1999 Theories of everyday competence. therapeutic plan. Treatments addressing these cog-
In: Bengtson V L, Schaie K W (eds.) Handbook of Theories of nitions can adopt a number of techniques, depending
Aging. Springer, New York, pp. 174–95 upon the problem addressed, the formulation of an
Smith A D 1996 Memory. In: Birren J E, Schaie K W (eds.) individual case, and the specific form of cognitive
Handbook of the Psychology of Aging, 4th edn. Academic psychotherapy favored by individual practitioners.
Press, San Diego, CA, pp. 236–50
Staudinger U M 1999 Social cognition and psychological ap-
proach to an art of life. In: Blanchard-Fields F, Hess T B 1.2 Rationale
(eds.) Social Cognition, Adult Deelopment and aging.
Academic Press, San Diego, CA, pp. 343–75 Although cognitive therapy has only gained currency
Sternberg R J 1990 Wisdom and its relation to intelligence and since 1970, its basic aims are not new. Attempts to
creativity. In: Sternberg R J (ed.) Wisdom: Its Nature, Origins, restore mental health by arguing sufferers out of their
and Deelopment. Academic Press, New York, pp. 142–9 false beliefs underpinned much ‘moral therapy’ in
Willis S L 1996 Everyday problem solving. In: Birren J E, Schaie eighteenth-century asylums. As a modern movement,
K W (eds.) Handbook of the Psychology of Aging, 4th edn. CT emerged from behavior therapy (qv). This had its
Academic Press, San Diego, CA, pp. 287–307
theoretical rationale in learning theory (qv), derived
Willis S L, Schaie K W 1994 Cognitive training in the normal
elderly. In: Forette F, Christen Y, Boller F (eds.) PlasticiteT from experimental manipulation of contingent re-
ceT reT brale et stimulation cognitie. Fondation Nationale de sponses in laboratory animals. Behavioral treatments
Ge! rontologie, Paris, pp. 91–113 had sought behavioral change through functional
analysis of target symptoms. Behavioral theory paid
K. W. Schaie little attention to the ‘black box’ of the mind and any
mediating role it played between environmental stimu-
lus and behavioural response. Cognitive therapy de-
Cognitive and Interpersonal Therapy: veloped in reaction to this denial of the importance of
thought, but was helped by clinical evidence that
Psychiatric Aspects thinking could actively obstruct the progress of behav-
ioral treatment unless it was explicitly attended to. The
Cognitive and interpersonal therapies are each struc- behavioral and cognitive approaches to psychological
tured psychological treatments. They differ consider- treatment have retained many common features,
ably in their rationale and therapeutic procedures, but including emphasis on explicit formulation, and an
both have been subject to research since their in- empirical and collaborative approach. Behavioral and
ception. In recent years, each has been demonstrated cognitive techniques may be combined within a treat-
to offer effective symptomatic help for a range of ment, and the close relationship between the two is
specific mental disorders. This has facilitated their reflected in the designation of ‘cognitive-behavior
adoption in publicly or insurance funded health therapy’ or ‘CBT’ for much work that remains
services, where major changes have been taking place essentially cognitive.
in the pattern of therapeutic provision and in the Cognitive therapists have used a variety of models
training of mental health professionals. In English- to account for how cognitive processes contribute m
speaking countries, where psychiatrists trained in psychopathology. While these can be impressive in
psychological treatments were, even 10 years ago, their orderliness and ingenuity, and can be of great
more likely to offer psychoanalytic psychotherapy heuristic value in practice, they nearly always derive
than any other kind, provision of cognitive or inter- from clinical experience. At the same time, inde-
personal therapy is increasingly common. Cognitive pendent support for such models has been sought
therapy is also provided frequently by clinical psy- from experimental psychology. Few cognitive findings

2075
Cognitie and Interpersonal Therapy: Psychiatric Aspects

in emotional disorders remain robust when stringently


tested, although a good deal of evidence exists that
selective preference for negative memories is common
in people with depressed mood, and irrational expec-
tations of future danger is more common in people
prone to anxiety (Williams et al. 1990).
The development of cognitive therapy has gone
through a succession of stages of increasing theoretical
complexity. These have accompanied a tendency,
shared with other maturing psychotherapies, to follow
early successes with relatively straightforward cases by
attempts to deal with those that are more resistant to
simple measures. While the theoretical ramifications
of this are beyond the scope of the present article, it is
helpful to distinguish between two kinds of cognition
that have been seen as pathogenic—‘surface’ and
‘deep.’ Surface cognitions are available to introspec-
tion, transient, and situation specific. Deep cognitions
are less easy to access, enduring and more global in
scope.
The most influential package of models informing
therapeutic practice derive from Beck and co-workers
(Beck 1989). As applied to common disorders in which Figure 1
the main features are anxiety symptoms or depression, A map of problem cognitions
cognitive changes lead to evident differences in ap-
praisal of a ‘cognitive triad’ of three interlinked areas. adoption by Young (1994) to refer to ‘early mal-
These are someone’s automatic attitudes concerning adaptive schemas’ (EMS) has had important practical
their world, themselves, and what is likely to happen. implications. An EMS is not only latent and enduring,
Beck terms surface cognitions of these kinds ‘auto- but presumed to result from experiences early in life. It
matic thoughts.’ While their content is colored by the is rigid in restricting the scope of thinking and likely to
particulars of an individual’s experience and personal be associated with very strong affect if challenged.
values, some themes have been demonstrated as Because of this emotional valence, EMSs can appear
common to people having a particular kind of emo- to be quasi-autonomous, acting to preserve themselves
tional disorder. For instance, when anxiety is marked, (schema maintenance). There can be a preference for
automatic thoughts about future danger are common; executive functions that either fail to activate the
with depression, personal helplessness, and perceived schema (schema avoidance) or that disguise it behind
inability to change their personal circumstances; with displays of contrary traits (schema compensation).
hostility, thoughts that the world is a bad or very EMSs resistance to change means they cannot be
unreliable place. Beck identifies two other kinds of inferred straightforwardly from behavior or simple
cognitive pathology that accompany these develop- enquiries.
ments. One is a set of ways in which cognitive processes
frequently are distorted. These ensure that appraisals
based on automatic thinking are likely to be con-
1.3 Methods
firmed. Examples include making of arbitrary or
personal inferences, selective abstraction and over- In general, therapeutic strategies aim to counteract
generalization from experience, and all or nothing cognitive distortions by teaching patients skills by
(‘dichotomous’) thinking. At the level of deep cog- which they can recognize and revise problematic
nitions, Beck recognizes the presence of ‘schemas’ as cognitions, as well as working to challenge specific
constellations of assumptions and beliefs which, being current cognitions and to establish a more adaptive
more latent but enduring cognitions that can be way of thinking. Essentially, this will involve tech-
reactivated by events, are associated with predispo- niques for the identification of surface and deep
sition to emotional disorders. The main functional cognitions and their sequelae that are not otherwise
relationships between these elements are summarized amenable to simple introspection, and for the modifi-
in Fig. 1. cation of beliefs and cognitive distortions so encount-
While a good deal of reasoning in cognitive therapy ered. Although this account will concentrate on
represents extensions or amendments to this frame- interventions directed at cognitions, both of these
work, a significant conceptual departure came with tasks are carried out in practice through the pre-
revision of the concept of the ‘schema.’ Used to refer scription of behavioral experiments and exercises. For
to a variety of kinds of deep cognition, the term’s instance, when working with people subject to panic

2076
Cognitie and Interpersonal Therapy: Psychiatric Aspects

attacks who fear they may die from a heart attack a fundamental and general belief about themselves
during a panic episode, the assumption that physical emerges. Attempts to change schemas would normally
symptoms such as dizziness, breathlessness, and only be made when there had been some progress with
awareness of accelerated heartbeat indicate collapse emotional symptoms. As the previous description of
and death are imminent is addressed by a behavioral the kinds of concealment and resistance associated
procedure. In order to break the link between these with EMSs implies, schema-focused work is likely to
experiences and expectations, a patient agrees to be more complicated and prolonged than work with
hyperventilate under controlled conditions until these automatic thoughts. Because schemas may embody
sensations are induced. The strength of their beliefs templates for how someone relates to others, and be
concerning the consequences of the hyperventilation associated with manifest relationship problems, thera-
exercise would be rated and recorded at its outset and peutic work with them is likely to resemble psycho-
its conclusion, as part of a therapeutic examination of dynamic psychotherapy (qv) more closely than other
the validity of the expectations in the light of a forms of cognitive therapy (Safran and Segal 1990).
disconfirmatory experience. This example illustrates
the general principle that, even when using a behav-
ioural manipulation, a cognitive therapist would insist
1.4 Applications
it subserves a cognitive goal, and that attention is paid
to its cognitive impact throughout. Unlike psychoanalytic psychotherapies, cognitive psy-
Measures to identify automatic thoughts include chotherapy has developed through its application to
analysis of situations in which a target problem occurs. conditions in which symptoms or problematic habits
This is unlikely to be successful through generalized are associated with specific patterns of thinking. Its
recall: specific instances need to be examined in detail. growth since 1970 has been accompanied by pro-
These might be recounted in the session; lived through gressive developments in psychiatric nosology. The
between sessions after a patient is instructed on how to diagnostic and statistical manuals of the American
maintain a detailed and contemporaneous record of Psychiatric Association chart diagnostic developments
associated negative thoughts; or reconstructed as a which have involved the progressive refinement of
patient has a shift of affect in a session. Other diagnoses based on anxiety, depression, psychosis,
techniques such as role-plays or induction of imagery and distortions of personality, with the effective
may be employed in sessions to stimulate automatic invention of categories to encompass the so-called
thoughts. Measures intended to facilitate reattribution somatoform, dissociative, adjustment, and eating dis-
expose not only the irrationality of automatic thoughts orders (APA 1968, 1980, 1994). These have facilitated
but their incompatibility with experience. The im- understanding of common cognitive patterns as-
portance of active testing in modifying them, alongside sociated with these and development of specific
regular and explicit reappraisal of the validity of the treatment techniques. Use of cognitive therapy in
thoughts, has already been referred to. Patients and depression and anxiety disorders are best established,
therapists are likely to collaborate in drawing up a set while diagnostic subclassifications that have clarified
of alternative explanations whose fit with the facts of the characteristics of bipolar depressive disorder and
experience can then also be tested. panic disorder have been followed by specific cognitive
As experience of cognitive techniques has devel- techniques for their management. The category of
oped, there has been a progressive shift of interest personality disorders has not only been refined
away from factors leading simply to inaccurate ap- through this period, but, with the advent of multiaxial
praisal of situations, to those by which faulty cog- classification, been designated as an independent axis
nitions, and problems associated with them, are for summary clinical descripton (axis II). Given that
maintained (Salkovskis 1991). These may be auto- the basic difference between disorders of personality
matic thoughts associated with plans of action that from the symptom focused categories of axis I lies in
perpetuate a problem by protecting a faulty cognition their early onset and relative stability, models of their
from challenge (safety behaviors) or attentional shifts cognitive pathology have emphasized ‘deep’ cog-
that have a similar impact through distraction. Inter- nitions over surface ones, and more recently the early
ventions that address these maintaining factors di- maladaptive schema model of Young. Efforts to link
rectly can allow cognitive techniques to be effective specific personality types with consistent schema
where attention to errors in appraisal alone would not. formations continue. A further development of par-
Deep cognitions in the form of schemas are not only ticular significance to psychiatrists has been the ap-
more difficult to identify than surface cognitions, but plication of cognitive therapy to schizophrenia. These
harder to attempt to change therapeutically. Identifi- have included treatments for specific symptoms
cation needs a more probing and hypothetical ap- (delusions and hallucinations) as well as measures to
proach in the course of guided discovery by use of live with the impact of illness and enhance coping
techniques such as the ‘vertical arrow.’ This is a capacities (Chadwick et al. 1996).
reiterative questioning of patients’ suppositions about The vast majority of cognitive therapy is provided
the implications of a negative idea being the case, until as individual therapy. However, group treatments

2077
Cognitie and Interpersonal Therapy: Psychiatric Aspects

have been pioneered for specific disorders including as a specific treatment for depression in the 1970s.
depressive, anxiety, and eating disorders. Beck has Findings from research into the interpersonal precipi-
advocated its use with couples experiencing conflict. tants of depression were used in designing an in-
More recently, attempts have been made to differ- tervention that could bring symptomatic relief through
entiate between psychological therapies with reference improvements in interpersonal functioning (Klerman
to the strength of independent evidence concerning et al. 1984). Different therapeutic strategies are used,
their clinical effectiveness. An empirically supported depending upon which of four basic kinds of inter-
treatment is one which is clearly defined and, for a personal problem is paramount in an individual case.
given clinical problem, is consistently more efficacious IPT is now being adapted to other disorders and
than placebo treatments on the evidence of controlled settings (Klerman and Weissman 1993).
clinical trials. On this basis, cognitive therapies have
emerged as empirically supported in adults for anxiety
2.2 Rationale
disorders (including panic disorder, generalized
anxiety disorder, and social phobia); unipolar de- IPTs basic rationale is that depressed mood is usually
pression; anorexia nervosa and bulemia nervosa (Roth secondary to deterioration in interpersonal relation-
and Fonagy 1996). ships, and can be reversed by deliberate attention to
the quality of current relationships. Historically, it was
developed in deliberate contrast to psychodynamic
1.5 Comment therapies, in which the importance of interpersonal
The remarkable growth of cognitive therapy has been relationships within the therapy itself as well as in a
facilitated by a number of external developments: the patient’s life are paramount. A key difference has been
growth of clinical psychology as a profession whose that psychodynamic psychotherapies not only pay
members receive training in cognitive theory and considerable attention to the formative influence of
techniques within their core curriculum; compatibility past relationships, but they focus on (maladaptive)
between cognitive models of psychopathology and patterns of relating that are seen as characteristic of a
classifications adopted for diagnosis, care manage- person. Interpersonal therapy pays attention to cur-
ment, and organization of the evidence base; and an rent relationships for their own sake. It does not
intellectual Zeitgeist that favors explanations based on conceptualise underlying patterns, although it would
information processing or neurocognitive paradigms. expect someone’s style of relating to improve through
The relative rapidity of the clinical effects of CT has reinforcement of positive changes achieved in the
invited comparisons of their efficacy with those of relationships targeted in therapy. Although the ‘inter-
pharmacological treatments, while their longer term personal psychiatry’ of Sullivan (1953) is frequently
impact is not always clear. They enjoy a high level of quoted as an antecedent of IPT, its focus on in-session
acceptability that reflects their collaborative and trans- interactions and on enduring patterns of relating is
parent style. However, the emphasis on monitoring, inconsistent with this new therapy, despite their
rating, and homework exercises can be seen as ex- mutual emphasis on the importance of good inter-
cessively demanding by some patients, and the relative personal relationships for personal mental health.
lack of attention to the historical origins of difficulties IPT has used theoretical developments in a number
is not to all patients’ taste. Practical methods are of fields, from attachment theory to life events re-
relatively easily learned and taught, although growing search, to highlight the association of onset of de-
evidence concerning the added value of intensive pression with loss through grief or major life changes,
training and supervision for clinical outcomes suggests conflict, and isolation. Its model identifies whether a
this ease is deceptive. Important developments in the person’s most pressing need is resolution of grief, role
theoretical base and practice of cognitive therapies transitions, interpersonal disputes, or interpersonal
continue to be made. Some of these, in indicating the deficits.
importance of formative experience, of the relation- The model was designed to be researchable from the
ship with the therapist and patterns of relating, and of outset, its method being summarized in a manual
increasingly inaccessible (‘unconscious’) cognitive (Klerman et al. 1984) to promote consistent appli-
structures underlying behavior, are narrowing the gap cation in practice. Studies have not only addressed its
between cognitive and psychodynamic models of efficacy for specific conditions (see below) but the
psychotherapy in practice. impact of process on outcome. An important literature
has therefore also developed concerning the impact of
training on the practice of psychotherapy (Rounsaville
2. Interpersonal Therapy et al. 1988).

2.1 Oeriew 2.3 Method


Interpersonal therapy (IPT) is an increasingly com- As a time-limited therapy, IPT was pioneered over 12
mon model of brief psychotherapy that was developed sessions per treatment, now often 16. A typical

2078
Cognitie and Interpersonal Therapy: Psychiatric Aspects

treatment is subdivided into initial, treatment, and tions of the therapeutic process to fit different working
termination phases. During the initial phase, the contexts (cf. Klerman and Weissman 1990). Depres-
patient is educated to see their difficulties as the sion in adolescents and the elderly, as well as the
consequence of having a depressive illness, and to chronic low grade depression known as dysthymic
allow themselves to occupy a sick role. (This means disorder, have all been shown to benefit from treat-
they should not feel responsible for this state of affairs, ment (Markowitz 1998). Modifications of therapeutic
and allow others to take on some of their normal technique can be involved, for instance greater in-
duties so they can concentrate on recovery.) A detailed volvement of significant others in the treatment
inventory is drawn up with the therapist summarizing process with adolescents, and use of less frequent
all current relationships, however insignificant. This maintenance sessions after the phase of regular ses-
not only provides a map of potential areas of difficulty, sions in dysthymia. Adjustment of content is likely to
but of sources of potential support and opportunities be involved with use of IPT with other disorders. A
for the development of relationships in future. great deal of exploratory work in the field of substance
Detailed questioning in the first phase allows a misuse has so far failed to show significant benefits. An
treatment focus to be identified which reflect four area of greater promise has been in the treatment of
distinct forms of interpersonal need: grief (where the bulimia nervosa, where lasting clinical improvements
loss of a significant other through death has not been comparable to those from CT have been achieved
overcome); role disputes (where conflict in a key (Agras et al. 2000).
relationship, perhaps in the form of an impasse rather The principal adaptations to the model to suit
than overt fighting, cannot be resolved); role tran- different contexts have been its shortening to six brief
sitions (where adaptation to a different situation, sessions for use with subclinical populations in pri-
commonly following loss events, is required). In some mary care settings (cf. Klerman and Weissman 1990).
residual cases, where a patient’s interpersonal situ- Attempts to develop a group model of IPT for
ation is particularly impoverished through an inability treatment of social phobia are promising but at an
to establish relationships, a fourth category of inter- early stage.
personal deficits applies.
Different therapeutic techniques are likely to be
required in each of these instances, the model being 3. Conclusion
sufficiently flexible to accommodate these. Examples
would be assisted mourning with grief, and attention Although cognitive and interpersonal therapeutic
to communication in interpersonal disputes. In all packages are relatively new, each has elaborated
cases, there is careful attention to affect and con- principles of good psychiatric management—respec-
siderable emphasis on its successful expression tively, the importance of a patient’s attitude or social
throughout treatment. relationships to their health and recovery—that are
Therapy concludes with explicit attention to ter- widely recognized. Their rapid growth reflects their
mination, both in anticipating and working through proven efficacy for specific disorders, the relative ease
loss of the therapy and in planning for continuing with which they can be learned and disseminated, and
progress along the lines tried out during the therapy. the promise of brief psychological treatment for
clinical conditions that account for a large proportion
of psychiatric practice. Both are widely used alongside
physical treatments such as psychotropic medication.
2.4 Applications Developments are likely to include continuing influ-
IPTs use in the treatment of major depression was ence of basic research in psychology and the neuro-
highlighted by a large randomized controlled trial sciences on the refinement of therapeutic models and
sponsored by the US National Institute of Mental methods, and increasing rapprochement with other
Health (Elkin et al. 1989). This remains one of the therapeutic models as integrative models of treatment
largest comparative trials of psychological and phar- are developed for more complex and treatment-
macological treatments ever conducted, providing resistant conditions.
information on the relative benefits from two psycho-
therapies (CT and IPT), imipramine and a drug
placebo. The results were encouraging for IPT, show- Bibliography
ing it to be as effective as CT in relieving symptoms
overall, while having the lowest attrition, and signifi- Agras W S, Walsh B T, Fairburn C G, Wilson G T, Kraemer
H C 2000 A multicenter comparison of cognitive-behavioural
cantly better results than CT among the most severely
therapy and interpersonal theapy for bulimia nervosa.
depressed patients. Archies of General Psychiatry 57: 459–66
Since its original application in studies of clinical American Psychiatric Association 1968 Diagnostic and Stat-
depression, IPT’s use has been broadened to incor- istical Manual of Mental Disorders, 2nd edn. APA, New York
porate depressed populations with special needs; American Psychiatric Association 1980 Diagnostic and Stat-
patients with distinct mental disorders and adapta- istical Manual of Mental Disorders, 3rd edn. APA, New York

2079
Cognitie and Interpersonal Therapy: Psychiatric Aspects

American Psychiatric Association 1994 Diagnostic and Stat- transparent those mental faculties that enable all
istical Manual of Mental Disorders, 4th edn. APA, New York humans to internalize the social and cultural character-
Beck A T 1989 Cognitie Therapy and the Emotional Disorders. istics of the society into which they are born. But these
International Universities Press, New York
are also central topics of anthropological research.
Chadwick P D J, Birchwood M J, Trower P 1996 Cognitie
Therapy for Hallucinations and Delusions. Wiley, Chichester,
UK
Elkin I, Shea M T, Watkins J T, Imber S D, Sotsky S M, Collins 1. From Ethnoscience to the Cognitie Sciences
J F, Glass D R, Pilkonis P A, Leber W R, Docherty J P,
Fiester S J, Paloh M B 1989 National Institute of Mental The year of birth of the ‘cognitive revolution’ in which,
Health Treatment of Depression Collaborative Research apart from social anthropology, various human
Programme: General effectiveness of treatment. Archies of sciences participated, to some extent independently of
General Psychiatry 46: 971–83 each other, is considered to be 1956. That year, at a
Klerman G L, Weissman M M, Rounsaville B J, Chevron E S conference on information theory at the Massachu-
1984 Psychotherapy for Depression. Basic Books, New York
setts Institute of Technology, Newell and Simon
Klerman G L, Weissman M M (eds.) 1990 New Applications of
Interpersonal Psychotherapy. American Psychiatric Press, presented a paper on computer programs, Miller
New York presented his famous treatise The Magical Number
Markowitz J C 1998 Interpersonal Psychotherapy for Dysthymic Seen, and the 28-year-old Chomsky read excerpts
Disorder. American Psychiatric Press, New York from his thesis Three Models of Language. In the same
Roth A, Fongagy P 1996 What Works for Whom? Guilford year, Bruner’s book A Study of Thinking was pub-
Press, New York lished and two anthropologists, Goodenough and
Rounsaville B J, O’Malley S, Foley S, Weissman M M 1988 Lounsbury, published the first two programmatic
Role of manual-guided training in the conduct and efficacy of articles on cognitive anthropology, Componential
interpersonal psychotherapy for depression. Journal of Con-
Analysis and the Study of Meaning and A Semantic
sulting and Clinical Psychology 56: 681–8
Safran J D, Segal Z V 1990 Interpersonal Process in Cognitie Analysis of the Pawnee Kinship Usage.
Therapy. Basic Books, New York With cognitive anthropology or, more precisely,
Salkovskis P M 1991 The importance of behaviour in the with the ethnoscience phase of cognitive anthro-
maintenance of anxiety and panic: A cognitive account. pology, a new field of research came to the fore. Its
Behaioural Psychotherapy 19: 6–19 goal was to describe other cultures in their own
Sullivan H S 1953 The Interpersonal Theory of Psychiatry. conceptualization, that is, in emic terms or from the
Norton, New York inside. Different cultures categorize the world differ-
Williams J M G, Watts F N, MacLeod L, Matthews A 1990 ently and apply a different type of logic in dealing with
Cognitie Psychology and the Emotional Disorders. Tavistock,
their environment. This difference should be recorded
London
Young J E 1994 Cognitie Therapy for Personality Disorders: A as it reveals different cognitive worlds. The underlying
Schema Focused Cognitie Therapy. Professional Resource question was an old one: What does the ‘order out of
Exchange, Sarasota, FL chaos’ look like? More precisely, the basic questions
were ‘How do other cultures label the things in their
C. J. Mace environment, and how are these labels related to each
other?’ This according to the assumption that with the
help of cultural phenomena the cultural contents, and
thus the mental representations, could also be docu-
mented.
Cognitive Anthropology
Cognitive anthropology attempts to link anthro- 1.1 Three Premises
pology with the cognitive sciences. Culture, as one of
anthropology’s central objects of research, has an One can argue that three premises formed the back-
effect in two respects: on the one hand, in a material ground to this ambitious program:
sense in the form of cultural phenomena, on the other
hand, mentally, in the form of cultural contents.
Cultural contents are based on mental representations. 1.1.1 Premise 1: Culture is common (shared) know-
While cultural phenomena are public and thus easy to ledge. A highly significant definition of culture was
document ethnographically, cultural contents are not coined by Goodenough: ‘A society’s culture consists
directly accessible since mental representations cannot of whatever it is one has to know or believe in order
be observed. What actually happens inside people’s to operate in a manner acceptable to its members,
heads is the object of research of the cognitive sciences. and to do so in any role that they accept for any one
Cognitive sciences see the human experience of reality of themselves. … It is the forms of things that people
and human thinking as acts of processing information. have in mind, their models for perceiving, relating,
In addition to the study of perception, mental rep- and otherwise interpreting them. … Culture does not
resentation, and memory, the objective is to make exist of things, people, behaviour, or emotions, but

2080
Cognitie Anthropology

in the forms or organizations of the things in the follow the same pattern universally, whether in
minds of people’ (Goodenough 1957, pp. 167–8). humans, animals, or the machine (in other words, that
Seen in this way, culture is a mental phenomenon. the software is the same everywhere, irrespective of the
hardware in which it is processed) was, from a
philosophical point of view, highly explosive.
1.1.2 Premise 2: Knowledge has the form of a cul-
tural grammar. Anthropologists must, on the basis
of the statements made by their informants, inductive- 1.2 Reised Premises
ly discover this abstract and shared knowledge as a
systematic mental representation. In principle, they As a consequence, the three premises of ethnoscience
can, in order to do so, limit themselves to one person, were reconsidered and extended; the emphasis turned
as applies when learning a foreign language, where to where knowledge is sought and how it is repre-
at first sight it suffices to have one speaker at one’s sented.
disposal. The knowledge system of a culture is under-
stood to be a conceptual model which embraces the
organizational principles of the culture and the behav- 1.2.1 Reised Premise 1: Turning towards the indi-
ior of its members. The model is, so to speak, a cul- idual. Attention at the end of the twentieth century
tural grammar. was no longer focused on the collective knowledge
system as the whole of a culture (as repre! sentation
collective in the sense of Durkheim), supposed to be
1.1.3 Premise 3: Language is the best means of ac- recorded as the ideal type, but on the scattered, vari-
cess to mental phenomena. With the reduction of able knowledges acquired, stored (memorized), and
chaos, certain phenomena and characteristics are applied by individuals in their everyday life. The fo-
selected from the environment as being significant, cus shifted, because it was recognized that inferences
named, and given a classificatory meaning. The main cannot be directly drawn from cultural phenomena
(but not the only) proof of the existence of a category and linguistic material in order to elaborate individ-
is its label. ual cognitive processes or representations.
The equation of culture and knowledge proved to be
very fruitful. In the 1960s, ethnoscience experienced
rapid success. Innumerable studies were published— 1.2.2 Reised Premise 2: Operationalization instead
which was to be expected—about terminologically of categorization. As soon as knowledge is no longer
densely structured individual fields, such as those on defined as an isolated, static system (that is, simply
kinship, colors, ethnozoology, ethnobotany, or illness. as grammar) but as something which is evident (ver-
One succumbed, as it were, to the great theoretical bally or nonverbally) in everyday use by individuals,
temptation to reduce complex and ostensibly hete- it becomes clear that many categories and semantic
rogeneous things to a few rules (inclusion, exclusion, fields have no fixed boundaries, and cannot be de-
and intersection) and to present them as elegant fined in the classical sense (fuzzy sets). They are now
models (taxonomy, paradigm, see below) which were also grouped according to what a person can do in
looked upon as mental representations of individuals. daily life (‘taskonomy’ instead of taxonomy) or else
At the beginning of the 1970s, however, only a very according to prototypes (best example from a cate-
few studies appeared, and Keesing (1972, p. 229) was gory). It is everyday cognition that a housewife needs
justified in starting a paper with the sentence: ‘Tell me, when she shops, a milkman when he distributes dairy
whatever happened to ethnoscience?’ Here lies the products to his customers according to a certain pat-
irony: when, at the end of the 1950s, ethnoscience took tern, a Yakan in the Philippines when he wants to
over this model from linguistics and it became the enter a house correctly.
focal point in the 1960s, it had already been swept
aside in linguistics itself by Chomsky’s new generative
linguistics.
The result was an opening up and turning towards 1.2.3 Reised Premise 3: Turning away from lan-
modern trends in neighboring disciplines. Computer guage as the only instrument to code knowledge. Know-
science proved to have particularly strong influence; ledge is also expressed by means of actions or
when the first computer programs appeared which emotions. Habitual actions in particular can be
played chess, the question arose, ‘If computers can very ‘eloquent’ in the sense of tacit knowledge. In
have programs, why can’t people, too?’ In endeavor- 1977, the computer specialist Schank and the
ing to reproduce human cognitive representations sociopsychologist Abelson introduced the significant
in the model, cognitive anthropology adopted the term ‘script’ to describe stereotyped sequences of
‘information-processing approach.’ In so doing, the actions in certain situations. Thus, although language
assumption (at that time) that cognitive processes remains one of the focal points, it is treated differently;

2081
Cognitie Anthropology

Figure 2
Gwarane, an old Yupno woman, sorts objects (sorting
Figure 1 task) following a hot\cold schema, after having first
Ndanda, an old Yupno man (Papua New Guinea), verbally given a taxonomic order which, however, plays
draws a picture of the world. no role in her everyday life.
no longer as a lexicon, but in everyday use as activity of individuals who actively apply knowledge
discourse from which inferences must be drawn as to in different contexts, in that they think, generalize,
the intended message (Hutchins 1980). Beyond draw inferences, perceive, recognize and categorize;
this, the (controversial) idea is that the structure of analyze, combine, assess possibilities, solve problems
knowledge (as stored in the mind as representation) is and make decisions; classify, differentiate and choose;
not necessarily language-like: ‘Knowledge organized remember and master new situations. These activities
for efficiency in day-to-day practice is not only are performed individually or between individuals
nonlinguistic, but also not language-like in that it but, nevertheless, take place somehow within the
does not take a sentential logical form’ (Bloch 1991, broad framework of the ‘Culture.’
pp. 189–190). In addition, the question arises whether
language should not be ignored more frequently
because certain kinds of knowledge cannot be
externalized linguistically, or only with great difficulty 2. The Representation of Knowledge
(Wassmann 1993) (Fig. 1 and Fig. 2). The most important models to represent knowledge in
If the individual acting in his or her daily life now cognitive anthropology are briefly described below.
arouses interest, this is a consequence of a para-
digmatic change: cognitive anthropology now con-
siders itself more clearly as part of the cognitive
2.1 Taxonomy
sciences, which, inter alia, leads to the word ‘cognition’
being better understood. Cognition is no longer an The inner order of a semantic field (e.g., ‘kinship,’
expression of a culture as a whole and abstracted from ‘colors’) depends on a small number of ordering
linguistic material, but understood as the mental principles which structure the lexical components

2082
Cognitie Anthropology

Table 3
Prototypes
Prototype Category
chair furniture
car vehicle
orange fruit
gun etc. weapon etc.

emic and only to a small degree ‘cognitive.’ They are


Table 2 understood as the ‘mind’ of a whole culture and (as we
Paradigm know today) they are not really instruments of
thinking.
Sex

a1 a2
2.3 Prototype
Generation b1 grandfather grandmother
b2 father mother Within a category of objects, that object is a prototype
b3 Ego (for the whole category) which is thought to be the best
b4 son daughter example or the clearest case. Thus, for example, in the
b5 grandson granddaughter category ‘furniture’ the ‘chair’ is a better example
than, for instance, ‘radio.’ Of course one could define
‘furniture’ as a category of objects having certain
(lexemes) of the field (e.g., ‘mother,’ ‘red’). The semantic characteristics or attributes in common (and
ordering principles are inclusion, exclusion (contrast), whatever does not have these characteristics does not
and intersection. A taxonomy is the description of a belong). However, this (checklist) definition does not
semantic field; it lists the categories (lexemes) and allow any grading such as ‘chair’ being a better
shows how they are connected with each other, i.e., example, more of a prototype of ‘furniture’ than, for
according to the principles of inclusion and exclusion instance, ‘radio’ (although both belong to ‘furniture’).
in hierarchic order. Categories at the same level It may happen that there are no criterial attributes
exclude each other (exclusion) while categories at the common to all parts (i.e., no semantic field, no class
lower level are included in the categories at the higher according to ethnoscience), but only a large number of
level (inclusion). The grouping in Table 1 shows that characteristics which may ‘match’ some but by no
‘dining tables’ are different from ‘smokers’ tables’ but means all. As a consequence, they have only a ‘family
that both are a kind of table. And also that chairs are resemblance’-structure (a term Rosch adopted from
distinguished from tables. It does not say what are the Wittgenstein (Rosch and Merris 1975)). Following
distinguishing characteristics. Le! vi-Strauss, prototypes are particularly good to think
(Table 3).

2.2 Paradigm
2.4 Script, Schema and Cultural Model
If a semantic field is structured according to the
principle of intersection, it is a paradigm. Hierarchy, The information theory approach forces the anthro-
inclusion, and exclusion are missing; instead the pologist to be explicit. Exactly that has to be made
distinctive features (or criterial attributes) are stated explicit which normally remains implicit. This ad-
which distinguish the different categories. In order to ditional information is called ‘script.’ It is the tacit
construct such a model, a componential analysis is knowledge enabling us to also understand incomplete
made (Tyler 1969). A complete lexicon of a semantic descriptions and suggestions: we automatically add
field, e.g., ‘blood relationship’ is established and then what is missing by an inference process. Every situ-
those characteristics (components) are looked for ation requires specific knowledge and accordingly
according to which of the lexemes differ, e.g., ‘male’ there are scripts for ‘eating in a restaurant,’ ‘playing
and ‘female’ in the dimension ‘sex,’ as well as the football,’ ‘attending a birthday party.’ But not only
distance from Ego in the dimension ‘generation.’ our actions are based on scripts, but our language as
Every single lexeme is now defined by a bundle of well, as exemplified briefly in the following; a story
components. In the graphic representation in Table 2 told with all the details would be tedious.
these components intersect at the defined lexeme. I am in New York and somebody asks me the way
Taxonomy and paradigm are the classic models of to Coney Island; I tell him to take the ‘N’-train to the
ethnoscience. They are strongly idealized, consistently terminus. This instruction only makes sense ‘if this

2083
Cognitie Anthropology

improperly specified algorithm can be filled out with a it on demand (in concrete situations), i.e., supplement
great deal of knowledge about how to walk, pay for it with details (instantiation).
the subway, get in the train and so on’ (Schank and (c) It is possible to interpret new experiences because
Abelson 1977, p. 20). these models not only represent knowledge but also
If the typical characteristics of a situation are allow us to draw conclusions from them to new
grasped, hence the stereotypical, the standard-like is situations.
stressed but raised to a higher level of abstraction; we When answering these three questions, modern
can talk about schemata and cultural models (which cognitive anthropology faces three more general prob-
partly replace the older term of the folk model). All the lem areas.
knowledge we acquire, remember, and communicate
about this world is neither a simple reflection of this
world nor does it consist of a series of categories (as 3.1 Knowledge and Knowing
ethnoscience assumed), but it is organized into differ- Giddens writes ‘The vast bulk of the ‘‘stocks of
ent situation-relevant, prototypical, simplified se- knowledge’’ … incorporated in encounters is not di-
quences of events. We basically think in simplified rectly accessible to the consciousness of the actors.
worlds: ‘… cultural models are composed of pro- Most of such knowledge if practical in character, it is
totypical event sequences set in simplified worlds’ inherent in the capability to ‘go on’ within the routines
(Quinn and Holland 1987, p. 32). What is more, these of social life’ (Giddens 1984, p. 4). Imagine having to
models are probabilistic and partial; they are actual describe to somebody how to ride a bicycle; doing so,
frames we can use to react to new situations as well. one has to question the traditional conception of
They are world-proposing yet cannot be directly knowledge. It seems to be advantageous to distinguish
observed, since they are not presented but merely between knowledge (what is known, as an abstract
represented by the behavior of the people. They are pool of information, declarative and verbalized) and
models of the mind and in the mind. The organizing knowing (how to do something in practice, implicit
principle behind these models seems to be metonymy, and hidden, primarily accessed through perfor-
a part of the whole, the prototypical, conspicuous part mance)—the focus being on the latter. We may even
is passed off as the whole, i.e., a whole is represented reorient our analysis and reverse the process—by
by one of its parts (cf. the frame theory of Minsky, and starting with knowing and seeing how knowledge is
story grammar by Rumelhart). constituted from it (Borofsky 1994).

2.5 Mental Images


3.2 Language and Cognition
Knowledge may also be represented by inner images.
The image schema consists of schematized, simplified Does language shape our thinking? This question
images. These images are able to make comprehensible seems to be receiving attention once more (Gumperz
and imaginable physical objects or logical relation- and Levinson 1996). Thus spatial orientation and
ships difficult to conceptualize. The organizing prin- communicating it certainly belongs to the cognitive
ciple behind them appears to be the metaphor; through and linguistic basic equipment of all people and
analogy, information from the physical world is societies. However, there are different linguistic sys-
introduced into the nonphysical world, for example, tems of orientation, and, for us, very basic spatial
rage can be imagined as a hot liquid in a container, categories such as ‘left,’ ‘right,’ ‘in front’ or ‘behind’
evaporation as rising molecules springing out of the cannot be taken for granted. Many languages do not
water like popcorn, electricity as a crowd of people (in know these terms and instead use a geocentric system
front of the gate of a stadium at a sports event). based, for example, on the cardinal points and use this
not only in navigation but also in everyday life (the
glass is not placed to the left of the plate but to the east,
3. How Deep? for instance). However, these differences of a linguistic
The structure of the representations of knowledge as and cultural kind ( probably) also influence the (cog-
presented here under Sects. 2.3, 2.4, and 2.5 allows us nitive) perception of spatial relationships as well as
to answer three questions central to cognitive anthro- their storage in memory (Wassmann and Dasen 1998).
pology (Quinn and Holland 1987, pp. 3–4). Here cognitive anthropology contradicts the prevail-
(a) It is able to explain the apparent systematicity of ing school of thought of cognitive linguistics in which
cultural contents (knowledge) by pointing to a number the worldwide diversity of languages is only seen as a
of general-purpose models which can repeatedly be cultural phenomenon and hence one of surface.
integrated into other concrete models which are
special-purpose orientated.
3.3 Uniersals?
(b) Mastering the enormous amount of knowledge
every one of us has is possible because we only store Cognitive processes such as categorizing, classifi-
what is prototypical, reduced, but are able to actualize cation, memorizing, and perception are the founda-

2084
Cognitie Archaeology

tions of the contents of knowledge and structure these. Gumperz J, Levinson S C (eds.) 1996 Rethinking Linguistic
In principle they are thought to be universal but can be Relatiity. Cambridge University Press, Cambridge, UK
applied in different ways. Hutchins E 1980 Culture and Inference. Harvard University
Press, Cambridge, MA
‘We found evidence of differences across cultural
Keesing R M 1972 Paradigms lost: The new ethnography and
groups, differences in habitual strategies for classifying the new linguistics. Southwestern Journal of Anthropology 28:
and for solving problems, differences in cognitive 299–332
style, and differences in rates of progression through Keesing R M 1987 Models, ‘folk’ and ‘cultural’: Paradigms
developmental stages … These differences, however, regained? In: Holland D, Quinn N (eds.) Cultural Models in
are in performance rather than in competence. They Language and Thought. Cambridge University Press, Cam-
are differences in the way basic cognitive processes are bridge, UK, pp. 368–93
applied to particular contexts, rather than in the Quinn N, Holland D 1987 Culture and cognition. In: Holland D,
presence or absence of the processes. Despite these Quinn N (eds.) Cultural Models in Language and Thought.
Cambridge University Press, Cambridge, UK
differences, then, there is an underlying universality of
Rosch E, Merris C 1975 Family resemblances: Studies in internal
cognitive processes’ (Segall et al. 1990, p. 184). structure of categories. Cognitie Psychology 7: 573–605
It seems, however, that the deep structure might be Rumelhart D E, McClelland J L (eds.) 1986 Parallel Distributed
influenced by culture in a more lasting way than has Processing: Explorations in the Microstructure of Cognition.
been assumed. In the cognitive sciences, the fact that a MIT Press, Cambridge, MA
major part of the mental representations and processes Schank R C, Abelson R P 1977 Scripts, Plans, Goals and
studied might be of a cultural nature receives little Understanding: An Enquiry into Human Knowledge Structures.
attention. Frequently universality is simply postulated Erlbaum, Hillsdale, NJ
without taking into consideration the possibility of a Segall M H, Dasen P, Berry J, Poortinga Y 1990 Human Behaior
in Global Perspectie: An Introduction to Cross-cultural Psy-
cultural cogeneration. To sensitize cognitive scientists
chology. Allyn and Bacon, Boston
to the ambitious question of the range of cultural Shore B 1996 Culture in Mind: Cognition, Culture, and the
variability is a task for which no discipline is better Problem of Meaning. Oxford University Press, New York
suited than anthropology. A promising field of work Strauss C, Quinn N 1997 A Cognitie Theory of Cultural
for both disciplines could be the PDP-models (Parallel Meaning. Cambridge University Press, Cambridge, UK
Distributed Processing), also called ‘connectionistic Tyler S (ed.) 1969 Cognitie Anthropology: Readings. Holt,
models’ or neuronal networks, which Rumelhart and Rinehart, and Winston, New York
McClelland (1986) developed as computer models. Wassmann J 1993 Das Ideal des Leicht Gebeugten Menschen.
These are able to construct models of the functioning Eine Ethno-Kognitie Analyse der Yupno on Papua Neuguinea.
Reimer Verlag, Berlin
of schemata: not as loading in a set of instructions, but
Wassmann J, Dasen P R 1998 Balinese spatial orientation. Some
as gradually building up associative links among empirical evidence of moderate linguistic relativity. Journal of
repeated or salient aspects of experience. This vague- the Royal Anthropological Institute 44: 689–711
ness and dependency on empirical knowledge from
our everyday life, hence of cultural phenomena, make J. Wassmann
these models attractive to anthropologists as well
(Shore 1996, Strauss and Quinn 1997).

See also: Cognitive Modeling: Research Logic in


Cognitive Science; Cognitive Science: History; Cogn-
itive Science: Overview; Conceptual Blending; Cult- Cognitive Archaeology
ural Studies of Science; Grammatical Relations;
Mental Models, Psychology of; Mental Represen- Cognitive archaeology is an approach to research
tation of Persons, Psychology of explicitly emphasizing the central role of cognition
and mental phenomena in explanations of the past. It
rejects behaviorism, with its emphasis on stimulus–
response relationships, and the overriding concern
Bibliography with environmental adaptation which underlies much
Bloch M 1991 Language, anthropology and cognitive science. processual archaeology. The theoretical underpin-
Man 26: 183–98 nings of cognitive archaeology derive instead from
Borofsky R 1994 On the knowledge and knowing of cultural post-positivist philosophy of science, cultural anthro-
activities. In: Borofsky R (ed.) Assessing Cultural Anthro- pology, linguistics, psychology and, more broadly, the
pology. McGraw-Hill, New York, pp. 331–46 cognitive neurosciences. Primary research topics in-
Giddens A 1984 The Constitution of Society: Outline of the
clude the origin of art, belief, language, tool use, and
Theory of Structuration. University of California Press,
Berkeley, CA the human mind; and the reconstruction of prehistoric
Goodenough W 1957 Cultural anthropology and linguistics. In: religion and ideology. Cognitive archaeologists look
Garvin P L (ed.) Reports of the Seenth Annual Round Table to a broad range of empirical data to investigate these
Meeting on Linguistics and Language Study. Georgetown interests, including stone tools, settlement patterns,
University Press, Washington, DC, pp. 167–73 ceramics, and art and iconography. Ethnohistorical

2085
Cognitie Archaeology

data are also commonly employed as a starting point, Cognitive archaeology has instead adopted ongoing
as many cognitive archaeologists are advocates of the developments in the philosophy of science, including
direct-historical approach. While concerns with art especially post-positivist approaches. Commonly
and belief and an emphasis on individual cognition these maintain a commitment to scientific knowledge,
and motivation ally cognitive archaeologists with albeit acknowledging that over time science only
recent postmodernist concerns, typically they main- increasingly approximates the ‘real’ truth, with scien-
tain a commitment to scientific method, including the tific method based on ‘inference to the best hypothesis’
testing of hypotheses and the development of scientific rather than singular critical tests (Kelley and Hanen
knowledge. 1988). Implicit in much cognitive archaeology more-
over is embodied scientific realism. This accepts the
existence of a world independent of our understanding
1. Definitions of it, as well as the fact that we can have stable
knowledge of this world. But it also holds that
Cognitive archaeology has been defined in various knowledge is relative to our bodies, minds, and
ways. According to Renfrew (1982) it is ‘the ar- interactions with the environment. This conclusion
chaeology of the mind.’ Flannery and Marcus derives from empirical results of cognitive neuro-
(1993, 261) define it as ‘a study of all those aspects of sciences studies which show that certain universal
ancient culture that are the product of the human aspects of human conceptual development result from
mind … cosmology … religion … ideology … icono- the embodied nature of the mind (Lakoff and Johnson
graphy … and all other forms of human intellectual 1999).
and symbolic behaviour that survive in the archaeo- These philosophical commitments resolve two fun-
logical record.’ damental even if largely unrecognized conceptual
These definitions reflect a rejection of a long-held contradictions in processual archaeology. First, by
archaeological belief that cognitive phenomena or adopting embodied scientific realism, cognitive ar-
mental products cannot be observed in the archae- chaeology supports a model of humankind and its
ological record, let alone reconstructed from it. Cog- development that is fully reconcilable with evolution-
nitive archaeologists, in contrast, commit a significant ary principles. Processual archaeology, even when
part of their effort in reconstructing the past to just this explicitly claiming to be evolutionarily based, instead
area of concern. While they do not deny the relevance implicitly invokes a pre-Darwinian model of dis-
of culture-history, technology, adaptation, subsist- embodied reason and mind due to its disavowal of any
ence, trade, and other traditional archaeological topics, real relevance for cognitive phenomena. Because
cognitive archaeologists argue that a holistic interpret- cognitive capabilities (which we all share) are ignored,
ation or explanation of the past requires more than they must be taken as a given; they ‘just are’ in
these traditional topics alone can provide. processual archaeological theory. Ultimately this pos-
ition then must resort to creationism and divine
intervention to explain the totality of the human
2. Philosophical and Theoretical Foundations condition, which must include our ability to think and
reason. Second, by insisting that mental phenomena
The development of cognitive archaeology as an are irrelevant but also conceding that the goal of
intellectual trend has involved the rejection of certain archaeological science is to create knowledge (a
key tenets of scientific archaeology as practiced by cognitive construct), processual archaeology further
Anglophone researchers (variously called new, proces- requires a difference in kind between prehistoric
sual or settlement-subsistence archaeology) in the peoples, whose thinking is putatively irrelevant and
latter half of the twentieth century. One of these tenets epiphenomenal, and contemporary westerners, includ-
is logical positivism as a philosophy of science, with its ing archaeologists whose professional purpose is to
emphasis on explanatory cover laws, the hard dis- create knowledge. Such a distinction in kind between
tinction between theory and fact, and a belief in different peoples precludes the kinds of law-like
unequivocal singular tests for hypotheses. Another is explanations that processual archaeology has sought
behaviorism which implicitly serves as the explanatory as its goal. Cognitive archaeology gives equal weight
paradigm for much processual archaeology (Peebles to the importance of mental phenomena in the lives of
1992). This emphasizes the importance of stimulus- both prehistoric and contemporary humans.
response relationships in explanation, especially con- Other theoretical aspects of cognitive archaeology
cerning adaptation to the environment, and denies the derive from its relationship with cultural anthro-
relevance of mind, intellect, and cognition in human pology. These partly reflect the definition of culture as
action, along with their byproducts such as belief, a cognitive worldview, a system of beliefs and mean-
ritual, and art. To the processual archaeologist these ings, thereby helping to situate aspects of cognitive
are epiphenomenal, meaning derivative in origin and archaeological research as a kind of anthropological
secondary in importance, and therefore analytically archaeology in the interpretation of prehistoric cul-
irrelevant. ture. Similarly, anthropological studies of the nature

2086
Cognitie Archaeology

of traditional thought and belief systems (e.g., Sperber analysis then concerns not so much the dual objects
1982) have dispelled the notion that these are necess- themselves but the relationships between the pairs.
arily irrational and therefore somehow inherently Both Leroi-Gourhan and Deetz showed that such a
inaccessible to rationalist science. As anthropologist relational structure could be identified in their resp-
Robin Horton (1982) has instead shown, all cultures ective Paleolithic and Colonial data, thereby sign-
share a core of cognitive rationality that involves the ificantly amplifying their abilities to make inferences
development of theory, based on deductive, inductive, about the symbolic meanings of their archaeological
and analogical reasoning, used in the explanation, remains.
prediction, and control of events. The analysis and Psychological models of the human brain–mind
interpretation of culture and its expression in sym- have figured prominently in other cognitive archae-
bolism, meaning, and worldview need not then necess- ological studies. Paleolithic archaeologist Thomas
arily reduce to particularistic and empathic statements Wynn (1989), for example, has used psychologist Jean
based on some kind of privileged access to the feelings Piaget’s theories on the development of thought in
of people in the past, but instead can be based upon children and what these imply about spatial abilities to
empirically grounded generalizations that stand up to infer kinds of intelligence and levels of cognitive
empirical analysis. development in Homo habilis and Homo erectus, based
Perhaps most importantly, linguistic, psychological, on analyses of the kinds of stone tools that these
and neuropsychological models of the human mind— ancient hominids created. Psychologist William Noble
brain and how it operates have been instrumental in and archaeologist Iain Davidson (e.g., 1996) have
cognitive archaeological studies. And while some of collaborated in a series of studies, bringing together
the topics that cognitive archaeologists include in their archaeological evidence on early symbol-making with
analyses, such as ritual, belief, symbolism, and mean- theories of perception and communication in order to
ing, are the same as those that are highlighted in chart the evolution of the human mind and the
humanist and postmodernist studies and approaches appearance of language. Steven Mithen (1996) has
(called ‘post-processualism’ in archaeology), the invoked neural models of brain modularity along with
models, techniques, and methods that the cognitive cognitive scientist Howard Gardner’s theory of mul-
archaeologists employ to understand these aspects of tiple intelligences in an effort to explain the ‘Upper
the past are quite different from those that post- Paleolithic Revolution’: the seemingly sudden appear-
modernists might allow, as the scientific models used ance of art, symbolism, and presumably belief roughly
by cognitive archaeologists themselves imply. 50,000 years ago, literally tens of thousands of years
after the earliest skeletal evidence for anatomically
modern humans. In these cases psychological models
3. Analytical Approaches: Linguistic and have not been used to help explain the archaeological
Psychological Models record: rather, the archaeological record has been
employed in order to make inferences, informed by
A measure of the versatility of cognitive archaeology is psychological theories, about prehuman and human
seen in the fact that many of its applications have intellectual and cognitive evolution.
occurred at opposite ends of the archaeological spec-
trum: Paleolithic or Stone Age archaeology, which
involves studies of the earliest human societies (even 4. Analytical Approaches: Neuropsychological
including in some cases the archaeology of prehuman Models
hominids); and late prehistoric, protohistoric and
historical archaeology, where direct cultural and A subset of the psychological models used in cognitive
linguistic links tie the archaeological past to recent archaeological studies involves neuropsychology:
peoples studied by ethnographers and historians. models based on theories of the mind–brain and the
Perhaps the first example of an in-depth cognitive nervous system and how the two interact. These have
archaeological study was provided by Paleolithic been particularly influential in analyses of hunter-
archaeologist Andre! Leroi-Gourhan (1967) in his gatherer rock art tied to shamanistic religions where
analysis of western European cave paintings dating trance or altered states of consciousness (ASC) were
from about 35,000 to 10,000 years ago. A similar considered key kinds of religious experiences. Because
approach was used by historical archaeologist James all humans share the same neural architecture, the
Deetz (1977) in his study of Euro-American Colonial mental, visual, emotional, and bodily reactions of ASC
artifacts. In both cases analysis was based on a fall within a predictable range, regardless of cause,
structuralist model of mind. Originating in the ling- culture, or time period (Hobson 1994). This fact creates
uistic theories of Ferdinand de Saussure and Roman a kind of ‘neuropsychological bridge’ promoting
Jacobson, structuralism posits a human mind that analytical access to aspects of the ancient mind that
organizes empirical phenomena and concepts in terms involved ASC experiences.
of binary oppositions or dualities: black versus white; Using the results of clinical studies, Lewis-Williams
male versus female; good versus bad; etc. Structuralist and Dowson (1988) constructed a model of the effects

2087
Cognitie Archaeology

of ASC on mental imagery—the ‘visions’ of a subsidiary information not previously documented in


shaman’s trance. Their model has three components: writing. Moreover, the applicability of the direct
three progressive stages of ASC; the principles by historical approach is enhanced by the fact that
which mental imagery is perceived at each stage; and religious and belief systems tend to be very con-
the seven most commonly perceived entoptic phenom- servative and slow to change, as other recent archae-
ena, the phosphenes or geometric light images gener- ological analyses have shown (e.g., Tac: on et al., 1996).
ated in the brain and visual system during the first
stage of an ASC. After confirming the validity of their
model using corpora of rock art known from eth-
nographic evidence to have been produced by shamans 6. Future Trends
to depict visionary imagery, these archaeologists then Two circumstances favor the continued development
used it to test whether European Paleolithic rock art, and growth of cognitive archaeology. The first is its
which lacks any ethnographic record, was also scientific robustness, based on the fact that many
shamanistic in origin. practitioners successfully incorporate different meth-
Central to their analysis is the use of analogy, but in odologies and kinds of data in their analyses and
this case analogy based on timeless and unchanging interpretations. Rock art researchers, for example,
determining structures (human neuropsychology and have combined the direct historical approach with
its effects on ASC imagery), not analogy based on neuropsychological and physical models, creating
formal similarities. Note too that their concern was the convergent scientific theories (e.g., Whitley et al.,
origin of Upper Paleolithic art, not its meaning. While 1999). These are advantaged because the use of
neuropsychology predicts human reactions to ASC, it different kinds of data tested with distinct method-
does not tell us how such experiences will be inter- ologies greatly enhances the degree of confirmation of
preted or understood by different peoples or cultures their hypotheses and ensures that they do not follow
and at different times. Subsequent neuropsychological from any one set of methodological assumptions. The
studies of rock art have also considered the origins of results are scientific interpretations and explanations
universal aspects of shamanic symbolism, such as that are as empirically grounded and as well-tested as
‘mystical flight’ or ‘death and rebirth,’ as metaphoric any that archaeology can offer.
expressions of the bodily and emotional hallucinations Equally importantly, cognitive archaeology increas-
of ASC which were used verbally and graphically to ingly borrows from the cognitive neurosciences, an
describe what otherwise is a largely ineffable experi- area of research that has advanced dramatically in
ence (Whitley 2000). the 1990s and which promises to influence greatly, if
not change, all disciplines concerned with human
behavior. Archaeologists have not yet exploited the
5. The Direct Historical Approach range of information that is relevant to their research
and that is currently available in the cognitive neuro-
Another analytical strategy, called the direct historical sciences, nor can we yet predict what future advances
approach, involves the development of interpretive in this field will imply for archaeology. But what is
models of aspects of traditional cultures, like religion certain is that, as cognitive neuroscientists improve
or ideology, based on ethnographic and ethnohistoric our understanding of how the contemporary human
evidence. If derived from directly relevant cultures, mind-brain operates, this will contribute further to our
these models can be cautiously applied to the pre- understanding of cognitive aspects of the prehistoric
historic past to both chart continuity and identify past.
change in cognitive systems. Marcus and Flannery
(1994), for example, used sixteenth and seventeenth See also: Archaeology and Philosophy of Science;
century Spanish documents to reconstruct Zapotec Archaeology and the History of Languages; Cognitive
Indian religion in southern Mexico. Comparisons with Anthropology; Cognitive Neuroscience; Cogni-
archaeological evidence of religion and ritual indicated tive Science: History; Cognitive Science: Overview;
that the ethnohistorically described religious system Structuralism
first appeared between 200 BC and AD 100. Thomas
Huffman (1996) has likewise used African Nguni
ethnography concerning settlement structure, religion,
and ideology to interpret the site of Great Zimbabwe, Bibliography
dating from the thirteenth to the fifteenth centuries.
Deetz J J F 1977 In Small Things Forgotten: The Archaeology of
Though sometimes criticized for projecting eth- Early American Life. Anchor Press-Doubleday, New York
nography on to the prehistoric past, careful appli- Flannery K V and Marcus J1993 Cognitive archaeology. Cam-
cations of the direct historical approach need do no bridge Archaeological Journal 3: 260–7
such thing. Indeed, Huffman has shown that they can Hobson J A 1994 The Chemistry of Conscious States: Toward a
result in a rewriting of aspects of the ethnographic Unified Model of the Brain and the Mind. Little, Brown and
record when the archaeological evidence contributes Company, Boston

2088
Cognitie Control (Executie Functions): Role of Prefrontal Cortex

Horton R 1982 Tradition and modernity revisited. In: Ration-


ality and Relatiism, edited by Hollis M and Lukes S,
pp. 201–60, MIT Press, Cambridge, MA
Huffman T 1996 Snakes & Crocodiles: Power and Symbolism in
Ancient Zimbabwe. Witwatersrand University Press, Johann-
esburg, South Africa
Kelley J H and Hanen M P 1988 Archaeology and the Meth-
odology of Science. University of New Mexico Press, Albu-
querque, NM
Lakoff G and Johnson M 1999 Philosophy in the Flesh: The
Embodied Mind and Its Challenge to Western Thought. Basic
Books, New York
Leroi-Gourhan A 1967 Treasures of Prehistoric Art. Abrams,
New York
Lewis-Williams J D and Dowson T A 1988 The signs of all
times: entoptic phenomena in Upper Paleolithic art. Current
Anthropology 29: 201–45
Marcus J and Flannery K V 1994 Ancient Zapotec ritual and
religion: an application of the direct historical approach. In:
The Ancient Mind: Elements of Cognitie Archaeology, edited
by Renfrew C and Zubrow E B W, Cambridge University
Press, Cambridge, UK, pp. 55–74
Mithen S 1996 The Prehistory of the Mind: A Search for the
Origins of Art, Religion and Science. Thames and Hudson,
London
Noble W and Davidson I 1996 Human Eolution, Language and
Mind: A Psychological and Archaeological Inquiry. Cambridge
University Press, Cambridge, UK
Peebles C S 1992 Rooting out latent behaviorism in prehistory
In: Gardin J-C, Peebles C S (eds.) Representations in Arch-
aeology, Indiana University Press, Bloomington, IN, pp.
357–84
Renfrew C 1982 Towards an Archaeology of Mind. Cambridge
University Press, Cambridge, UK
Sperber D 1982 Apparently irrational thoughts. In: Rationality Figure 1
and Relatiism, Hollis M, Lukes S (eds.) MIT Press, Cam- Relative size of prefrontal cortex (grayed region) in five
bridge, MA, pp. 149–80 species (http:\\www.psychol.ucl.ac.uk\kate.jeffery\
Tac: on P S C,Wilson M and Chippindale C 1996 Birth of the C567\Lecture13IFrontal\overheads\img003.jpg)
Rainbow Serpent in Arnhem: land rock art and oral history.
Archaeology in Oceania 31: 103–24
Whitley D S 2000 The Art of the Shaman: The Rock Art of finding the face you are looking for within a crowd),
California. University of Utah Press, Salt Lake City, UT maintaining a new and important piece of information
Whitley D S, Dorn R I, Simon J M Rechtman R and Whitley in mind against distraction (e.g., remembering a
T K 1999 Sally’s rockshelter and the archaeology of the vision telephone number you just got from directory as-
quest. Cambridge Archaeological Journal 9: 221–247 sistance until you dial, while in a phone booth at a
Wynn T 1989 The Eolution of Spatial Competence. University of
noisy airport terminal), overcoming a compelling but
Illinois Press, Urbana, IL
undesirable behavior (e.g., suppressing the urge to
D. S. Whitley scratch a mosquito bite) or pursuing a complex but
unfamiliar behavior (e.g., playing a new piano piece)
and, perhaps most importantly, the ability to respond
flexibly and productively in novel circumstances (e.g.,
mentally explore the consequences of a complex
sequence of moves in a game of chess).
Cognitive Control (Executive Functions): The distinction between controlled and automatic
processing is one of the central concepts within
Role of Prefrontal Cortex modern cognitive psychology. Controlled processing
is considered to be effortful, and to rely on a limited
Control refers to the ability to direct mental function capacity system, while automatic processing is
and behavior in accord with an internally represented assumed to occur independently of this system (Posner
set of intentions. This is manifest in higher cognitive and Snyder 1975, Shiffrin and Schneider 1977). This
function in many forms: the ability to direct attention concurs with common experiences, such as the ability
to a specific stimulus within a large array of other to carry on a conversation while driving a car (an
competing, and perhaps more salient stimuli (e.g., automatic process) but not while conducting multi-

2089
Cognitie Control (Executie Functions): Role of Prefrontal Cortex

digit arithmetic in one’s head (a process that relies on mechanisms by which it operates. However, more
control). Many contemporary theories posit that there detailed neurobiological studies have begun to shed
is actually a continuum between controlled and auto- light on this question.
matic processing (for a review, see Cohen et al. 1990). Using single unit recording techniques, Fuster and
Nevertheless, virtually all theorists acknowledge the Alexander (1971) and Niki (1974) reported the re-
need for some mechanism, or set of mechanisms, markable finding that some neurons in the prefrontal
responsible for the coordination of processing in a cortex continue to fire during the delay between a
flexible fashion—particularly in novel or demanding stimulus and a contingent response, even after the
tasks. This idea figures centrally in Baddeley’s classic stimulus has disappeared. Goldman-Rakic and her
theory of working memory (Baddeley 1986), which colleagues followed up on this finding and, in a series
postulates two critical components: a storage com- of elegant studies, demonstrated that the firing of such
ponent responsible for the active maintenance of neurons was stimulus-selective, and was critical for
information in a short-term store, and a executive performance in delayed-response tasks (Goldman-
control component responsible for the manipulation Rakic 1990).
and coordinated use of this information. For example, These findings strongly implicate neuronal activity
in a multi-digit multiplication problem, the storage in PFC as the site of temporary storage of information
component maintains the intermediate products while in working memory. Recent neuroimaging studies of
the executive carries out the arithmetic operations. humans have provided strong convergent support for
The postulation of a central executive closely this idea, demonstrating sustained activity in PFC
paralleled theorizing regarding the nature of frontal during the performance of simple working memory
lobe function (for a review, see Shallice 1982), based tasks (for a review, see Smith and Jonides 1999).
on the clinical observation that patients with frontal However, this discovery seemed to pose an interesting
lesions exhibit a ‘dysexecutive syndrome.’ The frontal puzzle. Most of the neuropsychological data suggested
lobes are the area of the brain most highly expanded in that the PFC housed the central executive which, at
humans relative to other species (see Fig. 1). Damage least within Baddeley’s influential theory of working
to this area is associated with impairments in charac- memory, was clearly distinguished from the storage
teristically human cognitive functions that are directly component thought by most to be housed primarily in
dependent upon control (such as planning, or the more posterior structures. Computational models of
ability to respond adaptively in novel circumstances), cognitive control have offered a different perspective
and disturbances of this structure have been implicated on this problem that may reconcile the role of the PFC
consistently in neuropsychiatric diseases such as in control and storage.
schizophrenia that also appear to be uniquely human.
Indeed, the earliest neurologists, neuroscientists and
neuropsychologists recognized the importance of the 1. Computational Models
frontal lobes in the control of behavior. Perhaps this
observation was made most dramatically in the classic Shallice (1982) presented the first account of the role
case of Phineas Gage, the unfortunate railroad fore- of PFC in cognitive control within an explicitly
man who suffered damage to his prefrontal cortex computational framework. He proposed that the PFC
(PFC) when a 1-1\2d-diameter rod penetrated his skull housed a supervisory attentional system (SAS)—a
in a construction accident. Originally someone who mechanism by which PFC coordinates complex cog-
was considered to be thoughtful, responsible, and of nitive processes. Shallice’s theory was described in
sound judgment, following the accident he was de- terms of a production system architecture. This has
scribed as ‘capricious … and unable to settle on any of appeal, as it relates the executive functions of frontal
the plans he devised for future action’ (Harlow 1848). cortex to the well characterized mechanisms of other
Such changes have been observed repeatedly in production system theories, which include the active
patients with damage to the frontal cortex (for a representation of goal states to coordinate the se-
review, see Stuss and Benson 1986). Furthermore, quences of production firings involved in complex
patients with frontal damage perform poorly in tasks behaviors (e.g., Anderson 1983). One feature of goal
that require even the simplest forms of cognitive representations is that they must be maintained
control, such as the Wisconsin Card Sort Task actively throughout the course of a sequence of actions
(WCST) (Milner 1964), and, most recently, brain to direct behavior effectively. This coincides with the
imaging studies consistently have revealed increased observation that PFC appears to be specialized for the
activity of the prefrontal cortex while subjects are active maintenance of task-relevant information.
performing tasks that demand cognitive control (for a Thus, it is possible that PFC is specialized for the
review, see Miller and Cohen 2001). Although this maintenance of a particular type of information, as a
accumulation of findings has led most investigators to means of executing control over performance.
assume that the prefrontal cortex plays a critical role While Shallice’s theory was not implemented orig-
in cognitive control, they have provided little insight inally as a functioning model, Kimberg and Farah
into the specific contribution that it makes, or the (1993) proposed a model of frontal function—also

2090
Cognitie Control (Executie Functions): Role of Prefrontal Cortex

using a production system architecture—that simu-


lated performance in a variety of tasks considered to
rely on frontal lobe function (including the WCST),
and illustrated that damage to this component of the
model produced impairments similar to those ob-
served in patients with frontal damage. However,
while such models have produced insights into the
functional role of PFC in cognitive control, their
components do not have a transparent mapping on to
specific neurobiological mechanisms. Neurobiological
plausibility is not a requirement, per se, of a theory
that seeks to explain the cognitive functions of PFC.
Nevertheless, the question of how goal directed behav-
ior arises from the firing of millions of neurons presents
a mystery in its own right. Furthermore, under-
standing how different forms of cognitive impairment
arise from different forms of damage to PFC (e.g., the
effects of stroke or injury vs. schizophrenia) provides
an important motivation for understanding how cog-
nitive control may arise from the specific neurobio- Figure 2
logical mechanisms housed within PFC. Model of the Stroop Task (adapted from Cohen et al.
Recently, investigators have begun to use neural 1990), showing the pathways for word reading, color
network models to better understand how PFC may naming, and top-down control. Connections between
carry out its functions. Such models (also known as units in different layers are excitatory and within layers
connectionist, or parallel distributing processing are inhibitory (closed circles). Shown with color
models) simulate the behavioral performance of hu- dimension unit active, biasing color-naming pathway
man subjects (or animals) in cognitive tasks using
neurally-plausible processing mechanisms (e.g., the
spread of activity among simple processing units along to read written words, which interferes with the ability
weighted connections; see Rumelhart and McClelland to name the color of a Stroop conflict stimulus. The
1986). The goal of this effort is to identify principles of ability to name the color in the face of such interference
neural function that are most relevant to behavior. (that is, to produce the weaker, but task-relevant
Using this approach, Dehaene and Changeux (1989), response) is a simple but clear example of cognitive
Levine and Prueitt (1989), and Cohen and Servan- control.
Schreiber (1992) have all described models of pre- Cohen et al. (1990) constructed a model of this task,
frontal function, and used these to simulate the as shown in Fig. 2. The connections among the units in
performance of normal and frontally-damaged this model defined two processing pathways, one for
patients in tasks that are sensitive to PFC damage, word reading and another for color naming. The
such as the WCST and others. All of these models connections in the word-reading pathway were
simulate PFC function as the activation of a set of stronger, capturing the assumption that this was the
units that represent the ‘rules’ of the task; that is, units more practiced task. Because of these stronger con-
whose activation leads to a task-relevant response, nections, information flowing along the word pathway
even when this may not be the one most strongly interfered with color naming, simulating the inter-
associated with the stimulus. However, in most ference effects observed when human subjects perform
models, the PFC units themselves are not responsible this task. Indeed, the model’s ability to produce a
for generating the response directly. Rather, they response to the color in the face of such interference
influence the activity of other units whose responsi- required the addition of a set of units (labeled
bility this is. This is clearly illustrated by a model of the ‘Dimension’ in Fig. 2), which provided additional
Stroop task developed by Cohen et al. (1990). activation of the units in the color-naming pathway.
In the Stroop task (Stroop 1935), subjects respond This additional activation biased processing in favor
to words either by reading them, or by naming the of this pathway, allowing it to compete more effec-
color in which they are displayed. In the critical tively with, and prevail over, activity flowing along the
condition, the word itself conflicts with the color in stronger word pathway. This biasing effect corre-
which it is displayed (e.g., the word GREEN displayed sponds precisely to the role of top-down control in
in red). Subjects have no trouble reading such words neurophysiological models of attention (such as the
(e.g., saying ‘green’). However, when they are asked to Biased Competition Model of Desimone and Duncan
name the color (e.g., say ‘red’), they are significantly 1995), and has been proposed as a mechanism by
slower, and sometimes even make errors. This reflects which PFC exerts control over processing (Cohen and
the highly practiced, and therefore prepotent tendency Servan-Schreiber 1992, Miller and Cohen 2001).

2091
Cognitie Control (Executie Functions): Role of Prefrontal Cortex

Models such as this have been used to simulate the preparing coffee, first stirred and then added cream;
performance of both normal subjects and patients Shallice 1982). The guided activation model also
with frontal damage in a wide range of tasks that tap captures another critical feature of theories about the
cognitive functions commonly associated with PFC role of PFC in executive function: the importance of
function, such as working memory, attention, behav- sustained activity as a critical component of control.
ioral inhibition, planning, and problem solving. For a representation to have a biasing influence, it
must be activated. For this influence to endure (e.g.,
over the course of performing a task), it’s activity must
be sustained. For example, to continue color naming,
2. Guided Actiation as a Method of Control the activity of the color unit must be maintained, lest
the word begin to dominate processing. Similarly, an
The Stroop model brings several features of PFC increase in the demand for control requires greater or
function into focus. First, it emphasizes the view that more enduring activity of the corresponding units in
the role of the PFC in control is modulatory, guiding PFC. This concurs with accumulating evidence from
the flow of activity along pathways in other parts of the neuroimaging literature, that tasks thought to rely
the brain that are responsible for task performance. on controlled processing consistently engage the PFC
For example, activating the color unit does not in itself (for a review, see Smith and Jonides 1999). It is also
transmit information about a particular response (red consistent with both behavioral and neuroimaging
or green). Rather, it simply insures that activity findings regarding the effects of practice on auto-
flowing along the color-naming pathway will have a maticity and the involvement of PFC. Increased
greater influence over the response than activity practice on a task should strengthen its underlying
flowing along the word pathway. In this way, repre- pathway, reducing its reliance on control. Indeed,
sentations within the PFC can function as intentions, consistent practice on a task reduces the amount of
rules, or goals (comparable to those in a production PFC activity observed, and PFC damage can impair
system architecture), by setting up the appropriate new learning but spare performance on well-practiced
relationship between a stimulus (or category of stim- tasks (for a review, see Miller and Cohen 2001).
uli) and an associated response (or set of responses) Finally, this approach helps to unify the role that
through the proper guidance of activity along path- PFC plays in the variety of cognitive functions with
ways in other parts of the brain. Recent neuro- which it has been associated, such as selective at-
physiological findings regarding the firing properties tention, behavioral inhibition, and executive function
of neurons in PFC provide strong support for this view in working memory. These can all be seen as varying
(for a review, see Miller and Cohen 2001). Recent reflections, in behavior, of the operation of a single
neuroimaging studies also provide convergent sup- underlying mechanism of cognitive control: the bias-
port, indicating that PFC activity occurs when behav- ing effects of representations in PFC on processing in
ior relies upon explicit knowledge about rules or pathways responsible for task performance. For ex-
arbitrarily determined conjunctions of stimulus ample, selective attention and behavioral inhibition
features (for a review, see Miller and Cohen 2001). may be viewed as two sides of the same coin: Attention
Note that this function is not necessarily restricted to is the effect of biasing competition in favor of task-
mappings from stimuli to responses, but applies relevant information, and inhibition is the conse-
equally well to mappings involving internal states quence that this has for the irrelevant information (cf.
(e.g., thoughts, memories, emotions, etc.). Note also the Biased Competition Model of Desimone and
that this function is not necessarily unique to PFC. Duncan 1995). This assumes that inhibition occurs
There may also be more local forms of control, due to local competition between conflicting repre-
responsive to regionally specific needs for the biasing sentations (e.g., between the two responses in the
of processing. However, the wide range of anatomic Stroop model), rather than centrally by the PFC. The
connections that the PFC shares with virtually all ‘binding’ function of selective attention can also be
associative areas of the brain places it in a strategic explained by such a mechanism, by assuming that
position to guide the flow of activity between and PFC representations can select the desired combi-
among these other regions (for a review, see Miller and nation of stimulus features (over other competing
Cohen 2001)—a position well suited to its presumed combinations) to be mapped on to a response.
role in the control of higher cognitive processes.
The emphasis of this guided activation model of
PFC on its modulation of other brain areas actually
responsible for task execution is consistent with the 3. Outstanding Questions
classic pattern of neuropsychological deficits asso-
ciated with frontal lobe damage: The individual The guided activation model provides an integrat-
elements of a complex behavior are usually left intact, ive, and mechanistically explicit framework for con-
but the subject is not able to coordinate them in a task- sidering the role of PFC in cognitive control. At the
appropriate way (for example, a patient who, when same time, it brings into focus a number of important

2092
Cognitie Control (Executie Functions): Role of Prefrontal Cortex

and unanswered questions. For example, despite the one of the most striking and perplexing properties of
longstanding observation of delay-period activity for cognitive control: its severely limited capacity. This
PFC neurons, and the centrality of this property in has long been recognized in cognitive psychology
neural network models of PFC, little is known about (Posner and Snyder 1975, Shiffrin and Schneider
the actual mechanisms by which neuronal activity is 1977), and is painfully apparent to anyone who has
sustained. This could reflect a cellular property of tried to talk on the phone and read email at the same
PFC neurons (e.g., bistability), or a circuit-level time. The resource limitation of cognitive control has
phenomenon (e.g., recirculation of activity within the played an explanatory role in many important models
PFC, or between PFC and other structures). Several of human cognition. However, to date, no theory has
models have proposed the latter, assuming that repre- provided an explanation of the limitation itself. This is
sentations are maintained in PFC as attractor states a sine qua non of cognitive control, and therefore
(for a review, see O’Reilly et al. 1999). However, provides an important benchmark for any theory that
neurobiological studies are needed to confirm this seeks to explain its underlying mechanisms.
hypothesis. A closely related question concerns the Finally, it is important to recognize that the PFC is
mechanisms by which patterns of activity are updated certainly not the only brain structure involved in
in PFC. These must be able to satisfy two conflicting cognitive control. As noted above, mechanisms similar
demands: On the one hand, they must be responsive to to those within PFC may operate locally in other parts
relevant changes in the environment; and on the other, of the brain. Furthermore, there are certainly other
they must be resistant to updating by irrelevant types of mechanism critical to cognitive control. For
changes. Neurophysiological studies suggest that PFC example, the mechanisms responsible for keeping an
representations are selectively responsive to task- immediate goal in mind (e.g., working on a book
relevant stimuli and robust to interference from chapter) are not likely to be the same ones responsible
distractors (for a review, see Miller and Cohen 2001). for realizing long-term goals (e.g., getting the book
Not surprisingly, two hallmarks of damage to PFC are published). While the former may be guided by
perseveration (failure to update) and distractibility representations actively maintained in PFC, the latter
(inappropriate updating). These observations suggest almost certainly engage mechanisms of long-term
the operation of mechanisms that insure the ap- storage. Some suggestions have been made about how
propriate updating of PFC activity in response to the PFC may interact with the hippocampus to
behavioral demands. orchestrate the storage and retrieval of goal repre-
Recent modeling work has suggested that brainstem sentations at appropriate times (O’Reilly et al. 1999),
dopaminergic systems and the basal ganglia may play however, this remains another area in need of further
an important role both in updating representations in research.
PFC and learning how and when to do so (for a
review, see Miller and Cohen 2001). Other work has
suggested that the anterior cingulate cortex—a midline
structure within the frontal lobes—may play an
important role in monitoring task performance, and 4. Conclusion
identifying the need to allocate control (Botvinick et
al. 2001). Such theories make important predictions One of the great mysteries of the brain is how
about the neural mechanisms underlying cognitive purposeful, goal-directed behavior emerges from the
control that serve as valuable challenges to future millions of relatively simple processing units that are
neurobiological research in this area. its basic computational elements. Behavioral, neuro-
An equally fundamental question concerns the psychological, and neurobiological data converge on
nature of representations in PFC and how they arise. the idea that the frontal lobes play a critical role in
For example, the Stroop model assumes there are cognitive control. Neural network models have begun
units that represent each of the two dimensions of the to suggest how this may be carried out, as sustained
stimulus (color and word), with the appropriate patterns of activity within PFC modulate, or ‘guide’
connections to the corresponding pathways. However, the flow of activity along pathways in other parts of
are there such units in the PFC for every possible the brain responsible for performing a task. However,
combination of stimulus and response of which tasks many fundamental questions remain to be addressed.
may be composed? This seems unlikely. Yet, there The human brain is arguably the most complex device
must be sufficient representational richness to support in the known universe, and its capacity for higher
the flexibility in behavior that the PFC seems to afford. cognitive function continues to be one of its deepest
What principles define this set of representations, how mysteries. Unraveling this mystery stands as one of the
they are organized, and how they are learned are most exciting challenges in science, and the rapid
important questions at both the computational and development of sophisticated new empirical methods
neurobiological levels. (such as functional brain imaging) and theoretical
A better understanding of the mechanisms under- tools (such as neural network modeling) offer hope
lying active maintenance may also provide insight into that this challenge can be met.

2093
Cognitie Control (Executie Functions): Role of Prefrontal Cortex

See also: Dysexecutive Syndromes; Prefrontal Cortex; Smith E E, Jonides J 1999 Storage and executive processes in the
Prefrontal Cortex Development and Development of frontal lobes. Science 283: 1657–61
Stroop J R 1935 Studies of interference in serial verbal reactions.
Cognitive Function
Journal of Experimental Psychology 18: 643–62
Stuss D T, Benson D F 1986 The Frontal Lobes. Raven, New
York
Bibliography J. D. Cohen
Anderson J R 1983 The Architecture of Cognition. Harvard
University Press, Cambridge, MA
Baddeley A 1986 Working Memory. Clarendon Press, Oxford,
UK Cognitive Development: Child Education
Botvinick M M, Braver T S, Carter C S, Barch D M, Cohen J D
2001 Conflict monitoring and cognitive control. Psychological
Reiew By the nature of their subject matter, education and
Cohen J D, Dunbar K, McClelland J L 1990 On the control of cognitive development are closely intertwined. The
automatic processes: A parallel distributed processing account goal of education, to produce knowledgeable problem
of the Stroop effect. Psychology Reiew 97: 332–61 solvers, who can apply their knowledge flexibly in real
Cohen J D, Servan-Schreiber D 1992 Context, cortex and world contexts, and who have the skills and motivation
dopamine: A connectionist approach to behavior and biology to acquire effectively new knowledge and understand-
in schizophrenia. Psychology Reiew 99: 45–77
ing, is what models of cognition and cognitive de-
Dehaene S, Changeux J P 1989 A simple model of prefrontal
cortex function in delayed-response tasks. Journal of Cognitie velopment attempt to describe and explain—that is,
Neuroscience 1: 244–61 how we learn, remember and know, and how these
Desimone R, Duncan J 1995 Neural mechanisms of selective processes change and are affected by the interplay of
visual attention. Annual Reiew of Neuroscience 18: 193–222 environment and person over development.
Fuster J M, Alexander G E 1971 Neuron activity related to This article first presents a brief historical overview
short-term memory. Science 173: 652–4 and summary of the assumptions underlying psycho-
Goldman-Rakic P S 1990 Cellular and circuit basis of working logical and especially cognitive developmental theories
memory in prefrontal cortex of nonhuman primates. Progress and research as they pertain to education, and then
in Brain Research 85: 325–35 considers a set of current issues that highlight the link
Harlow J M 1848 Passage of an iron rod through the head.
Boston Medical and Surgical Journal 39: 389–93
between the two and the potential for dynamic
Kimberg D Y, Farah M J 1993 A unified account of cognitive interaction. ‘Education’ is clearly a broad term that
impairments following frontal lobe damage: The role of can encompass learning throughout the lifespan. In
working memory in complex organized behavior. Journal of this article the central focus will be on education
Experimental Psychology 122: 411–28 defined as formal instruction during the school years.
Levine D S, Prueitt P S 1989 Modeling some effects of frontal
lobe damage—novelty and perseveration. Neural Networks 2:
103–16 1. Historical Oeriew
Miller E K, Cohen J D 2001 An integrative theory of prefrontal
cortex function. Annual Reiew of Neuroscience 24: 167–202 Historically, the impact of understanding of cognitive
Milner B 1964 Some effects of frontal lobectomy in man. In: development on educational practice can be traced to
Warren J M, Akert K (eds.) The Frontal Granual Cortex and the influences of psychology in general. The great
Behaior. McGraw-Hill, New York, pp. 313–31 movements of functionalism and behaviorism that
Niki H 1974 Prefrontal unit activity during delayed alternation shaped psychology in the first half of the twentieth
in the monkey. 1. Relation to direction of response. Brain century also affected models of education and edu-
Research 68: 185–96 cational practice. The subsequent ‘cognitive revol-
O’Reilly R C, Braver T S, Cohen J D 1999 A biologically-based
computational model of working memory. In: Miyake A,
ution’ affected education in a variety of ways. Theory
Shah P (eds.) Models of Working Memory: Mechanisms of and research on cognitive skill development (e.g.,
Actie Maintenance and Executie Control. Cambridge Uni- strategy acquisition and use) and on conceptual
versity Press, New York structure (e.g., concept organization and change;
Posner M I, Snyder C R R 1975 Attention and cognitive control. logicomathematical reasoning) led to a concern with
In: Solso R L (ed.) Information Processing and Cognition. facilitating reasoning and problem-solving strategies,
Erlbaum, Hillsdale, NJ and with facilitating concept acquisition and retrieval,
Rumelhart D E, McClelland J L 1986 Parallel Distributed especially through active, hands-on problem solving.
Processing: Explorations in the Microstructure of Cognition. Although educational practice has been influenced
MIT Press, Cambridge, MA
Shallice T 1982 Specific impairments of planning. Philos. Trans.
at the most general level by ideas about cognition and
R. Soc. London Ser. B 298: 199–209 cognitive development, a dynamic interplay between
Shiffrin R M, Schneider W 1977 Controlled and automatic cognitive developmental research and educational
human information processing: II. Perceptual learning auto- practice has been neither widespread nor systematic in
maticity, attending and a general theory. Psychology Reiew the typical classroom. This is not surprising, as noted
84: 127–90 by Olson and Bruner (1996, p. 9): ‘Theoretical knowl-

2094
Cognitie Deelopment: Child Education

edge of how cognition develops continues to grow but models of cognitive functioning, led to a focus on
just how to relate this knowledge to the practical specifying task structure and problem solving strat-
contexts … to educate … remains almost as mysteri- egies, and on the information processes underlying
ous as when such efforts first began.’ This is beginning memory, reasoning, and learning.
to change as new tools emerge. Over the past decades,
cognitive science research has produced detailed ana-
lyses of tasks and learners that allow more soph- 2. Cognitie Deelopment
isticated models of the knowledge that a learner brings
to the educational setting, and of the component skills Examples of the specific influences of cognitive de-
necessary for successful performance within specified velopment theories and research can be captured by
domains such as mathematics and physics. In addition, answers to the questions: who is the learner, what is
there are powerful models of early and late conceptual being learned, and how does learning proceed?
skill development, and of reasoning processes that
emerge within and outside of formal education. These
models carry the promise of better specification of the 2.1 Who Is the Learner?
design and implementation processes for successful Probably one of the most revolutionary ideas of
learning environments. scholarly study and research on childhood and educa-
tion undertaken in the twentieth century was that
neither education nor cognitive development can be
understood without taking account of the active
1.1 Specific Influences child\learner. Although this view informed ideas
about education from an early point (cf. White 1992),
The influence of cognitive developmental theory and it was strengthened by widespread interest in the
research on education has changed as models of theories of Jean Piaget and their adaptation to
cognition, learning, and education have altered. Dur- educational issues, especially mathematics and science
ing the heyday of behaviorism, especially in the USA, education. The central idea is that the child actively
the explanatory mechanisms underlying both devel- assimilates new information on the basis of general
opment and education were based on elementary reasoning structures, which themselves change in
principles of associative learning; educational strat- regular ways over the course of development. Recent
egies were based on behavior analysis, association, research on the cognitive abilities of the very young
and reinforcement contingencies; and educational child and infant have complemented this idea to
outcome was assessed in quantitative terms. At the suggest that knowledge and knowledge acquisition in
beginning of the twenty-first century, although general some domains (such as language, arithmetic, and the
learning principles inform educational practice (e.g., elementary natural and social worlds) is precocious.
the relative value of internal vs. external reinforce- One conclusion from this work is that the child brings
ment, the effectiveness of spaced vs. massed trials, to the formal educational context a set of robust ideas
learning and forgetting curves, curricula organized as (or ‘theories’) about the physical, social, and natural
an accumulation of simple units) a strict learning worlds.
model approach has largely been restricted to beha- The goal of information-processing\cognitive sci-
vioral modification programs and some programmed ence models is to develop descriptions of learners,
instruction, most notably for foreign language learn- tasks, processes, and performance that will allow a
ing. detailed specification of the componential skills and
The most general result of the cognitive revolution processes underlying cognitive activities. From this
was a new focus on the active, problem-solving child more information-processing perspective, the learner
and classroom. Although an oversimplification, two is seen as a consumer who gathers, stores, organizes,
influences were particularly important. One influence and uses information in problem solving, and
is from the broad class of structuralist models, most who, with development and instruction, becomes
notably Jean Piaget’s genetic epistemology. Most of increasingly adept at directing, monitoring, and
these models have been based on the assumption that manipulating these skills strategically across a variety
the child is an active participant in the acquisition and of content areas.
organization of knowledge—that is, that there is a
dynamic interplay between the knower’s current level
of understanding and the information and problems
2.2 What Is Being Learned?
presented by the environment. This perspective led to
a focus on understanding the child’s view of events, on There is of course no single ‘cognitive developmental’
describing the nature and organization of concepts answer to the question of what is learned, but rather a
and reasoning structures, and on matching instruction family of answers that have in common a focus on
to developmental level. A second influence, arising knowledge and mental representation as the products
from the broad array of information-processing of experience and\or education.

2095
Cognitie Deelopment: Child Education

The gist of most structuralist viewpoints is that 3. Contemporary Issues for Cognitie
cognitive development consists of qualitative changes Deelopment and Education
in the ways that concepts (i.e., knowledge about the
mathematical, material, and social worlds) are organ- The following section discusses four complementary
ized. These changes enable the child to reason in areas in cognitive developmental research that have a
increasingly complex ways. The description of con- strong potential to affect educational models and
ceptual structure varies: according to Piagetian practice: (a) the metaphor of the child as ‘universal
models, conceptual structure is described in terms of novice’, (b) conceptual change models—the role of
general and universal logical-mathematical relations; naive theories and ‘misconceptions’, (c) individual
other models refer to causal or semantic structures differences, (d) cognitive development and education
that may vary across content domains. as social phenomena.
The answer to ‘what is learned’ from an inform-
ation-processing perspective on cognitive develop-
ment is skill-based. A large number of cognitive 3.1 The Deelopment of Noice-to-Expert
developmental studies have documented changes in The discovery that experts and novices, regardless of
the speed and control of processing skills (perception, age, differ with respect not only to the sheer amount of
attention, and memory), and associated increases in knowledge they possess, but also with respect to the
cognitive performance, as well as changes in con- organization of that knowledge, offers a metaphor for
ceptual organization and rule use. Catalyzed by a the processes of education in which universal novices
seminal paper by Flavell (1979) on metacognition and become educated experts. Many studies have demon-
cognitive monitoring, cognitive developmental re- strated that prior knowledge predicts learning out-
search focused on the acquisition of cognitive process- comes, and that the knowledge base of novices in any
ing strategies, and metacognitive skills and knowledge, particular domain differs from that of experts in
as important mechanisms of developmental change in systematic ways (Bransford et al. 1999). This approach
cognitive performance. Teaching these skills to im- has the potential to make a powerful impact on
prove general performance across content area (e.g., educational practice because it suggests that the
self-monitoring, strategy retrieval and use) has in- acquisition of domain-specific information, not just
formed research and practice, as has teaching these domain-general skills, is necessary for deep knowl-
skills within a variety of content areas, including edge acquisition. Expert content knowledge can com-
reading, writing, mathematics, and science. pensate for other skills that predict performance, such
as age, aptitude or metacognitive skills. The expert–
novice distinction and attempts to design curricula to
2.3 How Does Learning Proceed?: Implications for move novices to become experts has become a catch-
Education word of general education in math and science, and
in specialized curricula such as medicine.
The central tenet of constructivist models is that
cognitive change\learning is a reciprocal process of
interpreting information according to current con-
3.2 Intuitie Knowledge and Misconceptions
ceptual structure and adapting that structure to task
demands. Learning does not proceed by recording Studies of reasoning, concept formation, and con-
received information directly or passively. Learning ceptual understanding in young children have
proceeds at best according to constructivist models, demonstrated convincingly that there is probably no
when the learner can experiment actively to discover, period during development when the child is a ‘blank
invent or infer the solutions to problems. This appr- slate.’ Rather, even young children show rich knowl-
oach is reflected in active, hands-on classrooms, where edge about a variety of concept domains, including
a host of strategies attempt to foster active engage- number, biological kinds, physics, social phenomena,
ment: the discovery method, activity centers, and and the like. Although this knowledge can provide a
inquiry-based instruction that focuses on acquiring strong initial base for formal instruction, some early
principles, not just facts (e.g., Hedegaard and knowledge may clash with information presented in
Lompscher 1999). formal learning contexts for mathematical, literary,
From a cognitive science\information-processing and scientific domains, because the concepts to be
perspective, learning proceeds by acquiring declarative learned in these domains do not match everyday
and procedural knowledge within content domains, experience. For example, children’s difficulties in
domain-specific and domain-general rules for operat- performing mathematical operations on fractions or
ing with that content, strategies and skills for inter- negative numbers arise in part from inappropriately
preting and attending to new information, and an generalizing knowledge about natural, whole numbers
ability to monitor, access, and control these skills (Gelman 1994). Similarly, informal knowledge about
intentionally, including learning how to learn (cf. biological kinds, movement through space, physical
Bruer 1993, Bransford et al. 1999). causality, and even cosmology, can clash with learning

2096
Cognitie Deelopment: Child Education

about formal, complex systems in school (e.g., ac- 3.4 Cognition in Context
celeration, force, gravity, and biology). Most clearly
Researchers have recently returned to classic questions
demonstrated in math and science, the study of
concerning the role of culture in cognition, the
‘misconceptions’ has shown the power of children’s
importance of context and motivation in explaining
implicit and everyday concepts. It underscores the
and understanding cognition in everyday contexts,
importance of basing instruction on a good diagnosis
and the influences of formal and informal learning
of current understanding, and of devising ways to
contexts on cognitive development. Two phenomena
promote conceptual change through active exper-
have heightened such interest: national differences in
imentation and confrontation in rich everyday con-
cognitive performance, especially in mathematics and
texts. There is general agreement that ‘a logical
science; and research findings showing large dis-
extension of the view that new knowledge must be
crepancies between cognitive performance in formal
constructed from existing knowledge is that teachers
educational settings and informal everyday contexts.
need to pay attention to the incomplete understand-
Both of these perspectives have motivated new re-
ings, the false beliefs, and the naive renditions of
search on the types and effects of formal and informal
concepts that learners bring with them to a given
education and have amply illustrated the effects of
subject’ (Bransford et al. 1999, p. 10).
schooling on a variety of cognitive tasks tapping
mathematics, logic, classification, and memory strat-
egies (cf. Rogoff and Chavajay 1995).
3.3 Indiidual Differences A cognitive model that has been highly influential in
education is based on the theories of Lev Vygotsky.
In its attempts to explain mental growth and change,
Vygotsky (1962) characterized cognition as the inter-
mainstream cognitive developmental research has
nalization of external and culturally transmitted struc-
focused more on phenomena that are believed to
ture, rules, and principles that are mediated by
characterize development universally, and less on
language. According to this model, development
individual differences. Nonetheless, a number of tra-
proceeds most effectively when there is adequate
ditional and emerging areas suggest that systematic
environmental support within the ‘zone of proximal
differences not only in cognitive style and cognitive
development,’ a construct to indicate the difference
strategies, but also in more basic conceptual structure,
between a child’s actual and potential performance.
may provide a means to tailor educational practice
The zone of proximal development is usually measured
more closely to children’s specific learning needs.
as the difference between tasks a child can solve
For example, well documented sex differences in
working independently, and those a child can solve
spatial skills that may be tied to differences in
with assistance from adults, instructors or other
representation mode or organization, not just to
competent models. This approach underlies ‘recipro-
experiential differences, may be exploited to develop
cal education’ and ‘reciprocal teaching’, in which the
alternative methods of mathematics instruction. Anal-
learner acquires strategies from expert models in social
ogously, descriptions of multiple forms of intelligence
settings. The educational goal is to develop supporting
(cf. Gardner 1993) may offer to the teacher different
social contexts in which a ‘community of learners’
approaches to a topic and different modes of pre-
collaborates in fostering learning outcomes (Brown
senting key concepts (Bransford et al. 1999).
and Campione 1994).
Detailed studies tracing the processes of conceptual
change, strategy acquisition and discovery, and the
like, also illustrate the large range of individual
4. Conclusions
differences in developmental rate, style, and pattern.
Microgenetic studies, i.e., investigations that follow As noted above, mainstream cognitive developmental
the emergence, development, and consolidation of research and theory have influenced educational prac-
cognitive skills at an intensive individual level over a tice at only the most general levels. As the relatively
period of time, have been applied to a variety of new field of multidisciplinary cognitive science has
content domains such as language, mathematical skills become established and institutionalized, its methods
and problem solving, scientific reasoning, and memory and results are being tested in school contexts (cf.
and concept development (cf. Siegler and Crowley Bruer 1993, Bransford et al. 1999). The fields of
1991, Weinert and Schneider 1999). These studies have cognitive development and education are ripe for
illustrated the large variability in skill learning and forging collaborations that allow the science of learn-
use, and have shown that average developmental ing to inform the practice of education in classroom
functions do not characterize developmental change at contexts.
the individual level. Insight into the conditions that
facilitate problem solving and that help consolidate See also: Cognitive Development in Childhood and
newly formed competencies may inform the develop- Adolescence; Cognitive Development: Learning and
ment of more individualized learning assessment Instruction; Instructional Technology: Cognitive
and curricula. Science Perspectives; Piaget, Jean (1896–1980);

2097
Cognitie Deelopment: Child Education

Piaget’s Theory of Human Development and Educa- in cognitive development. After reviewing age-related
tion; Situated Cognition: Contemporary Develop- developments in causal reasoning, processes that have
ments; Situated Learning: Out of School and in the been implicated as mechanisms of change in causal
Classroom; Vygotskij’s Theory of Human Develop- reasoning specifically and cognitive development more
ment and New Approaches to Education generally (including analogical reasoning, attention,
working memory, selection and use of strategies, and
domain-specific knowledge) are described. The article
Bibliography closes with a brief review of postnatal neural develop-
ments thought to underlie developmental changes in
Bransford J, Brown A L, Cocking R (eds.) 1999 How People these processes.
Learn: Brain, Mind, Experience and School. National Acad-
emy Press, Washington, DC
Brown A L, Campione J C 1994 Guided discovery in a com-
munity of learners. In: McGilly K (ed.) Classroom Lessons: 1. Causal Reasoning
Integrating Cognitie Theory and Classroom Practice. MIT
Press, Cambridge, MA
Whereas no one theory unites the numerous disparate
Bruer J T 1993 Schools for Thought. MIT Press, Cambridge, MA domains of cognitive development, some cognitive
Flavell J H 1979 Metacognition and cognitive monitoring: A operations are common across them. Causal reasoning
new area of cognitive-developmental inquiry. American Psy- provides an illustrative example of how several areas
chologist 43(10): 906–11 of cognition develop in concert to support a higher-
Gardner H 1993 Frames of Mind. Basic Books, New York level skill. Reasoning can be defined as goal-directed
Gelman R 1994 Constructivism and supporting environments. activity that often relies on the use of inferences to
Human Deelopment 6: 55–82 reach a conclusion (DeLoache et al. 1998). Causal
Hedegaard M, Lompscher J (eds.) 1999 Learning Actiity and relations are ‘the cement of the universe’ (Mackie
Deelopment. Aarhus University Press, Aarhus, Denmark
Olson D, Bruner J 1996 Folk psychology and folk pedagogy. In:
1980) in that they provide systematic links between
Olson D, Torrance N (eds.) The Handbook of Education and precursors and consequences. Causal reasoning thus
Human Deelopment. Blackwell, Cambridge, MA pp. 9–27 allows for identification of the cause–effect relations
Rogoff B, Chavajay P 1995 What’s become of research on observed in the world. Inhelder and Piaget (1958,
the cultural basis of cognitive development? American Psy- 1964), founders of scholarship in cognitive devel-
chologist 50(10): 859–77 opment, began the tradition of research on causal
Siegler R S, Crowley K 1991 The microgenetic method: A direct reasoning in the developing child and set the stage for
means for studying cognitive development. American Psy- much of the research done to date. Motivated by the
chologist 46(6): 606–20 desire to understand how children and adolescents
Vygotsky L S 1962 Thought and Language. MIT Press, Cam-
bridge, MA
think and reason about events in the world, researchers
Weinert F, Schneider W (eds.) 1999 Indiidual Deelopment have examined infants’ perceptions of causality, young
from 3 to 12: Findings from the Munich Longitudinal Study. children’s ability to perform causally linked actions in
Cambridge University Press, New York sequential order so as to achieve a goal, and older
White S H 1992 G. Stanley Hall: From philosophy to develop- children’s abilities to infer causal mechanisms and to
mental psychology. Deelopmental Psychology 28(1): 25–34 test for cause–effect relations (i.e., to reason scientifi-
cally).
M. Bullock For more detailed discussion of the above topics
and for additional references see Transfer of Learning,
Cognitie Psychology of; Early Concept Learning in
Children; Infant Deelopment: Physical and Social
Cognition; Piaget’s Theory of Child Deelopment;
Cognitive Development in Childhood and Scientific Concepts: Deelopment in Children; and
Adolescence Problem Soling (Eeryday), Psychology of.

In a typical textbook on cognitive development, one


would find chapters on representation, memory, lang- 2. Early Deelopments in Causal Reasoning
uage, conceptual development, reasoning, problem
solving, and strategy development and use. Currently,
2.1 Perception of Causality in Infancy
no single theory unites the study of all of these areas of
cognition. Indeed, little communication exists among Within a few months of birth, infants demonstrate
them. Not unlike the Indian parable in which six blind sensitivity to causes and their consequents. Oakes and
men give different answers to ‘what is an elephant,’ Cohen (1990) showed 7- and 10-month-old infants
researchers in each area suggest that something dif- different events in which one toy traveled across a
ferent is ‘what develops’ in cognitive development in screen and either (a) made contact with another toy,
childhood and adolescence. In this article the domain which immediately moved; (b) made contact with
of causal reasoning is used to describe general trends another toy, which moved only after a brief delay; or

2098
Cognitie Deelopment in Childhood and Adolescence

(c) stopped before contacting the other toy, which, more than 8 percent of the trials. In contrast, when
after a brief delay, moved. Only the first event provided with the goal step of the solution, 23 percent
contained the causal elements of spatial and temporal of the 21-month-olds and 50 percent of the 27-month-
contiguity. The researchers found that the 10-month- olds were able to plan a path to the goal. This study
olds were sensitive to the causal structure of the events not only demonstrates the primacy of the goal state in
as demonstrated by longer looking to the events that aiding children’s production of cause–effect sequences,
violated causal principles. Leslie (1984) found that but also shows rapid developments in planning abili-
infants as young as six and a half months are sensitive ties.
to causal structure when simpler stimuli (e.g., colored
blocks) are used.
2.4 Identifying Causal Relations in the Preschool
Years
2.2 Enacting Causal Sequences to Achiee a Goal Whereas in the toddler years, children show facility
with planning a course of action to achieve a specific
By the second half of the first year of life infants not effect, in the preschool years, they reason as effectively
only appear to recognize basic conditions of causality, about effects, causes, and the steps that unite them.
they also use them to guide their own actions to reach Gelman et al. (1980) trained three- and four-year-old
desired goals. In Willatts (1984), nine-month-old children to read three-picture ‘stories’ that depicted
infants were presented with a cloth within their reach the initial state of an object, a causal agent, and the
on which rested a desired toy just outside their reach. object in a transformed state. For example, children
The infants needed to remove a barrier in order to pull were shown an intact coffee cup (initial state), a
the cloth and thereby obtain the toy. Compared to hammer (causal agent), and a broken coffee cup
infants in a control condition in which the toy rested (transformed state). The children then were presented
just off the cloth, infants in the ‘causal’ condition more with a series of stories in which one of the three
often removed the barrier and pulled the cloth. By 12 pictures was missing. They were asked to complete the
months of age infants are able to solve a more complex story by selecting one of three choice cards. The
version of this means–ends task: they successfully researchers found that both three- and four-year-olds
navigate multiple steps including removing a barrier to were able to infer the causal agent as well as the initial
reach a cloth, pulling the cloth to reach a string, and and transformed states. In related research it has been
reeling in a toy attached to the string. By 18 months, in shown that children understand a number and variety
age-appropriate versions of means–ends tasks, chil- of causal relations, including melting, cutting, and
dren use a variety of strategies and monitor the burning (Goswami and Brown 1989).
effectiveness of their strategies for reaching their goals Just as important as the ability to infer causes,
(see Willatts 1990 for a review). consequences, and causal agents is the ability to
recognize that not all cause–effect relations are deter-
ministic. In the real world, causality tends to be
2.3 Planning a Path from Cause to Effect probabilistic. For example, if an object is tremen-
dously heavy, pulling the cloth on which it rests might
Shortly after children show appreciation of existing not be sufficient to retrieve it. Kalish (1998) tested
causal connections, they begin to create their own three- and five-year-olds and adults’ understanding of
paths from causes to effects. In Bauer et al. (1999), 21- probabilistic causal relations in the domain of illness.
and 27-month-old children were required to plan a Whereas adults recognize that causes of illness are
course of action to achieve an effect. The children were probabilistic (e.g., coming into contact with a sick
shown either the initial state or the goal state of a person does not inevitably result in getting sick),
problem that required a three-step solution. Each of children treat them as deterministic. This research
the steps was necessary, but not sufficient, to achieve suggests that the development of sensitivity to prob-
the goal. For example, for the problem of ‘making a abilistic causality is more protracted, relative to
rattle’ (Step 1: put a block in a cup; Step 2: cover the understanding of causal events with definite outcomes.
cup and block with a second cup; Step 3: shake the Thus, lack of sensitivity to probabilistic outcomes
cups to make a rattling sound), an experimenter represents a limitation of the causal understanding of
modeled either the initial step of putting the block in preschool-age children.
the cup or the goal step of shaking the rattle. In both
cases, the experimenter verbally provided the children
with the goal of the activity (i.e., ‘make a rattle’). Note 3. Later Deelopments in Causal Reasoning
that the children were given the same amount of
information (one step of the three-step solution) in In addition to demonstrated competencies in reason-
each of the two conditions. When the children were ing about cause–effect relations, preschool-age chil-
shown the initial step of the causal sequence, neither dren also show early manifestations of scientific
the 21- nor the 27- month-olds solved the problem on reasoning. The goal of scientific reasoning is to test

2099
Cognitie Deelopment in Childhood and Adolescence

hypotheses in order to identify cause–effect relations. which they did not hold prior beliefs. In contrast,
Children as young as four and five years of age will children used their experiments to confirm the beliefs
search for the causal mechanism that resulted in an they held prior to investigation. This difference was
effect even when they do not see it (Bullock 1984). partially responsible for the overall higher perform-
Therefore, even very young children are sensitive to ance of adults in identifying the causal variables.
the necessity of a cause. Nevertheless, just as they Schauble’s (1996) research also makes clear that
initially overgeneralize and treat all cause–effect re- children and adults differ in the systematicity with
lations as deterministic (i.e., failing to recognize the which they approach the experimental space. First,
probabilistic nature of some causal relations), as although the children and the adults conducted the
shown below, in the preschool and early school years, same total number of experiments, the children often
children also overgeneralize one of the strongest cues inadvertently duplicated their experiments (even
to causality, namely, contiguity. though they were provided with data cards to record
Schlottman (1999) presented five-, seven-, and nine- their experimental manipulations). Second, within an
year-old children and adults with a mystery box that experiment, the children were less systematic in the
could contain one of two mechanisms. Both mech- conduct of trials. They were less likely to control
anisms caused a bell to ring when a ball was dropped variables across two trials and often changed two
in one of two holes in the box. One causal mechanism variables at once in their attempts to test causal
was ‘slow’ in ringing the bell because the ball had to hypotheses. Because they based their causal inferences
travel down a runway in order to ring the bell at the on confounded tests, children showed lower levels of
opposite end of the box. The other causal mechanism performance relative to adults. These patterns indicate
was ‘fast’ because the ball dropped onto a lever that that in addition to domain-specific knowledge (as
acted like a seesaw and immediately rang the bell. In assessed by prior beliefs), domain-general experimen-
the task, a ball was dropped into one of two holes in tation strategies influence children’s abilities to design
the box. Seconds later, the second ball was dropped experiments and to determine causal mechanisms.
and the bell immediately rang. Participants were asked Scientists are required not only to design experi-
to identify which of the two balls had caused the bell to ments to test hypotheses, but are also required to draw
ring. When they did not have knowledge of which of conclusions on the basis of data obtained by others. In
the two mechanisms was in the box, participants of all drawing conclusions, scientists must attend to a
ages attributed causality to the contiguous event. That number of features, including the presence or absence
is, they selected the ball that was dropped immediately of co-variation, the availability of a plausible causal
before the bell rang. Even after they were informed of mechanism, the size of the sample, and the sampling
which mechanism was in the box, five- and seven-year- method used. Koslowski et al. (1989) presented sixth
olds continued to select the contiguous event, re- and ninth graders and college students with a series of
gardless of mechanism. In contrast, when appropriate, story problems, each of which described the way
the nine-year-olds and adults ignored contiguity and evidence was gathered (direct intervention or cor-
made decisions based on the properties of the mech- relation), the sample size (large or small), the causal
anism. This research makes clear both the profound mechanism (present or absent), and the results (co-
influence that prior knowledge and experience have on variation or no co-variation). Participants were then
causal and scientific reasoning and the great strides in asked to judge the extent to which they believed the
treatment of data made by children in the early proposed cause was responsible for an observed effect.
elementary school years. Whereas all participants were sensitive to the co-
Even beyond the early elementary school years, variation of cause and effect, developmental changes
individuals’ beliefs influence the ways in which they were apparent in sensitivity to each of the other types
design experiments and the hypotheses that they test. of information. Sixth graders continued to give high
Schauble (1996) asked fifth and sixth graders and ratings for the proposed cause when co-variation was
noncollege adults to design experiments to identify the present even when there was no causal mechanism
variables that affect the speed of a boat down a canal provided and when a small sample size was used.
and the extension of a spring into water when a weight Ninth graders provided high ratings for their con-
is attached. Both adults and children entered the fidence in the proposed cause with either a small or a
experiment with beliefs about the objects and the large sample size, but only when a causal mechanism
relations about which they were to reason. In general, was present. College students showed even more
adults’ beliefs were more appropriate than children’s refined scientific reasoning. They were less confident in
beliefs. For example, only 30 percent of the adults the proposed cause when a causal mechanism was
compared with 80 percent of the children expressed absent (even when co-variation was present) and when
the belief that ‘big things weigh more than small a small sample size was used. These findings suggest a
things.’ A priori beliefs about causal variables in- developmental progression in scientific reasoning in
fluenced the experiments that both the children and which co-variation is primary, followed by a sensitivity
the adults designed. Specifically, adults used their to the presence of causal mechanism, with a later
experimentation trials to understand the variables for developing sensitivity to sample size. Notice that

2100
Cognitie Deelopment in Childhood and Adolescence

absent from the list of features to which college and to ignore ‘distractions.’ A comparison of Willatts’
students were sensitive is the nature of the evidence: (1984) research on means–ends problem solving by
Koslowski et al.’s participants did not discriminate nine-month-olds with Schlottman’s (1999) research on
direct intervention studies from correlational ap- the use of contiguity to identify causal mechanism by
proaches. Thus although, with development, partici- five to nine-year-olds makes clear that the older
pants evidenced greater awareness of the types of children are required to attend to many more features
information that scientists use in their evaluation of a to produce or identify the cause–effect relation. In
potential cause and effect relation, the additional addition, with development, people become better
distinction of sampling method is needed to truly able to attend selectively to the variables at hand.
‘think like a scientist.’ Selective attention may promote the control of vari-
ables when designing experiments and assist children
in ignoring deceptive surface-feature similarities (Chen
4. Mechanisms of Cognitie Change and Siegler 2000).
Age-related changes in working memory also play a
From infancy through adolescence, children’s under- large part in the development of causal reasoning
standing of causality undergoes tremendous re- abilities. Working memory refers to the ability to hold
finements. Developments in several cognitive domains information in mind and simultaneously process it. A
support these advances in causal reasoning. The major model of working memory proposes that it is
abilities include analogical reasoning, attention, work- comprised of short-term information stores or ‘buf-
ing memory, selection and use of strategies, and fers,’ access to and use of which is controlled by a
domain-specific knowledge. Developments in each central executive function that also retrieves infor-
domain are described in turn. mation from long-term memory and controls action
Analogical reasoning refers to the process by which planning and goal-directed behaviors (Baddeley 1986).
knowledge is transferred from one domain to another. Over the course of development, there are age-related
It is appropriate when the domains share fundamental changes in each of these functions. Whether devel-
or ‘deep structure’ similarities. Analogical reasoning opmental changes are the product of increases in
plays an important role in causal and scientific available memory capacity (Halford et al. 1994) or of
thinking because it allows for extension of knowledge increasingly efficient use of available capacity (Case
from a well-known or better known situation to 1992) is under debate. Nevertheless, what is clear is
another less well-known domain. Children as young as that with development, there are increases in the
16 to 20 months of age show evidence of generalization length of time over which information can be held or
of knowledge from one domain to another (Bauer and maintained and the facility and speed with which it can
Dow 1994). Goswami and Brown (1989) demonstrated be processed (see Gathercole 1998 for a review). For
that preschoolers can use their understanding of causal example, absent sustained, focused attention or re-
relations to complete analogies wherein the higher hearsal, auditory information remains persistent in
order relation between the elements is the causal short-term memory for 12 seconds in adults, 10
mechanism (e.g., chocolate is to melted chocolate as a seconds in 10–12 year olds, and for only eight seconds
snowman is to a melted snowman). Analogical reason- in six–seven-year-olds (Keller and Cowan 1994). As
ing also permits transfer of strategies from one task to reviewed in Swanson (1996), there also are age-related
another. For example, Chen and Klahr (1999) found increases in working memory span, or the number of
that fourth graders transferred the strategy of control items that can be recalled in the context of a concurrent
of variables even after a seven-month period. With processing task. As demonstrated in Schauble’s (1996)
development there are changes in the efficiency and research, each of these elements plays a supportive
proficiency with which analogical transfer is employed role in causal reasoning. That is, compared with
(see DeLoache et al. 1998 for a review). adults, children were more likely to lose sight of their
Attention also shows tremendous developments plans for experimentation and were less systematic in
from infancy to adolescence. There are documented their execution of plans. In each case, the presumed
developmental differences in the ability to sustain source of lower performance was difficulty re-
attention on a task, to switch attentional focus between membering what already had been tested and what
tasks, and within a task, to focus on relevant versus had been found.
irrelevant features (e.g., Ruff and Lawson 1990). In Also implicated in developments in causal reasoning
research on causal reasoning, the tasks presented to are changes in the selection and use of problem-
young children typically require focus on a limited solving strategies. For example, as demonstrated in
number of features. The absence of irrelevant features Schauble (1996), relative to children, adults are more
makes it easier for young children to focus on those likely to use a control of variables strategy when
that are related to the causal structure. As tasks and testing hypotheses. Developmental changes in the
problems become more complicated and more vari- selection and use of more and less sophisticated and
ables and features are involved, it is necessary to effective strategies can be characterized as a series of
differentiate what is relevant from what is irrelevant, overlapping ‘waves’ of approaches or solutions to

2101
Cognitie Deelopment in Childhood and Adolescence

problems (Siegler 1996). Multiple strategies exist in the coherence in EEG mirror the patterns in the de-
repertoire simultaneously. They compete with one velopment of attention over the same time period (see
another both over time and within a given problem Case 1998, for a review).
resulting in waves of use as some strategies gradually Other neurological developments include brainwide
decline and the frequency of others gradually in- changes such as myelination that results in faster
creases. The implication is that strategies such as transmission of impulses between neurons. There also
control of variables likely exist in the repertoires of are age-related increases in white matter density that
both children and adults, alongside less advanced are consistent with greater myelination of fiber tracts
strategies of experimentation. Consistent with this throughout adolescence (Paus et al. 1999). Increased
suggestion, it has been shown that second graders can speed of processing benefits working memory and
be trained to use a control of variables strategy (Chen attention.
and Klahr 1999). Nevertheless, the presence of the
strategy in the child’s repertoire does not guarantee
that it will be used when appropriate or that it will be 6. Summary
executed successfully when it is deployed.
Finally, analogical reasoning, attention, working The refinements that occur through brain develop-
memory, and selection and use of strategies all are ment produce changes in the cognitive processes that
influenced by the individual’s familiarity with the support causal understanding. With development,
domain. For example, Chi (1977) documented higher children accrue a greater knowledge base, process in-
levels of reasoning by children in domains in which formation more rapidly, attain greater attentional and
they had expertise (e.g., chess) than in domains in strategic resources, and improve in the efficiency with
which they were relative novices. Goswami and Brown which they utilize resources. These component pro-
(1989) found that analogical reasoning was facilitated cesses support a range of behaviors that are united by
by knowledge of the causal mechanisms involved in the appreciation of causal mechanisms, from be-
the transformations that children were asked to judge. haviors as basic as infants’ perception of causality to
Similarly, Schauble (1996) found that both children’s adolescents’ abilities to design and evaluate experi-
and adults’ beliefs about causality influenced the ments to test causal relations. Although we observe
hypotheses that they tested. Thus, the influence of qualitative differences in these behaviors through the
domain knowledge is consistent throughout the life course of development, what underlies these advances
span. The facilitative effects of knowledge also extend are continuous changes in the brain and the resulting
to the component processes that support problem cognitive component processes that develop to sup-
solving. Specifically, working memory span is greater port them.
for familiar words than for nonwords (Hulme et al.
1991). In addition, greater experience within a domain
allows participants to guide their attention to the Bibliography
relevant attributes of the problem to be solved, and to
Baddeley A D 1986 Working Memory. Oxford University Press,
disregard those features that are irrelevant to the Oxford, UK
solution (Goswami and Brown 1989). Bauer P J, Dow G A A 1994 Episodic memory in 16- and 20-
month-old children: Specifics are generalized, but not for-
gotten. Deelopmental Psychology 30: 403–17
5. Neurological Changes Related to Cognitie Bauer P J, Schwade J A, Wewerka S S, Delaney K 1999 Planning
ahead: Goal-directed problem solving by 2-year-olds. Deelop-
Deelopment mental Psychology 35: 1321–37
Although the precise linkages are not clear, it is widely Bullock M 1984 Preschool children’s understanding of causal
connections. British Journal of Deelopmental Psychology 2:
assumed that cognitive developmental changes in
139–48
processes such as attention and working memory are Case R 1992 The role of the frontal lobes in the regulation of
related to developments in the neural substrates that cognitive development. Brain and Cognition 20: 51–73
subserve them. The prefrontal cortex supports plan- Case R 1998 The development of conceptual structures. In:
ning and working memory and undergoes a protracted Kuhn D, Siegler R S, Damon W (eds.) Handbook of Child
course of development (see Nelson et al. 2000 for Psychology, 5th edn. John Wiley, New York, Vol. 2, pp.
review). For example, although adult levels of meta- 745–800
bolic activity are approximated in the first year of life, Chen Z, Klahr D 1999 All other things being equal: Acquisition
development continues into adolescence (Chugani and transfer of the control of variables strategy. Child
Deelopment 70: 1098–120
1994), and frontal pruning of synapses (Huttenlocher
Chen Z, Siegler R S 2000 Across the great divide: Bridging the
1994) and myelination (Jernigan et al. 1991) continue gap between understanding of toddlers’ and older children’s
into adolescence. Thatcher (1992) demonstrated that thinking. Monographs of the Society for Research in Child
the coherence of EEG patterns between frontal and Deelopment, Vol. 65.
posterior lobes increases throughout middle child- Chi M T H 1977 Age differences in memory span. Journal of
hood. Case (1992) has shown that the patterns of Experimental Child Psychology 23: 266–81

2102
Cognitie Deelopment in Infancy: Neural Mechanisms

Chugani H T 1994 Development of regional brain glucose Schlottman A 1999 Seeing it happen and knowing how it works:
metabolism in relation to behavior and plasticity. In: Dawson How children understand the relation between perceptual
G, Fischer K (eds.) Human Behaior and the Deeloping Brain. causality and underlying mechanism. Deelopmental Psy-
Guilford Press, New York, pp. 153–75 chology 35: 303–17
DeLoache J S, Miller K F, Pierroutsakos S L 1998 Reasoning Siegler R S 1996 Emerging Minds: The Process of Change in
and problem solving. In: Kuhn D, Siegler R S, Damon W Children’s Thinking. Oxford University Press, New York
(eds.) Handbook of Child Psychology, 5th edn. Wiley, New Swanson H L 1996 Individual and age-related differences in
York, Vol. 2, pp. 801–50 children’s working memory. Memory and Cognition 24: 70–82
Gathercole S E 1998 The development of memory. Journal of Thatcher R W 1992 Cyclic cortical reorganization during early
Child Psychology and Psychiatry 39: 3–27 childhood. Brain and Cognition 20: 24–50
Gelman R, Bullock M, Meck E 1980 Preschoolers’ under- Willatts P 1984 The Stage-IV infants’ solution of problems
standing of simple object transformations. Child Deelopment requiring the use of supports. Infant Behaior and Deelopment
51: 691–9 7: 125–34
Goswami U, Brown A L 1989 Melting chocolate and melting Willatts P 1990 Development of problem-solving strategies in
snowmen: Analogical reasoning and causal relations. Cog- infancy. In: Bjorklund D F (ed.) Children and Strategies:
nition 35: 69–95 Contemporary Views of Cognitie Deelopment. Erlbaum,
Halford G S, Maybery M T, O’Hare A W, Grant P 1994 The Hillsdale, NJ, pp. 23–66
development of memory and processing capacity. Child
Deelopment 65: 1338–56 P. J. Bauer and M. M. Burch
Hulme C, Maughan S, Brown G D A 1991 Memory for familiar
and unfamiliar words: Evidence for a long-term memory
contribution to short-term memory span. Journal of Memory
and Language 30: 685–701
Huttenlocher P R 1994 Synaptogenesis, synapse elimination,
and neural plasticity in human cerebral cortex. In: Nelson CA Cognitive Development in Infancy: Neural
(ed.) Minnesota Symposium on Child Psychology: Vol. Mechanisms
27—Threats to Optimal Deelopment: Integrating Biological,
Psychological, and Social Risk Factors. Erlbaum, Hillsdale,
NJ, pp. 35–54 1. Background
Inhelder B, Piaget J 1958 The growth of Logical Thinking From
Childhood to Adolescence. Basic Books, New York Until the past decade the study of cognitive devel-
Inhelder B, Piaget J 1964 The Early Growth of Logic in the Child. opment in human infants has been conducted rela-
Harper & Row, New York tively independently of any consideration of the brain.
Jernigan T L, Trauner D A, Hesselink J R, Tallal P A 1991 This relative neglect of biological factors in the study
Maturation of human cerebrum observed in vivo during of behavioral development is surprising since the
adolescence. Brain 114: 2037–49 origins of developmental psychology can be traced to
Kalish C W 1998 Young children’s predictions of illness: Failure
to recognize probabilistic causation. Deelopmental Psy-
biologists. Darwin was one of the first to take a
chology 34: 1046–58 scientific approach to human behavioral development,
Keller T A, Cowan N 1994 Developmental increase in the and to speculate on the relations between phylogenetic
duration of memory for tone pitch. Deelopmental Psychology and ontogenetic change. Piaget, who was originally
30: 855–63 trained as a biologist, used then-current theories of
Koslowski B, Okagaki L, Lorenz C, Umbach D 1989 When embryological development to generate his accounts
covariation is not enough: The role of causal mechanism, of human cognitive development. McGraw and Gesell
sampling method, and sample size in causal reasoning. Child tried to integrate brain development with what was
Deelopment 60: 1316–27 known of behavioral development. While they focused
Leslie A M 1984 Spatiotemporal continuity and the perception on motor development, they also extended their
of causality in infants. Perception 13: 287–305
Mackie J L 1980 The Cement of the Unierse: A Study of
conclusions to mental and social development (Gesell
Causation. Clarendon Press, Oxford, UK 1928, McGraw 1943). While both these authors
Nelson C A, Monk C S, Lin J, Carver J L, Thomas K M, Truwit developed sophisticated informal theories that at-
C L 2000 Functional neuroanatomy of spatial working tempted to capture non-linear and dynamic ap-
memory in children. Deelopmental Psychology 36: 109–16 proaches to development, their efforts to relate brain
Oakes L M, Cohen L B 1990 Infant perception of a causal event. development to behavioral change remained very
Cognitie Deelopment 5: 193–207 speculative due to the paucity of knowledge at the
Paus T, Zidgenbos A, Worsley K, Collins D L, Blumenthal J, time.
Giedd J N, Rapoport J L, Evans A C 1999 Structural matu- From the 1960s to the late 1980s biological ap-
ration of neural pathways in children and adolescents: in vivo proaches to human behavioral development were
study. Science 283: 1908–11
Ruff H A, Lawson K R 1990 Development of sustained, focused
neglected for a variety of reasons, including the widely
attention in young children during free play. Deelopmental held belief among cognitive psychologists during that
Psychology 26: 85–93 period that the ‘software’ of the mind is best studied
Schauble L 1996 The development of scientific reasoning in without reference to the ‘hardware’ of the brain.
knowledge-rich contexts. Deelopmental Psychology 32: However, the recent expansion of knowledge at the
102–19 end of the twentieth century on brain development

2103
Cognitie Deelopment in Infancy: Neural Mechanisms

makes the task of relating it to behavioral changes levels at different ages during later childhood. The
considerably more viable than previously. In addition, postnatal rise-and-fall developmental sequence can
new molecular and cellular methods, along with also be seen in other measures of brain physiology and
theories based on artificial neural networks, have led anatomy. For example, using PET, Chugani et al.
to great advances in our understanding of how primate (1987) observed an adult-like distribution of resting
brains are constructed during ontogeny. These ad- brain activity within and across brain regions by the
vances, along with those in functional neuroimaging, end of the first year. However, the overall level of
have led to the recent emergence of the interdisci- glucose uptake reaches a peak during early childhood,
plinary science of developmental cognitive neurosci- which is much higher than that observed in adults. The
ence (see Johnson 1997). rates return to adult levels after about 9 years of age
What benefits can accrue from taking a devel- for some cortical regions. The extent to which these
opmental cognitive neuroscience approach to infants? changes relate to those in synaptic density is currently
First, considering evidence from brain development the topic of further investigation.
may help constrain, or even change, the type of A controversial issue in developmental neuroscience
cognitive theories that we consider. Second, being able concerns the extent to which the differentiation of the
to relate brain to cognitive development will po- cerebral neocortex into areas or regions with particular
tentially allow a more complete explanation not only cognitive, perceptual, or motor functions can be
of normal development, but also of developmental shaped by postnatal interactions with the external
disorders resulting from genetic abnormality, and the world. This issue reflects the debate in cognitive
long-term effects of early brain damage. development about whether infants are born with
domain-specific ‘modules’ for particular cognitive
functions such as language, or whether the formation
2. Human Postnatal Brain Deelopment of such modules is an activity-dependent process (see
Elman et al. 1996, Karmiloff-Smith 1992). Since
A number of lines of evidence indicate that there are around 1900 neuropsychology has taught us that the
substantive changes during postnatal development of majority of normal adults tend to have similar func-
the human brain. Perhaps most obviously, the volume tions within approximately the same regions of cortex.
of the brain quadruples between birth and adulthood. However, we cannot necessarily infer from this that
This increase comes from a number of sources such as this pattern of differentiation is intrinsically prespeci-
more extensive fiber bundles, and nerve fibers be- fied (the product of genetic and molecular interac-
coming myelinated. In addition, there is a dramatic tions), because most humans share very similar
increase in size and complexity of the dendritic tree of pre- and postnatal environments. In developmental
many neurons. Less apparent with standard micro- neurobiology this issue has emerged as a debate about
scopy, but evident with electron microscopy, is a the relative importance of neural activity for cortical
corresponding increase in the density of synapses. differentiation, as opposed to intrinsic molecular and
Huttenlocher (1990) and colleagues have reported a genetic specification of cortical areas. Supporting the
steady increase in the density of synapses in several importance of the latter processes, Rakic (1988)
regions of the human cerebral cortex. For example, in proposed that the differentiation of the cortex into
parts of the visual cortex, the generation of synapses areas is due to a protomap. The hypothesized proto-
(synaptogenesis) begins around the time of birth and map either involves prespecification of the tissue that
reaches a peak at around 150 percent of adult levels gives rise to the cortex during prenatal life or the
toward the end of the first year. In the frontal cortex presence of intrinsic molecular markers specific to
(the anterior portion of cortex, considered by most particular areas of cortex. An alternative viewpoint,
investigators to be critical for many higher cognitive advanced by O’Leary (1989) among others, is that
abilities), the peak of synaptic density occurs later, at genetic and molecular factors build an initially un-
around 24 months of age (see Goldman-Rakic et al. differentiated ‘protocortex,’ and that this is then sub-
1997). Although there is variation in the timetable, in sequently divided into specialized areas as a result of
all regions of cortex studied so far, synaptogenesis neural activity. This activity within neural circuits
begins around the time of birth and increases to a peak need not necessarily be the result of input from the
level well above that observed in adults. external world, but may result from intrinsic, spon-
Somewhat surprisingly, regressive events are com- taneous patterns of firing within sensory organs or
monly observed during the development of nerve cells subcortical structures that feed into the cortex, or from
and their connections in the brain. Due to the paucity activity within the cortex itself (e.g., Katz and Shatz
of human data, the regional timetable for this decrease 1996).
is still unclear and there is controversy about whether Although the neurobiological evidence is complex,
or not it shows differences between regions. Never- and probably differs between species and regions of the
theless, in humans, most neocortical regions and cortex, overall it tends to support the importance of
pathways appear to undergo this ‘rise and fall’ in neural activity-dependent processes (see Johnson 1997
synaptic density, with the density stabilizing to adult for a review). With several notable exceptions, it seems

2104
Cognitie Deelopment in Infancy: Neural Mechanisms

likely that activity-dependent processes contribute to of the relevant regions of the brain. Finally, there is a
the differentiation of functional areas of the cortex, continuing need for the neuroanatomical study of
especially those involved in higher cognitive functions postmortem tissue. For a variety of reasons such
in humans. During prenatal life, this neural activity studies are difficult to conduct.
may be largely a spontaneous intrinsic process, while
in postnatal life it is likely also to be influenced by
4. Relating Brain to Cognitie Deelopment
sensory and motor experience.
A number of different approaches have been taken to
relating brain to cognitive development. These differ-
ent approaches depend on very different sets of
3. Methods for Studying Human Postnatal Brain assumptions about development.
Deelopment
4.1 Maturational Models
Part of the reason for the recently renewed interest in
relating brain development to cognitive change comes The most common approach to developmental cog-
from advances in methodology which allow hypoth- nitive neuroscience is based on a maturational frame-
eses to be generated and tested more readily than work, in which it is assumed that as particular brain
previously (see also Nelson and Bloom 1997). One set regions mature they allow or enable new cognitive
of tools relates to brain imaging. Some of these functions to come on line. By this view postnatal brain
imaging methods, such as positron emission tom- development is assumed to be heavily governed by
ography (PET), are of limited utility for studying genetic and molecular factors, and relatively (though
transitions in behavioral development in normal in- not completely) independent of experience. In brief,
fants and children due to their invasive nature and postnatal brain development is seen as a necessary, but
their relatively coarse temporal resolution. However, not sufficient, cause of change in cognitive abilities.
two other methods may prove more useful. Two areas in which this approach has been applied
Since the 1960s, scalp recorded event-related poten- concern the transition from subcortical to cortical
tials have been used to assess brain function in control over visually guided behavior, and the later
infants and children for several decades. These record- onset of frontal and prefrontal cortex control.
ings can either be of the spontaneous natural In one of the first specific attempts to relate changes
rhythms of the brain (EEG), or the electrical activity in behavior to brain development in infants, Bronson
evoked by the presentation of a stimulus (ERP). (1974) presented evidence that the subcortical retino-
Recent developments of the ERP method allow the collicular visual pathway primarily controls visually
relatively quick and easy installation of large numbers guided action in the newborn human infant. He also
of sensors, thus making the method easier to use and showed that it is only by around 3 months of age that
also improving spatial resolution. Functional MRI visually-guided behavior switches to cortical path-
allows the noninvasive measurement of cerebral blood ways. More recent research indicates that there is
flow with fine spatial resolution and temporal reso- probably some, albeit limited, cortical activity in
lution on the order of seconds. Although this tech- newborns, and that the onset of cortical control over
nique has been applied to children (Casey et al. 1997), behavior is a gradual, rather than all-or-none, tran-
the distracting noise and vibration, and the presently sition. Johnson (1990) updated Bronson’s thesis to
unknown possible effects of high magnetic fields on incorporate several different cortical pathways now
the developing brain, make its usefulness for healthy known to underlie visually guided action in adult
children under 4 or 5 years of age unclear. However, primates. The logic underlying this model was that
there has been at least one MRI study of infants changes in visually guided behavior of infants over the
initially scanned for clinical reasons (Tzourio et al. first months of life could be attributed to the graded
1992), and the advent of ‘open’ scanners in which the onset of each of several different cortical pathways.
mother can hold the infant may increase possibilities Further, which pathways were active could be predic-
further. ted from the developmental neuroanatomy of the
Apart from brain imaging, the neural basis of primary visual cortex at that age, since this structure
cognitive development in infants can be examined by was the gateway to most of these pathways. While this
administering behavioral ‘marker tasks’ to infants model had reasonable success in accounting for the
who have suffered perinatal brain damage or de- sequence changes in behavior observed, in the past few
velopmental disorders of genetic origin. These marker years studies involving ERPs, and studies of infants
tasks are adapted from tasks previously linked to a with focal cortical damage, show that frontal cortical
brain region or pathway in adult primates and humans regions are active earlier than more posterior regions,
by cognitive neuroscience studies. By testing infants or a sequence not predicted by the original Johnson
children with versions of such a task at different ages, (1990) model.
the researcher can use the success or otherwise of Another prominent maturational model has con-
individuals as indicating the functional development cerned the onset of prefrontal cortex functioning. In

2105
Cognitie Deelopment in Infancy: Neural Mechanisms

terms of structural neuroanatomy, this part of the contacts. These initial connections are labile, but either
cortex shows the most prolonged development of any stabilize or regress, depending on the activity in the
region of the human brain, with changes in synaptic postsynaptic cell, a process referred to as ‘selective
density detectable even into the teenage years (Hutten- stabilization.’ Changeux and Dehaene (1989) sugges-
locher 1990). Diamond (1991) has argued that the ted that this model could be used to bridge from the
maturation of prefrontal cortex during the period brain to cognitive and behavioral levels, and that the
6–12 months accounts for a number of transitions same process of ‘Darwinian’ change could occur.
observed in the behavior of infants in object per- Perhaps the best example of this type of change at the
manence and object retrieval tasks. In one such task behavioral level comes from the work on phonemic
infants younger than 8 months often fail to accurately discrimination in infants, showing that while they can
retrieve a hidden object after a short delay period if the initially discriminate a very large range of phonetic
object’s location is changed from one where it was boundaries used in speech, including those not found
previously successfully retrieved. The basis for Dia- in their native language, this ability becomes restricted
mond’s claims come from the observations that (a) to those boundaries important for their native lan-
monkeys with lesions to the dorsolateral prefrontal guage around 12 months of age.
cortex (DLPC) show the same patterns of impairment Selectionist models have recently been criticized for
as young human and monkey infants, and (b) there are the assumption that the initial stage of overproduction
neurochemical and neuranatomical changes in the is not sensitive to experience (Quartz and Sejnowski
human DLPC at around the age they begin to perform 1997), and for focusing too heavily on only one aspect
successfully. Diamond (1991) has speculated that the of neural development (Purves 1994). It is also
DLPC is critical for performance when (a) information important to remember that neuroanatomical
has to be retained or related over time or space, and (b) measures of synaptic density are a static measure of
a prepotent response has to be inhibited. She argues dynamic processes. Since there is constant turnover of
that prior to the maturation of the DLPC, infants do synapses, it is unlikely that there are clearly distinct
not successfully perform tasks that require both of phases of growth and pruning. Rather, both stages are
these abilities. likely to be simultaneously occurring within cortical
Further evidence linking success in the object regions.
permanence task to frontal cortex maturation in the
human infant comes from two sources. The first of
these is a series of EEG studies with normal human
4.3 Actiity-dependent Models
infants (e.g., Bell and Fox 1992), in which increases in
frontal EEG responses correlate with the ability to A number of factors suggest that the field needs to
respond successfully over longer delays in delayed move beyond the maturational framework. First,
response tasks. The second source is work on cognitive increasing evidence from developmental neuroscience
deficits in children with a neurochemical deficit in the suggests that neuronal activity itself plays a vital role
prefrontal cortex resulting from Phenylketonuria in prenatal brain development, and it would seem
(PKU). Even when treated, this inborn error of reasonable to suggest that the same processes may
metabolism can have the specific consequence of extend into postnatal life. Second, there is evidence
reducing the levels of a neurotransmitter, dopamine, from neuroimaging and the study of infants with focal
in the dorsolateral prefrontal cortex. These reductions brain damage to suggest that there are dynamic
in dopamine levels in the dorsolateral prefrontal cortex, changes in the timing and pattern of cortical activation
result in these infants and children being impaired on in infants relative to adults. These dynamic changes
tasks thought to involve parts of the prefrontal cortex, take a number of forms, including changes in the
such as the object permanence task and an object overall spatial extent in cortex-activated ones (loca-
retrieval task, and being relatively normal in tasks lization), changes in the extent to which the activation
thought to depend on other regions of the cortex of a cortical region is stimulus-specific (specialization),
(Diamond et al. 1997, Welsh et al. 1990). and changes in the temporal stage of cortical pro-
cessing at which specialization can be observed (see
Johnson 2000).
Event-related potential experiments with infants
4.2 Selectionist Models
have indicated that both for word learning (Neville
As discussed earlier, during the postnatal 1991) and face processing (de Haan et al. submitted)
development of the cortex there is a rise and fall in there is increasing localization of processing with
synaptic density. This observation has led to a age\experience of a stimulus class. That is, more
number of ‘selectionist’ theories being advanced in widespread scalp leads show a difference between the
which the essential notion is that there is experience words or faces that control stimuli in younger infants
related sculpting of neural connectivity. For example, than in older ones. In the example of face processing,
Changuex (1985) proposes that molecular and genetic both the left and the right ventral visual pathways are
processes specify the initial overproduction of synaptic differentially activated by faces in early infancy, but in

2106
Cognitie Deelopment in Infancy: Neural Mechanisms

many (but not all) adults this further localizes only to de Haan M, Pascauis O, Johnson M H (submitted) Spatial and
the right ventral pathway (de Haan et al. 1998). In the temporal characteristics of cortical activation in adults and
example of word recognition, differences are initially infants viewing faces
found over widespread cortical areas, but narrow to Diamond A 1991 Neuropsychological insights into the meaning
of object concept development. In: Carey S, Gelman R (eds.)
left temporal leads after further experience with this The Epigenesis of Mind: Essays on Biology and Cognition.
class of stimulus (Neville 1991). Johnson (2000) Erlbaum, Hillsdale, NJ, pp. 67–110
presented an ‘interactive specialization’ framework Diamond A, Hurwitz W, Lee E Y, Bockes T, Grover W,
within which changes in localization are a direct Minarcik C 1997 Cognitive deficits on frontal cortex tasks in
consequence of increases in specialization within and children with early-treated PKU: Results of two years of
between cortical pathways. He suggests that this longitudinal study. Monographs of the Society for Research in
framework also provides a way of thinking about the Child deelopment, Monographs No. 252: 1–207
fact that the same behavior can be mediated by Elman J L, Bates E, Johnson M H, Karmiloff-Smith A, Parisi D,
different patterns of cortical activation in infants from Plunkett K 1996 Rethinking Innateness: A Connectionist
Perspectie on Deelopment. MIT Press, Cambridge, MA
those observed in adults.
Gesell A L 1928 Infancy and Human Growth. Macmillan, New
Activity-dependent models can also be extended York
more broadly to the view that, at a given stage in Goldman-Rakic P S, Bourgeois J, Rakic P 1997 Synaptic
postnatal development, the human infant may actually substrate of cognitive development: Life-span analysis of
seek out the sensory input it needs to enable the further synaptogenesis in the prefrontal cortex of the nonhuman
specialization of it its own brain. In other words, the primate. In: Krasnegor N A, Reid Lyon G, Goldman-Rakic
infant is not a passive absorber of experience, but P S (eds.) Deelopment of the Prefrontal Cortex. Eolution,
rather an active and selective seeker of it. Thus, the Neurobiology and Behaior. Paul H Brookes, Baltimore, MD,
infant changes its ‘effective environment’ during de- pp. 27–48
velopment. One example of this comes from the Huttenlocher P R 1990 Morphometric study of human cerebral
cortex development. Neuropsychologia 28: 517–27
development of face processing, where it has been
Johnson M H 1990 Cortical maturation and the development of
argued that primitive tendencies for the newborn to visual attention in early infancy. Journal of Cognitie Neuro-
orient to face-like stimuli ensures that developing science 2: 81–95
cortical circuitry is preferentially exposed to that class Johnson M H 1997 Deelopmental Cognitie Neuroscience: An
of stimulus (see Johnson 1997). Introduction. Blackwell, Oxford, UK
Johnson M H, de Haan M 2001 Developing cortical spec-
See also: Brain Development, Ontogenetic Neuro- ialization for visual-cognitive function: The case of face
biology of; Functional Brain Imaging; Infant Devel- recognition. In: McClelland J L, Siegler R S (eds.) Mech-
anisms of Cognitie Deelopment: Behaioral & Neural Pers-
opment: Physical and Social Cognition; Neural Plas- pecties. Lawrence Erlbaum Associates, Mahwah, NJ
ticity; Prefrontal Cortex; Prefrontal Cortex Devel- Johnson M H 2000 Functional brain development in infants:
opment and Development of Cognitive Function; elements of an interactive specialization framework. Child
Prenatal and Infant Development: Overview; Sensitive Deelopment 71(1): 75–81
Periods in Development, Neural Basis of Karmiloff-Smith A 1992 Beyond Modularity: A deelopmental
Perspectie on Cognitie Science. MIT Press\Bradford Books,
Cambridge, MA
Katz L C, Shatz C J 1996 Synaptic activity and the construction
Bibliography of cortical circuits. Science 274: 1133–8
McGraw M B 1943 The Neuromuscular Maturation of the
Bell M A, Fox N A 1992 The relations between frontal brain Human Infant. Columbia University Press, New York
electrical activity and cognitive development during infancy. Nelson C A, Bloom F E 1997 Child Development and Neuro-
Child Deelopment 63: 1142–63 science. Child Deelopment 68: 970–87
Bronson G 1974 The postnatal growth of visual capacity. Neville H J 1991 Neurobiology of cognitive and language
Child Deelopment 45: 873–90 processing: Effects of early experience. In: Gibson K R,
Casey B J, Trainor R J, Orendi J L, Schubert A B, Nystrom L E, Petersen A C (eds.) Brain Maturation and Cognitie Deel-
Giedd J N, Xavier Castellanos J L et al. 1997 A developmental opment: Comparatie and cross-cultural Perspecties. Adaline
functional MRI study of prefrontal activation during per- de Gruyter Press, Hawthorne, NY, pp. 355–80
formance of a go-no-go task. Journal of Cognitie Neuro- O’Leary D D 1989 Do cortical areas emerge from a protocor-
science 9: 835–47 tex? Trends in Neuroscience 12: 400–6
Changeux J P 1985 Neuronal Man: The Biology of Mind. Purves D 1994 Neural Actiity and the Growth of the Brain.
Pantheon Books, New York Academia Nazionale Dei Lincei, Cambridge University Press,
Changeux J P, Dehaene S 1989 Neuronal models of cognitive Cambridge, UK
functions. Cognition 33: 63–109 Quartz S R, Sejnowski T J 1997 A neural basis of cognitive
Chugani H T, Phelps M E, Mazziotta J C 1987 Positron development: A constructivist manifesto. Behaioural and
emission tomography study of human brain functional de- Brain Sciences 20: 537–96
velopment. Annals of Neurology 22: 487–97 Rakic P 1988 Specification of cerebral cortical areas. Science
de Haan M, Oliver A, Johnson M H 1998 Electrophysiological 241: 170–6
correlates of face processing by adults and 6-month-old Tzourio N, de Schonen S, Mazoyer B, Bore A, Pietrzyk U, Bruck
infants. Journal of Cognitie Neuroscience 36 Supp. S B, Aujard Y, Deruelle C 1992 Regional cerebral blood flow in

2107
Cognitie Deelopment in Infancy: Neural Mechanisms

two-month old alert infants. Society for Neuroscience Ab- In such a general conception, learning processes and
stracts 18: 1121 their external conditions take a subordinate role.
Welsh M C, Pennington B F, Ozonoff S, Rouse B, McCabe Instructions may not only promote natural devel-
E R 1990 Neuropsychology of early-treated phenylketo-
opment but also disrupt or impair it, and although
nuria: Specific executive function deficits. Child Deelopment
61: 1697–713 cognitive development is also a condition and outcome
of education, it is, nonetheless, a particularly im-
M. H. Johnson portant goal when the interest is in maintaining the
dynamics and continuity of the developmental process
for as long as possible in order to reach the highest
possible level of cognitive development (Kohlberg and
Mayer 1972).
Cognitive Development: Learning and At the beginning of the twenty-first century, many
Instruction scientists are skeptical about such purist theories of
cognitive development. This applies not only to the
Scientific conceptualizations of the relations between role of learning and of external learning opportunities
cognitive development and learning on the one side, but also to the impact of instruction and schools on
and learning and instruction on the other side, have cognitive development. For example, Geary (1995)
always been controversial in psychology and continue discriminates between primary and secondary bio-
to be so. Theoretical standpoints depend on whether logical abilities. He views primary abilities as innate
they take a universal or differential perspective, mental dispositions that enable even infants to learn
whether they are dominated by evolutionary-genetic from experience under almost any sociocultural condi-
or environment-oriented approaches, and whether tions through the assistance of a strong intrinsic
they are biased ideologically toward an optimistic or motivation. This is how they acquire domain-specific
pessimistic view on education. For these reasons, this competencies (e.g., a mother tongue; elementary nu-
entry will (a) present the universal relations between merical skills; physical, biological, psychological, and
cognitive development, learning, and instruction; (b) social knowledge).
analyze differential aspects of these three concepts; (c) In comparison, secondary skills vary greatly be-
describe theoretical approaches to the relation be- tween cultures, subcultural groups, and cohorts. The
tween cognitive development and learning; and (d) learning processes needed to acquire secondary com-
present theoretical conceptualizations of the relation petencies (scientific knowledge, higher mathematical
between learning and instruction. The final section will skills, metacognitive skills) are cumulative, proceed at
contain some conclusions for educational practice. a relatively slow pace, require individual effort, and
generally call for extrinsic motivation and didactic
support. The level of secondary abilities that may be
1. Cognitie Deelopment, Learning, and achieved collectively, and is actually achieved indi-
Instruction from a Uniersal Perspectie vidually, depends on the state of cultural development
and the availability of schools or equivalent institu-
Inspired by evolutionary ideas in general, and the tions.
epistemological approach of Jean Piaget (1970) in Hence, Geary’s (1995) model assumes that early
particular, most classic theories on childhood cog- cognitive development depends on a large number and
nitive development can be characterized by four basic variety of necessary, autonomous, and automatically
principles. Their theoretical models are: effective learning processes, and that although instruc-
(a) universal (valid for all human beings), tions (models, feedback, reinforcement, indications,
(b) general (valid for all cognitive phenomena), availability of learning opportunities, etc.) take an
(c) structural (valid for all basic changes in the important role, they are not decisive, because the
cognitive systems and functions that internally de- acquisition of primary abilities is universal, genetically
termine the acquisition of knowledge and skills), predetermined, and intrinsically motivated. In con-
(d) naturalistic-descriptive (stating that develop- trast, cognitive development in middle childhood,
mental changes are caused by the species-specific adolescence, and adulthood is influenced far more
nature of human beings and may be influenced by strongly by a person’s cultural, familial, and individual
environmental variables but not produced by them). situation. Instructionally guided and promoted learn-
A typical example is Flavell’s (1970) theoretical ing is decisive for the course, the contents, and the level
assumption ‘that cognitive changes during childhood of cognitive development. However, there are large
have a specific set of formal ‘‘morphogenetic’’ proper- interindividual differences—even under similar cul-
ties, that presumably stem from the biological-matur- tural conditions—in the speed, efficacy, and quality of
ational growth process underlying these changes. learning. Such individual differences in cognitive
Thus, childhood cognitive modifications are largely learning and, thus, in cognitive development, cannot
inevitable, momentous, directional, uniform, and ir- be explained completely as consequences of prior
reversible’ (p. 247). learning.

2108
Cognitie Deelopment: Learning and Instruction

2. Cognitie Deelopment, Learning, and individual differences in cognitive abilities and


Instruction from a Differential Perspectie achievements was possible and could be achieved
through intervention. The concept of mastery learning
Each attained state of cognitive development deter- was viewed as the key for this. The idea was to grant
mines learning, and the outcomes of cumulative less gifted and unsuccessful students a temporary
learning processes influence the course of cognitive increase in necessary learning time, while simultan-
development. This general rule is moderated by the eously adapting instruction to match their learning
strength and stability of individual differences in aptitudes as closely as possible. The educational and,
intellectual abilities. to some extent, ideological aspirations invested in this
Ever since Binet and Simon constructed the first model of learning have never paid off. Students with
intelligence test in 1905, psychometrically oriented low abilities and poor prior knowledge differ greatly in
research has assumed that interindividual differences what they can learn during a similar amount of
in abilities are relatively stable from middle childhood learning time. As a result, large differences in the
onward, and that these differences in the speed and amount of learning time are required if equal achieve-
quality of cognitive development permit long-term ments are to be attained in a heterogeneous population
predictions. In a theoretically sophisticated but skep- (Slavin 1987).
tical review of the available empirical findings, The opposing standpoint is well-characterized by
Wohlwill (1980) concluded, ‘A reasonable coherent Herrnstein and Murray’s (1994) book The Bell Cure.
picture of the stability of IQ, over different portions of According to their research in the USA, the interplay
the age span, and some of the variables affecting it, has of differences in genetic endowment and sociocultural
emerged from … research’ (p. 401). How can we conditions lead, at a very early stage, to stable
explain this high stability in individual IQ differences? interindividual differences in intellectual abilities, mot-
The most plausible answer is to assume the existence ivational tendencies, and patterns of social behavior.
of cognitive and social mechanisms with the same The attempts to increase equality through compensa-
effect as the Matthew Principle of ‘For unto everyone tory education in recent decades have led to no long-
that hath shall be given.’ lasting reduction in intellectual differences. As a result,
Despite numerous theoretical and methodological the authors, strongly recommend discontinuing inter-
controversies, the available research on twins leads to ventions for disadvantaged children and investing the
the conclusion that one half or slightly more of the available financial resources in educating those stu-
variance in IQ in the industrialized nations is de- dents whose high mental potential means that they will
termined genetically (which naturally also means that be responsible for generating the largest part of the
almost one half of this variance is not determined gross national product in the years to come.
genetically). In contrast to much prejudiced belief, Both radical positions give a very one-sided in-
genetic factors and environmental conditions do not terpretation of the current state of research. They
impact independently on the individual’s cognitive overlook the fact that all children have to learn all
development, but covary to a major degree. As a rule, competencies, and that all students profit from good
biological parents are also the most important actors instruction, while, at the same time, major differences
in the young child’s social environment. Persons in cognitive abilities and performance cannot be
already react differentially in accordance with the leveled out through intensive developmental interven-
infant’s genotype, and children become actively in- tions and academic instruction.
volved in selecting preferential segments of the en-
vironment at an early age. These covariations help to
strengthen and stabilize interindividual differences.
Moreover, more intelligent children profit to a larger
extent from the same learning opportunities and 3. Cognitie Deelopment and Learning: Which
instructional aids than their less intelligent peers. Comes First?
Those who are more successful as a result of a
cumulation of favorable factors develop a higher Psychology has a long tradition of assuming that
degree of self-confidence that encourages them to learning is the only mechanism for explaining cog-
make a greater effort to solve difficult problems. nitive development. With the expert approach, this
Finally, young persons with greater abilities and assumption has even survived the shift from beha-
higher achievements generally get a better education, viorist theories to cognitive models. Indeed, the expert
which leads to greater career opportunities that are approach has even be applied to cognitive devel-
also associated in the long term with more cognitive opment (Carey 1984, Sternberg 1998). For example,
stimulation and challenge. Carey (1984) has postulated: ‘Children differ from
Scientists and educators differ greatly in their adults only in accumulation of knowledge’ (p. 37). As
response to the empirical evidence of stable differences a result: ‘Children know less than adults. Children are
in cognitive development. In the 1960s and 1970s, novices in almost every domain in which adults are
people thought that an egalitarian leveling out of experts’ (p. 64).

2109
Cognitie Deelopment: Learning and Instruction

Gagne! (1968) had adopted a similar position in this psychology and the scientific study of teaching
criticism of the structuralist stage theories of cognitive methods. Some examples of what are, in part, very
development (e.g., Piaget 1970): different conceptions of teaching methods are non-
In an oversimplified way, it may be said that the directive instruction as an aid toward self-generated
stage of intellectual development depends upon what insight in the student (the Socratic method); the
the learner knows already and how much he has yet to adaptation of learning goals, learning conditions, and
learn order to achieve some particular goals. Stages of learning methods to the mental state of the individual
development are not related to age, except in the sense learner; the application of psychological laws to
that learning takes time. They are not related to logical initiate learning and make it successful (experimental
structures, except in the sense that the combining of teaching methods at the beginning of the twentieth
prior capabilities into new ones caries its own inherent century); open but stimulating learning environments
logic Gagne! 1968 (p. 189). to arouse students mentally (the educational reform
Does learning really determine all of cognitive movement of the twentieth century); the planning,
development during childhood? There is still not organization, structuring, and evaluation of learning
enough empirical evidence to answer this question. in the classroom through the teacher (direct in-
Nonetheless, the confirmation of domain-specific struction); or the moderation of the activities of
knowledge in infancy, of the acquisition of demand- independent and responsible student learners (self-
ing (linguistic or numerical) competencies during regulated learning approach).
early childhood without any recognizable signs of a The theoretical (and sometimes ideological) contro-
prior acquisition of the necessary cognitive precon- versies over the role and function of instruction for
ditions, and the interindividual similarly in cognitive learning are particularly relevant at the start of the
development despite different learning opportunities, twenty-first century. In a comparison of psychological
supports the idea that innate predispositions must also research on learning and educational research on
be involved in not only specific but also unspecific teaching. Shulman (1982) even believed that he could
cognitive learning processes. recognize a potential paradox: ‘Although the research
As children grow older, the externally controlled on learning has taught us the importance of the active,
acquisition (through learning) of competencies seems transforming role of the learner, the research on
to become more important for cognitive development teaching continues to demonstrate the importance of
than internally guided learning processes. For ex- direct instruction, an approach which seems to suggest
ample, the information-processing approach has re- a passive view of the learner’ (p. 97).
vealed large intraindividual differences in solving The apparantly paradoxical pattern of results from
structurally similar tasks that nonetheless differ in experimental and developmental research on the one
content. The influence of learned declarative as well as hand, and research on classroom instruction on the
procedural knowledge on the development of cog- other hand, is not at all contradictory when one goes
nitive competencies has been confirmed convincingly. beyond the negative stereotype associated with direct
Studies on how novices develop into experts in specific instruction and considers the features and operation-
contexts have also shown what outstanding levels of alizations used in many models of direct instruction.
performance can be achieved through long-term, Direct instruction can be characterized by the fol-
deliberate practice (Siegler 1991). lowing points: (a) The teachers classroom manage-
Even though relations between innate and acquired ment is effective and the rate of student’s interruptive
competencies, between ontological and logical restric- behavior is very low; (b) the teacher maintains a strong
tions to learning, between domain-specific and academic focus and uses instructional time intensively
domain-unspecific learning opportunities, between to initiate and facilitate active, constructive, and goal-
explicit and implicit learning modes, between neurobi- directed learning activities; (c) the teacher ensures that
ological and behavioral indicators of learning, are as many students as possible achieve successful learn-
currently a field of intensive research, new methodo- ing processes by carefully chosing appropriate tasks,
logical paradigms, and numerous theoretical specula- clearly presenting subject-matter information, con-
tions, they have still not been subjected to solid and tinuously diagnosing each student’s learning progress
consistent theory formulation (Richardson 1998). and learning difficulties, and providing effective help
through remedial instruction.
Many studies have shown that instruction in which
4. Learning and Instruction: Are Learners Really the teacher actively supports the learning process of
Their Own Best Instructors? the active and constructive working students is more
effective than an educational strategy in which the
The role and function of instruction for learning, teacher’s only role is to provide for external conditions
particularly for academic learning, were already con- that made individual learning possible (Weinert and
troversial topics in the philosophical tradition of Helmke 1995).
educational science, and this has not changed during Nonetheless, self-regulated learning has become a
the more than 100-year-old history of educational broad field of psychological research in recent years as

2110
Cognitie Deelopment: Learning and Instruction

well as a powerful movement for educational reform At the beginning of the twenty-first century, it
(Boekaerts 1999). This reveals a broad consensus that therefore seems wise for school administrators and
self-directed learning is one of the most important teachers not to follow radically one-sided scientific
goals of education. The only controversy is over recommendations, but to apply variable combinations
whether this goal can be attained exclusively through of instruction methods to meet different educational
self-directed learning activities by students, regardless goals and comply with the different learning precon-
of whether or not these learners possess appropriate ditions in their students.
learning strategies or how far the contents of learning
are close to the individual learner’s experience. See also: Childhood and Adolescence: Developmental
Assets; Cognitive Development: Child Education;
5. Conclusions Cognitive Development in Childhood and Adoles-
cence; Cognitive Psychology: Overview; Cognitive
Although it has been possible to solve or clarify many Styles and Learning Styles; Education: Phenomena,
scientific problems regarding the relation between Concepts, and Theories; Educational Learning
cognitive development, learning, and instruction Theory; Instructional Design; Instructional Psy-
during the last century, psychology and educational
chology; Intelligence, Prior Knowledge, and Learn-
science are still far from possessing satisfactory theor-
etical models. Nonetheless, the current state of ing; Learning and Instruction: Social-cognitive
knowledge reveals a few interesting perspectives and Perspectives; Learning Theories and Educational
conclusions for future psychological research and Paradigms; Piaget’s Theory of Human Development
current educational practice: and Education; School Achievement: Cognitive and
(a) At the present time, neurobiological and psycho- Motivational Determinants
logical research are still unable to work in concert,
particularly with babies and young children. There is a
need for improvement in the forms of interdisciplinary
cooperation and a speeding up of knowledge transfer. Bibliography
(b) A very important question seems to be what do Binet A, Simon R 1905 Application des me! thodes nouvelles au
we have to teach in order to ensure that certain diagnostic du niveau intellectuelle chez des enfants normaux
competencies will be in any way acquired and lead to et anormaux d’hospice et d’e! cole primaire. AnneT e Psycho-
the emergence of the desired special patterns of logique 11: 245–336
cognitive development. Boekaerts M 1999 (ed.) Self-regulated learning: Where we are
(c) As well as examining universal changes in today. International Journal of Educational Research 31:
development, there is a need for studies on interin- 443–551
Carey S 1984 Cognitive development. The descriptive problem.
dividual differences in cognitive structures and pro- In: Gazzaniga M S (ed.) Handbook of Cognitie Neuroscience.
cesses to improve the understanding of learning effects Freeman, New York, pp. 37–66
and the effectiveness of instruction. Which deficient Flavell J H 1970 Cognitive change in adulthood. In: Goulet R,
competencies can be compensated or substituted Baltes P. B. (eds.) Life-span Deelopmental Psychology: Re-
through learning and through instruction, and which search and Theory. Academic Press, New York, pp. 247–53
trade-offs can be anticipated? Gagne! R M 1968 Contributions of learning to human de-
(d) As well as studying representative samples of velopment. Psychological Reiew 75: 177–91
children at different stages of development, basic Geary D C 1995 Reflections of evolution and culture in
research needs to pay more attention to the devel- children’s cognition. American Psychologist 50: 24–36
opmental course of both gifted and mentally retarded Herrnstein R J, Murray C 1994 The Bell Cure. The Free Press,
New York
children. Which qualitative differences in ability and Kohlberg L, Mayer R 1972 Development as the aim of
development can be found? What consequences do education. Harard Educational Reiew 49: 294–303
they have for the possibility, speed, and quality of Piaget J 1970 Piaget’s theory. In: Mussen P H (ed.) Carmichael’s
learning? How can instruction be adapted in line with Manual of Child Psychology. Wiley, New York, Vol. 1, pp.
individual differences in intelligence, talent, and learn- 703–32
ing? Richardson K 1998 Models of Cognitie Deelopment. Psy-
(e) It has proved to be unacceptable theoretically chology Press, Hove, UK
and also dysfunctional in terms of research strategy to Shulman L S 1982 Educational psychology returns to school. In:
study cognitive development in middle age and old age Kraut A G (ed.) The G Standley Hall Lecture Series. American
in an (inverted) analogy to childhood development. Psychological Association, Washington, DC, Vol. 2, pp.
77–117
Developmental processes in the early and late sections Siegler R S 1991 Children’s Thinking, 2nd edn. Prentice-Hall,
of the lifespan differ in fundamental terms. Englewood Cliffs, NJ
(f) No universal, unequivocal, and concrete recom- Slavin R E 1987 Mastery learning reconsidered. Reiew of
mendations for educational practice can be derived Educational Research 57: 175–213
from the available general theories on development, Sternberg R S 1998 Abilities are forms of developing expertise.
learning, and instruction. Educational Researcher 27: 11–20

2111
Cognitie Deelopment: Learning and Instruction

Weinert F E, Helmke A 1995 Learning from wise mother nature that the task was rather boring. The other half,
or big brother instructor: The wrong choice as seen from an however, given insufficient justification for their be-
educational perspective. Educational Psychologist 30: 135–42 havior, experienced dissonance between the knowl-
Wohlwill J F 1980 Cognitive development in childhood. In:
edge that the task was boring and the reality that they
Brim O G, Jr., Kagan J (eds.) Constancy and Change in Human
Deelopment. Harvard University Press, Cambridge, MA, were misleading a fellow participant into believing the
pp. 359–444 opposite. Rather than endure the aversive experience
of believing one thing but saying another, these
F. E. Weinert individuals changed their opinion and convinced
themselves that the task was actually interesting. In
other words, their attitude was shaped by their
behavior.
Subsequent studies have confirmed the basic theory
Cognitive Dissonance of cognitive dissonance and demonstrated its far-
reaching impact. For example, cognitive dissonance
1. Foundations of Dissonance Theory explains the increased commitment so frequently
observed following a severe initiation into a group.
The theory of cognitive dissonance is elegantly simple: The theory also explains why, when faced with a
it states that inconsistency between two cognitions choice among several desirable options, we observe
creates an aversive state akin to hunger or thirst that the tendency to highlight positive aspects of the chosen
gives rise to a motivation to reduce the inconsistency. option and negative aspects of the rejected alternatives
According to Leon Festinger (1957), cognitions are after (and only after) the choice has been made. In the
elements of knowledge that people have about their course of such studies, we have learned much about
behavior, their attitudes, and their environment. As the boundary conditions associated with the theory
such, a set of cognitions can be unrelated, consonant, and have identified anomalies not easily explained by
or dissonant with each other. Two cognitions are said the original theory. Since the 1960s, a number of
to be dissonant when one follows from the obverse of theoretical revisions have sought to subsume these
the other. The resultant motivation to reduce dis- limitations under a unifying theory. This article
sonance is directly proportional to the magnitude and summarizes briefly the leading reformulations of
importance of the discrepant cognitions, and inversely dissonance theory and speculates on future directions.
proportional to the magnitude and importance of the
consistent cognitions. This tension is typically reduced
by changing one of the cognitions, or adding new 2. An Early Theoretical Challenge: Self-
cognitions until mental ‘consonance’ is achieved. perception Theory
Festinger’s original formulation proved to be one of
the most robust, influential, and controversial theories Dissonance did not come quietly into psychology, nor
in the history of social psychology. Although a number did controversy begin over the finer points of the
of challenges and revisions have been suggested, the underlying process. Rather, the theory challenged the
basic behavioral observation remains uncontested and reigning theoretical paradigm of behaviorism of the
continues to stimulate fresh research. 1950s by questioning the sovereign utility of basic
Application of this theory has yielded many surpris- learning theory. Rather than increased rewards lead-
ing and nonintuitive predictions. For example, con- ing to more positive attitudes, Festinger and Carlsmith
ventional wisdom suggests that behavior follows from (1959) had shown quite the opposite: participants who
attitudes; dissonance theory, however, identifies con- received a smaller reward for counterattitudinal ad-
ditions under which just the opposite occurs. An early vocacy developed more positive attitudinal responses.
and often replicated experiment illustrates the power The majority of early criticisms, therefore, focused not
and counterintuitiveness of the theory. In what is now on the details of the dissonance theory but rather on its
known as the induced compliance effect, Festinger and fundamental legitimacy. During parts of the next two
Carlsmith (1959) asked individuals to perform 30 decades, one of the enduring intellectual feuds in social
minutes of a mind-numbingly tedious activity, and psychology debated whether dissonance phenomena
then to persuade a waiting participant that the activity were the result of complex cognitive processes in the
was in fact quite interesting. This situation created mind of the participant, or whether they were merely
cognitive dissonance in most individuals—they be- the result of complex cognitive processes in the mind
lieved that the task was boring, yet inexplicably found of the experimenter. Daryl Bem (1972) took the
themselves arguing quite the opposite. Half of the position that one could derive similar predictions
participants were given a ready excuse for telling this through far more parsimonious behavioral processes.
lie—they were paid $20 to do so—while the other half, He argued that participants in dissonance experiments
paid only $1, had no such excuse. Those with a clear were not experiencing negative psychological tension
justification for their odd behavior experienced no due to inconsistency, but rather were simply inferring
dissonance and, as one would expect, later reported their attitudes from their behavior and the situation in

2112
Cognitie Dissonance

which it occurred. In essence, he suggested that people among those with high self-esteem—that is, among
view their own behavior as though they were outside those whose past history had led them to believe that
observers, and infer their underlying attitude from an their high internal standards of behavior were likely to
analysis of their behavior. In support of this position, be achieved. By contrast, Aronson predicted that
Bem replicated Festinger and Carlsmith (1959) and those with low self-esteem, who were accustomed to
demonstrated that independent observers, aware of behaving less competently, would not be surprised or
the monetary inducements, attributed attitudes to the discomfited to find themselves once again behaving in
participants that were nearly identical to the partici- an incompetent manner.
pants’ actual attitudes. The critical test between these To illustrate his point, Aronson (1968) argued that
theories revolved around the search for physiological the dissonance aroused in Festinger and Carlsmith’s
arousal. Festinger was quite clear that inconsistent (1959) experiment was not due to inconsistency be-
cognitions created an aversive motivational state that tween the thoughts ‘I believe the task was dull’ and ‘I
could presumably be measured. Bem’s self-perception told someone the task was interesting.’ Instead,
theory, by contrast, predicted neither psychological Aronson proposed that dissonance was aroused by
nor physiological tension. However, Zanna and inconsistency between cognitions about the self (e.g.,
Cooper (1974) found indirect evidence for physio- ‘I am a decent and truthful human being’) and
logical arousal by showing that if arousal were cognitions about the behavior (e.g., ‘I have misled a
misattributed to an irrelevant source, the effects of person … (and) conned him into believing something
dissonance disappeared. The debate over dissonance that just isn’t true,’ p. 24). Aronson concluded: ‘at the
vs. self-perception was finally laid to rest by a series of very heart of dissonance theory, where it makes its
experiments that identified the precise conditions clearest and neatest prediction, we are not dealing with
under which each was operative. Fazio et al. (1977) just any two cognitions; rather we are usually dealing
found that small discrepancies between attitude and with the self-concept and cognitions about some
behavior (defined as those within a person’s latitude of behavior. If dissonance exists it is because the indi-
acceptance) tended to elicit self-perception processes, vidual’s behavior is inconsistent with his self-concept’
but larger discrepancies (those that fall in the person’s (1968, p. 23).
latitude of rejection) were more likely to generate Claude Steele and his colleagues (1988) suggested a
dissonance processes. Croyle and Cooper (1983) different interpretation of the role of the self in creating
added direct support for Festinger’s original position dissonance. Like Aronson, Steele viewed inconsistent
by showing that engaging in counterattitudinal ad- cognitions as a threat to the self; but unlike Aronson’s
vocacy, at least outside people’s latitude of acceptance, self-consistency model, he suggested that the primary
is marked by measurable increases in people’s skin function of dissonance reduction was not to rescue the
conductance responses. Ultimately, the importance of specific self-cognitions threatened by a behavioral
self-perception theory lay in its contributions to outcome, but instead to restore the completeness of
identifying boundary conditions of dissonance pro- the overarching self-system. The difference, then, was
cesses and in provoking research that established not about the origin of dissonance arousal, but rather
physiological arousal as one of the hallmarks of about the purpose and mechanism underlying dis-
dissonance. sonance reduction. Whereas Aronson focused on
individual susceptibility to dissonance arousal, Steele’s
self-affirmation theory focused on individual resilience
3. Introduction of the Self to dissonance. Interestingly, from Aronson’s point of
view, it is those individuals who have a positive self-
It may be that all cognitive inconsistencies are not regard that are most likely to experience dissonance
psychologically equivalent. According to some arousal. Steele, by contrast, asserted that it is those
theorists, inconsistencies that implicate aspects of the same individuals—wrapped in their armor of self-
self maintain a privileged position. This suspicion led resources—who feel immune from the need to reduce
Elliot Aronson (1968) and Claude Steele (1988) to dissonance. Based on the data provided by Aronson
consider the role of self-concept in the dissonance and Steele, it is probably fair to say that those with
process. Aronson concurred with Festinger’s view that high self-esteem are both more susceptible to dis-
dissonance was caused by inconsistencies, but argued sonance arousal and more resistant to its effects
that it was a particular inconsistency that mattered because they can focus on their many other strengths.
most in arousing dissonance, i.e., the discrepancy Holding all else constant, high self-esteem serves both
between a person’s general expectations for the self as a catalyst that generates dissonance and as a buffer
and his or her actual behavior. In other words, the that mitigates the need to reduce dissonance.
arousal due to dissonance came about when a person’s
belief that he or she was a good and rational individual 4. A New Look at Dissonance
was called into question by behavior that was neither
good nor rational. Aronson predicted that dissonance In 1984, Cooper and Fazio provided a comprehensive
arousal would be more frequent and more powerful review of the dissonance literature and challenged the

2113
Cognitie Dissonance

dominant assumption that dissonance was driven by a Bibliography


need for psychological consistency. According to their
Aronson E 1968 Dissonance theory: Progress and problems. In:
‘New Look’ model, dissonance is aroused when people Ableson R P, Aronson E, McGuire W J, Newcomb T M,
perceive that their behavior has been responsible for Rosenberg M J, Tannenbaum P H (eds.) Theories of Cognitie
bringing about consequences that are unwanted or Consistency: A Sourcebook. Rand McNally, Chicago
aversive. If there are no such consequences, then Bem D J 1972 Self-perception theory. Adances in Experimental
inconsistent behavior will not produce the state of Psychology 6: 1–62
dissonance. For example, Cooper and Worchel (1970) Cooper J 1999 Unwanted consequences and the self: In search of
replicated Festinger and Carlsmith’s (1959) study with the motivation for dissonance reduction. In: Harmon-Jones E,
a condition in which the waiting participant was not Mills J (eds.) Cognitie Dissonance: Progress on a Piotal
convinced by the subject’s lie. In this condition, the Theory in Social Psychology. American Psychological Asso-
ciation, Washington, DC, pp. 149–73
aversive consequence of misleading a fellow parti- Cooper J, Fazio R H 1984 A new look at dissonance theory.
cipant was removed along with all evidence of the Adances in Experimental Psychology 17: 229–62
dissonance process. Cooper and Fazio concluded from Cooper J, Worchel S 1970 Role of undesired consequences in
this and many other studies that responsibility for an arousing dissonance. Journal of Personality and Social Psy-
aversive event rather than cognitive inconsistency chology 16: 199–206
plays the vital role in producing cognitive dissonance. Croyle R, Cooper J 1983 Dissonance arousal: Physiological
evidence. Journal of Personality and Social Psychology 45:
782–91
Fazio R H, Zanna M P, Cooper J 1977 Dissonance and self-
perception: An integrative view of each theory’s proper
domain of application. Journal of Experimental Psychology
5. Future Directions—Self-standards Model of 13: 464–79
Dissonance Festinger L 1957 A Theory of Cognitie Dissonance. Row,
Peterson, Evanston, IL
Each of these reformulations of dissonance theory Festinger L, Carlsmith J M 1959 Cognitive consequences of
differs with respect to one major issue: what is the role forced compliance. Journal of Abnormal Social Psychology 58:
of self-concept in dissonance processes? Is it a problem, 203–10
a benefit, or completely irrelevant to the arousal and Steele C M 1988 The psychology of self-affirmation: Sustaining
reduction of cognitive dissonance? There are data in the integrity of the self. Adances in Experimental Psychology
12: 261–302
favor of each position—data that are not easily
Stone J 1999 What exactly have I done? The role of self-attribute
reconcilable. However, a recent synthesis discussed by accessibility in dissonance. In: Harmon-Jones E, Mills J (eds.)
Cooper (1999) and Stone (1999) suggests that dis- Cognitie Dissonance: Progress on a Piotal Theory in Social
sonance is caused by a discrepancy between the Psychology. American Psychological Association, Washing-
outcome of a behavioral act and the standard to which ton, DC, pp. 175–200
it is compared. According to the self-standards model, Zanna M P, Cooper J 1974 Dissonance and the pill: An
sometimes the standard that people use to measure attribution approach to studying the arousal properties of
their behavioral outcomes are personal and idiosyn- dissonance. Journal of Personality and Social Psychology 29:
cratic. In such cases, people’s views of themselves will 703–9
play a crucial role. At other times, the assessment of an
act is based on broad, normative standards that are J. Cooper and K. M. Carlsmith
shared in the culture. At these times, the self will not
play a role in the dissonance process.
In summary, it is still useful to think of dissonance
as involving inconsistency among cognitive elements
and to conclude that inconsistency produces mo- Cognitive Functions (Normal) and
tivation for change. It is also fair to conclude that the
future of dissonance theory will include a role for Neuropsychological Deficits, Models of
behavioral consequences, an assessment of the self,
and an analysis of the contextual variables that make The field of cognitive neuropsychology centers around
different standards the basis of judgment for be- two coupled goals: to use patterns of cognitive deficits
havioral outcomes. The evolution of cognitive dis- in brain-damaged patients to inform theories and
sonance calls for an integration that will likely include models of how cognitive processes are carried out by
insights from the currently dominant perspectives. the brain, and to apply existing models to explain the
specific deficits of individual patients in order to design
more effective strategies for remediating these deficits.
See also: Attitudes and Behavior; Motivation and The roots of this effort can be traced back to the
Actions, Psychology of; Motivation: History of the pioneering work of Broca, Wernicke, and Lichtheim
Concept; Self-monitoring, Psychology of; Social in the mid- to late nineteenth century. These neurolo-
Comparison, Psychology of gists attempted to decompose complex cognitive func-

2114
Cognitie Functions (Normal) and Neuropsychological Deficits, Models of

Figure 1
Lichtheim’s (1885) model of aphasia. Grey bars indicate lesions to processing centers (circles) or pathways between
them (arrows), giving rise to the following classic aphasia syndromes (in modern terminology): (a) Broca’s aphasia;
(b) Wernicke’s aphasia; (c) conduction aphasia; (d) transcortical motor aphasia; (e) dysarthria; (f ) transcortical
sensory aphasia; (g) pure word deafness; (h) anomic aphasia
tions, such as language, into the joint operation of on fairly general notions about how the various
multiple functional ‘centers’ with specific patterns of modules would operate and interact. While these types
connectivity between them. Damage either to the of predictions may suffice for capturing the more
centers themselves or to the pathways between them general characteristics of normal and impaired cog-
were thought to give rise to distinct patterns of nitive functioning, they become increasingly unreliable
cognitive deficits. Indeed, the traditional scheme used as the model is elaborated to account for more detailed
today to categorize patterns of language impairment phenomena.
into distinct clinical syndromes—such as Broca’s Two recent trends have increased the usefulness of
aphasia, Wernicke’s aphasia, transcortical sensory or box-and-arrow theorizing within cognitive neuropsy-
motor aphasia, etc.—derives from the Wernicke– chology. First, with improvements in techniques for
Lichtheim model of the organization of the language structural lesion localization in patients and for
system developed in the late nineteenth century (see functional brain imaging in both patients and normal
Fig. 1). subjects, there has been a more concerted effort to
situate the components and pathways in specific brain
regions. This is important because information on
1. Box-and-arrow Modeling neuroanatomic localization places strong constraints
on how components participate in various tasks and
Although the form of explanation offered by these so- on how the system must be damaged to account for the
called ‘diagram makers’ was criticized roundly in the performance of specific patients.
early twentieth century, it is echoed in modern-day The second, and perhaps more important trend has
cognitive neuropsychology in the form of box-and- been the development of working computer simu-
arrow information processing models (see Fig. 2). lations of cognitive models that can both reproduce
While the functions ascribed to the centers or com- the characteristics of normal performance and can
ponents are far more specific than in the nineteenth exhibit the appropriate deficits when damaged in a
century, the same underlying explanatory logic is manner analogous to brain damage. Computational
applied—the patterns of performance of brain- modeling makes it possible to demonstrate the suf-
damaged patients are explained by positing one or ficiency of the underlying theory in accounting for the
more ‘lesions’ to the so-called functional architecture phenomena by making the behavior of a detailed
of the cognitive system. Typically, the predictions of cognitive model explicit. A working simulation
the model, both in normal operation and under guarantees that the underlying theory is neither vague
damage, have consisted of verbal descriptions based nor internally inconsistent, and the behavior of the

2115
Cognitie Functions (Normal) and Neuropsychological Deficits, Models of

Figure 2
A box-and-arrow model of written and spoken word processing (from Howard and Franklin 1988)

simulation can be used to generate specific predictions on the input string and generate single phonemes at
of the theory. fixed intervals. The lexical route is a version of a highly
One example of computational modeling based on influential model of word recognition developed by
box-and-arrow theorizing is the work of Coltheart and Rumelhart and McClelland (1982), known as the
co-workers (Coltheart et al. 1993) in simulating a dual- Interactive Activation model, which contains a sep-
route model of word reading. In the model, one arate processing unit for each word in the vocabulary.
pathway from print to sound applies grapheme- Damage to the lexical route yields a reading pattern in
phoneme correspondence rules (e.g.,  at the beginning which exception words are regularized (e.g.,  read
of a word is pronounced \b\), while the other uses to rhyme with ), analogous to patients with
memorized whole-word correspondences. (These are acquired surface dyslexia (Patterson et al. 1985).
the two pathways in Fig. 2 from the ‘Orthographic Conversely, damage to the rule route causes impaired
input buffer’ to the ‘Phonological output buffer’ that reading of pseudowords relative to words, correspond-
bypass the ‘Cognitive System.’) The rule route is ing to acquired phonological dyslexia (Beauvois and
effective for regular words (e.g., ) and for pro- Derouesne! 1979).
nounceable but meaningless pseudowords (e.g., );
however, the lexical route is needed to pronounce
exception words (e.g., , ) whose pronunci- 2. Connectionist Modeling
ations violate the rules. In Coltheart and co-workers’
implementation, the rule route consists of a collection Although substantial progress has been made within
of template-matching rules that operate left-to-right the framework of box-and-arrow theorizing, many

2116
Cognitie Functions (Normal) and Neuropsychological Deficits, Models of

Figure 3
A distributed connectionist framework for lexical processing. The output of each unit is a smooth, nonlinear
function of the summed, weighted input from other units (adapted from Plaut 1997)

researchers have come to believe that, in order to guide the search for underlying principles, it tends to
capture the full range of cognitive and neuropsycho- focus more on overall system function or behavior,
logical phenomena, a formalism is needed that is attempting to determine what principles of brain-style
based more closely on the style of computation computation give rise to the cognitive phenomena
employed by the brain. One such formalism that is observed in human behavior.
widely used in connectionist modeling (see, e.g., The simplest type of connectionist system is a
McClelland et al. 1986, McLeod et al. 1998, Rumelhart feedforward network, in which information flows
et al. 1986). unidirectionally from input units to output units,
In connectionist models—sometimes called neural typically via one or more layers of hidden units (so
networks or parallel distributed processing systems— called because they are not visible to the environment).
cognitive processes take the form of cooperative and Such networks are useful in many contexts but have a
competitive interactions among large numbers of limited ability to process time-varying information. In
simple, neuron-like processing units (Fig. 3). Typi- such contexts, recurrent networks, that permit any
cally, each unit has a real-valued activity level, roughly pattern of interconnection among the units, are more
analogous to the firing rate of a neuron. Unit inter- appropriate. In one common type of recurrent net-
actions are governed by weighted connections that work, termed an attractor network, unit activities
encode the long-term knowledge of the system and are gradually settle to a stable pattern in response to a
learned gradually through experience. The activity of fixed input. Recurrent networks can also learn to
some of the units encodes the input to the system; the process sequences of inputs and\or to produce se-
resulting activity of other units encodes the system’s quences of outputs. For example, in a simple recurrent
response to that input. The patterns of activity of the network (Elman 1990), the internal representation
remaining units constitute learned, internal repre- generated for each element in a sequence is made
sentations that mediate between inputs and outputs. available as an additional input to provide context for
Units and connections generally are not considered to processing subsequent elements. Critically, the
be in one-to-one correspondence with actual neurons internal representations themselves adapt so as to
and synapses. Rather, connectionist systems attempt encode this context information effectively, enabling
to capture the essential computational properties of the system to learn to represent and retain relevant
the vast ensembles of real neuronal elements found in information at multiple time scales.
the brain through simulations of smaller networks of An issue of central relevance in the study of
units. In this way, the approach is distinct from cognition is the nature of the underlying represen-
computational neuroscience (Sejnowski et al. 1988) tation of information. Connectionist models divide
which aims to model the detailed neurophysiology of roughly into two classes in this regard. In localist
relatively small groups of neurons. Although the models, such as the Interactive Activation model
connectionist approach uses physiological data to mentioned earlier, each unit corresponds to a distinct,

2117
Cognitie Functions (Normal) and Neuropsychological Deficits, Models of

familiar entity such as a letter, word, concept, or a good account of word reading but was poor at
proposition. By contrast, in distributed models, such pronouncing word-like pseudowords (e.g., ,
entities are encoded not by individual units but by Besner et al. 1990). A more recent series of simulations
alternative patterns of activity over the same group of (Plaut et al. 1996) showed that the limitations of this
units, so that each unit participates in representing preliminary model stemmed from the model’s use of
many entities. Both localist and distributed models are poorly structured orthographic and phonological
‘connectionist’ in the sense that the system’s knowl- representations. By contrast, networks with more
edge is encoded in terms of weights on connections appropriate representations were able to learn to
between units. pronounce both regular and exception words, and yet
Because localist models specify the form and content also pronounce pseudowords as well as skilled readers.
of representations, they tend to de-emphasize the role Moreover, damage to semantic representations in such
of learning. With a distributed model, by contrast, networks gave rise to surface dyslexia, in which
there is greater emphasis on the ability of the system to patients produce regularization errors to exception
learn effective internal representations. Thus, instead words. In closely related work (Hinton and Shallice
of attempting to stipulate the specific form and content 1991, Plaut and Shallice 1993), damage to the pathway
of the knowledge required for performance in a between orthography and phonology, combined with
domain, the approach instead stipulates the tasks the secondary damage to the semantic pathway, yielded
system must perform, including the nature of the the complementary pattern of deep dyslexia—often
relevant information in the environment, but then viewed as a severe form of phonological dyslexia—in
leaves it up to learning to develop the necessary which patients are extremely poor at pronouncing
internal representations and processes. pseudowords and make semantic errors in reading
Learning in a connectionist system involves modify- words aloud (e.g., misreading  as ‘ocean’; see
ing the values of weights on connections between units Coltheart et al. 1980). With the additional of an
in response to feedback on the behavior of the attentional mechanism, fully recurrent networks have
network. A variety of specific learning procedures are also been used to account for the interaction of both
employed in connectionist research; most that have perceptual and lexical\semantic factors in the reading
been applied to cognitive domains, such as back- errors of neglect dyslexic patients (Mozer and
propagation (Rumelhart et al. 1986) take the form of Behrmann 1990). In fact, such networks have been
error correction: change each weight in a way that applied to a wide range of neuropsychological
reduces the discrepancy between the correct response phenomena, including selective impairments in face
to each input and the one actually generated by the recognition (Farah et al. 1993), visual object rec-
system. Although it is unlikely that the brain imple- ognition (Humphreys et al. 1992), spatial attention
ments back-propagation in any direct sense, there are (Cohen et al. 1994), semantic memory (Farah and
more biologically plausible procedures that are McClelland 1991), anomia and aphasia (Dell et al.
computationally equivalent (see, e.g., O’Reilly 1996). 1997), spelling (Brown and Ellis 1994), and executive
In an early application of error-correcting learning, control (Cohen and Servan-Schreiber 1992).
Rumelhart and McClelland (1986) showed that a One of the main attractions of distributed con-
single network could learn to generate the past-tense nectionist models is their ability to discover the
forms of both regular verbs (e.g.,   ‘baked’) and structure implicit in ensembles of events and experi-
irregular verbs (e.g.,  ‘took’), thereby obviating ences. Accomplishing this, however, requires making
the need for dual rule-based and exception mechan- only very small changes in response to each input so
isms (Pinker 1999), analogous to those in the dual- that the resulting weight values reflect the long-term
route reading models mentioned earlier. Although experience of the system. Attempts to teach such
aspects of the approach were criticized strongly networks the idiosyncratic properties of specific events
(Pinker and Prince 1988), many of the specific limita- one after the other do not generally succeed since the
tions of the model have been addressed in subsequent changes made in learning each new case produce
simulation work (see, e.g., MacWhinney and Leinbach ‘catastrophic interference’ with what was stored pre-
1991, Plunkett and Marchman 1993, 1996). Of par- viously in the weights (McCloskey and Cohen 1989).
ticular interest is recent work by Joanisse and McClelland et al. (1995) observed, however, that
Seidenberg (1999) showing that damage either to catastrophic interference does not occur if continued
phonological or to semantic representations within training of old knowledge is interleaved with the
single processing system can account for the ob- training of new knowledge. They proposed that the
servation of selective impairments in performance on brain employs two complementary learning systems: a
regular vs. irregular verbs following Parkinson’s dis- cortical system for gradual learning using highly
ease vs. Alzheimer’s disease, respectively (Ullman et overlapping distributed representations, and a sub-
al. 1997). cortical, hippocampal-based system for rapid learning
A similar line of progress has taken place in the using much sparser, less-overlapping representations.
domain of English word reading. An early connection- On their account, stored instances in the hippocampus
ist model (Seidenberg and McClelland 1989) provided provide the training input for past experience that

2118
Cognitie Functions (Normal) and Neuropsychological Deficits, Models of

must be inter-leaved with ongoing experience to have proven most effective at capturing the effects of
prevent interference in cortex. The argument was that brain damage on cognition. Considerable work re-
learning in cortex and in distributed networks are mains, however, in extending such models to address
similarly constrained, so that the strengths and limi- more complex temporal phenomena.
tations of structure-sensitive learning in networks
explained why the brain employs two complementary See also: Artificial Intelligence: Connectionist and
learning systems in hippocampus and neocortex. Symbolic Approaches; Artificial Neural Networks:
Although fully recurrent networks are capable of Neurocomputation; Cognitive Neuropsychology,
learning to exhibit complex temporal behavior, for Methodology of; Computational Neuroscience; Con-
reasons of efficiency it is more common to apply nectionist Approaches; Connectionist Models of Lan-
simple recurrent networks in temporal domains. For guage Processing; Language Acquisition; Language
example, Elman (1991) demonstrated that a simple Development, Neural Basis of; Lexical Processes
recurrent network could learn the structure of an (Word Knowledge): Psychological and Neural
English-like grammar, involving number agreement Aspects; Neural Networks: Biological Models and
and variable verb argument structure across multiple
Applications; Neuropsychological Functioning,
levels of embedding, by repeatedly attempting to
predict the next word in processing sentences. St. John Assessment of; Syntactic Aspects of Language, Neural
and McClelland (1990) also showed how such net- Basis of; Word Recognition, Cognitive Psychology
works can learn to develop a representation of of
sentence meaning by attempting to answer queries
about thematic role assignments throughout the
course of processing a sentence. There have, however, Bibliography
been relatively few attempts at applying simple re-
current networks to neuropsychological phenomena. Beauvois M-F, Derouesne! J 1979 Phonological alexia: Three
dissociations. Journal of Neurology, Neurosurgery, and Psy-
chiatry 42: 1115–24
Besner D, Twilley L, McCann R S, Seergobin K 1990 On the
3. Future Directions connection between connectionism and data: Are a few words
necessary? Psychological Reiew 97(3): 432–46
In many ways, the application of computational Brown G D A, Ellis N C (eds.) 1994 Handbook of Normal and
modeling to understanding normal and impaired Disturbed Spelling. Wiley, New York
cognition is still in its infancy. Only a small fraction of Cohen J D, Romero R D, Servan-Schreiber D, Farah M J 1994
the relevant behavioral phenomena have been addres- Mechanisms of spatial attention: The relation of macro-
sed in any detail by existing models. Certainly con- structure to microstructure in parietal neglect. Journal of
siderable fruitful work remains in applying existing Cognitie Neuroscience 6(4): 377–87
methods to a broader range of empirical issues. Even Cohen J D, Servan-Schreiber D 1992 Context, cortex, and
dopamine: A connectionist approach to behavior and biology
so, it seems clear that existing computational frame- in schizophrenia. Psychological Reiew 99(1): 45–77
works have a number of limitations which hamper Coltheart M, Curtis B, Atkins P, Haller M 1993 Models of
their broader application. This is particularly true with reading aloud: Dual-route and parallel-distributed-processing
regards to the application of connectionist networks approaches. Psychological Reiew 100(4): 589–608
to complex temporal domains, such as language, Coltheart M, Patterson K, Marshall J C (eds.) 1980 Deep
reasoning, and problem solving. While there have been Dyslexia. Routledge and Kegan Paul, London
some promising initial steps in these areas, substantial Dell G S, Schwartz M F, Martin N, Saffran E M, Gagnon D A
development of the computational methodology itself 1997 Lexical access in normal and aphasic speakers. Psycho-
is likely to be necessary before satisfactory models will logical Reiew 104: 801–38
Elman J L 1990 Finding structure in time. Cognitie Science
be possible. 14(2): 179–211
Elman J L 1991 Distributed representations, simple recurrent
networks, and grammatical structure. Machine Learning 7:
4. Summary 195–225
Farah M J, McClelland J L 1991 A computational model of
Researchers interested in human cognitive processes semantic memory impairment: Modality-specificity and emer-
have long used computer simulations to try to identify gent category-specificity. Journal of Experimental Psychology:
the principles of cognition. The strategy has been to General 120(4): 339–57
build computational models that embody a set of Farah M J, O’Reilly R C, Vecera S P 1993 Dissociated overt and
covert recognition as an emergent property of a lesioned
principles and then examine how well the models neural network. Psychological Reiew 100(4): 571–88
capture human performance in cognitive tasks. A Hinton G E, Shallice T 1991 Lesioning an attractor network:
number of formalisms have been used to model Investigations of acquired dyslexia. Psychological Reiew
cognitive processing in normal individuals. Those 98(1): 74–95
based on general principles of neural computation— Howard D, Franklin S 1988 Missing the Meaning? MIT Press,
including connectionist or neural-network models— Cambridge, MA

2119
Cognitie Functions (Normal) and Neuropsychological Deficits, Models of

Humphreys G W, Freeman T, Mu$ ller H J 1992 Lesioning a Rumelhart D E, McClelland J L 1986 On learning the past
connectionist model of visual search: Selective effects on tenses of English verbs. In: McClelland J L, Rumelhart D E,
distractor grouping. Canadian Journal of Psychology 46: PDP Research Group (eds.) Parallel Distributed Processing:
417–60 Explorations in the Microstructure of Cognition. Volume 2:
Joanisse M F, Seidenberg M S 1999 Impairments in verb Psychological and Biological Models. MIT Press, Cambridge,
morphology after brain injury: A connectionist model. Pro- MA, pp. 216–71
ceedings of the National Academy of Science, USA 96: 7592–7 Rumelhart D E, McClelland J L, PDP Research Group (eds.)
Lichtheim L 1885 On aphasia. Brain 7: 433–84 1986 Parallel Distributed Processing: Explorations in the
MacWhinney B, Leinbach J 1991 Implementations are not Microstructure of Cognition. Volume 1: Foundations. MIT
conceptualizations: Revising the verb learning model. Cog- Press, Cambridge, MA
nition 40: 121–53 Seidenberg M S, McClelland J L 1989 A distributed, devel-
McClelland J L, McNaughton B L, O’Reilly R C 1995 Why opmental model of word recognition and naming. Psycho-
there are complementary learning systems in the hippocampus logical Reiew 96: 523–68
and neocortex: Insights from the successes and failures of Sejnowski T J, Koch C, Churchland P S 1988 Computational
connectionist models of learning and memory. Psychological neuroscience. Science 241: 1299–1306
Reiew 102: 419–57 St. John M F, McClelland J L 1990 Learning and applying
McClelland J L, Rumelhart D E, PDP Research Group (eds.) contextual constraints in sentence comprehension. Artificial
1986 Parallel Distributed Processing: Explorations in the Intelligence 46: 217–57
Microstructure of Cognition. Volume 2: Psychological and Ullman M T, Corkin S, Coppola M, Hicock G, Growdon J H,
Koroshetz W J, Pinker S 1997 A neural dissociation within
Biological Models. MIT Press, Cambridge, MA
language: Evidence that the mental dictionary is part of
McCloskey M, Cohen N J 1989 Catastrophic interference in
declarative memory and that grammatical rules are processed
connectionist networks: The sequential learning problem. In:
by the procedural system. Journal of Cognitie Neuroscience 9:
Bower G H (ed.) The Psychology of Learning and Motiation.
266–76
Academic Press, New York, Vol. 24, pp. 109–65
McLeod P, Plunkett K, Rolls E T 1998 Introduction to Con-
nectionist Modelling of Cognitie Processes. Oxford University
D. C. Plaut
Press, Oxford, UK
Mozer M C, Behrmann M 1990 On the interaction of selective
attention and lexical knowledge: A connectionist account of
neglect dyslexia. Journal of Cognitie Neuroscience 2(2):
96–123 Cognitive Maps
O’Reilly R C 1996 Biologically plausible error-driven learning
using local activation differences: The generalized re-
circulation algorithm. Neural Computation 8(5): 895–938 1. Introduction
Patterson K, Coltheart M, Marshall J C (eds.) 1985 Surface
Dyslexia. Erlbaum, Hillsdale, NJ A cognitive map is a representative expression of an
Pinker S 1999 Words and Rules: The Ingredients of Language. individual’s cognitive map knowledge, where cognitive
Basic Books, New York map knowledge is an individual’s knowledge about
Pinker S, Prince A 1988 On language and connectionism: the spatial and environmental relations of geographic
Analysis of a parallel distributed processing model of language space. For example, a sketch map drawn to show the
acquisition. Cognition 28: 73–193 route between two locations is a cognitive map—a
Plaut D C 1997 Structure and function in the lexical system: representative expression of the drawer’s knowledge
Insights from distributed models of naming and lexical
decision. Language and Cognitie Processes 12: 767–808
of the route between the two locations. Because a
Plaut D C, McClelland J L, Seidenberg M S, Patterson K 1996 cognitive map represents\demonstrates an indivi-
Understanding normal and impaired word reading: Compu- dual’s geographic knowledge, geographers, psycho-
tational principles in quasi-regular domains. Psychological logists, and others use them as the principle means by
Reiew 103: 56–115 which to assess how we learn, process, and store
Plaut D C, Shallice T 1993 Deep dyslexia: A case study of geographic information gained from primary (e.g.,
connectionist neuropsychology. Cognitie Neuropsychology walking through an area) and secondary (e.g., reading
10(5): 377–500 a map) sources. An understanding of how we under-
Plunkett K, Marchman V A 1993 From rote learning to system take these mental tasks is thought to be important
building: Acquiring verb morphology in children and con- because it reveals the fundamental cognitive processes
nectionist nets. Cognition 48(1): 21–69 and structures that underlie spatial decision and choice
Plunkett K, Marchman V A 1996 Learning from a connectionist making, and thus spatial behavior—why we choose
model of the acquisition of the English past tense. Cognition
61(3): 299–308
certain routes, places to visit, locations to live, and so
Rumelhart D E, Hinton G E, Williams R J 1986 Learning on (see Spatial Cognition; Behaioral Geography).
representations by back-propagating errors. Nature 323(9):
533–6
Rumelhart D E, McClelland J L 1982 An interactive activation 2. Cognitie Mapping
model of context effects in letter perception: Part 2. The
contextual enhancement effect and some tests and extensions The combined process by which we learn, store, and
of the model. Psychological Reiew 89: 60–94 use information relating to the geographic world is

2120
Cognitie Maps

Table 1
Lynch’s classification
Category Description
Paths Paths are the channels along which an individual moves. They may include streets,
walkways, railways
Edges Edges are the linear elements not considered as paths. They are the boundaries between two
phases, linear breaks in continuity such as shores or walls
Districts Districts are the medium-to-large scale sections to the city, conceived as having a
two-dimensional extent, which the observer mentally enters, and which have some common
identifiable character
Nodes Nodes are points, the strategic spots in the city into which an observer can enter, and which
are the intensive foci to and from which (s)he is traveling. They may be primarily junctions,
transportation changeovers, a crossing, or convergence of paths
Landmarks Landmarks are another type of point-reference. They are usually a physical object such as a
building, sign, store, or mountain

known as cognitive mapping, and this term is used to and are asked to draw a map of a certain location,
define the field of study which investigates this process. area, or route between locations. The scale of the
Cognitive mapping as a field of enquiry is relatively geographic realm to be drawn can vary substantially
young. Whilst there are a handful of studies which from the global (e.g., draw a map of the world) to the
predate 1960, the vast majority of research has local (e.g., draw a map of your neighborhood).
occurred after this date and the publication of Kevin Variants on this simple sketch mapping exercise
Lynch’s seminal work ‘The Image of the City’. Ever include providing respondents with a small portion of
since Lynch’s ground-breaking book, cognitive map- the map to provide a scale and reference, and teaching
ping research has been a multidisciplinary endeavor, subjects a sketch map language where specific symbols
undertaken by geographers, psychologists, anthropo- are used to denote particular features. By aggregating
logists, computer, and information scientists. How- together the cognitive maps of several individuals it is
ever, whilst a vast amount of research has been possible to determine their shared level of knowledge
conducted as yet there is only limited consensus as to and which elements of an environment are most
the fundamental processes of cognitive mapping. As salient. This is the technique pioneered by Lynch. He
such, there are a number of theories that seek to analyzed individuals maps by classifying their ele-
explain how we construct, process, and store cognitive ments into five different classes (see Table 1) which he
map knowledge and how this knowledge is used to then used to produce a composite map where the
make spatial decisions and choices. These theories symbol size\shading density is proportional to the
generally have been formulated on the basis of number of times an element appeared on the individual
evidence provided by cognitive maps (see Downs and maps. Using this technique he aggregated together the
Stea 1977, Ga$ rling and Golledge 1993, Portugali 1996, sketch maps of residents in Boston, Jersey City, and
Golledge and Stimson 1997). Los Angeles to create composite cognitive maps of
these cities (see Fig.1)
The analysis used by Lynch is a content classifi-
3. Cognitie Maps cation. A number of other classification schemes have
been used to analyze cognitive maps. For example,
Until very recently the bases and processes of human there have been classifications that assess map style,
cognitive mapping was exclusively measured and structure, and accuracy. In these cases the focus moves
assessed through cognitive (sketch) maps and other beyond what elements an individual draws to assessing
external forms of knowledge representation (e.g., the relationship between the elements and their rela-
estimating distances). More recently these methods of tivity to the real world. In addition, the accuracy of the
data generation have been supplemented by quali- spatial relations portrayed can be analyzed statistically
tative and neurological approaches. Moreover, there using spatial statistics. For example, in many studies
has been a transference of experimental setting from bidimensional regression has been used to compare
the laboratory to the natural environment. the geometry of the cognitive map to a cartographic
Whilst a cognitive map is any ‘map’ which repre- map. Bidimensional regression is a two-dimensional
sents an individual’s knowledge of an area, it generally equivalent of linear regression that quantifiably
takes the form of a sketch map drawn on a sheet of assesses scale, rotation, and translation differences
paper (it could, however, be a drawing in sand or a between the actual and estimated pattern of responses.
map constructed out of natural material). In exper- Using the technique of sketch mapping to generate
imental conditions, subjects are given a sheet of paper data about a person’s cognitive map knowledge and

2121
Cognitie Maps

Figure 1
Cognitive maps of Boston, Jersey City and Los Angeles

2122
Cognitie Maps

their ability to use this knowledge is not without route (procedural) knowledge. Distance tasks are used
criticism. For example, a number of researchers have to assess an individual’s knowledge of the distance
argued that sketch maps have a number of qualities between locations. In his review, Montello (1991)
that make them unreliable and inaccurate measures of identifies five groups of tests designed to measure
spatial knowledge (note not geographic knowledge): cognitive distance estimates: ratio scaling, interval and
they are dependent upon drawing abilities and fami- ordinal scaling, mapping, reproduction, and route
liarity with cartographic conventions; they suffer from choice. Ratio scaling adapts traditional psychophys-
associational dependence where later additions to the ical scaling techniques to a distance context, with
sketch will be influenced by the first few elements that subjects estimating the distance to a location as a ratio
are drawn; their content and style are influenced by the of some other known distance, such as an arbitrary
size of paper used for sketching; they are difficult to scale or the length of a ruler. Interval and ordinal
subjectively score and code; and they often show less scaling are similar to ratio scaling but differ in their
information than the respondent knows. As a conse- level of measurement: paired comparison requires a
quence of these criticisms it is becoming less common respondent to decide which one of a pair of distances
for researchers to use sketch mapping as an analytical is longer; ranking requires a respondent to rank
tool to assess individual and collective cognitive various distances in order along the dimension of
mapping. Instead, researchers are turning to a range of length; rating requires a respondent to assign the
other techniques. distance between places to a set of predetermined
classes that represent relative length; partition scales
require a respondent to assign distances to classes of
equal-appearing intervals of length. Mapping is the
measurement of distances from a sketch map for
4. Other Ways to Measure Cognitie Map comparison with the actual distances. Reproduction
Knowledge requires a respondent to provide distance estimates at
the scale of the estimated distance. Route choice
consists of inferring judgments of cognitive distance
4.1 Other Action Measures
from the choice of route an individual makes when
Other forms of action (e.g., drawing) measures can be asked to take the shortest route between two locations.
divided into those that, like sketch mapping, are two- Like sketch map data, distance data can be analyzed
dimensional in nature and those that are unidimen- individually or aggregated beforehand to provide data
sional. Other two-dimensional measures include about a group. A common method to analyze ratio
completion tasks and recognition tasks. Completion scaling, mapping, and reproduction data is to regress
tasks require an individual to complete a task that has the cognitive distance estimates onto the objective
already been started for them. For example, spatial distance values, observing the relationship between
cued response tests require subjects to place locations the two. Another common strategy, and one also used
in relation to locations that are preplaced. A highly with interval or ordinal distance data, is to analyze the
cued version of this are cloze procedure tests that data using multidimensional scaling techniques
require a subject to ‘fill in’ a missing space (an aspatial (MDS). MDS techniques explore the latent structure
example of which would be, ‘a dog barks but a cat --- of a set of distance estimates by assessing the dimen-
---?’ Recognition tasks measure how successful sub- sionality of the data. They do this by constructing a
jects are at identifying spatial relationships. Iconic two-dimensional space from one-dimensional data
tests require the respondent to identify correctly using a series of algorithms. In essence, they construct
features on a map or aerial photograph. Configuration a ‘map’ showing the relationship between a number of
tests require a subject to identify correctly which objects. This ‘map’ can then be compared to an actual
configuration, out of several, displays the correct map using techniques such as bidimensional regres-
spatial relations. Verifiable statement tests require sion.
subjects to identify whether a textual description of a Direction tasks assess an individual’s knowledge of
spatial relationship is true or false. Spatial cued the direction between two locations. The most com-
response data are often analyzed like sketch maps mon direction task is pointing. Pointing involves
using bidimensional regression. standing at, or imagining being at, a location and
Cloze procedure and recognition tests are analyzed pointing to another location. An alternative technique
by constructing an accuracy score that reveals as a involves drawing on a compass the direction to a
percentage the number of correct placements or location from another. Direction estimates have been
recognitions. analyzed by comparing the estimates to the actual
Unidimensional tests seek to uncover one-dimen- directions, often through a simple subtraction process.
sional aspects of cognitive map knowledge such as In other cases, a technique of projective convergence
distance and direction. These dimensions are thought has been used to construct a ‘map’ from estimates by
to be representative of spatial knowledge in general, calculating where estimates to the same location but
but are particularly useful for measuring levels of from different sites intersect.

2123
Cognitie Maps

4.2 Qualitatie Approaches spatial learning and processing. Some researchers,


however, question the ecological validity of this
In recent years there has been an increase in the use of
approach given that cognitive map knowledge con-
qualitative methodologies to investigate cognitive map
cerns the geographic environment and it is this
knowledge. In some cases, this has involved a scientific
environment in which spatial behavior occurs. For
approach and in others an interpretative approach.
these reasons they suggest that testing should occur in
The scientific approach continues the tradition de-
the natural environment, and increasingly this is
scribed above, but rather than externally representing
becoming more common.
their knowledge through actions (e.g., drawing, point-
ing) individuals are required to describe verbally
routes or layouts in experimental conditions. Like
cognitive maps these data can be analyzed for content, 5. Summary
style, structure, and accuracy. The interpretative Cognitive maps are representative expressions of
approach, however, is less structured in terms of data spatial knowledge. They are part of a wider set of
collection. It posits that talking to and observing analytical measures which seek to determine how we
individuals as they interact with an environment learn, store, and process knowledge of the geographic
reveals information concerning spatial behavior. Such environment. These measures are important because
an approach might seek to gain a spatial under- they reveal fundamental aspects of cognition and
standing of an area by adopting a strategy of in-depth reveal the cognitive processes that underlie spatial
interviews, discussing the reasoning behind spatial decision and choice making.
decision making.
See also: Hippocampus and Related Structures;
Knowledge (Explicit and Implicit): Philosophical
4.3 Neurological Approaches Aspects; Mental Maps, Psychology of; Spatial Thin-
king in the Social Sciences, History of
In contrast to the approaches above, which seek to
understand the process of cognitive mapping by
examining external measures (action or verbal), neuro- Bibliography
logical approaches measure neural activity within the
brain. A set of brain-scan techniques exist, differen- Downs R M, Stea D 1977 Maps in Minds: Reflections on
tiated by their temporal and spatial resolution. Mag- Cognitie Mapping. Harper & Row, New York
netic Electroencephalography (MEG) has a high Ga$ rling T, Golledge R G (eds.) 1993 Behaior and Enironment:
Psychological and Geographical Approaches. North Holland,
temporal resolution (0–300 ms) but low spatial res-
Amsterdam
olution (providing only a general indication of neural Golledge R G, Stimson R J 1997 Spatial Behaior: A Geographic
activity). Positron Emission Tomography (PET) is the Perspectie. Guildford Press, New York
converse, with functional Magnetic Resonance Ima- Lynch K 1960 The Image of the City. Technology Press,
gery (fMRI) providing a middle ground for both Cambridge, MA
parameters. To reveal information about spatial cog- Montello D R 1991 The measurement of cognitive distance:
nition, scans are taken as the individual undertakes a methods and construct validity. Journal of Enironmental
series of spatial, problem-solving tasks. As such, Psychology 11: 101–22
neurological approaches seek to explain spatial Portugali J (ed.) 1996 The Construction of Cognitie Maps.
Kluwer, Dordrecht
thought and spatial behavior by identifying its neural,
physiological bases rather than its psychological basis.
R. Kitchin
To date there has been very little work that has sought
to marry the findings of neurological and psycho-
logical research to provide a comprehensive, physio-
logical and psychological model of cognitive mapping.
Cognitive Modeling: Research Logic in
Cognitive Science
4.4 Naturalistic Settings
Many studies, particularly from psychology, which Cognitive science is a genuinely interdisciplinary field,
seek to understand cognitive map knowledge are which owes its existence to the insight that, in different
conducted in controlled laboratory settings. For ex- disciplines, interesting research was based on the
ample, respondents may be required to learn the common assumption that cognition could be regarded
layout of objects in a experimental laboratory, and as computation (see Artificial Intelligence in Cognitie
then to map the location of objects, or estimate Science; Cognitie Science: Oeriew). It follows that if
distance and directions between objects. The lab- cognition is computation, theories of cognition should
oratory is attractive to the researcher because they can be specified in terms of representations and the
control\monitor all the variables that may affect computational steps performed on them. Thus, cogn-

2124
Cognitie Modeling: Research Logic in Cognitie Science

itive modeling follows naturally from the basic tenet of environment. Cognition therefore implies modeling
cognitive science. the environment, the system itself, and other systems
Cognition has been addressed by philosophy for at (as in discourse models used in communication).
least 2,500 years, by psychology since well into the Science in general aims at constructing models.
nineteenth century, and by artificial intelligence since Cognitive science attempts to model cognition in
the mid-twentieth century—anyone would be ill- biological as well as technical systems. And since
advised to mistake cognitive science as the only science model building (in the sense discussed above) is an
of cognition. In fact, it overlaps considerably with essential part of cognition, scientific models of cog-
cognitive psychology and parts of several other disci- nition are about how cognitive systems construct
plines. Cognitive modeling is a unifying methodology models of their environment and of themselves. In
for the whole field of cognitive science. short, cognitive modeling amounts to second-order
Cognitive modeling combines research methods of modeling (and it is important to keep both levels well
vastly different origin. The first group consists of apart).
techniques of formal analysis of tasks and systems,
usually from philosophy, logic, theoretical linguistics,
mathematics, physics, and the foundations of com-
puter science. The second group consists of the
empirical methods used predominantly in exper-
1.2 General Characteristics of Models
imental psychology and in neuroscience, which are
used to test models for cognitive adequacy. Finally, the A model is a mapping from an empirical domain (a set
third group of methods are the programming tech- of elements and certain relations defined between
niques developed in artificial intelligence, which are them) to another one, the model domain (often a
used to build working computer models. As a whole, numerical one, as in measurement). Modeling is
the methodology combines formal and empirical constrained on both sides. The empirical domain, as
analysis with constructive synthesis. viewed for modeling, is a highly reduced abstraction,
To identify cognitive modeling with computer simu- and the model domain (a formal system) often includes
lation would be wrong for two reasons: first, we relations that are not relevant to the model at all.
would ignore that building the computer model is just Scientific models are abstractions in the sense that
one, albeit essential, part of the methodology. Second, the empirical domain grasped by the model comprises
a computer simulation may be successful if it produces only part of the objects (elements) and their relations.
the same kinds of results, such as a commercial chess Which ones to select depends on the epistemological
program, but a cognitive model (e.g., of a human interest, i.e., on the theoretical perspective as well as
cognitive function) must arrive demonstrably at on intended applications. For instance, a typical
the same results through the same kinds of comput- driver’s map of some region ignores geology, climate,
ations (see also Social Simulation: Computational biology, etc., and concentrates exclusively on roads,
Approaches). abstracting even there from most of the details. In
cognitive modeling, we usually focus on certain aspects
of mental representations (e.g., of a memory trace),
and some relation or relations, which may be as
1. Cognitie Modeling as Second-order Model different as association (linking two elements) and
Construction entailment (the semantic relation of logical implication
between two statements).
On the model side, we must be careful to define
1.1 Epistemological Perspectie
which of the many known relations in the formal
Cognition comprises sophisticated means of a system’s system being used to model the empirical domain is
adaptation to its environment, notably planning, part of the model. This is well known from psycho-
which in turn draws on anticipation of the results of logical scaling, where only a few of the relations
actions. Anticipation rests on learning and, in its most known to hold among real numbers (if those have been
advanced forms, on episodic memory. Planning and chosen as the model domain) are valid model charac-
decision rely on mental representations of the system’s teristics. For example, only the relation  may be
environment (world model) and the system as an actor valid for an ordinal scale, while differences and ratios
in it (the system’s self-model). The construction of between numbers are meaningless. Likewise, the com-
these models is constrained by the interaction between puter programs typically chosen in cognitive modeling
system and environment. In the case of organisms, the exhibit a wealth of parameters and other details of
necessities of evolution have ensured that internal data structures and control flow, some of which are
world models are sufficiently realistic to be adaptive certainly irrelevant as an aspect of the model. With
and ensure the survival of the species. In technical computer programs, however, it is much more difficult
cognitive systems (e.g., autonomous robots), the need to analyze just which relations are valid parts of the
also arises to represent important features of the model (see Sect. 5.2).

2125
Cognitie Modeling: Research Logic in Cognitie Science

2. Model Domains for Cognitie Modeling has representative character. This aspect of PDP
models has been highlighted as pertaining to a ‘sub-
The traditional framework of cognitive modeling has symbolic’ level by Smolensky (1988), who also stresses
been defined by Marr (1982) and Newell (1982). They that artificial neural networks define a computational
distinguish an abstract level of cognitive theory (the architecture that is nearer to symbol processing than
knowledge level) from the level of description typically to biological neural networks.
chosen in cognitive modeling (symbol level, or algor-
ithmic level). The level below that (implementation
level) is considered as irrelevant to cognitive modeling.
This does not exclude the construction of compu- 2.3 Nonlinear Dynamics and Other Approaches
tational models for specific implementations, however.
Connectionist models, relying on differential equa-
Such models have been used with much success as
tions rather than logic, paved the way to simulations
mediators between psychological and neurobiological
of nonlinear dynamic systems (imported from physics)
analyses of brain functioning.
as models of cognition (see also Self-organizing
Dynamical Systems).
Purely descriptive mathematical models have also
2.1 Symbol-processing Approaches been used in cognitive science, of course, but they do
According to Newell and Simon (see Artificial Intel- not take the form of an implemented computer
ligence: Connectionist and Symbolic Approaches; program, and hence cannot be considered to be at the
Cognitie Science: Oeriew; Problem Soling and heart of cognitive modeling, but rather to be part of
Reasoning, Psychology of ), cognitive processes are the formal analyses typically executed to arrive
symbol transformations on arbitrary complex symbol at sound specifications for cognitive models (see
structures (i.e., mental representations). Accordingly, Mathematical Models in Philosophy of Science).
the classical approach to cognitive modeling aims to
construct programs that manipulate symbol structures
of compositional semantics by means of algorithms 3. Cognitie Modeling and Cognitie
taken from artificial intelligence (e.g., heuristic search).
This approach adheres closely to the Turing Machine
Architectures
model of computation.
This approach provides considerable degrees of 3.1 Unified and Modular Theories of Cognition
freedom regarding how to go about constructing
models. In practice, most of the work has made use of The process of cognitive modeling makes use of
a production system architecture, or of a declarative computational architectures, as we have seen for
knowledge base coupled with an inference algorithm. symbol processing and connectionist frameworks. As
Therefore, the if–then rules of production systems and a special case, the framework may be a general theory
logical statements (e.g., the Horn clauses used in about the architecture of the human mind, usually
Prolog) are among the most widely used formalisms called a cognitive architecture.
for cognitive modeling. Semantic networks, or frames, Relying on a general cognitive architecture is to
with inheritance (the is a relation) are another well- assume that all cognitive processes instantiate the
known example of this approach. same principle (e.g., firing a production rule). The
present state of cognitive science casts doubt on the
reasonableness of this assumption. The functional
organization of the human brain is such that cognitive
2.2 Connectionist Approaches functions may be highly specialized, neuroanatomi-
These approaches are different with respect to the cally focused, not open to introspection, and working
algorithmic level. Simple elements or ‘nodes’ (which in parallel with other cognitive processes (Kolb and
may be regarded as abstract neurons, see Artificial Wishaw 1990).
Intelligence: Connectionist and Symbolic Approaches;
Connectionist Approaches) are connected in a more or
less pre-specified way, the connectionist network’s
3.2 General Cognitie Architectures
architecture. Each element’s output is a function of its
inputs integrated over time, and is passed on to other All these architectures take the form of a production
nodes that are connected with it. Two groups of system. This computational architecture, developed in
connectionist models can be distinguished according the 1960s, comprises knowledge in the form of
to the semantics of representation employed: parallel production rules (if–then rules), contained in a per-
distributed processing (PDP) and localist networks. In manent memory, plus a working memory of unlimited
the latter each node is a representation of something capacity and a rule interpreter for control. Several
(e.g., a concept), whereas in PDP it is the vector of lines of development have led to a number of systems
activation values taken over a number of nodes that that claim to be both a unified theory of human

2126
Cognitie Modeling: Research Logic in Cognitie Science

cognition and a program development environment At present, it is difficult to recommend a particular


for cognitive modeling, among them SOAR (Laird et computational framework for cognitive modeling.
al. 1987, Newell 1990), ACT, known best in the What works best, or what is easier to develop, depends
versions ACT* (Adaptive Control of Thought; Ander- heavily on one’s own experience, and on the cognitive
son 1983) and its revision, ACT-R (Atomic Com- function one wants to model.
ponents of Thought; Anderson and Lebiere 1998), and
others such as C, E, or P. Apart from
being production systems at heart, all these arch-
itectures differ markedly. For instance, SOAR relies
4. Cognitie Modeling Produces Theories
on productions only, whereas ACT* also has a The computer programs resulting from cognitive
declarative memory (a spreading activation network modeling have the status of a well-formulated theory
of unlimited capacity). Both address learning, but about cognition. They have the advantage of being
differently (see Johnson (1998) for a more extensive explicit (no computer would execute a ‘magic’ com-
comparison of ACT and SOAR). ACT-R (following mand) and fully specified (for the same reason), which
E there) now also includes a number of perceptual means that theories cannot focus on certain issues
and motor components. This means that at least some while leaving others in the dark—something that is
of these architectures have grown beyond a uniform just too easily done on paper. As soon as the program
approach. has been implemented successfully on a computer, and
The advantages of modeling within a cognitive produces the expected output, this is proof of the
architecture are obvious. Much of the work of model- logical consistency of the theory and positive evidence
ing has already been done, and programmers can use (proof being impossible) for its adequacy.
special predefined functions. At least the architectures
of long-standing tradition (ACT and SOAR) boast a
number of successful empirical tests and applications
(see also Cognitie Theory: ACT; Cognitie Theory: 4.1 Generatie Theories as Compared with Scientific
SOAR. Drawbacks are that the modeler is confined to Laws
a specific architecture and its means, and that these The most important characteristic of cognitive model-
architectures have grown so big that it becomes ing is that it results in generative theories: computer
difficult to assess an architecture’s relevance to some programs that not only explain, but actually produce,
specific small-scale model. the cognitive phenomena in question. This is very
different from ‘natural laws’ stating, for example, that
in our universe, nothing can travel with a speed beyond
3.3 Alternatie Computational Frameworks that of light.
Models in psychology and other behavioral and
Cognitive modeling may be based on almost any social sciences usually take the form of a system of
computational model. These may be specialized (tail- equations for computing the value of some variables
ored) processing architectures of artificial intelligence, given the value of some other variables. These models,
such as the blackboard architecture (Hayes-Roth however, do not give a detailed explanation (as in an
1985), case-based reasoning (see Problem Soling and algorithm, i.e., a step-by-step computation without
Reasoning: Case-based), or architectures for auto- any ‘magical’ operations) of how the resulting values
nomous social agents (see Autonomous Agents). are arrived at. Being generative is the main advantage
Beyond symbol processing, we find connectionist of cognitive models.
models of the PDP (see McClelland (1999) for an
overview) or the localist orientation (Page 2000).
Arbitrary combinations of different approaches
4.2 Validation Strategies in Cognitie Modeling
(hybrid systems) may also be employed (see also
Artificial Intelligence: Connectionist and Symbolic Cognitive modeling is more than just constructing a
Approaches). computer program that produces data more or less
It may be advantageous to base one’s model of a indistinguishable from ‘real data’ as gathered in
specific cognitive function on a computational frame- psychological experiments. As a methodology, it
work that is ideally suited for the task, and do without includes testing and comparing models. This is where
the massive overhead of the big cognitive archi- the pitfalls of cognitive modeling lie: they do not
tectures. The usual drawback is that one cannot build render themselves easily to the falsification strategy
on the long work of others. C (Cooper and Fox (Popper 1968) usually recommended in science.
1998) promises to ease the work of individual model First, anyone who has constructed a model that
building by providing, in modern object-oriented seems to work well is reluctant to focus on its
programming style, a toolkit for cognitive modeling weaknesses (but this is typical of all science: no one
with a user-friendly graphical interface for devel- should misinterpret Popper as obliging individual
opment. scientists to falsify their own theories; it is sufficient

2127
Cognitie Modeling: Research Logic in Cognitie Science

that there are other scientists around to do that.) Hayes-Roth B 1985 A blackboard architecture for control.
Second, the programs that are the result of cognitive Artificial Intelligence 26: 251–321
modeling are often highly complex, and their relation Johnson T R 1998 A comparison between ACT-R and SOAR.
In: Schmid U, Krems J F, Wysotzki F (eds.) Mind Modelling:
to empirical data cannot been tested exhaustively, but
a Cognitie Science Approach to Reasoning, Learning and
only for specific cases. Third, there is a fundamental Discoery. Pabst, Berlin, pp. 17–37
problem: it is always the case that many models (each Kolb B, Wishaw I Q 1990 Fundamentals of Human Neuro-
consisting of a representational and an operational, or psychology. Freeman, San Francisco
process, part) may fit the same data. Laird J E, Newell A, Rosenbloom P S 1987 SOAR: An archi-
Criteria for the empirical assessment of models are: tecture for general intelligence. Artificial Intelligence 33: 1–64
good fit to data in essential aspects (e.g., difficulty of Marr D 1982 Vision. A Computational Inestigation into the
tasks is reflected in computational effort in the model; Human Representation and Processing of Visual Information.
the model produces preferences, or errors, of the kind Freeman, San Francisco
McClelland J L 1999 Cognitive modeling: Connectionist. In:
found in human subjects, etc.), and formal qualities
Wilson R A, Keil F C (eds.) The MIT Encyclopedia of the
(e.g., the fewer parameters on which the model Cognitie Sciences. MIT Press, Cambridge, MA, pp. 137–44
depends, the better: a variant of Occam’s razor). Newell A 1982 The knowledge level. Artificial Intelligence 18:
Typically, some empirical studies give rise to the 87–127
model, and further empirical studies are used to test it. Newell A 1990 Unified Theories of Cognition. Harvard University
Formal analyses of the tasks used in the domain Press, Cambridge, MA
(e.g., analyzing the syntactic structure and complexity Page M 2000 Connectionist modelling in psychology: A localist
of sentences in computational psycholinguistics) can manifesto. Behaioral and Brain Sciences 23: 443–512
and should be used not only to find hints as to how a Popper K R 1968 The Logic of Scientific Discoery, 5th edn.
Hutchinson, London
model could be constructed, but also to exclude classes
Scarborough D, Sternberg S (eds.) 1999 Methods, Models, and
of models. Conceptual Issues. MIT Press, Cambridge, MA
Smolensky P 1988 On the proper treatment of connectionism.
Behaioral and Brain Sciences 11: 1–74
4.3 An Ealuation of Cognitie Modeling
Cognitive modeling is the research methodology which G. Strube
follows from the basic tenet of cognitive science that
the essential aspect of cognition is that it is a
computational process. Its main assets are that it
produces theories which are explicit, complete, con-
sistent, and generative. Formal analyses of the domain Cognitive Neuropsychology, Methodology
as well as empirical studies are required both as a
prerequisite for the construction of models, and as the
of
means of validating these models, e.g., to test their
generalizability. Cognitive science since the 1970s has Cognitive neuropsychology is the study of what one
demonstrated that this is an extremely successful can learn about the organization of the cognitive
research strategy. system from observing the behavior of neurological
For further information, the reader is referred to patients. Its utility derives from one aspect of the
Scarborough and Sternberg (1999) for an excellent organization of the cerebral cortex, namely the relative
collection of approaches to cognitive modeling. localization of function, such that different regions of
the cortex are differentially involved in different types
See also: Computational Approaches to Model Eva- of cognitive process. Thus patients can have disorders
luation; Connectionist Models of Concept Learning; that are relatively specific to a single cognitive process,
Connectionist Models of Development; Connectionist and so investigation of their impairments can provide
Models of Language Processing; Knowledge Re- critical information on the nature of the process.
presentation; Network Models of Tasks; Neural One still controversial example is that of the so-
Networks: Biological Models and Applications; Sci- called ‘category-specific disorders’ in which the patient
entific Discovery, Computational Models of loses the ability to identify one particular class of item.
Thus certain patients with herpes simplex encephalitis
can virtually entirely lose the ability to identify
Bibliography animals, plants, and foods, but show a far better
preservation of their knowledge of human artifacts
Anderson J R 1983 The Architecture of Cognition. Harvard
(Warrington and Shallice 1984). This is a loss of
University Press, Cambridge, MA
Anderson J R, Lebiere C 1998 Atomic Components of Thought. knowledge and not of early perceptual processes or the
Lawrence Erlbaum Associates, Hillsdale, NJ lower levels of language. By contrast other patients
Cooper R, Fox J 1998 COGENT: A visual design environment can be considerably better at identifying the ‘animate
for cognitive modeling. Behaior Research Methods, Instru- categories’ than human artifacts (Warrington and
ments and Computers 30: 553–64 McCarthy 1983). This double dissociation was orig-

2128
Cognitie Neuropsychology, Methodology of

inally interpreted in terms of the semantic repre- novel theoretical possibility. An example of the first
sentations of sensory quality knowledge being stored type of situation is the study of patient KF (Shallice
separately from knowledge about function, the latter and Warrington 1970), where it was shown that the
being the more critical for identifying artifacts. Such patient performed quite normally on a variety of tests
an approach has been elaborated into quite complex of auditory-verbal long-term memory but was grossly
models (e.g., Farah and McClelland 1991). Alternative impaired on measures of what would now be called the
ways of modeling the double dissociation have, how- ‘phonological buffer’ (Baddeley 1986), such as the
ever, been produced (e.g., Gonnerman et al. 1997, but recency component of the (auditory-verbal) free recall
see Garrard et al. 1998). Moreover the original results serial position curve. This was in conflict with memory
have led to extensive functional-imaging investigations theory of the time, which held that the laying down of
of the regions activated when different types of information in the long-term (secondary) memory
knowledge are concerned (e.g., Moore and Price 1999). system required that it be retained initially in a short-
Very different methodologies have been used when term (primary) memory system. The study showed
patients are studied experimentally for what they can that measures of auditory-verbal long-term memory
tell about the organization of cognitive functions. In could be normal even if measures of auditory-verbal
the early twentieth century, when the nineteenth- short-term memory were grossly impaired, and was
century method of rather casual clinical descriptions held to be in conflict with a serial organization of the
was rejected for a more scientific approach, patients short- and long-term verbal memory systems.
began to be studied as groups. Initially this took the An example of the second type is the analysis of the
form of a series of case histories (e.g., Head 1926), but writing disorder of patient LB (Caramazza et al. 1987).
in the 1950s and 1960s it became normal to average In this study it was shown that semantic variables did
behavior over patients with particular characteristics. not influence LB’s rate of spelling errors, and also that
Groups could be defined by their gross lesion site lexical variables had little effect, but that performance
(e.g., Milner 1958), or in terms of their global syn- was strongly dependent on word length. Moreover the
drome (e.g., in language fluent vs. nonfluent aphasia errors could nearly always be interpreted in terms of
deriving from the work of Goodglass et al. 1964). single or double operations on single letters, such as
This approach has remained the standard one for substitutions, deletions, insertions, and transpositions.
those approaching neuropsychology with either a The pattern was held to support the existence of a
physiological psychology or clinical background. graphemic buffer which held abstract letter-level in-
However in the 1960s and 1970s researchers more formation while words were being written.
influenced by cognitive psychology began to use single- Studies of these sorts should involve either two or
case studies increasingly because they were held to three stages following the stage of patient selection.
capture more satisfactorily from a theoretical per- First, one needs to carry out standard clinical (baseline)
spective the essence of selective disorders correspond- tests. The single-case study methodology differs from
ing to impairments of single cognitive processes. that in most other sciences in that replication of any
Indeed, in the 1980s it became popular amongst this given result is not generally possible for other re-
group of researchers to argue that only single-case searchers in the field. If a highly counterintuitive result
studies could be properly said to produce behavioral is obtained with apparently major theoretical conse-
findings from neurological patients from which one quences, such as the initial observations of category-
could draw theoretical conclusions about the organiz- specific semantic memory dissociations (e.g., Nielson
ation of the normal cognitive system (Ellis 1987). The 1946) or the observations of object-based neglect
position was best articulated by Caramazza (1986), described in patient NG (Caramazza and Hillis 1990),
whose argument was essentially that the behavior of then it is not open to other workers in the field to
the average of a group of subjects was not necessarily attempt to replicate it. They generally do not have
equivalent to that of any possible patient in the group. access to the patient. Therefore it is necessary that as
The argument will be discussed in more detail below. rich a description of the patient’s general cognitive
One may now distinguish a variety of different types characteristics are given so as to provide the raw
of experimental methods in regular use in cognitive material for other researchers in the field to produce
neuropsychology. However I will begin with the alternative explanations of any novel experimental
simplest and arguably the most powerful method. findings. This is best provided by giving the patient’s
performance on standard clinical tests in the particular
cognitive domain.
Second, the inestigation will inole tests used to
examine particular theoretical possibilities for the
1. The Single-case Study patient’s difficulties. These can be obtained from the
experimental psychological or neuropsychological
In this type of study one selects for study a patient literature or be specifically constructed. It is normally
whose disorder appears to be in conflict with the necessary that control data from normal subjects also
implications of an existing theory, or to suggest a be provided. Again because of the general impossi-

2129
Cognitie Neuropsychology, Methodology of

bility of replicating investigations in findings in single- (d) The tasks N to Nj, on which performance is
case studies, all theoretical inferences drawn from the impaired, are not "more demanding of some critical
patient’s performance should be based on the results cognitive resource than tasks Y to Yi where per-
of at least two experiments, or on a patient-internal formance is intact. "
replication of a single experiment. It is inappropriate The most complex of these is the last, the pre-
to draw theoretical inferences from a single unrepli- supposition that for most behavioral purposes the
cated experimental finding. effects of damage to a subsystem can be ordered on a
The third and optional stage of the procedure is a single dimension—resource. It has been argued by
detailed analysis of the patient’s performance in certain Glymour (1994) that the introduction of this concept
tests, rather than just using oerall scores. This can makes cognitive neuropsychology a theoretically in-
include the effects of relevant stimulus variables, of tractable enterprise, as far as the determination of the
different error types, of consistency of performance on modular functional architecture using the observed
the same items across replications of the same material, overall pattern of performance of patients is con-
and so on. cerned. Glymour argues that the degrees of freedom
The theoretical utility of the single-case study needs open to possible explanations would be greater than
to be considered in the context of two different types of the constraints imposed by the observations of task
model. The first are models of the global cognitive performance. However, that analyses based on the
architecture in a given domain in which the com- resource concept can in practice be used to differentiate
ponents are functional subsystems (Posner 1978) that between alternative models has been shown in the
are anatomically isolable and so can be individually concrete case of phonological output buffer disorder
damaged by neurological disease. Such models can be by Shallice et al. (2000).
characterized as having a modular functional archi- The introduction of the resource concept allows one
tecture where the term ‘modular’ is used with a less to infer that if a complementary set of dissociations is
specific referent than in Fodor’s (1983) sense. For such observed in a second patient producing a double
models, when a single subsystem is damaged, one will dissociation, in at least one of the two cases, the pattern
obtain a dissociation between impaired performance of Y to Yi contrasting with N to Nj cannot be
on tasks which require the subsystem and intact "
explained "
merely in terms of differential resource
performance on tasks which do not require it. A demands on a single subsystem being made across the
standard move in cognitive neuropsychology is to set of tasks (see Shallice 1988, Chap. 10 for extended
treat the existence of a set of dissociations in a patient discussion). It should be noted, however, that these
between tasks Y to Yi and task N to Nj as evidence in arguments are restricted to modular cognitive archi-
favor of models" where the pattern " can arise from tectures. The existence of a double dissociation does
damage to one or more subsystems or connections, not entail that the underlying system is modular; other
and against models where no such combination can possibilities include a continuum of processing space,
produce the observed oerall pattern of performance. within which different regions are damaged, over-
This however depends on the following assumptions lapping processing regions, and systems in which
which are derived in part from those of Caramazza different levels or aspects of operation of a subsystem
(1986) (see Shallice 1988 for a much more extensive can be separately damaged, as for instance on-line
discussion): processing and the capacity to learn (see Shallice 1988,
(a) Cognitive subsystems are qualitatively and Chap. 11). However one alternative which does seem
quantitatively similar across individuals, at least by to be ruled out is an equipotential distributed system
comparison with the effects of neurological disease. (Bullanaria and Chater 1995, but see Plaut 1995).
(b) The task procedure, used by the patient, in the Inferences in neuropsychology from impaired
sense of the cognitive schemas that are controlling the behavior to mechanism are not, however, restricted to
overall operation of the collection of functional inferences based on the overall pattern of performance
subsystems involved, does not differ markedly from and the global organization of a modular functional
that in use by normal subjects. This implies that the architecture (see McCloskey 2000). The second type of
patient is not using a strategy unusual in normals, as, model is those specifying how particular subsystems
for instance, in letter-by-letter reading, which involves operate. More detailed aspects of the patient’s per-
sounding out each letter in turn and then attempting to formance can give more specific information about the
read by combining them into a word, as is found in the operation of a processing system. Thus, in attentional
syndrome pure alexia. dyslexia, patients show migrations of letters from one
(c) Reorganization of function following the lesion word to another (e.g., ‘flip snag’ to ‘snip flag’) (Saffran
has had no more than a secondary effect in task and Coslett 1996). This is compatible with damage to
performance. Clearly if the patient’s cognitive system a mechanism which specifies over which parts of the
is qualitatively different from a normal system in its visual field the results of letter-form level analysis are
organization of subsystems, it will not be possible to admitted for higher-level processing (see Mozer 1991
make inferences from the patient’s performance to the for more detailed discussion of the theoretical impli-
organization of the normal system. cations of the disorder).

2130
Cognitie Neuropsychology, Methodology of

2. The Multiple-case Study and the Functional Functional syndromes are of relevance for theories
Syndrome other than modular ones. Connectionist models can
predict functional syndromes with complex charac-
teristics too (see, for example, Mozer and Behrmann
Any individual patient’s performance can be influ- 1990). However, a different type of multiple single-
enced by a variety of factors which reduce its relevance case study is more usually relevant to this type of
for inferences to normal processing. The patient may model. This is the use of a database for confronting
have an idiosyncratic functional or anatomical organ- theory of all patients showing deficits in a particular
ization of their cognitive system, use an unusual processing domain; their behavior is then individually
strategy in some tests, or the disorder may even be fitted, but using different parameter values within a
influenced by psychiatric factors, so-called ‘functional common underlying model. The prototypic example
overlay.’ One pragmatic way of partially guarding of this approach is the study of Dell et al. (1997) on
against these possibilities is to report two or more naming errors in aphasics (but see Ruml et al. in press
patients showing the same pattern of symptoms. for detailed criticism).
Second, if a dispute arises about the interpretation
of empirical aspects of the report of an individual case, 3. The Anatomically Based Group Study
there is no way it can be resolved, other than by further
investigation of the patient’s difficulties, or reanalysis A method which is very frequently used is to group
of existing findings. In practice these approaches are patients according to the location of lesion and average
only rarely undertaken. Therefore, the recourse tends the results of each group. A potential problem with
to be to find alternative arguments on the basic this approach is that patients in a given group may
theoretical position held, in which case the original vary qualitatively as well as quantitatively. The reason
findings become redundant. (An example of an un- is simple. The defining anatomical region will typically
resolvable theoretical question in the indiidual patient involve subsystems other than that which is the critical
is arguments of whether evidence for syllabic struc- theoretical focus of the study. Thus Caramazza’s
ture in graphemic buffer disorder patient LB repre- (1986) argument on the functional heterogeneity of the
sents the effects of orthographic or phonological based group becomes relevant, in particular that the pattern
processes; see Caramazza and Miceli 1990, Jonsdottir of performance of the average of the group may not
et al. 1996). correspond to that of any individual representative of
One way of making the neuropsychological data- the group.
base more solid is to establish functional syndromes, or This possibility should not, however, be thought to
characteristics held in common by all of a group of invalidate the approach completely. While it presents
patients. However, a combination of characteristics a theoretical danger, its potency depends upon the
can arise because of the anatomical proximity of nature of the averaged data and the nature of this
different functional systems, producing so-called asso- theory. Thus, for certain types of theories the same
ciated deficits. The concept of functional syndrome problem occurs when the findings of normal subjects
has therefore often been rejected metatheoretically on are confronted with theory. Second, for certain types
the grounds that a single counterexample can lead to of evidence, in particular double dissociations, it
the collapse of the putative functional syndrome. This requires the conditions, which are a priori highly
is not necessarily the case. The functional syndrome unlikely, for the pattern to arise as an average artifact
may fractionate with both subvarieties being theoreti- and not be present in individual complementary pairs
cally interpretable; one example is the two varieties of of subjects within the contrasting groups (see Shallice
so-called letter-by-letter readers who circumvent the 1988, Chap. 9). Thus the existence of the double
word-reading difficulty induced by their pure alexia. dissociation in the group data virtually always entails
One group uses the names of the letter forms and it occurring in specific pairs of individual subjects.
knowledge of the spelling; the other uses a sounding- This means that the situation when making inferences
out strategy (see Patterson and Kay 1982). Moreover, from the oerall pattern of performance of groups to
the overall performance of a single patient can also theory is virtually equivalent to that when making
derive from multiple impairments. This is not therefore inferences from individual subjects. I know of no
an argument against the functional syndrome per- actual case where theoretical inferences from an
spective, given that one restricts functional syndromes anatomically based group study have been criticized
to ones in which all patients behave in the same as being vitiated by an averaging artifact.
fashion, and treats the subsyndromes resulting from A further key issue in anatomically based group
fractionation of an original functional syndrome as studies is how to define the relevant region for patient
each requiring theoretical explanation. The functional inclusion. Typically, regions are defined a priori, such
syndrome is therefore a concept which has the utility as lesions being confined to or involving a particular
of producing a more solid database on which to assess lobe. However, approaches are being developed in
alternative theories (see, for example, Shallice et al. which determination of group membership is in-
2000 for a concrete example of the approach). directly influenced by statistical procedures for analyz-

2131
Cognitie Neuropsychology, Methodology of

ing functional-imaging results, such as statistical Bibliography


parametric mapping (SPM) (Friston 1997). This has
Baddeley A D 1986 Working Memory. Clarendon Press, Oxford,
resulted in the development of more complex pro- UK
cedures for determining the specification of the ana- Berndt R S, Mitchum C C, Haedinges A W 1996 Compre-
tomical regions most appropriate for most sharply hension of reversible sentences in agrammatism: A meta-
differentiating good and poor performance on a given analysis. Cognition 58: 289–308
measure. For instance, Stuss et al. (1998) have used Brieman L, Friedman J H, Olshen R A, Store C J 1984 Classifi-
classification and regression test (CART) procedure cation and Regression Trees. Wadworth, Belmont, CA
(Breiman et al. 1984) to assign patients to subgroups. Bullinaria J A, Chater N 1995 Connectionist modelling: Impli-
cations for cognitive neuropsychology. Language and Cog-
nitie Processes 10: 227–64
Butters N, Cermak L S 1974 Some comments on Warrington
and Baddeley’s report of normal short-term memory in
4. The Functionally Based Group Study Approach amnesic patients. Neuropsychologia 12: 283–5
Caramazza A 1986 On drawing inferences about the structure of
An alternative methodology is to constitute groups normal cognitive systems from the analysis of patterns of
according to the functional characteristics, such as impaired performance: The case for single-patient studies.
when studies contrast Broca’s aphasics and Wernicke’s Brain and Cognition 5: 41–66
aphasics, or amnesics and normal controls. In many Caramazza A, Hillis A E 1990 Spatial representation of words in
respects the methodological issues raised are similar to the brain implied by studies of a unilateral neglect patient.
those of the previous approach. However, a critical Nature 346: 267–9
Caramazza A, Miceli G 1990 The structure of graphemic
problem relates to the criteria used for group defi- representations. Cognition 27: 243–97
nition, and hence the possibility of replication. In Caramazza A, Cappa E F, Ray A, Berndt R S 2001 Agrammatic
studies contrasting amnesic and control groups, for Broca’s aphasia is not associated with a single pattern
instance, this has often been a critical issue (see Butters of comprehension performance. Brain and Language 76:
and Cermak 1974). However, in aphasia research it 158–84
has been a key issue since the 1980s. Thus there has Caramazza A, Miceli G, Villa G, Romani C 1987 The role of the
been a very long-standing and as yet unresolved graphemic buffer in spelling: Evidence from a case of acquired
controversy on whether the comprehension perfor- dysgraphia. Cognition 26: 59–85
mance of Broca’s aphasics mimics their production Dell G S, Schwartz M F, Martin N, Saffran E M, Gognon D A
problems, in the sense of showing a dissociation 1997 Lexical access in normal and aphasic speech. Psycho-
logical Reiew 104: 801–38
between comprehension of active and passive sent- Ellis A W 1987 Imitations of modelarity, or, the modularity of
ences. Some early cases seemed to indicate that in mind: Doing cognitive neuropsychology without syndromes.
certain patients it did not (see Berndt et al. 1996). In: Coltheart M, Sartori G, Job R (eds.) The Cognitie
However, Grodzinsky et al. (1999) reject this meta- Neuropsychology of Language. Erlbaum, London
analysis on the grounds that it applied too loose Farah M J, McClelland J L 1991 A computational model of
criteria for patient selection. Some patients included, it semantic memory impairment: Modality specificity and emer-
was argued, were not Broca’s aphasics. Caramazza et gent category specificity. Journal of Experimental Psychology
al. (2000) have pointed out that Berndt et al.’s findings General 120: 339–57
are still obtained when the more restrictive criteria, Fodor J A 1983 The Modularity of Mind. MIT Press, Cambridge,
endorsed by Grodzinsky et al., are applied. MA
Friston K 1997 Analysing brain images: Principles and over-
To be satisfactory, this approach must be based on
view. In: Frackowiak R S J et al. (eds.) Human Brain Function.
effective functional syndromes that do not fractionate Academic Press, San Diego, CA, pp. 25–42
and on good operational criteria for diagnosing them; Garrard P, Patterson K, Watson P C, Hodges J R 1998 Category
Broca’s and Wernicke’s aphasia probably do not pass specific semantic loss in dementia of Alzheimer’s type.
the test. The danger of the clinically based group study Functional-anatomical correlations from cross-sectional
approach is that if the functional syndrome which is analyses. Brain 121: 633–46
used to define group membership does fractionate, Glymour C 1994 On the methods of cognitive neuropsychology.
then the critique of Caramazza (1986) will be poten- British Journal of Philosophy of Science 45: 815–35
tially highly relevant, and only very simple characteris- Gonnerman L, Anderson E S, Devlin J T, Kempler D,
tics of the averaged findings will prove inferentially Seidenberg M S 1997 Double dissociation of semantic cate-
gories in Alzheimer’s disease. Brain and Language 57: 254–79
relevant.
Goodglass H, Quadfasel F A, Timberlake W H 1964 Phrase
length and the type and severity of aphasia. Cortex 1: 133–53
See also: Brain Damage: Neuropsychological Re- Grodzinsky Y, Pinango M M, Zurif E, Drai D 1999 The critical
habilitation; Case Study: Logic; Case Study: Methods role of group studies in aphasia: Comprehension regularities
in Broca’s aphasia. Brain and Language 67: 134–47
and Analysis; Case-oriented Research; Cognitive Head H 1926 Aphasia and Kindred Disorders of Speech.
Functions (Normal) and Neuropsychological Defi- Cambridge University Press, Cambridge, UK
cits, Models of; Neuropsychological Functioning, Jonsdottir M, Shallice T, Wise R 1996 Language-specific
Assessment of processes in graphemic buffer disorder. Cognition 59: 169–97

2132
Cognitie Neuroscience

McCloskey M 2001 The future of cognitive neuropsychology. uncover the processes and mechanisms lying behind
In: Rapp B (ed.) The Handbook of Cognitie Neuropsychology. human cognitive functions, and of computational
Psychology Press, Philadelphia, pp. 593–610 approaches within cognitive psychology, which rely
Milner B 1958 Psychological deficits produced by temporal-lobe
on computational models to develop explicit mech-
excision. Research Publications, Association for Research in
Nerous and Mental Disease 36: 244–57 anistic accounts of these functions. On the other side,
Moore C J, Price C J 1999 A functional neuroimaging study of it grows out of the traditions of behavioral, functional,
the variables that generate category-specific object processing and systems neuroscience, which use neurophysio-
differences. Brain 122: 943–62 logical and neuroanatomical methods to explore the
Mozer M C 1991 The Perception of Multiple Objects. MIT Press, mechanisms underlying complex functions. It draws
Cambridge, MA on findings and principles of cellular and molecular
Mozer M C, Behrmann M 1990 On the interaction of selective neuroscience. It joins these approaches with the use of
attention and lexical knowledge: A connectionist account of new functional brain imaging methods, such as func-
neglect dyslexia. Journal of Cognitie Neuroscience 2: 96–123 tional magnetic imaging (fMRI), positron emission
Nielson J M 1946 Agnosia, Apraxia, Aphasia: Their Value in
Cerebral Localization. Hoeber, New York
tomography (PET), as well as other methods in-
Patterson K E, Kay J 1982 Letter-by-letter reading: Psycho- cluding electroencephalography (EEG) and mag-
logical descriptions of a neurological syndrome. Quarterly netoencephalography (MEG), and with a growing
Journal of Experimental Psychology, Series A 34: 411–41 research tradition in computational neuroscience.
Plaut D C 1995 Double dissociation without modularity: Evi-
dence from connectionist neuropsychology. Journal of Clinical
and Experimental Psychology 17: 291–321
Plaut D C, Shallice T 1993 Deep dyslexia: A case study of 1. The Microstructure of Cognition
connectionist neuropsychology. Cognitie Neuropsychology
10: 377–500
Posner M I 1978 Chronometric Explorations of Mind. Erlbaum, 1.1 Patterns of Actiity Arising in Ensembles of
Hillsdale, NJ Simple Elements
Ruml W, Caramazza A 2000 An evaluation of a computational
model of lexical access: Comment on Dell et al. 1997 A starting point for cognitive neuroscience is the idea
Psychological Reiew 107: 609–34 that a cognitive or mental state consists of a pattern of
Saffran E M, Coslett B 1996 Attentional dyslexia in Alzheimer’s activity distributed over many neurons. For example,
disease: A case study. Cognitie Neuropsychlogy 13: 205–28 the experience an individual has when holding, sniff-
Shallice T 1988 From Neuropsychology to Mental Structure. ing, and viewing a rose is a complex pattern of neural
Cambridge University Press, Cambridge, UK activity, distributed over many brain regions, includ-
Shallice T, Warrington E K 1970 Independent functioning of the
verbal memory stores: A neuropsychological study. Quarterly
ing the participation of neurons in visual, somato-
Journal of Experimental Psychology 22: 261–73 sensory, and olfactory, and possibly extending to
Shallice T, Rumiati R I, Zadini A 2000 The selective impairment language areas participating in representing the sound
of the phonological output buffer. Cognitie Neuropsychology of the word ‘rose’ and\or other areas where activity
17: 517–46 represents the content of an associated memory that
Stuss D T, Benson D F 1984 Neuropsychological studies of the may be evoked by the experience.
frontal lobes. Psychological Bulletin 95: 3–28 These patterns of activation arise from excitatory
Stuss D T, Alexander M P, Hamer L, Palumbo C, Dempster R, and inhibitory interactions among the participating
Binns M, Levine B, Izuakawa D 1998 The effects of focal neurons, mediated by connections called synapses.
anterior and posterior brain lesions on verbal fluency. Journal
of the International Neuropsychological Society 4: 265–78
The inputs neurons receive cause them to ‘fire’ or emit
Warrington E K, McCarthy R 1983 Category specific access impulses called spikes or action potentials, which
dysphasia. Brain 106: 859–78 travel down their axons to synaptic terminals where
Warrington E K, Shallice T 1984 Category specific semantic they cause the release of chemicals that then have
memory impairment. Brain 107: 829–54 excitatory or inhibitory influences on the neurons on
the other side of the synapse. The combined effect of
T. Shallice the incoming signals to each neuron, together with its
recent history, determines whether it will fire at a
particular moment. Figure 1 indicates something of
the fundamental circuitry involved, though it should
be noted that only one out of 100 of the neurons in the
tiny region shown (about 3i3 mm) are indicated.
Cognitive Neuroscience While the computations performed by individual
neurons should not be underestimated (see Neurons
The discipline has emerged in the 1990s at the interface and Dendrites: Integration of Information), it seems
between the neural sciences and the cognitive and likely that what gives the system its power and com-
computational sciences. On one side, it grows out of plexity is the number of neurons involved (most
the traditions of cognitive psychology and neuro- estimates place the number in the human brain
psychology, which use behavioral experiments to between 10 and 100 billion) and of the density of

2133
Cognitie Neuroscience

A prime example of distributed representation is the


representation of the direction of arm movements in
the motor cortex. It appears that the representation of
a particular direction of reaching is a pattern over a
large population of neurons, each of which responds
maximally to a particular preferred direction, but
responds to a lesser degree to neighboring directions,
and thus participates partially in the representation of
many different directions of reaching (Georgopoulos
et al. 1986). There are other types of distributed
representations used in the brain, in which a neuron
can participate in two different representations, with-
out there being a clear shared feature or other
similarity between the situations that cause the neuron
to fire. For example, in the hippocampus, individual
neurons participate in distributed representations of
the animals’ location in external space and other
aspects of the current behavioral situation. Interest-
ingly, the same neurons may participate in different
ways in the representation of different environments,
or even of two distinct representations of the same
environment when the animal in performing different
tasks (Markus et al. 1995).
Figure 1
An early camera lucida drawing of the circuitry of the
neocortex, based on the Golgi stain method, which 1.3 Knowledge and Learning in the Strengths of
impregnates just one out of every 100 cortical neurons. Connections
The diagram depicts the rich dendritic branching
structure of the individual neurons present, whose cell The particular pattern of activation that arises in
bodies appear as small, pyramid-shaped blobs. The experiencing an input (or in reconstructing a memory
dendrites (and the little spines visible on the surfaces of or formulating an imagined experience) is determined
some of the larger dendrites) are the structures on by the connections among the neurons. A key issue,
which the neurons receive most of their inputs from then, is to understand the processes that lead to the
other neurons formation of the specific excitatory and inhibitory
connections that shape the processes of perception,
cognition, and action. Generally, it is thought that
connections among them (typical cortical neurons largely activity independent processes establish an
receive between 10,000 and 100,000 individual initial skeleton framework of connectivity early in
synapses from other neurons). development, for example, causing connections to
form between neurons in the retina of the eye and
other neurons in the lateral geniculate nucleus, a way
station for visual information on the way to the cortex.
1.2 Distributed Representations
Then, activity-dependent processes selectively refine
A great deal of research has concerned the nature of and stabilize some of the connections, and perhaps
the active representations the brain uses for objects of cause new ones to form, while other connections are
perception or cognition, such as the rose discussed pruned away.
above. There is now a great deal of support for the Activity-dependent processes continue throughout
view that the brain’s representations typically consist life, at least in many parts of the brain, and appear to
of patterns of activity involving fairly large ensembles provide the basis of both explicit and implicit learning.
of neurons. Individual neurons are often described as They have been the subject of intense scrutiny in
‘detectors’ for particular stimulus or situational neuroscience. Donald Hebb, the mid-twentieth cen-
features or conjunctions of features (e.g., the ‘edge tury neuropsychologist, proposed that if one neuron
detectors’ introduced by Hubel and Weisel 1962 in participates in firing another, the connection from the
their seminal studies in visual cortex), but most such first to the second will be strengthened (Hebb 1949).
neurons are fairly broadly tuned, so that they will also Hebb’s idea has been encapsulated in the phrase ‘cells
be partially activated by a wide range of stimuli that fire together wire together.’ While there is no
overlapping in one way or another with the optimal direct proof that this is a principle basis of learning in
stimulus, and thus will participate at least partially in the brain, the idea has received a great deal of
the representation of many different inputs. experimental support in experiments that have been

2134
Cognitie Neuroscience

carried out in slices of brain tissue (see Neural


Plasticity). It should be understood that there may
also be plasticity at the level of the whole neuron (in
some specialized brain areas, neurons are continually
created and incorporated into circuits while others are
continually being lost). There is likely also to be some
plasticity at the level of the branches of axons and\or
dendrites, which provide the scaffolding underlying
the formation and loss of synaptic connections.

2. System-leel Organization: The


Macrostructure of Cognition in the Brain

2.1 Specialization of Brain Regions


A central and important fact about the organization of
cognition in the brain is that individual brain regions
are specialized. The cerebral cortex can be partitioned
conceptually into primary, secondary, and tertiary
cortical zones (Luria 1966). According to this con- Figure 2
ception, the primary areas contain neurons whose The interactive, distributed framework for modeling
responses can be largely characterized as reflecting individual word reading of Seidenberg and McClelland
relatively simple, local properties of inputs or outputs 1989. All of the relevant processing pathways are
within a given modality, such as the presence of an assumed to be bidirectional
oriented line segment at a particular position on the
retina of the eye, the presence of acoustic energy in a
2.2 Modular s. Interactie Approaches to the
particular frequency band, or the presence of a tactile
Organization of Function
stimulus at a particular point on the skin surface.
Corresponding motor areas contain neurons whose The above provides only the starting place for the
responses may correspond to the activation of specific formulation of an understanding of how cognitive
muscles or elementary movement elements. Secondary processes arise from neural activity. There are two
areas contain neurons whose responses represent contrasting views: (a) The modular approach, cham-
higher-order stimulus attributes within a given mo- pioned by David Marr for vision and Noam Chomsky
dality, such as conjunctions of features, and the for language, and systematized as a general approach
representations in these areas may be relatively in- by Fodor (1983), holds that the brain consists of many
variant over some lower-level properties, such as separate modules that are informationally encapsu-
position of the stimulus containing the feature on the lated in that their operation is informed only by a very
sensory surface (Tanaka 1996). Tertiary areas are limited range of constraining sources of information.
responsible for representations that transcend indi- The modular view also holds that the principles of
vidual modalities, such as representations of the function are specific to each domain, and that distinct
current task context, or representation of one’s lo- and individualized mechanisms are used to subserve
cation in extra-personal space, or representation of each distinct function. For example, the initial as-
semantic content. It should be noted that this picture signment of the basic grammatical structure to a
is only a very crude approximation, and many so- sentence is thought to be based only on the syntactic
called primary areas appear to participate in the classification of words and their order and is thought
representation of the global structure of a stimulus or to be governed by the operation of a system of
response situation, and many areas that are treated as structure sensitive rules. The module that carries out
modality specific can be modulated by influences from this assignment is thought to be structured specifically
other modalities (see below). It should also be noted so that it will acquire and implement structure-
that structures outside the neocortex also play very sensitive rules, and to contrast in the principles that it
important roles in cognitive functions. Among these employs internally with other modules that carry out
are the diffuse neuromodulatory systems that regulate other tasks, including other aspects of language
behavioral\cognitive states such as alertness, wake- processing, such as the assignment of meanings to the
fulness, and mood; and other systems in the thalamus, words in a sentence. In Fodor’s view, there are many
lymbic system, and cerebellum. specialized modules (corresponding approximately to

2135
Cognitie Neuroscience

primary and secondary cortical areas and their sub- visual cortex, suggesting that reciprocal interactions
cortical inputs and outputs). These are complemented between these areas shape neuronal responses in V1
by a general-purpose cognitive system that is com- (Hup et al. 2001). Although the initial response of
pletely open-ended in the computations that it can neurons in V1 is determined by appropriate oriented
undertake and in the range of informational sources line segments at a specific location, by about 80 ms
that it can take into consideration. their firing is heavily dependent on the global
The alternative, interactive approach, has its seeds structure of the display (Lee et al. 1998). Furthermore,
in the ideas of Luria (1966), and has been championed neurons in V1 respond to illusory contours that fall in
by Mesulam (2000) and by Rumelhart et al. (1986), their receptive field. The response occurs at a lag of
and overlaps with the ideas of Damasio (1989). On this about 80 ms, suggesting an indirect source, perhaps
view, cognitive outcomes such as the assignment of an arising from feedback from higher cortical areas (Lee
interpretation to a sentence arise from mutual, bi- and Nguyen 2001). There is also considerable evidence
directional interactions among neurons in populations of between modality interactions. For example,
representing different types of information. An activity in auditory processing areas associated with
example of a system addressing the representations speech perception is enhanced by visible speech
and interactions involved in reading individual words (Callan et al. 2001). There are many corresponding
aloud is shown in Fig. 2. As the figure suggests, the examples of cross-modal influences in single neuron
formation of the sound of a word from a visual input recording studies in animals.
specifying its spelling arises from an interactive process
involving orthographic (i.e., letter identity), semantic,
phonological, and contextual information. Both the 3. Methods and Approaches in Cognitie
modular and the interactive view are consistent with
the idea that neurons in the brain are organized into Neuroscience
populations specialized for representing different types Cognitive neuroscience is a highly interdisciplinary
of information. Where they differ is in the extent and endeavor, and draws on a wide range of research
the role of bidirectional interactions among participat- methods and approaches, each with its own history
ing brain areas. and underlying theoretical frame of reference. One
important challenge for the field is to find ways of
integrating the insights gained from the different
2.3 Eidence of Interactie Processes in the Brain methods to allow the field as a whole to converge on a
common theoretical framework. Here the predomi-
The debate between modular and interactive ap- nant research approaches are briefly described, and
proaches is a long-standing one, and can be seen as the some of the prospects for integration are considered.
modern legacy of a history of diverse views on the
localization of functions within the brain (Luria 1966).
While the debate is likely to continue to evolve with
3.1 Lesion and Behaior Approaches (Cognitie
additional empirical evidence, it may be worth con-
Neuropsychology and Behaioral Neuroscience)
sidering a few elements of evidence that support the
idea that processing may be interactive. One relevant These research approaches within the field have the
anatomical point is the fact that connectivity within oldest historical roots, based as they often are in the
and between brain areas is generally reciprocal: when assessment of the effects of naturally-occurring brain
there are connections from region A to region B, there damage on cognitive function. A seminal case study
are nearly always return connections. was the report by Broca (1861) of a man with a severe
While there is no consensus on the function of disturbance of language arising from a large brain
reciprocal connections, there is some evidence that lesion in the posterior portion of the left frontal lobe.
they subserve distributed, interactive computations, at Since Broca’s day, neurologists and neuropsycholo-
least in particular cases. For example, there is evidence gists have investigated the effects of accidental or
that interactive processes influence the activation of therapeutic brain lesions in humans, and many key
individual neurons in primary visual cortex (Area V1). insights have arisen from these studies (see Agnosia;
Traditionally, individual neurons in this area have Amnesia; Aphasia; Dyslexia (Acquired) and Agraphia).
been seen as encoding the presence of segments of The subdiscipline of cognitive neuropsychology has
oriented edges at particular positions in a visual arisen specifically around the study of the effects of
display. Recent evidence suggests, however, that pri- brain lesions (see Cognitie Neuropsychology, Method-
mary visual cortex participates in a distributed and ology of ). A complementary has grown up around
interactive process that contributes to the represen- the use of brain lesions in animals carried out with
tation of global stimulus properties such as figure- specific experimental intent. This tradition is relevant
ground organization. The firing of neurons in primary to human cognitive neuroscience in view of the very
visual cortex is strongly affected by temporary in- close homology between many structures in the human
activation of corresponding portions of secondary brain and corresponding structures in the primate and

2136
Cognitie Neuroscience

rodent brains. This work has obvious advantages in 3.3 Functional Brain Imaging
that lesions can be carefully targeted to particular
brain areas to test specific hypotheses (see Lesion and Cognitive neuroscience has arisen as a separate
Behaior Approaches in Neuroscience). Many key discipline in tandem with the emergence of functional
insights have emerged from this work, including the brain imaging methods (PET and fMRI) as major
discovery of complementary processing streams in the tools for the analysis of human cognition, and it may
visual system (Ungerleider and Mishkin 1982; see be that the prospect of visualizing specifically human
Neural Plasticity). However, the approach is not cognitive activity has been a major catalyst. First used
without its pitfalls, since a lesion may have unintended to analyze cognitive functions by the St. Louis group
and unobserved effects in other brain regions; the (Petersen et al. 1988; see Functional Brain Imaging),
refinement and extension of experimental lesion tech- these methods are now coming into widespread use.
niques is ongoing. While these methods currently have low temporal and
spatial resolution compared to neuronal recording
studies, they still provide our best opportunity to
explore the neural mechanisms underlying distinctly
human cognitive functions.
3.2 Neuronal Recording Studies
To date the observations arising from functional
Studies relying on microelectrodes to record from imaging studies have tended to corroborate findings
neurons in the brains of behaving animals can allow from other methods, and\or to explore commonalities
researchers to study the representations that the brain and differences in human and animal brain organiz-
uses to encode information, and the evolution of these ation. As one recent example, it has now been
representations over time. Several fundamental possible to visualize the alternating strips in visual
observations have, some of which have been discussed cortex reflecting what are known as ocular dominance
above. These studies indicate, among other things, stripes. Beyond corroboration, a great deal of new
that the brain relies on distributed representations, information has also been provided by functional
that neurons participate dynamically and interactively brain imaging studies. For example, in a fairly early
in the construction of representations of external PET study, investigators found that an area of the
inputs, and that the representational significance of cerebellum became active when subjects were required
the firing of a particular neuron can vary as a function to generate the action that goes along with a concrete
of context. Neuronal recording studies have had a object (e.g., the word HAMMER requires a response
profound impact on our understanding of the nature such as ‘pound’). Subsequent investigation of an
of representations of extrapersonal space. There are individual with damage to this region of the cerebellum
neurons in the brain that encode the location of indicated that the patient had considerable difficulty
objects in extrapersonal space simultaneously in re- with the generation task, confirming the importance of
lation to many different parts of the body, including this area in the task.
the limbs and the head (Duhamel et al. 1991, Graziano Brain imaging studies, like lesion studies, have often
and Gross 1993) and other neurons that encode the been used to try to determine the loci in the brain
locations of objects in relation to other objects (Olson associated with particular cognitive functions. How-
and Gettner 1995). Furthermore, recordings from ever, in addition to this, brain imaging has begun to
neurons in parietal cortex suggest that when we move reveal a great deal about the plasticity of the brain,
our eyes from one location to another, we update our since patterns of brain activation can change dra-
internal representations of the locations of important matically with practice (Karni et al. 1998). Imaging is
objects in space, based on where we anticipate they also being used in search of distributed networks in the
will be after the upcoming eye movement (Duhamel et brain that contribute to particular cognitive functions.
al. 1992). For example, Just et al. (1996) have shown that as
An important recent development is the ability to sentences become more complex, there is an increase
record from up to 100 individual neurons at a time in neural activity in an ensemble of brain regions,
(see Perception and Cognition, Single-\Multi-neuronal including Broca’s and Wernicke’s area on both the left
Recording Studies of ). A key finding that has come out and to a lesser degree the right side of the brain. As
of this work is the confirmation in studies with rodents another example, investigators have begun to use
that the simultaneous and successive patterns of covariation in neural activity in different brain regions
activity acquired during behavior may be reactivated in an effort to determine which brain regions are
in the brain during subsequent sleep (Wilson and influencing each other’s activation (Maguire et al.
McNaughton 1994). Such methods are in their in- 2000) in different task situations.
fancy, but their potential to shed light on the moment- Imaging methods (including magneto and electro-
by-moment relations between activations of different encephalography, as well as fMRI and PET) are likely
neurons and between distributed brain representations to improve dramatically over time, allowing far higher
and specific inputs and outputs makes them essential spatial and temporal resolution. The potential for this
to the future of cognitive neuroscience. to bring us closer to the goal of understanding the

2137
Cognitie Neuroscience

details of information processing in the human brain they are, a more detailed characterization may be
will be discussed below. possible in computational terms (Pouget et al. 1999).
A further area of fertile research is in the use of
computational models to explain and catalog the ways
in which neuronal activation changes dynamically in
3.4 Computational and Mathematical Modeling
the course of task performance (Moody et al. 1998).
Approaches
While investigations relying on lesion and behavior
approaches, neuronal recording studies, and func- 4. Open Issues in Cognitie Neuroscience
tional brain imaging have provided and will continue
to provide the empirical evidence on which to build Cognitive neuroscience is young, and there is a great
our understanding of the basis of cognitive functions deal of work to be done. No aspect of cognition is fully
in the brain, these approaches, even when used in a understood, and in general, the more abstract or
convergent way, may still fail to provide a complete advanced the cognitive function, the less is known
understanding of how cognitive functions emerge from about its neural basis. A few of the most important
underlying neural activity. This may require the use of and interesting issues that remain to be addressed are
additional tools provided by mathematical modeling considered briefly here.
and computer simulation. These approaches allow
researchers to formulate possible accounts of specific
processes in the form of explicit models that can be 4.1 How Does the Brain Learn?
analyzed mathematically or simulated using com-
puters to determine whether they can account for all of There is a great deal known about the basic mechan-
the relevant neural and behavioral evidence. isms of synaptic plasticity, but typically these are
Three examples of cases in which computational studied in highly reduced preparations such as brain
models have already led to new thinking will be briefly slices. The basic processes that are studied in slices
considered. First, a number of computational model- surely play a role in the shaping of neural connections
ing studies have shown that many aspects of the in the whole, living brain, but they are also un-
receptive field properties of neurons and their spatial doubtedly modulated by processes that are usually
organization in the brain can arise through the eliminated in slices. We know that attention and
operation of very simple activity-dependent processes engagement in processing is essential for learning, and
shaped by experience and a few rather simple ad- there is good reason to believe that learning is gated by
ditional constraints (Linsker 1986, Miller et al. 1989; various neuromodulatory mechanisms in the brain,
see Neural Deelopment: Mechanisms and Models). but the details of the modulation and gating processes
Second, models may aid in the understanding of the are only beginning to be explored.
pattern of deficits seen in patients with brain lesions.
Certain patients with an acquired dyslexic syndrome
known as deep dyslexia make a striking form or error 4.2 What Makes an Experience Conscious?
known as semantic errors; for example the patient may Although some considerable progress has been made
misread APRICOT as ‘peach.’ In addition, all such in characterizing the concomittants of consciousness
patients also make visual errors, for example, reading (see Consciousness, Neural Basis of ), there is no overall
SYMPATHY and ‘symphony.’ Early, noncomputa- understanding of exactly what it is about the activity
tional accounts postulated that there must be two of the brain that gives it the attribute of consciousness.
separate lesions, one affecting visual processing and It appears likely that consciousness will not be
the other affecting semantic processing. However, localizable; although it may be highly dependent on
computational models of the reading process (Hinton specific brain structures (e.g., those that regulate sleep
and Shallice 1991; see Cognitie Functions (Normal) vs. wakefulness, etc.), it may well depend on the intact
and Neuropsychological Deficits, Models of ) have functioning of many interacting parts of the brain.
shown that a single lesion affecting either the visual or Exactly why or how consciousness arises from these
the semantic part of an interactive neural network will interactions is not at all understood.
lead to errors of both types. Thus, the coexistence of
these errors may be an intrinsic property of the
underlying processing architecture rather than a re-
4.3 What is the Basis for the Unique Cognitie
flection of multiple distinct lesions. A third area where
Capacities of the Human Brain, Relatie to that of
computational models have shed considerable light is
Other, Simpler Organisms?
in the interpretation of the receptive field properties of
individual neurons (Zipser and Andersen 1988). While The issue of what sets humans apart from other
initial interpretations were based on verbally de- organisms remains one of the central unresolved
scribable features such as oriented bars or edges, such questions. The similarity of the human genome to that
properties are not always apparent, and even when of closely related species can be taken in different

2138
Cognitie Neuroscience

ways. It can suggest to some that a very small number Bibliography


of specific faculties have been added which differ-
Broca P 1861 Remarques sur le siege de la faculte de la parole
entiate the human from, say, the chimpanzee; or it articulee, suives d’une observation d’aphemie (perte de par-
could suggest that rather than new faculties, the ole). Bulletin de la Societe d’Anatomie 36: 330–57
human brain really differs only in the expansion and Callan D E, Callan A M, Kroos C, Vatikiotis-Bateson E 2000
extension of structures already present to a degree in Multimodal contribution to speech perception. Cognitie
other organisms. The idea that the highest cognitive Brain Research
functions are emergent functions rather than localiz- Damasio A R 1989 Time-locked multiregional retroactivation:
able or locally encoded in genes remains an attractive, A system-level proposal for the neural substrates of recall and
though elusive possibility. recognition. Cognition 33: 25–62
Duhamel J-R, Colby C L, Goldberg M E 1991 Congruent
representations of visual and somatosensory space in single
neurons of monkey ventral intraparietal cortex (area VIP). In:
5. The Future of Cognitie Neuroscience Paillard J (ed.) Brain and Space. Oxford University Press,
Oxford, UK, pp. 223–6
Nobel laureate Eric Kandel has suggested that cog- Duhamel J-R, Colby C L, Goldberg M E 1992 The updating of
nitive neuroscience will increasingly assume center the representation of visual space in parietal cortex by
stage in the neurosciences in the twenty-first century, intended eye movements. Science 255: 90–2
and it has begun to make dramatic inroads into the Fodor J A 1983 Modularity of Mind: An Essay on Faculty
Psychology. MIT Press, Cambridge, MA
field of cognitive psychology, where many leading Georgopoulos A P, Schwartz A B, Kettner R E 1986 Neuronal
investigators have redirected their research to exploit population encoding of movement direction. Science 233:
ideas and methods from neuroscience. Future research 1416–9
in cognitive neuroscience will address the general Graziano M S A, Gross C G 1993 A bimodal map of space:
issues raised above as well as many other topics. What Somatosensory receptive fields in the macaque putamen with
makes the future of the field so exciting is the prospect corresponding visual receptive fields. Experimental Brain
of further development of a number of important Research 97: 96–109
contributing methodologies. Breakthroughs in func- Hebb D O 1949 The Organization of Behaior. Wiley, New York
tional brain imaging and other related methods are Hinton G E, Shallice T 1991 Lesioning an attractor network:
likely to provide far greater spatial and temporal Investigations of acquired dyslexia. Psychological Reiew
98(1): 74–95
resolution of brain activity. Another, very important Hubel D H, Weisel T 1962 Receptive fields, binocular orien-
area of methodological advance is the ability to create tation and functional architecture in the cat’s visual cortex.
genetically altered brains especially in small mammals Journal of Physiology 166: 106–54
and invertebrates, and thereby to explore the conse- Hup J M, James A C, Girard P, Lomber S G, Payne B R, Bullier
quences of these alterations for function (see Memory: J 2001 Feedback connections act on the early part of the
Genetic Approaches). These methods have already responses in monkey visual cortex. Journal of Neurophysiology
reached the point where it is possible to allow an 85: 134–45
organism to develop normally, and then induce a Just M A, Carpenter P A, Keller T A, Eddy W F, Thulborn K R
region-specific gene knockout, thereby providing the 1996 Brain activation modulated by sentence comprehension.
opportunity to investigate, for example, the effect of Science 274(5284): 114–6
Karni A, Meyer G, Rey-Hipolito C, Jezzard P, Adams M M,
the alternation of synaptic plasticity in a specific part Turner R, Ungerleider L G 1998 The acquisition of skilled
of the brain. Breakthroughs should be expected in motor performance: Fast and slow experience-driven changes
many other areas of cognitive neuroscience as well, in primary motor cortex. Proceedings of the National Academy
including neuronal recording, functional imaging, and of Science USA 95(3): 861–8
computational modeling approaches. Together, these Lee T S, Mumford D, Romero R, Lamme V A F 1998 The role
methods will lead to a deeper understanding of how of primary visual cortex in higher level vision. Vision Research
the highest capabilities of the human mind arise from 38: 2429–54
the underlying physical and chemical processes in the Lee T S, Nguyen M 2001 Dynamics of subjective contour
brain. formation in early visual cortex. Proceedings of the National
Academy of Science USA
Linsker R 1986 From basic network principles to neural
See also: Animal Cognition; Brain, Evolution of; architecture, I: Emergence of orientation columns. Proceed-
Cerebral Cortex: Organization and Function; Cogni- ings of the National Academy of Sciences USA 83: 7508–12
tive Control (Executive Functions): Role of Prefrontal Luria A R 1966 Higher Cortical Functions in Man. Basic Books,
Cortex; Cognitive Neuropsychology, Methodology of; New York
Cognitive Psychology: History; Cognitive Psycho- Maguire E A, Mummery C J, Buchel C 2000 Patterns of
logy: Overview; Cognitive Science: History; Cognitive hippocampal-cortical interaction dissociate temporal lobe
memory subsystems. Hippocampus 10(4): 475–82
Science: Overview; Cognitive Science: Philosophical Markus E J, Qin Y, Leonard B, Skaggs W E, McNaughton B L,
Aspects; Comparative Neuroscience; Computational Barnes C A 1995 Interactions between location and task affect
Neuroscience; Evolutionary Social Psychology; the spatial and directional firing of hippocampal neurons.
Human Cognition, Evolution of Journal of Neuroscience 15: 7079–94

2139
Cognitie Neuroscience

Mesulam M M 2000 Principles of Behaioral and Cognitie ism was concerned primarily with the learning of
Neurology. Oxford University Press, New York associations, particularly in nonhuman species, and it
Moody S L, Wise S P, di Pellegrino G, Zipser D 1998 A model constrained theorizing to stimulus–response notions.
that accounts for activity in primate frontal cortex during a
The overthrow of behaviorism came not so much from
delayed matching-to-sample task. Journal of Neuroscience
18(1): 399–410 ideas within psychology as from three research ap-
Olson C R, Gettner N 1995 Object-centered direction selectivity proaches external to the field.
in the supplementary eye field of the macaque monkey.
Science 269: 985–8
Petersen S E, Fox P T, Posner M I, Mintun M, Raichle M E
1988 Positron emission tomographic studies of the cortical
1.1 Communications Research and the Information
anatomy of single-word processing. Nature 331: 585–9 Processing Approach
Pouget A, Deneve S, Sejnowski T J 1999 Frames of reference in During World War II, new concepts and theories were
hemineglect: A computational approach. Progress in Brain developed about signal processing and communica-
Research 121: 81–97
Rumelhart D E, McClelland J L, the PDP Research Group 1986
tion, and these ideas had a profound impact on
Parallel Distributed Processing: Explorations in the Micro- psychologists active during the war years. One im-
structure of Cognition. Vol. 1: Foundations. MIT Press, portant work was Shannon’s 1948 paper about Infor-
Cambridge, MA mation Theory. It proposed that information was
Seidenberg M S, McClelland J L 1989 A distributed, devel- communicated by sending a signal through a sequence
opmental model of word recognition and naming. Psycho- of stages or transformations. This suggested that
logical Reiew 96: 523–68 human perception and memory might be concep-
Tanaka K 1996 Inferotemporal cortex and object vision. Annual tualized in a similar way: sensory information enters
Reiew Neuroscience 19: 109–39 the receptors, then is fed into perceptual analyzers,
Ungerleider L G, Mishkin M 1982 Two cortical visual systems.
In: Ingle D J, Goodale M A, Mansfield R J W (eds.) Analysis
whose outputs in turn are input to memory systems.
of Visual Behaior. MIT Press, Cambridge, MA This was the start of the ‘information processing’
Wilson M A, McNaughton B L 1994 Reactivation of hippo- approach—the idea that cognition could be under-
campal ensemble memories during sleep. Science 265: 676–9 stood as a flow of information within the organism,
Zipser D, Andersen R A 1988 A back propagation programmed an idea that continues to dominate cognitive psy-
network that simulates response properties of a subset of chology.
posterior parietal neurons. Nature 331: 679–84 Perhaps the first major theoretical effort in infor-
mation processing psychology was Donald Broad-
J. L. McClelland bent’s Perception and Communication (Broadbent
1958). According to Broadbent’s model, information
output from the perceptual system encountered a
filter, which passed only information to which people
were attending. Although this notion of an all-or-none
filter would prove too strong (Treisman 1960), it
Cognitive Psychology: History offered a mechanistic account of selective attention, a
concept that had been banished during behaviorism.
Information that passed Broadbent’s filter then moved
Since the beginning of experimental psychology in on to a ‘limited capacity decision channel,’ a system
the nineteenth century, there had been interest in the that has some of the properties of short-term memory,
study of higher mental processes. But something and from there on to long-term memory. This last part
discontinuous happened in the late 1950s, something of Broadbent’s model—the transfer of information
so dramatic that it is now referred to as the ‘cognitive from short- to long-term memory—became the salient
revolution,’ and the view of mental processes that it point of the dual-memory models developed in the
spawned is called ‘cognitive psychology.’ What hap- 1970s.
pened was that American psychologists rejected be- Another aspect of Information theory that attracted
haviorism and adopted a model of mind based on the psychologist’s interest was a quantitative measure of
computer. The brief history that follows (adapted in information in terms of ‘bits’ (roughly, the logarithm
part from Hilgard (1987) and Kessel and Bevan (1985)) to the base 2 of the number of possible alternatives). In
chronicles mainstream cognitive psychology from the a still widely cited paper, George Miller (1956) showed
onset of the cognitive revolution to the beginning of that the limits of short-term memory had little to do
the twenty-first century. with bits. But along the way, Miller’s and others’
interest in the technical aspects of information theory
and related work had fostered mathematical psy-
1. Beginnings chology, a subfield that was being fueled by other
sources as well (e.g., Estes and Burke 1953, Luce 1959,
From roughly the 1920s through the 1950s, American Garner 1962). Over the years, mathematical psychol-
psychology was dominated by behaviorism. Behavior- ogy has frequently joined forces with the information

2140
Cognitie Psychology: History

processing approach to provide precise claims about transformational grammar would change the intel-
memory, attention, and related processes. lectual landscape of linguistics, and usher in a new
psycholinguistics.
Chomsky’s second publication (1959) was a review
of Verbal Behaior, a book about language learning by
1.2 The Computer Modeling Approach
the then most respected behaviorist alive, B. F. Skinner
Technical developments during World War II also led (Skinner 1957). Chomsky’s review is arguably one of
to the development of digital computers. Questions the most significant documents in the history of
soon arose about the comparability of computer and cognitive psychology. It aimed not merely to devastate
human intelligence (Turing 1950). By 1957, Alan Skinner’s proposals about language, but to undermine
Newell, J. C. Shaw, and Herb Simon had designed a behaviorism as a serious scientific approach to psy-
computer program that could solve difficult logic chology. To some extent, it succeeded on both counts.
problems, a domain previously thought to be the
unique province of humans. Newell and Simon soon
followed with programs that displayed general 1.4 An Approach Intrinsic to Psychology
problem-solving skills much like those of humans, and
argued that these programs offered detailed models of At least one source of modern cognitive psychology
human problem solving (a classic summary is con- came from within the field. This approach had its roots
tained in Newell and Simon (1972)). This work would in Gestalt psychology, and maintained its focus on the
also help establish the field of artificial intelligence. higher mental processes. A signal event in this tra-
Early on, cross-talk developed between the com- dition was the 1956 book A Study of Thinking, by
puter modeling and information-processing ap- Bruner, Goodnow, and Austin (Bruner et al. 1956).
proaches, which crystallized in the 1960 book Plans The work investigated how people learn new concepts
and the Structure of Behaior (Miller et al. 1960). The and categories, and it emphasized strategies of learning
book showed that information-processing psychology rather than just associative relations. The proposals fit
could use the theoretical language of computer model- perfectly with the information-processing approach—
ing even if it did not actually lead to computer indeed, they were information processing proposals—
programs. With the ‘bit’ having failed as a psycho- and offered still another reason to break from be-
logical unit, information processing badly needed a haviorism.
rigorous but rich means to represent psychological By the early 1960s all was in place. Behaviorism was
information (without such representations, what on the wane in academic departments all over America
exactly was being processed in the information proces- (it had never really taken strong root in Europe).
sing approach?). Computer modeling supplied power- Psychologists interested in the information-processing
ful ideas about representations (as data structures), as approach were moving into academia, and Harvard
well as about processes that operate on these struc- University went so far as to establish a Center for
tures. The resultant idea of human information pro- Cognitive Studies directed by Jerome Bruner and
cessing as sequences of computational processes George Miller. The new view in psychology was
operating on mental representations remains the information processing. It likened mind to a computer,
cornerstone of modern cognitive psychology (see and emphasized the representations and processes
e.g., Fodor 1975). needed to give rise to activities ranging from pattern
recognition, attention, categorization, memory, reas-
oning, decision making, problem solving, and
language.
1.3 The Generatie Linguistics Approach
A third external influence that lead to the rise of
modern cognitive psychology was the development of 2. The Growth of Cognitie Psychology
generative grammar in linguistics by Noam Chomsky.
Two of Chomsky’s publications in the late 1950s had The 1960s brought progress in many of the above-
a profound effect on the nascent cognitive psychology. mentioned topic areas, some of which are highlighted
The first was his 1957 book Syntactic Structures below.
(Chomsky 1957). It focused on the mental structures
needed to represent the kind of linguistic knowledge
that any competent speaker of a language must have.
2.1 Pattern Recognition
Chomsky argued that associations per se, and even
phrase structure grammars, could not fully represent One of the first areas to benefit from the cognitive
our knowledge of syntax (how words are organized revolution was pattern recognition, the study of how
into phrases and sentences). What had to be added people perceive and recognize objects. The cognitive
was a component capable of transforming one syn- approach provided a general two-stage view of object
tactic structure into another. These proposals about recognition: (a) describing the input object in terms of

2141
Cognitie Psychology: History

Figure 1
(a) Part of a Collins and Quillian (1969) semantic network. Circles designate concepts and lines (arrows) between
circles designate relations between concepts. There are two kinds of relations: subset–superset (‘Robin is a bird’)
and property (e.g., ‘Robins can fly’). The network is strictly hierarchical, as properties are stored only at the highest
level at which they apply. (b) Part of an Anderson and Bower (1973) propositional network. Circles represent
concepts and lines between them labeled relations. All propositions have a subject–predicate structure, and the
network is not strictly hierarchical. (c) Part of a simplified connectionist network. Circles represent concepts, or
parts of concepts, lines with arrowheads depict excitatory connections, and lines with filled circles designate
inhibitory connections; typically numbers are put on the lines indicate the strength of the connections. The network
is not strictly hierarchical, and is more interconnected than the preceding networks

relatively primitive features (e.g., ‘it has two diagonal descriptions in visual memory, and selecting the best
lines and one horizontal line connecting them’); and match as the identity of the input object (‘this
(b) matching this object description to stored object description best matches the letter A’). This two-stage

2142
Cognitie Psychology: History

view was not entirely new to psychology, but ex- that forgetting in STM reflected a loss of information
pressing it in information-processing terms allowed from storage due to either decay or interference (e.g.,
one to connect empirical studies of object perception Wickelgren 1965), whereas some apparent losses of
to computer models of the process. The psychologist information in LTM often reflected a temporary
Ulrich Neisser (1964) used a computer model of failure in retrieval, (Tulving and Pearlstone 1966).To a
pattern recognition (Selfridge 1959) to direct his large extent, these findings have held up during over 30
empirical studies and provided dramatic evidence that years of research, although many of the findings would
an object could be matched to multiple visual mem- now be seen as more limited in scope (e.g., the findings
ories in parallel. about STM are now seen as reflecting only one
Other research indicated that the processing under- component of working memory, e.g., Baddeley (1986),
lying object perception could persist after the stimulus and the findings about LTM are seen as characterizing
was removed. For this to happen, there had to be a only one of several LTM systems, e.g., Schacter
visual memory of the stimulus. Evidence for such an (1987)).
‘iconic’ memory was supplied by Sperling in classic One of the most important innovations of 1960s
experiments in 1960 (Sperling 1960). Evidence for a research was the emphasis on reaction time as a
comparable brief auditory memory was soon provided dependent measure. Because the focus was on the flow
as well (e.g., Crowder and Morton 1969). Much of the of information, it made sense to characterize various
work on object recognition and sensory memories was processes by their temporal extent. In a seminal paper
integrated in Neisser’s influential 1967 book Cognitie in 1966, Saul Sternberg reported (Sternberg 1966) that
Psychology (Neisser 1967). The book served as the first the time to retrieve an item from STM increased
comprehensive statement of existing research in cog- linearly with the number of items in store, suggesting
nitive psychology, and it gave the new field its name. that retrieval was based on a rapid scan of STM.
Sternberg (1969) gave latency measures another boost
when he developed the ‘additive factors’ method,
which, given assumptions about serial processing,
2.2 Memory Models and Findings
allowed one to attribute changes in reaction times to
Broadbent’s model of attention and memory stimu- specific processing stages involved in the task (e.g., a
lated the formulation of rival models in the 1960s. decrease in the perceptibility of information affected
These models assumed that short-term memory (STM) the encoding of information into STM but not its
and long-term memory (LTM) were qualitatively storage and retrieval). These advances in ‘mental
different structures, with information first entering chronometry’ quickly spread to areas other than
STM and then being transferred to LTM (e.g., Waugh memory (e.g., Fitts and Posner 1967, see also
and Norman 1965). The Atkinson and Shiffrin (1968) Schneider and Shiffrin 1977).
model proved particularly influential. With its em-
phases on information flowing between memory
stores, control processes regulating that flow, and
2.3 The New Psycholinguistics
mathematical descriptions of these processes, the
model was a quintessential example of the informa- Beginning in the early 1960s there was great interest in
tion-processing approach. The model was related to determining the psychological reality of Chomsky’s
various findings about memory. For example, when theories of language (these theories had been formu-
people have to recall a long list of words they do best lated with ideal listeners and speakers in mind). Some
on the first words presented, a ‘primacy’ effect, and on of these linguistically inspired experiments presented
the last few words presented, a ‘recency’ effect. Various sentences in perception and memory paradigms, and
experiments indicated that the recency effect reflected showed that sentences deemed more syntactically
retrieval from STM, whereas the primacy effect complex by transformational grammar were harder to
reflected enhanced retrieval from LTM due to greater perceive or store (Miller 1962). Subtler experiments
rehearsal for the first items presented (e.g., Murdock tried to show that syntactic units, like phrases,
1962, Glanzer and Cunitz 1966). At the time these functioned as units in perception, STM, and LTM
results were seen as very supportive of dual-memory (Fodor et al. (1974) is the classic review). While many
models (although alternative interpretations would of these results are no longer seen as critical, this
soon be proposed—particularly by Craik and research effort created a new subfield of cognitive
Lockhart 1972). psychology, a psycholinguistics that demanded soph-
Progress during this period also involved empiri- istication in modern linguistic theory.
cally determining the characteristics of encoding, Not all psycholinguistic studies focused on syntax.
storage, and retrieval processes in STM and LTM. Some dealt with semantics, particularly the represen-
The results indicated that verbal material was encoded tation of the meanings of words, and a few of these
and stored in a phonologic code for STM, but a more studies made use of the newly developed mental
meaning-based code for LTM (Conrad 1964, Kintsch chronometry. One experiment that proved seminal
and Buschke 1969). Other classic studies demonstrated was reported by Collins and Quillian (1969). Partici-

2143
Cognitie Psychology: History

pants were asked simple questions about the meaning tions. Whereas the memory-for-language models de-
of a word, such as ‘Is a robin a bird,’ and ‘Is a robin an scribed earlier had assumed representations that were
animal?’; the greater the categorical difference between language-like, or propositional, other researchers ar-
the two terms in a question, the longer it took to gued that representations could also be imaginal, like
answer. These results were taken to support a model of a visual image. Shepard and Cooper (1972) provided
semantic knowledge in which meanings were organ- evidence that people could mentally rotate their
ized in a hierarchical network, e.g., the concept ‘robin’ representations of objects, and Kosslyn (1980) sur-
is directly connected to the concept ‘bird,’ which in veyed numerous phenomena that further implicated
turn is directly connected to the concept ‘animal,’ and visual imagery. In keeping with the interdisciplinari-
information can flow from ‘robin’ to ‘animal’ only by ness of cognitive science, AI researchers and philoso-
going through ‘bird’ (see the top of Fig. 1). Models like phers entered the debate about propositional versus
this were to proliferate in the next stage of cognitive imaginal representations (e.g., Block 1981, Pylyshyn
psychology. 1981). In addition to questions about the modality of
representations, there were concerns about the struc-
ture of representations. While it had long been
assumed that propositional representations of objects
3. The Rise of Cognitie Science were like definitions, researchers now proposed the
representations were prototypes of the objects, fitting
3.1 Memory and Language some examples better than others (Tversky 1977,
Mervis and Rosch 1981, Smith and Medin 1981).
Early in the 1970s the fields of memory and language Again the issues sparked interest in disciplines other
began to intersect. In 1973 John Anderson and than psychology (e.g., Lakoff 1987).
Gordon Bower published Human Associatie Memory The cognitive science movement affected most areas
(Anderson and Bower 1973), which presented a model of cognitive psychology, ranging from object rec-
of memory for linguistic materials. The model com- ognition (Marr 1982) to reasoning (e.g., Johnson-
bined information processing with recent develop- Laird 1983) to expertise in problem solving (e.g.,
ments in linguistics and artificial intelligence (AI), Chase and Simon 1973). The movement continues to
thereby linking the three major research directions be influential and increasingly focuses on computa-
that led to the cognitive revolution. The model used tional models of cognition. What has changed since its
networks similar to that considered above to represent inception in the 1970s is the kind of computational
semantic knowledge, and used memory-search pro- model in favor.
cesses to interrogate these networks (see the middle of
Fig. 1). The Anderson and Bower book was quickly
followed by other large-scale theoretical efforts that 4. Newer Directions: Connectionism and
combined information processing, modern linguistics, Cognitie Neuroscience
and computer models. These efforts included Kintsch
(1974), which focused on memory for paragraphs
rather than sentences, and Norman, Rumelhart, and 4.1 Connectionist Modeling
the LNR Research Group (1975), Anderson (1976), The computer models that dominated cognitive psy-
and Schank and Abelson (1977), which took a more chology from its inception used complex symbols as
computer-science perspective and focused on stories representations, and processed these representations
and other large linguistic units. in a rule-based fashion (for example, in a model of
As psychologists became aware of related develop- object recognition, the representation for a frog might
ments in linguistics and artificial intelligence, so consist of a conjunction of complex properties, and
researchers in the latter disciplines become aware of the rule for recognition might look something like ‘If
pertinent work in psychology. Thus evolved the it’s green, small, and croaks, it’s a frog’). Starting in
interdisciplinary movement called ‘cognitive science.’ the early 1980s, an alternative kind of cognitive model
In addition to psychology, AI, and linguistics, the started to attract interest, namely ‘connectionist’ (or
fields of cultural anthropology and philosophy of ‘parallel distributed processing’) models. These propo-
mind also became involved. The movement eventuated sals have the form of neural networks, consisting of
in numerous interdisciplinary collaborations (e.g., nodes (representations) that are densely intercon-
Rumelhart et al. 1986), as well as in individual nected, with the connections varying in strength (see
psychologists becoming more interdisciplinary. the bottom of Fig. 1).
In 1981 Hinton and Anderson published a book
surveying then existent connectionist models (Hinton
and Anderson 1981), and in the same year McClelland
3.2 Representational Issues
and Rumelhart (1981) presented a connectionist model
In the 1970s and early 1980s, cognitive science was of word recognition that explained a wide variety of
much concerned with issues about mental representa- experimental results. The floodgates had been opened,

2144
Cognitie Psychology: History

and connectionist models of perception, memory, and While many factors may have been responsible for
language proliferated, to the point where they now the change in view, three will be mentioned here. First,
dominate computational approaches to cognition. the rise of connectionist models that were loosely
Why the great appeal? One frequently cited reason is inspired by brain function had the side effect of
neurological plausibility: the models are clearly closer increasing interest in what was known in detail about
to brain function than are traditional rule-based brain function. Indeed, these two recent movements
models. A second reason is that connectionist models have become increasingly intertwined as connectionist
permit parallel constraint satisfaction: different sources models have increasingly incorporated findings from
of activation can converge simultaneously on the same cognitive neuroscience. Second, in the 1970s and 1980s
representations or response. A third reason is that there were breakthroughs in systems-level neuro-
connectionist models manifest graceful degradation: science that had implications for mainstream cognitive
when the model is damaged, performance degrades psychology. As one example, research with neuro-
slowly, much as is found in human neurological logical patients as well as with nonhuman species
disorders (Rumelhart et al. 1986). established that structures in the medial temporal lobe
From a historical viewpoint, there is an ironic aspect were essential for the formation of memories that
about the ascendancy of connectionist models. Such could serve as the basis for recall (‘explicit memory’),
models return to the pure associationism that charac- but not for other kinds of ‘implicit memories’ (e.g.,
terized behaviorism. While connectionist models Schacter 1987, Squire 1992). As another example,
hardly fit with all behaviorist dictums—their repre- research with nonhuman primates showed that two
sentations are not restricted to stimuli and responses, distinct systems were involved in the early stages of
and they routinely assume massively parallel pro- visual perception: a ‘where’ system that is responsible
cessing—still their reliance on associations runs coun- for spatial localization, and a ‘what’ system that is
ter to Chomsky’s arguments that sheer associationism responsible for recognition of the object (e.g.,
cannot explain language (Chomsky 1957, 1959). This Ungerleider and Mishkin 1982). Both of these dis-
issue has formed part of the basis for critiques of coveries had major implications for the cognitive level
connectionist models (e.g., Fodor and Pylyshyn 1988). of analyses, e.g., computational models of object
Newell (1990), one of the founders of traditional recognition or memory would do well to divide the
computational models, suggested a plausible resol- labor into separate subsystems.
ution: lower-level cognitive processes like object rec- The third factor responsible for the rise of cognitive
ognition may be well modeled by connectionist neuroscience is methodological: the development of
models, but higher-level cognitive processes like neuroimaging techniques that produce maps of neural
reasoning and language may require traditional sym- activity while the brain is performing some cognitive
bolic modeling. task. One major technique is positron emission tom-
ography (PET). In 1988, Michael Posner, Marcus
Raichle, and colleagues used PET in a ground-
4.2 Cognitie Neuroscience
breaking experiment to localize neurally different
The other major new direction in cognitive psychology subprocesses of reading—PET images obtained while
is the growing interest in the neural bases of cognition, participants looked at words showed activated regions
a movement referred to as ‘cognitive neuroscience.’ in the posterior temporal cortex; images obtained
There had been little interest in biological work in the while participants read the words revealed additional
research that brought about the cognitive revolution. regions activated in the motor cortex; and images
That early work was as much concerned with fighting obtained while participants generated constrained
behaviorism as it was with advancing cognitive psy- associations to the words revealed still additional
chology, and consequently much of the research fo- regions in frontal areas (Posner et al. 1988). PET
cused on higher-level processes and was completely analyses of object recognition, memory, and language
removed from anything going on in the neurobiology soon followed. A more recent technique is functional
of its day. Subsequent generations of cognitive psy- magnetic resonance imaging (fMRI). It is now being
chologists solidified their commitments to a purely used to study virtually every domain of human
cognitive level of analyses, by arguing that the dist- cognition.
inction between cognitive and neural levels of analyses
was analogous to that between computer software and
hardware, and that cognitive psychology (and cog- 5. Conclusion
nitive science) was concerned primarily with the soft-
ware. Since the early 1990s, views about the impor- This article has given short shrift to important contri-
tance of neural analyses have changed dramatically. butions that tend to fall off the mainstream of cognitive
There is a growing consensus that the standard psychology. One such case is the work done by Dan
information processing analyses of cognition can be Kahneman and Amos Tversky (e.g., Kahneman and
substantially enlightened by knowing how cognition is Tversky 1973, Tversky and Kahneman 1983) on the
implemented in the brain. use of heuristics in decision making, which can result

2145
Cognitie Psychology: History

in deviations from rational behavior. Another example Kahneman D, Tversky A 1973 Psychology of prediction.
is the cognitively inspired study of memory and Psychological Reiew 80: 237–51
language deficits in neurological patients (Shallice Kessel F S, Bevan W 1985 Notes toward a history of cognitive
(1988) provides a review). There are other cases like psychology. In: Buxton C E (ed.) Points of View in the Modern
History of Psychology. Academic Press, New York
these which deserve a prominent place in a fuller Kintsch W 1974 The Representation of Meaning in Memory.
history of cognitive psychology. Erlbaum, Hillsdale, NJ
Kintsch W, Buschke H 1969 Homophones and synonyms in
short-term memory. Journal of Experimental Psychology 80:
403–7
Bibliography Kosslyn S M 1980 Image and Mind. Harvard University Press,
Cambridge, MA
Anderson J R 1976 Language, Memory, and Thought. Erlbaum, Lakoff G 1987 Women, Fire, and Dangerous Things: What
Hillsdale, NJ Categories Reeal About the Mind. University of Chicago
Anderson J R, Bower G H 1973 Human Associatie Memory. Press, Chicago
Winston, Washington, DC Luce R D 1959 Indiidual Choice Behaior: a Theoretical
Atkinson R C, Shiffrin R M 1968 Human memory: a proposed Analysis. Wiley, New York
system and its control processes. In: Spence K, Spence J (eds.) Marr D 1982 Vision: a Computational Inestigation into the
The Psychology of Learning and Motiation. Academic Press, Human Representation and Processing of Visual Information.
San Diego, CA, Vol. 2 Freeman, San Francisco
Baddeley A D 1986 Working Memory. Oxford University Press, McClelland J L, Rumelhart D E 1981 An interactive activation
Oxford, UK model of context effects in letter perception: I. An account of
Block N J 1981 Imagery. MIT Press, Cambridge, MA basic findings. Psychological Reiew 88: 375–407
Broadbent D E 1958 Perception and Communication. Pergamon, Miller G A 1956 The magical number seven, plus or minus two:
New York some limits on our capacity for processing information.
Bruner J S, Goodnow J J, Austin G A 1956 A Study of Thinking. Psychological Reiew 63: 81–97
Wiley, New York Miller G A et al. 1960 Plans and the Structure of Behaior. Holt,
Chase W G, Simon H A 1973 Perception in chess. Cognitie New York
Psychology 4: 55–81 Miller G A 1962 Some psychological studies of grammar.
Chomsky N 1957 Syntactic Structures. Mouton, The Hague, American Psychologist 17: 748–62
The Netherlands Murdock Jr B B 1962 The serial position effect in the recall.
Chomsky N 1959 Review of B F Skinner’s Verbal Behaior. Journal of Experimental Psychology 64: 482–8
Language 35: 26–58 Neisser U 1964 Visual Search. Scientific American 210: 94–102
Collins A M, Quillian R M 1969 Retrieval time from semantic Neisser U 1967 Cognitie Psychology. Appleton-Century-Crofts,
memory. Journal of Verbal Learning and Verbal Behaior 8: New York
240–7 Neisser U 1968 The processes of vision. Scientific American 219:
Conrad R 1964 Acoustic confusions in immediate memory. 204–14
British Journal of Psychology 55: 75–84 Newell A 1990 Unified Theories of Cognition. Harvard University
Craik F I, Lockhart R S 1972 Levels of processing: a framework Press, Cambridge, MA
for memory research. Journal of Verbal Learning and Verbal Newell A, Shaw J C, Simon H A 1958 Elements of a theory of
Behaior 11: 671–84 human problem solving. Psychological Reiew 65: 151–66
Crowder R G, Morton J 1969 Precategorical acoustic storage Newell A, Simon H A 1972 Human Problem Soling. Prentice-
(PAS). Perception and Psychophysics. 5: 365–73 Hall, Englewood Cliffs, NJ
Estes W K, Burke C J 1953 A theory of stimulus variability in Norman D A, Rumelhart D E, and the LNR Research
learning. Psychological Reiew 60: 276–86 Group1975 Explorations in Cognition. Freeman, San Fran-
Fitts P M, Posner M I 1967 Human Performance. Brooks\Cole, cisco
Belmont, CA Posner M I, Petersen S E, Fox P T, Raichle M E 1988 Localiza-
Fodor J A 1975 The Language of Thought. Crowell, New York tion of cognitive operations in the human brain. Science 240:
Fodor J A, Bever T G, Garrett M F 1974 The Psychology of 1627–31
Language: an Introduction to Psycholinguistics and Generatie Pylyshyn Z W 1981 The imagery debate: analogue media versus
Grammar. McGraw-Hill, New York tacit knowledge. Psychological Reiew 88: 16–45
Fodor J A, Pylyshyn Z W 1988 Connectionism and cognitive Rosch E, Mervis C B 1981 Categorization of natural objects.
architecture: a critical analysis. Cognition 28: 3–71 Annual Reiew of Psychology 32: 89–115
Garner W R 1962 Uncertainty and Structure as Psychological Rumelhart D E, Hinton G E, Williams R J 1986 Learning
Concepts. Wiley, New York representations by back-propagating errors. Nature 323:
Glanzer M, Cunitz A R 1966 Two storage mechanisms in free 533–6
recall. Journal of Verbal Learning and Verbal Behaior 5: Rumelhart D E, McClelland J L, PDP Research Group 1986
351–60 Parallel Distributed Processing: Explorations in the Micro-
Hilgard E R 1987 Psychology in America: a Historical Surey. structure of Cognition. MIT Press, Cambridge, MA
Harcourt, Brace Jovanovich, San Diego, CA Schacter D L 1987 Implicit memory: history and current status.
Hinton G E, Anderson J A 1981 Parallel Models of Associatie Journal of Experimental Psychology: Learning, Memory, and
Memory. Erlbaum, Hillsdale, NJ Cognition 13: 501–18
Johnson-Laird P N 1983 Mental Models: Towards a Cognitie Schank R C, Abelson R P 1977 Scripts, Plans, Goals and
Science of Language, Inference, and Consciousness. Cambridge Understanding: an Inquiry into Human Knowledge Structures.
University Press, Cambridge, MA Erlbaum, Hillsdale, NJ

2146
Cognitie Psychology: Oeriew

Schneider W, Shiffrin R M 1977 Controlled and automatic within the boundaries of the field of Cognitive Psy-
human information processing: I. Detection, search, and chology. Second, the term alludes to the fact that
attention. Psychological Reiew 84: 1–66 cognitive psychologists attempt to explain intelligent
Selfridge O G 1959 Pandemonium: A Paradigm for Learning.
human behavior by reference to a cognitie system that
The Mechanization of Thought Processes. Her Majesty’s
Stationery Office, London intervenes between environmental input and behavior.
Selfridge O G, Neisser U 1960 Pattern recognition by machine. The second meaning of Cognitive Psychology thus
Scientific American 203: 60–8 refers to a set of assumptions governing the operations
Shallice T 1988 From Neuropsychology to Mental Structure. of the proposed cognitive system. Third, Cognitive
Cambridge University Press, Cambridge, UK Psychology means a particular methodological ap-
Shannon C E 1948 A mathematical theory of communication. proach to studying, that is, to empirically addressing
Bells Systems Technical Journal 27: 379–423, 623–56 potential explanations of human behavior. The two
Shannon C E, Weaver W 1949 The Mathematical Theory of latter meanings of Cognitive Psychology are discussed
Communication. University of Illinois Press, Urbana, IL
in some depth below, after a very brief consideration
Shepard R N, Cooper L A 1972 Mental Images and their
Transformations. MIT Press, Cambridge, MA of the scope of modern Cognitive Psychology and its
Skinner B F 1957 Verbal Behaior. Appleton-Century-Crofts, historical roots.
New York
Smith E E, Medin D L 1981 Categories and Concepts. Harvard
University Press, Cambridge, MA
Sperling G 1960 The information available in brief visual 2. The Scope of Cognitie Psychology
presentations. Psychological Monographs 77(3, no. 478)
Squire L R 1992 Memory and the hippocampus: A synthesis
At present, Cognitive Psychology is a broad field
from findings with rats, monkeys, and humans. Psychological concerned with many different topic areas, such as, for
Reiew 99(2): 195–231 instance, human memory, perception, attention, pat-
Sternberg S 1966 High-speed scanning in human memory. tern recognition, consciousness, neuroscience, rep-
Science 153: 652–4 resentation of knowledge, cognitive development,
Sternberg S 1969 The discovery of processing stages: extensions language, and thinking. The common denominator of
of Donders’ method. Acta Psychologica 30: 276–315 these phenomena appears to be that all of the
Treisman A M 1960 Contextual cues in selective listening. phenomena reflect the operation of ‘intelligence’ in
Quarterly Journal of Experimental Psychology 12: 242–8 one way or another, at least if intelligence is broadly
Tulving E, Pearlstone Z 1966 Availability versus accessibility of
information in memory for words. Journal of Verbal Learning
defined as skill of an indiidual to act purposefully,
and Verbal Behaior 5: 381–91 think rationally, and interact efficiently with the en-
Turing A M 1950 Computing machinery and intelligence. Mind ironment. Thus, at a general level, Cognitive Psy-
59: 433–60 chology is concerned with explaining the structure and
Tversky A 1977 Features of similarity. Psychological Reiew 84: mental operations of intelligence as well as its be-
327–52 havioral manifestations.
Tversky A, Kahneman D 1983 Extensional versus intuitive
reasoning: the conjunction fallacy in probability judgment.
Psychological Reiew 90: 293–315
Ungerleider L G, Mishkin M 1982 Two cortical visual systems. 3. Historical Roots of Cognitie Psychology
In: Engle D J, Goodale M A, Mansfield R J (eds.) Analysis of
Visual Behaior. MIT Press, Cambridge, MA, pp. 549–86
Waugh N C, Norman D A 1965 Primary memory. Psychological 3.1 The Term Cognitie Psychology
Reiew 72: 89–104
Wickelgren W A 1965 Acoustic similarity and intrusion errors in The term Cognitive Psychology (latin: cognoscere;
short-term memory. Journal of Experimental Psychology 70: greek: gignoskein l to know, perceive) is rather
102–8 young. Although we do find the related term ‘cog-
nition’ mentioned occasionally in the psychologies of
E. E. Smith the late nineteenth and early twentieth century (e.g.,
James, Spence, Wundt) where it denoted the basic
elements of consciousness and their combinations, the
present meanings of the term cognitive psychology
owe little to the early theoretical and philosophical
considerations of the human mind. Rather, the current
Cognitive Psychology: Overview modern meanings of the term owe much more to (a)
the fact that the study of cognition emerged in
1. Meanings of the Term ‘Cognitie Psychology’ opposition to the prevailing behavioristic view in the
1940s and 1950s that was trying to explain human
Cognitive Psychology has at least three different behavior primarily in terms of its antecedent en-
meanings. First, the term refers to ‘a simple collection vironmental conditions, and to (b) the availability of
of topic areas,’ that is, of behaviorally observable or both new theoretical concepts (e.g., information the-
theoretically proposed phenomena that are studied ory [Shannon], cybernetics [Wiener], systems theory

2147
Cognitie Psychology: Oeriew

[von Bertalanffy]) and practical computing machines


that offered new insights into the potential nature of
the mental device intervening between the outside
world and human behavior.
Accordingly, the history of Cognitive Psychology
can be parsed into four distinct periods: philosophical,
early experimental, the cognitive revolution, and
modern cognitive psychology. Early philosophy
(ancient Egyptians, Greek philosophers, British
empiricists) provided a context for understanding the
mind and its processes (i.e., associations), and identi-
fied many of the major theoretical issues that were
later studied empirically (e.g., how does perception
work? how are concepts represented?). Early exper-
imental work began, at the latest, during the middle
part of the nineteenth century, when Fechner empiri-
cally studied the relation between stimulus properties Figure 1
(e.g., weight) and accompanying internal sensations, A basic information-processing system
and when, in 1879, Wilhelm Wundt founded the first
department of psychology in Leipzig, Germany. The
neuroscience to create a new discipline called ‘Cog-
early experimental phase of Cognitive Psychology was
nitive Science.’
in full swing when, during the early 1900s, Donders
Even more recently, with the advent of new ways to
and Cattell conducted perception experiments on
see the brain at work (e.g., functional magnetic
imageless thought, and Frederic Bartlett investigated
resonance imaging fMRI, positron emission tomogra-
memory from a naturalistic viewpoint.
phy PET, electroencephalogram EEG), cognitive psy-
chologists have expanded their operations to neuro-
science, in the hope of being able to empirically localize
4. Modern Cognitie Psychology the components of the brain that are involved in
specific operations of the cognitive system.
In the mid-1950s and early 1960s, Cognitive Psy-
chology experienced a renaissance. Cognitie Psy-
chology, a textbook that systematized the re-emerged
science, was written by Neisser and was published in 5. The Properties of the Cognitie System: The
the United States (1967). Neisser’s book was central to Computer Metaphor
the solidification of Cognitive Psychology as it gave a
label to the field and defined the topical areas. Neisser Currently, the dominant metaphor underlying theor-
used the computer metaphor to capture the selection, etical and empirical research in Cognitive Psychology
storage, reception, and manipulation of information is the computer metaphor. According to the computer
in the human cognitive system. In 1966, Hilgard and metaphor, the cognitive system of humans, that is, the
Boweradded a chapter to their book Theories of device intervening between environmental input and
Learning in which the idea of using computer pro- behavior, can be understood best in analogy to an
grams to serve as models of theories of cognition was information-processing framework. A basic informa-
developed. tion-processing system (see Fig. 1) contains two basic
The 1970s saw the emergence of professional components, a memory and a processor, that interact
journals devoted to Cognitive Psychology, such as with each other. In addition, the processor interacts
Cognitie Psychology, Cognition, Memory & Cog- with the environment through receptors and effectors.
nition, and a series of symposia volumes, including the Newell and Simon argue that any physical-symbol
Loyola Symposium on Cognition and the Carnegie– system, such as an information-processing system, has
Mellon Series. Cognitive laboratories were built, sym- the necessary and sufficient means to generate in-
posia and conferences appeared at both national and telligent action (physical symbol systems hypothesis).
international levels, courses in Cognitive Psychology In laymen’s terms, the information-processing
were added to the curricula, textbooks written on the framework formally described by Newell and Simon
topic, and professors of Cognitive Psychology hired. (1972, pp. 20–21) can be said to be based on seven basic
In the 1980s and 1990s, serious efforts to discover ideas (Lachman et al. 1979): (a) humans are viewed as
the neural components that are linked to specific autonomous, intentional beings who interact with the
cognitive constructs began, and the field underwent external world; (b) the cognitive system is a general-
transformations due to major changes in computer purpose, symbol-processing system; (c) there exists a
technology and brain science. As a result, Cognitive fundamental distinction between processes and data
Psychology converged with computer science and (i.e., memories). Data are acted on by processes that

2148
Cognitie Psychology: Oeriew

manipulate and transform data; (d) cognitive proces- Application


ses take time, such that predictions about response
times can be made if it is assumed that processes occur
in sequence and have specifiable complexity; (e) the Declarative Production
cognitive system is a limited-capacity processor that Memory Memory
has both structural and resource limitations; (f) the

Storage

Match
cognitive system depends on, but is not entirely

Execution
Retrieval
constrained by, a neurological substrate; (g) the goal
of psychological research is to specify the processes
and representations underlying intelligent perform- Working
ance on cognitive tasks. Memory

Encoding Performances
6. Current Themes of the Computer Metaphor Outside World
The idea that a human cognitive system can be viewed
as an information-processing device has had a dra- Figure 2
matic impact on both theoretical and empirical re- The basic architecture of ACT* (Anderson 1983)
search on the functioning of the human mind. A few
select current themes in Cognitive Psychology reflect- occurs early or late within the cognitive system. Much
ing this approach are the following. of the early research on this topic focused on the extent
to which unattended stimuli are processed. Early
selection models hold that selection occurs at a
6.1 Data-drien or Conceptually Drien Processes? relatively early level in the cognitive system, that is,
before meaning has been extracted. Initial support for
A data-driven mental process is one that relies almost this notion came from dichotic listening tasks in which
exclusively on the ‘data,’ that is, on the stimulus listeners had to verbally repeat information presented
information being presented in the environment. to one ear while different (or the same) information
Whereas data-driven processes are assisted very little was simultaneously presented to the other ear. Results
by already known information, conceptually driven suggested that little of the information presented to
processes are those that rely heavily on such in- the unattended ear was noticed. However, it was soon
formation. Thus, a conceptually driven process uses realized that attentional selection was not an all-or-
the information already in memory, and whatever none phenomenon. Thus, at least under some cir-
expectations are present in the situation, to perform a cumstances unattended information seems to reach
task; data-driven processes use only the stimulus higher levels of the cognitive system.
information. The topic of attentional selection has received rather
The distinction between data-driven and conceptu- widespread interest not only in studies with normal
ally driven processes has been studied intensely in, adults but also with neuropsychological samples be-
among others, the area of pattern recognition. Models cause in some patient populations (e.g., attention-
of perception attempt to explain, in large part, how deficit disorder, schizophrenia), the attentional selec-
patterns are recognized. Early models assumed that tion system appears to be impaired.
this process was primarily data-driven. However, the
results of more recent research suggests that pattern
recognition is also influenced by top-down conceptual 6.3 Separate or Unitary Memory Systems?
processes.
The debate over memory types has a long history in
Cognitive Psychology. For example, in the 1960s
Atkinson and Shiffrin introduced an information-
6.2 Attention
processing model containing sensory, short-term, and
Attention is often assumed to be a critical mental long-term memory stores. Craik and Lockheart, a few
resource that is necessary for the operation of any years later, advanced a unitary view of memory. More
mental process. Most theories that discuss attention recently, distinctions have been made between declara-
assume that it is a limited mental resource and that the tive\explicit (intentionally recollecting earlier experi-
amount of attention that is available determines how ences) and procedural\implicit (nonintentional influ-
many separate processes can be simultaneously per- ences from earlier exposure) memory systems, that, in
formed. turn, have been challenged by the argument that many
One of the perhaps most interesting problems apparent dissociations can be accommodated when
surrounding the role of attention has been concerned one considers the match between encoding operations
with the question of whether attentional selection and retrieval operations (transfer-appropriate process-

2149
Cognitie Psychology: Oeriew

ing). At present, it is unclear how many distinct


memory systems exist in the human cognitive system.
However, recent studies with amnesic patients and
brain-imaging studies seem to suggest that memory
may not be unitary.

6.4 The Nature of the Cognitie Architecture


Cognitive architectures specify the permanent proper-
ties of the human cognitive system, akin to the
hardware of a modern computer. Recent proposals
sketch the human cognitive architecture in a much
more fine-grained and detailed manner than was
apparent in Newell and Simon’s earlier proposal of a
basic information-processing device.
Fig. 2 depicts the basics of a cognitive architecture,
that has been proposed by Anderson (1983), called Figure 3
ACT* (adaptive control of thought). In Anderson’s A basic connectionist model
architecture, the existence of a long-term declarative
memory for basic facts that are connected to each have been proposed. First, some theorists have argued
other in a semantic net is assumed. In addition, that the human cognitive system might be better
Anderson proposes a second long-term, procedural, understood in terms of a brain metaphor, assuming
memory that consists of productions. Each production that cognitive systems consist of elementary, neuron-
has a set of conditions that test elements of working like units that are connected and produce behavior as
memory and a set of actions that create new structures a whole. Second, at least some areas within Cognitive
in working memory. Strengths are associated with Psychology have adopted an ecological, or context
each long-term memory element (both network nodes metaphor, arguing that cognitive systems need to be
and productions) as a function of its use. Working understood in terms of organism-environment rela-
memory itself is activation based and contains the tions.
activated portion of declarative memory plus declara-
tive structures generated by production firings and
perception.
Activation spreads automatically (as a function of 7.1 The Brain Metaphor
node strength) through working memory and from According to the brain metaphor, human cognition is
there to other connected nodes in declarative memory. best understood in terms of the properties of the brain.
Activation, along with production strength, deter- The brain metaphor, and more specifically, so-called
mines how fast the matching of production proceeds. neural-like, connectionist networks as computational
Selection of productions to fire is a competitive process implementations of how our brain might work, have
between productions matching the same data. become highly popular in recent years and have
New productions are created by compiling the seriously challenged the premier status of the com-
effects of a sequence of production firings and re- puter metaphor when it comes to theorizing about the
trievals from declarative memory. Whenever a new nature of human cognition.
element is created in working memory, there is a fixed Connectionist networks, neural networks, or par-
probability that it will be stored in declarative memory. allel distributed processing models as they are vari-
Cognitive architectures like ACT* are important ously called, differ from theories based on the com-
not only in their own rights, that is, because they are puter metaphor in various respects. For example, in
theories of the structural and processing components theories adhering to the computer metaphor, all
of the human cognitive system, but also because they processes assumed to underlie human behavior need
set important constraints for more specific theories to be explicitly described. Connectionist networks, on
that address more local characteristics of the cognitive the other hand, can to some extent ‘program’ them-
system. selves in that they can learn to produce specific outputs
when certain inputs are given to them. Furthermore,
connectionist theorists often reject the use of explicit
7. Recent Challenges to the Computer Metaphor rules and symbols and use distributed representations,
in which concepts are characterized as patterns of
In recent years, an increasing number of theorists have activation in a network.
come to reject the view that the human cognitive Current connectionist networks typically have the
system operates like a computer. Two new metaphors following characteristics:

2150
Cognitie Psychology: Oeriew

(a) the network consists of elementary or neuron- propagation compares this imperfect pattern with the
like units or nodes that are connected to each other known required response, noticing the differences. It
such that a single unit has many links to other units; then back-propagates activation through the network
(b) units affect other units by exciting or inhibiting such that the units are adjusted in such a way that they
them; will tend to produce the required pattern on the next
(c) the units usually takes the weighted sum of all learning cycle. This process is repeated with a par-
input links and produces a single output to another ticular stimulus pattern until the network produces the
unit if the weighted sum exceeds some threshold value; required response pattern.
(d) the network as a whole is characterized by the Networks have been used to produce very interest-
properties of its units, by the manner in which the units ing results. For example, Sejnowski and Rosenberg
are connected to each other, and by the algorithms or produced a connectionist network called NETtalk
rules used to change the strength of connections that takes an English text as its input and produces
among units; reasonable English speech as its output. Thus, the
(e) networks can have different structures of layers; network appears to have learned the ‘rules of English
they can have a layer of input units, intermediate pronunciation’ but has done so without requiring
layers (of so-called ‘hidden units’), and a layer of explicit rules that combine and encode sounds in
output units; various ways.
(f ) a representation of a concept is stored in a
distributed manner by a pattern of activation through-
out the network; 7.2 The Ecological Metaphor
(g) the same network can store many different
patterns without them necessarily interfering with each Ecological psychology focuses specifically on the
other; interdependencies of humans and their environments,
(h) one algorithm or rule used in networks to permit which typically are studied under real-world con-
learning to occur is known as backward propagation ditions rather than in the laboratory. The approach
of errors. has given rise to a variety of very different approaches
How do individual units act when activation impinges and lines of research, only two of which will be briefly
on them? Any given unit can be connected to several considered. One approach is concerned with explain-
other units (see Fig. 3). Each of these other units can ing and understanding perception, and can be traced
send an excitatory or an inhibitory signal to the first to E. Brunswick and more recently, J. J. Gibson; the
unit. This unit generally takes a weighted sum of all other approach is concerned with social behavior and
these inputs. If the sum exceeds some threshold, then appears to go back to K. Lewin and R. Barker.
the unit produces an output that may feed into other Although the two lines of research share the ecological
units. perspective of examining functional adaptations of
This type of network can model cognitive behavior organisms to their environment, they are concerned
without recourse to the kinds of explicit rules found in with different issues and employ different methods.
the domain of the computer metaphor. The networks The two theoretical lines are much less explicit about
do so by associating various inputs with certain the actual properties of the assumed cognitive system
outputs, and by storing patterns of activation in the that intervenes between perception and action than
network. The networks typically make use of several are the computer and brain metaphors discussed
layers to deal with complex behavior. One layer above, and both have much less influence in con-
consists of input units that encode a stimulus as a temporary Cognitive Psychology as do their two
pattern of activation. Another layer is an output layer contenders. Nevertheless, they represent a succinctly
that produces some response, again as a pattern of different understanding of how the cognitive system
activation. When the network has learned to produce might function and belong, at least in part, to the
a particular response at the output layer following the realm of Cognitive Psychology.
presentation of a particular stimulus at the input layer,
it can exhibit behavior that very much looks like a rule
being applied. 7.2.1 Gibson’s ecological psychology. Following clas-
One of the most critical aspects of connectionist sical ecological theory, Gibson regards organism and
networks is the learning rule or algorithm used to form environment to be an inseparable pair. A critical
patterns of activation. One algorithm that has been feature of this conception is that environment is not
used to permit connectionist networks to learn is defined independently of organisms, nor are organ-
called backward propagation. At the beginning of a isms defined independently of environments.
learning episode, the network is set up with random Gibson considers the first task of ecological psy-
connection weights among the units. During the early chology to be an adequate description of the en-
stages of learning, when the input pattern has been vironment. Environment consists of a medium, sub-
presented, the output units often produce a response stances, and surfaces separating substance from me-
that is not the required output pattern. Backword dium. In a successful adaptation, organisms need to

2151
Cognitie Psychology: Oeriew

perceive which aspects of surface, substance, and is to obtain sufficient understanding to be able to
medium persist and which aspects change in regard to predict and control the effects of planned and un-
specific environmental events. planned interventions.
The ecological approach to visual perception Barker’s conception of ecological psychology has
assumes that senses represent evolved adaptations to been extended and refined in recent years (e.g., J.
an organism’s environment. These adaptations de- Barker, K. Fox, A. Wicker). For example, Fox has
velop in relation to environmental factors contributing linked it to social accounting theory, showing how
to an organism’s survival. Evolutionary success re- data from large-scale inventories of behavior settings
quires sensory systems that directly and accurately can reveal changes in the quality of life in communities.
depict the environment. The key stimulus features that From the brief desciption of ecological psychology
contribute to an organism’s survival which Gibson above, it should be clear that despite the fact that the
termed ‘affordances’ are invariant. Affordances differ ecological metaphor has more than some intuitive
according to situation and species and are perceived appeal to it, both its level of precision and its scope are
directly from the pattern of stimulation arising from far smaller than those of the computer and the brain
them. They do not change as the needs of observers metaphors. Consequently, the ecological metaphor
change; affordances have both objective and subjective plays a very minor role in current Cognitive Psy-
properties, becoming a fact of the environment and a chology.
fact of behavior.
Gibson’s approach is sometimes referred to as the
theory of direct perception, and has received empirical 8. Research Methods in Cognitie Psychology
support from, among others, research by E. J. Gibson,
studying the development of perceived invariance in Cognitive psychologists rely heavily on the exper-
infancy. E. J. Gibson’s research, for instance, has imental method, in which independent variables are
demonstrated that one of the properties of perceptual manipulated and dependent variables are measured to
learning and development does appear to be the provide insights into the specifics of the underlying
increasing ability to extract information about the cognitive system. To statistically evaluate the results
permanent properties of objects. from experiments, Cognitive Psychology relies on
standard hypothesis testing, along with inferential
statistics (e.g., analyses of variance).
7.2.2 Barker’s ecological psychology. According to The research methods Cognitive Psychology utilize
R. Barker, behavior should be studied without out- depend, in part, on the area of study and consist
side manipulation or imposition of structure. Rather primarily of chronometric methods, memory methods,
than contrive artificial settings, Barker advocated the cross-population studies, case studies, measures of
study of behavioral settings that already exist, using brain activity, and computational modeling.
methodology that exerts minimal influence upon the
situation.
8.1 Chronometric Methods
To collect systematic records of behavior in natural
contexts, Barker argued for the establishment of so- Beginning with early work by Donders (1868–1969),
called field stations, established organizational units cognitive psychologists have used reaction times to
that were to continue over time and whose staff measure the speed of mental operations. Donders
included both continuing and visiting researchers. developed the so-called subtractive method. For
Barker and co-workers established the Midwest Psy- example, task A might be assumed to require Process
chological Field Station in Oskaloosa, Kansas, where 1, whereas task B might require Processes 1 and 2.
for a period of 24 years detailed systematic records Donders assumed that cognitive operations are in-
were kept of community life. Observers were stationed dependent of each other and are processed in a strictly
throughout the town, and recorded everyday activities serial manner. Thus, the duration of Process 2 can be
of children. Barker concluded that the behavior of a estimated by subtracting the response time for task A
child could often be predicted more accurately from from the reaction time for task B.
knowing the situation the child was in, than from To circumvent some of the restrictive assumptions
knowing individual characteristics of the child. of the subtractive method, Saul Sternberg introduced
Barker’s conception of ecological psychology rests an additive factors logic. According to this logic, if a
on several assumptions, (a) human behavior must be task contains distinct processes, then there should be
studied at a level that recognizes the complexity of variables that selectively influence the speed of each
systems of relations linking individuals and groups process. Thus, if two variables influence different
with their social and physical environments; (b) processes, their effects should be statistically additive.
environment–behavioral systems have properties that By contrast, if two variables affect the same process,
develop over long periods of time; (c) change in one their effects should statistically interact.
part of the system is likely to affect other parts of the More recently, researchers have empirically identi-
system; and (d) the challenge of ecological psychology fied for so-called cascaded systems in which neither the

2152
Cognitie Psychology: Oeriew

assumptions of Donders nor of Sternberg hold because has been an increasing interest in comparing both the
mental processes occur simultaneously at different structure and the processes of the cognitive system
information-processing levels. New measures have across distinct populations. For example, studies of
been developed that combine reaction time measure- cognition from early childhood to older adulthood
ment with the measurement of other properties of the attempt to trace developmental changes in specific
cognitive system (e.g., speed-accuracy tradeoff func- mental operations, such as speed of processing and
tions, eye-tracking methods). memory. In addition, studies with special clinical
populations are conducted in order to understand
breakdowns in mental functioning, such as they occur
8.2 Memory Methods
in Alzheimer’s disease, schizophrenia, or amnesia.
One of the first to experimentally study human
memory was Hermann Ebbinghaus who developed
the savings technique to assess retention of nonsense 8.5 Measures of Brain Actiity
syllables. Retention was measured in terms of the In recent years a variety of possibilities to measure
number of trials necessary to relearn a list of syllables some correlates of mental activity in the brain (e.g.,
relative to the number of trials necessary to learn the evoked potentials, positron emission tomography
list for the first time. [PET], functional magnetic resonance imaging
More recently, researchers have begun to distin- [fMRI]) have become available. Evoked potentials
guish between three different aspects of memory, measure the electrical activity of systems of neurons;
encoding, retention, and retrieval, and have developed PET measures blood flow. Because the measures differ
methods to study the three aspects in isolation. For widely in their invasiveness and in terms of their
example, one way of investigating encoding processes temporal and spatial resolutions, a combination of
is to manipulate humans’ expectancies by way of these methods, together with behavioral measure-
intentional vs. incidental learning instructions. By ments (e.g., reaction time, accuracy) appears to be an
contrast, retrieval processes are often studied in one of extremely promising candidate for increasing under-
two general ways. On an explicit memory test, partici- standing of the interaction between Neuropsychology
pants are presented a list of materials and at some later and Cognitive Psychology.
point in time are given a test in which they are asked to
retrieve the earlier presented material. Retrieval is
measured in terms of recall, recognition, or cued 8.6 Computational Modeling
recall. On implicit memory tests, participants are not
directly asked to recollect an earlier episode, but Since the early research of Newell and Simon on the
rather, are asked to engage in a task where per- General Problem Solver, theoretical assumptions con-
formance might benefit from earlier exposure to the cerning both the structure of, and the processing
stimulus items. Interestingly, recent research has within, the cognitive system have been tested by
demonstrated that some neuropsychological popu- implementing the assumptions in running computer
lations (e.g., amnesic patients) can be unimpaired on programs. Recently, the modeling has been of two
implicit memory tests, but can show considerable different varieties, connectionist versus symbolic
impairment in explicit memory tests. modeling (see above). For example, in order to model
the processes underlying human language learning,
the symbolic modeling approach assumes that humans
8.3 Case Studies acquire a set of rules that specify how language
Although relatively rarely used in Cognitive Psy- constituents can be combined within a language and
chology, single case studies can provide vital infor- can be specified in a running program. Alternatively,
mation on how the cognitive system may be structured the cognitive system might acquire the ‘rules of
and which specific processes might be necessary to language’ without directly specifying these rules as
complete specific tasks. The classic case of HM, who symbols at all, that is, within a distributed repre-
as a consequence of an earlier epilepsy operation, sentational system (connectionist theory). At present,
acquired severe memory loss on explicit tasks though the controversy surrounding these two alternative
not implicit tasks, might serve as a striking example modeling accounts is far from settled, and, import-
arguing for the dissociation of different memory antly, reflects a fundamental issue regarding the nature
structures. Case studies can provide rather convincing of the human cognitive system that was addressed in
constraints for cognitive psychologists’ understanding more detail above (computer metaphor vs. brain
of the architecture of human cognition. metaphor).

8.4 Cross-population Studies 9. The Future of Cognitie Psychology


Cognitive Psychology relies heavily on college students If the past is a good predictor of the future, then the
as their research participants. Recently, however, there future of Cognitive Psychology is difficult to predict.

2153
Cognitie Psychology: Oeriew

Most likely, however, neither the theoretical scope nor Newell A, Simon H A 1972 Human Problem Soling. Prentice-
the empirical methods of the field are going to change Hall, Englewood Cliffs, NJ
dramatically. If there will be disagreement among Pylyshyn Z W 1984 Computation and Cognition. MIT Press,
scientists, it will concern the nature of the mental Cambridge, MA
system intervening between environmental input and Rumelhart D E, McClelland J L (eds.) 1986 Parallel Distributed
Processing: Explorations in the Microstructure of Cognition,
behavior. The most desirable future scenario is per- Vol. 1. MIT Press\Bradford Books, Cambridge, MA
haps one in which the three main metaphors (i.e., Simon H A 1969 The Sciences of the Artificial. MIT Press,
computer, brain, ecological) will be integrated into a Cambridge, MA
coherent one. Because the three metaphors deal with
distinct levels of the human mind, this scenario is P. A. Frensch
perhaps not an unlikely, remote theoretical possibility.

See also: Brain Development, Ontogenetic Neuro-


biology of; Cognitive Neuroscience; Cognitive Psyc-
hology: History; Cognitive Science: History; Cognitive
Science: Overview; Cognitive Science: Philosophical
Aspects; Experimentation in Psychology, History Cognitive Science: History
of; Human Cognition, Evolution of; Information
Processing Architectures: Fundamental Issues; The roots of cognitive science extend back far in
Intelligence, Evolution of; Intelligence: Historical intellectual history, but its genesis as a collaborative
and Conceptual Perspectives; Mathematical Psycho- endeavor of psychology, computer science, neuro-
logy; Problem Solving and Reasoning, Psychology science, linguistics, and related fields lies in the 1950s.
of; Psychology: Historical and Cultural Perspectives; Its first major institutions (a journal and society) were
Psychology: Overview established in the late 1970s. This history describes
relevant developments within each field and traces
collaboration between the fields in the last half of the
twentieth century.
Bibliography A key contributor to the emergence of cognitive
science, psychologist George Miller, dates its birth to
Anderson J R 1983 The Architecture of Cognition. Harvard
September 11, 1956, the second day of a Symposium
University Press, Cambridge, MA
Anderson J R 1995 Cognitie Psychology and its Implications.
on Information Theory at MIT. Computer scientists
Freeman, New York Allen Newell and Herbert Simon, linguist Noam
Broadbent D E 1958 Perception and Communication. Pergamon, Chomsky, and Miller himself presented work that
New York would turn each of their fields in a more cognitive
Chomsky N 1959 Review of Skinner’s verbal behavior. Language direction. Miller left the symposium ‘with a strong
35: 26–58 conviction, more intuitive than rational, that human
Fodor J A 1983 The Modularity of Mind. MIT Press\Bradford experimental psychology, theoretical linguistics, and
Books, Cambridge, MA the computer simulation of cognitive processes were
Fodor J A, Pylyshyn Z W 1988 Connectionism and cognitive all pieces from a larger whole, and that the future
architecture: A critical analysis. Cognition 28: 3–71 would see a progressive elaboration and coordination
Gardner H 1985 The Mind’s New Science: A History of the
of their shared concerns’ (Miller 1979).
Cognitie Reolution. Basic Books, New York:
Gibson J J 1979 The Ecological Approach to Visual Perception.
This early conference illustrates an enduring feature
Houghton Mifflin, Boston of cognitive science—it is not a discipline in its own
Lachman R, Lachman J L, Butterfield E C 1979 Cognitie right, but a multidisciplinary endeavor. Although a
Psychology and Information Processing. Erlbaum, Hillsdale, few departments of cognitive science have been created
NJ at universities in subsequent decades, most of its
McClelland J L, Rumelhart D E (eds.) 1986 Parallel Distributed practitioners are educated and spend their careers in
Processing: Explorations in the Microstructure of Cognition, departments of the contributing disciplines. The rela-
Vol. 2. MIT Press\Bradford Books, Cambridge, MA tive prominence of these disciplines has varied over the
Miller G A, Galanter E, Pribram K H 1960 Plans and the years. Computer science and psychology have played
Structure of Behaior. Holt, Rinehart & Winston, New York a strong role throughout. Neuroscience initially was
Minsky M 1985 The Society of Mind. Simon and Schuster, New
strong, but in the years immediately following the
York
Neisser U 1967 Cognitie Psychology. Appleton-Century-Crofts, 1956 conference its role declined as that of linguistics
New York dramatically increased. By the 1970s, such disciplines
Newell A 1980 Physical symbol systems. Cognitie Science 4: as philosophy, sociology, and anthropology were
135–83 making distinctive contributions. Recently, with the
Newell A 1991 Unified Theories of Cognition. Cambridge emergence of cognitive neuroscience, neuroscience has
University Press, Cambridge, MA once again become a central contributor.

2154
Cognitie Science: History

1. Intellectual Ancestors of Cognitie Science constructing cognitive maps. Most psychologists,


however, took learning rather than cognition as their
1.1 Artificial Intelligence domain of concern. The sentiment was well captured
by George Mandler, looking back on his graduate
One of the central inspirations for cognitive science student days at Yale: ‘… cognition was a dirty word
was the development of computational models of for us … because cognitive psychologists were seen
cognitive performance, which bring together two as fuzzy, hand-waving, imprecise people who never
ideas. First, conceiving of thought as computation was really did anything that was testable’ (quoted in Baars
an offshoot of the development of modern logic. In his 1986).
1854 book, The Laws of Thought, the British math- Although behaviorism cast a broad shadow in the
ematician George Boole demonstrated that formal US, alternatives which later influenced cognitive sci-
operations performed on sets corresponded to logical ence thrived elsewhere: Continental Europe (especially
operators (and, or, not) applied to propositions; Boole Jean Piaget’s genetic epistemology), the UK (e.g., Sir
proposed that these could serve as laws of thought. Frederic Bartlett’s appeal to schemas to explain
Second, conceiving of computers as devices for com- memory distortions and Donald Broadbent’s analyses
putation can be traced back to Charles Babbage’s of memory and attention), Germany and Austria
plans in the 1840s for an ‘analytical engine’ and his (Gestalt psychology), and the Soviet Union (Lev
collaboration with Lady Lovelace (Ada Augusta Vygotsky and Alexander Luria). Within the US,
Byron) in developing ideas for programming the psychophysics (and to some extent developmental,
device. These ideas gained new life in the 1930s and social, and clinical psychology) functioned largely
1940s with the development of automata theory outside behaviorism’s influence. Several psychologists
(especially the Turing machine), cybernetics (centered who later pioneered a more cognitive approach,
on Norbert Weiner’s feedback loops), designs for including Miller, Ulric Neisser, and Donald Norman,
implementing Boolean operations via electric on\off received their training in S. S. Stevens’s Psycho-
switches (Claude Shannon), and information theory acoustic Laboratory at Harvard. Miller earned his
(also Shannon). Implementation became possible with Ph.D. in 1946 for research on optimal signals for spot
the invention of electrical circuits, vacuum tubes, and jamming of speech. Just 10 years later, he was talking
transistors and was put on a fast track by World War about the structure of internal information processing
II (ENIAC was completed at the University of systems, such as the ‘seven plus or minus two’
Pennsylvania in 1946). By the mid-1950s, Newell and limitation in such domains as short-term memory. In
Simon (at RAND and then Carnegie-Mellon) pro- 1960, Miller, Eugene Galanter, and Karl Pribram
duced the first functioning program for reasoning, a broke new ground in their Plans and the Structure of
theorem-prover called Logic Theorist, and the first Behaior. In the same time frame, Miller’s Harvard
list-processing language, IPL. Meanwhile, John colleague Jerome Bruner was pioneering several
McCarthy and Marvin Minsky at MIT were devel- strands of cognitive research: he showed that internal
oping a broad based agenda for the field they named mental states influence perception and, arguing that
artificial intelligence (AI) and more specialized en- categories are central to thought, Bruner, Goodnow,
deavors also got underway (e.g., machine translation and Austin traced how people acquire them in their
of languages and chess-playing programs, neither truly 1956 book, A Study of Thinking.
successful until the 1990s).
1.3 Neuroscience
1.2 Psychology
Research into the brain was long thought to be
During the same period psychology began emerging relevant to understanding mental processes. One line
from a long domination by behaviorism, especially in of research focused on deficits stemming from brain
North America. Behaviorism had the lasting impact of lesions, such as Broca’s classic nineteenth century
focusing experimental psychology on explaining be- work lining articulate speech to what is now called
havior and relying on behavior as its primary source of Broca’s area. Although a holistic tradition in early
evidence. Radical behaviorists, such as B. F. Skinner, twentieth century brain research temporarily turned
actively opposed positing internal processes and fo- investigators away from localization studies, Norman
cused on what was observable: describing how behav- Geschwind and others gave new life to this approach
ioral responses changed with contingencies of re- in the 1950s. As well, improvements in electrophysio-
inforcement. Other behaviorists, such as Clark Hull, logical techniques, including brain stimulation, single
were willing to posit variables intervening between cell recording, and EEG recording, provided ad-
stimulus and response, such as drive, but emphasized ditional clues. As evidenced by a 1948 conference,
doing so in the context of developing a mathematico- ‘Cerebral Mechanisms in Behavior,’ there was eager
deductive theory accounting for behavior. Edward engagement at the time between neurophysiologists,
Tolman, an atypical behaviorist, went so far as to biologically oriented psychologists, and computer
propose that rats navigate their environments by scientists.

2155
Cognitie Science: History

One of the fruitful products of this engagement in 2.1 Artificial Intelligence


the 1940s to 1960s was the development of neural
Though based in computer science, much artificial
networks, a kind of computational modeling pion-
intelligence (AI) research was directed towards ac-
eered by neurophysiologist Warren McCulloch and
counting for the kinds of behavior studied by psy-
logician Walter Pitts. Donald Hebb proposed to build
chologists. Newell and Simon soon went beyond their
cell-assemblies by strengthening connections between
initial Logic Theorist program to a General Problem
neurons that fired simultaneously, a technique still in
Solver they used in less formal domains, such as
use. Oliver Selfridge had layers of units competing in
solving Tower of Hanoi problems. They developed
parallel to recognize patterns in his Pandemonium
such concepts as subgoals, heuristics, and satisficing
simulation. Frank Rosenblatt built layered networks
and introduced the production system framework,
that learned through error correction (Perceptrons).
which employs rules that operate on the contents of
Neural networks lost influence due to a devastating
working memory when their antecedent conditions are
critique of Perceptrons by Minsky and Seymour
satisfied. Their collaborative work culminated in their
Papert in 1969, but were revived when more effective
1972 book Human Problem Soling, but each went on
techniques became available in the ‘new connection-
to develop further systems, such as SOAR and
ism’ of the 1980s and beyond.
extensions of EPAM. At MIT McCarthy developed a
list processing language (LISP), which became a
standard tool of AI. Students working with him and
1.4 Linguistics Minsky wrote LISP programs to perform such tasks as
retrieving semantic information (Raphael’s SIR) and
Linguistics began to move towards a central role in the solving algebra word problems (Bobrow’s STU-
emerging interdisciplinary discussion of mind and DENT) and geometric analogies (Evans’s ANAL-
brain around the time of the 1956 MIT conference. In OGY). A 1968 book reporting this work also included
the early decades of the twentieth century, linguistics a seminal chapter by Ross Quillian introducing sem-
had changed its emphasis from reconstructing the antic networks. At Stanford, a team headed by Charles
history of languages to studying the structure of Rosen built a computer-controlled robot named
languages. Structuralist linguists such as Franz Boas, Shakey that could reason backwards from goals and
Edward Sapir, and the positivist Leonard Bloomfield take appropriate actions with boxes that were found
focused on lower-level structural units (phonemes and in its environment. Working with a simulated box
morphemes). In the 1950s, post-Bloomfieldian Zellig world rather than a physical one, Terry Winograd’s
Harris turned his attention to syntax and introduced SHRDLU at MIT offered innovations in data struc-
the idea of transformations that normalized complex tures and planning and had the most successful natural
sentences by relating them to simpler kernel sentences. language interface of the early 1970s. Around the same
This idea launched a revolution in linguistics when it time William Woods developed Augmented Tran-
was further developed by Harris’s student Noam sition Network (ATN) grammars at Harvard and
Chomsky at the University of Pennsylvania and then BBN. As the 1970s progressed, AI researchers recog-
MIT. In his 1957 Syntactic Structures, Chomsky nized the limitations of reasoning with only atomized
proposed the idea of a grammar as a generative information. Some proposed larger-scale knowledge
system—a set of rules that would generate all and only structures, such as Roger Schank’s scripts and MOPs
members of the infinite set of grammatically well- and Minsky’s frames. Also, considerable progress was
formed sentences of a human language—and argued made in such specialized areas as expert systems,
that finite state and phrase structure grammars, speech understanding programs, and computational
though generative, were inadequate. A series of trans- linguistics.
formations was needed to obtain an appropriate
surface structure from an initial deep structure created
by means of phrase structure rules. 2.2 Psycholinguistics
One of the most fruitful collaborations was between
psychology and linguistics. Modern psycholinguistics
2. The Maturation of Cognitie Science had already begun to emerge in the early 1950s,
especially in the context of a summer seminar spon-
If the 1956 symposium represented the birth of sored by the Social Science Research Council in 1953.
cognitive science, it had a lot of maturing to do before One of the aims of this interaction between post-
it solidified into a major recognizable area of scientific Hullians and post-Bloomfieldians was to investigate
inquiry. It did not even obtain its name and insti- the psychological reality of linguistic constructs such
tutional identity until the mid- to late 1970s. But in the as phoneme. Though many of their empirical strategies
intervening two decades, interaction and collaboration still thrive in some form, the field was impacted by
between computer science, psychology, and linguistics the echos of Chomsky’s revolution in linguistics. In
developed and began to bear fruit. his 1959 review of B. F. Skinner’s Verbal Behaior,

2156
Cognitie Science: History

Chomsky emphasized not only that linguistic behavior rotation or a mirror image of a geometrical form, their
does not consist in reproduction of acquired responses reaction times increased linearly with the degree of
but is creative in the sense that there is no bound to the rotation. This suggested that subjects mentally rotated
novel but grammatically well-formed sentences one the comparison stimulus—an attention-grabbing
might produce or hear. Focusing on the poverty of the claim at a time when mentalism was still suspect in
stimulus argument (that the input to children is many quarters. A third researcher, Gordon Bower,
inadequate to induce a language), Chomsky also later moved from mathematical models of learning towards
argued for an innate language capacity (Universal more cognitively oriented work on the nature of
Grammar). A long-standing conflict developed be- mental representations. One of his students, John
tween Chomskian developmental psycholinguists and Anderson, worked with Bower on a very influential
those taking a cognitive interactionist perceptive (e.g., semantic network model (HAM), that was described
Elizabeth Bates). Meanwhile, Chomkian inspired psy- in their 1973 book, Human Associatie Memory. Later
chologists showed that sentences with more trans- Anderson combined it with a production system
formations in their derivation were more difficult to component in ACT* and its predecessors.
process. Later, changes in linguistic theory led to a Another exemplar is University of California, San
more nuanced psycholinguistics, including some Diego, where the trio of Peter Lindsay, Donald
Chomskian approaches. Norman, and David Rumelhart created a collab-
orative research group (LNR) in a new department
and institution. In 1975 they published Explorations in
Cognition, which ended with what may have been the
2.3 Psychology
first published use of the term cognitive science: ‘The
Information processing psychology drew explicitly or concerted efforts of a number of people from …
implicitly on computational ideas from information linguistics, artificial intelligence, and psychology may
theory and AI. Some of the first glimmers of an be creating a new field: cognitive science’ (Norman et
information processing approach to psychology ap- al. 1975). (A second candidate for first use is a 1975
peared in the work of Miller and Bruner, who book by Bobrow and Collins, Representation and
established a Center for Cognitive Studies at Harvard Understanding: Studies in Cognitie Science.) In ad-
in 1960. Research in the center focused on a host of dition to work on varied topics including memory,
topics including conceptual organization, language word recognition, problem solving, imagery, and
processing and development, visual imagery, memory, analogy, the group implemented an ambitious mem-
and attention. Eleanor Rosch, a Bruner student, began ory model, MEMOD. It featured active structural
work that led in the 1970s to a view of categories that networks, which used a common semantic network
emphasized prototypes, fuzzy boundaries, and the format to represent both data and process (e.g.,
primacy of basic-level categories. And in 1967 Ulric declarative and procedural knowledge). A decade later
Neisser, one of the Center’s many research fellows and yet another collaborative group coalesced at UCSD
visitors, published Cognitie Psychology. This book around Rumelhart and James McClelland; known as
introduced and synthesized the newly burgeoning the PDP (parallel distributed processing) group, it
work on information processing, particularly empha- helped bring neural networks (connectionist models)
sizing attention and pattern recognition, and it quickly back to center stage.
became the bible for a new generation of students. Sensing the potential to catalyze cognitive science
Although the Center closed in 1970, information- programs at these and other universities, the Alfred P.
processing approaches to psychology had already Sloan Foundation launched an initiative that eventu-
begun to spread to other universities. Stanford Uni- ally provided $17.4 million over 10 years to such
versity, for example, built from its existing strength in institutions as MIT and the University of California,
mathematical psychology (including the work of Berkeley. During 1982–84 another foundation, The
William K. Estes) quickly to emerge as a premier System Development Foundation, contributed $26
center for information processing. In 1968 Richard million for computational linguistics and speech, with
Atkinson and Richard Shiffrin developed a model that its largest support going to the Center for the Study of
integrated previous work on control processes, sensory Language and Information at Stanford.
memory (Sperling), short-term memory (Peterson and During the same period, linguist-turned-computer-
Peterson), and the distinction between short-term scientist Roger Schank, psychologist Allan Collins,
and long-term memory (William James; Waugh and and computer scientist Eugene Charniak began a new
Norman). Roger Shepard did elegant work in math- journal called Cognitie Science. Describing a con-
ematical psychology (e.g., he pioneered nonparametric verging view of natural and artificial intelligence in his
multidimensional scaling), but is best known for his introduction to the first issue in 1977, Collins wrote:
research on mental imagery and mental rotation with ‘This view has recently begun to produce a spate of
such students as Lynn Cooper and Jacqueline Metzler. books and conferences, which are the first trappings of
For example, they demonstrated that when subjects an emerging discipline. This discipline might have
had to decide whether a comparison stimulus was a been called applied epistemology or intelligence the-

2157
Cognitie Science: History

ory, but someone on high declared it should be See also: Artificial Intelligence in Cognitive Science;
cognitive science and so it shall. In starting the journal Behaviorism, History of; Cognitive Neuroscience;
we are just adding another trapping in the formation Cognitive Psychology: History; Cognitive Science:
of a new discipline.’ Drawing upon some of the early Overview; Cognitive Science: Philosophical Aspects;
Sloan money, Donald Norman and his colleagues at Psychology: Historical and Cultural Perspectives;
UCSD organized the La Jolla Conference on Cog- Neuroscience, Philosophy of
nitive Science. But as planning proceeded the idea of a
new society germinated, and the conference (held
August 13–16, 1979) became the first annual meeting
of the Cognitive Science Society. Bibliography
Baars B J 1986 The Cognitie Reolution in Psychology. Guilford
3. Deeloping New Identities Press, New York
Bechtel W, Abrahamsen A, Graham G 1998 The life of cognitive
By 1980 cognitive science had developed an institu- science. In: Bechtel W, Graham G (eds.) A Companion to
tional profile and was the focus of serious funding Cognitie Science. Blackwell, Malden, MA
initiatives. It also had an identity, one that emphasized Gardner H 1985 The Mind’s New Science. Basic Books, New
computational modeling of cognitive and linguistic York
processes, but also incorporated linguistics and psy- Hirst W (ed.) 1988 The Making of Cognitie Science. Cambridge
University Press, Cambridge, UK
cholinguistics. The subsequent two decades have seen Miller G A 1979 A Very Personal History (Occasional paper no.
major efforts to revise this initial identity. Three 1). Center for Cognitive Science, Cambridge, MA
contributing factors, each involving perspectives from Norman D A, Rumelhart D E, and the LNR Research Group
disciplines not central to the cognitive science of the 1975 Explorations in Cognition. Freeman, San Francisco, CA
1970s, deserve brief mention. First, the development
of cognitive science gave philosophers, long interested W. Bechtel, A. Abrahamsen, and G. Graham
in issues surrounding the mind, a chance to address
such issues in the context of ongoing empirical
research. Jerry Fodor argued that cognitive processes
are autonomous from the neural substrate and capable
of being realized in multiple ways; this view, congenial
to the cognitive science of the 1970s and 1980s, was Cognitive Science: Overview
subsequently challenged by Patricia and Paul
Churchland, who have emphasized the co-evolution of Cognitive science (CS) is a young discipline that
cognitive science and neuroscience. emerged from a research program started in 1975. It
Second, mavericks in a variety of disciplines found partially overlaps with its mother disciplines: psy-
cognitive science’s nearly exclusive focus on processes chology, artificial intelligence, linguistics, philosophy,
within the head a limitation. Philosopher Hubert anthropology, and the neurosciences. By no means the
Dreyfus challenged the attempt to analyze cognition only discipline dedicated to the study of cognition,
as formal computational processes. Neisser integrated cognitive science is unique in its basic tenet that
information processing with the ecological psychology cognitive processes are computations, a perspective
of James J. and Eleanor J. Gibson. Edward Hutchens which allows for direct comparison of natural and
and many others began to focus on how embodiment artificial intelligence, and emphasizes a methodology
and the situatedness of agents contribute to their that integrates formal and empirical analyses with
cognitive performance. computational synthesis. Computer simulations as
Finally, although study of the brain largely dis- generative theories of cognition (see Cognitie
appeared from cognitive science in the 1960s and Modeling: Research Logic in Cognitie Science) have
1970s, partly because research on brain processes therefore become the hallmark of CS methodology.
seemed too remote to contribute to understanding Today, CS is an internationally established field.
cognitive operations, the late 1980s and 1990s saw the The dominant tradition of the early years, close to
emergence of cognitive neuroscience. Michael Posner, artificial intelligence and its symbol-processing
Marcus Raichle, and Steven Petersen collaborated at architectures, has been enriched by alternative com-
Washington University to show how images from putational architectures (e.g., artificial neural net-
PET could be used to link brain processes to cognitive works) and by the recognition that natural, especially
processes. More recently greater spatial resolution has human cognition, rests on biological as well as on
been gained using fMRI and greater temporal res- social and cultural foundations. CS studies cognitive
olution using EEG-based methods such as ERP. systems, which may be organisms, machines, or any
Whether cognitive science can successfully incorporate combination of these acting in an environment that
these pilgrimages out into the world and down into the may be open and dynamically changing. Cognition in
brain, or whether it will ultimately fractionate, is a CS denotes a class of advanced control mechanisms
question still unanswered. that allow for sophisticated adaptation to changing

2158
Cognitie Science: Oeriew

needs (e.g., learning and planning) through compu- 1.1 Formal Approaches in Philosophy, Computing,
tations operating on mental representations. Cog- and Neuroscience
nition typically coexists with simpler regulatory mech-
The philosophical tradition on which CS draws may
anisms, like reflexes. CS recognizes that cognition in
be traced from the invention of number systems to
biological systems is implemented in brain processes,
medieval algebra (Raimundus Lullus) and on to
but emphasizes the importance of analyses at the
Descartes and Leibniz, and from the Aristotelian
functional level, with cognitive neuroscience relating
syllogisms to Frege’s seminal work on formal logic.
both domains. Applications of cognitive science may
Apart from being a history of the development of
be found in human–computer interaction and in the
formal systems, this philosophical tradition can be
design of software and information systems, as well as
seen as an analysis of thinking and reasoning aimed at
in human factors engineering, health care, and, most
separating content and form of argumentation. The
notably, in education.
syllogisms of Aristotle, which continued to constitute
the core of logic, virtually unchanged, over more than
2,000 years, are the first milestone: In analyzing an
argument such as ‘All human beings will die. Socrates
1. Cognition and the Cognitie Science Approach is a human being. Therefore, Socrates will die’ as
p q, pYq(modus ponens) in the Aristotelian tra-
Although reflection on the mind dates back at least to
dition (combined with the formal advancements of
Plato, the term ‘cognition,’ etymologically based on
Lullus and Descartes), the specific content of the
ancient Greek gignoskein and Latin cognoscere, is
argument is separated from the form of reasoning, and
relatively recent. It surfaces in nineteenth-century
it becomes clear that the latter is sufficient for
psychology, which exclusively dealt with the phenom-
warranting a true conclusion, provided that the input
enology of consciousness, e.g., in Spencer’s charac-
to this logical vehicle consists of premises that are true.
terization of the interrelations of human feelings. At
An important generalization about reasoning had
about the same time, the triad of thinking, feeling, and
been found. It was Leibniz in the late-seventeenth
willing of eighteenth century VermoW genspsychologie
century who advocated the use of formal reasoning in
became the well-known taxonomy of the mind,
the hope that all fruitless discussions might be ended
dividing it into cognition, emotion, and volition.
just by formalizing the arguments and computing the
We all have an intuitive understanding of what
conclusions—hopelessly optimistic from our view
‘cognition’ refers to, and there is common agreement
(there would be endless debate on how to formalize the
that thinking, memory, and language, and ‘the use or
premises), but instrumental for an account of thinking
handling of knowledge’ (Gregory 1987, p. 149) are
as symbol processing.
correctly subsumed under that term. On the other side,
Frege’s (1879\1967) reformulation of logic laid the
it is difficult to define the term strictly. Minimalist
foundations of modern logical semantics and marks
approaches (e.g., Searle 1990) would like to reserve its
the beginning of the modern tradition that led to CS.
use for the contents of consciousness, whereas a
The theory of symbol processing is formalized as a
maximum approach has been taken by Maturana and
general theory of computation in the works of Go$ del,
Varela (1980, p. 13), where they claim that ‘living as a
Turing, Church, and Post between 1931 and 1943,
process is a process of cognition’ (cf., Boden 2000, for
soon to become the foundation of computer science
a critique). The leading opinion, however, will consider
and, especially, artificial intelligence. Logical posi-
unconscious processes as well as conscious ones
tivism (as epitomized in Wittgenstein’s Tractatus,
(Neisser 1967, Norman 1981) without attributing
written during World War I) and logical semantics
cognitive abilities to every living organism (e.g., a tree)
(foremost, Tarski’s work in the 1920s) constitute the
and, indeed, without confining the use of the term to
philosophical legacy on which CS could draw.
biological systems. Still, ‘cognition’ continues to be a
The invention of computing machinery in the 1940s
rather ill-defined term, which even as ambitious a
(Zuse, with Babbage as an isolated forerunner in the
project as The MIT Encyclopedia of the Cognitie
nineteenth century) was instrumental in promoting a
Sciences (Wilson and Keil 1999) has not dared to treat
computational perspective to cognition. A major step
in an article of its own.
was the founding of artificial intelligence (AI) in 1956.
The hallmark of the CS approach to cognition is to
The work done in AI, notably on human problem
identify cognitive processes with computation: cog-
solving by Newell and Simon, may well be regarded as
nition is information processing. Not every compu-
CS at a time when the term was not around yet.
tation is cognition, however, which means that these
The symbol processing tradition was complemented
computational processes must be characterized
in the neurosciences by the invention of the formal
(Newell and Simon 1976), and further restrictions
neuron (an abstraction from biological neurons;
must be named, such as referential content or
McCulloch and Pitts 1943), as well as by analog
‘intentionality’ (Fodor 1975), or system complexity
approaches to computing and self-regulation and
(Smolensky 1988). The rest of this section will sketch
general systems theory (cybernetics; Wiener 1948).
the way towards a computational theory of mind.

2159
Cognitie Science: Oeriew

The detection of representational functions of single 1.4 The Origin of Cognitie Science
neurons in the visual system (receptive fields; Hubel
A state-of-the-art report on CS by the Alfred P. Sloan
and Wiesel 1962) paved the way for a new, scientific
Foundation concludes that ‘What has brought the
approach to the concept of mental representation,
field into existence is a common research objective: to
while the development of artificial neural networks,
discover the representational and computational
notably the perceptron (Rosenblatt 1958), showed a
capacities of the mind and their structural and func-
possible way to combine these new discoveries with
tional representation in the brain’ (Sloan Foundation
the idea of computation.
1978, p. 6). But it would not be false to say that the
Sloan Foundation acted as midwife at the birth of CS.
Its committee diagnosed convergent approaches vis-
ible across disciplines and went one step beyond that
diagnosis to unite what still were very different
1.2 Other Disciplines: Linguistics and Anthropology
approaches. Institutionalization followed soon: a
Linguistics made a huge step forward when researchers journal first (Cognitie Science, in 1977), then, two
such as Jakobson and Troubetzkoy succeeded in years later, a society and a yearly conference and, still
discovering the common abstract structure behind the later, doctoral programs and research projects, all
phonemes utilized in different languages. Linguistic evolving into a flourishing new field.
structuralism, later to be applied to syntax by Harris
and Chomsky, became another driving force in the
development of CS, as well as structuralism developed
in ethnology by Levi-Strauss, and in cognitive anthro- 2. The Basic Tenet: Cognition as Computation
pology. In addition, analysis of formal languages The modern idea of computation was formulated in
(Chomsky 1959a) provided the link to computer the 1930s. It owes much to Hilbert’s program for the
science. complete axiomization of mathematics, limited by
Go$ del’s (1931) proof that not all theorems about a
formal system are provable with the means provided
by that very same system. Some people have tried to
use this result against the idea that cognition could be
1.3 The Rise of Cognitie Psychology computation, but did not realize that there may well be
Nineteenth-century psychology has only recently been truths about the human mind that it fails to arrive at,
rediscovered, but failed to contribute to CS’s early let alone prove them formally.
development because it had been all but extinguished The best-known definitions of computation rely on
by behaviorism from about 1915 to 1960. Only the recursive functions (Church 1941), logical productions
later behaviorists considered internal variables (no- (Post 1943), or the abstract machine designed by
tably Hull), or even memory and mental represen- Turing (1936). All these approaches have been proved
tation (Tolman). It was information theory (developed to be equivalent in scope, and especially the Turing
by Shannon 1948), or rather its insufficiency to explain machine may be considered the direct forerunner of
central psychological phenomena such as the memory today’s computers. The common core is the idea of a
span, which brought G. A. Miller in the mid-1950s to formal system that uses symbols, i.e., variables and
reconsider human information processing. His col- operators combined to form symbolic expressions that
laboration with Noam Chomsky, and especially the are manipulated according to fixed rules and to the
latter’s poignant critique of behaviorist approaches to internal state of the system.
language (Chomsky 1959b), served to reorient psycho-
logical research towards issues of internal storage and
processing, towards a psychology that no longer
2.1 Physical Symbol Systems
ignored mind and consciousness, yet was careful to
stay within the limits of scientific rigor (Miller et al. When they received the Turing Award for their
1960). Cognitive psychology underwent a rapid de- ground-breaking work in AI, Newell and Simon
velopment and in 1967, Ulric Neisser wrote the first expanded the theory of symbol processing and coined
textbook of the new field, coining its name. Gardner the Physical Symbol Systems Hypothesis (PSSH): ‘A
(1985) names further influential researchers in psy- physical symbol system has the necessary and sufficient
chology, among them Bruner (notably his work on means for intelligent action’ (Newell and Simon 1976,
strategies) and Jean Piaget. Although Piaget’s nu- p. 117).
merous monographs on cognitive development did A physical symbol system is a formal system. Like
not become available in English before the 1960s, he is all formal systems, it has an ‘alphabet’ of (at least two)
certainly a forerunner of CS who always insisted on arbitrarily defined symbols, as well as operators to
the importance of formal principles for explaining create and transform symbol structures (symbolic
cognitive development. expressions) of arbitrary complexity from the elemen-

2160
Cognitie Science: Oeriew

tary symbols of the alphabet according to syntactic others conceptualize the relation between brain and
rules. This system is ‘physical’ in the sense that it has mental states as being parallel to the relation between
been implemented in a suitable way. One such way is a computer (i.e., the hardware) and a program running
the encoding of the symbols as levels of voltages in an on that computer: the mind as the software of the
array of transistors, and the operations by hard-wired brain. This approach is known as the computational
connections between transistors; thus it is done in theory of mind. It fits well with the PSSH, and it soon
semiconductor chips. A different way would be to became the dominant framework in CS. However, it
encode the symbols as ‘spikes’ (action potentials) of addresses (potentially) conscious thought only,
neurons. Other ways are possible, but those two ignoring lower cognitive processes.
already show how the theory can be applied to
organisms as well as to technical systems. The point is
that a system like that needs the physical implemen- 2.3 Achieements and Drawbacks of the Classical
tation in order to function in the real world and Symbol-processing Approach
become more than just an idea. The kind of im-
plementation is arbitrary, however, because the system From the twenty-first-century perspective, the PSSH
is functioning according to its symbolic expressions and the computational theory of mind together con-
and syntactic rules, completely independent of its stitute the classical period of CS, spanning the decade
implementation. from 1975 to 1985. Within that period, cognitive
Symbols are arbitrary signs, but they designate modeling (see Cognitie Modeling: Research Logic in
objects or processes, including processes in the system Cognitie Science) emerges as CS’s characteristic
itself. Their semantics is defined either by reference to methodology. Its applications include the following
an object (in the sense that depending on the respective fields.
symbolic expression, the system exerts an influence on
the object or is influenced by it), or by the symbolic
expression being executable as a kind of program. 2.3.1 Problem soling. The former general model of
Physical symbol systems may have a lot of symbol problem solving as heuristic search (Newell and
structures, which means that they need to have a Simon 1972) was enlarged by recognizing the im-
symbol store, or memory, which is unlimited according portance of domain-specific knowledge, which
to Newell and Simon. In fact, physical symbol systems became the foundation of an AI technology (know-
are Turing-equivalent computing devices and, there- ledge-based systems or ‘expert systems’; Hayes-Roth
fore, the PSSH is equivalent to the notion of cognition et al. 1983, Buchanan and Shortliffe 1984)
being computation. The PSSH cannot be proven, nor and inspired much psychological research on
can it be refuted formally. It gained plausibility, expertise (e.g., Ericsson and Smith 1991).
however, through empirical studies of human problem
solving and its simulation (Newell and Simon 1972).
The PSSH has integrated AI into computer science 2.3.2 Cognitie architectures. Rule-based architec-
through the common reference to the theory of tures evolved into ambitious models of human
automata, a platform which gives a foundation to CS cognition in general, comprising memory, problem
as well. The distinction between functional and im- solving, learning, and some natural language pro-
plementation levels, elaborated by Newell (1982) and cessing. The best-known of these systems are
Marr (1982), enabled CS as a science of biological as SOAR (Laird et al. 1987, Newell 1990) and the
well as technical cognitive systems. impressive series of ACT, ACT*, and ACT-R frame-
works, all developed by John Anderson (Anderson
2.2 Philosophical Foundations: Functionalism and 1976, 1983, Anderson and Lebie' re 1998).
the Computational Theory of Mind
Mental states have been analyzed as ‘intentional
attitudes’ in the philosophy of mind, consisting of a 2.3.3 Natural language processing. From the late
propositional content (e.g., P l the sun is shining) and 1950s on, theoretical linguistics has been dominated
an attitude that characterizes one’s own relation to by Noam Chomsky. His theories (notably Chomsky
that proposition (e.g., I wish that P would become true). 1981) are framed as theories of human linguistic com-
Fodor (1975) developed this approach further, ar- petence and have inspired CS research on human
riving at a ‘language of thought’ that treats the parsing (Frazier 1987, Mitchell 1994) and on
propositional content as data and the intentional language acquisition (Pinker 1984).
relation as an algorithmic one.
If we accept these as the elements of a ‘language of
thought,’ then the question arises of how mental states 2.3.4 Computers and education. Knowledge-based
relate to brain states: a well-known problem in systems have been built for purposes of instruction,
philosophy. Following Putnam (1960), Fodor and so-called ‘intelligent tutorial systems’ (Psotka et al.

2161
Cognitie Science: Oeriew

1988). Although their cost-efficiency relation turned that could be used for pattern recognition, the
out to be not well suited for general education, they ‘perceptron.’ Soon, connectionism (the name adopted
have been used with success for the training of for this line of research) flourished. Its boom came to
specialists. an end, however, when perceptrons ran into trouble
The main drawback of research during the classical with certain distinctions, and their limitations were
period of CS was that it excluded many important mathematically proven (Minsky and Papert 1969).
issues that could well have been covered by assuming From then on, only a few researchers, mostly in
mental representation and cognitive algorithms. The biology and biophysics, continued the connectionist
original objective for AI, stated by Simon (1955)—to tradition.
address problems whose solution by a human being The sudden rebirth of connectionism started with
would lead us to attribute intelligence to that person the discovery of an architectural change—
—led CS and AI largely to ignore problems that do introduction of ‘hidden layers’ in ANNs—that over-
not seem to require intelligence in people. However, came the limitations of the perceptron (Rumelhart,
these problems turned out to be the real ‘tough nuts,’ McClelland and the PDP group 1986). After a period
e.g., navigation and other skilled action. of heated dispute, connectionism has now been inte-
The computer technology of that period was still grated into mainstream CS. Numerous ANN
mainframe oriented, and interactive and graphic architectures have been developed, and hybrid
technologies were scarcely developed. This state of the connectionist–symbolic systems have also been con-
art did not encourage researchers to model real-time structed.
agent–environment interaction, although there were
some exceptions, e.g., Winograd (1972).
The ‘methodological solipsism,’ as advocated by
3.2 Distributed Representations as Subsymbols
Fodor (1980) in connection with his theory of mental
representation, and the dominance of logic-based At the heart of the dispute over connectionism was the
approaches in AI (especially during the 1980s) made issue of mental computation. Symbolic and ‘localist’
CS researchers believe that all interesting aspects of connectionist architectures (see Page 2000 for an
cognition happened within a single symbol system. overview) maintain that variables (or the nodes of an
In Fodor’s ‘language of thought,’ content is defined ANN) can be interpreted as being meaningful. In
as a truthful representation of (a part of?) the world, as ‘parallel distributed processing’ (PDP), however, enti-
in Tarski semantics and similar approaches, which ties are represented by patterns, usually by a ‘feature
unduly constrains mental representation and ignores vector’ containing the activation values of formal
its constructive nature. neurons. Smolensky (1988) claimed that only the
To summarize, CS research in fact fell short of the subsymbolic approach (referring to the elements of a
scope even its original symbol-processing framework feature vector) grasps the essence of cognition,
provided. whereas symbol processing approaches were confined
to a mere approximation. To the contrary, Fodor and
Pylyshyn (1988) argued that connectionism either was
3. Alternatie Computational Frameworks just an implementation of symbol processing (hence
banal), or inadequate for modeling productivity and
From a very abstract viewpoint, a Turing machine is systematicity in natural language. The debate was
all one ever needs to compute. However, different never resolved, but it is generally recognized that
architectures or virtual machines may make some suitability for cognitive modeling is more important
computations easy and others difficult. Classical CS than an abstract decision about which framework is
had adopted symbol-processing frameworks. There ‘better,’ and that, in fact, both frameworks share
is a large gap, however, between a functional important characteristics, being computational as well
specification and the way in which a brain is built. The as representational.
new connectionist movement, which surfaced in the
mid-1980s, attempted to bridge the gap with a frame-
work that was all but forgotten at that time.
3.3 Beyond Connectionism: Nonlinear Dynamics
Since the early 1990s, another computational frame-
3.1 Artificial Neural Networks: Connectionism
work has been claimed to be useful for cognitive
McCulloch and Pitts (1943) presented the biological modeling: the theory of nonlinear dynamic systems,
neuron as an abstract computing device. Hebb (1949) initially known as ‘chaos theory’ (Port and van Gelder
added a lot of hypotheses, most of which turned out to 1995). Fine-grained analyses of movement (e.g., in
be correct in the meantime (e.g., that enduring changes phonetics) and of developmental changes are the
in the transmission efficiency of certain synapses are intended area of application. Although this renders a
the neurophysiological basis of memory). Rosenblatt more precise picture of how cognitive processes are
(1958) built a simple artificial neural network (ANN) implemented (like the neural models used in biology

2162
Cognitie Science: Oeriew

are more detailed than connectionist accounts), it CS. Vera and Simon (1993), however, insisted that CS
could well be that these characteristics are less im- from its very start (e.g., Newell and Simon 1976) does
portant on the functional level (Eliasmith 2000). not exclude situatedness, but rather emphasizes its
necessity. Indeed, to realize that cognition is situated
can also solve the problem of symbol grounding:
4. The ‘New’ Cognitie Science: Interacting mental representations arise and get their meaning in
Cognitie Systems the context of acting in the world (a solution already
envisioned by Newell 1980). Situated cognition, it
The technical cognitive systems of classical AI and CS seems, highlights a formerly neglected aspect of CS.
were systems that were only cognitive at their func- Situatedness also includes the body, not only in the
tional level. This is never the case in biological systems: basic sense that cognition coexists with other, more
animals live, move and eat, and reproduce. All this basic processes in organisms. Rather, mental repre-
basic behavior is made possible without cognition in sentations are in many ways influenced by the cog-
the sense of thoughtful decision (in lower animals, at nitive representation of the body, as Lakoff and
least). Built-in physiological regulations, reflexes, Johnson (1980) have shown for metaphors.
instincts and species-specific behaviors serve to achieve
the necessary adaptation. Learning in its most primi-
tive form (classical conditioning) has been demon-
strated in some worms and, of course, higher species; 4.2 The Social Aspects of Cognition
categorization of stimuli (the forerunner of concepts) Culture, and social action, shape a lot of our thoughts
in birds and mammals; episodic memory at least in and cognitive skills: ‘Sociality and culture are made
some mammals, especially in apes; full-fledged possible by cognitive capacities, contribute to the
language in humans only. ontogenetic and phylogenetic development of these
Cognition has evolved because of the adaptive value capacities, and provide specific inputs to cognitive
of learning (and the cultural tradition that is based processes’ (Sperber and Hirschfeld 1999, p. cxi). By
upon it) and thinking (in mental simulation of an act focusing on a single individual concerned with a single
and its consequences). But cognition came late; it has task only, CS had followed the model provided by
to coexist with those basic regulatory mechanisms experimental psychology, its most prominent source
mentioned above. In the human species, most of an of empirical data. The recent change of view in favor
individual’s knowledge, and even specialized cognitive of social and cultural factors was mainly due to work
processes, e.g., reading and writing, have to be learned in an applied CS area where social interaction is
and trained within the culture that provides it. Sensori- central: education. Lave (1988) gives numerous
motor processes (e.g., driving in a city) rely heavily on demonstrations of cultural influences on the solution
a system’s interaction with its environment and with of apparently context-free tasks in mathematics. More
special tools (like a car). Also, we are embedded in a generally, culture-oriented theories of development
social world: many cognitive processes, foremost and learning (beyond sociobiology as in Lumsden and
language, are acquired through social interaction. All Wilson 1981) have been advanced by Cole (1991) and
this cannot be modeled adequately by a single symbol- Tomasello et al. (1993).
processing system that scarcely (if at all) interacts with Because of the sheer amount of detail provided in
its environment and with other cognitive systems. In cultures, especially in modern industrial ones with
CS, this paradigm shift occurred gradually, starting in libraries and a high degree of specialization in work, it
the mid-1980s. Having extended its scope since its is very difficult to integrate cultural aspects in cognitive
classical period, modern CS appears as the science of theories. The problem is eased when analyses focus on
cognitive systems in a fuller sense, as described above. narrowly defined tasks.
A related field is research on groups (e.g., in
computer-supported cooperative work (CSCW);
4.1 Situated Cognition
Olson and Olson 1997) and on teams of experts (Hinsz
Using the methodology of field studies, Suchman et al. 1997). Here, as well as in knowledge engineering
(1987), Hutchins (1995a, 1995b) and others could (Strube et al. 1996), the task itself comprises the
demonstrate how much people rely on cues and exchange of knowledge and, at least to a certain
representations provided by the environment. While degree, the development of ‘shared knowledge’ or
traditional accounts of planning rely on mental repre- ‘shared mental models’ (Lewis and Sycara 1993).
sentations and processes exclusively, human planning
can be shown to rely on maps, road signs, guidance by
other people, and other external sources. The im-
4.3 A New Paradigm: Autonomous Social Agents
portance of external representations in problem solv-
and Mixed Groups
ing has been repeatedly demonstrated (Zhang 1997).
As is so often the case, ‘situated cognition’ was According to the dominant paradigm in CS, cognitive
introduced as an alternative to the ‘old’ paradigm of systems are conceived as autonomous social agents,

2163
Cognitie Science: Oeriew

situated in a complex dynamic environment. This view ‘as a discipline of its own [...] becoming increasingly
bears no accidental resemblance to the shift in AI more common’ (Schunn et al. 1998, p. 117). CS, as it
instigated by new architectures in robotics (Brooks seems, is still but steadily evolving.
1991) and the development of intelligent agents (see
Wooldrige and Jennings 1995 for an overview).
See also: Cognitive Psychology: History; Cognitive
Agent approaches emphasize the complex action
control needed for an agent (robot or organism) that Psychology: Overview; Cognitive Science: History;
pursues its own goals, which are always many, and Cognitive Science: Philosophical Aspects
potentially conflicting (Maes 1990). Because of re-
stricted resources, agents in the real world cannot be
fully rational; however, ‘bounded rationality’ (Simon
1955) has been claimed a universal principle of human Bibliography
thought (Gigerenzer et al. 1999).
Distributed AI (Bond and Gasser 1988) studies Anderson J R 1976 Language, Memory, and Thought. Erlbaum,
Hillsdale, NJ
cooperation and competition among agents. The
Anderson J R 1983 The Architecture of Cognition. Harvard
biennial RoboCup contest (Kitano et al. 1997) University Press, Cambridge, MA
epitomizes this line of research, having robot or Anderson J R, Lebie' re C 1998 Atomic Components of Thought.
simulated agent teams playing soccer games against Erlbaum, Hillsdale, NJ
each other. The relevance for CS lies in the de- Boden M A 2000 Autopoiesis and life. Cognitie Science
velopment of integrated architectures that comprise Quarterly 1: 135–46
cognitive and more primitive (so-called ‘reactive’) Bond A H, Gasser L (eds.) 1988 Readings in Distributed Artificial
regulation, as well as social interaction. This fits well Intelligence. Kaufmann, San Mateo, CA
with the current emphasis on situated cognition and Brooks R A 1991 Intelligence without representation. Artificial
with applied problems of CS, e.g., in the field of office Intelligence 47: 139–59
automation and cooperation in mixed groups that Buchanan B G, Shortliffe E H 1984 Rule-based Expert Systems.
comprise human workers as well as technical systems, Addison-Wesley, Reading, MA
as in the case of air traffic (Hutchins 1995b). Chomsky N 1959a On certain formal properties of grammars.
Information and Control 2: 137–67
Chomsky N 1959b Verbal behavior—Skinner B F. Language 35:
26–58
5. Achieements and Present State Chomsky, N 1981 Lectures on Goernment and Binding. Foris,
Dordrecht, The Netherlands
CS in the twenty-first century has grown from an Church A 1941 The Calculi of Lambda-Conersion. Princeton
innovative interdisciplinary field to an academic disci- University Press, Princeton, NJ
pline of its own, albeit in a still early stage of Cole M 1991 A cultural theory of development: what does it
institutionalization. CS has been evolving differently, imply about the application of scientific research? Special
however, in different parts of the world. It has a solid issue: culture and learning. Learning and Instruction 1: 187–200
infrastructure of departments, graduate programs, Eliasmith C 2000 Is the brain analog or digital? The solution and
etc., in the UK and North America, in France, and in its consequences for cognitive science. Cognitie Science
some other countries, but remains still an early stage Quarterly 1: pp. 147–70
Ericsson K A, Smith J (eds.) 1991 Toward a General Theory of
of institutionalization in Germany, for instance, and Expertise: Prospects and Limits. Cambridge University Press,
still less in many other different countries. In the USA, Cambridge, UK
cognitive neuroscience has split away from cognitive Fodor J A 1975 The Language of Thought. Crowell, New York
science, at least organizationally. In other countries Fodor J A 1980 Methodological solipsism considered as a
like France, however, it would be impossible to research strategy in cognitive psychology. Behaioral and
imagine CS without brain research and neuroscience. Brain Sciences 3: 63–73
The necessity and importance of real-world appli- Fodor J A, Pylyshyn Z W 1988 Connectionism and cognitive
cations of CS in education, industry, and other fields is architecture: a critical analysis. Cognition 28: 3–71
generally recognized, but CS still lacks a well-de- Frazier L 1987 Sentence processing: a tutorial review. In:
veloped professional profile. On the other hand, CS Coltheart M (ed.) The Psychology of Reading. Attention and
has an impressive record of success in research, an Performance. Erlbaum, Hove, UK, Vol. 12, pp. 559–86
infrastructure of international and national academic Frege G 1879\1967 Begriffsschrift: a formula language, modeled
upon that of arithmetic, for pure thought. In: van Heihenoort
societies (foremost the Cognitive Science Society,
J (ed.) From Frege to GoW del: a Source Book on Mathematical
founded in 1979), international and national con- Logic, 1879–1931. Harvard University Press, Cambridge, MA
ferences, and dedicated CS journals (Cognitie Sci- pp. 5–82
ence, founded in 1977, and Cognitie Science Quar- Gardner H 1985 The Mind’s New Science: a History of the
terly, founded in 2000). An analysis of publications in Cognitie Reolution. Basic Books, New York
Cognitie Science in 1977–95 found evidence not only Gigerenzer G, Todd P M, ABC Research Group 1999 Simple
for a dominance of psychologists and computer Heuristics That Make Us Smart. Oxford University Press,
scientists among the CS community, but also for CS New York

2164
Cognitie Science: Oeriew

Go$ del K 1931 U= ber formal unentscheidbare Sa$ tze der Principia Norman D A 1981 Categorization of action slips. Psychological
Mathematica und verwandter Systeme. Monatshefte fuW r Reiew 88: 1–15
Mathematik und Physik 38: 173–98 Olson G M, Olson J S 1997 Research on computer supported
Gregory R L (ed.) 1987 The Oxford Companion to the Mind. cooperative work. In: Helander M, Landauer T K, Prabhu
Oxford University Press, Oxford, UK P V (eds.) Handbook of Human–Computer Interaction, 2nd
Hayes-Roth F, Waterman D A, Lenat D B 1983 Building Expert rev. edn. Elsevier, Amsterdam, pp. 1433–56
Systems. Addison-Wesley, Reading, MA Page M 2000 Connectionist modelling in psychology: a localist
Hebb D O 1949 The Organization of Behaior. Wiley, New York manifesto. Behaioral and Brain Sciences 23
Hinsz V B, Tindale R S, Vollrath D A 1997 The emerging Pinker S 1984 Language Learnability and Language Deelopment.
conceptualization of groups as information processors. Harvard University Press, Cambridge, MA
Psychological Bulletin 121: 43–64 Port R, van Gelder T J 1995 Mind as Motion: Explorations in the
Hubel D H, Wiesel T N 1962 Receptive fields, binocular Dynamics of Cognition. MIT Press, Cambridge, MA
interaction and functional architecture in the cat’s visual Post E 1943 Formal reductions of the general combinatorial
cortex. Journal of Physiology 160: 106–54 decision problem. American Journal of Mathematics 65:
Hutchins E 1995a Cognition in the Wild. MIT Press, Cambridge, 197–268
MA Psotka J, Massey D L, Mutter S A (eds.) 1988 Intelligent
Hutchins E 1995b How a cockpit remembers its speeds. Cognitie Tutoring Systems: Lessons Learned. Erlbaum, Hillsdale, NJ
Science 19: 265–88 Putnam H 1960 Minds and machines. In: Hook S (ed.)
Kitano H, Asada M, Kuniyoshi Y, Noda I, Osawa E, Matsubara Dimensions of Mind. New York University Press, New York,
H 1997 RoboCup: a challenge problem for AI. AI Magazine pp. 138–64
18: 73–85 Rosenblatt F 1958 The perceptron: a probabilistic model for
Laird J E, Newell A, Rosenbloom P S 1987 SOAR: an archi- information storage and organization in the brain. Psycho-
tecture for general intelligence. Artificial Intelligence 33: 1–64 logical Reiew 65: 386–408
Lakoff G, Johnson M 1980 Metaphors We Lie By. Chicago Rumelhart D E, McClelland J L & the PDP Group 1986 Parallel
University Press, Chicago Distributed Processing: Explorations in the Microstructure of
Lave J 1988 Cognition in Practice: Mind, Mathematics and Cognition. MIT Press, Cambridge, MA
Culture in Eeryday Life. Cambridge University Press, Schunn C D, Crowley K, Okada T 1998 The growth of
Cambridge, UK multidisciplinarity in the Cognitive Science Society. Cognitie
Lewis C M, Sycara K P 1993 Reaching informed agreement in
Science 22: 107–30
multispecialist cooperation. Group Decision and Negotiation
Searle J R 1990 Consciousness, explanatory inversion, and
2: 279–99
cognitive science. Behaioral and Brain Sciences 13: 585–95
Lumsden C J, Wilson E O 1981 Genes, Minds and Culture.
Shannon C 1948 A mathematical theory of communication. Bell
Harvard University Press, Cambridge, MA
System Technical Journal 27: 379–423 , 623–656
Maes P (ed.) 1990 Designing Autonomous Agents. Theory and
Simon H A 1955 A behavioral model of rational choice.
Practice from Biology to Engineering and Back. MIT Press,
Quarterly Journal of Economics 69: 99–118
Cambridge, MA
Sloan Foundation 1978 Cognitie Science 1978. Report of the
Marr D 1982 Vision. A Computational Inestigation into the
Human Representation and Processing of Visual Information. State of the Art Committee. Alfred P. Sloan Foundation, New
Freeman, San Francisco York
Maturana H R, Varela F J 1980 Autopoiesis and Cognition: the Smolensky P 1988 On the proper treatment of connectionism.
Realisation of the Liing. Reidel, Dordrecht, The Netherlands Behaioral and Brain Sciences 11: 1–23
McCulloch W S, Pitts W 1943 A logical calculus of the ideas Sperber D, Hirschfeld L 1999 Culture, cognition and evolution.
immanent in nervous activity. Bulletin of Mathematical In: Wilson R A, Keil F C (eds.) The MIT Encyclopedia of the
Biophysics 5: 115–33 Cognitie Sciences. MIT Press, Cambridge, MA, pp. cxi–cxxxii
Miller G A, Galanter E, Pribram K H 1960 Plans and the Strube G, Janetzko D, Knauff M 1996 Cooperative construction
Structure of Behaior. Holt, New York of expert knowledge: the case of knowledge engineering. In:
Minsky M, Papert S 1969 Perceptrons. MIT Press, Cambridge, Baltes P B, Staudinger U M (eds.) Interactie Minds.
MA Cambridge University Press, Cambridge, UK, pp. 366–93
Mitchell D C 1994 Sentence parsing. In: Gernsbacher M A (ed.) Suchman L A 1987 Plans and Situated Actions: the Problem of
Handbook of Psycholinguistics. Academic Press, San Diego, Human–Machine Communication. Cambridge University
CA, pp. 375–409 Press, New York
Neisser U 1967 Cognitie Psychology. Prentice-Hall, Englewood Tomasello M, Kruger A C, Ratner H H 1993 Cultural learning.
Cliffs, NJ Behaioral and Brain Sciences 16: 495–511
Newell A 1980 Physical symbol systems. Cognitie Science 4: Turing A 1936 On computable numbers, with an application to
135–83 the Entscheidungs problem. Proceedings of the London
Newell A 1982 The knowledge level. Artificial Intelligence 18: Mathematical Society (Series 2) 42: 230–65; 43: 544–546
87–127 (addendum)
Newell A 1990 Unified Theories of Cognition. Harvard Uni- Vera A H, Simon H A 1993 Situated action: a symbolic
versity Press, Cambridge, MA interpretation. Cognitie Science 17: 7–48
Newell A, Simon H A 1972 Human Problem Soling. Prentice- Wiener N 1948 Cybernetics. Wiley, New York
Hall, Englewood Cliffs, NJ Wilson R A, Keil F C (eds.) 1999 The MIT Encyclopedia of the
Newell A, Simon H A 1976 Computer science as empirical Cognitie Sciences. MIT Press, Cambridge, MA
enquiry: symbols and search. Communications of the ACM 19: Winograd T 1972 Understanding natural language Cognitie
113–26 Psychology 3(1)

2165
Cognitie Science: Oeriew

Wooldridge M, Jennings N R 1995 Intelligent agents: theory states. Many theories in cognitive science assume that
and practice. Knowledge Engineering Reiew 10: 115–52 mental processes consist of mental states that are
Zhang J J 1997 The nature of external representations in problem connected by virtue of their semantic contents. Such
solving. Cognitie Science 21: 179–217 processes include ones that are ‘rational’ in that they
tend to produce true beliefs or promote survival.
G. Strube Mental information processing consists of sequences
of rationally related mental states.
Most cognitive scientists assume that psychological
states are part of the natural physical order. However
it is usually thought that cognitive science will, like
other special sciences, employ its own taxonomy of
states, and in particular one that differs from that of
Cognitive Science: Philosophical Aspects neurophysiology. The ‘received’ view is that psycho-
logical states are certain kinds of functional states. A
Cognitive science emerged as a distinct field in the functional state or property is one that is individuated
middle 1950s influenced by two important develo- in terms of a causal role. For example, what makes
pments. First was the construction of digital comp- something a carburetor is its causal role in taking gas
uters and their capacity to perform operations that and air as inputs and producing a mixture of the two
apparently require intelligent thinking. Second was as output. In the case of psychological states the causal
Noam Chomsky’s idea that linguistic capacities invo- role involves causal connections to other psychological
lve in some sense knowledge of and conforming to states, to stimuli, and to behavior. An important
unconscious grammatical rules and that at least some feature of functionalist accounts is that physically
of this knowledge is innate. The first suggested that the different kinds of structures can realize a given
mind’s operations could be understood on the model functional state by satisfying its functional specifi-
of a computer implementing an internally represented cation. Thus carburetors can be made out of metal or
program (Turing 1950). The second that mental plastic and minds out of brains or computers.
capacities are best describable in intentional terms like There are three big issues in the philosophy of mind.
‘knowledge,’ ‘belief,’ and ‘following a rule.’ Both of How is rational thinking possible? How can the mind
these developments rejected behaviorism (Skinner represent the world? What is consciousness? Cognitive
1953). In its extreme forms behaviorism endorsed the science has much to say about the first and has given
view that human (and other animal) mental capacities new twists to the second and third. Below are
are best understood in terms of causal connections discussions of these and related issues.
between stimuli and responses and that learning is best (a) Is cognitive science possible? There are philo-
understood as the change of such connections under sophical traditions which claim that there can be no
reinforcement. Behaviorists tended to reject questions science of the mind. The main contemporary objection
about the internal structures and processes that revolves around the idea that intentionality and
mediated stimulus and responses. They viewed inten- rationality are normatie categories and this dis-
tional concepts as meaningless and\or unscientific. qualifies them from being the subject of scientific laws
Some mental states are attributed to a person, e.g. and causal explanations. Ruminations along these
beliefs, memories, perceptions, desires, while others lines can be found in Wittgenstein (1953) and Ryle
are properly attributed only to subpersonal parts or (1949) and later taken up by philosophers as diverse as
faculties, e.g. the language faculty and the visual Kripke (1982), Davidson (1980), McDowell (1994),
system. Both kinds of mental states and processes and Quine (1960). Davidson’s arguments have been
involving them are intentional. ‘Intentionality’ refers especially influential among philosophers. He (1969)
to the fact that mental states represent. For example, argues that in attributing beliefs, thoughts, desires,
the thought that New York is tropical represents New and other propositional attitudes to one another we
York as being tropical. It has long been observed that are engaged in a project of interpretation. Further, he
a thought can represent what doesn’t exist (e.g., claims that interpretation is guided by a holistic
thoughts about Santa Claus) and misrepresent what ‘principle of charity.’ This principle dictates that when
does exist. Intentional mental states possess semantic attributing mental states we ought, other things being
properties. They refer and are evaluable as true or equal, to maximize the subject’s rationality. Attribut-
false. The thought that New York is tropical refers to ions guided by charity are holistic since whether or not
New York and is false. Semantic features are shared a belief or preference, etc. is rational depends on other
with natural language expressions and by other kinds beliefs, etc. The normativity of rationality makes
of representations (e.g., maps and pictures). For intentional states normative as well. Davidson claims
example, the belief that New York is tropical and ‘New that interpretation is so different from the way physical
York is tropical’ possess the same semantic content. It properties are assigned as to make it impossible for
is widely held that the semantic features of nonmental there to be any strict laws connecting intentional
items are ultimately derived from intentional mental psychological states with physical states or with each

2166
Cognitie Science: Philosophical Aspects

other. But while Davidson’s (and other such) argu- explanations are often mistaken or vacuous and
ments have been influential it is far from clear that they predicts that a developed science of the mind will
are sound or even whether if sound they really would abandon it in much the same way that physics
undermine the employment of intentional vocabulary abandons folk physics. At the current stage of de-
in scientific laws and explanations. In any case, velopment many cognitive theories do employ folk
cognitive scientists persist in employing intentional psychological concepts (or refinements of them) and
concepts in formulating explanations and theories. produce theories that sometimes explain folk psycho-
For example, there is a lively area of cognitive science logical regularities.
concerning how people engage in everyday logical and (c) Are there unconscious and subpersonal mental
statistical inference that proposes hypotheses con- states? Folk psychology recognizes that there are
cerning causes of errors, selection of conclusions, and unconscious mental states and processes. For example,
so forth (Johnson-Laird 1983). The mental states and most of a person’s memories are not present to
processes studied are, of course, intentional ones. The consciousness and the mental processes a person
issues of whether the mind and what aspects of engages in when driving a car might well be un-
mentality can be scientifically studied and whether conscious. Psychoanalytic theory, which to some
there are laws involving intentional states are more extent has been appropriated by folk psychology,
likely to be resolved by the success (or lack thereof ) of recognizes unconscious thoughts and desires that are
the theories produced in the cognitive sciences than by not normally accessible to consciousness, at least not
philosophical argument. without therapeutic assistance. Theories in cognitive
Most cognitive scientists think that a science of the science go quite a bit further, often positing intentional
mind like other sciences contains laws, which explain states that are in principle inaccessible to conscious-
psychological capacities and processes. The question ness and are properly attributed only to parts of the
arises of how such laws are related to biological laws. mind. For example, some psycholinguistic theories
According to Fodor (1975), psychological laws are posit a language module that cognizes grammatical
neurophysiologically implemented. In this view psy- rules formulated in terms of concepts that are un-
chology is autonomous in that its taxonomy and laws known to the person but in some way guide com-
are specific to it but every instance of a psychological prehension and production of her speech.
law is also an instance of a more basic law or causal Some functional structures engage in mental pro-
mechanism. If this is right then various kinds of cesses relatively independently of others. Such struct-
mechanisms can implement psychological laws so ures are said to be cognitiely encapsulated modules.
psychology is not restricted to humans but can apply The mental processing engaged in by many modules is
to Martians and computers. Cognitive scientists dis- not accessible to consciousness although the end
agree about exactly how much can be learned about product of the processing may be. For example, there
psychology by studying neurophysiology. ‘Top down’ is evidence that the mind contains a ‘face detecting’
theorists (Fodor 1981) think that very little beyond module that is able to determine whether or not a
crude neural geography of mental capacities can be person has seen a particular face previously. The
learned from neural sciences. At the other extreme, picture of the mind that emerges from this is in which
‘bottom up’ theorists think quite a lot can be learned it consists of many modules each dedicated to a
about the mind and perhaps even that neurophys- particular task (e.g., face recognition, speech pro-
iology should replace cognitive psychology. duction, character trait attribution) and a general
(b) What is the status of folk psychology? The reasoner which receives information from the modules
mental states people attribute to one another (and and whose operations are (partly) consciously ac-
sometimes animals) include belief, knowledge, mem- cessible. There are disagreements concerning the ex-
ory, desire, perception, and emotions. Normal humans tent to which cognitive capacities are modularized.
are capable of attributing mental states with reliability Some (e.g., Pinker 1995) seem to think that the mind is
and employing them in explaining other mental states massively modularized, while others (Fodor 1983)
and behaviors. Such explanations seem to conform to ascribe much more importance to general reasoning
general principles like ‘if a person wants q and believes capacities.
that doing A is the only way to get q then unless she (d) What are propositional attitudes? Belief, knowl-
has a reason not to do A she will intend to do A’ and edge, desire, and other folk psychological mental states
also to somewhat more specific principles like ‘if a are said to be propositional attitudes. The reason is the
person sees a friend coming then unless she has some widely held view that the ‘that’ clauses used for
special reason not to she will greet him.’ The collection attributing such states refer to propositions. There are
of such principles has come to be known as ‘folk various views about the nature of propositions but
psychology.’ One issue concerning folk psychology is they all have in common that propositions are the part
whether it is approximately true. Fodor (1981, 1987) of the meaning of a sentence that is or determines the
argues that it is and provides the appropriate starting conditions under which the sentence is true. The
place of a deeper theory of the mind. In contrast question arises of exactly what it is to have a
Churchland (1995) argues that folk psychological propositional attitude; for example, the belief that

2167
Cognitie Science: Philosophical Aspects

New York is tropical. This question is often seen (e.g., language in which thinking and other mental processes
Fodor 1981) as having two parts: (i) what is it to have take place. Whereas natural languages must be
a belief? (ii) what is it to have a belief that expresses a learned, many proponents of mentalese think that in
particular proposition? The most widely accepted some sense it is innate.
answer to the first question is that a belief is a (f ) What is thinking? Perception and rational
functional state, i.e., it has a certain causal role. One thinking involve mental states that are semantically
approach to answering the second question is that a related. For example, a person sees Macguire swing at
belief expresses the proposition that p in virtue of a ball and the ball go over the centerfield fence. He
involving a representation whose content is that of the thinks that it is a home run number 62 and then that
intentional state. So to believe that New York is Macguire has broken Maris’ home run record. On the
tropical is to be in a functional state (believing) that representational theory of mind the process running
involves a representation that has the intentional from the visual perception to the thought that the
content that New York is tropical. This provides a nice home run record has been broken consists of many
explanation of why beliefs are truth evaluable, are representations, e.g., a visual representation of shapes
about things, can be involved in inferences, etc. It is and colors, a perception that the ball is going over the
because these are features of the representations that fence, etc. These are not arbitrarily related but are
they contain. The view that the mind contains repre- related to each other by virtue of their semantic
sentations is usually called ‘the representational theory features. These semantic features involve relations to
of mind’ or RTM. It should be cautioned that RTM is things external to the mind, e.g., to Macguire, the
restricted to explicit beliefs. Implicit beliefs (e.g., your property of being a home run, and so on. How does
belief you had prior to reading this sentence that no the mind ‘know’ to go from a representation, which
giraffe is bigger than the empire state building) are refers one thing to a representation that refers to a
dispositions to form explicit beliefs under suitable related thing if it has access only to the representations
circumstances. This account of propositional attitudes and not to their references? A closely related question
is the beginning of an account of how mental states is how can the mind engage in reasoning, which leads
can have intentional content. It has its dissenters. it from some true representations (say about light
Some philosophers (Dennett 1981) suggest that there striking the retinas) to other true representations (say
may be no internal representations with the same about the scene in front of the eyes). The computat-
content as that clause we use in attributing a belief, but ional theory of mind suggests answers to this question.
it is still appropriate to attribute the belief because of It says that mental processes like thinking and per-
the person’s behavioral (including linguistic behavior) ceiving are computations on mental representations. A
dispositions that themselves may involve states that computer is a device which is able to follow a program
represent at the subpersonal level. for manipulating representations on the basis of their
(e) What is the nature of mental representations? syntax. So any relation which can be reduced to or
Cognitive science is up to its neck (and beyond) in encoded in syntactical relations can, in principle, be
representations. Some representations are involved in computed. Logical relations, e.g. logical implication,
propositional attitudes, others in relatively high-level can be reduced to syntactical relations of sentences
unconscious cognitive systems like the language fac- whose syntactic forms are logical forms (as is the case
ulty, and others lower level systems that may im- for Mentalese) computation can then account for
plement the higher level systems. There are various logical inference. By encoding semantic features in
views about the nature of these representations. One syntax the computer can manipulate representations
widely held view (Fodor 1975) is that most mental in ways that respect their semantics. Researchers in
representations, at least those involved in proposition- computer science have shown how many tasks that
al attitudes, belong to an internal language of thought, apparently require intelligence can be educed to
‘mentalese.’ Mentalese contains basic expressions— computations.
predicates, names, connectives, etc. and rules of According to the computational theory of mind
combination. The rules allow for the construction of a (CTM), the mind is a computer and mental processes
potential infinity of complex expressions from a finite consist of computations on Mentalese representations.
basic vocabulary. There are important differences In this way the mind is able to manipulate represent-
between a natural language like English and mental- ations that are semantically related. Of course there is
ese. The most important difference (as will be further a big difference between a computer and a mind. The
discussed in item (g)) is that mentalese syntax is logical computer is programmed by a human programmer
form. Whereas English can be used for communi- who supplies interpretations for the symbols it man-
cation, mentalese is used for thinking. Understanding ipulates. If the CTM is correct the mind that programs
a natural language can be accounted for in terms of what the mind implements is a product of the structure
processes that ‘translate’ natural language into menta- of the brain and that, presumably, is at least partly
lese. But understanding mentalese cannot be under- dependent on evolution. The interpretation of the
stood in the same way on the pain of regress. Rather symbols that the mind manipulates is not provided by
mentalese is not literally ‘understood’ but rather is the a programmer. Exactly what determines the semantic

2168
Cognitie Science: Philosophical Aspects

features of mental representations is discussed in item sentations by virtue of their syntactic forms. There is
(i). an alternative account of cognitive structure and
Not every cognitive scientist is impressed by CTM computation that has been developed called ‘con-
as an account of mental processes. Some go along with nectionist architecture.’ Roughly, a connectionist sys-
the idea that mental processes involve a kind of tem consists of a network of nodes joined together in a
computation but conceive of computation very dif- pattern of connections. Each node can be activated (or
ferently from the TM account. They suggest that the activated to a certain degree) and can receive signals
mind has a connectionist architecture. I discuss this and send signals to certain connected nodes. Whether
view in the next section. The most famous philo- or not a signal travels via a connection from node A to
sophical objection to CTM is due to John Searle node N depends on the weight of the connection.
(1980). Suppose that understanding a language, e.g. Some nodes are activated by external stimuli (input
Chinese, involves following a program. Searle argues nodes) and others send signals outside the network. At
that this cannot be correct since it is possible for a any time the state of the network is determined by the
person who knows no Chinese but is able to follow weights of the connections and the activation of the
program instructions just like a computer implements nodes. Signals that activate the input nodes are
the program. The program is written so that questions propagated throughout the system to the output
in Chinese are input and answers in Chinese are nodes. Thus connectionist systems can be thought of
output. Searle observes that although the man imple- as computing certain outputs given certain inputs.
ments the program he clearly has no clue as to the Further, the connectionist network can be ‘trained’ by
meanings of the Chinese symbols. There have been altering the connection weights depending on whether
many replies to this objection. Perhaps the most or not a given output is ‘appropriate’ for a given input.
convincing from proponents of CTM is that imple- Connectionist networks have been constructed that do
menting a program is necessary but not sufficient for a number of tasks e.g. recognize letters of the alphabet,
language understanding. The language must also be ‘read’ text, recognize faces, and so forth by such
translated into the person’s Mentalese. Of course this training.
still leaves open the question of what it is for a symbol Connectionist cognitive architecture is apparently
of Mentalese to represent, which will be discussed in quite different from a classical computer. There is no
item (i). ‘executive’ that is following a program. Computations
A different worry about CTM is whether it can are not performed on sentences but on the totality of
account for certain kinds of reasoning, specifically connections among nodes. Although the state of a
inductive inference. Inductive reasoning involves con- connectionist system can be thought of as a repr-
sidering various hypotheses and coming up with the esentation—say as representing that the cat is on the
one that is best supported by the evidence. We employ mat—unlike sentences of mentalese it needn’t contain
it in producing explanations, identifying causes, and any parts that correspond to ‘the cat,’ ‘is on,’ and ‘the
so forth. For example, Sherlock Holmes solved a case mat.’ Proponents of connectionism think that it
when he realized that the fact that no dog barked was provides a model of mental states and processes that is
evidence that the murderer was known to the dogs. more plausible than classical accounts. One reason is
The worry, which is sometimes called ‘the frame that some find it difficult to believe in mentalese.
problem’ (Pylyshyn 1987), is that no computational Another is that it seems natural for connectionist
program can realistically perform this kind of task. In systems to implement vague concepts since a conn-
inductive reasoning almost any bit of information may ectionist system can be trained to respond in a graded
be relevant. We seem to have the ability to survey a manner. The holism of connectionist representation is
great deal of what we know and come up with what is also appealing to some and there are suggestions that
evidentially relevant. But a program that operated by its holistic features may help with the frame problem
having to survey a vast number of representations mentioned in item (f ). Finally, connectionist networks
evaluating them for relevance would seem to be are reminiscent of assemblies of neurons and so strike
completely impractical. There are too many com- some as biologically realistic. Proponents of classicism
putations to be performed. One response to this is to point out that connectionism is more than reminiscent
think that CTM may provide good accounts of the of behaviorism. Like behaviorism it is an associacionist
information processing that occurs in mental modules psychology that construes mental processes in terms of
but that it is not very good at accounting for the establishing and modifying associations. Although it
mental capacities of a general reasoner. Thus those is in a sense holistic it is far from clear how that will
who like CTM are attracted to the view that the mind help accounting for inductive inference. In fact critics
is massively modular. of connectionism set up a dilemma (Fodor and
(g) Classical or connectionist architecture? In the Pylyshyn 1988, Fodor and McLaughlin 1990): either
classical account of computation discussed in item (f ), the connectionist architecture implements a classical
the architecture of the mind is that of a classical architecture, in which case it is not really an alternative
computer (or system of such computers) and mental to classicism, or it fails to account for essential features
processes are operations on linguistic-like repre- of thought. These features are productiity and syst-

2169
Cognitie Science: Philosophical Aspects

ematicity. Productivity involves the fact that once a that two people ever share the same concept. If the
thinker has basic concepts she is able to produce a second then the question arises of what makes some
potential infinity of novel thoughts involving those inferences concept constituting. There are arguments
concepts. Systematicity is the feature that any thinker in the philosophical literature (Quine 1960, Fodor and
who can think a thought can think related thoughts LePore 1992) that there is no principled distinction
that apparently have the same components. For between the two but also some proposals (Peacocke
example, if one can think Jack loves Jill then one can 1992) for how to make the distinction. Finally, there is
also think Jill loves Jack. These features are easily the view that concepts are expressions in mentalese
accounted for by classicism since the thoughts corres- that are individuated by their syntax and by their
pond to syntactically structured representations. But reference (Fodor 1998). This view allows for thinkers
a connectionist system can be capable of being in a with very different beliefs to share the same concept.
state that represents that Jack loves Jill without being But it also allows for the bizarre possibility of someone
capable of being in a state that represent Jill loves Jack possessing the concept horse while believing that
since it need not have parts that correspond to Jack, horses are edible fruits.
loes, Jill. If the connectionist system does have Cognitive theories that posit innate knowledge are
such parts then it is merely implementing a classical also committed to the innateness of the concepts that
system. constitute the knowledge. Some cognitive scientists go
(h) What are concepts? The concept plays an much further and claim that many of our concepts are
important role in the cognitive sciences. Thoughts innate. One reason for this is the difficulty in ac-
(beliefs, memories, desires, etc.) are composed of counting for how concepts can be learned. As Fodor
concepts and so what mental processes a thinker can (1980) observed, they cannot be learned by testing
engage in depends on what concepts she possesses. various hypotheses about them since the formulation
Developmental psychology is interested in how people of the hypotheses already requires possessing the
acquire concepts and whether some concepts are concept. At one time Fodor thought that this line of
innate. There are various views about the nature of argument showed that even the concept carburetor is
concepts. Advocates of RTM think of concept tokens innate. Fodor has since moderated his view but there
as representations but there is a wide diversity of views is no consensus concerning how concepts are acquired.
on what makes a particular representation a particular (i) How does the mind represent the world? What
concept, say the concept horse. One view is that a makes a component of a mental state a representation
concept is something like a definition. For example, is that it possesses semantic properties, e.g., it refers,
the concept horse may be the definition is a large land has truth-value, and so on. But what exactly deter-
mammal that has been domesticated for riding. To mines that a given representation possesses a certain
possess the concept horse is to know the definition. semantic property. The Cartesian tradition generally
This view has come under much deserved criticism. thought that intentionality is a distinct and basic
One problem is that not all concepts can have feature of mental substance. But most philosophers of
definitions without circularity. More serious even is cognitive science who think that there are mental
that most words (and the concepts they are associated representations think that whatever determines sem-
with) do not seem to have definitions at all. There are antic features has to be within the realm of natural
horses that are not large and large animals that have science. On the view that was once widely held in
been domesticated for riding (e.g., elephants) that are philosophy that concepts are images the answer to this
not horses. Another view is that concepts are prot- question is that resemblance makes for representation.
otypes. A prototype consists of a core exemplar—a But even cognitive scientists who posit mental images
representation of something that is a paradigm exam- do not think that these literally resemble their refere-
ple of the concept—and then a similarity metric that nces. The two views that are currently most widely
determines how close something is to the paradigm. advocated are informational semantics and teleological
For example, the concept bird consists of the rep- semantics or some combination of the two Millikan
resentation robin and a metric that makes eagles pretty 1984, Fodor 1987, 1990, Loewer 1998). Simplified
good birds and penguins pretty bad ones. But while informational semantics says that the fact that
there is evidence that thinkers do judge instances of a a certain state carries certain information under
concept as better or worse examples, the account faces certain circumstances or is reliably caused by certain
some of the same difficulties faced by the definition properties under certain circumstances determines its
account. A somewhat more general approach cons- semantic properties. Teleological semantics says that
iders the inferences that a thinker is disposed to make semantic properties of a representation are determined
concerning thoughts containing a concept as indi- by its biological function. A simplified combined view
viduating the concept. Such ‘conceptual role’ theories is that the function of carrying certain information
of concepts face a dilemma. Either all of the inferences under certain circumstances determines represent-
involving the concept are individuative of it (holism) ations of semantic properties. For example, it is not
or only some are (molecularism). If the first then, as implausible that there is a certain system of a frog’s
our beliefs change so do our concepts and it is unlikely brain with the function of being in a particular state R

2170
Cognitie Styles and Learning Styles

if and only if a fly is nearby when the circumstances are Johnson-Laird P 1983 Mental Models: Towards a Cognitie
normal (e.g., in a pond, good light, etc.). If the Science of language, Inference, and Consciousness. Cambridge
circumstances are normal then an occurrence of R in University Press, Cambridge, UK
the frog’s brain carries the information (to other parts Kossylyn S M 1980 Image and Mind. Harvard University Press,
Cambridge, MA
of the frog’s brain) that a fly is present. This kind of Kripke S A 1982 Wittgenstein on Rules and Priate Language.
account fits very nicely with the view that the mind is Harvard University Press, Cambridge, MA
a kind of information processor. But whether it can be Loewer B 1998 Guide to naturalizing semantics. In: Hale B,
developed so as to provide a plausible account of the Wright C (eds.) The Companion to the Philosophy of Language.
semantics of the mental representations involved in Blackwell, Oxford, UK
human thought is a big and very open question. Loewer B, Georges, Rey 1991 Meaning in Mind. Blackwell,
Oxford, UK
Marr D 1982 Vision. Freeman and Co.
See also: Cognitive Modeling: Research Logic in Millikan R 1984 Language, Thought, and Other Biological
Cognitive Science; Cognitive Neuroscience; Cognitive Categories. MIT Press, Cambridge, MA
McDowell J 1994 Mind and World. Harvard University Press,
Psychology: Overview; Cognitive Science: History;
Cambridge, MA
Cognitive Science: Overview; Consciousness and Sen- Nagel T 1979 Mortal Questions. Cambridge University Press,
sation: Philosophical Aspects; Intentionality and Cambridge, UK
Rationality: A Continental-European Perspective; Peacocke C 1992 A Study of Concepts. MIT Press, Cambridge,
Intentionality and Rationality: An Analytic Per- MA
spective; Knowledge (Explicit and Implicit): Philo- Pinker S 1997 How the Mind Works. Norton, New York
Pylyshyn Z W 1984 Computation and Cognition. MIT Press,
sophical Aspects; Marr, David (1945–80); Reference
Cambridge, MA
and Representation: Philosophical Aspects Pylyshyn Z W 1987 The Robot’s Dilemma: The Frame Problem in
Artificial Intelligence. Ablex, Norwood, NJ
Rey G 1997 Contemporary Philosophy of Mind. Cambridge,
MA
Ryle G 1949 The Concept of Mind. Hutchenson’s University
Bibliography Library, London
Quine W 1960 Word and Object. Technology Press of MIT,
Block N 1995 On a confusion about a function of consciousness. Cambridge, MA
Behaioral and Brain Sciences 18: 227–47 Searle J 1980 Minds, brains, and programs with commentaries.
Chomsky N 1954 Syntactic Structures. Mouton, The Hague The Behaioral and Brain Sciences 3: 417–57
Chomsky N 1959 A Review of Skinner’s Verbal Behavior. Searle J R 1992 Rediscoery of the Mind. MIT Press, Cambridge,
Language 35: 26–58 MA
Churchland P M 1995 The Engine of Reason, The Seat of the Soul. Skinner B P 1953 The Science of Human Behaior. Macmillan,
MIT Press, Cambridge, MA New York
Davidson D 1980 Essays on Actions and Eents. Oxford Smolenski P 1988 On the proper treatment of connectionism.
University Press, Oxford, UK Behaioral and Brain Sciences 11: 1–74
Dennett D 1981 Brainstorms. MIT Press, Cambridge, MA Turing A 1950 Computing machinery and intelligence. Mind 59:
Dennett D 1994 Consciousness Explained. Little Brown, New 433–60
York Wittgenstein L 1953 Philosophical Inestigations [trans. Ans-
Descartes R 1641\1970 Meditations on first philosophy. In: ES combe GEM]. Macmillan, New York
Haldane, Ross GRT (trans.) The Philosophical Works of
Descartes, Cambridge University Press, Cambridge, UK, Vol. B. Loewer
1, pp. 131–200
Fodor J A 1975 The Language of Thought. Crowell, New York
Fodor J A 1981 RePresentations: Philosophical Essays on the
Foundations of Cognitie Science. MIT Press, Cambridge, MA
Fodor J A 1983 The Modularity of Mind. MIT Press, Cambridge,
MA Cognitive Styles and Learning Styles
Fodor J A 1987 Psychosemantics. MIT Press, Cambridge, MA
Fodor J A 1990 A Theory of Content: And Other Essays. MIT
Press, Cambridge, MA 1. Style Differences in Cognition and Learning
Fodor J A 1998 Concepts: Where Cognitie Science Went Wrong.
University Press, Oxford, UK The chorus line of a popular song once claimed
Fodor J A, LePore E 1992 Holism: a Shopper’s Guide. Blackwell, that … it’s not what you do but the way that you do it.
Oxford, UK This idea—making the how as important as the
FodorJ A,McLaughlinB1990Connectionismandtheproblemof what—is intriguing. Furthermore, as the song insists,
systematicity. Cognition 35(2): 185–204 the character of an individual is invariably woven into
Fodor J A, Pylyshyn Z 1988 Connectionism and cognitive archi- how a task is completed. This can be seen in any
tecture: a critical analysis. In: Pinker S, Mehler J (eds.) Connect- number of human endeavors, for example, sport, art,
ions and Symbols. MIT Press, Cambridge, MA handwriting, thinking, learning, even conversation.

2171
Cognitie Styles and Learning Styles

In any performance then, a sense of person, as well 3.2 Cognitie Controls and Cognitie Processes
as context, combines to produce a typical pattern,
A second significant influence in the development of
hallmark, leit-motif, signature, or style. The very same
cognitive style was the study of cognitive processes
idea underlies the suggestion that individuals possess a
related to individual adaptation to the environment,
personal way of thinking (cognitive style) or learning
exemplified by the work of Gardner and co-workers at
(learning style). The following discussion will intro-
the Messinger Clinic in the USA. This work was
duce the construct of style differences in cognition and
shaped, originally, by psychoanalytic theories of ego
learning and consider its significance for lifelong
psychology—which was typified by studies focusing
learning in both the world of education and workplace.
upon variables in ego adaptation to the environment.
This led to the identification of several cognitive
processes including perceptual attitudes, cognitive
2. The Style Construct in Psychology attitudes, and cognitive controls. Further work related
The term construct refers to a psychological idea or to this area led to several stylistic labels and models,
notion. Examples of constructs are intelligence, per- supporting the general notion of a cognitive style (see
sonality, or self-concept. A style construct appears in a Messick 1976).
number of academic disciplines—in psychology it has
been used in a number of different areas such as
personality, cognition, communication, motivation, 3.3 Mental Imagery
perception, learning, and behavior.
The theory of style, unfortunately, has been char- A third key influence in the development of cognitive
acterized by a tendency for researchers to (a) work in style reflected work looking at mental representation.
isolation; (b) develop their own instruments for the Early in the scientific study of psychology, attention
assessment of style; and (c) the creation of independent was given to the notion that some people have a
style labels with little reference to the field. A wide- predominantly verbal way of representing information
spread use of the term ‘style’ has led to a number of in thought, while others are more visual or imaginal.
different definitions and terminology. Consequently, Paivio (1971) further developed this notion with a dual
those interested in validity or verifiability, and an coding measurement of mental imagery. Riding and
accepted nomenclature for a theory of style, have Taylor (1976) identified, as fundamental to the con-
faced considerable difficulty. The idea of style in struct of cognitive style, the verbal-imagery dimension
educational psychology, nonetheless, is recognized as of cognitive style.
a key construct of individual differences in human
performance.
3.4 Personality Constructs
A fourth and separate influence on the field of
3. Cognitie Styles cognitive style involved researchers utilizing person-
While Allport (1937), in work which developed the ality-based constructs to develop a model of learning
idea of ‘lifestyles,’ was probably the first researcher to style (Myers 1978). The most influential model was the
deliberately use the ‘style’ construct in association Myers–Briggs Type Indicator, developed from Jung’s
with cognition, the following key areas of psychology typology of personality constructs and ‘psychoana-
contributed to an emerging field of cognitive style. lytic ego psychology’ (Jung 1923). It offered an
alternative model of cognitive style in contrast to those
flowing from cognitive psychology.
3.1 Perception
Experimental work—reflecting an emphasis on the 4. The Further Deelopment of Cognitie Styles
‘regularities’ of information-processing which were
derived from the German gestalt school of perceptual A contemporary resurgence of interest in style dif-
psychology—led to an early development of the ‘style ferences over the last three decades has resulted in
construct’ of field dependence–independence (Witkin three distinct developments, all involving the gen-
et al. 1971). eration of new models of cognitive style or learning
Individuals were found to rely upon the surrounding styles.
‘field’ or ‘context’ to a greater or lesser extent, when The first development involved researchers inde-
reorienting an object relative to the vertical. This was pendently constructing new models of individual
subsequently found to correlate with competence in difference in various aspects of cognitive functioning.
disembedding shapes from a field and experimental This approach tended to conceptualize styles as the
participants were found to either rely heavily on the discovery of new psychological phenomena, for exam-
field for orientation or shape discrimination (field- ple, styles of thinking, intuition, creativity, decision-
dependent) or little or not at all (field-independent). making, and motivation.

2172
Cognitie Styles and Learning Styles

Examples of some of this work generating ad- (a) the intention of developing new concepts of
ditional labels of cognitive style include a model of learning to reduce reliance upon tests of intelligence or
perceptual style (Gregorc 1982); the Adaptor– ability;
Innovator cognitive style of decision-making (Kirton (b) a focus on the learning process and achievement;
1994); and the Assimilator–Explorer cognitive style of (c) a primary interest in the effect of individual
creativity (Kaufmann 1989). Another more elaborate differences upon pedagogy;
model of mental self-government presented by (d) a parallel construction of new assessment instru-
Sternberg (1996, 1997) represented a theory of style ments and models of learning style.
derived from notions of government. According to The primary concern for educationists working
this theory, people can be understood in terms of within this learning-centered tradition lay with the
mental government, that is, processes of function, process of learning and its context. It focused on
form, level, scope, and learning, which impact upon individual differences in the process of learning rather
thinking and learning. than within the individual learner. Models of learning
The second development reflected work aimed at a styles in this tradition included the following four
synthesis of theory and a consensus in the under- groups.
standing of cognitive style. This approach was char-
acterized by a focus upon the construct validity of
cognitive style and its application. The work of Curry 5.1 Models Focusing on the Learning
(1987) and Rayner and Riding (1997), for example, Process—Based on Experiential Learning
both attempted to synthesize or integrate existing These derived from a theory of experiential learning,
theory of cognitive and learning styles. and the most influential example was the work of Kolb
The work of Riding over 20 years involved a (1976), who described learning style as the individual’s
reclassification of cognitive style models (see Riding preferred method for assimilating information, in an
and Rayner 1998). The structure of cognitive style was active learning cycle. Kolb constructed a two-dimen-
defined as two-dimensional, comprising the Wholist– sional model comprising perception (concrete\ab-
Analytic style dimension, relating principally to cog- stract thinking) and processing (active\reflective
nitive organization and the Verbal–Imagery style information processing) as fundamental aspects of
dimension, relating principally to mental represen- an experiential learning cycle.
tation.
An individual’s cognitive style was defined as a
person’s tendency to process information wholistically 5.2 Models Focusing on the Learning
or analytically, that is, either as a whole piece or Process—Based on Orientation to Study
piecemeal, while at the same time mentally represent-
ing information using imagery or language. While These derived from a theory of information processing
each dimension was thought to be independent, they and learning processes, and the most influential
were conceptualized as continua and it was not example was developed by Entwistle called the ‘Ap-
suggested that an individual could only use one or the proaches to Study Inventory.’ Entwistle (1981) found
other way of thinking. that approaches to study often reflected either a
The further development of a computer-based surface or deep engagement with the study task. This
assessment for cognitive style analysis (CSA) by was later extended to include four key orientations to
Riding reflected a deliberate attempt to integrate both study: meaning, reproducing, achieving, and holistic.
the Wholist–Analytic and Verbal–Imagery dimensions The model was further refined as an integrated
of cognitive style (see Riding 1991). An extensive conception of the learning process, which described a
number of empirical studies over a number of years series of actions linked to specific learning strategies
were conducted using the CSA at the University of identified in his original model.
Birmingham and provided evidence to support this
model of cognitive style (see Riding and Rayner 1998).
5.3 Models Focusing on Instructional-preference
These set out to measure a range of environmental or
instructional factors affecting an individual’s learning
behavior. A leading example of this type was the
5. Learning Styles Learning Styles Inventory (LSI) developed by Dunn et
al. (1989). The learning style elements identified in the
A third more widespread development in contem- LSI were: enironmental stimulus (light, temperature);
porary work on style differences in cognition and emotional stimulus (persistence, motivation); socio-
learning led to the generation of yet more labels, which logical stimulus (peers, adults); physical stimulus
were described as models of learning styles. This (perceptual strengths, time of day—morning vs. after-
learning-centered tradition of ‘style’ is arguably dis- noon); and psychological stimulus (global\analytic,
tinguished by four major features: impulsive\reflective).

2173
Cognitie Styles and Learning Styles

5.4 Models Focusing on Cognitie Skills, and managed within the educational context raises key
Learning Strategy Deelopment questions for the design of instruction and pedagogy,
including a consideration of:
The fourth group of learning style labels focused on an
(a) assessment-based learning;
individual’s developing cognitive ability and repertoire
(b) differentiation in the curriculum;
of cognitive skills or ability to learn, together with
(c) learning method routines;
related behavioral characteristics, which were under-
(d) professional development.
stood to comprise an individual’s learning profile.
Each of these approaches, if adopted with an eye to
Learning style was typically perceived as a multimodal
considering the benefit of a pedagogy which is style-
construct and understood to describe a range of
friendly, encourages interactive learning, builds upon
intellectual functioning relating to the learning ac-
the principles of individual difference, and adopts the
tivity.
idea of developing a learning expertise within the
An example of this type of model was the Learning
learner, will provide a foundation for lifelong learning
Styles Profile developed by the North American
(see Rayner 2000).
Association of Secondary School Principals (Keefe
Implicit in all of this work, and equally relevant to
1988). This style construct described 24 key elements
the classroom as to the workplace, is the notion of the
in learning style, grouped into three categories:
matching hypothesis. The full value and significance to
cognitive skills—relating to aspects of information
the professional of cognitive and learning styles rest
processing; perceptual responses—encompassing per-
ultimately with the belief that if it is possible to make
ceptual responses to data; study and instructional
a better match between person and environment, then
preference—referring to motivational and environ-
performance will improve and achievement will be
mental elements affecting learning preferences.
enhanced. Moreover, formal education might be made
more effective by matching style to materials, to
presentation, mode and structure, through nurturing
6. Cognitie Style or Learning Styles strategy development to maximize style effectiveness.
The models in the learning-centered tradition shared The delivery of a curriculum, albeit for schooling or
several limitations. First, they reflected a construct workplace training, will improve with an increasing
that by definition was not stable—it was grounded in depth of differentiation and a match between in-
process and therefore susceptible to rapid change. dividual differences with targeted learning or activity.
Second, they did not describe a developmental ration- Such an approach will build upon personal strengths,
ale for the concept of learning style nor easily sow seeds of success, and reap the benefits of learning
correspond to other models of assessment, thereby enhancement.
suggesting a problem for conceptual validity. Third, The final word in offering an overview of style
they attracted peer criticism for lacking psychometric differences in thinking, learning, and behavior is left to
rigor and a systematically developed theory supported Sternberg (1996, p. 363). He succinctly stated that in
by empirical evidence (see Grigerenko and Sternberg the world of learning and education, ‘styles matter’!
1995). This view mirrors that of many workers in the field
The learning-centered tradition, however, reflected who remain interested in knowing more about those
a continuing need for a theory of individual differences individual differences which affect human perform-
which could be applied to the learning context. It also ance.
reinforced previous work in the area of cognitive style
and pointed to the potential for profiling the personal See also: Mental Imagery, Psychology of; Mental
learning style of an individual (see Rayner 2000). Representations, Psychology of; Metacognitive Devel-
opment: Educational Implications; Self-regulated
Learning
7. Implications of Style for Lifelong Learning
At the 1997 Seventh International Conference on
Thinking, Howard Gardner argued that in the not too Bibliography
distant future people will look back to the end of this Allport G W 1937 Personality: A Psychological Interpretation.
millennium, and laugh at the ‘uniform school.’ They H. Holt and Co., New York
will be greatly amused, he suggested, by the idea that Curry L 1987 Integrating Concepts of Cognitie or Learning
educationists actually believed they could teach the Style: A Reiew with Attention to Psychometric Standards.
Canadian College of Health Service Executives, Ottawa, ON,
same things to all children at the same time and in the
Canada
same way. To believe that the uniform school can Dunn R, Dunn K, Price G E 1989 Learning Styles Inentory.
provide efficient or effective education, he concluded, Price Systems, Lawrence, KS
was to endorse educational failure! Entwistle N J 1981 Styles of Learning and Teaching: An
The extent to which an awareness of learning style Integrated Outline of Educational Psychology for Students,
or the self as a learner is currently considered and Teachers and Lecturers. Wiley, Chichester, UK

2174
Cognitie Theory: ACT

Gregorc A R 1982 Style Delineator. Gabriel Systems, Maynard, trol of Thought, has not been consistently used, and
MA several books written about the theory suggest alterna-
Grigerenko E L, Sternberg R J 1995 Thinking styles. In: tive candidates, e.g., The Adaptie Character of
Saklofske D H, Zeidner M (eds.) International Handbook of
Thought (Anderson 1990) and The Atomic Components
Personality and Intelligence. Plenum, New York
Jung C G 1923 Psychological Types. Harcourt Brace, New York of Thought (Anderson and Lebiere 1998). However,
Keefe J W 1988 Profiling and Utilising Learning Style. National these publications postdate ACT’s modest beginnings
Association of Secondary School Principals, Reston, VA by more than a decade, making the simpler moniker A
Kaufmann G 1989 The Assimilator-Explorer Inentory. Uni- Cognitive Theory seem the most parsimonious answer.
versity of Bergen, Bergen, Norway
Kirton J W (ed.) 1994 Adaptors and Innoators, 2nd edn.
Routledge, London 2. Architectures and Models
Kolb D A 1976 Learning Style Inentory: Technical Manual.
Prentice-Hall, Englewood Cliffs, NJ Among computational systems designed to model
Messick S 1976 Indiiduality in Learning, 1st edn. Jossey-Bass, cognition, there is a critical distinction between a
San Francisco cognitive architecture and a cognitive model built
Myers I 1978 Myers–Briggs Type Indicator. Consulting Psycho- within an architecture. A cognitive architecture defines
logists Press, Palo Alto, CA a specific way of representing knowledge and a fixed
Paivio A 1971 Styles and strategies of learning. British Journal of
set of mechanisms for processing knowledge. A cog-
Educational Psychology 46: 128–48
Rayner S 2000 Reconstructing style differences in thinking and nitive model, on the other hand, specifies the knowl-
learning: profiling learning performance. In: Riding R, Rayner edge that is required to perform a particular task. Any
S (eds.) International Perspecties on Indiidual Differences. one architecture supports a wide variety of models, all
Ablex, Stanford, CT, Vol. 1 of which use the same mechanisms to capture behavior
Rayner S, Riding R 1997 Towards a categorisation of cognitive in different tasks, just as the brain presumably employs
styles and learning styles. Educational Psychology 17: 5–28 a common set of mechanisms across a variety of tasks.
Riding R J 1991 Cognitie Styles Analysis. Learning and The ACT theory is a cognitive architecture. It
Training Technology, Birmingham, UK provides mechanisms for the retrieval and learning of
Riding R J, Rayner S G 1998 Cognitie Styles and Learning
facts (declarative knowledge) and the selection, ap-
Strategies. David Fulton Publishers, London
Riding R J, Taylor E M 1976 Imagery performance and prose plication, and learning of skills (procedural knowl-
comprehension in 7 year old children. Educational Studies 2: edge). All cognitive models built within ACT share
21–7 these mechanisms; what differs across ACT models is
Sternberg R J 1996 Styles of Thinking. In: Baltes P B, Staudinger the task-specific knowledge (i.e., the facts and skills
U M (eds.) Interactie Minds. Cambridge University Press, themselves) input to the system. ACT models have
Cambridge, UK successfully fit behavioral data across a wide variety of
Sternberg R J 1997 Thinking Styles. Cambridge University Press, tasks, including arithmetic, navigation, categoriza-
Cambridge, UK tion, and game playing (see Table 1). The success of
Witkin H A, Oltman P, Raskin E, Karp S 1971 A Manual for
these ACT models across such varied domains
Embedded Figures Test. Consulting Psychologists Press, Palo
Alto, CA provides support not only for the models themselves
but for the explanatory power of the ACT architecture.
S. G. Rayner This article focuses on the current version of the
ACT theory, called ACT-R (Anderson and Lebiere
1998). Nevertheless, a historical sketch of the theory’s
evolution and how it relates to contemporary research
are presented as well as a brief discussion of future
Cognitive Theory: ACT research issues.

ACT is a theory of human cognition that posits 3. Basic Features of the Theory
particular ways of representing knowledge and the
mechanisms by which such knowledge is acquired and The four main tenets of the ACT theory are as follows:
used (Anderson 1983, 1993, Anderson and Lebiere (a) the ability to perform a complex task can be
1998). The theory is implemented as a computer decomposed into separate pieces of knowledge;
simulation system that generates the theory’s predic- (b) these pieces of knowledge are learned through
tions, thereby facilitating quantitative comparisons experience, i.e., using a piece of knowledge is akin to
with experimental data. practice;
(c) at any given time, the current focus of attention
(also called the goal) influences what knowledge is
1. What’s in a Name? used; and
(d) there are two types of knowledge, declaratie
The etymology of the acronym ACT is the subject of (for facts) and procedural (for skills), that have distinct
some debate. The original definition, Adaptive Con- representations and learning mechanisms.

2175
Cognitie Theory: ACT

Table 1 ited, the more nodes in the focus, the less attentional
A list of tasks and phenomena modeled with the ACT activation for each.) This activation then propagates
theory from each attended node to related nodes in the
network in proportion to the corresponding link
1 Visual search including menu search strengths. This attentional activation produces context
2 Subitizing effects because only those nodes related to the current
3 Dual tasking including Psychological focus of attention receive extra activation.
Refractory Period (PRP) The total activation at node i is the sum of its base-
4 Similarity judgments level activation Bi and the attentional activation it
5 Category learning receives from nodes in the focus of attention:
6 List learning experiments
7 Paired-associate learning
8 The fan effect Ai l BijWjSji (2)
9 Individual differences in working memory j
10 Cognitive arithmetic
11 Implicit learning (e.g., sequence learning) where the sum is over attended nodes j. Retrieval of
12 Probability matching experiments facts is determined by the total activation of the
13 Hierarchical problem solving tasks corresponding nodes. Specifically, the node that is
14 Strategy selection retrieved is the one whose total activation plus some
15 Analogical problem solving added noise is highest, given that this sum is above a
16 Dynamic problem solving tasks including global threshold. This added noise represents sto-
military command and control chasticity in the system. The time it takes to recall this
17 Learning of mathematical skills including node decreases exponentially as a function of its
interacting with Intelligent Tutoring (noisy) activation which, combined with Eqn. (1),
Systems (ITSs) produces a power-law speedup with practice (see
18 Development of expertise Learning Cure, The).
19 Scientific experimentation Procedural knowledge is represented by a set of
20 Game playing condition–action pairs called production rules (e.g.,
21 Metaphor comprehension IF the goal is to add two numbers a and b, and the fact
22 Learning of syntactic cues that c is the sum of a and b can be recalled, THEN say
23 Syntactic complexity effects and c is the answer). Each production has several associ-
ambiguity effects ated quantities that reflect how useful it has been in
24 Dyad communication past applications. These quantities are learned by
experience and used to compute an estimate of the
expected gain from applying each rule. For example,
Declarative knowledge is represented as nodes in an the cost associated with a particular production rule is
associative network. The more activated a given node, the weighted average of the rule’s past costs of
the more easily the corresponding fact can be accessed. application (measured in units of time) and a prior
Each node has an associated base-level activation B estimate of cost. When several rules are candidates at
that increases each time the fact corresponding to that the same time, the one whose expected gain plus some
node is accessed (learning) and decays with time added noise is highest gets selected. In this way,
(forgetting) according to the equation production rules that have been more useful (i.e., more
successful and less costly) in the past are more likely to

0
B l ln (tktj)Vd
j
1 (1)
be selected. In addition, each piece of procedural
knowledge is strengthened with each application, and
that strength decays with time. A production rule’s
speed of application increases exponentially with
where t is the current time, tj is the time of the jth use strength.
of the node, and d is a global decay rate. The quantity The ACT theory also posits more complicated
B offers a summary description of a node’s past mechanisms by which production rules and declarative
history of use and hence a reasonable estimate of its nodes are initially created. In both cases, the goal plays
likelihood to be needed in the future. an important role in the form that new knowledge
Each link between two nodes in the network has a takes. These separate mechanisms highlight ACT’s
continuous-valued quantity S that measures the distinction between acquiring new pieces of knowledge
strength of association between those two nodes. The and refining the continuous quantities associated with
current focus of attention works by selecting a subset each piece of knowledge. Moreover, these two modes
of nodes in the network to be attended. Each attended of knowledge representation—symbolic (nodes, pro-
node gets a share of a limited amount of attentional duction rules) and sub-symbolic (activations, costs,
activation W. (Because attentional activation is lim- strengths)—make ACT a hybrid system. This

2176
Cognitie Theory: ACT

distinguishes it from wholly symbolic (e.g., Soar) and The latest incarnation of the ACT theory, ACT-R
wholly sub-symbolic (e.g., connectionist) systems. (Anderson 1993, Anderson and Lebiere 1998), was
developed in the 1990s, inspired by a rational analysis
4. Historical Deelopment of cognition (Anderson 1990). Among other things,
ACT-R includes a refined activation calculus and a
ACT has been under development since the 1970s. It more plausible mechanism for acquiring procedural
began as a theory of semantic memory and now knowledge. These changes were designed to reflect the
encompasses learning, memory, problem solving, at- way human cognition adapts to the structure of the
tention, perception, and action. The following pro- environment. In addition, ACT-R has been put to the
vides a brief sketch of its development and places each toughest challenges in testing its fidelity to empirical
version of the theory in its historical context. Until the data: ACT-R models have been able to fit fine-grained,
1970s, mathematical models were the typical for- multivariate data simultaneously across several
malism used to describe and predict cognitive psycho- experiments (e.g., Anderson and Matessa 1997), and
logical phenomena. A disadvantage of mathematical they have captured patterns of performance in in-
models, however, is that they are limited in the dividual subjects across tasks (e.g., Lovett et al. 2000).
complexity of processes they can describe. Thus, to In sum, the ACT theory has evolved into its current
provide a mechanistic account of complex cognitive form by virtue of the guiding force of several kinds of
processes, computational models began to be de- constraints. Throughout ACT’s development, exper-
veloped. The first well-defined version of the ACT imental data have been used to test the veridicality of
theory, called ACTE (Anderson 1976), was one such the theory. In some cases, this empirical constraint has
model. It introduced the distinction between declara- invoked a reevaluation of some aspect of the theory
tive (i.e., factual) and procedural (i.e., skill-based) (e.g., single-trial learning of production rules). More
memory and the notion of declarative activation. generally, even when the theory’s predictions have
By the 1980s, however, some researchers were been met, refinements were made so that more
questioning whether the development of computa- detailed, fine-grained datasets could be modeled.
tional models (and the theories they implemented) was Besides these empirical constraints on the theory there
sufficiently constrained to produce reasonable models are theoretical constraints imposed top-down from the
of the ‘true’ underlying representations and processes architectural status of ACT’s claims. That is, an ACT
(e.g., Anderson 1978, Newell 1973). One approach model may need to be designed in a certain way so that
that reduces this problem involves developing models the knowledge it specifies is sufficient to perform the
within a cognitive architecture, where a fixed set of given task when ACT mechanisms are applied.
representations and mechansims are used to test a Finally, based on the rational analysis of cognition
variety of models. That is, models are constrained to (Anderson 1990), constraints have been imposed on
work within the strictures of the architecture. The next the theory so that it includes the kind of processing
version of the ACT theory, ACT* (Anderson 1983) that is necessary and sufficient to meet the demands of
was a cognitive architecture, like others developed the environment.
around this time (see also Cognitie Theory: SOAR;
Newell 1990). ACT* extended its predecessor by
specifying an activation calculus for declarative 5. Future Issues
knowledge and a new mechanism for acquiring pro-
cedural knowledge. The ACT theory is still under active development. An
Besides using ACT* to model a variety of task extension has been added (called ACT-R\PM, Byrne
domains—from language processing to paired-associ- and Anderson 1998) that incorporates perception and
ate learning—this version of the theory was applied to motor modules (e.g., eyes, ears, and hands). This
the practical problem of improving computer-aided extension enables the system to model interaction with
instruction. So-called ‘intelligent’ tutoring systems the environment. In addition, ACT models have been
were built based on ACT* cognitive models of algebra developed for a variety of new domains, including
problem solving, geometry theorem proving, and complex, dynamic tasks such as air-traffic control. In
computer programming (e.g., Anderson et al. 1989, some cases, the data sets being modeled even include
1990). Because these models could solve the required eye-movement protocols. Yet another area of develop-
problems in each domain, they enabled the corre- ment involves exploring the relationship between the
sponding tutoring systems to follow students’ prob- algorithmic level of description of the ACT-R com-
lem-solving, give feedback when students made a puter simulation system (cf. Marr 1982) and a corre-
mistake, and offer hints when students were confused. sponding neural level implementation.
Moreover, because these systems tracked the steps
students were taking and hence the knowledge they See also: Cognitive Psychology: History; Cognitive
were using, specific predictions of the theory could be Psychology: Overview; Cognitive Theory: SOAR;
generated and tested in this scaled-up, real-world Knowledge Spaces; Knowledge Representation;
learning context. Logics for Knowledge Representation; Mathematical

2177
Cognitie Theory: ACT

Psychology; Problem Solving and Reasoning, functionality, and its attendant theoretical commit-
Psychology of; ments, is what makes soar both distinctive and
controversial in cognitive psychology.  represents
the last major work of Allen Newell, one of the
founders of modern cognitive science and artificial
Bibliography intelligence, and a pioneer in the development of
Anderson J R 1976 Language, Memory, and Thought. Erblaum, architectures as a class of cognitive theory.
Hillsdale, NJ
Anderson J R 1978 Arguments concerning representations for
mental imagery. Psychological Reiew 85: 249–77 1. Multiple Constraints on Mind and
Anderson J R 1983 The Architecture of Cognition. Harvard Computational Theories of Cognition
University Press, Cambridge, MA
Anderson J R 1990 The Adaptie Character of Thought. Newell (1980a, 1990) described the human mind as a
Erlbaum, Hillsdale, NJ solution to a set of functional constraints (e.g., exhibit
Anderson J R 1993 Rules of the Mind. Erlbaum, Hillsdale, NJ adaptive (goal-oriented) behavior, use language, op-
Anderson J R, Lebiere C 1998 The Atomic Components of erate with a body of many degrees of freedom) and a
Thought. Erlbaum, Mahwah, NJ set of constraints on construction (a neural system,
Anderson J R, Boyle C F, Corbett A T, Lewis M W 1990 grown by embryological processes, arising through
Cognitive modeling and intelligent tutoring. Artificial In-
telligence 42: 7–49
evolution). The structure of  is shaped primarily
Anderson J R, Conrad F G, Corbett A T 1989 Skill acquisition by three of the functional constraints: (a) exhibiting
and the LISP tutor. Cognitie Science 13: 467–505 flexible, goal-driven behavior, (b) learning continu-
Anderson J R, Matessa M 1997 A production system theory of ously from experience, and (c) exhibiting real-time
serial memory. Psychological Reiew 104: 728–48 cognition (elementary cognitive behavior must be
Byrne M D, Anderson J R 1998 Perception and action. In: evident within about a second).
Anderson J R, Lebiere C (eds.) The Atomic Components of The emergence of computational models of cog-
Thought. Erlbaum, Mahwah, NJ nition in information processing psychology (and
Lovett M C, Daily L Z, Reder L M 2000 A source activation artificial intelligence) represented a significant theor-
theory of working memory: Cross-task prediction of per-
formance in ACT-R. Cognitie Systems Research 1: 99–118
etical advance by providing the first proposals for
Marr D 1982 Vision. Freeman, San Francisco physical systems that could, in principle, satisfy the
Newell A 1973 You can’t play 20 questions with nature and win: functional constraints of exhibiting intelligence
Projective comments on the papers of this symposium. In: (Newell et al. 1958, Newell and Simon 1972). However,
Chase W C (ed.) Visual Information Processing. Academic they raised a set of difficult methodological and
Press, New York theoretical issues that cognitive science still grapples
Newell A 1990 Unified Theories of Cognition. Harvard Uni- with today. Among these issues are: (a) the problem of
versity Press, Cambridge, MA irrelevant specification (in a complex computer pro-
gram, which of the myriad aspects of the program
M. C. Lovett carry theoretical content, and which are irrelevant
implementation details) (Reitman 1965); (b) the prob-
lem of too many degrees of freedom (an unconstrained
computer program can be modified to fit any data
pattern); and (c) the problem of identifiability (any
Cognitive Theory: SOAR sufficiently general proposal for processing schemes or
representations can mimic the input\output charac-
teristics of any other general processing or represen-
 is a computational theory of human cognition tation scheme (Anderson 1978, Pylyshyn 1973).
that takes the form of a general cognitive architecture
(Laird et al. 1987, Newell 1990, Rosenbloom et al.
1992).  (not an acronym) is a major exemplar of 2. SOAR as a Confluence of Fie Major Technical
the architectural approach to cognition, which Ideas in Cognitie Science
attempts the unification of a range of cognitive
phenomena with a single set of mechanisms, and  can be seen as a confluence of five major technical
addresses a number of significant methodological and ideas in cognitive science, which, taken together, are
theoretical issues common to all computational cog- intended to address the three functional constraints
nitive theories (Anderson and Lebiere 1998, Newell summarized above, as well as the fundamental meth-
1990, Pylyshyn 1984).  is also characterized by a odological issues concerning computational models.
setofspecifictheoreticalcommitmentsshapedprimarily
by attempting to satisfy the functional requirements
2.1 Physical Symbol Systems
for supporting human-level intelligence, manifest in
soar’s parallel existence as a state-of-the art artificial  is a physical symbol system. The physical symbol
intelligence system (Laird et al. 1987). This focus on system hypothesis asserts that physical symbol systems

2178
Cognitie Theory: SOAR

are the only class of systems that can in principle considerably reduced. Second, to the extent that
satisfy the constraint of supporting intelligent, flexible architectures have learning components that can
behavior. Physical symbol systems are a reformulation acquire new knowledge (e.g., about a specific task),
of Turing universal computation (Church 1936, the form of that knowledge is no longer freely
Turing 1936) that identifies symbol processing as a key under control of the theorist. Third, to the extent
feature of intelligent computation. The requirement is that architectures are programmable (and are also
that the system be capable of manipulating and constrained by a temporal mapping or learning mech-
composing symbols and symbol structures—physical anism), they permit a single set of processing assump-
patterns with associated processes that give the pat- tions to be applied to a diverse range of tasks,
terns the power to denote either external entities or constraining that theory by a broader range of data.
other internal symbol structures (Newell 1980a, 1990, Fourth, to the extent that cognitive architectures are
Simon 1996). The key to the universality of Turing comprehensive and include some perceptual and
machines and physical symbol systems is their pro- motor components, they can be used to provide closed-
grammability: content can be added to the systems (in loop models of complete tasks, so that no explanatory
the form of programs) to change their behavior, power need be ascribed to anything external to the
yielding indefinitely many response functions. model.

2.2 Cognitie Architectures 2.3 Production Systems


 is a cognitive architecture. A cognitive archi- All long-term memory in  is held in the form of
tecture is a theory about the fixed computational productions (Anderson 1993, Newell 1973). Each
structure of cognition (Anderson and Lebiere 1998, production is a condition-action pair. The conditions
Newell 1990, Pylyshyn 1984). Computational systems form access paths and the actions form the memory
that are programmable must have some kind of fixed contents. Productions continuously match against a
structure that processes the variable content: a set of declarative working memory that contains the mo-
primitive processes, memories, and control structures. mentary task context, and matching productions put
The theoretical status of this underlying structure has their contents (actions) back into the working mem-
not always been clear in cognitive models. For ory. Productions are the lowest level of elementary
example, when a cognitive model is programmed in memory access available in , and Newell’s (1990)
Lisp, the theorist intends to make some theoretical temporal mapping onto human cognition places them
claims about the program (e.g., that the steps of the approximately at the 10 ms level. This mapping pro-
program corresponds in some way to the cognitive vides strong constraints on the shape of cognitive
steps of the human performing the task), but probably models built in  that must operate in real time.
intends to make no theoretical claims about Lisp as ’s productions form a recognition memory.
the architecture that executes the program (e.g., the Such recognition memories have a number of features
fact that unused memory structure are reclaimed via a that make them attractive as models of human
garbage collection process is theoretically irrelevant). memory: they are associational in nature (access is via
A cognitive architecture explicitly specifies a fixed the contents of working memory); they are fine-
set of processes, memories, and control structures that grained and independent (which makes them a good
are capable of encoding content and executing pro- match for continuous, incremental learning mech-
grams. Cognitive models for specific tasks can be anisms); they are dynamic (a production system by
developed in such architectures by programming itself defines a computationally complete system that
them. The theoretical status of various parts of a can yield behavior; other processes are not needed to
programmed implementation is now considerably access or execute the memory structures); and they are
clarified: what counts is the structure of the archi- cognitively impenetrable (their contents and structure
tecture (not its particular implementation), and the may not be arbitrarily searched over, examined, or
cognitive model’s program, which makes a set of modified, but only accessed via automatic associ-
specific commitments about the form and content of ation). All of these properties place them in sharp
knowledge used in a specific task. Thus, implemented contrast to memories in digital computers, which are
cognitive architectures go a long way toward solving static structures (not processes), freely addressable by
the irrelevant specification problem. location.
Cognitive architectures, especially those with tem-
poral mappings and integrated learning mechanisms,
can also address the degrees of freedom problem and 3. Search in Problem Spaces Supported by a
identifiability problem in four ways. First, to the Two-leel Automatic\Deliberate Control Structure
extent that architectures have a constrained temporal
mapping, the space of possible programs that yield  achieves all cognition by search in problem
both the required functionality and temporal profile is spaces, and architecturally supports this by a flexible,

2179
Cognitie Theory: SOAR

two-level recognize–decide–act control structure. is not clear what to do next (e.g., several operators
Problem spaces are based in part on the idea that have been proposed, but no knowledge is evoked to
search in combinatoric spaces is the fundamental prefer one option to another, or there are conflicts in
process for attainment of difficult tasks. The nature of the retrieved knowledge), an impasse has arisen, and
such search is seen most easily in tasks like chess that the decision procedure records in working memory the
have a well-defined set of operators and states. A type of the impasse, and sets a subgoal of resolving
search space consists of a set of (generated) repre- that impasse. In this way, ’s problem solving gives
sentational states and operators that transition be- rise automatically to a cascade of subgoals whenever
tween states. the knowledge delivered by the recognition memory is
Problem spaces as realized in  extend the insufficient for the current task.
standard notion of search in an important direction: The critical feature of this control structure is its
problems spaces are taken to be the fundamental way run-time, least-commitment nature: each local de-
that humans accomplish all cognitive tasks, including cision in the problem space is made at execution time
routine (i.e., well-practiced) tasks.  is, therefore, by assembling whatever relevant bits of knowledge
one realization of the problem-space hypothesis can be retrieved (by automatic match) at that moment.
(Newell 1980b), which asserts that all deliberate Decisions are not fixed in advance, and there are no
cognitive activity occurs in problem spaces. The key to architectural barriers to the kinds of knowledge that
this move lies in the role of knowledge in problem can be brought to bear on the decisions.
spaces: problem spaces freely admit of any amount of
knowledge for guiding search, executing operators, or
formulating the space initially in response to a task. 3.1 Continuous, Impasse-drien Learning
Because  provides a set of mechanisms (described  continuously acquires new knowledge in its long-
next) that support this kind of knowledge use, be- term memory through an experience-based learning
havior in  spans the well-known continuum mechanism called chunking (Laird et al. 1987, Rosen-
between knowledge-intensive processing (little search) bloom and Newell 1983). This mechanism generates
and knowledge-lean processing (much search) (Newell new productions in the long-term memory by pre-
1990). serving the results of problem solving that occurred in
Supporting knowledge-driven search places strong response to impasses. The conditions of the new
functional demands on the architecture’s control production consist of aspects of the working memory
structure: at any step in the problem-solving process— state just before the impasse, and the actions of the
selecting the next operator, generating the next state, production consist of the new knowledge that resolved
etc.—any relevant knowledge must be brought to the impasse (e.g., an assertion that one of the proposed
bear. There are two parts to the solution to this operators is to be preferred to the other in the current
problem: the mechanisms for appropriate indexing of situation). Upon encountering a similar situation in
the knowledge, and the mechanisms for retrieving and the future, the production will automatically match
applying the relevant knowledge during search. The and retrieve the knowledge that permits  to avoid
indexing concerns learning, discussed below. the impasse. Thus, chunking is a mechanism that
For retrieving and applying the knowledge during converts problem solving into recognition memory,
search,  relies on a two-level control structure that continuously moving  from knowledge-lean to
separates the automatic access of knowledge via the knowledge-rich processing.
productions from the deliberate level of problem Chunking in  has two important functional
solving. Each cognitive step is accomplished by a properties. First, it begins to provide a solution to the
recognize–decide–act cycle. In the recognize phase, all knowledge-indexing problem raised earlier. The sys-
productions that match the current state fire, pro- tem assembles its own indices out of the contents of
ducing new content in the working memory. Part of working memory in a way that is directly aimed at
this retrieved content is about what the system should making the knowledge retrievable when it is relevant
do next—the possible operators to try in the current to the immediate demands of the task at hand. Second,
state, the relative desirability of these operators (e.g., learning permeates all aspects of cognition in .
operator A is better than operator B), and so on. Next, Chunking applies to all kinds of impasses, so any
in the decide phase, a fixed (domain independent) problem space function is open to learning improve-
decision procedure sorts out these preferences in ments: problem-space formulation, operator gener-
working memory to determine if they converge on a ation, operator selection, and so on.
consistent decision. In the event that this processing
clearly determines the next step, the decision procedure
places in working memory an assertion about what 4. Major Architectural Implications and Specific
that step should be. In the act phase, that step is taken Domains of Application
(by additional production rule firings): the move to the
next state in internal problem space search, or the  can be used as a theory in multiple ways (Newell
release of motor intentions in external interaction. If it 1990). Qualitative predictions can be drawn from 

2180
Cognitie Theory: SOAR

as a verbal theory, without actually running detailed tion of the two-level control structure) and ’s
computer simulations. These qualitative predictions learning mechanism.  models have been devel-
can be both domain-general (cutting across all oped of real-time interaction and learning in video
varieties of cognitive behavior) and domain-specific. games (John et al. 1994), novice-to-expert transitions
The theory can be also be applied to specific domains in computer menu navigation (Howes and Young
by developing detailed computational models of a 1997), and a programmer’s interaction with a text
task; this involves programming  by adding editor (Altmann and John 1999), among others.
domain-specific production rules to its long-term  models have also been developed of problem
memory, and generating behavioral traces. solving (Newell 1990), sentence processing (Lewis
2000), concept acquisition (Miller and Laird 1996),
and interaction with educational microworlds (Miller
et al. 1999). In all  models (as with any cognitive
model), the explanatory power is shared to varying
4.1 Domain-independent Predictions degrees by both the content posited by the theorist for
the particular task and the architectural mechanisms.
A principal prediction of a theory of human cognition
For example, in the sentence processing model, ’s
is that humans are intelligent; the only way to clearly
control structure and learning mechanism, coupled
make that prediction is to demonstrate it opera-
with the real-time constraint, lead directly to a theory
tionally.  makes this prediction only to the extent
of ambiguity resolution that yields a novel explanation
that the system has been demonstrated to exhibit
of apparent modularity effects and their malleability
intelligent behavior. As a state of the art AI system
(Lewis 1996a, Newell 1990), but the architecture
that has been applied to difficult tasks (ranging from
provides little apparent constraint on the choice of
algorithm design to scheduling problems),  makes
grammatical theory, which also plays a role in the
the prediction to a greater degree than other psycho-
empirical predictions (Lewis 1996b). Similarly, the
logical theories.
general theory of episodic indexing of attention events
 makes a number of general predictions related
embodied in the text editor model depends critically
to long-term memory and skill (Newell 1990). These
on ’s continuous chunking mechanism (Altmann
include the prediction that procedural skill transfer is
and John 1999), while the specific behavioral traces are
essentially by identical elements, and will usually be
a function, in part, of task strategies that could be
highly specific (Singley and Anderson 1989, Thorndike
accommodated by alternative architectures.
1903); the bias of Einstellung will occur—the pres-
ervation of learned skill when it is no longer useful
(Luchins 1942); the encoding specificity principle
(Tulving 1983) holds; and recall will generally take
place by a generate-and-recognize process (Kintsch
1970). The best known of ’s general predictions is 5. Critiques of SOAR, and Future Directions
the power law of practice, which relates the time to do
Critiques of  fall into three major classes: critiques
a task to the number of times the task has been
of specific models built within , critiques of the
performed (Newell and Rosenbloom 1981, Snoddy
architecture itself, and critiques of the general meth-
1926).
odological approach of building comprehensive archi-
tectural theories. For example, specific empirical
critiques have been made of  models of the
Sternberg memory search task (Lewandowsky 1992)
and immediate reaction tasks (Cooper and Shallice
4.2 Domain-specific Predictions
1995). The theoretical challenge is understanding the
 models have been constructed across a range of extent to which the empirical problems can be resolved
task domains, and the behavior of the models has been within the existing architecture, or whether they point
compared with human data on those tasks. One area back to problems in the architecture itself (Newell
that has received considerable attention is human-com- 1992b). (The fact that the latter is a real possibility
puter interaction (HCI). Some of the successes in this demonstrates that the architectural approach has
area, such as a detailed model of transcription typing made some headway on the identifiability and degrees
(John 1988), are a result of  inheriting the results of freedom problems.)
of the  theory (Goal, Operators, Methods, and At the architectural level, nearly every major as-
Selection rules), a theory developed in HCI to predict sumption of  has been challenged in the literature
the time it takes expert users to do routine tasks (Card (see the multiple book review in BBS for a range of
et al. 1983). ( can be seen at one level as a assessments; Newell 1992a). Many of these archi-
specialization of , missing features such as learn- tectural-level criticisms have been aimed at the uni-
ing and impassing.) Other  HCI models depend formity assumptions in  (all tasks as problem
crucially on ’s real-time interruptability (a func- spaces, all long-term memory as productions, all

2181
Cognitie Theory: SOAR

learning as chunking), which appear at first to run Bibliography


strikingly against the prevailing mode of theorizing in
Altmann E M, John B E 1999 Episodic indexing: A model of
both cognitive psychology and cognitive neuroscience, memory for attention events. Cognitie Science 23(2): 117–56
which emphasizes functional specialization and dis- Anderson J R 1978 Arguments concerning representations for
tinctions over computational generality. The evalu- mental imagery. Psychological Reiew 85(4): 249–77
ation of  in light of these concerns is not always Anderson J R 1993 Rules of the Mind. Erlbaum, Hillsdale, NJ
transparent, however. For example, the analysis of Anderson J, Lebiere C 1998 Atomic Components of Thought.
’s implications for modularity (particularly in Erlbaum, Hillsdale, NJ
language processing) revealed that  is not only Card S K, Moran T P, Newell A 1983 The Psychology of
consonant with, but even predicts, many of Fodor’s Human–Computer Interaction. Erlbaum, Hillsdale, NJ
diagnostics of modular systems (Lewis 1996a, Newell Church A 1936 An unsolvable problem of elementary number
theory. The American Journal of Mathematics 58: 345–63
1990).
Cooper R, Shallice T 1995 SOAR and the case for unified
Finally, the general approach to cognitive theory theories of cognition. Cognition 55(2): 115–49
that  embraces has come under sharp criticism Howes A, Young R M 1997 The role of cognitive architecture in
(most notably by Cooper and Shallice 1995) for not modelling the user: SOAR’s learning mechanism. Human–
living up to the promise of addressing the methodo- Computer Interaction 12: 311–43
logical concerns identified above, and for not yielding John B E 1988 Contributions To Engineering Models of Human–
theories with deep empirical coverage that clearly gain Computer Interaction. Carnegie Mellon University, Pitts-
their explanatory power from general architectural burgh, PA
mechanisms. To the extent that these critiques depend John B E, Vera A H, Newell A 1994 Toward real-time GOMS:
on practice with the  theory specifically, their A model of expert behavior in a highly interactive task.
implications for the broader approach are insecure. Behaior and Information Technology 13: 255–67
Kintsch W 1970 Models for free recall and recognition. In:
Other architectural theories (e.g.,  (Anderson and Norman D A (ed.) Models of Human Memory. Academic
Lebiere 1998) and  (Meyer et al. 1995)) exist in the Press, New York
field, and each has adopted somewhat different ways Laird J E, Newell A, Rosenbloom P S 1987 : An archi-
of dealing with these methodological issues that may tecture for general intelligence. Artificial Intelligence 33: 1–64
or may not make them suspect to the same criticisms. Lewandowsky S 1992 Unified cognitive theory: Having one’s
The evolution of  as a theory, and its broader apple pie and eating it. Behaioral and Brain Sciences 15(3):
role in cognitive science, is likely to proceed along two 449–50
fronts. First,  will remain an important source of Lewis R L 1996a Architecture matters: What  has to say
ideas for developing theories of complex cognition, about modularity. In: Steier D M, Mitchell T M (eds.) Mind
even for those theorists who do not embrace the Matters: Contributions to Cognitie and Computer Science in
Honor of Allen Newell. Lawrence Erlbaum Associates,
architecture whole cloth, or reject the architectural Mahwah, NJ
methodology. A harbinger of this can be seen in Lewis R L 1996b Interference in short-term memory: The
cognitive neuroscience: as researchers begin to tackle magical number two (or three) in sentence processing. Journal
the problem of understanding the nature of ‘executive’ of Psycholinguistic Research 25(1): 93–115
processes and their realization in the brain, models like Lewis R L 2000 Specifying architectures for language processing:
 can provide concrete proposals for a set of Process, control, and memory in parsing and interpretation.
functionally sufficient mechanisms for the control of In: Crocker M W, Pickering M, Clifton C Jr (eds.) Archi-
deliberate cognition; (see the recent volume on work- tectures and Mechanisms for Language Processing. Cambridge
ing memory and executive control for evidence of such University Press, Cambridge, UK
interaction by Miyake and Shah 1999). Second,  Luchins A S 1942 Mechanization in problem solving. Psy-
chological Monographs 54(6): no. 28
will continue to evolve as a unified set of mechanisms
Meyer D E, Kieras D E, Lauber E, Schumacher E H, Glass J,
itself, informed in part by the continued application of Zurbriggen E, Gmeindl L, Apfelblat D 1995 Adaptie Execu-
 to difficult AI problems, and in part by the tie Control: Flexible Human Multiple-task Performance
continued construction and empirical evaluation of Without Perasie Immutable Response-selection Bottlenecks.
detailed models of cognitive tasks that focus on unique University of Michigan, Ann Arbor, MI
aspects of the architecture. Miller C S, Laird J E 1996 Accounting for graded performance
within a discrete search framework. Cognitie Science 20:
499–537
Miller C S, Lehman J F, Koedinger K R 1999 Goals and
See also: Artificial Intelligence: Connectionist and learning in microworlds. Cognitie Science 23(3): 305–36
Symbolic Approaches; Artificial Intelligence in Cog- Miyake A, Shah P (eds.) 1999 Models of Working Memory:
nitive Science; Artificial Intelligence: Search; Cog- Mechanisms of Actie Maintenance and Executie Control.
nitive Theory: ACT; Deductive Reasoning Systems; Cambridge University Press, Cambridge, UK
Newell A 1973 Production systems: Models of control structures.
Expert Systems in Medicine; Intelligence: History of In: Chase W G (ed.) Visual Information Processing. Academic
the Concept; Production Systems in Cognitive Psy- Press, New York
chology; Scientific Discovery, Computational Models Newell A 1980a Physical symbol systems. Cognitie Science 4:
of 135–83

2182
Cognitie Therapy

Newell A 1980b Reasoning, problem solving and decision behavioral therapies (Dobson, 2001). The cog-
processes: The problem space as a fundamental category. In: nitive-behavioral therapies share certain theoretical
Nickerson R (ed.) Attention and Performance VIII. Erlbaum, underpinnings, notably: (a) that cognitive events affect
Hillsdale, NJ behavior (the mediational hypothesis); (b) that cog-
Newell A 1990 Unified Theories of Cognition. Harvard University
Press, Cambridge, MA
nitive events can be assessed and systematically modi-
Newell A 1992a Precis of unified theories of cognition. Be- fied (the accessibility hypothesis); and (c) that cog-
haioral and Brain Sciences 15(3): 425–92 nitive change can be employed to cause therapeutic
Newell A 1992b SOAR as a unified theory of cognition: Issues changes in behavior or adaptive functioning. In this
and explanations. Behaioral and Brain Sciences 15(3): 464–92 regard, cognitive therapy is similar to other schools of
Newell A, Rosenbloom P 1981 Mechanisms of skill acquisition cognitive-behavioral psychotherapy, such as rational-
and the law of practice. In: Anderson J R (ed.) Cognitie Skills emotive psychotherapy.
and Their Acquisition. Erlbaum, Hillsdale, NJ Cognitive therapy distinguishes itself from other
Newell A, Shaw J C, Simon H A 1958 Elements of a theory of cognitive-behavioral therapies by the particular
human problem solving. Psychological Reiew 65: 151–66 organization of its theoretical constructs. The model
Newell A, Simon H A 1972 Human Problem Soling. Prentice-
proposes that the manner in which an individual
Hall, Englewood Cliffs, NJ
Pylyshyn Z W 1973 What the mind’s eye tells the mind’s brain:
views, appraises, or perceives events around himself\
A critique of mental imagery. Psychological Bulletin 80(1): herself is what dictates their subsequent emotional
1–24 responses and behavioral choices. Although the con-
Pylyshyn Z W 1984 Computation and Cognition. Bradford\MIT tent of situation-specific appraisals will vary as a
Press, Cambridge, MA function of the person’s activities, the model also
Reitman W 1965 Cognition and Thought. Wiley, New York posits that these appraisals may be accurate or
Rosenbloom P S, Laird J E, Newell A (eds.) 1992 The SOAR distorted, positive or negative. For example, an
Papers: Research on Integrated Intelligence. MIT Press, individual reacts to what another individual said in an
Cambridge, MA interpersonal interaction, as well as to what he or she
Rosenbloom P S, Newell A 1983 The chunking of goal hier- thinks about those statements.
archies: A generalized model of practice. In: Michalski R S,
Carbonell J, Mitchell T (eds.) Machine Learning: An Artificial
The extent to which the individual sees situations
Intelligence Approach II. Morgan Kaufman, Los Altos, CA accurately, or conversely may be distorting, is in part
Simon H A 1996 The patterned matter that is mind. In: D a M a function of the individual’s core beliefs (also referred
Steier T (ed.) Mind Matters: Contributions to Cognitie and to as underlying assumptions, or schemas). These core
Computer Science in Honor of Allen Newell. Erlbaum, Hills- beliefs are hypothesized to be intrapsychic phenomena
dale, NJ that emerge over the person’s lifetime, based on their
Singley M K, Anderson J R 1989 The Transfer of Cognitie Skill. experiences. Once established, these beliefs not only
Harvard University Press, Cambridge, MA increase the likelihood of certain cognitive reactions to
Snoddy G S 1926 Learning and stability. Journal of Applied life events, but also then influence proactively the way
Psychology 20: 1–36 in which an individual chooses to spend their time,
Thorndike E L 1903 Educational Psychology. Lemke and career, partner choices, etc., which also tend to
Buechner, New York
Tulving E 1983 Elements of Episodic Memory. Oxford University
reinforce these beliefs. Thus, over time cognitive beliefs
Press, New York become the basic factor in what may become an
Turing A M 1936 On computable numbers, with an application increasingly negative and closed feedback loop (see
to the Entscheidungsproblem. Paper presented at the Pro- Fig. 1).
ceedings of the London Mathematics Society Cognitive therapy is a systematic treatment, which
is founded on the cognitive model of distress. The
R. L. Lewis principal emphases in therapy are assisting the patient
to identify, evaluate, and modify the potentially faulty
information processing they engage in, as well as the
underlying beliefs or schemas that drive that infor-
mation processing. This process typically begins with
an assessment of adaptive functioning and behavioral
patterns that might either enhance or interfere with the
Cognitive Therapy process of treatment. In the case of depression, the
patient may be encouraged to increase their activities
1. Definition to alleviate negative mood, and to ensure that the
patient is at least engaged in their life sufficiently to
Cognitive therapy is one of a large number of generate negative information processing that can
psychotherapy approaches that was developed in the then be the focus of treatment. In the case of marital
latter part of the twentieth century. It is associated distress, negative interactional patterns (e.g., fighting,
principally with its originator, Dr. Aaron Beck. The spousal abuse) will be assessed and modified before
treatment model falls within a broader class of interventions aimed at relationship beliefs will be
treatments, which are referred to as the cognitive- attempted.

2183
Cognitie Therapy

Figure 1
The cognitive therapy model of emotional distress

Once the patient is able to engage in a review of their depression (Beck et al. 1979, Clark et al. 1999), and
negative thinking, a number of methods can be transgression in anger (Beck 1999). Further, the
employed to assess this thinking. Strategies include cognitive processes associated with various disorders
counts of particular thoughts, questioning in the (e.g., magnification of perceived danger in anxiety) are
therapy session, thoughts about imagined situations, also increasingly understood.
or the use of a written thought record (J. Beck 1995).
Dependent on the pattern of dysfunctional thinking
that is identified, various therapeutic techniques can
be employed to change these patterns. For example, if 2. Intellectual Context
a patient recurrently believes they ‘cannot’ be assertive
with certain people, the therapist and patient can Beck’s original interest was in understanding and
collaborate to set up circumstances in which they can treating depression. He began his departure from his
test this idea empirically. In doing so, adaptive original psychoanalytic training through a series of
thinking and various behavioral competencies can be studies of the dreams and daytime thoughts of de-
enhanced. pressed patients, in which he discovered that the
At a certain stage of treatment, it is typically the content of these thoughts had stereotypical content.
case that a patient’s situation-specific automatic Further, he hypothesized that these depressed patients
thoughts are accurate and adaptive, and that they are engaged in negative distortions of their world in order
functioning more adaptively than when they first to obtain these thoughts. Based on these observations,
arrived for treatment. At this stage of treatment, the he developed a treatment in which the negative
focus will move to the more general beliefs that cognitions were systematically tested, in order to
prompted the patient’s problems in the first instance. undermine cognitive distortions and the negative
This work includes a search for the common themes thinking seen in depression (Beck et al. 1979).
underlying specific thoughts. These themes can take The early formulation of cognitive therapy was put
the form of beliefs about how ‘the world’ operates, or to its first empirical test in an outcome study in the late
schemas that have developed about the self. Tech- 1970s. Based on a developed treatment manual,
niques that can be used include inductive questioning, cognitive therapy was contrasted with antidepressant
a review of the negative thinking patterns seen by the medication. This study revealed that cognitive therapy
therapist and patient, examination of historical pat- had equal outcome to medication in the short term
terns of thought and behavior, and a review of life and superior outcome at follow-up. As a consequence
events related to the problem. Once identified, dys- of the original success of cognitive therapy, a number
functional beliefs can be challenged systematically of outcome studies were conducted in the area of
through behavioral experiments, bibliotherapy, open depression in subsequent years.
discussion with key people in the patient’s life, and The first meta-analysis of these studies (Dobson
other ‘assumptive techniques’ (J. Beck 1995). 1989) used a standard outcome measure, the Beck
In summary, cognitive therapy is a systematic and Depression Inventory, and compared cognitive ther-
progressive form of treatment that typically includes apy to other forms of treatment. This analysis revealed
assessment and modification of adaptive functioning, that cognitive therapy patients ended treatment 0.5
situation-specific automatic thinking, and more en- standard deviations less depressed than patients re-
grained, long-term beliefs and self-schemas. The ceiving other treatments. Further, compared to wait
cognitive content of various forms of disorders is list or placebo conditions, the effect size for cognitive
increasingly understood, and it includes themes of therapy indicated that it was over 2 standard devi-
danger in anxiety (Beck and Emery 1985), loss in ations superior to no treatment. Although the results

2184
Cognitie Therapy

of the above meta-analysis were criticized due to Further issues exist with regard to cognitive therapy
questions about the selection process for including of depression. These include the role of patient
studies, the conclusion that cognitive therapy is at characteristics that potentially effect treatment out-
least equal to pharmacotherapy has not been chal- come, the role of life stress in depression, the recent
lenged seriously. Further, a more recent meta-analysis focus on schema-focused cognitive therapy (Clark et
(Gloaguen et al. 1998) has confirmed the results found al. 1999), the risk of relapse relative to other treat-
earlier by Dobson (1989). ments, and assessment issues relative to the adherence
More specifically, Gloaguen et al. (1998) found and competence of therapists in providing cognitive
similar effect sizes to the earlier analysis, although the therapy.
overall magnitude of these effects was somewhat
attenuated. Notably, in an examination of long-term
effects, it was also reported that relapse rates from 3. Changes in Focus or Emphasis Oer Time
cognitive therapy were approximately one-half of
those observed for drug therapy for depression. Most Branching out from the early work on depression
recently, DeRubeis et al. (1999) conducted a ‘mega- (Beck et al. 1979), Beck and his colleagues extended
analysis’ of treatment for depression, in which they the cognitive model to other clinical conditions. Beck
collapsed the data from four outcome studies and and Emery published a treatment manual for anxiety
analyzed the effects of cognitive therapy for depres- disorders in 1985, which was followed by works
sion, relative to drug therapy. They concluded that dedicated to marital disorder (Beck 1988), personality
these two forms of therapy were equally effective, even disorders (Beck et al. 1990), substance use and abuse
for severely depressed patients. As such, it is an (Beck et al. 1993), and—most recently—anger and
established fact that cognitive therapy is an effective aggression (Beck 1999). Other works on cognitive
treatment for clinical depression. therapy have appeared for bipolar disorder, as well as
Notwithstanding the demonstrated efficacy of cog- numerous chapters in various sources (see J. Beck
nitive therapy for depression, a number of questions 1995 for a review).
continue to require theoretical and empirical atten- Throughout the development of cognitive therapy,
tion. The model underlying cognitive therapy states some features have remained constant. First, while the
that cognitive distortions and situation-specific nega- content of cognition related to various forms of
tive thoughts emerge from an interaction between core disorders necessarily varies, there has been a consistent
beliefs (also referred to as cognitive schemas, or emphasis on the process of thinking in cognitive
underlying assumptions) and the life events that therapy that permits many of the treatment techniques
impinge on these beliefs. Once the core beliefs are to be applied across disorders. For example, a com-
activated and negative cognitions are produced, the mon intervention in cognitive therapy is assessing the
emotional and behavioral consequences of these reality basis, or veridicality, of certain perceptions.
thoughts naturally flow (see Fig. 1). This mediational This intervention can be used successfully in many
model, while intuitive and having lead to a successful forms of anxiety disorders, depression, marital dis-
treatment technology, has yet to be validated through tress, and anger-related problems.
research. In fact, based on a comprehensive exam- A second consistent emphasis in cognitive therapy
ination of the research that has tested the various has been a focus on treatment efficacy. From the
assumptions of cognitive therapy (Clark et al. 1999), outset of the development of cognitive therapy, Beck
the conclusion is that cognitive mediation remains to and colleagues have maintained that efficacy studies
be proven. are required to establish the clinical utility of cognitive
Another challenge for the cognitive therapy of therapy. No doubt it is in part due to this emphasis
depression model comes from a component analysis of that many of the psychological treatments now being
this treatment. Jacobson et al. (1996) conducted a recognized as empirically supported are variants of
randomized clinical trial in which depressed patients cognitive therapy.
received either only the behavioral interventions asso-
ciated with cognitive therapy, behavioral interventions
and those aimed at situation-specific distortions, or 4. Methodological Issues or Problems
the complete cognitive therapy program. Contrary to
predictions, all three treatment conditions had equal Two primary sets of methodological issues have
outcomes, both in the short term and at up to two constrained the development of cognitive therapy. The
years follow-up. This study suggests that the cognitive first of these is related to the methodological issues
interventions that are the hallmark features of the inherent in clinical research. The randomized clinical
treatment may not, in fact, contribute to outcome trial methodology, which has become the standard in
more than the behavioral components of the treat- the development of psychotherapy, makes a number
ment. If replicated, these results suggest that the active of requirements on investigators. These include the
ingredients of cognitive therapy should be recon- need for a well-defined independent variable, which in
ceptualized. the context of psychotherapy means a treatment

2185
Cognitie Therapy

manual. Unfortunately, a consequence of this re- (a) A need for continued emphasis on efficacy
quirement is that fully developed treatment manuals research, particularly as cognitive therapy is applied to
are often developed before the research that is needed new and innovative areas.
to substantiate them has been conducted. A second (b) A need for effectiveness research that assesses
requirement of clinical trials is that the subject group such issues as the utility of cognitive therapy relative to
be clearly specified. In psychotherapy research, this other treatments, the clinical acceptability of treat-
requirement has often been translated into using ment to patients, and other issues that affect how
diagnostically related groups, ideally with few or no practical the treatment is to apply in varied clinical
complicating co-morbid conditions. The relative hom- contexts.
ogeneity of research participants, while providing a (c) Continued work on the measurement of cog-
good test of the intervention, unfortunately leads to nition (Ingram et al. 1998) and the mechanisms of
what may be relatively poor generalizability of re- change in cognitive therapy is clearly warranted.
search findings. The requisite training for research Cognitive therapy is a complex, multi-component
trials has also been controversial. A fourth area of treatment, the mechanisms of which are only be-
controversy surrounds the measurement and evalu- ginning to be studied. The results of this research will
ation of outcome. Typically, the outcomes of a given contribute both to the theory and application of the
treatment can be assessed in a number of ways treatment.
(diagnostically, at the symptom level, or using special- (d) Theoretical therapy development and efficacy
ized assessment tools), and the analysis of the out- work are needed to address the issue of whether
comes can also be handled in several ways. As a cognitive therapy is an exhaustive theory that can
consequence of these various strategies to conduct integrate other models (Dobson and Khatri, 2000),
clinical trials, the scientific status of various treatments or whether it might be integrated optimally into other
can be challenged. Given that most psychological treatment models.
treatments are developed in universities with publicly (e) There is a need for research that assesses the
funded grants, and further given the necessarily limited success and failure of cognitive therapy for patients of
set of questions that any one treatment study can diverse backgrounds. This ‘aptitude by treatment’
address, it is not surprising that considerable efficacy research will help the theory of cognitive therapy to
and effectiveness research in the area of cognitive develop, as well as to ensure the most appropriate
therapy is needed. treatment is provided to different patient populations.
The second major methodological issue for cog- (f) Predicated on the assumption that cognitive
nitive therapy is the measurement of mechanisms of therapy continues to enjoy strong clinical outcomes
change. The cognitive model of distress makes a series and popularity in the treatment community, further
of assumptions about the nature of cognitive processes research is needed to understand the optimal method
involved in the development and treatment of various to disseminate this treatment approach. Tied to this
disorders (see Fig. 1). The measurement of cognition development are issues related to how to best measure
has been a difficult task, however—particularly as therapist adherence and competence in cognitive
some of the constructs in the model are hypothetically therapy.
latent until activated (Ingram et al. 1998). Even in the
area of depression, where the greatest concentration of See also: Behavior Psychotherapy: Rational and
work has taken place to assess these constructs, some Emotive; Behavior Therapy: Psychiatric Aspects;
of the hypothesized mechanisms of change have yet to Behavior Therapy: Psychological Perspectives; Cog-
be established (Clark et al. 1999, Ingram et al. 1998). nitive and Interpersonal Therapy: Psychiatric Aspects;
Research from the level of both theory and therapy are Depression, Clinical Psychology of
required to further evaluate the relationship between
cognitive change and other parameters of clinical
change in cognitive therapy.
Bibliography
Beck A T 1988 Loe is Neer Enough. Harper and Row, New
York
Beck A T 1999 Prisoners of Hate: The Cognitie Bases of Anger,
5. Probable Future Directions of Theory And Hostility and Violence. Harper Collins, New York
Research Beck A T, Emery G 1985 Anxiety Disorders and Phobias: A
Cognitie Perspectie. Basic Books, New York
Despite the success of cognitive therapy, and its Beck A T, Freeman A et al. 1990 Cognitie Therapy of
phenomenal growth over the past two decades, there is Personality Disorders. Guilford Press, New York
no doubt that considerable development remains Beck A T, Rush A J, Shaw B F, Emery G 1979 Cognitie
(Dobson and Khatri 2000). Some of the directions for Therapy of Depression. Guilford Press, New York
this work that have been identified in the literature Beck A T, Wright F D, Newman C, Liese B S 1993 Cognitie
include: Therapy of Substance Abuse. Guilford Press, New York

2186
Cognitie Therapy

Beck J 1995 Cognitie Therapy: Basics and Beyond. Guilford Dobson K S, Khatri N 2000 Cognitive therapy: Looking
Press, New York backward, looking forward. Journal of Clinical Psychology 56:
Clark D A, Beck A, Alford B A 1999 Cognitie Theory and 907–23
Therapy of Depression. Wiley, New York Gloaguen V, Cottraux J, Cucherat M, Blackburn I 1998 A meta-
DeRubeis R J, Gelfand L A, Tang T Z, Simons A D 1999 analysis of the effects of cognitive therapy in depressed
Medications vs. cognitive behavioral therapy for severely patients. Journal of Affectie Disorders 49: 59–72
depressed outpatients: A mega-analysis of four randomized Ingram R, Miranda J, Segal Z V 1998 Cognitie Vulnerability to
comparisons. The American Journal of Psychiatry 156: Depression. Guilford Press, New York
1007–13 Jacobson N S, Dobson K S, Truax P, Addis M, Koerner K,
Dobson K S 1989 A meta-analysis of the efficacy of cognitive Gollan J, Gortner E, Prince S 1996 A component analysis of
therapy for depression. Journal of Consulting and Clinical cognitive behavioral treatment for depression. Journal of
Psychology 57: 414–9 Consulting and Clinical Psychology 64: 295–304
Dobson K S (ed.) 2001 Handbook of Cognitie-behaioral
Therapies, 2nd edn. Guilford Press, New York K. S. Dobson

Copyright # 2001 Elsevier Science Ltd.


All rights reserved.

2187

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Coh
Nominal attempts to extract information from
Cohort Analysis Table 1 would summarize percent ‘very happy’ by
rows, columns, and diagonals. The most immediately
A cohort is a set of individuals entering a system at the visible pattern is the association between age and
same time. Individuals in a cohort are presumed to happiness. In all three periods the percentages of
have similarities due to shared experiences that differ- respondents who identify themselves as ‘very happy’
entiate them from other cohorts. Cohort analysis are smaller for those under age 40. This reading of the
seeks to explain an outcome through exploitation of table ignores the possibility that observed variation is
differences between cohorts, as well as differences due to birth cohort, with younger cohorts less likely to
across two other temporal dimensions: ‘age’ (time report being happy. Although visual inspection of the
since system entry) and ‘period’ (times when an diagonals does not suggest a consistent association
outcome is measured). This article exposits difficulties between decade of birth and eventual happiness, the
inherent to cohort analysis, indicates promising direc- absence of an observable relationship does not confirm
tions, and provides context. that no such relationship exists. A potential associ-
ation between birth cohort and adult happiness may
have been obscured by the effects of age and period.
Finally, there appears to be little systematic variation
1. Cohorts between years in percent ‘very happy.’ In this instance,
however, a potential association between period and
Cohorts as analytic entities appear in the social
sciences, the life sciences, epidemiology, and else- Table 1
where. A cohort can be a set of people, automobiles, Percent ‘very happy’ by age and period
trees, whales, buildings; the possibilities are endless. Age Period
System entry can refer to birth—a person is born, or to
any dated event—a machine is assembled on a par- 1973 1983 1993
ticular date. A set of individuals who begin serving a 20–29 29% 30% 28%
prison term at the same point in time might also define
a cohort for certain purposes, in which case ‘birth’ (347) (372) (278)
refers to initiation into a particular role system and
‘age’ becomes duration (time since system entry). The 30–39 36% 28% 29%
breadth of the time interval that defines membership
in a particular cohort depends on analytic consider- (294) (354) (381)
ations and the nature of the phenomenon under study.
40–49 40% 31% 30%
The main difficulties inherent to cohort analysis can
be illustrated with data from the NORC General (247) (228) (329)
Social Survey, a national sample survey conducted
annually or biennially in the United States. Respon- 50–59 41% 29% 38%
dents in this repeated cross-sectional survey are
queried about their emotional wellbeing. Each cell in (253) (212) (205)
Table 1 shows the percentages, organized by age and
survey year, of those who identified themselves as 60–69 38% 37% 33%
being ‘very happy.’ Each row shows how happiness
(192) (201) (166)
levels change across survey year (period) for people
within a given age group. Each column reveals age 70–79 38% 38% 33%
variation in any survey year. The diagonals permit
tracking of a single birth cohort over time. Those aged (117) (123) (155)
20–29 in 1973 were born between 1943 and 1953, as
were those aged 30–39 in 1983, and those aged 40–49 N = 4454
in 1993. Ten-year age categories lead here to cohorts
operationalized as individuals born in contiguous 10- Source: General Social Survey, 1973–1993
Note: Numbers in parentheses are base Ns for the percentages
year intervals.

2189
Cohort Analysis

current happiness may have been obscured by age and interpreted either as age or cohort differences. The
cohort. A further, linked difficulty is that the data data structure provides no basis for choosing because
structure is imbalanced. Although periods are repre- there is but a single period. Interpretation of age
sented over all ages, and ages are represented over all differences on an outcome as being attributable to
periods, the observed age spans of cohorts necessarily factors thought to vary with age or duration requires
differ. the assumption that there are no cohort or period
This imbalance cannot be extirpated from analyses differences, or that they are known. Parallel assump-
that take into account age, period, and cohort sim- tions are necessary for interpreting the age differences
ultaneously. Comparable points hold for other data as attributable to cohort. Thus, the single cross section
structures supporting the analysis of multiple cohorts. does not permit cohort analysis.
Table 1 shows that data structures that allow for Some cross-sectional surveys elicit retrospective
measurements on multiple cohorts necessarily data on birth, marital, or other kinds of histories.
measure age and period. Furthermore, knowledge of Figure 1(a) structures this design, a single cross section
placement on any two of age, period, and cohort with retrospectie data, as an upper-triangular age by
determines placement on the third. This dependency period array with cohorts defined by diagonals.
can be expressed as: The retrospective data structure provides infor-
mation on ages, cohorts and periods, and introduces
Cohort l PeriodkAge longitudinal information on individuals (see Longi-
tudinal Data). When the longitudinal data in this
which raises the questions of whether and how all design are used without regard to age, period, and
three of age, period, and cohort can be included in cohort, the analyst is implicitly, and possibly inad-
cohort models. The linear dependency between age, vertently, assuming that at least one of age, period,
period, and cohort, also known as the cohort analysis and cohort is not essential. The same point holds for
identification problem (see Statistical Identification panel study data structures (Fig. 1(b)). A panel study
and Estimability), is the point of departure for all begins with a single cross section, and is followed by
modern discussions of techniques of cohort analysis. one or more panels on the same individuals (units of
The identification problem is present irrespective of analysis). This design is intended for the creation of
data structure. longitudinal data. Figure 1(b) structures the panel
study as a lower-triangular age by period array with
cohorts on the diagonal. It is triangular because, in the
2. Data Structures simplest case, panel designs do not replenish the data
structure with the addition of new cohorts after the
Three commonly seen data structures engender cohort initial cross section. A panel study could also be
analysis (for a fuller discussion of data structures, see designed to include new cohorts at successive waves of
Fienberg and Mason 1985). Excluded from this data collection; the age, period, cohort dependency
discussion, however, would be the single cross section. would remain.
Reducing Table 1 to a single column indicates the data In the replicated cross-section design or time series
structure of a single cross section. With this design, of cross sections (see Longitudinal Data; Eent His-
differences on an outcome variable by age can be tory Analysis in Continous Time) illustrated by Table 1,

Figure 1
Retrospective and prospective data structures that
engender age, period, and cohort

2190
Cohort Analysis

typically the same individuals are not tracked from subsets of adjacent categories, can lead to quite
one period to the next. However, each cross section different sets of age, period, and cohort effects, with
can have a retrospective component, and thus this the various models all fitting the data equally well, or
design can be longitudinal. Like the other multiple nearly so. One response to the problems of estimating
cohort designs, this structure permits cohort analysis, Eqn. (1) is to massively overidentify one or more
although the class of models it supports is less rich dimensions. For example, a priori knowledge may
when retrospective information is unavailable. suggest that period can be represented by several,
rather than many, categories. The data cannot, how-
ever, be relied on to contain the information on which
to base overidentifying restrictions (Fienberg and
3. Cohort Models Mason 1985, Glenn 1989, Heckman and Robb 1985,
Cohort models may be fixed or random effect (see Kupper et al. 1983, Mason and Smith 1985, Wilmoth
Hierarchical Models: Random and Fixed Effects); 1990).
terms for age, period, and cohort may enter the model
as discrete or continuous; one or more of the age,
period, and cohort dimensions may be included in the 3.2 Fixed Effect: Continuous Time
model via an explicit, substantive measure of that
dimension; interactions are possible. These are the Let age (A), period (P), and cohort (C ) be measured in
most prominent possibilities in the literature on cohort continuous time, with C l PkA. Then the continuous
analysis. time equivalent to Eqn. (1) is:

θAPC l fA(A)jfP(P)jfC(C ) (2)


3.1 Fixed Effect: Discrete Age, Period, and Cohort where fA(A) is an (Ik1)th-order polynomial in A, fP(P)
Assume an IiJ age by period array (Table 1 is a 6i3 is a (Jk1)th-order polynomial in P, and fC(C ) is a
illustration), with age groups and period intervals of (Kk1)th-order polynomial in C. Because of the C l
identical widths. The K l IjJk1 diagonals of the PkA linear dependency, the coefficients of the linear
array correspond to cohorts. The basic fixed-effect terms for A, P, and C are not estimable. As in the
model treats a parameter (θijk) associated with a discrete case, at least one linear restriction must still be
response variable as a linear function of discrete age, imposed and in any event the discrete variable ap-
period, and cohort. Using dummy coding for age, proach is generally preferable.
period, and cohort, let:

I J K
θijk l β j βiAij γjPjj  δkCk (1) 3.3 Random Effect, Discrete Age, Period, Cohort
!
i=# j=# k=# It is possible to view the modeling of age, period, and
cohort effects from a Bayesian, hierarchical perspec-
where the Ai, Pj, and Ck (k l ikjjJ ) are dummies for tive (Nakamura 1986; see Bayesian Statistics; Hi-
ages, periods, and cohorts, respectively. This is a fixed- erarchical Models: Random and Fixed Effects). In this
effect model because inference is conditional on the approach, it is convenient to characterize age, period,
ages, periods, and cohorts represented by a particular and cohort effects through first differences, as in
dataset. Although Eqn. (1) manifests usual normaliz-
ation restrictions (omission of one dummy from each I J K
classification), this is insufficient to break the linear θijk l β* j β*i A*i j γ*j P*j j  δ*kC*k (3)
dependency between age, period, and cohort. Omis- !
i=# j=# k=#
sion of all terms in one of the age, period, or cohort
classifications eliminates the dependency. This is a where A*i l AikAi− , P*j l PjkPj− , and C*k l
satisfactory strategy if prior theory and information CkkCk− . The approach " assumes that the
" β*,γ*, and δ*
suggest that age, or period, or cohort is superfluous. "
are separately
i j k
distributed, and that they are random or
On the other hand, if all three dimensions are deemed exchangeable (Nakamura 1986, Sasaki and Suzuki
indispensable to the analysis, the dependency must be 1987, 1989, Glenn 1989, Miller and Nakamura 1996).
eliminated by one or more further restrictions on The exchangeability assumption requires that within
coefficients if the fixed-effect, discrete model is to be the age dimension, within the period dimension, and
employed. Problems with this approach include col- within the cohort dimension, all permutations of the
linearity among terms on the right hand side of Eqn. (1) first-difference coefficients must be equally acceptable.
remaining after additional coefficient restrictions have This assumption makes possible the determination of
been introduced, and coefficient bias. Moreover, age, period, and cohort effects without resorting to
restrictions that might appear to be innocuously restrictions on coefficients. To the extent that the
different, such as equating the coefficients of different assumption fails, and it can when there are shocks

2191
Cohort Analysis

(e.g., war, famine, plague, a stock market crash) in the and cohort dimensions, and this is likely to be more
process being modeled, the random effects approach is defensible—because substantively plausible underly-
not a panacea. ing, measured variables should covary with the
phenomena that produce shocks. Moreover, the
exchangeability assumption becomes comparable to
the assumption of a random-error term in a fixed
3.4 Substantie Measurement of Age, Period, or effects approach with measured variables.
Cohort
When one or more of the age, period, and cohort
dimensions, whether represented as discrete or con- 4. Interactions
tinuous, is replaced by a variable chosen to measure
the underlying process thought to be captured by age, The need for interactions often arises, either for
or period, or cohort, the linear dependency is almost substantive reasons (Converse 1976), or for technical,
always broken. Attention is then appropriately fo- adjustment purposes (Mason and Smith 1985). It is
cused on the theoretical and substantive merits of the possible to include interactions in both fixed- and
specification. Models that include age, period, and random-effect models, regardless of the presence of
cohort should be thought of as starting points, given measured variables. Fienberg and Mason (1985) elab-
their inferiority relative to models that are able to test orate readily implemented strategies for doing so in
ideas of how and why a cohort, or period, or age the fixed-effect framework, using either discrete or
mechanism affects an outcome. Concretely, suppose continuous age, period, and cohort. Not all inter-
the cohort dimension is held to reflect the impact of actions are estimable, and hence they cannot be added
measured variable X, then Eqn. (1) might change to: into the model at will.

I J
θijk l β j βiAij γjPjjδX (4)
! 5. Conclusions
i=# j=#
Panel studies, cross-sectional studies with retrospec-
where X is either constant over ages within a cohort, or tive information, and replicated cross sections (in-
varies by age within a cohort. Relative cohort size is an cluding age by period arrays created from process-
example of a variable that could be defined at the birth generated data) engender the analysis of a response
of each cohort, or allowed to vary as cohorts ‘age’ variable as a function of age, period, and cohort as
through the life cycle. Measured variables can also be well as other factors. Such analyses must contend with
employed in extensions and revisions of Eqns. (2)–(3). the linear dependency between age, period, and cohort
For example, in Eqn. (2) the polynomial in C can be membership. The use of one or more measured
omitted upon inclusion of some function of X (X itself; variables held to underlie at least one of age, period, or
a polynomial in X; interactions between X and age or cohort can break the linear dependency. So too can
period). application of credible prior information, whether
In the extension of the random-effects approach expressed as constraints on coefficients in fixed-effect
that includes measured variables, cohort analysis models, or as exchangeability assumptions in random-
becomes a specific case of random effects multilevel effects or hierarchical models. Measured variables
analysis (see Statistical Analysis: Multileel Methods). can, of course, be incorporated into both fixed-effect
This development can be expected in the course of the and random-effect models. This strategy is to be
continued deployment of Bayesian statistical solu- preferred, since it makes it possible to test ideas about
tions, because the use of measured variables can substantive processes in the most direct way. Models
enhance the validity of the exchangeability assum- that include age, period, and cohort can also include
ption. In the random-effects approach, substantive interactions between these dimensions, though not all
variables can be written into the model in the following such terms have estimable coefficients. Models that do
way, for one or all of age, period, and cohort: not explicitly consider all three of age, period, and
cohort, and yet are based on data structures that
β*i l λ Ajλ AXAjτiA permit their inclusion, rest on the implicit assumption
! " that age, or period, or cohort is irrelevant. This
γ*j l λ Pjλ PXPjτjP (5)
! " assumption should and can be assessed.
δ*k l λ Cjλ CXCjτkC
! "
where, for example, XC is a measured variable for 6. Further Reading
cohort, and for simplicity only one measured variable
per dimension has been included, and only as a linear Fienberg and Mason (1985) discuss the formalization
term. In this extension it is the τiA, τjP, and τkC that are of the identification problem; identifiability of non-
assumed to be exchangeable within the age, period, linear components; identifiability of certain inter-

2192
Cohort Analysis

actions beyond those implicit in the simultaneous cussions of this kind can help cohort analysts become
inclusion of age, period, and cohort; polynomial more substantively grounded. Research using measur-
models; and other topics. Kupper et al. (1983) and ed variables in place of accounting categories (e.g.,
Kupper et al. (1985) explore in depth several issues cohort size instead of a cohort classification) is not,
raised by the identification problem for the fixed effect however, in need of such assistance (Easterlin 1980,
discrete case, and take the stance (Kupper et al. 1985) Ahlburg and Shapiro 1984, Welch 1979). Blossfeld
that age-period-cohort models are no more informa- (1986) provides a persuasive example of the use of
tive than exploratory graphical displays. Ploch and massive overidentification within a single dimension.
Hastings (1994) illustrate the use of smoothed per-
spective plots. Robertson and Boyle (1998b) provide See also: Bayesian Statistics; Hierarchical Models:
an overview of different strategies for graphical display Random and Fixed Effects; Longitudinal Data;
of age-period-cohort data. Longitudinal Data: Event–History Analysis in Dis-
In Nakamura’s (1986) Bayesian formulation, identi- crete Time; Markov Chain Monte Carlo Methods;
fying linear restrictions are replaced by the assumption Statistical Analysis: Multilevel Methods; Statistical
of exchangeability of first differences in age, period, Identification and Estimability
and cohort effects (although Nakamura chooses to
emphasize the closely related assumption of ‘smooth-
ness’). Hodges’ (1998, Sect. 2) development of hi-
erarchical models as linear models contributes to an Bibliography
understanding of how Nakamura’s specification over- Ahlburg D A, Shapiro M O 1984 Socioeconomic ramifications
comes the singularity of the fixed effects model matrix. of changing cohort size: An analysis of U.S. postwar suicide
Berzuini and Clayton’s (1994) Bayesian formulation, rates by age and sex. Demography 21: 97–108
which is focused on second differences of effects, does Becker H A (ed.) 1992 Dynamics of Cohort and Generations
not solve the identification problem. The social Research. Thesis Publishers, Amsterdam
sciences literature in which Nakamura’s specification Berzuini C, Clayton D 1994 Bayesian analysis of survival on
is used or discussed (Sasaki and Suzuki 1987, 1989, multiple time scales. Statistics in Medicine 13: 823–38
Blossfeld H-P 1986 Career opportunities in the Federal Republic
Glenn 1989, Miller and Nakamura 1996) fails to make
of Germany: A dynamic approach to the study of life-course,
clear that it is not the Bayesian approach per se that cohort, and period effects. Eur. Sociol. Re. 2: 208–25
provides an alternative to linear restrictions on coeffic- Converse P E 1976 The Dynamics of Party Support: Cohort
ients, but rather Nakamura’s particular formulation. Analyzing Party Identification. Sage, Beverly Hills, CA
Nakamura’s (1986) computational approach has been Easterlin R A 1980 Birth and Fortune: The Impact of Numbers
supplanted by the use of Markov Chain Monte Carlo on Personal Welfare. Basic Books, New York
methods (The BUGS Project 2000; see Monte Carlo Elder G H Jr 1999 [1974] Children of the Great Depression: Social
Methods and Bayesian Computation: Oeriew; Change in Life Experience, 25th Anniversary edn. University
Marko Chain Monte Carlo Methods). of Chicago Press, Chicago
Fienberg S E, Mason W M 1985 Specification and implemen-
Robertson and Boyle (1998a), and Robertson et al.
tation of age, period, and cohort models. In: Mason W M,
(1999) review the largely disciplinary-specific epi- Fienberg S E (eds.) Cohort Analysis in Social Research: Beyond
demiological literature on the methodology of cohort the Identification Problem. Springer-Verlag, New York, pp.
analysis, which has focused primarily on additive 44–88
models, and conclude that only the nonlinear com- Glenn N D 1989 A caution about mechanical solutions to the
ponents of age, period, and cohort can be used reliably. identification problem in cohort analysis: Comment on Sasaki
Holford et al. (1994) employ substantive reasoning and Suzuki. American Journal of Sociology 95: 754–61
about cell malformation in carcinogenesis, develop Heckman J, Robb R 1985 Using longitudinal data to estimate
specifications in which age is inherently nonlinear age, period and cohort effects in earnings equations. In:
Mason W M, Fienberg S E (eds.) Cohort Analysis in Social
(e.g., logarithmic) and thus eliminate the identification
Research: Beyond the Identification Problem. Springer Verlag,
problem through choice of functional form. Mason New York, pp. 137–50
and Smith’s (1985) extended study of tuberculosis Hobcraft J, Mencken J, Preston S 1985 Age, period, and cohort
mortality combines substantive reasoning based on effects in demography: A review. In: Mason W M, Fienberg
prior information and expectations, uses one body of S E (eds.) Cohort Analysis in Social Research: Beyond the
data to guide modeling of another, includes an Identification Problem. Springer Verlag, New York, pp. 89–
interaction term, employs a substantive, measured 135
variable, and concludes that potential interactions Hodges J S 1998 Some algebra and geometry for hierarchical
require at least as much attention as the identification models, applied to diagnostics (with discussion). Journal of the
Royal Statistical Society: Series B 60: 497–536
problem itself.
Holford T R, Zhang Z, McKay L A 1994 Estimating age, period
Hobcraft et al. (1985) focus on theoretical reasons and cohort effects using the multistage model for cancer.
for the use of age, period, and cohort in different areas Statistics in Medicine 13: 23–41
within demography. Nı! Brolcha! in (1992) takes aim at Kupper L L, Janis J M, Karmous A, Greenberg B G 1985
the relevance of the cohort dimension for under- Statistical age-period-cohort analysis: A review and critique.
standing temporal variation in human fertility. Dis- Journal of Chronic Disease 38: 811–30

2193
Cohort Analysis

Kupper L L, Janis J M, Salama I A, Yoshizawa C N, Greenberg war; East–West Conflict: Confrontation and DeT tente
E 1983 Age-period-cohort analysis: An illustration of the expresses the ambiguity, but this label is less eye-
problems in assessing interaction in one observation per cell catching.
data. Commun. Statist.–Theor. Meth. 12: 2779–807
Mason W M, Fienberg S E (eds.) 1985 Cohort Analysis in Social
Research: Beyond the Identification Problem. Springer Verlag,
New York 1. The Concept and its Salience
Mason W M, Smith H L 1985 Age-period-cohort analysis and The persistent central features of the Cold War are the
the study of deaths from pulmonary tuberculosis. In: Mason
W M, Fienberg S E (eds.) Cohort Analysis in Social Research:
global contest between the US and the USSR, the
Beyond the Identification Problem. Springer Verlag, New dependence of the allies on the security guarantee of
York, pp. 151–227 their respective superpower, and bipolarity. The latter
Miller A S, Nakamura T 1996 On the stability of church was reinforced in the core area, the arms race, due to
attendance patterns during a time of demographic change: the widening gap between the extensive and highly
1965–1988. Journal for the Scientific Study of Religion 35: diversified weapons systems of the superpowers and
275–84 the arsenals of Britain, France, and the Peoples’
Nakamura T 1986 Bayesian cohort models for general cohort Republic of China (PRCh), the other three established
table analyses. Ann. Inst. Statist. Math. 38: 353–70 nuclear powers. This explains that American and
Nı! Bhrolcha! in M 1992 Period paramount? A critique of the
cohort approach in fertility. Pop. De. Re. 18: 599–629
Russian writings hold to Cold War as a concept fitting
Ploch D R, Hastings D W 1994 Graphic presentations of church the entire epoch. It allows for distinguishing between
attendance using general social survey data. Journal for the moments of imminent war—Cuba, October 1962, and
Scientific Study of Religion 33: 16–33 Yom Kippur war, October 1973—times of high
Robertson C, Boyle P 1998a Age-period-cohort models of tensions and war outside Europe—Korea, 1950–53;
chronic disease rates. I: Modeling approach. Statistics in Vietnam, 1964–75; Afghanistan, 1979–88—long inter-
Medicine 17: 1305–23 vals of deT tente, and even moments of concerted crisis
Robertson C, Boyle P 1998b Age-period-cohort models of management (June 1967 Near East war) or collab-
chronic disease rates. II: Graphical approaches. Statistics in oration (during the Laos crisis and Vietnam war in the
Medicine 17: 1325–39
Robertson C, Gandini S, Boyle P 1999 Age-period-cohort
1960s and ending wars in Africa in the late 1980s).
models: A comparative study of available methodologies. This version pays less attention to the fact that
Journal of Clinical Epidemiology 52: 569–83 throughout the four decades not all parties to the Cold
Ryder N B 1965 The cohort as a concept in the study of social War were involved on the same issue-at-stake and at
change. Am. Sociol. Re. 30: 843–61 the same time. France joined Britain (UK) and the US
Sasaki M, Suzuki T 1987 Changes in religious commitment in in 1948 in founding the West German state as the
the United States, Holland, and Japan. American Journal of Western allies’ response to Stalin’s anchoring of
Sociology 92: 1055–76 Poland, Bulgaria, Romania, Hungary, and Czecho-
Sasaki M, Suzuki T 1989 A caution about the data to be used for slovakia firmly into the Soviet security-zone, and
cohort analysis: Reply to Glenn. American Journal of Sociology
95: 761–5
maintained staunch opposition to proposals for a
The BUGS Project 2000 http:\\www.mrc-bsu.cam.ac.uk\bugs\ neutralized, but nationally rearmed united Germany
Welch F 1979 The effects of cohort size on earnings: The baby (Hitchcock 1998). Since the mid-1960s, France became
boom babies financial bust. Journal of Politics and Economics the spokesman for a Europe less dependent on the US
87: 565–97 and discussing terms of settlement with ‘Russia.’ The
Wilmoth J R 1990 Variation in vital rates by age, period, and Federal Republic of Germany (FRG) converted Cold
cohort. Sociological Methodology 20: 295–335 War into cold peace with the ratification of Ost- und
DeutschlandertraW ge (1970–3). Both governments co-
W. M. Mason and N. H. Wolfinger operated in defending deT tente in and for Europe; by
implication, this aspiration for a European peace-zone
rejected the American concept of the indivisibility of
the global contest between East and West. The PRCh,
Cold War, The the only principal ally the USSR ever had (1950–58\9),
made its peace with the US in 1972 on Pejing’s term of
The term is widely accepted in historical writings. It ‘One China.’ While the PRCh had provoked the US
refers to the epoch between 1947–8 and 1989–90. The during the 1950s to continuous nuclear sabre-rattling,
nature of the Cold War, its causes and effects, and the it turned its antihegemonic posture throughout the
reasons for its long duration are, however, contro- 1970s and 1980s against the USSR.
versial. Its meaning would exclude Asia because of the
outbreak of wars there; the wars in Korea, Vietnam,
1.1 The Components of the Term
and Afghanistan made a difference to how the United
States (US) and the Soviet Union (USSR), the two The definition refers to four aspects of the conflict: (a)
superpowers, waged their global contest in Europe. the ideological antagonism between the Western con-
The situation in Europe resembled neither peace nor cept of freedom of choice for the people of the domestic

2194
Cold War, The

regime and external alignments of their state and the the rescue and infuse capital into the European
Soviet-imposed monopoly of the ‘workers’ and land- economics as the only strategy which could render
labor party and subjugation to the community of support to the parties willing to exclude Communists
socialist states; (b) the geostrategic struggle for bases from governments (Lundestad 1986). The formation
of power-projection; (c) the domestic political contest, of Western-oriented governing coalitions ensured that
but almost exclusively within Western countries, about the European allies, as well as Japan, cooperated with
commitments to the military alliance or equidistance the US in the evolution of the International Economy,
to both of the superpowers; (d) the dispute about to which the USSR and the PRCh were not negotiating
where to draw the line between permissible and parties after the outbreak of the Cold War. Most allies
noncompatible elements in the policy mix of market- did not comply, however, with the American demand
oriented and government-controlled economies. Of to restrict ‘trading with the communist enemy,’ except
these four components, the last became least im- in conspicious moments of Soviet war activities.
portant for refueling the confrontation; the conver- The ideological component became a wasted asset
gence of the mixed economies rather promoted the after its overexploitation during the first phase (1947–
idea that Osthandel, credit facilities, and technology 53). Khrushchev’s condemnation of Stalin’s death toll
transfer might enhance not only ‘liberalization’ in the in February 1956 and the NATO allies’ perception
East European economies, but also generate trans- that the Kremlin was unlikely to launch a general war
formation of the political system. on European territory (1956\7) put a break on the
momentum of ideology as a driving force in the
East–West conflict. But it was on the Kremlin to end
1.2 The Ideological Antagonism
the system conflict. In 1987 Gorbachev rescinded the
The ideological antagonism was the first impulse for Breshnev doctrine and made it known that hardliners
confrontation, but it was also subject to changing in Eastern European governments should not reckon
threat perceptions and shifts of emphasis in the with Moscow calling on the Red Army to back up
balance between confrontation and deT tente in the unpopular communist regimes (Adomeit 1998).
overall relationship. The wider notion of East–West
Conflict posits the Cold War as a distinctive period
into the ideological struggle, originated in 1917–18, 1.3 The Long Duration of the Cold War
between the Wilsonian Impulse and Lenin’s urge for For explaining the longevity of the Cold War, the
peoples’ democracy as the basis for securing peace account therefore has to focus on the other two factors:
(Link 1980). From this perspective, Stalin and Truman the structures which emerged with the consolidation
likewise took the lead in splitting the world along and reaffirmation of the divisions of Europe and of
‘alternative ways of life.’ Western leaders argued that Germany, and the pecularities of the military–
the West had to learn its lessons from the vain hope to geostrategic balance between the two sides.
appease an expansionist regime. ‘No more Munichs’
informed the confrontation stance of containment as
the West’s answer to the external threat. The con- 1.3.1 The diisions of Europe and Germany. The
comitant stance at the home-front is based on the key feature is the hinge between the two diisions.
thesis that authoritarian regimes abuse state-power The stark contrast between Stalin’s refusal to con-
against large parts of their own population; therefore sider concessions on Poland, Bulgaria, and Romania
Communists must not be given a chance to occupy in exchange for American economic assistance and
crucial positions of government such as Interior, his demand to have a say on questions concerning
Public Transport, or Justice. primarily intra-West relations, e.g., the status of the
Conversely, Communist ideology maintained that Ruhr area or of Norway, provoked the US and the
capitalism on the one hand inspires the state to UK to tie their zones firmly to Western institutions
prosecute the working-class and thus breeds fascism, (Deighton 1990). The implementation of this basic
and on the other hand is inherently expansionist and strategy was, however, complicated by clashes of
therefore knows no boundary to its domination. The interest between the US, France, and Britain. These
belief, however, that capitalist nations are bound to tensions prevented them from advancing as far as the
fight for defeating rivals, made Stalin expect that USSR had with incorporating its German state, the
Britain would resist the US’ aspiration to become heir GDR, into the Soviet Empire. Hence they could not,
to the British Empire. The new feature of the post- as wanted, negotiate from a position of strength, but
World War II international system is that the US and expected to be asked to give away on what ‘the West’
the UK were competitive partners in founding the did not yet have, whereas the USSR would persist in
International Organizations destined to develop and its refusal to put its reign over Eastern Europe on the
monitor rules of conduct for international trade, agenda. The stalemate was compounded, when the
currency exchange, and development aid (Gardner reimposition of the Ulbricht regime in 1953 revealed
1969). The UK and France, but also Italy, expected that the Kremlin considered the GDR as its west-side
the up-to-then reluctant US government to come to lever for control over Poland and Czechoslovakia.

2195
Cold War, The

The structural impediment to a negotiated settle- bloc partners’ central agencies: the State Party, Secret
ment disappeared when Moscow tacitly bowed to Service, and top military echelon. The USSR also
Germany’s entry into NATO (1955) and when the made sure through logistic measures that the Soviet
‘West’ acknowledged the fact that the USSR was military could operate from its allies’ territory inde-
strong enough to prevail in Eastern Europe (1956). pendently, and if need be without the consent of the
German Ostpolitik put the final stamp to the ‘norma- incumbent governments (Wolfe 1970).
tivity of the facts.’ On this platform, the so-called The military imbalance between NATO-Europe and
Helsinki process, the Conference on Security and the USSR existed throughout 1947 until the mid-
Cooperation in Europe (CSCE), lingered on through- 1980s. The US found no takers—except the FRG
out the 1973–89 period. Neither side was ready to from 1961 onwards—for its concept of balanced
proceed towards peace-making. collective forces in the sense that the US were the sole
The Cold War might have ended earlier if provider of strategic forces, that is, weaponry pro-
Gorbachev’s predecessors had responded to the US jected to hitting targets located in the USSR, whereas
view that Germany’s double-bind into a US-centered the allies would recruit the ground forces and tactical
NATO and European integrated structures would air forces required for the defense of the region against
prevent Germany from positing a threat to the security infiltration or surprise attacks. Britain and France
of the USSR (Schmidt 1993–5). Instead, Moscow’s preferred to duplicate the US strategic role, albeit on a
attempts to push the divisiveness of nuclear diplomacy much minor scale, and took the relaxation of tensions
within NATO rather than wait for the outcome of in Europe and their entanglement in the legacies of
such conflicts provoked the ‘Atlanticists’ to close ranks their imperial rule as the rationale for reducing their
and reiterate the standard thesis that the West could force assignments to NATO.
prevent the USSR from winning the Cold War if its
members resisted the temptation to court Moscow and
make separate deals with the USSR. The Soviet 2. The Intersection of Economic and Strategic
leadership’s ‘stupidity’ is said to have saved the US or Decisions: The European Structure
NATO to get rearmament projects—such as NSC-68
or the 1979 dual-track decision—through. The failure of the ‘Big Four’ to cooperate on a German
Peace Treaty is identified as a crucial turning point
towards the Cold War (Leffler 1992). Stalin did not
1.3.2 The peculiarities of the military–geostrategic want to give the impression that the US, thanks to the
balance. The key feature is the fundamental asym- atomic bomb and its economic wealth, could impose
metry between the nature of America’s and Russia’s its will on the USSR; he therefore pressed his claims on
predominance in their respective sphere. This asym- Iran, Turkey, during the Peace Conferences with
metry is reinforced by the imbalance of military Germany’s war-time allies and in the Allied Control
forces between NATO-Europe and the USSR’s Council. This in turn was taken in Washington as a
forward deployed forces (Kugler 1993). signal that hope of Russian cooperation must be
The advantage of the US in its global contest with abandoned. Truman and Byrnes had been reluctant to
the USSR was that all other principal powers, in- confront the USSR when the Cold War started over
cluding Germany and Japan, were allied to the US. the fate of Eastern European nations (1944–7). Now,
This helped in the build-up of the international in view of the havoc the 1946–7 crisis wreaked on the
economy, but not necessarily in defense. Conse- economies of Britain and France, the US government
quently, US diplomacy was absorbed as much in intra- conceived that Western Europe could not save itself
West crisis management as in the context of East–West (Hogan 1992).
relations. The allies wanted to be assured that there From there on, the ‘West’ displayed a dynamic of its
were no long-term security risks involved for them in own. The first act was that the economic recovery of
America’s option for sponsoring the resurgence of the Europe was said to require the inclusion of (West)
enemies and occupiers of World War II as strongholds Germany; the price for getting France to change its
of the West. Having imposed limits on Germany’s and German policy was the assurance that the US and the
Japan’s military status, the US could not reckon with UK would back up the French quest for security
a defense contribution for some years to come; both against a resurgent Germany. The second act followed:
were prohibited to engage in ‘out-of-area’ defense Britain and France pleaded with Washington that the
activities (Schmidt and Doran 1996). Marshall Plan initiative would not suffice to attain
By contrast, the USSR gained a formidable ally in stability without an American security guarantee.
the PRCh, who was keen to test the credibility of the Against the background of the Berlin blockade,
American commitment to South Korea and Taiwan. Truman in October 1948 ordered that the US provide
Although the USSR copied the West’s community the major counterbalance to the ‘ever-present threat of
building by setting up COMECON (1949) and the the Soviet military power.’ In September 1948, the
Warsaw Treaty Organisation 1955, the Kremlin relied Brussels Treaty Organization (BTO) had already
for exerting influence de facto on the penetration of the resolved that the defense of Western Europe should be

2196
Cold War, The

as far to the east (of West Germany) as possible. Russia, but he and his successors insisted on having at
Because this was beyond the means available to the least a crucial German vote in all decisions on the use
BTO and not yet compatible with the military strategy of nonstrategic weapons in NATO’s custody in case
of American and British defense planning, it was only deterrence to all types of war failed (Heuser 1998,
a question of developments in the Cold War that the Trachtenberg 1999).
third issue be placed on the agenda: German re- This final act of incorporating the two German
armament. The logic behind the build-up of con- states into alliances dominated by the nuclear-strategic
ventional forces sounded compelling: why do the old superpowers had a far-reaching effect: plastering the
allies expect the US to provide ground and air forces ‘front-line’ states with short-range atomic weapons
and the Supreme Allied Commander to NATO’s was the best insurance premium that the Germans
integrated forces with a view to realize the concept of would not launch war from their soil (Ullman 1991). As
forward defense when they deny the indispensibility of long as the US and USSR stationed troops there and
a sensible German defense contribution, on whose resolved to maintain control over the warheads, the
territory the French in particular wanted to stop the danger of accidental war could be excluded. The
enemy’s offensive? At that time, however, all the US parallel to the military factor written into the strategic
military could offer was an air offensive to deter the landscape is the interest of the Western nuclear powers
USSR; they did not yet want to rely on ‘the bomb’ in confirming the status quo and discuss on that
(Ross), and the USSR had acquired by then (Fall territorial basis the questions of putting ceilings on
1949) an atomic capability. rearmament. Separately, the two German states re-
The outbreak of the Korean War generated the fear launched their rivalry and demanded (1955–73) allies
of a similar war-by-proxy in divided Germany. The and Third World countries alike to subscribe to the
US did not only reverse its stance on what countries Alleinertretungsanspruch of the Western democratic
were of absolute importance to US security by now or peoples’ democratic German republic.
declaring Korea the test case of what became the
domino syndrome, that no US ally should fall prey to
a Communist invader, but also designated the 3. The History of the Concept, Major
National Security Paper NSC-68 as the platform for a Deelopments, and Empirical Results
massive conventional rearmament of the US, adequate
force deployment abroad and military aid to upgrade The invention of the term is accredited to Walter
the defense capabilities of its European and Asian Lippmann who took issue with G. F. Kennan’s ‘X’-
allies (May 1993). article ‘Sources of Soviet Conduct.’ Kennan’s long
The issue of German rearmament was divisive, but telegram of February 22, 1946 molded the agenda of
the US compromised with France on the understand- America’s containment policy. His arguments were
ing that German forces be integrated into a European selectively used by top officials in Washington to assert
Defense Community (EDC), whereas the EDC would that the ‘Soviets’ will develop all means and methods
delegate strategic planning and fixing force require- ranging from threat of military aggression via propa-
ments to NATO. The project became the victim of ganda warfare to clandestine activities to a degree
French domestic politics; the US and the UK had to without precedent in history. Therefore, the US had
comply with the French request that no German to strengthen its executive branch, introduce a
soldier be officially recruited until France had ratified national security policy, establish a professional
the EDC and German Contractual Agreements. The intelligence agency, and develop an unassailable mili-
demise of the EDC meant that Washington had to tary–industrial base (Leffler 1992, Paterson 1979).
forego the hope that US force deployment would be a The opposite Communist view emphasizes three
temporary stop-gap until the European allies, and aspects: the economic aggressiveness of the long-
especially the FRG, were able to do more towards standing hegemonic US project to attain a ‘one-world
meeting NATO’s minimum force requirements. market’ and abuse of the open-door doctrine for
The parallel staged fourth act (1952–4) produced penetrating other nations’ economy; the manipulation
the definitie structure of the Cold War in Europe: the of anti-communism as a club to suppress claims for
NATO, to which the FRG was finally admitted, had social justice and equality of rights within the Ameri-
made itself dependent on the credibility of the US (and can and other Western capitalist systems; and
UK) strategic deterrent. This provoked the issue how America’s acquisition of bases bordering on the USSR
the USSR would react to the logic that NATO’s new for purposes of encirclement and monitoring inside
member would have to be equipped with the same the USSR. In geostrategic terms the US became a direct
weaponry as the allied forces into which the Ger- neighbor of the USSR, whereas the USSR required
man divisions were to be incorporated. Chancellor long-distance bombers (it introduced such jets by
Adenauer made it clear that the FRG did not want 1955) and Inter-Continental Ballistic Missiles (ICBM)
to provoke Moscow unnecessarily by dislocating to show the US its vulnerability. The strong engage-
medium-range ballistic missiles on German soil, whose ment of the US in the post-1990 contests about
later replacements might be able to hit targets in Caucasian and Middle East oil and gas concessions

2197
Cold War, The

and pipeline routes indicates that the US–USSR financial means provided under the Marshall Plan,
rivalry did not end with the Cold War. Milward (1984) presents the thesis of the rescue of the
The ‘Revisionists’ (LaFeber 1991, Paterson 1979) nation-state. His complex argument also draws on the
disclaim that Stalin pursued an offensive strategy observation that at that time the West did not reckon
aiming at extending Soviet rule; they discern rather the with an immediate Soviet threat. The implication is
US’ responsibility for causing the Cold War. The that an unwarranted haste was imposed on a process
postrevisionist approach accounts for the Soviet as- which in any case depended on what the European
pect of the fundamental change in the Cold War since countries did to and for another, formally, US
about 1954–5, and explains the US interest in deT tente, assistance was tied to the Uniting of (Western) Europe
i.e., the coinciding interest in Long Peace (Gaddis and to parallel bilateral agreements between the US
1987). Moscow settled the Austrian question, tolerated and the recipient country.
the FRG’s entry into NATO, and opted for rec-
onciliation with Tito, the only East European leader
who had got away with breaking ranks with Stalin. 3.2 Competitie Cooperation Between the
The cumulation point was the declaration of the Superpowers
doctrine of peaceful co-existence. The Western allies’
By the end of the 1960s, researchers discerned the
match was tacit satisfaction with consolidating the
convergence of five developments which demonstrated
status quo after they had absorbed the FRG. Stability
the continuing relevance of the ‘competitive coop-
in Europe turned the UK’s, US’s, and France’s
eration’ between the superpowers for the changing
attention to the Third World. In reaction to Com-
structures of the Cold War: (a) Soviet policy shifted
munist China’s revolutionary foreign policy, the
towards accepting the US’s and Canada’s presence in
Soviet leadership in 1960 declared wars of liberation
Germany; (b) German Ostpolitik presumed the en-
from colonial rule legitimate and thus distinguished
gagement of the US in NATO; (c) de Gaulle realized in
peaceful co-existence in the developed world from just
the context of the Prague 1968 crisis that Breshnev’s
wars in the ‘southern’ part of the globe. In this sense,
Russia was no party to his vision that dissolving Cold
the East–West conflict was exported to the Third
War structures in the West, i.e., France’s disengage-
World.
ment from NATO’s military organization, might
The US resolved on noninterference in the internal
induce ‘Russia’ to allow for more evolution of national
affairs of the Soviet bloc and relaxation of tensions.
communist and then independent states in Eastern
The US was somewhat dependent on Russia’s re-
Europe; (d) the US embroilment in the Vietnam war
sistance to become the air and nuclear strategic arm of
caused a change in the intellectual climate; the abuse
Pejing’s violent anti-American activities in the Far
of the USSR being the cause of every evil became
East. The emerging picture is that the US, in parallel
obsolete; instead the US became the villain in the
with the Berlin and Cuba crises (1958–62), wanted to
piece; (e) new developments in arms technology—e.g.,
concert the superpowers’ activities in China’s ‘hin-
high-precision weaponry as a substitute for atomic
terland’ (former ‘Indochina’) in the sense that both
weapons; MIRV-technique; antiballistic missiles—
exert pressure on the parties to a conflict amenable to
were beyond Britain’s and France’s capability to
their respective influence and through such agreement
follow on; this reinforced NATO Europe’s self-elected
attain the neutralization of the conflict area, including
dependency on the US with respect to security and
the instalment of all-party or power-sharing coalition
defense (Hanrieder 1989).
governments, however unstable (Nelson 1995).
Renewed US pressures on its allies to extend their
conventional military capabilities and share the bur-
3.1 European Perspecties on the Cold War dens for the ‘Defense of the West’ more evenly, the re-
escalation of the strategic arms race between the
A different strand in the history of the concept is the
superpowers under the aegis of limitation treaties, and
perception of the impact of the other Western powers,
the evidence presented by German Ostpolitik that
especially Britain and France, on how the struggle was
negotiations with Russia generate tolerable results,
waged (Greenwood 2000, Bozo 1991). The contri-
combined to raise the basic questions: what are the
bution is twofold: (a) the Cold War is viewed as a new
costs of the ongoing Cold War? on what terms could
stage in great power rivalry. These studies stress the
the conflict be ended and converted into politico-
Europeans’ interest in devoting resources to restoring
economic competition? (Garthoff 1985).
or preserving their assets and commitments overseas
(Kent 1993, Bossuat 1992). (b) The second contri-
bution takes a different direction: it challenges the
3.3 From the Second Cold War to the End of the
assumption that Western Europe by 1947–8 was in
Cold War
such a critical state that the leading nations had to
‘invite’ the US to reconstruct and protect Western The USSR, after getting Germany’s pledge to observe
Europe. Examining the economic potential of the the invulnerability of the territorial status quo in
European nations and how their governments used the Europe, revived the global contest with the US, which

2198
Cold War, The

was immersed in domestic turbulences and disputes of the concomitant events in the politico-strategic
with all its allies about oil and the dollar. The Kremlin (military) sphere, where the USSR appeared to be the
demonstrated its self-confidence by expanding Soviet winner, but then ended the Cold War on western
naval forces and establishing bases, however short- terms.
lived, stretching from Vietnam via Mozambique,
Somalia, and Angola to Central America. In contrast
to the 1950–3 period, the US catchword ‘arch of crises’ 5. Future Direction of Research
did not resonate well with its allies. The latter wanted
to develop deT tente, notwithstanding the fear, expressed ‘East’ and ‘West’ waged the struggle by all means and
by German Chancellor Helmut Schmidt in 1977, that methods except ‘hot’ war in the area stretching from
the Soviets’ new equipment (Backfire bombers; SS-20) Vancouver to Wladivostok. Hence, research must
made NATO Europe hostage to the Kremlin’s whip of attend to different subject areas and to the expertise
the will. This raised the question whether the USSR developed in the many scholarly communities. Some
was about to win the Cold War or whether the Kremlin areas of research, e.g., diplomatic or intellectual
was overstretching the country’s resources and would history and biography, are more established than
thus expose the USSR sooner rather than later to the others. The impact of intelligence on policy-makers is
need for radical changes in her system. The wishful a relatively new area of systematic research. Inter-
thinking that the USSR would fall victim to its national cooperation projects have done much to
inability of continuous adaptation worked out after promote nuclear history and strategic studies, but the
three decades of aspiring military strategic and pol- history of the military alliances and of national defense
itical parity with the US. Some authors argue that organizations and policies depend still on the auth-
Reagan’s defence build-up deliberately forced the orities’ grant of access as well as clearance and
Soviet leadership to acknowledge the failure of its permission for publishing inspected material. The
domestic system and hence the USSR’s inability to studied ambiguity of the political leaderships and the
persist in the contest and sustain bipolarity. (Adomeit top military echelon about the worth of strategic
1998, Gaddis 1987, Wells’ article in Hogan 1992). The nuclear weapons in case deterrence should fail and
way the global contest ended invited Americans to about the use of short-range atomic weapons deserves
believe that they had won the Cold War. further study in order to know the implications of
nuclear weaponry on the conduct of Cold War
diplomacy and governmental guidance to their mili-
4. Methodological Problems tary.
The immersion of the concept Cold War in the Future research should be more systematic in the
perpetual clash of interests between East and West and sense of extending the conceptualization beyond the
within each ‘bloc’ subjects the interpretator on the one national and bilateral focus to the regional context
hand to the political climate of his own times, and may and intensifying the approach by addressing the
thus reload or de-emphasize the contentions of the fundamental questions about the changing nature of
past; on the other hand, access to newly available the struggle, its costs, the persistence or recreation of
records reveals new insights into previous phases of patterns of conflict, and above all the question whether
the Cold War and thus demands rethinking the past the Cold War structure affected all other bilateral or
but in a different way. Getting the balance right intraregional conflicts or whether re-emerging older
between these two operations is difficult when knowl- conflicts pervaded the East–West conflict, so that the
edge about one party to the conflict is the result of parties to such conflicts used the Cold War for the
generations of research, whereas the more recent purpose of engaging wealthy allies on their side.
presentations of findings from Soviet, GRD, Czech, or See also: Arms Control; Communism, History of;
Polish sources are both less systematic or structured Contemporary History; Diplomacy; Eastern Euro-
and more exposed to be taken instantly as evidence for pean Studies: History; Eastern European Studies:
one interpretation e.g., of Stalin or Krushchev or Politics; International Relations, History of; Military
another. The task to study the records compre-
and Politics; National Security Studies and War
hensively, but also be prepared to modify one’s
assessments in response to newly available empirical Potential of Nations; Peacemaking in History;
evidence is prone to collide with the other obligation, Revolutions of 1989–90 in Eastern Central Europe;
namely to explain what were the basic causes of a Second World War, The; Soviet Studies: Politics;
conflict and which causes—meaning in politics: sins of War, Sociology of; Warfare in History
commission and omission—are accountable for what
developments.
A second main problem is the exact phase-by-phase Bibliography
and overall intersection between the interpretation of Adomeit H 1998 Imperial Oerstretch: Germany in Soiet Policy
the collaboration leading to the evolution of ‘Western’ From Stalin to Gorbache. Nomos Verlagsgesellschaft, Baden-
structures in the economic sphere and the assessment Baden, Germany

2199
Cold War, The

Bossuat G 1992 La France, l’aide ameT ricaine et la construction Wolfe T W 1970 Soiet Power and Europe, 1945–1970. Johns
europeT enne, 1944–1954. Impr. nationale, Paris Hopkins University Press, Baltimore, MD
Bozo F 1991 La France and l’OTAN. De la guerre froide au
nouel ordre europeT en. Masson, Paris G. Schmidt
Deighton A 1990 The Impossible Peace: Britain, the Diision of
Germany and the Origins of the Cold War. Clarendon Press,
Oxford, UK
Gaddis J L 1987 The Long Peace: Inquiries into the History of the
Cold War. Oxford University Press, New York
Gardner R N 1969 Sterling–Dollar Diplomacy. The Origins and
the Prospects of Our International Economic Order New Coleman, James Samuel (1926–95)
expanded edn, McGraw-Hill, New York
Garthoff R L 1985 DeT tente and Confrontation: American–Soiet James S. Coleman was born on May 12, 1926 in
Relations from Nixon to Reagan. Brookings Institution, Bedford, Indiana. He died on March 25, 1995 in
Washington, DC Chicago, Illinois. Coleman was among the most im-
Greenwood S 2000 Britain and the Cold War 1945–91. portant American sociologists of his generation. By
Macmillan, Houndsmill, UK importantly influencing both intellectual and policy
Hanrieder W F 1989 Germany, America, Europe: Forty Years of
German Foreign Policy. Yale University Press, New Haven,
debate, Coleman was unique among sociologists. By
CT making important contributions to a large range of
Heuser B 1998 NATO, Britain, France and the FRG. Nuclear areas of scholarly concern, including mathematical
Strategies and Forces in Europe, 1949–2000. Macmillan, sociology, sociology of education, and social theory,
Basingstoke, UK Coleman was unique among social scientists.
Hitchcock W I 1998 France Restored. Cold War Diplomacy and
the Quest for Leadership in Europe, 1944–1954. University of
North Carolina Press, Chapel Hill, NC
Hogan M (ed.) 1992 The End of the Cold War. Cambridge 1. Life
University Press, Cambridge, UK
Kent J 1993 British Imperial Strategy and the Origins of the Cold James Samuel Coleman was born into a family of
War, 1944–1949. Leicester University Press, Leicester, UK teachers, with roots in the landed gentry of the South
Kugler R L 1993 Commitment to Purpose. How Alliance Part- on his father’s side, and in various places in the Mid-
nership Won the Cold War. RAND, Santa Monica, CA West on his mother’s side. His grandfather, Samuel,
LaFeber W 1991 America, Russia, and the Cold War, 1945–1990, was a minister. The family moved frequently in
6th edn. McGraw-Hill, New York Coleman’s youth between places in Ohio, Arkansas,
Leffler M P 1992 A Preponderance of Power: National Security, and Kentucky to settle, finally, in Louisville, Ken-
the Truman Administration, and the Cold War. Stanford tucky, where Coleman graduated from Manual High
University Press, Stanford, CA
Link W 1980 Der Ost-West-Konflikt. Die Organisation der
School in 1941. The origin is important for his basic
internationalen Beziehungen im 20. Jahrhundert Kohlhommor. sociological interests and positions. The marginality
Stuttgart, Germany from moving around between southern and northern
Lundestad G 1986 East, West, North, South: Major Deelop- American cultures and the diversity of his origins
ments in International Politics, 1945–1986. Oxford University instilled a strong curiosity about social relations. The
Press, New York teacher occupations of both parents created a strong
May E R (ed.) 1993 American Cold War Strategy: Interpreting interest in schools and education. Samuel, the minister
NSC 68. Bedford Books, Boston grandfather, and the mother were important for
Milward A S 1984 The Reconstruction of Western Europe, Coleman’s preoccupation with moral issues that pro-
1945–1951. Methuen, London foundly influenced his work.
Nelson K L 1995 The Making of DeT tente: Soiet–American
Relations in the Shadow of Vietnam. Johns Hopkins University
Coleman’s choice of sociology as a vocation came
Press, Baltimore, MD quite late. He graduated from Purdue University in
Paterson T G 1979 On Eery Front: The Making of the Cold 1949, with a degree in Chemical Engineering, and his
War. Norton, New York first job was as a chemist with Eastman Kodak. He
Schmidt G (ed) 1993\1995 Ost-West-Beziehungen: Konfron- had almost no undergraduate education in any social
tation und DeT tente 1945–1989, 3 Vols. Brockmeyer Bochum, science. Nonetheless, in 1951 he began graduate study
Germany in sociology at Columbia University. Coleman’s dual
Schmidt G, Doran C F (eds.) 1996 Amerikas Option fuW r attraction to science and moral engagement made
Deutschland und Japan. Die Position und Rolle Deutschlands sociology an impeccable choice, or so it would seem in
und Japans in regionalen und internationalen Strukturen. Die 1951. He had found industry frustrating and a likely
1950er und die 1990er Jahre im Vergleich. Brockmeyer,
Bochum, Germany
eventual career in management unappealing. He
Trachtenberg M 1999 A Constructed Peace. The Making of the wanted to devote his life to discovery and concluded it
European Settlement 1945–1963. Princeton University Press, could only be about people, their relationships, and
Princeton, NJ their social organization.
Ullman R H 1991 Securing Europe. Adamantine Press, Twicken- Columbia’s sociology department gave Coleman
ham, UK four intense years and three important teachers: Paul

2200
Coleman, James Samuel (1926–95)

Lazarsfeld, Robert Merton, and Seymour Martin theory (Coleman 1982), moral philosophy (Coleman
Lipset. Coleman is usually regarded as Paul Lazars- 1974), statistics and probability theory (Coleman
feld’s student. This is not quite correct. Lazarsfeld 1981), and education (Coleman 1990a).
was not his dissertation advisor; it was Lipset. It is difficult to characterize James Coleman’s
Lazarsfeld was not the teacher who had the most contributions to the discipline because there are so
profound intellectual influence on Coleman; it was many components. It includes two major and two very
Merton. Lazarsfeld did involve and use Coleman different paradigms of what sociology is about. One is
for the development of mathematical and statistical a Durkheimian paradigm: it sees the task as studying
tools for social analysis, and these activities created how social structure creates and constrains individual
the point of departure for some of Coleman’s most action and causal social processes. The other paradigm
important later work. However, there is an important takes departure in individual purposeful action for
difference between Coleman’s Introduction to Math- developing properties of social systems and structures.
ematical Sociology (1964) and Lazarsfeld’s branch of The former project governs most of Coleman’s em-
mathematical thinking in the social sciences. pirical work on educational processes and social
Coleman’s main objective with the use of mathematics processes in educational institutions. The latter pro-
was the development of theoretical insights and ject, the development of a sociological, rational action
conceptual development. Lazarsfeld’s major contribu- theory, moves, so to speak, in the opposite direction. It
tions were to the codification of research procedures, is theory aimed at understanding social systems
that is, methodology. Coleman has made important themselves, their development, and properties, be-
contributions to methods, but his most remarkable ginning with a theory of individual action. It occupied
quality as a sociologist was his ability to develop most of Coleman’s attention in the last years of his life.
sociological ideas and sustain them with empirical Almost all of Coleman’s empirical research was
evidence. This is much closer to Merton’s style of about growing up, about schools, and learning and
theorizing about empirical matters, though Merton educational opportunity. His first study of schools was
often relied on evidence produced by others. a study of adolescent subcultures based on surveys of
Coleman obtained his first faculty position in 1956 students in ten high schools in northern Illinois. The
as an Assistant Professor at the University of Chicago. report on this research was published in 1961 as The
In 1959 he moved to Johns Hopkins University to Adolescent Society (1961). In the preface, Coleman
create his own sociology department. He developed a notes two reasons for undertaking this study. One was
small organization with an intellectual intensity and his wish to study how schools might be more effective;
excitement that was truly remarkable. It was perhaps he states that the study had its origin in ‘a deep
unsustainable and, for a variety of reasons, Coleman concern I have had, since my own high school days,
went back to the University of Chicago in 1973, now with high schools and with … ways to make an
as a University Professor, and stayed there until his adolescent experience with learning more profit-
death in 1995. able …’ (1961, p. viii). The second concern was an
interest in different types of social systems and the
value systems they reflect. These two interests, in
2. Scholarship creating effective schools and other social organiz-
ations and in the values and cultures of social systems,
The body of scholarly work is large. It includes 30 would guide Coleman’s research throughout his life.
books and monographs and more than 300 articles The wish to study the effects of high schools on
and chapters in books. These contributions have learning did not receive much attention in The Ado-
profoundly influenced and, in several cases, defined lescent Society, possibly because the empirical evi-
the agenda for important subfields of sociology: dence Coleman obtained about these effects, and their
sociological theory, sociology of education, sociology relationship to status systems, is quite difficult to
of the family, mass-communication, social stratifica- interpret. The Adolescent Society is primarily a study
tion, political sociology, mathematical sociology, and of adolescent subcultures. The policy prescriptions
sociological methods. The contributions include three Coleman drew from his analysis were about how
major pieces of policy-oriented research on schools to change these subcultures to encourage learning
and educational opportunities that are the major and academic achievement. This would be done by
examples of ‘sociology making a difference’ in the changing the structure of competition so that students
1960s, 1970s, and 1980s. The scholarly work covers a would see academic achievement as a collective good,
phenomenal range of approaches to social research. like sports.
There is work about social systems (Coleman 1961) The interest in the effects of schools and school
and about individual behavior (Coleman 1990b). social organizations on learning became the main
There is basic research as well as applied. There is preoccupation of Coleman’s educational research for
quantitative (Coleman et al. 1966, Coleman 1987) as almost four decades. The link between the two
well as qualitative analysis (Coleman 1961). There are concerns was always clear, and the interest in the
contributions to economics (Coleman 1966), political social systems created in schools re-emerges as the

2201
Coleman, James Samuel (1926–95)

dominant concern again in his latest research. All of about the existence of school effects—the very effects
his educational research sees the effectiveness of that were said to have been denied by Coleman in the
schools related to the social systems created in schools first Report. Now, the tables had been reversed. The
and in the family and community that form the context critiques now denied the existence of school effects,
for the school social systems. Coleman was very much in particular the apparent superiority of Catholic
a sociologist of schools. schools.
Coleman performed three major studies of schools Coleman strongly believed in the use of mathematics
and educational processes after The Adolescent Soci- as a language for the development of social theory. His
ety. The first so-called ‘Coleman Report’ (for a while Introduction to Mathematical Sociology (1964) remains
referred to by many as the Coleman–Campbell report) unsurpassed as a text on how to use mathematics, in
was a report on the massive Equality of Educational particular stochastic process models, to mirror social
Opportunity survey (Coleman et al. 1966). It is prob- processes of diffusion and influence in groups. It is an
ably the largest social science survey research ever enormously rich collection of tools for the analysis of
undertaken with some 600,000 students surveyed. The social processes. Coleman remained committed to the
Report’s main finding was that the instructional use of mathematics for the development of theory.
facilities and resources, including per pupil expendi- Over time, he changed the type of theory focused
tures, have small and comparatively insignificant upon, from modeling the mechanism of social pro-
effects on academic achievement relative to the influ- cesses in his early work, to modeling the outcomes of
ence of family background. This result created purposive action in his later work. Throughout,
considerable controversy, a controversy that continues Coleman’s contributions provide perhaps the most
today. A very large number of studies have focused on important demonstration of the power of mathematics
the issue since then. The best of this research has as a tool for social theory in contemporary sociology.
replicated the main conclusion: there is very little Already in the 1960s, Coleman published several
relationship between the amount of expenditure on articles on a model for collective decisions where
schools and achievement outcomes, net of the family actors trade control over events and resources to
backgrounds of students. Next to family background, maximize their interests. This work came about as a
the family background of a student’s peers, as meas- result of Coleman’s interest in developing games for
ured for example by the racial composition of schools, learning and for social theory, where players engage in
seems most relevant, especially for minorities. This transactions designed to mirror important learning
conclusion had an important consequence for Ameri- tasks or to simulate basic rules of social systems. One
can schools as it was used to justify racial desegre- of these games was called the legislative game, and was
gation in public schools. The second Coleman designed to simulate the transactions going on in a
Report focused on this effect on policy: the use of legislative body with vote trading. Coleman formal-
busing to integrate schools and reap some of the ized this game in a simple model that provided
benefits of desegregation on the academic achievement equilibrium outcomes providing measures of the
of minorities. The Report concluded that the policy power of actors and the value of events. These ideas
had been ineffective. This conclusion was widely became the basis for what became his major pre-
interpreted as denying the benefits of desegregation occupation in the 1970s and 1980s: creating a rational
documented in the First Report and was strongly action basis for social theory where the emergence of
attacked by many, including many of Coleman’s macro social structures are clearly linked to micro
fellow sociologists. The interpretation was not correct. foundations.
Coleman did not doubt the relevance of social systems The outstanding result of many years of effort
created by student body composition for schools’ became Foundations of Social Theory (1990). It is a
educational climates. He doubted the long-term major book in ambition, achievement, and size. It
benefits of busing as a remedy for segregation because provides theory and theoretical tools for the analysis
it encouraged white flight from desegregated public of existing society and the creation of better societies.
schools. The book moves from the analysis of elementary
The allegation that Coleman had reversed his action and social relations of trust and authority to the
position would be repeated with the publications from development of norms and more complex authority
the third major piece of educational research published systems or corporate actors, such as large corpor-
in the early 1980s. In a series of publications, for ations, and then on to the analysis of the major
example Public and Priate High Schools (Coleman institutions of modern society. A major concern for
and Hoffer 1987), Coleman and co-authors concluded Coleman was the distribution of power in society and
that Roman Catholic parochial schools produced the problem of holding corporate actors responsible
more learning and superior educational outcomes than for their actions. Foundations of Social Theory is an
public schools because of superior social resources, or unfamiliar contribution to a discipline where theory
social capital created by the involvement of parents often seems to be an exegeses about the theory created
with schools and their personnel. These conclusions by those who are safely dead, or abstract conceptual
resulted in another set of published controversies, now development with little if any empirical reference. This

2202
Coleman, James Samuel (1926–95)

major book aims to shape the discipline by providing The ambivalence of the profession toward Coleman
a theory and a mathematical structure that may have had two main sources. One was Coleman’s unwill-
extraordinary potential for research. The realization ingness to specialize in one of the usual three main
of this potential depends not only on the quality of the roles for sociologists: as theorist, methodologist, or
ideas, but also on the discipline’s ability to retool. researcher. This contradicts the implicit theory most
Some concepts from the book have become prominent of us have that one cannot be outstanding in all three
in sociology, for example Coleman’s ideas about the roles. Coleman was outstanding in all three, but
micro–macro link and about social capital. However, specialists in theory, methods, or research had diffi-
this is only a part of this enormously rich contribution. culties recognizing his achievements. The second, and
Coleman knew that an extraordinary educational probably major, source of ambivalence was Coleman’s
effort was needed, and he devoted much of his use of research to draw policy inferences. Coleman
attention to this effort in his last years, for example, by stated what his research meant for policy and was not
founding and editing the journal Rationality and hesitant to do so when it contradicted conventional
Society. wisdom. He loved controversy. Each of the three
Coleman’s theoretical work created a landmark in Coleman Reports stated a conclusion that infuriated
social theory. His concern for developing a math- many: that school resources have little impact on
ematical language for the analysis of empirical pro- academic achievement compared to the family re-
cesses created a rich set of tools that eventually may sources of a child; that busing to achieve racial
change the way sociologists work. His empirical work integration speeds up the process of white flight from
often includes imaginative interpretation and brilliant our central cities; that schools organized as many
insights. Though his emphasis increasingly was on the private Roman Catholic schools produce more learn-
project that resulted in Foundations of Social Theory, ing and less inequality in learning than schools
Coleman repeatedly returned to empirical research on organized like the typical public school. In every
causal processes with individual actions and lives as instance, an army of researchers tried to find fault with
the outcome. The synthesis was accomplished in some the evidence for these conclusions and largely failed.
of Coleman’s latest empirical work, on schools, family, They concentrated on specific statistical issues. This
and community. was a mistake when confronting Jim Coleman. He
anticipated criticisms by demonstrating the main
3. Impact finding in several ways. Moreover, his powerful
intuitions about the mechanisms that create observed
There was no lack of recognition of Coleman’s outcomes produced coherent arguments that Coleman
scholarly contributions by the bodies that confer the effectively integrated with empirical analysis, while the
highest prestige upon scientists. Coleman was elected opposition failed to formulate an alternative argument
to the American Academy of Arts and Sciences in that could be sustained with evidence.
1966, the National Academy of Education in 1966, the Sociology is profoundly influenced by James
American Philosophical Society in 1970, the National Coleman. So are many individual lives and careers.
Academy of Sciences in 1972, and the Royal Swedish None of his students and followers moved with him
Academy of Sciences in 1984. He was a Fellow at the through all the projects, but remain in one or the other
Center for Advanced Study in the Behavioral Sciences, of his many areas of interests. Those who had the
a Guggenheim Fellow, and Fellow at the Wissen- privilege to know him were always profoundly touched
schaftskolleg zu Berlin. He received numerous honor- by his excitement of translating into empirical analysis
ary degrees from universities in the USA and Europe. theoretical ideas about how social structure affects
Despite all the recognition, American sociology individuals. Coleman’s enormous mental (and physi-
had, and still has, considerable ambivalence toward cal) energy never ceased to amaze. This energy and
Coleman. His international reputation, particularly in creativity were sustained by his certitude about the
Germany, the Netherlands, and Scandinavia, and his importance of the project of making sociology a better
reputation outside of sociology seemed often more tool for a better society.
solid than his reputation in American sociology. He Appraisals of Coleman’s scholarly and policy con-
became President of the American Sociological As- tributions may be found in Sørensen and Spilerman
sociation (ASA) from 1991 to 1992, but he never held (1993) and Clark (1996). The latter volume, published
another elected office in the ASA. His election to the after Coleman’s death, includes a complete bibli-
Presidency in 1990 was the result of a write-in ography and several autobiographical fragments.
campaign and not of a nomination by the Association.
Indeed, the leadership of the ASA, including the ASA
President, tried to censor him in the mid-1970s for See also: Adolescence, Sociology of; Affirmative
producing subversive sociology, because his work on Action: Empirical Work on its Effectiveness; Edu-
the impact of busing on white flight threatened cational Sociology; Family and Schooling; Family as
traditional liberal beliefs. It was an ignominious act Institution; Public Policy Schools; Racial Relations;
that almost succeeded. Rational Choice Theory: Cultural Concerns; Rational

2203
Coleman, James Samuel (1926–95)

Choice Theory in Sociology; Sociology, History of; anonymity of crowds permitted passions to surface
Theory: Sociological which could be translated into violent behavior. These
ideas were parallel to the distinction made by Gabriel
Tarde between a public that rationally considers
political issues and a crowd which acts hastily without
Bibliography
rational consideration of the matters at hand. Yet Le
Clark J (ed.) 1996 James S. Coleman. Falmer Press, London Bon also felt that crowds were progressive in that they
Coleman J S 1961 The Adolescent Society. Free Press, Glencoe, challenged existing social arrangements and often led
IL to social change that would not have occurred without
Coleman J S 1964 Introduction to Mathematical Sociology. Free them (Wood and Jackson 1982).
Press, Glencoe, IL
Coleman J S 1966 The possibility of social welfare function.
Even pre-dating Le Bon in 1856 was Alexis de
American Economic Reiew 56(5): 1105–12 Tocqueville who, in The Old Regime and the French
Coleman J S 1974 Inequality, sociology, and moral philosophy. Reolution, argued that the French Revolution
American Journal of Sociology 80: 739–64 resulted from social changes occurring over several
Coleman J S 1981 Longitudinal Data Analysis. Basic Books, New centuries, culminating in the Revolution of 1789, aided
York by—using a current term—the widespread sense of
Coleman J S 1982 Re-contracting, trustworthiness, and the relative deprivation among the bourgeoisie (French
stability of vote exchanges. Public Choice 40: 89–94 middle class) and sans-culottes (French masses). He
Coleman J S 1990a Equality and Achieement in Education. posed a paradox, and then resolved it, by indicating
Westview Press, Boulder, CO
Coleman J S 1990b Foundations of Social Theory. Belknap Press,
that the country less constrained by classical feu-
Cambridge, MA dalism, France, was where the Revolution occurred
Coleman J S, Campbell E Q, Hobson C J, McPartland J, Mood instead of the more constrained England or Germany,
A, Weinfeld F D, York R L 1966 Equality of Educational for example. This occurred because France, by gradu-
Opportunity. Government Printing Office, Washington, DC ally dismantling feudalism, had improved the political
Coleman J S, Hoffer T 1987 Public and Priate High Schools: and economic circumstances of the bourgeoisie and
The Impact on Communities. Basic Books, New York the sans-culottes over the centuries, but these people
Sørensen A B, Spilerman S (eds.) 1993 Social Theory and Social still felt deprived because it was clear the Old Regime
Policy: Essays in Honor of James S. Coleman. Praeger, would permit only so much advancement for them.
Westport, CT
This sense of deprivation relative to what they felt was
A. B. Sørensen their just station in life—that their expectations were
significantly different from their experience–motivated
participation in this monumental movement that
toppled the Old Regime and became the exemplar for
many subsequent revolutions.
Collective Behavior, Sociology of
2. Smelser’s Theory of Collectie Behaior
1. Introduction
The modern, post-World War II, sociology of col-
The sociology of collective behavior begins simul- lective behavior is initiated by Neil J. Smelser’s Theory
taneously with the inception of modern sociology. of Collectie Behaior (Smelser 1963). This book
Gustave Le Bon published The Crowd in 1895, two introduced a thoroughly sociological approach to the
years before Emile Durkheim’s Suicide, and two years investigation of collective behavior. Smelser’s theory
after Durkheim’s The Diision of Labor in Society. Le contrasted with the previously dominant social psy-
Bon’s descriptions and analyses of crowd behavior in chological approach pioneered by scholars such as
the wake of the French Revolution and Paris Com- Robert Park and E. W. Burgess, Herbert Blumer, and
mune stimulated further study of crowds and, more Ralph Turner and Lewis Killian, who focused on
generally, collective behavior. topics such as crowd interaction, types of collective
Collective behavior is noninstitutionalized, uncon- behavior, and emergent norms in undefined social
ventional group activity such as panics, crazes, mass situations.
delusions, incited crowds, riots, and reform or rev- Smelser, by stressing such variables as structural
olutionary movements. A sociological approach to conduciveness and structural strain, along with the
collective behavior focuses on social conditions such other variables in his theory—growth and spread of
as political structures and shared beliefs as these generalized beliefs, precipitating factors, mobilization
conditions influence patterns of collective behavior. of participants for action, and weakening of social
Le Bon approached the study of crowds skeptically. control—highlighted the importance of sociological
He argued that individuals can lose their self-identity conditions as determinants of collective behavior.
in crowds and can commit acts they would not do In focusing on these six sociological factors, he
alone, including physical aggression. He felt the moved the field of collective behavior away from

2204
Collectie Behaior, Sociology of

social psychological theories stressing single variables Snow (1997). As Klandermans (1997, p. 201) states,
such as anonymity of crowds, economic deprivation, ‘One need not concur with Smelser … to appreciate the
alienation, and strong leadership (Wood and Jackson usefulness of these categories and assess maybe with
1982). some surprise how akin they are to other more recent
Smelser showed that these single variables had a conceptualizations.’
place in understanding the development of collective The first well-known approach signaling not only a
behavior, but that they had to be seen in combination critique of, but also an attempted alternative to,
with other relevant variables in order to increase the Theory of Collectie Behaior, was resource mobiliza-
likelihood of collective behavior actually occurring. tion theory pioneered by Mayer Zald and John
Indeed, Smelser argued that the six variables in his McCarthy. While retaining an interest in the origin of
system constituted necessary, or required, conditions collective behavior and social movements—a key
for the occurrence of collective behavior; and when question for Smelser—the resource mobilization ap-
combined at the same time and place, the six variables proach began to shift the basic question addressed.
constituted a sufficient, or fully predictive, condition Resource mobilization, and other subsequent ap-
for the occurrence of some type of collective behavior. proaches, became particularly interested in social
Thus, when the six conditions occurred simul- movement success or failure, and the conditions—such
taneously, Smelser predicted that a craze, panic, riot, as availability of resources like money and media
reform, or revolutionary movement would occur. attention—that are associated with movement success
Smelser furthermore argued that different aspects of or failure.
the six conditions, and their particular combinations,
could help predict which of the five types of collective
behavior mentioned above would be the one to occur
3.2 Political Process Model
(e.g., a reform or revolutionary movement). If, for
example, there was considerable economic strain in Doug McAdam pioneered another important ap-
society, the spread of radical political beliefs, pre- proach to understanding the success—as well as
cipitating factors such as the arrest of dissidents, development—of social movements by focusing on the
attempts by movement leaders to mobilize politically political process involved in such movement devel-
large numbers of people, and the weakening effec- opment and success. In his first book on the topic,
tiveness of agents of social control such as the police Political Process and the Deelopment of Black In-
and army, Smelser’s theory would then ask, in what surgency 1930–1970, McAdam (1982) outlined, and
kind of society did all this occur? If the society was documented the importance of, four conditions that
democratic, Smelser argues that this kind of society he showed to be associated with the rise of militant
would be structurally conducive to the development of black politics and black political achievements in the
a reform, or norm-oriented, movement such as the USA in this crucial 40-year period. These four condi-
Civil Rights Movement in the USA. If instead the tions were as follows: (a) ‘the degree of organizational
society was authoritarian, Smelser argues that this ‘‘readiness’’ within the minority community’; (b) ‘the
kind of society would be structurally conducive to the level of ‘‘insurgent consciousness’’ within the move-
development of a revolutionary, or value-oriented, ment’s mass base’; (c) the ‘structure of political
movement such as the Russian Revolution. opportunities’; and (d) the ‘level of social control’
Smelser’s theory attracted considerable attention (McAdam 1982, pp. 40, 53). Using historical examples
from the start, and it generated a large empirical to show variations in these four conditions, McAdam
literature within a decade of its publication, much of does a masterful job of explaining the ensuing
which supported the theory (Marx and Wood 1975). black political movements of mid-twentieth century
America.
In focusing on the four conditions, McAdam has
3. Subsequent Approaches both drawn upon, and extended, the scope of Smelser’s
six conditions of social movement development. There
is overlap between McAdam’s variables and Smelser’s
3.1 Resource Mobilization
with, for example, organizational readiness being an
Smelser’s theory also attracted critics, some of whom aspect of mobilization of participants for action, and
developed their own approaches that form the basis of insurgent consciousness being an aspect of generalized
current collective behavior analysis. These other beliefs. Yet McAdam also extends Smelser’s model by
approaches will now be examined for their contribu- using the four conditions to help explain the successes
tions to the sociology of collective behavior, many of of black protest in the USA as well as understand its
which owe at least indirect inspiration from Theory of initial development in this historical period. Finally,
Collectie Behaior, as is indicated by several themes McAdam, his associates, and others have further
such as conditions of conduciveness and strain in the extended and applied the political process model to yet
seminal collection of articles covering the last decade of other instances of social movement activity (McAdam
social movement research organized by McAdam and and Snow 1997).

2205
Collectie Behaior, Sociology of

3.3 Political Opportunity Model analyst to note that single variables such as ideology,
by themselves, did not have considerable explana-
One of the conditions in McAdam’s model is ‘the
tory power. Snow, Zurcher, and Ekland-Olson (in
structure of political opportunities,’ a concept elab-
McAdam and Snow 1997) demonstrated that those
orately developed by Sidney Tarrow. In a series of
who belong to social networks are considerably more
books and monographs, Tarrow (1991, 1998) has
likely to act upon their ideology by joining a social
utilized the political opportunity model to understand
movement than those without network affiliations.
the circumstances whereby contentious social move-
This goes a long way toward explaining why empirical
ments can achieve success. Using a wide variety of case
studies since the 1970s have shown low to moderate, as
studies, often ranging from the social movements of
compared to strong, relations between ideological
the 1960s to the movements, conflicts, and changes in
commitment and movement participation (Wood and
Eastern Europe in the 1990s, Tarrow shows how a
Jackson 1982). That is, the relationship between
variety of political structures have influenced the
ideology and movement participation is influenced by
success, or lack of success, of these social movements.
the extent of social network involvement, so that
Drawing on the seminal work of Michael Lipsky
taking network involvement into account strengthens
and others, Tarrow (1991, p. 32) defines political
the relationship between ideology and movement
opportunity structure as ‘political situations in which
participation.
states become vulnerable to collective action and in
In fact, this latter social network finding was
which ordinary people amass the resources to over-
preceded, and influenced by, the earlier finding that
come their disorganization and learn where and how
membership in secondary organizations—such as
to use their resources.’ This definition clearly incor-
business, trade union, church, or even recreational
porates an important element of resource mobilization
associations—facilitated recruitment to social move-
theory, namely the significance for movement success
ments (Marx and Wood 1975, Wood and Jackson
of masses of people obtaining and utilizing resources
1982). In contrast, mass society theory, as elaborated
such as money, media attention, and organizational
by William Kornhauser, had previously argued that
connections. But it also puts these resources in the
the absence of such memberships facilitated recruit-
special political context of governments being par-
ment. Other discussions of social network analysis
ticularly susceptible to mass political pressure.
have included topics such as the types of networks
Regarding several scholars’ use of political op-
needed to facilitate recruitment, the role of conflicting
portunity theory, such as J. Craig Jenkins and Charles
group memberships in recruitment, and the size of
Perrow, Tarrow (1991, pp. 34–6) points to five
networks for recruitment (McAdam and Snow 1997).
variables often used by these scholars to identify con-
Thus, with its organizational focus, social network
ditions associated with governments being especially
analysis provides valuable insights into issues ranging
susceptible to the influence of social movements: (a)
from how political ideology gets translated into social
the degree of openness or closure of the polity, with
movement participation to the influence of organiza-
open polities more favorable to social movements; (b)
tional size on movement recruitment.
the stability or instability of political alignments, with
less stable alignments more favorable to social
movements; (c) the presence or absence of allies or
support groups, with more allies and support groups 3.5 The New Social Psychology
favorable to social movements; (d) divisions or lack of
Bert Klandermans and his associates have developed a
divisions within the elite, with elite divisions favoring
new social psychological approach to social move-
social movements; and (e) tolerance or intolerance of
ments, aptly summarized in his book, The Social
protest by the elite, with tolerance of protest favoring
Psychology of Protest (Klandermans 1997). This new
social movements. These and other elements of pol-
social psychology analyzes recruitment to social
itical opportunity theory are woven throughout vari-
movements, as did an earlier social psychological
ous complex analyses by Tarrow and others to show
approach, but also includes such topics as framing
how shifting political circumstances significantly affect
of social movements, identity politics, new social
the fortunes of social movements, often resulting in
movements, the transformation of discontent into
cycles of protest for the movements.
movement participation, and the relation of social
psychology to social movement organizations and in-
terorganizational analysis. Whereas an earlier social
3.4 Recruitment: Social Network Analysis
psychology of social movements emphasized the
One of the more informative later approaches to social influence of single variables such as relative depri-
movements is social network analysis that helps vation or ideology on movement participation (Wood
explain why people with similar ideologies are differ- and Jackson 1982), or focused on emergent norms
entially recruited—that is, why some of them join influencing such participation (Turner and Killian
social movements and others do not. As discussed, 1972), the new social psychology sees potentially
Smelser (1963) was the first modern collective behavior sympathetic individuals responding to a variety of

2206
Collectie Behaior, Sociology of

social and cultural influences that may, or may not, arguments have focused on the fact that a key basis for
lead to actual movement participation. Among these organizing women and ethnic minorities, especially,
influences are the social networks and organizations has been the need to improve their status economi-
discussed previously, which are seen to promote cally, occupationally, educationally, legally, and in
participation. The new approach also discusses cir- terms of lifestyles and life opportunities. Responding
cumstances, such as a deepening political conscious- to higher education budget cuts in the 1990s, students
ness, favoring—or inhibiting—sustained participa- from varying backgrounds rallied together, along
tion, as discussed by Molly Andrews in Lifetimes of with faculty from varying backgrounds, to fight these
Commitment. cuts and preserve educational and career opportunities
Among the influences more recently discussed is (Wood 1998). Thus, identity politics may well be fairly
the manner in which a movement is framed, which important in organizing given modern social move-
is its presentation to the public—especially by the ments, but the continuing issues of economics, edu-
media—an approach significantly inspired by Ralph cation, legal rights, lifestyles, and life opportunities
Turner’s important article, ‘The public perception of remain as important to the new movements as to the
protest’ (Turner 1969). Is a movement for female labor movement.
reproductive rights portrayed as a movement to
enhance the social, economic, and political rights of
women, or is it presented as a movement to deprive
3.6 Historical–Comparatie Analyses of Reolutions
unborn children the right to live? Is a movement for
better hours, wages, and working conditions portrayed There is a body of highly scholarly, detailed historical
as a just cause for exploited working people, or is it an and comparative analyses of revolutions. Barrington
excuse for ‘union bosses’ to expand their power? The Moore, George Rude! , E. P. Thompson, Charles Tilly,
new social psychology goes into detail explaining how Theda Skocpol, Eric Hobsbawm, and Jack Goldstone
movements and their oppositions attempt to frame the are among the major contributors to the understand-
movements in ways to encourage or discourage actual ing of such classical social and political upheavals as
participation by those with initial interest in the the French Revolution, Russian Revolution, English
movements. Revolution, and Chinese Revolution. These authors
One of the more controversial emphases discussed combine analyses of conditions common to several
by the new social psychology—and which has its own revolutions, such as the weakening of state authority,
literature by scholars such as Alberto Melucci— with analyses of conditions unique to some upheavals
focuses on the role of personal identity in under- but not others, such as differing cultural ideology.
standing the development of recent movements and Skocpol (1979), for example, analyzes social revol-
why certain people are drawn to these movements. In utions that entail not only political change at the
an informative book entitled Ideology and the New top of society, but also large-scale social–economic
Social Moements, Alan Scott (1990) indicates why change throughout the society. She argues that
social movements such as the student movement of the weakening of the state and military, peasant rebellions,
1960s, the modern women’s movement, and ethnic and alienated marginal elites combined to bring about
movements such as the black movement in the USA social revolutions in France, Russia, and China.
should be called new social movements. He argues that Goldstone (1991) also looks at similar conditions of
these movements are new because, first, they are not revolutions, as well as demographic conditions, but
the traditional labor movement that arose from adds in an ideological dimension of why social
conflicts and contradictions between groups of labor revolutions were more likely to occur in the West with
and capital involved in the means of economic its Judeo-Christian ideology that emphasized large-
production, and second, because they are instead scale transformations of society. Thus, for Goldstone,
based on—and arise from—issues of personal identity social revolutions instead of palace coups or military
such as one’s gender, status as a student, and racial or conquests, for example, were seen as the secular
ethnic background. This type of position draws loosely application of Western ideology as compared with
on Orrin Klapp’s imaginative, though less empirically non-Western ideologies such as Hinduism or Budd-
oriented book, Collectie Search for Identity. hism that were associated with different political
Whereas Scott argues that these considerations of results. This body of literature is complex, but highly
identity are paramount in forming the new social informative for deeper understandings of the processes
movements—and figure in any success of the whereby revolutions develop.
movements—others such as Todd Gitlin (1995) have
argued that excessive involvement with personal
identity issues such as these can detract from more
3.7 Analyses of Other Forms of Collectie Behaior:
general or universal themes such as the fight for decent
Riots, Panics, Crazes, and Mass Delusions
conditions of work, fair allocation of income in a
society, and improvement of lifestyles for all instead of The sociology of collective behavior has been in-
just for particularistic group interests. Still other terested in longer-term, more organized social move-

2207
Collectie Behaior, Sociology of

ments, as well as shorter-term, less organized, and less such, collective behavior will continue inspiring and
political forms of noninstitutionalized group behavior. appalling large numbers of people since it persists in
A. C. Kerckhoff and K. W. Back’s interesting study of dealing with the most electrifying of human dramas.
mass delusion, June Bug, analyzed perceptions in an
American city that its inhabitants were being bitten See also: Action Theory: Psychological; Action, Col-
each summer by a June Bug. The two researchers were lective; Crowds in History; Deprivation: Relative;
in this city, determined that no such insect existed, and Identity: Social; Ideology: Political Aspects; Ideology,
then utilized Smelser’s (1963) theory of collective
Sociology of; Network Analysis; Networks: Social;
behavior to understand the development of this
unusual craze. Panic, Sociology of; Revolutions, Sociology of; Revo-
In the late twentieth century, various cults have lutions, Theories of; Social Movements: Resource
been the focus of collective behavior analysis, with Mobilization Theory; Social Movements, Sociology
discussions including Jim Jones’ People’s Temple cult of
that committed mass suicide, the Heavens Gate cult
that acted similarly while anticipating transportation
to another planet, and the San Diego-based cult
focusing on a goddess, with a public access television Bibliography
show, who also promised her followers transportation
Gitlin T 1995 The Twilight of Common Dreams: Why America is
to an extraterrestrial land (Tumminia and Kirkpatrick Wracked by Culture Wars. 1st edn. Metropolitan Books, New
1995). These studies have focused on the circum- York
stances—social and personal—that would lead people Goldstone J A 1991 Ideology, cultural frameworks, and the
to join such cults, as well as circumstances that keep process of revolution. Theory and Society 20: 405–53
them involved, especially in terms of members’ suscep- Klandermans B 1997 The Social Psychology of Protest.
tibility to suggestions and new influences due to their Blackwell, Cambridge, MA
isolation from familiar surroundings and social net- Marx G T, Wood J L 1975 Strands of theory and research in
works. collective behavior. Annual Reiew of Sociology 1: 363–428
Riots are a type of crowd behavior studied since Le McAdam D 1982 Political Process and the Deelopment of Black
Bon’s The Crowd. For many years riots—that is, Insurgency 1930–1970. University of Chicago Press, Chicago
McAdam D, Snow D A 1997 Social Moements: Readings on
violent crowd activities—were seen as irrational,
Their Emergence, Mobilization, and Dynamics. Roxbury, Los
spontaneous, and unorganized group activities. This Angeles
picture has changed over the years, with riots now McPhail C 1989 Blumer’s theory of collective behavior: the
more often seen as outcomes of oppression and development of a non-symbolic interaction explanation.
deprivation, having some rational connection to Sociological Quarterly 30: 401–23
anticipated outcomes, and having at least some Scott A 1990 Ideology and the New Social Moements. Unwin
organization (McPhail 1989). Indeed, Gary Marx felt Hyman, London
compelled to add a correction to the increasingly Skocpol T 1979 States and Social Reolutions. Cambridge
ideological characterization of riots by discussing University Press, Cambridge, UK
‘issueless riots,’ such as crowd violence after athletic Smelser N J 1963 Theory of Collectie Behaior. Free Press of
victories. Nonetheless riots, along with social move- Glencoe, New York
ments, are typically seen as responses to difficulties Tarrow S G 1991 Struggle, Politics, and Reform: Collectie
Action, Social Moements, and Cycles of Protest. Cornell
that groups experience that are at least under- University\Western Societies Program, Occasional Paper No.
standable, or even as rationally calculated responses 21, 2nd edn. Cornell University Press, Ithaca, NY
to these difficulties (Tarrow 1991, 1998). Tarrow S G 1998 Power in Moement, 2nd edn. Cambridge
University Press, New York
Tumminia D, Kirkpatrick R G 1995 Unarius: emergent aspects
of an American flying saucer group. In: Lewis J R (ed.) The
Gods Hae Landed: New Religions from Other Worlds. State
4. Conclusion University of New York, Albany, NY, Chap. 4 pp. 85–104
The sociology of collective behavior is one of the most Turner R 1969 The public perception of protest. American
dynamic and innovative fields of study in sociology. Sociological Reiew 34: 815–31
Indeed, Alain Touraine has stated that the study of Turner R H, Killian L M 1972 Collectie Behaior, 2nd edn.
Prentice-Hall, Englewood Cliffs, NJ
social movements is sociology. Collective behavior has Wood J L 1998 With a little help from our friends: student
always had the ability to inspire and appall. Those activism and the crisis at San Diego State University. In:
sympathetic to the French Revolution remain in awe DeGroot G J (ed.) Student Protest: The Sixties and After.
of the social class changes brought about by it, whereas Addison Wesley Longman, New York, Chap. 19 pp. 264–79
those in opposition remain dismayed by the violence Wood J L, Jackson M 1982 Social Moements: Deelopment,
connected with it. Even a harsh critic of collective Participation, and Dynamics. Wadsworth, Belmont, CA
behavior like Le Bon felt that progress would not
occur without the crowds he nonetheless disliked. As J. L. Wood

2208 Copyright # 2001 Elsevier Science Ltd.


All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7


Collectie Beliefs: Sociological Explanation

Collective Beliefs: Sociological Some examples may be useful. Ideology constitutes


a traditional object of the sociology of collective
Explanation beliefs. These sociopolitical belief systems play a
central role in the legitimization of the social order of
The sociological tradition considers the explanation of modern societies. The sociological explanation of
collective beliefs as an essential task. In his famous ideologies generally consists of replacing them in the
Elementary forms of religious life (1912, 1995), Durk- singular interaction system in which they appear.
heim presents this explanatory project as the center of There is not, however, a single way to conceive the
gravity of the sociological discipline. Although of nature of this system. The Marxist tradition, for
decisive importance, this project remains confronted example, identifies this system with a complex set of
with certain difficulties. First, the extreme diversity of ‘social interests.’ This explanatory scheme initially
the collective beliefs. The members of modern and suggested by Marx (1852) is primarily of a ‘utilitarist’
traditional collectivities develop many and varied nature. If a social group believes in the value of such or
forms of thought: ideological, religious, magic, scien- such a politico-social organization, it is not because of
tific, etc. Impressed by this diversity, sociologists its intrinsic value, but because this precise type of
have sometimes multiplied the explanatory categories organization reinforces directly or indirectly its social
and by doing so have forgotten that the vocation of domination, and consequently its material interests.
any explanatory theory is to establish the simplicity The social utility of the ideological belief overrides its
behind the apparent complexity. truth or falseness.
Another difficulty is the permanent and undoubt- If this utilitarist approach of collective beliefs has
edly irreducible tension on which the sociological been presented by K. Mannheim (1929, 1991) as a
explanation of collective beliefs rests. The sociologist fundamental stage for the emergence of the sociology
must maintain a sufficient proximity to the observed of knowledge, it is not the only possible one. The
beliefs as to be able to identify their specific qualities, paretian study of religious beliefs, for example, rests
while remaining sufficiently distant to preserve the on a very different conception of the existential factors
autonomy of the sociological interrogation. The his- mentioned by Merton. In his famous TraiteT de
tory of the various ideological drifts of the sociological sociologie geT neT rale, Pareto (1916, 1968) analyzes the
analysis shows that this balance is not always easily social diffusion of the religions as the consequence of
accessible. social ‘feelings.’ He affirms the importance to clearly
A short detour by sociological tradition will initially separate the respective influence of ‘derivations’—
remind us of the diversity of the sociological explana- ideas, theories, theologies, etc.—and ‘residues’—deep-
tions of the collective beliefs. On a second occasion, we rooted feelings: ‘the social value of the religions,’
will identify some of the general criteria used by Pareto writes, ‘depends very little on derivations,
sociologists to evaluate the respective merits of these enormously on the residues. In several religions, there
various explanations. Finally, we will be interested in is a significant group of residues (…) which correspond
the reasons why many contemporary sociologists to feelings of discipline, submission, hierarchy’ (§1854).
consider the generalization of the principle of ‘sym- In a way, this explanatory scheme developed by Pareto
metry’ as significant progress for the sociological is very different from the one proposed by Marx for
explanation of collective beliefs. the ideologies: if a social group massively adopts a
religion, it is not because of its direct or indirect social
‘utility’ but because it somehow manages to satisfy the
1. The Diersity of Sociological Explanations dominant ‘passions’ of this group. In another way,
however, these two explanatory schemes are quite
In a classical article intended to provide a general similar. Marx for the ideologies and Pareto for the
theoretical framework for the sociology of knowledge, religions both explicitly consider that the link between
R. K. Merton (1945) summarizes the nature of the the dependent and independent variables has to be
sociological explanation in the following way: to conceived as a ‘causal’ relation. The adoption of a
establish the correlations between collective beliefs, belief by a social group appears in both cases as the
conceived as dependent variables, and ‘the other mechanical consequence of ‘forces’—interest or pass-
existential factors of society and culture’ conceived as ions—which dominate the conscience of its members.
independent variables. If such a formula makes it These forces remain out of the control of the social
possible to unify many sociological studies devoted to actors.
the explanation of collective beliefs, this unity remains This causal approach has sometimes been used to
however purely ‘formal.’ The diversity of the socio- explain the existence of magic beliefs. Le! vy-Bruhl
logical explanatory modes must be considered on (1922, 1960) sees the persistence of magic beliefs in
at least two levels: on the one hand the nature of these traditional societies as the mechanical consequence of
‘existential factors’ mentioned by Merton, on the a specific mental structure: the ‘primitive mentality.’
other the nature of the ‘relation’ which links these This mentality, Le! vy-Bruhl suggests, prevents the
factors to the collective beliefs. members of these communities from perceiving the

2209
Collectie Beliefs: Sociological Explanation

objective difference between verbal similarity and real inventions are intended to bring a solution to maritime
similarity and, in a more general way, the difference transport, mining industry, or military technology
between the relations between the words and the problems. The production of the scientific beliefs may
relations between the things. However, works of be partially interpreted as an attempt of the scientific
Durkheim and Weber show that this approach of community to satisfy an explicit or diffuse social
magic beliefs is far from being the most fruitful. In demand.
their respective analyses of the magic beliefs, they both
identify the ‘existential’ factor to the immediate
environment of the social actors. They both also
conceptualize the relation between this environment 2. Three Elementary Conditions of Sociological
and the collective belief as ‘rational’ or more precisely Explanations
as subjectiely rational. ‘The acts motivated by magic,’ As shown by these classical examples, sociologists
Weber writes (1922, 1979), ‘are acts at least relatively refer in the course of their studies to multiple ex-
rational (…): they follow the rules of the experience istential factors: interest (Marx), passion (Pareto),
even if they are not necessarily acts according to the structure of interaction (Durkheim, Weber), cultural
means and ends.’ The same point is emphasized by framework (Sorokin), socioeconomic development
Durkheim (1912, 1995), in particular, when he com- (Merton). They diverge moreover in their way of
pares the rationality of traditional ‘rites’ and the representing the nature of the relation between these
rationality of modern ‘techniques’: ‘the rites which factors and the studied collective beliefs. Three general
[the primitive] use to ensure the fertility of the ground approaches of this relation are regularly observable:
(…) are not, to him, more irrational than are, to us, the causal (Marx, Pareto), functional (Merton), and
technical processes used by our agronomists (…). The rational (Durkheim, Weber, Sorokin). It also happens
powers linked to these rites do not appear particularly sometimes that sociologists try to evaluate the ob-
mysterious. For those who believe in them, those jective range of their own explanations. The question
forces are not more unintelligible than are gravity or they then have to face can be summarized in the
electricity to a contemporary physicist.’ The explan- following way: which conditions do sociologist need to
atory strategy proposed by Weber and Durkheim respect in order to produce a solid explanation of
thus exists mainly to identify the role of the collective collective beliefs? Three elementary conditions deserve
belief in the adaptation process of the social actor to to be briefly underlined.
his immediate environment, and thus to reconstruct
the ‘meaning’ of the belief for this actor. First condition: the sociological explanation of collective
The sociologists have also paid great attention to belief must be more than the translation of the phenomenon
scientific beliefs. Sorokin (1937), for example, attempts to explain a new idiom
to demonstrate ‘that what a given society regards as
true or false, scientific or unscientific (…) is funda- The sociologists of beliefs generally start from the
mentally conditioned by the nature of the dominant identification of an enigmatic phenomenon. Why does
culture.’ He analyzes the relation between the social the industrial bourgeoisie of the nineteenth century
‘credibility’ of scientific representations of reality and believe in the virtues of parliamentary monarchy
the evolution of cultural values. The Sorokinian (Marx)? Why do Zunis Indians adopt a representation
‘existential’ factor is different from the factors pre- of the world in seven categories (Durkheim)? Why are
viously observed in Marx, Pareto, Weber, or corpuscular and ondulatory theories of light widely
Durkheim’s works: it consists mainly of a ‘cultural accepted during the nineteenth century (Sorokin)? etc.
framework’ evolving in a cyclic way. The ‘relation’ But it sometimes happens that their answers are only
between dependent and independent variables remains apparent. The methodological analysis of the theory
however similar to the relation theorized by Durkheim of ‘primitive mentality’ shows thus that if the theory of
and Weber. The social groups, Sorokin says, select Le! vy-Bruhl does not convince anymore it is essentially
their scientific beliefs according to a general principle because it does not explain anything. The key concepts
of ‘logical dependence’ or ‘logical consistency.’ of this theory, Boudon underlines (1990, 1994), ‘only
Merton (1938, 1970) shares with Sorokin the will to paraphrase the beliefs. They explain the confusion of
establish the social conditions of the scientific deve- the primitives between verbal associations and causal
lopment. However, he emphasizes, the ‘functional’ relations by the primitive’s tendency to make this
nature of the relation which linked the natural sciences confusion, tendency whose reality is guaranteed by the
of the seventeenth century to their socio-historical fact that they do make it indeed.’ A similar point is
contexts. Scientific knowledge, Merton observes, is made by Merton (1945) concerning Sorokin’s theory.
obviously developed on the basis of cognitive cons- ‘It seems tautological,’ Merton writes, ‘to say, as
traints but this development also integrates, in variable Sorokin does, that in a sensualist society and culture,
proportions, the influence of social factors. The the sensualist system of the truth based on the
statistical analysis applied to technological inventions testimony of the sensory organs must prevail. Because
shows in particular that a great number of these he had already defined the sensualist mentality as one

2210
Collectie Beliefs: Sociological Explanation

which conceives reality as being only what is perceived linked to the application of this general principle by
with the sensory organs.’ More recently, S. Cole (1996) the latter.
pointed out that the analysis of the production of Third condition: the sociological explanation of collective
scientific beliefs by certain constructivist sociologists belief must be compatible with observable facts
of sciences is primarily of circular nature. These
sociologists generally claim ‘that it is impossible to This third condition can be interpreted in various
separate the technical from the social; that all science ways. The first and simplest one consists of assuming
is inherently social. This turns their entire argument that explanatory theories of scientific beliefs are not
into a tautology. If science is inherently social, this different from other scientific theories. They must not
means that the technical aspects of scientific disc- only dispose of various internal qualities (absence of
overies by necessity must be determined socially. This circularity, precision, etc.) but also be able to be
is indeed the question we are examining.’ confronted with precise facts, and further still to resist
this empirical test. These facts might be elementary
Second condition: the sociological explanation of collective
observations. The observation, for example, of the
beliefs must avoid ambiguous and\or occult concepts
persistence of the magic beliefs in Western societies
Sociologists use a great number of conceptual seems hardly compatible with the assumption of a
categories. Many proposals produced on the basis of causal link between the magic beliefs and a ‘primitive
the combination of these categories make clear and mentality’ (Le! vy-Bruhl). In the same way the ob-
univocal sense, some however remain more ambigu- servation of the developments of new religious move-
ous. For example, certain sociological explanations ments in large cities enters openly in contradiction
rest on an ambiguous concept of ‘social,’ or more with the hypothesis that the contemporary process of
precisely a confusion between the ‘social’ and ‘his- urbanization is essentially a vector of disintegration of
torical’ dimensions of a collective belief. To show, collective belief (Borhek and Curtis 1975). These facts
however, that a belief appears in a particular historical can also be more complex products of the inter-
context is not the same as to demonstrate its social pretative reconstruction work done by the sociologist
character. L. Laudan (1977) is right to observe that or anthropologist. When R. Horton (1993) argues for
‘the fact that certain assertions are contextual, that example against the tenants of symbolist theory of
one adhered to it only to a certain time and in a certain magic, he points out that this theory, for which magic
place establishes neither that they are necessarily of beliefs are mainly the expression of a symbolic ‘desire,’
social origin, nor even that they are opened to the enters into flagrant contradiction with what the
sociological analysis.’ The sociological use of the members of the traditional societies say about their
concept of ‘interest’ constitutes another well-known own rituals: ‘(…) in denying the paramount import-
example of ambiguity. This concept is sometimes used ance of explanation, prediction and control as guiding
by sociologists of science in order to describe the origin aims of traditional African religious life, ‘‘symbolists’’
of the prestige or credibility of scientific beliefs. In an (…) are committing the cardinal interpretative sin of
underdetermined situation, contemporary sociologists flouting the actor’s point of view.’ Another way to
of science say, the scientists choose between two interpret this third condition consists of associating
concurrent theories the one which appears conform to the solidity of an explanatory theory with its capacity
their ‘interests.’ However, how do we have to conceive to give an account of other beliefs than the one for
the exact nature of these interests? Are they material which it was initially elaborated. Is it for example
interests, symbolic interests, technical, or cognitive possible to apply the explanatory scheme imagined by
interests, etc.? As long as the concept of interest Durkheim and Weber for magic beliefs to other types
remains ambiguous, the sociologist cannot legiti- of beliefs, such as the ideological ones? The fact that
mately claim to explain the observed beliefs. In this transposition is not only possible but brings an
addition, following a Weberian principle, many objective gain of intelligibility for the comprehension
sociologists closely associate the solidity of an ex- of the ideological phenomenon (Boudon 1986, 1989)
planation to the manner in which these explanations demonstrates the real value of their general conception
avoid using occult categories, i.e. invisible ‘forces’—as of the sociological explanation.
the Paretian concept of ‘residue’ or the ‘primitive
mentality’ of Le! vy-Bruhl—which determines the be- 3. The Symmetry Principle
haviors of the social actors. The methodological
principle is simple and easily acceptable: the soci- Sociologists frequently add to these three general
ologist seeking to account for the origin of a collective conditions a final one: a condition known as the
belief may refer to the existence of an occult cause only ‘principle of symmetry.’ This principle implies that the
insofar as he cannot explain this belief more easily, in explanation of the collective beliefs should be equally
particular by identifying his meaning. The obvious applied to phenomena of various natures. This last
difference between the explanations of magic beliefs condition is not reducible to the conditions previously
proposed on the one hand by Le! vy-Bruhl and on the mentioned (the third one in particular) insofar as, to
other by Durkheim and Weber seems to be directly identify the nature of this plurality, it introduces a

2211
Collectie Beliefs: Sociological Explanation

criterion of truth and falseness, rationality and ir- Cole S 1996 Voodoo sociology: recent developments in the
rationality. A sociological theory of collective beliefs is sociology of science. Annals of the New York Academy of
said to be symmetrical if it explains in the same terms Sciences 775: 274–87
Dubois M 2001 La nouelle sociologie des sciences. Presses
true beliefs and false beliefs, rational beliefs and
Universitaires de France, Paris
irrational beliefs. This principle, implicitly used by Durkheim E 1912\1995 The Elementary Forms of Religious Life.
Sorokin (1937), is directly opposed to the ‘a-rationality Free Press, New York
principle’ according to which the sociological expla- Horton R 1993 Patterns of Thought in African and the West.
nation of the collective beliefs cannot be indifferent to Cambridge University Press, Cambridge, UK
the content of these beliefs. Laudan L 1977 Progress and its Problems. University of
The interest of the principle of symmetry is double. California Press, Berkeley, CA
From a methodological point of view, it requires the Le! vy-Bruhl L 1922\1960 La mentaliteT primitie. Presses
suspension of all evaluation related to the potential Universitaires de France, Paris
Mannheim K 1929\1991 Ideology and Utopia. Routledge &
validity or invalidity of the analyzed collective belief.
Kegan Paul, London
It thus represents a guarantee of neutrality. From the Marx K 1852 The eighteenth brumaire of Louis Bonaparte. In:
point of view of the empirical research, it makes it Marx K, Engels F (eds.) Collected Works. Lawrence &
possible to follow in unified and thus simplified terms Wishart, London, Vol. 11
the manner in which the collective beliefs obtain a Merton R K 1938\1970 Science, Technology and Society in
social recognition, or inversely an absence of social Seenteenth Century England. Fertig, New York
recognition. Merton R K 1945\1971 Sociology of knowledge. In: Gurritch G,
The sociological use of this principle followed, Moore W E (eds.) Twentieth Century Sociology. Books for
however, two paths of unequal importance. The first, Libraries Press, Freeport, New York
Pareto V 1916\1968 TraiteT de sociologie geT neT rale. Droz, Gene' ve,
chosen by some sociologists of science, exists mainly to
Switzerland
explain the scientific beliefs in terms of explanatory Sorokin P 1937 Social and Cultural Dynamics, Fluctuation of
categories previously devoted to false beliefs. Linked Systems of Truth, Ethics, and Law. American Book Company,
to an exclusively causal approach of scientific belief New York, Vol. 2
production, this first use of the principle of symmetry Weber M 1922\1979 Economy and Society. University of
has greatly contributed to the diffusion of a relativistic California Press, Berkeley, CA
representation of science (Dubois 2001). There is,
however, another way to apply this principle. It M. Dubois
consists of following the major intuition of the theory
proposed by Weber and Durkheim: to seek behind all
collective beliefs, true or false, socially validated or
socially marginal, the influence of ‘reasons,’ objective
and\or subjective. Thus as R. Horton (1993) is Collective Identity and Expressive Forms
reconstructing the limited rationality of certain
African rituals, he defines the major nature of its If key concepts or expressions can be identified that
explanatory project as follows: elaborating ‘an ex- function to capture the animating spirit of different
planatory framework which would deal ‘‘symmet- epochs, then certainly one candidate concept for the
rically’’ with the traditional and the modern, the latter quarter of the twentieth century is the concept of
prescientific and the scientific.’ collective identity. Indeed, it is a concept that came of
age in the latter part of the past century, as reflected in
See also: Action, Theories of Social; Belief, An- the outpouring of scholarly work invoking the concept
thropology of; Cognitive Dissonance; Collective directly or referring to it indirectly through the linkage
Memory, Anthropology of; Culture, Sociology of; of various collectivities and their identity interests via
Durkheim, Emile (1858–1917); Ideology, Sociology such concepts as identity politics, identity projects,
of; Knowledge, Sociology of; Mannheim, Karl contested identities, insurgent identities, and identity
(1893–1947); Rational Choice Theory in Sociology; movements. This article provides an analytic overview
Social Movements, Sociology of; Theory: Sociologi- of scholarly work on the concept by considering, in
cal; Tocqueville, Alexis de (1805–59); Weber, Max order, its conceptualization, its various empirical
(1864–1920) manifestations, the analytic approaches informing
its discussion and analysis, and several unresolved
theoretical and empirical issues.
Bibliography
Borhek J T, Curtis R F 1975 A Sociology of Belief. Wiley, New
York 1. Conceptualization
Boudon R 1986\1989 The Analysis of Ideology. University of
Chicago Press, Chicago The concept of collective identity, just as the base
Boudon R 1990\1994 The Art of Self-Persuasion: The Social concept of identity, is rooted in the observation that
Explanation of False Beliefs. Polity Press, London interaction between two or more sets of actors mini-

2212
Collectie Identity and Expressie Forms

mally requires that they be situated or placed as social personal identities. Although there is no consensual
objects. To do so is to announce or impute identities. definition of collective identity, discussions of the
Hence, interaction among individuals and groups, as concept invariably suggest that its essence resides in a
social objects, is contingent on the reciprocal attri- shared sense of ‘one-ness’ or ‘we-ness’ anchored in real
bution and avowal of identities. This character of or imagined shared attributes and experiences among
identity is highlighted in Stone’s (1962) conceptuali- those who comprise the collectivity and in relation or
zation of identity as the ‘coincidence of placements contrast to one or more actual or imagined sets of
and announcements.’ This process holds for both ‘others.’ Embedded within the shared sense of ‘we’ is
individuals and collectivities, and it probably has a corresponding sense of ‘collective agency.’
always been a characteristic feature of human in- This latter sense, which is the action component of
teraction, whether the interaction was among early collective identity, not only suggests the possibility
preliterate humans or among those in the modern of collective action in pursuit of common interests but
social world. To note this is not to ignore the even invites such action. Thus, it can be argued that
sociological truism that the issue of identity becomes collective identity is constituted by a shared and
more problematic and unsettled as societies become interactive sense of ‘we-ness’ and ‘collective agency.’
more structurally differentiated, fragmented, and This double-edged sense can be culled from classic
culturally pluralistic (Castells 1997, Giddens 1991). sociological constructs such as Durkheim’s ‘collective
But historical variation in the extent to which matters conscience’ and Marx’s ‘class consciousness,’ but is
of identity are problematic does not undermine the reflected even more clearly in most conceptual dis-
double-edged observation that the reciprocal impu- cussions of collective identity, although the agentic
tation and avowal of identities is a necessary condition dimension is sometimes implied rather than directly
for social interaction and that identities are thus rooted articulated (e.g., Castells 1997, Cerulo 1997, Eisen-
in the requisite conditions for social interaction. stadt and Giesen 1995, Jasper and Polletta 2001,
Delineating the interactional roots of identities does Jensen 1995, Levitas 1995, Melucci 1989, 1995).
not explain what is distinctive about collective identity, A common theme running throughout a segment of
as there are at least three conceptually distinct types of the literature is the insistence that collective identity
identity: personal, social, and collective. Although is, at its core, a process rather than a property of
they often overlap, one cannot be inferred from the social actors. Such work acknowledges that collective
other. Hence the necessity of distinguishing among identity is ‘an interactive and shared definition’ that is
them. evocative of ‘a sense of we,’ but then highlights the
Social identities are the identities attributed or process through which social actors recognize them-
imputed to others in an attempt to situate them in selves as a collectivity, contending that this process is
social space. They are grounded typically in estab- more vital to conceptualizing collective identity than
lished social roles, such as ‘teacher’ and ‘mother,’ or in any resultant product or property (e.g., Melucci 1989,
broader and more inclusive social categories, such as pp. 34, 218, passim). Few scholars would take ex-
gender categories or ethnic and national categories, ception with the importance of the process through
and thus are often referred to as ‘role identities’ which collective identities develop, but it is both
(Stryker 1980) and ‘categorical identities’ (Calhoun questionable and unnecessary to contend that the
1997). Whatever their specific sociocultural base, process is more fundamental than the product to
social identities are fundamental to social interaction understanding the character and functionality of
in that they provide points of orientation to ‘alter’ or collective identity. Not only is the product or ‘shared
‘other’ as a social object. we’ generative of a sense of agency that can be a
Personal identities are the attributes and meanings powerful impetus to collective action, but it functions,
attributed to oneself by the actor; they are self- as well, as the orientational identity for other actors
designations and self-attributions regarded as per- within the field of action. More concretely, it is the
sonally distinctive. They are especially likely to be constructed social object to which the movement’s
asserted during the course of interaction when other- protagonists, adversaries, and audience(s) respond
imputed social identities are regarded as contradic- (Hunt et al. 1994), and which, in turn, may have
tory, as when individuals are cast into social roles or implications for the operation of its organizational
categories that are insulting and demeaning (Snow carrier, affecting the availability and character of
and Anderson 1987). Thus, personal identities may allies, resources, and even tactical possibilities (Jensen
derive from role incumbency or category-based 1995). The initial projected collective identity may be
memberships, but they are not necessarily comparable short-lived and transient, subject to modification and
since the relative salience of social roles or category even transformation during the course of ongoing
membership with respect to personal identity can be collective (inter)action, but the set of properties that
quite variable. makes up the initial collective identity, as well as
Just as social and personal identities are different yet whatever subsequent ones emerge, constitute objects
typically overlapping and interacting constructs, such of orientation and interaction for other collectivities
is the relationship between collective and social and within the field of action.

2213
Collectie Identity and Expressie Forms

If it is acknowledged that there is something of to some set of others, then it follows that collective
substance to collective identities, how are they dis- identities can surface among almost any grouping or
tinguished from social and personal identities? Several aggregation in a variety of contexts, ranging from
factors appear to be at work. First, collective identities relatively small cliques and gangs to sports fans and
may or may not be embedded in existing social celebrity devotees, to laborers and occupational
identities, since they are often emergent and evolving groupings, to neighborhoods and communities, to
rather than firmly rooted in prior social categories. even broader categories such as sexual and gender
This is often the case with the collective identities that categories, religions, ethnic groups, and nations.
emerge in the course of dynamic social protest events The preponderance of empirical research on collec-
(for illuminating examples, see Walder’s research tive identity has focused on the last, more inclusive
on the Beijing Red Guard Movement (2000), and set of categories—sexuality and gender, religion,
Calhoun’s account of the Chinese student movement ethnicity, and nationality. Illustrative is Taylor and
of 1989). Whittier’s (1992) research on lesbian identity and
Second, the collective, shared ‘sense of we’ is lesbian social movements; Nagel’s (1996) work on the
animating and mobilizing cognitively, emotionally, resurgence of collective identity among American
and sometimes even morally. The shared perceptions Indians; Cornell and Hartmann’s (1998) overview of
and feelings of a common cause, threat, or fate that the construction of ethnic and racial identities in the
constitute the shared ‘sense of we’ motivate people to modern world; and Anderson’s (1991) and Calhoun’s
act together in the name of, or for the sake of, the work on nationalism, which the latter defines, in part,
interests of the collectivity, thus generating the pre- as one ‘way of constructing collective identities’ (1997,
viously mentioned sense of collective agency. That p. 29). An additional characteristic of research on
potential inheres within social identities, but they collective identity is its association with the study of
typically function more like orientational markers as social movements, no doubt because such mobili-
the routines of everyday life are negotiated. When they zations tend to be both generative of and dependent
are activated or infused affectively and morally, it is on collective identities (e.g., Gamson 1991, Hunt et al.
arguable that they have been transformed into col- 1994, Jasper and Polletta 2001, Melucci 1989, Snow
lective identities. Third, the emergence and operation and McAdam 2000).
of collective identities means that other social identities Although collective identities can congeal in various
have subsided in relevance and salience for the time aggregations and contexts, they appear not to do so
being. In other words, collective identities, when they on a continuous basis historically. Instead, their
are operative, generally have claims over—not so emergence and vitality appear to be associated with
much normatively as cognitively and emotionally— conditions of sociocultural change or challenge,
other identities in terms of the object of orientation socioeconomic and political exclusion, and political
and character of corresponding action. Examples breakdown and renewal, thus suggesting that they
abound, as observed frequently in the case of cluster historically in social space. The latter part of
many protest gatherings, gripping fads, joyous and the twentieth century has generally been regarded as
celebratory sports crowds, and the concerted cam- one such period of collective identity effervescence
paigns and actions associated with social movement and clustering, with some scholars characterizing this
activism. Fourth, while collective identities and per- period in terms of identity crises and collective searches
sonal identities are obviously different, they are still for identity (e.g., Castells 1997, Gergen 1991, Giddens
very much interconnected in the sense that collective 1991, Klapp 1969).
identities are predicated, in part, on constituents’ In The Power of Identity, Castells captures both this
embracement of the relevant collective identity as a characterization and the kinds of conditions thought
highly salient part of their personal identity and sense to be associated with the various manifestations of
of self (Gamson 1991). Finally, while the attribution collective identity during this period:
or avowal of all identities is interactionally contingent,
collective identities tend to be more fluid, tentative, Along with the technological revolution, the transformation
and transient than either categorically based social of capitalism, and the demise of statism, we have experienced,
in the last quarter of the century, the widespread surge of
identities or even personal identities. powerful expressions of collective identity that challenge
globalization and cosmopolitanism on behalf of cultural
singularity and people’s control over their lives and en-
2. Empirical Manifestations vironment (Castells 1997, p. 2).

Empirical manifestations of collective identity can


vary in a number of significant ways. One important 3. Analytic Approaches
axis of variation is the size of the collectivity and the
corresponding scope of its claims. If the essence of To note that expressions of collective identities cluster
collective identity resides in a sense of ‘we-ness’ historically according to the conjunction of various
associated with real or imagined attributes in contrast social conditions does not specify the character or

2214
Collectie Identity and Expressie Forms

content of the emergent collective identities. This issue national) that are differentially invoked or avowed
has been addressed from the vantage point of three depending on their relative salience and their
contrasting perspectives: primordialism, social struc- situational pervasiveness.
turalism, and social constructionism. Salience refers to the relative importance of an
Both the primordialist and structuralist views can identity in relation to other identities (Stryker 1980);
be construed as variants of an overarching essentialist pervasiveness or comprehensiveness refers to the
perspective which posits that a collectivity’s identity situational relevance or reach of any particular identity
basically flows naturally from some underlying set of and the corresponding degree to which it organizes
characteristics, often reduced to a single determina- social life, including collective action (Cornell and
tive attribute regarded as the collectivity’s ‘defining Hartmann 1998, Snow and McAdam 2000). Given the
essence.’ From the primordialist point of view, the fact that increasing numbers of individuals live in a
defining characteristic is typically an ascriptive attri- world in which they are the carriers of multiple and
bute, such as race, gender, or sexual orientation, or often conflicting identities, what determines any par-
sometimes a deep, underlying psychological or per- ticular identity’s relative salience and pervasiveness,
sonality disposition. From a structuralist perspective, and thus the influence of its claims, vis-a' -vis others?
the critical characteristic is typically a kind of master Clearly such matters are not determined solely by
social category implying structural commonality, such an identity’s primordial roots or structural footing.
as social class, ethnicity, or nationality; a set of Finally, much of the empirical evidence is consis-
relational ties or networks suggesting structural con- tent with the constructionist argument. Two highly
nectedness; or a mixture of both. Individuals who evocative examples include Trevor-Roper’s (1983)
are similarly situated structurally, such that they are account of the retrospective invention of the distinc-
incumbents of similar roles, work in similar enter- tive Highland culture and tradition so redolently
prises, are linked to the same social networks, or associated with all of Scotland, and James’ parallel
members of the same social class, religion, or ethnic conclusion, based on extensive archeological and
group, are presumed to have a shared collective archival research, regarding the origins of the modern
identity or at least be candidates for such. Celts:
The constructionist perspective, in general, rejects
both the primodialist and structuralist variants of the … the idea of a race, nation or ethnic group called Celts in
essentialist argument, seeing the presumed link be- Ancient Britain and Ireland is indeed a modern invention. It
tween identities and their ascriptive or structural is an eighteenth- and nineteenth-century reification of a
moorings as being more indeterminate than postu- people that never existed, a factoid … assembled from frag-
lated. Instead, attention is shifted to the construction ments of evidence drawn from a wide range of societies
and maintenance of collective identities. Collective across space and time, This reification served the interests of
identities are seen as invented, created, reconstituted, a range of cultural expectations, aspirations, and political
agendas—and still does (James 1999, p. 136).
or cobbled together rather than being biologically
preordained or structurally or culturally determined.
Assessment of the three perspectives in terms of Such conclusions should not be read as unequivocal
their relative analytic utility for explaining the charac- refutations of the primordialist and structuralist
ter and content of collective identities reveals con- arguments, since constructed identities are not fabri-
siderable support for the constructionist thesis. This cated whole cloth but typically knit together by
may be due in part to the currents of fashion, drawing on threads of past and current cultural
influenced by the winds of multiculturalism, post- materials and traditions, structural arrangements, and
modernism, and identity politics, but it is due, more even primordial attributes. These materials and at-
importantly, to other factors. One is that the hy- tributes constitute the kinds of stuff from which
pothesized link between identities and the primordial collective identities—particularly ethnic, religious and
attributes or structural categories in which they are national ones—are fashioned, and thereby function,
presumably anchored is too mechanistic. In its hard in varying degrees, to constrain the construction
version, it is contradicted by the sociological ob- process. Interpretative constraint also may be exer-
servation that people are often members of the same cised by the institutional contexts and relations of
categories or groups in different ways and with vary- power in which contestants are embedded (Castells
ing degrees of commitment and identification, thus 1997, Jensen 1995). Additionally, analyses of the
suggesting that inferring correspondence between relationship between collective identity and partici-
personal, social, and collective identities solely on pation in social movements repeatedly point to the
the basis of primordial or structural categories is experience of collective action itself as a fertile seed-
empirically suspect. Additionally, the claims of pri- bed for the generation of collective identities (Calhoun
mordialist and structuralist arguments do not fare 1991, Fantasia 1988, Melucci 1989, Walder 2000).
well when confronted with the observation that Thus, while collective identities are undeniably con-
people generally have multiple identities (e.g., family, structed, they rarely are constructed carte blanche;
work, leisure, gender, ethnic, religious, political, and rather, they typically are forged not only with the

2215
Collectie Identity and Expressie Forms

materials suggested by the primordialist and struc- Schrock 1996) that gives symbolic substance to the
turalist perspectives, but with and through the claimed distinctive ‘we,’ and it is largely through this
experience of collective action itself. bricolage that collective identity is expressed and
known publicly. While the boundary making and
maintenance functions of these symbolic resources, or
bricolage, is widely acknowledged, what accounts
4. Theoretical and Empirical Issues
for the differential resonance or carrying power of
Relevant to a thoroughgoing understanding of col- different symbolic markers is less well understood.
lective identity, whatever its empirical locus, are
several theoretical and empirical issues that require
more careful consideration than often accorded. 4.2 The Problem of Identity Correspondence
A not uncommon problem with analyses of collective
identity is the tendency to reify the collective identity,
4.1 Identity Work (the Expression of Collectie
and thus take for granted the link between the
Identities)
individuals that make up the collectivity and the
Fundamental to understanding collective identity, shared, overarching identity. This gloss is particularly
particularly from a constructionist standpoint, are the troublesome in light of the observation that people
processes through which it is created, expressed, typically have multiple identities that vary in salience
sustained, and modified. These processes have been and pervasiveness. Thus, how is any particular col-
conceptualized as variants of ‘identity work,’ which lective identity reconciled with other identities ad-
encompasses the range of activities people engage in, herents possess? How, in other words, do the shared
both individually and collectively, to signify and cognitions and feelings indicative of a collective
express who they are and what they stand for in identity move center stage at the individual level? Such
relation or contrast to some set of others (Schwalbe questions allude to what has been referred to as the
and Mason-Schrock 1996, Snow and Anderson problem of ‘identity correspondence’—the alignment
1987, Snow and McAdam 2000). At its core is the or linkage of personal and collective identities (Snow
generation, invocation, and maintenance of symbolic and McAdam 2000).
resources used to bound and distinguish the collec- One general answer is that this alignment ‘is
tivity both internally and externally by accenting accomplished by enlarging the personal identities of a
commonalities and differences (Eisenstadt and Giesen constituency to include the relevant collective identi-
1995, Schwalbe and Mason-Schrock 1996, Taylor and ties as part of their definition of self’ (Gamson 1991,
Whittier 1992). Symbolic resources include the inter- p. 41). But what are the processes through which
pretive frameworks (or frames), avowed and imputed prospective adherents come to embrace the relevant
names, and dramaturgical codes of expression and collective identity, such that personal and collective
demeanor (e.g., particularistic styles of storytelling, identity are correspondent or congruous? Two broad
dress, adornment, and music) that are generated and processes have been suggested: identity convergence
employed during the course of a collectivity’s efforts to and identity construction.
distinguish itself from one or more other collectivities. Identity convergence refers to the union of personal
Concrete examples include the various forms of and collective identities when both are congruent, such
identity talk such as ‘atrocity tales’ and ‘war stories’ that an extant collectivity provides a venue for an
that group members repeatedly tell each other, pro- individual to act in accordance with her or his personal
spective adherents, and the media (Hunt and Benford identity. The analytic problem is not one of identity
1994); particular songs and styles of music that invite construction or transformation, but one of linkage or
participation and that are politically and emotionally bridging and the identification of the mechanisms that
evocative, such as ‘We Shall Overcome’ (Eyerman and facilitate the convergence. A number of such mech-
Jamison 1998); key words and slogans that function anisms have been identified. One operates at the
in a similar fashion, such as ‘Liberte, Fraternite, organizational level, entailing the occasional appro-
and Egalite’ and ‘Workers of the World Unite’; and priation of existing solidary networks by movement
systems of gestures and signs, such as the raised organizations (Snow and McAdam 2000); the other
clinched fist and the peace sign, that function similarly mechanisms are variants of rational choice processes.
to the tradition of heraldry (Pastoureau 1997). One is based on a ‘tipping’ or ‘threshold’ model,
These and other symbolic resources function as which posits that collective identities are assumed by
boundary markers of collective differentiation, dis- individuals when the perceived actions of others reach
tinguishing insiders from outsiders, or protagonists a point that suggests that the payoffs for adopting, or
from antagonists, in a fashion that heightens aware- at least acting in accordance with, the collective
ness of in-group commonalities and connections and identity outweigh doing otherwise. Illustrative is the
out-group differences. Together they congeal into a contention that such tipping points played a critical
kind of ‘semiotic bricolage’ (Schwalbe and Mason- role in explaining language and identity shifts among

2216
Collectie Identity and Expressie Forms

Russian-speaking immigrants ‘beached’ in four of the adherents, antagonists, and bystander audiences
republics of the former Soviet Union (Laitin 1998). A (Benford and Snow 2000, Hunt et al. 1994); en-
related mechanism is the existence of intergenerational gagement in collective action, as when direct ob-
investments in personal identities that may have servation or experience functions as a demonstration
implications for the embracement of future collective event that gives rise to a situationally specific collective
identities (Laitin 1998). A third rational choice ex- identity or affirms collective claims and thus helps to
planation holds that the collective identities associated render salient, and perhaps pervasive, what was
with social movements—and, by implication, with previously a secondary or marginal personal or social
other collectivities as well—can be regarded as ‘selec- identity (Melucci 1989, Calhoun 1991, Walder 2000);
tive incentives’ for those who seek to express and or some combination of both framing and actual
affirm their personal identities (Friedman and engagement.
McAdam 1992). The explanatory utility of these Given the variety of ways in which identity cor-
arguments, as well as the network appropriation respondence can be affected, the question arises as to
thesis, is contingent on two underlying assumptions: whether the relevance of the convergence and con-
that there is pre-established congruence among some struction processes varies by type of collectivity. In the
number of personal identities and the proffered or case of social movements, for example, it has been
available collective identities; and that collective hypothesized that movements that are culturally dif-
identities are constituted by the aggregation or con- ferent, greedy in terms of the cognitive and behavioral
vergence of parallel personal identities. While the demands, and\or politically radical are likely to rely
first assumption is empirically tenable, the second is more on identity construction than convergence
questionable from the vantage point of many scholars processes (Snow and McAdam 2000). Whether this is
of collective identity (e.g. Jasper and Polletta 2001, the case is an empirical question, but it does caution
Melucci 1989, 1995). against presuming that what accounts for identity
In the absence of correspondence between personal correspondence and shifts in one context necessarily
identities and collective identities, some variety of holds for another.
identity work is necessary in order to facilitate their Just as there may be variation between types of
alignment. This alignment can vary significantly, collectivities and the processes of identity convergence
ranging from the elevation of the salience of a and identity construction, so it is reasonable to ask
particular identity to a fairly dramatic change in one’s whether these processes might vary in importance at
sense of self. Four identity construction processes different points in the life course of a social movement
have been identified that capture this variation: or ethnic or nationalist mobilization? Rather than
identity amplification, identity consolidation, identity assuming that a particular process, such as the tipping
extension, and identity transformation (Snow and process or identity amplification, operates routinely
McAdam 2000). with respect to the emergence of a collective identity,
Identity amplification affects a change in an indi- might not these processes be more relevant at par-
vidual’s identity salience hierarchy, such that an ticular junctures in the career of a movement’s
existing but lower-order identity becomes sufficiently collective identity? Preliminary consideration of
salient to ensure engagement in collective action, as such questions suggests that network appropriation,
in the many cases in which in the identity of woman rational choice, and constructionist explanations,
was elevated and expanded in conjunction with the rather than being mutually exclusive and antithetical,
Women’s Movement; identity consolidation involves may interact and combine in interesting ways in
the adoption of an identity that is a blend of two prior explaining the emergence, institutionalization, and
but seemingly incompatible identities, as in the case of diffusion of collective identities across different
the union of environmentalists and labor activists and contexts (Snow and McAdam 2000).
‘Jews for Jesus’; identity extension entails the expan-
sion of the situational pervasiveness of an individual’s
personal identity so that its reach is congruent with
4.3 Dimensions, Layers, and Types of Collectie
the collectivity’s, as when individuals come to see
Identity
themselves as representatives for a specific cause that
transcends other role obligations and identities; and Although there is an extensive literature on collective
identity transformation involves a dramatic change in identity, with considerable discussion regarding its
identity, such that individuals now see themselves as conceptualization and sources, this literature has been
remarkably different than before, as often occurs in relatively mute regarding variation in its form. The
the case of conversion to a new group or movement. concept is most often invoked as if it is an invariant,
The mechanisms or processes underlying these uniform collective phenomenon. This is not the case,
various forms of identity construction include framing however, as collective identities can be multidimen-
processes in which identities are announced or re- sional and be multilayered within a specific locus, and
nounced, embraced or rejected, and modified or they may also vary by type. The multidimensionality
reframed in the course of various interactions with of collective identity is indicated by reference to its

2217
Collectie Identity and Expressie Forms

cognitive, emotional, and moral dimensions (Jasper identities. Noting as well that collective identities
and Polletta 2001, Melucci 1989). The relative im- arise and operate within an interactive context
portance of each of these dimensions to the vitality ‘marked by power relationships,’ Castells distinguishes
and motivational force of a collective identity has not among legitimizing, resistance, and project collective
been elaborated, however. Presumably the presence identities (1997, pp. 7–10). Legitimizing identities
of each of these dimensions yields a more robust and are associated with dominant institutions or the
vital collective identity. Clearly a collective identity in state, whereas both resistance and project identities
which the boundaries between ‘us’ and ‘them’ are un- represent two basic forms of the antagonist identity,
ambiguously drawn, in which there is strong feeling the former generated by devalued and stigmatized
about those differences, and in which there is a sense of collectivities, and thus constituting the seedbed
moral virtue associated with both the perceptions and for identity movements and politics, and the latter
feelings, should be a more potent collective identity associated with movement beyond resistance to the
than one in which either the emotional or moral construction not only of alternative identities but also
dimensions are weakly developed. a new system that valorizes rather than defiles the
Similarly, several analyses have noted that collective new identity. The important issue is not whether
identity can be multilayered, such that there can be such typologies of collective identities are exhaustive,
variation in its locus. Three such layers have been but the emphasis on their contextually embedded
noted with respect to social movements (Gamson and interactional character and their different con-
1991, Stoecker 1995). They include, beginning with the sequences.
broadest and potentially most inclusive layer: the
social movement community or solidary group, which See also: Collective Beliefs: Sociological Explanation;
can be thought of as the constituent layer, as in the Discourse and Identity; Identity and Identification:
case of black Americans in relation to the civil rights Philosophical Aspects; Identity Movements; Identity:
movement; the social movement layer as in the case of Social
the civil rights movement; and the organizational
layer, as represented by the Southern Christian
Leadership Conference (SCLC), the Congress for
Racial Equality (CORE), and the Student Nonviolent Bibliography
Coordinating Committee (SNCC) in the context of the
civil rights movement. In principle, each successive Anderson B 1991 Imagined Communities: Reflections on the
layer may be embedded in the larger more inclusive Origin and Spread of Nationalism. Verso, London
layer, giving rise to a generalized, cohesive collective Benford R D, Snow D A 2000 Framing processes and social
movements: an overview and assessment. Annual Reiew of
identity at the community or national level. But clearly
Sociology 26: 611–39
the existence of a collective identity at one level does Calhoun C 1991 The problem of identity in collective action. In:
not automatically generalize to or incorporate another Huber J (ed.) Macro–Micro Linkages in Sociology. Sage
level. Thus, collective identities can be built around the Publications, Newbury Park, CA
organizational carriers of a movement, as in the case Calhoun C 1997 Nationalism. University of Minnesota Press,
of SCLC and SNCC, without necessarily representing Minneapolis, MN
the broader movement, which indicates the potential Castells M 1997 The Power of Identity. Blackwell Publishers,
for identity conflicts at the collective level and the Oxford, UK
potential for schism and factionalization. Such Cerulo K A 1997 Identity construction: new issues, new direc-
observations suggest the need for more careful con- tions. Annual Reiew of Sociology 23: 385–409
Cornell S, Hartmann D 1998 Ethnicity and Race: Making
sideration of the often multilayered character of
Identities in a Changing World. Pine Forge Press, Thousand
collective identities and of greater specification of the Oaks, CA
ways in which they can interact and combine, and Eisenstadt S N, Giesen B 1995 The construction of collective
with what consequences. Also, these observations call identity. Archies Europeanes de Sociologie 36: 72–102
for caution in generalizing about the scope and Eyerman R, Jamison A 1998 Music and Social Moements:
functioning of collective identities, particularly with Mobilizing Traditions in the Twentieth Century. Cambridge
respect to broader social categories such as ethnicities University Press, Cambridge, UK
and nationalities. Fantasia R 1988 Cultures of Solidarity: Consciousness, Action,
Finally, it is reasonable to wonder if collective and Contemporary American Workers. University of
identities vary by type. At the most general level, Hunt California Press, Berkeley, CA
et al. (1994) distinguish among protagonist, antag- Friedman D, McAdam D 1992 Collective identity and activism:
networks, choices, and the life of a social movement. In:
onist, and audience or bystander identities, arguing Morris A D, Mueller C M (eds.) Frontiers in Social Moement
that even though protagonist or oppositional identities Theory. Yale University Press, New Haven, CT
have received most of the scholarly attention, each Gamson W A 1991 Commitment and agency in social move-
type or field of identity is fundamental to under- ments. Sociological Forum 6: 27–50
standing the interactive dynamics underlying the Gergen K 1991 The Saturated Self. Dilemmas of Identity in
emergence, character, and functioning of collective Contemporary Life. Basic Books, New York

2218
Collectie Memory, Anthropology of

Giddens A 1991 Modernity and Self-identity: Self and Society in Walder A 2000 Identities and interests in the Beijing Red Guard
the Late Modern Age. Stanford University Press, Stanford, Movement: Structural explanations reconsidered. Unpub-
CA lished Manuscript
Hunt S A, Benford R D 1994 Identity talk in the peace and
justice movement. Journal of Contemporary Ethnography 22: D. A. Snow
488–517
Hunt S A, Benford R D, Snow D A 1994 Identity fields:
Framing processes and the social construction of movement
identities. In: Larana E, Johnston H, Gusfield J R (eds.) New
Social Moements: From Ideology to Identity. Temple Uni-
Collective Memory, Anthropology of
versity Press, Philadelphia, PA
Jasper J M, Polletta F 2001 Collective identity and social Focused on shared, collective, or social memory
movements. Annual Reiew of Sociology 27: 283–305 practices, this article examines the linkages between
James S 1999 The Atlantic Celts: Ancient People or Modern memory and history with attention to the fields of
Inention? British Museum Press, London power, in which the struggle for domination over
Jensen J 1995 What’s in a name? Nationalist movements and remembrance and tradition, the manipulation of
public discourse. In: Johnston H, Klandermans B (eds.) Social retrievable historical consciousness and collective for-
Moements and Culture. University of Minnesota Press, getting, takes place.
Minneapolis, MN
Klapp O E 1969 Collectie Search for Identity. Holt, Rinehart
and Winston, New York 1. Definition of Concepts: Memory and History
Laitin D D 1998 Identity in Formation: The Russian-speaking
Populations in the Near Abroad. Cornell University Press, The term history is often used to describe representa-
Ithaca, NY tions of the past that appear in written or narrative
Levitas R 1995 We: Problems in identity, solidarity and form, a primary medium through which states, elites,
difference. History of the Human Sciences 8: 89–105 or dominant descent groups confiscate linear time and
Melucci A 1989 Nomads of the Present: Social Moement and proclaim official chronologies as master narratives.
Identity Needs in Contemporary Society. Temple University The writing of history is ‘a colonization’ of time ‘by
Press, Philadelphia, PA the discourse of power’ (de Certeau 1988). The term
Melucci A 1995 The process of collective identity. In: Johnston
memory, by contrast, is conventionally applied to
H, Klandermans B (eds.) Social Moements and Culture.
University of Minnesota Press, Minneapolis, MN
those oral, visual, ritual, and bodily practices through
Nagel J 1996 American Indian Ethnic Renewal: Red Power and
which a community’s collective remembrance of the
the Resurgence of Identity and Culture. Oxford University past is produced or sustained (Connerton 1989). It
Press, New York includes the vast complex of unofficial, noninstitu-
Pastoureau M 1997 Heraldry: Its Origins and Meanings. Thames tionalized knowledge not yet sedimented into formal
and Hudson Ltd., London traditions and which represent the ‘collective con-
Schwalbe M L, Mason-Schrock D 1996 Identity work as group sciousness’ of whole groups (Halbwachs 1980),
process. Adances in Group Processes 13: 113–47 forming a counterweight to the knowledge that is
Snow D A, Anderson L 1987 Identity work among the homeless: privatized and monopolized by certain elites for the
The verbal construction and avowal of personal identities. defense of established interests. While memory may be
American Journal of Sociology 92: 1336–71 a moving reservoir of history, furnishing the ‘raw
Snow D A, McAdam D 2000 Identity work processes in the material’ for representing the past, it is not the same
context of social movements: Clarifying the identity\move- as history (Le Goff 1992). Individual remembrance,
ment nexus. In: Stryker S, Owens T, White R W (eds.) Self, collective memory, and narrative history interact in
Identity and Social Moements. University of Minneapolis
highly complicated ways, shaping each other as
Press, Minneapolis, MN
Stoecker R 1995 Community, movement, organization: The
different versions of the past are constructed and
problem of identity convergence in collective action. Socio- reconstructed, modified, and invented.
logical Quarterly 36: 111–30
Stone G P 1962 Appearance and the self. In: Rose A (ed.)
Human Behaior and Social Processes. Houghton Mifflin, 2. The Politics of Time: History and
Boston Anthropology
Stryker S 1980 Symbolic Interactionism: A Structural Version.
Benjamin\Cummings, Menlo Park, CA
The theories, models, and methods that were de-
Taylor V, Whittier N E 1992 Collective identity in social veloped by anthropologists to study non-European
movement communities: Lesbian feminist mobilization. In: worlds tended toward a ‘rejection of historical research
Morris A D, Mueller C M (eds.) Frontiers in Social Moement of any kind’ (Evans-Pritchard 1962), at least within
Theory. Yale University Press, New Haven, CT the limits of the discipline. Having as its central theme
Trevor-Roper H 1983 The invention of tradition: The Highland ‘the unlettered and forgotten peoples,’ the anthro-
tradition of Scotland. In: Hobsbawm E, Ranger T (eds.) pological endeavor was inclined to disregard the
The Inention of Tradition. Cambridge University Press, existence of indigenous histories, which included, in
Cambridge\London particular, the trauma of the colonial encounter. But if

2219
Collectie Memory, Anthropology of

making history is a social practice that produces society, based on a descriptive analysis of the structure
peoples, bodies, and places, as de Certeau (1988) and functioning of social forms’ (Cohn 1981, p. 232).
suggests, then the very erasure of such a universe of American scholars, by contrast, adopted a romantic
time and being must be understood as a figuration of idealist stance: anthropology emerged as a hermen-
power. The effacement of historicity is a political eutic science, focused on the descriptive reconstruction
operation. and symbolic capture of entire cultural systems.
Nineteenth-century anthropology was deeply im- Although the nation-bound projects of anthropology
plicated in the projects of Western modernity: the began to diverge, the ethnographic field method was
construction of nation-states and the creation of adopted as a common research standard, thereby
colonial systems with which Europeans came to imparting and codifying the now conventional dis-
dominate other worlds. During this process of global ciplinary gaze: the observers’ eyes were trained to see
expansion, and the establishment of modern hegem- society only at a particular point in time. Ethnography,
onic polities, the past was increasingly mobilized as a in its obsession with reifying the authenticity of
symbolic discourse ‘in the definition of, and the peoples, fabricated a closed temporal frame: a present
marking of, the boundaries of states and nations’ present or a past present, as Evans-Pritchard (1962)
(Cohn 1981, p. 228). Anthropologists of this period termed it in a critical commentary on the discipline’s
contributed to the colonial undertaking by a reliance ‘breach with history,’ illuminating the distortions
on analytic tools that used Western notions of time as which flow from the imposition of a priori temporal
an index of unequal evolutionary development. The limits upon ‘traditional societies.’
prerogative to know and control the past, by defining But it was precisely this aversion to history that led
it as ‘history’ (an order of chronological or linear anthropologists to discover ‘memory’ as a valuable
time), was an entitlement of civilization. The social ethnographic tool. Evans-Pritchard (1962), for in-
worlds outside of Europe came to be marked as stance, suggested that the past was always ‘incapsu-
timeless: a closed symbolic universe inhabited by ‘the lated in a context of present thought’: the past, as
people without history’ (Wolf 1982). Displaced into memory, was embedded in material, ritual, and narra-
the sphere of alterity, colonial subjects were incar- tive practices and therefore ‘part of the social life
cerated by the temporal logic of an expansive which the anthropologist can directly observe’ (pp.
European world system. 177–8). The pursuit of collective memory was to
The past as history became a way to construct constitute a major change in ethnographic practice:
schemes of classification which differentiated Euro- the ‘life story’ of local worlds—of bodies, magic, and
peans from others. Anthropologists began to delineate markets—could be rendered intelligible through the
stages of human development via the comparative oral archive. But in the colonizing imagination of
method: by identifying phenomena of a general type Western anthropologists, the work of memory was
(taboo, totemism, polygamy), their goal was to map consigned to the uneventful register of structural,
out a global history of social institutions. In this mythical, and sacred time: an order of signs without
endeavor, however, analytic units were ‘taken out of history. Peoples’ means of remembering traumatic
the flux of time’ (Evans-Pritchard 1962, p. 175): social events (famine, migration, conquest) ‘were always
forms, stripped of their unique features and context- seen to reinforce the system in place, never to trans-
bound meanings, escaped temporality, appearing form it’ (Comaroff and Comaroff 1992, p. 21). In a
static and unchanging. The comparative method, in its shift from historical past to ethnographic present,
attempt to achieve conceptual stability as a socio- anthropologists erected ‘counterfeit signposts’ of a
logical proposition, was explicitly anti-historical in primitive world of enduring traditions—an analytic
emphasis and thereby served to obscure the effects of fiction. The anthropology of collective memory, in
Western contact and domination. consequence, began as a kind of anti-history.
By the early twentieth century, anthropologists had
decisively abandoned the comparative paradigm. But
in spite of the shift to new methods of analysis, the 3. The Shape of Memory: Modes of Historical
discipline’s aversion to historical research intensified. Consciousness
Scholars in Britain, Germany, and the United States
asserted that the customs of native peoples were to be In their ethnographic explorations, anthropologists
found not in colonial archives, missionary reports, or encountered indigenous memory forms that did not
dubious travel accounts but in the field: reliable data seem to fit the European idea of historical time: an
about non-Western societies demanded first-hand objective, unmediated, linear account of past events.
observation. This was accomplished at first through By a focus on myths, ritual, oratory, and body
brief expeditions, and later by long-term residence of practices, they uncovered the narrative and performa-
the anthropologist in native communities. British tive possibilities of a history of the present: the social
scholars, in their interpretation of the ethnographic sense of time whereby people attempted ‘to grasp the
venture, reasserted the enlightenment program: world as both a synchronic and a diachronic totality’
anthropology was to become ‘a natural science of (Le! vi-Strauss 1966, p. 263). In the realms of memory,

2220
Collectie Memory, Anthropology of

past and present—although theoretically distinct— cal places (archives, libraries, museums); monumental
were conjoined to link, from generation to generation, places (cemeteries, architectural edifices); symbolic
the living with the dead. places (commemorative rites, pilgrimages, emblems);
Among Australian aborigines, for instance, his- functional places (manuals, autobiographies, associa-
torical or commemoratie rites ‘recreate the sacred and tions); and places of power (states, elites, milieux)
beneficial atmosphere of mythical times—the ‘dream which ‘constitute their historical archives in relation to
age’—mirroring its protagonists and their great deeds,’ the different uses they make of memory’ (Le Goff 1992,
and transport ‘the past into the present’; rites of pp. 95–6, Nora 1986). These memory-sites furnish a
mourning ‘correspond to an inverse procedure: instead series of locations where knowledges of the past are
of charging living men with the personification of conveyed and sustained by a circulation of signs that
remote ancestors, these rites assure the conversion of calls attention to its own logic of inclusion, exclusion,
men who are no longer living into ancestors,’ thereby and selective incompleteness.
bringing ‘the present into the past’ (Le! vi-Strauss 1966,
pp. 236–7).
These images of time are different, perhaps, but not 5. Unsanctioned Archies: Memory and
‘other.’ In the rites performed, past and present are Countermemory
propelled into a social space—the body. The act of
remembering takes corporeal form: memory is em- The anthropology of collective remembrance is not
bodied (Connerton 1989). But this transposition of only concerned with affirmation and heritage, but with
temporal worlds, which connects the living and dead processes of contestation, the aftermath of violence,
to the ancestors via the mimetic power of embodi- and the emergence of countermemories (Stoller 1995).
ment, is not an abrogation of time. It is a generative A social history of remembering must take account of
reconstitution of society: a practice of empowerment, ‘the mechanisms that make unsanctioned remem-
a political engagement with the matrix of life (Stoller brance possible’ under repressive political regimes: in
1995). asking how people remember what is meant to be
Other anthropological studies suggest that collective forgotten, ‘matters of transmission’—the mechanics
memory may be systematized on the basis of relatie of shared memory and hidden histories—take center
temporalities, that is, a flexible system of polymor- stage (Watson 1994). Here, interpersonal memories,
phous reference points. Among certain peoples of the commemorations, theater, drama, folklore, secret,
Ivory Coast, like the Gue! re! , ‘the consciousness of a and oppositional histories are ‘the venues within
historical past has developed alongside a multiplicity which alternative remembrances’—unauthorized and
of other times’ (Le Goff 1992, pp. 8–9): mythical, unapproved memories of the past—can be located
genealogical, lied, and projected time. Such a register and analyzed (Watson 1994, p. 2). Collective memory
of memory forms belongs to the aftermath of the practices are not only linked to sites of domination
colonial project. Under the impact of prolonged social that attempt to legitimate a given social order, but also
trauma, the disparate logics of time and memory chart to unsanctioned sites of struggle, opposition, and
the precarious matrices of life: a synchrony of struggle, transformation.
violence, and perseverance.

6. The Traffic in Memory: Global Hegemonies


4. Memory-sites: The Localization of and Local Struggle
Remembrance
In all societies, there exist distinct moments when new
The anthropological analysis of collective memory representations of the past are forged, contested, and
moves toward a renunciation of linear temporality in put to cultural and ideological use. Battles waged over
favor of multiple kinds of time as experienced at the ethnic origin or national identity may be linked to the
levels where the individual takes root in the social: constitution of radically new kinds of memory-
language, demography, economics, politics. Memory archives (Hobsbawm and Ranger 1983). The creation
production is in this sense not dependent on a single of nations, ethnicities, and subjects inevitably alters
individual’s experience of the past. A remembrance, a the shape of the past. ‘Memory’ is therefore not a
social understanding of events that is represented as generic term of analysis, but itself an object appro-
memory, can be constructed by sharing with others priated, transformed, and politicized. Or, put dif-
‘sets of images that have been passed down to them ferently, memory can be nationalized, medicalized,
through the media of memory—through paintings, aestheticized, gendered, bought, and sold.
architecture, monuments, ritual, storytelling, poetry, Historical consciousness shaped by nation-building,
music, photos, and film’ (Watson 1994, p. 8). In every scientific ideology, and global capitalism plays a
society, we can identify an array of memory-sites significant role in the formation of social identities.
or places of commemorative record and practice Detailing the politics of remembrance in postsocialist
where remembrance anchors the past: topographi- China, Schein (2000) shows how state-orchestrated

2221
Collectie Memory, Anthropology of

reforms, staged against the chaos of the Cultural missionary records. The Hawaiian confiscation of
Revolution and implemented during a period of these colonial memory-stores proceeded, as Friedman
increasing exchange and fascination with the West, documents, by forging an indigenous past uncoupled
produced a popular appetite for exotic others: com- from the larger world, a Western-imposed modernity,
modified images of minority peoples, like the Miao, which had obliterated a population and absorbed its
were fashioned by an acute nostalgia for ‘ancient’ social history into the projects of global economic
traditions and those feminized cultural essences on systems.
which to recraft a distinctive Chinese nationness. The Anthropologists have firmly established the need to
result was a boom in memory recuperation among attend to these paradoxes of mobile cultural and
China’s urbanites and Miao elites, who simultaneously symbolic forms—this traffic in memory—during
played to a Western valorization and high-spending periods of transnational crisis and restructuring
tourist consumption of dwindling pasts. But such a (Appadurai 1996). In the context of global systems,
newly forged cultural attachment to memory artifacts, where the practice of identity is brought on stage, the
the ‘fossil bed’ phantasms of folk heritage, also entails logic of time and the meaning of memory have been
transformation. In other words, the work of symbolic radically transposed. In the global arena, the pro-
restoration ‘may conceal very new projects’ (Le Goff duction of historical consciousness becomes increas-
1992, p. 9). The recruitment of the past for revolu- ingly entangled with the commodity form: encoded by
tionary or political ends, as in the case of China, temporal longings and mythic value, it furnishes the
is entangled in a twofold venture: the forging of common stock for a memory market saturated by
a distinctive ethnonational imaginary and the re- cargo-type products of surplus and lack, bearing the
negotiation or recalibration of place in the global deep imprint of the globally hegemonic mode in which
order. people’s ethnonational remembrances are manu-
The pursuit, rescue, and celebration of collective factured, exchanged, and consumed (Taussig 1980).
memory is always socially motivated and has, thus, But memory consumption can also be a subordinate
to be understood in positional terms. The global production, one that creatively and tactically resists
context—the theater of transnational consumer the homogenizing tendency of centralized systems.
capitalism—is, according to Friedman (1994), par- This suggests, as Schein (2000) observes, an alternate
ticularly relevant in understanding how local memory sense for cultural production: ‘the myriad ways in
forms are affected by the formative impact of world which people make use of received [memory] products
systemic processes. In the case of modern Greece, as presents a measure of possibility, not only for auton-
Friedman shows, national selfhood was crafted by omy, but also for subversion of the dominant order’
internalizing the ways in which European elites, in (p. 16). Put differently, the modalities of collective
constructing their own ‘civilized’ origins, mytholo- memory practice attest to the possibilities of symbolic
gized classical antiquity as the cradle of Western creatiity—even in the province of globalizing
civilization. Greek nationalists found their past in the modernities.
institutional memory of the West, thereby accelerating
incorporation into an expanding European world See also: Collective Memory, Psychology of; Cul-
system. Such acts of cosmological repositioning are tural History; Expressive Forms and the Evolution
symptomatic of a transglobal system that fuses socio- of Consciousness; History and Memory; Memory:
economic transformation and collective memory to Collaborative; Memory Retrieval
‘the reconfiguration of the map of the world’s peoples’
(Friedman 1994, p. 123).
Yet even in the global arena, there exist other
possibilities of memory production. In the Hawaiian Bibliography
case, Friedman’s instructive contrast, indigenous com- Appadurai A 1996 Modernity at Large. University of Minnesota
munities attempted to extricate themselves from West- Press, Minneapolis, MN
ern dominance by projecting a value system, produced Cohn B S 1981 Toward a rapprochement. Journal of Inter-
in the modern context, onto an aboriginal past. The disciplinary History 12 (2): 227–52
result was a body of memories composed of mythic Comaroff J, Comaroff J 1992 Ethnography and the Historical
history, a work of folklore and folklorization ‘rife with Imagination. Westview Press, Boulder, CO
simulacra of tropical fantasies.’ Social and economic Connerton P 1989 How Societies Remember. Cambridge Uni-
marginalization, in the aftermath of World War II, versity Press, Cambridge, UK
gave rise to an intense period of Hawaiian cultural de Certeau M 1988 The Writing of History. Columbia University
Press, New York
revival. Recuperation efforts, centered on the recon- Evans-Pritchard E E 1962 Social Anthropology and Other Essays.
stitution of ethnic history by retrieving precolonial The Free Press of Glencoe, New York
social forms, brought into play a museology of memory Friedman J 1994 Cultural Identity and Global Process. Sage,
that was staged in opposition to the West: a precontact London
Hawaii stocked with all the items to be found in Halbwachs M 1980 The Collectie Memory. Harper & Row,
Western historical archives—libraries, museums, New York

2222
Collectie Memory, Psychology of

Hobsbawm E, Ranger T (eds.) 1983 The Inention of Tradition. lective memory has greatly accelerated the develop-
Cambridge University Press, Cambridge, UK ment of the themes defined by Hawlbachs, Bartlett
Le Goff J 1992 History and Memory. Columbia University and Douglas and their influence on disparate academic
Press, New York
fields. In the last decades of the twentieth century,
Le! vi-Strauss C 1966 The Saage Mind. University of Chicago
Press, Chicago collective memory generated a large literature focusing
Nora P 1986 Les lieux de meT moire. Gallimard, Paris, 3 Vols. on the relation between the past and the present,
Schein L 2000 Minority Rules. Duke University Press, Durham, raising questions especially about how the present
NC influences the reconstruction of the past and the ways
Stoller P 1995 Embodying Colonial Memories. Routledge, New in which historical events and circumstances impinge
York or intrude on the present. These issues have often been
Taussig M T 1980 The Deil and Commodity Fetishism in South addressed through empirical studies of particular rites,
America. University of North Carolina Press, Chapel Hill, NC festivals, commemorations, and memorials in different
Watson R S (ed.) 1994 Memory, History, and Opposition under
national and historical settings.
State Socialism. School of American Research Press, Santa
Fe, NM What follows is a discussion of the sources of
Wolf E R 1982 Europe and the People Without History. convergence and divergence in the psychology of
University of California Press, Berkeley, CA collective memory as expressed in the academic litera-
ture and as it impacts on these broader politics of
U. Linke memory. There is widespread agreement concerning
both the reality of collective memory as well as its
malleability, i.e., memory’s susceptibility to revision,
manipulation, etc. However, differences among
scholars exist as well, focusing particularly on the
question of the motives and interests that lie beneath
Collective Memory, Psychology of the phenomenon and determine its form. A neo-
Durkheimian framework will be characterized where
1. Introduction the past is understood as a symbolic resource often
employed to reduce tensions and strains presently
Collective memory is a sociological concept, though confronting the collectivity; collective memory is
shot through with psychological presumptions. Its understood as a complex social process—including
most important theorist is Maurice Hawlbachs, a instrumental and symbolic struggles by various
second-generation student of Emile Durkheim, who members of the group over the definition of the
wrote Les Cadres sociaux de la memoire (The Social past—to strengthen the ties that bind individual
Frameworks of Memory 1992) and La Memoire Col- members together within the collectivity. Others adopt
lectie (Collectie Memory) ([1950], 1992). Hawlbachs a neo-Freudian perspective where tensions and strains
and those working in the collective memory tradition within a present day collectivity are largely understood
posit that not only do individuals remember, but as deriving from the unavoidable intrusion of a
collectivities as well. Just as there is great interest traumatic past. The past is the source, rather than a
among psychologists, psychoanalysts, and neuro- resource, of challenges confronting a present-day
scientists to discover, describe, and specify the pro- collectivity. Collective memory, here, is often seen as
cesses through which individuals remember—how, an impressive impediment to the establishment of a
what, when, and why remembering occurs—so too healthy, post-traumatic community.
do social scientists—sociologists, historians, anthro-
pologists, political scientists—attend to the social
processes through which collectivities—families, 2. Indiidual and Collectie Memory:
groups, nations—recover the past, conceptualize it Conergence and Consensus
through narrative structures, and memorialize it
through memorials, museums, rites, and other com- Students of collective memory, like researchers on
memorative forms. individual memory studies, understand memory to be
Hawlbachs’ essay was first published in English in a constitutive feature of the present. In both, memory
1980 with an introduction by Mary Douglas, herself a is distinguished from history; the former describes a
neo-Durkheimian. In her introduction, Douglas drew process of reconstruction, occurring ex post facto and
the link between Hawlbachs’ concept and the research possibly bearing little relation to the historical past
of the British psychologist, Sir Frederick Bartlett who, being remembered. Nonetheless, an understanding of
in Remembering: A Study in Experimental and Social oneself in relation to an ‘as if it were true’ past is
Psychology (1932), contextualized memory by arguing believed to be an essential feature of individual life.
that what is remembered of the past depends upon Personal identity depends on locating oneself in a
frameworks or schemas of understanding imposed particular past in which the present is bounded and
upon it from the present. Since the publication of made meaningful through history. The person
Hawlbachs’ writings in English, the interest in col- experiences him or herself in a continuous, com-

2223
Collectie Memory, Psychology of

prehensible relation to a past; an orientation toward 3. The Neo-Durkheimian Perspectie: The


the present and the future is only coherent if in Embeddedness of Collectie Memory
relation, in part, to an understood past. Collective
memory similarly appreciates the temporal dimension American sociology of collective memory has been
of social life, where sociality depends on a past to dominated by a ‘presentist’ psychology, building
support and give continuity and meaning to a present. upon, refining, and revising elements of Hawlbachs’
No collectivity can function without a sense of its own insight concerning the present’s contribution to the
traditions and its own continuity with the past (Shils recovery of the past. Here, research has focused on the
1983, Elias 1994). Where no such past exists—as in relation between memory as a mechanism of response
new collectivities, for example—traditions, as Eric and contemporary strains in the collectivity and other
Hobsbawm has argued, are ‘invented.’ ‘The invention social practices similarly related to responding to
of tradition’ (Hobsbawm 1983) is a social process in tensions within the social fabric. The unifying idea
which memory is confabulated on behalf of a present behind this research rests on the importance of a
requiring a history, for the purpose of creating a common ‘collective consciousness’ to social solidarity
collective or national identity. But if collective memory and cohesion—as originally defined by Durkheim
emphasizes the diachronic relation between a present (1997)—and the important role that memory plays in
that requires a past, it also underscores the funda- constituting that consciousness. Memory, it is argued,
mentally social and synchronous character of memory is ‘embedded’ (Prager 1998) in ‘mnemonic communi-
formation. Collective memory concerns itself with ties,’ (Zerubavel 1996) meaning that it is constituted,
dialogical, discursive processes occurring as a result of elaborated, and altered in response to social processes
communication between various groups and and practices occurring in a collectivity at any given
institutions comprising the collectivity. point in time. Armed with an uncontested under-
Just as the autobiographical memory of the in- standing of the past, collective consciousness is
dividual is not identical to the history being recalled strengthened because of a shared identity with the
—always subject to reformulation, reconstitution, and past; divergent memories, in contrast, express disunity
distortion as a result of the temporal distance between and challenges to social solidarity.
remembering and what is remembered (Schachter
1995, Loftus and Ketcham 1994)—so, too, is collective
(or historical) memory vulnerable to similar possi-
3.1 Rites and Sites of Remembrance
bilities of confabulation, revision, and remodeling
(Kammen 1995, Schudson 1995). Remembering, Collective consciousness is heavily structured through
whether individually or collectively, always involves historical referents and the practices of rituals, rites,
the representation after the fact of past events or and commemorations in part serve to strengthen a
experiences, never the return to the actual past. Freud common sense of the collectivity through shared
himself described this parallel process between in- history. Shils and Young (1953, see also Cannadine
dividual and collectivity when he wrote 1983) writing about the British coronation ceremonies
and Ozouf (1991) describing ritual forms of per-
If we do not wish to go astray in our judgment of their formative remembering of the French Revolution
historical reality, we must above all bear in mind that people’s identify the social significance of these forms of
‘childhood memories’ are only consolidated at a later period, political practices and, through careful readings of
usually at the age of puberty; and that this involves a both the content and the form of commemoration,
complicated process of remodeling, analogous in every way describe the symbolic process of historical recovery on
to the process by which a nation constructs legends about its behalf of present-day political and ideological needs
early history. (1913, p. 206n) and interests (Connerton 1989). While it has been
noted by Bakhtin (1968) and others that commem-
Particular memories, rather than being an orations, like other moments of liminality (Turner),
enduringly stable feature of either individual or group have the potential to take on a life-of-their own (e.g.,
life, are continually susceptible to instability and Davis (1977) by either anticipating the restructuring
disruption. However, temporal distance is not the only of the collective consciousness or promoting social
explanation for the breach between history and mem- division, collective memory research has largely
ory. Interest and\or affect—present-day requirements concerned itself with the particular stabilizing effects
of the individual or the collectivity—dictate the need of past events remembered (e.g., Olick 1999, Schwartz
for remembering. The imperatives of the present are 1996, Wagner-Pacifici 1991, Zelizer 1992) and its
decisive, it is believed, in the nature and character of contribution to the collective consciousness. Nora
remembering that occurs. As those present-day needs (1989) calls attention to les lieux de memoire, or sites of
change, or are altered by circumstance, so too are memory, highlighting the necessity in the modern
memories altered. While there is probably not com- world to locate special sites for memory to occur.
plete alteration, memory researchers are impressed by Because of the discontinuity in modern societies
the capacity for substantial revision to occur. between memory and history—in contrast to tra-

2224
Collectie Memory, Psychology of

ditional societies where ‘real environments of memory’ that the reinterpretation of the past de noo is inhibited
exist—memory sites occupy sacred ground self-con- by counteracting social imperatives to understand the
sciously constructed to mark the present’s continued present as continuous with and coherent to what has
link with its past and to combat forgetting. preceded it. Finally, Olick (1999) strikes a similar
cautionary note by emphasizing the ways in which
earlier commemorations of historical events them-
selves serve as constraining influences on current
3.2 Frameworks of Remembering
memory. He refers to the dialogue between current
Contemporary researchers have identified narrative, memorializations and prior ones as constituting ‘genre
discursive, and other semantic features of collective memories,’ in which the present is discursively steeped
memory to demonstrate the socially embedded, con- in these mediated pasts.
structivist, and presentist quality of collective memory.
In Presenting the Past, Prager (1998) describes how
individual memory relies upon available cultural 4. Neo-Freudianism: Trauma and the
narratives to provide context and meaning to what Embodiment of the Past in the Present
otherwise would be relatively meaningless and random
events in an individual’s life (see also Schudson 1995). Neo-Durkheimian approaches to collective memory
These narratives—for example, a tale of victimization emphasize the continuity of the present to the past,
or of physical or sexual abuse—speak to contemporary underscoring the important role that remembering
understandings of the ways in which the past informs plays in the constitution of the present. A neo-
or frames the present. Collective memory, too, relies Freudian perspective, in contrast, insists that collective
on culturally provided narratives coding the past so as memory is a social process in response to social
to make it meaningful and available to the present. ruptures, or discontinuities, that have occurred in the
The past can only be remembered discursively and past that, because not fully assimilated in conscious
rhetorically, as part of a dialogical process between experience, subsequently interfere with the smooth
social members or organizations collectively remem- functioning of collective life. The title of Friedlander’s
bering (Middleton and Edwards 1990, Douglas 1986, autobiographical work, When Memory Comes (1991),
Shotter 1990). an account of his discovery of his Jewish roots in pre-
Holocaust Czechoslovakia, captures this idea that
memory is an imperative, driven from the past,
intruding upon and interfering with the present.
3.3 History as Constraint
Memory, as the title suggests, will come. Memory and
‘The past in the present versus the present in the past’ Freud’s ‘return of the repressed’ are complementary
(Schudson 1989) captures one important emendation concepts: together they define the present as always
to Hawlbach’s presentism, arguing against a too susceptible to intrusions from the past. Collective
strong unmooring of the present from the past. memory expresses not merely social practices but
Seeking to explicate the ways in which history imposes rather processes driven by psychological law.
certain constraints upon what can be remembered, Friedlander (1992) acknowledges the large debt that
researchers have sought to restrict an unbridled social scholars of the Holocaust owe to Freud when they
constructivism seemingly encouraged by Hawlbachs invoke his ideas to the remembering and understand-
and others. Collective memory, despite its role as a ing of this event. The study of the Holocaust, it should
resource for legitimation, cannot completely override be noted, has emerged as the paradigmatic literature
history. As Schudson argues, the past imposes itself for those researchers studying the phenomenon of
on memory (a) through the strength of prior remem- collective memory in various world settings

You might also like