PageNeuringer1985JEP ABP 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/232488201

Variability Is an Operant

Article in Journal of Experimental Psychology Animal Behavior Processes · July 1985


DOI: 10.1037/0097-7403.11.3.429

CITATIONS READS

389 2,504

2 authors:

Suzanne Page Allen Neuringer


LIRIC Reed College
7 PUBLICATIONS 533 CITATIONS 86 PUBLICATIONS 4,372 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Allen Neuringer on 25 June 2015.

The user has requested enhancement of the downloaded file.


Journal of Experimental Psychology: Copyright 1985 by the American Psychological Assooation, Inc.
Animal Behavior Processes 0097-7403/85/$00.75
1985, Vol. II, No. 3, 429~152

Variability Is an Operant
S u z a n n e Page a n d Allen N e u r i n g e r
Reed College

Pigeons were rewarded if their pattern of eight pecks to left and right response
keys during the current trial differed from the patterns in each of the last n trials.
Experiments 1 and 2 compared Schwartz's (1980, 1982a) negative findings
(variability was not controlled by reinforcement) with the present positive results
and explained the difference. Experiment 3 manipulated n and found that the
pigeons generated highly variable patterns even when the current response
sequence had to differ from each of the last 50 sequences. Experiment 4
manipulated the number of responses per trial; variability increased with increasing
responses per trial, indicating that the pigeons were acting as quasi-random
generators. Experiment 5 showed that for high levels of variability to be engendered,
reinforcement had to be contingent on response variability. In a yoked condition,
where variability was permitted but not required, little response variability was
observed. Experiment 6 demonstrated stimulus control: Under red lights the
pigeons generated variable patterns, and under blue lights they repeated a
particular fixed pattern. We concluded that behavioral variability is an operant
dimension of behavior controlled by contingent reinforcement.

Is response variability controlled by con- Thus, reinforcement might inexorably lead


tingent reinforcers, as are other behavioral to response repetition and therefore to de-
dimensions, such as response rate, location, creased variability. If variability can be rein-
duration, force, and topography? That is, can forced, what function does reinforcement
behavioral variability be increased or de- serve?
creased by reinforcers contingent on such Schedules of reinforcement clearly affect
increases or decreases? Variability is necessary behavioral variability (Antonitis, 1951; Cross-
for m a n y behavioral phenomena. The process man & Nichols, 1981; Eckerman & Lanson,
of operant shaping depends on a variable 1969; Herrnstein, 1961; Lachter & Corey,
substrate (Skinner, 1938). Successive approx- 1982; Notterman, 1959; Piscaretta, 1982;
imations to some goal response are selected Schwartz, 1980, 1981, 1982a, 1982b). Inter-
for reinforcement, and without sufficient mittent schedules, which reinforce only occa-
variation, selection is difficult or impossible. sional responses, generally engender higher
Behavioral variability is also important for variability along many dimensions than does
problem solving and creativity. It would be reinforcement of every response (e.g., Lachter
useful to know, in these instances, whether & Corey, 1982). However, these demonstrations
variability is controlled by its consequences. are of respondent effects and are therefore
The question also has theoretical importance. orthogonal to the present question. An anal-
Reinforcers are said to increase the probability ogous case would be to reward rats for running
of those specific responses that produce them. in a running wheel. One might observe changes
in the rat's heart rate, but reinforcement was
not contingent on heart rate. Whether heart
This article was derived from an undergraduate thesis rate could be manipulated through contingent
submitted by the first author to Reed College. We thank
Scott Gillespie, Rick Wood, Steve Luck, and Richard reinforcement is a separate issue (Miller, 1978).
Crandall for invaluable technical advice and assistance So, too, the question of whether behavioral
and Barry Schwartz for helpful discussion and suggestions. variability can be reinforced is independent of
Order of authorship was determined by a quasi-random schedule-eliciting effects.
process.
Requests for reprints should be sent to Allen Neuringer, There have been few attempts to reinforce
Department of Psychology, Reed College, Portland, Or- variability directly, and the results are incon-
egon 97202. sistent. Pryor, Haag, and O'Reilly (1969)

429
430 SUZANNE PAGE AND ALLEN NEURINGER

reinforced novel behaviors in two porpoises. nate variability in the work of Bryant and
Reinforcers were delivered for any movement Church. Here, too, the argument is that
that in the opinion of the trainer did not variability was not reinforced (variability is
constitute normal swimming motions and not an operant) but the schedules of rein-
that had not been previously reinforced. Dif- forcement engendered variability as a respon-
ferent types of behaviors emerged, including dent by-product. The control experiments
several that had never before been observed necessary to distinguish between the operant
in animals of that species. In a more con- and respondent alternatives were not per-
trolled setting, Schoenfeld, Harris, and Farmer formed.
(1966) delivered rewards to rats only if suc- The most serious evidence against vari-
cessive interresponse times fell into different ability as an operant dimension of behavior
temporal class intervals. This procedure pro- comes from Schwartz (1980, 1982a). Pigeons
duced a low level of variability in the rat's were required to generate on two response
rate of bar pressing. In the most sophisticated keys a pattern of eight responses that differed
and demanding experiment to date, Blough from the last sequence of eight responses. A
(1966) reinforced the least frequent of a set 5 × 5 matrix of lights provided the pigeons
of interresponse times and obtained interre- with continual feedback concerning their dis-
sponse time distributions in three pigeons tribution of pecks. The light started in the
that approximated a random exponential dis- upper left-hand corner; each left-key peck (L)
tribution. Similarly, Bryant and Church moved the light one column to the right and
(1974) rewarded rats 75% of the time for each right-key peck (R) moved the light down
alternating between two levers and 25% of one row. In preliminary training, pigeons
the time for repeating a response on the same were simply required to move the light from
lever and found that for 3 of 4 subjects, the upper left to lower right. There was no re-
resulting choice behaviors could not be dis- quirement to vary the pattern. If the birds
tinguished from chance. People have also responded more than four times on either
been rewarded for behaving variably in more key (thereby moving the light off the matrix),
natural environments (Holman, Goetz, & the 'trial terminated without reinforcement.
Baer, 1977). The pigeons were highly successful, obtaining
Although the above studies indicate that 70% to 90% of available reinforcers by gen-
variability can be reinforced, the conclusion erating repetitive and energy-conserving pat-
is not secure. When the response in question terns (e.g., LLLLRRRR).In the experiments of
is a complex behavior, such as the responses most concern to the present work (Schwartz,
studied by Pryor et al. (1969) and Holman 1980, Experiment 4; 1982a, Experiment 1),
et al. (1977), the observer plays an important in addition to requiring four pecks on each
role in defining novel response, and the novelty key, the schedule demanded that each eight-
observed might be as much a function of the response sequence differ from the immediately
observer as of the subject. Moreover, Schwartz preceding sequence. Now, when the pigeons
(1982b) has argued that the Pryor et al. had to vary their response patterns to be
findings with porpoises can be attributed to rewarded, only 40% of the available reinforcers
the effects of repeated extinction and recon- were earned. Number of different sequences
ditioning, with extinction occurring increas- emitted, an index of learned variability, did
ingly rapidly in progressive instances, and not appreciably increase.
not to the direct reinforcement of novelty. The Schwartz findings, though apparently
Because extinction increases variability (e.g., robust, appeared to conflict with those of
Antonitis, 1951), the variability observed may Pryor et al. (1969), Blough (1966), and Bryant
have been a by-product of the reinforcement and Church (1974). Furthermore, Neuringer
schedule. A similar argument can be made (1984, 1985) showed that when provided with
with respect to both Blough (1966) and Bryant feedback, people learned to generate highly
and Church (1974). Although unlikely, it is variable response sequences, indeed sequences
possible that the reinforcement schedules so variable that they could be described as
somehow elicited interresponse time variabil- random. To explore the apparent disagree-
ity in the Blough experiment and stay-alter- ment between Schwartz's work and these
OPERANT VARIABILITY 431

findings, t h e p r e s e n t r e s e a r c h a t t e m p t e d to experiments (No. 58) were housed in individual


28-cm × 32-cm × 31-cm cages with grit and water freely
r e i n f o r c e variability in a n e n v i r o n m e n t s i m i l a r available, and they were maintained at approximately
to that used by Schwartz, explain Schwartz's 80% of their free-feeding body weights. A 12-hour light-
n e g a t i v e results, a n d d e m o n s t r a t e t h a t t h e dark cycle was in effect. Sessions were conducted daily
v a r i a b i l i t y d i m e n s i o n o f b e h a v i o r is r e i n - unless a bird's weight was more than 20 g over 80% of
free-feeding weight.
f o r c e a b l e a n d , as w i t h o t h e r r e i n f o r c e a b l e
d i m e n s i o n s , sensitive to d i s c r i m i n a t i v e s t i m -
ulus control. Apparatus
The experimental chamber measured 28 cm along the
E x p e r i m e n t 1: V a r i a b i l i t y Versus Variability- sides, 30 cm along the front and rear walls, and 29.5 cm
Plus-Constraint high. The front and rear walls were aluminum, one side
wall was pressed board, the other was Plexiglas, and the
T h e r a t i o n a l e for E x p e r i m e n t 1 derives floor and ceiling were wire mesh. The chamber was
f r o m t h e query, H o w f r e q u e n t l y w o u l d a housed in a sound-attenuating box equipped with a fan
random generator be rewarded under the for ventilation and masking noise. A one-way mirror in
S c h w a r t z (1980, E x p e r i m e n t 4; 1982a, Ex- the outer box, parallel to the Plexiglas chamber wall,
permitted observation.
p e r i m e n t l) c o n t i n g e n c i e s w h e r e a s e q u e n c e
The front wall contained two Gerbrands pigeon keys,
o f eight r e s p o n s e s h a d to differ f r o m t h e each 2 cm in diameter, that were operated by withdrawal
p r e v i o u s s e q u e n c e for r e i n f o r c e m e n t ? I f t h e of an applied force. The keys were 21.5 cm above the
Schwartz contingencies reinforced response floor; one key was directly above a 4.5-cm × 5.5-cm
variability, as was i n t e n d e d , m a n y r e i n f o r c e r s hopper opening that was itself 7.5 cm above the floor
and centered on the panel, and the second key was 4.5
w o u l d b e g i v e n to a r a n d o m r e s p o n s e g e n e r - cm (center to center) to the right of the first. A Gerbrands
ator. H o w e v e r , we f o u n d t h a t a c o m p u t e r - food magazine delivered mixed pigeon grain through the
based r a n d o m n u m b e r generator, p r o g r a m m e d hopper opening. Each key could be transilluminated with
to r e s p o n d r a n d o m l y left o r right u n d e r a 7.5-W white light, and a 7.5-W white house light was
S c h w a r t z ' s c o n t i n g e n c i e s , was r e w a r d e d o n located above the mesh ceiling, directly above the center
key. The house light was continuously illuminated except
o n l y 29% o f t h e trials. T h i s r a n d o m g e n e r a t o r during reinforcement, when the hopper was lighted by a
was successful o n s o m e w h a t fewer trials t h a n 7.5-W white light.
w e r e S c h w a r t z ' s pigeons. T h e r a n d o m g e n e r - A 5 × 5 matrix of yellow 5-W cue lights, each 0.75
ator's performance can be explained on the cm in diameter, was located along the wall to the left of
the response keys. The last column of the matrix was 4
following theoretical grounds. The number cm from the front wall. Columns were 2.5 era, center to
o f d i s t i n c t e i g h t - r e s p o n s e s e q u e n c e s , given center, and rows were separated by 2 cm, center to center.
t w o alternatives, is 256, o r 28 . T h e n u m b e r Each of the 25 lights could be separately illuminated.
of such sequences meeting the requirement Stimulus events were controlled and responses were
o f n o m o r e t h a n f o u r r e s p o n s e s o n e a c h key recorded by a Commodore V1C-20 computer through
an integrated circuit and relay interface. Data were
is 70. T h e r e f o r e , t h e p r o b a b i l i t y t h a t a se- recorded on cassette tape at the end of each session and
quence meets the no-more-than-four require- were later transferred to a Digital Equipment PDP-1170
m e n t is 0.27, o r a p p r o x i m a t e l y the p e r c e n t a g e computer for analysis.
o f t i m e s t h a t t h e r a n d o m g e n e r a t o r was suc-
cessful. A c c o r d i n g to this r e a s o n i n g , t h e n o - Procedure
m o r e - t h a n - f o u r r e q u i r e m e n t was r e s p o n s i b l e
for t h e r a n d o m s i m u l a t o r ' s low success rate.
Pretraining. All pigeons were trained to peck the keys
under a modified autoshape procedure derived from
To d e t e r m i n e w h e t h e r t h e s e t h e o r e t i c a l c o n - Schwartz (1980). After variable intertrial intervals (keys
s i d e r a t i o n s c a n e x p l a i n S c h w a r t z ' s findings, dark) averaging 40 s, one of the two keys, randomly
the present experiment compared Schwartz's selected, was illuminated for 6 s, followed by presentation
s c h e d u l e w i t h o n e in w h i c h t h e r e q u i r e m e n t of grain reinforcement for 4 s. Pecks to a lighted key
immediately produced the reinforcer and darkened the
o f n o m o r e t h a n f o u r r e s p o n s e s p e r key was key. Autoshape training continued for four to six sessions
omitted. after the first session in which a pigeon pecked both
lighted keys. Each session terminated after the 60th
Method reinforcer.
A comparison of two main conditions followed. In the
Subjects variability (V) condition, a sequence of eight left and
right responses had to differ from the previous sequence
Three experimentally naive pigeons (Nos. 28, 68, and for reinforcement. This was compared with a variability-
71) and 1 with previous experience in conditioning plus-constraint (VC) condition, where, in addition, exactly
432 SUZANNE PAGE AND ALLEN NEURINGER

four responses had to occur on each key, a condition each subject was stable over 5 sessions, or 22 to 31
analogous to Schwartz (1980, Experiment 4; 1982a, sessions.
Experiment 1). Return to variability: Lag 5. The Lag 5 variability
Variability: Lags 1 and 5. Each trial consisted of contingencies used in the first phase were reinstated for
eight pecks distributed in any manner over the two keys; 6 to 20 sessions. The eight responses could again be
each session comprised 50 trials. At the start of each distributed in any manner; that is, the requirement of
trial, both keys were illuminated with white light. A peck no more than four responses per key was removed. This
to either key immediately darkened both key lights and variability phase differed from the original in that during
initiated a 0.5-s interpeck interval. Pecks during the the first five trials of each session, the bird's sequences
interpeck interval reset the interval but were not counted had to differ from the last five sequences, including
and had no other consequence. The eighth peck to a sequences emitted during .the last five trials from the
lighted key terminated the trial with either a reinforcer previous session. That is, the comparison set was made
(3.5 s of access to mixed grain) or a 3.5-s time-out, continuous over sessions so that, for example, the fourth
during which the keys were darkened but the house light trial in a session would be reinforced only if the fourth
remained illuminated. Pecks during the time-out reset sequence differed from the preceding three sequences of
the 3.5-s timer but had no other consequence. The only that session and the last two sequences of the previous
difference between time-out and interpeck interval was session.
the duration of these events. Under the Lag 1 condition, Owing to a programming error, the timing of experi-
whether a trial ended in reinforcement or time-out mental events (reinforcement, time-out, and interpeck
depended on whether the sequence of eight pecks on that interval) was altered when the comparison set was made
trial differed from the last sequence of eight pecks, that continuous over sessions: All timed events were lengthened
is, whether the response pattern on trial n differed from by a factor of 10/6; reinforcement and time-out were
trial n - 1. If the sequences on the two trials differed, a 5.83 s and interpeck interval was 0.83 s. The effect of
reinforcer was given; if the sequences were identical, these changes was examined in this experiment and more
time-out occurred. Thus, for example, if on Trial 10 the directly in Experiment 5, below, which shows that these
bird pecked LRLLLRRL, Trial 11 would be reinforced timing parameter changes had no discernible effect on
unless the bird repeated that sequence. The three exper- sequence variability.
imentally naive birds (Nos. 28, 68, and 71) received 5 to After completion of the research with pigeons, the
13 sessions of this Lag 1 training. Because most trials VIC-20 computer's random number generator was used
ended with a reinforcer (the pigeons successfully varied to generate left and right responses under both V and
their sequences), greater variability was demanded by VC conditions identical to those experienced by the
increasing the look-back value to Lag 5. Now, for rein- pigeons. This simulation by a random generator permitted
forcement, the response sequence on trial n had to differ comparison of the pigeon's sequence variability with a
from each of the sequences on trials n - 1 through n - 5, random standard. ~
inclusive; otherwise time-out occurred. The one previously
experienced bird (No. 58) began the experiment with
Results
Lag 5 contingencies. Subjects received 15 to 18 sessions
of Lag 5 until the percentage of reinforced trials remained Two basic measures were derived from
stable over at least 5 sessions. The matrix lights were not
S c h w a r t z (1980, 1982a). F i r s t , p e r c e n t a g e o f
used under these variability procedures.
r e i n f o r c e d trials per session ( n u m b e r o f rein-
Variability-plus-constraint: Lag 1. Except where noted,
the contingencies and parameters here were the same as f o r c e d t r i a l s d i v i d e d b y t o t a l trials) i n d i c a t e d
in the Lag 1 variability condition. As in Schwartz (1980, how often the pigeons met the contingency
Experiment 4; 1982a, Experiment 1), to be reinforced, r e q u i r e m e n t s , o r t h e p r o b a b i l i t y t h a t t h e se-
the birds had to peck exactly four times on each key
quence was correct. The second measure,
with a sequence that differed from the last trial. Thus,
there were two ways for a subject to produce time-out: w h i c h p r o v i d e d a m o r e direct i n d e x o f vari-
(a) Respond four times on the left and four times on the ability, w a s t h e p e r c e n t a g e o f d i f f e r e n t se-
right with a sequence that was identical to the last quences per session (the number of distinct
sequence, or (b) respond more than four times on either eight-response patterns divided by the total
of the keys. The major difference between the V and the
n u m b e r o f trials). A s e q u e n c e w a s t e r m e d
VC conditions was that the eight responses could be
distributed in any manner under the former schedule, distinct i f it d i f f e r e d i n a n y way f r o m all
whereas exactly four responses per key were required p r e v i o u s s e q u e n c e s in a g i v e n s e s s i o n . N o t e
under the latter. that subjects could demonstrate mastery of
The 5 X 5 cue light matrix functioned as in Schwartz. the contingency requirement (percentage of
At the start of each trial, the top left matrix light was
illuminated. A peck to the left key darkened that light r e i n f o r c e d trials could be very high) even
and lit the one to its right. Pecks to the right key darkened
the presently illuminated light and lit the one immediately
below it. A fifth peck to either key moved the matrix Computer-based random number generators are often
light off the board, darkening the currently illuminated referred to as pseudorandom or quasi random because
light and initiating time-out. The VC procedure was the output is generated from an equation. For ease of
continued until the percentage of reinforced trials for presentation we shall refer simply to the random generator.
OPERANT VARIABILITY 433

though the percentage of different sequences


remained low. Under Lag 1 conditions, for
example, where a sequence had to differ only
from the last trial, if a bird alternated between
two sequences, all trials would be reinforced.
However, percentage different would be 2/50,
or 4%, a low value.
Figure 1 shows the percentage of reinforced
trials for each of the four subjects averaged
over the last five sessions of each condition.
The bars show medians for the 4 subjects'
performances. (This same convention will be
used in all following figures, except where
otherwise stated.) Under the VC condition,
shown by the middle bar, 42% of trials were
reinforced per session, whereas under the Lag
5 V conditions, shown by the first and last
bars, more than 90% of all trials were rein-
forced. (The median percentage reinforced
for the 3 subjects that experienced Lag 1 V
training was a similarly high 94%.) Figure 1. Percentage of reinforced trials per session
during variability (V), variability-plus-constraint(VC),
Performances under the original and rep- and replication of variability (V) conditions. (Filled
licated V conditions did not differ statistically, points = arithmetic averagesover the final five sessions;
and for purposes of further statistical analyses, bars =medians of the pigeons' performances; open dia-
the initial and replicated V data were averaged monds = simulated performancefrom a computer-based
and then compared with VC data. The per- random number generator.)
centage of reinforced trials under V was
significantly higher than the percentage of
Because the output of the random generator
reinforced trials under VC, t(3)= 14.44,
could not have been affected by the reinforc-
p < .001.
ers, the only explanation is that there were
Figure 1 also shows that the performance
fewer possible sequences under VC (where
of a simulated random generator (open dia-
trials were terminated by a fifth response to
monds) yielded percentage reinforcement
either key) than under V (where all trials
values of 96% under the V condition and required eight responses),
29% under VC. In terms of successfully meet-
ing the schedule requirements, the pigeons
Discussion
and random generator were similarly affected
by the conditions of this experiment. Under variability-plus-constraint condi-
Figure 2 shows that during the last five tions, only 42% of trials were reinforced, a
sessions of V and VC conditions, respectively, finding that essentially replicates Schwartz
more than 70% of response patterns differed (1980), in which 36% of trials ended in
from all previous sequences in t h e session. reinforcement. In sharp contrast, under the
The two conditions did not differ statistically. variability condition, 90% of trials were rein-
Two pigeons showed a decrease in percentage forced. The main question for discussion is
of different sequences from initial V to VC why V and VC conditions yielded such dif-
conditions and a subsequent increase when ferent success rates. The procedural differ-
V was reinstated; all 4 pigeons showed an ences between the two conditions complicate
increase of percentage difference scores from interpretation: Matrix lights were present un-
VC to the replicated V condition. This slight der VC and not V; sequences had to differ
tendency for there to be a greater number of from the last trial under VC and from the
different sequences under the V condition last five trials under V; a total of 70 sequences
was also observed in the simulated bird's were potentially reinforceable under VC (all
performance, shown by the open diamonds. sequences o f eight responses having exactly
434 SUZANNE PAGE AND ALLEN NEURINGER

cies, only approximately one third of the


trials would end with a reinforcer. On the
other hand, the same random responses would
be reinforced on more than 99% of trials
under an analogous Lag 1 contingency where
the constraint of no more than four responses
per key was absent. An additional analysis of
the computer-simulated data showed that
during 250 simulated trials under the VC
condition, whenever a trial was not reinforced,
it was because more than four responses had
been made to one key. Further analysis of
the pigeons' data showed the same effect.
Over the last five sessions of the VC condition,
for 3 of the subjects, more than 99% of the
nonreinforced trials were due to a fifth re-
sponse on one of the two keys; less than 1%
of the errors were due to a repetition of the
last sequence. The fourth pigeon (No. 68)
was a slight exception, with 81% of nonrein-
Figure 2. Percentageof different sequences(those differing forced trials due to more than four responses
from all previoussequences in a session) emitted by each on a key and 19% due to repetition.
subject under variability (V), variability-plus-constraint Schwartz (1982a, p. 177) concluded that
(VC), and replicated variability (V) conditions. (Filled
points = arithmetic averages over the final five sessions; "reinforcement of variable response sequences
open diamonds= simulations from a computer-based in pigeons does not succeed." The present
random generator; bars = medians of the pigeons' per- analysis suggests that the lack of success was
formances.) due to the presence of the arbitrary four-
responses-per-key constraint.

four responses on each key), whereas 256 Experiment 2" Exact Replication
sequences were reinforceable under V (all of Schwartz
eight-response sequences); and there were two
ways to commit an error under VC (repeat Parameters differed among the variability-
the last sequence or emit more than four plus-constraint and variability procedures in
responses on a given key), whereas only the Experiment l and the Schwartz procedures.
former error was possible under V. The present experiment therefore attempted
The fact that the simulating random gen- to repeat the present Experiment 1 with
erator also was almost always correct under parameters in both V and VC conditions as
V but only infrequently correct under VC close as possible to Schwartz (1982a, Exper-
helps to explain the obtained results. The iment l). This provided a conclusive test of
random generator was not responding to the whether the four-response-per-key constraint
matrix lights. Furthermore, a random gener- was responsible for the low frequencies of
ator would be expected to gain slightly more reinforcement under Schwartz.
reinforcers under a Lag 1 contingency (VC)
than under a Lag 5 contingency (V). Therefore Method
it is unlikely that either lights or lag can
account for the different results under V
Subjects and Apparatus
versus VC. The remaining possible explana- The subjects and apparatus were the same as in
tions involve (a) the difference in the numbers Experiment 1.
of reinforceable sequences and (b) a related
factor, the different ways to make errors PrOcedure
under the two conditions. If a pigeon re- Variabilityplus constraint. The procedurewas identical
sponded randomly under the VC contingen- to VC in Experiment I (eight responses per trial, four
OPERANT VARIABILITY 435

responses required on each key, Lag 1 look back), except


that, as in the work of Schwartz (1982a), (a) there was
no interpeck interval (i.e., a free-operant procedure was
employed); (b) reinforcement consisted of 4 s of access
to mixed grain; and (c) all trials, both reinforced and
not, were followed by a 0.5-s intertrial interval. The
subjects received 17 to 20 sessions, until performances
were stable.
Variability. This procedure was identical to the above
VC except in two respects. Trials did not terminate until
the eighth peck; that is, the requirement of no more than
four responses per key was omitted, as in the V condition
in Experiment I. Second, although matrix lights were
used for feedback, it was necessary to reverse the direction
of the lights whenever more than four pecks were emitted
on a given key. For example, each of the first four pecks
to the left key moved the point of matrix illumination
one position to the right; each additional left-key peck
moved the illumination one position back to the left. An
analogous reversal occurred for the up-down direction
as a function of right-key pecks. Subjects received 15
sessions under this procedure.

Results

T h e left p o r t i o n o f F i g u r e 3 shows average


Figure 3. Percentage of reinforced trials (left two bars)
percentage o f r e i n f o r c e d trials p e r session and percentage of different sequences (right bars) in the
over the last five sessions in each condition. variability (V) and variability-plus-constraint (VC) con-
These results essentially replicate E x p e r i m e n t ditions. (Filled points = averagesover the final five sessions;
1: W h e n pigeons were p e r m i t t e d to d i s t r i b u t e bars = medians of the pigeons' performances.)
their behaviors on the two keys in any fashion,
they achieved r e i n f o r c e m e n t on m o r e t h a n
o f all trials; repetitions a c c o u n t e d for less
80% o f the trials. However, when a four-
t h a n 2%.
response-per-key c o n s t r a i n t was a d d e d , per-
centage r e i n f o r c e m e n t s fell to slightly higher
Discussion
t h a n 40%, a significant decrease, t(3) = 6.667,
p < 0.01. T h e contingencies a n d p a r a m e t e r s in the
T h e right side o f F i g u r e 3 shows the per- p r e s e n t V C c o n d i t i o n were identical to those
centage o f sequences t h a t differed from every used b y Schwartz (1982a). T h e contingencies
p r e v i o u s sequence in each session averaged u n d e r V were, as far as possible, identical to
over the last five sessions. T h e percentage the VC c o n d i t i o n with one i m p o r t a n t excep-
different scores u n d e r the V C c o n d i t i o n were tion: U n d e r V there were no constraints on
higher t h a n u n d e r V in t h r e e o f the four the d i s t r i b u t i o n o f responses. We c o n c l u d e
pigeons. T h a t is, the birds were m o r e variable t h a t c o n s t r a i n i n g the response d i s t r i b u t i o n to
u n d e r V C t h a n they were u n d e r V. This four responses per key in V C a n d in Schwartz
difference was n o t statistically significant. (1980, 1982a) caused the low frequencies o f
Figure 4 shows why the relatively high o b t a i n e d rewards.
variability u n d e r V C resulted in relatively T h e pigeons generated fewer different se-
i n f r e q u e n t rewards. T h e p e r c e n t o f n o n r e i n - quences u n d e r V t h a n u n d e r VC. As n o t e d
forced trials u n d e r VC (left bar) is divided above, little variability was r e q u i r e d b y a Lag
into two categories, shown b y the two right 1 contingency (one successful strategy being
bars. O n e cause for n o n r e i n f o r c e m e n t was to alternate between only two sequences).
t e r m i n a t i o n o f a trial b y a fifth response on Thus, the variability engendered b y the V
either key. T h e o t h e r cause was repetition o f contingencies a p p e a r e d to be sensitive to the
the i m m e d i a t e l y p r e c e d i n g eight-response se- degree o f v a r i a b i l i t y r e q u i r e d (see below for
quence. T e r m i n a t e d trials a c c o u n t e d for 53% confirmation). F u r t h e r m o r e , despite the rel-
436 SUZANNE PAGE AND ALLEN NEURINGER

• '#28
• ÷58
quired to vary their sequences. In his second
70 • ÷68 experiment (Schwartz, 1982a), birds again
Q #71
received prior training on the sequencing task
(although the duration of this training was
60
not stated). In the present experiments, pi-
geons had to respond variably (V condition
in Experiment 1) before being placed under
50
the Schwartz contingencies (VC). Second,
"O Schwartz continued his experiments for more
c 40
sessions than we did. Although there was no
indication that variability under V conditions
O. decreased with training (see Experiment 5,
"O below, for the opposite conclusion), it is pos-
0 . 30 er
O # sible that with continued training under VC,
lb.
O response stereotypies would have developed.
20 e"
om

re Experiment 3: How Variable?


Lag as Parameter
10 O
Z Variability is a continuum. The two pre-
# vious experiments showed that pigeons earned
0 V I frequent rewards under Lag 1 and Lag 5
Figure 4. Percentage of trials that were not reinforced look-back contingencies when there were no
and division of nonreinforced trials into terminated additional constraints on response distribu-
(more than four responses emitted on a key) and repeated tion. This experiment asked whether pigeons
(a given eight-response sequence repeated on two consec- could maintain high success rates when the
utive trials). (Filled points = averages from final five
variability requirement became increasingly
sessions under the variability-plus-constraint condition;
bars = medians of the fours pigeons' performances.) stringent, that is, when the look back was
increased. By the end of this experiment, to
be rewarded, a pigeon had to generate a
atively high variability demonstrated by the sequence that differed from every one of its
birds in the VC condition, reinforcement last 50 trials.
frequency remained low because of the re-
sponse-distribution requirement. Method
Although the percentage of reinforced trials Subjects and Apparatus
under the present VC contingencies approx-
Two experimentally naive pigeons (Nos. 70 and 73)
imated those obtained by Schwartz (1980; and two previously experienced pigeons (Nos. 59 and 61)
1982a), one result differed. Schwartz found were housed and maintained as in Experiment 1. The
that birds emitted few different sequences, apparatus used in this experiment was the same as in
that one modal sequence came to dominate, Experiment I, but the matrix lights were not used.
and that overall variability of responding was
low. There are two possible reasons for the Procedure
relatively high variability shown by the present The four pigeons were autoshaped as described in
subjects under VC. First, in Schwartz (1980), Experiment 1. The two previously naive subjects (Nos.
all pigeons were given extensive preliminary 70 and 73) then received training under Lag 1 conditions
training on a sequencing task in which the with parameters the same as in the Lag 1 variability
phase of Experiment 1: Reinforcement and time-out
only requirement was to move the light from were 3.5 s each; an interpeck interval consisting of dar-
upper left to lower right. Under this sequenc- kened keys lasted 0.5 s; eight responses constituted a
ing task, patterns did not have to differ from trial; the sequence on each trial had to differ from that
previous patterns, and all pigeons eventually on the last trial for reinforcement; and there were 50
trials per session. Throughout this and the remaining
came to repeat highly stereotyped patterns experiments, there were no additional response constraints;
(e.g., L L L L R R R R ) . Only following this stereo- that is, the eight responses could be distributed in any
typy-inducing training were the pigeons re- manner across the two keys. Atter 12 or 13 sessions, the
OPERANT VARIABILITY 437

lag requirement was increased to 5: For reinforcement, the broken line shows a random generator
sequences had to differ from those in each of the last 5 simulation under identical conditions. From
trials. The two experimentally sophisticated pigeons began
the experiment at this Lag 5 value. There followed 10 to
Lags 5 through 25, more than 85% of the
21 sessions under Lag 10, 8 to 25 sessions under Lag 15, pigeons" sequences met the variability re-
10 to 38 sessions under Lag 25 (No. 59 remained at Lag quirements and were reinforced. At Lag 50,
25 until the end of the experiment because its performance there was a decrease to 67%. This same
never reached stability), and (for the 3 remaining subjects)
23 to 45 sessions under Lag 50. Throughout the experi-
decrease in percentage reinforced was seen in
ment, the lag value was changed only after a subject's the random generator's simulated data. Thus,
percentage of reinforced trials had become stable over at the pigeon's data again paralleled the data of
least 5 sessions. Midway in the Lag 10 condition, the a random generator.
procedure was changed (as described in Experiment 1)
so that the comparison trials included the final trials of
To obtain the high frequencies of rein-
the previous session. Thus, for example, under the Lag forcement shown in Figure 5, the pigeons
50 condition, if the subject were responding on its 1 lth must have generated highly variable response
trial in a session, for a reward to be presented, the sequences. One index of this high variability
current sequence had to differ from the 10 trials already is shown in Figure 6. As the lag requirement
completed in the present session and the last 40 trials of
the previous session. Because of the same programming increased from 5 to 25, the percentage of
error described in Experiment 1, from the midpoint of sequences that differed from all previously
Lag 10 through the end of the experiment, all timed emitted sequences in a session increased from
events were increased by a factor o f 10/6: Reinforcement 66% to 87%. As discussed above, to maintain
and time-out were 5.83 s rather than the original 3.5 s,
and interpeck interval was 0.83 s rather than the orig-
a high frequency of reinforcement under a
inal 0.5 s. low lag requirement, the bird did not have
to emit very many different sequences. As
Results and Discussion the length of the look back increased, however,
increasingly greater numbers of different se-
Figure 5 shows average percentage of rein- quences were demanded. Sensitivity to these
forced trials over the last five sessions at each changing requirements is indicated by the
lag, or look-back value. The solid line con- increasing function from Lag 5 to Lag 25.
nects the medians of the four pigeons, and The small decrease to 81% different at the

100

90
E
o
f- 80

70
O • ,I,59
tO • ,/,61 •
L. 60 • ,~70
0
a. & @ ÷Ta
~ MED

5O
1" I I I I Ii I
5 10 15 25 50

LAG
Figure 5. Percentage of reinforced trials per session as a function of lag (look back) value. (Filled points =
averages over final five sessions at each lag value for each subject; solid line = medians; broken line =
performance o f a simulating random generator under identical conditions.)
438 SUZANNE PAGE AND ALLEN NEURINGER

100

0.-- ---.<>- ~ -0--- .~-~-w~-----...O


E 90
• Q

8o

0L--
• ÷61
60 • • ÷70
O. • • '/'73
• : : MED
Q
5O1"
I t a t I!, i
5 10 15 25 50

LAG
Figure 6. Percentage of different sequences per session as a function of lag value. (Filled points = averages
over final five sessions at each lag value for each subject; solid line = medians; broken line = performance
of a simulating random generator under identical conditions.

Lag 50 value may be a respondent effect, and


o f LLL, LLR, LRL, LRR, RLL, RLR, RRL,
correlated with the lowered frequencies of RRR triplets. The U measure, derived from
obtained reinforcements. As shown by the information theory (Miller & Frick, 1949),
performance of the random generator in Fig- varies between 0.0 and 1.0, with 0.0 indicating
ure 5, no matter how variable the behavior, that all responses are perfectly predictable or
as lag requirement continued to increase, a ordered and 1.0 indicating maximum uncer-
decrease in the number of rewards gained tainty. The U values were calculated by con-
resulted. catenating all responses without regard to
More detailed indices of overall response reinforcement or time-out and computing the
variability are given in Figure 7, which shows relative frequencies of left and right responses
three measures of variability (U values, or (Ul), pairs of responses (U2), and triple~ of
average uncertainty) as functions of lag. The responses (U3). When left and right were
U values were computed according to the approximately equal, U~ approached 1.0;
following equations: when all possible pairs of responses ap-
proached equality, []2 approached 1.0; and
~2l Pi log2 Pi when all possible sequences of responses taken
Vl = three at a time approached equality, U3 ap-
log2(2)
proached 1.0. Figure 7 shows that as lag
X~ (p~ log2 p~) - u, values increased (i.e., as requirement for vari-
82---- ability increased) the averages of the 4 pigeons'
1og2(4) Ui, U2, and U3 values increased. (These
and averages well represent the individual func-
~_~ (Pi log2 Pi) -- U2 tions and are presented to save space.) At the
U3 = lag value of 25, the average of the 4 pigeons'
log2(8) U values approximated the U value of the
where, for U~, pj equals the probabilities of random number generator (shown by the
L and R responses; for U2, Pi equals the open diamonds; only a single line is drawn
probabilities of LL, LR, RL, and RR response because UI, U2 and []3 values were approxi-
pairs; and for U3, pi equals the probabilities mately the same for the random generator).
OPERANT VARIABILITY 439

1.oo 0 ._~i.~. .~__.~i.i.~

~ .951 ~~'~" ~
i .85t ~ / :// ~Ul2
• U3

, I , ' II i
5 10 15 25 50
LAG
Figure 7. Average sequence variability as functions of lag. (Each of the lines connects the medians of the
pigeons' average performances over final five sessions of each lag value. Ut = uncertainty for responses
taken one at a time; /-/2 = uncertainty for responses considered in pairs; /-/3 = uncertainty for response
triplets. Open diamonds = analogous data for simulating random number generator, where, because the
three U values were almost identical, a single point indicates the values.)

Once again, there was a slight tendency for paralleled that of the simulated random func-
variability to decrease at Lag 50. The closeness tion; this finding was again consistent with
of the three U functions to one another the hypothesis that the pigeons were gener-
indicates high variability and absence of ating quasi-random sequences. Although the
higher order biases or stereotypies. (Note that variability of the computer-based random
we also examined U4 through Us, and these generator was, of course, unaffected by the
contained similar information to that shown reinforcement schedule, the pigeons' vari-
in U1 through Us. ability appeared to be controlled by the re-
Two possibly confounding influences were inforcers. When the schedule demanded rel-
present in Experiment 3. First, the timing atively little variability (Lag 5), variability
parameter changed during the Lag 10 con- was relatively low. As the variability require-
dition, and therefore the Lag 15 through Lag ment increased to Lag 25, so too did vari-
50 conditions contained longer reinforcement ability of performance. However, at Lag 50,
and time-out durations than the Lag 5 and when the obtained frequency of reinforcement
Lag 10 conditions. However, Experiment 1 decreased despite random performance (as
showed that there was no statistically signifi- indicated by the simulating random genera-
cant effect of these timing differences, a result tor), again the birds' variability decreased.
supported in Experiment 5, below. Second, Thus, high variability was engendered only
the lag requirements increased from low val- when it was differentially reinforced.
ues to high, and therefore the form of the
obtained function may partly be due to the Experiment 4: Quasi-Random Versus
order of experience. We thought that pigeons Memory Strategies: Number of
could not tolerate high lag requirements be-
Responses as Parameter
fore they experienced training under lower
requirements, a hypothesis shown to be in- The present experiment asked how the
correct in Experiment 5. The general form pigeons generated their variable response se-
of the subjects' percentage reinforced function quences: What mechanism or strategy ac-
440 SUZANNE PAGE AND ALLEN NEURINGER

counts for such highly variable performance? under eight responses per trial (identical to the Lag 5
The previous experiments alluded to one variability conditions in Experiments 1 and 3 where
there were 256 possible sequences); 9 to 23 sessions
possible strategy, that o f a quasi-random gen- under four responses per trial (16 possible sequences);
erator, but alternative strategies involving re- and another 6 to 9 sessions under eight responses per
m e m b e r i n g previous sequences would also trial. The reinforcement and time-out intervals were 5.83
do. For example, a subject could learn a long s throughout, and interpeck interval was 0.83 s.
r a n d o m response sequence (see Popper, 1968)
or utilize a rational system (e.g., first emit 8 Results and Discussion
left responses; then 7 left and l right; then 6 Figure 8 shows the average percentage o f
left, l right, and 1 left; etc.). It is not here the 50 available rewards obtained over the
being suggested that pigeons can c o u n t in last five sessions at each responses-per-trial
binary but rather that their behavior could value. The eight-response value represents
be described as systematic. A n y systematic the mean o f the two eight-response phases,
strategy would involve a rather large m e m o r y which were statistically indistinguishable from
load, for the pigeon would have to r e m e m b e r one another. For all subjects, as n u m b e r o f
where it was in its system. The present ex- responses per sequence increased, percentage
periment attempted to contrast the quasi- o f reinforced trials increased monotonically.
r a n d o m and m e m o r y hypotheses in the fol- A n analysis o f variance with repeated mea-
lowing way. If the n u m b e r o f responses re- sures showed an overall significant difference
quired for each sequence were increased, a m o n g the three conditions, F(2, 6) = 38.21,
performance based on a m e m o r y strategy p < .001, and analytical pairwise comparisons
should be adversely affected, for it is easier showed that each condition differed from
to r e m e m b e r a four-response sequence than every other: four versus eight responses, F(1,
an eight-response sequence. O n the other
hand, if the bird were acting as a quasi-
r a n d o m generator, success rates should im- 1013
prove with increasing responses per trial, ,O,.... i ot"'~O
because by the laws o f chance, a r a n d o m
generator would be more likely to repeat 913
sequences comprising four responses (1
chance in 16 under Lag 1) than eight respon- "~
ses (1 chance in 256). Thus, the m e m o r y and O 813
quasi-random generator hypotheses make op- //A ,
posite predictions. If the subject's rewards O
increased with increasing responses per trial, .~r" 70
the quasi-random hypothesis would be sup- I~.
ported; if the rewards decreased with increas-
ec. -, 60
ing responses per trial, a m e m o r y strategy
O
would be supported.
0L_ ~'- ,~ RND
50 • ÷28
Method a. • ÷58
• ÷68
Subjects and Apparatus 40 • ÷71
= - MED
The subjects and apparatus were the same as in
Experiment 1.
•'c,~ j o t I
4 6 8
Procedure
The procedure was identical to that in Experiment 3,
Number of Responses
except the lag requirement was kept constant at Lag 5 Figure 8. Percentage of reinforced trials as a function of
and the number of responses per trial, or sequence length, number of responses per trial. (Filled points = averages
varied in an ABCB format. The pigeons were first given over final five sessions at each condition for each of the
24 to 29 sessions under a six-responses-per-trial condition pigeons; solid line = medians; broken line = data from
(there were 64 possible sequences); then 12 to 38 sessions simulating random generator under identical conditions.)
OPERANT VARIABILITY 441

6) = 77.37, p < .001; six versus eight respon- " ~ 1.00


ses, F(1, 6) = 16.17, p < .01; and four versus :~ .98
six responses, F(1, 6) = 22.80, p < .005. The ~t~
random number simulator's percentage of ~"
reinforced trials, shown by the open dia- .94
monds, also increased monotonically as a
function of increasing responses per t r i a l . . _ * - ,
.90
Figure 9 shows that U values (measures of '.Q =
overall response variability, as explained in , i • U2 XT1"
Experiment 3) varied nonmonotonically over .86 • U3
a relatively small range. (Note that the per-
centage different statistic is inappropriate in
the present case because the n u m b e r of pos- Number of Responses
sibly different sequences varied with responses Figure 9. Average response variability as a function of
per sequence.) Analyses of variance showed number of responsesper trial. (UI = uncertainty measured
no significant effects for U~,/-/2, or/-73. Thus, by taking responses singly; U2 = uncertainty by taking
once again, the pigeons' function approxi- pairs of responses; U3 = uncertainty by taking response
mated the simulated random generator (open triplets; open diamonds = uncertainty of responses gen-
erated by simulating random generator.)
diamonds), and the quasi-random hypothesis
was supported.
reinforced, each eight-response sequence had
to differ from the previous 50 sequences (Lag
Experiment 5: Is Variability a Reinforceable
50 variability condition). After stable perfor-
Dimension? Lag 50 Versus Yoked
mances were attained, each pigeon was pre-
Variable Ratio
sented with the exact same frequency and
Neither the experiments described above pattern of rewards that it had received over
nor any previously published study conclu- its last six sessions of Lag 50 variability, but
sively demonstrated that variability is an op- now the rewards depended only on an emis-
erant dimension controlled by reinforcement. sion of eight responses and not on sequence
A distinction was drawn at the beginning of variability. With this self-yoking procedure,
this article between elicited, or respondent, we determined whether the variability ob-
effects of reinforcement schedules and rein- served under a Lag 50 schedule was due to
forcing effects. For example, in Herrnstein respondent effects of the schedule or to re-
(1961), variability of responding along a strip inforcement of operant variability. A rein-
was monitored as a function of different forcement-of-variability hypothesis would be
reinforcement schedules. As long as the pigeon supported only if sequence variability were
pecked anywhere along the strip, responses appreciably higher under the Lag 50 vari-
were effective. Variability was therefore or- ability condition than under the self-yoke
thogonal to those dimensions required by the condition.
schedule contingencies; reinforcement did not
depend on variability. (In Herrnstein and the Method
other respondent cases described at the begin-
Subjects
ning of the article, variability may have been
reinforced adventitiously. See Neuringer, Two experimentally naive (Nos. 49 and 50) and two
1970, for a related analysis of superstitious previously experienced (Nos. 44 and 45) pigeons were
maintained as described in Experiment 1.
key pecking in pigeons.)
The question now raised is whether the
variability observed in Experiments 1 through Apparatus
4 was a by-product of the particular schedules Two 30-cm × 40-cm × 30-cm chambers made of
used or whether the observed variability de- aluminum walls, with a Plexiglas door inset in the rear
pended on direct reinforcement of that vari- and a wire mesh floor and ceiling were housed in two
sound-attenuating outer boxes. A Gerbrands masking
ability. For an answer, subjects were first noise generator provided additional sound masking. On
presented with a schedule in which, to be the front wall of each chamber were three 2-cm-in-
442 SUZANNE PAGE AND ALLEN NEURINGER

diameter Gerbrands response keys with their centers 7.5 1, the duration of events (reinforcement, time-out, and
cm from each other and from the side walls and 21.5 cm interpeck interval) was shorter (by a factor of 10/6) in
above the mesh floor. Keys could be transilluminated yoked-VR than in either the preceding or the following
with 7.5-W blue bulbs. A response registered when Lag 50 phases. The reinforcer and time-out were 3.5 s
applied force was withdrawn. The middle of the three in yoked-VR, as opposed to 5.83 s in Lag 50. The
keys was covered with black tape and could not be interpeck interval was 0.83 s in Lag 50 phases but 0.5 s
operated. Directly below this middle key was a round in the yoked-VR. To compare directly the effects of the
hopper opening, 5 cm in diameter, with its midpoint l0 different time values on sequence variability, after each
cm from the floor, through which a Gerbrands magazine pigeon had reached stability on Lag 50 (second A phase
could provide mixed pigeon grain reinforcement illumi- containing the long times), the durations of the reinforcer,
nated with a 7.5-W white bulb. House light was provided time-out, and interpeck interval were changed to those
by two 7.5°W white bulbs above the wire mesh ceiling. under yoked-VR with everything else held constant, thus
As described in Experiment l, VIC-20 computers con- permitting comparison of responding under Lag 50 long
trolled the experiment. Each pigeon was arbitrarily as- times with Lag 50 short times. After stable performances
signed to one of the two experimental chambers. were reached in 10 to 20 sessions, the yoked-VR contin-
gencies were reinstated for 17 to 32 sessions, thereby
permitting comparison of Lag 50 and yoked-VR when
Procedure all time parameters were identical. Schedules of reinforcers
The four subjects were first given autoshaping training and time-outs in this second yoked-VR phase were
as described in Experiment 1. The experimental procedure derived from performances of each pigeon in the last 6
then followed an ABAA'B design, with A and A' repre- sessions of the A' Lag 50 (short times) phase.
senting Lag 50 variability contingencies and B representing
a yoked variable ratio (yoked-VR) contingency in which Results and Discussion
variable sequences were not required. The variability
procedure was identical to that in Experiment 1 except T h e m a i n results w e r e t h a t (a) v a r i a b i l i t y
as follows. A Lag 50 requirement was present from the was significantly h i g h e r u n d e r L a g 50 t h a n
outset. As in the latter phases of Experiment 1, the yoked-VR, thereby demonstrating that the
comparison criterion was continuous over sessions. For
example, for Trial 10 to be reinforced, the sequence of variability depended on contingent reinforce-
responses in that trial had to differ from each of the m e n t ; (b) e x p e r i m e n t a l l y naive pigeons p l a c e d
sequences in the 9 previous trials of the current session d i r e c t l y o n t h e L a g 50 s c h e d u l e v e r y q u i c k l y
and the last 41 completed trials of the preceding session. l e a r n e d to vary, t h e r e b y i n d i c a t i n g t h a t vari-
The initial Lag 50 phase continued for 26 to 38 sessions
or until each pigeon maintained a stable percentage of a b i l i t y was easy to c o n d i t i o n ; a n d (c) t h e 10/
reinforced trials for 5 or more sessions. Throughout the 6 d i f f e r e n c e s in reinforcer, t i m e - o u t , a n d in-
present experiment, sessions terminated after the 50th terpeck interval times had no discernible
reinforcer, or after 100 trials, whichever occurred first. effect o n p e r f o r m a n c e , t h e r e b y i n d i c a t i n g t h e
At the start of the B phase, the contingencies of robustness of the reinforcement-of-variability
reinforcement were changed so that the pigeons were
reinforced on a yoked-variable ratio schedule derived effect.
from their individual performances under Lag 50. Under T h e left t w o b a r s o f F i g u r e 10 s h o w m e a n
this yoked-VR, eight responses again constituted a trial, p e r c e n t a g e s o f d i f f e r e n t s e q u e n c e s p e r session
and trials were again sometimes followed by grain rein- for t h e first five ( o p e n bar) a n d last five
forcers and sometimes by time-out, but reinforcement ( s t r i p e d bar) sessions u n d e r t h e first L a g 50
and time-out presentation were now independent of
sequence variability. Each pigeon's last 6 sessions under (long t i m e s ) s c h e d u l e . O v e r t h e first five ses-
the Lag 50 variability contingencies were used to create sions, m o r e t h a n 50% o f s e q u e n c e s differed
its schedule of reinforcement under yoked-VR. Thus, f r o m all p r e v i o u s s e q u e n c e s in t h e s a m e
each subject was yoked to itself, and the schedule of session, a n d this v a l u e i n c r e a s e d to m o r e
reinforced and nonreinforced trials under yoked-VR
t h a n 75% by t h e last five sessions. By t h e e n d
replicated the pattern of reinforcers and time-outs obtained
under Lag 50 variability. The yoked reinforcement sched- o f this L a g 50 phase, t h e p i g e o n s w e r e b e i n g
ule lasted for 6 consecutive sessions and then was repeated. r e w a r d e d after a p p r o x i m a t e l y 70% o f t h e i r
To illustrate, if Subject 44 had been rewarded after Trials trials, an i n c r e a s e f r o m 50% d u r i n g t h e first
2, 5, 6, and 8, and so on, in the last session under the five sessions. T h e i n c r e a s e s f r o m first to last
Lag 50 condition, then in yoked-VR Sessions 6, 12, 18,
and so on, Subject 44 would be rewarded after Trials 2, five sessions in b o t h p e r c e n t a g e different trials,
5, 6, and 8, and so on, regardless of which eight-response t(3) -- 6.68, p < .0 l, a n d p e r c e n t a g e r e i n f o r c e d
sequence was emitted. Trials 1, 3, 4, and 7, and so on, trials, t(3) -- 5. l l, p < .025, w e r e statistically
would be terminated by time-out. The yoked-VR contin- significant.
gencies continued until performance was stable, from 24
to 31 sessions, whereupon Lag 50 variability contingencies T h e s e c o n d set o f bars, r e p r e s e n t i n g y o k e d -
were reinstated and maintained for 17 or 18 sessions. V R , s h o w s t h a t w h e n the v a r i a b i l i t y c o n t i n -
Due to the programming error described in Experiment gencies were removed, percentage of different
OPERANT VARIABILITY 443

sequences fell immediately until fewer than last five sessions under this phase. Because of
20% of the sequences were different from the differences in timing values (long times
previous sequences in the session. The differ- in Lag 50 and short times in yoked-VR), the
ence between the last five sessions of the Lag observed effects might have been confounded.
50 variability contingencies and the last five The reinforcement, time-out, and interpeck
sessions of the yoked-VR was significant, interval times were therefore changed under
t(3) = 7.46, p < .005. Lag 50 so that these times were now identical
Upon reintroduction of the Lag 50 contin- to those under yoked-VR. The third and
gencies, percentages of different sequences fourth bars above the replicated Lag 50 show
rose immediately. The leftmost of the four percentage different sequences during the first
bars above the repeated Lag 50 condition five and last five sessions under short time
shows percentage different during the first values. There was essentially no change in
five sessions of Lag 50 following yoked-VR, percentage of different sequences due to the
and the second of the four bars shows the different time values.

Figure 10. Percentage of different sequences per session under one condition where each sequence had to
differ from the previous 50 sequences for reinforcement(Lag 50) and another condition where reinforcements
were given independently of response variability (yoked VR). (Open bars = first five sessions of each
condition; striped bars = final five sessions. L = long timing values; S = short times. Each point = 1
pigeon'sarithmetic averageperformanceover fivesessions; bars = medians of the 4 pigeons'performances.)
444 SUZANNE PAGE AND ALLEN NEURINGER

Upon return to the yoked-VR condition, Lag 50 and yoked-VR, t(3)= 2.493, p =
variability of responding again decreased im- .0873, was due to the large spread of the
mediately, and once again the difference be- individual subjects' data under the yoked-VR
tween the last five sessions of Lag 50 (now condition. When the schedule demanded high
with short times) and last five sessions of variability (Lag 50), all subjects emitted very
yoked-VR (with identical short times) was few repetitions of any given sequence; when
significant, t(3) = 6.77 l, p < .01. the schedule permitted variability but did
The fact that response variability depended not require it (yoked-VR), there were large
on contingent reinforcement is shown in Fig- intersubject differences. These same patterns
ure I l by the percentage of modal sequences, of modal frequencies were replicated with
defined as the single sequence emitted more return to Lag 50 (only the short time phase
frequently than any other in a session. The of Lag 50 is shown in the figure) and then to
ordinate shows the percentage of trials per yoked-VR, with the difference during the last
session in which the modal sequence oc- five sessions of these conditions being statis-
curred. By the last five sessions of the first tically significant, t(3) = 3.46, p < .05. In al-
phase of Lag 50, the modal sequences ac- most all cases, the sequence defined as modal
counted for about 4% of the sequences emit- under yoked-VR represented exclusive re-
ted per session. On the other hand, by the sponding on one or the other key (e.g., eight
end of the first yoked-VR phase, modal se- left responses or eight right responses). Be-
quences accounted for almost 50% of the cause reinforcement did not depend on any
sequences. Absence of significance between particular sequence, the final behavior was

Figure 11. Percentage of modal sequences per session (number of trials in which the most common
pattern occurred divided by the total number of trials) as a function of Lag 50 versusyoked variable ratio
(yoked-VR)conditions.
OPERANT VARIABILITY 445

probably a function of minimizing energy 1.0C O-----O ~ _ O------O


expenditure (it takes more energy to alternate
keys than to respond on a single key) and .9C
adventitious reinforcement of superstitious
patterns. Under Lag 50, however, the modal .~ .8C
sequences generally comprised a mixture of >,
responses on the two keys (e.g., LLRLRLLL, .7C
RLLLRRRR, LRLLRRRR, and RRRRRLRR [one

\
example from each of the 4 pigeons]). .6C
.I3
Average U values, an index of overall re-
.5C
~'g?,,,
sponse variability as discussed above, are > L:Last

plotted for each of the conditions in Figure .4C


12. For comparison, the random number
generator's U values under identical simulated .3C F L F L F L F L
conditions are also drawn. During the last
LAG Yoked LAG Yoked
five sessions of both Lag 50 conditions, the 50 VR 50 VR
pigeons' U values closely approximated the
U value of the random generator. Under both Figure 12. Average response variability under Lag 50 and
yoked variable ratio (yoked-VR) conditions. (U~ = re-
yoked-VR conditions, however, the pigeons' sponses taken one at a time; U2 = responses in pairs;
U values were greatly lowered. The relatively /./3 = responses in triplets. Open diamonds are from a
low Ut values show that the birds were form- simulating random generator under identical conditions.
ing position preferences. The differences be- F = averages over the first five sessions; L = averages
tween Ui and U2 and, similarly, between Ut over the final five sessions of each condition.)
and U3 show that in addition to the position
preferences, second- and third-order patterns
of responding were being generated with high can pigeons learn to generate variable se-
probability. quences in the presence of key lights of one
Whether the results are considered in terms color and stereotyped, or fixed, sequences in
of percentages of different sequences, relative the presence of a different color? An affir-
frequencies of modal sequences, or average mative answer would support the thesis that
uncertainty across the entire array of respon- behavioral variability is controlled by envi-
ses, a single robust conclusion is reached: ronmental contingencies.
The variability-requiring Lag 50 condition
caused significantly more variability of se- Method
quence patterns than did the yoked-VR. The
behavioral variability generated in the present Subjects and Apparatus
experiment depended on the variability's Four experimentally naive pigeons (Nos. 29, 3 t, 38,
being reinforced. Absence of high variability and 39) were maintained as described in Experiment 1.
under yoked-VR may indicate that variable For the apparatus, the two chambers described in Exper-
iment 5 were again used. The key lights could be
responding is an energy-expensive strategy illuminated with either blue or red 7.5-W bulbs.
and, hence, nonpreferred. That there was
some variability under the yoked-VR condi-
tion may indicate a small respondent effect.
Procedure
An autoshaping procedure was identical to that in
Experiment 1, with one exception. A trial consisted of
Experiment 6: Stimulus Control: Multiple one of four equally probable events: left key light red,
Variability-Stereotypy left key light blue, right key light red, and right key light
blue. Subjects received 5 sessions of autoshape training
Operant, or reinforceable, dimensions of after their first response. Because Subject 38 did not peck
behavior (location, duration, rate, force, and the key after 20 sessions of autoshaping, the experimenter
topography) are sensitive to stimulus control. shaped key pecking by reinforcing successive approxi-
The question raised in this final experiment mations to the key peck response and then provided 5
additional autoshaping sessions.
is whether the variability dimension can also There were three experimental phases. The first ex-
come under stimulus control. In particular, amined acquisition of stimulus control over variable and
446 SUZANNE PAGE AND ALLEN NEURINGER

stereotyped responding,. Following an informal exploration The key light colors signaling V and S were then
of the effects of parameter manipulations, the second reversed, with red key lights now associated with V and
phase attempted to equalize the number of responses per blue key lights with S. No other changes were made. Two
trial in the two conditions and to generate approximately pigeons (Nos. 29 and 38) began the first session of
equal and intermediate levels of percentage of reinforced reversed color in the S component, and the other two
trials. A reversal of stimulus conditions followed. began in V. There were 18 to 24 sessions of reversal.
Phase 1: Acquisition of stimulus control. Following
autoshaping, the four pigeons were put on a multiple Results
variability-stereotypy schedule in which the two com-
ponents alternated after every 10th reinforcement. In the F i g u r e 13 s h o w s p e r c e n t a g e s o f r e i n f o r c e d
variability component, both key lights were blue, and
trials in variability a n d s t e r e o t y p y c o m p o n e n t s
subjects had to meet a Lag 5 variability criterion identical
to that described in Experiment 1, where reinforcement s e p a r a t e l y for 1 subject ( N o . 31) d u r i n g e a c h
was contingent on a sequence of eight pecks on the two session o f Phases 1 a n d 3. T h e p e r f o r m a n c e
keys that differed from each of the preceding five se- s h o w n is r e p r e s e n t a t i v e o f all birds. D u r i n g
quences. The Lag 5 criterion was continuous with respect a c q u i s i t i o n o f p e r f o r m a n c e u n d e r t h e first
to other V components in the same session and across
sessions. m u l t i p l e s c h e d u l e , s h o w n in t h e left panel,
In the stereotypy (S) component, both key lights were all b i r d s initially r e c e i v e d a h i g h e r p e r c e n t a g e
red and the pigeons had to emit an arbitrarily defined o f r e i n f o r c e m e n t s in t h e v a r i a b i l i t y c o m p o -
three-response sequence, namely, ERR, in that order. The n e n t (eight responses, L a g 5) t h a n in t h e
first two correct responses in the sequence produced the
s t e r e o t y p y c o m p o n e n t ( t h r e e responses). O v e r
same interpeck interval as in the V component, and the
third correct response produced the reinforcer. An error sessions, p e r f o r m a n c e s in b o t h c o m p o n e n t s
during an S trial (e.g., if the second response were to the i m p r o v e d i n a c c u r a c y , i n d i c a t i n g t h a t all sub-
left key) immediately produced the same time-out as in j e c t s a c q u i r e d t h e v a r i a b i l i t y - s t e r e o t y p y dis-
V and reset the sequence to the beginning. Whereas the crimination.
time-out in V occurred only after the eighth response,
the time-out in S immediately followed any incorrect F i g u r e 14 s h o w s a v e r a g e U - v a l u e m e a -
response in the sequence. s u r e s - - U ~ s h o w i n g v a r i a b i l i t y for r e s p o n s e s
The V and S components continued to alternate after t a k e n o n e at a t i m e , U2 for r e s p o n s e s t a k e n
every 10th reinforcement until the bird earned a total of t w o at a t i m e , a n d U3 for r e s p o n s e s t a k e n
60 rewards or until a maximum of 120 trials was reached t h r e e at a t i m e - - i n e a c h o f t h e c o m p o n e n t s .
in either component, whichever occurred first. After
initial longer times, the durations of reinforcement and T h e left set o f bars i n d i c a t e s t h e m e d i a n
time-out were reduced to 5,0 s in both components p e r f o r m a n c e o v e r t h e last five sessions in t h e
(times for Subject 38 were decreased to 4.2 s after five initial c o n d i t i o n . A s w o u l d be e x p e c t e d i f t h e
sessions owing to its weight gain). Pecks in both V and p i g e o n s w e r e p e r f o r m i n g v a r i a b l y in V a n d
S were separated by a 0.83-s interpeck interval. e m i t t i n g a fixed s e q u e n c e in S, t h e r e w e r e
Because most of the pigeons were at first unable to
attain 10 rewards in S, the first few sessions began with large a n d s i g n i f i c a n t d i f f e r e n c e s b e t w e e n t h e
the V component. After the 8th session, when responding U v a l u e s in t h e t w o c o m p o n e n t s : F o r U~,
in S had improved, the V and S components alternately t(3)=i5.515, p<.001; for U2, t(3) =
began each session. In an attempt to further improve 15.617, p < .001; a n d for U3, t(3) = 11.533,
performances during S, the salience of time-out was
p < .005.
increased by flashing the house light on for 0.33 s and
off for 0.08 s during time-out in both the V and S D u r i n g t h e e x p l o r a t o r y p h a s e t h a t is n o t
components. Phase 1 training continued for 12, 16, 24, s h o w n , a t t e m p t s w e r e m a d e to i n c r e a s e t h e
and 24 sessions for each of the 4 pigeons, respectively. s t e r e o t y p y c o m p o n e n t t o eight r e s p o n s e s
Phase 2: Equalization of responses. An attempt was ( e q u a l to t h e v a r i a b i l i t y c o m p o n e n t ) a n d t o
made to equalize the number of responses required in V
and S components and to approximately equalize the c h a n g e t h e c o n t i n g e n c i e s in S so t h a t t i m e -
percentages of correct responses at an intermediate level o u t s for i n c o r r e c t s e q u e n c e s o c c u r r e d o n l y at
of proficiency to avoid ceiling and floor effects. At the t h e e n d o f t h e s e q u e n c e (as was t h e case in
end of this phase, which lasted approximately 50 sessions, V). B u t b o t h o f t h e s e a t t e m p t s failed. T h e r e -
the schedule in the V component was six responses, Lag
fore, to e q u a l i z e p e r f o r m a n c e s u n d e r V a n d
10. The schedule requirement in S was the fixed pattern
LRRLL. Reinforcement and time-out in both components S, a f i v e - r e s p o n s e s e q u e n c e was u s e d in e a c h
lasted for 3 s. c o m p o n e n t a n d the lag c r i t e r i o n was i n c r e a s e d
Phase 3: Stimulus reversal. At the start of Phase 3, to 10 in the V c o m p o n e n t . P e r f o r m a n c e s
the number of responses in V was reduced to five to u n d e r t h e s e c o n d i t i o n s a r e s h o w n in t h e
equal the number of responses in S. Otherwise, the
contingencies in effect at the end of Phase 2 were m i d d l e p a n e l s o f F i g u r e s 13 a n d 14. W h e n
maintained. Birds 29, 31, 38, and 39 received 10, 14, the stimulus conditions were reversed (blue
12, and 22 sessions, respectively. key lights n o w signifying t h e S c o m p o n e n t
OPERANT VARIABILITY 447

100 V= 8 resp, lag 5 ~ V = 5 resp, leg 10 V= 5 resp, lag 10


S= LRR i/~t~/~ S = LRRLL S= LRRLL
Stimulus Reversal
90

9 IAi /? i
,0 t /fl ?, ~'

50 I

o
a. I
• Veriabilit
3(] /~ ! 0 Stereotypy

,
10 1
I

~ ,; ,~ 2; 2'~ 3o
Sessions
3'5 ,o ,; 5; 5~

Figure 13. Percentage of reinforced trials per session in variability (V) and stereotypy (S) components of
the multiple schedule for Pigeon 31. (Left panel = initial acquisition where the contingencies in V were
eight responses, Lag 5, and the S contingencies reinforced left-right-right patterns; middle panel =
performance under a V schedule of five responses, Lag 10, and an S schedule of left-right-right-left-left;
right panel = reversal of the middle schedule conditions [key light colors were reversed].)

and red key lights signifying the V component) ment. This conclusion was strengthened by
performances immediately deteriorated but comparison of the pigeon's performance with
then improved (right panels). At the end of a computer-based random number simulationl
both these phases, U values in S and V Experiment 3 increased the look back, or
differed significantly at the .01 level or better. the number of prior sequences of eight re-
Thus, we conclude that stimulus control was sponses from which the current sequence had
established over variable and stereotyped re- to differ. Eventually, to be reinforced, the
sponding. pigeon had to respond with a sequence that
differed from each of its last 50 sequences.
General Discussion This look back included sequences from the
previous session. The subjects generated
Experiments 1 and 2 showed that when highly variable sequences, with more than
hungry pigeons were given grain for generating 80% of the patterns differing from all previous
sequences of eight responses that differed patterns in a session. Probabilities of correct
from their last sequence, they successfully sequences again paralleled the probability of
varied their sequences. When, in addition to correct sequences of the simulating random
meeting this same variability requirement, generator.
the pigeons had to peck exactly four times Experiment 4 compared two possible ac-
on each key (as in Schwartz, 1980, Experi- counts of this variability. The m e m o r y hy-
ment 4; 1982a, Experiment 1), success rates pothesis was that the pigeons learned a long
fell significantly. We concluded that the in- sequence of responses or used a rational
ability of pigeons to gain high rates of reward strategy to meet the schedule requirements.
under the Schwartz procedure was due to an The variability hypothesis was that the pigeons
artifact of the four-response-per-key require- behaved as a quasi-random generator. The
448 SUZANNE PAGE AND ALLEN NEURINGER

first hypothesis predicts that increasing the rates improved significantly, thereby support-
n u m b e r of responses per trial would be cor- ing the quasi-random generator hypothesis.
related with a lowered success rate, for it is Once again, the pigeons' performances par-
easier to r e m e m b e r fewer responses than alleled the performance of a simulating ran-
more. The quasi-random hypothesis made dom generator.
the opposite prediction: By chance, given few The first four experiments generated high
responses per trial, consecutive trials would behavioral variability. However, neither these
repeat one another; given m a n y responses per nor any previously published experiments
trial, there would be few repetitions by chance. demonstrated that response variability de-
When the required number of responses per pended on the contingency between variability
trial was increased from four to eight, success and reinforcement. The observed variability

Figure 14. Average response variability under three phases of Experiment 6, from left to right. (U~ =
responses taken one at a time; U2 = responses taken in pairs; U3 = responses taken in triplets.)
OPERANT VARIABILITY 449

could have been elicited by the reinforcement was neither required nor differentially rein-
schedule (a respondent effect) rather than forced. The state of the environment, as well
directly reinforced (an operant effect). Exper- as the contingencies in the environment, in-
iment 5 tested these alternatives by comparing fluence behavioral variability, the former
the performance of pigeons under two iden- through respondent effects and the latter
tical schedules, except that one schedule re- through operant effects.
quired variability whereas the other permitted Both respondent and operant variability
it. In the first, a pigeon had to respond eight may be adaptive. When reinforcers are infre-
times with a sequence that differed from each quent or absent, variability increases the like-
of its last 50 sequences. The patterns and lihood that the animal will improve its l o t - -
frequencies of rewards under this condition learn a new strategy for obtaining reinforce-
were duplicated to form a self-yoked schedule ment or change its environment. Even when
where eight responses were required for re- reinforcement densities are relatively high,
ward but sequence variability was no longer variability may improve the schedule or pro-
necessary. The results showed that sequence vide knowledge of the environment in antic-
variability was generated only when it was ipation of possible future decrements in re-
required. Under the yoked schedule, vari- inforcement. Variability is an adaptive re-
ability decreased significantly and the pigeons sponse to a changing or potentially changing
responded with highly repetitive patterns. environment.
Variability is therefore an operant dimension Operant variability has unique adaptive
of behavior. functions not shared by respondent variability.
Experiment 6 demonstrated discriminative Whenever an animal is operantly conditioned
control over behavioral variability. Pigeons to generate a new response, whether the
learned to respond variably in the presence conditioning is through the process of shaping
of key lights of one color and with a fixed, (successive approximations to some desired
stereotyped sequence in the presence of a goal response) or trial and error (Thorndikian
second color. When the stimulus conditions conditioning), it is adaptive for the animal to
were reversed, performances reversed. Thus, vary its behaviors. Reinforcement works
the variability dimension of behavior is con- through selection of some previously emitted
trolled by environmental stimuli in much the behavior. If the to-be-selected behavior does
same manner as other operant dimensions are. not occur, reinforcement cannot select. On
The present series of experiments therefore the other hand, in an environment where an
conclusively demonstrate the existence and operant response has previously been learned
strength of operant variability, variability that and is now being maintained by an acceptable
is engendered and maintained because pre- schedule of reinforcement, high variability
sentation of a reinforcer depends on the may not be functional, for variation may
variability. This conclusion is consistent with require relative high energy expenditure and
Blough (1966), Pryor et al. (1969), and Bryant sometimes result in less frequent reinforce-
and Church (1974), among others. ment. It is advantageous for an animal to
Previous studies have examined respondent discriminate situations in which new respon-
variability. Different schedules of reinforce- ses must be learned from those in which
ment reliably engender differing degrees of previously learned behaviors must be re-
behavioral variability with no contingency peated. We hypothesize that this discrimina-
between the variability and the reinforcement tion is based on the reinforcement of diverse
schedules. For example, if identical and in- responses and response classes in the former
dependent fixed-ratio 5 (FR 5) schedules are case versus reinforcement of fixed, or stereo-
programmed on each of two response keys, typed, responses and response classes in the
most pigeons peck exclusively on one or the latter. (The discrimination is in some ways
other key. If the schedules are changed to FR analogous to that between contingent and
150, there is considerable switching between noncontingent reinforcement; Killeen, 1978).
keys. This observation from our laboratory When an animal is differentially rewarded
is a clear example of respondent variability for a variety of responses, it generates variable
caused by reinforcement schedules. Variability behaviors. We posit that this describes all
450 SUZANNE PAGE AND ALLEN NEURINGER

operant learning (as opposed to operant tempted to train a single, fixed eight-response
maintaining) situations. Explicit reinforce- sequence in the stereotypy component of
ment of variable behaviors prior to initiation Experiment 6, we were unsuccessful. Despite
of operant learning, or shaping, procedures hundreds of reinforcements over many ses-
might speed the learning process. This hy- sions, the pigeons failed to learn. It therefore
pothesis should be tested. seems unlikely that they could learn an eight-
There are other instances where operant response sequence after a single reinforce-
control of variability is adaptive. Some envi- ment. Furthermore, the present results did
ronments punish variability (e,.g., many school not show persistence of previously reinforced
classrooms), and most children can discrim- sequences.
inate these from other situations (e.g., the A second hypothesis is that the pigeons
game of hide-and-seek, where variability is learned a long sequence of responses or a
reinforced). Environments that require brain- rational strategy to meet the variability re-
storming, problem solving, or creativity rein- quirements. Experiment 4 showed that this,
force variability of behaviors. One attribute too, is unlikely.
of successful art is the uniqueness of the The third interpretation, one supported by
artist's work. If an animal is to avoid preda- the present research, is that variability is a
tion or injury, it is functional for the animal dimension of behavior much like other op-
to vary its behavior in the presence of specific erant dimensions. Reinforcement does not
predators or in specific environments (see necessarily lead to response stereotypy. Vari-
Humphries & Driver, 1970; Serpell, 1982), ability is as susceptible to control by rein-
and at least some aspects of this variability forcement as are frequency, force, duration,
might be controlled by consequences. Operant location, or topography. But this does not
'variability is also functional in sports, gam- imply that the existence of variability depends
bling, war, or other competitive environments. on its reinforcement. As indicated above,
The interaction between elicited respondent behavioral variability is a respondent conse-
variability and reinforced operant variability quence of environmental events as well as an
is an important area for future study. Re- operant progenitor. Furthermore, variable be-
spondent variability may set the boundaries havior must precede its consequences. The
within which reinforcing contingencies control following analogy may be useful: The pigeon
operant variability. For example, both very enters the operant conditioning experiment
low and very high densities of food to a with a class of behaviors described as pecking
hungry animal may prohibit high behavioral already intact. When the experimenter shapes
variability despite reinforcement of that vari- key pecking, the pecking response is not
ability (see Experiment 3). Alternatively, re- being trained. Rather, the pigeon is taught
inforcement of variability may extend the where, when, and possibly how fast or hard,
boundaries of elicited variation. and so on, to peck. Analogously, there may
Reinforcement of variability raises theo- be a dimension of all behaviors, described as
retical problems. How can behavioral varia- variability, with which the organism enters
tion be reinforced if reinforcement increases our experiments. The rapidity with which
the probability of those responses that pro- inexperienced pigeons acquired variable per-
duced it, thereby necessarily increasing re- formance under the initial Lag 50 condition
sponse stereotypy (e.g., Schwartz, 1982a)? in Experiment 5 supports this view. Turning
There are at least three possible ways in on or off a variability generator may be under
which reinforcement can serve both to in- the control of reinforcement, but the vari-
crease the probability of prior responses and ability generator is not itself created through
increase response variability. First, a schedule reinforcement. An animal may be born with
of reinforcement may rapidly condition and the variability generator intact.
extinguish response patterns, thereby only The present results also raise a method-
apparently reinforcing variable sequences. It ological issue. Skinner has argued for the
is the intermittent extinction, according to ultimate predictability of behavior (Skinner,
this interpretation, that elicits the variability, 197 l) and therefore for the study of highly
a respondent effect. However, when we at- controlled behaviors (Skinner, 1984). In fact,
OPERANT VARIABILITY 451

operant conditioning studies have emphasized food, clothing, and company), behavior may
highly controlled acts (see Schwartz, Schul- still be highly constrained. Contingencies that
denfrei, & Lacey, 1978), but a complete explicitly reinforce behavioral variability are
analysis of behavior must include analyses of necessary to maximize freedom.
the variability of behavior, variability main-
tained through respondent influences as well References
as that directly engendered and maintained Antonitis, J. J. (1951). Response variability in the white
by reinforcing consequences. It may be im- rat during conditioning, extinction, and reconditioning.
possible to predict or control the next instance Journal of Experimental Psychology. 42, 273-28 I.
of a variable behavior, but lack of prediction Blough, D. S. (1966). The reinforcement of least frequent
interresponse times. Journal of the Experimental Anal-
and control should not rule out study. When ysis of Behavior, 9, 581-591.
behaviors are variable, experimental analyses Bryant, D., & Church, R. M. (1974). The determinants
can determine whether the variability is noise of random choice. Animal Learning & Behavior, 22,
(i.e., experimental or observational error), 245-248.
under respondent control, or under operant Carpenter, E (1974). The Skinner primer: Behind freedom
and dignity. New York: Macmillan,
control. Experiments can define the class or Catania, A. C. (1980). Freedom of choice: A behavioral
classes from which the individual instances analysis. In G. H. Bower (Ed.), The psychology of
of behavior are selected, the conditions under learning and motivation (Vol. 14, pp. 97-145). New
which the variability will occur or not, the York: Academic Press.
Crossman, E. K., & Nichols, M. B. (1981). Response
variables controlling onset and offset of the location as a function of reinforcement location and
variability, and the functions served by the frequency. Behavior Analysis Letters, 1, 207-215.
variability. Operant analysis must not limit Eckerman, D. A., & Lanson, R. N. (1969). Variability of
itself to predictable and controllable behav- response location for pigeons responding under contin-
iors. Doing so ignores an essential character- uous reinforcement, intermittent reinforcement, and
extinction. Journal of the Experimental Analysis of
istic of operant behavior. Behavior, 12, 73-80.
Finally, we note one sociopolitical impli- Herrnstein, R. J. (1961). Stereotypy and intermittent
cation. Freedom (cf. Carpenter, 1974; Lamont, reinforcement. Science, 133, 2067-2069.
1967) often means the absence of constraining Holman, J., Goetz, E. M., & Baer, D. M. (1977). The
training of creativity as an operant and an examination
contingencies (no gun to the head) and the of its generalization characteristics. In B. Etzel, J. Le
presence of a noncontingent benign environ- Blanc, & D. Baer (Eds.), New developments in behavioral
ment, one where adequate food, shelter, and research: Theory, method and application. Hillsdale,
so forth, are available independent of any NJ: Erlbaum.
particular behavior. However, operant contin- Humphries, D, A., & Driver, P. M. (1970). Protean
defense by prey animals. Oecologia, 5, 285-302.
gencies may also be crucial. To maximize Killeen, P. R. (1978). Superstition: A matter of bias, not
freedom, an animal or person must have a detectability. Science, 199, 88-90.
wide variety of experiences (Catania, 1980; Lachter, G. D., & Corey, J. R. (1982). Variability of the
Rachlin, 1980). If, for example, to obtain duration of an operant. Behavior Analysis Letters, 2,
97-102.
food, an animal has always entered the same Lamont, C. (1967). Freedom of choice affirmed. New
1 of 10 cubicles, each cubicle containing a York: Horizon Press.
different type of food, and therefore has never Miller, G. A., & Frick, E C. (1949). Statistical behavior-
experienced any but the 1 food, it makes istics and sequences of responses. Psychological Review,,
little sense to say that there is free choice 56, 311-324.
Miller, N. E. (1978). Biofeedback and visceral learning.
a m o n g the 10 foods. The present results Annual Review of Psychology, 29, 373-404.
suggest that diversity of choice is controlled Neuringer, A. (1970). Superstitious key pecking after
by reinforcers contingent on diversity. Absence three peck-produced reinforcements. Journal of the
of variability-maintaining contingencies, such Experimental Analysis of Behavior, 13, 127-134.
Neuringer, A. (1984). Melioration and self-experimenta-
as in the yoked condition of Experiment 5, tion. Journal of the Experimental Analysis of Behavior,
increases stereotyped behaviors and therefore 42, 397-406.
limits experiences. If these speculations are Neuringer, A. (1985). "Random" performance can be
correct, a laissez-faire environment will not learned. Manuscript submitted for publication.
secure freedom of choice. Despite the absence Notterman, J. M. (1959). Force emission during bar
pressing. Journal of Experimental Psychology, 58, 341-
of aversive constraints and the presence of 347.
positive respondent influences (e.g., good Piscaretta, R. (1982). Some factors that influence the
452 SUZANNE PAGE AND ALLEN NEURINGER

acquisition of complex stereotyped response sequences ability with reinforcement. Journal of the Experimental
in pigeons. Journal of the Experimental Analysis of Analysis of Behavior, 37, 171-181.
Behavior, 37, 359-369. Schwartz, B. (1982b). Reinforcement-inducedstereotypy:
Popper, K. R. (1968). The logic of scientific discovery. How not to teach people to discover rules. Journal of
New York: Harper & Row. Experimental Psychology: General, 111, 23-59.
Pryor, K. W., Haag, R., & O'Reilly, J. (1969). The Schwartz, B,, Schuldenfrei, R., & Lacey, H. (1978).
creative porpoise: Training for novel behavior. Journal Operant psychology as factory psychology.Behaviorism,
of the Experimental Analysis of Behavior, 12, 653- 6, 229-254.
661. Serpell, J. A. (1982). Factors influencing fighting and
Rachlin, H. (1980). Behaviorism in everyday life. Engle- threat in the parrot genus Trichoglossus. Animal Be-
wood Cliffs, NJ: Prentice-Hall. havior, 30, 1244-1251.
Schoenfeld, W. N., Harris, A. H., & Farmer, J. (1966). Skinner, B. F. (1938). The behavior of organisms. New
Conditioningresponse variability. PsychologicalReports, York: Appleton-Century.
19, 551-557. Skinner, B. F. (1971). Beyond freedom and dignity. New
Schwartz, B. (1980). Development of complex stereotyped York: Alfred A. Knopf.
behavior in pigeons. Journal of the Experimental Anal- Skinner, B. F. ( 1984, May). Thefuture of the experimental
ysis of Behavior, 33, 153-166. analysis of behavior. Paper presented at the meeting
Schwartz, B. (1981). Control of complex, sequential of the Association for Behavior Analysis, Nashville,
operants by systematic visual information in pigeons. TN.
Journal of Experimental Psychology: Animal Behavior
Processes, 7, 31-44. Received July 13, 1984
Schwartz, B. (1982a). Failure to produce response vari- Revision received February 26, 1985 •

View publication stats

You might also like