Chap. 5

Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Schedules of

Reinforcement 5
1. Learn about the basic schedules of reinforcement.
2. Investigate rates of reinforcement and resistant to change.
3. Inquire about behavior during transition between schedules of reinforcement.
4. Discover how schedules of reinforcement are involved in cigarette smoking.
5. Distinguish molecular and molar accounts of performance on schedules.

The events that precede operant behavior and the consequences that follow may be arranged in
many different ways. A schedule of reinforcement describes this arrangement. In other words,
a schedule of reinforcement is a prescription that states how and when discriminative stimuli and
behavioral consequences will be presented (Morse, 1966). In the laboratory, sounding a buzzer in an
operant chamber may be a signal (SD) that sets the occasion for lever pressing (operant) to produce
food (consequence). A€similar schedule operates when a dark room sets the occasion for a person to
turn on a lamp, which is followed by illumination of the room.
At first glance, a rat pressing a lever for food and a person turning on a light to see appear
to have little in common. Humans are very complex organisms—they build cities, write books,
go to college, use computers, conduct scientific experiments, and do many other things that rats
cannot do. In addition, pressing a lever for food appears to be very different from switching on a
light. Nonetheless, performances controlled by schedules of reinforcement have been found to be
remarkably similar across different organisms, behavior, and reinforcers. When the same schedule
of reinforcement is in effect, a child who solves math problems for teacher approval may generate a
pattern of behavior comparable to a bird pecking a key for water.

IMPORTANCE OF SCHEDULES OF REINFORCEMENT


Schedules of reinforcement were a major discovery first described by B.â•›F. Skinner in the 1930s.
Subsequently, Charles Ferster and B.â•›F. Skinner reported the first and most comprehensive study
of schedules ever conducted (Ferster€& Skinner, 1957). Their work on this topic is unsurpassed
and represents the most extensive study of this critical independent variable of behavior science.
Today, few studies focus directly on simple, basic schedules of reinforcement. The lawful rela-
tions that have emerged from the analysis of reinforcement schedules, however, remain central to
the science of behavior—being used in virtually every study reported in the Journal of the Exper-
imental Analysis of Behavior. The knowledge that has accumulated about the effects of schedules
is central to understanding behavior regulation. G.â•›S. Reynolds underscored this point and wrote:

Schedules of reinforcement have regular, orderly, and profound effects on the organism’s rate of
responding. The importance of schedules of reinforcement cannot be overestimated. No description,

135
136╇╇ Schedules of Reinforcement

account, or explanation of any operant behavior of any organism is complete unless the schedule of
reinforcement is specified. Schedules are the mainsprings of behavioral control, and thus the study
of schedules is central to the study of behavior.€.€.€. Behavior that has been attributed to the supposed
drives, needs, expectations, ruminations, or insights of the organism can often be related much more
exactly to regularities produced by schedules of reinforcement.
(Reynolds, 1966b, p. 60)

Modern technology has made it possible to analyze performance on schedules of reinforcement


in increasing detail. Nonetheless, early experiments on schedules remain important. The experi-
mental analysis of behavior is a progressive science in which observations and experiments build
on one another. In this chapter, we present early and later research on schedules of reinforcement.
The analysis of schedule performance ranges from a global consideration of cumulative records to
a detailed analysis of the time between responses.

FOCUS ON: C.â•›B. Ferster


and Schedules of Reinforcement
In 1957, Charles Bohris Ferster (Figure€5.1) together with B.╛F. Skinner published Sched-
ules of Reinforcement, the most comprehensive description of the behavior (performance)
generated by different schedules of reinforcement. Charles was born in Freehold, New
Jersey, on 1 November€1922 in the Depression years and, even though life was difficult for
the Ferster family, Charles completed high school and entered Rutgers University in 1940.
After receiving his BS degree at Rutgers and doing military service from 1943 to 1946,
Charles went on to Columbia University where he studied the reinforcing effects of stimuli
(conditioned reinforcers) presented during intermittent reinforcement. He obtained his
PhD in 1950 and took a Research Fellowship at Harvard in the behavioral laboratory of
B.â•›F. Skinner.
At the Harvard laboratory, Charlie (as he was called)
impressed Skinner by vastly improving the design of the
equipment used to study the performance of pigeons
on a variety of reinforcement schedules. For example,
Charlie made improvements to the cumulative recorder
to depict changes in response rate, which eventually
resulted in the design patented by Ralph Gerbrands of
the first modern-style instrument, as shown in Chapter€4
(see Figures€4.7 and 4.9). In the laboratory, he worked
night and day and, together with Skinner, made Grand
Rounds each morning—inspecting cumulative records
of pigeons’ rates of response and making changes to
the programmed schedules. Often there were surprises
as Ferster and Skinner tried to predict the performance
of the birds under complex schedules; they would then
FIG. 5.1╇ Photograph of C.╛B. Ferster.
Source: Copyright 1961 and add a new piece of control equipment such as a clock
republished with permission of the or timer to see what would happen—and often the
Society for the Experimental Analysis researchers found the results surprising and dramatic.
of Behavior. Charlie said that over a year he stopped predicting the
Schedules of Reinforcement╇╇ 137

outcomes of experiments, as the predictions were often incorrect and “the pigeon really
did know best what it was he was likely to do and the conditions under which he would
do it” (Ferster, 2000, p. 306).
Ferster and Skinner noted that the only contact that the pigeons had with the pro-
gramming equipment was at the moment of reinforcement but that many stimuli could be
involved at this moment, determining the current rate of response and changes in response
rates. Stimuli arising from the passage of time and from the number of responses made on
the schedule were obvious sources of stimulus control. Ferster and Skinner designed exper-
iments to enhance these stimuli so as to observe the effects in the cumulative records. In
recalling these experiments and interactions with Skinner, Ferster stated:

There were many personal and natural consequences of completing a successful experiment
[other than the results from the actual behavior of the birds]. A€successful experiment led to
conversations about the data, the new devices we could build, the new experiments that had
to be started, and the new ways we could organize our past experience from the laboratory.
When we discovered a new degree of orderliness or an unexpected but rewarding
result on morning rounds, there was always much excitement and talk about where the
experiment might go next and how to manage the equipment for the next experiment that
was burning to be done because of the new result.
When new discoveries accumulated too fast .€.€. there were planning sessions, which
were always great fun and very exciting.€.€.€. I learned the value of large sheets of paper,
which we used to aid our thought and to chart our progress.€.€.€. The theoretical structures
[organized by the charts] and programmatic aspects of our work appeared as [headings]€.€.€.
to appear as chapter and subchapter titles in Schedules of Reinforcement. Each entry
prompted rearrangements of the theoretical pattern and suggested new experiments and
programs, which in turn prompted further rearrangements of the data. The interactions
between these theoretical exercises and changes in ongoing experiments in the laboratory
were continuous and constituted an important reinforcer.
(Ferster, 2000, p. 307)

Looking back on his own experience of writing Schedules of Reinforcement and what
he had learned, Charlie Ferster indicated that:

[A] potential reinforcing environment exists for every individual, however, if he will only emit
the required performances on the proper occasion. One has merely to paint the picture, write the
symphony, produce the machine, tell the funny story, give affection artfully or manipulate the
environment and observe the behavior of the animal, and the world will respond in kind with
prestige, money, social response, love, and recognition for scientific achievement.

(Ferster, 2000, p. 311)

BEHAVIOR ANALYSIS: A€PROGRESSIVE SCIENCE


The experimental analysis of behavior is a progressive enterprise. Research findings are accumu-
lated and integrated to provide a general account of the behavior of organisms. Often, simple ani-
mals in highly controlled settings are studied. The strategy is to build a comprehensive theory of
behavior that rests on direct observation and experimentation.
138╇╇ Schedules of Reinforcement

The field of behavior analysis emphasizes a descriptive approach and discourages speculations
that go substantially beyond the data. Such speculations include reference to the organism’s mem-
ory, thought processes, expectations, and undocumented accounts based on presumed physiological
states. For example, a behavioral account of schedules of reinforcement provides a detailed descrip-
tion of how behavior is altered by contingencies of reinforcement. One such account is based on
evidence that a particular schedule sets up differential reinforcement of the time between responses
(interresponse times, or IRT; see later in this chapter). An alternative account is that behavior is inte-
grated into larger units of performance according to the molar or macro contingencies of reinforce-
ment (overall rate of reinforcement). Both of these analyses contribute to an understanding of an
organism’s behavior in terms of specific environment—behavior relationships—without reference
to hypothetical cognitive events or presumed physiological processes.
Recall that behavior analysts study the behavior of organisms, including people, for its own
sake. Behavior is not studied to make inferences about hypothetical mental states or real phys-
iological processes. Although most behaviorists acknowledge and emphasize the importance of
biology and neurophysiological processes, they focus more on the interplay of behavior with the
environment during the lifetime of an organism. Of course, direct analysis of neurophysiology of
animals provides essential details about how behavior is changed by the operating contingencies of
reinforcement and behavioral neuroscientists currently are providing many of these details, as we
discuss throughout this textbook.
Contemporary behavior analysis continues to build on previous research. The extension of
behavior principles to more complex processes and especially to human behavior is of primary
importance. The analysis, however, remains focused on the environmental conditions that control
the behavior of organisms. Schedules of reinforcement concern the arrangement of environmental
events that regulate behavior. The analysis of schedule effects is currently viewed within a bio-
logical context. In this analysis, biological factors play several roles. One way in which biology
affects behavior is through specific neurophysiological events (e.g., release of neurotransmitters)
that function as reinforcement and discriminative stimuli. Biological variables may also constrain
or enhance environment-behavior relationships (see Chapter€7). As behavior analysis and the other
biological sciences progress, an understanding of biological factors becomes increasingly central to
a comprehensive theory of behavior.

Schedules and Patterns of Response


Response patterns develop as an organism interacts with a schedule of reinforcement (Ferster€&
Skinner, 1957). These patterns come about after an animal has extensive experience with the con-
tingency of reinforcement (SD : R → Sr arrangement) defined by a particular schedule. Subjects
(usually pigeons or rats) are exposed to a schedule of reinforcement and, following an acquisition
period, behavior typically settles into a consistent or steady-state performance (Sidman, 1960).
It may take many experimental sessions before a particular pattern emerges, but once it does, the
orderliness is remarkable. In fact, B.â•›F. Skinner provided the first description of systematic schedule
performance in his book, The Behavior of Organisms (Skinner, 1938). In the preface to the seventh
printing of that book, Skinner writes that “the cumulative records .€.€. purporting to show orderly
changes in the behavior of individual organisms, occasioned some surprise and possibly, in some
quarters, suspicion” (p. xii). Any suspicion was put to rest when Skinner’s observations were rep-
licated in many other experiments (see Morse, 1966 for a review of early work on schedules of
reinforcement).
The steady-state behavior generated when a fixed number of responses are reinforced illustrates
one of these patterns. For example, a hungry rat might be required to press a lever 10 times to get a
Schedules of Reinforcement╇╇ 139

food pellet. Following reinforcement, the animal has to make another 10 responses to produce the
next bit of food, then 10 more responses. In industry, this requirement is referred to as piece rate
and the schedule has characteristic effects on the job performances of the workers. When organisms
(rats, pigeons, or humans) are reinforced after a fixed number of responses, a break-and-run pattern
of behavior often develops. Responses required by the schedule are made rapidly and result in rein-
forcement. A€pause in responding follows each reinforcement, followed by another quick burst of
responses (see “Fixed Ratio” section in this chapter for more details). This pattern repeats over and
over again and occurs even when the ratio size of the schedule is changed.

NOTE ON: Inner Causes, Schedules,


and Response Patterns
We sometimes speak of people being “highly motivated” when we observe them
investing energy or time in some project. Motivation seems to explain why peo-
ple behave as they do. Schoolchildren are said to be unmotivated when they put
off or fail to do assignments; in contrast, children are highly motivated when they
study hard and overachieve. From a behavioral perspective, there is no need to infer
a hypothetical internal process of motivation or drive to understand this kind of
behavior. Schedules of reinforcement generate unique and predictable patterns of
behavior that are often taken as signs of high motivation; other schedules produce
pausing and low rates of response used as indicators of low motivation or even clin-
ical depression. In both cases, behavior is due to environmental contingencies rather
than the inferred inner cause called motivation.
Similarly, habits or personality traits are said to be “response dispositions that
are activated automatically by context cues that co-occurred with responses during
past performance” (Neal, Wood,€& Quinn, 2006, p. 198). Here reference is made to
internal dispositions that account for regular and frequent actions or habits. Instead
of inferring dispositions as internal causes, one might say that habits or traits are
patterns of steady-state responding; these regularities of behavior are maintained
by the consistency of the schedule of reinforcement. Consistent or reliable schedules
of reinforcement generate habitual, stable rates and patterns of responding. It is
these characteristic patterns of behavior that people use to infer dispositional causes.
A€behavior analysis indicates that the actual causes are often the behavioral contin-
gencies rather than dispositional states within us (Phelps, 2015).
The stability of behavior patterns generated by reinforcement contingencies,
which allows people to infer others’ dispositions and personality, also allows for reli-
able inferences of emotional states. Based on behavioral stability and consistency,
computer programs are now able to recognize human faces and “read” emotions
from a person’s facial expressions. Our faces evolved as organs of emotional com-
munication and there is money to be made with emotionally responsive machines.
Computer programs with visual inputs are able to code facial expressions and, some-
times together with voice analysis, predict buying, voting, depression, attention, and
additional affective behaviors (Khatchadourian, 2015). Our point here is that peo-
ple use stable overt behavior generated by reinforcement schedules to infer disposi-
tional and emotional states and, it turns out, those visible behaviors can be computer
defined and used for commercial purposes.
140╇╇ Schedules of Reinforcement

Schedules and Natural Contingencies


In everyday life, behavior is often reinforced on an intermittent basis. On an intermittent schedule
of reinforcement, an operant is reinforced occasionally rather than each time it is emitted. Every
time a child cries, she is not reinforced with attention. Each time a predator hunts, it is not success-
ful. When you dial the number for airport information, sometimes you get through, but often the
exchange is busy. Buses do not immediately arrive when you go to a bus stop. It is clear that per-
sistence is often essential for survival or achievement of success; thus, an account of perseverance
on the basis of the maintaining schedule of reinforcement is a major discovery. In concluding his
review of schedule research, Michael Zeiler stated:

It is impossible to study behavior either in or outside the laboratory without encountering a schedule
of reinforcement: whenever behavior is maintained by a reinforcing stimulus, some schedule is in
effect and is exerting its characteristic influences. Only when there is a clear understanding of how
schedules operate will it be possible to understand the effects of reinforcing stimuli on behavior.
(Zeiler, 1977, p. 229)

Consider a bird foraging for food. The bird turns over sticks or leaves and once in a while finds
a seed or insect. These bits of food occur only every now and then, and the distribution of reinforce-
ment is the schedule that maintains the animal’s foraging behavior. If you were watching this bird
hunt for food, you would probably see the animal’s head bobbing up and down. You might also see
the bird pause and look around, change direction, and move to a new spot. This sort of activity is
often attributed to the animal’s instinctive behavior patterns. Labeling the behavior as instinctive,
however, does not explain it. Although evolution and biology certainly play a role in this foraging
episode, perhaps as importantly, so does the schedule of food reinforcement.
Carl Cheney (an author of this textbook) and his colleagues created a laboratory analog of for-
aging. In this arrangement, pigeons were able to choose between two food patches by pecking keys
(Cheney, Bonem,€& Bonem, 1985). Based on two concurrent progressive-ratio schedules, the density
of food (ratio requirement) on either key increased or decreased with the amount of foraging (see
“Progressive-Ratio Schedules” in this chapter; and see discussion of concurrent schedules in Chap-
ter€9). As food reinforcers were obtained from one key, the density of food reinforcement on that key
decreased and more responses were required to produce bits of food—a progressively increasing ratio
schedule (depleting patch of food). Concurrently, the number of responses for each reinforcement
decreased on the other key (repleting patch of food)—a progressively decreasing ratio schedule. As
would be expected, this change in reinforcement density up and down generated switching back and
forth between the two patches. To change patches, however, the bird had to peck a center key—simu-
lating travel time and effort between patches (the side keys). Cheney and his colleagues found that the
cost of hunting—represented by the increasing ratio schedule for pecking in a patch—the effort (num-
ber of responses) required to change patches, and the rate of food replacement in the alternative patch
all contributed to the changing patches. This experiment depicts an animal model of foraging—using
schedules of reinforcement to simulate natural contingencies operating in the wild.

Ongoing Behavior and Schedule Effects


Zeiler’s (1977) point that schedules of reinforcement typically affect operant behavior is well
taken. Experimenters risk misinterpreting results when they ignore possible schedule effects. This
is because schedules of reinforcement may interact with a variety of other independent variables,
producing characteristic effects. For example, when every response on a fixed-ratio schedule of
reinforcement (reinforcement after a fixed number of responses) is shocked, the pause length after
Schedules of Reinforcement╇╇ 141

reinforcement increases (Azrin, 1959). Once the animal emits the first response, however, the oper-
ant rate to finish the run of responses is unaffected. In other words, the pause increases with continu-
ous punishment, but otherwise behavior on the schedule remains the same. A€possible conclusion is
that punishment (shock) reduces the tendency to begin responding; once started, however, behavior
is not suppressed by contingent aversive stimulation.
This conclusion is not completely correct, as further experiments have shown that punishment has
other effects when behavior is maintained on a different schedule of reinforcement (e.g., Azrin€& Holz,
1961). When behavior is reinforced after a fixed amount of time (rather than responses), an entirely differ-
ent result occurs. On this kind of schedule, when each operant is punished, the pattern of behavior remains
the same and the rate of response declines. Obviously, conclusions concerning the effects of punishment
on pattern and rate of response cannot be drawn without considering the schedule of reinforcement main-
taining the behavior. That is, the effects of punishment depend on the schedule of reinforcement. These
findings have applied importance for the regulation of human behavior by social punishment (fines, taxes,
and sentencing) administered through the legal system. When punishment “doesn’t work,” one thing to
check is the schedule of reinforcement maintaining the behavior labeled as illegal.
In summary, schedules of reinforcement produce reliable response patterns, which are con-
sistent across different reinforcers, organisms, and operant responses. In our everyday experience,
schedules of reinforcement are so common that we take such effects for granted. We wait for a taxi
to arrive, line up at a store to have groceries scanned, or solve 10 math problems for homework.
These common episodes of behavior and environment interaction illustrate schedules of reinforce-
ment operating in our everyday lives.

FOCUS ON: A€System of Notation


We have found that using a notation system greatly improves the understanding of con-
tingencies and reinforcement schedules. Our notation system is based on Mechner’s (1959)
description of reinforcement contingencies (see Mechner, 2011 for a behavioral notation
system appropriate to the social sciences). We have simplified the notation and relabeled
some of the symbols. The system of notation only describes independent variables, and is
similar to a flow chart sometimes used in computer programming. Thus, Mechner notation
describes what the experimenter (instrumentation or computer) does, not the behavior
of organisms. In other words, Mechner notation represents the way (sequence of events
and the response requirements) that schedules of reinforcement are arranged. Cumulative
records or other data collected by computers such as rate of response describe what the
organism does on those schedules (the dependent variable).

SYMBOL EVENT
S Stimulus or event
Sr Reinforcer
Sr+ Positive reinforcer
Sr− Negative reinforcer (aversive stimulus)
SD Discriminative stimulus (event signaling reinforcement)
S∆ S-delta (a discriminative stimulus that signals extinction)
Save Conditioned aversive stimulus (an event that has signaled punishment)
R Response (operant class)
Ra Response of type a (i.e., a response on lever a)
142╇╇ Schedules of Reinforcement

TIME AND NUMBER SYMBOLS


F Fixed
V Variable
T Time
N Number

Relationships
The horizontal arrow connecting two events (i.e., A → B) indicates that one event follows
another. When the arrow leads to a consequence, as in R → Sr, the arrow is read as produc-
es. In this case, a response (R) produces a consequence (Sr). If the arrow leads to a response,
as in Ra → Rb, it is read as produces a condition where. In other words, response Ra “sets
up” or allows response Rb to produce an effect. For example, a press on lever “a” creates
a situation where a press on lever “b” results in food.

Brackets
All conditions listed vertically inside a bracket go into effect simultaneously (Figure€5.2).
For example, A€and B are conditions that occur at the same time, and the occurrence of B
leads to event C.
When a vertical arrow cuts across a horizontal arrow (Figure€5.3), it means that the
diagrammed event is prevented. In diagram, conditions A€and B occur at the same time.
Event A€leads to condition C, but event B blocks the A → C relationship. In other words,
A€leads to C but not if A€and B occur together.
When events repeat (Figure€5.4), a horizontal arrow is used that starts at the end of
a sequence and goes back to the beginning. In the presence of condition A, the event B
produces C, and after C occurs the sequence repeats.
Mechner notation is especially helpful when complex contingencies are involved and
the experimenter has to program a computer or other instrumentation for contingencies
arranged in an operant chamber. Using this notation system also aids students in specify-
ing exactly what the events, requirements, and their interactions are in an experiment.
Finally, the notation makes explicit the programmed contingencies that control the be-
havior of organisms.

A C
A

B
B C FIG. 5.4╇ Relations
FIG. 5.3╇ Relations within within brackets in
FIG. 5.2╇ Relations within brackets in Mechner Mechner notation are
brackets in Mechner notation are shown. A€and shown. A€and B occur
notation are shown. B occur; A€produces event and B produces event
A€and B occur and B C but not if A€and B occur C. After C occurs the
produces event C. together. sequence repeats.
Schedules of Reinforcement╇╇ 143

SCHEDULES OF POSITIVE REINFORCEMENT

Continuous Reinforcement
Continuous reinforcement, or CRF, is probably the simplest schedule of reinforcement. On this
schedule, every operant required by the contingency is reinforced. For example, every time a hungry
pigeon pecks a key, food is presented. When every operant is
followed by reinforcement, responses are emitted relatively
quickly depending upon the time to consume the reinforcer.
The organism continues to respond until it is satiated. Sim-
ply put, when the bird is hungry (food deprived), it rapidly
pecks the key and eats the food until it is full (satiated). If the
animal is again deprived of reinforcement and exposed to a
CRF schedule, the same pattern of responding followed by
satiation is repeated. Figure€5.5 is a typical cumulative record
of performance on continuous reinforcement. As mentioned
FIG. 5.5╇ Performance is shown on a
in Chapter€4, the typical vending machine delivers products continuous reinforcement schedule.
on a continuous (CRF) schedule. Hatch marks indicating reinforcement
Conjugate reinforcement is a type of CRF schedule in are omitted since each response is
which properties of reinforcement, including the rate, ampli- reinforced. The flat portion of the
tude, and intensity of reinforcement, are tied to particular record occurs when the animal stops
making the response because of
dimensions of the response (see Weisberg€& Rovee-Collier, satiation.
1998 for a discussion and examples). For example, loud,
energetic, and high-rate operant crying by infants is often
correlated with rapid, vigorous, and effortful caretaking (reinforcement) by parents. Basically, a
repetitive “strong” response by the infant results in proportionally quick, “strong” caretaking (rein-
forcement) by the parents. Many repetitive behavior problems (stereotypy), such as head banging
by atypically developing children, are automatically reinforced by perceptual and sensory effects
(Lovaas, Newsom, & Hickman, 1987), in which high-rate, intense responding produces equally
rapid, strong sensory reinforcement—making this behavior difficult to manage (see Rapp, 2008 for
a brief review). Research with infants on conjugate schedules, involving leg thrusting for visual/
auditor stimulation (e.g., stronger leg thrusts produce clearer image), has shown rapid acquisition
with higher peak responding than on simple CRF schedules (Voltaire, Gewirtz,€& Pelaez, 2005).
Additional research has used college students responding to clarify pictures on a computer monitor;
in this study, students’ responding was sensitive to change in intensity of the visual stimulus, rate of
decrease during extinction, and rate of decrease with conjugate negative punishment (MacAleese,
Ghezzi,€& Rapp, 2015). Further experimental analysis of this type of CRF schedule seems warranted.

CRF and Resistance to Extinction


Continuous reinforcement (CRF) generates weak resistance to extinction compared with intermittent
reinforcement (Harper€& McLean, 1992). Recall from Chapter€4 that resistance to extinction is a mea-
sure of persistence when reinforcement is discontinued. This perseverance can be measured in several
ways. The most obvious way to measure resistance to extinction is to count the number of responses
and measure the length of time until operant level is reached. Again, remember from Chapter€4 that
144╇╇ Schedules of Reinforcement

operant level refers to the rate of a response before behavior is reinforced. For example, a laboratory
rat could be placed in an operant chamber with no explicit contingency of reinforcement in effect. The
number of times the animal presses the lever during a 2-h exploration of the chamber is a measure
of operant level, or in this case baseline. Once extinction is in effect, measuring the time taken and
number of responses made until operant level is attained is the best gauge of resistance to extinction.
Although continuing extinction until operant level is obtained provides the best measure of
behavioral persistence, this method requires considerable time and effort. Thus, arbitrary measures
that take less time are usually used. Resistance to extinction may be estimated by counting the
number of responses emitted over a fixed number of sessions. For example, after exposure to CRF,
reinforcement could be discontinued and the number of responses made in three daily 1-h sessions
counted. Another index of resistance to extinction is based on how fast the rate of response declines
during unreinforced sessions. The point at which no response occurs for 5 min may be used to index
resistance. The number of responses and time taken to that point are used as indicators of behavioral
persistence or resistance to extinction. The important criterion for any method is that it must be
quantitatively related to extinction of responding.
Hearst (1961) investigated the resistance to extinction produced by CRF and intermittent schedules.
In this experiment, birds were trained on CRF and two intermittent schedules that provided reinforce-
ment for pecking a key. The number of extinction responses that the animals made during three daily
sessions of nonreinforcement was then counted. Basically, Hearst found that the birds made many more
extinction responses after training on an intermittent schedule than after continuous reinforcement.

Response Stereotypy on CRF


On CRF schedules, the form or topography of response becomes stereotypical. In a classic study,
Antonitis (1951) found that operant responses were repeated with very little change or variability in
topography on a CRF schedule. In this study, rats were required to poke their noses anywhere along
a 50-cm horizontal slot to get a food pellet (see Figure€5.6). Although not required by the contin-
gency, the animals frequently responded at the same position on the slot. Only when the rats were
placed on extinction did their responses become more variable. These findings are not limited to
laboratory rats, and may reflect a principle of behavior—reinforcement narrows operant variability
while extinction increases it; one might say that “failure creates innovation” (see research on this
issue in Chapter€4, “Focus On: Reinforcement and Problem Solving”).
Further research with pigeons suggests that response variability may be inversely related to the rate of
reinforcement. In other words, as more and more responses are reinforced, less and less variation occurs in the
members of the operant class. Herrnstein (1961a)
reinforced pigeons for pecking on an intermittent
schedule. The birds pecked at a horizontal strip
and were occasionally reinforced with food. When
some responses were reinforced, most of the birds
pecked at the center of the strip—although they
were not required to do so. During extinction, the
animals made fewer responses to the center and
more to other positions on the strip. Eckerman and
Lanson (1969) replicated this finding in a subse-
quent study, also with pigeons. They varied the
rate of reinforcement and compared response vari-
ability under CRF, intermittent reinforcement, and
FIG. 5.6╇ The apparatus used by Antonitis
(1951). Rats could poke their noses anywhere extinction. Responses were stereotypical on CRF
along the 50-cm horizontal slot to obtain and became more variable when the birds were on
reinforcement. extinction or on an intermittent schedule.
Schedules of Reinforcement╇╇ 145

One interpretation of these findings is that organisms become more variable in their respond-
ing as reinforcement becomes less frequent or predictable. When a schedule of reinforcement is
changed from CRF to intermittent reinforcement, the rate of reinforcement declines and response
variability increases. A€further change in the rate of reinforcement occurs when extinction is started.
In this case, the operant is no longer reinforced and response variation is maximal. The general
principle appears to be “When things no longer work, try new ways of behaving.” Or, as the saying
goes, “If at first you don’t succeed, try, try again.”
When solving a problem, people usually use a solution that has worked in the past. When
the usual solution does not work, most people—especially those with a history of reinforcement
for response variability and novelty—try novel approaches to problem solving. Suppose that
you are a camper who is trying to start a fire. Most of the time, you gather leaves and sticks,
place them in a heap, strike a match, and start the fire. This time the fire does not start. What do
you do? If you are like most of us, you try different ways to get the fire going, many of which
may have worked in the past. You may change the kindling, add newspaper, use lighter fluid,
swear at the fire pit, or even build a shelter. Clearly, your behavior becomes more variable and
inventive when reinforcement is withheld after a period of success. This increase in topographic
variability during extinction after a period of reinforcement has been referred to as resurgence
(Epstein, 1985), possibly contributing to the development of creative or original behavior on
the one hand (Neuringer, 2009), and relapse of problem behavior during treatment on the other
(Shahan€& Sweeney, 2013).
In summary, CRF is the simplest schedule of positive reinforcement. On this schedule, every
response produces reinforcement. Continuous reinforcement produces weak resistance to extinction
and generates stereotypical response topographies. Resistance to extinction and variation in form of
response both increase on extinction and intermittent schedules.

RATIO AND INTERVAL SCHEDULES


OF REINFORCEMENT
On intermittent schedules of reinforcement, some rather than all responses are reinforced. Ratio
schedules are response based—these schedules are set to deliver reinforcement following a pre-
scribed number of responses. The ratio specifies the
number of responses required for reinforcement.
Interval schedules pay off when one response is
made after some amount of time has passed. Inter-
val and ratio schedules may be fixed or variable.
Fixed schedules set up reinforcement after a fixed
number of responses have occurred, or after a con-
stant amount of time has passed. On variable sched-
ules, response and time requirements vary from one
reinforcer to the next. Thus, there are four basic
schedules—fixed ratio, variable ratio, fixed interval,
and variable interval. In this section, we describe
these four basic schedules of reinforcement (shown
in Figure€5.7) and illustrate the typical response pat- FIG. 5.7╇ A€table is shown of the four basic
schedules of positive reinforcement.
terns that they produce. We also present an analysis Source: Adapted from C.â•›B. Ferster,
of some of the reasons for the effects produced by S.€Culbertson,€& M.C.P. Boren (1975). Behavior
these basic schedules. principles. Englewood Cliffs, NJ: Prentice Hall.
146╇╇ Schedules of Reinforcement

Ratio Schedules
Fixed Ratio
A fixed-ratio (FR) schedule is programmed to deliver reinforcement after a fixed number of
responses have been made. Continuous reinforcement (CRF) is defined as FR 1—the ratio is one
reinforcer for one response. Figure€ 5.8 presents a fixed-ratio schedule diagrammed in Mechner
notation. The notation is read, “In the presence of a discriminative stimulus (SD), a fixed number (N)
of responses (R) produces unconditioned reinforcement (SR+).” In a simple animal experiment, the
SD is sensory stimulation arising from the operant chamber; the response is a lever press and food
functions as reinforcement. On fixed-ratio 25 (FR 25), 25 lever presses must be made before food is
presented. After reinforcement, the returning arrow indicates that another 25 responses are required
to again produce reinforcement.
The symbol N is used to indicate that fixed-ratio schedules can assume any value. Of course, it
is unlikely that very high values (say, FR 100,000,000) would ever be completed. Nonetheless, this
should remind you that Mechner notation describes the independent variable, not what the organ-
ism does. Indeed, FR 100,000,000 could be easily programmed, but this schedule is essentially an
extinction contingency because the animal probably never will complete the response requirement
for reinforcement.
In 1957, Ferster and Skinner described the FR schedule and the characteristic effects, patterns,
and rates, along with cumulative records of performance on about 15 other schedules of reinforce-
ment. Their observations remain valid after
literally thousands of replications: FR sched-
ules produce a rapid run of responses, fol-
lowed by reinforcement, and then a pause in
responding (Ferster€& Skinner, 1957). An ide-
FIG. 5.8╇ A€fixed-ratio schedule of positive alized cumulative record of behavior on fixed
reinforcement is diagrammed in Mechner notation. ratio is presented in Figure€5.9. The record
In the presence of an SD, a fixed number of
looks somewhat like a set of stairs (except at
responses (NR) results in reinforcement (Sr+). As
indicated by the returning arrow, the sequence very small FR values, as shown by Crossman,
repeats such that another fixed number of responses Trapp, Bonem,€& Bonem, 1985). There is a
will again produce reinforcement. steep period of responding (run of responses),
followed by reinforcement (oblique line), and
finally a flat portion (the pause)—a pattern
known as break and run.
During extinction, the break-and-run
pattern shows increasing periods of paus-
ing followed by high rates of response. In a
cumulative record of a pigeon’s performance
for the transition from FR 60 (after 700 rein-
forcements) to extinction, the pausing after
reinforcement comes to dominate the record.
A€ high rate of response (approximately 5
pecks per second), however, is also notable
FIG. 5.9╇ A€cumulative record is shown of a well- when it does occur (Ferster€& Skinner, 1957,
developed performance on an FR 100 schedule of Bird 31, p. 58).
reinforcement. The typical break-and-run pattern is
presented. Reinforcement is indicated by the hatch The flat part of the cumulative record is
marks. This is an idealized record that is typical of often called the postreinforcement pause
performance on many fixed-ratio schedules. (PRP), to indicate where it occurs. The pause
Schedules of Reinforcement╇╇ 147

in responding after reinforcement does not occur because the organism is consuming the food.
Skinner (1938, p. 288) indicated that the length of the PRP depended on the preceding reinforce-
ment, and called it a postreinforcement pause. He noted that on FR schedules (and fixed-interval
schedules) one reinforcer never immediately follows another. Thus, the occurrence of reinforcement
became discriminative for nonreinforcement (SΔ), and the animal paused. Subsequent research has
shown that the moment of reinforcement contributes to the length of the PRP, but is not the only
controlling variable (Schlinger, Derenne,€& Baron, 2008).
Detailed investigations of PRP on FR schedules indicate that the upcoming ratio requirement
is perhaps more critical. As the ratio requirement increases, longer and longer pauses appear in the
cumulative record. At extreme ratios there may be almost no responding. If responding occurs at all,
the animal responds at high rates even though the number of responses per reinforcement is very
high. Mixed FR schedules described later in this chapter also illustrate the influence of to-be-com-
pleted response requirements on FR pausing. The number of responses required and the size of the
upcoming reinforcer have both been shown to influence PRP (Inman€& Cheney, 1974). Calling this
pause a “post”-reinforcement event accurately locates the pause, but the upcoming requirements
exert predominant control over the PRP. Thus, contemporary researchers often refer to the PRP as a
preratio pause (e.g., Schlinger et al., 2008).
Conditioned reinforcers such as money, praise, and successful completion of a task also pro-
duce a pause when they are scheduled on fixed ratio. Consider what you might do if you had five sets
of 10 math problems to complete for a homework assignment. A€good bet is that you would solve
10 problems, and then take a break before starting on the next set. When constructing a sun deck,
one of the authors bundled nails into lots of 50 each. This had an effect on the “nailing behavior”
of friends who were helping to build the deck. The response pattern that developed was to put in 50
nails, then stop, take a drink, look over what had been accomplished, have a chat, and finally start
nailing again. In other words, this simple scheduling of the nails generated a break-and-run pattern
typical of FR reinforcement.
These examples of FR pausing suggest that the analysis of FR schedules has relevance for
human behavior. We often talk about procrastination and people who put off or postpone doing
things. It is likely that some of this delay in responding is similar to the pausing induced by the ratio
schedule. A€person who has a lot of upcoming work to complete (ratio size) may show a period of
low or no productivity. Human procrastination may be modeled by animal performance on ratio
schedules; translational research linking human productivity to animal performance on ratio sched-
ules, however, has yet to be attempted (Schlinger et al., 2008).
In addition to pausing and procrastination, fixed-ratio schedules have been used to investigate
the economics of work. Researchers in behavioral economics often design experiments using FR
schedules to manipulate the price (ratio size) per reinforcement, holding the “unit of cost” constant.
The equal cost assumption holds that each response or unit toward completion of the ratio on an FR
schedule is emitted with equal force or effort—implying that the cost of response does not change
as the animal completes the ratio. But evidence is mounting that the force of response changes as
the animal fulfills the ratio requirement—suggesting that each response does not have an equal cost.

Variable Ratio
Variable-ratio (VR) schedules are similar to fixed ratio except that the number of responses
required for reinforcement changes after each reinforcer is presented. A€variable-ratio schedule is
literally a series of fixed ratios with each FR of a different size. The average number of responses
to reinforcement is used to define the VR schedule. A€subject may press a lever for reinforcement
5 times, then 15, 7, 3, and 20 times. Adding these response requirements for a total of 50 and
then dividing by the number of separate response runs (5) yields the schedule value, VR 10. The
148╇╇ Schedules of Reinforcement

symbol VR in Figure€5.10 indicates that the


number of responses required for any one
reinforcer is variable. Other than this change,
the contingency is identical to fixed ratio (see
FIG. 5.10╇ A€variable-ratio schedule of positive Figure€5.8).
reinforcement is depicted. The symbol VR indicates In general, ratio schedules produce a high
that the number of responses required for rate of response. When VR and FR schedules
reinforcement is varied from one sequence to the are compared, responding is typically faster on
next. The average number of responses required
for reinforcement indexes the schedule. That is, a VR. One reason for this is that pausing after
VR 10 requires an average of 10 responses before reinforcement (PRP) is reduced or eliminated
reinforcement is presented. when the ratio contingency is changed from
fixed to variable. This provides further evidence
that the PRP does not occur because the animal
is tired or is consuming the reinforcer (i.e., eat-
ing food). A€rat or pigeon responding for food
on VR does not pause as many times, or for as
long, after reinforcement. When VR schedules
are not excessive, PRPs do occur, although
these pauses are typically smaller than those
generated by FR schedules (Mazur, 1983). Fig-
ure€5.11 portrays a typical pattern of response
on a VR schedule of positive reinforcement.
A VR schedule with a low mean ratio can
FIG. 5.11╇ A€cumulative graph is shown of typical
responding on a VR schedule of reinforcement.
contain some very small ratio requirements.
Reinforcement is indicated by the hatch marks. For example, on a VR 10 schedule there cannot
Notice that PRPs are reduced or eliminated when be many ratio requirements above 20 responses
compared with FR performance. because, to offset those high ratios and average
10, there will have to be many very low ratios.
It is the occasional occurrence of a reinforcer
right after another reinforcer, the short runs to reinforcement, that reduces the likelihood of pausing
on a VR schedule of reinforcement. Variable-ratio schedules with high mean ratios (e.g., VR 100)
have fewer short ratios following one another and typically generate longer PRPs.
The change from VR reinforcement to extinction initially shows little or no change in rate of
response. A€pigeon on VR 110 shows a high steady rate of response (approximately 3 pecks per
second). With the onset of extinction, the bird continues to respond at a similar high rate for about
3000 responses, followed by a shift to a somewhat lower rate of response for 600 responses. The last
part of the record shows long pausing and short bursts of responses at a rate similar to the original
VR 110 performance. The pauses become longer and longer and eventually all responding stops, as
it does on FR schedules (Ferster€& Skinner, 1957, pp. 411–412).
An additional issue concerning VR schedules is that the number of responses for reinforcement
is unpredictable, but it is not random. In fact, the sequence repeats after all the programmed ratios
have been completed and, on some VR schedules, short ratios may occur more frequently than with
a random sequence. A€schedule with a pseudo-random pattern of response to reinforcement values
is called a random-ratio (RR) schedule of reinforcement. Research has shown that performance on
RR schedules resembles that on a VR schedule, but these probabilistic schedules “lock you in” to
high rates of response, as in gambling, by early runs of payoffs and by the pattern of unreinforced
responses (Haw, 2008).
In everyday life, variability and probability are routine. Thus, ratio schedules involving prob-
abilistic payoffs (or RR schedules) are more common than strict VR or FR contingencies from the
Schedules of Reinforcement╇╇ 149

laboratory. You may have to hit one nail three times to drive it in, and the next nail may take six
swings of the hammer. It may, on average, take 70 casts with a fly rod to catch a trout, but any one
strike is probabilistic. In baseball, the batting average reflects the player’s schedule of reinforce-
ment. A€batter with a .300 average gets 3 hits for 10 times at bat on average, but nothing guarantees
a hit for any particular time at bat. The schedule depends on a complex interplay among conditions
set by the pitcher and the skill of the batter.

Interval Schedules
Fixed Interval
On fixed-interval (FI) schedules, an operant is reinforced after a fixed amount of time has passed.
For example, on a fixed-interval 90-s schedule (FI 90 s), one bar press after 90 s results in reinforce-
ment. Following reinforcement, another 90-s period goes into effect, and after this time has passed
another response will produce reinforcement. It is important to note that responses made before the
time period has elapsed have no effect. Notice that in Figure€5.12, one response (R) produces rein-
forcement (Sr+) after the fixed amount of time (FT) has passed. [Note: there is a schedule called fixed
time (FT) in which reinforcement is delivered without a response following a set, or fixed, length of
time. This is also referred to as a response-in-
dependent schedule. [Unless otherwise speci-
fied, one should always assume that a response
is required on whatever schedule is in effect.]
When organisms are exposed to interval
contingencies, and they have no way of tell- FIG. 5.12╇ A€fixed-interval schedule. In the
presence of an SD, one response (R) is reinforced
ing time, they typically produce many more after a fixed amount of time (FT). Following
responses than the schedule requires. Fixed-in- reinforcement (S), the returning arrow states that
terval schedules produce a characteristic the sequence starts again. This means that the
steady-state pattern of responding. There is a fixed-time interval starts over and, after it has
pause after reinforcement (PRP), then a few elapsed, one response will again be reinforced.
probe responses, followed by more and more
rapid responding to a constant high rate as the
interval times out. This pattern of response is
called scalloping. Figure€5.13 is an idealized
cumulative record of FI performance. Each
interreinforcement interval (IRI) can be divided
into three distinct classes—the PRP, followed
by a period of gradually increasing rate, and
finally a high terminal rate of responding.
Suppose that you have volunteered to be
in an operant experiment. You are brought into
FIG. 5.13╇ Fixed-interval schedules usually
a small room, and on one wall there is a lever produce a pattern that is called scalloping. There
with a cup under it. Other than those objects, is a PRP following reinforcement, then a gradual
the room is empty. You are not allowed to keep increase in rate of response to the moment of
your watch while in the room, and you are told, reinforcement. Less common is the break-and-
“Do anything you want.” After some time, you run pattern. Break-and-run occasionally develops
after organisms have considerable experience on
press the lever to see what it does. Ten dollars FI schedules. There is a long pause (break) after
fall into the cup. A€good prediction is that you reinforcement, followed by a rapid burst (run) of
will press the lever again. You are not told this, responses.
150╇╇ Schedules of Reinforcement

but the schedule is FI 5 min. You have 1 h per day to work on the schedule. If you collect all 12 (60
min ÷ 5 min€=€12) of the scheduled reinforcers, you can make $120 a day.
Assume you have been in this experiment for 3 months. Immediately after collecting a $10 rein-
forcer, there is no chance that a response will pay off (discriminated extinction). But, as you are stand-
ing around or doing anything else, the interval is timing out. You check out the contingency by making
a probe response (you guess the time might be up). The next response occurs more quickly because
even more time has passed. As the interval continues to time out, the probability of reinforcement
increases and your responses are made faster and faster. This pattern of responding is described by the
scallop shown in Figure€5.13, and is typical for FI schedules (Ferster€& Skinner, 1957).
Following considerable experience with FI 5 min, you may get very good at judging the time
period. In this case, you would wait out the interval and then emit a burst of responses. Perhaps you
begin to pace back and forth during the session, and you find out that after 250 steps the interval has
almost elapsed. This kind of mediating behavior may develop after experience with FI schedules
(Muller, Crow,€& Cheney, 1979). Other animals behave in a similar way and occasionally produce a
break-and-run pattern of responding, similar to FR schedules (Ferster€& Skinner, 1957).
Humans use clocks and watches to keep track of time. Based on this observation, Ferster and Skin-
ner (1957, pp. 266–278) asked about the effects of adding a visible clock to an FI schedule. The “clock”
for pigeons was a light that grew in size as the FI interval ran out. The birds produced FI scallops that
were much more uniform than without a clock, showing the control exerted by a timing stimulus.
Another indication of the stimulus control occurred when the clock was reversed (i.e., the light grew
smaller with the FI interval). Under these conditions, the scallop also reversed such that immediately
following reinforcement a high response rate occurred, leading to a pause at the end of the interval. The
FI contingencies, however, quickly overrode the stimulus control by the reverse clock, shifting the pat-
tern back to a typical curve. When a stimulus such as a clock results in inefficient behavior with respect
to the schedule, behavior eventually conforms to the schedule rather than the controlling stimulus.
In everyday life, FI schedules are arranged when people set timetables for trains and buses.
Next time you are at a bus stop, take a look at what people do while they are waiting for the next
bus. If a bus has just departed, people stand around and perhaps talk to each other for a while. Then,
the operant of “looking for the bus” begins at a low rate of response. As the interval times out, the
rate of looking for the bus increases and most passengers are now looking for the arrival of the next
bus. The passengers’ behavior approximates the scalloping pattern we have described in this section.
Schedules of reinforcement are a pervasive aspect of human behavior, but we seldom recognize the
effects of these contingencies.

FOCUS ON: Generality


of Schedule Effects
The assumption of generality implies that the effects of contingencies of reinforcement
extend over species, reinforcement, and behavior (Morse, 1966, p. 59; Skinner, 1969, p. 101).
For example, a fixed-interval schedule is expected to produce the scalloping pattern for a
pigeon pecking a key for food, and for a child solving math problems for teacher approval.
Fergus Lowe (1979) conducted numerous studies of FI performance with humans who
press a button to obtain points later exchanged for money. Figure€5.14 shows typical perfor-
mances on fixed-interval schedules by a rat and two human subjects. Building on research by
Harold Weiner (1969), Lowe argued that animals show the characteristic scalloping pattern,
and humans generally do not. Humans often produce one of two patterns—an inefficient
Schedules of Reinforcement╇╇ 151

steady, high-rate of response or an


efficient low-rate, break-and-run per-
formance. Experiments by Lowe and
his colleagues focused on the condi-
tions that produce the high- or low-
rate patterns in humans.
The basic idea is that sched-
ule performance in humans reflects
the influence of language (see FIG. 5.14╇ Typical animal performances on FI schedules
Chapter€12 on verbal behavior). In are shown along with the high- and low-rate performance
conditioning experiments, people usually seen with adult humans.
generate some verbal rule and pro- Source: Adapted from C. F. Lowe (1979). Reinforcement
ceed to behave according to the and the organization of behavior. Wiley: New York, p.
162. Copyright 1979 held by C. F. Lowe. Published with
self-generated rule rather than the permission.
experimentally arranged FI contin-
gencies. Lowe, Beasty, and Bentall
(1983) commented that:

Verbal behavior can, and does, serve a discriminative function that alters the effects of
other variables such as scheduled reinforcement. Unlike animals, most humans are capable
of describing to themselves, whether accurately or inaccurately, environmental events and
the ways in which those events impinge upon them; such descriptions may greatly affect
the rest of their behavior.
(Lowe et al., 1983, p. 162)

In most cases, people who follow


self-generated rules satisfy the require-
ments of the schedule, obtain reinforce-
ment, and continue to follow the rule. For
example, one person may say, “I€should
press the button fast,” and another says,
“I€should count to 50 and then press the
button.” Only when the contingencies
are arranged so that self-generated rules
conflict with programmed reinforce-
ment do people reluctantly abandon
the rule and behave in accord with the
contingencies (Baron€& Galizio, 1983).
Humans also naturally find it easy to fol-
low a self-instruction or rule and effort-
ful to reject it (Harris, Sheth,€& Cohen, FIG. 5.15╇ Infant Jon in study of FI schedules had to
touch the metal cylinder to receive small snack items
2007).
like pieces of fruit, bread, and candy. The order of
One implication of Lowe’s analysis the FI values for Jon were 20, 30, 10, and 50 s.
is that humans without language skills Source: From C. Fergus Lowe, A. Beasty,€& R.╛P.
would show characteristic effects of Benthall (1983). The role of verbal behavior in
schedules. Lowe et€al. (1983) designed human learning: Infant performance on fixed-interval
schedules. Journal of the Experimental Analysis
an experiment to show typical FI perfor- of Behavior, 39, pp. 157–164. Reproduced with
mance by children less than 1€year old. permission and copyright 1983 held by the Society
Figure€5.15 shows an infant (Jon) seated for the Experimental Analysis of Behavior.
152╇╇ Schedules of Reinforcement

in a highchair and able to touch a round metal cylinder. Touching the cylinder produced
a small bit of food (pieces of fruit, bread, or candy) on FI schedules of reinforcement.
A€second infant, Ann, was given 4 s of music played from a variety of music boxes on the
same schedules. Both infants produced response patterns similar to the rat’s performance
in Figure€5.14. Thus, infants who are not verbally skilled behave in accord with the FI con-
tingencies. There is no doubt that humans become more verbal as they grow up; however,
many other changes occur from infancy to adulthood. A€possible confounding factor is the
greater experience that adults compared to infants have with ratio-type contingencies of
reinforcement. Infants rely on the caregiving of other people. This means that most of an
infant’s reinforcement is delivered on the basis of time and behavior (interval schedules).
A€baby is fed when the mother has time to do so, although fussing may decrease the inter-
val. As children get older, they begin to crawl and walk and reinforcement is delivered
more and more on the basis of their behavior (ratio schedules). When this happens, many
of the contingencies of reinforcement change from interval to ratio schedules. The amount
of experience with ratio-type schedules of reinforcement may contribute to the differ-
ences between adult human and animal/infant performance on fixed-interval schedules.
In fact, research by Wanchisen, Tatham, and Mooney (1989) has shown that rats per-
form like adult humans on FI schedules after a history of ratio reinforcement. The ani-
mals were exposed to variable-ratio (VR) reinforcement and then were given 120 sessions
on a fixed-interval 30-s schedule (FI 30 s). Two patterns of response developed on the
FI schedule—a high-rate pattern with little pausing and a low-rate pattern with some
break-and-run performance. These patterns of performance are remarkably similar to the
schedule performance of adult humans (see Figure€5.14). One implication is that human
performance on schedules may be explained by a special history of ratio-like reinforce-
ment rather than self-generated rules. At this time, it is reasonable to conclude that both
reinforcement history and verbal ability contribute to FI performance of adult humans (see
Bradshaw€& Reed, 2012 for appropriate human performance on random-ratio (RR) and
random-interval (RI) schedules only for those who could verbally state the contingencies).

Variable Interval
On a variable-interval (VI) schedule, responses are reinforced after a variable amount of time
has passed (see Figure€5.16). For example, on a VI 30-s schedule, the time to each reinforcement
changes but the average time is 30 s. The symbol V indicates that the time requirement varies from
one reinforcer to the next. The average amount of time required for reinforcement is used to define
the schedule.
Interval contingencies are common in the ordinary
world of people and other animals. For example, people
stand in line, sit in traffic jams, wait for elevators, time a
boiling egg, and are put on hold. In everyday life, variable
time periods occur more frequently than fixed ones. Wait-
ing in line to get to a bank teller may take 5 min one day
FIG. 5.16╇ The illustration depicts a and half an hour the next time you go to the bank. A€wolf
variable-interval schedule. The symbol pack may run down prey following a long or short hunt.
VI stands for variable interval and
indicates that the schedule is indexed A€baby may cry for 5 s, 2 min, or 15 min before a parent
by the average time requirement for picks up the child. A€cat waits varying amounts of time in
reinforcement. ambush before a bird becomes a meal. Waiting for a bus
Schedules of Reinforcement╇╇ 153

is rarely reinforced on a fixed schedule, despite the


efforts of transportation officials. The bus arrives
around an average specified time and waits only a
given time before leaving. A€carpool is an example of
a VI contingency with a limited hold. The car arrives
more or less at a specified time, but waits for a rider
only a limited (and usually brief) time. In the lab-
oratory, this limited-hold contingency—where the
reinforcer is available for a set time after a variable
interval—when added to a VI schedule increases the
rate of responding by reinforcing short interresponse FIG. 5.17╇ Idealized cumulative pattern of
response produced by a variable-interval
times (IRTs). In the case of the carpool, people on the schedule of reinforcement.
VI schedule with limited hold are ready for pick-up
and rush out of the door when the car arrives.
Figure€5.17 portrays the pattern of response generated on a VI schedule. On this schedule, rate
of response is moderate and steady. The pause after reinforcement that occurs on FI does not usually
appear in the VI record. Notably, this steady rate of response is maintained during extinction. Ferster
and Skinner (1957, pp. 348–349) described the cumulative record of a pigeon’s performance for the
transition from VI 7 min to extinction. The bird maintains a moderately stable rate (1.25 to 1.5 pecks
per second) for approximately 8000 responses. After this, the rate of response continuously declines
to the end of the record. Generally, VI response rates initially continue to be moderate and stable on
extinction, showing an overall large output of behavior (resistance to extinction). Because the rate of
response remains steady and moderate to high, VI performance is often used as a baseline for evaluat-
ing other independent variables. Rate of response on VI schedules may increase or decrease as a result
of experimental manipulations. For example, tranquilizing drugs such as chlorpromazine decrease the
rate of response on VI schedules (Waller, 1961), while stimulants increase VI performance (Segal,
1962). Murray Sidman has commented on the usefulness of VI performance as a baseline:

An ideal baseline would be one in which there is as little interference as possible from other vari-
ables. There should be a minimal number of factors tending to oppose any shift in behavior that
might result from experimental manipulation. A€variable-interval schedule, if skillfully programmed,
comes close to meeting this requirement.
(Sidman, 1960, p. 320)

In summary, VI contingencies are common in everyday life. These schedules generate a mod-
erate steady rate of response, which is resistant to extinction. Because of this characteristic pattern,
VI performance is frequently used as a baseline to assess the effects of other variables, especially
performance-altering drugs.

NOTE ON: VI Schedules, Reinforcement Rate,


and Behavioral Momentum
Behavioral momentum refers to behavior that persists or continues in the presence
of a stimulus for reinforcement (SD) despite disruptive factors (Nevin, 1992; Nevin€&
Grace, 2000; see PRE effect and behavioral momentum in Chapter€4). Furthermore,
154╇╇ Schedules of Reinforcement

response rate declines more slowly relative to its baseline level in the presence of an
SD for high-density than low-density reinforcement (Shull, Gaynor,€& Grimer, 2002).
When you are working at the computer on a report, and keep working even though
you are called to dinner, your behavioral persistence indicates behavioral momen-
tum. Also, if you continue messaging on Facebook despite alternative sources of
reinforcement (watching a favorite TV show), that too shows behavioral momentum.
In the classroom, students with a higher rate of reinforcement (correct answers) for
solving math problems are less likely to be distracted by the sights and sounds out-
side the classroom window than other students with a lower rate of reinforcement
for problem solving.
At the basic research level, Nevin (1974) used a multiple schedule of reinforce-
ment to investigate behavioral momentum. The multiple schedule arranged two
separate VI reinforcement components, each with a discriminative stimulus (SD)
and separated by a third darkened component. Rates of responding were naturally
higher in the richer VI component. But, when free food was provided in the third
darkened component (disruption), responding decreased less in the VI condition with
the higher rate of reinforcement. Thus, behavior in the component with the rich VI
schedule (high rate of reinforcement) showed increased momentum. It continued to
keep going despite the disruption by free food (see also Cohn, 1998; Lattal, Reilly,€&
Kohn, 1998).
Another study by John Nevin and associates compared the resistance to change
(momentum) of behavior maintained on ratio and interval schedules of reinforce-
ment (Nevin, Grace, Holland,€& McLean, 2001). Pigeons pecked keys on a multiple
schedule of random-ratio (RR) random-interval (RI) reinforcement to test relative
resistance to change. On this multiple schedule, a distinctive stimulus (SD) signaled
each component schedule, either RR or RI, and the researchers ensured that the
reinforcement rates for the RR component were equated with those of the RI seg-
ment. Disruptions by free feeding between components, extinction, and pre-feed-
ing (before the session) were investigated. The findings indicated that, with similar
obtained rates of reinforcement, the interval schedule is more resistant to change,
and has higher momentum, than performance on ratio schedules. Notice that resis-
tance to change is exactly the opposite to the findings for rate of response on
these schedules—ratio schedules maintain higher rates of response than interval
schedules.
Currently, researchers are using behavioral momentum theory to evaluate the
long-term effects of reinforcement programs on targeted responses challenged
by disruptions (Wacker et al., 2011). In one applied study, two individuals with
severe developmental disabilities performed self-paced discriminations on a com-
puter using a touch-sensitive screen and food reinforcement (Dube€& McIlvane,
2001). Responses on two separate problems were differentially reinforced. On-task
behavior with the higher reinforcement rate showed more resistance to change
due to pre-feeding, free snacks, or alternative activities. Thus, the disruptive fac-
tors reduced task performance depending on the prior rates of on-task reinforce-
ment. When performance on a task received a high rate of reinforcement, it was
relatively impervious to distraction compared with performance maintained on a
lower rate schedule. One applied implication is that children with attention deficit
Schedules of Reinforcement╇╇ 155

hyperactivity disorder (ADHD) who are easily distracted in the school classroom
may suffer from low rates of reinforcement for on-task behavior and benefit more
from a change in rates of reinforcement than from administration of stimulant
medications with potential adverse effects.

Basic Schedules and Biofeedback


The major independent variable in operant conditioning is the program for delivering consequences,
called the schedule of reinforcement. Regardless of the species, the shape of the response curve for a
given schedule often approximates a predictable form. Fixed-interval scallops, fixed-ratio break and
run, and other patterns were observed in a variety of organisms and were highly uniform and regular
(see exceptions in “Focus On: Generality of Schedule Effects” in this chapter). The predictability
of schedule effects has been extended to the phenomenon of biofeedback and the apparent willful
control of physiological processes and bodily states.
Biofeedback usually is viewed as conscious, intentional control of bodily functions, such as
brainwaves, heart rate, blood pressure, temperature, headaches, and migraines—using instruments
that provide information or feedback about the ongoing activity of these systems. An alternative
view is that biofeedback involves operant responses of bodily systems regulated by consequences,
producing orderly changes related to the schedule of “feedback.”
Early research showed schedule effects of feedback on heart rate (Hatch, 1980) and blood
pressure (Gamble€& Elder, 1990). Subsequently, behavioral researchers investigated five different
schedules of feedback on forearm-muscle tension (Cohen, Richardson, Klebez, Febbo,€& Tucker,
2001). The study involved 33 undergraduate students who were given extra class credit and a
chance to win a $20 lottery at the end. Three electromyogram (EMG) electrodes were attached
to the underside of the forearm to measure electrical activity produced by muscles while partic-
ipants squeezed an exercise ball. They were instructed to contract their arm “in a certain way”
to activate a tone and light; thus, their job was to produce the most tone/light presentations they
could. Participants were randomly assigned to groups that differed in the schedule of feedback
(tone/light presentations) for EMG electrical responses. Four basic schedules of feedback (FR,
VR, FI, or VI) were programmed, plus CRF and extinction. Ordinarily, in basic animal research,
sessions are run with the same schedule until some standard of stability is reached. In this applied
experiment, however, 15-min sessions were conducted on three consecutive days with a 15-min
extinction session added at the end.
Cumulative records were not collected to depict response patterns, presumably because the
length and number of sessions did not allow for stable response patterns to develop. Instead,
researchers focused on rate of EMG activation as the basic measure. As might be expected, ratio
schedules (FR and VR) produced higher rates of EMG electrical responses than interval contingen-
cies (FI or VI). Additionally, the VI and VR schedules showed the most resistance to extinction (see
“Note On: VI Schedules, Reinforcement Rate, and Behavioral Momentum” in this chapter). CRF
produced the most sustained EMG responding, while FR and VR schedules engendered more mus-
cle pumping action of the exercise ball.
The EMG electrical responses used in this study were sensitive to the schedule of feedback, indi-
cating the operant function of electrical activity in the forearm muscles. Together with studies of bio-
feedback and responses of the autonomic nervous system, the Cohen et€al. (2001) experiment shows
that responses of the somatic nervous system also are under tight operant control of the schedule
156╇╇ Schedules of Reinforcement

of reinforcement (feedback). Further detailed analyses of biofeedback schedules on physiological


responses clearly are warranted, but have been lacking in recent years. In this regard, we recommend
the use of steady-state, single-subject designs that vary the interval or ratio schedule value over a wide
range to help clarify how schedules of feedback regulate seemingly automatic bodily activity.

PROGRESSIVE-RATIO SCHEDULES
On a progressive-ratio (PR) schedule of reinforcement, the ratio requirements for reinforcement
are increased systematically, typically after each reinforcer (Hodos, 1961). In an experiment, the
first response requirement for reinforcement might be set at a small ratio value such as 5 responses.
Once the animal emits 5 responses resulting in reinforcement, the next ratio requirement might
increase by 10 responses (the step size). Now reinforcement occurs only after the animal has pressed
the lever 15 times, followed by ratio requirements of 25 responses, 35, 45, 55, and so on (adding
10 responses on each step). The increasing ratios (5, 15, 25, 35, and so on) are the progression and
give the schedule its name. At some point in the progression of ratios, the animal fails to achieve the
requirement. The highest ratio value completed on the PR schedule is designated the breakpoint.
The type of progression on a PR schedule may be arithmetic, as when the difference between
two ratio requirements is a constant value such as 10 responses. Another kind of progression is geo-
metric, as when each ratio after the first is found by multiplying the previous one by a fixed number.
A€geometric progressive ratio might be 2, 6, 18, and so on, where 3 is the fixed value. The type of
progression (arithmetic or geometric) is an important determinant of behavior on PR schedules.
In one study of PR schedules, Peter Killeen and associates found that response rates on arithmetic
and geometric PR schedules increased as the ratio requirement progressed and then at some point
decreased (Killeen, Posadas-Sanchez, Johansen,€& Thraikill, 2009). Response rates maintained on
arithmetic PR schedules decreased in a linear manner—as the ratio size increased, there was a linear
decrease in response rates. Responses rates on geometric PR schedules, however, showed a negative
deceleration toward a low and stable response rate—as ratio size increased geometrically, response
rates rapidly declined and then leveled off. Thus, the relationship between response rates and ratio
requirements of the PR schedule depends on the type of progression—arithmetic or geometric.
These relationships can be described by mathematical equations, and this is an ongoing area of
research (Killeen et al., 2009).

Progressive-Ratio Schedules and Neuroscience


The PR schedule has also been used in applied research (Roane, 2008). Most of the applied research
on PR schedules uses the giving-up or breakpoint as a way of measuring reinforcement efficacy or
effectiveness, especially of drugs like cocaine. The breakpoint for a drug indicates how much oper-
ant behavior the drug will sustain at a given dose. For example, a rat might self-administer morphine
on a PR schedule as the dose size is varied and breakpoints are determined for each dose size. It is
also possible to determine the breakpoints for different kinds of drugs (e.g., stimulants or opioids),
assessing the drugs’ relative reinforcement effectiveness. In these tests, it is important to recognize
that the time allotted to complete the ratio (e.g., 120 min) and the progression of the PR schedule
(progression of ratio sizes) have an impact on the breakpoints—potentially limiting conclusions
about which drugs are more “addictive” and how the breakpoint varies with increases in drug dose
(dose–response curve).
Schedules of Reinforcement╇╇ 157

Progressive-ratio schedules 3000


d-Amph
allow researchers to assess the rein- Meth
forcing effects of drugs prescribed 2500
to control problem behavior. A€drug

Responses at breakpoint
prescribed to control hyperactivity
2000
might also be addictive—an effect
recommending against its use. The
1500
drug Ritalin® (methylphenidate) is
commonly used to treat attention defi-
cit hyperactivity disorder (ADHD) 1000
and is chemically related to Dexe-
drine® (d-amphetamine). Amphet- 500
amine is a drug of abuse, as are other
stimulants, such as cocaine. Thus, 0
people who are given methylpheni- Placebo Low Medium High
date to modify ADHD might develop Dose Level
addictive behavior similar to behav-
ior maintained by amphetamine. FIG. 5.18╇ Breakpoints produced by 10 drug-abusing
volunteers who self-administered low, medium, and
In one study, human drug-abusing high doses of two stimulants, methylphenidate and
volunteers were used to study the d-amphetamine, as well as a placebo control. The medium
reinforcing efficacy of three doses doses of each drug differ from the placebo but the two drugs
of methylphenidate (16, 32, and 48 do not reliably differ from each other.
mg) and d-amphetamine (8, 16, and Source: Based on data for average breakpoints in W.â•›W.
Stoops, P.E.A. Glaser, M.╛T. Fillmore,€& C.╛R. Rush (2004).
24 mg), including a placebo (Stoops, Reinforcing, subject-rated, performance and physiological
Glaser, Fillmore,€& Rush, 2004). The effects of methylphenidate and d-amphetamine in stimulant
reinforcing efficacy of the drug was abusing humans. Journal of Psychopharmacology, 18,
assessed by a modified PR schedule pp. 534–543, 538.
(over days) where participants had to
press a key on a computer (50, 100,
200, 400, 800, 1600, 3200, and 6400 times) to earn capsules of the drug; completion of the ratio
requirement resulted in oral self-administration of the drug. Additional monetary contingencies
were arranged to ensure continued participation in the study. As shown in Figure€5.18, the results
indicated that the number of responses to the breakpoint increased at the intermediate dose of meth-
ylphenidate and d-amphetamine compared with the placebo control. Thus, at intermediate doses
methylphenidate is similar in reinforcement effectiveness to d-amphetamine. One conclusion is that
using Ritalin® to treat ADHD may be contraindicated due to its potential for abuse, and interventions
based on increasing behavioral momentum may be a preferred strategy, as previously noted in this
chapter (see Stoops, 2008 for a review of the relative reinforcing effects of stimulants in humans
on PR schedules; also Bolin, Reynolds, Stoops,€& Rush, 2013 provide an assessment of d-amphet-
amine self-administration on PR schedules, including verbal ratings of drug effects).
In another context, PR schedules have been used to study the reinforcement efficacy of pal-
atable food on overeating and obesity. Leptin is a hormone mostly produced in the adipocytes (fat
cells) of the white adipose (fat) tissue. The basic function of leptin is to signal when to stop eating—
counteracting neuropeptides that stimulate feeding. Based on a genetic mutation, the ob/ob (obese-
prone) mouse is deficient in the production of leptin—overeating and gaining excessive body weight
compared with the lean-prone littermates of the same strain (see Figure€5.19). Generally, overeating
and obesity vary with genotype (obese-prone vs. lean-prone) of these rodents.
Researchers have investigated whether the reinforcing efficacy of palatable food varies by gen-
otype of the ob/ob mouse (Finger, Dinan, and Cryan, 2010). Obese-prone and lean-prone mice were
158╇╇ Schedules of Reinforcement

trained to make nose-poke responses for flavored


sucrose pellets (detected by photo beams in the
food magazine). Next, the mice were tested on
a PR3 schedule, requiring an increase or step of
3 responses for each pellet, using a linear pro-
gression of ratio values (3, 6, 9, 12, 15, and so
on) with 3 as the first ratio requirement. The
highest ratio completed within a 15-min period
was defined as the breakpoint. All mice received
16 daily sessions to attain stability on the PR3
schedule. After establishing the PR3 baselines,
both obese-prone and lean-prone mice were
administered low and high doses of an anorexic
FIG. 5.19╇ Photograph is shown of the ob/ob
obese-prone mouse and lean-prone littermate. drug (fenfluramine, withdrawn from commer-
The ob/ob genotype has a deficiency in leptin cial use) and given more sessions on the PR3
production that results in obesity when food is schedule. The results for breakpoints showed
freely available. that the obese-prone (leptin-deficient) genotype
Source: Public access photo. did not reliably differ from the lean-prone mice.
The reinforcing efficacy of palatable food was
similar for both genotypes. Also, the anorexic
drug reduced PR3 breakpoints in a dose–response manner, but this effect did not differ by genotype
(obese-prone vs. lean-prone). Apparently, the overeating and excessive weight gain of leptin-deficient
(ob/ob) mice is not due to differences in the reinforcement efficacy of palatable food.
One problem with this conclusion is that animals were only given 15 min to complete the ratio
requirement, and some animals did not achieve stable baselines on the PR3 schedule, even after
16 days. The safest conclusion is that further studies of ob/ob mice on PR schedules are necessary
to determine the reinforcing efficacy of palatable food for the obese-prone rodents (see Kanoski,
Alhadeff, Fortin, Gilbert,€& Grill, 2014 for reduced PR responding for sucrose by rats after leptin
administration to the medial nucleus tractus dolitarius (mMTS) of the brain, amplifying satiation
signals in the gastrointestinal (GI) tract).

PR Schedule of Wheel Running for Food Reinforcement


In the wild, animals usually obtain food and other resources by locomotion and travel within (and
between) food locations or patches. Thus, in the everyday life of animals, food often is contingent on
distance traveled or covered in a day. Viewed as behavior, traveling for food is an operant controlled
by its consequences—the allocation of food arranged by the location or patch. In the laboratory, a
progressive-ratio (PR) schedule can be used to simulate increasing travel demands for food. On this
schedule, increasingly more work or effort is required to obtain the same daily food ration.
Typical operant PR experiments are conducted in an open economy where animals receive bits
of food (reinforcers) for responses within the experiment, but receive most of their food after an
experimental session to maintain adequate body weight. To model the problem of travel for food in
the wild, a closed economy is used where animals that meet the behavioral requirements receive all
of their food (all they are going to get) within the experimental setting. This difference in economy
(open vs. closed) may have effects on behavior independent of the operating PR contingencies,
especially for food consumption and maintenance of body weight.
A novel experiment by a group of biologists from Brazil and England arranged a variation on
the traditional PR schedule, which involved increasing the travel distance in meters (m€=€meters;
Schedules of Reinforcement╇╇ 159

where 1000€m€=€0.6 miles) required to obtain the animal’s free-feeding daily food supply (Fonseca
et al., 2014). Adult male rats were divided into wheel-running contingent (CON) and wheel-running
noncontingent (NON) groups. The CON group was placed in a cage with a running wheel where
the acquisition of food was contingent upon the distance traveled (closed economy). Every 3 days
the distance required to maintain free-feeding levels was increased above the distance set for the
previous 3 days. The NON group was housed and treated identically to the CON group, but food
acquisition was not dependent on running in the wheel (open economy).
During a baseline period, all rats were given 3 days of free food and access to running wheels.
The animals consumed on average 24 g of food a day for an average consumption rate of 1 g per hour.
On average, rats ran 1320€m/day in the wheels during the baseline phase. The next phase involved
arranging a PR schedule for the rats in the CON group. To obtain the initial PR value, the 1320€m of
baseline wheel running was divided by 24 g of food, yielding 1 g of food for each 55€m. A€program-
mable dispenser contained six 4-g pellets (24 g), and rats received these pellets at the completion of
each 220€m (55€m/1 g × 4€=€220€m/4 g) of wheel running (1st pellet 220€m, 2nd pellet 440€m, 3rd
pellet 660€m, 4th pellet 880€m, 5th pellet 1100€m, and 6th pellet 1320€m). The initial PR requirement
stayed in effect for 3 days at which point the ratio (travel distance) increased by 1188€m. With 2508
as the new distance (1 g/104.5€m), rats obtained 4 g of food for each 418€m of wheel running. This
PR value again remained in effect for 3 days at which point the distance requirement increased again
by adding the constant distance 1188€m to 2508€m, yielding 3696€m as the new distance (1 g/154€m)
or 4 g of food for each 616€m of wheel running. This procedure of increasing distance required for
food (PR value) every 3 days continued over the 45 days of the experiment (description of PR con-
tingencies based on Fonseca
et al., 2014 and personal
communication from Dr.
Robert Young, the English
author of the article).
Figure€5.20 shows the
average distance (m) trav-
eled for food for each 3 days
of the experiment by the
rats in the contingent (CON,
open circles) and noncon-
tingent (NON, filled circles)
groups. The line joining the
grey triangles depicts the
increasing distance on the
PR schedule for animals to
obtain their daily free-feed- FIG. 5.20╇ Distance traveled (m) by rats in the progressive-ratio
ing level of food, six 4-g contingent (CON) group (open circles) and the progressive-ratio
pellets. For the rats in the noncontingent (NON) group (filled circles). The linear line with filled
NON group, the distance triangles depicts the increase in the progressive-ratio (PR) value over
3 day periods. Notice the increase in wheel running distance that
traveled each day is low approximates the PR value until day 15 (6000€m) and then falls short
and constant, about 1300€m of the PR requirements and leveling off above 8000€m, without an
on average, as in baseline. obvious breakpoint. See text for an analysis of the PR contingencies
These rats maintained daily and the effects on consumption of food and body weight.
food intake at about 24 g or Source: From I. A. T. Fonseca, R.â•›L. Passos, F.â•›A. Araujo, M. R. M. Lima,
D.╛R. Lacerda, W. Pires, et€al. (2014). Exercising for food: Bringing
6 pellets (data not shown) the laboratory closer to nature. The Journal of Experimental Biology,
and showed increasing body 217, pp. 3274–3281. Republished with permission of the Journal of
weight over days (data not Experimental Biology and The Company of Biologist Limited.
160╇╇ Schedules of Reinforcement

shown). In contrast, rats in the CON group that had to meet the increasing distance requirement to
obtain food at first closely matched their average level of wheel running (travel distance) to the PR
scheduled distance.
Although distance traveled matched the early PR values and rats received the six 4-g pellets,
food consumed (actually eaten) showed a sharp drop (from 24 g to 18 g) for the first 3 days of
wheel running on the PR schedule. Following this initial drop, food consumed partially recovered;
however, consumption by CON rats remained suppressed relative to the NON group (21 g vs. 24 g).
When the distance requirement increased to approximately 6000€m per day (1000€m/pellet), CON
rats’ average distance traveled no longer approximated the PR value—even though rats did complete
longer distances at higher ratios, exceeding 8000€m a day (see Figure€5.20). Rats now traveled less
than required by the PR value, giving up some of the daily food ration that they could have obtained.
One possibility is that the animals were sensitive to energy balance or homeostasis—balancing as
best as possible energy expenditure by wheel running with energy intake from food consumption.
In fact, body weight initially increased, but then leveled off and decreased as distance traveled fell
considerably off the PR requirement and food availability decreased. At PR values between 8000€m
to 9000€m on average, distance traveled leveled off (asymptote) showing no breakpoint (giving-up
value) typical of PR schedules; however, food availability substantially declined, and body weight
plummeted.
The PR schedule and closed economy used in this study generated a severe energy imbalance,
which ultimately would result in eventual death of the animal. Other research addressed in this
textbook shows that rats develop activity anorexia when faced with a restricted food supply and free
access to running wheels. The animals run more and more, eat less at each meal, and die of self-star-
vation (see Chapter€7, “On the Applied Side: Experimental Analysis of Activity Anorexia”). Given
these findings, it would be informative to remove the PR requirements for wheel running once
energy imbalance is induced, delivering only four pellets (16 g) of food daily no longer contingent
on travel distance (open economy). If animals give up wheel running, the four pellets of food would
be sufficient to maintain energy stores. On the other hand, animals might continue to run and self-
starve under these conditions—demonstrating how food reinforcement contingencies may induce
life-threatening, non-homeostatic behavior.

SCHEDULE PERFORMANCE IN TRANSITION


We have described typical performances generated by different schedules of reinforcement. The
patterns of response on these schedules take a relatively long time to develop. Once behavior has
stabilized, showing little change from day to day, the organism’s behavior is said to have reached a
steady state. The break-and-run pattern that develops on FR schedules is a steady-state performance
and is only observed after an animal has considerable exposure to the contingencies. Similarly, the
steady-state performance generated on other intermittent schedules takes time to develop. When
an organism is initially placed on any schedule of reinforcement, typical behavior patterns are not
consistent or regular. This early performance on a schedule is called a transition state. Transition
states are the periods between initial steady-state performance and the next steady state (see Sidman,
1960 for steady-state and transition-state analysis).
Consider how you might get an animal to press a lever 100 times for each presentation of food
(FR 100). First, you shape the animal to press the bar on CRF (see Chapter€4). After some arbitrary
steady-state performance is established on CRF, you are faced with the problem of how to program
the steps from CRF to FR 100. Notice that in this transition there is a large shift or step in the ratio
Schedules of Reinforcement╇╇ 161

of reinforcement to bar pressing. This problem has been studied using a progressive-ratio schedule,
as we described earlier in this chapter. The ratio of responses following each run to reinforcement is
programmed to increase in steps. Stafford and Branch (1998) employed the PR schedule to inves-
tigate the behavioral effects of step size and criteria for stability. If you simply move from CRF to
the large FR value, the animal will probably show ratio strain in the sense that it pauses longer
and longer after reinforcement. One reason is that the time between successive reinforcements con-
tributes to the postreinforcement pause (PRP). The pause gets longer as the interreinforcement
interval (IRI, or time between reinforcement) increases. Because the PRP makes up part of the IRI
and is controlled by it, the animal eventually stops responding. Thus, there is a negative feedback
loop between increasing PRP length and the time between reinforcements in the shift from CRF to
the large FR schedule.
Transitions from one schedule to another play an important role in human development. Devel-
opmental psychologists have described periods of life in which major changes in behavior typically
occur. One of the most important life stages in Western society is the transition from childhood to
adolescence. Although this phase involves many biological and behavioral processes, one of the
most basic changes involves schedules of reinforcement.
When a youngster reaches puberty, parents, teachers, peers, and others require more behav-
ior and more skillful performance than they did during childhood. A€young child’s reinforcement
schedules are usually simple, regular, and immediate. In childhood, food is given when the child
says “Mom, I’m hungry” after playing a game of tag, or is scheduled at regular times throughout the
day. On the other hand, a teenager is told to fix her own food and clean up the mess. Notice that the
schedule requirement for getting food has significantly increased. The teenager may search through
the refrigerator, open packages and cans, sometimes cook, get out plates, eat the food, and clean up.
Of course, any part of this sequence may or may not occur depending on the disciplinary practices
of the parents. Although most adolescents adapt to this transition state, others may show signs of
ratio strain and extinction. Poor eating habits among teenagers may reflect the change from regular
to intermittent reinforcement.
Many other behavioral changes may occur during the transition from childhood to adolescence.
Ferster, Culbertson, and Boren (1975) noted the transition to intermittent reinforcement that occurs
in adolescence:

With adolescence, the picture may change quite drastically and sometimes even suddenly. Now
money becomes a reinforcer on a fixed-ratio schedule instead of continuous reinforcement as before.
The adolescent may have to take a job demanding a substantial amount of work for the money, which
heretofore he received as a free allowance. Furthermore, he now needs more money than when
he was younger to interact with people he deals with. A€car or a motorcycle takes the place of the
bicycle. Even the price of services such as movies and buses is higher. Money, particularly for boys,
frequently becomes a necessary condition for dealing with the opposite sex. The amount of work
required in school increases. Instead of simple arithmetic problems, the adolescent may now have to
write a long term paper, cover more subjects, or puzzle through a difficult algebra problem, which
will require much trial and error.
(Ferster et al., 1975, pp. 416–417)

There are other periods of life in which our culture demands large shifts in schedules of rein-
forcement. A€current problem involves a rapidly aging population and the difficulties generated by
forced or elected retirement. In terms of schedules, retirement is a large and rapid change in the
contingencies of reinforcement. Retired people face significant alterations in social, monetary, and
work-related consequences. For example, a professor who has enjoyed an academic career is no
longer reinforced for research and teaching by the university community. Social consequences for
these activities may have included approval by colleagues, academic advancement and income, the
162╇╇ Schedules of Reinforcement

interest of students, and intellectual discussions. Upon retirement, the rate of social reinforcement
is reduced or completely eliminated. It is, therefore, not surprising that retirement is an unhappy
time of life for many people. Although retirement is commonly viewed as a problem of old age,
a behavior analysis points to the abrupt change in rates and sources of reinforcement (Skinner€&
Vaughan, 1983).

ON THE APPLIED SIDE:


Schedules and Cigarettes
As we have seen, the use of drugs is operant behavior maintained in part by the reinforcing
effects of the drug. One implication of this analysis is that reinforcement of an incompatible
response (i.e., abstinence) can reduce the probability of taking drugs. The effectiveness of an
abstinence contingency depends on the magnitude and schedule of reinforcement for nondrug
use (e.g., Higgins, Bickel,€& Hughes, 1994).
In applied behavior analysis, contingency management involves the systematic use
of reinforcement to establish desired behavior and the withholding of reinforcement or
punishment of undesired behavior (Higgins€& Petry, 1999). An example of contingency
management is seen in a study using reinforcement schedules to reduce cigarette smoking.
Roll, Higgins, and Badger (1996) assessed the effectiveness of three different schedules of
reinforcement for promoting and sustaining drug abstinence. These researchers conducted
an experimental analysis of cigarette smoking because cigarettes can function as reinforce-
ment, smoking can be reduced by reinforcement of alternative responses, and it is relatively
more convenient to study cigarette smoking than illicit drugs. Furthermore, cigarette smok-
ers usually relapse within several days following abstinence. This suggests that reinforce-
ment factors regulating abstinence exert their effects shortly after the person stops smoking
and it is possible to study these factors in a short-duration experiment.
Sixty adults, who smoked between 10 and 50 cigarettes a day, took part in the experi-
ment. The smokers were not currently trying to give up cigarettes. Participants were randomly
assigned to one of three groups: progressive reinforcement, fixed rate of reinforcement, and a
control group. They were told to begin abstaining from cigarettes on Friday evening so that
they could pass a carbon monoxide (CO) test for abstinence on Monday morning. Each per-
son in the study went for at least 2 days without smoking before reinforcement for abstinence
began. On Monday through Friday, participants agreed to take three daily CO tests. These
tests could detect prior smoking.
Twenty participants were randomly assigned to the progressive reinforcement group. The
progressive schedule involved increasing the magnitude of reinforcement for remaining drug
free. Participants earned $3.00 for passing the first carbon monoxide test for abstinence.
Each subsequent consecutive CO sample that indicated abstinence increased the amount of
money participants received by $0.50. The third consecutive CO test passed earned a bonus
of $10.00. To further clarify, passing the first CO test yielded $3.00, passing the second test
yielded $3.50, passing the third test yielded $14.00 ($4.00 and bonus of $10.00), and passing
the fourth test yielded $4.50. In addition, a substantial response cost was added for failing a
CO test. If the person failed the test, the payment for that test was withheld and the value of
payment for the next test was reset to $3.00. Three consecutive CO tests indicating abstinence
following a reset returned the payment schedule to the value at which the reset occurred (Roll
et al., 1996, p. 497), supporting efforts to achieve abstinence.
Schedules of Reinforcement╇╇ 163

Participants in the fixed reinforcement group (N€=€20) were paid $9.80 for passing each CO
test. There were no bonus points for consecutive abstinences and no resets. The total amount of
money available for the progressive and fixed groups was the same. Smokers in both the pro-
gressive and fixed groups were informed in advance of the schedule of payment and the criterion
for reinforcement. The schedule of payment for the control group was the same as the average
payment obtained by the first 10 participants assigned to the progressive condition. For these
people, the payment was given no matter what their carbon monoxide levels were. The control
group was, however, asked to try to cut their cigarette consumption, reduce CO levels, and
maintain abstinence.

FIG. 5.21╇ Figure shows the percentage of participants in each group who obtained three
consecutive drug-free tests, but then resumed smoking (A). Also shown is the percentage of
smokers in each group who were abstinent on all trials during the entire experiment (B).
Source: From J.╛M. Roll, S.╛T. Higgins,€& G.╛J. Badger (1996). An experimental comparison of three
different schedules of reinforcement of drug abstinence using cigarette smoking as an exemplar.
Journal of Applied Behavior Analysis, 29, pp. 495–505. Published with permission from John
Wiley€& Sons, Ltd.

Smokers in the progressive and fixed reinforcement groups passed more than 80% of the
abstinence tests, while the control group only passed about 40% of the tests. The effects of
the schedule of reinforcement are shown in Figure€5.21A. The figure indicates the percentage
of participants who passed three consecutive tests for abstinence and then resumed smoking
over the 5 days of the experiment. Only 22% of those on the progressive schedule resumed
smoking, compared with 60% and 82% in the fixed and control groups, respectively. Thus, the
progressive schedule of reinforcement was superior in terms of preventing the resumption of
smoking (after a period of abstinence).
Figure€5.21B shows the percentage of smokers who gave up cigarettes throughout the
experiment. Again, a strong effect of schedule of reinforcement is apparent. Around 50% of
those on the progressive reinforcement schedule remained abstinent for the 5 days of the
experiment, compared with 30% and 5% of the fixed and control participants, respectively.
In a subsequent experiment, Roll and Higgins (2000) found that a progressive reinforce-
ment schedule with a response–cost contingency increased abstinence from cigarette use
compared with a progressive schedule without the response cost or a fixed incentive-value
schedule. Overall, these results indicate that a progressive reinforcement schedule, combined
164╇╇ Schedules of Reinforcement

with an escalating response cost, is an effective short-term intervention for abstinence from
with an escalating response cost, is an effective short-term intervention for abstinence from
smoking. Further research is necessary to see whether a progressive schedule maintains
abstinence after the schedule is withdrawn. Long-term follow-up studies of progressive and
other schedules are necessary to assess the lasting effects of reinforcement schedules on absti-
nence. What is clear, at this point, is that schedules of reinforcement may be an important
component of stop-smoking programs (see more on contingency management in Chapter€13).

ADVANCED SECTION: Schedule Performance


Each of the basic schedules of reinforcement (FR, FI, VR, and VI) generates a unique pat-
tern of responding. Ratio schedules produce a higher rate of response than interval sched-
ules. A€reliable pause after reinforcement (PRP) occurs on fixed-ratio and fixed-interval
schedules, but not on variable-ratio or variable-interval schedules.

Rate of Response on Schedules


The problem of the determinants of rapid responding on ratio schedules, and moderate
rates on interval schedules, has not been resolved. The two major views concern molecular
versus molar determinants of schedule control. Molecular accounts of schedule perfor-
mance focus on small moment-to-moment relationships between behavior and its con-
sequences. The molecular analysis is based on the fact that some behavior precedes the
response (peck or lever press), which is reinforced. This is the behavior that occurs between
successive responses, and it is measured as the interresponse time (IRT). On the other hand,
molar accounts of schedule performance are concerned with large-scale factors that occur
over the length of an entire session, such as the overall rate of reinforcement and the rela-
tion between response rate and reinforcement rate (called the feedback function).

Molecular Account of Rate Differences


The time between any two responses, or what is called the interresponse time (IRT), may
be treated as an operant. Technically, IRTs are units of time and cannot be reinforced. The
behavior between any two responses is mea-
sured indirectly as IRT, and it is this behavior
that produces reinforcement. Consider Fig-
ure€5.22, in which 30-s segments of perfor-
mance on VR and VI schedules are present-
ed. Responses are portrayed by the vertical
marks, and the occurrence of reinforcement
is denoted by the familiar symbol Sr+. As you
can see, IRTs are much longer on VI than on
VR. On the VR segment, 23 responses occur
FIG. 5.22╇ Idealized distributions of response
on VR and VI schedules of reinforcement. in 30 s, which gives an average time between
Responses are represented by the vertical marks, responses of 1.3 s. The VI schedule generates
and Sr+ stands for reinforcement. longer IRTs with a mean of 2.3 s.
Schedules of Reinforcement╇╇ 165

Generally, ratio schedules produce shorter IRTs, and consequently higher rates of response,
than interval schedules. Skinner (1938) suggested that this came about because ratio and
interval schedules reinforce short and long interresponse times, respectively. To understand
this, consider the definition of an operant class. It is a class of behavior that may increase or
decrease in frequency on the basis of contingencies of reinforcement. In other words, if it
could be shown that the time between responses changes as a function of selective reinforce-
ment, then the IRT is by definition an operant in its own right. To demonstrate that the IRT
is an operant, it is necessary to identify an IRT of specific length (e.g., 2 s between any two
responses) and then reinforce that interresponse time, showing that it increases in frequency.
Computers and other electronic equipment have been used to measure the IRTs gen-
erated on various schedules of reinforcement. A€response is made and the computer starts
timing until the next response is emitted. Typically, these interresponse times are slotted into
time bins. For example, all IRTs between 0 and 2 s are counted, followed by those that fall
in the 2- to 4-s range, and then the number of 4- to 6-s IRTs. This method results in a distri-
bution of interresponse times. Several experiments have shown that the distribution of IRTs
may in fact be changed by selectively reinforcing interresponse times of a particular duration
(for a review, see Morse, 1966). Figure€5.23 shows the results of a hypothetical experiment
in which IRTs of different duration are reinforced on a VI schedule. On the standard VI, most
of the IRTs are 2–4 s long. When an additional contingency is added to the VI schedule that
requires IRTs of 10–12 s, the IRTs increase in this category. Also, a new distribution of IRTs is
generated. Whereas on a VR the next response may be reinforced regardless of the IRT, on
VI the combination pause plus response is required for reinforcement.
Anger (1956) conducted a complex experiment demonstrating that IRTs are a property
of behavior, which can be conditioned. In this experiment, the IRT also was considered as
a stimulus that set the occasion for the next response (SD). Reynolds (1966a) subsequently
showed that the IRT controlled performance that followed it. In other words, IRTs seem to
function as discriminative stimuli for behavior. The difficulty with this conception is that
stimulus properties are inferred from the performance. Zeiler has pointed out:

If the IRT is treated as a differentiated response unit [an operant], unobservable stimuli
need not be postulated as controlling observable performance. Given the one-to-one cor-
respondence between response and inferred stimulus properties, however, the two treat-
ments appear to be equivalent.
(Zeiler, 1977, p. 223)

FIG. 5.23╇ Hypothetical distributions are shown of interresponse times (IRTs) for an animal
responding on a standard VI schedule of reinforcement and on a VI that only reinforces IRTs that fall
between 10 and 12 s.
166╇╇ Schedules of Reinforcement

We treat the IRT as an operant rather than as a discriminative stimulus. As an operant,


the IRT is considered to be a property of the response that ends the time interval between
any two responses. This response property may be increased by reinforcement. For exam-
ple, a rat may press a lever R1, R2, R3, R4, and R5 times. The time between lever presses R1
and R2 is the IRT associated with R2. In a similar fashion, the IRT for R5 is the elapsed time
between R4 and R5. This series can be said to constitute a homogeneous chain, which is
divisible into discrete three-term contingency links.
As part of Anger’s experiment, animals were placed on a VI 300-s schedule of rein-
forcement (Anger, 1956). On this schedule, the response that resulted in reinforcement
had to occur 40 s or more after the previous response. If the animal made many fast
responses with IRTs of less than 40 s, the schedule requirements would not be met. In other
words, IRTs of more than 40 s were the operant that was reinforced. Anger found that this
procedure shifted the distribution of IRTs toward 40 s. Thus, the IRT that is reinforced is
more likely to be emitted than other IRTs.
Ratio schedules generate rapid sequences of responses with short IRTs (Gott€& Weiss,
1972; Weiss€& Gott, 1972). On a ratio schedule, consider what the probability of reinforce-
ment is following a burst of very fast responses (short IRTs) or a series of responses with
long IRTs. Recall that ratio schedules are based on the number of responses that are emit-
ted. Bursts of responses with short IRTs rapidly count down the ratio requirement and are
more likely to be reinforced than sets of long IRT responses (slow responding). Thus, ratio
schedules, because of the way they are constructed, differentially reinforce short IRTs.
According to the molecular IRT view of schedule control, this is why the rate of response
is high on ratio schedules.
When compared with ratio schedules, interval contingencies generate longer IRTs and
consequently a lower rate of response. Interval schedules pay off after some amount of time
has passed and a response is made. As the IRTs become longer the probability of reinforce-
ment increases, as more and more of the time requirement on the schedule elapses. In other
words, longer IRTs are differentially reinforced on interval schedules (Morse, 1966). In keep-
ing with the molecular view, interval contingencies differentially reinforce long IRTs, and the
rate of response is moderate on these schedules.

Molar Accounts of Rate Differences


There are several problems with the IRT account of rate differences on ratio and inter-
val schedules. One problem is that experiments on selective reinforcement of IRTs do not
prove that IRTs are controlled in this way on interval or ratio schedules. Also, there is
evidence that when long IRTs are reinforced, organisms continue to emit short bursts of
rapid responses. Animals typically produce these bursts even on schedules that never re-
inforce a fast series of responses (differential reinforcement of low rate, DRL). For these
reasons, molar hypotheses have been advanced about the rate of response differences on
reinforcement schedules.
Molar explanations of rate differences are concerned with the overall relationship
between responses and reinforcement. In molar terms, the correlation between responses
and reinforcement or feedback function produces the difference in rate on interval and
ratio schedules. Generally, if a high rate of response is correlated with a high rate of
reinforcement in the long run, animals will respond rapidly. When an increased rate of
response does not affect the rate of reinforcement, organisms do not respond faster
(Baum, 1993).
Schedules of Reinforcement╇╇ 167

Consider a VR 100 schedule of reinforcement. On this schedule, a subject could respond


50 times per minute and in a 1-h session obtain 30 reinforcements. On the other hand, if
the rate of response now increases to 300 responses per minute (not outside the range of
pigeons or humans), the rate of reinforcement would increase to 180 an hour. According
to supporters of the molar view, this correlation between increasing rate of response and
increased rate of reinforcement is responsible for rapid responding on ratio schedules.
A different correlation between rate of response and rate of reinforcement is set up
on interval schedules. Recall that interval schedules program reinforcement after time has
passed and one response is made. Suppose you are responding on a VI 3-min schedule
for $5 as reinforcement. You have 1 h a day to work on the schedule. If you respond at a
reasonable rate, say 30 lever presses per minute, you will get most or all of the 20 payouts.
Now pretend that you increase your rate of response to 300 lever presses a minute. The
only consequence is a sore wrist, and the rate of reinforcement remains at 20 per hour. In
other words, after some moderate value, it does not pay to increase the rate of response
on interval schedules—hence low to moderate response rates are maintained on interval
schedules.

Molar and Molecular Control of Response Rates


A substantial number of studies have attempted to experimentally separate the molecular
and molar determinants of response rates on schedules of reinforcement (Reed, 2007). One
way to analyze the control exerted by the molar and molecular determinants of response
rate is to use a blended schedule of reinforcement, having both VR and VI properties.
McDowell and Wixted (1986) designed a schedule with the molar properties of a
VR schedule, higher rates of response correlating with higher rates of reinforcement
(linear feedback), but with the molecular properties of a VI schedule—differential rein-
forcement of longer IRTs. The schedule is called the variable interval plus linear feed-
back (VI+). In this study, humans pressed a lever for monetary reinforcement on VR and
VI+ schedules. The results indicated that both schedules produced high response rates,
a finding consistent with molar control by the feedback function (correlation between
rates of response and rates of reinforcement) but inconsistent with molecular contin-
gency of differential reinforcement of long IRTs (and low response rate).
Studies with rats generally have not found equivalent high rates of response on VR
and VI+ schedules. Evidence from rodents indicates that VI+ generates response rates sim-
ilar to conventional VI schedules when matched for the rate of reinforcement (Reed, Soh,
Hildebrandt, DeJongh,€& Shek, 2000). Thus, the IRTs on the VI+ schedules were more simi-
lar to VI and considerably longer than on the VR schedule. Overall, rats show sensitivity to
the molecular contingency of IRT reinforcement and minimal control by the molar feed-
back function arranged by the VI+ schedule.
As you can see, studies using a VI+ schedule to separate molar and molecular control
of response rates reach different conclusions for rats and humans. Rats are sensitive to
the reinforcement of IRTs arranged by the VI+ schedule, whereas humans show sensitiv-
ity to the feedback function (correlation between response rate and reinforcement rate)
of the schedule. One difference between rats and humans (other than species) is that
humans may have responded at higher rates than rats on the VI+. The higher response
rates would allow differential conditioning of high rates. Also, variability in high rates
would allow sampling of the correlation between response rate and reinforcement rate,
resulting in sensitivity to the molar feedback function. One implication of this is that
168╇╇ Schedules of Reinforcement

rats would be sensitive to the molar feedback function of the VI+ schedule if the animals
responded at high rates.
In a series of experiments that compared VR, VI, and VI+ schedules, Reed (2007)
demonstrated that at low rates of response rats were controlled by the reinforcement of
IRTs, showing higher response rates on VR than on VI or VI+ schedules. In contrast, when
procedures were arranged to maintain high response rates, rats showed sensitivity to the
molar feedback function, responding as fast on VI+ as on VR, and faster on both than on
a yoked VI schedule (obtained reinforcement the same as VI+). Variability in the response
rate also resulted in more sensitivity to the molar characteristics of the schedule. Overall,
sensitivity to molecular and molar determinants of schedules of reinforcement requires
contact with the contingencies. Low rates of responding contact the molecular contingen-
cies related to IRTs. High response rates contact the molar contingencies, which involve
the correlation between rate of response and rate of reinforcement (see also Tanno, Sil-
berberg,€& Sakagami, 2010 for molar control of preference and molecular control of local
response rate in a choice situation).
In terms of human behavior, Baum (2010) has argued that IRT regularities at molecular
levels may be of little use. The contingencies that regulate children responding for teacher
attention, or employees working for pay, seldom involve moment-to-moment contiguity
between responses and reinforcement. Thus, employees would be more likely to contact
the molar correlation between rates of productivity and wages, varying their work rates
over time to match the rate of payoff. It is the molar contingencies that control human
behavior from Baum’s point of view.
Although human behavior may not show obvious control by molecular contingen-
cies, there are industries such as Lincoln Electric that use incentive systems directly tied to
IRTs (assembly piece rate) as well as molar profit sharing (Hodgetts, 1997). Notably, Lin-
coln Electric has been highly successful even in hard economic times, and has never used
employee layoffs to cut costs. Planned incentive systems that arrange both molecular and
molar contingencies may yield high performance and satisfaction for both workers and
management (Daniels€& Daniels, 2004).

Postreinforcement Pause on Fixed Schedules


Fixed-ratio and fixed-interval schedules generate a pause that follows reinforcement. Ac-
counts of pausing on fixed schedules also may be classified as molecular and molar. Mo-
lecular accounts of pausing are concerned with the moment-to-moment relationships that
immediately precede reinforcement. Such accounts address the relationship between the
number of bar presses that produce reinforcement and the subsequent postreinforcement
pause (PRP). In contrast, molar accounts of pausing focus on the overall rate of reinforce-
ment and the average pause length for a session.
Research shows that the PRP is a function of the interreinforcement interval (IRI). As
the IRI becomes longer, the PRP increases. On FI schedules, in which the experimenter
controls the time between reinforcement, the PRP is approximately half of the IRI. For
example, on an FI 300-s schedule (in which the time between reinforcements is 300 s), the
average PRP will be 150 s. On FR schedules, the evidence suggests similar control by the
IRI—the PRP becomes longer as the ratio requirement increases (Powell, 1968).
There is, however, a difficulty with analyzing the PRP on FR schedules. On ratio sched-
ules, the IRI is partly determined by what the animal does. Thus, the animal’s rate of
pressing the lever affects the time between reinforcements. Another problem with ratio
Schedules of Reinforcement╇╇ 169

schedules for an analysis of pausing is that the rate of response goes up as the size of the
ratio is increased (Boren, 1961). Unless the rate of response exactly coincides with changes
in the size of the ratio, adjustments in ratio size alter the IRI. For example, on FR 10 a rate
of 5 responses per minute produces an IRI of 2 min. This same rate of response produces an
IRI of 4 min on an FR 20 schedule. Thus, the ratio size, the IRI, or both may cause changes
in PRP.

Molar Interpretation of Pausing


We have noted that the average PRP is half of the IRI. Another finding is that the PRPs are
normally distributed (bell-shaped curve) over the time between reinforcements. In other
words, on an FI 320-s schedule, pauses will range from 0 to 320 s, with an average pause
of around 160 s. As shown in Figure€5.24, these results can be accounted for by consider-
ing what would happen if the normal curve moved upward so that the mean pause was
225 s. In this case, many of the pauses would exceed the FI interval and the animal would
get fewer reinforcements for the session. An animal that was sensitive to overall rate of
reinforcement or the long-range payoffs (maximization; see Chapter€9) should come to
emit pauses that are on average half the FI interval, assuming a normal distribution. Thus,
maximization of reinforcement provides a molar account of the PRP (Baum, 2002).

Molecular Interpretations of Pausing


There are two molecular accounts of pausing on fixed schedules that have some degree of
research support. One account is based on the observation that animals often emit other
behavior during the PRP (Staddon€& Simmelhag, 1971). For example, rats may engage
in grooming, sniffing, scratching, and stretching after the presentation of a food pellet.
Because this other behavior reliably follows reinforcement, we may say it is induced by
the schedule. Schedule-induced behaviors (see Chapter€6) may be viewed as operants that
automatically produce reinforcement. For example, stretching may relieve muscle tension,
and scratching may eliminate an itch. One interpretation is that pausing occurs because

FIG. 5.24╇ The figure shows two possible distributions of PRPs on a fixed-interval 320-s schedule. The
distribution given by the open circles has a mean of 160 s and does not exceed the interreinforcement
interval (IRI) set on the FI schedule. The bell curve for the distribution with the dark circles has an
average value at 225 s, and many pauses exceed the IRI.
170╇╇ Schedules of Reinforcement

the animal is maximizing local rates of reinforcement. Basically, the rat gets food for bar
pressing as well as the automatic reinforcement from the induced activities (see Shull,
1979). The average pause should therefore reflect the allocation of time to induced behav-
ior and to the operant that produces scheduled reinforcement (food). At present, experi-
ments have not ruled out or clearly demonstrated the induced-behavior interpretation of
pausing (e.g., Derenne€& Baron, 2002).
A second molecular account of pausing is based on the run of responses or amount
of work that precedes reinforcement (Shull, 1979, pp. 217–218). This “work-time” inter-
pretation holds that the previously experienced run of responses regulates the length of
the PRP. Work time affects the PRP by altering the value of the next scheduled reinforce-
ment. In other words, the more effort or time expended for the previous reinforcer, the
lower the value of the next reinforcer and the longer it takes for the animal to initiate
responding (pause length). Interestingly, Skinner made a similar interpretation in 1938
when he stated that pausing on FR schedules occurred because “the preceding run which
occurs under reinforcement at a fixed ratio places the [reflex] reserve in a state of strain
which acts with the temporal discrimination of reinforcement to produce a pause of some
length” (p. 298). Skinner’s use of the strained reserve seems to be equivalent to the more
current emphasis on work time. Overall, this view suggests that the harder one works for
reinforcement, the less valuable the next reinforcement is, and therefore the longer it
takes to start working again.
Neither the induced behavior nor the work-time accounts of pausing are sufficient
to handle all that is known about patterning on schedules of reinforcement. A€schedule
of reinforcement is a procedure for combining a large number of different conditions
that regulate behavior. Some of the controlling factors arise from the animal’s behavior
and the experimenter sets others via the programmed contingencies. This means that it
is exceedingly difficult to unravel the exact processes that produce characteristic schedule
performance. Nonetheless, the current interpretations of pausing point to some of the
more relevant factors that regulate behavior on fixed schedules of reinforcement.

The Dynamics of Schedule Performance


There are reasons for detailed research on the PRP and IRT. The hope is to analyze sched-
ule effects in terms of a few basic processes. This area of research is called behavioral
dynamics, or the study of behavior allocation through time. Behavioral dynamics involve
feedback processes that move the system (organism) from an unstable, transitional state
toward steady-state equilibrium. If performance on schedules can be reduced to a small
number of fundamental principles, either laws of dynamics or equilibrium, then reason-
able interpretations may be made about any particular arrangement of the environment
(schedule). Also, it should be possible to predict behavior more precisely from knowledge
of the operating contingencies and the axioms that govern dynamic behavior systems.
Evidence has accumulated that the basic principles may be molar, involving laws of equi-
librium and the matching of time allocation among activities. Even at the smallest time
scales of the key peck or the switch from one activity to another, a molar law describes
the behavioral dynamics of the system (Baum, 2010; see Chapter€9’s Advanced Section and
“Preference Shifts: Rapid Changes in Relative Reinforcement”).
Behavioral dynamics is currently at the leading edge of behavior analysis and, like
most scientific research, it requires a high level of mathematical sophistication (see Grace€&
Schedules of Reinforcement╇╇ 171

Hucks, 2013, pp. 325–326, on dynamics of choice). Both linear and nonlinear calculus is
used to model the behavioral impact of schedules of reinforcement. In the 1990s, an entire
issue of the Journal of the Experimental Analysis of Behavior (1992, vol. 57) was devoted
to this important subject, and included topics such as a chaos theory of performance on FI
schedules, dynamics of behavioral structure, behavioral momentum, resistance to behav-
ior change, and feedback functions for VI schedules. In this issue, Peter Killeen, a professor
at Arizona State University, builds on his previous work and suggests that “behavior may
be treated as basic physics” with responses viewed as movement through behavioral space
(Killeen, 1992, p. 429). Although these issues are beyond the scope of this book, the stu-
dent of behavior analysis should be aware that the analysis of schedule performance is an
advanced area of the science of behavior.

CHAPTER SUMMARY
A schedule of reinforcement describes the arrangement of discriminative stimuli, operants, and con-
sequences. Such contingencies were outlined by Ferster and Skinner (1957) and are central to the
understanding of behavior regulation in humans and other animals. The research on schedules and
performance patterns is a major component of the science of behavior, a science that progressively
builds on previous experiments and theoretical analysis. Schedules of reinforcement generate con-
sistent, steady-state performances involving runs of responses and pausing that are characteristic
of the specific schedule (ratio or interval). In the laboratory, the arrangement of progressive-ratio
schedules can serve as an animal model of foraging in the wild, and intermittent reinforcement plays
a role in most human behavior, especially social interaction.
To improve the description of schedules as contingencies of reinforcement, we have intro-
duced the Mechner system of notation. This notation is useful for programming contingencies in
the laboratory or analyzing complex environment–behavior relations. In this chapter, we described
continuous reinforcement (CRF) and resistance to extinction on this schedule. CRF also results in
response stereotypy based on the high rate of reinforcement. Fixed-ratio (FR) and fixed-interval (FI)
schedules were introduced, as well as the postreinforcement pausing (PRP) on these contingencies.
Adult humans have not shown classic scalloping or break-and-run patterns on FI schedules, and
the performance differences of humans relate to language or verbal behavior as well as histories of
ratio reinforcement. Variable-ratio (VR) and variable-interval (VI) schedules produce less pausing
and higher overall rates of response. Adding a limited hold to a VI schedule increases the response
rate by reinforcing short interresponse times (IRTs). When rates of reinforcement are varied on VI
schedules, the higher the rate of reinforcement the greater the behavioral momentum.
The study of behavior during the transition between schedules of reinforcement has not been
well researched, due to the boundary problem of steady-state behavior. Transition states, however,
play an important role in human behavior—as in the shift in the reinforcement contingencies from
childhood to adolescence or the change in schedules from employment to retirement. Reinforce-
ment schedules also have applied importance, and research shows that cigarette smoking can
be regulated by a progressive schedule combined with an escalating response–cost contingency.
Finally, in the Advanced Section of this chapter, we addressed molecular and molar accounts of
response rate and rate differences on schedules of reinforcement. We emphasized the analysis of
IRTs for molecular accounts, and the correlation of overall rates of response and reinforcement
for molar explanations.
172╇╇ Schedules of Reinforcement

KEY WORDS

Assumption of generality Postreinforcement pause (PRP)


Behavioral dynamics Preratio pause
Break and run Progressive-ratio (PR) schedule
Breakpoint Ratio schedules
Continuous reinforcement (CRF) Ratio strain
Fixed interval (FI) Reinforcement efficacy
Fixed ratio (FR) Resurgence
Interreinforcement interval (IRI) Run of responses
Interresponse time (IRT) Scalloping
Interval schedules Schedule of reinforcement
Limited hold Steady-state performance
Mechner notation Transition state
Molar account of schedule performance Variable interval (VI)
Molecular account of schedule performance Variable ratio (VR)

ON THE WEB
www.thefuntheory.com/ Control of human behavior by programming for fun (called Fun Theory)
is shown in these short videos; schedules of reinforcement (fun) are arranged for seatbelt use,
physical activity, and cleaning up litter. See if you can think up new ways to use reinforcement
schedules in programming fun to regulate important forms of human behavior in our culture.
www.youtube.com/watch?v=I_ctJqjlrHA This YouTube video discusses basic schedules of rein-
forcement, and B.â•›F. Skinner comments on variable-ratio schedules, gambling, and the belief
in free will.
www.pigeon.psy.tufts.edu/eam/eam2.html This module is available for purchase and demon-
strates basic schedules of reinforcement as employed in a variety of operant and discrimination
procedures involving animals and humans.
http://opensiuc.lib.siu.edu/cgi/viewcontent.cgi?article=1255&context=tpr&sei-redir=1-
search=“conjugate schedule reinforcement” A€review of the impact of Ferster and Skinner’s
publication of Schedules of Reinforcement (Ferster€& Skinner, 1957), from the study of basic
schedules to the operant analysis of choice, behavioral pharmacology, and microeconomics
of gambling. Contingency detection and causal reasoning by infants, children, and adults are
addressed as areas influenced by schedules of reinforcement.
www.wadsworth.com/psychology_d/templates/student_resources/0534633609_sniffy2/sniffy/
download.htm If you want to try out shaping and basic schedules with Sniffy the virtual rat, go
to this site and use a free download for 2 weeks of fun. After this period, you will have to pay to
continue your investigation of operant conditioning and schedules of reinforcement.
Schedules of Reinforcement╇╇ 173

BRIEF QUIZ
1. Schedules of reinforcement were first described by:
(a) Charles Ferster
(b) Francis Mechner
(c) B. F. Skinner
(d) Fergus Lowe
2. Infrequent reinforcement generates responding that is persistent. What is this called?
(a) postreinforcement pause
(b) partial reinforcement effect
(c) molar maximizing
(d) intermittent resistance
3. Mechner notation describes:
(a) stimulus effects
(b) dependent variables
(c) response contingencies
(d) independent variables
4. Resurgence happens when:
(a) behavior is put on extinction
(b) reinforcement magnitude is doubled
(c) high-probability behavior persists
(d) response variability declines
5. Schedules that generate predictable stair-step patterns are:
(a) fixed interval
(b) fixed ratio
(c) variable ratio
(d) random ratio
6. Variable-ratio schedules generate:
(a) postreinforcement pauses
(b) locked rates
(c) break-and-run performance
(d) high rates of response
7. Schedules that combine time and responses are called:
(a) partial reinforcement schedules
(b) complex schedules
(c) interval schedules
(d) fixed-time schedules
174╇╇ Schedules of Reinforcement

8. The shape of the response pattern generated by an FI is called a:


(a) scallop
(b) ogive
(c) break and pause
(d) accelerating dynamic
9. Human performance on FI differs from animal data due to:
(a) intelligence differences
(b) self-instruction
(c) contingency effects
(d) alternative strategies
10. Behavior is said to be in transition when it is between:
(a) the PRP and IRI
(b) stable states
(c) one schedule and another
(d) a response run
Answers to Brief Quiz: 1, c (p. 135); 2, b (p. 153); 3, d (p. 141); 4, a (p. 145); 5, b (p. 146);
6, d (p. 148); 7, c (p. 145); 8, a (p. 149); 9, b (p. 151); 10, b (p. 160).

You might also like