mid semester exam

o 5 questions 100 words per question (40 minutes)
o Associative learning (Professor Westbrooks content)
o You will be asked to interpret hypothetical experiments and identify the
role of stimuli in these experiments
o You will be asked to describe real world and/or lab examples of
associative learning

Critical analysis
o Associative learning models
o Presented 2 hypothetical experiments – answer questions identifying and
describing the learning phenomena and behaviours present
o Template to answer questions

week 1 asynchronous tutorial prep: associative learning

associative learning (2 types)

 Pavlovian conditioning
- Learn about the relationship between events in the world
- Does not require the subject to perform a response to learn association

 Instrumental conditioning
- Learn about our behaviour and events in the world
- Requires the subject to perform a response to learn association

Pavlovian conditioning
 US – a biologically relevant stimulus that elicits the UR such as food or danger
 UR – the response that occurs naturally to the US e.g. fight or flight response to
 CS – previously neutral stimulus that following conditioning, predicts the US
 CR – the response elicited by the CS following conditioning  commonly the same
as the UR

 Involves an appetitive outcome e.g. food or an aversive outcome e.g. shock

Pavlov’s observations

 Extinction – gradual decline in the CR when the CS is presented repeatedly in the

absence of the US
 Spontaneous recovery – recovery of a previously extinguished conditioned response
following the passage of time


 Pavlovian conditioning with an appetitive outcome
 US= food for dog, UR= salivating, CS= footstep of the dog’s owner, CR= hearing the
footstep and salivating
 Real life example: US: park, UR: excitement/happiness, CS: soccer ball in hand
CR=seeing the soccer ball and being happy

 Pavlovian conditioning with aversive outcome

 US= electric shock, UR= fear e.g., heart rate, CR= increasing heat rate when seeing
the light, CS= light
 Real life example: training your child not to touch spiders using a toy spider that
produces a loud tone when you touch it
 US =loud tone , UR=fear , CS= toy spider CR= increase HR when child sees any


 Learn the relationship between a voluntary response and positive or negative

 Increases the instrumental response through reinforcement
 To learn an instrumental association the subject must first perform a response

Reinforcement and punishment

 positive reinforcement
 giving a dog a treat for doing a trick
 rewarding your child when they do well in an exam
 scholarships for top students
 must finish your dinner for dessert
 mother giving their child lollies for cleaning up toys
 giving your child pocket money for doing chores
 all these examples are expected to increase desired behaviour
everyday example: An example of positive reinforcement is asking your child to finish their
dinner to have dessert. Adding dessert is positive reward which will increase behaviour of
your child finishing dinner.

 negative reinforcement
 sound of no seatbelt  put on seatbelt to remove sound
 putting on sunscreen when going out in the sun to reduce the risk of skin cancer

 positive punishment
 slapping child if they steal  reduce stealing
 fine for driving through red light  decrease behaviour

positive punishment is an attempt to influence behaviour by adding something unpleasant.

For example, giving a student an academic warning for not submitting an assessment. The
warning is unpleasant and will deter the student from failing to submit their assignment,
therefore increasing the outcome of the student submitting in their assignment.

 negative punishment
 taking away Xbox if they swear
 removing play time for a primary student who is disruptive in class

The aim of negative punishment is influence behaviour by removing pleasant stimulus. For
example, removing play time for a primary student who is disruptive in class. By removing
the pleasant stimulus of play time will decrease the student’s behaviour of disrupting in class.

Animal ethics
 Animals are used for many purposes by humans
- Food
- Clothing
- Research/medicine
- Pets
- Working animals
 It is important to consider the arguments for using animals in research and the ethics
which guide the treatment of animals
Governing principles

 All of the experiments have been conducted with adherence to the ethical guidelines
and have been approved by the ethics committees

Introduction lecture

 Explaining a range of behaviours and their neural bases

 The basics of associative learning and behaviour
 Pavlovian conditioning
 Instrumental conditioning
 Pavlovian to instrumental transfer
 The neural basis of ranging psychological phenomena
 The neural structures involved in fear and appetitive learning

Animal learning
 Focus on animal models of learning, cognition and memory
 How animal models inform clinical practice
 How animal models are useful for neuroscience
 The neural basis of clinic disorders
 Schizophrenia
 Addiction
 Feeding
 Attachment
 Choice


Historical introduction – 19th century

Psychology as a discipline
 Ebbinghaus - Remarked that psychology has a long past and a short history

First laboratory was 1879

 Long past because its subject matter is of intrinsic interest. People are and always
have been psychologist
 Psychology has a short history because the discipline has been in existence since 1879
 Wundt established the first laboratory at Leipzig University in Germany

Late 19th century (1879)

 Contrast between philosophy and theology, on the one hand and science on the other
 Philosophy still debated the merits of the questions asked and answers given by Plato
and Aristotle
 Theology was unable to command any universal assent to its agreement, and religions
differed widely in their beliefs and practices

Natural sciences: physics and chemistry

 Science was the endeavour which met the unqualified success
 The laws of motion of material bodies, from pendulum clocks to the solar system, had
been formulated and an account of matter given in the terms of elementary particles
 Instrumentation and measurement, combined with experimentation and the
formulation of mathematical models, appeared able to resolve any complex physical
system into its elementary component parts

Science and psychology

 Natural that many came to believe the application of the scientific method to human
affairs was the only hope for understanding people and changing society for the better

The study of the body: physiology

 By the late 19th century
 Much was known about the sensors in our bodies that detect heat, pressure, stretch,
acceleration, sound, light, smells, tastes and other forms of energy and that transmit
energy in a form that can be understood by our brains

The study of the mind: psychology

 The study of sensory systems led to psychophysics:
 How physiological measurements of energy (e.g. light, sound) correlate experienced
 Because the mental could be measured and subjected to mathematical treatment (e.g.
Weber’s law) psychology could be quantitative science
 Energy (physics)- sensory coding (physiology) – experience sensation (psychology)

The study of life: biology

 A central question was the origin of species

 One view was that species were immutable, having been fixed by an act of creation.
Humans were unique, distinct from all other species in their possession of a conscious
mind and soul
 The alternative was the present species (perhaps including humans) had been
descended from earlier ones through some process of evolution
 Jean-Baptiste de Lamarck – first to propose a mechanism for evolutionary change

Lamarck’s theory
 Organic and inorganic matter are fundamentally different, primarily in that each living
species possesses an innate drive to perfect itself. This drive to perfection results in
changes in its body and in its mentality
 The acquired characteristics are hereditary, so that the physical and mental efforts
made across the course of an individual’s life are reflected directly in the physical and
mental organs of their offspring

Lamarckian inheritance
1. Environmental changes can produce new habits
2. New habits produce physical changes: law of use and disuse
3. Physical changes are heritable: inheritance of acquired characteristics

Charles Darwin
 Studied medicine at Edinburgh university but was repelled by surgery and bored by
the lectures
 Studied at Cambridge university with the aim of becoming a country priest
 This occupation would allow him to indulge his passion for natural history

Voyage of the beagle

 His interest in natural history brough him into contact with Henslow, professor of
botany at Cambridge, who recommended him to Fitzroy, captain of the HMS beagle,
as the ship’s naturalist, on a surveying voyage

Charles Darwin
 After his return in 1836, Darwin wrote several works on geology, edited all the work
done by the various specialist on the zoological material that he had collected, and
wrote a popular account of the voyage
 He also gave a good deal of through to how species of animals came to be as they are

 Argued that in nature plants and animals produce far more offspring than can survive,
and that humans are too capable of overproducing if left unchecked

“a theory by which to work”

 Darwin realised that competition for scarce resources would favour those individuals
whose attributes gave them an advantage
 He knew that plants and animals could be changed over successive generations by
artificial selection
 He saw that the characteristics of animals in the wild could also be changed by natural
Darwin’s theory

 There is a variation among individuals belonging to any population whose attributes

gave them an advantage
 He knew that plants and animals could be change over successive generations of
artificial selection
 He saw that the characteristics of animals in the wild could also be change by natural
 Much of the variation is genetically inherited
 More individuals are born than can survive to maturity
 Those who possess variations that favour survival are more likely to have offspring
who will inherit these adaptations
 Certain attributes will thus be perpetuated or selected through the action of natural
selection  distinct species

The origin of species – Darwin

 Committed his ideas on evolution to paper but refrained from publishing
 In 1858 he received a letter from the naturalist Alfred Wallace, in which he found a
description of the theory of evolution by natural selection. Darwin arranged that a
paper be read at a scientific meeting in the names of Darwin and Wallace and
proceeded to publish

The origin of species had two aims

 Aim 1 was to show that evolution had occurred
 Easily accomplished – many accepted that the present species of animals had their
origins in simple forms of like that existed millions of years ago
 Aim 2 was to show that the mechanisms were natural selection
 Less easily accomplished – evolution by natural selection not accepted until 1930’s
when it became the unifying force in biology
 One factor that predisposed against acceptance of the theory was implications for

Thomas Huxley
 Comparative neuroatomist
 Darwin’s ‘bulldog’ relished his role of public champion of evolutionary theories,
while Darwin remained in the country, often ill and avoiding controversy

Alfred Wallace
 Maintained that: natural selection explained the human body but not the human mind
 Divine intervention had created life, consciousness in the higher animals and uniquely
human mental faculties (e.g., language, numbers, music, morals, metaphysics)

Darwin’s reply: The Descent of man

 Sexual selection is a factor in human evolution and that this can explain certain
mental faculties
 Human intellect not unique, mental continuity with other species e.g. reasoning as in
apes, speech as in parrots

The expression of the emotions in man and animals

 Similarity in emotional expression across species and emphasis on their common
function (e.g., the behaviours and physiological changes elicited by danger serve to
protect the organism)
 Emotions in humans (partly due to social learning) are inherited from our animal
ancestors. They are instincts that originate in attachments (happiness), loss (sadness),
aggression (anger), fear (danger), disgust (rejection)

Summary of Darwin’s arguments

 No qualitative differences between humans and higher animals in mental faculties

 Behaviour of any species is partly instinctive and partly governed by individuals
 Elements of language present in other animals and these, combined with other mental
abilities, led to language in humans
 Moral sense resulted from possession of intellectual abilities in conjunction with
processes of attachment
 Basic emotions present in other animals



George Romanes
 In 1874 Darwin handed over his notes on animal behaviour to the neurophysiologist.
Romanes, who expanded this corpus of anecdotal evidence, adding some
experimental results
 Animal intelligence (1882) reported these observations on a huge range of species

Mental evolution in animals (1884)

 Linear progress of evolution from lower to higher with humans as the highest forms
of evolved life
 Lamarckian inheritance played a role in mental evolution
 Criteria for mind were:
1. The organism must have a nervous system
2. Its behaviour must be sensitive to past experience – must show evidence for
learning and memory

hierarchies for
emotions and
shift to the laboratory study of learning – 20th century
 Morgan’s canon
 Thorndike’s study of trial-and-error learning
 Watson’s behaviourist manifesto
 Pavlov’s study of the conditioned reflexes
 Skinner’s study of operant conditioned responses
 Heinstein’s study of choice
 Rediscovery of the mind and rejection of the behaviourism (1960-1970)

Lloyd Morgan
 He argued that Romanes had:
 Ignored previous opportunities for an animal to learn a behaviour
 Confused testable and non-testable inferences from behaviour to mental events
 Used unnecessary complex terms to explain behaviour

Trial and error learning

 One-off observations of an animal’s behaviour can lead to wrong inference
 Repeated observations reveal how the behaviour develops e.g., Morgan recorded the
history of how his dog learned to open a gate
 Romanes data base mostly one-off observations

Edward Thorndike
 Aim: systematic, quantitative experiments on trial-and-error learning in animals
 Thorndike’s puzzle boxes for cats and dogs
 Time from placement in the box to the animal making the effective response plotted
as a function of trial number to produce the first learning curves
 Cross species comparisons: no detectable difference
 Learning by imitation: no evidence
 Memory: no forgetting over time

Law of effect
 “Of several responses made of the same situation, those which are accompanied or
closely followed by satisfaction to the animal will, other things being equal, be more
firmly connected with the situation so that, when it recurs., they will be more likely to
 Learning consists in the formation of connections between situations and responses –
animals (and people) learn what to do in a situation – now termed procedural
knowledge (habits)
 Satisfaction (rewards) establish these connections while dissatisfaction (punishers)
breaks them

Metaphor for the mind

 The metaphor for the mind was a telephone exchange where an operator connected
and disconnected one line to another, thereby regulating communication from one line
to the other
 Satisfaction connected the situation with the response and dissatisfaction broke this
connection so that the situation would not subsequently provoke the response

John Watson
 Studied maze learning in rats and innate behaviour of birds
 In 1913, he published the behaviorist manifesto

psychology should study behaviour rather than mind

 Watson argued the traditional study of consciousness through introspection had failed
because of its reliance on subjective data
 Statements based upon such data cannot be subjected to the public scrutiny required
in science
 Watson also argued that introspection was relevant to everyday problems and that the
few areas where psychology had been effective (e.g., mental tests) was because it had
abandoned a concern with consciousness and the use of introspection

Watson went to study infants

 He asked:
 Do neonates display instinctive behaviours or just simple reflexes?
 What stimuli elicit emotional responses in infants
 How do stimuli come to elicit fear in “little albert”?
 His answer: Pavlovian conditioning of emotional responses

Two scientific traditions in psychology

 Natural history (Darwin, Huxley, Romanes)
 Observational
 Species differences
 Function
 Evolutionary theory
 Psychological (Thorndike, Watson, Pavlov)
 Experimental
 General mechanisms
 Applications
 Reflex and S-R theory

Ivan Pavlov
 Nobel prize winner in medicine for studied on physiology of digestion
 Noticed “psychic-secretions” in his subjects (dogs) and switched to the study of these
 His lectures, published as Conditioned Reflexes (1927), made his work known to the
English-speaking world

Conditioned reflexes
 Pavlov implanted a fistula in the dog’s mouth via a small incision in its cheek and
connected the fistula to the tube and finally, to devices that collected the salivary
responses and recorded their occurrence
 He was interested in the nature of salivation elicited by food in the mouth and how
this contributed to digestion
 However, he noticed that dogs salivated not only when the food was in their mouths
but also when they saw the food or when they saw the attended who regularly fed

Why conditioned?
 Pavlov called the salivary responses elicited by the food unconditioned reflexes; they
were unconditioned because their occurrence was not conditional on any prior
experience. Any dog salivated the first time that some food was placed into its mouth
 Pavlov called the anticipatory responses conditioned reflexes because their
occurrences was conditional on specific history. The dog salivated when they saw the
attendant who regularly fed them but did not salivate when they saw someone else,
equally familiar who had never fed them
 The “psychic reflexes” were due to the dogs having learned that the attendant
signalled or predicted the imminent arrival of food

Pavlov’s studies
 Substituted stimuli for the attendant to study the development of conditioned reflexes
 Every few minutes, a sound was presented for a few seconds and the food was
delivered to the dog. The dog quickly learned to expect the food when it heard the
sound and began to salivate
 Critically, food was presented to the dog whether or not it salivates. The dog did not
need to salivate in order to procure the food
 Rather, learning that the sound signalled the arrival of food caused the dog to salivate
when it heard the sound

Some conditioning phenomena

 Pavlov discovered several phenomena
 One was that a stimulus from any modality could be used as a signal for food,
eliciting conditioned salivation
 The learning system therefore accepted any detectable stimulus as a stimulus for food
 A second was extinction which refers to the gradual decline in salivation when the
signal was repeatedly presented in the absence of food
 The learning system adjusts to the relations between events in the world. It codes the
signalling relation and produces responses. It codes the change in the signalling
relation and ceases to produce responses

External inhibition and disinhibition

 External inhibition – this refers to the loss of salivation when the sound is combined
with a novel stimulus e.g., light. The dog is first trained with the sound-food pairings
and then is presented with the sound in compound with a light. The decrease in
salivation suggested that the dog had switched attention from the sound to the light
and therefore, failed to salivate
 Loss of CR

 Disinhibition – This refers to the finding that the salivation that has been
extinguished re-appears when the sound is combined with the light. The dog has
learned to inhibit salivation to the sound. Attending to the light removes the attention
required for inhibition of salivation to the sound.
 Removing extinction of CR

External inhibition – loss of CR when the CS is combined with a novel CS. This suggests
that the animal pays attention to the novel CS, therefore fails to salivate to CS. E.g., AN
animal learns that tone predicts food and salivates (CR) in presence of tone (CS). Then we
present the tone (CS) with a novel CS. The animal does not salivate to this compound.

Disinhibition – The reappearance of an extinguished CR when a novel CS is presented

with the extinguished CS. This suggests that the novel CS takes away the attention that is
needed to inhibit CR to the CS. e.g. An animal learns that a tone (CS) predicts a foot shock.
The animal freezes (CR) in response to the tone. Then the conditioned response is
extinguished through extinction training, such as presentations of the tone no longer elicit
freezing. Later, the tone is presented in compound with a bright light, and the animal now
freezes (CR).

 Spontaneous recovery. Pavlov found that extinction was not necessarily permanent:
the salivation that ceased to occur because the conditioned stimulus had been
repeatedly presented in the absence of the food re-occurred when the conditioned
stimulus was presented days or weeks later.


 Arguments for use of animals in research

 Beneficial to animal species

 benefits of research for humans outweigh the impact on animals
 testing for medical treatments/drugs
 cost effective
 reduce variables e.g. constant control
 study diseases – impacts  how to resolve/treat it
 quality of life in lab can be good for animal

 List any points against animal testing

 animals are very different to humans. findings from animal research may not be
relevant to human beings.
 psychological and physical impacts for animals
 some types of research are unnecessary e.g. animal testing for new cosmetics -
although cosmetics enhance confidence, they are not essential to life/do not benefit
the individual
 animals cannot consent
 research might not be successful
 What standards need to be met for animal care

 reduce harm as much as possible by avoiding unnecessary measures/techniques

 comfortable living conditions e.g. providing them with enough food and water and
having enough space for them to move around freely
 treat animals if they are physically wounded during the research

 What ethical guidelines might you put in place for animal research

 minimising the types of research that can be done using animals e.g. remove cosmetic
 reduce the number of animals being used in each experiment (just enough to reach
significance in findings)
 prohibit the infliction of physical pain e.g. rats being hung by their tail to suspend
them from using it (used in the rat tail suspension paradigm to determine critical and
sensitive periods)

the 3 r’s
 replacement
 reduction
 refinement

BF skinner
 studied feeding in rats and arranged that presses of a bar provided access to food
 realised that he could study instrumental (or operant) conditioned responses
 this “skinner box” allowed automatic scheduling of events and recording of responses

 Pavlov’s dogs salivated when they heard the sound that signalled food. They did not
have to salivate to procure the food: the sound caused the salivation
 skinner’s rats had to press the bar in order to procure the food: if they failed to press,
then they failed to obtain food
 salivary responses, therefore, were involuntarily, elicited mechanistically by the signal
for food: lever presses were voluntary due the animal learning that could procure food
by lever pressing

 defined reinforcement in terms of its effects on operant behaviour: it increased the
subsequent likelihood of that behaviour. Thus, anything that acts in this way is a
reinforcer for the contingent behaviour
 he found that occasionally reinforcing a response was more effective in maintaining
that response than was reinforcing a response on each of its occurrences
 he also studied relations between stimuli, responses, and reinforcement. he called
these relations schedules of reinforcement

schedules of reinforcement

 one type of schedule is where reinforcement depends on some number of repsonses

 a fixed ration n is where n responses are required to procure the reinforcer, while a
variable ratio n where n responses on average are required to procure the reinforcer
 another type of schedule is where reinforcement depends on some response but only
If some period of time has elapsed
 a fixed interval schedule is where behaviour is reinforced after a fixed period of time,
while a variable interval schedule is where behaviour is reinforced at variable
intervals of time

Fixed ratio (number of responses  E.g., getting paid for the number of items
reinforced) you make e.g., 100 envelopes

Variable ratio (unpredicted number of Reinforcement at unpredictable quantity of

responses  reinforced) responses regardless of time e.g., jackpot
Fixed interval  (wait specified time  Need to wait a specified period after
reinforced) completing the correct response before
being reinforced
e.g., being paid every 2 weeks (does not
depend how fast or much they work
Variable interval (unpredicted time) Need to wait for an unpredictable amount of
time e.g., waiting for an elevator (does not
matter how often you press the button)

examples of schedules

 fixed: e.g., piece work is where reinforcement (money) is contingent on a fixed
amount of some designated activity e.g., fruit picker  filling up a number of baskets
 variable: when fishing, casting the line is reinforced (fish biting hook) intermittently,
results in a constant rate of casting

 fixed: doing well in an exam (reinforcement) is contingent on studying. Exams occur
at regular intervals. Studying occurs shorty before exam, ceases after the exam, re-
starts before the next exam
 variable: poker machines pay off something like a variable interval schedule. This
schedule maintains constant playing

skinner’s view of society

 skinner argued that our behaviour was controlled by its past’s interactions with the
 behaviours that were reinforced by money, praise and so on were those which were
maintained whereas behaviours that were not or no longer reinforced disappeared
from our repertoire
 he advocated society using the “science of behaviour” to effectively control what
people did: not via coercion but via arranging appropriate contingencies between their
behaviour and reinforcements
 skinners views were implemented into clinic contexts, in behaviour modifications. For
example, long term patients in psychiatric ward were first observed to determine what
could be used as a reinforcer to control their activities (cigarettes). Provision of
cigarettes was made contingent on activities (e.g. getting up in the morning, making
the bed) and once established on additional activities (engaging in social activities)

choice and the matching law - RJ Herrnstein

 Used these schedules to study choice: how an organism allocated its activity across
various behaviours (sport, studying, movies)

A lab example of choice

 A pigeon in a skinner box is presented with two patches of colour (red and green)
and on a screen can peck at the patches to earn food rewards
 Suppose it can earn a max of 40 rewards per hour, for example, pecking at the red
patch (termed key 1 on the next slide) is rewarded on average every 6-min (i.e., 10
rewards per hour), whereas pecking at the green patch (key 2) is rewarded on average
every 2-min (i.e., 30 rewards per hour).
 Red is smaller reward; green is larger reward per hour

Herrnstein’s matching law

matching law
the matching law and behaviour

 Each day we engage in a range of different activities e.g., cook clean, work, meet
friends, study, play sport etc  how do we allocate our time?
 One idea is that we always try to maximise our benefits and minimise our costs
 Therefore, we make decisions about what to do by comparing the benefits and costs
of different courses
 If one study has a higher value than another, we chose the former. E.g., socialising
over studying

Matching rather than maximising

 The matching law says that we do not try to maximise some value
 Rather, we allocate behaviour in proportion to the value derived from each
 We measure the value of one activity and that of another and allocate behaviour
according to the proportion of value from each activity.
 If the value of socialising is twice the value of studying, we allocate twice as much
behaviour to socializing than studying.

commitment and self-control (Rachlin & Green 1972)

 Direct choice procedure and concurrent choice procedure
 Bird learns through trial and error about the small and large reward

Discount function

 We discount the future very steeply

 Depending on the time, at some point we will wait to pick the large reward
 Addicts will always pick the small reward

Demand and choice

 Demand: how much of a commodity will be consumed at a given price
 Bread (inelastic): its consumptions is independent of cost; movies (elastic): their
consumption is dependent on price
 Rats are given a choice: press left for food vs right for rewarding brain stimulation.
They are allocated an income – a specified number of presses which they can spend
 Outcome: rats prefer brain stimulations if cost is low but prefer food if cost is high
 Demand is greater for stimulation when it is cheap but demand for food is greater
when costs are relative to income
 e.g., going to concert before exam
 if a friend asks you 2 hours before a concert and you have an exam tomorrow – you
are more likely to go to the concert than study for the exam, however if your friend
asks you to go a day before, you are more likely to stay home and study for the exam
(this is because the immediate reward of going to a concert is further away)

skinner and radical behaviourism

 skinner advocated radical behaviourism
 this is focused on the environment and was in turn changed
 it is rejected explanations based on appeals to inner variables, including cognitions
and motivations, as these processes themselves had to be explained
 it is proposed that our current behaviours were the products of the past contingencies
between situations, behaviours, and reinforcers

behaviourism rejected: circa 1965

 developments in cognate disciplines provided ways of studying mind in rigorous
manner (artificial intelligence)
 environmental (empirical) approaches could not explain language
 much more was built into the mind than implied in the behaviorist approach
 psychology shifted from the study of behaviour to the use of behaviour as a means of
formulating and testing theories about the mind


The study of associative learning

 critical distinction is between knowledge acquired about the world and its expression
in behaviour  this distinction led to various questions about learning:
 what conditions produce learning?
 What is learned and how is it expressed in behaviour?

Pavlovian conditioning
 Once identified as the salivary responses studied by
Pavlov and the emotional responses observed by Watson and dismissed as of little
general interests
 It was the psychology of spits (dogs) and twitches (Watson’s little albert)
 Pav conditioning produces a range of responses including approach and withdrawal

Bird video
 Colour of light and food
 Recognises the colour and tries to eat the colour (knows the association between the
light and the food but acts in an irrational manner)

Attractive signals: approach and contact

 Stimuli that signal attractive events (food, fluid, sexual partners) come to elicit
approach and consummatory responses. The pigeon eats the signal for food, drinks the
signal for fluid and courts the signal for a sexual partner
 Other attractions include the effects produced by drugs of abuse e.g., heroin, cocaine,
nicotine, alcohol). Stimuli that signal such effects elicit approach and a range of drug
related responses in people

Omission schedule: depends on if its voluntary or involuntary behaviour

 Sign tracking persisted even when their occurrence prevents the outcome (Williams &
Williams 1969)
 Sign tracking: refers to behaviour that is directed towards a stimulus as result of a
learned association between that stimulus and reward. The sign tracking response
develops even through reward delivery is not contingent upon a response
 Presentations of the red dot were followed by access to a grain but only if pecks
did not occur. If pecks occurred grain was not delivered
 Pecking eventually ceased because grain was not delivered. The cessation of the
pecking meant that presentations of the red dot were followed by access to grain. The
signalling relation between red dot and food reinstated sign tracking (pecking the dot)
which was again exhausted and again reinstated
 The pigeons were unable to supress pecking at the CS even when pecking caused the
omission of food
 Birds that have approached, courted, and copulated with the CS (stuffed toy)
signalling sexual partner continued to direct these behaviours towards the CS even
when the partner is present
attentional capture in people

 LePelley et al 2015 – exposed participants to a display in which the presence of a

particular colour signalled a high reward and monitored eye gaze to the elements in
the display
 They arranged that participant procured the large reward only if they did not look at
the signal (the colour)
 Just

pigeons were unable to inhibit pecking, people were unable to inhibit looking at the
diamond, even although looking at the colour caused the omission of the large reward

Sign tracking in rats

 Every few minutes, a lever is inserted into an experimental chamber for a few
seconds, removed and food is presented to hungry rats
 Some rats come to approach, manipulate, and gnaw the lever (sign trackers) whereas
others use the lever as a signal to enter the aperture where food is delivered (goal
 The individual differences are related to how readily the rats subsequently acquire
drug-related habits: the sign trackers are more likely to become addicted than the goal
 The sign trackers show: more approach to a cocaine cue, more cue-induced and
cocaine-primed reinstatement following extinction of self-administration, greater
preference for a cocaine-associated tactile cue in a conditioned cue preference
procedure, a higher break point for cocaine in a progressive ratio schedule (Flagel,
Akil, & Robinson, 2019)
 Sign tracking is held to reflect the processes involved in drug addiction where drug-
related cues acquire so-called incentive salience: the sign trackers are more likely to
imbue the lever with such salience and likewise drug-related cues than goal trackers
 Incentive salience refers to motivation for rewards that is driven by both
physiological state and previously learned associations about a reward cue

Attractive signals: preferences

 People and rats are omnivores: almost anything that can be eaten is eaten
 The human infant is born with a liking for a sweet taste and a dislike of bitter one but
has no innate preferences for smells
 But we come to like a range of flavours (taste-smell experiences)

Flavour nutrient learning (Myers 2003)

 Rats are fitted with a fistula into their stomach through which nutrients can be infused
and presented with two bottle containing water, one orange flavoured, and the other
lemon flavoured
 Drinking the orange flavour is followed by intra gastric (IG) infusions of a nutrient
(glucose, starch, protein) whereas drinking the lemon flavour is followed by infusion
of physiological saline
 Rats ingest the orange flavour and show liking responses
 The association between the flavour and the nutrient results in the flavour being liked
Drug conditioning of flavour preferences

 People come to like the flavours associated with the effects produced by drugs,
such as the flavour of cigarettes, alcoholic beverages, and coffee
 In each case, the flavour starts from neutral or even negative hedonic value but
acquires a positive hedonic value because it signals the effects produced by
nicotine, alcohol, or the coffee

Why are these conditioned responses?

 In terms of reproductive fitness
 If conditioned responding evolved because it increases reproductive fitness, subjects
exposed to a conditioned stimulus should sire more offspring than subjects not
exposed to the CS
 Male birds learn that a place signals for females and are then allowed to copulate with
a female in that place

Domjan (2007)


 Learning to identify cues that signal danger is an adaptation that emerged early in the
evolution of animal life
 Such cues allow people and animals to anticipate the danger and respond
 These responses include arousal, fight/flight/freezing, sympathetic nervous system
arousal, decreased pain sensitivity, protective reflexes


Biological function of conditioned responses

 Male fish establish a territory to attract females. They defend the territory against
other males
 Males are repeatedly exposed to a stimulus (a light) which signals the appearance
of a rival (located in adjacent glad water tank). They learn that the light signals the
appearance of the rival
 Finally, the rival is introduced into their territory after presentation of the stimulus
 The males were more successful than controls in the subsequent fights and
therefore more successful in maintaining their territory, attracting females, and
 Pavlovian conditioning had conferred fitness

Biological function (hollis 1984)

cue to consequence effect (Garcia & Koelling 1966)

flavourless illness learning

 Rats can learn that a sweet taste signals pain and will supress ingestion of that
 However, this suppression is specific to the place where the taste-pain
experiences occurred
 Elsewhere, rats avidly ingest the sweet taste and show liking responses
 In contrast, rats who learn that a sweet taste signals malaise refuse to ingest that
taste anywhere and show disgust responses
 Flavours associated with nausea become disliked

Conditions for leaning

 Association by temporal contiguity is one of the oldest ideas in the study of learning
 Two events (Cs and US) are more likely to be associated when they occur together
than when their presentations are separated in time
 Why:
 One reason is the nature of the physical word: events which occur together in time are
more likely to be related to each other than are events which occur separately
 Association by temporal contiguity is an innate central mechanism that models such
 temporal contiguity is necessary but not sufficient for associative learning

The blocking effect (Kamin 1969)

 2 groups, one that learned with the light, and one did not

Contingency effect (Rescorla 1969)

 Contingen
cy: a
event of
circumstance which is possible but cannot be predicted with certainty/ a
provision for a possible event or circumstance


 Temporal contiguity is not sufficient for associative learning

 The additional requirement is that the CS provides information about the occurrence
of the US
 Learning about the CS fails when it accompanied by a better predictor of the US
 In the blocking effect, the pre-trained light was the better predictor
 In the contingency effect, the environment was the better predictor
 These and other results led to the idea that learning is regulated by prediction error
 Initially, the US is surprising, and subjects learn about the CS, the more that they
learn, the less surprising is the US =, learning ceases when the CS predicts the US


 Learning allows an organism to adjust its behaviour to bring it into line with changes
in the world
 Extinction is an example. It occurs when the positive contingency between a CS and
US is broken by exposures to the CS in the absence of the US
 The responses elicited by the CS (approach or withdrawal) decline across the CS
alone exposures and eventually cease to occur
 Responding to the CS is said to be extinguished
 Extinction is used to model cue exposure which is a component of cognitive
behavioural therapy (CBT)

Cue exposure and extinction

 Clinical trials have shown that CBT is more effective than other therapies in the
treatment of anxiety disorders (e.g. PTSD)
 The cure exposure component of CBT consists in the patient confronting trauma
related cues (the CS) in the absence of the if any overt danger (the US)
 The aim of these confrontations (CS alone exposures) is to reduce the ability of the
trauma-related cues to elicit the fear memories which are destroying the patient’s life
The restoration of extinguished responses

1. Recovery of fear

You train a rat to fear a light CS by presenting it with an electric shock. Every time the light
flashes an electric shock is elicited causing the rat to show a conditioned response of fear
such as increased HR. Over the next 4 days you extinguish the CR of fear by presenting the
flashing light without an electric shock. A week later you place the rat back into the box and
present the flashing light alone and the rats performs the CR of fear. This restoration of fear
after a period and extinguishing is a demonstration of fear recovery.

2. Fear is renewed

In the first condition (context A) rats put into a green box and trained with light as the CS
which elicits shock (US). Every time the light is turned on, the rat receives a shock which
causes it to respond with the fear response increased heart rate.

Over the next few days, you train the rat in another blue box (Context B) and extinguished
the CR by presenting the light without the electric shock. The following day after training in
context B, you put the rat back into context A and present the light without the shock,
however the CR of increase heart rate returns. The return of the CR in context A with the
presentation of the light demonstrates fear renewal.
3. Fear is reinstated

You train a rat to fear a light CS by presenting it with an electric shock. Every time the light
flashes an electric shock is elicited causing the rat to show a conditioned response of fear
such as increased HR. Over the next 4 days you extinguish the CR of fear by presenting the
flashing light without an electric shock. The following day, you present the shock to the rat
without the presence of the flashing light (US). After a period of time, you train the rat with
only the presence of the flashing light and no shock, and the rat shows the CR of fear. This an
illustration of fear reinstatement.

Implications for exposure therapy

 Restoration phenomena suggest that symptoms of PTSD will return after exposure
therapy with the lapse of time (spontaneous recovery)
 Subjects emerge from fear extinction or patient from exposure therapy with the
original fear/trauma memories intact but inhibited. Subsequent behaviour (fear vs no
fear) depends on which memory is retrieved
 The extinction data suggest that inhibitory memory is situation, specific to time and
place and emotional state (safety). In contrast memory is relatively independent of
these cues. Thus, the fear or trauma memory returns when the situation cues differ
(different time, place or emotional state)

Extinction and prediction error

 The omission of the predicted US constitutes the error which initiates the inhibitory
learning that underlines the reduction in conditioned responses (fear responses)
 The size of the error will determine the amount of the inhibitory learning, just as
it is the size of the error which determines the amount of excitatory learning (e.g., the
blocking effect).
 Evidence for this comes from designs in which one CS (e.g., a noise) is strongly
conditioned, a second (e.g., a light) is moderately conditioned, and a third CS (e.g., a
tone) is weakly conditioned.
 Half of the subjects are then extinguished to a compound composed of the light and
the noise, while the remainder are extinguished to a compound containing the light
and the tone.
 Subsequent test of the light shows that it elicited less fear (i.e., greater extinction)
when it had been extinguished in compound with the strongly conditioned noise than
with the weakly conditioned tone
 The expectation of the feared outcome elicited by the noise-light was greater and,
hence, the error when that outcome did not occur was also greater.

Instrumental condition
 Can we learn about relations between any behaviour and its consequences?
 In Lewis Carroll’s Through the Looking-Glass, there is an incident in which the Red
Queen and Alice are constantly running but remaining in the same spot.
 "Well, in our country," said Alice, still panting a little, "you'd generally get to
somewhere else — if you run very fast for a long time, as we've been doing.”
 A slow sort of country!" said the Queen. "Now, here, you see, it takes all the running
you can do, to keep in the same place. If you want to get somewhere else, you must
run at least twice as fast as that!"

The looking glass world

 A laboratory example of the Red Queen’s world is where a dog is presented with a
bowl of food. Each time the dog approaches, the bowl moves away; each time the dog
stays still, the bowl moves to the subject.
 If approach is an instrumental response acquired because of its consequences, then the
dog will learn to remain still. But it doesn’t. It remains still and the bowl approaches;
but as soon as this occurs, it subsequently approaches the bowl.
 This is an example of sign tracking where the Pavlovian association causes approach
even when its consequence is the omission of food
 The omission test is used to determine whether some response is under the control of
its consequences that is, whether it is an instrumental response.

Conditions for learning

 Dickinson et al. (1984) used a video game in which people could fire a shell at a
tank moving through a minefield. Their task was to determine whether the shell was
effective in destroying the tank. On some occasions, they fired the shell, and the tank
was destroyed but there were other occasions on which they did not fire, and the tank
was destroyed by the minefield.
 Their judgments accurately reflected the differences in these likelihoods: Actions
were judged to be effective if the likelihood of the outcome (the tank being
destroyed) was greater when they had fired the shell than its likelihood when
they had not fired.
 Similar results obtained with rats who received food for pressing a bar and who either
do or do not receive food in the absence of bar pressing. The amount learned about the
action-food association was influenced by the frequency with which food occurred in
the absence of the action.

Action-outcome associations
 Any point above the diagonal line means that the outcome (reinforcement) is
more likely if the action has occurred than if it has not occurred: the action, then,
causes the outcome.
 Any point below the diagonal means that the outcome is more likely if the action has
not occurred than if it has occurred. The action prevents the outcome
 The diagonal line means that the action and outcome are unrelated.
 Points on the diagonal may lead to the phenomenon of learned helplessness
where learning that outcomes are unrelated to your actions: 1) makes it more
difficult to subsequent learn that outcome are contingent on your actions; 2) makes
you less inclined to engage with the problem; and 3) mood changes (depression).

 Animals and people are sensitive to the contingency between an action and an
outcome. They take into account not only the likelihood of the outcome when the
action occurs but also the likelihood of the outcome when the action has not occurred.
 Learning mechanisms are designed to detect predictive relations that exist between
events (Pavlovian conditioning) and the causal relations that exist between actions
and events (instrumental conditioning).

Mechanisms of performance

 An animal or person has learned that an action is instrumental in procuring an

outcome (e.g., pressing produces food)
 How is this knowledge expressed in behaviour? What causes the agent to perform the
 One explanation is in terms of desires: the agent performs the action because the
outcome of the action is desired.
 We have learned that some action produces the outcome
 We desire the outcome
 Therefore, we perform the action.
 Desires, therefore, are critical in explaining how learning is related to behaviour.

Incentive learning

 One idea is that animals and people learn about the value of an outcome (e.g., food)
relative to an internal state (hunger-satiety) and that it is this value which controls
the action that produces the outcome.
 Animals and people rate food as hedonically pleasant when they are hungry and
hedonically neutral or mildly unpleasant when satiated.
 Effectively, animals and people learn that food tastes good when hungry but not when
 Hungry rats learn that pressing a bar produces a novel food. Then half of the rats
(Groups A and B) are provided with this food in their home cages where they can eat
as much as they want; the remainder (Groups C and D) are provided with their
familiar lab chow in the home cages.
 Finally, all rats are tested for how much they press the bar.
 Rats in Groups A and C are tested hungry. They show substantial pressing.
 Those in Groups B and D are tested satiated. Rats in Group D show just as much
pressing as those in Groups A and C while those in Group B refuse to press the bar

How learned value affects actions (Dickinson & Balleine 2002)

 Hungry rats first learn that pressing a bar produces a novel food
 Then half of the rats (Groups A and B) are provided with this food in their home cages
where they can eat as much as they want.
 The remainder (Groups C and D) are provided with their familiar lab chow in the
home cages.
 Finally, all rats are tested for how much they press the bar.
 Rats in Groups A and C are tested hungry. They show substantial pressing.
 those in Groups B and D are tested satiated. Rats in Group D show just as much
pressing as those in Groups A and C, while those in Group B refuse to press the bar

satiated rats and bar pressing

 Why did rats tested satiated (Group D) press the bar as much as those tested hungry
(Groups A and C) whereas the other rats tested satiated (Group B) refuse to press the
 Rats in Groups A and B had learned that the novel food lost its palatability when they
were satiated, whereas those in Groups C and D only knew that the food as
hedonically pleasant. These latter rats were not allowed the opportunity to learn that
the value of that food is changed by satiety.
 Desires are evaluative beliefs about goals where your experience with the goal allows
you to relate its value (its hedonics) to motivational state and thereby to actions.

An application: drug seeking and drug taking

 Taking a drug (e.g., alcohol) results in hedonically pleasant effects (feeling good even
euphoric) followed by aversive ones (nausea, headache, irritability, depression)
 Cues (e.g., a pub, sight or smell of alcohol) signalling these hedonically pleasant
effects become Pavlovian CSs, exciting a representation of these effects, drug-related
responses (changes in body temperature) and an appetitive motivational system
(arousal and approach).
 The actions which produce the effects (obtaining money, going to the liquor store and
so on) are then initiated.
 In addition, taking the drug in the aversive after effect allows incentive learning where
the agent learns about the value of the drug relative to that aversive state: it alleviates
that state.
 The drug user comes to desire the drug not only because it produces hedonically
pleasant effects but also because it alleviates aversive ones. The range of conditions
which produce drug seeking and taking are increased.


Associative learning
 Encode statistical regularities in the environment
 Allows organisms to build up representation of environment
 and to alter/adapt behaviour to exploit statistical regularities in the environment (gain
rewards, avoid punishments)

Pavlovian/classical conditioning
 Learning about signalling relationships
 E.g., McDonalds sign signals greasy food = salivating

Instrumental/operant conditioning
 Learn about action-outcome relationships
 Perform responses that lead to positive outcomes (rewards)
 Avoid responses that lead to negative outcomes (punishments)
 E.g., studying hard = good grade

 E1  E2?
 Does whether E2 happens or not depend on whether E1 happens or not
 Does E2 depend on E1?
 Is E2 contingent on E1?

2x2 contingency table

E2 happens E2 does not happen

E1 happens A B
E1 does not happen C D
 However, if you always took your lucky pen with you, you might think it was really
 This may be part of the reason why people continue to believe in the effectiveness of
homeopathy – this is an example of a casual illusion: belief that one event causes
another but realistically they are unrelated

Dickinson, Shanks & Evenden (1984)

 Told participants that on each trial of an experiment they would see a tank  they
could choose to fire at the tank and see whether it would explode
 They were also told that the tanks were travelling through a mind field and that they
could explode as a result of the mine or the gun
 5 conditions, within subjects, one after the other
 40 trails per condition
 Participants were asked to rate how effective was the firing in the previous set?

 +100=firing always destroyed tank

 0 = firing completely ineffective
 -100= firing always prevented tank from being destroyed
 Judgments vary as a function of contingency
 People are sensitive to differences in contingency
 Though not always perfectly accurate, AP= 0 judges as a positive relationship
 is associative learning bases on AP?

Changes across training

 Lopez & Shanks (unpublished)

 Learning is gradual, even though actual AP does not change

o Contingency: how strongly are events related?
o Learning is influenced by contingency
o But that is not the whole story
 Learning is gradual even though AP is constant, “learning curves”
o Maybe learning is not the same as calculating contingency
o What happens when people learn about more than one cue at a time?

Learning about multiple cues

 In the examples above, people have been learning about a single “cue” stimulus

What if instead people have to learn

about multiple cues at the same
 Do we learn about each cue independently, based on its contingency?
 Or do cues compete with one another

Blocking – Aitken, Larkin & Dickinson (2001)

 Told that Mr X is allergic to apples
 Trials show different types of food and participants were asked if he would suffer an
allergic reaction  receive feedback e.g., no allergic reaction
 Participants were then asked how likely Mr X would suffer an allergic reaction after
eating a certain food on a scale of 1-6
o When we learn about multiple cues at the same time, we don’t learn about each one
individually based on its contingency
o Instead cues compete with eachother
 Blocking as an example of “cue competition”
 It’s not just the contingency between the cue and the outcome that’s important;
learning is also influenced by what we’ve previously learned about other cues
o Learning is not the same as calculating contingecy. What are people doing when they


An alternative view: association formation

 We can understand learning as the gradual formation of an association (a link)

between representation

 What determines whether (and how fast) this association strengthens

The Rescorla-Wagner model

 Proposed by Rescorla & Wagner (1972)
 Surprise is the key to learning
 We only learn when something happens that we don’t expect  we learn when we are
surprised by something
 If everything occurs as expected, we don’t learn anything new  we don’t learn if we
are not surprised
 They captured this idea in an equation: an algorithm that provides a model of the
learning process
 When prediction error is zero, there is no learning
 Learning = reduction in PE
 Learning = things become more expected, and less surprising

Vegemite example

 V begins at 0 because it hasn’t been eaten before

 Salience of the cue (e.g. loud noise has more salient than soft noise)
 Salience of outcome = illness = 0.8
 Lambda (Observed magnitude of outcome) = illness occurring = 1
 Change in associative strength from 0  0.4 as a result of learning episode

 Graph plots associative learning as a function of number of trials

 Association changes quickly at first but then slows down
 When associative learning reaches 1, learning stops

After establishing a strong associative strength – what happens if we eat vegemite, and it
doesn’t make us sick?
Acquisition and extinction
Changes to alpha and beta
 The values we chose don’t matter too much; the general pattern of
learning is the same in each case

Rescorla-Wagner and contingency

Rescorla-Wagner model and blocking
Summary so far
 Lecture 1: learning = calculating contingency?
 Different view: learning = forming association between representations
 The Rescorla-Wagner model
 Surprise drives association formation (learning)
 RW is one algorithmic model of association formation – there are many others
 RW is probably the most influential model of AL
 A simple, intuitive account – but very powerful
 It examples the effects of contingency, learning curves, and cue competition
 Still often used as a default view of learning

An alternative view - the propositional nature of human associative learning (Chris Mitchell,
Jan De Houwer & Peter Lovibond)

 The link based approached e.g., Rescorla-Wagner

The propositional approach

 Idea: Last time I ate vegemite I was ill, perhaps vegemite makes me ill
 Form a memory of eating vegemite and becoming ill
 Active reasoning

 The “thinking about it” approach

 Remember what went with what in the past, try and work out most likely

Propositions & Blocking

Evidence for the propositional account

 Blocking is stronger when outcome at sub maximum level

 Predicted by propositional account

 Cannot be predicted by the RW model
 All that matters in the RW is that the phase 2 outcome is not surprising,
and this is the case in both groups
 So should have equal blocking in both groups
Blocking of memory

propositional account
Mitchell et al (2006)

 Mr X experiment
 Asked to identify 2 types of questions
1. What illness followed bread during training? Options: Daryosis,
Xianethis etc
2. To What extent did the food cause this illness on a scale of 0-8?

o Its still not clear: more research is needed
o Seems likely that both types of process operate
 Simple experiments encourage “thinking about it”
 More complex experiments discourage thinking about it, so more
automatic link formation mechanisms dominate

o Associative learning may be due to more than one process


 Process of learning
 The type of learning can lead to either an increase of decrease in behavior
Pavlovian conditioning
 Repeated contingent presentations of the CS with the US
 Cs becomes associated with the US
 The subject has acquired the CS-US association

How do we test this?

 The CS elicits the conditioned response CR
 The CR increases as the association increases

 Repeated presentations of the CS without the US
 Produces a gradual decline in the conditioned response
 At test the conditioned response is very small or unobservable
 E.g., rat, light and electric shock
 Acquisition = fear response when the light is presented alone
 Extinction = no fear response when light is presented alone

The RW model
 Computational model that allows us to stimulate learning phenomena
 It is a trial model that calculates the change in associative strength for each trial
 The total associative strength must be split between all stimuli present on any trial
 The model describes negatively accelerated learning (on each trial you will learn
V = associative strength

change in associative strength

prediction error (element of surprise)
Compound conditioning
 the sum of associative strength is between all cues present on each trial
 condition light only
 light = Lambda = 100
 e.g., condition light and tone together (compound)
 the maximum lambda for the compound = 100
 the light and tone have equal salience they will accrue the same associative strength
 light = V = 50
 tone = V = 50

over expectation
Conditioned inhibition

 the minus sign means that the US is no longer present

Super conditioning

Evaluative conditioning: learning to like and dislike

 a change in liking of a stimulus that is due to paring of that stimulus with a liked
(positive) or disliked (negative) event

 Pavlovian condition that produces a change in liking

 Potentially a very powerful source of beliefs and preferences

Attempts to demonstrate EC in the lab – Todrank et al (1995)

 Participants rated 3 faces
 Pairing a face with a pleasant smell produced an increase in its
attractiveness rating (relative to neutral)
 Pairing a face with an unpleasant smell produced a decrease in its
attractiveness rating (relative to neutral)
 Is this lab a real evaluative conditioning, or a demand effect?

Attempts to demonstrate EC in the lab

Staats & Staats (1958)

 “This is an experiment to see how well we can learn to separate lists
simultaneously through two different sensory modalities”

 Rate each name from 1(pleasant) to 7 (unpleasant)

 Asked to write down “anything they had thought during the experiment,
especially the purpose of it, and so on”
 Any students indicating awareness of the name-word relationships

 “Meaning responses had been conditioned to the name without subjects’

 one of the participants in the experiment reported that “I don’t really
dislike the name bill, that’s my husband’s name, but for the purposes of
this experiment, I marked bill bad”
 the way they asked participants to write down any thoughts they had in
the experiment was not a very sensitive test of awareness
Page (1969)
early attempts to demonstrate EC in the lab

 “Evaluating conditioning” likely to have been mainly due to demand

 Evidence for a real change in liking because of pairing with
positive/negative stimuli?

Lab demonstrations of EC

Olson & Fazio (2001)

 “Video surveillance” task
 100s of images to be presented, press button when target image appears
 

 Experiment 2 used an ‘implicit’ test of liking (implicit association test,

IAT), which goes even further to rule out effects of experiment demand

Olson & Fazio – IAT (2001)

 Respond rapidly as possibly
 First condition – compatible
 second
condition –

 the IAT is less susceptible to demand effects

 the aim is to respond as rapidly as possible, not to evaluate stimuli
 significantly faster in compatible condition
 evidence for conditioning of automatic evaluations
 “real” evaluative conditioning

Summary so far
o Evaluative conditioning is real
 Not just a consequence of demand characteristics
 Pairings produce changes in automatic evaluations of stimuli: “genuine”

Conditioned taste aversion (CTA)

 Conditioned aversion: a disliking of something because of pairings with
an unpleasant event

 Conditioned taste aversion: a specific form of evaluative conditioning

in which we learn to dislike a food/flavour because it has been previously

paired with unpleasant consequences
 CTA can be used as a force for good
 Maintaining healthy food consumption in people undergoing

CTA and chemotherapy

 Chemotherapy often causes severe nausea and vomiting
 Cancer patients frequently suffer loss of appetite
 Contributes to weight loss

Bernstein (1978) – chemotherapy and CTA

 Che
motherapy can induce CTA
 This will be bad if patients learn an aversion to foods that are good for
 Is there anything we can do to prevent this?

Broberg & Bernstein (1987) – CTA and chemotherapy intervention

o A form of Pavlovian conditioning that leads to a change in liking
o Potentially a powerful source of (and influence on) preferences
 Social relationships might be influenced by smells, music, experiences
 Heavily used in advertising: change opinions about products, politicians
by association


Attention and selection

 Attention has a huge impact on the way we behave
 If we want to understand behaviour, then we need to understand which stimuli will be
selected under a given set of circumstances

Attention and stimulus properties

 Stimulus is more likely to receive attention if it is:
 Perceptually salient e.g., bright, loud, abrupt (Folk, Remington & Johnson 1992)
 Emotionally relevant e.g., anxiety/depression: more attention to negative/threatening
words (Macleod, Mathews & Tata 1986), Positive mood: more attention to positive
words (Tamir & Robinson 2007)

Attention and learning

 Learning about the relationships between stimuli and other events also influences
attention to those stimuli

Attention and predictiveness

 Predictiveness: how well a stimulus predicts its consequences

 Predictiveness principle: pay more attention to cues that are most accurate predictors
of outcomes

Robert, Robbins & Everitt (1988)

 Intradimensional shift and extradimensional shift

 Learning should be much faster in the intradimensional condition
Owen et al. 1992

Latent inhibition
 Rascle et al 2001
 e.g., “count how many TOG and PEC appeared on the screen”
 the square signalled the beep

Attention and predictiveness

 majority of studies of attention and learned predictiveness in humans are consistent
with the predictiveness principle
 we lean faster about predictive cues
 we look for longer at predictive cues
 we respond faster to events appearing in the same location as predictive cues
 we are better at detecting predictive cues when they are presented very briefly
 All suggest an attentional advantage for predictive cues

Attention and reward value

Anderson, Laurent & Yantis (2011)

Attention and reward value

 The likelihood that a stimulus will capture our attention can change because of the
experience and learning
 Our attention system automatically prioritizes processing of stimuli that have
previously signalled reward, even when this is contrary to our current goals
 Influence of learned value on attention
 Makes sense to have a system that prioritises reward-related stimuli
 But attention to reward-related stimuli can be maladaptive

Attention and addiction

 Drugs of abuse provide potent neural signals of reward (Hyman 2005)

 Learned attentional bias to drug-related stimuli
 Drug addicts show attentional bias towards drug-related stimuli e.g., emotional Stroop

 (Marissen 2006) ^
 Anderson, Faulker, Rilee, Yantis & Marvel (2013)
 Individual differences in reward related attentional bias determine
susceptibility to addiction?
 Attentional bias towards reward-related stimuli can be automatic
(Anderson et al 2011)
 Instruction may not be enough to reduce maladaptive biases
 May require re-training
 Relearn to allocate attention in adaptive way
 Eberl et al (2013)
 Retraining procedure with alcohol dependent individuals
 Reduced relapse rates one year by approx. 10%

o Attention and learning are intimately related
o Attention is influenced by learned predictiveness
 Predictiveness principle: greater attention to most predictive stimuli
o Attention is influenced by learned reward value
 Our attention system automatically prioritizes processing of stimuli that
have previously signalled high value reward
 Learned attentional biases to reward-related stimuli may be implicated in


 Lifetime prevalence 0.4-0.8%

 Onset of symptoms typically in young adulthood
Positive symptoms
 Delusions
 Hallucinations
 Disorganised thinking
 Racing thoughts
Negative symptoms
 Lack of emotion
 Low energy and motivation
 Affective flattening
 Lack of social skills
 Social isolation

Dopamine and learning

 Dopamine plays a key role in learning about reward

Waelti, Dickon & Schultz 2001

 Anticipating reward increases the level of dopamine in the brain, and drugs of abuse
typically increase rates of dopamine release (Berrige, Robinson & Aldridge 2009)
 Dopamine neurons mediate attribution of “motivational salience”
 Events and thoughts grab attention and drive action by virtue of association with
reward/punishment (Berridge & Robinson 1988)

Dopamine and psychosis

 Psychotic schizophrenia patients show heightened dopamine synthesis & level of
synaptic dopamine
 Dopamine agonist (e.g., amphetamine) produce psychotic symptoms in healthy adults
 Dopamine antagonist (haloperidol) reduce them in patients
 Dopamine plays a critical role in psychosis
 Psychosis is characterised by unusually high dopamine levels in certain critical brain

Aberrant salience
 Dopamine neurons mediate attribution of “motivational salience”
 Psychotic patients have excess dopamine
 Stimuli attributed too much salience in psychosis. (Kapur 2003)
 Hallucinations: internal representations have abnormal salience
 Delusions: attempts to make sense of abnormally salient events
Chadwick (2001)

Schizophrenia and latent inhibition – Rascel et al (2001)

Schizophrenia, dopamine, and latent inhibition

 Latent inhibition in schizophrenia

- Inconsequential stimuli remain salient, aberrant salience
- Result of excess dopamine?
 Dopamine agonists reduce latent inhibition
 Dopamine antagonists enhance latent inhibition
 Schizophrenia associated with reduced ability to learn to ignore inconsequential
- May contribute to hallucination and delusions

Schizophrenia and aberrant salience – Morris, Griffiths, Le Pelley & Weikert (2013)
 Delusional schizophrenic associated with reduced ability to learn to ignore irrelevant
and inconsequential stimuli
- Aberrant salience
- May contribute to hallucinations and delusions

1. Regulation
2. Incentives

 Motivation behaviours can be understood in terms of homeostasis or
homeostatic mechanism:
 Homeostasis (Bernard 1865) is the maintenance of a steady internal state,
this state is the condition of optimal functioning for the organism keeping
critical biological systems within certain some pre-set limits
 These limits are setpoints – optimal operating characteristics for that
 Behaviour is motivated by deviations and serves to return the system to
the set point e.g., eating when we are hungry
 These are negative feedback systems
 A critical feature of negative feedback is that it tends towards stable
equilibrium (i.e., set point)

Regulation: example: Glucostatic hypothesis of feeding (Mayer, 1955)

regulation: example: drug addiction
 Allostasis and allostatic (definitions in above diagram)

Examples of homeostatic theories in psychology

Homeostasis and negative feedback

1. Detect
2. Compare (actual v model)
3. Correct any error between actual and model via behaviour
 homeostasis and negative feedback are fundamental principles of biology. By framing
our theories of motivation in the language and ideas of homeostasis, we are moving
close to a unified biological and psychological explanation of behaviour
 our brain has a ‘model’ of the world in terms what a parameter should be (set point or
goal), deviations from this model initiate behaviours to return the parameter to the set
point. Learning can allow us to generate compensatory responses in advance of any
deviation, thereby minimizing impact of these disturbances and more effectively
defending homeostasis
 this is the ‘thermostat’ model of motivation – It only reacts to changes

learned control of homeostatic systems

1. Detect
2. Compare (actual v model)
3. Correct any error between actual and model via behaviour
4. Learn about the antecedents in error in the parameter (i.e., what caused perturbation)
5. Predict the perturbation and respond in anticipation of this
 Perturbation is a deviation of a system caused by an outside influence
 Learning in the form of pavlovian conditioning can allow us to predict when
biological systems might be perturbed and allow us to mount a conditioned response
that defends homeostasis “before” any deviation occurs
 Example – Siegal’s theory of compensatory conditioned response (or conditioned
opponent process)
 Learning can allow us to generate compensatory responses in advance of any
deviation, thereby minimizing impact of these disturbances and more effectively
defending homeostasis

Beyond learning: social facilitation of eating (De Castro & De Castro 1989)

The dessert effect (Rolls 1981)

Incentive theories of motivation

 whereas homeostatic mechanisms push us unto action to satisfy our needs, incentives
pull us towards or repel us from them to satisfy our wants. They are motivational
 incentives are properties of the world, and their significance can be ‘innate’ (primary
incentives) or learned (secondary incentives)
 they can be positive or negative
 primary positive incentives include food, water, attachment, sex
 negative primary incentives include pain, illness
 secondary positive incentives include money, drugs, social status, achievement (e.g.,
your motivation to do well in this course)
 secondary negative incentives include fear, loss, risk
 other things being equal, we seek positive incentives and avoid negative ones

how do incentives control our behaviour?

 Robinson & Berridge

 Dickinson & Balleine


1. Assessing drug reward in animals

2. How do drugs of abuse affect the brain?
 aand neurotransmission
3. Why do people persist in taking drugs?
 Pavlovian incentive learning
 Opponent process and Pavlovian conditioning
 Instrumental incentive learning
4. Treatment
 How do we treat addiction and are we effective?
5. conclusions

what is addiction?

 Addiction is not recreational drug use, but some proportion of recreational drugs users
will be ‘addicted’
 APA, DSM-IV diagnosis of ‘substance dependence’
 an individual must exhibit 3 of the following in the past 12 months:
 tolerance – refers to reduction in the effectiveness of the drug
 withdrawal symptoms – changes in physiology that are associated when the drug is
not taking
 increasing doses
 unsuccessful effort to reduce the intake of the substance
 a considerable amount of time spent in obtaining the substance or using it
 interference with important social, occupational, or recreational activities
 continuation of use of the substance despite recognition of the fact that doing so
causes physical and psychological problems
 addiction – to both licit (e.g., nicotine, alcohol) and illicit (e.g., opiates, cocaine) is a
leading cause of death and disability worldwide

requirement of adequate explanation of addiction

1. how do drugs of abuse work?

2. Why do people start taking drugs?
3. Why do people persist in taking drugs?
4. Can it lead to an effective treatment?

Assessing drug reward and drug taking in animals

 Pavlovian learning – conditioned place preference
 Instrumental learning – self administration
why are drugs rewarding? Mesolimbic dopamine system
 Mesolimbic system pathway is a dopaminergic pathway in the brain
 VTA  dopamine producing neurons are most rich in this area
 Cells bodies/neurons in the VTA project up to the nucellus accumbens to form

activation of human mesolimbic dopamine system

 Sell et al (1999)
 Use of fMRI
 Large changes in activity around the VTA
 Heroine increases brain activity in the VTA

 Fowler et al (2001)
 High levels of dopamine located around the nucleus accumbens
 Cocaine increases release of dopamine in the nucleus accumbens

Changes in dopamine release in rat nucleus accumbens: Nicotine

cocaine and amphetamine

 Amphetamines are psychostimulant drugs  speed up messages traveling between
the brain and the body
dopaminergic neurotransmission
 Dopamine is synthesised and stored in vesicles in the presynaptic neuron

 Dopamine binds to two receptors:

1. D1 dopamine receptor on postsynaptic neuron raises the concentrations of cAMP.
These are rewarding
2. D2 dopamine receptors on the pre-synaptic neurons reduces levels of cAMP in the
terminal and reduced further dopamine release. These can be aversive or punishing
 Dopamine is removed from the synaptic cleft via a reuptake pump called the
dopamine transporter
 Inside the pre-synaptic terminal monoamine oxidase (MAO) enzymes deactivate
 Dopamine reaming in the synaptic cleft is left deactivated

How do drugs of abuse increase dopamine?

 Cocaine
 Cocaine binds to and deactivates the dopamine transporter  increase the amount
of dopamine in the synapse therefore allowing more dopamine to bind to the post
synaptic receptors

 Opiates (pain relief, sedation, anxiety

reduction and euphoria)
 Opiates act indirectly on other neurons that are regulating dopamine neurons
 Bind to opioid receptions  Prevent GABA neurons from sending inhibitory signals
to the dopamine neurons  dopamine neurons are free to send more dopamine

theories of addiction - why do people persist in taking drugs?


 Olds and Miler (1954)

 Animals readily learned to push a lever to obtain electrical stimulation of various
brain regions (medial forebrain bundle, hypothalamus)
 4,500/hr 15 days +  medial forebrain bundle: a group of axons with cell bodies
located in the midbrain and which course to a variety of forebrain regions and include
axons of VTA dopamine neurons
 Reward could occur because of activation of particular parts of the brain and could
therefore be studied physiologically

 The brain has circuits which when activated give rise to the sensation of please (the
pleasure pathway)
 This pleasure is both “liked” and “wanted”
 These brain circuits, normally activated in response to naturally occurring rewards
(e.g., water, sex, money etc) are ‘hijacked’ by drugs of abuse
Incentive sensitisation (Robinson & Beridge)

 Positive reinforcement model

 Addictive drugs share the ability to enhance transmission within the mesolimbic
dopamine pathway (increase dopamine)
 One function of this pathway is to attribute incentive salience (‘wanting’) to the
stimuli associated with the activation of the pathway. Such stimuli will attract or
pull towards them
 Wanting (incentive salience) is not liking (the hedonic or affective responses to drugs)
 Dopamine mediates incentive salience (i.e., wanting)
 Repeated administrations of addictive drugs sensitise the mesolimbic dopamine
 This sensitisation is gated by the associative learning which causes incentive
value to be attributed to the act of drug-taking and to stimuli associated with
drug taking

 Attempts to integrate a psychological explanation with dopamine function
 obvious link to neurobiology
 consistent with the important role accorded craving by addicts
 does not accord drug withdrawal a causal status in addiction and thus could be applied
to many drugs

 lack of clinical evidence for sensitisation of drug responded among human addicts
 not much evidence for dissociations between wanting and liking in human addicts or
in animal based on drugs of abuse

opponent process models

 negative reinforcement model
 addiction is the result of compensatory responses that drive drug-withdrawal. The
motivation to take drugs shifts across the course of addiction
 A process is drug reward whereas B process is aversive
 Overall state or experience is the sum of A and B process
 B process is strengthened with use and weakened with disuse
 Across time B process increases and although drug administrations arouse the A
process, the B process has grown significantly and thus the drugs fail to produce an A-
state. They simply alleviate the B process

 Initially we take drugs because they are rewarding (recreational use), eventually
we take drugs to alleviate withdrawal

 Explains many features of addiction
 Would meet DSM criterion for diagnosis of substance depended
 Does not accord drugs of abuse any special role
 Can be reconciled with neurobiology
 May have fundamental importance and relevance to understanding causes of opioid
 Fails to explain persistent drug taking in the absence of a withdrawal state (e.g.,
cannabis, psychostimulants)
 Tolerance is not an inevitable consequence of drug exposure
 The model system (morphine analgesia) has little relevant to understanding addiction

How can withdrawal change behaviour?

 We must learn about the relevance of a reward to our current motivational state to
seek that reward – Dickinson & Balleine
 Withdrawal from opiates functions as a motivational state that enhances the value of
the drug, thereby enabling withdrawal to trigger drug taking but only in animals that
have had experience with the drug alleviating withdrawal
Hutcheson et al (2001

treating addiction

 In 1898 the drug

company Bayer
introduced a wonder
(heroine), as a cue for morphine addiction as a cough suppressant and other uses
 Other drugs had other therapeutic uses e.g., cocaine for tooth aches
 In 1995 the drug company Purdue Pharma introduced the products MS

Contin(morphine) and oxycontin (oxycodone – synthetic opioid) as revolutionary

treatments for pain
 Use a drug to treat a drug addiction

 Agonist and antagonist ^^


 If learning is important, surely, we can reduce the power of a drug associated stimuli?
E.g., via extinction such as cue exposure in cognitive behavioural therapy

Sell et al (1999)
Does cue exposure work?
 Conklin & Tiffany (2001)

Contingency management

 A type of behavioural therapy in which individuals are rewarded for not taking drugs
 Types of contingencies
 Paid in vouchers for negative drug test
 Allowed to work after negative drug tests
 Urine samples are collective multiple times each week (to detect brief periods of
abstinence) and abstinence is reinforced each time negative samples are submitted
 Can increase reward across time

Liussier et al (2006)
 One of the most effective methods

Improve social connectedness

 Primates are social species. We are sensitive to social cues, social isolation, and
dominance hierarchies
 In monkeys, social status predicts response to cocaine
 Social rewards can be effective rewards in contingency management therapies

Heilig et al (2016)

Are we getting better at treating addiction? No

 Hunt et al (1973) and Sinha et al (2011



 Positive symptoms (something added to those with schizophrenia that is not present
in the general person)
 Hallucinations; audio and visual
 Delusions e.g., people believe that others are trying to kill them
 Irrational thoughts and beliefs e.g., thinking that there are messages left for you on tv
or the news
 Negative symptoms (absent in those with schizophrenia)
 Reduced affect (emotion)
 Catatonia (complete absence of movement or anything related to movement)
 Cognitive symptoms
 Deficits in working memory
 Inability to perform the stroop task

Associative learning and schizophrenia

 Several associative learning experiments have provided evidence to explain the

positive symptoms of schizophrenia
 The focus of this research is the regulation of attention
 The two learning protocols used are:
 Blocking
 Latent inhibition
 Theory is that dis-regulation in attention leads to the development of aberrant
salience – all cues are of equal important

Blocking in schizophrenia

Latent inhibition in schizophrenia

Attention and schizophrenia
 Attend to all stimuli equally
 Unable to ignore irrelevant stimuli
 Contribute to hallucinations as they attend to all information meaning that thoughts
and patterns are more salient than they should be
 Contribute to delusions as you learn about what you attend to
 Learn about irrelevant cues so they are as areal and important as the relevant cues

How does associative learning relate to the attentional theory of schizophrenia?

 Mackintosh – attentional theory – we attend to cues that are good predictors of
 Schizophrenia
 Attend to irrelevant stimuli  don’t display latent inhibition or blocking effect
 Leads to aberrant salience
 Delusions are based on explaining the over attending to irrelevant stimuli
 More delusional less able to ignore irrelevant cues

How does associative learning relate to dopamine hypothesis of schizophrenia?

 RW model
 Dysfunctional prediction error
 Aberrant salience  abnormal salience to internal representations  explanations
 Top-down process explains delusions
 Attempt to make sense of abnormally salient events

Tutorial discussion – articles

1. What is the evidence for the dopaminergic hypothesis of schizophrenia? (Kapur


 Antipsychotics “dampen the salience of abnormal experiences and by doing so permit

the resolution of symptoms. However once patients stop taking them, dysregulated
neurochemistry returns.

Dopamine hypothesis of antipsychotic medication/action

 Antipsychotics increase the turnover of dopamine

Dopamine hypothesis of psychosis

 post-mortem studies showed that abnormalities in dopaminergic levels in those

with schizophrenia
 neuroimaging studies showed an increase in the synthesis of dopamine, increase in
dopamine release in response to impulsiveness, and increase of synaptic dopamine in
schizophrenic patients
 psychostimulants that trigger the release of dopamine are associated with psychosis
and cause the worsening of psychosis in patients

2. How does the dopamine hypothesis and aberrant salience explain the positive
symptoms of schizophrenia? (Kapur 2003)

 it is proposed that in psychosis there is a dysregulated dopamine transmission that

leads to stimulus-independent release of dopamine.
 This neurochemical aberration usurps the normal process of contextually driven
salience attribution and leads to aberrant assignment of salience to external objects
and internal representations
 Thus, dopamine, which under normal conditions is a mediator of contextually relevant
saliences, in the psychotic state becomes a creator of saliences, albeit aberrant ones
 it is postulated that before experiencing psychosis, patients develop an exaggerated
release of dopamine, independent of and out of synchrony with the context. This leads
to the assignment of inappropriate salience and motivational significance to external
and internal stimuli
 Once the patient arrives at such an explanation, it provides an “insight relief” or a
“psychotic insight” (74, 75) and serves as a guiding cognitive scheme for further
thoughts and actions  try to make meaning of internal stimuli as a coping
 Unable to discriminate from real and fake because all stimuli have the same weight
3. What is learned irrelevance and how does this relate to positive symptoms of
schizophrenia? (Morris et al 2013)

 Schizophrenia is associated with deficits in attention and recent theories of psychosis

have argued that positive symptoms such as delusions and hallucinations are related to
a failure of selective attention
 Aim of the experiments was to assess whether healthy adults and people with
schizophrenia discriminate previously predictive and non-predictive cues
 Learning about non-predictive cues is correlated with more severe symptoms
 Blocking effect does not occur in people with schizophrenia (they essentially learn
the same about cue a and b)

 People with schizophrenia will fail to discriminate between relevant and irrelevant
 The amount of learning about the irrelevant cues will be related to the severity of

Results and discussion

 People with schizophrenia did not show a significant learned irrelevance effect
 Learners with schizophrenia learned significantly more about non-predictive cues

Results and discussion of experiment 2

 Easier task
 Evidence of normal bias towards predictive cues and bias against nonpredictive cues
existed in both groups
 Normal selective learning can occur in schizophrenia under easier task conditions
 More severe positive symptoms failed to ignore non-predictive cues, suggesting the
bias also varies with the severity of positive symptoms


Attachment and love
1. Mother – infant attachment
 Implications and mechanisms for infant wellbeing and development
 Hormonal and brain mechanisms
2. Fathers
3. Infant – mother attachment
 Imprinting
 Conditioning
 Attachment theory
 Attachment theory
4. Adult couples
 Hormonal and brain mechanisms
 Are we addicted to love?
5. Unifying principles
 Common strategies and common biology

Mother infant attachment: deprivation

 The infant – parent relationship is the first attachment relationship
 The formation of an emotional bonding between infants and their mothers is essential
 The absence of such a bond can have profound and long-lasting effects
 E.g., Romanian orphans and reactive disorder

O’connor & Rutler

Mother infant attachment: normal variation

 Even normal variations in maternal care can have profound influences on
 Individual differences in rat maternal care
 High versus low level of licking of pups (LG)
 High versus low levels of “arched back” nursing (ABN)
 Liu et al. (1997)
 Examined the relationship between naturally occurring variations in maternal care and
stress responding of their offspring
 Studied, as adults, the offspring of high versus low nursing mothers
 corticosterone is the stress hormone

mother infant attachment:

gene x environment interaction
Kaufman et al (2004)
 101 children: 57 were removed from their parents care by the state of Connecticut
within the past 6 months because of allegations of abuse and/or neglect, and 44 were
community controls
 Physical abuse, sexual abuse, neglect, emotional abuse, and exposure to domestic
 Assessed for depression
 Assessed for presence and frequency of social support
 Talk about personal things; count on to buy the things they need: share good news
with; get together with to have fun and go to if they need advice
 Genotyped on the 5-HTT transporter (serotine transporter)
 Genotype: investigate the genetic constitution

mother – infant
attachment: hormonal
and brain mechanisms
 Vasotocin
 Vasotocin has evolved into two distinct hormones: oxytocin and vasopressin
 Oxytocin and vasopressin are important for mammalian reproductive and attachment

 Dam will pick up her own pups to ensure that they survive
 Naïve virgin rat will not

Pup retrieval test

 Pederson et al 1994 – blocking oxytocin
 Marlin et al (2015 – increasing oxytocin
Father - infant

Father – infant: brain mechanisms

 Increasing vasopressin in the male prairie vole brain increases paternal behaviour
 Decreasing vasopressin in the male prairie vole brain decreases paternal behaviour
Wang et al (1984)

Kozorovitskiy et al (2006)

Parent – infant attachment: hormonal and brain mechanisms

 In humans and other primates, the role of oxytocin and vasopressin in parent infant
interactions is more subtle
 E.g., females of most primate species find newborns and infants highly attractive and
will readily make contact and display maternal behaviour
 Oxytocin also contributes to mother-infant interactions in primates
 Increasing oxytocin in non-human primates increase female nurturing behaviour
towards infants
 Decreasing oxytocin levels in non-human primates decreases female nurturing
behaviour towards infants
 Oxytocin also plays a central role in birth and breastfeeding
 These hormone co-ordinate biological and behavioural responses in parents
Infant – parent attachment

 What variables determine the formation of an emotional bond between infants and
their parents?
 Historically there have been very different views about how infant – parent bonds are
formed and what the consequences of these bonds are

Infant – parent attachment: imprinting

 Early view of attachment behaviour

 Behaviour is elicited by stimuli (“releasing-stimuli’ or ‘sign stimuli’) in the
 The elicitation is innate: without any prior learning or experience
 The behaviour is fixed, complex, and automatic (“fixed action pattern” and is
probably just a fixed circuit in the nervous system)

 Attachment by an infant to its mother can be explained by these processes

 Newly hatched goslings or ducklings follow and become socially bonded to their
 Lorenz called this attachment process “imprinting”
 An image of the mother is stamped irreversibly on the nervous system
 Lorenz distinguished imprinting from learning
 Innate and formed immediately
 Irreversible (no forgetting)
 Only develops during a brief “critical period” (the first few hours after hatching)

Infant – parent attachment: conditioning

 early conditioning theorist (1900-1960) held a very different view about infant mother
 JB Watson (1928) “the psychological care of infant and child”
 Attachment is no different to any other behaviour, it is not innate, instead it is
acquired through experience, specifically through conditioning”
 during early human infancy  Mother is CS, breast milk is US and emotional
attachment is CR
 because attachment behaviour is a learned response, parents should minimise any
displays of affect in case this encourages attachment in the infant”

Infant – parent attachment: Harlow

 experimentally tested in predictions of conditioning theory using surrogate mothers

 infant monkeys were raised by inanimate “surrogate’ mothers
 cloth mother: wire frame covered in cloth and heated with lamp
 wire mother differs only by the absence of cloth cover
 for half the monkey’s cloth mother contained food (milk) for the other half wire
mother contained food
 if food is the critical variable for attachment, then monkeys should prefer the mother
that fed them
 if contact is the critical variable, then all monkeys should prefer the “cloth mother”

Harlow 1958
 no evidence that nursing (feeding) is a critical variable in infant monkey attachment to
a mother
 contact comfort is a far more important variable than nursing in determining infant
attachment as measured either by time spent with surrogate or by responses to a
fearful stimulus
 later research showed that many other variables were just as important as contact
comfort (e.g., rocking and cradling)
 led to the development of attachment theory

infant – parent attachment: attachment theory

 sensitive responding by the parent to the infants needs results in an infant who
demonstrated secure attachment, while lack of such sensitive responding results in
insecure attachment
 secure infants either seek proximity or contact or else greet the parent at a distance
with a smile or wave
 avoidant infants avoid the parent
 resistant/ambivalent infants either passively or actively show hostility toward the

 attachment styles in adults are based on experiences (‘mental models’) during

 secure adults find it relatively easy to get close to others and do not worry about being
 avoidant adults are uncomfortable being close to others; they find it difficult to trust
others completely, difficulty to allow themselves to depend on others. Avoidant adults
are nervous when anyone gets too close
 anxious/ambivalent adults worry that their partner doesn’t really love them or won’t
want to stay with them. Want to merge completely with another person and this desire
sometimes scares people away
adult attachment: pair bonds
 human adult couples are typically monogamous but only 5% of all mammalian
species are monogamous, the rest are polygamous
 is there a brain mechanism for monogamy?
 If so, could we design a ‘commitment pill’?
 How do the behavioural and brain mechanisms for adult relationships differ from
Parent-infant relationship?
 Insight from studies of two species of rodents (voles)
 Prairie voles: form lifelong monogamous bonds
 Montane voles: polygamous

Adult attachment: partner preference test

 Isnel and young (2001)

adult attachment: oxytocin and monogamy in females

adult attachment: vasopressin and monogamy in males

 increasing vasopressin
in montane/mountain
adult attachment
 Has a strong biological basis
 The same hormones which mediate parent-infant interactions contribute to formation
of stable long-term monogamous relationships in adults
 Oxytocin: mother infant attachment and monogamous relationships
 Vasopressin: father infant and monogamous relationships
 Different in mating strategies are due to at least in part to differences in levels of these
two key hormones and their receptors

Are we addicted to love?

 Attachment behaviour and drug use are both examples of motivated behaviour
 Are the behavioural and brain mechanisms for these different motivations similar or

An opponent process account of attachment

are we addicted to love?
separation distress/breakup
 Loneliness
 Crying
 Loss of appetite
 Depression
 Sleeplessness
 Irritability
 Psychological pain
 Crying
 Loss of appetite
 Depression
 Insomnia
 Iterability
Unifying principles of attachment

 What do these studies of infant, maternal and romantic attachment tell us about
attachment motivation?
 Behaviour: approach the individual
 Learn the identity of that individual
 Invest in this individual while rejecting others
 Biology: oxytocin and vasopressin are important for attachment behaviour in
mammals, including humans
 Interactions between these hormones and reward circuit are important for attachment
behaviours (i.e., for approaching and learning about the individual)
 There are shared behavioural and brain mechanisms for different kinds of attachment
and love
 These mechanisms are slightly different for males and females within a species
 Although there are some important differences, the behavioural and brain mechanisms
for attachment are also shared across mammalian species, including humans,
indicating that they have been conserved evolution history
 The brain mechanisms for attachment and love overlap with those underlying other
motivated behaviours


The opponent process model

 Originally applied to visual research
 Yellow/blue
 Red/green
 Black/white

 Activation of one receptor leads to the inhibition of the other in the pair
 The first member of the pair (blue) is excited and then fatigued
 When the blue is fatigued, inhibition of the opponent (yellow) is reduced
 This leads to seeing the opposite colour and produces the after image

The A process
 The A process occurs in response to the administration or ingestion of the drug
 The physiological repones that occurs to a particular substance – this will change
depending on the type of drug e.g., heroine vs amphetamine
 It can be thought of as the process which produces the feeling of euphoria or the
rush of consuming the drug
 The A state never changes, it will always produce the same response to the drug.
It is the B process that changes over time

The B process
 It produces the opponent physiological effect to that of the drug
 The B process is slower to activate and lasts longer than the A process
 As the figure displays, over time the A process remains the same, but the B process
grows in length and strength of the activation. It can be thought of in the same that a
muscle grows with repeated use
 The B process is thought to explain, tolerance, withdrawal, and craving

Limitations of the opponent process model

 Purely physiological account of addiction
 It is a non-associative model of addiction
 Cannot account for the role of conditioned cues or environmental stimuli in addiction
 Cues and context are critical and known to trigger relapse
Classical conditioning model of tolerance (Siegal 1978)

Classical conditioning and addiction

 According to this model:
 The neutral stimulus before persistent drug taking is the drug cues associated with
drug taking
 Needle and spoon for heroine
 Might be the environmental the drug is taken in

 US is the physiological effect of the drug on the body

 E.g., increase in HR/blood pressure from amphetamine

 UR  is the opponent process your body produces to deal with the effects of the
 E.g., a decrease in HR/blood pressure from amphetamine
 This process is designed to bring the body back to homeostasis

 According to this model after conditioning has occurred

 The CS  drug associated cues
 Needle and spoon for heroine
 Environment the drug is taken in

 CR  is the opponent process your body produces to deal with the effects of the
 E.g., decrease in HR/blood pressure for amphetamine
 This process is designed to bring the body back to homeostasis
 The CR occurs following exposure to the CS alone
 So they intensify craving and explain withdrawal and tolerance

Example: drug cues in health campaigns to reduce drug taking

 Cigarette quit smoking campaigns
 If you include an image of a cigarette or anything associated with smoking you end up
producing a CR which induces smoking cravings
 Increases the likelihood of relapse smoking

Drug environments and the Siegal model

 When people take drugs, the environment becomes a drug associated cue
 The environmental cue acts as a predictive cue that you will take the drug
 This produces activation of the CR (opponent process to drug effect)
 As people increase the dose, the environmental cue signals the body to activate the
 If someone addicted to drugs, takes the drug in a novel environmental the CS is not
present and so they do not activate the CR
 Leads to overdose (commonly seen in injecting room situations)

Classical conditioning model of tolerance

 Accounts for tolerance and withdrawal
 Explains the role of the drug associated cues
 Cannot explain persistent drug taking in the absence of withdrawal and tolerance
 Tolerance does not always develop following drug use

Incentive learning model (Dickinson and Balleine)

 Animals learn the relevance of a reward in relation to their current motivational state
 Withdrawal state  acts as a motivation state
 Enhances the incentive value of the drug as it can alleviate the withdrawal state
 This can only occur in animals that have experienced the drug alleviating withdrawal
 Seek drug to remove withdrawal

 Animal studies that demonstrate this effect

 Trained animals to self-administer
 Gave half of the animals’ naloxone (mimics and increases heroine withdrawal effect
 They then tested the drug seeking (lever pressing in the absence of the drug) found
naloxone group with increased withdrawal symptoms showed persistent drug seeking
over the control group
 Demonstrates that when the animal’s motivation state is in a withdrawal state:
 Increase in craving
 Increase in drug seeking

Opponent process and attachment

 The A process is the reward from being around a loved one

 The B is the separation distress from being away from them
 To begin with the contact with the loved one activates the A process  produces the
enjoying affective response
 Activates the B process so that when you are not with your loved on  separation
distress (not for long)
 Overtime the B process grows
 Separation from loved one becomes much worse  increased separation anxiety
 The A process remains the same does not grow
 The B process is now larger than the A process
 Stay together to simply prevent the distress of separation
 BUT don’t have the same excited and affective response to your loved one

 Children that were sick in hospital for extended periods of time

 Parent only allowed to visit once a week
 In this scenario – the parents have created an attachment A process with the children
prior to them being hospitalised
 The B process  when the children are separated from the parents, they experience
separation distress and anxiety
 When then parents visit intermittently, they activate the A process again
 The B process separation anxiety grows over time
 When parents visit the children are in a state of distress and often angry at being
 They act with aggression to the parents
 These findings have led to open access for parents to visit children

Discussion questions
1. What are the behavioural similarities and differences between drug addiction and
attachment relationships?

Similarities Differences
 Overtime the B process grows   Cues lead to increase behaviour of
leads to increase behaviour i.e., taking drugs
taking more drugs, spending more
time with loved ones
 Participate in the behaviour to
alleviate withdrawal (i.e.,
withdrawal from drugs and distress
from not seeing loved ones)

2. What are the brain similarities and differences between drug addiction and attachment

Similarities Differences
 Mechanism is the same for addiction  Physiological response never
and attachment  both are changes (A process) – always
associated with a positive and produces the feeling of euphoria
negative outcome when taking drug
 Increased in dopamine to reward  Excitement and affective response to
your loved one is not the same
(opponent process and attachment)
 Withdrawal state in drug taking is
motivation  enhances incentive
value of drug
 Attachment involves oxytocin and

3. Which theories of motivation explain these similarities

 Opponent process model – A process and B process

 Classical conditioning model of tolerance – US, UR, CR, CS
 Incentive learning –related to reward and value
Topics covered:
 Associative learning
 Pavlovian/instrumental transfer (PIT) – how predictive relationships influence
with our interactions with environment
 Extinction of Pavlovian conditioning – updating predictive relationships in our

Pavlovian conditioning
Why is predictive learning important?
 Learning about stimuli that predict important events is critical for successful
adaptation. It allows us to use the past and predict the future and thereby behave
accordingly in the present
 Pav conditioning is used to study how we learn about these predictive relationships,
specifically, it is used to demonstrate how we learn that certain stimuli in our
environment predict biologically significant events
 Pavlovian conditioning is expresses as reflexive behaviours, rather than voluntary
behaviours. This means that the subject is a passive observer and does not need to
interact with its environment
 A classic example is Pavlov’s dogs that began salivating to a bell when they learned it
predicted the arrival of food

Appetitive vs aversive USs

 The two types of associative distinct motivational system
 Positive outcomes (food, drugs, reward)  appetitive motivational system
 Negative outcomes (danger, pain, frustration)  aversive motivational system
 After conditioning, CS’s will come to activate the motivational system of their
associated US

Discrete vs context CS’s

 CS’s can be classified in two distinct categories: discrete and contextual
 Discrete CS’s have a definitive onset and offset. They are usually auditory (e.g.,
tone) or visual (a light). The US is typically delivered just before or just after the
termination of a discrete CS. E.g., tone or light
 Contextual CS’s have no onset or offset. They typically refer to the physical
environment where the US delivered. E.g., animal chamber

Excitatory and inhibitory conditioning

 Typically, we think about Pavlovian conditioning as learning about a stimulus that
predicts the arrival of an event. Positive association formed between the CS and the
US: CS  US. This is called excitatory conditioning
 However, we can learn that a CS predicts the omission of an event. Negative
association is formed between a CS and US  No US. This is called inhibitory

Various types of
studying fear conditioning in the lab

Measuring fear in rodents

 Freezing is a species-specific behaviour exhibited by rodents when scared
 Characterised by the absence of all movement except for breathing
 Fear can also be measured by heart rate, arterial pressure, and ultrasonic vocalisations
(rarely used because they are difficult to record and require surgery to implant)

Stages of memory formation

 Acquisition  consolidation  retrieval/expression
what is needed to form a fear memory
 Brain must process information about the CS
 Brain must process information about US
 The brain must associate the two types of information to establish CS-US memory
 The brain must then retrieve this memory to coordinate the conditioned fear responses

Where in the brain are these processes occurring?

Processing the CS
 The brain regions involved in processing the CS depend on the modality of the CS
that is being used
 Auditory CS (e.g., tone, clicker): auditory thalamus (fast), auditory (slow but more
 Visual CS (e.g., light): visual thalamus, visual cortex
 Context (e.g., conditioning chambers): hippocampus

The hippocampus

 Located in the medial temporal lobe

 Important in the formation of new memories, learning and emotions
Lesion of the hippocampus

 Rats received context fear conditioning: they were placed in a conditioning chamber
and received a shock a few seconds later
 4 groups
 G1: received control sham lesion of the hippocampus before (control group)
 G2: received a lesion of the hippocampus before conditioning
 G3: received control sham after conditioning
 G4: received lesion of the hippocampus after conditioning
 All groups displayed a similar amount of fear conditioning: this implies that lesion of
the hippocampus before conditioning (g2) did not prevent freezing response
 Results are surprising  impairment in G4 suggest that the hippocampus is
required for context fear conditioning. But if this was true G2 should also have
been impaired

Context: configural and elemental features

 Context: shared office space

 Composed of elemental features: computers, chairs, light, storage
 By default, we established a configural representation of this context  bind all
elements together for form a complex and accurate representation of what this office
space looks like
 Advantages:
 It allows pattern completion: retrieve the configural representation by simply
perceiving a few of the elements
 Reduces cognitive load by preventing the processing of each element
 Our brain will by default establish a configural representation of the context
 The hippocampus is responsible for establishing this representation  if hippocampus
is unavailable  each element is processed individually by the various cortical

Dual process theory of the hippocampus

 3 central assumptions
1. Context can be processed as an independent set of features e.g., shape, floor, texture,
odour, illumination. This involves cortical areas
2. A unique representation of a context can be formed, where all independent features
are bound together. This involves the hippocampus
3. By default, the unique representation dominates and prevents the processing of each
single element

 Group 2
 The hippocampus is lesioned during conditioning  rats cannot form a
configural/unique representation of the conditioning chamber  single elements are
processed independently in the cortex and are associated with shock  associations
retrieved at test and the animals displayed fear
 Group 4
 The hippocampus was intact  unique representation if formed and was associated
with shock
 Hippocampus was lesioned after conditioning  prevents the retrieval of the unique
context representation  no fear displayed

Processing the US

 Just like the CS, the brain regions involved in processing the US depends on the
nature of the US being used
 Foot shock  somatosensory thalamus and somatosensory cortex

Site of the CS-US association: amygdala

 Structure located in the medial temporal lobe and plays an important role in the
processing of emotional information

The basolateral amygdala (BLA)

 BLA consists of:
 Lateral amygdala
 Basal amygdala
 Basomedial amygdala
 BLA received projections from brain regions that process the CS and US. Thus, it is
widely agreed that the BLA is the site of convergence CS and US formation
 Where the CS-US association is formed
 Lateral amygdala – discrete fear conditioning (receives projections from the visual
and auditory cortices and thalamus)
 Basal Amygdala – context fear conditioning (receives projections from the

Inactivation of the BLA

 Studies have demonstrated the role of the BLA in fear conditioning by temporarily
inciting the structure at various points in time during training and testing
 Experiment:
 Rats were presented with pairings of a tone and shock in one context
 24 hours later  2 tests  one tested the fear to tone in different context and the
second test assessed fear to the conditioning context
 Disrupted activity in the BLA by infusing a drug called muscimol (GABAa receptor
agonist – activates these receptions)
 GABA is the main inhibitory neurotransmitter  activating its receptors produces
inhibition of neural activity
 Control animals received saline
 Results:
 Control animals (saline-saline) showed considerable fear to both the tone and context
 Muscimol-saline group showed little fear to the tone and to the context  suggest that
BLA is required during the acquisition of fear memories about discrete cues and
 Saline-muscimol group showed no fear to the tone or context  BLA is required for
the retrieval/expression of the fear memory
 Muscimol-muscimol showed no fear to the tone or context
 Thus, BLA is required for the acquisition and retrieval/expression of conditioned

Cellular molecular changes

 The formation of memory requires a chain or cascade of molecular events in the

neurons supporting memory  events produce durable physiological changes in the
neurons that are believed to be the underlying substrates of the memory
 The cascade of events involved in the formation of Pavlovian fear memories in the
BLA are described in the picture above – they start at the time of acquisition and end
once consolidation takes place
 Consider 2 events:
1. Activation of NMDA receptors at the time of acquisition – NMDA are a type of
glutamate receptor  excite post-synaptic neuron when activated  initiated the
cascade of events
2. Synthesis of new proteins during consolidation  lead to long term changes in the

Blockade of NMDA in the BLA

 NMDAr (NMDA receptors)

 Study:
 Same basic design as the one previously shown
 CS is paired with foot shock and animals were later tested for fear to the CS and
 Infused drug called infenprodil  selectively blocks NMDAr
 Control rats: received “vehicle”  does not contain the active drug and does not
produce any behavioural or neural changes

 Results:
 blocking NMDAr before training impaired the acquisition of fear to the tone and
the context: animals that were infused with either a high dose or a low dose of
ifenprodil showed less fear than animals infused with vehicle.
 By contrast, infusions of ifenprodil before the tests,as seen from the graph on the
right, did not impair expression of fear to either the tone or context, as animals that
were infused with both doses of the drug showed as much fear as control animals.
 Data indicates that NMDAr activation in the BLA is required for the acquisition but
not the expression/retrieval of fear

Blockade of protein synthesis

 Protein synthesis is another important feature of memory formation and occurs during
the consolidation stage
 The synthesis of new proteins allows the implementation of structural and
physiological changes in neurons that support a memory
 Study:
 CS-US pairings during the consolidation stage, ansiomycin was infused into the BLA
 Ansiomycin inhibits the synthesis of new proteins
 Tested 24 hours later for fear to the tone
 Results:
 Anisomycin impaired fear in the dose-dependent manner  the higher the dose the
stronger the impairment that was observed
 Data indicate that protein synthesis in the BLA is necessary for consolidation of the
fear memory

 BLA is required for:
 Acquisition of conditioned fear
 Consolidation of conditioned fear
 Retrieval/expression of conditioned fear

The central nucleus of the amygdala (CeA)

 Lies medial to the BLA
 Receives excitatory projections from the BLA
 Projections to brainstem structures mediating various components of fear responses
e.g., projects to the ventrolateral periaqueductal grey (PAG) which is critical to the
freezing response
 CeA viewed as the output structure of the amygdala that coordinates fear
Neural model of Pavlovian fear conditioning

 one model of Pavlovian fear conditioning is that the BLA and CeA function serially,
where the CS-US association is formed and stored in the BLA
 Subsequent retrieval of the memory in the BLA then activates the CeA, which
coordinates fear responses through its projection to the brainstem.
 Although this serial model of the amygdala has been widely accepted, current
evidence suggests that it is an oversimplified view.
 he main issue with this model is that it implies a very restricted role for the CeA. In
fact, the role of the CeA may be more important than originally thought.

Inactivation of the CeA

 left panel experiment
 rats were conditioned to fear a tone and were infused with muscimol into the CeA
immediately before test.
 During this test, rats with CeA inactivation showed less fear during the test, indicating
that the CeA is critical for fear expression.
 When these rats were tested drug-free 24 hours later, there was no impairment in fear
expression, thus supporting the serial model of the amgydala in fear conditioning
 However, the experimenters also assessed the effects of CeA inactivation before
 Right panel experiment
 rats that were infused with muscimol into the CeA before conditioning showed little
fear when tested 24 hours later
 If the CeA was only involved in fear expression, its inactivation should have left fear
at test intact: the CeA was not inactivated at that time.
 These data therefore suggest that the CeA is involved in both the acquisition and
expression of fear memory, and contrasts with the serial model of the amygdala.

Limitations of the serial model

 some data to suggest that fear conditioning and the interactions between amygdala
subnuclei is more complicated than the model suggests.
 For example, rats can learn conditioned fear in the absence of the BLA if they are
overtrained. Thus, the CeA may support learning of conditioned fear in some cases.
 Furthermore, studies using motivational systems other than fear, such as appetitive
conditioning and a phenomenon known as pavlovian-instrumental transfer, suggest
that the BLA and CeA function independently, not serially

 Formation of the CS-US association occurs in the BLA
 Activity in the BLA is important for acquisition, consolidation, and expression of
conditioned fear
 NMDAr activity in the BLA is critical for acquisition
 Protein synthesis in the BLA is critical for consolidation of conditioned fear memories
 Seriel model of of the amygdala suggest that the CeA is important for the expression
of conditioned fear, but this view may be oversimplified and the role of the CeA
might not be restricted to fear expression


Instrumental conditioning

Topics covered
 Goal directed behaviours vs habitual behaviour
 How we assess whether behaviours are goal directed or habitual in the lab
 Neurobiology of goal-directed behaviour and habitual behaviour
Instrumental behaviours
 Pavlovian learning is expressed as reflexive rather than voluntary behaviours
 However, we also learn about predictive relationships by interacting with our
environment through volitional or voluntary behaviours
 Instrumental behaviours are ones in which we perform voluntarily e.g., driving home
from university
 Instrumental learning occurs as a result of performing actions and learning about
their consequences

Goal-directed actions vs habits

 Instrumental actions are initially very cognitively demanding e.g., driving a car (hands
at 10, check mirrors, check side view mirrors)  each of these behaviours is directed
towards a specific goal – in other words these actions are goal directed
 After years of practice, these actions become seamless and can do them without
thinking  the cognitive demand of driving a car lessens and actions while driving
become automatic
 These goal directed actions have now become habitual

Goal-directed actions
 Goal directions are those that we and other animals perform in order to satisfy our
basic needs and desires
 Goal directed actions have 2 main characteristics
1. They rely on the presence of causal relationship with their occurrence i.e.,
contingency requirement  you perform an action because you know that doing so
will produce the particular outcome you have in mind e.g., press coke button on
vending machine = receiving coke
2. They depend on value attributed to their consequence i.e., goal requirement –
outcome of performing that action is valuable to you e.g., press button for coke
because of thirst  drinking coke will quench thirst
 Goal directed actions are said to be driven by action-outcome (A-O associations)
 GDA are essential to survival, allowing use to interact with the environment and they
are flexible. However, they are cognitively demanding

 Habits emerge after GDA have been overtrained
 Habits are insensitive to contingency and value requirements
 Driven by stimulus-response (S-R associations)
 For example, if you have been working on level 5 of your work building for years and
years and one day you are moved to level 7. However, you find yourself getting to
work, going on the lift, and hitting the button for level 5. Pressing 5 on the button has
become a habit.
 The antecedent stimuli such as being at work and pressing the same button for the lift,
seeing co-workers, same time of day. all the antecedent stimuli elicited the response
of pressing level 5 instead of 7
 Habits occur without cognitive oversight, freeing up resources for other tasks
 Important for refining and perfecting motor skills; stable and long lasting
 Relatively inflexible and it can be difficult not to perform them once they’ve been
established if our circumstances change
 The S-R association that leads to the performance of habits can also lead to
maladaptive behaviours such as addiction
Assessing instrumental behaviours in the lab

 In labs we can assess instrumental behaviours by training rodents to perform actions

to obtain outcomes  often we train them to press a level in a conditioning chamber
to gain access to food or sucrose solution
 Rats are usually hungry when placed in chamber  therefore love these rewards and
quickly learn that pressing the lever produces the outcome
 These outcomes are delivered into a magazine which is a food dispenser in the wall
 Often these chambers are equipped with two different levers on opposite sides  2
different actions can be performed to get different outcomes
 These chambers have variety of visual and auditory stimuli

Goal-directed actions vs habits

 A fundamental principle in instrumental conditioning is establishing whether the
behaviour is goal-directed or habitual
 GDA behaviours are sensitive to outcome value and the contingency between the
action and the outcome. Habits on the other hand are not
 Outcome devaluation: manipulating the value of the outcome
 Contingency degradation: manipulating the contingency between the action and
 If animals show sensitivity to these manipulations, they are said to be GD, if they
are not, they are said to be habitual

Outcome devaluation
 Used to assess whether behaviours are GD or habitual
 3 stages
1. Trained to perform 2 different actions for two different outcomes e.g., right lever =
food pellet, left lever = sucrose solution
2. Devalue the outcome (devalue = the reduction or underestimation of the worth or
importance of something)
 2 main approaches
 Sensory specific satiety – involves giving the animal free access to one of the foods
either the pellets or the sucrose. The animals fill themselves up with the specific food
and don’t want anymore
 Once outcomes are devalued  test is given immediately
 Conditioned tasted aversion (CTA)
 One of the outcomes is given freely to the rats for some period of time e.g.., 30
 They usually fill up on it
 Immediately after they are injected with lithium chloride  makes them feel mildly
 Appeal of eating food is paired with sickness has been completely diminished 
tested on the day after the last pairing of the food and the lithium chloride
3. Choice test
 Given the opportunity to press the two levers
 Successful devaluation is shown when the rats selected the “valued” lever – the one
that delivered the foot outcome they were not sated on or the one that was not paired
with lithium chloride (the one that delivered the food outcome that has now been
 It’s important to note that these outcomes are conducted in extinction – where no
outcomes are delivered  this ensures the performance seen at test reflects the
animal’s knowledge of what was encoded during the training stage and precludes the
animal from using any immediate feedback to adjust their actions
Lab example
 Rats were trained to perform two actions that led to two outcomes
 They were given outcome devaluation by specific satiety and were immediately tested
in a choice test

 Devaluation effect  this means that the selectively chose to perform the action that
led to the still valued outcome or the one they were not sated on
 This devaluation effect is indicative of GD behaviours

Contingency degradation

 During contingency degradation, rats might initially learn that pressing the right lever
produces food pellets and pressing the left lever produces sucrose solution.
 In positive contingency – the probability of earning the outcome given performance
of the action is greater than the probability of gaining the outcome in the absence of
the action
 Degrade contingency  arranging it so that the outcome is freely delivered without
having to press the lever e.g., the left lever might still earn sucrose solution, but the
sucrose solution is also delivered into the magazine without having to press the lever
 the probability of the outcome given the action is the same as earning the outcome
in the absence of performing the outcome
 When contingencies are degraded, GD animals will cease performing the action
with the degraded contingency
 Same refers to the action that earn the same outcome as the freely delivered on or the
degraded action
 Different refers to the action that earned the outcome that is different to the freely
delivered one or the non-degraded one
 Non-degraded actions is referred to as a contingency degradation effect  indicative
of GDA

From actions to habits

GDA actions develop into habits in one of 2 ways
1. Through overtraining  repetition
2. Due to the schedule of reinforcement

 Varying the amount of training the rats receive
 E.g., three groups of rats were trained to learn that two actions led to two different
 R1  O1
 R2  O2
 Groups varied in training
 One group received 2 days of training
 Second group received 5 days of training
 Third group received 20 days of training
 Training was followed by devaluing one of the outcomes by specific satiety  choice
 Results:
 Rats that received 2 and 5 days of training showed devaluation effect  Respond
more to the non-devalued lever
 Rats trained for 20 days did not show the devaluation effect  responded
equivalently on the devalued and non-devalued lever  overtrained rats were
insensitive to outcome devaluation  demonstrating that overtraining an action
makes it habitual

 The ability of overtraining to produce habits can be showed using contingency

 2 groups learn that two actions delivered two distinct outcomes
 2 groups differed in training
 One group received 360 pairings of the action and outcomes  overtrained
 One group received 120 pairings
 Then one contingency between one of the actions and its outcome was degraded
 Overtrained rats responded just as much to the degraded lever and the non-degraded
lever  habitual
 Lightly trained rats responded less on the degraded lever  GD

From actions to habits: schedules of reinforcement

 in the simplest case, instrumental outcomes are delivered whenever an action is
 sometimes we must perform an action many times to obtain the outcome e.g., pokies
machine and other times we must wait a specified amount of time for an outcome to
be delivered after performing an action e.g., mail carrier
 these relationships between the performance of an action and the rate of reinforcement
are called schedules of reinforcement
1. ration schedules  outcome delivery based on specified number of responses. The
more the animal performs the action the more rewards e.g., hunting in animals
2. interval schedules  deliver Reponses after specified time e.g., animal grazing for
grass  they must wait until more grass to grow to eat again

from actions to habits: schedules of reinforcement

 GD behaviours rely on the feedback between the response and the rate of
 Under interval schedules the rate of responding does not necessarily correlate with the
rate of the reward, particularly if the specified interval is long the rate of responding is
 Interval schedules  produce habits more rapidly than ratio schedules. This is
because interval schedules produce less feedback between the response and the
 Lab experiment
 Researched trained 2 groups to press a lever for a food pellet
 One group was trained with ratio – 5 lever presses = food
 Other group was trained with interval = lever press was reinforced every 15 seconds
 Half of the rats in each group then had the pellet devalued by pairing its consumption
with lithium chloride injections. The other half received saline injections  tested
the following day
 Rats trained on ratio showed devaluation effect: those that had the pellet delayed by
its conditioned taste aversion showed a decrease in responding compared to the non-
devalued rats
 By contrast, rats that were trained to interval showed insensitivity to outcome
devaluation: responded equivalently whether the outcome have been paired with
sickness or not
 These data demonstrate the interval schedules produce habits more rapidly

Interim summary
 Instrumental behaviours are those in which we interact with our environment to
produce outcomes
 Instrumental Behaviors can be GD or habitual
 GD behaviours are sensitive to outcome devaluation and contingency degradation
 Overtraining and interval schedules caused GDA to become habits

Neural structures of GDA

 GD behaviours require the interactions of several brain regions – these regions are the
amygdala, the prelimbic region (PL) of the medial frontal and the dorsal striatum
 The striatum is the rodent homologue to the human caudate/putamen
 It receives dense projections from the cortex and processed information from the
thalamus  specifically, the medial aspect of the dorsal striatum (DS) receives
projections from the prelimbic cortex as well as the BLA
 The medial aspect of the DS, PL and the BLA are all important for different
aspects of GD behaviours
Prelimbic cortex (PC): contingency

 This experiment used contingency degradation to show that animals without a PC

are not sensitive to changes in actions-outcome contingencies
 Rats were initially given lesions to the PC or sham lesions (control)
 Rats were trained to press two different levers R1 and R2  O1 and O2
 One of these action-outcome contingencies was positive (only delivered upon a
 The other one was contingency degraded
 Responding during training for the sham operated rats are presented in the top graph
 sham controls show a contingency degradation effect i.e., they decreased their
responding on the lever that freely produced the outcome and maintained responded
to the non-degraded lever
 The rats that received the PL lesions did not show a preference for the non-
degraded lever, they responded similarly.
 PL rats were unable to unjust their behaviour according to the contingency between
actions and outcomes
 Thus, showing PL is necessary for the acquisition of GDA

PL: outcome value

 Pretraining lesions of the PL or sham lesions were given before the rats were trained
to perform two actions for outcomes
 They were given outcome devaluation by specific satiety – had free access to 1 of the
2 outcomes for 30 minutes  immediately tested in a choice test between R1 and R2
– test was conducted in extinction (no outcome was given), this means that the rats
had to use knowledge encoded during conditioning and devaluation to choose
between actions and precludes using feedback to adjust actions
 Outcome devaluation test:
 sham rats show an outcome devaluation effect  more responding to the lever
associated with the still valued outcome
 PL rats did not show this effect  equivalent levels of responding to both levers
 unable to adjust their behaviour according to the value of the outcome that an
action procures
 Reward test:
 When the rats were given feedback on their responding by presenting the outcomes
again, their performance reflected knowledge of the current outcome values
 PL rats showed an outcome devaluation effect that was just like the sham control
 These data demonstrate that the PL is necessary for acquisition but not for
retrieval/expression of GD learning

Posterior Dorsomedial striatum (pDMS): contingency

 PL sends dense projections to the striatum (specifically the posterior aspect of the
 The DMS has been shown to be critical in their acquisition and plays a particular role
in updating actions based on current action-outcome contingencies
 Experiment:
 Rats were given either sham lesions, lesions of the posterior DMS or lesions of the
anterior DMS and trained to perform two actions for 2 outcomes
 One of the action-outcomes were degraded  freely delivered
 Rats were given a choice test
 Sham lesions:
 Responded more on the non-degraded lever (contingency degradation effect)
 Anterior DMS:
 Showed similar pattern to the sham rats
 Posterior DMS:
 Did not show contingency degradation effect
 Responded similarly for both levers  showed difficulty in updating the action-
outcome contingencies
 pDMS is necessary for the acquisition of GD behaviours

pDMS: outcome value – specific satiety

 lesions of the pDMS also impair choice between the actions after the outcome
 rats were given either sham lesions, anterior DMS lesions or pDMS lesions and were
then trained that R1 O1, R2O2
 one of the outcomes were devalued by specific satiety  choice test
 sham rats:
 no different in responding before the outcome devaluation
 similar responding during training
 clear outcome devaluation effect  selected the still valued outcome
 anterior DMS:
 similar pattern to the sham rats
 pDMS:
 did not show devaluation effect  providing further support for the role of the
pDMS in the acquisition of GD behaviours

pDMS: acquisition vs expression/retrieval (devaluation)

 this result has been replicated with temporarily inactivating the pDMS with
intracranial muscimol infusions
 rats trained with two action-outcome contingencies  one of the outcomes was
devalued  choice test
 immediately before the test they were infused with either muscimol of CSF (control)
into the pDMS
 rats in the control showed an outcome devaluation effect
 rats injected with muscimol did not show a devaluation effect  responded similarly
to both levers
 demonstrates that the pDMS is important for the expression of goal-directed

pDMS: acquisition vs expression/retrieval (contingency degradation)

 infusions of muscimol into the pDMS similarly impair the performance of GDA when
outcome contingencies are manipulated
 rats were trained with 2 outcome action associations  one action-outcome
contingency was degraded
 infusions of the muscimol of the CFS occurred immediately before the choice test
 control rats:
 final day of training  rats responded more for the non-degraded lever
 test  difference persisted at test which showed significant increase in responding to
the non-degraded lever
 muscimol rats:
 showed a preference for responding to the non-degraded lever during train during test
muscimol injection impaired the expression of GD behaviours
 pDMS is necessary for the expression of GD behaviours

BLA: contingency

 lesions of the BLA have been shown to impair GDA as assessed by contingency
 BLA or sham lesions were made prior to training the rats on 2 action outcome
contingencies  one was degraded  choice test in extinction
 Sham rats showed sensitivity to contingency degradation  decreased responding on
degraded lever
 BLA rats showed insensitivity to contingency degradation  responded similarly
to both levers

BLA: outcome value

 lesions of the BLA have shown to produce insensitively to outcome devaluation
 pre-training lesions were given before rats were trained on 2 action-outcome
contingencies  one was devalued by specific satiety  choice test
 sham lesion rats showed clear devaluation effect  responded more to the non-
degraded lever
 BLA rats did not show thve devaluation effect
 Thus, the effects on the contingency degradation and outcome devaluation provide
evidence that the BLA is required in the acquisition of GDA

BLA attributes “incentive value”

 BLA has a role that is distinct from the roles played by the PL and the pDMS
 The BLA processes changes to outcome value which is critical for GD behaviour 
important in assigning incentive value
 Rats were trained with two levers that predicted 2 distinct food outcomes  one was
devalued by specific satiety
 Immediately before devaluation rats were infused with either NMDAr agonist,
ifenprodil or vehicle into the BLA which impaired the expression of GD behaviour
 Data from rats given pre-devaluation infusions are presented here. Vehicle treated rats
showed outcome devaluation, pressing more for the valued lever than the devalued
 However, infusions of ifenprodil into the BLA impaired the expression of goal-
directed behaviours. These rats showed similar responding on the devalued and
non-devalued levers
 Importantly, it is the timing of the infusion suggests that the BLA is important for
encoding outcome value as the NDMAr antagonism occurs during the specific satiety
manipulation. This suggest that the impairment caused by the BLA manipulation may
not be because of inability to retrieve instrumental contingences

 In support, in another group of rats, these infusions occurred before testing rather
than before devaluation
 When the infusions occurred before the choice test  both vehicle and infendropil
infused rats showed a devaluation effect  behaviour was GD despite the blockade of
NMDAr in the BLA  this strengthens the view that the BLA is not involved in
encoding instrumental contingencies
 These data suggest that the BLA is involved in encoding outcome value or assigning
incentive value  retrieving this value is essential for the expression of GD
behaviour  thus, the BLA is not directly involved in GDA but it provides
information that is necessary to display the behaviour

Interim summary
 PL is necessary for acquisition but not the performance of GDB
 pDMS is necessary for the acquisition and performance of GDB
 BLA encodes outcome value which is required for the performance of GDB

Neural structure of habits

 Focus on 3 complementary regions:

1. The infralimbic cortex which lies just ventral to the PL
2. The dorsolateral striatum which lies lateral to the DMS
3. Central amygdala which sits medial to the BLA

Infralimbic cortex (IL): retrieval/expression of habits

 Rats were trained to press a lever, R1  O1 – this was done for many weeks to
overtrain the rats on this action
 Rats received outcome devaluation by specific satiety
 During specific satiety half of the rats were pre fed on the outcome they had earned
during training sessions, O1 whereas the others were pre fed on a different outcome,
 In the between subjects design, rats that were pre fed on O1 were the devalued group
and rats that were pre fed on O2 were the non-devalued group
 After specific satiety, rats were infused with either muscimol or vehicle
 Muscimol inactivates the structure
 Test was conducted in extinction – requires rats to use knowledge encoded during the
conditioning and devaluation
 Control rats:
 Showed similar levels of responding to both levers
 Rats pre fed in outcome 1 (the earned outcome) during specific satiety showed the
same amount of responding as the rats that were pre-fed on the different outcome
during satiety
 This insensitivity to outcome devaluation is indicative of habitual performance caused
by extensive overtraining of the instrumental response
 do
 Displayed sensitivity to outcome devaluation despite overtraining
 Rats in the non-devalued group responded more than rats that were in the devalued
 Inactivation of the IL before testing restored sensitivity to outcome devaluation
 Demonstrates that the IL is necessary for the retrieval and expression of habits

Infralimbic cortex. vs prelimbic cortex

 Experimenter tested the effects of PL and IL lesions on GDB and habits by

manipulating the amount of training 2 actions received
 Initially rats were given lesions to the prelimbic cortex, the infralimbic cortex or
sham lesions, which control the surgical procedure but do not product any neural
 Following surgery rats were trained to perform 2 different actions for 2 outcomes in
2 different chambers. However, they differed in the amount of training received.
 One action was overtrained  20 sessions of training on that action
 The other action was not overtrained  5 sessions
 One action is habitual, one is GD
 After training rats were given outcome devaluation by specific satiety
 rats were either pre-fed on outcome 1 (overtrained) or outcome 2 (undertrained)
 they were tested in both the overtrained and undertrained context for responding on
R1 and R2

 undertrained rats:
 sham lesions showed an outcome devaluation effect
 rats with IL lesions showed a similar devaluation effect  IL lesions do not impair
the acquisition of GD behaviours
 PL rats responded similarly for both outcomes  supports the role of the PL in the
acquisition of GDB

 Overtrained rats:
 Shams rats showed insensitivity to outcome devaluation  similar levels of
responding for both outcomes – suggest that overtraining made this response habitual
 IL rats showed GD performance  responded more to the non-devalued lever –
suggest that IL is necessary for habits
 PL rats had no effect on habit performance – showed insensitivity to outcome

 PL and IL have dissociable roles in GDA and habits

 The PL is necessary is for GDA
 IL is required for habits

Dorsolateral striatum (DLS): acquisition of habits

 DLS or sham lesions were made before overtraining a response that led to a food
 That outcome was devalued by conditioned taste aversion  specifically half the
rats had the consumption of the earned outcome (O1) paired with injections of lithium
chloride which induces illness – devalued group
 The other half had the consumption of O1 paired with saline injections – O1 was still
valuable to these rats – non-devalued group
 Rats were then tested for responding on R1 in extinction.
 Sham rats showed insensitivity to outcome devaluation  overtraining produced
habitual behaviours
 DLS rats did not show habits despite overtraining  DLS is necessary for the
acquisition of habits

DLS: retrieval/expression of habits

 The DLS has been shown to be critical for the expression of habits
 Rats were trained to press a lever for sucrose solution over many days until there was
a stable and high rate of responding
 Rats were given an omission schedule – pressing the lever delayed the delivery of the
sucrose solution  rats had to withhold their responding to access the outcome
 Control rats received no contingent relationship between the action and outcome
 Immediately before omission training rats were infused with either muscimol or
saline into the DLS  tested in extinction

 Control rats:
 Control rats showed insensitivity to the imposition of the omission contingency and
failed to withhold responding during omission training in order to earn the sucrose
 Loss of control is indicative of habitual responding mediated by S-R associations

 DLS infused rats:

 Enhanced learning of the omission contingency
 Learned to withhold lever pressing  suggest that DLS inactivating during training
increased sensitivity to this contingency
 The results demonstrate the role of the DLS in the performance of overtrained
habits and underscore its importance in maintaining habitual behaviours

IL and DLS: interim summary

 IL is necessary for acquisition, retrieval/expression of habits
 DLS is necessary for acquisition and retrieval/expression of habits
 IL and DLS are not anatomically connected
 Projections from the IL to the central nucleus of the amygdala (CeA) suggest that the
CeA interacts with these regions to promote habits

CeA: acquisition of habits

 Rats were given lesions of the anterior and posterior CeA or sham lesions before
being overtrained on a response that earned sucrose solution
 Devaluation  conditioned taste aversion  extinction test
 Lesions in the anterior CeA restored sensitivity to outcome devaluation despite
 Restoration was not seen in the sham rats of the posterior CeA rats
 Suggest that the anterior CeA is necessary for the acquisition of habits

aCeA  DLS pathway

 This study supports that the CeA and DLS interact during habit acquisition
 The experimenters disconnected the communication between the two regions
before overtraining a response  lesioning the DLS and the CeA in either ipsilateral
or contralateral hemispheres
 Ipsilateral  disrupt the communication between these structures in the opposite
 Contralateral  disrupt the communication between these structures in both
 Surgery  overtrained  outcome devaluation by conditioned taste aversion 
extinction test
 Ipsilateral rats:
 Showed insensitivity to outcome devaluation  responding was habitual
 Contralateral rats:
 Restored sensitivity to outcome devaluation
 These data provide evidence that the CeA and DLS during habit formation

 PL is necessary for acquisition but not performance of GDB
 pDMS is necessary for acquisition and performance
 BLA encodes outcome value which is required for performance of GDB
 The IL/DLS is necessary for the acquisition and retrieval/expression of habits
 The CeA interacts with the DLS in the acquisition of habits

Dual process theory of actions and habits

 Driven by action-outcome (A-O) associations
 Sensitive to outcome value and action-outcome contingencies
 Driven by stimulus response (S-R) associations
 Insensitive to outcome devaluation and contingency
 A popular explanation of interactions between these two behaviours is the dual
process theory
 It states that the behavioural output (or total strength) is determined by the sum
of A-O and S-R associations. A-O associations are strong when an animal initially
learns about its actions and their consequences, but their influence declines with
 S-R associations start out weak but gain strength as training progresses
 This results in the performance of habitual responses that are relatively insensitive to
outcome devaluation and contingency degradation after extended training

 GDA and habits are supported by two independent and parallel systems  two forms
of learning co-exist and compete for control
 Although the figure suggests a decline in the A-O associations, GDA are not removed
and replaced by habits
 For example, driving to work – although the behaviour is done seamlessly because
you drive every day, when you drive past a cop car and your behaviour changes  it
is habitual and changes back to goal directions behaviours in an instant
 On a neural level, these systems that drive GD and habitual behaviours function in the
same one  parallel, independent and compete with each other  disruption GD
related structures made behaviours habitual, disrupting habit related structures made
behaviour GD


Topics covered:
 Influence of Pavlovian stimuli (CS) on instrumental behaviour
 Pavlovian to instrumental transfer (PIT): general and specific
 Effects of change in primary motivational states and outcome value
 Neural circuity underlying general and specific PIT

Pavlovian and instrumental interactions

 Pavlovian and instrumental conditioning interact with each other
 Pavlovian Cs’s influence performance of instrumental actions
 This is true whether a CS is aversive or appetitive
 Pavlovian instrumental transfer (PIT) allows the study of these interactions
 PIT is observable in many species: humans, monkeys, rodents
 The two forms of PIT are general and specific

General PIT

 A stimulus that predicts food enhances performance on action that procures food
 E.g., sitting at desk and notice that the time on the clock is almost 12  this time is
associated with lunch and therefore predicts food  go into kitchen or buy food
 E.g., smell food that your colleague has brought  motivates you to buy food
 These examples show that a stimulus (time of day, smell of food) prediction food
enhances performance on actions procuring food – this is general PIT effect
 This technique is used in advertising/marketing
Specific PIT

 Predictive stimuli can exert a more selective influence on our instrumental

 E.g., lunch example: what should I have for lunch? Options: burger, salad, noodles
 one factor in influencing our choice is the value that we attribute to the outcome of
our actions: if I ate a salad at lunch, I may not want to eat a salad again at dinner
 another factor that influences our choices is the presence of stimuli that have history
with the outcomes earned by our actions – this influence is more specific
 e.g., you are walking on the street and hesitate between eating a burger or salad. As
you walk you see the McDonald’s sign  this will lead you to eat a burger
 seeing the sign does not motivate you to buy any sign, it selectively guides you to go
and buy a burger  the stimulus predicts a particular food outcome biased choice
towards an action producing that same outcome

studying PIT in the lab

 PIT involves 3 stages
1. Pavlovian stage
2. Instrumental stage
3. Transfer test stage

 The content of these stages differs between general and specific PIT
 General PIT – Pavlovian stage
 One stimulus predicts the food outcome and the other predicts no outcome
 E.g., tone predicts food and clicker predicts nothing (S1  O1)
 This training takes place over the course of a few days
 Pavlovian  Instrumental training
 Lever press earns a different type of food e.g., sucrose solution
 Action 1 leads to outcome 2 (A1 O2)
 Transfer test
 S1 and S2 are presented and the amount of responding on the lever is measured in the
presence of each stimulus
 This is the first time that the lever and stimuli are presented in the same session
 Results: more pressing on the lever when S1 is turned on – when this happens, it is
referred to as a “general transfer effect”

 Specific PIT
 2 outcomes are used in each stage
 Pavlovian stage
 Tone predicts food and noise predicts sucrose solution
 S1  O1 and S2 O2
 Instrumental training
 2 different actions lead to two outcomes
 Left lever = food
 Right lever = sucrose solution
 A1  O1
 A2  O2
 Transfer test
 Each stimulus is turned on and the amount of responding is measured
 Results: when S1 is on there is more responding on A1 and when S2 is on 
more responding on A2 than A1
 In other words, a stimulus predicting a particular outcome increases performance
on an action delivering the same outcome

General and specific PIT

 When we quantify responding during the transfer test, we look at the amount of
responding on levers in the presence of each stimulus as well as when neither stimulus
is on
 General PIT
 A general PIT effect would be demonstrated by higher levels of responding during
the stimulus that predicts food (S1), than during the stimulus that predicts
nothing (S2) or during the “No CS” period
 Generally, rats will show some levels of performance during the no cs period –
baseline level of responding or sometimes called “Pre CS” or intertrial interval (ITI)
 Evidence of general PIT can come from two comparisons – S1 triggers more
responding than when no CS is on or S1 triggers more responding than the neutral cue
of S2

 Specific PIT
 Comparing responding on two different levers during 3 different times
 Responding to the no CS is very limited – considered baseline levels
 The two columns in the middle are responding to the levers during s1 (tone=food) 
respond more to the left lever because the left lever earned the food pellets during the
instrumental stage
 Right column = responding on S2 (noise=sucrose)  pressed right lever more
because the right lever earned the sucrose during the instrumental phase

 To simplify things the data on the right collapses the two stimuli and compares
performance on the lever that earned the same outcome with the performance on the
lever that earned the different outcome
 So, the black bar in this figure is the average rate of responding on the left lever when
the tone turned on AND the rate of responding on the right lever when the white noise
turned on.
 The white bar is the average rate of responding on the levers that earned the different
outcome when the stimuli turn on. So, it would be the right lever during the tone, and
the left lever during the white noise.
 NB: Baseline responding is sometimes absent on PIT graphs. This responding is
typically subtracted to responding during the stimuli. Usually, the term net responding
appears on the y axis

3 CS design
 It is possible to study general and specific PIT in the same animal at the same time 
3 CS design
 During Pavlovian conditioning stage, rats learn that 3 CSs (s1, s2, s3) predict 3
distinct food outcomes
 During the instrumental stage, two of these food outcomes can be earned by
performing 2 distinct actions e.g., left lever press and right lever press
 During the test, the amount of responding on each lever is assessed during the
presentation of 3 stimuli and when no stimuli is on at all. The critical point here is that
S1 and S2 predicts outcomes earned by the actions, which should generate specific
PIT, whereas S3 predicts a food outcome but one that is different from the ones
earned by the instrumental actions. This should generate general PIT
 What’s found is that S1 and S2 trigger specific PIT: S1 increases A1 but not A2,
whereas S2 increases A2 but not A1. In other words, responding on the “Same” lever
is greater than the “Different” lever, and greater than baseline
 In addition, S3 generates general PIT. It increases responding on both A1 and A2
compared to baseline. Responding on A1 and A2 during S3 is generally collapsed to
create ‘General’ responding.

 the 3 CS design reveals an important dilemma
 S3 enhances performance on both A1 and A2  a stimulus that predicts food
enhances performance on actions procuring food
 If true, why isn’t S1 (or S2) enhancing performance on A2 (or A1)? In other words,
why aren’t S1 and S2 producing general PIT.
 The reason is that general and specific are two distinct phenomena and, as such, they
involve distinct mechanisms
 To understand these mechanisms, we must first understand how outcomes or USs are
being processed.

Processing of outcomes/USs

 Need to consider 2 properties:

 Motivational: appetitive vs aversive  all the outcomes used in PIT procedures are
food which are appetitive
 Sensory specific e.g., taste, odour, texture
 Can this distinction explain and support the two forms of PIT?  One way to answer
this question is to manipulate one of these properties and determine whether it affects
the two forms of PIT in a similar way
 The easiest property to manipulate is motivation  outcome devaluation = decreases
desirability of an outcome
 New procedure:
 Sated  no longer view food as appetitive (shift in motivational states)

Changes in motivational state

 3 CS design to evaluate general (S3) and specific PIT (S1 and S2)
 Pavlovian and instrumental stage was conducted when the rats were hungry
 During this first test (left panel), there was evidence of both specific and general PIT.
There was more responding on the same lever than the different lever, indicating a
specific PIT effect, and there was more general responding during S3 than during the
baseline, or the Pre CS period.
 The second test (right panel) was conducted while the rats were sated. Note that the
overall performance was lower than when the rats were hungry. This simply reflects
that animal were less motivated to press the levers to get food because they were not
hungry. Nevertheless, specific PIT was still present: there was more responding on the
Same lever than the different lever. However, general PIT was abolished. General
responding during S3 did not differ from baseline.
 This experiment therefore demonstrates that lowering the value of the outcome
predicted by a stimulus abolishes general PIT but leaves intact specific PIT.
Consistent with the latter, outcome devaluation spares specific PIT.
 Specific PIT was used
 Pavlovian  instrumental  PIT test
 Before the test, the two outcomes were devalued by conditioned taste aversion  O1
and O2 were paired with injections of lithium chloride to induce illness
 Results:
 Even though the outcomes were devalued  specific PIT emerged: more responding
on the lever that earned the same outcome
 despite having outcomes paired with sickness, the influence of Pavlovian cues to bias
choice was still apparent
 Outcome devaluation fails to remove specific PIT

Outcome processing in the amygdala

 Amygdala contained 2 important subnuclei, BLA and CeA – these 2 structures work
together in a parallel and complimentary fashion
 CeA encodes information about the motivational properties of the CS-US association
 By contrast, the BLA computes information about the sensory specific properties of
the CS-US association
 As a result, the CeA and BLA are thought to trigger distinct forms of conditioned
 The CeA produces preparatory CRs based on the valence of the US. For example,
aversive CSs will elicit withdrawal, and appetitive CSs will elicit approach responses.
These responses are due to the motivational properties of the stimuli.
 By contrast, the BLA produces consummatory responses, which are specific to the
outcome employed. For example, some aversive CSs elicit freezing and some
appetitive CSs elicit chewing. These are specific to their sensory specific properties
 the best evidence for this dichotomy probably comes from the dissociation seen in
general and specific PIT.

CeA vs BLA lesions in PIT

 To test the prediction that the BLA and CeA encode distinct properties of outcomes,
and thus, should support the different forms of PIT, lesions the BLA or CeA were
made before submitting the rats to a PIT procedure using the 3 CS design
 Control rats were given sham lesions
 Recall that the 3 CS design allows us to look at specific PIT by looking at responding
during S1 and S2, as well as general PIT by looking at responding during S3.
 Results:
 show the net effects of the predictive stimuli (baseline responding (in the absence of
the stimuli) is subtracted)  positive values mean there is more responding during a
particular stimulus than during baseline  Negative values mean there is more
responding during the baseline period than during that stimulus
 Sham rats show a lot of responding on the lever that predicted the same outcome as
the CS being presented, and very little responding on the different lever  specific
PIT effect
 Sham rats also showed general PIT: S3 increased responding on both levers above
 specific PIT was abolished in rats given BLA lesions: Performance on the same
lever was equivalent to that on the different lever
 the opposite was observed in rats given CeA lesions. Specific PIT was preserved:
there was more responding on the same than the different lever, but general PIT was
abolished: general responding did not differ from baseline
 So, the BLA is necessary for specific PIT, but not general PIT, and the CeA is
necessary for general PIT, but not specific PIT.

PIT and instrumental learning

 Overtraining an action makes it habitual

 Specific and general PIT are observed regardless of the amount of instrumental
 In fact, there may be evidence that extensive training increases the size of the PIT
 So, what affect does the instrumental training have on PIT? Can manipulation of the
brain regions involved in goal-directed and habitual learning provide some


 Researchers examined the role of the DLS and the posterior DMS in specific PIT
 DLS is critical for habits
 posterior DMS is critical for goal-directed actions.
 Experiment:
 standard PIT design was used to look at specific PIT.
 immediately before the test rats were infused into either the DLS or posterior DMS
with either muscimol, which inactivates the structure or saline
 results for DLS rats
 Saline treated rats showed specific PIT: their responding on the same lever was
greater than the different lever and greater than baseline
 Rats given muscimol during test showed much less responding overall than the saline
treated rats, but specific PIT was still observed.
 DLS is important for performance but not choice between actions
 Results of DMS rats:
 the saline control rats showed a clear specific PIT effect:
 Rats given muscimol into the DMS showed as much responding as the saline controls,
but the specificity of their performance was gone (specific PIT was removed)
 the pDMS is important for specific PIT. Given its role in goal-directed behaviours,
it is likely to provide information about the action-outcome associations that are
necessary for specific PIT.

How can we explain these results?

 Amount of instrumental training was moderate  instrumental behaviour generated
was likely to be GD
 pDMS inactivation disrupted GDB and removed specific PIT
 DLS lesion did decrease performance, indicating that some aspect of the behaviour
was habitual as suggested by the dual-process theory
 One obvious question is whether specific PIT could be disrupted in situations where
instrumental training is extensive and leads to habits. This question is yet to be
answered. Likewise, it remains unknown whether the DLS or pDMS are involved in
general PIT.

Energising vs choosing

 These data, along with the data from the experiment examining the shift in
motivational state demonstrate that there are two mechanisms at play: one that
energise action performance, and one that controls action selection (i.e., choice).

Nucleus accumbens (NAc)

 Another area that has been implicated in both forms of PIT

 It is the ventral part of the striatum and is therefore just below the dorsal striatum.
 plays an important role in predictive learning and rewarding experiences
 comprised of two regions, the core, and the shell, which have been shown to have
dissociable roles in general and specific PIT.

Nucleus accumbens: Shell vs core

 For instance, the core and the shell were each inactivated before the PIT test using the
3 CS design.
 in one group of rats, the shell was infused with either saline or muscimol
 in another group of rats, the core was infused with either saline or muscimol.
 Note that the data present the net effects of the stimuli. Meaning that baseline
responding was subtracted.
 Results for specific PIT
 Control rats given saline infusions into the either the shell or the core displayed
specific PIT
 The same was true for rats given muscimol into the core (specific PIT effect)
 However, rats given infusions of muscimol into the shell did not show a specific PIT
 These data demonstrate that the nucleus accumbens shell, but not the core, is critical
for specific PIT.
 Results for general PIT:
 Unsurprisingly, control rats infused with saline into both the shell and core exhibited
general PIT
 The same was true for rats that had muscimol into the shell.
 However, rats that had muscimol infusions into the core did not show general PIT
 Demonstrates that the nucleus accumbens core, but not the shell, is necessary for
general PIT

General and specific PIT

General Specific
 Stimulus predicting food elevates  A stimulus predicting a particular
performance on an action delivering food increases performance on an
food action earning the same good
 A1: S1 > S2, S1> no S  S1: A1>A2; S2: A1<A2
 Abolished by shifts in motivational  Survives shifts in motivational states
states and outcome devaluation
 Triggered by general motivational  Triggered by the sensory specific
properties of the outcome predicted properties of the outcome predicted
by the stimulus by the stimulus
 Relies on CeA, NAc core  Relies on BLA, NAc shell

Pavlovian stimuli and choice

 Although it’s important to determine the neural structures involved in the acquisition
and performance of actions in the absence of any stimuli, this does not really reflect
what happens in real-life.
 We live in a world full of predictive stimuli, and these stimuli as we’ve seen have a
major influence on how we behave
 these stimuli make our actions transiently habitual and resistant to outcome value.
 Thus, animals, including humans, engage in behaviours to procure outcomes we do
not necessarily desire. This obviously poses a hurdle to overcome when considering
issues such as obesity and drug addiction.

 Pavlovian cues have a strong influence on our instrumental behaviours and this
influence is assessed using general and specific PIT
 3 Cs designs of PIT allows us to examine specific and general forms of PIT
 Specific PIT is mediated by sensory specific properties of outcome; general PIT is
mediated by the outcome’s motivational properties
 General PIT requires activity in the CeA and the nucleus accumbens core
 Specific PIT requires the BLA, pDMs and the nucleus accumbens shell
 PIT exemplifies the influence environmental stimuli have on our actions: this
influence is adaptive but also problematic


Topics covered
 Extinction in lab
 Restoration of phenomena
 Neurobiology of extinction
 Clinical implication of fear extinction

Living in a changing environment

 Successful adaptation depends on the capacity to extract predictive information from
the environment to anticipate motivationally significant events
 In many situations, this information is unambiguous, and reliability predicts an
outcome or the occurrence of some event. An example of this unambiguous situation
is in Pavlovian fear conditioning, which we have discussed
 However, our environment is constantly changing, and information that was once
reliable can become ambiguous. A classic example of this ambiguity involves
experimental extinction of Pavlovian conditioning

Extinction of conditioned fear

 Extinction refers to a procedure during which a conditioned stimulus that has been
trained to predict an unconditioned stimulus is repeatedly presented on its own.
Because of these CS-alone presentations, conditioned responses gradually decrease,
and eventually cease
 Recall that during fear conditioning, a rat is placed in a chamber, and pairings of the
CS and US are presented. This results in excitatory CS-US associations
 During extinction, the CS is repeatedly presented on its own, without the US. These
CS-alone presentations result in inhibitory CS-US associations, or a CS-noUS
 Typically, we administer fear extinction 24 hours after conditioning, although it can
be administered days or even weeks after conditioning
 The extinction session may seem like the test for fear conditioning. Recall that testing
the retrieval of the fear conditioning memory consists of presenting the CS on its own
 However, tests of fear conditioning are generally shorter than extinction training to
ensure conditioned responding does not cease

Extinction of conditioned fear

 When quantified, conditioned responses, such as freezing, increase during
conditioning (or during the CS-US pairings).
 then, during extinction, the repeated exposures to the CS in the absence of the US
produces a short-term loss of conditioned responses, as seen by the gradual reduction
in conditioned responses across the extinction session.
 Extinction also produces a long-term loss of conditioned responses, which can be
observed when animals are later tested in a retention test
 testing the fear extinction memory consists in presenting the CS on its own, requiring
the animal to retrieve and express the extinction memory
 As can be seen by the panel of the right, animals that experienced extinction exhibit
fewer conditioned responses compared to those that did not experience extinction

Extinction is new learning

 Acquisition  consolidation  retrieval/expression

 Just like conditioning, extinction requires the same 3 stages of memory formation we
have previously discussed. Only now, the acquisition of extinction occurs during the
CS-noUS trials. Consolidation occurs after these CS alone presentations and lasts
minutes to hours and establishes it into long-term memory. And retrieval and
expression of the extinction memory occurs when we later test the CS by itself after
extinction in a retention test

CS has ambiguous meaning

 However, presenting the CS after extinction during these tests puts the animal in an
interesting predicament
 During fear conditioning, the rat developed an excitatory CS-US association, learning
that the CS predicted shock.
 But during extinction, it developed a CS-noUS association, learning the CS does not
predict shock. The CS is now ambiguous, and these two associations compete for
expression. So how does the rat behave given this conflicting information?
 Answering this question helps us understand how organisms adapt to changes in
predictive information.

Extinction is new learning

 It is important to emphasise that extinction produces new learning of an inhibitory
association that co-exists with the original excitatory association.
 This inhibitory association inhibits the expression of conditioned responses.
Extinction does not erase the original memory and is not “forgetting”. We know this
because of a variety of so-called ‘restoration phenomena’

The restoration phenomena

Return of conditioned responding following extinction:
 Spontaneous recovery
 Renewal
 Reinstatement
 These restoration phenomena are not exclusive to the extinction of conditioned fear,
but are also seen among a range of other behaviours
 Additionally, these restoration phenomena have important clinical implications,

Spontaneous recovery

 Spontaneous recovery refers to the return in extinguished conditioned responses

following the passage of time.
 For example, fear is conditioned by pairing a CS, such as a tone, with a US, such as a
footshock. Fear to the tone is then extinguished by presenting the tone alone. This
extinction session produces a short-term reduction in conditioned responses,
observable by the within-session reduction in freezing. It also produces a long-term
reduction in responding, so if we test 24 hours later, the rat will exhibit low levels of
conditioned responses
 However, if we interpolate a delay between extinction and test, 48 hours for example,
rats will exhibit more conditioned responding than those tested only 24 hours after
extinction learning.
 Similarly, if we test even longer after extinction, for example, one week following
extinction, rats will exhibit even more conditioned responding than those tested after
1 or 2 days. The passage of time after extinction produces a return of conditioned


 This refers to the return of conditioned responding due to a change in the extinction
 For example, conditioning might take place in a particular context. We will call this
context A.
 Then, extinction might take place in a different context, and we’ll call this context B.
 Then, we can return rats to context A for testing

Renewal (ABA)
 What we observe is that following extinction in context B, rats tested in context A,
the conditioning context, show more fear than those tested in context B, the extinction
context. In other words, the extinguished fear responses are renewed by the context
shift. This type of renewal, where conditioning takes place in context A, extinction
takes place in context B, and testing occurs back in context A is called ABA renewal.
However, renewal can be seen in other situations.

Renewal (AAB)

 fear conditioning and extinction might both take place in context A.

 test rats in a new context, context B
 we again see renewal of conditioned responding when the animals are shifted from
their extinction context. Specifically, rats tested in context B show more freezing than
those tested in the extinction context, context A.
 This type of renewal, where conditioning and extinction takes place in context A, and
testing occurs in context B is called AAB renewal.

Renewal (ABC)
 renewal is also observed when conditioning occurs in context A, extinction occurs in
context B, and testing occurs in context C.
 Rats tested in context C will show more conditioned responding than those tested in
context B, their extinction context
 This type of renewal is called ABC renewal. This is the most convincing type of
renewal, as the test context is completely new.

Renewal and context specificity

 AAB and ABC renewal demonstrate that the return of conditioned responding is not
due to being returned to the conditioning context, per se, but rather being removed
from the extinction context. This is important, as it tells us that extinction is context

 refers to the return of extinguished conditioned responses following exposure to the
US alone
 In reinstatement, rats learn an association between a CS and a US, for example, a tone
and footshock. Fear to the tone is then extinguished. After extinction, the rat is
presented with the US alone, but NOT the CS-US pairings.
 For example, the rat will be placed back in the chamber and footshocks will be
delivered without the tone
 What is observed is more conditioned responding to the CS when it is presented at
test in rats that had been given the shock after extinction compared to rats that
 Interestingly, conditioned responses can be reinstated after other stressful events are
experienced. For example, an acute stressor, like mild restraint can serve to reinstate
the conditioned responses
 Thus, reinstatement doesn’t necessarily require reexperiencing the US, per se, but can
be caused by some other trauma.

Pavlovian extinction and PIT

 Although the three fear restoration phenomena are very popular, the best evidence for
extinction not being erased comes from studies using specific PIT.
 Experiment:
 rats were submitted to a standard appetitive PIT procedure.
 initially given Pavlovian training, where they learned two stimuli, S1 and S2,
predicted two food outcomes, O1 and O2, respectively
 instrumental training where those two food outcomes could be earned by responding
on two distinct levers, R1 and R2
 before the transfer test, they were given Pavlovian extinction, where one of the
Pavlovian stimuli was extinguished by presenting it without the outcome
 results:
 On the left of this figure is the responding on the same lever, the different lever, and
the baseline responding during the CS that was not extinguished  specific PIT effect
 Similarly, a clear PIT effect can be seen during the CS that was extinguished (data on
the right) - Even though this stimulus was extinguished before the test, it still elevated
performance on the action that earned the same outcome more than the other action,
and more than baseline. In other words, a specific PIT effect was found regardless
of whether the CS was extinguished.
 So, extinction of the Pavlovian association does not remove specific PIT. Extinction is
not erasure or unlearning.
 Why is PIT the best evidence? Because it does not involve any other manipulation
than extinction.

Interim summary

 Fear extinction results in an inhibitory CS-US (or a CS-noUS) association that

competes with the excitatory CS-US association formed during conditioning
 Fear extinction causes a reduction of conditioned responses
 Extinction is new learning (not erasure of memory, or forgetting)
 Restoration phenomena, such as spontaneous recovery, renewal, and reinstatement,
demonstrate that extinction is new learning, and that extinction is context specific

Neurobiology of extinction

 Contemporary models of fear extinction emphasise the importance of several brain

regions, most notably the BLA and the medial prefrontal cortex and the interactions
between them to learn to inhibit fear

BLA inactivation – acquisition extinction context fear

 In this experiment rats were first given context fear conditioning, where they were
placed in a conditioning chamber and footshocks were delivered.
 The following day, they received extinction to the context, meaning they were placed
in the chamber and no shocks were delivered.
 However, immediately before this extinction session, half of the rats received an
intracranial infusion into the BLA of muscimol.
 Muscimol is a GABA agonist, meaning it temporarily inactivates the structure by
preventing activity in the region
 The other half of the rats received vehicle infusions.
 24 hours after extinction, rats were tested drug-free for fear to the context by
measuring freezing responses in the chamber.
 Results:
 The vehicle control group expressed very little freezing responses when tested after
extinction  This demonstrates the long-term reduction in freezing produced by the
extinction session.
 rats that had muscimol infusions before extinction showed significantly more fear at
test  inactivating the BLA before extinction impaired extinction learning.
 These data provide evidence that activity in the BLA is required for the acquisition
of extinction of conditioned fear.

BLA inactivation – acquisition extinction discrete fear

 Further support for the role of the BLA in fear extinction comes from studies
examining the extinction of discrete cues.
 immediately before the extinction session, rats were infused into the BLA with either
muscimol or saline, which served as the control
 results:
 The arrow indicates the point of BLA infusion of saline or muscimol
 saline control rats (shown as the open circles) initially showed high levels of
freezing responses during extinction, indicating that fear conditioning the
previous day was successful – the rats feared the CS that had been paired with shock.
 These fear responses gradually decline, however, over the extinction session as the CS
is repeatedly presented without the shock
 By contrast, rats that had the BLA inactivated by the muscimol infusions showed
very little fear during the extinction session.
 This is consistent with the idea that activity in the BLA is necessary for the expression
of conditioned fear

 After extinction, the rats were tested the following day, drug-free.
 Results:
 Rats that had received the control infusions of saline the day before showed low levels
of fear during test, indicating that extinction successfully reduced fear to the CS in
this group.
 However, the rats that received muscimol infusions showed significantly more
freezing than the controls, indicating that BLA inactivation prevented the acquisition
of fear extinction.
 Thus, activity in the BLA is required for the acquisition of extinction of
conditioned fear.

BLA acquisition
 Many experimenters have investigated the specific mechanisms within the BLA that
are necessary for the formation of extinction memories. For example, these
experimenters examined the role of NMDArs in the BLA
 a context was paired with a footshock and then fear to the context was extinguished.
 Immediately before extinction, half of the rats received an infusion of a drug called
ifenprodil, which is an antagonist of the NMDAr  blocks function of NMDAr,
preventing the intracellular cascade of events required for memory formation
 other half were infused with a control vehicle.
 Results:
 Controls rats  showed high levels of freezing at the beginning of the session,
indicating that fear conditioning the previous day was successful, and we also see a
gradual decline in freezing across extinction
 Rats that received ifenprodil also showed high levels of fear at the beginning of the
session, however these rats showed an impairment in the rate extinction learning 
These rats were slower to extinguish than the vehicle control rats, indicating an
impairment in extinction learning due to the blockade of NMDAr
 This impairment persisted when the rats were tested drug free the following day
 Again, we see control rats showed low levels of freezing to the context. However, rats
that had the infusions of ifenprodil during extinction the previous day showed high
levels of freezing, indicating that blocking NMDAr activity before extinction
impaired this learning.
 Thus, NMDAr activity in the BLA is necessary for the acquisition of extinction of
conditioned fear.

BLA inactivation – consolidation extinction context fear

 disrupting neuronal activity in the BLA after extinction has also been shown to impair
 These experimenters inactivated the BLA after extinction
 immediately after extinction was conducted, rats were infused into the BLA with
muscimol or a control vehicle
 24 hours later, the rats were tested drug-free for fear to the context
 Results:
 What was observed was higher levels of freezing in rats that had the BLA inactivated
after extinction,
 Demonstrating that activity in the BLA is necessary for consolidation of

BLA consolidation extinction NMDAr activity

 Although activity in the BLA is necessary for extinction consolidation, activation of
BLA NMDAr after extinction does not appear to be
 These experimenters used the same design as the previous experiment, except that the
NMDAr antagonist ifenprodil, rather than muscimol, was infused into the BLA after
 rats were tested 24 hours after extinction learning drug-free, what was observed was
no impairment in extinction learning caused by NMDAr blockade
 Both the vehicle and ifenprodil rats froze as similarly low levels at test  Indicating
that NMDAr activation in BLA is not required for consolidation of extinction of
conditioned fear

BLA- retrieval/expression of extinction

 Electrophysiological recordings have demonstrated that there are two distinct

populations of neurons in the BLA.
 One population responds to CS-US pairings and are often referred to as “fear
 Another respond to the CS-only presentations during extinction, and are thus referred
to as “extinction neurons”
 Silencing these extinction neurons have been shown to impair extinction learning, but
not the expression of extinction
 Experiment:
 extinction neurons were specifically targeted and silenced with muscimol during
either the acquisition of extinction or during the retrieval of extinction
 impaired the acquisition of extinction learning, as can be seen in the left hand panel of
this figure
 When the infusions occurred before extinction, vehicle treated rats showed
significantly less freezing than rats that had muscimol infusions into the BLA
 However, when extinction was conducted drug-free and the infusions occurred at test,
there was no difference in the expression of the extinction memory between the
groups  That is, silencing extinction neurons in the BLA did not impair the
expression of extinction
 Indicating that the BLA is not required for the retrieval and expression of

interim summary
 Acquisition of extinction requires:
 Neuronal activity in the BLA
 NDMAr activation
 Consolidation of fear extinction requires:
 Activity in the BLA
 No NMDAr activation
 Expression of fear extinction
 Does not require BLA

The mPFC

 the extinction of conditioned fear also involves the BLA the medial prefrontal cortex.
 rodent medial prefrontal cortex consists of two main subnuclei: PL and IL.
 The PL has been shown to be critical in the expression and potentiation of
conditioned fear
 The IL has been implicated in the inhibition of fear responses during fear
IL/PL inactivation – fear extinction and expression
 A straightforward demonstration of the dissociable roles of the subregions of the
mPFC has been shown by inactivating each specific region before the extinction of
conditioned fear.
 Experiment:
 rats were conditioned to fear a CS, by pairing it with a footshock  extinguished
 Immediately before this extinction session, some rats were infused into the PL with
either muscimol or saline
 In other rats, muscimol or saline was infused into the IL.
 The rats were then tested for fear to the CS the following day drug-free.
 Results for PL
 The arrow represents the point of infusion, immediately before the extinction session
 PL inactivation caused a reduction in fear responses during the session, evidenced by
the lower levels of freezing in muscimol treated rats, which are shown as the black
circles  This is consistent with the view that the PL is important for expressing fear
 When the rats were tested the following day, PL inactivation had no effect on long-
term extinction learning  Rats that had the PL inactivated before extinction froze
as much as their saline controls
 Thus, the PL does not seem to be critical for the acquisition of the extinction

 Results for IL:

 inactivation of the IL disrupted extinction learning
 Muscimol infused rats showed more fear across the extinction session than saline-
treated rats
 This impairment persisted when the rats were tested the following day drug-free
 Rats that had the IL inactivated during extinction showed more fear at test than the

 Experiment suggests 2 points

1. the IL and PL have dissociable roles in the expression and inhibition of conditioned
2. The second is that it appears as though the IL is important for the acquisition of
extinction learning, however, we will see that the role of the IL in the acquisition of
extinction is still a matter of debate, and these effects seen during extinction may be
due to methodological considerations.

IL inactivation – extinction fear context

 researchers used context fear conditioning to assess the role of the IL in extinction
 Foot shocks were presented in a conditioning chamber  extinguished
 Immediately before extinction, they were infused into the IL with muscimol or saline.
 Results:
 no impairment caused by IL inactivation was observed during extinction
 Rats that had the IL inactivated with muscimol, showed the same reduction in
freezing across the extinction session as rats infused with saline
 Unsurprisingly, when rats were tested the following day, those that had the IL
inactivated during extinction showed impaired extinction learning: they froze more
than the saline controls when tested.

IL inactivation: acquisition?
 How do we account for these discrepancies in the 2 experiments?
 Recall that in one, infusions of muscimol into the IL produced an impairment in
extinction learning during the session, and in another, no impairment was seen during
the extinction session
 One explanation is that the effect of IL inactivation during extinction learning may
depend on the modality of the CS – in one case discrete CSs were used, and in the
other case context conditioning was used.
 What is important, however, is that in both instances, long-term extinction was
impaired when the rats were tested drug-free.
 in both experiments, muscimol was used. Muscimol, has a relatively long-half life,
meaning it metabolizes slowly, and stays in the brain for a long time. This means it is
still exerting its effects after the extinction session is finished.
 So, the muscimol infusions into the IL may impair extinction by disrupting
consolidation rather than acquisition, and there is substantial evidence to support the
view that the IL is critical for the consolidation of extinction

IL inactivation – consolidation context extinction

 Such support for this view comes from studies manipulating neuronal activity after
extinction training
 Experiment:
 context was paired with shock  extinguished (conducted drug free)
 immediately after extinction, the IL was infused with either vehicle or muscimol to
temporarily inactivate the structure.
 unsurprisingly, there were no differences in groups, as the infusions took place after
this session
 Rats were then tested drug-free the following day for fear to the context  What was
found at test was that muscimol infusions after extinction disrupted the long-term
reduction in fear produced by the extinction learning the day prior. Rats treated with
muscimol after extinction froze more at test than their vehicle controls,
 Providing support for the role of the IL in the consolidation of extinction of
conditioned fear
IL- protein synthesis blockade & consolidation extinction

 Further support for this view comes from other studies manipulating neuronal activity
after extinction training
 Experiment:
 Context paired with food shock  extinction (drug free)
 immediately after extinction, the IL was infused with either vehicle, lidocaine or
 Lidocaine is a sodium channel blocker, that effectively functions to inactivate the
 Anisomycin is a protein synthesis inhibitor, which we have previously seen is critical
in the consolidation of new memories into long-term memories.
 Rats were then tested 24 hours later drug-free for fear to the context.
 during this drug-free test was that rats that had the IL infusions of lidocaine and
anisomycin after extinction showed more fear than the controls
 This gives direct support for the view that activity and protein synthesis in the IL
after extinction learning is necessary for the consolidation of the extinction

IL - NDMAr activity & consolidation

 Additionally, NMDAr activity in the IL appears to be necessary for the consolidation
of extinction
 Experiment
 context conditioning and extinction were given
 Rats were given infused into the IL with vehicle or ifenprodil, which is an NMDAr
antagonist, which means it blocks the function of the NMDAr, preventing the
intracellular cascade of events required for memory formation.
 during extinction, there were no differences between groups since the infusions
occurred after extinction
 Both groups showed high levels of fear at the beginning of extinction, suggesting fear
conditioning was successful and rats were scared of the context, and this fear declined
across the extinction session, indicating successful short-term extinction learning.
 However, when rats were tested 24 hours after extinction and the IL infusions, the rats
that had ifenprodil infusions after extinction showed more fear to the extinguished
context than the vehicle controls.
 Demonstrating that NMDAr activity in the IL is necessary for extinction consolidation

IL- retrieval extinction

 Electrophysiological recordings of IL neurons have also revealed its importance in the

retrieval of the extinction memory
 IL neurons were recorded from during extinction learning as well as during a
subsequent test of extinction. The bottom panel of this figure show the firing of IL
 What can be seen from this figure is that firing of IL neurons only increased during
testing of the extinguished CS, which is seen in the bottom right pane
 There was no increase in firing in IL neurons during the acquisition of extinction, as
seen in the bottom middle panel.
 Furthermore, burst firing of IL neurons during extinction has been shown to be
correlated with better expression of extinction at test.
 Therefore, the IL is critical for retrieval/expression of extinction but not its

IL inactivation – retrieval extinction

 Consistent with single unit recordings, muscimol-induced inactivation has shown that
the IL is involved in retrieval/expression of extinction.
 in this study, context fear conditioning and extinction were administered, and then rats
were infused into the IL with muscimol or saline before testing
 What was found was an impairment in the retrieval and expression of the extinction
memory caused by the IL inactivation. Muscimol infused rats showed more fear
during the test than the vehicle infused rats,
 Providing further support that the IL is necessary for the retrieval and expression of
the extinction memory

How does the IL inhibit fear?

 The findings described so far suggest that the IL inhibits conditioned fear responses to
an extinguished CS at test.
 This is thought to occur in two ways
1. Through their reciprocal projections to the extinction neurons in the BLA - We
saw that these neurons in the BLA are necessary for the acquisition of extinction
2. The second is through projections to a network of inhibitory interneurons
(GABAergic) located between the BLA and the CeA. These interneurons are called
intercalated cells (ITC)

Intercalated cells (ITC)

 The intercalated cells lie in between the BLA and the cental nucleus of the amygdala
 Shown in the dark brown and bright red in the pictures
 The intercalated cells inhibit CeA efferent neurons that project to the brainstem that
mediate conditioned fear responses.

Lesions of the ITC

 lesions of the intercalated cells disrupt the retrieval and expression of the fear
extinction memory
 experiment:
 intercalated cells were lesioned by infusions of a drug called D-Sap, the day after
extinction training
 D-Sap lesions the intercalated cells by targeting a neuropeptide that is specifically
expressed on the intercalated cells, but not the surrounding regions.
 Control rats were given a control substance called U-Sap or or D-Sap into the BLA
or CeA, but not the intercalated cells.
 subsequently tested one week later, those that had lesions of the intercalated cells,
seen as the red lines in this figure on the right, showed impaired extinction retrieval
 These rats froze more than the controls that received U-sap, and the controls that had
D-Sap infused into the BLA or CeA
 Demonstrating that the intercalated cells are necessary for the retrieval and expression
of extinction

Summary of extinction

Role in  Extinction  Consolidatio  Ret/exp None (role
extinction neurons n in fear
 Acquisition  Ret/exp expression)
 Consolidatio
Projections  Ext neurons:  ITC  CeA Fear
IL  Ext neurons neurons
 Fear neurons:

Neural model of extinction

Clinical implications
 Anxiety disorders
 Characterised by irrational fears and beliefs
 Lifetime prevalence rate of approx. 20% in Australia

 Cognitive behavioural therapy (CBT)

 Modelled after extinction
 Effective treatment for anxiety disorder
 Patient confronts trauma related cues in order to reduce or eliminate the ability of
these to elicit fear and associated avoidance that impair the quality of a patient’s life
 compliance is an issue: Exposure therapy is itself a highly aversive experience where
patients confront the source of their fear; an experience that many subsequently avoid
by dropping out of therapy
 Second, anxiety disorders are chronically relapsing conditions. Multiple sessions
are required for patients to learn to control their fear, and patients who have learned
this control are nevertheless still prone to relapse, requiring additional therapy
 For example, a patient may confront trauma-related cues in their therapist’s office.
Therapy may be successful in reducing fear and anxiety to these cues  however,
when confronted with these cues elsewhere, fear and anxiety may return. This is a
real-life example of fear renewal that is common to many who are undergoing therapy
and is the reason that therapists conduct sessions in multiple settings and use virtual
reality to mimic multiple contexts

 Delated onset PTSD (spontaneous recovery)

 Return of PTSD symptoms after experience other stressors (reinstatement) e..g, dealth
of loved one, financial hardship

Extinction in multiple contexts

 Understandably, there have been many attempts to find ways to reduce or eliminate
the incidence of fear relapse
 One common area of focus is on extinction in multiple contexts, however, the data
suggesting that it prevents renewal has been inconclusive
 E.g., in one study, rats were extinguished by presenting the CS 144 times in one
context or in 3 different contexts
 When extinction occurred in 3 different contexts, renewal was eliminated.
 However, when extinction consisted of only 36 trials, the number of extinction
contexts made no difference in subsequent renewal, indicating that extensive
extinction is required in order for renewal to be eliminated
 Furthermore, other studies have not been so conclusive. For instance, one study
showed that extinction in multiple contexts produces more fear during extinction and
does not necessarily produce less renewal.
 Thus, fear inhibition and reducing relapse is complex and requires much further

Facilitating extinction
 To combat issues of compliance and relapse, pharmacological agents are often used
in conjunction with CBT
 Benzodiazepines and selective serotonin reuptake inhibitors (SSRIs) are first-line
medications prescribed for people suffering from anxiety disorders
 Such drugs alleviate many of the symptoms of anxiety disorders and reduce the
aversiveness of exposure therapy, thereby encouraging compliance.
 Meta-analysis of clinical trials has shown that such drugs are more effective in
the treatment of anxiety disorders than placebo

 However, comparison trials have also shown that CBT is more effective in
isolation than in combination with pharmacotherapy. But there is some evidence to
suggest that administration of some pharmacotherapies, such as benzodiazepines,
might be more effective in reducing fear once extinction have been established drug-

 D-cycloserine (DCS)
 Antibiotic that also acts as a NMDAr agonist
 Facilitated extinction learning in rodents
 In clinical trials it has been shown to improve symptoms of some anxiety disorders
and OCD (although may be less effective for PTSD)

 SSRI (selective serotonin reuptake inhibitors)

 E.g., fluoxetine, have also been shown to facilitate extinction and prevent relapse in
rodents, although there is conflicting evidence to suggest that other SSRIs impair
extinction in rodent
 Clearly, further evidence is needed to establish which pharmacotherapies and when
they should be administered to aid treatment.


 Dopamine agonist produces psychotic symptoms and dopamine antagonist reduces
psychotic symptoms

 Lambda is what occurs

 EV = what we expect to occur


 Increased levels of maternal care increase the levels of the glucocorticoid receptor
 Sensory specific satiety is used to assessed whether behaviours are GD or habitual
 Example of blocking effect – Croissant predicts hug  croissant + cheese predicts
hug  cheese is not learned to predict hug
 AutoShaping --> consistently perform behaviour after reinforcement, even if it is

