Zhong-Lin Lu, Barbara Dosher-Visual Psychophysics - From Laboratory To Theory-The MIT Press (2013)
Zhong-Lin Lu, Barbara Dosher-Visual Psychophysics - From Laboratory To Theory-The MIT Press (2013)
Zhong-Lin Lu, Barbara Dosher-Visual Psychophysics - From Laboratory To Theory-The MIT Press (2013)
Visual Psychophysics
All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means
(including photocopying, recording, or information storage and retrieval) without permission in writing from the
publisher.
MIT Press books may be purchased at special quantity discounts for business or sales promotional use. For
information, please email [email protected] or write to Special Sales Department, The MIT Press,
55 Hayward Street, Cambridge, MA 02142.
This book was set in Syntax and Times New Roman by Toppan Best-set Premedia Limited, Hong Kong. Printed
and bound in the United States of America.
Lu, Zhong-Lin.
Visual psychophysics : from laboratory to theory / Zhong-Lin Lu and Barbara Dosher.
pages cm
Includes bibliographical references and index.
ISBN 978-0-262-01945-3 (hardcover : alk. paper)
1. Vision. 2. Psychophysics. 3. Visual perception. 4. Image processing. I. Dosher, Barbara, 1951
II. Title.
QP475.L776 2013
612.8'4dc23
2013002296
10 9 8 7 6 5 4 3 2 1
As a matter of course, we cannot in general deny that the mind is subject to quantitative principles.
This is because apart from distinguishing stronger and weaker sensations, we can also distinguish
stronger and weaker intensities of drive, higher and lower degrees of attention or vividness of recol-
lections and fantasies, as well as different stages of consciousness and different intensities of indi-
vidual thoughts. . . . Consequently, the higher mental processes canin much the same way as
sensory processesbe measured quantitatively, and the activity of the mind can be measured quan-
titatively in its entirety as well as in its components.
Gustav Theodor Fechner, 1860
Contents
Preface ix
Acknowledgments xiii
3 Generating Stimuli 27
4 Stimulus Presentation 61
5 Visual Displays 109
6 Response Collection 161
7 Scaling 199
8 Sensitivity and Signal-Detection Theory 221
9 Observer Models 267
Index 443
Preface
We wrote this book for anyone who wants to set up or work in a visual psychophysics
laboratory and to develop theories and models of visual functions. Our aim was to write
a book useful for beginning researchers and students and for experienced researchers or
clinicians in allied areas.
Vision science is one of the most important areas of biomedical research. Practitioners
in this area include those in basic science disciplines such as psychology, neuroscience,
and computer sciences and those in clinical and applied areas of optometry, ophthalmol-
ogy, and biomedical engineering. Visual psychophysical techniques are one of the foun-
dational methodologies for this research enterprise.
Our own experience is that every time a new person joins our laboratories, whether he
or she is an undergraduate, graduate student, or postdoc from another training area, we
need to teach this new colleague a set of skills and techniques for visual psychophysics.
We have done this again and again on an individual basis. Although we pointed to books
and articles for our new laboratory members to consult, it was difficult to find integrated
coverage of all the information and all the skills that were needed to carry out psychophysi-
cal research independently.
Our intention in writing this book was to provide the coverage for teaching a new vision
scientist the necessary techniques in visual psychophysics and to orient him or her to the
basic tools and the capabilities and limitations of these methodologies. Visual psychophys-
ics is a classical field but has widespread applications in modern vision science. Our hope
is that a new vision scientist, after studying this book, will have the skills to apply visual
psychophysics to cutting edge vision research.
In this book, we focus on bridging the gap between theory and practice with suffi-
cient detail and coverage to take the reader from the initial design and practicalities of
understanding display devices and how to program an experiment all the way to analy-
sis of psychophysical data in the context of example models and theories. We cover the
basic software and hardware specifications for visual psychophysical experiments,
including an extensive treatment of calibration procedures. Issues of precise timing and
integration of displays with measurements of brain activities (electroencephalography,
x Preface
functional magnetic resonance imaging) and other relevant techniques are considered. A
section on psychophysical methodologies includes many classic and modern experi-
mental designs and data analysis techniques. The treatment of experimental design
incorporates the most commonly used psychophysical paradigms and the extension of
these basic paradigms in sophisticated, theory-driven psychophysical experiments. All
these procedures are considered within a signal-detection theory framework. We also
cover representative data analysis, model fitting, and model comparison, as well as
sampling methods and computations for estimating the variability of parameters and the
statistical power in complex psychophysical designs. We analyze many representative
examples of new adaptive testing methods and provide a general framework for the
development of adaptive testing designs. We end the book with sample applications and
potential future directions in combining psychophysics with neuroscience techniques, or
neuropsychophysics.
In writing the book, we aimed to integrate the practical with the theoretical. We use
contemporary examples and provide about 100 sample programs written in MATLAB.
The experimental programs also use the subroutines provided by Psychtoolbox Version 3
(Psychtoolbox-3), which is a public-domain toolbox for real-time experimental control.
Psychtoolbox is used in the generation and presentation of stimuli, in timing and synchro-
nization, and in the collection and recording of data.
We designed the book to be used interactively by alternately reading, working through,
and executing sample programs. The reader will need access to a computer with MATLAB
and Psychtoolbox-3. Psychtoolbox-3 can be downloaded and installed, and procedures to
do this are described on the Psychtoolbox website (http://psychtoolbox.org/HomePage).
The sample programs provided in the book are available for downloading from the
website mitpress.mit.edu/visualpsychophysics. This should allow the reader to see some
experiments in action. It also provides a programming core that can be modified for the
readers own use. As each new topic is covered, we have tried to incorporate citations to
serve as pointers to the relevant literature and as the starting point for more extensive
investigations.
Several items, described and provided on the books web page, need to be downloaded
to execute the programs provided in the book (i.e., routines ReadKey.m, WaitTill.m,
the image Church.jpg, and the video file introhigh.mpg). We encourage the reader
to make extensive use of the online help functions in MATLAB. Many useful discussions
and information related to the real-time functions of Psychtoolbox are available on the
Psychtoolbox website.
This new book project has taken us almost two years to complete. Writing the book has
left us more excited about the future of visual psychophysics and the related theoretical
and computational approaches than when we began. Psychophysics had its origins more
than a century ago, and, like the integration of mathematics into many areas of science,
the methods and theories of visual psychophysics have been integrated into many areas of
Preface xi
biomedical research related to human behavior. The role of visual psychophysics has not
diminished but has become stronger through the integration with other methods and
modern measurement tools. From the use of visual psychophysical paradigms in the mea-
surement of brain activity to applications in clinical testing and system design, we are
seeing new applications of visual psychophysics in areas not even imagined by early
practitioners. We hope that by assisting in the training of new researchers, this book will
make a small contribution to the exciting new frontiers in biomedical research.
Acknowledgments
We salute the contributions of the many scientists who developed the field of visual psy-
chophysics and the applications of signal-detection theory. These include the seminal
contributions of historical figures but also the inspiring work of many contemporary sci-
entists. In a book like this, we cannot possibly have cited all of the relevant sources. We
request the tolerance of those whose citations we may have overlooked.
We especially acknowledge the contributions of the developers of Psychtoolbox, Dennis
Pelli, David Brainard, and the large and active community of current Psychtoolbox
programmers.
Zhong-Lin would like to thank Sam Williamson and Lloyd Kaufman for their introduc-
tion to scientific research. Barbara would like to thank Wayne Wickelgren for insights and
training in the process of developing a theory and the sophisticated application of signal-
detection theory in human memory and cognition. We both wish to take this opportunity
to especially thank George Sperling for his introduction to the world of visual psychophys-
ics and his many inspiring contributions to the field.
Many individuals assisted in the completion of the book project itself. Special thanks are
due to Xiangrui Li for his help with experimental programs and with several figures. Fang
Hou also tested some of the experimental programs. We thank Jongsoo Baek who re-drew
several figures and assisted in the reformatting of the text for submission. Several individu-
als assisted in the creation of single figures, including Greg Appelbaum, Yukai Zhou, and
Mae Lu. Rachel Miller provided useful comments on several chapters of an earlier draft.
Lina Hernandez and Stephanie Fowler coordinated printouts and practical details.
We wish to thank the highly professional staff of the MIT Press, including Bob Prior,
executive editor, and Susan Buckley, acquisitions editor. We also wish to thank those who
assisted with the copyediting and production, including Christopher Curioli, Katherine A.
Almeida, Sharon Deacon Warne, and Kate Elwell.
We would each also like to thank the current and former members of our laboratories
for their many contributions to our research and for the opportunity to work with them.
Our interactions with them provided much of the inspiration for this book.
Thanks to my parents, Daode Zhong and Hongchun Lu, for their unconditional love.
Thanks to my wife, Wei Sun, son, James, and daughter, Mae, for their understanding,
tolerance, and accommodation of my busy work schedule. I would also like to acknowl-
edge all my collaborators and my colleagues of the Departments of Psychology of The
Ohio State University and of the University of Southern California for numerous scientific
discussions and for support.
Zhong-Lin Lu
Barbara Dosher
I INTRODUCTION TO VISUAL PSYCHOPHYSICS
1 Understanding Human Visual Function with Psychophysics
Humans are visual creatures. The amazing abilities of human vision far exceed the capa-
bilities of the most sophisticated machines and may inspire new machine design. Human
visual capabilities and processes can only be known through an interdisciplinary approach
that combines visual psychophysics, understanding of brain responses, and computational
models of visual function. Visual psychophysics studies the relationship between the physi-
cal stimulus in the outside world and how that is connected to human performance. It has
played a central role in the understanding of human visual capabilities and the brain. In
addition, the assessment of human vision through applications of human visual psycho-
physics has been the core discipline behind the design of standards for visual devices and
has many other practical applications.
Survival and reproduction of living organisms requires interactions with other organisms
and adaptation to environments. Millions of years of evolution have developed five
senses in humans and other higher organisms that provide the doorways to interaction
with the environment. Of the five major sensestaste, smell, touch, hearing, and sight
sight is perhaps the most highly developed in humans and the most important.14 More
than 50% of the cerebral cortex of the human brain is involved with processing of visual
inputs.57
Vision begins with light that impinges on the eyes. In humans, the components of the
visual system are the eye, including the retina, the optic nerves, the lateral geniculate
nucleus (LGN), primary visual cortex, and other visual association cortices (figure 1.1,
plate 1).8,9
The cornea and lens of the eye act as a compound lens that projects an inverted image
of the visual input onto photoreceptors on the retina at the back of the eye.10 Light-
sensitive photoreceptors are more densely packed close to the fovea or center of gaze
where vision is best.11 The visible light spectrum for humans corresponds to approxi-
mately 390- to 750-nm wavelength of electromagnetic spectrum. Rods respond to most
4 Chapter 1
a LGN
Optical nerve
Retina
Occipital
MST 7a
lobe
MT
VIP
Visual Eye V1 V2 V3-
pattern LGN VP
V4- DP V1
V3a VA
PIT AIT TF
b Temporal lobe (Ventral)
What?
Form/color pathway
Object representation
visible wavelengths of light and serve vision in low illumination, while several families
of cone receptors with differential wavelength or color sensitivity support vision in day-
light conditions.1214
The photoreceptors convert light energy into electrical signals that are further processed
in the retina and then transmitted through the axons of approximately one million neurons
to the cell bodies of neurons of the LGN in thalamus.15 Axons from the LGN distribute
information to the primary visual cortex in the striate cortex of the occipital lobe and then
onward to a series of interconnected stages of visual processing.16 Much of the primary
visual cortex has a retinotopic organization that preserves the spatial organization or topol-
ogy of the visual image.1720 In creating a cortical representation of the world, the visual
system represents the images that stimulate adjacent parts of the retina by adjacent neurons.
Figure 1.1b illustrates a diagram of our current understanding of the stages of visual pro-
cessing.21 Each stage of visual processing represents different computations and is sensitive
to different aspects of the stimulus. Surprisingly, the visual system first breaks an image
into features or cues and then binds the encoded aspects of objectssuch as color, depth,
or motionback together to form our percepts of the world.22
Vision supports many critical functions. Animals who can see well are better adapted
for finding food and mates and for avoiding dangers.23 Visual cues are used in signaling
for reproduction and attractiveness.24,25 Vision is important in navigation through the local
terrain and integrates with motor systems in the efficient guidance of reaching, grasping,
or pushing in physical interaction with the world.26,27
The importance of clear vision can be even better appreciated from the challenges for
individuals with major visual impairments. Imagine navigating even your own home in
complete darkness. Try making a cup of tea with your eyes closed. Visual diseases chal-
lenge individuals and create medical and economic challenges for society. An estimated
2.4 million Americans over the age of 40 years have low vision, and an estimated 1 million
Americans over the age of 40 are legally blind. The total annual economic impact of adult
vision problems in the United States exceeds $50 billion.28,29
Vision also supports many higher-level mental processes. Often, visual images or visual
details are integral to our memory of places and people.3032 Humans use visual imagery
to construct mental models of the environment.33,34 Mental imagery may also be used to
help manipulate abstract relations and concepts.3538
For humans, vision is increasingly important for communication through reading, visual
media, and device interaction.39,40 For most of us, the modern world involves interaction
with visual devices such as iPhones or computers. Vision drives much of the human appre-
ciation of art and the enjoyment of the environment.
For all these and other reasons, vision is the subject of extensive basic, applied, and
clinical research. How the brain processes visual stimuli and integrates the visual world
with the auditory and tactile inputs, what kinds of displays lead to what kinds of
6 Chapter 1
perception, the nature of the impairments in low vision, and many other issues are actively
studied today.
Vision research is one of the important areas of biomedical research. Practitioners in this
area include those in basic science disciplines such as psychology, neuroscience, and
computer science and those in clinical and applied areas of optometry, ophthalmology, and
biomedical engineering. Visual psychophysical techniques are one of the foundational
methodologies in this area.
All of these cues to the visual world are encountered in every instant of our waking
lives. As we write this chapter in our local coffee shop (figure 1.2, plate 2), all of these
cues, and many more, create our perception of the environment.
The light in the shop combines full-spectrum daylight from a cloudless California
summer sky passing through partly shaded windows behind the image and light from a
collection of large and small standard indoor floodlights. The illumination is reflected from
mustard yellow walls, matte-black surfaces, and metal fixtures. There are specular reflec-
tions from glass and matte reflections from the clothing, hair, and skin of the customers.
Each object has spatial textures characteristic of those materials. Larger texture patterns,
8 Chapter 1
such as the woven chair back, are seen in larger-scale spatial patterns of the light array.
Clusters in the light array are recognizable as objects or people. We have purposely not
included facesto guarantee anonymity for reasons of privacy. The spatial arrangements
of objects are conveyed through the spatial array of light interpreted through perspective
and the complex decoding of light reflections between objects. As we enter or leave the
coffee shop, our locomotion splashes our retinas with systematic flows of the spatial array,
or optic flow.
Plate 2 shows the original color photograph of the coffee shop (upper left). A black-
and-white version preserves the luminance distribution without color (upper right). Edges
extracted from the spatial array of light approximate the outlines of objects (lower left).
The average color in each region emphasizes color without detailed texture (lower right).
The visual system processes the visual image in these ways and many more. Then, the
visual system pieces the puzzle back together.
in LGN, for example, responded positively to red inputs but negatively to green inputs and
were labeled +R G cells, on so on. So both models identified through visual psychophys-
ics were implemented in biology, trichromacy in photopigments of the retina and opponent
processing combining these inputs in the LGN. In both cases, the functional understanding
from psychophysics set the stage for the biological investigation.
Computational models of color vision incorporate our understanding of biological
aspects of color and aim to predict major phenomena of color appearance, including color
matching, context-dependence of color perception, and color constancy, which is the con-
stancy of color perception in the face of changing lighting.6365
As you can see, the case of color vision starts with the physics of the stimulus and ends
with functional human behavior. Although our understanding of color vision is not perfect
and color vision is still under active investigation, discoveries from visual psychophysics
and physiology have been tremendously useful in application to technology. These have
led to international standards of color, namely the 1931 Commission Internationale de
lclairage (CIE) standards66 based on the trichromatic theory of color vision. The CIE
standards motivate the use of three phosphors (red, green, blue) in color computer moni-
tors, three-color bases of some printing processes, and so forth.
Precise characterization of the physical stimulus, psychophysical experimentation, neu-
rophysiology, and computational modeling are all important tools used toward understand-
ing of human visual function. A good understanding of human visual function can lead to
powerful applications.
behavior. A multiplicity of methods and tools are used in biological investigation of the
visual brain. These range from molecular biology of subcellular reactions to brain-imaging
techniques that measure the responses of populations of neurons in different spatial regions
of the brain. Each one of these techniques contributes to understanding the puzzle of visual
perception leading to detailed descriptions of anatomy, behavior, and function of the bio-
logical hardware that implements visual functions.
Computational models are also critical in elaborating our understanding of visual func-
tion. At the psychophysical level, they provide quantitative descriptions of the relationship
between the physical environment and human behavior. At the neuronal level, they provide
candidate representations of stimuli and relate these representations to behavior. One major
role played by computational models is to bridge the gaps between different levels of
description and to provide an understanding of how different systems, or populations of
units, work together to bring about behavior. These models make explicit the interactions
between representation and activity at different levels of the system and how that informa-
tion is used to determine behavior. Computational models must be constrained by all the
known facts of both biological and psychophysical experimentation. They should provide
testable predictions.
To summarize, the goal of scientific investigation of visual function is to understand
the response of a functioning goal-directed organism operating in the visual environ-
ment. In many ways, these investigations begin and end with visual psychophysics. It is
visual psychophysics that presents the first functional understanding of behavior. Bio-
logical measurements are increasingly made in the context of awake and behaving
organisms to guarantee that the measured properties of the hardware are relevant to the
behavior. Finally, the goal of computational models is to predict the functional architec-
ture of the visual system and the laws of response that are inferred from the visual
psychophysics.
The publication of Elemente der Psychophysik by Gustav Theodor Fechner in 1860 marked
the official beginning of a new scientific study of psychology that Fechner called psy-
chophysics.67 The new scientific discipline incorporated contributions of many other
scientists, most famously including the physiologist Ernst Heinrich Weber,68 the physiolo-
gist and surgeon Hermann Ludwig Ferdinand von Helmholtz,69 and Fechners psychologist
colleague Wilhelm Maximillian Wundt.70 In counterpoint to philosophers such as Kant71
who questioned whether psychology was accessible to scientific study, from the very
beginning the aim of psychophysics was to formulate a quantitative relationship between
the physical world and human perception. It is the quantification of perceptual experience,
or the relationship between perception and the physical world, that is the foundational goal
of the psychophysics enterprise.
Understanding Human Visual Function with Psychophysics 11
Scaling methods were introduced to measure the perceived intensity or perceived mag-
nitude of physical variation. At the same time, methods were developed to infer the
minimal threshold levels to perceive a stimulus, to perceive the identity of the stimulus,
or to perceive a difference between two stimuli.67 Over the subsequent 150 years, these
methods and new measurement techniques were developed.7276 The methods, analyses of
data, theories, and models of psychophysics have also become core to the scientific study
of many areas of psychology. They have influenced the study of learning and memory, the
measurement of attitudes and beliefs, and social and affective psychology. Psychophysical
methods have also been extended and applied in areas of engineering, quality control, and
risk managementareas far from the origins in human perception. Visual psychophysics
has had direct applications in areas of human factors, individual differences, and clinical
vision (see chapter 13).
From its inception, pioneers in psychophysics recognized the strong relationship between
physiology and behavior. Fechner recognized the importance not just of the neurophysiol-
ogy (the relationship between the stimulus and neural activity) but also of what he termed
the inner psychophysics (the relationship between neural activity and perception).67 With
the modern advancement of techniques in neurophysiology and brain imaging, the study
of the relation extending from physical stimulus to neural activity to response, which we
term neuropsychophysics, is entering a new phase of scientific inquiry into brain and mind.
1.3.1 Scaling
Scaling is a collection of psychophysical methods that quantify the relationship between
perceptual intensity or perceptual qualities and physical manipulations of the stimulus.77
For example, people might be asked to assign a number from 1 to 100 to reflect the per-
ceived intensity of stimuli varying in lightness.78 Psychophysical approaches to scaling
have been applied not just in vision but also in many other modalities to measure percep-
tions of sound, weight of objects, taste, or even emotion.
In the simplest cases, such as the perception of light intensity, the stimulus and the
perception both vary along a single dimension, or are unidimensional. Almost all such
cases show lawful inputoutput relationships, such as power functions.79 In other cases,
physical variations along a single physical dimension create perceptions that are more
complicated and must be characterized in multidimensional space. One example is in color
perception (see earlier), where physical variations along a single dimension of the wave-
length of light are perceived by three classes of receptors and so perception of color is
encoded in a three-dimensional space.80 In other cases, physical manipulations along
multiple dimensions result in a perception that is characterized in a single dimension. An
example of this is the manipulation of visual intensity and duration that combine together,
for short durations, to a unidimensional percept of intensity.81 In the most general case,
stimulus variations along multiple dimensions are mapped into perceptions that are char-
acterized in multidimensional spaces. For example, faces can differ in age, gender, and
12 Chapter 1
eye width, and judgments of similarity between faces reflect distances in a multidimen-
sional similarity space.82 In cross-modal scaling, observers are asked to relate perception
in one modality to that in another modality. Observers may judge whether a light intensity
is more or less than the intensity of a sound.83
Psychophysical scaling investigations have led to discoveries that have been widely
incorporated into everyday life. For example, understanding of the color space of percep-
tion is directly reflected in the paint color wheels that relate color chips by hue and by
saturation, where a particular strip shows colors of increasing saturation or intensity of
color, and adjacencies in the wheel of strips express color similarities.
The visual psychophysics of scaling provides a quantitative specification of the lawful
relationships between physical variations of the stimulus and of the percept.67,79 It also
helps to identify the relevant dimensions in both physical stimulus and in the perceptual
system.14
1.3.2 Sensitivity
Another major focus in psychophysics is measurement of the sensitivity of the perceptual
system. Sensitivity refers to the minimum physical stimulus that is just detectable or the
minimum physical difference between stimuli that is just detectablethe absolute or dif-
ference thresholds.67,68 Sensitivity measurements and theories about sensitivity and signal
detection have been a foundational part of psychophysics that estimates and describes the
limitations of the sensory system.84 Most people are familiar with visual acuity tasks such
as eye charts that are used by optometrists or ophthalmologists. These charts measure the
finest detail a person can see. Visual acuity tests are still the most widely used measures
of functional vision in practical applications.
Often, sensitivities or thresholds are measured as a function of systematic manipulations
of the physical characteristics of the stimulus, and it is this more detailed description of
the operation under variation that has become fundamental to understanding of sensory
systems. In the laboratory, perhaps the best-known example of a sensitivity function is the
contrast sensitivity function (CSF).85,86 Contrast is the ratio between the luminance devia-
tion from the background and mean luminance; it reflects the size of the differences
between the dark and light parts of a stimulus. The CSF measures contrast threshold as a
function of the spatial frequency of sinusoidal gratings. This is the minimum visible con-
trast as a function of the coarseness or fineness of the pattern. The CSF provides a good
test of the limitations of the early visual system. It is used in modeling behavior in complex
tasks and can summarize the properties and limitations of the first stages of visual
neurons.87,88 Measurements of the CSF also provide one of the best ways to characterize
loss of visual capability or deficits in functional vision due to visual diseases.89
Sensitivity or thresholds may also be measured in multiple physical dimensions to
evaluate a sensitivity surface as a function of two or more variables. For example, we may
measure sensitivity for combinations of spatial frequency and temporal frequency (time
variation) of the stimulus.90
Understanding Human Visual Function with Psychophysics 13
1.3.3 Neuropsychophysics
With major advancements in neuroscience and brain imaging, it is increasingly possible
to identify candidate neural systems that mediate perception of the stimulus, selection and
generation of response, and implementation of a goal structure for behavior. This leads to
the possibility of controlling the physical variations of the stimulus, measuring the neural
responses to those variations, and also measuring the overt behavior of the animal or the
person. This provides a platform to conduct neuropsychophysics, which seeks to under-
stand the complete relationship between the physical stimulus, the neural responses, and
the behavior. Moreover, in some cases it may be possible to directly manipulate aspects
of the neural responses, thereby creating new tests of causal relationships in this chain.93,94
Single-unit recording measures the neural responses of a few selected neurons. Multi-
electrode arrays measure the simultaneous neural responses of sets of neurons. Brain-
imaging techniques such as electroencephalography (EEG),95 magnetoencephalography
(MEG),96 and functional magnetic resonance imaging (fMRI)97 can measure responses of
populations of neurons in psychophysical experiments. Current injection techniques in
animals or transcranial magnetic stimulation (TMS) or transcranial direct current stimula-
tion (tDCS) may alter the response properties of either neurons or brain regions providing
direct causal tests of cognitive or neuropsychophysical models of the brain.98,99
By combining psychophysical methods, physiological investigation, and computational
modeling in neuropsychophysics, we can understand the relationship between the physical
stimulus and the internal representation and the relationship between the internal repre-
sentation and the behaviorclosing the full loop from stimulus to response.
This book, Visual Psychophysics: From Laboratory to Theory, aims to provide a compre-
hensive treatment of visual psychophysics. Bridging the gap between theory and practice,
14 Chapter 1
the book begins with practical information about how to set up a vision lab and integrates
this with the creation, manipulation, and display of visual images, experimental designs,
and estimation of behavioral functions. The treatment of experimental design incorporates
the most commonly used psychophysical paradigms, applications of these basic paradigms
in sophisticated, theory-driven psychophysical experiments, and the analysis of these
procedures and data in a signal-detection theory framework.
The book provides examples of commonly used psychophysical paradigms, extensions
to modern adaptive methods, and procedures to measure some of the most important visual
sensitivity functions and surfaces.
The book treats the theoretical underpinnings of data analysis and scientific interpreta-
tion. Data analysis approaches include model fitting, model comparison, and examples of
optimized adaptive testing methods. The treatment also includes methods to synchronize
visual displays with measurements of brain activities such as EEG or fMRI. The book
includes many sample programs in MATLAB100 with functions from Psychtoolbox,101,102
which is a public-domain toolbox for real-time experimental control.
The book ends with a discussion of the important questions that can be addressed using
the laboratory methods and theoretical testing principles covered in the book, how to
extend and integrate the methodology discussed in this book with other tools to study
important questions in human factors, and biomedical and clinical research.
References
1. Gregory RL. Eye and brain: The psychology of seeing. Princeton, NJ: Princeton University Press; 1997.
2. Wandell BA. Foundations of vision. Sunderland, MA: Sinauer; 1995.
3. Bruce V, Green PR, Georgeson MA. Visual perception: Physiology, psychology, & ecology. New York: Psy-
chology Press; 2003.
4. Palmer SE. Vision science: Photons to phenomenology. Cambridge, MA: MIT Press; 1999.
5. Zeki SM. 1978. Functional specialisation in the visual cortex of the rhesus monkey. Nature 274(5670):
423428.
6. Allman JM, Kaas JH. 1974. The organization of the second visual area (V II) in the owl monkey: A second
order transformation of the visual hemifield. Brain Res 76(2): 247265.
7. Felleman DJ, Van Essen DC. 1991. Distributed hierarchical processing in the primate cerebral cortex. Cereb
Cortex 1(1): 147.
8. Polyak S, Kluver H. The vertebrate visual system. Chicago: University of Chicago Press; 1968.
9. Hubel DH, Wensveen J, Wick B. Eye, brain, and vision. New York: Scientific American Library; 1988.
10. Atchison DA, Smith G. Optics of the human eye. Boston: Butterworth-Heinemann Medical; 2000.
11. Curcio CA, Sloan KR, Kalina RE, Hendrickson AE. 1990. Human photoreceptor topography. J Comp Neurol
292(4): 497523.
12. Cornsweet TN. Visual perception. New York: Academic Press; 1974.
13. Kaiser PK, Boynton RM, Swanson WH. Human color vision, 2nd ed. Washington, DC: Optical Society of
America, 1996.
14. Wyszecki G, Stiles WS. Color science: Concepts and methods, quantitative data and formulas. New York:
Wiley; 1967.
Understanding Human Visual Function with Psychophysics 15
15. Brodal P. The central nervous system: Structure and function. Oxford: Oxford University Press; 2003.
16. Schiller PH. 1986. The central visual system. Vision Res 26(9): 13511386.
17. Holmes G. 1918. Disturbances of visual orientation. Br J Ophthalmol 2(9): 449468.
18. Holmes G. 1945. Ferrier Lecture: The organization of the visual cortex in man. Proc R Soc Lond B Biol Sci
132: 348361.
19. Horton JC, Hoyt WF. 1991. The representation of the visual field in human striate cortex: A revision of the
classic Holmes map. Arch Ophthalmol 109(6): 816824.
20. Engel SA, Glover GH, Wandell BA. 1997. Retinotopic organization in human visual cortex and the spatial
precision of functional MRI. Cereb Cortex 7(2): 181192.
21. Movshon A. Visual processing of moving images. In: Barlow H, Blakemore C, Weston-Smith M, eds. Images
and understanding: Thoughts about images, ideas about understanding. New York: Cambridge University Press;
1990: pp. 122137.
22. Treisman A. 1996. The binding problem. Curr Opin Neurobiol 6(2): 171178.
23. Goldsmith TH. 2006. What birds see. Sci Am 295(1): 6875.
24. Walster E, Aronson V, Abrahams D, Rottman L. 1966. Importance of physical attractiveness in dating behav-
ior. J Pers Soc Psychol 4(5): 508516.
25. Thornhill R, Gangestad SW. 1999. Facial attractiveness. Trends Cogn Sci 3(12): 452460.
26. Goodale MA, Milner AD. 1992. Separate visual pathways for perception and action. Trends Neurosci 15(1):
2025.
27. Loomis JM, Da Silva JA, Fujita N, Fukusima SS. 1992. Visual space perception and visually directed action.
J Exp Psychol Hum Percept Perform 18(4): 906921.
28. Rein DB, Zhang P, Wirth KE, et al. 2006. The economic burden of major adult visual disorders in the United
States. Arch Ophthalmol 124(12): 17541760.
29. Frick KD, Gower EW, Kempen JH, Wolff JL. 2007. Economic impact of visual impairment and blindness
in the United States. Arch Ophthalmol 125(4): 544550.
30. Perky CW. 1910. An experimental study of imagination. Am J Psychol 21(3): 422452.
31. Kosslyn SM. Image and mind. Cambridge, MA: Harvard University Press; 1980.
32. Pylyshyn ZW. 1973. What the minds eye tells the minds brain: A critique of mental imagery. Psychol Bull
80(1): 124.
33. McNamara TP. 1986. Mental representations of spatial relations. Cognit Psychol 18(1): 87121.
34. Maguire EA, Gadian DG, Johnsrude IS, et al. 2000. Navigation-related structural change in the hippocampi
of taxi drivers. Proc Natl Acad Sci USA 97(8): 43984403.
35. Piaget J. 1972. Intellectual evolution from adolescence to adulthood. Hum Dev 15(1): 112.
36. Miller AI. Imagery in scientific thought: Creating 20th-century physics. Cambridge, MA: MIT Press; 1986.
37. Lakoff G, Johnson M. Philosophy in the flesh: The embodied mind and its challenge to western thought.
New York: Basic Books; 1999.
38. Davidson D, Block N. Mental events. In: Block NJ, ed. Mental events. Readings in philosophy of
psychology,Vol. 1. Cambridge, MA: Harvard University Press; 1980;1:107119.
39. Boyer MC. Cybercities: Visual perception in the age of electronic communication. New York: Princeton
Architectural Press; 1996.
40. Raupp G, Cranton W. Handbook of visual display technology. New York: Springer; 2009.
41. Gibson JJ. The ecological approach to visual perception. Hillsdale, NJ: Lawrence Erlbaum; 1986.
42. Barlow HB. 1981. The Ferrier Lecture, 1980: Critical limiting factors in the design of the eye and visual
cortex. Proc R Soc Lond B Biol Sci 212(1186): 134.
43. Barlow HB. Dark and light adaptation: Psychophysics. In: L. Hurvich and D. Jameson, eds. Handbook of
sensory physiology, Vol. VII/4. New York: Springer-Verlag; 1972; pp. 128.
44. Born M, Wolf E, Bhatia AB. Principles of optics. Oxford: Pergamon Press; 1970.
16 Chapter 1
45. Motoyoshi I, Nishida S, Sharan L, Adelson EH. 2007. Image statistics and the perception of surface qualities.
Nature 447(7141): 206209.
46. Adelson EH. On seeing stuff: The perception of materials by humans and machines. In: Society of Photo-
Optical Instrumentation Engineers SPIE Conference series, Vol. 4299. 2001: pp. 112.
47. Julesz B. 1981. Textons, the elements of texture perception, and their interactions. Nature. 290(5802):
9197.
48. Ullman S. High-level vision: Object recognition and visual cognition. Cambridge, MA: MIT Press; 2000.
49. Marr D. Vision. New York: WH Freeman; 1982.
50. Julesz B. Foundations of cyclopean perception. Chicago: University of Chicago Press; 1971.
51. Kaufman L. Sight and mind: An introduction to visual perception. Oxford: Oxford University Press; 1974.
52. Kolers PA. Aspects of motion perception. Oxford: Pergamon Press; 1972.
53. Wuerger S, Shapley R, Rubin N. 1996. On the visually perceived direction of motion by Hans Wallach: 60
years later. Perception 25: 13171368.
54. Gibson JJ. 1958. Visually controlled locomotion and visual orientation in animals*. Br J Psychol 49(3):
182194.
55. Warren WH, Jr. 1998. Visually controlled locomotion: 40 years later. Ecol Psychol 10(34): 177219.
56. Newton I. Opticks or, a treatise of the reflexions, refractions, inflexions and colours of light. Also two treatises
of the species and magnitude of curvilinear figures. London: Royal Society; 1704.
57. Helmholtz H. Treatise on physiological optics. III. The perceptions of vision. Southall JPC, ed. New York:
Optical Society of America; 1925.
58. Young T. 1802. On the theory of light and colour. Phil Trans R Soc Lond 92: 1248.
59. Hering E. Outlines of a theory of the light sense. Cambridge, MA: Harvard University Press; 1964.
60. Marks WB, Dobelle WH, MacNichol EF. 1964. Visual pigments of single primate cones. Science 143(3611):
11811183.
61. Pugh EN. 1999. Molecular mechanisms of vertebrate photoreceptor light adaptation. Curr Opin Neurobiol
9(4): 410418.
62. De Valois RL, Abramov I, Jacobs GH. 1966. Analysis of response patterns of LGN cells. JOSA 56(7):
966977.
63. DZmura M, Lennie P. 1986. Mechanisms of color constancy. JOSA A 3(10): 16621672.
64. Maloney LT, Wandell BA. 1986. Color constancy: A method for recovering surface spectral reflectance. JOSA
A 3(1): 2933.
65. Brainard DH, Freeman WT. 1997. Bayesian color constancy. JOSA A 14(7): 13931411.
66. Commission Internationale de lEclairage. Proceedings of the Eighth Session. Cambridge, UK: Cambridge
University Press; 1931.
67. Fechner G. Elemente der psychophysik. Leipzig: Breitkopf & Hrtel; 1860.
68. Weber EH. De pulsu, resorptione, auditu et tactu: Leipzig: Koehler; 1834.
69. Von Helmholtz H. Handbuch der physiologischen Optik: Mit 213 in den Text eingedruckten Holzschnitten
und 11 Tafeln. Leipzig: Voss; 1866.
70. Wundt WM. Lectures on human and animal psychology. Creighton JG, Titchener EB, trans. London: Allen.
Translation of Wundt, 1863.
71. Kant I. Critique of pure reason. Riga: Johann Friedrich Hartknoch; 1781.
72. Gescheider GA. Psychophysics: Method and theory. Hillsdale, NJ: Lawrence Erlbaum; 1976.
73. Falmagne JC. Elements of psychophysical theory. Oxford: Oxford University Press; 2002.
74. Kingdom FAA, Prins N. Psychophysics: A practical introduction. New York: Academic Press; 2009.
75. Green DM, Swets JA. Signal detection theory and psychophysics. Melbourne, FL: Robert E. Krieger; 1974.
76. Stevens SS. Psychophysics: Introduction to its perceptual, neural, and social prospects. New Brunswick, NJ:
Transaction Publishers; 1975.
Understanding Human Visual Function with Psychophysics 17
77. Krantz DH, Luce RD, Suppes P, Tversky A. Foundations of measurement: Additive and polynomial repre-
sentations, Vol. 1. New York: Academic Press; 1971.
78. Stevens SS. 1946. On the theory of scales of measurement. Science 103(2684): 677680.
79. Stevens SS. 1957. On the psychophysical law. Psychol Rev 64(3): 153181.
80. Derrington AM, Krauskopf J, Lennie P. 1984. Chromatic mechanisms in lateral geniculate nucleus of
macaque. J Physiol 357(1): 241265.
81. Kahneman D, Norman J. 1964. The time-intensity relation in visual perception as a function of observers
task. J Exp Psychol 68(3): 215220.
82. Valentine T. 1991. A unified account of the effects of distinctiveness, inversion, and race in face recognition.
Q J Exp Psychol 43(2): 161204.
83. Krantz DH. 1972. A theory of magnitude estimation and cross-modality matching. J Math Psychol 9(2):
168199.
84. Graham NVS. Visual pattern analyzers. Oxford: Oxford University Press; 2001.
85. Campbell FW, Robson JG. 1968. Application of Fourier analysis to the visibility of gratings. J Physiol 197(3):
551566.
86. Enroth-Cugell C, Robson JG. 1966. The contrast sensitivity of retinal ganglion cells of the cat. J Physiol
187(3): 517552.
87. Movshon JA, Thompson ID, Tolhurst DJ. 1978. Spatial and temporal contrast sensitivity of neurones in areas
17 and 18 of the cats visual cortex. J Physiol 283(1): 101120.
88. Watson AB, Ahumada AJ. 2005. A standard model for foveal detection of spatial contrast. J Vis 5(9):
717740.
89. Schwartz SH. Visual perception: A clinical orientation. New York: McGraw-Hill Medical; 2009.
90. Kelly DH. 1979. Motion and vision. II. Stabilized spatio-temporal threshold surface. JOSA 69(10):
13401349.
91. Lu ZL, Dosher BA. 2008. Characterizing observers using external noise and observer models: assessing
internal representations with external noise. Psychol Rev 115(1): 4482.
92. Fahle M, Poggio TA. Perceptual learning. Cambridge, MA: The MIT Press; 2002.
93. Salzman CD, Britten KH, Newsome WT. 1990. Cortical microstimulation influences perceptual judgements
of motion direction. Nature 346(6280): 174177.
94. Shadlen MN, Newsome WT. 2001. Neural basis of a perceptual decision in the parietal cortex (area LIP) of
the rhesus monkey. J Neurophysiol 86(4): 19161936.
95. Nunez PL, Srinivasan R. Electric fields of the brain: The neurophysics of EEG. Oxford: Oxford University
Press; 2006.
96. Lu ZL, Kaufman L. Magnetic source imaging of the human brain. Hillsdale, NJ: Erlbaum; 2003.
97. Huettel SA, Song AW, McCarthy G. Functional magnetic resonance imaging. Sunderland, MA: Sinauer;
2004.
98. Nitsche MA, Cohen LG, Wassermann EM, et al. 2008. Transcranial direct current stimulation: State of the
art 2008. Brain Stimulat 1(3): 206223.
99. Wassermann E, Epstein CM, Ziemann U. The Oxford handbook of transcranial stimulation. Oxford: Oxford
University Press; 2008.
100. The MathWorks Inc. MATLAB [computer program]. Natick, MA: The MathWorks Inc.; 1998.
101. Pelli DG. 1997. The VideoToolbox software for visual psychophysics: Transforming numbers into movies.
Spat Vis 10(4): 437442.
102. Brainard DH. 1997. The psychophysics toolbox. Spat Vis 10(4): 433436.
2 Three Examples from Visual Psychophysics
Contrast threshold represents the minimum stimulus energy that can be detected by the
observer.2,7 To measure threshold for a particular visual stimulus, you need to try many
different levels of stimulus energy and determine the particular level at which the stimulus
20 Chapter 2
20
16
Perceived magnitude
12
0
0 20 40 60 80 100
Physical intensity of the light patch
Figure 2.1
Magnitude estimation for patches of different light intensity.
0.9
Probability correct
0.8
0.7
0.6
0.5
0.25 0.44 0.79 1.41 2.50 4.48
Grating contrast (%)
Figure 2.2
Psychometric function for detecting sine-wave gratings of different contrasts.
the observer to achieve 75% correct, which is the usual criterion for threshold when guess-
ing is 50% correct.
In the next experiment, we use measurements of contrast threshold as basic building blocks
to construct a more sophisticated measure of visual sensitivity: the contrast sensitivity
function.9,10 A contrast sensitivity function is a description of the sensitivity of the visual
system to sine-wave patterns of different spatial frequencies, from coarse to fine. In
example 2, we used a sine-wave pattern of a single spatial frequency and only varied the
contrast energy of the sine pattern. Here in example 3, both the contrast energy of the
pattern and its spatial frequency are varied (figure 2.3).
We choose a number of spatial frequencies from the coarsest to the finest. For each
spatial frequency, we choose contrast levels to measure the psychometric function at that
spatial frequency. On each trial, the computer selects a particular sine pattern and a par-
ticular contrast level to show to the observer. For each combination of frequency and
contrast, we test a reasonable number of trialsso this experiment has as many times the
number of trials as example 2 as there are tested spatial frequencies.
22 Chapter 2
1024
256
Sensitivity 64
16
1
0.06 0.25 1 4 16 64
Grating spatial frequency (c/d)
Figure 2.3
Contrast sensitivity function.
2.4 Discussion
These three example experiments illustrate the fundamental elements of visual psycho-
physics. They represent some of the simplest paradigms and procedures that make up the
toolkit of the visual psychophysicist. Psychophysicists have developed many different
standardized methods of quantifying the relationship between perception and the physical
environment.8 Example 1 shows one of the simplest experiments in psychophysical scaling,
the method of magnitude estimation. Examples 2 and 3 illustrate one classical method of
threshold measurement, the method of constant stimuli. This method uses prior informa-
tion to select a set of relevant stimulus conditions that are fixed throughout the experiment.
All stimulus conditions are tested a given number of times, which provides one or a set
of psychometric functions under a range of conditions. This in turn allows us to derive
Three Examples from Visual Psychophysics 23
Figure 2.4
Demonstration of a contrast sensitivity function with increasing spatial frequency from left to right and decreas-
ing contrast from bottom to top.
measures of visibility (or discriminability) such as thresholds that can be used to character-
ize human sensitivity over different kinds of stimulus variations. There are, however, other
methods to accomplish scaling such as magnitude production or similarity ratings.3 There
are also other methods to estimate thresholds, such as method of limits, method of adjust-
ment, and various adaptive methods.2
The quality of the visual psychophysics is only as good as the physical control and
precision of the displays. The experimenter must make sure that the physical properties
of the stimuli are precisely known and quantified. Often in visual psychophysics, the
stimuli are displayed on a computer monitor. For example, in our second and third experi-
ments, the physical variations involved contrast for stimuli of given spatial frequency and
timing. Contrast thresholds in many of these conditions are remarkably low contrasts,
perhaps at 0.1% of the full contrast range of the displays, and computers often generate
only 256 programmable levels of intensity, so specialized methods are required to expand
the number of programmable levels in certain restricted ranges. The displays also require
precise control over the time of onset and duration of the displays. All of these properties
of a proper display can be challenging for the beginning experimenter.
Measurement of human performance in visual psychophysics involves the collection of
responses or judgments. A range of standardized methods may be used to collect these
responses on standard computer equipment, such as the keyboard, computer mouse, joy-
sticks, or touch-screens. If very precise and accurate measurements are required, such as
for response times, specialized equipment might be involved. In addition, investigators
might choose to measure other behavior such as eye movements, EMG, EEG, MEG, or
fMRI. In all of these cases of specialized equipment, a critical consideration is synchro-
nization with the stimulus displays.
Once the psychophysical experiment is complete, the data that have been collected must
be tabulated, usually through simple statistical analysis. Then, these data are used to
24 Chapter 2
estimate some critical measure, such as the psychological scaling function, or the psycho-
metric function and threshold, or the contrast sensitivity function. In example 2, we col-
lected responses in the method of constant stimuli for a set of stimulus conditions, from
which we construct a table of response accuracy as a function of the signal contrast in
different stimulus conditions. We then quantify the empirical psychometric functions using
descriptive functions from which we then estimate the contrast thresholds by interpolation.
The set of contrast threshold measurements allow us to construct the visibility window of
vision.
In the remainder of the book, we take you to the laboratory, where you will see how to
construct an experiment. Then, we provide a tour of methods and models in psychophysics
that allow theoretical interpretation of empirical findings. And finally, we integrate psy-
chophysicsthe laboratory methods and theoretical testing principles covered in the
bookwith applications to important questions in visual function in applied and clinical
settings.
References
1. Stevens SS. 1946. On the theory of scales of measurement. Science 103(2684): 677680.
2. Fechner G. Elemente der psychophysik. Leipzig: Breitkopf & Hrtel; 1860.
3. Gescheider GA. 1988. Psychophysical scaling. Annu Rev Psychol 39(1): 169200.
4. Krantz DH, Luce RD, Suppes P, Tversky A. Foundations of measurement: Additive and polynomial representa-
tions, Vol. 1. New York: Academic Press; 1971.
5. Falmagne JC. Elements of psychophysical theory. Oxford: Oxford University Press; 2002.
6. Stevens SS. 1957. On the psychophysical law. Psychol Rev 64(3): 153181.
7. Weber EH. De pulsu, resorptione, auditu et tactu. Leipzig: Koehler; 1834.
8. Gescheider GA. Psychophysics: Method and theory. Hillsdale, NJ: Lawrence Erlbaum; 1976.
9. Campbell FW, Robson JG. 1968. Application of Fourier analysis to the visibility of gratings. J Physiol 197(3):
551566.
10. Enroth-Cugell C, Robson JG. 1966. The contrast sensitivity of retinal ganglion cells of the cat. J Physiol
187(3): 517552.
11. Watson AB, Ahumada AJ. 2005. A standard model for foveal detection of spatial contrast. J Vis 5(9):
717740.
12. Movshon JA, Thompson ID, Tolhurst DJ. 1978. Spatial and temporal contrast sensitivity of neurones in areas
17 and 18 of the cats visual cortex. J Physiol 283(1): 101120.
II IN THE VISUAL PSYCHOPHYSICS LAB
3 Generating Stimuli
The implementation of experiments generally starts with the definition of what is displayed
to the observer. This chapter is designed to take the reader through a simple introduction
to digital images and to show how to generate typical psychophysical stimuli. The methods
include creation of simple geometric patterns and more complex images. The examples
also provide a tour of common image manipulations. Taken together, these methods will
allow the creation of the most commonly used classes of psychophysical stimuli.
Display 3.1
by an RGB triplet. A fourth channel, the alpha channel, stores a separate bitmap that when
multiplied with the three primary channels determines the intensity of the RGB image by
a factor from 0 to 1 at each location.2 So, the alpha channel can be used to emphasize
some regions of the image while hiding other regions. Addition of an alpha channel
expands 24-bit images to 32 bits per pixel.
The bitmap of an 8 8 checkerboard (figure 3.1, plate 3) is given in display 3.1. (Notice
that although 1-bit bitmaps normally take on values of 0 or 1 and 8-bit bitmaps take on
values from 0 to 255, in MATLAB3 a 1-bit image is represented by 1 and 2 and an 8-bit
image is represented by numbers from 1 to 256 because these values are actually used to
index locations in a colormap with index beginning at 1.4,5)
Display 3.2
Display 3.3
bottom left corner is (I, 1), and the bottom right corner is (I, J). Once a coordinate system
is established, assigning an intensity value to each matrix location defines the image.
A different way of computing the bitmap of the checkerboard image in figure 3.1 (plate
3) is shown in display 3.2.
It is also possible to transform the image coordinate system, for example to place (0,
0) in the center of an image. MATLAB provides a function called meshgrid that sets up
coordinate systems for a matrix of size [I, J]. A call to meshgrid that places (0, 0) at the
center of the image is:
where x is a matrix that contains the x coordinates of all the pixels that goes from (J/2)
to (J/2 1) from left to right, and y contains the y coordinates that goes from (I/2+1) to
(I/2) from bottom to top.
The bitmap of the same checkerboard from display 3.1 can be also constructed using
the meshgrid function as shown in display 3.3.
on the screen by leaving the image unchanged but altering the lookup table, or CLUT.
Changing the CLUT rather than the contents of the displayed image can sometimes be a
very useful tool in controlling experimental displays.
The checkerboard image defined earlier, with digital values of 1 or 2, could be shown
as black and white, or red and green, or different levels of gray. Specifically, the CLUT
may translate 1 to black and 2 to white; or the CLUT may translate 1 to red and 2 to green;
or it may translate 1 to one gray level and 2 to another. One could reverse the checkerboard
that is actually displayed not by rewriting the entire I J pattern of 1s and 2s in the matrix,
but rather by changing the CLUT to translate 1 to green and 2 to red, which requires only
changes in two values of the CLUT.
In MATLAB, a matrix that codes a colormap or CLUT may have any number of rows,
but it must have exactly three columns. Each row is interpreted as a color, with the first
element specifying the intensity of red light, the second green, and the third blue. The ith
row of the colormap specifies the color translation of intensity value i in the image bitmap.
For example, the fifth row of the colormap specifies the color translation of intensity value
5 in the bitmap. Color intensity is specified on the interval 0.0 to 1.0. For example, [0 0
0] is black, [1 1 1] is white, [1 0 0] is pure red, [0.5 0.5 0.5] is gray, and [127/255 1
212/255] is aquamarine.
In display 3.4, we show how to use different color maps to display the same check-
erboard image M computed in displays 3.13.3. The results are shown in figure 3.1
(plate 3).
Display 3.4
This section illustrates a range of images and how to create them. Examples include simple
images, such as dots, lines, and letters, but also other images that are commonly used in
experiments, such as sine-wave gratings, external noise images, and textures.
Before we start, we define a new MATLAB function, showImage, to display images.
The function, shown in display 3.5, should be saved with the file name showImage.m
and included in the PATH of MATLAB. The showImage code is a new function that
displays images without altering the aspect ratio using either a gray lookup table or MAT-
LABs default colormap or lookup table. A call to the MATLAB function image displays
an image M in an unnecessary axis frame and the default colormap. Calls to the showIm-
age function eliminate repetitive parts in subsequent programs in this chapter.
Display 3.5
Display 3.6
Figure 3.2
A fixation dot in the center of the display.
where l(x, y) is the gray level for a pixel at location (x, y) in the image, l0 is the mean gray
level, f is the frequency of the sine wave in (1/pixels), and is the angular tilt of the sine
wave. In this example, f is 1/32 and is 35.
We also show how to create a Gabor, which is a sine-wave patch windowed by a 2D
Gaussian (normal or bell-shaped) function. The Gabor has a well-defined spatial frequency
range and is contained in a windowed region of space (figure 3.4b). It is another of the
most frequently used stimuli in vision research.7
In this example, the Gabor is also tilted 35 from horizontal. An equation that creates
this Gabor is
x 2 + y 2
l ( x, y) = l0 1.0 c sin {2 f [ y sin( ) + x cos( )]} exp 2
, (3.3)
2
Generating Stimuli 33
Display 3.7
Figure 3.3
A fixation cross in the center of the display.
a b
Figure 3.4
A sine-wave pattern (a) and a Gabor pattern (windowed sine-wave grating) (b).
34 Chapter 3
Display 3.8
where = 24 pixels is the standard deviation of the spatial window that defines the Gabor
patch. Display 3.8 shows the program to create both these stimuli.
Display 3.9
Figure 3.5
A Gaussian white noise image.
Display 3.10
Figure 3.6
A contrast modulated sine-wave grating.
larger disparities or shifts in the image in the two eyes. Some people, free fusers, are
able to process and combine two separated images, one for the left eye and one for the
right eye, without a stereo viewer or other specialized laboratory equipment.
Display 3.11 shows the generation of a stereo pair of 256 256 binary noise images in
which a common 64 64 binary noise image is embedded at slightly different locations
near the middle. The two are offset by 2 pixels of disparity in this example. The pair of
images for the random dot stereogram is shown in figure 3.7. An individual who can fuse
the two images without devices by relaxing the convergence of the eyeor someone who
is viewing the two images through a stereo devicewill see a square floating above the
background.
Display 3.11
Figure 3.7
A random dot stereogram made of a left-eye and a right-eye image.
Display 3.12
%%% Program TrueColorImage.m
[x, y] = meshgrid(-128:127, 128:-1:-127);
r = sqrt(x.^2 + y.^2);
RI = 128 + (r <32 )*127 + (r >= 128)*127;
GI = 128 + (r >= 28 & r < 80)*127 + (r >= 128)*127;
BI = 128 + (r >= 72)*127;
M = cat(3, RI, GI, BI)/255; % cat takes 3 2-dimensional
% RGB matrices to create
% an image with 3 color planes
showImage(M, );
the function text. Finally, getframe is used to create an image of the content of the
figure, and the relevant subset is extracted. This same approach can be used to draw lines,
or other patterns, using predefined plot or graphics functions in MATLAB and then extract-
ing them as images for your experiments. Display 3.13 shows an example program for
creating an image of a letter, shown in figure 3.9.
Display 3.13
Figure 3.9
Image of a character extracted from a MATLAB figure.
40 Chapter 3
defines these wedge and ring images in figure 3.10 is shown in display 3.14. To represent
fine geometric shape well, we used 2048 2048 images. Regions of alternating intensity
were defined by different radii and angled lines in the pattern. To do this, we first translated
the pixel locations in [x, y] into a polar coordinate system [r, theta] of the radius and angle.
Display 3.14
Figure 3.10
A wedge (left) and a ring pattern (right)are frequently used in fMRI visual experiments to measure retinotopic
areas in visual cortex.
color profiles, blur edges, cartoon, and many other effects to achieve a creative goal. In
visual psychophysics, we are interested in quantifying the relationship between image
properties and human performance. This requires us to manipulate images to control for
certain properties, such as size or intensity, in order to test ideas about human visual pro-
cessing. For these purposes, it is important that we control image manipulations exactly
and can characterize the resulting images that are displayed. Starting with images stored
in a particular format, we use MATLAB routines to create new images with exact
properties.
Display 3.15
showImage(T, grayscale);
Figure 3.11
A texture made of many Gabors.
Generating Stimuli 43
(graphics interchange format), jpeg (joint photographic experts group), pbm (portable
bitmap), pgm (portable graymap), png (portable network graphics), ppm (portable pixmap),
or tiff (tagged image file format). The different formats were developed by different user
groups or companies for somewhat different purposes. The various formats follow different
conventions in storing images. Some of them are used to represent true color images; some
of them are used to represent indexed color images. Some represent color in RGB space
(e.g., JPEG); some represent color in CMYK color space (e.g., TIFF), which is a special
four-color model used in color printing where four inks (cyan, magenta, yellow, and black)
are used.
Display 3.16
Display 3.17
Figure 3.13
A grayscale image converted from the color photograph of the church.
Generating Stimuli 45
Display 3.18
Figure 3.14
A black and white image converted from the color photograph of the church by thresholding.
image to binary by applying a threshold. The binary image contains values of 1 for white,
for all pixels with a value greater than the threshold, and values of 0 for black for all other
pixels. The conversion im2bw returns values of 0 or 1, which are converted to 1 and 256
for display corresponding to the lowest and highest value in the lookup table. Display 3.18
shows a program using im2bw to convert the RGB color image of the church into a binary
black and white image seen in figure 3.14.
In addition to transforming RGB images to grayscale images, it may be useful to
convert between the two forms of grayscale imagebetween index and intensity images.
The MATLAB function ind2gray converts an indexed image and the associated color-
map into an intensity image. The MATLAB function gray2ind takes a grayscale image
and creates an indexed image and a colormap with a specified number of gray levels. In
display 3.19, we show an example program that combines a grayscale image M1 with a
new lookup table to generate a new grayscale image M2 with an inverted color lookup
table (figure 3.15). We then covert M2 into an indexed image M3 that includes a lookup
table.
46 Chapter 3
Display 3.19
Figure 3.15
A new grayscale image created by combining an indexed image with a color lookup table.
Display 3.20
Figure 3.16
Resized, rotated, and cropped images of the church photograph.
Display 3.21
%%% IntensityTransformation.m
[M, map] = imread(Church.jpg, jpeg);
M1 = rgb2gray(M);
bkgrd = mean2(M1); % mean2 computes the matrix mean.
M2 = uint8(0.50*(M1 - bkgrd) + bkgrd);
% Reduce image contrast by 50%
showImage(M2, grayscale);
Figure 3.17
A digital photograph of the church in lower contrast.
Display 3.22
Figure 3.18
Extracted edges of the church photograph.
Display 3.23
a b c
Figure 3.19
(a) A windowed region, (b) a Gaussian windowed region, and (c) a notch windowed region of the church
photograph.
Generating Stimuli 51
Display 3.24
s = 96;
noiseI = s*randn(Sz(1), Sz(2));
M3 = uint8(double(M1) + noiseI);
showImage(M3, grayscale);
Figure 3.20
Gaussian white noise masked church photograph.
Noise masking manipulates the signal-to-noise ratio in images and thereby controls the
quality of the images. Noise masking has been a major instrument in understanding visual
processing.8,9 A similar process can be used to combine any two images.
3.3.9 Filtering
Filtering is used to remove or reduce certain features or components of an image.19 A filter
is a device or a computational procedure that removes or attenuates some components or
features from an image. Filtering is used in some practical situations to reduce noise in a
stimulus. In the vision laboratory, spatial frequency filtering is used to create specialized
test stimuli.
52 Chapter 3
In many applications, this is accomplished through the fast Fourier transform (fft2),
which codes the quantity of each spatial frequency in the image. This mimics the opera-
tion of the early visual system that analyses images into spatial frequency compo-
nents.5,20 The Fourier analysis describes an image in terms of its spatial frequency
content. Low spatial frequencies correspond with slow undulations in intensity over
space, whereas high spatial frequencies correspond with rapid changes in intensity over
space. Visibility of different spatial frequencies was shown in the earlier discussion of
the contrast sensitivity function (CSF) in section 1.3.2 in chapter 1 and section 2.3 in
chapter 2.
A Fourier transform takes an input image and redescribes the image in terms of coef-
ficients representing the amount, or magnitude, of each spatial frequency and orientation
and its phase. The Fourier representation of the input image makes it easy to alter the
spatial frequency and orientation content of images. For example, in filtering, some spatial
frequencies are reduced or filtered out while others are left intact. Finally, the filtered
image is reconstructed through the inverse fast Fourier transform (ifft2).
Display 3.25 shows a program that carries out low-pass, high-pass, and band-pass filter-
ing of the church image. Filtering removes certain frequencies from the grayscale church
image by multiplying the coefficients of the original image magnitude spectrum and the
filter in the domain of the Fourier representation. Figure 3.21 shows the original church
image and images resulting from low-pass, high-pass, and band-pass transformations from
top to bottom. A low-pass filter attenuates high spatial frequencies while passing
(keeping) low spatial frequencies, which blurs the original image. A high-pass filter does
the opposite; it attenuates low spatial frequencies while keeping high spatial frequencies,
retaining the sharper edges in the image. Because low spatial frequencies are less important
in object recognition, the high-pass image appears similar to the original to us. A band-
pass filter attenuates both very low and very high spatial frequencies. Band-pass filters
can be either broad or narrow. Figure 3.21 shows a narrow band-pass filter that is the part
in common between the low- and high-pass exampleswhich has extracted the more
obvious edges in the image.
The middle column of figure 3.21 shows the log of the Fourier spectra of the images
in the left column. (It is necessary to take the log to make small coefficient values for
many spatial frequencies more visible in displaying Fourier spectra.) In these spectra,
orientation is depicted by the radial angle, and the spatial frequency is low in the center
and increasing with distance from the center. The spectra of the filtered images were
created by multiplying the Fourier spectrum of the original church images and the spatial
frequency filters shown in the right column of figure 3.21. The filters are white (multipliers
of 1) where frequencies are being passed or retained and black (multipliers of 0) where
frequencies are being filtered out.
The mathematical treatment of the discrete fast Fourier transform21 is not provided here,
but these transformed images, composition of the fft spectra, and filters should provide
Generating Stimuli 53
Display 3.25
%%% FourierFiltering.m
[M, map] = imread(Church.jpg, jpeg);
M1 = rgb2gray(M);
fM = fftshift(fft2(M1 - mean2(M1))); %Fourier transformation
maxfM = max(max(log(abs(fM))));
showImage(uint8(256*log(abs(fM))/maxfM), grayscale);
Sz = size(M1);
[x, y] = meshgrid(-Sz(2)/2:(Sz(2)/2-1),
Sz(1)/2:-1:-(Sz(1)/2-1));
r = sqrt(x.^2 + y.^2);
r0 = 64;
s1 = 10;
% Lowpass filtering
lowpass = (r <= r0) + (r > r0).*exp(-(r - r0).^2/2/s1^2);
% Construct a lowpass filter
showImage(uint8(256*lowpass), grayscale);
fMl = fM.*lowpass; % Apply the lowpass filter to the
% Fourier spectrum
showImage(uint8(256*log(abs(fMl) + 1)/maxfM), grayscale);
M2 = uint8(ifft2(fftshift(fMl)) + mean2(M1) );
% Inverse Fourier transformation
showImage(M2, grayscale);
% Highpass filtering
highpass = (r >= r0) + (r < r0).*exp(-(r - r0).^2/2/s1^2);
% Construct a highpass filter
showImage(uint8(256*highpass), grayscale);
fMh = fM.*highpass; % Apply the highpass filter to the
% Fourier spectrum
showImage(uint8(256*log(abs(fMh) + 1)/maxfM), grayscale);
M3=uint8(ifft2(fftshift(fMh)) + mean2(M1) );
% Inverse Fourier transformation
showImage(M3, grayscale);
% Bandpass filtering
bandpass = exp(-(r - r0).^2/2/s1^2);
% Construct a bandpass filter
showImage(uint8(256*bandpass), grayscale);
fMb = fM.*bandpass; % Apply the bandpass filter to the
% Fourier spectrum
showImage(uint8(256*log(abs(fMb) + 1)/ maxfM), grayscale);
M4 = uint8(ifft2(fftshift(fMb)) + mean2(M1) );
% Inverse Fourier transformation
showImage(M4, grayscale);
54 Chapter 3
Figure 3.21
The original church image and its Fourier spectrum (top row). A low-pass filtered church image, its Fourier
spectrum, and the low-pass filter (second row). A high-pass filtered church image, its Fourier spectrum, and the
high-pass filter (third row). A band-pass filtered church image, its Fourier spectrum, and the band-pass filter
(bottom row).
Generating Stimuli 55
the reader with a strong intuition about the importance of different spatial frequencies in
the perception of objects and how images can be transformed for visual experiments.
The program in display 3.25 reads in the church picture, which is in color. It then trans-
forms it to grayscale using rgb2gray. It then computes the fft2 of the image after
removing the DC component, which is the mean value. The Fourier spectrum is rear-
ranged with fftshift so that the low spatial frequencies are in the center and orientations
are radial. Multiplying the coefficients of the shifted spectrum with different masks carries
out different filtering operations. Taking the inverse Fourier transformation with ifft2
and adding back the DC reconstructs the new filtered image.
3.3.11 Convolution
Signal processing of images is often used to find features in the imageand some of these
image transformations may mimic the processing of the visual system. Many important
forms of image transformation are carried out by a mathematical operation called convolu-
tion. Convolution computes the outputs of a spatial array of detectors looking for a par-
ticular spatial profile or pattern. The convolution operation computes the dot product or
match between the detector and the image in every spatial location.
Display 3.27 shows a program that computes a convolution of the image with a so-called
difference of Gaussians (DOG) detector at each spatial position. The DOG detector looks
56 Chapter 3
Display 3.26
Figure 3.22
A phase scrambled image of the grayscale church photograph.
Generating Stimuli 57
Display 3.27
Figure 3.23
DOG convolved church photograph.
for transitions in intensity. If the DOG is located over a patch where all intensities in
the original image are the same, the correlation is 0. The DOG convolution provides a
new image that highlights the edges in the pattern of the original image, as seen in
figure 3.23.
3.4 Summary
The basic concepts of computer graphics and digital manipulation of images described in
this chapter provide a basis for generating some of the most commonly used stimuli in
58 Chapter 3
psychophysics. The examples were selected to illustrate important principles and tools that
can be used to generate many other kinds of visual images for display. Although all of the
current examples create individual images, compositions of multiple images can be used
to create more complex displays. And a series of images can be used to generate dynamic
sequences in time. The display of multiple images in space or in time is considered in
chapter 4.
References
1. Foley JD. Computer graphics: Principles and practice. New York: Addison-Wesley Professional; 1996.
2. Porter T, Duff T. 1984. Compositing digital images. ACM SIGGRAPH Computer Graphics 18(3): 253
259.
3. The MathWorks Inc. MATLAB [computer program]. Natick, MA: MathWorks; 1998.
4. Gonzalez RC, Woods RE, Eddins SL. Digital image processing using MATLAB, 2nd ed. Knoxville, TN:
Gatesmark Publishing; 2009.
5. Marchand P, Holland OT. Graphics and GUIs with MATLAB. Boca Raton, FL: CRC Press; 2003.
6. Campbell F, Robson J. 1968. Application of Fourier analysis to the visibility of gratings. J Physiol 197(3):
551566.
7. Porat M, Zeevi YY. 1988. The generalized Gabor scheme of image representation in biological and
machine vision. Pattern Analysis and Machine Intelligence. IEEE Trans Pattern Anal Mach Intell 10(4):
452468.
8. Pelli DG, Farell B. 1999. Why use noise? J Opt Soc Am A Opt Image Sci Vis 16(3): 647653.
9. Lu ZL, Dosher BA. 2008. Characterizing observers using external noise and observer models: Assessing
internal representations with external noise. Psychol Rev 115(1): 4482.
10. Sperling G. 1965. Temporal and spatial visual masking. I. Masking by impulse flashes. JOSA 55(5):
541559.
11. Chubb C, Sperling G. 1989. Two motion perception mechanisms revealed through distance-driven reversal
of apparent motion. Proc Natl Acad Sci USA 86(8): 29852989.
12. Cavanagh P, Mather G. 1989. Motion: The long and short of it. Spatial Vision 4 (2/3): 103129.
13. Julesz B. Foundations of cyclopean perception. Chicago: University of Chicago Press; 1971.
14. Engel SA, Glover GH, Wandell BA. 1997. Retinotopic organization in human visual cortex and the spatial
precision of functional MRI. Cereb Cortex 7(2): 181192.
15. Sereno M, Dale A, Reppas J, Kwong K, Belliveau J, Brady T, Rosen B, Tootell R. 1995. Borders of
multiple visual areas in humans revealed by functional magnetic resonance imaging. Science 268(5212):
889893.
16. Biederman I, Ju G. 1988. Surface versus edge-based determinants of visual recognition. Cognit Psychol
20(1): 3864.
17. Geisler WS, Perry JS. 1998. A real-time foveated multi-resolution system for low-bandwidth video com-
munication. SPIE Proceedings 3299: 294305.
18. de Jong PTVM. 2006. Age-related macular degeneration. N Engl J Med 355(14): 1474
1485.
19. Jain AK. Fundamentals of digital image processing. Englewood Cliffs, NJ: Prentice-Hall;
1989.
20. Enroth-Cugell C, Robson JG. 1966. The contrast sensitivity of retinal ganglion cells of the cat. J Physiol
187(3): 517552.
Generating Stimuli 59
21. Nussbaumer HJ. Fast Fourier transform and convolution algorithms. Berlin: Springer-Verlag; 1982.
22. Piotrowski LN, Campbell FW. 1982. A demonstration of the visual importance and flexibility of spatial-
frequency amplitude and phase. Perception 11(3): 337346.
23. Olman CA, Ugurbil K, Schrater P, Kersten D. 2004. BOLD fMRI and psychophysical measurements of
contrast response to broadband images. Vision Res 44(7): 669683.
24. Rainer G, Augath M, Trinath T, Logothetis NK. 2001. Nonmonotonic noise tuning of BOLD fMRI signal to
natural images in the visual cortex of the anesthetized monkey. Curr Biol 11(11): 846854.
4 Stimulus Presentation
4.1.1 Movies
A movie is a sequence of images shown in succession over time. The movie definition for
each image M in the sequence consists of the display time t, a display duration, a particular
screen position x, y, with magnification factors zoomx, zoomy, and a color lookup table
(CLUT). This definition is repeated for each image in the movie sequence.
Whether computed in the user program or loaded from an archive on a computer disk,
M is an image matrix that is stored in computer memory. To display M, it must be copied
from computer memory to a special memory on a graphics card along with the instructions
about when, where, and how to display the image. The display device, usually a computer
monitor, is a passive device that displays the contents of a currently active frame buffer.
A frame buffer is a section of memory on a graphics device that drives the contents of one
image on the display device.
Modern graphics cards usually have large memory capacity that can store multiple
image frames of a movie. During a given display sequence, the user program specifies
which section of graphics memory is the active frame buffer being displayed. In this multi-
buffering mode, one can display an active frame while simultaneously loading other image
frames into graphics memory.
re-display is called the refresh rate, which is either a preset property or a programable
property of the display device. The duration of each movie frame is therefore defined in
terms of the number of refreshes during which an image frame is displayed. The beginning
of each image frame coincides with the refresh of the display device. The contents of a
frame buffer must be completely specified before it becomes active. If a frame buffer is
still in the process of being constructed while it is active, mergers of several images, visible
discontinuities, or tears in those images may occur. We describe tools to test for and
avoid these failures of display synchronization in chapter 5.
This book includes many example programs for psychophysical experiments. All the
sample experimental programs are written in MATLAB with Psychtoolbox Version 3
extensions (subroutines). Psychtoolbox Version 3 is a collection of MATLAB and GNU/
Octave functions for generating and presenting accurately controlled visual and auditory
stimuli and collecting responses from the observer.3 GNU/Octave is a high-level inter-
preted language for numerical computations. The Psychtoolbox package is freeware and
has more than 15,000 active users. It continues to be developed on a voluntary basis by a
large community of users. The Psychtoolbox wiki http://psychtoolbox.org/wiki contains a
forum and information for downloading and installing the package and about system
requirements.
Our intention is not to provide full coverage and instruction for all of the many Psych-
toolbox functions. Instead, we focus on those functions that are essential to the psycho-
physical laboratory and provide a general orientation to basic functions and what they
are designed to accomplish. The Psychtoolbox website includes a general overview and
introduction as well as various tutorials (http://psychtoolbox.org/PsychtoolboxTutorial).
Readers may also use the help function in MATLAB to access the document pages con-
cerning toolbox functions.
Unlike many software packages developed to run experiments, the philosophy of Psych-
toolbox is not to write and run a small number of highly structured experiments through
a descriptive language. Rather, it provides the interface between a MATLAB interpretive
programming environment and devices such as graphics devices, sound devices, clocks,
keyboards, joysticks, and touchpads by providing function calls to control the devices. The
use of general programming in a MATLAB environment allows the user to define any
number of different experiments in a very flexible manner.
To speed the execution of time-critical operations during an experiment, key Psych-
toolbox routines are written in C code that are called as functions from MATLAB. The
functions present a simple software interface that provides full control of the various
devices to the user. This allows experiments programmed in MATLAB with Psychtool-
box functions to access all of the display, timing, and response collection hardware with
Stimulus Presentation 63
millisecond timing accuracy. In the comments of the programs in this book, Psychtoolbox
is sometimes referred to as PTB.
Our first sample program has some calls to Psychtoolbox functions that are used in all
experiments but that may not be fully explained until later chapters in this book. In these
cases, we aim to provide a brief functional description of what a set of calls accomplishes
and refer the reader to the relevant subsequent section that treats this topic. In most cases,
the specific details that cannot be understood in full now will be explained or become clear
as you proceed through the book.
Each experimental program in this book consists of three modules: the Display Setup,
Experimental, and System Reinstatement modules. The first module of the program per-
forms general actions that set up and specify the experiment. The second module specifies
and runs the experiment. The last module resets the computer and screens to the state in
which they were found before the experiment.
The Display Setup module of each sample experimental program includes the definition
of basic aspects of the experimental setup such as properties of the screen (e.g., size and
resolution), the distance of the screen from the viewer, and a function to carry out some-
thing called Gamma correction (see chapter 5) in order to guarantee that what is shown
on the screen corresponds precisely to what is programmed for display in terms of lumi-
nance or relative contrast.
The Experimental module of each sample experimental program will define and compute
the stimuli to be shown to the observer, perform randomization or ordering of testing trials,
generate visual and/or auditory displays, supervise the delivery and timing of the displays,
and collect responses. In some cases, the Experimental module includes a simple data
summary table at the end. In all cases, the relevant information about the experimental
trials and collected data are saved in a raw data trace that could be used in subsequent
programs to provide more detailed analyses.
The final module of the sample programs carries out all operations to return the com-
puter and display to their original state so that others can use it.
Our philosophy in developing these sample experimental programs has been to provide
as much portability to new computer and display setups as feasible. For this reason, we
generally set up and define visual stimuli in terms of their size on the retina, or degrees
of visual angle. Sine-wave gratings are defined by their spatial frequencies in cycles per
degree (c/d) of visual angle, or the number of periods of the sine wave in 1 of visual
angle. Timing is defined in terms of seconds. A few basic facts about a particular experi-
mental setup, such as the physical size of a pixel (or of a 100-pixel line or square), the
viewing distance of the observer, and the refresh rate of the graphics display card, specify
the experimental setup. These key parameters can be used to express pixel sizes of images
from sizes in degrees of visual angle and duration of displays that correspond to a number
of screen refreshes. Once some of these basic parameters are set for your display and
system, the program should run in a portable way.
64 Chapter 4
) S
A
B
D
Figure 4.1
Illustration of the concept of visual angle.
Figure 4.1 shows some of the key parameters of the display setup that determine the
translation between pixel size, viewing distance, and degrees of visual angle. In viewing
angle, the size of an object is expressed as the degrees of arc that it subtends on the retina.
The visual angle at the retina is a function of the size of the image, S, and the distance
from the eye, D:
Visual angle = 2arctan [ S (2 D )]
S (4.1)
57.3 (degrees), if S < D.
D
The larger the object and the smaller the distance, the larger the visual angle subtended
by the object on the retina.
Finally, many key parameters of the experiment in the sample programs are defined and
saved in a MATLAB structure. Structures are a way to group variables together. For
example, instead of defining individual variables ScreenDistance, ScreenHeight,
and so forth, they may be defined as fields in a structure: p.ScreenDistance,
p.ScreenHeight, and so on. The advantage of storing variables of different kinds in a
structure is that they can be grouped together even if they are of different lengths, sizes,
or types. This allows the experimenter to save all of the relevant variables together in a
single command. For example, after storing the list of parameter variables in the structure
p, you may save them all together as: save(p.mat,p). The structure of variables
can be reloaded together as: load p.mat.
Stimulus Presentation 65
a b
Figure 4.2
A motion stimulus defined by successive image frames. Five frames of a leftward (a) and a rightward (b) moving
sine-wave grating.
In this section, we explain a very simple program that runs an experiment to measure the
observers ability to see motion direction of a drifting sine-wave grating (figure 4.2).
The example program is divided into several modules. A Display Setup module defines
the various screen parameters, initializes the display screen for experimental displays,
loads the color lookup table, and specifies the screen font. The Experimental module
creates images for the movies, executes the movies, collects responses, and saves data.
The System Reinstatement module restores the settings of the screen and the computer to
the original state at the beginning of the program, which leaves it in a standard state for
other users or applications. Each module is considered in turn.
The experiment tests motion perception by showing a small number of frames of a
moving stimulus and asking the observer to identify the direction of motion as left or right.
Figure 4.2 shows five image frames of a moving sine-wave grating with the first frame at
the top and frames that would be shown in successive time intervals going down. Leftward
motion is shown on the left and rightward motion on the right.
The sample program is a very simple example of an experiment with 100 trials. Fifty
of the 100 trials, randomly chosen, present a sine-wave grating moving to the left. The
remaining 50 trials present a rightward moving sine-wave grating. The experiment mea-
sures the observers ability to see motion direction of a drifting 8 6 degree sine-wave
66 Chapter 4
grating with 20% contrast, a spatial frequency of 1 c/d, and a temporal frequency of 4 Hz.
Each trial starts with a 250-ms fixation, followed by the presentation of the moving
grating (250 ms or 1 temporal cycle). The observer presses the left or right arrow on the
keyboard to indicate the perceived direction of motion after each stimulus presentation.
The program saves the observers response along with the stimulus condition for each
trial.
The general approach in this program is to specify the stimulus in terms of its physical
parameters and then compute the corresponding images that are required with respect to
the screen size, pixel resolution, and screen refresh rate of your display. For example, the
number of pixels of a sine-wave grating of a given size and spatial frequency are computed
from the viewing distance and the known physical parameters of the display screen. The
temporal frequency of 4 Hz corresponds with a periodor full repeatof the position of
the sine wave four times per second, or cycle per 250 ms. The required phase shift of the
sine wave in each new frame of the movie is computed based on the known refresh rate
of the display screen.
% Open the display window, set up lookup table, and hide the
% mouse cursor
if exist(onCleanup, class), oC_Obj = onCleanup(@()sca); end
% close any pre-existing PTB Screen window
% Prepare setup of imaging pipeline for onscreen window.
PsychImaging(PrepareConfiguration); % First step in starting
% pipeline
PsychImaging(AddTask, General,
FloatingPoint32BitIfPossible);
% set up a 32-bit floatingpoint framebuffer
PsychImaging(AddTask, General,
NormalizedHighresColorRange);
% normalize the color range ([0, 1] corresponds
% to [min, max])
PsychImaging(AddTask, General, EnablePseudoGrayOutput);
% enable high gray level resolution output with
% bitstealing
PsychImaging(AddTask,FinalFormatting,
DisplayColorCorrection,SimpleGamma);
% setup Gamma correction method using simple power
% function for all color channels
[windowPtr p.ScreenRect] = PsychImaging(OpenWindow,
whichScreen, p.ScreenBackground);
% Finishes the setup phase for imaging pipeline
% creates an onscreen window, performs all remaining
% configuration steps
PsychColorCorrection(SetEncodingGamma, windowPtr,
1/ p.ScreenGamma);
% set Gamma for all color channels
HideCursor; % Hide the mouse cursor
could learn more about the PsychImaging function by using the MATLAB help command.
In this particular program, we set up a 32-bit floating-point frame buffer and set the range
of color values from 0 to 1. We also enable high gray-level resolution with bit stealing and
set up the method of Gamma correction as a simple power function for all color channels.
Bit stealing and Gamma correction are discussed in detail in chapter 5. Each one of these
choices is seen in a line of the code in display 4.2 along with a comment. The purpose of
this series of PsychImaging commands is to create displays with calibrated high gray-
level resolution.
The last call to PsychImaging opens the display window:
PsychImaging(OpenWindow,whichScreen,p.ScreenBackground).
This call returns a pointer to the display area in the variable windowPtr. It returns the
coordinates of the upper left and lower right of the screen in the parameter p.ScreenRect.
The values in the p.ScreenRect correspond to selected resolution of the display (i.e., [0
0 1024 768]). The following few lines of code set up the Gamma correction, compute and
load the color lookup table, and make the cursor invisible.
A call to FrameRate returns the current frame rate setting of the experimental window
identified by windowPtr and stores the result in p.ScreenFrameRate.
Finally, a font for alphanumeric character displays, chosen by the experimenter, is speci-
fied in the command Screen(TextFont, windowPtr, Times). The font size
is loaded by a call to Screen(TextSize, windowPtr, 24).
Display 4.2
%% Experimental Module
% Save Results
save DriftingSinewave_rst.mat rec p; % save the results
Next, the program creates sine-wave grating images using a function called Cre-
ateProceduralSineGrating. This function creates a texture on the graphics processing
unit (GPU) for sine-wave images and returns the pointer to tex. Programming the sine
wave on a GPU using this function has the advantage of rapid computation. Your system
must include a graphics card that supports ita recent Direct 3D-10/11 capable or
OpenGL-3/4 capable system (see Psychtoolbox help files for a list of capable graphics
cards).
If your system is older and does not have a compatible graphics card, it is straightfor-
ward to use standard MATLAB programming to create the images (using meshgrid) and
compute the luminance of each image pixel for the appropriate sine waves (see chapter 3
for examples).
The experiment itself is set up in a 100 4 matrix rec representing the conditions to
be tested and the responses to be stored for the 100 trials. The labels of each column are
saved in p.recLabel. Each of the 100 rows of the matrix rec contains four columns.
72 Chapter 4
The first column of the matrix holds the trial number, and the second column assigns
exactly 50 trials for left-drifting sine waves and 50 trials for right-drifting sine waves (1,
1). The third and fourth columns are set up in advance to store the response accuracy and
the response time for the trial once it has been run. The routine Shuffle is used to random-
ize the testing order of the different trials, and so randomizes the direction of drift over
trials.
Just before the experiment is actually run, the priority of the display window is set to
highest priority using the Psychtoolbox function call Priority(MaxPriority(window
Ptr)). It is quite important to give maximum priority to the experimental display window
in order to improve the accuracy of the timing of the display. If priority is not given to the
experiment, random actions by the operating system may delay or distort the timing of
displays.
First, at the beginning of the experimental session, instructions are shown on the screen
indicating the arrow keys to be used for responses, and then providing a fixation point as
a simple centered rectangle.
Next, the trials are presented and data are collected. Each trial has two parts. The first
computes and displays the movie of the moving sine wave. The second collects the
response and reads the response time for that response.
Each frame of the movie corresponds with a call to the sine-wave procedure on the
GPU. These were set up earlier in the program using CreateProceduralSineGrating.
The function call to Screen(DrawTexture, windowPtr, tex, [], [], 0, [],
[], [], [], [], params) computes and stores each new frame with the parameters
in params using the sine-wave procedure pointed to by tex. The call is: Screen
(DrawTexture, windowPointer, texturePointer [,sourceRect] [,des-
tinationRect] [,rotationAngle]), where [ ] is optional. Use the help function
for descriptions and to see more options of the function Screen.
Each frame of the movie is shown one by one at the frame rate by a call to
Screen(Flip). This flips what is shown on the display from one image to the
next with a timing that is synchronized with the frame rate of the display device.
Finally, at the end of the nFrames of the drifting sine wave, we show the fixation
display. The response of the observer is collected and recorded along with the response
time from the beginning of the trial and stored in the third and fourth columns of the
matrix rec.
After all the trials are completed, a date string is recorded in p.finish. At this point,
the record of all trials in the matrix rec containsin the order of runninga trial number
(1 to 100), the trial type (left or right motion), whether that trial was correct or incorrect,
and the associated response time. The trial records are saved along with all of the param-
eters of the experiment in DriftingSinewave_rst.mat.
Stimulus Presentation 73
Display 4.3
In section 4.3, we reviewed in detail a sample program that carried out a simple experiment
in motion perception. In this section, we provide a number of sample programs to carry
out other experiments, including measurement of magnitude estimation, contrast psycho-
metric function, contrast sensitivity function, rapid serial visual presentation for a memory
task, playing a movie read in from an outside source, and measurement of retinotopy of
visual cortex for functional magnetic resonance imaging (fMRI). These examples are
selected to reflect a representative sample of important paradigms and to include often
used programming requirements and functions. Taken together, this set of examples should
provide the basis for devising and running a very wide range of psychophysical and cogni-
tive experiments in the MATLAB plus Psychtoolbox environment.
Although the example experiments differ in many details, there are many similarities
between the different programs. The similarities mean that understanding the first example
with detailed annotation in section 4.3 should make it easy to understand the new experi-
ments. For the programs and experiments here, we show only the Experimental module
the Display Setup module and System Reinstatement module would be the same as the
first example. Each program here includes inline comments. The help command for Psych-
toolbox functions could be used to understand more details about individual function calls
as the reader works through each experimental program.
Display 4.4
%% Experimental Module
Screen(Flip, windowPtr);
% show reference instruction text
key = WaitTill({space esc});
% wait for SPACE response
if strcmp(key, esc), break; end
% if response is <escape> , then stop experiment
Secs = Screen(Flip, windowPtr);
% turn off text by flipping to background image
end
to understand the relationship between physical manipulations of the stimulus and the
internal scale. This experiment measures the effect of variations in luminance. The experi-
ment consists of a total of 140 trials. In each trial, the subject is shown a 3 radius disk
at one of seven possible luminance levels (1, 11, 21, 31, 41, 51, and 61 cd/m2). The disk
is displayed for 200 ms. Observers are asked to assign a numerical value to the perceived
intensity using the computer keyboard. At the beginning and after every 14 trials (two
presentations of each of the seven luminance levels), the subject is presented a patch with
luminance set at 25 cd/m2, whose perceived intensity is defined as 10, to provide an anchor
for the rating scale. The same set of luminance levels are tested 20 times in random order.
The observers rating in each trial is saved. The average rating of a particular luminance
level provides an estimate of its perceived intensity (figure 4.3).
The topic of magnitude estimation is treated in section 7.2.1 of chapter 7, where we
consider the theoretical assumptions and interpretation of magnitude estimation scaling.
Stimulus Presentation 77
a Standard
Standard
Standard
Trial
18
b
16
14
Perceived magnitude
12
10
0
0 10 20 30 40 50 60 70
Physical intensity of the light patch
Figure 4.3
Magnitude estimation of perceived light intensity. (a) Trial sequence in the magnitude estimation experiment.
(b) Estimated magnitudes of patches with different light intensity.
Display 4.5
%% Experimental Module
p.contrasts = logspace(log10(conRange(1)),
log10(conRange(2)), nContrast);
Stimulus Presentation 79
a
Trial
Interval
1
Interval
2
1
b
0.9
Probability correct
0.8
0.7
0.6
0.5
Figure 4.4
Measuring the contrast psychometric function with the method of constant stimuli and two-interval forced-choice
(2IFC). (a) Illustration of the 2IFC paradigm for several trials. (b) The measured contrast psychometric
function.
The two intervals are each marked by the presentation of fixation crosshairs, with onsets
of the intervals separated by 1000 ms. The observer reports the interval most likely to
contain the grating. The computer records the stimulus condition and the observers
response on each trial. At the end of the 700 trials of the experiment, the computer tabulates
the accuracy with which the correct interval was selected by the observer at each contrast.
The function that relates the accuracy of performance to the stimulus contrast is the psy-
chometric function (figure 4.4b).
Figure 4.4a shows a sample test sequence over several trials that randomly vary the
contrast of the grating and the interval in which it appears (first or second). Table 4.1
shows a tabulation of hypothetical data from the experiment. Figure 4.4b graphs the psy-
chometric function. The error bars for each observed proportion are estimated from the
variance of the binomial distribution, or (pq)/N, where p is the percent correct, q is the
percent incorrect, and N is the number of trials . The data from psychometric function
experiments of this kind are analyzed in detail in section 10.4.2 of chapter 10. In that
82 Chapter 4
Table 4.1
A sample psychometric function
section, the data are modeled with a function, and a threshold contrast is estimated for a
particular accuracy level such as 75% correct.
Display 4.6
%% Experimental Module
p.contrasts = logspace(log10(conRange(1)),
log10(conRange(2)), nContrast);
p.SFs = logspace(log10(sfRange(1)), log10(sfRange(2)), nSF);
SFs = p.SFs / ppd;
% compute spatial frequency in cycles per pixel
84 Chapter 4
for i = 1 : nTrials
con = p.contrasts(rec(i, 2));
% use contrast index from rec to set contrast
% for this trial
sf = SFs(rec(i, 3));
% use spatial frequency index from rec to set spatial
% frequency for this trial
flipSecs = Secs + p.ISI + [0 p.interval];
% define the start time of the two intervals
for j = 1 : 2
if rec(i, 4) == j % draw the grating in the interval
% defined in rec(i,4) only
Screen(DrawTexture, windowPtr, tex, [], [],
0, [], [], [], [], [], [180 sf con 0]);
end % draw the sine grating with phase 180,
% spatial frequency, and contrast
Screen(DrawLines, windowPtr, fixXY, 3, 0);
% add the fixation crosshairs
t0 = Screen(Flip, windowPtr, flipSecs(j));
% show the stimulus and return the time
Screen(Flip, windowPtr, t0 + p.stimDuration);
% turn off the stimulus by flipping to
% background image after p.stimDuration secs
end
a
Trial
Interval
1
Interval
2
b 1
0.9 0.125 c/d 0.25 c/d 0.5 c/d
0.8
0.7
0.6
0.5
0.4
0.25 0.44 0.79 1.41 2.50 4.48 0.14 0.25 0.44 0.79 1.41 2.50 0.14 0.25 0.44 0.79 1.41 2.50
1
0.9 1 c/d 2 c/d 4 c/d
Probability correct
0.8
0.7
0.6
0.5
0.4
0.14 0.25 0.44 0.79 1.41 2.50 0.14 0.25 0.44 0.79 1.41 2.50 0.14 0.25 0.44 0.79 1.41 2.50
1
0.9 8 c/d 16 c/d 32 c/d
0.8
0.7
0.6
0.5
0.4
0.25 0.44 0.79 1.41 2.50 4.48 0.79 1.41 2.50 4.48 7.91 0.79 1.41 2.50 4.48 7.91 14.1 25.0
Grating contrast (%)
Figure 4.5
Measuring a contrast sensitivity function. (a) Sample trial sequence. (b) Psychometric functions at nine spatial
frequencies.
Stimulus Presentation 87
316
100
31.6
Sensitivity
10.0
3.16
Figure 4.6
Contrast sensitivity function.
Table 4.2
A sample contrast sensitivity function
Spatial frequency 0.125 0.25 0.50 1.0 2.0 4.0 8.0 16.0 32.0
(c/d)
Threshold (%) 1.43 0.69 0.49 0.49 0.53 0.54 1.01 2.17 4.74
Sensitivity 70.1 144.7 202.2 204.6 189.0 183.5 98.7 46.0 21.1
correct was estimated for each spatial frequency. These thresholds and the corresponding
sensitivityor 1/thresholdare listed in table 4.2. The sensitivity estimates are graphed
in figure 4.6 as the measured contrast sensitivity function. There is an extensive treatment
of how to model contrast sensitivity functions in section 10.4.3 of chapter 10.
information across the visual field.911 RSVP paradigms have been influential in the devel-
opment of theories in many of these areas.
Display 4.7 shows a program for a simplified condition of an attention paradigm by
Reeves and Sperling9 and Weichselgartner and Sperling.7 The purpose of the experiment
is to measure the temporal properties of attention leading to storage and report. This
program considers the simple example of a single locationsometimes called a single
streamof rapidly presented letters. The task of the observer is to report the first four
items following a report cue, which in this case is a square surrounding one of the
letters. Opening an attention window prior to the cue would flood a memory store with
irrelevant items, so observers must try to open an attention window as soon as they see
the square cue. The items are processed through this attention window, and then the
most salient four items are selected by the observer for report. In typical data, each
reported item is drawn from a temporal position either just before, simultaneous with,
or just after the cue. While the observer is trying to report the four items simultaneous
with the cue and following it, in fact the distribution of the locations of reported items
broadly follows the interval after the cue, with a maximum at some typical delay. In
this paradigm, the data are analyzed in a very clever way to estimate the temporal
properties of the attention window, which like a gate opens to its full extent and then
closes again.10
On each trial, a string of 23 letters is displayed in random order and appear in the same
location on the screen. The letters are chosen from a set of 23 letters (excluding I, Q, and
V because of visual similarity with J, O, and U, respectively). The display rate shows one
new letter every 150 ms, consisting of display duration near 50 ms (here three frames),
while the rest of the interval is blank. The report cue appears at one of the temporal posi-
tions from 6 to 15 equally often. Observers are asked to report the first four letters starting
with the one that is simultaneous with the cue. In each trial, the observer produces an
ordered report string of four items (i.e., the observer might report H X O E). There are
200 trials.
Figure 4.7a shows a sample sequence in temporal order and the surrounding box that
is the report cue. Each letter is translated to a position score based on its position in the
RSVP stream that was presented and in turn converted to positions relative to the cue. A
position code for letters relative to the ordinal position of the target is shown below each
letter. The target letter, which is simultaneous with the cue, is labeled 0 while letters before
the cue have negative position scores, and letters after the cue have positive position scores.
For example, a reported 4-tuple <H X O E> has position scores of <+2 +4 +1 +3> relative
to the report cue. These reports can be scored in several ways. First, the frequencies of
report (regardless of report order) are computed by summing the number of reports at each
relative temporal position 2, 1, 0, 1, 2, . . . . Second, the reports can be scored for sys-
tematic report orders usually called iBj scores. This refers to the probability of reporting
the letter displayed in temporal position i before the letter displayed in temporal positon
Stimulus Presentation 89
Display 4.7
%% Experimental Module
for j = 1 : nLetters
if j == rec(i, 2)
Screen(FrameRect, windowPtr, 1, cueRect, 3);
end
Screen(DrawText, windowPtr, p.seq(j, i), x1, y1,
1);
t0 = Screen(Flip, windowPtr, t0 + blankDur);
% letter on
t0 = Screen(Flip, windowPtr, t0 + stimDur);
% letter off
end
Screen(TextSize, windowPtr, 48);
Screen(TextFont, windowPtr, Times);
DrawFormattedText(windowPtr, Please input the 4
letters, center, center, 1);
t0 = Screen(Flip, windowPtr); % show prompt string
Y F C R L G S O H E X U P A Z B D T N J W K M
a Temporal
position
1 1
b c
Probability of reporting
0.8 0.8
Order scores PiBj
0
0.6 0.6
1
0.4 0.4 4
1 8
5
0.2 0.2 2 7
6
3
0 0
1 0 1 2 3 4 5 6 7 8 1 0 1 2 3 4 5 6 7 8
Figure 4.7
Measuring an attention reaction time with an RSVP experiment. (a) A sample display sequence. (b) Item scores
for each individual response position. (The probability P(i) of a letter from stimulus position i appearing in
response j). (c) The proportion PiBj of trials in which letters from stimulus position i are reported earlier in the
response than those from position j, as a function PiBj, with the curve parameter i.
j, which is defined as 0.5 when i = j. Both kinds of scoring lead to highly systematic pat-
terns in the data that reveal something about the window of attention.
Figure 4.7b shows the item scores pooled over all response positions (solid line) and
for each position in the report order (dashed lines). The figure shows typical results for
medium display speeds. The observer does a reasonably good job of reporting the correct
items, although the order is somewhat blurred. Figure 4.7c shows the results of the iBj
analysis. For example, the curve marked 0 shows that the item in the 0 position, simultane-
ous with the cue, is reported first more often than any other position, and nearly perfectly
before all letters in position 3 or greater. The fact that the curves are largely laminar
layered like laminated surfacesreflects a consistent ordering even when items are not
Stimulus Presentation 93
reported in the correct order. These highly ordered data have been interpreted as reflecting
the operation of attention triggered by the cue and the reporting of items in proportion to
their perceived strength rather than explicitly coded order (see ref. 9).
Display 4.8
t = Screen(Flip, windowPtr);
toTime = toTime - fromTime + t;
% compute stop time relative to computer time
Display 4.9
Screen(CloseMovie, movie) releases the pointer movie and all OpenGL resources
associated with it.
The Psychtoolbox functions for handling movies that are briefly described here are
explained in more detail by the help function. The reader should check the help function
for the definitions of additional parameters and options.
4.4.6 Retinotopy
The next example program is somewhat more complicated. It is included here to illustrate
an important function for data collection in visual fMRI. The program shows how to
present a series of displays of the sort that has been widely used to measure the retinotopy
in early visual cortex in human brain imaging. Many regions of visual cortex have a reti-
notopic organization in which different regions of the visual field are represented in the
brain in a systematic way. The purpose of the retinotopy display sequences is to measure
96 Chapter 4
those voxels of the fMRIand the corresponding group of neuronsthat respond to each
point in the visual field. A given point on the cortical surface represents neurons from
several cortical layers whose receptive fields center on the same point in visual space. In
retinotopic organizations, adjacent points on the cortical surface correspond to adjacent
points in the visual field. The purpose of retinotopy experiments is to use a sequence of
displays to find the correspondence between stimulation in a part of the visual field and
activity in a cortical area in fMRI.
Wedges and rings made of flickering radial color checkerboard patterns are widely
used to identify retinotopic visual areas of each observer.12,13 The relevant display
sequences over time stimulate different regions of visual space with flickering stimuli.
Figure 4.8a shows a series of black and white checkerboard wedges that cycle radially
through stimulating different regions of the visual field, like positions of the hand of an
old-fashioned analog clock. Figure 4.8b shows a series of black and white checkerboard
rings that cycle through stimulating regions of the visual field of different eccentricity
starting from fixation. The aspect ratio of the dark and light checkered regions at differ-
ent eccentricities is computed by setting their (radial) height equal to their (tangential)
mid-width. For the wedge displays, a cycle of 32 images is shown over a period of 32
s, 1 s each. During the 1 s of each image, the checkerboards reverse polarity (change
which is black and which is white) at 7.5 Hz. This flicker drives neural activity in visual
cortex at that set of locations in the visual field. The wedge has a radius of 8.5 in
degrees of visual angle and a width of one-eighth of a disk or 45 orientation and is
divided into four subsectors. Every second, the image is rotated counterclockwise one-
fourth of its width, so the wedge sweeps the whole visual field in 32 s. The wedge
sequence is shown eight times continuously. Each ring was made of two checkers in the
radial direction. The rings expand from the center of the display at the speed of one
checker box per second. The entire cycle took 20 s. The ring displays are also shown a
total of eight times continuously. During both the wedge and the ring display cycles, a
central fixation square changed from black to red or vice versa in randomly chosen
intervals between 5 and 15 s.
Figure 4.8
Image sequences for defining fMRI retinotopy. (a) Display sequence of the wedge. (b) Display sequence of the
ring.
Stimulus Presentation 97
For an fMRI retinotopy session, observers are asked to maintain fixation and to press
a key as soon as the fixation region changes color; 1 indicates a white fixation and 2
indicates a red fixation. Image sequences of the rotating wedges or expanding rings are
shown while the observer maintains fixation. An fMRI analysis correlates the activity in
different voxels representing a location on the cortical surface with the stimulation in a
region of the visual field from the wedge and ring flickering displays.
To run an actual fMRI experiment, the experimental computer that generates the visual
images and collects the responses would need to receive triggers from the MRI system.
For example, a Siemens scanner sends trigger signals that are represented as equivalent to
pressing 5 on the keyboard of the experimental computer. Additionally, behavioral
responses would likely not be collected on a keyboard, but rather would make use of an
fMRI-compatible response collection apparatus.
The retinotopy programs introduce the use of several Psychtoolbox function calls for
creating image textures on the GPU as functions under the main function Screen. The
call to Screen(MakeTexture,) serves a basic function of converting a two-dimen-
sional (2D) or three-dimensional (3D) matrix into an OpenGL texture (a display on the
GPU) and returning a pointer that can then be used by various other functions to specify
the texture. The form of the call is textureIndex = Screen(MakeTexture, Win-
dowIndex, imageMatrix). See the Psychtoolbox help function for other options. The
2D image(s) containing the black and white circular checkerboards are first set up with
calls to MakeTexture in display 4.10.
Once the images are converted to OpenGL textures, we need to specify the content of
the textures. In the case of the wedge program in display 4.10, the arcs of the circular
checkerboard were drawn with Screen(FrameArc, ). This function draws an arc
(here, essentially one of the black or white checks in the circular checkerboard display)
by specifying a starting angle, an angle for the arc, and a length or pen width. Angles
are measured clockwise from vertical. This function is called as Screen(FrameArc,
windowPtr, color, [rect], startAngle, arcAngle, penWidth). There are
a number of other optional variables that can be looked up using the Psychtoolbox help
function.
In the wedge experimental program, first the entire visible wedge of texture1 is drawn
as white by calling Screen(FrameArc,tex(1), 1, [], 0, p.wedgeDeg,
nPixels). The FrameArc subfunction is also subsequently used to paint each black
check in the wedge, with the color set to 0, the location specified by rect, the initial angle
specified by ang0, the angle to be swept by the arc of dA, and the penwidth (or radial
length) of nPixels (see display 4.10).
Another function used to draw the textures once they are created is
Screen(DrawTexture,). To display the image, this function draws a texture pointed
to by a texture pointer, here tex(i), into an (active) screen window pointed to by a
window pointer, here windowPtr.
98 Chapter 4
Display 4.10
%% Experimental Module
key = ReadKey(keys);
if strcmp(key, esc), break; end
% stop the dispay if esc is pressed
if isempty(key)
validResp = 1;
% reset response flag, the next response will be valid
elseif validResp
validResp = 0;
rec(i, 2) = vbl - t1; % record response time
end
end
Figure 4.9 (plate 6) shows the results of a retinotopy experiment using the wedge and
ring stimuli from displays 4.10 and 4.11. The fMRI responses to different parts of the
visual field are shown overlaid on a flattened representaton of the visual cortex. Lines are
drawn to label different visual cortical regions, such as V1, V2, V3, and V4. The retinotopy
is based on the organizational principles of the visual cortex: The borders between the
early visual areas coincide with the representations of the vertical and horizontal meridians
in the visual field, and neighboring areas contain mirror-reversed maps of the visual
space.12,13
4.5 Summary
In this chapter, we have shown how to use movies for controlled visual presentations.
Movies are simply sequences of images that are displayedtypically but not necessarily
rapidlythat may or may not be accompanied by audio. We started with the detailed
presentation of the anatomy of a simple experiment using movies programmed in MATLAB
with Psychtoolbox.1,2 The chapter provides samples of implementations of representative
stimuli and tasks. These include cases where the images of the movies are constructed
102 Chapter 4
Display 4.11
%% Experimental Module
key = ReadKey(keys);
if strcmp(key, esc), break; end % to stop
if isempty(key)
validResp = 1; % key released, next response
% will be valid
elseif validResp
validResp = 0; % debouncing
rec(i, 2) = vbl - t1; % record response time
end
end
either ahead of time or on the fly by the experimenter and also an example of presenting
modified clips from native movies. Combined with some of the image processing examples
in chapter 3, these simple examples provide the basis of the expertise that allows the reader
to generate a wide range of interesting experiments and demonstrations and to control their
timing and display.
106 Chapter 4
a b
+ +
V1d
V2d
V3
V3a
V1v
V2v
VP
V4v
References
1. Pelli DG. 1997. The VideoToolbox software for visual psychophysics: Transforming numbers into movies.
Spat Vis 10(4): 437442.
2. Brainard DH. 1997. The psychophysics toolbox. Spat Vis 10(4): 433436.
3. Psychtoolbox-3. [computer program] Available at: http://psychtoolbox.org/.
4. Fechner G. Elemente der psychophysik. Leipzig: Breitkopf & Hrtel; 1860.
5. Stevens SS. 1946. On the theory of scales of measurement. Science 103(2684): 677680.
6. Chun MM, Potter MC. 1995. A two-stage model for multiple target detection in rapid serial visual presenta-
tion. J Exp Psychol Hum Percept Perform 21(1): 109127.
7. Weichselgartner E, Sperling G. 1987. Dynamics of automatic and controlled visual attention. Science
238(4828): 778780.
8. Holmes V, Arwas R, Garrett M. 1977. Prior context and the perception of lexically ambiguous sentences. Mem
Cognit 5(1): 103110.
9. Reeves A, Sperling G. 1986. Attention gating in short-term visual memory. Psychol Rev 93(2): 180206.
10. Sperling G, Weichselgartner E. 1995. Episodic theory of the dynamics of spatial attention. Psychol Rev
102(3): 503532.
11. Shih SI, Sperling G. 2002. Measuring and modeling the trajectory of visual spatial attention. Psychol Rev
109(2): 260305.
12. Engel SA, Glover GH, Wandell BA. 1997. Retinotopic organization in human visual cortex and the spatial
precision of functional MRI. Cereb Cortex 7(2): 181192.
13. Sereno M, Dale A, Reppas J, Kwong K, Belliveau J, Brady T, Rosen B, Tootell R. 1995. Borders of multiple
visual areas in humans revealed by functional magnetic resonance imaging. Science 268(5212): 889893.
14. Li X, Lu Z-L, Tjan BS, Dosher B, Chu W. 2008. BOLD fMRI contrast response functions identify multiple
mechanisms of attention in early visual areas. Proc Natl Acad Sci USA, 105(16): 62026207.
5 Visual Displays
The evaluation, calibration, and choice of visual displays are critical to the success of the
psychophysics enterprise. In this chapter, we show how to evaluate and calibrate visual
displays and summarize the qualities of some current display technologies. A number of
simple sample programs for calibration and evaluation of displays are provided. Obviously,
the specifics of particular devices at any given time will be subject to change as technolo-
gies emerge in the marketplace. However, the goals and principles of evaluation and cali-
bration will be applicable to any new device.
The quality of the visual display is critical in visual psychophysics. The purpose of visual
psychophysics is to measure the behavioral response of the observer to a visual stimulus
with known physical properties of size, luminance, timing, and so forth. A characterization
of the human (or animal) observer requires the experimenter to design a particular visual
stimulus and to then generate that stimulus in a visual display. To understand exactly what
has been presented, we must either know or measure the properties of the display. Consider
several examples. To test differences in response to a pattern in different orientations, we
must ensure that the luminance and contrast properties of the stimulus shown to the
observer are equivalent at different orientations. To measure responses to color stimuli,
we must control the color information provided in the display. To measure the perception
of motion, the experimenter must also control the temporal characteristics of the displays.
Assessment of visual memory persistence requires that we can attribute persistence to the
viewer and not to persistence on the display.
Any visual display device should be evaluated for the following properties: spatial reso-
lution, refresh rate, pixel depth, color and luminance homogeneity, geometric distortions,
temporal response function, pixel independence, and display synchronization.17
Even though the nominal spatial resolution of a display is specified by the device
design, the homogeneity and independence of the values of pixels in the display may be
imperfect. Similarly, the refresh rate provided by the manufacturer may not fully reflect
110 Chapter 5
the exact temporal properties of the display. There may be persistence of the physical
stimulus or delays in the onset of images because of various possible conversions between
the graphics card and the display device. The display luminance has an unknown physical
range and a nonlinear transformation from the programmed image values specified in a
bitmap to physical values on the display device. This nonlinear transformation needs to
be measured. In some cases, even monochrome display devices have a characteristic color
shift away from gray; that is, the specific phosphors of a cathode ray tube (CRT) device
may have a distinctive color.
Measurement devices are required for some kinds of evaluation. A precise ruler is used
to measure the spatial properties in relation to a physical standard. A colorimeter is used
to measure the spectral composition and intensity of displays. Photodiodes, an oscillo-
scope, and a video splitter are used to measure the persistence and delay of displayed
images.
These properties of displays and how to evaluate them will be discussed in the next
sections.
Display 5.1
VRAM stands for video random access memory, which is available on the graphics
card, where models and textures may be stored leading up to display.
The refresh rate, as described earlier, is the rate at which contents of the display buffer
are refreshed or shown on the display device. This sets the fastest rate at which new
images can be presented to an observer. The duration of an image display is the time to
complete an integer number of refresh counts.
Many experimental setups have several displays. For example, one display might control
MATLAB10 and the display of messages or information to the experimenter, while another
is used as the testing display, which presents images to an observer. ScreenTest tests
and generates a report for each visual display device.
Display 5.1 shows the output of ScreenTest for one of the screens on one of our
systems. The ScreenTest report begins with the version of MATLAB and other licensing
information (omitted here). It then reports the specific manufacturer and version of the
OpenGL-Renderer associated with the graphics card. ScreenTest reports 256 MB of
VRAM of which 237 is usable as texture memory.
The other information concerns the refresh rate and timing of the display. ScreenTest
measures the refresh rate multiple times and reports the mean and standard deviation. In
this example, the nominal refresh rate for Screen 0 was 60 Hz, which corresponds to about
16.67 ms per frame (1000 ms/60 refreshes per second). ScreenTest reports two empirical
measurements of the refresh rate by slightly different methods: 60.222498 Hz and
16.605090 ms per frame and 60.221456 Hz and 16.605377 ms per frame. One is based
112 Chapter 5
Beam position
Vertical
blanking
Figure 5.1
Illustration of refresh intervals and vertical blanking in visual displays.
on the vertical blanking interval, and the other is based on the beam position (figure 5.1).
Both are within measurement tolerance of the nominal rate. The definitions of horizontal
and vertical blanking intervals are explained next.
Most of the terminology and the standards for video and television were developed
around the physical constraints of the CRT (figure 5.1). Originally, an electron beam
activated a phosphor on the surface of the CRT in a conventional order, from top left to
bottom right in rows.11 Very brief horizontal blanking intervals allow time for the beam
to shift from the end of one row to the beginning of the next, and a longer vertical blank-
ing interval12 (VBI) allows the beam to reset back to the top left of each display. The image
on the screen is unchanged during vertical blanking intervals. The beam position refers to
the location of the electronic beam in this pattern of display, which is associated with a
rather accurate measure of the time during this display interval. In early applications
without buffered graphics memory, the VBI might be 15% of the total refresh interval in
order to allow new information to be transferred into the frame buffer. For example, of the
16.67 ms per frame at 60 Hz, the VBI might be about 2.5 ms. With modern technology,
the physical requirement for a long VBI is no longer relevantbut the conventions are
retained for compatibility to television and other video standards, and the VBI is now used
to send other kinds of data (i.e., subtitles for the deaf in television) or to carry out graphics
computations.
Psychtoolbox Version 3 uses a double buffering system with an onscreen display buffer
and an offscreen back buffer.8 Switching the pointer to the next display as the two are
flipped is almost instantaneous, usually taking less than a microsecond. Psychtoolbox
can track both the beam position and the flag (VBLsync) that marks the beginning of the
Visual Displays 113
VBI, and it reports measurements of the refresh rate based on both of these. The screen
being tested in display 5.1 was set for a spatial resolution of 1280 800, so reporting that
the start and end line of the VBI occurred at line 800 indicates that it occurred before
starting the next display.
ScreenTest is provided by Psychtoolbox to assist in evaluating visual displays,
however there are many aspects of visual displays that ScreenTest does not handle. In
this chapter, we discuss many other calibration and evaluation procedures and implement
them in a series of programs. These programs are relatively straightforwardand so we
provide fewer comments. They serve as additional sample programs that may be helpful
for students of Psychtoolbox.
Display 5.2
intended shape.13 For example, a square may appear as a rectangle or a trapezoid or may
show some other warping of the shape. One simple way to test this is to display a large
square of known pixel size on the display device and use a good ruler to measure the hori-
zontal and vertical size of the square (see display 5.3 for the program). Many display
devices have parameters that can be adjusted to determine the overall size and shape
(aspect ratio and straight-line properties) of the display. Geometric distortion could be
measured in multiple regions of the display, or at least in the regions of the display that
will be used in the experiment.
Visual Displays 115
Display 5.3
%% Experimental Module
% Specify the stimulus
p.stimSize = 256; % stimulus size in pixels
Display 5.4
%% Experimental Module
Display 5.5
vbl=Screen(Flip, windowPtr);
%% Experimental Module
Monitor
Video splitter
Photodiode
Trigger
Oscilloscope
Figure 5.3
Circuit diagram for testing temporal properties of visual displays.
a b
4
Voltage (V)
1 2 3 6
1
0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16
Time (s)
Figure 5.4
Temporal response functions of (a) a CRT display and (b) an LCD as measured by an oscilloscope. Responses
to 1, 2, 3, and 6 consecutive refreshes are shown. For the CRT, the trace of each refresh is the same in all condi-
tions. For the LCD, the traces of 2, 3, and 6 refreshes (solid curves) are different from the predictions of the
combinations of multiple single refreshes (dotted curves); multiple refreshes smooth out the response
functions.
120 Chapter 5
tion, including the rising from and return back to baseline, within the interval of the frame
rate. The shape should be consistent across repetitions.
The purpose of the video splitter in the setup in figure 5.3 is to provide a simultaneous
signal to the monitor, which will be measured by the photodiode as one input to the oscil-
loscope, and a direct signal to the oscilloscope in order to assess whether the image is
actually presented at the programmed time. To assess a delay requires the comparison of
the actual onset of the image patch compared to the onset of the video signal. Another
method is to use the video signal from the graphics card via a special-purpose device, such
as a video switch device,16 to trigger the oscilloscope and measure the onset of the pho-
todiode response relative to the trigger.
If the video signal of the graphics card is not directly available, it is possible to use the
onset of a pattern on a CRT displaywhich generally has extremely small delaysto
calibrate the delays of other non-CRT devices. The output of the graphics card run through
a video splitter may be fed to the CRT and, for example, an LCD. A photodiode measuring
the CRT triggers the oscilloscope, and the photodiode measurement of the LCD evaluates
the time course in relation to that onset.
Figure 5.5
Image tearing due to synchronization failure. A single left-to-right moving bar that expands the entire height of
the screen is broken due to synchronization failure.
Most current experimental test setups for visual psychophysics use raster displays, such
as the CRT, which paint light in a series of rows. Historically, visual psychophysics used
122 Chapter 5
Display 5.6
%% Experimental Module
offset = 0;
tEnd = vbl + 30;
other display devices to present stimuli, such as tachistoscopes, projectors with shutters
and optics, or point-plot or vector-plot displays. Tachistoscopes or slide projectors present
either printed materials or slides for controlled times by creating a brief illumination or
by briefly opening a shutter. Point-plot or vector-plot display devices activate phosphors
only in specific screen locations determined either as a set of individual illuminated points
or as a sequence of points along a vector, leaving all other locations inactive. A raster scan
display device, as described earlier, is based on television technology, in which an electron
beam is moved across the screen, painting pixels from left to right and from top to bottom,
one row at a time. Raster displays use either a non-interlaced mode in which each row is
painted in turn (see section 5.1.1) or an interlaced mode in which first the odd rows are
painted as one image frame followed by even rows in a separate image frame. The inter-
laced mode is still used in some broadcast television. Interlacing was designed to com-
Visual Displays 123
Display 5.7
%% Experimental Module
pensate for earlier devices with slow transfer of display information to mitigate the
appearance that an image appears over an extended time. Display devices currently in use,
including most monitors, projectors, and goggles, are raster devices using a non-interlace
mode that paints all rows in order.
In this section, we present a brief review of the current state of each display technology
and the advantages and disadvantages of each. Attributes of each display type are sum-
marized in table 5.1, followed by a section on each. This table deals with most of the major
display technologies used in the laboratory today. We provide this section for individuals
who may be setting up a psychophysics laboratory or choosing a new display technology
for an existing laboratory. Other readers may choose to skip to section 5.3 to address
general issues of calibration and testing.
screen. The chemical compositions of the phosphors painted on the fluorescent screen
determine the brightness, color, and the temporal response of the pixels. Color CRTs use
three different phosphors packed together around each pixel that correspond to red,
green, and blue. Painting an RGB triplet in each pixel with three separate but tempo-
rally synchronized electron guns portrays different colors.
CRTs provide good quality displays because they have relatively high numbers of pixels,
a range of native (built-in programmable) pixel resolutions, have relatively rapid response
times, and achieve good control and quality of colors and color balance. They also are less
sensitive to off-axis viewing angles so that the screen can be viewed from a range of side
angles and still be seen with little loss of contrast. CRTs are analog devices and often have
some issues with homogeneity. Brightness is adequate, and contrast is usually very high.
Most phosphors have excellent temporal responses with rapid rise times and relatively
short persistence. Some, however, do not. Geometric distortions in flat-panel CRTs can be
adjusted to be minimal. Pixel independence can be achieved with special-purpose, high-
bandwidth monitors.
A liquid crystal display15 (LCD) is a display that is based on the light-modulating prop-
erties of liquid crystals. Liquid crystals do not emit light directly. LCDs consist of pixels
filled with liquid crystals in front of a light source or light reflector, called the backlight
where the current state of the crystals modulate or filter the light available to the viewer
to create either color or monochrome images.
At the current time, there are four major versions of LCD technology that differ in the
details of arrangement of the liquid crystals. These are the twisted nematic (TN), in-plane
switching (IPS), multi-domain vertical alignment (MVA), and pattern vertical alignment
(PVA) technologies. LCDs are lightweight, energy efficient, and are now the dominant
display method in computers. However, as devices for visual psychophysics, they have
many limitations:
1. The pixel response time for many LCD displays is more than 20 ms. In comparison,
the refresh period for a 60-Hz monitor is 16.7 ms. In rapid, dynamic displays, the long
pixel response time manifests as low-contrast ghosting, in which a previous image appears
as a low-contrast but visible component of the current image.
2. To make things worse, the pixel response time depends on the brightness of the pixel,
so the pixel response time is not a constant, but differs depending upon the image and the
brightness/contrast settings of the monitor.
3. Another problem for LCD displays in visual testing is the dependence on viewing angle.
Pictures look different when seen from different viewing angles, whereas there is relatively
little sensitivity to viewing angle in CRTs. Although this problem has been moderated in
more recent LCD displays, it may still be significant for purposes of visual testing.
4. Because the LCD matrix is a passive device that only modulates the output of a back-
light unit, and because the LCD matrix is not fully opaque, some light is seen even in a
126 Chapter 5
black state, showing not a deep black color, but only dark gray. For this reason, LCDs
often have a limited contrast ratio of 200300 to 1.
5. Contrary to the claims of some manufacturers, many LCDs are limited in the number
of colors they can depict by a 6-bit resolution in each of three color-channels.
6. In many cases, the backlight is not strictly uniform across the display, and so neither
is the LCD modulated light.
On the positive side, LCD displays have excellent geometric properties and good pixel
independence. For this reason, LCDs may be a good option in studies with static or slowly
varying stimuli in which the details of the brightness and contrast are unimportant. These
conditions may hold in many cognitive or memory experiments, for example. Additionally,
special-purpose LCD technology for visual psychophysics is under development.17
Light emitting diode18 (LED) displays are essentially LCDs that use semiconductor light
sources in backlighting rather than the fluorescent backlights more often used in LCDs.
The LCD technology is used in both to filter the light that appears on the display panel.
Although sharing many of the disadvantages of the conventional LCD, the LEDs improve
spatial uniformity across the display surface. Certain LED displays also can reduce the
backlighting source in dark regions to achieve a darker black, and so increase the achiev-
able contrast ratio. If RGB LEDs are used, then this may also improve the representation
of color relative to the standard LCD.
Another emerging display contender is the organic light emitting diode19 (OLED)
display. It uses organic carbonbased compounds that emit red, green, and blue lights
when stimulated by electric current, and so emits light rather than filtering a backlight. It
uses much less power and can be made extremely thin. It has a rapid pixel response time,
improves color representation over the LCD, and achieves a high contrast ratio. The
current technology has several issues. It has limited pixel resolution per inch, is expen-
sive, and has a limited life expectancy. Although not yet ready for general use, the OLED
technology is interesting and may become a contender for visual psychophysics in the
future.
the DLP is populated with millions of hinge-mounted microscopic mirrors that control
the amount of the light emitted for each pixel. The amount of light is controlled by
switching each mirror ontoward the projection surfaceor off extremely rapidly, and
the relative amount of on to off in a rapid time cycle controls the gray scale up to 1024
gray levels.
Filtering white light through red, green, and blue filters operating in rapid sequence
creates image colors. The single-chip DLP projection systems create many millions of
colors by mixing different proportions of on and off mirror states of this colored light
rapidly alternated over time. In a single-chip DLP, only one color filter operates at any
given instant, so the components of the RGB image cannot occur strictly simultaneously.
Another issue with the single-chip DLP technology is that the synchronization of the color
filters and the mirrors can cause visible color separation due to slips in synchronization.
For example, when a yellow light is programmed, the R and G components may become
visible separately (see section 5.1.7).
A three-chip DLP uses three independent DLP chips to reflect light from red, green,
and blue light sources to achieve good synchronization and produce about 12 bits per
component, leading to excellent depiction of more than 35 trillion colors. However, the
three-chip DLP projection systems are quite expensive, and so are only likely to be used
in commercial applications such as large theaters or in specialized laboratories.
The contrast ratio of a DLP can be as high as 2000 to 1. It has excellent luminance
homogeneity. Color depth or resolution is excellent, especially with three-chip DLPs.
Geometric distortion and pixel independence are excellent. Temporal resolution is
excellent in three-chip DLP, but may be an issue in single-chip DLP projection
systems.
combines the two monocular images to generate the perception of 3D depth. Displaying
different images to the two eyes is done in four major ways: (1) using goggles to
present different images to the two eyes, (2) using optics (e.g., prism glasses, stereo-
scopes) to present different images to the two eyes, (3) using filters (e.g., complemen-
tary color anaglyphs, polarized glasses) or shutter-glasses to present different images to
the two eyes, and (4) generating images for the two eyes on the same display device
in different locations and displaying them to different eyes through a physical projection
scheme.
The diagram in figure 5.6 shows a four-mirror stereoscope,21 a setup often used in the
laboratory. The images for the left and right eyes are presented spatially separated on the
display device and delivered to each eye through half-silvered, front-coated 45 mirrors,
where they are fused in the percept of the observer. The distance between the centers of
the two inner mirrors should match the distance between the two eyes. One way to check
this setup is to create a device with two parallel laser pointers that are separated by the
same distance as the eyes and that reverse the optical path. When the two pointers are
placed in the stereoscope in the position of the eyes, the points projected on the display
screen should be centered in the image for each eye.
Image for the left eye Image for the right eye
\
\\
\
\\ \
\ \\
\
\\ \\
\
\\ \
\\ \\
\
\\ \\ \ \
\\ \\
\ \
\\ \\ \ \
\\
\
\\
\
D \\ \\
\\ \\ \ \\ \ \\
\\ \\
\ \
\\ \\
\\ \\ \\ \\
\\ \\
LE RE
Figure 5.6
A four-mirror stereoscope for displaying separate images to the two eyes.
Visual Displays 129
In addition to considering all the display properties relevant for 2D displays, one major
concern for 3D display technologies based on filtering and directional lights is ghosting.
In this context, ghosting reflects blending of the two monocular images and so failure to
segregate the images to the two eyes. The presence of ghosting may be important in many
experiments and should be evaluated.
The basic input to the visual system is light. All of our knowledge about the properties of
the visual system is predicated on careful measurements of the visual response to different
patterns of light. It is therefore critical for psychophysical experiments to represent accu-
rately the patterns of light in the stimulus. This requires the measurement of light and color
in our display devices.
First, we need to know how to measure light and color. Then, systematic measurements
of display devices generate mathematical descriptions of the display through proper cali-
bration. It is then possible to represent faithfully any new visual stimuli based on the known
mathematical properties of the display rather than on individual physical measurements
of every stimulus.
We focus on the issues and methods of calibration for monitors and projection
devices.
50
40
Luminance (cd/m )
2
30
20
10
0
0 0.196 0.392 0.588 0.784 0.980
Gray level
Figure 5.7
A Gamma function with Lmin = 1 cd/m2, Lmax = 53 cd/m2, and = 1.9.
Visual Displays 131
where Lmin, Lmax, and are determined either by using photometers or by a combination
of psychophysical procedures and photometric measurements.22,23 The constant is called
the Gamma constant.
Display 5.8 provides a program that allows you to set the RGB value for a displayed
square at a starting value, measure the luminance with the photometer, and then increase
the value of one or more guns and measure again. The results of many such measure-
ments are fit with equation 5.1 to obtain the characteristic of the display. This program
for calibration includes calls to functions that set up the display using PsychImaging.
These calls were treated in the discussion about and comments for display 4.1 in
chapter 4.
The human eye is even more sensitive than the photometer in discriminating lumi-
nance, so photometric calibration can be augmented with calibration by eye. We use a
bisection matching procedure to match successively different proportional values of
output luminance (figure 5.8). The eye combines equal mixtures of maximum and
minimum luminance pixels in one mixed intensity patchthrough temporal and spatial
Display 5.8
if showHelp
DrawFormattedText(windowPtr, hStr, 4, 64,
texColor, [], 0, 0, 1.5);
end
Screen(Flip, windowPtr);
blur. This physically defines the halfway point in luminance output. The experimenter
then measures which single programmed grayscale level in another patch visually matches
this mixture. This halfway point can then in turn be bisected above and below, and so on,
until one has measured the function relating programmed intensity to actual output lumi-
nance from minimum to maximum. These data are fit with the function in equation 5.1 to
describe the relationship. Display 5.9 shows the corresponding visual Gamma calibration
program.
Most visual experiments in fact define stimuli in terms of contrast on a background
with luminance L0 halfway between Lmin and Lmax, rather than by luminance values. The
following equations define the relationship between the contrast c(U) and gray level U (0
to 1) as
L (U ) L0 Lmax Lmin
c(U ) = = (2U 1). (5.2)
L0 Lmax + Lmin
This equation can be inverted to solve for the programmed grayscale level for a desired
contrast:
1/
c + 1 Lmax + Lmin
U = . (5.3)
2 Lmax Lmin
Although U can take a continuous value from 0 to 1 in the equation, only a subset of
values can actually be displayed because of the color resolution of the display device. The
achievable values of U for each device are quantized. For example, for an 8-bit display, only
256 values between 0 and 1 are possible. U must be rounded to the nearest achievable value.
The Gamma correction parameter in equation 5.3 is a characteristic of a particular
display device that should be incorporated into most experimental running programs to
linearize or correct the output luminance. Failure to calibrate a display device for the
Gamma correction can lead to distortions in the luminance profile of stimuli; for
example, an incorrect could alter an intended sine-wave profile into something closer
to a square-wave luminance pattern. The of a display device is likely to change when
device settings are changed. It also can change with the aging of the device. The display
device should be calibrated periodically, and in some very demanding experiments
should be calibrated very frequently. Some display devices may need to be turned on
for a period of stabilization, usually a few minutes, prior to calibration or experimental
running.
The human visual system is highly sensitive. The human eye can sometimes be sensitive
to contrasts on the order of 0.1%, or 0.001.2426 To display a sine wave with a peak
contrast of 0.1% requires a gray-level resolution of at least 0.02%, or 1 part in 5000. This
134 Chapter 5
Display 5.9
% Instruction 2
str = [Use Left/right arrow keys for coarse adjustments,\n
and Up/down keys for fine adjustments.\n
Press Enter key when you are done];
DrawFormattedText(windowPtr, str, 100, 200, 255, [], 0, 0,
1.5);
Screen(Flip, windowPtr, 0, 1);
%% Experimental Module
% Specify the stimulus
sz = 128; % size of the middle patch in pixels
high = vol(hI(i));
low = vol(lI(i));
ind = (hI(i) + lI(i)) / 2;
img(:) = high; img(2 : 2 : sz, :) = low;
tex(1) = Screen(MakeTexture, windowPtr, img, 0, 0, 2);
tex(2) = Screen(MakeTexture, windowPtr,
img([2 : sz 1], :), 0, 0, 2);
mid = vol(ind); % start with guessed value
while 1
KbReleaseWait; % avoid continuous change
while 1
for j = 1 : 2 % generate flickering flankers
Screen(DrawTexture, windowPtr, tex(j));
Screen(FillRect, windowPtr, mid, rect);
Screen(Flip, windowPtr, 0, 1);
end
key = ReadKey(keys);
if isempty(key), break; end
end
switch key
case left, mid = mid - step(2);
case right, mid = mid + step(2);
case up, mid = mid + step(1);
case down, mid = mid - step(1);
case esc, sca; error(ESC pressed.);
otherwise % return or enter
if i == 1
guessGamma = log(0.5) / log(mid);
vol = lum .^ (1 / guessGamma);
else
vol(ind) = mid;
end
Beeper;
break; % go to next step
end
% Fit Gamma
costfunc = @(x) sum((lum .^ (1 / x) - vol) .^ 2);
gamma = fminsearch(costfunc, guessGamma);
rsq = 1 - costfunc(gamma) / var(vol, 1) / 17;
% plot the result
figure(3);
x = linspace(0, 1, 256);
plot(x, x .^ gamma, r); hold on;
plot(vol, lum, o); hold off;
xlabel(Normalized Output);
ylabel(Normalized Luminance);
text(0.2, 0.9, sprintf(RGB = [%g %g %g], rgb));
text(0.2, 0.8, sprintf(Gamma = %.3g, gamma));
text(0.2, 0.7, sprintf(Rsq = %.4g, rsq));
Figure 5.8
Calibration by eye. (a) In the bisection procedure, the single luminance value of M is adjusted to match that of
the physical mixture of luminance values L and H. (b) The resulting matching values and Gamma fit for a CRT
display. Note that the axes are flipped relative to figure 5.7 because here gray level U is the measured
variable.
Visual Displays 137
corresponds to about 12.4 bits of gray-level resolution. However, typical computer graph-
ics cards provide only 8-bit intensity resolution per channel. Certain graphics cards provide
10-bit intensity resolution per channel. Many experiments in visual psychophysics require
higher-resolution control of gray or color levels. Several different approaches have been
developed to achieve high gray-level resolution in display systems. We review a number
of methods and issues in some detail.
For analog display devices such as the CRT, the limiting factor is the bit resolution
(pixel depth) of the digital to analog conversion of the video card. A higher-resolution card
or a combination of several channels of a lower-resolution card can be used to deliver a
net higher resolution to the display device.
For digital display devices such as LCD monitors, achieving high gray-level resolution
requires both a high-resolution video card and a high gray-level resolution display, such
as a 10- or 12-bit LCD display.
Here, we focus on the input to the display device. Display 5.10 provides a program that
uses a number of options to show a luminance ramp that uses high gray-level resolution.
It displays a series of rectangles that step up the luminance from the lowest to the highest
programmed value in fine steps. The function PsychImaging is used to set up and enable
display options for testing. Some display options are described in more detail in the fol-
lowing sections. The sample program uses the native capability of the graphics device,
such as a 10-bit graphics display. A number of other software and hardware options may
be used to improve resolution. We provide syntax in the sample program for these different
approaches to high grayscale resolution in the lines that are commented out. Only one
option at a time can be implemented in a running program.
Display 5.10
%%% HighResolutionLuminance.m
function HighResolutionLuminance(rgb)
Screen(Flip,windowPtr);
%% Experimental Module
if strcmp(key, esc)
break;
else
highBit = 1 - highBit; % toggle between the 32-
% and 8-bit textures
KbReleaseWait;
end
end
When R1 >> R2, the attenuated R signal provides the fine resolution. This video attenu-
ator is capable of generating video signals with up to 16 bits of gray-level resolution,
satisfying the needs of most visual psychophysical experiments.
One issue with this attenuator is that high-quality monochrome CRTs are very hard to
find and very expensive; color CRTs are easier to find and less expensive. For this reason,
a video switcher that modifies the outputs of conventional computer graphics cards to
generate high luminance resolution monochromatic displays on color monitors was devel-
oped.16,30 The design incorporates an attenuator to generate a single-channel high-
resolution video signal and then uses amplifiers to duplicate the same signal to drive the
three RGB channels of color monitors. This special device design also includes a pushbut-
ton to switch between a normal color mode and a high gray-level resolution monochrome
mode as well as a trigger output that can be used to synchronize other equipment (e.g.,
fMRI recording) to the video signal.
Visual Displays 141
The purpose of color calibration is to specify the color of a stimulus in a color space. All
color spaces are intrinsically multidimensional. The two major color spaces are the Inter-
national Color Standards (Commission Internationale de lclairage; CIE) space and the
DerringtonKrauskopfLennie (DKL) space.33 The CIE space approximates the excitation
in cone typeslong, medium, and short wave, or red, green, and blue sensitive. The DKL
space represents color opponent systems that code redgreen, yellowblue, and luminance
(blackwhite) responses. DKL is used in many experiments that study isoluminant color
perception because it separates color variations from luminance variations.
Colors as presented on display devices are programmed values of the intensities in RGB
channels. In this section, we illustrate how to convert programmed RGB values in com-
puter memory into the corresponding descriptions of the stimulus in CIE or DKL spaces;
or, conversely, how to translate a CIE or DKL specification of a desired color stimulus
into programmable RGB values on a display device. The diagram in figure 5.9 summarizes
the steps and equations that are explained in the next several sections.
Graphics
Graphics card
card
0 = (.5,.5,.5)
= (r,g,b)
Background luminance
Luminance of the three guns
L0R = LR (min) + [ LR (max) LR (min)] .5 R
Figure 5.9
Diagram of the components, steps, and equations that convert RGB values from a graphics card into expected
cone excitations and coordinates in DKL color space.
Visual Displays 143
When the CIE standards were established, spectral sensitivities of the cone photorecep-
tors in the eye were not known independently but were inferred from behavioral study of
perceived colors. Color matching experiments showed that a light of any pure wavelength
could be perceptually matched by a weighted linear combination of three wave-
lengths700.0 nm, 546.1 nm, and 435.8 nmmixtures of three primary colored lights.34
Color matching functions graph the weights on each of the three primary colors required
for a perceived color match as a function of the pure wavelength to be matched. These
functions are analogous with the spectral sensitivity of the three cone types to different
wavelengths. The resulting functions for a standard observer are x( ), y( ), and z ( ),
which represent the weights of the three primary colors. These functions are one descrip-
tion of the sensitivity of the observer to different wavelengths of light; the functions depend
144 Chapter 5
on the choice of the three primary light colors. The wavelengths of the three primary colors
are approximately equal to the wavelengths that generate the maximum responses from
the three cone types in the human eye.
Every physical stimulus has a spectral radiance distribution W ( ) that determines the
perceived color. Because the percept of any single wavelength stimulus can be matched
by a linear combination of three selected wavelengths, any distribution of wavelengths can
also be matched. The CIE space assigns each stimulus to an (X, Y, Z) value. It does this
by weighting the different wavelengths in the stimulus by the color matching functions
[ x( ), y( ), and z ( )] of a standard CIE observer and correcting by a constant K m called
the maximum photopic luminance efficacy.
The equations are
X = K m W ( ) x ( )d
Y = K m W ( ) y ( )d (5.5)
Z = K m W ( ) z ( )d.
So, if the color matching functions of the CIE observer, the constant, and the spectral
distribution of the image are known, you can compute the color of the stimulus in (X, Y,
Z) space. Images with different spectral radiance distributions but the same tri-stimulus
descriptions are perceptually color matched for the standard CIE observer.
The XYZ system based on the CIE space combines hue and luminance information.
An alternative description of the stimulus transforms the XYZ into normalized values
with chromaticity coordinates x and y along with luminance L. Luminance is set as
equal to Y.
The transformation from XYZ space to normalized chromaticity space is
X
x=
X +Y + Z
Y
y= (5.6)
X +Y + Z
Z
z= = 1 x y.
X +Y + Z
The values of LR, LG, and LB can be computed from the results of Gamma calibration
of each gun.
Conversely, to display a pixel with tri-stimulus values XYZ, the programmed luminance
of the red, green, and blue guns can be computed as:
146 Chapter 5
1
LR yR yG yB X
xR xG xB
LG = 1 1 1 Y . (5.9)
L zR zG zB Z
B yR yG yB
G
The corresponding RGB triplet, = (r , g, b), on the graphics card can be computed from
the Gamma calibration of each gun.
(5.11)
Combining equations 5.8 and 5.10 allows us to use what we know about the chromaticity
and luminance of a display device and what we know about the responses of cones in the
visual system to compute expected cone excitations from luminances set for the three guns
of the display device:
PL 0.15516 0.54308 0.03287 yR yG yB LR
xR xG xB
By inverting equation 5.12, we can specify the luminance values of each color gun to
achieve a given state of expected cone excitation:
1
LR yR yG yB 2.9448 3.5001 13.1745 PL
xR xG xB
Often, the most important aspect of color vision is not the absolute color but instead
color contrast relative to the background color. For this reason, color stimuli may also be
described in terms of expected cone contrasts:
PL P P
CL = , CM = M , CS = S . (5.14)
PL0 PM0 PS0
G
For a given background on a display device specified as 0 = (r0 , g0 , b0 ) and character-
ized by L0R, L0G, and L0B, expected cone excitations PL0, PM0, and PS0 can be computed
from equation 5.12. One can compute luminance increments or decrements of the three
different color channels by the following equation:
1
LR yR yG yB 2.9448 3.5001 13.1745 PL
xR xG xB
(L + M)
L-M
S - (L + M)
Figure 5.11
The 3D DKL color space. The primary axes are L + M (luminance), L M (redgreen), and S (L + M) (yellow
blue). Isoluminant stimuli vary within a plane in the space varying [L M, S (L + M)] at a constant value of
(L + M) luminance.
kRLum
Lum
1 1 1
RLM PL0 PL
kLM = 1 PM0 0 PM . (5.16)
RSLum
PL0 + PM0 PS
kS Lum 1 1
PS0
Conversely,
1
1 1 1 kRLum
Lum
PL
P RLM
PM = 1 PML0 0 kL M . (5.17)
P
0
PL0 + PM0 RSLum
S 1 1 kS Lum
PS0
The constants kLum , kL M, and kS Lum in equations 5.16 and 5.17 define or scale the units
of response for each dimension in the DKL space. One standard convention is to choose
Visual Displays 149
the scaling constants such that, when (RLum , RL M, RS Lum) = (1, 0, 0), (0, 1, 0), and
(0, 0, 1), C = CL2 + CM2 + CS2 = 1.0.36
To summarize, color calibration of your display yields the chromaticity parameters for
that device. The expected cone excitations for a chosen background, PL0, PM0, and PS0 can be
computed from the device-specific chromaticity parameters. From the color calibration
constants and the expected cone excitations for a selected background, a system of equations
to estimate the scaling constants kLum , kL M, and kS Lum is generated by setting (RLum , RL M,
RS Lum ) = (1, 0, 0), (0, 1, 0), and (0, 0, 1). Every color can be specified in the DKL space.
Isoluminant stimuli, important in many studies of color vision, vary only in color without
contamination from luminance signals. Calibration in DKL space provides candidate
stimuli for isoluminance. The DKL specification of a stimulus depends upon calibration
by a photometer using the filter setttings of a standard observer. However, an isoluminant
stimulus varying solely in the color dimensions of DKL space is only nominally isolumi-
nant based on photometric measurements and may not be perceptually isoluminant.
Random variation in responses at every level of the visual system means that what is
isoluminant for one neuron may not be perfectly isoluminant for other neurons. Isolumi-
nance may also depend on characteristics of the pattern, the size, and the location of the
stimulus. The unintended luminance signals may help an observer solve a task and so
contaminate the study of pure color vision.
Special calibration based on human perceptual tests, sometimes called higher-order
calibration, is necessary to remove unintended luminance impurities from signal transmis-
sion in the visual system.42,43
Another area of vision research that requires functional, higher-order calibration is the
study of motion perception. According to some theories, human motion perception is
Display 5.11
z = 1 - x - y; % zR zG zB
Display 5.12
imshow(img);
set(gcf, color, background);
a b
1.0
Prob rightward
R G R G R G R G
R G R G R G R G
G R G R G R G R
G R G R G R G R 0.5
0 mc
R G R G R G R G
Modulation amplitude of added luminance
Figure 5.12
The minimum motion procedure to generate functional color isoluminance. (a) A representation of alternating
isoluminant red and green (R, G) patches that move rightward in successive frames (top to bottom). (b) A func-
tional isoluminance test adds luminance variation in some phase (i.e., in phase with green) with a luminance
modulation that minimizes the perception of rightward motion.
in a particular phase to the candidate isoluminant stimulus. A search is then made for the
m that produces minimum motion. This added m is assumed to cancel the contamination
component. Usually, only one phase is tested. In the minimum motion method, the two
motions, the possible contamination and the added canceling modulation, both move in
the same direction.
A more sensitive method to remove luminance contamination is based on sandwich
displays in which the experimenter alternates the original image frames with amplifier
frames. This procedure we discuss here is based on five-frame calibration displays.
The odd frames are candidate isoluminant redgreen sine-wave gratings with 180
phase shifts between them. The even amplifier frames are luminance sine-wave grat-
ings also with 180 phase shifts. The phase shift between successive frames is 90 (figure
5.13). There is no motion signal in the odd frames alone or the even frames alone. The
perception of motion requires the combination of information in odd and even frames. No
consistent motion would be perceived in a color luminance sandwich display if the iso-
luminant redgreen frames were truly isoluminant. If there were residual luminance
contamination in the isoluminant frames so that, for example, the green areas were
slightly less luminous than the red areas, motion would be seen in a consistent leftward
direction based on luminance contamination in the color frames. If green were more
luminous than red, motion would be seen in the rightward direction based on luminance
contamination in the color frames. The high-contrast luminance frames in the sandwich
displays amplify the luminance contaminants in the nominally isoluminant frames. In the
sandwich procedure, luminance is added to the redgreen isoluminant stimuli to cancel
luminance artifacts.
Visual Displays 153
a b
1.0 2x threshold
Prob rightward
R G R G R G R G
0.5
G R G R G R G R
0.0
0 mc
R G R G R G R G
Modulation amplitude of added luminance
Figure 5.13
An illustration of the sandwich display for isoluminant color calibration. (a) Five frames consisting of odd frames
of candidate color isoluminant stimuli and even frames of luminance stimuli. (b) Psychometric function to cancel
motion artifacts.
Several examples of higher-order calibration methods for motion displays are described
in Lu and Sperling (2001).42 A program based on the sandwich method is shown in display
5.13 for interested readers.
5.7 Summary
This chapter was designed to be a road map to the evaluation, calibration, and selection
of visual displays for the psychophysicists laboratory. The capabilities and limitations of
many of the currently available display technologies were considered. Although new
technologies may be developed or the existing technologies may be improved, the same
goals and principles of calibration remain. We have provided simple methods to evaluate
the geometric distortions of displays and to calibrate monochromatic and color luminances
and temporal and persistence properties of the medium. We have explained methods to
describe visual stimuli in color spaces and how to translate between different stimulus
representations. Knowing exactly which physical stimuli are being presented to observers
is the first step in high-quality psychophysical investigation.
154 Chapter 5
Display 5.13
%% Experimental Module
References
1. Pelli DG. 1997. Pixel independence: Measuring spatial interactions on a CRT display. Spat Vis 10(4):
443446.
2. Brainard DH, Pelli DG, Robson T. Display characterization. In: Encyclopedia of imaging science and technol-
ogy. Hoboken, NJ: Wiley; 2002.
3. Bach M, Meigen T, Strasburger H. 1997. Raster-scan cathode-ray tubes for vision researchlimits of resolu-
tion in space, time and intensity, and some solutions. Spat Vis 10(4): 403414.
4. Packer O, Diller LC, Verweij J, Lee BB, Pokorny J, Williams DR, Dacey DM, Brainard DH. 2001. Charac-
terization and use of a digital light projector for vision research. Vision Res 41(4): 427440.
5. Garca-Prez MA, Peli E. 2001. Luminance artifacts of cathode-ray tube displays for vision research. Spat Vis
14(2): 201215.
6. Compton K. 2001. Factors affecting cathode ray tube display performance. J Digit Imaging 14(2): 92
106.
7. Zele AJ, Vingrys AJ. 2005. Cathode-ray-tube monitor artefacts in neurophysiology. J Neurosci Methods 141(1):
17.
8. Psychtoolbox-3. [computer program] Available at: http://psychtoolbox.org/.
9. Segal M, Akeley K. The OpenGL graphics system: A specification. Available at: www.opengl.org/registry/
doc/glspec20.20041022.pdf.
10. The MathWorks Inc. MATLAB [computer program]. Natick, MA: MathWorks; 1998.
11. Tyson J, Carmack, C. How computer monitors work. Available at: http://www.howstuffworks.com/monitor
.htm.
12. Pelli DG. 1997. The VideoToolbox software for visual psychophysics: Transforming numbers into movies.
Spat Vis 10(4): 437442.
13. Samei E, Badano A, Chakraborty D, Compton K, Cornelius C, Corrigan K, et al. 2005. Assessment of
display performance for medical imaging systems: Executive summary of AAPM TG18 report. Med Phys 32:
12051225.
14. Winterbottom MD, Geri GA, Morgan B, Pierce BJ. 2004. An integrated procedure for measuring the spatial
and temporal resolution of visual displays. Interservice/Industry Training, Simulation, and Education Conference
(I/ITSEC); paper 1855, pp 18.
15. Cristaldi DJR, Pennisi S, Pulvirenti F. Liquid crystal display drivers: Techniques and circuits. Berlin:
Springer-Verlag; 2009.
158 Chapter 5
16. Li X, Lu ZL, Xu P, Jin J, Zhou Y. 2003. Generating high gray-level resolution monochrome displays with
conventional computer graphics cards and color monitors. J Neurosci Methods 130(1): 918.
17. Wang P, Nikoli D. 2011. An LCD monitor with sufficiently precise timing for research in vision. Front
Human Neurosci 5: 110.
18. Okuno Y. Light-emitting diode display: Google Patents, US patent 4298869; 1981.
19. Mllen K, Scherf U. Organic light emitting devices: Synthesis, properties and applications. Hoboken, NJ:
Wiley; 2006.
20. Hornbeck LJ. 1997. Digital light processing TM for high brightness, high-resolution applications. Proc SPIE
3013: 2740.
21. Wheatstone C. 1852. The Bakerian LectureContributions to the physiology of visionpart the second. On
some remarkable, and hitherto unobserved, phenomena of binocular vision (continued). Philos Trans R Soc Lond
142: 117.
22. Lu ZL, Sperling G. 1999. Second-order reversed phi. Atten Percept Psychophys 61(6): 10751088.
23. Colombo E, Derrington A. 2001. Visual calibration of CRT monitors. Displays 22(3): 8795.
24. Kelly D. 1979. Motion and vision. II. Stabilized spatio-temporal threshold surface. JOSA 69(10):
13401349.
25. Burr DC, Ross J. 1982. Contrast sensitivity at high velocities. Vision Res 22(4): 479484.
26. Lu ZL, Sperling G. 1995. The functional architecture of human visual motion perception. Vision Res 35(19):
26972722.
27. Tyler CW. 1997. Colour bit-stealing to enhance the luminance resolution of digital displays on a single pixel
basis. Spat Vis 10(4): 369377.
28. Allard R, Faubert J. 2008. The noisy-bit method for digital displays: Converting a 256 luminance resolution
into a continuous resolution. Behav Res Methods 40(3): 735743.
29. Pelli DG, Zhang L. 1991. Accurate control of contrast on microcomputer displays. Vision Res 31(78):
13371350.
30. Li X, Lu ZL. 2012. Enabling high grayscale resolution displays and accurate response time measurements
on conventional computers. J Vis Exp 60: e3312.
31. BITS++. Available at: http://www.crsltd.com/tools-for-vision-science/visual-stimulation/bits-sharp/.
32. vPixx. Available at: http://www.vpixx.com/.
33. Derrington AM, Krauskopf J, Lennie P. 1984. Chromatic mechanisms in lateral geniculate nucleus of
macaque. J Physiol 357(1): 241265.
34. CIE. Commission Internationale de lEclairage proceedings, 1931. Cambridge, UK: Cambridge University
Press; 1932.
35. Kaiser PK, Boynton RM, Swanson WH. Human color vision, 2nd ed. Washington, DC: Optical Society of
America; 1996.
36. Brainard D. 1996. Cone contrast and opponent modulation color spaces. Human Color Vision 2: 563
579.
37. Gregory RL. 1977. Vision with isoluminant colour contrast: 1. A projection technique and observations.
Perception 6(1): 113119.
38. DZmura M. 1991. Color in visual search. Vision Res 31(6): 951966.
39. Sekiguchi N, Williams DR, Brainard DH. 1993. Efficiency in detection of isoluminant and isochromatic
interference fringes. JOSA A 10(10): 21182133.
40. Lindsey DT, Teller DY. 1990. Motion at isoluminance: Discrimination/detection ratios for moving isoluminant
gratings. Vision Res 30(11): 17511761.
41. Lu ZL, Lesmes LA, Sperling G. 1999. The mechanism of isoluminant chromatic motion perception. Proc
Natl Acad Sci USA 96(14): 82898294.
42. Lu ZL, Sperling G. 2001. Sensitive calibration and measurement procedures based on the amplification
principle in motion perception. Vision Res 41(18): 23552374.
Visual Displays 159
43. Scott-Samuel NE, Georgeson MA. 1999. Does early non-linearity account for second-order motion? Vision
Res 39(17): 28532865.
44. Lu ZL, Sperling G. 2001. Three-systems theory of human visual motion perception: Review and update.
JOSA A 18(9): 23312370.
45. Anstis SM, Cavanagh P. A minimum motion technique for judging equiluminance. In: Mollon JD,
Sharpe LT, eds. Colour vision: Physiology and psychophysics. London: Academic Press; 1983:155
166.
6 Response Collection
Once the stimulus is generated and displayed on the monitor or other display device, the
next step is to measure aspects of the observers response, which may be as simple as a
key-press or as complex as an implicit behavior such as eye movements or physiological
responses observed through external devices. In this chapter, we describe the use of many
standard methods of collecting responses and consider issues of synchronization between
stimulus presentation and response collection or between multiple devices collecting dis-
tinct responses. The first two sections describe the collection of discrete responses and the
measurement of response times, respectively. The third section provides relevant programs
and examples. The fourth section treats the measurement of eye movements, and the fifth
section describes the synchronization issues in the collection of physiological data.
6.1.1 Keyboard
The simplest method of collecting a persons response to a visual display is to ask the
individual to press a key on the computer keyboard. He or she may be asked to press dif-
ferent keys to indicate different responses. Computer keyboards are inexpensive, do not
require any special hardware setup or programming, and naturally work with any computer
hardware, operating system, and standard software toolkit.
KeyboardInput.m is a program that displays the first character you type on your
keyboard and the elapsed time since the onset of the instruction screen. The Experimental
module of the program is shown in display 6.1.
6.1.2 Mouse/Touchpad
Most modern computers are equipped with a mouse or touchpad. A mouse is a device that
controls the position of a cursor in the two dimensions of a screen or other graphic device.
The mouse can be used to select different response categories by clicking a different button
or by clicking at a different cursor location. It may also be used to trace the trajectory of
the movements of a mouse-controlled cursor on the screen.
162 Chapter 6
Display 6.1
%% Experimental Module
6.1.3 Joystick
A joystick is an input device consisting of a stick that pivots on a base and reports its
angle or direction to the computer. A two-dimensional (2D) joystick is like a mouse
moving the stick left or right signals movements along the X-axis and moving it up or
down signals movement along the Y-axis. A three-dimensional (3D) joystick uses twist
(clockwise or counterclockwise) to signal movement along the Z-axis. These three
Response Collection 163
axesX, Y, and Zcorrespond with roll, pitch, and yaw in the movements of an air-
craft. Some joysticks also have one or more simple on/off switches, called fire buttons,
to be used for other kinds of responses or actions. Some also have haptic feedback
capability.
An analog joystick returns an angular measure of the movement in any direction as a
continuous value. A digital joystick gives only on/off signals for four different directions
and those combinations that are mechanically possible. Most I/O interface cards for PCs
have a game control port for a joystick. Modern joysticks mostly use a USB interface for
connection to the PC. The current version of Psychtoolbox does not support joysticks.
6.1.4 Touchscreen
A touchscreen is an electronic visual display that detects the presence and location of a
touch of a finger or a stylus within the display area. It enables a user or observer to interact
directly with what is displayed, rather than indirectly with a cursor controlled by a mouse
or touchpad. The typical touchscreen detects an event about 10 ms after contact.
The Psychtoolbox treats a touchscreen in the same way as the left mouse button. You
can use MouseButton.m to test a touchscreen if you have one.
Response times (RTs) provide valuable measures of human performance.1,2 RT and accu-
racy together define a performance point in a trade-off between speed and accuracy, the
speedaccuracy trade-off.35 There are areas in psychology where RT is treated as the
primary measured variable of interest because RT varies while there may only be small
variations in accuracy. Defined as the elapsed time between stimulus or task onset and a
164 Chapter 6
Display 6.2
%% Experimental Module
subjects response, RTs have been measured in a large variety of tasks. Although computer
keyboards and the mouse are widely used to collect reponses, this often leads to possibly
substantial variations and biases in the measurement of RTs. Specialized button boxes must
be used to obtain more accurate measurements.6,7
Display 6.3
%% Experimental Module
lineWidth = 1; % pixels
% Set the cursor to its initial location
SetMouse(100, 100);
str = Press on the left mouse button and move it to
draw. Release to finish;
Screen(DrawText, windowPtr, str, 50, 50, 1);
Screen(Flip, windowPtr);
% Wait for click and hide the cursor
while 1
[x0, y0, buttons] = GetMouse(windowPtr);
if buttons(1), break; end
end
HideCursor;
thePoints = nan(10000,2); % pre-allocate record of mouse
% locations
thePoints(1, :) = [x0 y0]; % first point
i = 2;
while 1
[x, y, buttons] = GetMouse(windowPtr);
if buttons(1), break; end
% exit when button is released
if (x = x0 || y = y0) % make sure mouse is moving
Screen(DrawLine, windowPtr, 128, x0, y0, x, y,
lineWidth);
Screen(Flip, windowPtr, 0, 1); % update the drawing
% on the display
thePoints(i, :) = [x y];
x0 = x;
y0 = y;
i = i + 1;
end
end
highly accurate timing. There are a number of sources of timing error in registering a
keyboard/mouse response, including mechanical lag in depressing the keys (how far the
key has to travel and the pressure required), debouncing of multiple key presses (dis-
counting mechanical bouncing of the contact), sequential scanning of key presses, polling
of keyboard events, and event handling of the computer operating system.
Cumulatively, the various sources of timing errors in keyboard responses can add up to
delays and uncertainties of between 20 and 70 ms. For some tasks, this variability is
comparable to or greater than the variability of human RTs in simple detection tasks, which
is typically about 20 ms.2 Importantly, some of the timing variability in keyboard/mouse
responses is biased. The bias may make comparisons of RTs between different conditions
invalid. In situations in which the possible RT difference is small, the additional variability
from the measurement device may make the difference undetectable. An external device,
such as a special-purpose RT box, makes RT measurements more accurate.
Table 6.1
Properties of some existing button boxes
a. PST Serial Response Box from Psychology Software Tools Inc. (http://www.pstnet.com/).
b. Engineering Solutions, Inc. (http://www.response-box.com/tools.shtml).
c. Stewart9: A PC parallel port button box.
d. PsyScope Button Box (http://psy.cns.sissa.it/).
e. Cedrus Corporation RB Series Response Pad (http://www.cedrus.com/support/rb_series/).
f. Li et al.6: RTBox.
In this section, we consider several RT devices for which specifications are available,
summarized in table 6.1. Several commercial RT boxes, such as the device by Empirisoft
Corporation,8 have good mechanical keys but still use a keyboard emulator and the
standard keyboard intake drivers, and so do not eliminate the biases of delayed polling
cycles. Other devices (see table 6.1) use either standard serial or parallel port communica-
tion thatwith appropriate softwarecan be polled frequently enough to reduce the
polling bias and allow millisecond accuracy.
Some RT devices use only simple button boxes and solve the various logging and timing
problems in software through the use of specially programmed device drivers. One example
is the software by Stewart9 for the Linux operating system that can be used to achieve
1-ms accuracy in RT measurement.
On the other end of the dimension are devices that are very sophisticated and have their
own real-time clocks, microprocessors, and ports to take inputs from other devices that
allow autonomous action. They use very rapid or continuous polling by the onboard
microprocessor of the RT inputs in order to minimize polling lags. The physical response
of the keys is debounced either in hardware or software. A microprocessor may keep a
sequence of time-stamped RTs and key identities and other events until this information
can be communicated to a main computer during a non-time-critical part of the trial.
The more elaborate RT devices also allow other events to trigger inputs that can be
time-stamped. If the device and the main computer each have their own real-time clocks
and the main computer is controlling, for example, the onset and duration of the display,
it is necessary to coordinate or synchronize the timing of the two devices. If we know the
time-stamp of a display onset relative to the clock of the RT device and we know the
time-stamp of the display onset by the main computer clock, we can infer the relative
timing and use this information to coordinate or synchronize the two.
Response Collection 169
One example of a full-service device is the RTBox.6,7 It combines many of the good
properties of the full-service devices. A software driver is provided to control the RTBox
in MATLAB, with Psychtoolbox-3 extensions.10 Once the RTBox is connected to the host
computer through a USB connection, the software driver detects the device and is used to
control and use all of its functions.
objects). The second variant measures the time course of visual search by interrupting the
observer after different amounts of processing time and scoring the accuracy of response
a speedaccuracy trade-off (SAT) measure of speed of processing.
This visual search example illustrates the differential difficulty of searching for a C
among Os and searching for an O among Cs, where the former is easier than the latter.12,13
This is called a search asymmetry. Visual search asymmetry occurs when the speed and/
or accuracy of visual search differs depending upon which of the same two items is
assigned as the target of the search. Another example is the relative ease of finding a tilted
line among vertical lines compared to finding a vertical line among tilted ones.14 Research-
ers have suggested that search asymmetries reveal the existence of coded features of
primary visual analysis, such as gap detectors or tilt detectors.13,1519
Display 6.4 shows the experimental code for using the computer keyboard to measure
the RT and accuracy of responses for visual search for C among Os or O among Cs for
display sizes of 4, 8, and 12. Display 6.5 shows the same experiment using the RTBox
without an external trigger to record RTs.
Observers decide whether a search display contains a target or not and respond yes or
no. In the RT variant, RT is measured from the onset of the search display, which remains
on the screen until response, and then is erased. Sample displays of different set sizes with
target present or absent are shown in figure 6.1, along with corresponding mean RT and
accuracy data based on actual data in this experiment.12
It takes longer to respond to larger displays for the more difficult O in C searches,
whereas RT is nearly unaffected by display size for the C in O searches. This pattern of
RT data has often been interpreted as evidence for serial searchevaluation of one item
at a timein the O in C searches, but not in the C in O searches. Further analysis of visual
search with an SAT experiment suggests instead that the time course of these two kinds
of searches is largely overlapping, with differences in the relative accuracy of performance
for a given processing time12 (see later).
The SAT version of the search experiment manipulates the amount of time spent pro-
cessing the display and measures the corresponding accuracy.12 This experiment tests two
display sizes, 4 and 12. The visual search display is exposed only briefly (50 ms), and
accuracy of target absent or present discrimination is measured at seven different points
in time, with delays to an auditory response cue (beep) of 0, 0.05, 0.15, 0.30, 0.50, 1.15,
and 1.80 s after display offset. Display 6.6 shows experimental code for the SAT experi-
ment with a 50-ms display duration. It measures the time from the offset of the display to
the auditory cue, the RT of the response relative to the auditory cue, and displays the RT
as feedback to the observer.
In these SAT paradigms, observers are trained to keep the RTs to the response cue quite
short (150350 ms) so that the experimenter manipulates processing time while measuring
the corresponding accuracy of the responses. At very short cue delays, the observer
has not had time to process the content of the search display, and the responses are
Response Collection 171
guesses. Performance accuracy increases as observers are allowed more processing time.
Sample data from an actual SAT experiment of this design12 are shown in figure 6.2. The
accuracy curves are fit with best-fitting time-course curves, as described in section 10.4.5
of chapter 10.
These sample RT and SAT experiments and the corresponding code to run the experi-
ments provide some of the typical building blocks of coordination of visual displays,
presentation and timing of simple auditory cues, and measurement of RTs. These building
blocks can be combined to create many RT and SAT experiments.
In many experiments, the point of gaze (where we are looking) and/or the movement
of the eye relative to the head are measured.20 Eye position is sometimes used as an
implicit measure of what an observer is thinking about or where the observer is getting
information. It is also the object of study in experiments that investigate the functioning
of the eye movement system. How eye movements depend upon physical characteris-
tics of the display or on context, expectancy, or attention is sometimes the topic of
study.2124
There is a long history of the measurement of the eye position. Several generations of
technology have been used in its measurement. The different methods for measuring eye
position reflect a trade-off between precision of measurement and degree of invasiveness
of the procedure. For example, hard-core studies of the dynamic system properties,
including onset, acceleration, maximum eye speed, slowing of the eye, and so forth,
require more accurate measures, and investigators may use more invasive methods of
measurement. At the other extreme, investigations of infant looking behavior may use
only very rough measures of where an infant looks and for how long in order to provide
a completely noninvasive method of measurement.25,26 Earlier research often fixed the
position of the observers head with a bite barwax impressions of the teeth attached
to a fixed metal bar. More often today, the observer stabilizes the head by placing it
against a chin and a forehead rest. Some systems are designed to tolerate modest head
movements.
Display 6.4
%% Experimental Module
We discuss only the infrared or near-infrared video-based eye trackers in this section.
These video-based eye trackers are noninvasive, provide relatively good measurements for
many though not all people, and the simpler ones can be moderate in price.
Video-based eye trackers use a video camera that focuses on one or both eyes to record
eye movements. Typically, an infrared light generated by the device is reflected from the
eye and sensed by a video camera. The information is then analyzed to extract eye rota-
tions from changes in reflections from the eye.
Some eye trackers infer the orientation of the eye by using the corneal reflection (the
first Purkinje image) of the infrared light source created by the front surface of the cornea,
Response Collection 175
Display 6.5
%% Experimental Module
respCorrect respTime};
rec = nan(nTrials, length(p.recLabel));
% matrix rec initialized with NaN
rec(:, 1) = 1 : nTrials;
% count the trial numbers from 1 to nTrials
sizeIndex = repmat(1 : nDispSize, [2 1 repeats]);
targetPresent = zeros(2, nDispSize, repeats);
% first set all to 0
targetPresent(:, :, 1 : repeats / 2) = 1;
% change first half to 1
[rec(:, 2) ind] = Shuffle(sizeIndex(:));
% shuffle size index
rec(:, 3) = targetPresent(ind);
% shuffle target presence index in the same order
% Initialize RTBox
RTBox(ButtonNames, {left left right right});
% define the first two buttons as left; the last
% two as right.
RTBox(inf); % wait till any button is pressed
Secs = Screen(Flip, windowPtr);
p.start = datestr(now); % record start time
a b
c d
1,600 0.35
Homogeneous Homogeneous
0.30
Error (proportion)
1,400
C-in-O TP
0.25
1,200 C-in-O TA
RT (ms)
0.20 O-in-C TP
1,000 0.15 O-in-C TA
800 0.10
0.05
600
0
0 5 10 15 0 5 10 15
Display size Display size
Figure 6.1
Response time study of visual search. (a) This display of size 4 illustrates a search for a C in Os. (b) This sample
display of size 12 illustrates a search for an O in homogeneous Cs. (c) and (d) Average correct RTs and error
rates as a function of display size for search with free viewing of unlimited-time displays (after Dosher, Han,
and Lu12). TP, target present; TA, target absent.
which acts like a convex mirror, and the center of the pupil as features to track the eye
position over time (figure 6.3).29 An image-processing algorithm looking for a dark circular
or elliptical pattern in the video image estimates the location of the pupil and its center.
A more sensitive type of eye tracker, the dual-Purkinje eye tracker,27 uses reflections from
the front of the cornea (first Purkinje image) and the back of the lens (fourth Purkinje
image) as features to track eye movements.
Eye trackers measure the rotation of the eye relative to a reference frame. Different
kinds of systems measure the rotation of the eye relative to different reference frames. A
head-mounted system, in which the light source and video camera are mounted on the
subjects head, measures the eye-in-head angles. A head-mounted system in which the
observer may look around at the external world by moving the head as well as the eyes
requires separate measurement of the images experienced by the eye to place where in an
image the eye is looking. Goggle systems for measuring eye movements are an example
of head-mounted displays where not just the measurement system but also the display
system is head-mounted.
Response Collection 179
Display 6.6
%% Experimental Module
str = [Press the left buttons for target absent and the
right buttons for target present responses\n\n
Please respond as quickly and accurately as
possible.\n\nPress any button to start.];
DrawFormattedText(windowPtr, str, center, center, 1);
% Draw Instruction text string centered in window
Screen(Flip, windowPtr);
% flip the text image into active buffer
Beeper;
% Initialize RTBox
RTBox(ButtonNames, {left left right right});
Response Collection 181
WaitTill(Secs + p.ITI);
RTBox(clear); % clear RTBox, sync clocks
Screen(DrawLines, windowPtr, fixXY, 3, 0); % fixation
t0 = Screen(Flip, windowPtr, 0, 1);
Screen(DrawTextures, windowPtr, tex, [], rects);
% C and O
t0 = Screen(Flip, windowPtr, t0 + p.fixDuration);
t0 = Screen(Flip, windowPtr, t0 + stimDur);
% turn off stim
tCue = WaitSecs(UntilTime, t0 + dt);
Beeper; % please double check if there is any
% delay introdued by your operating
% system
182 Chapter 6
2.5
1.5
Accuracy (d)
0.5
-0.5
0 0.5 1 1.5 2 2.5 3
Total processing time (s)
Figure 6.2
Visual search discrimination performance (d) as a function of total processing time (test onset to response) for
display sizes of 4 and 12 (data after Dosher, Han, and Lu12).
Response Collection 183
Light
Eye source
Virtual image
Camera
of light source Image of
Center of
Fovea Rotation
pupil center
First
Purkinje image
Retina Lens
Corneal
Iris surface Visual axis
Pupil
center
Optic axis of the eye
Figure 6.3
Ray-tracing diagram showing schematic representations of the eye, a camera, and a light source.
A table-mounted (remote) system has a camera mounted on the table and measures
gaze angles relative to a physical layout. The head position is fixed (e.g., using a bite bar,
or a chin rest and forehead support setup), then eye position and gaze are directly con-
nected. Head direction is subtracted from gaze direction to determine eye-in-head position.
Using one camera and one light source, the gaze can be estimated for a stationary head.
Using one camera and multiple light sources, the gaze can be estimated with free head
movements.
In some experiments, eye trackers monitor eye fixation and track eye movements while
the experimenter alters the content depending upon where an observer is looking. These
are called gaze contingent displays. The host computer of the eye tracker extracts informa-
tion about the eye position in real time. The information is then communicated to the
experimental computer to control the displays. Synchronization of the two computers is
very important for accurate timing of eye movements relative to stimulus events.
position for new locations and comparing this with the predicted values for these locations.
An accurate and reliable calibration is essential for obtaining valid and repeatable eye
movement data properly registered to external references.
Eye movement measurement systems can experience drift in their responses over time
or with movement of the viewers head within the measurement frame of the device. For
this reason, calibration should be performed periodically. It may be necessary to perform
a calibration every dozen trials or so if very precise measurements are required.
For adult humans with healthy eye control systems who are willing and able to stay
relatively still, the calibration is relatively straightforward. In contrast, for young children
or animals, the situation is more challengingwe cannot simply say exactly where to look
during a calibration process and expect these observers to comply. Similarly, there may
be measurement issues with certain ocular or other diseases that exhibit variability or bias
in the actual control of eye gaze. For such individuals, the experimenter may need to design
a more qualitative experiment or use specially designed methods to ensure calibration.
Figure 6.4
Photograph of an Eyelink-1000 eye-tracking system. A chinrest is shown in the foreground attached to two poles.
The system of cameras to photograph the eye and light sources are in the small unit placed in front of the monitor.
The computer and equipment for eye-movement tracking are shown. Often, a separate computer (not shown)
drives the display device and performs timing operations.
Response Collection 185
ware package provides programs for calibration, measurement, and analysis of eye posi-
tion. Other functions for the Eyelink eye tracker have been developed and circulated by
Psychtoolbox user groups.
Display 6.7 provides the code for RT_Eyelink.m, a sample program for tracking eye
fixation and generating gaze contingent displays for the Eyelink device. The experiment
is the same as the one described in section 6.3.1 except that an Eyelink eye tracker is used
to track a subjects eye fixation during each trial. The Eyelink systems use an Ethernet
communication protocol between the main computer and the Eyelink computer, with
estimated synchronization accuracies of 12 ms.
Other commercial eye-tracking systems may use different protocols and rely on different
methods of communication between the eye movement and main experimental computers.
Subroutines for eye-tracking systems will either be provided by the device developer or
by second-source user groups. However, the general principles of operation, calibration,
and testing will be similar to the example provided here.
6.5.1 Electroencephalography
EEG measures the electrical activity of the brain through sensors attached to the scalp
and a reference sensor attached to another part of the body, such as the ear. The electrical
fields of brain activity are measured through the skull and the scalp and reflect the group
activity of many active neurons. Many psychophysics and other laboratories include EEG
systems.35
The level of noise or random variation in the brain response on any given trial is suf-
ficiently high that the stereotypical response to a stimulus and task is usually measured
by averaging more than 100 trials synchronized to the stimulus input.3436 Individual trial
noise can reflect random or slow cyclical variations in intrinsic brain activity or in elec-
tromagnetic interference from the environment. The average EEG locked to an event is
called an event-related potential (ERP). The temporal precision of the EEG measurement
is at a millisecond level and is determined by the sampling rate of the EEG device. Mea-
surable responses of the average ERP may occur within the first few milliseconds follow-
ing the stimulus and continue for a second or more. Figure 6.5 shows an example of
continuous EEG, with several event markers, and an average ERP trace. This shows a
stereotypical form of a response to repeated visual stimulus. Experimenters infer properties
of the brain response from where an ERP signal is localized in the brain and the shape
and time course of the average response to the stimulus.
Source localization of an ERP tries to identify which brain region or regions are the
source(s) of the primary ERP response. Localization of brain responses to a specific part
of the brain depends upon source-localization modeling. By assuming a net current flow
of a certain magnitude at a given source location and applying a model of the conduction
properties of the brain, a source-localization model generates a prediction about the dis-
tribution of ERP at the scalp where the EEG sensors are placed. Mathematically inverting
such models estimates the location of sources of the electrical brain activity from the scalp
measurements.44
In a typical EEG setup, one computer interacts with the EEG device setup and records
the EEG responses while another computer manages visual, auditory, or tactile stimulus
display and perhaps collects responses through a button box or on the keyboard. The
synchronization of the three sets of activities is the key experimental feature. In almost all
setups, synchronization with the stimulus occurs by sending a stimulus onset signal or
mark to the EEG computer and including this time-stamp as part of the EEG data stream.
This time mark may then be used to mark the beginning of the EEG interval either for
online averaging routines or for later offline data processing.
The specifics of the synchronization are dependent upon the setup and the manufacturer
of the EEG system. Synchronization signals may be sent via TTL, light or other pulses,
or via messages to data acquisition computer cards for the particular EEG system. Details
and the specific accompanying function calls are device-specific.
Response Collection 187
Display 6.7
%% Experimental Module
disp(str)
Screen(DrawTextures, windowPtr, tex, [], rects); % C and O
t0 = Screen(Flip, windowPtr, t0 + p.fixDuration);
[key Secs] = WaitTill(keys); % wait till response
Screen(Flip, windowPtr); % remove stimulus
if iscellstr(key), key = key{1}; end
% take the first in case of multiple key presses
if strcmp(key, esc), break; end
rec(i, 4) = strcmp(key, right) == rec(i, 3);
% record correctness
rec(i, 5) = Secs - t0; % record respTime
if rec(i, 4), Beeper; end % beep if correct
end
p.finish = datestr(now); % record finish time
save RT_Eyelink_rst.mat rec p; % save the results
Eyelink(StopRecording); % stop eye data recording
Eyelink(CloseFile); % close data file
Eyelink(ReceiveFile); % download data file to current folder
Eyelink(Shutdown); % close the connection to eye PC
190 Chapter 6
Figure 6.5
Continuous EEG traces time-locked to the onset of a visual stimulus are averaged to obtain the ERP. Topographic
map of the ERP at selected latencies.
6.5.2 Magnetoencephalography
MEG37,38 provides a converging method for measuring brain activity through the induced
magnetic fields of underlying brain activity; it is a complementary measure of neuronal
activities to the electrical traces of the EEG. Magnetic fields of the brain are very small
compared with the ambient magnetic field induced by the earths magnetic field. MEG
devices require magnetically shielded facilities and arrays of superconducting quantum
interference devicesSQUIDsoperated at low temperatures using liquid helium as a
cooling agent. The experimental laboratory situations involve special considerations for
removing electromagnetic activity of computers or display devices from the measurement
room. For these reasons, MEG facilities are typically regionally shared devices that are
operated at major university or industry research centers.
Magnetic signals that are detectable reflect currents that are perpendicular to cortical
surfaces of the brain and likely involve 50,000 active neurons or more.45 MEG responses
can be measured at about 1 ms accuracy. The neurogenerators of MEG signals are the
same as those of EEG signals, and source localization is also performed through a math-
ematical source localization model. MEG has better source localization than the EEG
Response Collection 191
because magnetic fields are less susceptible to distortions from measurement through the
skull. MEG is most accurate in localizing sources in brain regions near the scalp, such as
primary auditory, somatosensory, and motor cortices, where the localization exceeds that
of the EEG.46 Figure 6.6 shows two sample topographic maps of MEG responses to audi-
tory tones.47
Like EEG, synchronization is one important experimental issue. Here too, synchroniza-
tion is typically managed through providing a time-mark input signal for the stimulus
onset. The time mark often comes from a signal from the parallel port of the main computer
that is managing the stimulus display and collecting response data into the computer that
is controlling the imaging device and recording imaging data. The details are specific to
the device and setup in the laboratory.
Dewar Squid
electronics
Liquid helium
Bandpass
Amplifier filters
Squid
Detection coil
A/D converter
Magnetic field
Field map
Computer
Disk
Display
Figure 6.6
(a) Schematic of a single-channel neuromagnetometer measuring the brains magnetic field, or MEG. (b) and
(c) Isofield contours for subject ZL characterizing the measured field pattern over the left hemisphere 100 ms
following the onset of a tone burst stimulus with interstimulus intervals of 1.2 and 6 s. Arrows denote the direc-
tion of current dipole sources that account for each pattern, with their bases placed at the dipoles surface location.
In the right panels, the upper arrow is the N100m source and the lower arrow is the L100m source. Insets illustrate
response waveforms obtained at the indicated positions. Both waveforms also exhibit a 200-ms component (after
Williamson, Lu, Karren, and Kaufman46 and Lu, Williamson, and Kaufman47).
Response Collection 193
lots of electricity, special cooling, magnetic shielding, and special protocols to protect
subjects who may have metals within their bodies or special medical conditions that make
them more vulnerable. Access to fMRI is typically provided in centralized research facili-
ties. There are special control systems and computers to program the timing of changes
in magnetic fields within the MRI machine and to record data and provide on-site checks
of 3D reconstruction and signal verification.
These systemsif they are outfitted for psychophysical experimentationwill also
have systems for the projection of images from a shielded source for the observer to see,
noise isolation and sound presentation through suitable earphones, and response devices,
all designed to work in the magnetic environment (figure 6.7).
The responsibility of the psychophysical experimenter using fMRI is to control the
stimulus display in the correct temporal relationship to the sequence of MRI samples, to
collect and time observer responses, and to store the sequences and timings of new stimu-
lus response cycles. To preserve the integrity of the MRI system, synchronization is usually
managed by having the MRI system send out time-mark signals to the experimental com-
puter, and for the experimental computer to arrange to deliver the stimulus at the correct
time and measure RT in relation to stimulus delivery.
Typically, MRI systems provide a TTL trigger at the beginning of each repetition time
(TR), defined as the amount of time between successive pulse sequences applied to the
Gradient
Speech recording
Waveform RF
Amplifiers
Response pad
Generator Electronics
Audio play
(x, y, z)
Console Digitizer
Data Server Computer
Figure 6.7
Schematics of the functional MRI setup in the Center for Cognitive and Behavioral Brain Imaging at The Ohio
State University.
194 Chapter 6
same brain volume. In fMRI brain imaging, the TR is typically 13 s. The TTL trigger
input is recorded on the stimulus presentation computer. In some cases, the TTL trigger
is connected via a keyboard, which has the same limitations on timing as the keyboard
has for measuring RT. For better timing, the TTL trigger of a MRI system can be connected
to an RTBox, in which case timing will be at millisecond accuracy.
6.6 Summary
References
13. Treisman A, Gormican S. 1988. Feature analysis in early vision: Evidence from search asymmetries. Psychol
Rev 95(1): 1548.
14. Wolfe JM, Friedman-Hill SR. 1992. Visual search for oriented lines: The role of angular relations between
targets and distractors. Spat Vis 6(3): 199207.
15. Wolfe JM. 2001. Asymmetries in visual search: An introduction. Atten Percept Psychophys 63(3):
381389.
16. Nagy AL, Cone SM. 1996. Asymmetries in simple feature searches for color. Vision Res 36(18):
28372847.
17. Rosenholtz R. 2001. Search asymmetries? What search asymmetries? Atten Percept Psychophys 63(3):
476489.
18. Williams D, Julesz B. 1992. Perceptual asymmetry in texture perception. Proc Natl Acad Sci USA 89(14):
65316534.
19. Rubenstein BS, Sagi D. 1990. Spatial variability as a limiting factor in texture-discrimination tasks: Implica-
tions for performance asymmetries. JOSA A 7(9): 16321643.
20. Kowler E. Eye movements and their role in visual and cognitive processes, Vol. 4. Amsterdam: Elsevier
Science Ltd; 1990.
21. Hoffman JE, Subramaniam B. 1995. The role of visual attention in saccadic eye movements. Atten Percept
Psychophys 57(6): 787795.
22. Rayner K. 1998. Eye movements in reading and information processing: 20 years of research. Psychol Bull
124(3): 372422.
23. Hayhoe M, Ballard D. 2005. Eye movements in natural behavior. Trends Cogn Sci 9(4): 188194.
24. Allopenna PD, Magnuson JS, Tanenhaus MK. 1998. Tracking the time course of spoken word recognition
using eye movements: Evidence for continuous mapping models. J Mem Lang 38(4): 419439.
25. Kremenitzer JP, Vaughan HG, Jr, Kurtzberg D, Dowling K. 1979. Smooth-pursuit eye movements in the
newborn infant. Child Dev 50(2): 442448.
26. Young LR, Sheena D. 1975. Survey of eye movement recording methods. Behav Res Methods 7(5):
397429.
27. Crane HD, Steele CM. 1985. Generation-V dual-Purkinje-image eyetracker. Appl Opt 24(4): 527
537.
28. Woestenburg J, Verbaten M, Slangen J. 1983. The removal of the eye-movement artifact from the EEG by
regression analysis in the frequency domain. Biol Psychol 16(1): 127147.
29. Guestrin ED, Eizenman M. 2006. General theory of remote gaze estimation using the pupil center and corneal
reflections. IEEE Trans Biomed Eng 53(6): 11241133.
30. SR Research. Available at: http://www.sr-research.com/.
31. Jung R, Kornhuber H. Neurophysiologie und Psychophysik des visuellen Systems. [The visual system: Neu-
rophysiology and psychophysics] Berlin: Springer-Verlag; 1961.
32. Jung R. 1961. Neuronal integration in the visual cortex and its significance for visual information. In: Rosen-
blith W, ed. Sensory communication. Cambridge, MA: MIT Press; 1961: 627674.
33. Fechner G. Elemente der psychophysik. Leipzig: Breitkopf & Hrtel; 1860.
34. Nunez PL, Srinivasan R. Electric fields of the brain: The neurophysics of EEG. New York: Oxford University
Press; 2006.
35. Regan D. Human brain electrophysiology: Evoked potentials and evoked magnetic fields in science and
medicine. New York: Elsevier; 1989.
36. Niedermeyer E, Da Silva FHL. Electroencephalography: Basic principles, clinical applications, and related
fields. Philadelphia: Lippincott Williams & Wilkins; 2005.
37. Lu ZL, Kaufman L. Magnetic source imaging of the human brain. Hillsdale, NJ: Lawrence Erlbaum Associ-
ates; 2003.
38. Hansen PC, Kringelbach ML, Salmelin R. MEG: An introduction to methods. Oxford: Oxford University
Press; 2010.
196 Chapter 6
39. Huettel SA, Song AW, McCarthy G. Functional magnetic resonance imaging, Vol. 1. Sunderland, MA:
Sinauer Associates; 2004.
40. Poldrack RA, Mumford J, Nichols T. Handbook of functional MRI data analysis. Cambridge, UK: Cambridge
University Press; 2011.
41. Buxton RB. Introduction to functional magnetic resonance imaging: Principles and techniques. Cambridge,
UK: Cambridge University Press; 2002.
42. Villringer A, Planck J, Hock C, Schleinkofer L, Dirnagl U. 1993. Near infrared spectroscopy (NIRS): A new
tool to study hemodynamic changes during activation of brain function in human adults. Neurosci Lett 154(1):
101104.
43. Kato T, Kamei A, Takashima S, Ozaki T. 1993. Human visual cortical function during photic stimulation
monitoring by means of near-infrared spectroscopy. J Cereb Blood Flow Metab 13: 516520.
44. Michel CM, Murray MM, Lantz G, Gonzalez S, Spinelli L, Grave de Peralta R. 2004. EEG source imaging.
Clin Neurophysiol 115(10): 21952222.
45. Lu ZL, Williamson S. 1991. Spatial extent of coherent sensory-evoked cortical activity. Exp Brain Res 84(2):
411416.
46. Williamson SJ, Lu ZL, Karron D, Kaufman L. 1991. Advantages and limitations of magnetic source imaging.
Brain Topogr 4(2): 169180.
47. Lu ZL, Williamson SJ, Kaufman L. 1992. Human auditory primary and association cortex have differing
lifetimes for activation traces. Brain Res 572(12): 236241.
III THEORETICAL FOUNDATIONS
7 Scaling
Every visual stimulus generates an internal representation in the perceptual system of the
observer. The goal of scaling in psychophysics is to quantify the relationship between
perceptual intensities or qualities and properties of the physical stimulus.15 This is accom-
plished through various empirical and theoretical methods of scaling including magnitude
estimation, magnitude production, and similarity/dissimilarity judgments. The goal of the
scaling enterprise is to derive an internal representation of the stimuli in either a single-
dimensional or multiple-dimensional space. This internal representation is then used to
predict performance and brain responses for other stimulus combinations.
Every psychophysical experiment involves the creation of an internal representation in
the brain from the physical stimulus, and also the production of one or more responses
using a decision structure or decision rule (figure 7.1). In visual psychophysics, we specify
and control the physical stimulus and measure the response. To infer accurately the rela-
tionship between the physical stimulus and the internal representation, we must also
understand the transformation from the internal representation to the response.6
The advancement of neurophysiology and brain imaging by electroencephalography
(EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging
200 Chapter 7
Neural Neural
activity Internal activity
Stimulus Response
representation
Figure 7.1
Mapping the physical stimulus onto an internal representation and then an overt response.
(fMRI) created new tools that may provide insights into both the internal representations
of stimuli and the decision rules underlying responses. A combination of the information
obtained from psychophysical scaling and neurophysiology, or neuropsychophysics, will
prove useful in resolving ambiguities in both domains.1,7,8
Some of the most important quantitative laws or relationships between simple stimu-
lus variations and perceptual qualities that model human performance have been discov-
ered through scaling. In cases where the perceptual representations are more complicated,
such as for complex objects, characterizing stimulus variations in multidimensional psy-
chological spaces can yield important metrics for the stimulus variations themselves.
Understanding these complicated representational spaces may improve the interpretation
of the neural codes that represent the stimuli. This in turn may be critical to the develop-
ment of devices that augment human perception and cognition.
interpolations to estimate the function relating the magnitude rating responses to manipula-
tions of stimulus intensity.
Magnitude estimation and magnitude production experiments have been widely studied
and have generated well-known functional relationships between the stimulus variation
and the magnitude response. The most famous of these is the power-law relationship
known as Stevens law. The power function is
g (I ) = I , (7.1)
where I is the physical intensity or magnitude of the stimulus, is the power exponent,
and is a scaling factor. There are three kinds of relationships (figure 7.2): linear when
is 1, compressive when is less than 1, and expansive when is greater than 1. In a
compressive scale, larger and larger stimulus differences are needed to yield a given dif-
ference on the internal scale as the stimulus intensity increases. In an expansive scale,
the opposite is true. The measured exponents are typically compressive for dimensions
45
40
35
30
Perceived magnitude
25
20
15
10
0
0 10 20 30 40 50 60 70 80 90 100
Physical intensity
Figure 7.2
Illustrations of compressive, linear, and expansive relationships between stimulus values and perceptual strength.
202 Chapter 7
such as loudness and brightness, have been reported occasionally to be near 1 for line
length, and are often expansive when dealing with some aspects of taste or noxious
stimuli.10,16
Direct scaling seeks to understand how changes in the physical stimulus are reflected
in the internal representation that underlies the psychological percept. However, the direct
scaling methods actually estimate the relationship between the magnitude response and
the stimulus. Inferring the internal representation from this stimulusresponse relationship
requires that we know how the internal representation is mapped to the overt response.
Usually, it is implicitly assumed that the magnitude score directly accesses the internal
representationthat is, the function that relates the internal representation to response is
linear. Discovery of the accurate relationship between the external stimulus and the internal
representation depends upon the validity of this assumption.6 This assumption is rarely
tested, although there is evidence that the function is nonlinear in at least some
circumstances.17
One way to check the validity of a linear linking function between the internal repre-
sentation and the reported magnitude is to relate the results of one power-law to another
by measuring the relationship through either within-modality matching or cross-modality
matching.17 For example, if loudness and brightness are each described by their own
power-law functions with different exponents A and V and the linking function is linear
in both cases, then this implies specific relationships between pairs of auditory and visual
stimuli. So, if you start with one loudness and one brightness that are perceived to be
matched on intensity, and if you double the perceived loudness by using the auditory power
function and do the same for brightness, then the new loudness and the new brightness
should also match. Consistency in such tests provides evidence that the two linking func-
tions are the same and are more likely to be linear. Empirically, such tests sometimes fail
to provide consistent results.2
Recent advancements in neurophysiology and brain imaging may bring tools that will
contribute to understanding both the representation and the linking functions or decision
processes that take the internal representation(s) to responses. This is a point we return to
later in the chapter.
Weber originally measured the increment I in the physical stimulus intensity needed
for the observer to perceive the larger stimulus to be greater 75% of the time, defined to
be the just noticeable difference (JND). He found that the increment I was proportional
to the magnitude of the baseline stimulus, or I I = k over a wide range of intensities I .
This relationship has been called Webers law.18
One example experiment measures the JND for line length. On each trial, line segments
with length Li and Li + Li are presented, and the subject judges which is longer. A stair-
case procedure (see section 11.2 in chapter 11) or a method of limits (see section 8.2 of
chapter 8) procedure is used to find Li such that Li + Li is perceived as longer 75% of
the time. The experiment measures this for eight different baseline lengths Li over a wide
range. The experimenter then graphs Li as a function of Li . This functional relationship
is approximately linear and can be fit by a straight line to estimate the Weber fraction, k
(figure 7.3).
Webers law is an empirical statement about the constancy of the ratio of the JND
to stimulus intensity. However, the law by itself does not specify the internal scale,
0.5
0.45
0.4
0.35
0.3
JND
0.25
0.2
0.15
0.1
0.05
1 2 3 4 5 6 7 8
Line length
Figure 7.3
Just noticeable difference (JND) as a function of baseline length.
204 Chapter 7
g ( I i ).Fechner reformulated Webers law to create a psychological scale from the mea-
sured JNDs. He showed that if Webers law holds, then the function of intensity is
logarithmic: g = A log ( I ) + C . This formulation is called Fechners law.1 Fechners
reformulation of Webers law makes a critical assumption that a function F maps the
internal representations of a pair of stimuli to their discriminability in a way that
depends only on the difference between the internal scale values: [ g ( I + I ) g ( I )],
and that this is a constant for a particular level of discriminability. In this formulation,
P ( g( I + ) > g( I )) = F [ g ( I + I ) g ( I )] = k1 and [ g ( I + I ) g ( I )] = F 1 ( k1 ) = k2 , or
g k2 k2 1 ,
= =
I I k I
corresponding with a logarithmic scale.
This critical assumption of Fechners law still must be tested empirically. Some
experiments suggest that the function F may also depend in a subtle way on the base
intensity I and other factors.2023 There are cases in which Webers law does not hold.
In auditory perception, the Weber fraction k is not exactly a constant. It is slightly
lower for more intense auditory stimuli, a phenomenon called the near miss to Webers
law.13
where dij is the z-score of the probability that Si is judged greater than S j.
In Thurstone scaling, the experimenter measures the probability that one stimulus is
judged greater than another for many different stimulus combinations. These data are used
to estimate scale values g ( Si ) for all tested stimuli. Most researchers simplify the estima-
tion of scale values by assuming that the variabilities of the internal representations are
independent and identically distributedThurstones case V2527:
This means that all the noise variances are equal, and all the noise correlations are 0. Other
cases considered by Thurstone (cases I to IV) relax these simplifying assumptions in
various ways.
Thurstones approach could be used in visual psychophysics to measure the positions
of stimuli along an arbitrary single internal dimension. For example, Thurstone scaling
could estimate the distance on an internal representation between sine waves of different
spatial frequencies. The experimenter selects sine waves with a number of different spatial
frequencies between the lowest and the highest spatial frequency to be tested and performs
pairwise comparative judgments of which has higher frequency for each pair. These data
are used to estimate the distances between near pairs, and the distances between highly
separated pairs are estimated by cumulating shorter distances.
In figure 7.4, we present data from a hypothetical spatial frequency scaling experiment.
In this experiment, seven different spatial frequencies (indexed by numbers 1 to 7, from
1 3
0.8 j=1 2
2 1
0.6 3 j=1
2 3
4 0 4 5
PiHj
z iHj
0.4 5 6
6 1 7
0.2
7
2
c d
0 3
1 2 3 4 5 6 7 1 2 3 4 5 6 7
Index of spatial frequency (i)
Figure 7.4
An example of Thurstone scaling of different spatial frequency stimuli. (a) Sine-wave gratings of seven different
spatial frequencies. (b) Internal representation of the seven sine-wave gratings on a single dimension assuming
independence and equal variance. (c) and (d) Hypothetical probability and z-score data in which the spatial
frequency indexed by i is judged to be higher than the spatial frequency indexed by j.
206 Chapter 7
low to high) are used. In each trial, the observer is presented with a pair of spatial
frequencies and asked to judge which one is higher. This is repeated many times. At
the end of the experiment, we can summarize the data in terms of PiHj, the observed
proportion that spatial frequency i is judged to be higher than spatial frequency j
(figure 7.4a). The proportions are also converted into corresponding z-scores, ziHj (figure
7.4b). If we set the first spatial frequency (i = 1) at the origin and estimate the distance of
all the other spatial frequencies (i = 2,, 7) to the first spatial frequency as di , we can
express ziHj as
Displays 7.1 and 7.2 present MATLAB programs for recovering the internal psychological
distances from the data shown in figure 7.4. The recovered distances are 0.347, 0.776,
1.268, 1.558, 1.764, and 2.050. For Thurstone case V, these distances are in units of the
common standard deviation. The program enters (hypothetical) PiHj data and provides
initial guesses for the locations of stimuli on the internal dimension. A fitting routine,
fminsearch, is used to find the internal locations that optimize the fit to the data. This
kind of model estimation is treated in detail in chapter 10.
Display 7.1
Display 7.2
L = 0;
for i = 1:7
for j = 1:7
z(i, j) = sqrt(2) * (d(j) - d(i));
L = L + (zData(i, j) - z(i,j))^2;
end
end
r2 = 1 - L/sum(sum((z(i,j) - mean2(zData)).^2));
Thurstone scaling relies on pairwise measurements and only close pairs yield meaning-
ful probability data that are not saturated to near 0 or near 1. Long-range scales that are
constructed from these local measures usually assume that the scale is unidimensional and
on a straight line embedded in a Euclidean space. This corresponds to assuming that you
can add many small distances to estimate a long distancebut these long distances cannot
be directly tested in the data in a meaningful way.
In general Thurstone scaling, not just the mean scale values but also the variances and
the correlations are free parameters in the solution. Most meaningful comparisonsthose
where one stimulus does not completely dominate anotherare limited to near neighbors.
For this reason, the simplifying assumptions of Thurstone case V of equal variances and
zero correlations may not be easy to test. The estimated distances of highly separated
stimuli may be altered, for example, by variances that change systematically with the
mean, so as to compress or expand the distance scale at one end. Yet these may not be
empirically detectable. Also, the assumption of unidimensionality may be violated in some
instances. Researchers have developed other techniques to place stimuli in a multidimen-
sional space (see section 7.3).
measurement theory. Measurement theory asks whether the internal or psychological scale
preserves the strong properties of a physical scale, such as length which has a physical
ratio scale, or instead has weaker properties with respect to various mathematical opera-
tions.2830 In other words, to quote Ashby31(p. 103): When are we justified in representing
phenomena with properties X by numerical structure Y? or How much are we allowed
to read into the numbers that result?
The field of measurement theory often tests relationships between stimuli, or between
pairs of stimuli, that probe necessary mathematical relations implied by ratio, interval, or
ordinal scale representations.32 Good primary references to the extensive mathematical
development in measurement theory can be found in the three volumes of Foundations of
Measurement and in others.13,3336 Several examples are considered next.
In direct estimation, subjects are asked to assign numbers to stimuli. However, numbers
are not just words, but carry with them mathematical operations. Numbers can be used to
indicate names or categories (nominal scale), in which case they provide no information
about relationships between corresponding stimuli. Or, numbers may indicate order or
ranking information between examples (ordinal scale). Numbers may also preserve sums
and differences (interval or ratio scales). Understanding the relational and other properties
of the assigned numbers provides information about the internal representations. If the
assigned numbers of direct estimation obey interval scale properties, then we infer that
the internal representation must also preserve those same interval properties. In contrast,
if the internal representations preserve only ordinal properties, then there would be no
basis for assigned numerical values to show interval scale consistency.
Conjoint measurement theory asks the question whether it is possible to measure the
corresponding psychological scales for objects with distinct attributes A and X through
stimulus combinations and judgments.3739 The theory assumes that stimulus attributes A
and X combine (represented by the symbol ) non-interactively to determine the value
of a perceived variable P. The psychological scales for A and X are unknown. The
theory tests whether Ps resulting from different combinations of different values of A
and X are consistent in the data. For example, if ai xk is judged greater than a j xk,
then ai xl should be judged greater than a j xl . Elaborate versions of these and other
similar teststhe so-called cancellation theoremsare sometimes used to infer interval
properties of the scales of A and X as well as to validate the assumption of non-inter-
active combination. If all these properties hold, then we conclude that the internal rep-
resentations of A and X are on an interval scale.
Conjoint measurement has been widely used in marketing and other applications where
people are asked to judge objects with multiple attributes.4042 For example, fuel efficiency
(mpg) and engine power (rpm) of cars both contribute to peoples judgment of their value
or desirability. Conjoint measurements have also been used in psychophysical applications.
In classic examples from audition, for example, the two attributes were the loudness of a
stimulus in the two ears.43,44 Ideas from conjoint measurement have recently been applied
Scaling 209
to vision45 to examine the perceptual combination of different levels of gloss (shiny appear-
ance) and surface texture.
The earlier sections of this chapter focused on understanding the internal representa-
tions of stimuli (such as a series of increasing magnitude; i.e., line lengths) that differ
along a single dimension. However, internal perceptual representations are often multi-
dimensional. Multidimensional scaling (MDS) is a set of statistical procedures or algo-
rithms that position stimuli within a multidimensional geometric representation that
expresses the psychological distances between the stimuli based on dissimilarity judg-
ments or other measures of proximity or distance. The MDS algorithms explain many
pairwise measures of similarity/dissimilarity by estimating the locations of the stimuli
within a multidimensional space. An MDS solution provides a compact interpretation
of the pairwise data and allows predictions about all combinations from a small set of
estimated parameters.
Metric MDS assumes a particular distance metric, so that distances between items in
the internal representation are computed as Euclidean, city-block, or some other Minkowski
distance. The Minkowski distance between two points ( x1 , x2 ,..., xn ) and ( y1 , y2 ,..., yn ) is
1
n p
p
xi yi .
i =1
Judged Similarity
434 1.00 0.86 0.42 0.42 0.18 0.06 0.07 0.04 0.02 0.07 0.09 0.12 0.13 0.16
445 0.86 1.00 0.50 0.44 0.22 0.09 0.07 0.07 0.02 0.04 0.07 0.11 0.13 0.14
465 0.42 0.50 1.00 0.81 0.47 0.17 0.10 0.08 0.02 0.01 0.02 0.01 0.05 0.03
472 0.42 0.44 0.81 1.00 0.54 0.25 0.10 0.09 0.02 0.01 0.00 0.01 0.02 0.04
490 0.18 0.22 0.47 0.54 1.00 0.61 0.31 0.26 0.07 0.02 0.02 0.01 0.02 0.00
504 0.06 0.09 0.17 0.25 0.61 1.00 0.62 0.45 0.14 0.08 0.02 0.02 0.02 0.01
537 0.07 0.07 0.10 0.10 0.31 0.62 1.00 0.73 0.22 0.14 0.05 0.02 0.02 0.00
555 0.04 0.07 0.08 0.09 0.26 0.45 0.73 1.00 0.33 0.19 0.04 0.03 0.02 0.02
584 0.02 0.02 0.02 0.02 0.07 0.14 0.22 0.33 1.00 0.58 0.37 0.27 0.20 0.23
600 0.07 0.04 0.01 0.01 0.02 0.08 0.14 0.19 0.58 1.00 0.74 0.50 0.41 0.28
610 0.09 0.07 0.02 0.00 0.02 0.02 0.05 0.04 0.37 0.74 1.00 0.76 0.62 0.55
628 0.12 0.11 0.01 0.01 0.01 0.02 0.02 0.03 0.27 0.50 0.76 1.00 0.85 0.68
651 0.13 0.13 0.05 0.02 0.02 0.02 0.02 0.02 0.20 0.41 0.62 0.85 1.00 0.76
674 0.16 0.14 0.03 0.04 0.00 0.01 0.00 0.02 0.23 0.28 0.55 0.68 0.76 1.00
434 445 465 472 490 504 537 555 584 600 610 628 651 674
Chapter 7
Scaling 211
555 0.9
a b
537 0.8
600 0.6
0.5
490
610
0.4
628 0.3
651
674 0.2
472
465
0.1
445 0
434 0 0.2 0.4 0.6 0.8 1
Obtained Euclidean distance
Figure 7.5
MDS solution for the similarity judgment data in Ekman50 for colored lights. (a) Two-dimensional configuration
obtained by MDS. (b) A graph of the relation between judged similarities and corresponding Euclidean distances
between points in the MDS solution.
Display 7.3 enters the similarity data used by Shepard and uses a simple MATLAB
function called mdscale, specifying a 2D solution, to recover the representation of the 14
colored lights. The solution is expressed as 14 ( xi , yi ) locations in a two-dimensional
coordinate system. In display 7.3, we show a simple MATLAB program that calls mds,
the multidimensional scaling routine, to recover the color space using Ekmans data. We
closely recapitulate Shepards analysis,49 with essentially identical results. What was a
demanding project in 1980 is now routine due to the development of faster computer
hardware and algorithms.
Another good example of the application of MDS is the shape-variation study of
Cutzu and Edelman51 (figure 7.6). They studied the perception of several sets of artificial
animal-like shapes created from generalized cylinders in different configurations (figure
7.6a).
The different artificial animal stimulus sets were created with distinct kinds of spacing
within the 2D variations of shape along physical dimensions. All these animal-like shapes
had the same parts, modeled by generalized cylinders, and were controlled by 70 param-
eters that determined the geometry of the parts and their mutual arrangement, via nonlinear
functions. MDS did an excellent job of recovering the relationships in all these different
sets using empirical similarity scores collected in several different ways. The empirical
methods included pairwise ratings, comparisons of one pair to another pair, long-term
memory confusions, and delayed match to sample tests. These tests served as a validation
of the MDS method.
212 Chapter 7
Display 7.3
figure;
for i=1:14
plot(Representation(i, 1), -Representation(i, 2),kx);
text(Representation(i,1), -Representation(i,2),
Wavelength(i));
hold on;
end
plot([Representation(:, 1); Representation(1,1)],
-[Representation(:,2); Representation(1, 2)], k-);
axis(square);
axis(off);
sdRelation = [];
for i = 1:14
for j = 1:14
Distance = sqrt(
(Representation(i, 1) - Representation(j, 1)).^2
+ (Representation(i, 2) - Representation(j, 2)).^2);
sdRelation = [sdRelation; Distance
Judged_similarities(i, j)];
end
end
Scaling 213
figure;
plot(sdRelation(:, 1), sdRelation(:, 2), ko);
axis(square);
xlabel(Obtained Euclidean distance,Fontsize, 24,
Fontname, Arial);
ylabel(Similarity between colors,Fontsize, 24,
Fontname, Arial);
MDS algorithms recover two things. The first and primary thing is the positions of all
the tested stimuli within the multidimensional space (figure 7.6) for the two artificial
animal sets with different patterns of similarity between objects. The second important
aspect is the functional relationship, or the transfer function, between the internal distances
and the behavioral measurement such as rated similarity or rate of confusion errors. The
quality of the model in relation to the data is summarized by an index of stress, or a fidel-
ity criterion that weights the errors between predicted and observed distances.
Often, the dimensionality of the internal representation is unknown when applying MDS
to a new experimental example. An MDS solution with a smaller dimensionality is gener-
ally preferred as it provides an account with fewer estimated parameters in the representa-
tion and also is more easily visualized and interpreted.52 For example, n objects located
in 2D space require 2n parameters and are visualized by locations in a 2D plane, whereas
n objects in a three-dimensional (3D) space require 3n parameters and are visualized in
3D projections; and so on for higher-dimensional spaces. Dimensions are added or sub-
tracted from the MDS solution until the best solution with the smallest number of dimen-
sions is identified. Because MDS depends upon distances only, there are many equivalent
solutions that differ in the orientations in the dimensional space. The spatial configuration
can sometimes be rotated or realigned to find some set of axes in which the stimulus
variations make the dimensions easier to interpret.
The number of dimensions in an MDS solution can be known only approximately.
Indeed, MDS is viewed by some researchers primarily as a simple form of data
reductiona solution that takes a large set of pairwise data and displays it in terms of a
small number of locations in a small number of dimensions. In practice, researchers find
the minimum dimensionality that usefully describes the data and can be interpreted. This
recovered dimensionality may only approximate the relevant psychological dimensions in
a particular experimental situation.
MDS is more complex than unidimensional scaling because it requires the choice of a
distance metric, such as Euclidean or city-block, and the determination of the dimensional-
ity of the recovered similarity space. The MDS method in common metrics such as
Euclidean or city-block implicitly makes assumptions of independent and equal variance
214 Chapter 7
b c
Figure 7.6
MDS analysis of similarity data on sets of artificial animal-like shapes. (a) The four 2D parameter-space con-
figurations illustrating the similarity patterns built into the experimental stimuli used by Cutzu and Edelman. (b)
The 2D MDS solution for all subjects in the pairwise comparison experiment. (c) The 2D MDS solution for all
subjects in the long-term memory experiment (after Cutzu and Edelman,51 figures 1 and 2).
Scaling 215
in internal representations, and the distances in the solutions are de facto scaled as signal-
to-noise ratios.
This brief introduction to MDS only touches on a vast set of technical developments.
Classical MDS led to several more complicated implementations designed to incorporate,
for example, the variations between the ratings of different observers46 or use of only
ordinal properties of the data.48 The original MDS algorithms were deterministic and
ignored variability in the data and the model. Probabilistic MDS53,54 incorporated a more
serious treatment of error variation. Some of these methods were recently extended using
Bayesian estimation algorithms.5558 The probabilistic and Bayesian approaches may relax
the assumptions of independence and equal variance. However, data from typical para-
digms may not provide sufficient constraints to model internal noises effectively. For some
kinds of stimuli, such as word meanings or knowledge spaces, non-geometric representa-
tions such as graphs or tree structures may provide an alternative and better representation
of the relationships between objects embodied in the data.5963
Neurophysiology and brain imaging provide a new window into the relationship between
the physical stimulus and the internal representations. They may also identify the brain
activity associated with making a decision. The field is just beginning to see an interactive
development of brain-imaging data and methods of scaling.
For example, consider the internal response to stimuli varying in intensitythe contrast
response function. Single-unit recordings measure the contrast response function of a
single neuron.6468 Multi-electrode arrays measure the simultaneous responses of hundreds
of neurons.6972 fMRI measures the contrast response function of a cortical area.7376 EEG
and MEG measure population responses.7779 All these measurements provide evidence of
internal neural representations of the stimulus intensity within certain stages of neural
processing. Exploring the relationship between these neural functions and the psychologi-
cal scales is a new direction in neuropsychophysics.
Another important direction will be to understand the relationship between the internal
representation and the overt response, the linking or transfer function. The transfer
function is the relationship between the internal representation and the magnitude esti-
mate in direct scaling, or the relationship between rated similarity and internal distances
in MDS. Recent neuropsychological and imaging studies are beginning to address these
issues.74,8083
Multivariate pattern analysis in fMRI and emerging methods in analyzing data from
multi-array neural recordings offer neural measures of similarity of internal representations
for different stimuli and brain regions.69,70,84 Exploring the relationship between neural
similarity measures and MDS and other related scaling technologies represents a major
direction in cognitive neuroscience. For example, Haushofer, Livingstone, and Kanwisher85
216 Chapter 7
used fMRI to measure the physical, behavioral, and neural similarity between pairs of
novel shapes. They obtained perceptual similarity measures for each pair of shapes in a
psychophysical samedifferent task, physical similarity measures from stimulus parame-
ters, and neural similarity measures from multivoxel pattern analysis methods applied to
the anterior and posterior lateral occipital complex (LOC). Pattern analysis of the similarity
between the responses to two stimuli in the same locations or brain regions found that the
pattern of pairwise shape similarities in the posterior LOC is highly correlated with physi-
cal shape similarities, whereas shape similarities in the anterior LOC most closely matched
perceptual shape similarities. Results from such experiments could be used to specify
stimulus manipulations and critical validation for neuropsychophysical investigations and
identify the brain regions that underlie the psychological scales. Also, results from neuro-
psychophysical investigations could provide an important basis for further investigation
of multidimensional representations of visual stimuli.
7.5 Summary
Scaling methods were among the earliest of the empirical methods applied in psycho-
physics, dating back to the end of the nineteenth century. The goal of the scaling
enterprise is to quantify perceptual experience. In this chapter, we reviewed early
advances in the development of perceptual scales and the extension of methods of
scaling over large stimulus variations. We treated the scaling of multidimensional
stimuli along with some key examples. We also pointed to alternative developments in
measurement theory. Finally, we considered the amazing possiblities for future research
in neuropsychophysics that combines new technologies in brain imaging with the power
of quantitative scaling methods. Together, these approaches have the potential to iden-
tify and specify the translation from stimuli to internal representations and from internal
representations to observer responses. Understanding these relationships may provide
some of the keys that unlock our further understanding of the brain mechanisms in
information processing.
References
8. Scheerer E. Fechners inner psychophysics: Its historical fate and present status. In: Geissler HG, Link SW,
Townsend JT, eds. Cognition, Information Processing and Psychophysics. Hillsdale, NJ: Lawrence Erlbaum;
1992: pp. 322.
9. Stevens SS. 1946. On the theory of scales of measurement. Science 103: 677680.
10. Stevens SS. 1957. On the psychophysical law. Psychol Rev 64: 153181.
11. Pfanzagl J. Theory of measurement. New York: John Wiley & Sons; 1968.
12. Narens L. 1996. A theory of ratio magnitude estimation. J Math Psychol 40: 109129.
13. Falmagne JC. Elements of psychophysical theory. Oxford: Oxford University Press; 2002.
14. Green DM, Luce RD, Duncan JE. 1977. Variability and sequential effects in magnitude production and
estimation of auditory intensity. Atten Percept Psychophys 22: 450456.
15. Krantz DH. 1972. A theory of magnitude estimation and cross-modality matching. J Math Psychol 9:
168199.
16. Stevens SS. Psychophysics: Introduction to its perceptual, neural, and social prospects. Piscataway, NJ:
Transaction Publishers; 1975.
17. Krantz DH. 1972. A theory of magnitude estimation and cross-modality matching. J Math Psychol 9:
168199.
18. Weber EH. De Pulsu, resorptione, auditu et tactu: Annotationes anatomicae et physiologicae. Leipzig: CF
Koehler; 1834.
19. Luce RD, Galanter E. Discrimination. In: Luce RD, Bush RR, Galanter E, eds. Handbook of mathematical
psychology, Vol. 1. New York: Wiley; 1963: pp. 191243.
20. Foley JM, Legge GE. 1981. Contrast detection and near-threshold discrimination in human vision. Vision
Res 21: 10411053.
21. Wilson HR, Humanski R. 1993. Spatial frequency adaptation and contrast gain control. Vision Res 33:
11331149.
22. Lu ZL, Sperling G. 1996. Contrast gain control in first-and second-order motion perception. JOSA A 13:
23052318.
23. Watson AB, Solomon JA. 1997. Model of visual contrast gain control and pattern masking. J Opt Soc Am
14: 23792391.
24. Thurstone LL. 1927. A law of comparative judgment. Psychol Rev 34: 273286.
25. Reeves A, Sperling G. 1986. Attention gating in short-term visual memory. Psychol Rev 93: 180206.
26. Galanter E, Messick S. 1961. The relation between category and magnitude scales of loudness. Psychol Rev
68: 363372.
27. Woods RL, Satgunam PN, Bronstad PM, Peli E. 2010. Statistical analysis of subjective preferences for video
enhancement. Human Vision and Electronic Imaging XV, vol. 7527, article 14.
28. Hand D. Measurement theory and practice: The world through quantification. London: Arnold Publishers;
2004.
29. Michell J. Measurement in psychology: Critical history of a methodological concept. Cambridge, UK: Cam-
bridge University Press; 1999.
30. Narens L. Theories of meaningfulness. Hillsdale, NJ: Lawrence Erlbaum; 2002.
31. Ashby F. 1991. Book reviews, Foundations of Measurement, Volume II and III. Appl Psychol Meas 15:
103108.
32. Dez JA. 1997. A hundred years of numbers. An historical introduction to measurement theory 18871990:
Part I: The formation period. Two lines of research: Axiomatics and real morphisms, scales and invariance.
Studies in History and Philosophy of Science Part A 28: 167185.
33. Krantz DH, Suppes P, Luce RD. Foundations of measurement, Vol. 1. New York: Academic Press;
1971.
34. Suppes P, Krantz DM, Luce RD, Tversky A. Foundations of measurement, Vol. 2: Geometrical, threshold,
and probabilistic representations. New York: Academic Press; 1989.
218 Chapter 7
35. Luce RD, Krantz DH, Suppes P, Tversky A. Foundations of measurement, Vol. 3: Representation, axiomatiza-
tion, and invariance. New York: Academic Press; 1990.
36. Narens L. Abstract measurement theory. Cambridge, MA: The MIT Press; 1985.
37. Luce RD, Tukey JW. 1964. Simultaneous conjoint measurement: A new type of fundamental measurement.
J Math Psychol 1: 127.
38. Gustafsson A, Herrmann A, Huber F. Conjoint measurement: Methods and applications. Berlin: Springer-
Verlag; 2007.
39. Krantz DH, Tversky A. 1971. Conjoint-measurement analysis of composition rules in psychology. Psychol
Rev 78: 151169.
40. Steenkamp JBEM. 1987. Conjoint measurement in ham quality evaluation. J Agric Econ 38: 473480.
41. Reid, GB, Shingledecker, CA, Eggemeier, FT. 1981. Application of conjoint measurement to workload scale
development. Proceedings of the 1981 Human Factors Society Annual Meeting, 522526.
42. Green PE, Rao VR. 1971. Conjoint measurement for quantifying judgmental data. J Mark Res 8(3):
355363.
43. Falmagne JC, Iverson G, Marcovici S. 1979. Binaural loudness summation: Probabilistic theory and data.
Psychol Rev 86: 2543.
44. Falmagne JC, Iverson G. 1979. Conjoint Weber laws and additivity. J Math Psychol 20: 164183.
45. Ho YX, Landy MS, Maloney LT. 2008. Conjoint measurement of gloss and surface texture. Psychol Sci 19:
196204.
46. Carroll JD, Chang JJ. 1970. Analysis of individual differences in multidimensional scaling via an N-way
generalization of Eckart-Young decomposition. Psychometrika 35: 283319.
47. Borg I, Lingoes JC. Multidimensional similarity structure analysis. Berlin: Springer-Verlag; 1987.
48. Borg I, Groenen PJF. Modern multidimensional scaling: Theory and applications. Berlin: Springer-Verlag;
2005.
49. Shepard RN. 1980. Multidimensional scaling, tree-fitting, and clustering. Science 210: 390398.
50. Ekman G. 1954. Dimensions of color vision. J Psychol 1054(38): 467474.
51. Cutzu F, Edelman S. 1996. Faithful representation of similarities among three-dimensional shapes in human
vision. Proc Natl Acad Sci USA 93: 1204612050.
52. Lee MD. 2001. Determining the dimensionality of multidimensional scaling representations for cognitive
modeling. J Math Psychol 45: 149166.
53. MacKay DB. 1989. Probabilistic multidimensional scaling: An anisotropic model for distance judgments.
J Math Psychol 33: 187205.
54. MacKay DB, Zinnes JL. 1986. A probabilistic model for the multidimensional scaling of proximity and
preference data. Mark Sci 5(4): 325344.
55. Oh MS, Raftery AE. 2001. Bayesian multidimensional scaling and choice of dimension. J Am Stat Assoc 96:
10311044.
56. Okada K, Shigemasu K. 2010. Bayesian multidimensional scaling for the estimation of a Minkowski expo-
nent. Behav Res Methods 42: 899905.
57. Lee MD. 2008. Three case studies in the Bayesian analysis of cognitive models. Psychon Bull Rev 15:
115.
58. Okada K, Shigemasu K. 2009. BMDS: A collection of R functions for Bayesian multidimensional scaling.
Appl Psychol Meas 33: 570571.
59 Buneman, OP. The recovery of trees from measures of dissimilarity. In: Hodson FR, Kendall DG, Tautu PT,
eds. Mathematics in the archaeological and historical sciences. Edinburgh: Edinburgh University Press; 1971,
pp. 387395.
60. Pruzansky S, Tversky A, Carroll JD. 1982. Spatial versus tree representations of proximity data. Psy-
chometrika 47: 324.
Scaling 219
61. Carroll JD, Pruzansky S. 1975. Fitting of hierarchical tree structure (HTS) models, mixtures of HTS models
and hybrid models, via mathematical programming and alternating least squares. Proceedings of the U.S.-Japan
Seminar on Multidimensional Scaling, 919.
62. Corter JE. Tree models of similarity and association. Thousand Oaks, CA: Sage Publications; 1996.
63. Falmagne JC, Koppen M, Villano M, Doignon JP, Johannesen L. 1990. Introduction to knowledge spaces:
How to build, test, and search them. Psychol Rev 97: 201224.
64. Tolhurst DJ, Movshon JA, Thompson ID. 1981. The dependence of response amplitude and variance of cat
visual cortical-neurons on stimulus contrast. Exp Brain Res 41: 414419.
65. Albrecht DG, Hamilton DB. 1982. Striate cortex of monkey and cat: Contrast response function. J Neuro-
physiol 48(1): 217237.
66. Albrecht DG, Geisler WS. 1991. Motion selectivity and the contrast-response function of simple cells in the
visual cortex. Vis Neurosci 7: 531546.
67. Sclar G, Maunsell JHR, Lennie P. 1990. Coding of image contrast in central visual pathways of the macaque
monkey. Vision Res 30: 110.
68. Tolhurst D, Movshon J, Thompson I. 1981. The dependence of response amplitude and variance of cat visual
cortical neurones on stimulus contrast. Exp Brain Res 41: 414419.
69. Cohen MR, Kohn A. 2011. Measuring and interpreting neuronal correlations. Nat Neurosci 14: 811819.
70. Averbeck BB, Latham PE, Pouget A. 2006. Neural correlations, population coding and computation. Nat Rev
Neurosci 7: 358366.
71. Smith MA, Kohn A. 2008. Spatial and temporal scales of neuronal correlation in primary visual cortex.
J Neurosci 28: 1259112603.
72. Montani F, Kohn A, Smith MA, Schultz SR. 2007. The role of correlations in direction and contrast coding
in the primary visual cortex. J Neurosci 27: 23382348.
73. Li X, Lu ZL, Tjan BS, Dosher BA, Chu W. 2008. Blood oxygenation level-dependent contrast response
functions identify mechanisms of covert attention in early visual areas. Proc Natl Acad Sci USA 105:
62026207.
74. Boynton GM, Demb JB, Glover GH, Heeger DJ. 1999. Neuronal basis of contrast discrimination. Vision Res
39: 257269.
75. Buracas GT, Boynton GM. 2007. The effect of spatial attention on contrast response functions in human
visual cortex. J Neurosci 27: 9397.
76. Gardner JL, Sun P, Waggoner RA, Ueno K, Tanaka K, Cheng K. 2005. Contrast adaptation and representation
in human early visual cortex. Neuron 47: 607620.
77. Nunez PL, Srinivasan R. Electric fields of the brain: The neurophysics of EEG. Oxford: Oxford University
Press; 2006.
78. Lu ZL, Kaufman L. Magnetic source imaging of the human brain. New York: Psychology Press; 2003.
79. Hansen PC, Kringelbach ML, Salmelin R. MEG: An introduction to methods. Oxford: Oxford University
Press; 2010.
80. Ress D, Backus BT, Heeger DJ. 2000. Activity in primary visual cortex predicts performance in a visual
detection task. Nat Neurosci 3: 940945.
81. Ress D, Heeger DJ. 2003. Neuronal correlates of perception in early visual cortex. Nat Neurosci 6:
414420.
82. Gold JI, Shadlen MN. 2007. The neural basis of decision making. Neuroscience 30: 535574.
83. Shadlen MN, Newsome WT. 1998. The variable discharge of cortical neurons: Implications for connectivity,
computation, and information coding. J Neurosci 18: 38703896.
84. Kriegeskorte N, Mur M, Bandettini P. 2008. Representational similarity analysisconnecting the branches
of systems neuroscience. Front Systems Neurosci 2: 4.
85. Haushofer J, Livingstone MS, Kanwisher N. 2008. Multivariate patterns in object-selective cortex dissociate
perceptual and physical shape similarity. PLoS Biol 6: e187.
8 Sensitivity and Signal-Detection Theory
This chapter treats the measurement of visual sensitivity, or how we detect threshold or
near-threshold stimuli. It begins with a brief review of the classical methods of measuring
thresholds. It then presents the signal-detection theory framework and an analysis of
several classical and modern procedures. The chapter ends with the signal-detection
approach to more complex situations involving inputs from multiple detectors and tasks
that involve stimuli in multidimensional space.
This chapter treats many of the most common methods for measuring the sensitivity in
a single condition. In later chapters, we consider how multiple measurements in several
conditions are combined to estimate sensitivity functions or sensitivity surfaces over
several dimensions of stimulus variation. Signal-detection theory812 provides the theoreti-
cal framework that separates decision factors (bias, etc.) from sensitivity and is used in
this chapter to analyze and understand the different paradigms for the measurement of
sensitivity and threshold.
Three of the most common methods first introduced to measure threshold are the method
of limits, the method of adjustment, and the method of constant stimuli.13,14 These methods
date to the turn of the past century. In this section, we discuss each of these in turn.
The method can also be used to measure difference thresholds. Starting with a very
small or no difference between two visible stimuli, one can gradually increase the differ-
ence until the difference is visible. Or starting with a large difference between two stimuli,
one can gradually reduce the difference until the difference is invisible.
Display 8.1
%% Experimental Module
p.stimSize = [6 6]; % horizontal and vertical stimulus
% size in visual angle
p.stimDuraion = 0.1; % stimulus duration in seconds
p.sf = 2 ; % spatial frequency in cycles/degree
p.ITI = 0.5; % seconds between trials
if nargin < 1, descending = true; end
if descending
respKey = down;
p.startContrast = 0.02; % starting visible grating contrast
inc = -0.001; % contrast increment
else
respKey = up;
p.startContrast = 0; % starting invisible grating contrast
inc = 0.001;
end
keys = {respKey esc}; % allowed response keys
0.6
a b
0.5
Stimulus contrast
0.4
0.3
0.2
0.1
0
0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80
Number of trials
Figure 8.1
The method of limits. (a) A descending sequence. (b) An ascending sequence. Filled circles indicate visible
responses; open circles indicate invisible responses. The horizontal lines indicate the true threshold.
stimulus is visible. In each case, the sensory stimulation is one ingredient in performance,
but the decision about how much evidence is requiredthe criterion for decisionis
another.
The method of limits and the method of adjustment are both useful in getting an approxi-
mate threshold measurement relatively quickly, but these estimates can be systematically
affected by how the observer chooses to make decisions. The individual could require that
the stimulus is clearly visible before he or she indicates a visible response. A high cri-
terion shifts the estimate of the threshold higher. Or, the individual could give a visible
response whenever there is a hint of visibility. Such a low criterion shifts the estimate of
threshold lower.
Psychologists have developed a formal theorysignal-detection theoryto analyze and
separate the sensory and the decision factors in responses.812 The application of these
methods can improve the estimation of sensory threshold separate from other properties
of the observer.
Display 8.2
%% Experimental Module
and the observer responds no, this is a miss. If the stimulus was not presented but the
observer responds yes, this is a false alarm. If the stimulus was not presented and the
observer responds no, this is a correct rejection. The proportion of hits and the propor-
tion of false alarms summarize the data. The proportion of misses and correct rejections
are not independent quantities, but are directly related to hits and false alarms. The propor-
tion of misses = 1 proportion of hits, and the proportion of correct rejections = 1 pro-
portion of false alarms.
The primary assumption of SDT is that the strength of the internal representation of a
stimulus is not identical on each trial, but is variable. That is, the observers internal
response to the stimulus is not exactly repeatable, but is noisy, presumably due to random-
ness in neural responses.812 The distribution of internal responses on trials where the
Sensitivity and Signal-Detection Theory 229
0.6
0.5
0.4
Stimulus contrast
0.3
0.2
0.1
0
0 10 20 30 40 50 60 70 80
Number of trials
Figure 8.2
The method of adjustment. Filled circles indicate target higher responses; open circles indicate target lower
responses. The horizontal line indicates the true matching contrast.
stimulus is not presented is the noise distribution, while the distribution of internal
responses on trials where the stimulus is presented is the signal + noise distribution. The
signal + noise distribution has a higher mean. (In this labeling, the noise refers to noise
or variability in the internal response, not to external masking noise that is part of the
physical stimulus.)
In most cases, the noise and the signal + noise distributions overlap. So, the particular
internal response value or strength sensed by the observer on a given trial could arise either
from a signal-present or a signal-absent stimulus presentation. Signal-present trials are
more likely to produce larger values in the internal representation.
Hit and false alarm rates provide key data for testing and applying SDT. A simple detec-
tion task measures performance for a single contrast. On each trial of an example experi-
ment, the observer first sees a fixation frame followed after 150 ms by a very brief
presentation of the low-contrast Gabor (signal) or a blank frame (noise or blank).
Observers participate in 200 trials with the signal Gabor appearing randomly in half of
230 Chapter 8
Display 8.3
%% Experimental Module
p.contrasts = logspace(log10(conRange(1)),
log10(conRange(2)), nContrast);
% use logspace function to choose nContrast contrasts
% (7) at log intervals between the minimum and
% maximum specified contrasts
Sensitivity and Signal-Detection Theory 231
Table 8.1
Definitions of hit, miss, false alarm and correct rejection
the trials. On each trial, the observer says signal present or yes if he or she
believes the Gabor was present in the display and signal absent or no otherwise, and
we tabulate the number of hits, misses, false alarms, and correct rejections. Display 8.4
shows a simple program that carries out this experiment and tabulates the responses. The
data from one observer are shown in table 8.2. Further SDT treatment of the data is con-
sidered in section 8.3.3, along with the resulting response table.
%% Experimental Module
Table 8.2
Frequency and percentage (in parentheses) of hits, misses, false alarms, and correct rejections from an observer
a S+
ROC ROC
N S+N
z(Hit rate)
Hit rate
Miss Hit
Criterion
b c
False alarm rate z(False alarm rate)
CR FA
Internal response
Figure 8.3
SDT in a Yes/No task. (a) The two distribution curves in each panel represent internal response distributions
for signal present (S + N) and for signal absent (N) trials. A subjective criterion determines whether the observer
reports that the signal is present (Yes) or not (No). Internal responses greater than the criterion lead the
observer to decide that the signal is present. The probabilities of hit, false alarm, miss, and correct rejection
responses, as indicated by the shaded areas, are jointly determined by the internal response distributions and the
criterion. (b) The ROC plots the hit rate as a function of the false alarm rate as the observer varies his or her
criterion. (c) ROC shown transformed in a zz plot.
The proportions of false alarms and hits are the complement of correct rejections and
misses, respectively:
c
PFA (c) = 1 PCR (c) = 1 G , 0,1 (8.4)
N
Sensitivity and Signal-Detection Theory 237
c S + N
PHit (c) = 1 PMiss (c) = 1 G
S+ N S+ N
, ,1 . (8.5)
Assuming = S+ N = N, the SDT parameters d and c can be estimated from the observed
hits and false alarms:
S + N
d = = z( PHit ) z( PFA ) (8.6)
c = z( PFA ) . (8.7)
Here, the mean of the stimulus-absent or noise distribution N is set to 0. The z refers to
the z-score of the corresponding probability; it converts a cumulative probability for the
normal distribution into units of standard deviation.
In SDT, the measure of discrimination d is described as a pure index of discrimina-
tion sensitivity, while criterion c estimates the decision bias. Discrimination d in SDT
provides a true measure of the sensitivity of the observer independent of criterion. (Note,
this use of the word sensitivity is related to but not the same as the sensitivity that is esti-
mated as the inverse of threshold in other contexts in psychophysics.)
To see how SDT is implemented in the example, consider again the response tabulation
in table 8.2 for the simple low-contrast Gabor detection experiment in section 8.3.2. We
can transform the observed proportions of hits and false alarms (or correspondingly of
misses and correct rejections) into z-scores with the MATLAB function norminv. These
z-scores are used to compute the observed discriminability d and the criterion c. For this
example, with hit rate of 80% and false alarm rate of 30%, corresponding to an overall
percent correct of 75%, the estimated discriminability d and criterion c are 1.366 and
0.524, respectively. These calculations are shown in display 8.5. Alternatively, MATLAB
has a function dprime that performs these computations ([d,c]=dprime(pHit,pFA)).
To provide an intuition about SDT with different parameters, we use MATLAB func-
tions for the probability density function of the normal distribution, normpdf, in a program
that plots the signal and noise distribution for the d and c estimated for this example data,
and we use the function for the cumulative distribution function, normcdf, to show the
corresponding proportions for hits and false alarms (display 8.6). This program plotsdt.m
can be used with different values of d and c to visualize and better understand the func-
tional relations between these model parameters and the observable response categories.
Display 8.5
Display 8.6
All of these examples assume that the variances of the signal and noise distributions are
the same. We consider unequal variance examples in the following section on receiver
operating characteristics.
ROC ROC
z(Hit rate)
N S+N
Hit rate
b c
Internal response False alarm rate z(False alarm rate)
z(Hit rate)
Hit rate
N S+N
e f
Internal response False alarm rate z(False alarm rate)
Figure 8.4
SDT in a Yes/No task for signal and noise distributions with different variances. (a) S + N has a higher vari-
ance than N. (b, c) ROC curves for the distributions in (a). (d) N has a higher variance than S + N. (e, f) ROC
curves for the distributions in (d).
probability axes, in which case the ROC is a concave curve. The ROC can also be graphed
on z ( PHit ) versus z ( PFA ) axes (figure 8.3c). This is convenient because the ROC corre-
sponds to a straight line in zz space for normally distributed internal responses.
The example shown in figure 8.3c assumes a simple case in which the variance of the
internal response on signal-present and signal-absent trials is equalthe equal-variance
SDT. In this case, the slope of the ROC curve on zz axes is 1.
If the variance of the internal responses on signal-present and signal-absent trials is not
equalthe unequal-variance SDTthen the slope of the ROC on zz axes will deviate
from 1. Figure 8.4 shows two cases. If the variance of the internal responses on signal-
present trials is larger, then the slope will be less than 1 (figure 8.4ac); if the variance of
the internal responses on signal-present trials is smaller, then the slope will be greater than
1 (figure 8.4df). The slope of the ROC has been considered a signature feature for certain
perceptual processes1618 and so has been extensively studied.
Whether or not the variances are equal, larger discrimination d values have ROC lines
that are further from the negative diagonal. ROC functions for higher d s move toward the
upper left of the ROC graph, which corresponds with more hits and fewer false alarms.
240 Chapter 8
The definition of d is altered slightly for the unequal variance case to scale the differ-
ence between the means of signal + noise and noise distributions by the square root of the
average of the two variances:
S + N N N
da = = z( PHit ) z( PFA ) , (8.9)
2
S+ N + 2
N
S+ N
2
where N/S+N is the slope of the zz ROC.
Our ROC experimental example, RatingExperiment.m (display 8.7), uses a rating
scale variant of the simple Gabor detection experiment from display 8.4. Instead of simple
yes and no responses, we collect 6-point rating scale responses, corresponding to
high-, middle-, and low-confidence target-present and high-, middle-, and low-confi-
dence target-absent responses. We increase the number of trials to 300 to allow better
estimates of the probabilities of the different response categories. Hypothetical data for
this experiment and the corresponding proportions are shown in table 8.3. Display 8.8ac
provides an ROC analysis program that estimates the standard deviation of the signal +
noise distribution, which is equal to the slope of the zz ROC curve, and the corresponding
d and the five ci criteria that separate the different rating responses. The results are shown
in figure 8.5. Estimating the slope of the ROC involves issues about how to treat deviations
between predicted and observed proportions. We simply present this estimation in displays
8.8ac; the issues behind the computation will be considered again in chapter 10 on data
analysis and model fitting.
An alternative index of discriminability based on the measured ROC is A , the area
under the ROC curve, or
n
A = 1 ( X k X k 1 ) (Yk + Yk 1 ) / 2, (8.10)
k =1
where n reflects the number of points on the ROC curve from (0, 0) to (1, 1), and Yk and
X k are the hits and false alarm rates (figure 8.5a).
A is an alternative measure to d . It estimates the proportion of times that a randomly
chosen response from the internal signal + noise distribution exceeds a randomly chosen
response from the noise distribution. Researchers may prefer the A measure in some
situations because it is independent of the exact distribution assumed or whether the vari-
ances of the two distributions are equal. The value of A for the data shown in table 8.3
is 0.7376.
The previous section introduced SDT using the Yes/No paradigm. The theoretical frame-
work of SDT is designed to separate discriminability from shifts in criterion. However,
Sensitivity and Signal-Detection Theory 241
g ( x, )
0
= 1 B A , A2 + B2 dx (8.11)
A
= 1 G 0, B ,1 .
A2 + B2
242 Chapter 8
Display 8.7
%% Experimental Module
% shuffle 0s and 1s
Equations 8.11 and 8.12 are mathematically equivalent for 2IFC/2AFC. Although equa-
tion 8.11 is more intuitive, equation 8.12 is more readily extended to situations with more
intervals or samples (see section 8.4.2 for n-alternative forced-choice paradigms) or
to situations with decision uncertainty (see section 8.5.2 for a full discussion of
uncertainty).
Display 8.8b
zHits_observed =data(1,:);
zFA_observed = data(2, :);
d = guess(1);
sigma = guess(2);
criteria = guess(3:7);
Display 8.8c
>> Srating = [ 7 15 27 35 20 46 ];
>> Nrating = [ 5 36 68 38 4 1 ];
>> ROC = ROCanalysis(Srating, Nrating)
ROC = 1.494 1.994 2.484 1.831 0.579 -0.6112 -1.842 0.7376
Table 8.3
Data from a hypothetical rating procedure
from the target stimulus category, such as stimulus present, and n 1 stimuli from
the other category, such as stimulus absent. An advantage of adding intervals or
stimuli is that random guessing corresponds with lower accuracy, and so a larger n
provides a larger range over which to measure performance accuracy. For example, pure
guessing corresponds to 50% correct in 2IFC, but 25% in 4IFC, while high levels of
discrimination approach 100% correct in both. A disadvantage is that the increased
number of alternatives or intervals may pose additional processing or memory load on
the observer.
The n-AFC/n-IFC paradigms that take one sample from category B and n 1 samples
from category A are simplest to think about within the context of the max rule formulation
introduced for 2AFC. In this case, the correct response occurs when the internal response
to the singular category B stimulus exceeds the internal response to the n 1 samples of
category A:
Sensitivity and Signal-Detection Theory 247
1 3 0.4
a b c
0.8 2
0.3
1
Probability
0.6
pHits
zHits
0 0.2
0.4
1
0.1
0.2 2
0 3 0
0 0.2 0.4 0.6 0.8 1 3 2 1 0 1 2 3 6 4 2 0 2 4 6 8
Figure 8.5
An example of a hypothetical ROC curve shown as (a) a probability plot and (b) a zz plot and (c) graphs of the
signal and noise distributions and the location of the five criteria recovered from the sample data. The mean and
variance of the noise distribution are set to N = 0 and N = 1, which scales the x-axis. In this example, S+ N > N,
corresponding with a zz ROC slope of less than 1.
Category
A B
x1
x2
a b
3 2 1 0 1 2 3 4 5 3 2 1 0 1 2 3 4 5
Internal response Difference between internal responses
Figure 8.6
SDT in a 2AFC task. A 2AFC task presents two stimuli to an observer in each trial, one from each of two catego-
ries, and forces the observer to decide the correspondence between the stimuli and the categories. (a) The two
bell curves represent the internal response distributions for the two stimulus categories with different means and
standard deviations. The circles represent hypothetical values of x1 and x2 for a given trial. (b) The probability
density distribution of the difference between the internal responses of stimuli in the two categories. The vertical
line indicates the decision boundary.
248 Chapter 8
g ( x, , B ) G ( x, A , A )
n 1
Pc = B dx . (8.14)
The probability correct for a n-AFC/n-IRC paradigm can be computed from the under-
lying theoretical discriminability d that reflects the difference between the means of the
internal response distributions from the two categories divided by the standard deviation.
The equal variance case is usually assumed. The underlying discriminability is the key
theoretical measure of sensitivity. This sometimes leads to confusion. The same percent
correct observed in a 2AFC and an n-AFC experiment corresponds to a higher level
of discrimination with larger n. Conversely, the identical underlying d discriminability
corresponds to different, lower, observed percent correct performance in designs with
higher n.
This formulation assumes no observer biases toward particular locations or intervals.
However, in principle it is possible for observers to show response biases or differential
criteria for the different locations of simultaneously presented stimuli in n-AFC experi-
ments or for particular interval(s) of the n-IFC experiment (see ref. 19). The equation must
be generalized to include criterion terms if biases are present. Forced-choice experiments
are generally believed to exhibit fewer issues of bias than the Yes/No experiments, and, in
fact, in many cases where it can be checked, observed biases tend to be small while the
number of alternatives or intervals is small. The data from an experiment can be examined
by tabulating responses for each location or interval over all stimuli to look for major
departures from unbiased performance. Other types of decision rules (than the max rule)
have also been formulated for the n-AFC/n-IFC cases; these may lead to slightly different
estimates.
8.4.3 SameDifferent
The samedifferent paradigm compares two stimuli jointly and asks observers to judge
whether they are the same. The simple samedifferent experiment consists of two catego-
ries of stimuli, A and B. Two stimuli are presented on each trial, and the observer responds
either same or different. On any given trial of the experiment, stimuli from AA, AB,
BA, or BB trial types are presented. The correct response to AA or BB trial types is same
and to AB or BA trial types is different. The two stimuli in each trial are chosen inde-
pendently and randomly across trials. Alternatively, the numbers of trials of the four types
are set to be equal for a perfectly balanced design, and the order of the trials is random-
ized. As an example, four stimulus pairs manipulating the angle of Gabors are shown in
figure 8.7.
The samedifferent paradigm is more complicated than some of the other tasks because
it involves the use of multiple criteria or multiple decisions that are combined together.
No single criterion (or linear boundary in two dimensions) can solve the problem.
Sensitivity and Signal-Detection Theory 249
a
Interval two b
clow chigh
A B 0
Interval one Difference between internal responses
Figure 8.7
The samedifferent task. (a) In each interval, a stimulus from either category A or category B is shown, resulting
in four possible trial types: AA, AB, BA, and BB. Observers must judge if the stimuli in the two intervals are
the same or different. (b) Difference distributions.
There are several strategiesor ways for an observer to go about making a decisionin
a samedifferent task. The most commonly cited decision rule (see ref. 9) takes the dif-
ference between the internal response to the stimuli appearing either successively in two
intervals or simultaneously in two locations. Researchers usually assume that the variances
2 are equal and that the internal responses are uncorrelated. For trials where stimuli are
the same, AA or BB, the distribution of the difference scores should be centered about
zero and the variance is 2 2. For trials where the stimuli are different, the distribution of
difference scores is centered either around + d or around d depending upon whether the
pair is AB or BA; the variance is 2 2 . In this case, the subject should say same whenever
the difference is sufficiently close to 0 and say different otherwise. This requires two
criterion values, clow and chigh. If the two criteria are symmetric, only one parameter c is
estimated. One way the criteria may be displayed on a graph is as a pair of lines diagonal
to the axes (figure 8.7).
Another differentand more optimalway to make the samedifferent decision is to
combine independent classifications of the two stimuli. The first stimulus is classified as
an A or B, the second stimulus is classified as an A or B, and then these two classifications
are combined to determine the same and different response. In the two-dimensional
representation of the internal responses in figure 8.7, this corresponds to partitioning the
four quadrants of the two-dimensional space. If the criteria applied to both dimensions are
unbiased, then the multiple classification strategy is optimal.
250 Chapter 8
The probability correct for the optimal independent decisions strategy is related to
d by:
Pc = G ( d 2 ) + G ( d 2 ) .
2 2
(8.15)
An approximation to d from the standard equation is often used, d = zHits zFA. For the
samedifferent paradigms, hits are defined as responding different to AB and BA trials,
or correctly detecting a difference. False alarms are defined as responding different to
AA and BB trials, or incorrectly detecting a difference.
For the independent decisions rule,
1
d = 2 z 1 + (2 Pc 1) 2 .
1
For a given d between the distribution of As and Bs, the observed probability correct
is slightly higher for the (optimal) independent decisions rule than for the differencing
rule. Both decision rules in samedifferent test designs will lead to percent correct perfor-
mance that is noticeably less than the percent correct for a Yes/No design with the same
two stimuli. For example, for a d of 1, Yes/No performance yields 69% correct, indepen-
dent decisions samedifferent yields 57% correct, and differencing samedifferent yields
55% correct.
A final approach to the samedifferent decision is to choose a response based on the
likelihood ratio. Here, the subject would say same whenever the observed internal
responses are more likely to have arisen from same than from different pairs. The likeli-
hood ratio formulation is another way to define optimal performance. It has been argued
that the likelihood ratio requires particularly complex processing on the part of the
observerit assumes that the observer has full knowledge of the form of the distributions
of internal responses.
See ref. 9 for a more detailed treatment of the samedifferent tasks and issues
of response bias. Several researchers have provided more detailed analyses of these dif-
ferent approaches to the samedifferent task and their optimality.2022 For a recent develop-
ment of ways to test for various strategies, including those with asymmetric criteria, see
ref. 23.
8.4.4 Identification
In an identification experiment, a single stimulus is presented on any trial, and the stimulus
is drawn from one of two or more stimulus types or conditions. The subject must classify
identifythe stimulus. If the stimuli vary along a single dimension, such as contrast, the
same single detector might process all the stimuli. In this case, different conditions are
assumed to produce distributions shifted along the internal response axis.
Unidimensional identification tasks default to the SDT formulation of a Yes/No task. If
there are only two stimulus conditions and the distribution of internal responses is higher
Sensitivity and Signal-Detection Theory 251
for one and lower for the other, then identification is completely analogous to simple Yes/
No. Simply respond B if the internal response sampled on that trial exceeds criterion c.
If there are n stimulus categories corresponding to n shifted distributions on a single
internal response dimension, then the subject will need n - 1 criteria to yield n response
categories, and this is theoretically related to an n-point rating scale. Whether an identifica-
tion task is intrinsically unidimensional depends upon the stimuli and how they are pro-
cessed in the visual (or auditory or tactile) system. In some cases, what may seem like a
variation of stimuli along a single dimension, such as orientation, may in fact be processed
in the brain by a set of competing detectors tuned to different ranges of orientations. We
take up the extensions of SDT to multiple dimensions in the next section.
The standard signal detection experiments and analysis involve stimuli that differ in a
single dimensionor whose differences are projected onto a single relevant decision
dimension (section 8.4.4). However, stimuli often differ along more than one dimension
of variation. For example, Gabor patches varying in contrast have other attributes such
as spatial size and spatial frequency. In some circumstances, the experimenteror
the worldpresents stimuli that vary in several dimensions, in the same experiment or
setting, such as varying color and shape, or spatial frequency and orientation, or hue and
saturation.
SDT provides a theoretical analysis for many detection and discrimination paradigms
for single-dimensional situations. The general recognition theory (GRT) extends the prin-
ciples of SDT to multidimensional stimuli for several different kinds of decisions.24 Some
cases of GRT are simple extensions of the original unidimensional SDT. Other cases can
be quite complicated. This section describes some of the simpler ones.
Consider an experiment with four distinct stimuli created with two values or levels on
each of two stimulus dimensions. For example, the stimuli might be a set of four lines
made from one of two lengths and one of two orientations. As in SDT, the external
stimuli are encoded in internal representations that vary from trial to trial due to internal
variability or internal noise. Presenting a given stimulus leads to an internal value on each
dimension, corresponding to a point in a two-dimensional (2D) space representing length
and orientation. Over trials, the variability in the internal representations of length and
orientation leads to four distributions in the 2D space, corresponding to the four stimuli
that can be presented in different trials: ( length1 , orientation1 ), ( length1 , orientation 2 ),
(length 2 , orientation1 ), and (length 2 , orientation 2 ).
Figure 8.8 illustrates the simple case where the internal response distributions for each
of four stimuli (listed above) are drawn from bivariate normal distributions that are equiva-
lent. The means and standard deviations for stimuli varying in length (dimension 1) are
not dependent on orientation (dimension 2) and vice versa.
252 Chapter 8
B2
B1
A1 A2
Internal response in dimension 1
Figure 8.8
GRT representations of four stimulus types in two dimensions that exhibit perceptual separability and perceptual
independence. Joint probability density distributions for the four stimuli in two dimensions are shown as contour
circles along with the marginal probability density distributions shown above or to the right of the relevant
dimensions. The dotted lines indicate the optimal decision boundaries that maximize proportion correct for
identification of stimuli in the four categories.
Sensitivity and Signal-Detection Theory 253
When the values of the 2D internal representation are projected onto the length axis,
the marginal distributions are identically and normally distributed for the two levels of
orientation. Because they are identical, we see only a single marginal distribution curve.
(Projection onto the length axis means that we look at the distribution of values on that
dimension regardless of the value on the other one.) Similarly, when the values of the 2D
internal representation are projected onto the orientation axis, the marginal distributions
are identically and normally distributed for the two levels of length. If the aspect ratios of
the two axes have been scaled to visually equate the standard deviations in the two dimen-
sions, and the two dimensions are perceptually independent, then the equal-probability
contour lines that represent the bivariate normal probability density distributions will
appear circular in the (scaled) 2D space as in figure 8.8. The contour lines connect the
points in the distribution that have the same value.
In standard SDT, a single decision criterion demarks the regions of the one-dimensional
(1D) line associated with each response. By analogy, in the simple case of GRT depicted
here, a single criterion for perceived length corresponds with a line in the 2D space that
is perpendicular to the length axisand does not depend upon the sensed orientation. The
same is true for a decision about the stimulus orientation. Jointly, these two lines separate
four regions in space that would correspond with an identification response.
In the most general case of the GRT, the internal representations of different stimulus
conditions are described completely by their means and variances on the different dimen-
sions and by the covariance matrix, all of which are free to vary. The GRT was developed
to handle not just the simple special cases but also other more complicated cases. There
are many ways in which four stimuli in two dimensions might depart from the simple case.
The means and/or standard deviations on one dimension may depend upon the level of its
own dimension and/or on the level of the other variable. Even if the marginal distributions
are the same, the variability in the two dimensions may be correlated, exhibiting distribu-
tions with contours on the 2D graph that are oblique to the axes.
Discussions of the GRT have focused on three simplifying properties of the multidi-
mensional description, perceptual independence, perceptual separability, and decision
separability.25,26 Two dimensions are perceptually independent if the bivariate distribution
of the internal representation is the product of the marginal distributions. This only occurs
if the internal values of the two dimensions are statistically independentchosen inde-
pendently and at random.24,27
Two dimensions are perceptually separable if the marginal distributions are independent
of the level of the other variable. Dimensions that are perceptually separable are not neces-
sarily perceptually independent, and vice versa.28,29
Decision separability corresponds to the simple cases where the decision boundaries
for classifying each of the single dimensions are straight lines parallel to the coordinate
axes and perpendicular to the axis of the decision. Decision boundaries are lines or curves
in the 2D decision space that demark regions for different responses.28,29
254 Chapter 8
Figure 8.8 showed a particularly simple case in which perceptual separability, perceptual
independence, and decision separability all hold.
Figure 8.9 shows a special case where variances of the internal representations on a
dimension depend upon the level of that dimension, but the representations still exhibit
perceptual independence and perceptual separability. The perceptual independence cor-
responds to equal probability ellipses of the bivariate normal distributions that are not tilted
relative to the axes in the 2D space. The variance in the internal representations of each
individual dimension is larger for the larger level, yet does not depend upon the level of
the other variable, meeting perceptual separability. Each individual decision (i.e., for
longer or shorter or for more or less oriented) shows decision separability, and the two
decision bounds (dashed lines) taken together provide decision separable bounds for iden-
tifying stimuli in the four categories.
Figure 8.10 shows a case in which perceptual separability holds, yet perceptual inde-
pendence and decision separability fail. Indeed, the means and variances of the distribu-
tions are the same regardless of the level of either dimension. Yet, the variability in the
internal representation is correlated; higher values on one dimension tend to go with higher
values on another. This correlation suggests optimal decision boundaries that do not show
decision separabilitythey are not parallel to the dimensional axes.
Empirically, observers may use simplified but nonoptimal decision strategies that exhibit
decision separability in complicated perceptual situations in which there are failures of
perceptual independence or perceptual separability. If the distributions in perceived lengths,
for example, differ substantially for lines of different orientations, an observer might
exhibit separable decision boundaries even though more complicated decision boundar-
iesno longer parallel to the coordinate axes or even straight linesmay be required to
optimize accurate classification. The observer may default to a strategy of decision sepa-
rability simply because he or she does not know how to set up the more complicated
decision boundaries.
Developments in GRT have created tests for perceptual independence, perceptual sepa-
rability, and decision separability based on response categorization data. Other tests have
been developed based on response time signatures, in some cases by assuming that
response times are inversely related to distance from a decision boundary,26,3033 or adding
dynamic noise to the GRT to derive response time predictions.34 Many of these develop-
ments are quite technical. Interested readers should consult key publications in this area,
such as Refs. 24, 25, and 35.
B2
B1
A1 A2
Internal response in dimension 1
Figure 8.9
GRT representations of four stimuli in two dimensions that exhibit perceptual and decision separability and
perceptual independence in the presence of variance differences. Joint probability density distributions for two
stimulus values in each of two dimensions are shown as contour ellipses along with the marginal probability
density distributions shown as curves to the right or above the relevant dimension. The dotted lines indicate
the optimal decision boundaries that maximize proportion correct for identification of stimuli in the four
categories.
256 Chapter 8
B2
B1
A1 A2
Internal response in dimension 1
Figure 8.10
GRT representations of four stimuli in two dimensions that exhibit perceptual separability but neither perceptual
independence nor decision separability. Joint probability density distributions for two stimulus values in each of
two dimensions are shown along with the marginal probability density distributions. The dotted lines indicate
the optimal decision boundaries that maximize proportion correct for recognition of stimuli in the four
categories.
Sensitivity and Signal-Detection Theory 257
0 1
Internal response in channel 1
Figure 8.11
Two-alternative identification. Each stimulus generates two internal responses, one from each channel.
258 Chapter 8
orthogonal to the criterion, lead to distributions that correspond to the two shifted distri-
butions of standard SDT. Then, a single linear boundary in the 2D spacethe positive
diagonal line in figure 8.11represents the single criterion in standard STD. This projec-
tion rule is also equivalent to sampling each channel for the presented stimulus and
choosing the response that corresponds to the maximally active channel. In some cases,
the responses of the two channels are not orthogonal, in which case SDT application is
more complicated.24,28,35,38
based on the internal responses of all the detectors that the observer considers broadly task
relevant.
Over the two intervals or locations, the observer monitors a total of 2(U + 1) internal
responses, of which only one is associated with the signal stimulus. For this development,
we assume that the internal responses of all the 2(U + 1) detectors are independent and
Gaussian distributed. The signal or target stimulus generates an internal response distribu-
tion with mean S+ N and standard deviation S+ N in the task-relevant detector, while the
response distributions of the internal responses of all the other 2U + 1 detectors have mean
0 and standard deviation N.
There are several different decision rules that the observer may use. The optimal rule
sums (U + 1) internal responses for each of the two stimuli or intervals and chooses
the interval or the stimulus with the larger sum as containing the signal. For the sum of
the (U + 1) random variables for each interval or stimulus, the mean is equal to the sum
of the means; the variance is the sum of the variances.
Often, a maximum rule, or max rule, is used instead of the summation rule. In many
conditions, the max rule is a reasonable approximation43 to the optimal decision rule. With
the maximum rule, the observer compares all the 2(U + 1) internal responses and chooses
the interval or sample with the maximum as the signal location or signal interval. A correct
response may arise in two different ways: (1) the internal response of the task-relevant
detector to the stimulus containing the signal (with mean S+ N ) is greater than the other
2U + 1 internal responses (each with mean 0), and (2) the internal response of one of the
task-irrelevant detectors to the signal is greater than the other 2U + 1 internal responses,
including the one from the task-relevant detector for the signal. In the second case, the
observer generates the correct response for the wrong reason. The two possibilities are
reflected in the two terms of the following equation:
+
Pc = [ g( x S+ N , 0, S+ N )G 2U +1 ( x, 0, N )
(8.16)
+ Ug( x, 0, N )G 2U ( x, 0, N )G( x S+ N , 0, S+ N )]dx.
The maximum rule can also be formulated in a different but equivalent way. The
observer could first extract the maximum of the (U + 1) internal responses to each interval
or stimulus in a trial and decide that the one that contains the greater maximum internal
response is generated by the signal. For the (U + 1) internal responses generated by non-
signals, all with the same probability density function g( x, 0, N ), the probability density
function of the maximum is
For the (U + 1) internal responses generated by the interval or the stimulus containing
the signal, the probability density function of one of them is g( x S+ N , 0, S+ N ); the
260 Chapter 8
Pc = p2 ( x | max)P1 ( x | max)dx
+
= [ g( x S+ N , 0, S+ N )G 2U +1 ( x, 0, N ) (8.19)
+ Ug( x, 0, N )G 2U ( x, 0, N )G( x S+ N , 0, S+ N )]dx.
The maximum distributions of the internal responses of the signal-present and signal-
absent intervals are shown in figure 8.12. As the number of irrelevant channels increases,
the two distributions get closer and closer.
Internal responses
Figure 8.12
Internal response distributions for signal-present (right) and signal-absent intervals, with the number of irrelevant
channels (U) indicated at the upper left corner of each panel.
Sensitivity and Signal-Detection Theory 261
0.9
Probability correct
0.8 d=3
0.7
d=2
0.6
d=1
0.5
0 20 40 60 80 100 120 140 160 180 200
# of Task-irrelevant channels
Figure 8.13
Dependence of probability correct on the number of task-irrelevant channels in a 2AFC/2IFC design using the
max rule for decision. Three d levels are shown.
Figure 8.13 illustrates how probability correct depends on the number of task-irrelevant
channels in a max rule. Uncertainty calculations can be very important in the correct
interpretation of the decisions of observers. Adding uncertainty lowers the observed
percent correct for the same internal distance between the signal and noise, S+ N, because
all the additional detectors or channels provide more opportunities for a noise sample to
exceed the sample from the signal, leading to more opportunities for false alarms.
Uncertainty calculations have been especially important in several areas of research.
One such example is the area of attention in which early researchers attributed limited
attention processes to reductions in percent correct performance that were instead com-
pletely consistent with unlimited and ideal observers given uncertainty considerations of
the SDT.39,41,44,45
varies from trial to trial, the standard SDT formulation absorbs criterion noise into per-
ceptual noise.46 However, more than half a century of research has provided a great deal
of evidence contradicting the assumption of a static decision criterion, and correct inter-
pretation of sensory processes would ideally segregate criterion noise from perceptual
noise. Traditional SDT is quite good at estimating large differences in bias. It provides an
excellent approximation as long as decision noise is small relative to the variability in the
internal perceptual representations.
Several recent studies47,48 have suggested a dramatic reassessment of previous research
findings from traditional SDT for cases where criterion noise is large. Although extensions
of SDT accounting for decision noise have appeared recently, efforts to obtain separate
estimates of decision and encoding noise have so far relied on restrictive assumptions about
the relationship between various noise components. Recent work has started to make new
headway in this classic problem. This new work49 derives methods capable of independent
estimates of the noise components at the decision stage.
8.7 Summary
This chapter provided a basic treatment of the measurement of visual sensitivity or the
detection, discrimination, and identification of near-threshold or threshold stimuli. The
Sensitivity and Signal-Detection Theory 263
chapter first reviewed several classical paradigms for measuring perceptual sensitivity. The
middle sections of the chapter provided an introduction and explanation of SDT as a
framework for understanding detection and discrimination performance and for the separa-
tion of measures of sensitivity from measures of criterion or bias. We provided examples
and treatments of standard detection paradigms and their extensions to rating experiments
that measure ROCs. This section provided an introduction to many of the standard detec-
tion, discrimination, and identification paradigms.
We provided a brief overview of the extensions of SDT to multiple-dimensional stimuli,
such as stimuli varying in length and orientation for discrimination and for full identifica-
tion. Situations where stimuli vary in several dimension are considered more generally
within the framework of GRT. Finally, we consider several specialized extensions to the
SDT such as uncertainty calculations often used in attention research. This chapter should
provide the basic theoretical tools for understanding most prominent paradigms for mea-
suring threshold performance.
References
1. Campbell F, Robson J. 1968. Application of Fourier analysis to the visibility of gratings. J Physiol 197: 551.
2. Enroth-Cugell C, Robson JG. 1966. The contrast sensitivity of retinal ganglion cells of the cat. J Physiol 187:
517.
3. Movshon JA, Thompson ID, Tolhurst DJ. 1978. Spatial and temporal contrast sensitivity of neurones in areas
17 and 18 of the cats visual cortex. J Physiol 283: 101120.
4. Watson AB. Temporal sensitivity. In: Boff K, Kaufman L, Thomas J, eds. Handbook of perception and human
performance, Vol. 1. New York: Wiley; 1986, pp. 143.
5. Chung STL, Legge GE, Tjan BS. 2002. Spatial-frequency characteristics of letter identification in central and
peripheral vision. Vision Res 42: 21372152.
6. Watson AB, Ahumada AJ. 2005. A standard model for foveal detection of spatial contrast. J Vis 5(9):
717740.
7. Graham NVS. Visual pattern analyzers. Oxford: Oxford University Press; 2001.
8. Green DM, Swets JA. Signal detection theory and psychophysics, New York: Wiley; 1966.
9. Macmillan NA, Creelman CD. Detection theory: A users guide. Hillsdale, NJ: Lawrence Erlbaum; 2005.
10 Marcum, JA. Statistical theory of target detection by pulsed radar. Project RAND, Douglas Aircraft Company,
Inc., RA-15061, Dec. 1947.
11. Peterson W, Birdsall T, Fox W. 1954. The theory of signal detectability. Transactions of the IRE Professional
Group on Information Theory 4(4): 171212.
12. Tanner WP, Jr, Swets JA. 1954. A decision-making theory of visual detection. Psychol Rev 61: 401409.
13. Gescheider GA. Psychophysics: Method and theory. Hillsdale, NJ: Lawrence Erlbaum; 1976.
14. Kingdom FAA, Prins N. Psychophysics: A practical introduction. New York: Academic Press; 2009.
15. DeCarlo LT. 2002. Signal detection theory with finite mixture distributions: Theoretical developments with
applications to recognition memory. Psychol Rev 109: 710721.
16. Cohn TE, Lasley DJ. 1974. Detectability of a luminance increment: Effect of spatial uncertainty. JOSA 64:
17151719.
17. Eskew RT, Jr, Stromeyer CF, III, Picotte CJ, Kronauer RE. 1991. Detection uncertainty and the facilitation
of chromatic detection by luminance contours. JOSA A 8: 394403.
264 Chapter 8
18. Lasley DJ, Cohn T. 1981. Detection of a luminance increment: Effect of temporal uncertainty. JOSA 71:
845850.
19. Wenger MJ, Rasche C. 2006. Perceptual learning in contrast detection: Presence and cost of shifts in response
criteria. Psychon Bull Rev 13: 656661.
20. Noreen D. 1981. Optimal decision rules for some common psychophysical paradigms. Math Psychol Psy-
chophysiol 13: 237279.
21. Dai H, Versfeld NJ, Green DM. 1996. The optimum decision rules in the same-different paradigm. Atten
Percept Psychophys 58: 19.
22. Versfeld NJ, Dai H, Green DM. 1996. The optimum decision rules for the oddity task. Atten Percept Psy-
chophys 58: 1021.
23. Petrov AA. 2009. Symmetry-based methodology for decision-rule identification in same-different experi-
ments. Psychon Bull Rev 16: 10111025.
24. Ashby FG, Townsend JT. 1986. Varieties of perceptual independence. Psychol Rev 93: 154179.
25. Ashby FG. Multidimensional models of perception and cognition. Hillsdale, NJ: Lawrence Erlbaum; 1992.
26. Maddox, WT. Perceptual and decisional separability. Hillsdale, NJ: Lawrence Erlbaum; 1992.
27. Thomas RD. 1995. Gaussian general recognition theory and perceptual independence. Psychol Rev 102:
192200.
28. Kadlec H, Townsend JT. Signal detection analyses of dimensional interactions. In: Ashby FG, ed. Multidi-
mensional models of perception and cognition. Hillsdale, NJ: Lawrence Erlbaum; 1992: pp. 181227.
29. Kadlec H, Townsend JT. 1992. Implications of marginal and conditional detection parameters for the sepa-
rabilities and independence of perceptual dimensions. J Math Psychol 36: 325374.
30. Garner WR, Michigan Uo. The processing of information and structure. Hillsdale, NJ: Lawrence Erlbaum;
1974.
31. Garner WR. 1976. Interaction of stimulus dimensions in concept and choice processes. Cognit Psychol 8:
98123.
32. Posner MI. 1964. Information reduction in the analysis of sequential tasks. Psychol Rev 71: 491504.
33. Wickens TD, Olzak LA. Three views of association in concurrent detection ratings. In: Ashby FG, ed. Mul-
tidimensional models of perception and cognition. Hillsdale, NJ: Lawrence Erlbaum; 1992, pp. 229252.
34. Ashby FG. 2000. A stochastic version of general recognition theory. J Math Psychol 44: 310329.
35. Ashby FG, Gott RE. 1988. Decision rules in the perception and categorization of multidimensional stimuli.
J Exp Psychol Learn Mem Cogn 14: 3353.
36. Kadlec H, Townsend J. Signal detection analysis of multidimensional interactions. In: Ashby FG, ed. Proba-
bilistic Multidimensional Models of Perception and Cognition. Hillsdale, NJ: Lawrence Erlbaum; 1992, pp.
181231.
37. Wickens TD. Elementary signal detection theory. Oxford: Oxford University Press; 2002.
38. Thomas RD. 1996. Separability and independence of dimensions within the same-different judgment task.
J Math Psychol 40: 318341.
39. Sperling G, Dosher B. Strategy and optimization in human information processing. In: Boff KR, Kaufman
L, Thomas JP, eds. Handbook of perception and human performance, Vol. I: Sensory processes and perception.
New York: Wiley; 1986, pp. 2-12-65.
40. Shaw ML. 1980. Identifying attentional and decision-making components in information processing. Atten-
tion and Performance 8: 277295.
41. Palmer J, Verghese P, Pavel M. 2000. The psychophysics of visual search. Vision Res 40: 12271268.
42. Palmer J, Ames CT, Lindsey DT. 1993. Measuring the effect of attention on simple visual search. J Exp
Psychol Hum Percept Perform 19: 108130.
43. Nolte LW, Jaarsma D. 1967. More on the detection of one of M orthogonal signals. J Acoust Soc Am 41:
497505.
44. Shaw ML. 1982. Attending to multiple sources of information: I. The integration of information in decision
making. Cognit Psychol 14: 353409.
Sensitivity and Signal-Detection Theory 265
45. Eckstein MP. 1998. The lower visual search efficiency for conjunctions is due to noise and not serial atten-
tional processing. Psychol Sci 9: 111118.
46. Wickelgren WA, Norman DA. 1966. Strength models and serial position in short-term recognition memory.
J Math Psychol 3: 316347.
47. Benjamin AS, Diaz M, Wee S. 2009. Signal detection with criterion noise: Applications to recognition
memory. Psychol Rev 116: 84114.
48. Rosner BS, Kochanski G. 2009. The Law of Categorical Judgment (Corrected) and the interpretation of
changes in psychophysical performance. Psychol Rev 116: 116128.
49. Cabrera C, Lu ZL, Dosher B. 2011. Separating decision noise and encoding noise in perceptual decision
making. J Vis 11: 805.
50. Graham N, Kramer P, Yager D. 1987. Signal-detection models for multidimensional stimuli: Probability
distributions and combination rules. J Math Psychol 31: 366409.
9 Observer Models
Observer models specify computations that are sufficient to predict the behavior of an
observer for many different input stimuli with a small number of parameters. Observer
models specify the transformations leading to the relevant internal representations from
the stimulus, and define the decision rules for particular tasks. These models have a
remarkable ability to summarize compactly the behavioral outcomes in many conditions
and provide a conceptual framework within which to understand the responses of the
sensory system. In this chapter, we consider both modern single-channel and multichannel
observer models. Each observer in a task can be described by a small number of parameters
that fully specify how the stimulus is recoded in an internal response and then subjected
to a task-relevant decision. Once these parameters have been estimated from specific
experimental tests, it is possible to make predictions about an observers responses to a
wide range of stimuli and paradigms.
The goal of visual psychophysics is to fully specify the relationship between the stimulus,
the internal representation, and the observers response. A successful theory of perception
should include models that specify all of these components, including a transformation
from the stimulus inputs to the internal representations, and then a function that takes the
internal representation into a response. Chapter 7 on scaling focused on the relationship
between perceptual intensity or perceptual qualities and properties of the physical stimulus.
Chapter 8 showed how to use signal-detection theory to separate perceptual sensitivity
from response bias and subjective criterion. A complete psychophysical theory for any
experiment requires detailed specification of the transformations from the input to the
output. Signal-detection theory by itself is silent about how the stimulus quality or intensity
is related to the internal representation. Scaling infers a quantitative relationship between
stimuli, but may or may not specify the transformation from the stimulus to the internal
representation, and often fails to provide an explicit model of how the observer generated
the scaling responses from the internal representation.
268 Chapter 9
The observer models solve the problem of how to predict an observers behavioral
performance over a wide range of possible stimuli. Even a simple example illustrates the
value. Suppose that we want to know the observers sensitivity to any Gabor pattern. There
are hundreds of combinations of contrast and external noise that we might want to predict.
How can we make predictions about these hundreds of cases by measuring only a few
conditions? One brute-force possibility is to sample the contrast and external noise space
with some experimental tests (and hope they are well placed) and then attempt to use
interpolation to predict the performance in all the other cases. The observer models provide
a sophisticated way to make these predictions from several key observations in a way that
also improves our conceptual understanding of how the sensory systems function.
An observer model specifies the transformation from the stimulus to the internal repre-
sentation and then the transformation to the response. The focus is on specifying the
transformation from the stimulus to the internal response. Most observer models then
incorporate signal detection or other standard decision module for response generation.
Modeling the noise or variability in the internal representation is a key component of any
observer model.1 The noises make the system stochasticmeaning that it involves random
variables or probabilities that generate trial-to-trial variability. There are many models in
vision; however, only a subset incorporates a significant treatment of all the aspects of
observer models.
The simplest observer model must have a component or module that is sensitive to the
target pattern or patterns. This is usually called a template. It can be thought of as a detec-
tor for the target. The output of the template depends both on the match of the test pattern
to the template and the contrast of the pattern. The template response is increased when
the match to the template is better and the contrast is higher. To mimic the variability of
observers, including the neural and sensory variability in encoding, the observer model
must incorporate internal noise sources to generate a noisy, stochastic internal response
(figure 9.1). Then, it must have a decision module that maps the noisy internal response
into an action, or external response of the observer.
In some applications of observer models, a single template may be applied at the loca-
tion of the signal stimulus, and the output is a single number that is the basis for subsequent
decision. More generally, a template may be applied at many locations, and the output is
an array of numbers that drive a decision. Figure 9.1 illustrates template matching by the
application (convolution) of a Gabor template at many points in the stimulus. The template
is looking for a vertical Gabor pattern. It returns essentially no response over the cloudless
sky, stronger responses when the orientation and spatial frequency in the image match the
template, and responding most strongly to the vertical edges in the stimulus. The middle
panel of figure 9.1 represents a matching value at each spatial location in the image. More
complicated observer models may incorporate multiple channels or multiple templates,
physiological response nonlinearities and interactions, multiple sources of internal noise,
and more complex decision rules.2,3
Observer Models 269
Figure 9.1
An illustration of template matching at many points in the image with additive noise. An input stimulus is con-
volved with a Gabor template; 30% noise is added to the output.
Once constructed and fully specified, an observer model provides the basis to generate
predictions and generalize the results from a small number of experimental tests to a wide
range of experimental conditions for a particular observer or class of observers.4,5 Observer
models also provide a theoretical framework to investigate and quantify how the observer
changes as a function of the individuals state. An observer state can include such things
as adaptation, or training, or attention state, or the observer classification as, for example,
dyslexic or amblyopic. The effects of changing the state of the observer are understood
by how the model parameters change as a function of state while others remain fixed.5
The components of observer models can be specified and tested in a number of ways.
Psychophysical paradigms and estimates from neurophysiology or brain imaging may all
provide constraints that specify aspects of observer models. In the next section, we describe
one observer model and provide illustrations of experiments that are used to specify the
model.
Observer models for detection and discrimination have been under development for many
years, at least since the 1950s.6 All of the early models begin with a relevant perceptual
template for the signal stimulus, a source of additive internal noise, and a decision module.
Other models may include important functions such as nonlinearity, multiplicative noise,
and decision uncertainty. A number of observer models have been developed over the past
few decades. The most prominent observer models include the linear amplifier model,7 the
induced noise model,8 the linear amplifier model with decision uncertainty,9 the induced
noise and uncertainty model,10 and the perceptual template model.5,11 The perceptual tem-
plate model (PTM) incorporates and integrates the major components of the previous
observer models and has been shown to provide an excellent account of a range of psy-
chophysical data.5
The components of the PTM are illustrated in figure 9.2. The PTM includes (1) a per-
ceptual template tuned for the signal stimulus in the target task. For example, the template
270 Chapter 9
Nm Na
+ +
Template Transducer Noises Decision
Figure 9.2
The PTM of a human observer (see text for explanation).
may be tuned for an oriented Gabor if that is the signal in a particular experiment. The
PTM incorporates (2) a nonlinear transducer function that mimics known nonlinearity in
physiological responses and can partially capture the nonlinear relationship between stimu-
lus intensity and the internal response. The PTM also includes two different internal noise
sources, (3) multiplicative and (4) additive internal noises, which account for the stochastic
or variable nature of the internal representations to the same stimulus. Multiplicative
internal noise increases with the contrast energy in the stimulus, and so is related to
Webers law behavior. Additive internal noise limits absolute threshold for very low con-
trast stimuli. This part of the PTMthe template, nonlinearity, and internal additive and
multiplicative noisesspecifies the mean internal representation and its variability. The
PTM then incorporates a signal-detection module for the specific experimental task that
operates on the internal representation(s). The PTM provides a link from the stimulus input
to noisy internal representationsa link that is missing in signal-detection theory (SDT).
Here we develop the predictions of the PTM for a two-alternative forced-choice decision
task. The PTM, first introduced by Lu and Dosher,12 processes the input stimuli in two
pathways. In the signal pathway, input stimuli are processed through a perceptual tem-
plate that has selectivity for the particular characteristics of the signal stimulus (e.g., spatial
frequency, orientation, etc.). This pathway is where the template is used to look for the
signal in the stimulus. The response or gain of the template to a signal stimulus is quanti-
fied by the model parameter . The gain of the template to Gaussian white noise is
set (normalized) to 1.0. The signal pathway has a nonlinear transducer function
[Output = sign(Input )| Input | 1 ], specified by the model parameter 1. This form of non-
linearity is also often used in modeling pattern vision.13,14
The output of the second pathway, the multiplicative noise pathway, controls the
amount of multiplicative internal noise. It also responds to the input stimuli but is normally
more broadly tuned than the signal pathway and may integrate (be sensitive to) energy
over a broad range of space, time, and features. This is implemented in equations as another
Observer Models 271
perceptual template with gain parameter 2 to the signal stimulus and 1.0 to white external
noise. The output of this gain-control template is also submitted to a nonlinear transducer
function ( Output = | Input | 2 ), with parameter 2 . The variance of multiplicative noise is
proportional to the output of the multiplicative noise path. The outputs of the two pathways
are combined with an additive internal noise to form the internal representation that is the
input to the decision module.
For paradigms where observers discriminate between two stimuli, the final response
may be determined by a difference rule; that is, the difference in the response of two dif-
ferent templates to the presented stimulus. If, for example, the observer is discriminating
two oriented Gabors of angles +45 or 45 relative to vertical, the PTM compares the
response of the two templates to a test stimulus on each trial. One of the templates provides
a good match to the signal stimulus, the other mismatches.
In the template detector that matches the signal, the mean of the internal response is
1 1
c , and the total variance of the internal response is
2 1 2 2 2 2
total
2
1 = ext + N mul [ ext + ( 2 c )
2
] + add
2
. (9.1)
In the template detector that mismatches the signal (but matches another very different
target stimulus), the mean of the internal response is 0, and the total variance of the internal
response is
2 1 2 2
total
2
2 = ext + N mul ext + add .
2 2
(9.2)
Using the decision module and equations for a two-alternative forced-choice task from
chapter 8, the average signal-to-noise ratio (d ) in the PTM is
( c) 1
d = . (9.3)
2 1 2 2 2 ( 2 c )2 2
ext + N mul ext + + add
2
2
The numerator or signal part of the equation reflects the difference in response to the
test stimulus between the two templates, ( c ) 1 0.
g { x c }
2 1 2 2
Pc = 1 1
, 0, ext + N mul
2
[ ext + (2 c)2 2 ] + add
2
(9.4)
(
G x, 0, 2 1
ext +N 2
mul 2 2
ext + 2
add ) dx.
That is, percent correct is determined by the probability that the response of the matching
template is greater than the mismatching template, integrated over all possible values.
The equations for the PTM used here are analytic approximations to a fully stochastic
model. The PTM formulation uses equations that approximate the full stochastic model
272 Chapter 9
by replacing several random variables with their expectations in specifying the noises. The
approximation has similar or identical properties to the full stochastic model in most
parameter regimes that have been observed so far.5,15
The PTM defines the relationship between the physical stimulus, the internal representa-
tion, and the response of the observer. It extends SDT by specifying the signal strength
and the variability of the internal representations from physical properties of the stimulus.
The classical SDT experiments typically correspond to a situation in which external noise
is not present ( ext
2
= 0), while the PTM is designed to account for performance in multiple
levels of external noise masking. Once the parameters of the PTM are specified by experi-
mentation, it provides comprehensive predictions of performance under many stimulus
conditions, including the condition typically used in SDT without external noise. The
properties of the PTM and how to specify them are described in the next section.
Signal
External
noise
Figure 9.3
Demonstration of the equivalent input noise method. Three imagesa vertical sine wave (signal) with increasing
contrast from bottom to top, an external noise image with increasing variance from left to right, and an internal
noise image with a constant varianceare superimposed. Three contours of equal visibility of the signal grating
are traced with the smooth curves. The contour is nearly flat in low external noise conditions and rises with
external noise in high external noise conditions. The amplitude of the external noise at the elbow of the contour
provides an estimate of the variance of the internal noise.
external noise. In this region of the curve, the external noise is the limiting factor in per-
formance. However, at very low levels of external noise, the contrast threshold for detec-
tion or discrimination barely depends upon the external noise. In this region of the curve,
the internal noises are the limiting factor in performance. The transition point where
external noise starts to control performance is related to the magnitude of the internal
noises. By referencing to the external noise, it is possible to peg the scale of the internal
noises to a physical quantity.
Frequently in experimental tests, the rising portion of the threshold versus external noise
contrast (TvC) functions has a slope of 1.0. In this special case, we can set = 1 = 2,
and rearrange the fundamental signal-to-noise equation to obtain threshold signal contrast
c as a function of external noise contrast ext at a given performance criterion (i.e., d ):
1
2
d 2 [(1 + N mul
2
) ext + add
2
] 2
c = 2 2 . (9.5)
N mul 2 d / 2
2 2
274 Chapter 9
a b
1 1.5
Threshold signal contrast (C )
1.4
1/4
Threshold ratio
1.3
1/16 d' = 2.0 cd=2.0/cd=1.414
1.2
d' = 1.0 d' = 1.414 cd=1.414/cd=1.0
1/64
1.1
1
0 1/32 1/16 1/8 1/4 1/2 1 0 1/32 1/16 1/8 1/4 1/2 1
Contrast of external noise ( Next)
Figure 9.4
Illustration of hypothetical results of a triple-TvC paradigm. (a) TvC functions at three performance levels, cor-
responding to d = 1.0, 1.414, and 2.0. (b) Threshold ratios between two performance levels in each external
noise condition: ratios between thresholds at d = 2.0 and d = 1.414, and ratios between thresholds at d = 1.414
and d = 1.0.
This is the equation for a TvC function at a given threshold level of performance accuracy.
It follows directly from this equation that for any given external noise contrast, the ratio
between the threshold signal contrasts at two performance levels (corresponding to d2 and
d1 ), is
1
c 2 d2 2 2 N mul
2
22 d1 2 / 2 2
= 2 2
c1 d1 N mul 22 d2 2 / 2
2 . (9.6)
The PTM predicts that the ratio of threshold signal contrasts at two criterion performance
levels for any given external noise contrast is a nonlinear function of the two corresponding
ds and does not depend on the particular external noise level (figure 9.4b).
Measurement of multiple TvC functions at different criterion performance levels pro-
vides sets of pairwise tests of these ratio properties. One way is to measure full psycho-
metric functions at each external noise level tested in TvC measurements. In this case,
estimating the thresholds for a given accuracy level and graphing these as a function of
external noise contrast provides a useful visual display and useful ratio tests as described
earlier. Measuring performance at three criterion levelsa three-point proxy for measure-
ment of the full psychometric functionsshould be sufficient to estimate the parameters,
including the nonlinearity parameters, of a PTM.11,15 A paradigm that measures the TvC
at three threshold levels is sometimes called a triple-TvC paradigm.
Observer Models 275
Display 9.1 shows a MATLAB program that implements a triple-TvC experiment and
a double-pass procedure (see section 9.3.2 for a description of double pass). The experi-
ment measures the contrast thresholds at multiple external noise levels in a two-alternative
Gabor orientation identification task. Observers determine the tilt of a Gabor as 10 from
vertical by responding top tilted right or top tilted left. The experiment measures
thresholds using the method of constant stimuli and tests seven suitably chosen contrasts
for each of eight levels of external noise. Specific random seeds are used in the program
to set the random number generators that control all random events in the experiment (see
section 4.3.2 of chapter 4), including the randomized trial sequence and the specific
samples of external noise. The random seeds of the first four sessions are saved and then
used again to reconstruct the exact same sequence of trials and noise displays in sessions
five to eight. A double-pass experiment measures the response of the observer to exactly
the same stimuli and trial sequence to estimate the variability in the internal response. This
experiment corresponds to one described in Lu and Dosher.5 TvC functions at three per-
formance levels from such an experiment are shown in table 9.1 and figure 9.5.
Display 9.1
%% Experimental Module
if session > 4
fileName = sprintf(TvC_rst_%s_%02.0f.mat, subjID, session - 4);
S = load(fileName, p);
p.randSeed = ClockRandSeed(S.p.randSeed);
% use seed from session-4 session
else
p.randSeed = ClockRandSeed;
% use clock to set seed for the random
% number generator
end
if i > 1
if WaitTill(Secs + p.ITI, esc), break; end
% wait for ITI
if strcmp(key, esc), break; end % to stop
end
Table 9.1
Threshold versus external noise contrast (TvC) functions at three performance levels (65%, 75% and 85% correct)
value of the ratio between total internal noise and external noise results in a specific func-
tion relating percent correct to percent agreement (see figure 9.6). The variation in accuracy
may reflect a manipulation such as signal contrast, for example. For each function, percent
correct goes from chance, usually 50%, to very good, near 100%. As percent correct
approaches 100%, percent agreement also approaches 100%. As percent correct becomes
poor, the agreement in responses between the two copies of the exact same trial depends
on whether the errors reflect independent samples of internal noise, and so are not related,
or the same sample of external noise that is controlling the error response in the same way.
Figure 9.6 shows several hypothetical functions relating percent correct (PC) to
percent agreement (PA) for different ratios of the total amount of internal noise and
the amount of external noise. The performance for any given condition specified by
signal contrast and external noise level (c, N ext ) will generate a single (PC, PA) point
in this graph. Because the experimenter knows the external noise level as well as the
signal contrast, it is possible to estimate the total internal noise from the curve upon
280 Chapter 9
50
25
Threshold (%)
12.5
6.25
3.13
0 3.13 6.25 12.5 25 50
Figure 9.5
TvC functions at three different performance levels (65%, 75%, and 85% correct).
which the data point lies. By varying external noise and signal contrast, one can esti-
mate the total internal noise for many conditions. By plotting estimated total internal
noise as a function of the two stimulus variables, contrast and external noise, it is
possible to get a sense of the functional relationship between total internal noise and
the input stimulus.
The PC versus PA double-pass functions for different ratios of total internal noise and
external noise are based on SDT, but are otherwise model independent. These generic
predictions assume Gaussian signal + noise and noise distributions and are derived for
two-alternative forced-choice tasks without bias, where the total
2
= int
2
+ ext
2
. The relevant
equations for PC and PA are
Internal noise
1
0.9
0.8
PC
0.7
=int /ext
0.6
Figure 9.6
Probability correct (PC) versus probability consistent (PA) for a range of internal to external noise ratio s.
PA = P[(S + N ext1 + N int1a ) > ( N ext2 + N int2a )] P[(S + N ext1 + N int1b ) > ( N ext2 + N int2b )]
+ P[(S + N ext1 + N int1a ) < ( N ext2 + N int2a )] P[(S + N ext1 + N int1b ) < ( N ext2 + N int2b )]
+ (9.8)
= g( x S, 0, 2 ext ){G 2 ( x, 0, 2 ext ) + [1 G( x, 0, 2 ext )]2 }dx.
The subscripts 1 and 2 index two intervals/alternatives or two detectors; subscripts a and
b index different instances of internal noise.
Unlike this general SDT treatment, the PTM as an observer model further specifies how
the internal noises arise and provides an explicit equation for the total internal noise as a
function of external noise, and model parameters N m, N a , , and . The PTM predicts (PC,
PA) data points for all of the manipulated conditions in a given experiment through estimat-
ing these system parameters. From this, you can estimate the internal noises directly by
fitting PTM equations (see sections 10.4.2 and 10.4.4 of chapter 10).
The PTM parameters can be estimated from a triple-TvC experiment. It is also possible
to implement both a triple-TvC and double-pass procedures jointly in the same experiment.
The data on the relationship between percent correct and percent agreement provide addi-
tional complementary constraints for estimating the PTM parameters. Additional empirical
constraints can improve the quality of estimation and also provide additional theoretical
challenges to test the model.5
282 Chapter 9
0.9
0.8
ext noise 1
PC
ext noise 2
0.7
ext noise 3
ext noise 4
ext noise 5
ext noise 6
0.6
ext noise 7
ext noise 8
0.5
0.5 0.6 0.7 0.8 0.9 1
PA
Figure 9.7
Results of a double-pass experiment. Probability correct (PC) versus probability consistent (PA) for an average
of three observers (based on data reported in Lu and Dosher5).
In the example experiment in display 9.1, double pass can be introduced by exactly
repeating sessions. In this program, sessions 58 repeat the exact sequence of trials,
signals, and external noises as sessions 14, and sessions 1316 repeat sessions 912.
Figure 9.7 shows a resulting graph of percent correct as a function of percent agreement
for an average of three observers.
Empirically, most or all of the observed functions relating PC (percent correct) to PA
(percent agreement) in the literature are quite similar across external noise conditions. This
suggests that the ratio of internal to external noise approaches a constant,5,8 which in
turn implies the dominance of multiplicative noise over additive noise, consistent with
Webers law.27
9.3.3 Discussion
Observer models provide an important approach to specifying the relationship between
the stimulus and the internal representation and the response in a psychophysical task. The
Observer Models 283
PTM observer model has been applied to data from TvC functions at multiple criteria from
many different tasks and manipulations of attention, learning, and other observer states.
It also has been applied to several double-pass experiments.5 The model framework has
provided an excellent account of a wide range of data and has done well in comparison
with other models of its general class. As a comprehensive model that incorporates the
ideas of many earlier models, the success validates this approach to understanding the
relationship between the stimulus and the internal representation and decision of the psy-
chophysical observer.
Most external noise experiments in the traditional literature only measured the TvC
function at a single performance level. In these cases, a simple linear amplifier model7
consisting of a single template and additive internal noise, followed by decision, provides
an adequate account of performance. However, many other experiments suggest that the
simple linear amplifier model cannot account for nonlinearities in the responses of the
visual system. These nonlinearities were revealed by the triple-TvC or full psychometric
function methods in the previous section. Indeed, in the visual domain, d is thought to
increase as a power function of signal contrast.13,2834 The linear amplifier model essentially
always fails as soon as even two performance levels are considered jointlyit requires
different internal system parameters to account for the data at the different performance
levels.
Decision uncertainty (see section 8.6.2 of chapter 8) has been proposed as one explana-
tion for the observed nonlinearities.9 The idea is that many task-irrelevant or hidden
channels may contribute to a detection or discrimination decision because the observer is
in some way uncertain about the nature of the signal or where the signal is encoded in the
visual system. For example, perhaps the observer samples not just the spatial frequency
or orientation of the stimulus, but some other orientation and spatial frequency channels
as well. The uncertainty account presumes that the observer does not know which evidence
to consider, and so includes information from irrelevant sources.
Results from double-pass experiments clearly show strong evidence for multiplicative
internal noise, which was not incorporated into the linear amplifier model. Models have
been proposed to include multiplicative noise8 and to include multiplicative noise and
uncertainty,10 as well as additive internal noise.
The PTM observer model uses nonlinear transducers and multiplicative noise (that
reflects the total energy in the stimulus and not just external noise in the stimulus) instead
of incorporating uncertainty and an unknown number of hidden channels. The PTM
has outperformed all of the other observer models in accounting for a wide range of
experiments.5
The formulation of the PTM described in this section is mathematically equivalent
to a development in which system nonlinearities are recast as contrast gain control.35
In contrast gain control, the magnitude of the internal representation is scaled or
normalized by a measure of the relevant total contrast energy in the input stimulus.
284 Chapter 9
This contrast gain control form is consistent with a rearrangement of the standard
PTM equations.
The PTM as an observer model was developed in the context of external noise mask-
ingthe external noises are random. A significant parallel development of observer
models occurred in the context of pattern-masking experiments.13,14,32,3641 In pattern
masking, a pattern stimulus rather than a noise masking stimulus is combined with the
signal stimulus that the observer is judging. Although pattern masking and external noise
are somewhat different experimental techniques, they both test the same visual system.
Models of performance in the two domains should share core properties. The PTM
model is functionally very similar to the models developed to account for performance
in pattern masking.14 Many of the other observer models are not. The parallels between
the two developments are especially obvious for the gain-control formulation of
the PTM.
All of the existing observer models, including the PTM, have been developed to account
for the detection or discrimination of orthogonal or nearly orthogonal stimulistimuli that
differ substantially from one another. To handle the discrimination of very similar stimuli,
the PTM must consider close and overlapping templates both of which may respond to
the same stimulus, although the response of one template will be stronger than the other.
Very closely similar templates will be correlated in their responses to external noise, and
so extensions to discrimination of close stimuli may also consider the correlations in
response. One such development has been proposed in an elaborated PTM.42
The methods we have discussed so far estimate the response gain of the perceptual tem-
plate to the stimulus as well as the different sources of perceptual noise, including multi-
plicative and additive noise, and system nonlinearity. Critical band masking and reverse
correlation, or classification images, can also be used to further estimate and specify the
properties of the perceptual template. Critical band masking uses masks with specific pat-
terns to discover which features drive the response of the perceptual template. Reverse
correlation, or classification images, estimates template sensitivity by categorizing external
noise samples by the response they produce to infer which external noise features drive
the response. These other masking methods estimate the sensitivity of the perceptual
template model to stimulus properties including spatial frequency, orientation, spatial
location, and temporal location.
energy outside the sensitivity of the perceptual template will not influence behavior. The
overall response of the perceptual template to a signal stimulus in the PTM is captured by
the single parameter . However, the response of the template can also be specified as a
function of any manipulated variable,v, where v could stand for the spatial frequency,
orientation, time, or spatial location (or some combination).
The template response to the stimulus along the manipulated variable v is TS ( v ), the
amplitude of the signal stimulus is SS ( v ), and the amplitude of the external noise is F ( v ).
Then the output of the template matching stage through the signal path for the stimulus
(signal and the noise) can be expressed as43
S1 = c TS(v)SS(v)dv (9.9)
N2 1 = ext
2
Ts2 (v)F 2 (v)dv . (9.10)
The parameter is the gain of the template to a signal stimulus relative to external noise.
Similarly, the output of the template in the multiplicative noise pathway to the stimulus
is TN ( v ). Then, the output of the template for signal and external noise through the mul-
tiplicative noise path can be expressed as
N2 2 = ext
2
TN2 (v)F 2 (v)dv . (9.12)
Analogous to the simple PTM, the discriminability d depends upon the output of the
signal path compared with the total noise:
S1 1
d = . (9.13)
2 1 2 2 S22 2
N1 +N 2
mul N 2 + 2 + add
2
In figure 9.8, we show an example of a series of high-pass and low-pass filters in the
spatial frequency domain and examples of external noise through the low-pass and high-
pass filters. Figure 9.8a and b depict different spatial frequency cutoffs in a Fourier space
representation. (See section 3.3.9 of chapter 3 for a discussion of Fourier representations
and the fast Fourier transform and how to program band-pass filters.) Figure 9.8c shows
corresponding example noises for the low-pass series, and figure 9.8d shows example
noises for the high-pass series. Figure 9.8e shows the estimated thresholds as a function
of cutoff for three observers. Figure 9.8f shows the template sensitivities for different
spatial frequencies estimated from these threshold profiles. This example is taken from
ref. 43.
In figure 9.9, we show another example in which the orientation content of the external
noise is filtered around 45 with increasing orientation bandwidths to include energy from
more and more orientations, until all orientations are represented. Examples of external
noise samples are shown from left to right in figure 9.9a for increasing width of the pass
band. Filtered external noise in the orientation domain has been used to estimate the ori-
entation bandwidth of perceptual templates.35
In our next example, the spatial footprint of the perceptual template is estimated by
adding external noise in different-sized spatial rings that cover different parts of a signal
stimulus, which in this example is an oriented Gabor. Testing different combinations of
external noise rings estimates the spatial footprint or profile of the template (and tests for
interactions between regions of space).44 If external noise appears in regions that do not
drive the template, the noise will not elevate threshold, whereas if the external noise
appears in regions that are heavily weighted in the template, there will be significant
threshold elevation. Figure 9.10 shows examples of different spatial patterns of external
noise rings and the weights on different rings in the spatial templateor the template
footprint. The threshold data from which the weights are estimated (16 multipoint psycho-
metric functions) are omitted.
Finally, an analogous manipulation can be carried out in the time domain; external noise
is added at various points in time before and after the signal test stimulus. Figure 9.11
shows examples of external noise frames at various points surrounding the signal image
frame, from which we can estimate the temporal window of the perceptual template.45
In these examples, we have provided sample estimates of the spatial frequency, orienta-
tion tuning, spatial footprint, and temporal window of the perceptual template. Each result
is associated with a relatively large psychophysical experiment. These estimates of the
perceptual template may be related to neurophysiological measures of the tuning properties
of visual neurons or visual areas. These same methods have also been used to study how
the template may change under manipulations of the state of the observerfor example
due to adaptation, or attention, or perceptual learning.35,4446
In an ideal case, the perceptual template could be tested in all these dimensions simul-
taneously. However, joint measurements of different properties, such as spatial frequency
Observer Models 287
fy fy
a b
6 0
5 1
4 2
3 3
1 2 4
0 fx 65 fx
e 12.5
-3
Threshold (%)
8.8
-3.5
6.3
-4
4.4
-4.5
3.1
-5
0.08
f AT QL SM
0.06
Gain
0.04
0.02
0.00
0.17 0.68 2.72 10.9 0.17 0.68 2.72 10.9 0.17 0.68 2.72 10.9
Spatial frequency (c/d)
Figure 9.8
Estimating the spatial frequency sensitivity of the template. (a, b) Two-dimensional low-pass and high-pass
spatial-frequency filters with seven different passbands. (c, d) From left to right, examples produced by filtering
a Gaussian white noise image through the seven filters in (a) and (b). (e) Contrast threshold as a function of pass
band of the low-pass and high-pass filters for three observers at 70% correct performance level. The curves were
from fits to the PTM. (f) The best-fitting spatial-frequency sensitivity of the perceptual templates. The arrows
on the x-axis indicate the center frequency of the signal stimulus (from Lu and Dosher43).
288 Chapter 9
1
b c
2
Figure 9.9
Estimating the orientation sensitivity of the template. (a) Orientation-filtered external noise around 45 by filters
of different orientation bandwidths. The number above each panel indicates the filter bandwidth in degrees of
orientation. (b, c) Estimated perceptual templates in the signal and gain-control paths (after Dao, Lu, and
Dosher35).
a b
[0] [1] [2] [3] 1
0.8
0.6
Gabor
0
1 2 3 4
Figure 9.10
Estimating the spatial footprint of the template. (a) Sixteen external noise conditions, consisting of all possible
combinations of spatial rings 1, 2, 3, and 4, indicated by the labels above the panels. For example, [124] indicates
the combination of these three rings. (b) Estimated weights from the PTM for each of the four rings of external
noise in attended and unattended conditions. Also graphed is the proportion root mean square (RMS) contrast
for each ring in the signal Gabor stimulus. The spatial profile of both the attended and the unattended condition
match the spatial profile of the stimulus (after Dosher, Liu, Blair, and Lu44).
Observer Models 289
0.5
Weights of the template
0.25
0
0 33.4 66.8 100.2 133.6 0 33.4 66.8 100.2 133.6 0 33.4 66.8 100.2 133.6
Target-noise SOA (ms)
Figure 9.11
Estimating the temporal window of the template. (a) Signal and external noise temporal configurations. The zero
noise [N0] and the four non-overlapping basic configurations, [N1], [N2], [N3], and [N4], are shown. Additional
mixtures of the basic temporal configurations can also be constructed. (b) Derived temporal characteristics of
the perceptual templates from the best fitting PTM (after Lu, Jeon, & Dosher45).
and orientation and space and time, are relatively impractical due to the large data collec-
tion demands. Investigating one dimension at a time could provide information about
optimal positioning for tests in an experiment that simultaneously manipulates multiple
dimensions.
Characterizing the detection or discrimination template within the context of the PTM
observer framework is essential to correctly understand and estimate all observer proper-
ties, including the contributions of internal noises and nonlinearities in performance. The
PTM framework allows us to separate perceptual factors from decision factors in the data.
This allows the estimation of the underlying perceptual template without contamination
by nonlinearity or decision factors.
290 Chapter 9
The observer models in the previous sections account for visual detection and discrimina-
tion behavior by a single system describing the inputoutput functions exhibited by observ-
ers. A few equations summarize the overall behavior of the observer. The advantage of
this approach is that estimating a small number of parameters allows us to predict perfor-
mance in a wide range of conditions. In addition, good estimation protocols have been
developed for this purpose. The framework has been extremely successful in characterizing
and understanding changes in observer state because usually only one parameter or a few
parameters change from one state to another.12,35,58
Observer Models 291
a b
c d
Figure 9.12
Classification images for simple vernier stimuli. (a, b) The vernier stimuli (1 pixel = 1.26 min arc). (c, d) A raw
classification image (c) and the same image smoothed and quantized (d), so only weights significantly different
from zero are colored differently from the gray background (after Ahumada47).
The template in the PTM represents the sensitivity of the whole observer system to the
set of stimuli in a particular task. In the visual system, the overall template must be created
from neural receptors or visual detectors. The human visual system that embodies the
behavior involves the complex interaction of many neural subsystems and many kinds of
detectors. In an effort to understand more fully how neural systems produces the observers
behavior, researchers have developed models that include many detectors or channels that
correspond more closely with known physiologic properties of the visual system. Any
model of this nature is still an approximation of the real neural system. It will be closer
to neural reality, but also more complex to specify.5965
Receptors in the visual system are selectively responsive to visual inputs. They are tuned
to respond to stimuli with certain properties, such as a particular orientation or spatial
frequency. One theoretical approach tries to predict the response to the stimulus from the
responses of groups or cells, or channels, tuned to somewhat different properties. These
are called multichannel observer models.
A multichannel observer model combines responses from many channels. Each channel
may be thought of as a unit with PTM-like propertieseach has tuning, noisiness, gain
control, or other nonlinearity. The tuning could be in terms of features such as spatial
frequency or orientation but also include tuning in space. Most models assume channels
292 Chapter 9
Nm Na
gain
+ +
f
gain
+ +
f Nm Na
gain +
+ +
f Decision
Nm Na
gain
+ +
f
Figure 9.13
A multichannel observer model. The signal plus noise image is shown on the left and is subsequently processed
through standard spatial frequency tuned visual channels (shown schematically) to yield the filtered images
shown. The image power is passed through the nonlinear transducer functions. Each visual channel is illustrated
with its own internal multiplicative and additive noise sources representing processing inefficiencies. Finally, the
integrated output of these visual channels is input to a decision process, illustrated at right (after Dosher and
Lu58).
are duplicated in detectors at many spatial locations. More complex models include sys-
tematic variations in channel properties, or their distribution in space, as a function of
visual eccentricity.66
Figure 9.13 shows a schematic of a hypothetical multichannel observer model.15,58 A
bank of channels or detectors processes a single image. This schematic shows a stimu-
lus image processed through spatial frequency channels, those tuned to very slowly
varying low-frequency patterns up to those tuned to very-high-frequency patterns. The
images attached to each channel show the stimulus image as filtered by that channel.
The most useful information will appear in channels whose spatial frequency tuning
matches the spatial frequency content of the signal stimuli. The output of each channel
(possibly at many spatial positions) is subject to gain control nonlinearities and internal
noises.
The information available in principle to the observer is the output of each of these
channels. The observers job is to integrate all of these pieces of information into a detec-
Observer Models 293
tion or discrimination decision. In this schematic, the decision is made by adding together
information from various channels with different weights.
A multichannel observer model illustrated in figure 9.13 has been implemented by
Petrov, Dosher, and Lu67,68 in the context of perceptual learning. The model has success-
fully accounted for performance for stimuli of different contrasts at several performance
levels and is able to reproduce the classical patterns in threshold versus contrast functions
observed in external noise experiments.69
Different multichannel observer models have different implementations of the channels
and their properties and make different assumptions about how evidence is combined, or
pooled over both space and over channels depending on the task.36,60,62,7075 At the theoreti-
cal extreme, the channels might be individual neurons or groups of neurons, the properties
could be based on reported physiological properties of those neurons, and the model would
be a population model consisting of many neurons.76,77
In a major effort to specify and test multichannel observer models in visual detection,
a large group of visual psychophysicists produced a set of baseline data for foveal detec-
tion of representative spatial patterns and used these data as a test bed to evaluate models.
This enterprise was called Modelfest.63,64,65,7880 The Modelfest data set includes contrast
threshold measurements from several laboratories for 43 foveal achromatic contrast stimuli
(figure 9.14). Included are a series of stimuli with different spatial frequencies to specify
the contrast sensitivity function, Gabors of several sizes, Gabors with fixed numbers of
cycles and of fixed extent and elongated Gabors of different spatial frequency, and oriented
stimuli with different spatial arrangements, edge and line stimuli, several compound
pattern stimuli such as a checkerboard, and a few more naturalistic stimuli.
Watson and Ahumada63 built a framework consisting of a number of stages as a structure
for examining multichannel observer models. In the most complete version of the frame-
work, a luminance pattern is converted to contrast, passed through a contrast sensitivity
function (CSF), subjected to an oblique effect, cropped by a spatial aperture, and followed
by channel analysis. Then the outputs of the channels are pooled together with a so-called
Minkowski metric62,7375:
1
Ny Nx
cT,q = px py rx , y . (9.14)
y =1 x =1
This metric specifies the pooling over space for a given channel q. The value cT is the
contrast threshold, px and py are the width and height of each pixel in degrees, and the rx , y
are the pixel values of the new image after preprocessing. For multiple channels, the pooled
value is
1
Q
cT = cT,q . (9.15)
q =1
294 Chapter 9
0
b
10
Threshold (dB)
20
30
40
50
10 20 30 40
Stimulus number
c 40
30
Threshold (dB)
20
10
0
a
10 20 30 40
Stimulus number
Figure 9.14
(a) Modelfest stimuli. Each is a monochrome image subtending 2.133 2.133. The index numbers have been
added for identification and were not present in the stimuli. (b, c) Average Modelfest thresholds in dB (b) and
in dBB (c). Each point is the mean of 16 observers, and the error bars indicate 2 SE. The dBB is a measure
of the contrast energy of a stimulus, normalized by a nominal minimum threshold of 10 6 deg 2 s1. Zero dBB is
defined so as to approximate the minimum visible contrast energy for a sensitive human observer (after Watson
and Ahumada63).
A number of variants of this full model did a reasonable job of accounting for the pattern
of thresholds over the 43 stimuli.63 Although the results are excellent, they also illustrate
the limited power of even this large data set in its ability to distinguish between different
models. This partly reflects the fact that the Modelfest data set does not include psycho-
metric functions, or multiple performance levels, or external noise. Ideal extended tests
might involve examining all of these patterns using both external noise and contrast psy-
chometric functions. Another important conclusion from this exercise63 is that the template
models, such as the PTM, could also account for these results without multiple channels,
but assuming known templates for a few basic patterns. For example, some researchers
have concluded that three basic templates, a spot, a bar, and a grating template, account
for the Modelfest78 data.
Observer Models 295
9.6 Summary
Observer models aim to provide a strong functional understanding of the internal processes
in vision or perception. They provide a quantitative framework that allows us to specify
and understand how stimuli are transformed into internal representations and then choice
behavior in the mind of the observer. The best models can provide a very compact formu-
lation that predicts behavioral outcomes across many stimulus variations and conditions
through the estimation of a few parameters that specify the sensitivity, nonlinearity, and
internal noises of the observer. Often, the observer model framework provides key insights
into the nature of the processes that determine the internal representation.
This chapter considered a range of modern single-channel observer models, with focus
on the PTM. The PTM incorporates many features of earlier models and has proved
quite successful. We also discussed several approaches that involve multichannel
observer models that to varying degrees approximate the complexity of the human brain
processes.
The observer approach and associated paradigms such as the external noise method
provide one complementary approach to psychophysical scaling. By measurement of
sensitivity or thresholds across a wide range of stimulus contrasts, external noises, and
performance levels, observer models derive the internal representation of stimuli. If the
goal of scaling is to measure the internal representation, then the observer approach suc-
ceeds in specifying that internal representation through the transducer functions and
contrast gain control systems.
The observer models, whether single-channel or multichannel, also provide a framework
that can be further exploited in order to understand how the system changes when the state
of the observer changes due to factors such as attention, learning, or adaptation. The
behavioral consequences of these manipulations can often be summarized by a change in
one or two parameters in the observer model, and these in turn provide insights and set
the stage for many other investigations of the human perceptual systems.
References
1. Sperling G. 1989. Three stages and two systems of visual processing. Spat Vis 4: 183207.
2. Petrov AA, Dosher BA, Lu ZL. 2005. The dynamics of perceptual learning: An incremental reweighting model.
Psychol Rev 112: 715743.
3. Bejjanki VR, Beck JM, Lu ZL, Pouget A. 2011. Perceptual learning as improved probabilistic inference in
early sensory areas. Nat Neurosci 14: 642648.
4. Pelli DG, Farell B. 1999. Why use noise? J Opt Soc Am A Opt Image Sci Vis 16: 647653.
5. Lu ZL, Dosher BA. 2008. Characterizing observers using external noise and observer models: Assessing
internal representations with external noise. Psychol Rev 115: 4482.
6. Barlow HB. 1956. Retinal noise and absolute threshold. J Opt Soc Am 1056(46): 634639.
296 Chapter 9
7. Pelli, DG. 1981. Effects of visual noise. PhD dissertation, Physiology Department, Cambridge University.
8. Burgess AE, Colborne B. 1988. Visual signal detection: IV. Observer inconsistency. J Opt Soc Am A 2:
617627.
9. Pelli DG. 1985. Uncertainty explains many aspects of visual contrast detection and discrimination. J Opt Soc
Am A 2: 15081532.
10. Eckstein MP, Ahumada AJ, Jr, Watson AB. 1997. Visual signal detection in structured backgrounds: II. Effects
of contrast gain control, background variations, and white noise. J Opt Soc Am 14: 24062419.
11. Lu Z-L, Dosher BA. 1999. Characterizing human perceptual inefficiencies with equivalent internal noise.
J Opt Soc Am A Opt Image Sci Vis 16: 764778.
12. Lu Z-L, Dosher BA. 1998. External noise distinguishes attention mechanisms. Vision Res 38: 11831198.
13. Foley JM, Legge GE. 1981. Contrast detection and near-threshold discrimination in human vision. Vision
Res 21: 10411053.
14. Foley JM. 1994. Human luminance pattern-vision mechanisms: Masking experiments require a new model.
J Opt Soc Am A Opt Image Sci Vis 11: 17101719.
15. Dosher BA, Lu Z-L. 1999. Mechanisms of perceptual learning. Vision Res 39: 31973221.
16. North DO. 1942. The absolute sensitivity of radio receivers. RCA Review. 6: 332344.
17. Friis HT. 1944. Noise figures of radio receivers. Proceedings of the IRE. 32: 419422.
18. Mumford WW, Schelbe EH. Noise performance factors in communication systems. Dedham, MA: Horizon
House-Microwave Inc.; 1968.
19. Nagaraja NS. 1964. Effect of luminance noise on contrast thresholds. J Opt Soc Am 54: 950955.
20. Burgess AE, Wagner RF, Jennings RJ, Barlow HB. 1981. Efficiency of human visual signal discrimination.
Science 214: 9394.
21. Legge GE, Kersten D, Burgess AE. 1987. Contrast discrimination in noise. J Opt Soc Am A 4: 391404.
22. Swets JA, Shipley EF, McKey MJ, Green DM. 1959. Multiple observations of signals in noise. J Acoust Soc
Am 31: 514521.
23. Green DM. 1964. Consistency of auditory detection judgments. Psychol Rev 71: 392407.
24. Levi D, Klein S. 2003. Noise provides some new signals about the spatial vision of amblyopes. J Neurosci
7: 25222526.
25. Gold J, Bennett PJ, Sekuler AB. 1999. Signal but not noise changes with perceptual learning. Nature 402:
176178.
26. Chung STL, Levi DM, Tjan B. 2005. Learning letter identification in peripheral vision. Vision Res 45:
13991412.
27. Weber EH. De Pulsu, resorptione, auditu et tactu: Annotationes anatomicae et physiologicae. Leipzig: CF
Koehler; 1834.
28. Cohn TEA. 1974. New hypothesis to explain why the increment threshold exceeds the decrement threshold.
Vision Res 14: 12771279.
29. Leshowitz B, Taub HB, Raab DH. 1968. Visual detection of signals in the presence of continuous and pulsed
backgrounds. Percept Psychophys 4: 207213.
30. Nachmias J. 1981. On the psychometric function for contrast detection. Vision Res 21: 215223.
31. Nachmias J, Kocher EC. 1970. Visual detection and discrimination of luminance increments. J Opt Soc Am
60: 382389.
32. Nachmias J, Sansbury RV. 1974. Grating contrast: Discrimination may be better than detection. Vision Res
14: 10391042.
33. Stromeyer CF, Klein S. 1974. Spatial frequency channels in human vision as asymmetric (edge) mechanisms.
Vision Res 14: 14091420.
34. Tanner WP, Jr. 1961. Physiological implications of psychophysical data. Ann N Y Acad Sci 89: 752765.
35. Dao DY, Lu Z-L, Dosher BA. 2006. Adaptation to sine-wave gratings selectively reduces the contrast gain
of the adapted stimuli. J Vis 6: 739759.
Observer Models 297
36. Fredericksen RE, Hess RF. 1997. Temporal detection in human vision: Dependence on stimulus energy.
J Opt Soc Am 14: 25572569.
37. Gorea A, Sagi D. 2001. Disentangling signal from noise in visual contrast discrimination. Nat Neurosci 4:
11461150.
38. Klein SA, Levi DM. 1985. Hyperacuity thresholds of 1 sec: Theoretical predictions and empirical validation.
J Opt Soc Am A 2: 11701190.
39. Kontsevich LL, Chen CC, Tyler CW. 2002. Separating the effects of response nonlinearity and internal noise
psychophysically. Vision Res 42: 17711784.
40. Legge GE, Foley JM. 1980. Contrast masking in human vision. J Opt Soc Am 70: 14581471.
41. Watson AB, Solomon JA. 1997. Model of visual contrast gain control and pattern masking. J Opt Soc Am
14: 23792391.
42. Jeon ST, Lu Z-L, Dosher BA. 2009. Characterizing perceptual performance at multiple discrimination preci-
sions in external noise. JOSA A 26: 4358.
43. Lu Z-L, Dosher BA. 2001. Characterizing the spatial-frequency sensitivity of perceptual templates. J Opt
Soc Am A Opt Image Sci Vis 18: 20412053.
44. Dosher BA, Liu SH, Blair N, Lu Z-L. 2004. The spatial window of the perceptual template and endogenous
attention. Vision Res 44: 12571271.
45. Lu Z-L, Jeon ST, Dosher BA. 2004. Temporal tuning characteristics of the perceptual template and endog-
enous cuing of spatial attention. Vision Res 44: 13331350.
46. Lu Z-L, Dosher BA. 2004. Spatial attention excludes external noise without changing the spatial frequency
tuning of the perceptual template. J Vis 4(10): 955966.
47. Ahumada A, Jr. 1996. Perceptual classification images from Vernier acuity masked by noise. [abstract] Per-
ception 26: 18.
48. Ahumada AJ, Lovell J. 1971. Stimulus features in signal detection. J Acoust Soc Am 49: 17511756.
49. Ahumada AJ, Jr. 2002. Classification image weights and internal noise level estimation. J Vis 2(1):
121131.
50. Eckstein MP, Ahumada AJ, Jr. 2002. Classification images: A tool to analyze visual strategies. J Vis
2(1): 1x.
51 Abbey, CK, Eckstein, MP. 2002. Classification image analysis: Estimation and statistical inference for two-
alternative forced-choice experiments. J Vis 2(1): 6678.
52. Gold JM, Murray RF, Bennett PJ, Sekuler AB. 2000. Deriving behavioural receptive fields for visually
completed contours. Curr Biol 10: 663666.
53. Jones JP, Palmer LA. 1987. The two-dimensional spatial structure of simple receptive fields in cat striate
cortex. J Neurophysiol 58: 11871211.
54. Ohzawa I, DeAngelis GC, Freeman RD. 1996. Encoding of binocular disparity by simple cells in the cats
visual cortex. J Neurophysiol 75: 17791805.
55. Ringach DL, Hawken MJ, Shapley R. 1997. Dynamics of orientation tuning in macaque primary visual
cortex. Nature 387: 281284.
56. Murray RF, Bennett PJ, Sekuler AB. 2002. Optimal methods for calculating classification images: Weighted
sums. J Vis 2(1): 79104.
57. Tjan BS, Nandy AS. 2006. Classification images with uncertainty. J Vis 6(4): 387413.
58. Dosher BA, Lu Z-L. 1998. Perceptual learning reflects external noise filtering and internal noise reduction
through channel reweighting. Proc Natl Acad Sci USA 95: 1398813993.
59. Graham NVS. Visual pattern analyzers. Oxford: Oxford University Press; 2001.
60. Wilson HR, Bergen JR. 1979. A four mechanism model for threshold spatial vision. Vision Res 19: 1932.
61. Wilson HR, Humanski R. 1993. Spatial frequency adaptation and contrast gain control. Vision Res 33:
11331149.
62. Graham N. 1977. Visual detection of aperiodic spatial stimuli by probability summation among narrowband
channels. Vision Res 17: 637652.
298 Chapter 9
63. Watson AB, Ahumada AJ. 2005. A standard model for foveal detection of spatial contrast. J Vis 5(9):
717740.
64. Carney T, Klein SA, Tyler CW, Silverstein AD, Beutter B, Levi D, et al. 1999. The development of an image/
threshold database for designing and testing human vision models. Proc SPIE 3644: 542551.
65 Walker, L, Klein, S, Carney, T. Modeling the Modelfest data: Decoupling probability summation. Proceedings
of the Optical Society of America Annual Meeting, Santa Clara, CA: OSA; 1999, pp. SuC5.
66. Najemnik J, Geisler WS. 2005. Optimal eye movement strategies in visual search. Nature 434: 387391.
67. Petrov A, Dosher BA, Lu Z-L. 2005. Perceptual learning through incremental channel reweighting. Psychol
Rev 112: 715743.
68. Petrov A, Dosher B, Lu Z-L. 2006. Comparable perceptual learning with and without feedback in non-
stationary contexts: Data and model. Vision Res 46: 31773197.
69. Lu Z-L, Liu J, Dosher B. 2010. Modeling mechanisms of perceptual learning with augmented Hebbian
reweighting. Vision Res 50: 375390.
70. Graham N, Sutter A, Venkatesan C. 1993. Spatial-frequency-and orientation-selectivity of simple and complex
channels in region segregation. Vision Res 33: 18931911.
71. Graham N, Sutter A. 1998. Spatial summation in simple (Fourier) and complex (non-Fourier) texture chan-
nels. Vision Res 38: 231257.
72. Sutter A, Beck J, Graham N. 1989. Contrast and spatial variables in texture segregation: Testing a simple
spatial-frequency channels model. Atten Percept Psychophys 46: 312332.
73. Watson AB. 1979. Probability summation over time. Vision Res 19: 515522.
74. Robson J, Graham N. 1981. Probability summation and regional variation in contrast sensitivity across the
visual field. Vision Res 21: 409418.
75. Quick R. 1974. A vector-magnitude model of contrast detection. Biol Cybern 16: 6567.
76. Averbeck BB, Latham PE, Pouget A. 2006. Neural correlations, population coding and computation. Nat Rev
Neurosci 7: 358366.
77. Pouget A, Dayan P, Zemel R. 2000. Information processing with population codes. Nat Rev Neurosci 1:
125-132.
78. Chen CC, Tyler CW, Rogowitz B, Pappas T. ModelFest: Imaging the underlying channel structure. In: Human
vision, visual processing, and digital display. Vol. 159. Bellingham, WA: SPIE; 2000, pp. 152159.
79 du Buf, J. 2005. Modelfest and contrast-interrelation-function data predicted by a retinal model. Perception
34 ECVP Abstract Supplement.
80. Klein SA, Tyler CW. 2005. Paradoxical, quasi-ideal, spatial summation in the Modelfest data. J Vis 5: 478.
IV EXPERIMENTAL DESIGN, DATA ANALYSIS, AND MODELING
10 Data Analysis and Modeling
This chapter provides a guidebook to the basic issues in quantitative data analysis and
modeling. It shows how we test the quality of a model and fit the model to observed data.
The quality of a model or theory includes qualitative assessments related to internal con-
sistency, breadth of application, and the ability to make new and useful predictions.
Another assessment of the quality of a model is its ability to predict or fit the observed
behavioral data in relevant domains quantitatively.
Two criteria for fitting a model to data are considered, a least-squared error criterion
and a maximum likelihood criterion, along with methods of estimating the best-fitting
parameters of the models. Bootstrap methods are used to estimate the variability of derived
data and model parameters. Several methods of comparing and selecting between models
are considered. The chapter uses typical psychophysical testing situations to illustrate
several standard applications of these methods of data analysis and modeling.
Informal and formal models serve an important role in the development of scientific
theories. An informal model is an idea or set of ideas about the mechanisms that underlie
some phenomenonan ability or performance. A formal model is one that has been
developed and quantified to the point where precise predictions about that phenomenon
can be generated and tested. Developing a formal model often clarifies whether the
ideas in an informal model have any likelihood of explaining the phenomenon. For
example, people knew for many years that movement of objects depends on the mass
of the object and the force applied on itNewtons second law quantified this as a
testable relationship between mass, force, and acceleration. People had observed planet
motion and planetary orbits for many centuries. Newtons law of universal gravitation
provided a formalism that allowed scientists to test many ideas about the orbits of the
planets.
One common meaning of the word model is a surrogate or stand-inoften small
representation of something real. It is meant to allow the researcher to explore the
302 Chapter 10
more general theory. Just as with a model train or a model plane, certain details may
be glossed over in a model of the sensory system. However, as vision or auditory
scientists, we are more concerned with the inner workings than with the superficial
aspects of a model. The construction of a model often serves two slightly different
purposes. A model is sometimes seen as a description or analogy used to help visual-
ize something . . . that cannot be directly observed. Or, a model is a system of
postulates, data and inferences presented as a mathematical description.1 The first
statement highlights the heuristic value of a model. Construction of models can help
us visualize or think through problems. The second statement emphasizes the formal
aspects of a model. Formalizing a model of a phenomenon or process will allow us
to test precisely how good a model we have.
Quantitative modeling serves several scientific functions. The development of a quan-
titative model from a set of qualitative ideas (a qualitative theory) can provide clear tests
of those ideas (see chapter 12). This is especially important when the qualitative theory is
complex. The fit of a quantitative model can be evaluated to see how well it captures all
the details of the pattern of performance. This establishes to what degree the model pro-
vides an accurate or a complete account of the target phenomenon.
A quantitative model also allows the estimation of specific parameters that summarize
performance. This is often critically important in external validation of the model.
The primary technique for external validation of a model is to manipulate a relevant
aspect of the stimuli or the procedures to see if this results in a sensible change in
the values estimated for parameters. Attempts at validation often form the most power-
ful tests of a model. For example, the introduction of high spatial frequency external
noise in an image should disproportionately limit the contrast sensitivity function in
parameters that specify high-frequency performance. If the model is substantially
correct, then validation tests will support both the model and our interpretation of the
models parameters.
Finally, a successful quantitative model allows the researcher to summarize existing
observations efficiently and to make predictions about likely performance under new
conditions. Quantitative testing can motivate modifications of the existing model and lead
to the creation of a better theory.
One of the often-used methods of assessing a quantitative model uses the least-squares
methods of fitting and estimation. It seeks to find a model and a set of estimated parameters
for the model that provide a good fit to the data. The fidelity criterion, or function defining
a good fit, is that the model with optimized parameters minimizes the sum of squared
errors. This is equivalent to minimizing the root mean squared error, which is analogous
to minimizing the variance about the predictions.
Data Analysis and Modeling 303
17.7
a b
Increment threshold (%)
12.5
12.5
8.84
8.84
6.25
1 2 4 8 2 4
Set size
Figure 10.1
A linear model of log contrast increment threshold as a function of log set size in a hypothetical visual search
experiment. (a) Contrast increment threshold is plotted as a function of set size. The squares indicate the mean
thresholds of four observers; and the solid line is the best-fitting linear regression line on this loglog plot. Error
bars are standard errors of the mean. (b) A close-up of a subset of the data (set sizes 2 and 4) that shows the
prediction error (indicated by the arrows) for each point.
where m is the set size in the visual search display, and c ( m ) is the predicted contrast
increment threshold at set size m. The y-intercept in this case is defined at log m = 0, which
corresponds to a set size of 1, and the intercept a is log c (1), or the contrast increment
threshold for set size 1. Then, b is the slope of the contrast increment threshold in relation
to log m. The value a is the intercept and b is the slope on a loglog plot. The notation
log c ( m ) (log delta-c hat) stands for the predicted log contrast increment threshold.
The number of items in a display, m, is manipulated by the experimenter.
Linear models are far simpler in form than most quantitative models but nonetheless
illustrate the general principles of fitting models to data.
data. The model should be concise, parsimonious, and testable and should explain the
behavior over a representative range of situations.
The model is fit to the data by choosing values for the slope b and the intercept a. The
model specifies that log c increases linearly with log m, or the log of the set size. The
model does not specify the values of a and b. The values of a and b are chosen to provide
a good fit to the data and are called free parameters. A good fit between a model and data
is intuitively related to how visually close the model predictions are to the data. However,
formal model evaluation requires a precise definition of the quality of fita fidelity cri-
terion. One common fidelity criterion is the sum of squared errors.
Figure 10.1a shows a linear model and data for set sizes between 1 and 8. Figure 10.1b
shows an enlarged graph of the data, which makes clearer the fact that experimental data
almost always differ to some degree from the values that the model predicts. Even if the
model is correct, deviations between the actual data and the model will arise for reasons
such as sampling variability in the data. An error refers to the deviation between an
observed data value and the model prediction for that data point. Errors are shown as
arrows in figure 10.1b. The sum of squared errors, or SSE, is the sum of the squared
deviations for all data points:
n
SSE = ( yipredicted yiobserved )2 . (10.2)
i =1
In this equation, an observed data value is yiobserved and a predicted value for that data point
is yipredicted, and n is the number of observed data values. Model parameters are selected to
minimize the measure of error, here SSE.
To find a close correspondence between model predictions and observed data points and
a low SSE, we need to find a good selection of values for the free parameters, the slope
and intercept. A poor selection of parameter values will lead to a poor fit of the model to
the data and a high SSE. The variation in error, or SSE, as a function of different parameter
values is the error surface or error function for the model given the data.
Figure 10.2 shows a three-dimensional (3D) projection of the variation in SSE as a
function of the slope and intercept parameters. The parameter values that best fit the data
are those that minimize the SSE function. Finding the best parameter values corresponds
to finding the minimum of the error surface. The best estimates for this data set are a =
2.53, and b = 0.303. In this case, the error surface is fairly flat near the optimal parameter
values, so similar parameter values might be nearly as good in fitting the data. Another
fact that is visible in these graphs is the interrelationship between two estimated param-
eters: A lower intercept can compensate for a too-high slope as seen in the 3D graph.
The best-fitting model and parameters correspond to the minimum of an error surface.
From this two-parameter surface it might seem that finding the minimum would be
obvious. However, for more complicated models, the full calculation of a multidimensional
error surface is prohibitive and difficult to visualize. For the simplest case of a linear model,
Data Analysis and Modeling 305
60
50
40
SSE
30
20
10
0 1
2 0
2.5 1
3
3.5 2
Intercept Slope
Figure 10.2
The error surface of the linear model of visual search contrast thresholds as a function of the intercept and slope
parameters.
there is a direct formula for finding the best parameter values in a linear regression.3
However, in general the solution for any arbitrary model is not so straightforward, and the
minimum must be found by searching the parameter space at many points. There are now
many different computational approaches to minimize error functions, including so-called
gradient descent methods, grid-search methods, genetic algorithms, and many more.
Regardless of the precise search algorithm that is used, the goal is the sameto find the
values of parameters that optimize the fit of a particular model to the observed data.
Visual impressions concerning the quality of the fit of the model to the data can be
quantified by measures of goodness of fit. For the fidelity criterion of sum of squared
errors, there are three forms that provide a quantitative measure of the goodness of fit of
the model: the SSE and the related measures root mean squared error (RMSE), and r 2.
The SSE defines an error surface from which we estimate the best fit of a model to a
set of data with the best values of the model parameters. The SSE provides one measure
of the goodness of fit of a model. Because the best-fitting model has minimized the SSE,
other less well fitting models will have higher SSE. Still, it is difficult to have an intuitive
interpretation of the SSE as a measure of quality of fit. This is because SSE, which is
306 Chapter 10
expressed in units of squared deviations, depends not just on the size of the prediction
errors but also on n, the number of data points.
For this reason, it is sometimes useful to convert SSE into the more interpretable
measure of RMSE, which is the square root of the average squared error:
The RMSE is in the same units as the measured performance. For example, it could be in
response times or in log percent contrast. You can see from the formula that the RMSE is
similar to a standard deviation formula, except that it measures the difference from the
predicted value rather than the difference from the mean value. It is quite easy to under-
stand and visualize.
Another very useful measure of quality of fit is r 2, or the proportion of variance
accounted for by the model:
r 2 = 1.0
(y
predicted
i yiobserved )2
=
(y observed
i y observed )2 ( yipredicted yiobserved )2
,
(y
observed
i y observed )2 (y observed
i y observed )2
(10.4)
where y observed is the mean of the observed data values. The denominator of this equation
is the sum of squared deviations between the observations and the mean value, or
( yiobserved y observed )2. This quantity is the total variability in the observed data. The
numerator is the variability that can be accounted for by the modelthe prediction
errors, or ( yipredicted yiobserved )2 subtracted from the total sum of squared errors, or
( yiobserved 2y observed )2 ( yipredicted yiobserved )2.
Because r is the proportion of variance in the observed data accounted for by a given
model, larger values of r 2 represent better fits of the model, and the r 2 is 1 if the model
fits the data perfectly. The r 2 formula compares the variance accounted for by the model
to the variance accounted for using the mean as the best prediction. If the mean is a good
predictor because there is little variation in the data itself, then even a model with small
differences between predicted and observed values may correspond with lower absolute
r 2 values, and conversely higher r 2 values are possible if the data have major variations to
predict.
It is also good practice to examine the fit of the model to the data visually by graphing
or listing the observed data values and the predicted data values. Sometimes the r 2 good-
ness of fit can seem relatively high even when the model has significant deviations or
seems to have a mediocre fit to the data. This is because models that capture even gross
trends in the data can capture quite a bit of the variance when compared with using the
mean value as the best predictor. And sometimes there may be systematic misfits of the
model to the data even when the model fit is truly excellent. Sometimes small but sys-
tematic deviations can be useful in developing new variants of the same model.
Data Analysis and Modeling 307
In the example that we provided of the linear model for the relationship between log
threshold increment and log set size in visual search, the minimum SSE is 0.0075, the
RMSE is 0.043, and the r 2 is 0.9840. Overall, this is a quite good fit of the linear model
to this relationship.
300
250 a b
Frequency
200
150
100
50
0
2.55 2.54 2.53 2.52 2.51 2.5 0.285 0.29 0.295 0.3 0.305 0.31 0.315 0.32 0.325 0.33
Intercept Slope
0.33
c
0.32
0.31
Slope
0.3
0.29
0.28
0.27
2.55 2.54 2.53 2.52 2.51 2.5 2.49 2.48
Intercept
Figure 10.3
Bootstrap evaluation of the parameter variability in the linear model for visual search. (a, b) Distribution of the
intercept and slope. (c) Scatterplot of slope versus intercept.
could also look at the distribution of the intercept and of the slope to estimate the confi-
dence intervals that contain the middle 90% of the values for each (figure 10.3a and b).
Finally, you could graph the slope and intercept for each resample as a point in two-
dimensional (2D) space; over resampled data, this would reveal the correlation between
the two estimated parameter values (figure 10.3c).
Here, df1 = kfull kreduced, df2 = N kfull, kfull is the number of parameters of the full model,
kreduced is the number of parameters of the reduced model, and N is the number of predicted
data points. An F test compares variances to see if the variance in the numerator statisti-
cally exceeds that in the denominator. An examination of the formula shows this is a test
of whether the difference in the r 2s (essentially the difference between the SSE) between
the fuller and the reduced model per added free parameter is greater than would be
expected from the error per degree of freedom in the data given the fuller model.
In our visual search example, consider the relationship of contrast increment thresholds
as a function of set size. One model of visual search predicts that all items are processed
in parallel and any reductions in performanceincreases in thresholdare the result of
having multiple sources of false alarms. Some authors7 have argued that in this case, the
slope of the log threshold function should be 0.33. We can test this hypothesis for the
sample data by using the nested-F test to compare a model in which the slope is free to
vary with a submodel in which the slope is set to 0.33. The r 2 of the full model is 0.9840
and of the reduced model is 0.9763. This leads to a value of F of 0.9625, with degrees of
310 Chapter 10
freedom 1 and 2. By comparing these to an F distribution, we see that the reduced and
full models provide statistically equivalent fits to the data (p > 0.40). We conclude that the
slope is consistent with the predicted slope of 0.33 of the parallel search model.
One alternative to the least-squares model estimation and testing uses a maximum likeli-
hood framework. The maximum likelihood approach to fitting and testing models seeks
to find a model and a set of estimated parameters to provide a good fit to the data. The
fidelity criterion for a good fit is to maximize the likelihood that the data were sampled
from the model.
0.1 14
12
0.08
10
0.06
Log(L)
8
L
0.04
6
0.02
4
0 20
0 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
p p
Figure 10.4
Likelihood and log(likelihood) as functions of parameter p.
with df = kfull kreduced . This G 2 distribution of the nested likelihood ratio test is well
approximated by a (chi-squared) distribution with the same degrees of freedom. In
2
the programs and the subsequent examples, we simply refer to this as a test.
2
In our example, the observed values are mearly = 75 and mlate = 90. The nested test
2
has a value of 8.007 and a probability of 0.0017, so we conclude that the fuller model
provides a statistically better fit to the joint data and that perceptual learning has occurred
in the experiment.
So far, we have discussed model selection only between models in a nested structure.
In general, it may be useful to compare models that are not related to one another in
this way.
One approach for the comparison of non-nested models penalizes models with more
free parameters. Models with more free parameters should, all things being equal, be better
able to account for the data. For this reason, a direct comparison of the likelihood of the
data given the respective models should discount or adjust for the number of free param-
eters in assessing the quality of fit. Several different measures have been developed to
assist in the selection of models in non-nested model comparisons.
The Akaike Information Criterion (AIC) is a measure of relative goodness of fit that
penalizes the model for the number of free parameters.9 The AIC is defined as
where kmodel is the number of parameters in that model. The AIC does not provide an
intuitively interpretable measure of the goodness of fit of the model. It produces a
number that characterizes the model for a given data set. It is designed to be useful
specifically in comparing several models for the same data. We compute the AIC value
for the different models that are being compared and choose the model with the minimum
AIC value. In the current example, the AIC for the model with two independent prob-
abilities, one for early and one for late in training, is 12.83, and the AIC for the single-
probability model is 18.84. This model choice is consistent with the results of the
nested model test in which we concluded that the two-parameter model was needed to
fit the data.
Data Analysis and Modeling 313
Another commonly used metric for comparing the fits of different models is the Bayes-
ian Information Criterion, or BIC.10 The BIC is related to the AIC. It is also based on the
likelihood function, and like the AIC, the purpose is to compare models to one another.
The BIC is defined as
The parameter kmodel is the number of parameters of that model, and n is the number of
data points. In this simple example, the number of data points, n, is 2, for the two observed
proportions. (It is not the sample size of either proportion. These sample sizes are implicitly
incorporated in the likelihood values.)
The BIC imposes a heavier penalty on the number of model parameters than the AIC.
Here, too, the point is to choose the best model as that with the lower BIC. In our example,
the BIC of the full model is 10.21 and of the reduced model is 17.53.
more general AIC and BIC indexes that use penalties for free parameters to allow com-
parison between quite different models are replaced by a comparison of (the ratio between)
the (marginal) likelihoods of the data given one or another modelthe so-called Bayes
factor. The basic logic of Bayesian inference and incorporation of Bayesian logic into
adaptive methods is treated in chapter 11. These newer methods are of increasing interest
in the field today.1116
In this section, we provide a series of examples of model estimation, model selection, and
estimation of parameter variability. The examples are designed to cover the methods
described more abstractly in the previous section. Sample code for the model estimation
and fitting is provided for several common problems and experiments in psychophysics.
At the same time, we also show some examples of model comparison and selection.
with different levels of confidence.1719 The theory and purpose underlying the ROC curve
and analysis was considered in section 8.4.3 of chapter 8, where we described the theory
but not the details of how data are analyzed.
The data for an ROC analysis often come from a rating experiment in which, for
example, an observer indicates both a Yes/No (Present/Absent) response but also the con-
fidence associated with that response. In the example of section 8.3.4, this led to six ratings,
Absent-3, Absent-2, Absent-1, Present-1, Present-2, and Present-3. If we run an experiment
with 150 signal-present trials (S + N) and 150 signal-absent trials (N), the observations
are the frequencies of each of the six response categories in the two test conditions. It is
these frequencies that are the input to an ROC analysis. These frequencies are first con-
verted into cumulative frequencies of hits and false alarms, and then to the corresponding
proportions of hits and false alarms. In this case, the cumulative frequencies of hits and
false alarms start with Absent-3, then consider Absent-3 or Absent-2, and then Absent-3
or Absent-2 or Absent-1 (all the no responses), and so forth.
Figure 8.3 of chapter 8 shows examples of hit and false alarm probabilities graphed as
an ROC curve on probabilityprobability axes and on zz axes. The next sections consider
in more detail how to go about analyzing such data.
In a rating procedure, the zi (Hit ) and zi (FA) in category i are functions of the dis-
tance between the signal and noise distributions d , criterion ci, the standard deviation
of the noise distribution N, and the standard deviation of the signal plus noise distribu-
tion S+ N :
zi (Hit ) = G 1 [1 G(ci , d , S+ N )]
(10.11)
zi (FA) = G 1 [1 G(ci , 0, N )].
As usual in signal-detection theory analyses, G is the cumulative Gaussianthe proportion
of the distribution up to criterion ci , and G 1 is the inverse cumulative Gaussian that returns
a z-score from a proportion for the standard normal distribution. Because S+ N and N can
only be known up to their ratio, we typically set N to 1.0 and set the mean of the noise
distribution to 0. These parameters set the scale. For a rating experiment with n categories,
the model has n + 1 parameters: d , S+ N, and criteria ci (i = 1, to n 1). The next task in
the analysis is to fit this functional form to observed data.
In this example, we choose to fit the observed z-scores for hits and false alarms, zi (Hits)
and zi ( FA ), for the n different rating categories. We use a least squares procedure to
fit the model in equation 10.11 to the empirical ROC function. The cost function is
defined as
n 1 n 1
L = [ ziobserved (Hit ) zipredicted (Hit )]2 + [ ziobserved (FA) zipredicted (FA)]2 . (10.12)
i =1 i =1
Display 10.1ac shows a program fittingROC1.m that performs this ROC analysis.
This program takes as input the frequencies of each confidence rating for the target-
Data Analysis and Modeling 315
Display 10.1a
present and the target-absent trials, Srating and Nrating. The program computes
cumulative frequencies and cumulative proportions and the corresponding z-scores for
hits and false alarms associated with each rating criterion. A MATLAB search function
fminsearch is called to find parameter values that minimize a cost function, ROC-
1costfunc. The cost function computes the predicted values corresponding to each data
point for given parameter values and returns the fidelity score L. The fminsearch algo-
rithm uses the simplex search (modified gradient descent) method20 to optimize the selec-
tion of parameters to find the best-fitting model. The program returns the best estimates
of d , S+ N, and the n 1 criteria ci =1:n 1. It also returns r 2 as a summary of the goodness
of fit (see equation 10.13).
316 Chapter 10
Display 10.1b
zHits_observed =data(1,:);
zFA_observed = data(2, :);
d = guess(1);
sigma = guess(2);
criteria = guess(3:7);
zHits_predicted =norminv(1-normcdf(criteria, d, sigma));
zFA_predicted = norminv(1- normcdf(criteria, 0, 1));
L = sum((zHits_observed - zHits_predicted).^2 +
(zFA_observed - zFA_predicted).^2);
Display 10.1c
>> Srating = [ 7 15 27 35 20 46 ];
>> Nrating = [ 5 36 68 38 4 1 ];
>> ROC = fittingROC1(Srating, Nrating)
r2 = 0.9999
ROC = 1.494 1.994 2.484 1.831 0.5791 -0.6112 -1.842
In ROC curves, both the x- and y-values (false alarms and hits) are estimated from data.
For this reason, the fidelity criterion in the cost function that is minimized sums the devia-
tions from predicted values in both x and y. This is different from the principles of fitting
a simple regression model, where it is assumed that one value is given or manipulated and
the second, observed value is being predicted.
The degrees of freedom for the fit of the model to the data is df = 2 (n 1) (n +
1) = n 3. In MATLAB, fminsearch.m is typically used to minimize the cost
function.
The goodness of fit is described with r 2:
n 1 n 1
0.5 [ ziobserved (Hit ) zipredicted (Hit )]2 0.5 [ ziobserved (FA) zipredicted (FA)]2
i =1 i =1
r2 = 1 n 1
n 1
.
{z observed
i (Hit ) mean[z observed 2
(Hit )]} {z observed
i (FA) mean[z observed 2
(FA)]}
i =1 i =1
(10.13)
One important question is whether the simplifying assumption of equal variance fits
the ROC curve; that is, whether S+ N = N = 1. If the variances are equal, the signal-
detection model is a reduced model of the fuller model where S+ N is a free parameter
used to fit the data. Display 10.2ac shows the function program fittingROC2.m that
Display 10.2a
freqHits(1) = Srating(6);
freqFA(1) = Nrating(6);
for i = 2:6
freqHits(i) = freqHits(i-1) + Srating(7-i);
freqFA(i) = freqFA(i-1) + Nrating(7-i);
end
pHits = freqHits(1:5)/freqHits(6);
pFA = freqFA(1:5)/freqFA(6);
zHits = norminv(pHits);
zFA = norminv(pFA);
data = [zHits; zFA];
% compute r2
zHits_observed =data(1,:);
zFA_observed = data(2, :);
d = ROC(1);
criteria = ROC(2:6);
zHits_predicted =norminv(1-normcdf(criteria, d, 1));
zFA_predicted = norminv(1- normcdf(criteria, 0, 1));
mean_zHits = mean(zHits_observed);
mean_zPA = mean(zFA_observed);
r2 = 1- 0.5*sum((zHits_observed -zHits_predicted).^2)
/sum((zHits_observed-mean_zHits).^2) -
0.5*sum((zFA_observed - zFA_predicted).^2)
/sum((zFA_observed-mean_zPA).^2)
Display 10.2b
zHits_observed =data(1,:);
zFA_observed = data(2, :);
d = guess(1);
criteria = guess(2:6);
zHits_predicted =norminv(1-normcdf(criteria, d, 1));
zFA_predicted = norminv(1- normcdf(criteria, 0, 1));
L = sum((zHits_observed - zHits_predicted).^2 +
(zFA_observed - zFA_predicted).^2);
318 Chapter 10
Display 10.2c
>> Srating = [ 7 15 27 35 20 46 ];
>> Nrating = [ 5 36 68 38 4 1 ];
>> ROC=fittingROC2(Srating, Nrating)
r2 = 0.8489
ROC = 1.102 2.043 1.546 0.6131 -0.0032 -1.209
where df1 = kfull kreduced , df2 = N kfull , and N is the number of predicted data points.
The fuller model has parameters d , S+ N, and n 1 criteria ci, or n + 1 parameters, while
the reduced model has only d , and n 1 criteria ci, or n parameters. In our example,
kfull = 7 parameters, kreduced = 6 parameters, df1 = 1 (the number of parameters difference
between the models), and df2 = 5. In figure 8.3 and table 8.3 (section 8.3.4 of chapter
8), we simulated data from a situation in which S+ N > N = 1. Fitting the fuller model
to the data using fittingROC1.m led to an r 2 of 0.9999. Fitting the reduced model
to the data using fittingROC2.m led to an r 2 of 0.8489. When we apply the F test
for nested models to these data, we find F(1,5) = 7550.0, p < 5 10 9. Consistent with
the true situationthat is, the simulated situationwe reject the hypothesis that
S+ N = N = 1.
In this example, we chose to use bootstrap methods to generate variance estimates for
the parameters. The bootstrap method resamples individual experimental trials and
responses in a bootstrap procedure. Display 10.3a and b shows a program ROCstd.m that
takes the original rating frequency data as input. It creates many new ROC data sets that
are then subjected to tabulation and fitting. It computes the standard deviation of each
parameter of the best-fitting model from the resampled data sets to provide an estimate of
parameter variability. The program outputs the variance estimates for d , S+ N, and the
n 1 criteria ci =1:n 1 (display 10.3b).
sorted_N_trials = [1*ones(Nrating0(1),1)
2*ones(Nrating0(2),1)
3*ones(Nrating0(3),1)
4*ones(Nrating0(4),1)
5*ones(Nrating0(5),1)
6*ones(Nrating0(6),1)];
ROC0 = [];
for i=1:1000
resampled_S_trials=[];
resampled_N_trials=[];
for j=1:150 % sample 300 trials from the sorted
% single trial data with replacement
index1 = floor(150*rand(1))+1;
index2 = floor(150*rand(1))+1;
resampled_S_trials = [resampled_S_trials;
sorted_S_trials(index1)];
resampled_N_trials = [resampled_N_trials;
sorted_N_trials(index1)];
end
Srating = [];
Nrating = [];
for j=1:6 % compute Srating and Nrating from the
% resampled data
Srating = [Srating;
length(find(resampled_S_trials ==j))];
Nrating = [Nrating;
length(find(resampled_N_trials ==j))];
end
Srating(Srating==0)=1;
Nrating(Nrating==0)=1;
ROCstd=std(ROC0);
320 Chapter 10
Display 10.3b
>> Srating0 = [ 7 15 27 35 20 46 ];
>> Nrating0 = [ 5 36 68 38 4 1 ];
>> ROCstd=ROCstd(Srating0, Nrating0)
ROCstd = 0.1901 0.1485 0.0426 0.1646 0.2830 0.1155 0.2244
2.2 of chapter 2, we showed an example that used the method of constant stimuli and
two-interval forced-choice to measure a 7-point contrast psychometric function for detect-
ing a sine-wave grating. The purpose of measuring a psychometric function is to find the
best estimate of the threshold and to measure the slope of the psychometric function.
Together, the threshold, the slope, and therefore the shape provide important information
about the visual system.
The data for psychometric functions often come from the method of constant stimuli
that tests a predefined set of six to nine stimulus values spanning from near chance to
maximal performance. Section 4.3.2 of chapter 4 provided an example of measuring a
contrast psychometric function with 7 contrasts and 100 trials per contrast in a two-interval
forced-choice (2IFC) task, leading to a tabulation of the number of correct and incorrect
trials for each contrast (see table 4.1 of chapter 4).
Psychometric functions are often modeled by the Weibull function21:
P(c) = + (1 )(1 e ( c / ) ). (10.15)
In this formula, c is the signal contrast, is the threshold, is the slope of the psychometric
function, represents the chance performance level, and represents the observers lapse
rate, which sets a maximum accuracy below 100%. In a 2IFC task, is set to 0.5; is
often set to a value less than 0.05.21
The theoretical basis and implications of fitting Weibull functions to psychometric
data have been debated.22,23 A number of other similarly shaped functions have been
used as well.24 A more theoretically driven approach is to specify the functional form
of the psychometric functions based on the internal responses of the observer. For
example, we can use the PTM observer model to specify d as a function of stimulus
contrast and external noise and use SDT to relate d to percent correct, as in section 9.2
in chapter 9.25,26
In this section, we consider model estimation using the Weibull, which is one of the
most popular descriptive psychometric functions. The principles of fitting any of the other
functional forms would be analogous. The maximum likelihood procedure8 is typically
used to fit psychometric functions. As described earlier, this procedure maximizes the
likelihood of the observed data given the probabilities predicted by the model or function
with a particular set of parameter values. We assume that the observed data in each condi-
tion are drawn from a binomial distribution. So, the likelihood is a function of the predicted
Data Analysis and Modeling 321
probability for a given contrast condition, pi , the total number of trials, ni, and the number
of correct trials, mi, in each experimental condition i :
ni !
Likelihood = pimi (1 pi )ni mi . (10.16)
mi !(ni mi )!
The product runs across all the experimental conditions i for an observer.
We maximize the likelihood by minimizing the cost function, L = log(likelihood ).
Minimizing L maximizes the likelihood. Taking the log changes multiplication to summa-
tion. To avoid values of infinity, model probabilities 0 and 1 are adjusted up or down
slightly by 1 2ni.19
Display 10.4ac provides a sample program Weibull.m. The program reads in a
matrix with a column for each condition, each with three rows with the contrast (or
other variable) and the numbers of correct and incorrect trials. It sets the chance per-
formance level to 0.5 and the lapse rate to 0.05 and provides starting values or guesses
for the threshold (tau) and slope (eta) parameters of the psychometric function. It
returns the best estimates of and , estimates of thresholds at selected levels of per-
formance (see later), and the resulting likelihood fidelity criterion for the best fit,
maxloglikelihood.
Display 10.4a
Display 10.4b
tau = guess(1);
eta = guess(2);
Nconditions = size(data, 2); % # of stimulus conditions
xi = 0.50;
lamda = 0.02;
L=0;
for i=1:Nconditions
p = xi + (1 - xi - lamda)*(1-exp(-(data(1,i)/tau).^eta));
% Eq. 10.15
if (p < 1/2/(data(2, i)+data(3, i)))
% putting lower and upper boundaries on p
p = 1/2/(data(2, i)+data(3, i));
elseif (p> 1-1/2/(data(2, i)+data(3, i)))
p = 1- 1/2/(data(2, i)+data(3, i));
end
L = L - (data(2, i)*log(p) + data(3, i)*log(1-p));
end
Display 10.4c
Threshold signal contrast at a given performance level can be calculated from the best-
fitting model to the psychometric function (figure 10.5). The model for the psychometric
function is used to interpolate a contrast level associated with some probability correct, p.
For a Weibull function, the threshold at performance level p can be solved by rearranging
terms:
1
1 p
c p = log . (10.17)
1
This formula can be used to estimate a 75% threshold, or an 85% threshold, or any other
target value. For the data in section 4.3.2 of chapter 4, the best-fitting Weibull has set
Data Analysis and Modeling 323
0.9
Probability correct
0.8
0.7
0.6
0.5
0.25 0.40 0.63 1.00 1.58 2.51 3.98 6.31
Figure 10.5
A psychometric function with a best-fitting Weibull model. Square symbols are measured data.
Display 10.5a
tau_std = std(tau0);
eta_std = std(eta0);
eta_iqr = iqr(eta0);
thresholds_std = std(thresholds0);
Data Analysis and Modeling 325
Display 10.5b
to share one or more parameters (e.g., refs. 29 and 30). We can compare different (nested)
fits of two or more psychometric functions using statistics:
2
equivalent slope gives a good account of the data from the three levels of external
noise.
In this example, the likelihood is directly computed from the underlying binomial
process generating the data, or at least the binomial provides a good approximation to the
actual process. In other cases, where likelihood is not directly available, we often use
either least squares or weighted least squares. Likelihood can be computed for continuous
measurements if one is able to specify predicted distributions. More often, weighted least
squares criteria would be selected as an approximation to a maximum likelihood model
solution.31
Display 10.6a
fullmodel_maxloglikelihood =
fullmodel_maxloglikelihood
+ maxloglikelihood;
end
chi2 = 2*(fullmodel_maxloglikelihood
reducedmodel_maxloglikelihood);
p = chi2cdf(chi2, NPF-1);
Data Analysis and Modeling 327
Display 10.6b
for i=1:Nconditions
p = xi + (1 - xi - lamda) * (1-exp(-
(data1(1,i)/tau).^eta)); % Eq. 10.15
if (p < 1/2/(data1(2, i)+data1(3, i)))
% putting lower and upper boundaries on p
p = 1/2/(data1(2, i)+data1(3, i));
elseif (p> 1-1/2/(data1(2, i)+data1(3, i)))
p = 1- 1/2/(data1(2, i)+data1(3, i));
end
L = L - (data1(2, i)*log(p) +
data1(3, i)*log(1-p));
end
end
Display 10.6c
sine wave. In sections 2.3 of chapter 2 and 4.3.3 of chapter 4, we showed an example of
a contrast sensitivity function estimated from an experiment.
We choose to test nine spatial frequencies from coarse to fine. For each spatial fre-
quency, we chose seven contrast levels. On each trial, the computer selects a particular
sine pattern and a particular contrast level to show to the observer. For each combination
of frequency and contrast, we might measure 100 trialsso this experiment requires 9
times the number of trials as the example experiment in section 2.2 of chapter 2. The
contrast sensitivity function was estimated by graphing the inverse of the 75% correct
threshold as a function of spatial frequency. The contrast sensitivity function is usually
analyzed by estimating a best-fitting summary function to these data.
Based on a comprehensive review of nine functional forms yielding similar shapes that
have been used to describe empirical contrast sensitivity functions (CSFs) in normal vision,
Watson and Ahumada24 concluded that all provide a roughly equivalent description of the
CSFs in a shared Modelfest data set (see section 9.5 of chapter 9).
Qualitatively, the CSF has a shape similar to a band-pass filter. That is, it has a peak at
a mid-level spatial frequency and attenuated sensitivity at both high and low spatial fre-
quencies relative to the peak. Here, we have chosen to describe the CSF as the truncated
log-parabola (see figure 10.3). This model has four parameters: (1) the peak gain (sensitiv-
ity) max; (2) the peak spatial frequency fmax; (3) the bandwidth , and (4) , the truncation
level at low spatial frequencies.
The log-parabola form of the contrast sensitivity function, log10 [ S ( f )] is
2
log ( f ) - log10 ( fmax )
log10 [ S ( f )] = log10 ( max ) log10 (2) 10 (10.19)
log10 (2 ) / 2
The truncated log-parabola is defined as:
log10 ( f ) - log10 ( fmax )
2
Display 10.7ac shows the program for analyzing the CSF, fittingCSF.m. The data
from a measured CSF experiment are entered into the program as a matrix with two rows.
The first row contains the spatial frequency of each test condition. The second row is
the sensitivity for those conditions. The program has selected initial starting guesses for
the parameters [Gmax Fmax beta delta], or [ max fmax ]. It returns the best estimates
of these parameters and the r 2 . With an r 2 of 0.9884, the fit to the data is excellent
(figure 10.6).
Here, we illustrate a somewhat different resampling procedure for estimating the stan-
dard deviations of the model parametersa Monte Carlo resampling method. In resam-
pling, each sensitivity that is measured, S ( f )
Observed
, is assumed to be drawn from a normal
distribution that is estimated from the psychometric function for each spatial frequency
by the method in section 10.2.3 of chapter 10. We resample new hypothetical S ( f )
Observed
for each spatial frequency, and refit the truncated log-parabola model to each set of resa-
mpled data. Another possible approach to resampling, not shown here, is to use a bootstrap
procedure on the raw data to construct new psychometric functions and new CSFs based
on the bootstrapped data sets.
The number of resampled data sets in this example is 1000. Display 10.8a and b shows
the sample program CSFstd.m using this procedure to estimate the variances of the fitted
parameters. This program takes as input the CSF data and the estimated standard deviation
Display 10.7a
Gmax = guess(1);
Fmax = guess(2);
beta = guess(3);
delta = guess(4);
330 Chapter 10
Display 10.7b
Gmax = guess(1);
Fmax = guess(2);
beta = guess(3);
delta = guess(4);
f = data(1, :);
S_observed = log10(data(2, :));
Nfrequencies = size(data, 2);
% # of spatial frequency conditions
S_predicted = [];
for i=1:Nfrequencies
S = log10(Gmax) - log10(2) * ((log10(f(i)) -
log10(Fmax))/(log10(2*beta)/2))^2;
if (f(i) >= Fmax)
S = S;
elseif (f(i) < Fmax & S < log10(Gmax) - delta)
S = log10(Gmax) - delta;
end
S_predicted = [ S_predicted S ];
end
Display 10.7c
>> CSF =
[.125 .25 .50 1.00 2.00 4.00 8.00 16.0 32.0
70.1 144.7 202.2 204.6 189.1 183.5 98.7 46.0 21.1];
>> [Gmax, Fmax, beta, delta, r2] = fittingCSF(CSF)
Gmax = 222.8, Fmax = 1.168, beta = 17.58, delta = 0.4777
r2 = 0.9884
Data Analysis and Modeling 331
316
100
31.6
Sensitivity
10.0
3.16
Spatial frequency
Figure 10.6
A contrast sensitivity function with the truncated log-parabola model fit. The symbols represent measured data,
or 1/threshold from psychometric functions for tests of different spatial frequencies.
of sensitivity at each spatial frequency. It outputs the standard deviations of the estimated
parameters of the CSF model fit. For the data in section 4.3.3 of chapter 4, the best-fitting
CSF model parameters are max = 222.8 7.3, fmax = 1.17 0.04, = 17.6 1.2, and
= 0.48 0.08.
One practical application of the comparison of several CSFs is in the study of amblyo-
pia, where the CSF of the amblyopic eye is compared to the CSF of the fellow or non-
amblyopic eye.35,36 This comparison is sometimes carried out using analysis of variance
on the two sets of empirical CSFs (figure 10.7). An alternative (and we believe better)
approach is to test whether the same values of the four parameters of the truncated log-
parabola model can be used jointly to fit the two CSFs. Display 10.9ac shows a program
that compares a model in which each CSF has four parameters, for a total of eight param-
eters, to a situation where the two CSFs share the same four-parameter description. If the
two fits are statistically equivalent, we infer that the two CSFs are equivalent, otherwise
we infer that they are different. For this purpose, with least-squares model fitting, the
nested-F test (equation 10.14) yields F(4, 10) = 58.62, p < 6.63 10 7, indicating that the
two CSFs are statistically different.
332 Chapter 10
Display 10.8a
Nfrequencies = size(CSF0, 2)
Gmax0 = []; Fmax0 = []; beta0=[]; delta0=[];
for i = 1 : 1000
CSF=[CSF0(1,:); CSF0(2, :) + CSF0(3, :).*randn(1,
Nfrequencies)];
Gmax_std = std(Gmax0);
Fmax_std = std(Fmax0);
beta_std = std(beta0);
delta_std = std(delta0);
Display 10.8b
>> CSF0=
[.125 .25 .50 1.00 2.00 4.00 8.00 16.0 32.0
70.1 144.7 202.2 204.6 189.1 183.5 98.7 46.0 21.1
4.9 10.1 11.6 12.2 12.9 12.5 7.6 3.2 1.4];
>> [Gmax_std, Fmax_std, beta_std, delta_std] = CSFstd(CSF0)
Gmax_std = 7.3, Fmax_std = 0.043, beta_std = 1.2, delta_std =
0.085
Using the same logic, we can evaluate other intermediate models in which some subset
of the four parameters is shared between the two empirical CSF measurements. This sta-
tistical comparison is valid as long as the reduced model is nested within the fuller model
in which the reduced model equates certain parameters. For example, an eight-parameter
model could be compared to a seven-parameter model in which the center frequency
parameter fmax is constrained to be identical for the two empirical CSFs.
Display 10.9a
Display 10.9b
L=0;
for n = 1 : NCSF
f = data((n-1)*2+1, :);
S_observed = log10(data(2*n, :));
S_predicted = [];
for i=1:Nfrequencies
S = log10(Gmax) - log10(2) * ((log10(f(i)) -
log10(Fmax))/(log10(2*beta)/2))^2;
if (f(i) >= Fmax)
S = S;
elseif (f(i) < Fmax & S < log10(Gmax) - delta)
S = log10(Gmax) - delta;
end
S_predicted = [ S_predicted S ];
end
L = L+sum( (S_observed - S_predicted).^2 );
end
Display 10.9c
>> CSFs =
[.125 .25 .50 1.00 2.00 4.00 8.00 16.0 32.0
70.1 144.7 202.2 204.6 189.1 183.5 98.7 46.0 21.1
.125 .25 .50 1.00 2.00 4.00 8.00 16.0 32.0
60.1 110.7 160.2 150.6 110.1 45.5 19.0 10.0 2.1];
>> [F, df1, df2, p] = CSFcomp(CSFs)
F = 58.62, df1 = 4, df2 = 10, p = 6.6347e-07
Data Analysis and Modeling 335
316
100
31.6
Sensitivity
10.0
3.16
Figure 10.7
Comparing two contrast sensitivity functions. The solid curves represent predictions of the full model; the dotted
curve represents the predictions of the reduced model assuming no difference between the two functions. The
two data sets are shown as square and circle symbols, respectively.
detection or discrimination task to the amount of external noise added to the signal
stimulus. In section 9.3.1 of chapter 9, we showed empirical TvC functions estimated
from 8-point contrast psychometric functions measured at each of eight external noise
levels. The TvC is traditionally graphed on loglog axes. Taking the log contrast thresh-
old helps to equalize the variability of the thresholds across a large range of measured
values.
Here, we use a function from the perceptual template model (PTM) to describe the TvC
data. Fitting a set of TvCs at different criterion threshold levels for a single condition
requires four parameters: the internal multiplicative and additive equivalent noises, N m and
N a, the gain response of the template to a signal stimulus, , and the nonlinearity param-
eter, . The PTM model and its parameters were extensively discussed in section 9.3.1 of
chapter 9. The PTM model predictions are fit to empirical TvC functions through least
squares methods, using the predicted log contrast as:
log(c ) =
1
2 {
2 log(d ) + log[(1 + N mul
2 2
) ext + add
2
( N 2 d 2
] log 1 mul2 )} log( ). (10.23)
Display 10.10a
data = TvC;
NperformanceLevels = size(TvC, 1) -1;
NnoiseLevels = size(TvC, 2) - 1;
guess = [4 2 0.1 0.01]; % beta, gamma, Nm, Sa
options = optimset(fminsearch);
[guess, L] = fminsearch(TvCcostfunc, guess, options,
data);
beta = guess(1);
gamma = guess(2);
Nm = guess(3);
Sa = guess(4);
Log_observedThresholds =
log(TvC(2:(NperformanceLevels+1), 2:(NnoiseLevels+1)));
meanT = mean(mean(Log_observedThresholds));
r2 = 1 - L/sum(sum((Log_observedThresholds - meanT).^2));
Display 10.10b
beta = guess(1);
gamma = guess(2);
Nm = guess(3);
Sa = guess(4);
NperformanceLevels = size(data, 1) -1;
NnoiseLevels = size(data, 2) - 1;
Next = data(1, 2:(NnoiseLevels+1));
dprime = norminv(data(2:(NperformanceLevels+1), 1)) -
norminv( 1- data(2:(NperformanceLevels+1), 1));
L=0;
for i = 1:NperformanceLevels
Log_observedThresholds =
log(data(i+1, 2:(NnoiseLevels+1)));
d = dprime(i);
Log_predictedThresholds = 1 / (2*gamma) * (2*log(d)
+ log((1+Nm^2) * Next.^(2*gamma) + Sa^2) -
log(1-Nm.^2*d.^2/2)) - log(beta);
L = L+sum((Log_observedThresholds -
Log_predictedThresholds).^2);
end
Data Analysis and Modeling 337
Display 10.10c
>> TvC=
[0 0.000 0.030 0.045 0.067 0.100 0.149 0.223 0.330
0.65 0.051 0.050 0.052 0.052 0.058 0.085 0.125 0.168
0.75 0.067 0.069 0.066 0.068 0.078 0.111 0.163 0.239
0.85 0.084 0.090 0.082 0.085 0.102 0.140 0.206 0.322]
>> [beta, gamma, Nm, Sa, r2] = fittingTvC(TvC)
beta = 1.635, gamma = 1.96, Nm = 0.135, Sa = 0.0095
r2 = 0.9946
50
25
Threshold (%)
12.5
6.25
3.13
0 3.13 6.25 12.5 25 50
External noise contrast (%)
Figure 10.8
TvC functions showing thresholds as a function of external noise contrast. Symbols show the data with error
bars estimated by bootstrap methods; smooth curves represent the best-fitting PTM observer model.
338 Chapter 10
thresholds at three performance levels are input to the program. In this sample experiment,
there are eight levels of external noise. The input data includes the 8 4 data matrix in
which the first row is the external noise contrast and the next three rows are the estimated
thresholds at 0.65, 0.75, and 0.85 proportion correct accuracies. The input matrix is
however 9 4, which includes a left-most column that contains the proportion correct
targets for the thresholds and a placeholder 0 in the external noise row. The program
assumes starting values for and then estimates and returns the best values for the PTM
parameters [beta gamma Nm Sa] along with the r 2. The fit of this PTM model to the
triple-TvC data is excellent, with r 2 = 0.9946.
Fitting contrast thresholds in the log form, log (c ), roughly equates the variability of
the different contrasts. This approximates a weighted least squares solution and so approxi-
mates maximum likelihood estimation.37
As in the previous examples, it is possible to use bootstrapped resampling to estimate
the variability of the four parameters of the PTM model. Display 10.11a and b shows the
program TvCstd.m that carries out this resampling analysis. The input to the program is
Display 10.11a
data = TvC0;
NperformanceLevels = (size(TvC0, 1) -1)/2;
NnoiseLevels = size(TvC0, 2) - 1;
beta0 = []; gamma0 = []; Nm0=[]; Sa0=[];
for n = 1 : 1000
TvC = TvC0(1, :);
for i = 1 : NperformanceLevels
TvC = [TvC; TvC0((i+1), 1)
TvC0((i+1), 2:(NnoiseLevels+1)) +
TvC0((NperformanceLevels+i+1), 2:(NnoiseLevels+1))
.*randn(1, NnoiseLevels)];
end
[beta, gamma, Nm, Sa, r2] = fittingTvC(TvC);
Display 10.11b
>> TvC0 =
[ 0 0.000 0.030 0.045 0.067 0.100 0.149 0.223 0.330
0.65 0.051 0.050 0.052 0.052 0.058 0.085 0.125 0.168
0.75 0.067 0.069 0.066 0.068 0.078 0.111 0.163 0.239
0.85 0.084 0.090 0.082 0.085 0.102 0.140 0.206 0.322
0.65 .0044 .0066 .0040 .0042 .0053 .0074 .0109 .0130
0.75 .0043 .0049 .0034 .0039 .0041 .0064 .0089 .0129
0.85 .0044 .0045 .0035 .0043 .0048 .0071 .0102 .0191]
>> [beta_std, gamma_std, Nm_std, Sa_std] = TvCstd(TvC0)
beta_std = 0.066, gamma_std = 0.42, Nm_std = 0.17,
Sa_std = 0.0044
the contrast thresholds in the different external noise conditions, the three levels of contrast
thresholds, and the corresponding standard deviations of the threshold estimations, input
in a format analogous to fitting the TvC. The output includes the PTM parameter estimates
and their standard deviations.
Nested-F tests can compare the TvCs from two or more conditions using fuller or more
reduced models. The nested model structure can be used to identify which of several
parameters change when the state of the observer changes. For example, nested model
testing has been used to identify mechanisms of attention and perceptual learning.29,3840
Consider an example from an experiment on perceptual learning39 in which observers
were trained for 10 sessions, nearly 14,400 trials, on an orientation discrimination task in
the lower right visual field, while performing a letter/number identification task in a rapid
stream of characters in the center of the display. The orientation task was tested in eight
different levels of external noise from 0 contrast to 0.33 contrast, all intermixed. Contrast
thresholds were estimated using staircase procedures (see chapter 11) at two accuracy
levels, 70.7% and 79.4%. In figure 10.9, we show the TvC functions from every two
practice sessions to illustrate the effects of perceptual learning. The TvCs at both perfor-
mance levels were fit simultaneously with PTM equations.
Display 10.12ac provides the program TvCcomp.m to analyze and test the PTM models
of TvC data at two accuracy criteria. The input consists of five sets of two TvC functions.
The data are input as in the last example as a matrix in which the first column is a place-
holder 0 in the first row and indicates the target accuracy level in the other rows, and the
first row lists the external noise contrasts, and subsequent rows have the thresholds for the
lower and higher accuracy staircases for each of 5 days. The search algorithm fmin-
search is used to find best estimates of the parameters of several different PTM model
variations.
Perceptual learning may improve performance through several mechanisms expressed
in the PTM model. Learning may reduce the impact of external noise through improved
filtering, it may reduce internal additive noise, or it may reduce internal multiplicative
340 Chapter 10
50
25
Threshold (%)
12.5
6.25
3.13
Figure 10.9
TvC functions in a perceptual learning experiment, fit with a PTM model. The left panel and right panels show
thresholds at two different accuracy criteria, and each curve represents 2 days of data collection. Practice
improved external noise filtering and reduced internal additive noise (data from Dosher and Lu39).
noiseeach associated with a different kind of change in the TvC functions following
practice. Reducing the impact of external noise improves contrast thresholds in high exter-
nal noise. Reducing additive internal noise improves contrast thresholds in low external
noise. Reducing internal multiplicative noise improves contrast thresholds in both low
and high noise, but by a different amount (in log contrast threshold) depending upon the
performance threshold level. Perceptual learning is expressed in the PTM equations for
the TvC, then, by added parameters (<1) that multiplicatively reduce the impact of these
different noises after training: Af2 N ext2 2
replaces N ext after training, Aa2 N a2 replaces Aa2
2 2 2
after training, and Am N m replaces N m after trainingif performance has improved in all
these ways.
Nested model tests allowed us to infer that perceptual learning reduced the impact of
external noise ( Af2 N ext
2
) and reduced additive internal noise ( Aa2 N a2 ), but did not alter inter-
nal multiplicative noise ( Am2 N m2 = N m2). The two factors of improvement Af2 and Aa2 were
both needed to account for the data, but they were not necessarily the same. This is typical
of observed changes due to perceptual learning, where one or the other, but usually both
factors are needed to explain improved TvC performancebut the factors may be affected
partially independently.41
This example is meant to illustrate how the framework of a quantitative model, com-
bined with model comparison, can assist the researcher in summarizing and quantifying
differences between conditions.
Display 10.12a
beta = guess(1);
gamma = guess(2);
Nm = guess(3);
Sa = guess(4);
Aa = [1 guess(5 : (5 + Nstates - 2))];
Af = [1 guess((5+Nstates-1) : (5 + 2* Nstates -3))];
Display 10.12b
for i = 1 : Nstates
for j = 1 : NperformanceLevels
d =
norminv(data((i-1)*NperformanceLevels+2+j, 1)) -
norminv( 1- data((i-1)*NperformanceLevels+2+j, 1));
Log_observedThresholds =
log(data((i-1)*NperformanceLevels+2+j,
2:(NnoiseLevels+1)));
Log_predictedThresholds = 1 / (2*gamma) * (2*log(d)
+ log((1+Nm^2) * (Af(i)*Next).^(2*gamma) +
(Aa(i)*Sa)^2) - log(1-Nm.^2*d.^2/2)) - log(beta);
L = L+sum ((Log_observedThresholds -
Log_predictedThresholds).^2);
end
end
processing time (see section 6.3.1 of chapter 6). Usually, visual discrimination improves
with added processing time for short times, and then levels off when more processing time
ceases to yield new information. Originally used in the measurement of memory retrieval
for verbal materials,42,43 the time course of processing has also been measured for several
cases of visual discriminations.4448
As in the CSF example earlier, the rapid early improvements and asymptotic form of
the SAT function can be fit by a number of functional forms.42,4953 Here we consider the
exponential approach to a limit as one of the best descriptive equations for the functional
form of the SAT function:
d (t ) = (1 e (t ) ) , (10.24)
Data Analysis and Modeling 343
Display 10.12c
>> TvCs =
[0 0.0000 0.0205 0.0410 0.0820 0.1230 0.1640 0.2460 0.3281
.707 0.0553 0.0510 0.0712 0.0854 0.1458 0.1530 0.3669 0.3455
.794 0.0847 0.1363 0.1281 0.2041 0.2300 0.3059 0.4111 0.5137
.707 0.0462 0.0453 0.0421 0.0658 0.0779 0.1073 0.1461 0.2007
.794 0.0588 0.0786 0.0808 0.1134 0.1369 0.2038 0.2119 0.3176
.707 0.0349 0.0342 0.0343 0.0403 0.0561 0.0823 0.1190 0.1312
.794 0.0382 0.0418 0.0479 0.0548 0.0804 0.0937 0.1279 0.1840
.707 0.0334 0.0321 0.0323 0.0454 0.0628 0.0689 0.0953 0.1292
.794 0.0374 0.0378 0.0423 0.0572 0.0758 0.0862 0.1299 0.1707
.707 0.0307 0.0305 0.0333 0.0336 0.0565 0.0661 0.0899 0.1289
.794 0.0343 0.0389 0.0411 0.0530 0.0706 0.0794 0.1238 0.1631]
>> [beta, gamma, Nm, Sa, Aa, Af, r2_full, r2_reduced, F,
df1, df2, p] = TvCcomp(TvCs, 5)
beta = 0.9050, gamma = 1.3138, Nm = 0.1759, Sa = 0.0225
Aa = 1.0000 0.5971 0.3774 0.3161 0.3876
Af = 1.0000 0.5662 0.3508 0.3493 0.2793
r2_full = 0.9490, r2_reduced = 0.4896
F = 76.5759, df1 = 8, df2 = 68, p = 0
2.5
1.5
Accuracy (d)
0.5
-0.5
0 0.5 1 1.5 2 2.5 3
Total processing time (s)
Figure 10.10
SAT functions with the best fitting exponential model. Symbols are measured data at seven different interruption
times for display sizes of 4 (top curve, square) and 12 (bottom curve, x). The best-fitting exponential functions
differ in asymptotic accuracy, but neither rate nor intercept (data after Dosher, Han, and Lu45).
summarize the quality of fit with r 2. Nested model tests indicate that the two display sizes
differ in asymptotic levels, and only modestly in rate, but not in intercept.
Display 10.13ad shows a program used for model comparison and fitting of SAT data,
fittingSATs.m. The input to the program is a matrix of the SAT data in two conditions,
for set size 4 and 12, for 7 points of interruption. The first two rows have the total process-
ing time (lag plus latency) and d of the set size 4 at the seven interruption times, while
the third and fourth rows have total processing time and d for set size 12. The output is
the best-fitting parameters of the reduced model, the r 2 of the fuller model, the r 2 of a
reduced model, and the value of the F test for the nested models with degrees of freedom,
and the resulting p-value for the test.
As in other examples, bootstrap resampling could be used to estimate the variability in
the parameters of the fits of the descriptive model to the data. In this example, we have
shown how to fit the descriptive exponential approach to a limit as the functional form of
Data Analysis and Modeling 345
Display 10.13a
L=0;
for i = 1:NSATs
t = data((i-1)*2+1, :);
Observed_dp = data((i-1)*2+2, :);
Predicted_dp = lamda(i) .* ( 1 - exp(-beta(i)
* ( t - delta(i) ) ) );
L = L+sum ((Observed_dp - Predicted_dp).^2);
end
Display 10.13c
%%% SATreduced_costfunc.m
function L = SATreduced_costfunc(guess, data)
L=0;
for i = 1:NSATs
t = data((i-1)*2+1, :);
Observed_dp = data((i-1)*2+2, :);
Predicted_dp = lamda(i) .* ( 1 - exp(-beta(i)
* ( t - delta ) ) );
L = L+sum ((Observed_dp - Predicted_dp).^2);
end
Display 10.13d
>> SATs =
[0.420 0.443 0.511 0.638 0.837 1.474 2.124
0.392 0.424 0.500 0.636 0.836 1.480 2.128
0.418 0.740 0.892 1.393 1.738 2.091 2.094
0.171 -0.016 0.677 0.820 0.882 1.124 1.087]
>> [lamda, beta, delta, r2_full, r2_reduced, F, df1, df2, p]
= fittingSATs(SATs)
lamda = 2.093 1.101, beta = 4.007 4.365, delta = 0.3604
r2_full = 0.9506, r2_reduced = 0.9443
F = 1.0279, df1 = 1, df2 = 8, p = 0.3403
Data Analysis and Modeling 347
the SAT, or time-course functions. Other functional forms of very similar shape have been
derived for process models of visual search.45,46 Yet other functional forms have been
derived from general models of decision, such as the diffusion model.56 All of these func-
tional forms are quite similar to the exponential and can be used to test the process models
that they are designed to capture.
10.5 Summary
In this chapter, we have focused on classical methods of model fitting, parameter estima-
tion, estimation of parameter variability, and model comparison based on least-squares
and maximum likelihood methods. The initial part of the chapter provided a general
introduction and intuitive analysis of these methods, and the second part of the chapter
provided example applications and programs that are central to typical experiments in
psychophysics. The methods and the examples treated in this chapter should provide the
reader with the tools to develop their own programs to carry out model testing and esti-
mation for their applications.
References
1. Grolier. New Websters dictionary and thesaurus of the English language. New York: Grolier; 1992.
2. Palmer J, Verghese P, Pavel M. 2000. The psychophysics of visual search. Vision Res 40(10): 12271268.
3. Graybill FA, Iyer HK. Regression analysis. Belmont, CA: Duxbury Press; 1994.
4. Efron B, Tibshirani RJ. An introduction to the bootstrap. Monographs on statistics & applied probability. Boca
Raton: Chapman & Hall/CRC Press; 1994.
5. Davison AC, Hinkley DV. Bootstrap methods and their application. Cambridge, UK: Cambridge University
Press; 1997.
6. Myung JI, Pitt MA. 2004. Model comparison methods. Methods Enzymol 383: 351366.
7. Palmer J, Ames CT, Lindsey DT. 1993. Measuring the effect of attention on simple visual search. J Exp Psychol
Hum Percept Perform 19(1): 108130.
8. Hays WL. Statistics for the social sciences, Vol. 410. New York: Holt, Rinehart and Winston; 1973.
9. Akaike H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control
19(6): 716723.
10. Schwarz G. 1978. Estimating the dimension of a model. Ann Stat 6(2): 461464.
11. Kruschke JK. Doing Bayesian data analysis: A Tutorial with R and BUGS. Burlington, MA: Academic Press;
2010.
12. Lee MD. 2008. Three case studies in the Bayesian analysis of cognitive models. Psychon Bull Rev 15(1):
115.
13. Raftery AE. 1995. Bayesian model selection in social research. Sociol Methodol 25: 111164.
14. Pitt MA, Myung IJ, Zhang S. 2002. Toward a method of selecting among computational models of cognition.
Psychol Rev 109(3): 472491.
15. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian data analysis. Boca Raton, FL: CRC Press; 2004.
16. Shiffrin RM, Lee MD, Kim W, Wagenmakers EJ. 2008. A survey of model evaluation approaches with a
tutorial on hierarchical Bayesian methods. Cogn Sci 32(8): 12481284.
17. Green DM, Swets JA. Signal detection theory and psychophysics. New York: Wiley; 1966.
348 Chapter 10
18. Wickens TD. Elementary signal detection theory. Oxford: Oxford University Press; 2002.
19. Macmillan NA, Creelman CD. Detection theory: A users guide. Hillsdale, NJ: Lawrence Erlbaum; 2005.
20. Lagarias JC, Reeds JA, Wright MH, Wright PE. 1998. Convergence properties of the Nelder-Mead simplex
method in low dimensions. SIAM J Optim 9: 112147.
21. Wichmann FA, Hill NJ. 2001. The psychometric function: I. Fitting, sampling, and goodness of fit. Atten
Percept Psychophys 63(8): 12931313.
22. Mortensen U. 2002. Additive noise, Weibull functions and the approximation of psychometric functions.
Vision Res 42(20): 23712393.
23. Tyler CW, Chen CC. 2000. Signal detection theory in the 2AFC paradigm: Attention, channel uncertainty
and probability summation. Vision Res 40(22): 31213144.
24. Watson AB, Ahumada AJ. 2005. A standard model for foveal detection of spatial contrast. J Vis 5(9):
717740.
25. Lu ZL, Dosher BA. 2008. Characterizing observers using external noise and observer models: Assessing
internal representations with external noise. Psychol Rev 115(1): 4482.
26. Lesmes LA, Lu ZL, Tran NT, Dosher BA, Albright TD. 2006. An adaptive method for estimating criterion
sensitivity (d) levels in yes/no tasks. J Vis 6(6): 1097.
27. Maloney LT. 1990. Confidence intervals for the parameters of psychometric functions. Atten Percept Psy-
chophys 47(2): 127134.
28. Lu Z-L, Dosher BA. 1999. Characterizing human perceptual inefficiencies with equivalent internal noise.
J Opt Soc Am A Opt Image Sci Vis 16(3): 764778.
29. Dosher BA, Lu ZL. 2000. Noise exclusion in spatial attention. Psychol Sci 11(2): 139146.
30. Lu ZL, Lesmes LA, Dosher BA. 2002. Spatial attention excludes external noise at the target location. J Vis
2(4): 312323.
31. Wonnacott TH, Wonnacott RJ. Regression: A second course in statistics. New York: Wiley; 1981.
32. Campbell FW, Robson JG. 1968. Application of Fourier analysis to the visibility of gratings. J Physiol 197(3):
551566.
33. Enroth-Cugell C, Robson JG. 1966. The contrast sensitivity of retinal ganglion cells of the cat. J Physiol
187(3): 517552.
34. Rohaly AM, Owsley C. 1993. Modeling the contrast-sensitivity functions of older adults. JOSA A 10(7):
15911599.
35. Zhou Y, Huang C, Xu P, Tao L, Qiu Z, Li X, Lu Z-L. 2006. Perceptual learning improves contrast sensitivity
and visual acuity in adults with anisometropic amblyopia. Vision Res 46(5): 739750.
36. Huang CB, Zhou Y, Lu ZL. 2008. Broad bandwidth of perceptual learning in the visual system of adults with
anisometropic amblyopia. Proc Natl Acad Sci USA 105(10): 40684073.
37. Busemeyer JR, Diederich A. Cognitive modeling. Thousand Oaks, CA: Sage Publications; 2009.
38. Lu Z-L, Dosher BA. 1998. External noise distinguishes attention mechanisms. Vision Res 38(9):
11831198.
39. Dosher BA, Lu Z-L. 1998. Perceptual learning reflects external noise filtering and internal noise reduction
through channel reweighting. Proc Natl Acad Sci USA 95: 1398813993.
40. Dosher BA, Lu Z-L. 1999. Mechanisms of perceptual learning. Vision Res 39(19): 31973221.
41. Lu ZL, Dosher BA. 2009. Mechanisms of perceptual learning. Learning & Perception. 1(1): 1936.
42. Reed AV. 1973. Speed-accuracy trade-off in recognition memory. Science 181(4099): 574576.
43. Dosher BA. 1976. The retrieval of sentences from memory: A speed-accuracy study. Cognit Psychol 8(3):
291310.
44. McElree B, Carrasco M. 1999. The temporal dynamics of visual search: Evidence for parallel processing in
feature and conjunction searches. J Exp Psychol Hum Percept Perform 25(6): 15171539.
45. Dosher BA, Han S, Lu ZL. 2004. Parallel processing in visual search asymmetry. J Exp Psychol Hum Percept
Perform 30(1): 327.
Data Analysis and Modeling 349
46. Dosher BA, Han S, Lu ZL. 2010. Information-limited parallel processing in difficult heterogeneous covert
visual search. J Exp Psychol Hum Percept Perform 36(5): 11281144.
47. Carrasco M, McElree B. 2001. Covert attention accelerates the rate of visual information processing. Proc
Natl Acad Sci USA 98(9): 53635367.
48. Carrasco M, McElree B, Denisova K, Giordano AM. 2003. Speed of visual processing increases with eccen-
tricity. Nat Neurosci 6(7): 699700.
49. Reed AV. 1976. List length and the time course of recognition in immediate memory. Mem Cognit 4(1):
1630.
50. Dosher BA. 1979. Empirical approaches to information processing: Speed-accuracy tradeoff functions or
reaction timeA reply. Acta Psychol (Amst) 43(5): 347359.
51. Dosher BA. 1981. The effects of delay and interference: A speed-accuracy study. Cognit Psychol 13(4):
551582.
52. Ratcliff R. 1978. A theory of memory retrieval. Psychol Rev 85(2): 59108.
53. Ratcliff R. 1980. A note on modeling accumulation of information when the rate of accumulation changes
over time. J Math Psychol 21(2): 178184.
54. McElree B, Dosher BA. 1989. Serial position and set size in short-term memory: The time course of recogni-
tion. J Exp Psychol Gen 118(4): 346373.
55. McElree B, Dosher BA. 1993. Serial retrieval processes in the recovery of order information. J Exp Psychol
Gen 122(3): 291315.
56. Ratcliff R. 1985. Theoretical interpretations of the speed and accuracy of positive and negative responses.
Psychol Rev 92(2): 212225.
11 Adaptive Psychophysical Procedures
Adaptive procedures are developed to reduce the burden of data collection in psychophys-
ics by creating more efficient experimental test designs and methods of estimating either
statistics or parameters. In some cases, these adaptive procedures may reduce the amount
of testing by as much as 80% to 90%. This chapter begins with a description of classical
staircase procedures for estimating the threshold and/or slope of the psychometric function,
followed by a description of modern Bayesian adaptive methods for optimizing psycho-
physical tests. We introduce applications of Bayesian adaptive procedures for the estima-
tion of psychophysically measured functions and surfaces. Each method is accompanied
by an illustrative example and sample results and a discussion of the practical requirements
of the procedure.
Previous chapters have described classical methods in visual psychophysics for measuring
important perceptual properties of detection, such as the threshold or the psychometric
function. One such method, the method of constant stimuli, requires the measurement of
behavioral performance for a number of stimulus values and requires a large number of
test trials. Such procedures are relatively expensive in data collection. They also presume
a certain a priori knowledge about where to place stimulus values to cover a range of
performance levels from chance to maximum performance. The experimental demand is
multiplied in situations where we are measuring either a threshold or a psychometric func-
tion for multiple groups or different conditions. This is more typical than not in testing
different hypotheses about vision. Furthermore, in some circumstances we may not really
know in advance which stimulus values to test for a given observer, a special population,
or for novel stimulus manipulations.
Adaptive procedures serve a critical function in the measurement of psychophysical
properties. They provide procedures and methods that use each new observation to adjust
the stimulus values tested, making the most of each new observation. This leads to
methods that are more efficient. Good estimates may be achieved with much less testing.
352 Chapter 11
Furthermore, within limits, adaptive procedures are self-ranging and able to select stimu-
lus values to test that are more responsive to the observer.
The adaptive methods are all the more critical when multiple measurements or condi-
tions are required or when measuring overall perceptual functions, such as the contrast
sensitivity function. A method of constant stimulus approach may require the measurement
of a full psychometric function at each of a number of stimulus conditions or different
groups. Without adaptive methods, certain studies simply require too much testing to
be feasible. Increasingly in vision science, it is functions, like the contrast sensitivity
function or the threshold versus external noise contrast function, that provide important
information in the specification of individuals or tasks. Adaptive testing procedures
have the potential for an especially large payoff in efficiency, especially for these new
applications.
With the advancement of new computer technology and the availability of new algo-
rithms for search and optimization, the area of adaptive testing is making rapid advance-
ments and is likely to be increasingly important in the field. This chapter begins with some
of the simpler, classical approaches to adaptive testing and ends with some examples of
adaptive measurements of complex visual functions.
1-up and 1-down, staircase that converges on 50% correct (or yes) responses.1 The
transformed staircases such as 2/1, 3/1, or 4/1 reduce contrast (or increase difficulty) after
2, 3, or 4 consecutive correct responses in a row and increase contrast after every error.
They converge to probabilities of 70.7%, 79.4%, and 84.1%, respectively.
A common use of truncated staircases is for the estimation of a point of subjective
equality or the point at which a test stimulus subjectively matches a standard stimulus.
For example, a truncated staircase might be used to estimate the perceived length of a line
ending in arrows by comparing it to a line alone (the Mller-Lyer illusion2). Or, the sub-
jectively equal luminance of a patch in a black surround may be estimated by comparing
it to a sample patch in a neutral gray surround. The two percepts are equal when one
chooses the test over the standard 50% of the time.
Figure 11.1a shows a possible psychometric function for a contrast matching experi-
ment. The psychometric function graphs the percent yes for classifying a test patch as
having more contrast than a standard patch. The function goes from near zero at the lowest
contrast values where the test clearly has less contrast, through the 50% point, and
approaches 100% at the highest contrasts that are clearly more than the standard. The 1-up
and 1-down (or 1/1) staircase estimates the 50% point in a relatively small number of
trials.
The 1/1 adaptive staircase increases the manipulated variable, here the contrast of the
test, after every no response (when the test stimulus is estimated as lower) and decreases
the contrast of the test after every yes response. Over a sequence of responses, the
staircase tracks the point of subjective equality. Figure 11.1b shows an example of such a
staircase that tracks a threshold stimulus value for a simulated observer with the psycho-
metric function in figure 11.1a. Because there is noise or randomness in the observers
1 0.6
0.9
0.5
0.8
Stimulus contrast
0.7
0.4
Probability
0.6
0.5 0.3
0.4
0.2
0.3
0.2
0.1
0.1 a b
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 60 70 80
Contrast of the test stimulus Number of trials
Figure 11.1
A psychometric function and sample staircase for contrast matching. (a) A psychometric function for the match-
ing task. (b) A sample trial sequence of a truncated (1/1) staircase. The filled circles indicate yes responses;
the unfilled circles indicate no responses.
354 Chapter 11
internal response, each run of the staircase leads to a slightly different sequence of stimuli
and responses.
In a 2-down, 1-up (or 2/1) staircase, the contrast or signal level is decreased after every
new set of two consecutive correct responses and is increased after every incorrect
response.3 When the staircase converges, the performance level p can be computed by
equating the probability that the staircase goes up and down:
(1 p) + p(1 p) = p2
(11.1)
p = 2 / 2 = 0.707,
corresponding to 70.7% correct.
In a 3-down, 1-up (or 3/1) staircase, the signal level is decreased after every new set of
three consecutive correct responses, and the signal level is increased after each error. When
the staircase converges, the performance level p again can be computed by equating the
probability of the staircase going up or down:
(1 p) + p(1 p) + p2 (1 p) = p3
(11.2)
p = 3 1 / 2 = 0.794,
corresponding to 79.4% correct. A similar computation for the 4-down, 1-up (or 4/1)
staircase leads to a performance level of 84.1%.
Figure 11.2a shows a psychometric function of percent correct versus stimulus contrast
starting at 50% correct for zero or small contrasts and increasing to 100% at high contrasts,
a typical psychometric function for two-alternative forced-choice discrimination. The
psychometric function in this example is typical for the discrimination of the orientation
of a 15 Gabor in the periphery. Sample trial sequences for 2/1, 3/1, and 4/1 staircases
are shown in Figures 11.2b-d.
All updown staircases begin at a starting valuea value specified by the experi-
menterand increase or decrease the next presented stimulus value by a step-size. Fol-
lowing the suggestions of Levitt, we illustrated trial runs in which the step-size is decreased
by half after the first, third, seventh reversals, and so on. A reversal occurs if the stimulus
value moves up when it was last moved down, or vice versa. All staircase procedures have
a stop rule, a rule that determines when to stop the estimation. The stop rule might
involve testing a certain number of trials, but often the rule itself is also adaptive, and
specifies testing until after a certain number of reversals. The procedures in figures 11.1
and 11.2 are illustrated for a stop rule of 80 trials, but many investigators use a stop rule
of a fixed number of reversals (e.g., 10 reversals). A standard practice in psychophysics
is to exclude the first three reversals in computing the threshold if the number of total
reversals is odd or to exclude the first four reversals if the total number of reversals is
even. The estimate of the threshold in the example in figure 11.2 averages the last six
endpoints. This eliminates some of the early range-finding trials from the threshold estima-
Adaptive Psychophysical Procedures 355
0.5
0.9
Stimulus contrast
0.4
Probability
0.8
0.3
0.7
0.2
0.6 0.1
a b
0.5 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 60 70 80
Contrast of the test stimulus Number of trials
0.5 0.5
Stimulus contrast
Stimulus contrast
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
c d
0 0
0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80
Number of trials Number of trials
Figure 11.2
Sample transformed staircases for 2/1, 3/1, and 4/1 methods for the same psychometric function tracking 70.7%,
79.4%, and 84.1% correct, respectively. (a) A psychometric function for Gabor orientation discrimination. (b) A
sample trial sequence of a 2-down, 1-up staircase. The estimated threshold is 0.263. (c) A sample trial sequence
of a 3-down, 1-up staircase. The estimated threshold is 0.309. (d) A sample trial sequence of a 4-down, 1-up
staircase. The estimated threshold is 0.356.The filled circles indicate correct responses; the unfilled circles indi-
cate incorrect responses.
tion and guarantees a balanced number of up and down reversal points, which reduces bias
in the estimate.
A staircase procedure to estimate a particular threshold is judged in three ways: bias,
precision, and efficiency. Bias refers to whether the average threshold estimated by the
staircase corresponds to the actual value or is higher or lower than it. Precision refers to
the variability of the estimate. The relative efficiency is related to the number of trials
required to achieve a given precision. The technical definitions of these criteria are dis-
cussed in section 11.2.5. These three aspects of performance depend on how far the starting
value is from the actual threshold, the step-size, and the stop rule.
Figure 11.3 (plate 8) illustrates estimates of bias, precision, and typical trial numbers
for a 2-up and 1-down staircase for a psychometric function like the one in figure 11.2b.
356 Chapter 11
0.5 2 8000
Frequency
Bias (dB)
0.1 1.2
0 4000
0.1 0.8
2000
0.3 0.4
0.5 0
40 60 80 100 120 140 160 180 200 0 40 60 80 100 120 140 160 180 200 10 20 30 40 50 60 70 80
Number of trials
99 81 68 59 51 46 41 37 34
10 10 10
d e f 35
Starting value (db)
2.25 76 65 57 50 46 42 39 33
1.0 0.75 0.50 0.25 0.0 1.25 1.5 1.75 2.0
5 5 5 60 54 49 45 42 39 37 34 31
49 46 44 41 40 38 36 34 32
2.25
0 0 0 45 44 43 42 40 39 38 36 34
44 44 43 42 40 39 38 36 34
5 5 5 45 44 43 42 40 39 38 36 34
2.25 2.25 45 45 44 43 41 39 38 36 34
10 1.0 0.75 0.50 0.25 10 2.50 10
46 45 44 42 41 39 37 36 34
10 5 0 5 10 10 5 0 5 10 10 5 0 5 10
Stepsize (dB)
We used simulation studies to investigate the behavior of these 2/1 staircases with different
starting values and step-sizes, ranging from 10 dB below the true threshold to +10 dB
above the true threshold. [A decibel is a logarithmic function of the ratio of two amplitude
values, 20 log10 ( x y ); 1 dB has an amplitude ratio of about 1.222.]
Consistent with the early reports of Levitt,3 the optimal starting value for the staircase
is near the actual threshold. Here, we know the true threshold because we are simulating
the process. In experimental situations, this is an unknown, and experimenters do their
best to estimate a reasonable starting value based on available information.
The optimal initial step-size depends on both the threshold and the slope of the psycho-
metric function. It should be large enough to traverse the rising portion of the psychometric
function in one or two steps. For the 2/1 staircase, this corresponds to a step-size on the
order of 0.5 divided by the (ordinary) slope of the psychometric function at the threshold,
1
which is 2 log (2 ) / for a Weibull function.
Figure 11.3ac (plate 8) assumes a reasonable initial starting value (about 0.65 dB above
the true threshold) and step-size selection and shows (a) the bias of the staircase estimate
as a function of the number of trials, (b) the precision of the staircase estimate as a func-
tion of the number of trials, and (c) the distribution of number of trials needed for a
Adaptive Psychophysical Procedures 357
10-reversal stop rule. Figure 11.3df (plate 8) shows how bias, precision, and numbers of
trials vary as the starting value and step-size are changed. For this particular step-size rule
(in which the step size is halved after the first, third and seventh reversal) and stop rule,
it is very important that the initial step-size is large enough to achieve good accuracy. As
mentioned previously, this should be about 0.5 divided by the slope of the psychometric
function near threshold (which is graphed as 0 dB step-size). There is a trade-off for bias
in decreased precision. The best starting values and starting step-sizes depend on the step-
size reduction scheme and on the stop rule. A starting value and initial step-size for the
staircase should be chosen to achieve desirable bias and precision by choosing appropriate
values in the contour plots. For a cautious experimenter, a simulation study such as this
one may prove very useful in planning a particular experiment or study.
0.5
0.9
Stimulus contrast
0.4
0.8
Probability
0.3
0.7
0.2
0.6
0.1
a b
0.5 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 60 70 80
Contrast of the test stimulus
Figure 11.4
Psychometric function and sample accelerated stochastic approximation (ASA) staircases. (a) A psychometric
function for discrimination. (b) Trial sequences of two ASA staircases, converging at 60% and 80% correct. The
filled circles indicate yes responses; the unfilled circles indicate no responses.
Figure 11.4 shows (a) a psychometric function for two-alternative forced-choice dis-
crimination and (b) sample trial histories corresponding to target probabilities of 60% and
80% for the ASA staircase.
In an influential review of adaptive psychophysical procedures, Treutwein6 recom-
mended the ASA as the best available adaptive procedure for measuring thresholds. Figure
11.5ac (plate 9) shows the results of simulation studies of an ASA procedure for (a) bias,
(b) precision, and (c) total trial number for a given starting value and step size. Graphs in
Figure 11.5df (plate 9) show the sensitivity of (d) bias, (e) precision, and (f) trial number
to the starting value and initial step-size. The simulation studies suggest that the optimal
starting value c1 is at the (unknown) threshold, and the optimal initial step-size s should
traverse the psychometric function in one or two steps. Just as for the transformed stair-
cases, these initial values may be informed by prior knowledge or pilot data.
The ASA staircases have many good properties, most notably the ability to converge on
any arbitrary accuracy level relatively quickly. They also have a fairly large region of
starting values in threshold and step-size that yield relatively unbiased estimates. The
profile of bias and precision is somewhat better than that of the 2/1 or 3/1 transformed
staircases tracking similar percent correct levels. Because the ASA staircase shifts to small
step-sizes fairly quickly, it may be less robust to unfortunate selections of initial value,
random errors early in the trial sequence, or drifts in the level of observer performance
over a session. For this reason, the initial step-sizes should be chosen to be large enough.
0.5 2 6000
Frequency
4000
Bias (dB)
0.1 1.2
0
-0.1 0.8
2000
-0.3 0.4
-0.5 0
40 60 80 100 120 140 160 180 200 0 40 60 80 100 120 140 160 180 200 10 20 30 40 50 60 70 80
Number of trials
159 123 96 77 63 54 46 41 37
10 10 10
d e f 37
Starting value (db)
114 91 73 61 52 46 41 34
0.75 1.00
5 1.00 0.75
5 5 82 68 57 50 44 40 37 34 32
0.50 55 50 45 41 39 36 35 33 32
0.75 1.00 1.25 1.50 1.75
0.25
0 0.0 0 0 37 37 38 38 37 37 36 35 35
-0.25 30 31 33 34 35 36 36 35 35
-0.50
5 -0.75 5 5 28 29 30 32 33 34 35 35 35
-1.0 2.00
2.25 27 28 29 31 32 34 34 34 34
10 10 2.50 10
27 27 28 30 31 33 33 34 34
10 5 0 5 10 10 5 0 5 10 10 5 0 5 10
Stepsize (dB)
It is important in a real experiment to optimize the selection of the starting value and
the step-size for a given stop rule and to choose a stop rule appropriate for the goal of the
estimation. If the threshold and the slope of the psychometric function are known in
advance, this is quite easy. Of course, in general, one does not know either, and the entire
point of the adaptive procedure is to be able to do a quick estimate of threshold on the
basis of a smaller number of trials. In such cases, it is useful to collect pilot data to increase
the quality of the initial guesses from estimates of the first few observers.
Another important practical issue is the potential impact of trial-to-trial dependencies,
often called sequential dependencies, on the staircase performance. Sequential dependen-
cieslack of independence in responses from trial to trialarise due to fluctuations in
alertness, or systematic variation in criteria for Yes/No procedures, or predictability of the
stimulus, to name just a few possible causes. For this reason, experimenters often interleave
several staircases in a single experimental session so that adjacent trials may be drawn
from different independent staircases whose stimulus values will be decoupled from one
another.3 Sometimes these are staircases for different conditions. If an experiment is
designed to measure only a single condition, it is often still a good idea to include several
independent copies of staircases measuring the same target performance interleaved in a
session.
In some cases, quick staircase estimates may be used to choose good values for a more
trial-intensive measurement of the full psychometric function. For example, if 2/1 and 3/1
updown staircases are used as a pretest to estimate the contrasts corresponding to the
70.7% and 79.4% performance levels, these in turn can be used to set contrasts to test for
a 5-point psychometric function [c1, c2, c3, c4, c5, where c2 = c70.7%, c4 = c79.4%, c1 = 0.5c2,
c3 = 0.5(c2 + c4 ), and c5 = 2c4]. These five signal contrast levels represent a relatively
efficient sampling of the psychometric functions when tested with a randomized method
of constant stimuli.12
nruns
(v estimated vactual )
i =1
bias = .
nruns
Ideally, a procedure will lead to an unbiased estimator, one where the bias = 0 when nruns
approaches infinity. In practice, it may be sufficient to achieve a bias that approximately
equals the precision of the measurement or slightly better.
The variability in the estimates yielded by the procedure is
nruns
(v vestimated )
2
estimated
i =1
2
sestimated = .
nruns 1
The higher the variability, the lower the precision of the estimate. Precision is defined as
2
the inverse of variability of the estimates, or 1 sestimated , and this too is usually estimated
by simulation.
As summarized in Treutwein,6 the efficiency of the threshold procedure is related to the
so-called sweat factor K , which is the product of the variance of the threshold estimate
and the fixed number of test trials used to obtain the estimate,10,13 or
nruns
(v vestimated )
2
estimated
i =1
K = ntrials sestimated
2
= ntrials .
nruns 1
The sweat factor increases with the number of trials needed to provide the estimate and
also increases with the variability of the estimate. Recovering a low variability estimate
in few trials makes a procedure more efficient.
The efficiency of a psychophysical procedure is quantified by comparison to a standard
or ideal procedure. The ideal procedure is usually benchmarked by the asymptotic variance
of a RobbinsMonro process, which is
(1 )
RM
2
= ,
d ( x )
2
ntrials x = threshold
dx
procedure = K ideal / K procedure . By definition, because a given procedure can at best equal the
ideal procedure (or else the ideal procedure isnt ideal), the efficiency is less than 1.
In the next section, we describe the Bayes rule and Bayesian updating before describing
the remaining adaptive methods.
The Bayes rule has been an extremely important computational principle in computer
science and in several areas of modern statistics. Starting with a set of prior probabilities,
the Bayes rule provides a powerful method to improve or update the probability estimates
using available observations or evidence. Posterior probability estimates will favor events
that are more consistent with the observations.
Next, we show the Bayes rule in a simple computational example. We start with two urns.
Each urn contains black and white marbles, but in different mixtures. At the beginning of
a game, one urn is chosen, and all marbles are drawn from that urn throughout the game.
The observer does not know which urn was chosen, but is told that the two urns will be
chosen with equal probability. All of the marbles (samples) throughout the game are chosen
from the urn that was chosen. The job of the observer is to guess which urn is being sampled.
The observers knows (was told) that the prior probability of the two urns is the same,
or p ( Urn1 ) = p ( Urn 2 ) = 0.5. They are told that Urn1 has 25% white marbles, so that the
conditional probability of a white marble from Urn1 is 25%, or p ( White | Urn1 ) = 0.25 and
p ( Black | Urn1 ) = 0.75 . They are also told that Urn 2 has 60% white marbles, so that the
conditional probability of a white marble from Urn 2 is 60%, or p ( White | Urn 2 ) = 0.60
and p ( Black | Urn 2 ) = 0.40 (figure 11.6). Now, the observer draws 10 marbles and observes
that 7 are white and 3 are black marbles. Intuitively, Urn 2 is more likely to be the source
than Urn1 because it has more white marbles. The Bayes rule provides the equation to
update the estimate of the probability of which urn was selected. It computes the posterior
probability of Urn1 and Urn 2 given the observations of 7 of 10 white marbles:
p( Urn1 )[.7 p(W | Urn1 ) + .3 p( B | Urn1 )]
p( Urn1 | W = 7, B = 3) = 2
0.60 0.30
0.50
0.20
0.40
Figure 11.6
An illustration of the various probabilities entries used in the Bayes rule for a simple example.
11.3.3 QUEST
The first of the Bayesian methods developed for estimating the psychophysical threshold
was the QUEST procedure.14 QUEST assumes that the psychometric function is a Weibull
Adaptive Psychophysical Procedures 365
expressed as a function of intensity in decibels (dB). This is a log form. This form of the
Weibull differs slightly from the earlier forms provided in this book because the input, x ,
is expressed in decibels:
( ) ( x T ) .
T ( x ) = (1 ) (1 ) exp 10 20 (11.8)
In this equation, specifies the chance level, which is 1/n for n-alternative forced-
choice (i.e., 50% for two-alternative forced-choice) or the false alarm rate for a Yes/No
paradigm. The parameter is determined by the testing paradigm. The value specifies
the slope of the psychometric function. The value T is the threshold or location param-
eter, and the value is associated with the proportion correct for the selected threshold
performance level. The parameter is the lapse rate, which sets the highest perfor-
mance level at less than 100%. Although in principle might be estimated from the
data, the QUEST procedure requires that it be specified in advance. It is usually assigned
a value of 3.5 for two-alternative forced-choice paradigms. The only parameter to esti-
mate is T.
The function fT (T ) is the prior probability distribution for the threshold T . The experi-
menter sets this before the experiment. In QUEST, the priors are usually set to a Gaussian
distribution with a large standard deviation and a mean that is chosen by the experimenter
based on any prior experimental evidence that is available.
The stimulus value on the next trial, xn+1 , is set to be the mode of the QUEST function
Qn (T ). The QUEST function is the log posterior distribution of threshold T after trial n,
based on the Bayesian update rule:
n
Qn (T ) = ln fT (T ) + {ri ln T ( xi ) + (1 ri ) ln[1 T ( xi )]}. (11.9)
i =1
The value ri is set to 1 for a success (correct trial) and 0 for a failure (error) on the trial.
The term ln T ( xi ) or ln [1 T ( xi )] is the log likelihood of the success or failure response
on trial i. Before the first trial ( n = 0), the first stimulus is set as the mode of the prior
distribution.
In this scheme, the next value of the stimulus is the current maximum likelihood estimate
of the threshold. As more samples are taken, the tested values converge on the estimated
threshold value. The QUEST procedure sometimes uses a stop rule based on the confidence
interval around the maximum likelihood estimate, which is set to a particular value (e.g.,
1 dB). Alternatively, it uses a total number of trials as the stop criterion. At the end, the
best estimate of the threshold is the final maximum likelihood estimate, or the maximum
of the QUEST function on the last trial.
Display 11.1 shows a sample program, QUEST.m, that runs a simple experiment using
the QUEST function from Psychtoolbox. The experiment uses a two-alternative forced-
choice Gabor orientation discrimination task. For the QUEST function, the experimenter
Display 11.1
%% Experimental Module
provides an initial guess for the threshold Tguess, the standard deviation of the prior distribu-
tion, Tprior , and the values for function parameters , , and . For a two-alternative forced-
choice paradigm shown in this example, = 0.5, = 3.5, and = 1.5 dB, which
corresponds to 92% correct at threshold. Estimation of the threshold using QUEST usually
requires 5060 trials. Estimating a 75% threshold sets = 0.91 dB, which is less efficient
and requires more testing, perhaps as many as 100 trials.
Figure 11.7 shows (a) the assumed psychometric function, here graphed in decibels of
contrast, and (b) and (c) two sample trial sequences corresponding with two different target
threshold accuracies of 75% and 92% correct, respectively, for the QUEST procedure.
Watson and Pelli14 simulated the QUEST procedure for 4, 8, 16, 32, 64, and 128 trials.
They estimated that convergence occurs by 64 trials in most circumstances.
a
0.9
0.8
Probability
0.7
0.6
0.5
80 70 60 50 40 30 20 10 0
Contrast of the test stimulus (db)
5
10 b c
Stimulus contrast (db)
15
20
25
30
35
40
45
0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80
Number of trials
Figure 11.7
Assumed psychometric function and sample simulated trial sequences from the QUEST method for estimating
two thresholds. (a) A psychometric function for a two-alternative forced choice task. (b, c) Two sample trial
sequences from QUEST for estimating 75% and 92% correct thresholds.
Adaptive Psychophysical Procedures 369
QUEST is one of the most popular procedures for estimating the threshold if the
approximate slope of the psychometric function is known in advance. An alternative to
QUEST is ZEST,15 which is identical except it uses the mean rather than the mode of the
QUEST function Qn (T ) to estimate the threshold.
In this form of the psychometric function, x is the stimulus contrast, and G is the
cumulative Gaussian with mean (1 / 2 ) ( x a ) and standard deviation of 1, evaluated at x
b
. The parameter is the lapse ratea probability with which the observer guesses, which
limits the maximum percent correct at high values of x . The value of a is the threshold at
the midpoint of the psychometric function, usually about 75% correct, and b is the slope.
This corresponds to a psychometric function in which d ( x ) = ( x a ) . A slope of 2.8 in
b
this equation is equivalent to the QUEST slope of 3.5 because of the different functional
forms of the psychometric function.
Before the first trial, the experimenter defines the prior probability distribution for the
two parameters, the threshold and slope of the psychometric function, as p0 ( a, b ). The
priors represent a best guess about these two parameters. A lapse rate such as 0.05 is typi-
cally assumed and not estimated.
370 Chapter 11
Before every trial t + 1, starting at t = 0, the expected probability of a response r for any
stimulus value x is
In this equation, r = 1 is for a success (correct trial), and r = 0 is for a failure (error).
The probability of a correct trial is predicted by ( x | a, b ), and the probability of an error
is predicted by 1 ( x | a, b ). That is, the probabilities of a correct response and an error
response are predicted by the location x along the psychometric function ( x | a, b ). Every
pair of threshold and slope values yields a different predicted probability of a correct or
error response for a stimulus value x. The expected probability of response r given stimulus
value x is computed by summing over all possible values of the two parameters weighted
by their prior probability pt ( a, b ) as of trial t . This computation provides the expected
probability of either response r for any stimulus x.
To select the stimulus for the next trial, the expected entropy is computed for any stimu-
lus value x. To do this, the expected posterior probability distribution is computed for any
stimulus value x for both potential responses r using the Bayes rule:
pt (a, b){r( x | a, b) + (1 r )[1 ( x | a, b)]}
pt +1 (a, b | x, r ) = . (11.12)
pt (a, b){r( x | a, b) + (1 r )[1 ( x | a, b)]}
a b
The entropy is a measure of the uncertainty associated with the values of random vari-
ables. Uncertainty is the opposite of predictabilitythe more we know about the random
variables (here the parameters), the lower the entropy. To choose the next stimulus, the
expected entropy (equation 11.13) is computed for all possible stimulus values, and the
stimulus value with the smallest expected entropy is selected for presentation on the next
trial.
This method uses a greedy search algorithm that looks one step ahead and asks the
questionwhich stimulus value x is likely to reduce the variability in the posterior prob-
ability distributions for the two parameters a and b, which is defined as the stimulus value
that minimizes the expected entropy. The greedy algorithms are heuristics that hope to
find the true overall optimum by optimizing each step by itself.
After the stimulus value to test xt +1 has been selected, we run the trial and observe the
response rt +1. Then, the posterior distribution is updated using the Bayes rule for the
observed data:
pt (a, b){rt +1 ( xt +1 | a, b) + (1 rt +1 )[1 ( xt +1 | a, b)]}
pt +1 (a, b | xt +1 , rt +1 ) = . (11.14)
pt (a, b){rt +1( xt +1 | a, b) + (1 rt +1 )[1 ( xt +1 | a, b)]}
a b
Adaptive Psychophysical Procedures 371
1 0.16
a b
0.9
Probability correct
0.08
Contrast
0.8
0.04
0.7
0.02
0.6 Threshold = 0.027
Slope = 2.8
0.5 0.01
0.001 0.01 0.1 1 0 10 20 30 40 50
Contrast Trial number
0.16 6
c d
Threshold estimate
Slope estimate
0.08
4
0.04
3
0.02
2
0.01 1 2 3
1 0 1 2 3
0
10 10 10 10 10 10 10 10
Trial number Trial number
Figure 11.8
Assumed psychometric function and sample estimates of threshold and slope using the Psi () procedure. (a)
A psychometric function with threshold at 0.027, slope of 2.8, and lapse rate of 0.05. (b) A sample simulated
trial sequence (c) The estimated threshold and (d) estimated slope of the psychometric function.
This provides the new posterior distribution. This procedure is reiterated until the criterion
for the stop rule is met.
The standard stop rule for the method is based on the number of trials. Kontsevich
and Tyler16 simulated a two-alternative forced-choice experiment and determined that
threshold estimation within 2 dB (or about 23%) precision usually required 30 trials if you
can assume a given slope, whereas estimating the slope with a similar precision may take
as many as 300 trials. Figure 11.8 shows a simulated trial sequence, estimated threshold,
and slope of the psychometric function from the method.
8.3 of chapter 8). One cannot interpret a threshold associated with a percent yes level
such as 75% without considering decision bias.
An adaptive method, the quick Yes/No (q-YN) method, was developed by Lesmes
et al.17 to estimate threshold for Yes/No tasks. It takes bias into consideration by estimating
the threshold for a fixed d rather than a target percent yes. The method combines signal-
detection theory (SDT) with Bayesian adaptive inference to measure rapidly the sensitivity
and decision parameters for the observer.
In the q-YN method, the dt psychometric function is assumed to be of this form:
(c )
d (c; , , ) = . (11.15)
2
( 2 1) + (c )
The value is the sensitivity threshold at dt = 1, controls the steepness of the d
contrast psychometric function, and is the maximum or saturating d at high contrast.
For applications with nearly perfect performance corresponding with d > 4 , the asymptote
parameter can be set to 5. With set, the q-YN method estimates the two free parameters
that describe the dt psychometric function: the threshold, , and the steepness parameter,
. To estimate the threshold that corresponds with d of 1, the q-YN procedure must
estimate a decision criterion, , or the false alarm rate when the signal is absent. It uses
the estimated false alarm rate to construct the psychometric function for percent yes:
where G(x) is the standard cumulative Gaussian function. If a lapse rate for inattention
errors, (usually set at 4%), is included, the corresponding psychometric function is
yes (c) = + (1 ) yes (c)
(c )
. (11.17)
= + (1 )[1 G ]
2
( 2 1) + (c )
There are three different variants of the q-YN procedure: one for simple Yes/No
detection, one to allow separate conservative and liberal criteria, and another to allow
three response categories of Yes/No and unsure. The simple Yes/No procedure estimates
three parameters: threshold , steepness of the psychometric function , and the false
alarm rate . A procedure that cues either a strict or liberal criterion on different trials
requires four parameters: , , and strict and liberal to define two different percent yes
psychometric functions, one for the strict and one for the liberal criterion. These two
functions both express the same d psychometric function with different false alarm
rates. A similar elaboration with multiple criteria describes performance in the rating
procedure.
In the q-YN method, psychometric functions are defined with either three or four
parameters. Before any testing, the experimenter defines a prior probability distribution
Adaptive Psychophysical Procedures 373
for each of the parameters. The principle of estimation and selection of the next stimulus
is the same as for the other Bayesian methods. The stimulus value for the next trial is
selected to minimize the expected entropy and therefore to provide the most expected new
information. A Bayesian update procedure is used to compute the posterior distributions
for each parameter given the priors and the observed data. The posterior distribution of
the threshold parameter is estimated during the q-YN procedure directly, so it is simple
to use a stop rule defined as a target precision for the distribution of that parameter. Alter-
natively, a set sample size may also define the stop rule.
While the Bayesian principles are the same, the expansion from the two parameter
distributions of , for example, to the three or four parameters of the q-YN applications
pose significant additional demands on computer memory and computation that must be
optimized to allow efficient on-line computation that occurs between each trial in choosing
the next stimulus.
An example of the q-YN procedure applied to a simple Yes/No detection experiment
is shown in figure 11.9. This shows (a) a d psychometric function along with (b) the
corresponding percent Yes psychometric functions for a strict criterion and a lax cri-
terion. Figure 11.9c and d show sample testing sequences for the strict and lax criterion
applications. The horizontal lines in figure 11.9c and d are the true threshold, which
is known here because this is a simulation. The sample testing sequences include a
number of very low contrast trials, which provide data to the Bayesian model related to
false alarm rate.
The q-YN method has been tested by simulation and psychophysically compared with
results from the method of constant stimuli.17 The simple q-YN, the two-criterion, and the
rating methods all showed excellent convergence. All estimates of threshold were closely
equated to the threshold estimated by the method of constant stimuli, and to each other.
The results suggest that about 25 trials may be sufficient to estimate a threshold corre-
sponding to d of 1 with a precision of about 23 dB.
The q-YN is a reliable new method for estimating the threshold and the bias in Yes/No
paradigms. These Yes/No methods deliver criterion-free thresholds that have previously
been exclusive to forced-choice paradigms and methods.
Many of the fundamental properties of the visual system measured in visual psychophysics
are expressed as functions or surfaces of stimulus properties. One example is the contrast
sensitivity function, which captures the sensitivity to stimuli of different spatial frequencies
and has been widely used as a front end in models of early visual processing. Another
example might be sensitivity as a function of several characteristics such as spatial fre-
quency and temporal frequency. The traditional ways of measuring these functions for
individual observers are quite laborious. For example, for contrast sensitivity functions,
374 Chapter 11
4 a b
0.8
Probability "Yes"
Sensitivity (d)
2
0.6
1 0.4
= 0.10
=2.0
0.5 0.2 =2.0
=1.0
0
1 2.5 5 10 25 50 1 2.5 5 10 25 50
Signal contrast (%) Signal contrast (%)
100 100
50 50
25 25
Signal contrast (%)
1 1
0.1 c 0.1 d
0 20 40 60 80 100 0 20 40 60 80 100
Trial number Trial number
Figure 11.9
An illustration of a quick Yes/No (q-YN) method of threshold estimation. (a) A d psychometric function and
(b) corresponding percent yes psychometric functions for a strict criterion and a lax criterion. (c, d) Sample
testing sequences for the strict and lax criterion. Filled circles are yes responses, and open circles are no
responses.
ability. The methods compute a trial-by-trial update of the posterior probability distribu-
tions of the parameters. The next stimulus that will be most informative for testing is
selected by minimizing the one-step ahead expected entropy, and the procedure chooses
an appropriate stop rule.
Quick adaptive methods for the estimation of parameters of known functions have the
potential for significant reductions in the testing requirements in estimating these basic
perceptual functions for individuals.
log ( ) , if f fmax ,
log10 [ S ( f )] = 10 max / 2 (11.18)
log10 ( max ) , if f < fmax & S ( f ) < max
where = log10 (2) and = log10 (2 ). Each set ={ max , fmax , , } of parameter values,
together with this functional form, defines a different CSF.
The probability of a correct response for a grating of frequency f and contrast c is given
by the log-Weibull psychometric function:
The q-CSF method assumes that the steepness parameter of the psychometric functions
does not change with spatial frequency. It is usually set to = 2 based on the experimental
376 Chapter 11
evidence in this domain. The q-CSF allows for a small proportion of lapse errors, , usually
set to 4%.29,30
After choosing this functional form, the experimenter sets a prior probability density
for the four parameters of the CSF. To choose the stimulus to test on the next trial, a one-
step ahead search finds the grating stimulusa combination of spatial frequency and
contrastthat minimized the expected entropy, or maximizes the expected information
gain, about the CSF parameters. Data collected at one spatial frequency improve the esti-
mates of the parameters of the whole CSF and so each trial of data improves the estimates
across all spatial frequencies. The stop-rule for the q-CSF aims to achieve a certain preci-
sion for the parameter estimates. Alternatively, one could set the stop rule to a certain
number of trials.
Although the Bayesian update procedure in the quick methods is analogous to that in
the method,16 it involves optimization of three or four parameters rather than two and
may define several variations of the stimulus in multidimensional space (here spatial fre-
quency and contrast).
The multidimensional parameter space is labeled as where = ( fmax , max , , ), and
the multidimensional stimulus space is labeled as x where x = ( f , c ) for spatial frequency
and contrast of the test stimulus. Before the first trial, the experimenter defines the a priori
probability distribution p0 ( ) that represents prior evidence about typical values of the
parameters of the function describing the shape of the CSF.
Then, before every trial t + 1, starting at t = 0, the q-CSF computes the expected
probability of each of the responses r = 0 or r = 1 in every possible stimulus condition
x = ( f , c ):
pt +1 (r | x ) = {rp(r | , x ) + (1 r )[1 p(r | , x )]} pt ( ) . (11.20)
It applies the Bayes rule to compute the posterior probability distributions pt +1 ( | x, r )
following each possible response to each possible stimulus x :
pt ( ){rp(correct | x, ) + (1 r )[1 p(correct | x, )]}
pt +1 ( | x, r ) = . (11.21)
pt ( ){rp(correct | x, ) + (1 r )[1 p(correct | x, )]}
It then computes the expected entropies of every possible stimulus following either a
correct or incorrect response to x :
1
E[ H t +1 ( x )] = pt +1 (r | x ) {rpt +1 ( | x , r ) log[ pt +1 ( | x, r )] + (1 r )[1 pt +1 ( | x, r )]
r =0
log[1 pt +1 ( | x, r )]}. (11.22)
The stimulus condition chosen for testing on the next trial is the stimulus that minimizes
the expected entropy:
Adaptive Psychophysical Procedures 377
xt +1 = arg min E[ H t +1 ( x )] . (11.23)
x
The actual response on trial t + 1 is used to compute the posterior probability func-
tion corresponding to xt +1 and used as the prior probability function for the subsequent
trial:
pt ( ){rt +1 p(correct | xt +1 , ) + (1 rt +1 )[1 p(correct | xt +1 , )]}
pt +1 xt +1 rt +1 =
( | , ) .
pt ( ){rt +1 p(correct | xt +1 , ) + (1 rt +1 )[1 p(correct | xt +1 , )]}
(11.24)
The estimation of one-step ahead expected entropy over four parameters in a four-
dimensional parameter space for each stimulus specified in a two-dimensional space would
require a prohibitive amount of computation if it were done exhaustively. The integration
of modern computational algorithms, such as Markov chain Monte Carlo (MCMC),31 into
the q-CSF speeds this computation. MCMC methods are a class of algorithms that are
based on sampling or estimating the posterior distributions as a function of the multidi-
mensional parameter and multidimensional stimulus spaces. It is estimated that MCMC
algorithms may reduce the computational load by a factor of 100 or more. Integration of
the MCMC algorithms into the search for the next stimulus is what makes the q-CSF
feasible in real-time testing.
Figure 11.10a shows a hypothetical underlying CSF along with marks showing the
simulated sequence of test trials used by q-CSF. Figure 11.10b shows an actual empiri-
cal sequence of test trials from a q-CSF run along with the final best-fitting CSF for
those data.
Contrast (%)
Sensitivity
Sensitivity
50 2 50 2
10 10 10 10
2 50 2 50
Figure 11.10
An illustration of a quick contrast sensitivity function (q-CSF) method. (a) A hypothetical underlying CSF along
with marks showing the simulated sequence of test trials used by the q-CSF. (b) An actual empirical sequence
of test trials from a q-CSF run along with the final best-fitting CSF for those data.
378 Chapter 11
The q-CSF method was validated with a psychophysical study. The q-CSF with about
100 trials and 5 min of testing had excellent agreement with a CSF measured for the same
individuals using classic methods, 1000 trials and 50 min of testing, to a precision of 23
dB across the different spatial frequencies.19 These validation results were consistent with
simulation analyses of the method.
The q-CSF method provides an effective and efficient method to measure the CSF in
both normal and clinical applications.20 The applications of this function are now far more
accessible to integrate into larger studies of visual function.
The corresponding expected percent correct as a function of external noise and signal
contrast is defined using the log-Weibull psychometric function:
The q-TvC uses the Bayesian adaptive procedure described in the previous section for
q-CSF. In this case, the multidimensional parameter space is v = (c0 , N c , ), the contrast
in low noise, the critical value of external noise, and the common slope of the log psy-
chometric function at all levels of external noise. And the multidimensional stimulus space
is x = ( N ext , c ), the level of external noise and the signal contrast.
Figure 11.11 shows typical empirical results using the q-TvC method. The q-TvC
method has been evaluated with both simulation and psychophysical validation experi-
ments. Simulations showed that fewer than 300 trials are needed to estimate TvC func-
tions at three widely separated criteria with a bias less than 5% and a precision of 1.5 dB
or less.18 The method showed excellent agreement between estimates of the TvCs with
the q-TvC and a traditional method of constant stimuli in an orientation identification
task, with correlations greater than 0.95. The q-TvC used 240 trials, and the method
of constant stimuli used 1920 trialsa nearly 90% reduction in the amount of required
testing.
The q-TvC provides a relatively efficient measurement of threshold versus external
noise contrast functions characterization of observers. The rapidity of measurement makes
50
a b
Signal contrast (%)
25
12.5
6.25
3.12
0 2 4 8 16 32 0 2 4 8 16 32
Noise contrast (%) Noise contrast (%)
Figure 11.11
An illustration of the quick threshold versus external noise contrast (q-TvC) method. TvC functions (at 65%,
79%, and 92% correct) estimated by the q-TvC method from (a) 240 and (b) 480 trials. Shaded regions represent
1 SD. The TvC functions collected with the method of constant stimuli are presented as circles, with error bars
reflecting variability estimates (1 SD).
380 Chapter 11
it a candidate for use in clinical populations, for children, and in situations in which this
might be tested in multiple conditions.
Bayes factors greater than 3 or 4 indicate that Model1 is substantially better than Model 2.
The Bayes factor is one component of the more general Bayesian formulation of the pos-
terior beliefs about the two models:
p ( Model1 | Data ) p ( Data | Model1 ) p ( Model1 )
= . (11.28)
p ( Model 2 | Data ) p ( Data | Model 2 ) p ( Model 2 )
The first factor following the equal sign is the Bayes factor, and the second factor
takes into account the experimenters priors about the two competing models. Pos-
terior beliefs may disagree with the Bayes factor if the priors for one model over
another are sufficiently strong. It has been claimed that Bayesian model selection
automatically takes model complexity into consideration.32 This is because models with
more potential parameter values have more diffuse prior probability distributions. This
contrasts to the approach of the Akaike Information Criterion (AIC) and Bayesian
Information Criterion (BIC), which directly punish models for the number of free
parameters.
Some researchers have used the Bayesian machinery to formulate an alternative to
standard hypothesis testing. They have replaced traditional hypothesis tests such as t-tests,
analysis of variance, and linear and nonlinear regression with a Bayesian approach focused
Adaptive Psychophysical Procedures 381
11.6 Summary
Adaptive methods have the potential to convert measurements that were the purview of
a few laboratories and heroic observers into rapid testing protocols. This opens the field
to consideration of individual differences in visual function and to measurement of indi-
viduals for whom long protocols are impossible, such as children, or special populations.
In combination with mobile testing methods that use mobile devices (such as the iPad),
we may take visual testing into the field.
An additional value for adaptive methods in the laboratory is the ability to measure
changes in state, such as the effect of practice, in a smaller number of trials. It also allows
us to consider measuring performance in a larger number of conditions.
The adaptive psychophysical methods offer important and wide-ranging advantages.
There remain, however, issues with these methods that require care in application and that
may be addressed in further development. One possible issue is that of sequential depen-
dencies in testing, or predictability or incentive in the testing sequence. A related issue is
the impact of lapse trials that may be overweighted in estimation, especially if they occur
at particular points in testing. The introduction of a certain number of trials drawn more
broadly from the stimulus range and so sampling more of the parameter space may guard
against atypical early trials and generally improve the robustness of testing.
Another component that may be improved by future development involves the greedy
one-step ahead methods that currently optimize sampling based on computations involving
the next trial alone. As new algorithms are developed and computational speed is improved,
we may expect that new methods may be able to consider more global optimization involv-
ing multiple steps ahead.
Finally, the parametric methods we described here assume a single functional form or
estimate an index, for example the threshold, for one condition. Further work is under
way to develop protocols using adaptive methods to help choose between several differ-
ent functional forms, or optimize the testing for differences in parameter values (for the
same form) between two or more conditions. The further development of adaptive testing
is an important wave of the future that will contribute to new theory and to the applica-
tion of measurement and estimation for individuals in biomedical and related research
areas.
References
1. Dixon WJ, Mood A. 1948. A method for obtaining and analyzing sensitivity data. J Am Stat Assoc 43:
109126.
2. Mller-Lyer FC. 1889. Optische urteilstuschungen. Archiv fr Physiologie 2 (Suppl): 263270.
3. Levitt H. 1971. Transformed updown methods in psychoacoustics. J Acoust Soc Am 49: 467477.
4. Kesten H. 1958. Accelerated stochastic approximation. Ann Math Stat 29: 4159.
5. Robbins H, Monro S. 1951. A stochastic approximation method. Ann Math Stat 22(3): 400407.
6. Treutwein B. 1995. Adaptive psychophysical procedures. Vision Res 35(17): 25032522.
7. Derman C. 1957. Non-parametric up-and-down experimentation. Ann Math Stat 28(3): 795798.
Adaptive Psychophysical Procedures 383
8. Kaernbach C. 1991. Simple adaptive testing with the weighted up-down method. Atten Percept Psychophys
49(3): 227229.
9. Tyrrell RA, Owens DA. 1988. A rapid technique to assess the resting states of the eyes and other threshold
phenomena: The modified binary search (MOBS). Behav Res Methods 20(2): 137141.
10. Taylor MM, Creelman CD. 1967. PEST: Efficient estimates on probability functions. J Acoust Soc Am 41:
782787.
11. Pentland A. 1980. Maximum likelihood estimation: The best PEST. Atten Percept Psychophys 28(4):
377379.
12. Green DM. 1990. Stimulus selection in adaptive psychophysical procedures. J Acoust Soc Am 87:
26622274.
13. Taylor M. 1971. On the efficiency of psychophysical measurement. J Acoust Soc Am 49: 505508.
14. Watson AB, Pelli DG. 1983. QUEST: A Bayesian adaptive psychometric method. Atten Percept Psychophys
33(2): 113120.
15. King-Smith PE, Grigsby SS, Vingrys AJ, Benes SC, Supowit A. 1994. Efficient and unbiased modifications
of the QUEST threshold method: theory, simulations, experimental evaluation and practical implementation.
Vision Res 34(7): 885912.
16. Kontsevich LL, Tyler CW. 1999. Bayesian adaptive estimation of psychometric slope and threshold. Vision
Res 39(16): 27292737.
17. Lesmes LA, Lu ZL, Tran NT, Dosher BA, Albright TD. 2006. An adaptive method for estimating criterion
sensitivity (d) levels in yes/no tasks. J Vis 6(6): 1097.
18. Lesmes LA, Jeon ST, Lu ZL, Dosher BA. 2006. Bayesian adaptive estimation of threshold versus contrast
external noise functions: The quick TvC method. Vision Res 46(19): 31603176.
19. Lesmes LA, Lu ZL, Baek J, Albright TD. 2010. Bayesian adaptive estimation of the contrast sensitivity
function: The quick CSF method. J Vis 10(3): 17.121.
20. Hou F, Huang CB, Lesmes L, Feng LX, Tao L, Zhou YF, Lu ZL. 2010. qCSF in clinical application: Efficient
characterization and classification of contrast sensitivity functions in amblyopia. Invest Ophthalmol Vis Sci
51(10): 53655377.
21. Weber EH. De pulsu, resorptione, auditu et tactu. Leipzig: Koehler; 1834.
22. Fechner G. Elemente der psychophysik. Leipzig: Breitkopf & Hrtel; 1860.
23. Graham NVS. Visual pattern analyzers. Oxford: Oxford University Press; 2001.
24. Campbell FW, Robson JG. 1968. Application of Fourier analysis to the visibility of gratings. J Physiol 197(3):
551566.
25. Enroth-Cugell C, Robson JG. 1966. The contrast sensitivity of retinal ganglion cells of the cat. J Physiol
187(3): 517552.
26. Movshon JA, Thompson ID, Tolhurst DJ. 1978. Spatial and temporal contrast sensitivity of neurones in areas
17 and 18 of the cats visual cortex. J Physiol 283(1): 101120.
27. Watson AB, Ahumada AJ. 2005. A standard model for foveal detection of spatial contrast. J Vis 5(9):
717740.
28. Schwartz SH. Visual erception: A clinical orientation. New York: McGraw-Hill Medical; 2009.
29. Wichmann FA, Hill NJ. 2001. The psychometric function: I. Fitting, sampling, and goodness of fit. Percept
Psychophys 63(8): 12931313.
30. Swanson W, Birch E. 1992. Extracting thresholds from noisy psychophysical data. Percept Psychophys 51(5):
409422.
31. Gilks WR, Richardson S, Spiegelhalter DJ. Markov chain Monte Carlo in practice. Boca Raton: Chapman
& Hall/CRC Press; 1996.
32. Kruschke JK. Doing Bayesian data analysis: A tutorial with R and BUGS. Burlington, MA: Academic Press;
2010.
33. Lee MD. 2008. Three case studies in the Bayesian analysis of cognitive models. Psychon Bull Rev 15(1):
115.
384 Chapter 11
34. Lee MD. 2008. BayesSDT: Software for Bayesian inference with signal detection theory. Behav Res Methods
40(2): 450456.
35. Raftery AE. 1995. Bayesian model selection in social research. Sociol Methodol 25: 111164.
36. Pitt MA, Myung IJ, Zhang S. 2002. Toward a method of selecting among computational models of cognition.
Psychol Rev 109(3): 472491.
37. Shiffrin RM, Lee MD, Kim W, Wagenmakers EJ. 2008. A survey of model evaluation approaches with a
tutorial on hierarchical Bayesian methods. Cogn Sci 32(8): 12481284.
38. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian data analysis. Boca Raton, FL: CRC Press; 2004.
39. Sivia DS, Skilling J. Data analysis: A Bayesian tutorial. Oxford: Oxford University Press; 2006.
40. Carlin BP, Louis TA. Bayesian methods for data analysis.Boca Raton: Chapman & Hall/CRC Press; 2009.
41. Box GEP, Tiao GC. Bayesian inference in statistical analysis. New York: John Wiley; 1992.
12 From Theory to Experiments
A theory is a set of principles that explains the functioning of a system. From this set of
theoretical principles, the experimenter must decide which aspects of the behavior of the
system will be the focus of investigation and (ideally) develop a model of the system and
generate a specific hypothesis and predictions. This is the process of starting with a theory
and making that theory testable. Given a specific prediction or hypothesis, choices must
be made about the general approach and the specific methods and design of testing.
Chapter 10 discussed the methods of quantitative data analysis and model fitting. Chapter
11 discussed efficient testing procedures. This chapter considers the process of translating
a theory to a hypothesis and then constructing a strong experiment. We focus on the
development of experiments to provide strong tests of theories.
The researchers next job is to make predictions and choose a paradigm to test those
predictions. This is an iterative process. The selection of paradigm and predictions may
be refined in successive attempts. The selection of a general approach to testing a theory
may suggest a good paradigm for collecting data, but then the specific theoretical predic-
tions need to be developed in more detail for the specific testing paradigm.
Next, researchers collect data or observations, summarize the results, and compare the
observed results to the theory or model predictions. If the data are incompatible with the
theory or model predictions, this requires a modification of the form of the model that
applies in the particular domain or leads to a small or sometimes larger change in the
theory itself. If the data are compatible with the predictions, this provides further support
for the theory, and new predictions can be developed for further detailed testing. The next
section provides several concrete examples.
the amounts of internal additive or multiplicative noise, or the nonlinearity in the visual
system of an observer.
The insight was that this kind of model could be adapted to measure how the observer
changed between two or more states. For example, if the state of attending versus not
attending has no effect on visual perception, then the same system parameters would
characterize performance in the two states. Or, if visual perception is different in the
attended and unattended states, then the framework allows us to identify how attention
changes the specific parameters of the system.
This strong theoretical framework of the PTM and external noise paradigm suggested
testable predictions of the two verbal ideas about mechanisms by which attention might
operate. The PTM framework created a natural way to develop performance signatures for
the two mechanisms of attention for threshold versus external noise contrast (TvC) func-
tions by measuring contrast thresholds in different amounts of external noise in the stimu-
lus. Varying external noise provides a specific test of whether attention improves external
noise filtering. External noise filtering by attention improves performance differentially in
high-noise tests. In contrast, low-noise conditions allow us to ask whether attention still
improves performance by making the stimulus somehow clearer even when there is no
external noise to be filtered. We called this mechanism stimulus enhancement. External
noise filtering was modeled as a reduction in the impact of external noise, whereas stimulus
enhancement was modeled as a reduction in the internal additive noise.
Figure 12.1 shows the signature performance patterns for different mechanisms of atten-
tion as measured with external noise paradigms. It shows three distinct patterns or behav-
ioral signatures, each corresponding to a particular mechanism of attention: (i) filtering
out external noise by changing the template; (ii) enhancing the stimulus through reductions
in internal additive noise; (iii) reductions in internal multiplicative noise or changes in
nonlinearity parameter.
Effects of filtering in high external noise, or in the presence of visual distractors, cor-
respond to the verbal idea of attention that operates like a filter by changing the template.
Effects of attending in clear or noiseless displays at threshold correspond to the verbal
idea of attention that operates by clarifying the visual representation. Both of these patterns
have been observed in the data of attention experiments that measured TvC functions of
spatially cued attention and of divided attention.1015 Of the two, filtering is the mechanism
seen most often in attention experiments and may be the dominant mechanism across many
different attention manipulations and paradigms. Sometimes, both mechanisms are found
together in an experimental condition.12,13 To date, there is no evidence for the third pos-
sible mechanism of attention through reduction of multiplicative noise or changes in
nonlinearity that improve behavioral performance.
This analysis of visual attention started with verbal ideas or theories and created a
framework for understanding attention effects by casting them within a quantitative model
or theory of the observer. This in turn allowed the generation of testable predictions that
388 Chapter 12
Stimulus enhancement
32
a Nm Na
8
Additive
noise
4
+ + b
0 2 4 8 16 32
Template Multiplicative Decision
noise
Internal multiplicative
External noise exclusion noise reduction
32 32
Signal contrast (%)
8 8
4 4
c d
0 2 4 8 16 32 0 2 4 8 16 32
Figure 12.1
Mechanisms of attention within the perceptual template model (PTM) tested in TvC functions. (a) The perceptual
template model. (b) Signature TvC changes by stimulus enhancement, where attention reduces contrast threshold
in low external noise. (c) Signature TvC changes by external noise exclusion, where attention reduces contrast
threshold in high external noise. (d) Signature TvC changes by internal multiplicative noise reduction, where
attention alters the nonlinearity or multiplicative noise in the system affecting contrast threshold in all external
noise conditions.
were signatures of each mechanism. This framework has provided one useful way to clas-
sify and think about effects of attention.16,17
Stimulus contrast
Figure 12.2
Potential effects of attention on the fMRI BOLD contrast response function: (a) contrast gain, (b) response gain,
and (c) baseline increase. Solid lines show responses in attended conditions compared to unattended conditions
shown with dashed lines.
the BOLD response to signal stimuli of different contrasts in several visual cortical areas:
V1, V2, V3, V3A, and V4.18 Cortical contrast response functions in the fMRI differed in
attended and unattended conditions.
Using fMRI to measure neural responses to signals of increasing contrast in the absence
of visual noise can discriminate several different effects of attention in the absence of
external noise. Distinctive patterns, shown in figure 12.2, illustrate three functionally dif-
ferent ways in which attention can affect the contrast response of the visual system.
Amplification of contrast increases responses only to stimuli with intermediate
contrastsa particular contrast when attended has the same effect as a larger contrast in
the unattended condition. This pattern is labeled contrast gain in the physiology litera-
ture.19,20 If neural responses are multiplied for attended stimuli, this leads to small differ-
ences at low contrasts and increasing differences between attended and unattended
conditions at higher contrasts. This pattern is labeled response gain.21 Finally, attention
might simply raise the baseline fMRI response, shifting the response upward across the
contrast response function.19,21
In one experiment we measured fMRI responses to sine waves of different contrasts.
Observers either attended to a peripheral annulus containing a sine wave or did not
attend to the sine wave and performed a task at fixation instead. Attention conditions
were blocked together in this design. Each block of attended or unattended trials included
a mixed sequence of trials of different contrast levels for the sine wave in an event-
related design. The fMRI BOLD responses were measured for several seconds after each
stimulus onset. This fMRI investigation discovered that attention does two things. It
increases the responses at all contrasts through a baseline shift in activity. Attention also
amplifies the response to contrast, showing a contrast gain pattern with increased
responses for intermediate contrasts. The difference between attended and unattended
390 Chapter 12
responses were modeled as a combination of the effect patterns seen in figure 12.2a and
c. The contrast gain was stronger in early cortical areas like V1 and somewhat smaller
in higher cortical areas like V4. This study found an effect of attention on brain responses
to visual stimulation that corresponds to stimulus enhancement within the PTM frame-
work. It also shows evidence of baseline shifts that cannot be easily measured in the
behavior alone.18
A companion fMRI experiment tested orientation discrimination for a peripheral
sine-wave annulus in external noise. As in the previous case, the contrast of the signal
was manipulated to measure contrast response functions, here in the presence of exter-
nal noise. This experiment demonstrated evidence for external noise filtering in early
visual areas, corresponding to the external noise filtering previously reported in psy-
chophysical TvC experiments. If the sine-wave contrast is especially low, then the
stimulus mostly consists of external noise, so a reduced response in this condition is a
direct indication of external noise filtering. The responses in the earliest visual areas to
external noise, with a very low contrast stimulus, was actually reduced in the attended
condition.22
These measures of brain activity in different visual areas provide converging evidence
that is consistent with the model-based behavioral signatures in the TvC functions and the
PTM. This provides an especially strong interrelated body of evidence in support of these
fundamental mechanisms of attention. The fMRI results specify how these mechanisms
are embodied in the internal responses to the contrast stimuli in different visual cortical
areas.
316
a b c
100
Sensitivity
31.6
10.0
3.16
Figure 12.3
Potential effects of training on the contrast sensitivity function (CSF): (a) contrast gain, (b) increasing peak
spatial frequency, and (c) increasing bandwidth.
We have given several examples of starting with a theory or even a simple idea and devel-
oping testable predictions. We believe that it is especially useful to develop hypotheses
392 Chapter 12
using quantitative models or descriptive functions, as in the three case studies detailed
earlier. In some situations, we already may have a good idea of what needs to be mea-
suredthe TvC curves or the CSF functions in the examples. In other cases, especially
for broad or qualitative hypotheses, the choice of paradigm and design for testing allows
a wide range of possible experiments. In either case, there are many detailed choices
that must be made in determining an experimental design for testing hypotheses and
theories.
This section describes a number of issues that arise while deciding on the design of an
experiment. How do you go from specific predictions of a model to the details of an
experimental design? How many conditions should be tested? What task should you use?
Should multiple conditions be intermixed or tested in isolation? How many trials are
needed for each condition of the experiment?
An experienced investigator in a particular area of research will have developed prefer-
ences for one kind of experiment over another and an intuitive sense of many of these
aspects of design. Even so, whenever proposing an experiment or test that is new to you,
it is important to understand the ways in which alternative experiments might reveal con-
verging constraints on models or tests of hypotheses.
advantage of allowing different spatial frequency stimuli to be displayed in any order just
by changing the display image. It has the disadvantage that the display window shows
different numbers of cycles of the sine-wave gratings for different spatial frequency condi-
tions. Alternatively, the window size in the sine-wave image could change to equate the
number of grating bars for different spatial frequencies. Another approachwhich is more
cumbersomeis to manipulate spatial frequency by changing the viewing distance of the
observer to the same stimulus. The same stimulus at a greater viewing distance subtends
fewer degrees of visual angle. Changing viewing distance requires changing the position
of either the observer or the screen between trials or blocks of trials, but equates all the
details of the stimuli in different spatial frequency tests. All of these methods have been
used in the vision testing literature.23
From prior research, it is known that the visible range of spatial frequency for humans
is from below 0.25 cycles per degree (c/d) to almost 60 c/d. The CSF is traditionally
measured from about 0.5 c/d to 32 c/d. The bandwidth of the sensitivity of visual receptors
to spatial frequency is about an octave,2427 and so it makes sense to test spatial frequencies
with a logarithmic spacing within the tested range. An octave corresponds to a doubling
of frequency. The experimenter measures sine-wave detection at spatial frequencies that
are an octave apart at 0.25, 0.5, 1, 2, 4, 8, 16, and 32 c/d, corresponding to the eight values
of 2i for integers i = 2 to 5.
If sine-wave spatial frequency is manipulated by changing the image on the screen, a
viewing distance in relation to the pixel resolution of the screen must be carefully selected
to span this range of spatial frequencies with a reasonable presentation of sine variation
for high spatial frequencies and a minimum number of cycles for low spatial frequencies.
The display screen also needs sufficient contrast resolution to allow the measurement of
the very low contrast thresholds at the peak of the sensitivity function, and the screen
should be carefully calibrated (see section 5.2 of chapter 5).
The basic design includes two measurements of the CSF, one before and one after a
practice manipulation. Each copy of the CSF is measured at eight spatial frequencies. The
experimenter has decidedas is traditional in this areato measure contrast threshold by
staircase methods that are able to adjust to the range of visual sensitivity of observers. The
70.7% accuracy thresholds may be measured with the 2-down, 1-up staircase procedure
(see section 11.2 of chapter 11). From testing and experience, these staircases are known
to generally converge in 3050 trials; the stop rule is set at 50 trials. Thresholds are often
measured several times in this literature and averaged, so the measurements are repeated
three times.
To measure the initial CSF will require 8 spatial frequencies 50 trials 3 repeated
threshold measures, or 1200 trials. This can be tested in one long session with several
breaks. The post-training CSF also requires 1200 trials.
The last and perhaps most important aspect of the experiment is the training protocol
that may produce learned changes in the CSF. More elaborate experiments might compare
From Theory to Experiments 397
and assess different training protocols in different groups of observers. One training
method involves several, perhaps eight, sessions of 1000 training trials for detecting sine-
wave patches at a high spatial frequency near the high-frequency cut-off of the contrast
sensitivity function for the observer. This training protocol targets high spatial frequency
limits on visibility and has in fact been used as an approach to rehabilitation in adult
amblyopes, where it has been shown to improve the overall CSF, especially in the high-
frequency limb.28,29
The value in assaying a quantitatively specified function of visibility such as the CSF
as a measure of learning, as we have shown by example, is that it may lead to new signature
changes attributed to learning that may then allow the refinement of hypotheses and theo-
ries (figure 12.3). A computational study of the power of the experimental design to dis-
cover different kinds of improvements could be useful. For example, we may discover that
even more repetitions of the threshold measures are required to discriminate changes in
the bandwidth of the measured CSF. This is considered again in section 12.3.6 on power
analysis of experiments.
Predicted
Observed
370
P(left) .65 Observer - Pam
360
RRRL
350
Mean reaction time of correct left response (milliseconds)
RRL
340
330
LRRL
RL
320
310
300
L
290
280
LL
LLL
270
n-3 n-2 n-1 n
Figure 12.4
An illustration of sequential dependencies in choice reaction times (RTs) shown as a tree graph. The x-axis is
the number of the earliest preceding trial in the conditional analysis (where the current trial is n), and the y-axis
is the mean RT of a correct left response. Each node in the diagram is the mean correct RT conditionalized on
the stimulus history of preceding trials, with three-trial histories on the left and the unconditional average on the
right. The trial history (e.g., RRRL) is indicated for seven representative nodes. Data points are filled if the earli-
est trial in the history is left and unfilled if it is right (from Falmagne, Cohen, and Dwivedi30).
From Theory to Experiments 399
There are other examples of sequential effects. Observers may change criteria if they
believe they have just made a mistake or to compensate for what they believe are too many
same responses in a row. Observers running in a single simple staircase protocol may find
the next stimulus predictable and adjust their criteria. Criteria for different scaling responses
change as a function of the history of stimuli.31,32
Although both adaptation and sequential dependencies are phenomena that have them-
selves been studied, experimenters who are investigating other phenomena, such as atten-
tion or learning in our case examples, may seek to minimize such extraneous factors and
choose an experimental design to achieve that goal. In particular, experimenters often
choose mixed designs in which all conditions are mixed together for testing rather than
blocked designs where many trials of one type are tested one after the other. This is cer-
tainly our general preference. The advantage of a mixed design is thaton averagethe
observer will be in the same state while testing each condition. A test of one condition is
likely to be preceded by different conditions and only occasionally by the same condition.
This reduces predictability and averages over sequential dependency patterns.
One example where a mixed design is especially critical is in testing observer models
such as the PTM. The parameters of the model estimate properties of the visual system
and observer, and these properties might be state-dependent. For situations such as this
one, it is important for every test trial to occur while the observer is, on average, in the
same state. If individual conditions, such as individual levels of external noise, are tested
separately in blocks, then the system properties might change with the adaptive state. In
that case, the parameters for each external noise level would define a different state of the
system, and the blocked experiment would not measure a coherent TvC function.
Conversely, there may be some cases where it is important for certain trial variables to
be blocked in a test design. For example, if a researcher is studying the effects of persistent
attention setas distinct from transitory shifts in attentiontesting sets of 50100 trials
at a time with one attention set or another is a reasonable choice. In another example,
blocks of trials with conservative or liberal response criteria may result in criterion setting
with less variability than trial-by-trial cueing of decision state. If the goal is to measure
the best performance of an observer in a given condition, then blocked testing may be
appropriate. There are many other examples where blocked testing may be advantageous
for a particular purpose.
Decisions about mixed versus blocked designs have direct implications for trial random-
ization. Mixed designs randomize the order of trials over all conditions, whereas blocked
designs randomize trials over nonblocked stimulus manipulations.
20
18 Group 1 Group 2 a b
16
14
Threshold (dB)
12
10
8
6 Difference scores
4
2
0
0 5 10 15 20 M1 M2 t 0 5 10 5 10 MD t
Figure 12.5
The advantage of within-observer designs. (a) Hypothetical data from a between-observer design. Each group
consists of 10 observers. The mean and standard error of the two groups and the t statistic are graphed as gray
bars (M1, M2, and t). (b) Hypothetical data from a within-observer design. Ten observers participated in both
version of the hypothetical experiment. The difference scores of each observer in the two conditions are shown,
along with the mean difference (MD) and t statistic.
between-observer designs may eliminate concerns about the order in which conditions are
tested or about contaminating effects of learning or context.
condition should have higher percent correct at intermediate contrasts. A standard sta-
tistical test of the difference between proportions is
za pu
z= , (12.2)
pa (1 pa ) pu (1 pu )
+
na nu
where pa and pu are the proportions correct for attended and unattended conditions.
Another direct way to estimate the effect of attention is to compare the full psychometric
functions using a model. If attention has no effect, then the two psychometric functions
should be the same and may be fit with the same Weibull function; if they are different,
then we expect, for example, that the location (threshold) parameter of the attended condi-
tion should be smaller. This plan for statistical testing fits Weibull functions to the two
psychometric functions (see section 10.4.2 of chapter 10) while comparing models in
which the threshold, slope, and asymptotic levels are equal for the two conditions with
models where, for example, the threshold levels differ. The data are the numbers of correct
and incorrect responses, so maximum likelihood estimation is a natural choice. The
2
nested model contrasts evaluate whether there are significant differences in the threshold
parameter. The size of the attention effect is summarized and understood by comparing
the estimated thresholds for the attended and unattended conditions. Even if the signifi-
cance of the difference is known from the nested model test, we may want to use bootstrap
methods to estimate the variability in the estimated thresholds (see section 10.4.2). An
example of this kind is considered in more detail in section 12.3.6 on power analysis.
Measuring the TvC functions and using the PTM to understand the mechanisms of
attention in low and high noise requires a more complicated design. The aim is to test
whether both external noise exclusion and stimulus enhancement, one or the other, or
neither attention signature(s) occur in a particular experimental threshold data. Suppose
we measured psychometric functions at each of eight external noise levels, with different
contrast levels selected to be appropriate to span the entire function for each noise level.
One could use the plan just detailed to compare the psychometric functions individually
in each external noise level tested. The true aim, however, is to use the data to estimate
and test the hypotheses within the context of the PTM.
There are (at least) two ways to carry out this analysis. One starts with estimating the
thresholds at three different criterion accuracy levels at all external noise levels and then
using a PTM to fit these data. The design has 16 psychometric functions, an attended and
an unattended one at each of eight external noise levels. Each psychometric function is fit
with a Weibull function by maximum likelihood methods, and the best-fitting Weibull is
used to interpolate the estimates of three contrast thresholds corresponding to, say, 65%,
75%, and 85% correct. These 48 threshold estimates (2 8 3) are then fit with the PTM
model using a least squares procedure. The basic PTM without attention has four param-
From Theory to Experiments 403
eters: template gain , nonlinearity , internal multiplicative noise N m , and internal addi-
tive noise N a. There are two kinds of attention effects, implemented as multipliers (between
0 and 1): Af that reduces or filters external noise ( Af2 N ext
2
) and Aa that reduces internal
2 2
additive noise ( Aa N a ) with attention. The first captures an attention effect in high-noise
conditions, and the second captures an attention effect in low noise. The values are set to
1 in unattended conditions. Several model variants are compared: a basic model without
either attention factor (all A values set to 1), a model with Af but not Aa , a model with Aa
but not Af , and a model with both Af and Aa. These different nested models are compared
to determine whether adding each parameter yields a significant improvement in the fit of
the model to the data. In the literature on attention in visual discrimination, we have found
that central cueing often corresponds with an attention effect in filtering external noise but
not in reducing internal additive noise.1113
An alternative to first estimating thresholds at three accuracy levels and then fitting the
PTM to the thresholds is to fit the PTM directly to the psychometric functions. This analy-
sis is presented in more detail in section 12.3.6. One advantage of first estimating thresh-
olds and then fitting the PTM is that graphs of the threshold and model fits show nice
signature patterns of threshold changes that are easy to see and understand. In contrast,
fitting the PTM directly to the family of 16 psychometric functions fits all the data directly
using maximum likelihood methods, but the data pattern is more subtle, more dense to
graph, and more difficult to visualize, but may in some circumstances provide a more
powerful test of the model.
Experiments that collect physiological measures or other forms of data such as eye
movements or reaction times must consider not just the data analysis and model testing
for the task performance (i.e., accuracy). They must anticipate the special demands of
analysis of the collateral measures. For example, if measuring response time in addition
to response accuracy, the experimental design and the analyses depend upon whether the
focus of testing a hypothesis is about mean response time, or the distributions of response
times for correct and incorrect responses. EEG responses are averaged over trials to dis-
cover the waveform after preprocessing of the data to remove artifacts. Consideration and
preplanning of the analysis of fMRI data is especially important and may drive many other
considerations in the selection of the experimental design. Because these data are so
expensive to collect and to analyze, careful advance consideration of the plan for analysis
is especially important.
The analysis plan for an experiment should almost always be undertaken before choos-
ing the final experimental design. Programs for preliminary data analyses that can be
applied to pilot data may catch errors in programming or in design. Even more serious
consideration of the experimental design may focus on its power to detect important
effects, informing the selection of the sample sizes as well as the conceptual design of
experiments. This topic is addressed next.
404 Chapter 12
ably not the right experimental design as even 100-plus trials per point fails to find such
a small effect almost half the time. Another computational study could check if adding
points on the psychometric function would improve the situation of detecting small dif-
ferences in the psychometric functions.
This analysis assumes the inductive logic of standard hypothesis testing.34 An alternative
Bayesian analysis might focus instead on the posterior distribution and the relative likeli-
hood of different models.35 If instead of significance, the focus is on the precision of the
estimated thresholds, or the difference between thresholds, analogous computations could
provide a corresponding analysis of precision.
A more complex example analyzes experiments that test a full quantitative model. Here,
consider an experiment that measures full psychometric functions at different external
noise levels and estimates attention effects using the PTM of the observer. The experiment
tests 7-point psychometric functions at eight external noise levels in attended and unat-
tended conditions. The purpose is to evaluate external noise exclusion.
Display 12.2ac shows programs that carry out a power analysis of this attention study.
It provides the programs for carrying out the power analysis for fitting the PTM model
with template gain , nonlinearity , internal multiplicative noise N m, internal additive
noise N a , and attention effect Af for filtering external noise ( Af2 N ext
2
). The program simu-
lates the response of a known PTM observer with given parameter values for several
potential magnitudes of attention effects and different sample sizes. For each combination
of these factors, it generates new data sets and fits them with the PTM to determine if the
effect of external noise exclusion is significant. The PTMs are fit to the numbers of correct
and incorrect trials. The output of the power analysis is the probability of rejecting the
null hypothesis for combinations of set size and sample size.
The computational study considers effects sizes between 0.25 and 2.5 dB and sample
sizes from 10 to 100 per point on the psychometric functions, and significance = 0.01.
The results are shown in figure 12.7. Many conditions16 psychometric functions rather
than 2constrain the PTM. Here, the power computations suggest that an effect as small
as 1 dB can be detected with sample sizes of around 40 per condition.
In all of these computational studies of power, the more you know, the more accurate
will be your power analyses of an experimental design. Knowing the typical threshold
values, typical values of psychometric function slope or the normative PTM parameter
values for the experiment being studied, and the typical effect sizes for attention mecha-
nisms will all increase confidence in the selection of the design and the computation of
power that determines the sample size. Knowing typical values of these factors will
improve the validity of a power analysis. The more you know, the more you can learn
through computational study.
Sometimes in highly studied areas, one omits a full power study because hundreds of
very similar experiments have been carried out, and one can follow the (informal) power
analysis encapsulated in those studies by simply following a typical experiment. But if the
406 Chapter 12
Display 12.1a
xi = 0.50;
lamda = 0.02;
tau1 = 0.0112; % unattended
eta1 = 2.84;
tau2 = 0.0112 * 10 ^ (-effectSize/20); % attended
eta2 = 2.84;
totalN = 500;
success = 0;
for i = 1 : totalN
PFS(1, 1:7) = c_pre;
PFS(2:3, 1:7) = 0;
for j = 1 : 7
prob = xi + (1 - xi - lamda) * (1 - exp(-
(c_pre(j)/tau1).^eta1));
for k = 1 : sampleSize
r = rand;
if r < prob
PFS(2, j) = PFS(2, j) + 1;
else
PFS(3, j) = PFS(3, j) + 1;
end
end
end
end
end
end
NPF = 2;
%reduced model
data = PFS;
guess = [(0.01+0.01*10^(effectSize/20))/2 3];
options = optimset(fminsearch);
[psy, minus_maxloglikelihood] =
fminsearch(Weibullcostfunc2, guess, options, data);
reducedmodel_maxloglikelihood =
- minus_maxloglikelihood;
chi2 = 2*(fullmodel_maxloglikelihood -
reducedmodel_maxloglikelihood);
p = 1 - chi2cdf(chi2, 1);
if p < alpha
success = success + 1;
end
end
beta = success/totalN;
experiment is expensive to run in any way, a preliminary computational study may save
disappointment and aggravation.
Once an experiment is completed, information from that experiment may help to guide
the design of follow-up experiments, and computational power analyses may sometimes
be useful in determining the next successful experiment.
Display 12.1c
1.0
4
Effect size (db)
3 0.9
0.8
2 0.7
0.6
0.5
0.4
1 0.3
0.2
0.1
20 40 60 80 100 120 140 160 180 200
Sample size
Figure 12.6
A power analysis for an experiment that compared two psychometric functions in attended and unattended condi-
tions that may differ in threshold. The contour plot shows the probability of a significant effect when comparing
two psychometric functions using different samples sizes and threshold effect sizes. See the text for the descrip-
tion of the experimental design.
From the perspective of hypothesis testing, there are two possible outcomes of the
experiment: rejecting or failing to reject the null hypothesis. The null hypothesis usually
assumes no difference between conditions whereas the theory or hypothesis predicts a
difference or effect. For example, attention should improve performance, so we predict a
difference (in a particular direction) between attended and unattended conditions, and so
expect rejection of the null hypothesis. If the null hypothesis is rejected with a pattern
consistent with the original prediction, then the data are consistent with the theory. The
next step is to extend the theory to new predictions that might be falsified by new data or
to extend the test of the theory to new experimental contexts.
If the data fail to reject the null hypothesis, then the experimenter must either decide
that the experimental test had insufficient power for the size of the effect or that the
theory or hypothesis should be modified, or that the particular application of the theory
to the experiment was ill conceived. If there is a suspicion that the experiment had
insufficient power or had inadequate control conditions, then the investigator will
410 Chapter 12
Display 12.2a
for i = 1: NnoiseLevels
data1(i, 1) = Next(i);
data2(i, 1) = Next(i);
for j = 1:NperformanceLevels
logC1 = 1 / (2*gamma) * (2*log(dp0(j)) +
log((1+Nm^2) * Next(i).^(2*gamma) + Sa^2) -
log(1-Nm.^2*dp0(j).^2/2)) - log(beta);
data1(i, j + 1) = exp(logC1);
% stimulus contrast level in the unattended condition
logC2 = 1 / (2*gamma) * (2*log(dp0(j)) +
log((1+Nm^2) * (Af*Next(i)).^(2*gamma) + Sa^2)
- log(1-Nm.^2*dp0(j).^2/2)) - log(beta);
data2(i, j + 1) = exp(logC2);
% stimulus contrast level in the attended condition
end
end
success = 0;
for m = 1 : totalN
data1((NnoiseLevels+1):(3*NnoiseLevels), :) = 0;
data2((NnoiseLevels+1):(3*NnoiseLevels), :) = 0;
for i = 1 : NnoiseLevels
for j = 1 : NperformanceLevels
for k = 1 : sampleSize
From Theory to Experiments 411
x = (-100:100)/10;
dx = 0.10000000;
prob = sum(normpdf(x -
dp0(j)).*normcdf(x).^3)*dx;
% compute percent correct from target dprime
%reduced model
guess = [1.5 2 0.1 0.01 10^(-effectSize/20)];
options = optimset(fminsearch);
[ptm, minus_maxloglikelihood] =
fminsearch(PTMcostfunct2, guess, options, data);
reducedmodel_maxloglikelihood = - minus_maxloglikelihood;
chi2 = 2*(fullmodel_maxloglikelihood -
reducedmodel_maxloglikelihood);
412 Chapter 12
p = 1 - chi2cdf(chi2, 1);
if p < alpha
success = success + 1;
end
end
beta = success/totalN;
Display 12.2b
L=0;
% unattended
for i = 1 : NnoiseLevels
for j = 1 : NperformanceLevels
Next = data1(i, 1);
c = data1(i, j+1);
m = data1(NnoiseLevels + i, j+1);
k = data1(2*NnoiseLevels+i, j+1);
dp = (beta*c)^gamma / sqrt(Next^(2*gamma)*(1+Nm^2)
+ Nm^2*(beta*c).^(2*gamma) + Sa^2);
x = (-100:100)/10;
dx = 0.10000000;
p= sum(normpdf(x-dp).*normcdf(x).^3)*dx;
% compute percent correct
if (p < 1/2/(m+k))
% putting lower and upper boundaries on p
p = 1/2/(m+k);
elseif (p> 1-1/2/(m+k))
p = 1- 1/2/(m+k);
end
L = L - (m*log(p) + k*log(1-p));
From Theory to Experiments 413
end
end
% attended
for i = 1 : NnoiseLevels
for j = 1 : NperformanceLevels
Next = data2(i, 1);
c = data2(i, j+1);
m = data2(NnoiseLevels + i, j+1);
k = data2(2*NnoiseLevels+i, j+1);
dp = (beta*c)^gamma /
sqrt((Af*Next)^(2*gamma)*(1 + Nm^2)
+ Nm^2*(beta*c).^(2*gamma) + Sa^2);
x = (-100:100)/10;
dx = 0.10000000;
p= sum(normpdf(x-dp).*normcdf(x).^3)*dx;
% compute percent correct
if (p < 1/2/(m+k))
% putting lower and upper boundaries on p
p = 1/2/(m+k);
elseif (p> 1-1/2/(m+k))
p = 1- 1/2/(m+k);
end
L = L - (m*log(p) + k*log(1-p));
end
end
decide either to collect more data in the current experiment or to create a new and
better one.
If instead the hypothesis and the corresponding prediction are not consistent with the
data due to a convincing failure to reject the null hypothesisor the null hypothesis was
rejected but in an unexpected direction (i.e., attention unexpectedly damaged perfor-
mance), then it is time to revise the theory.
No amount of experimentation can ever prove me right; a single experiment can prove me wrong.
A theory can be proved by experiment; but no path leads from experiment to the birth of a theory.
attributed to Albert Einstein36; quoted in Sunday Times, July 18, 1976
The relationship between experiment and theory is a delicate and important one. Experi-
ment and observations are the basis of developing the set of facts about the worldor
414 Chapter 12
Display 12.2c
x = (-100:100)/10;
dx = 0.10000000;
p= sum(normpdf(x-dp).*normcdf(x).^3)*dx;
% compute percent correct
if (p < 1/2/(m+k))
% putting lower and upper boundaries on p
p = 1/2/(m+k);
elseif (p> 1-1/2/(m+k))
p = 1- 1/2/(m+k);
end
L = L - (m*log(p) + k*log(1-p));
end
end
2.4
1.0
Effect size (db)
1.6
0.9
1.2
0.8
0.7
0.8 0.6
0.5
0.4
0.3
0.4 0.2
0.1
10 20 30 40 50 60 70 80 90 100
Sample size
Figure 12.7
A power analysis for detecting attention effects on external noise exclusion in a set of psychometric functions
at different noise levels defining a TvC. The contour plot shows the probability of rejecting the null hypothesis
when comparing the psychometric functions in attended versus unattended conditions as a function of the size
of the attention effect and the sample size. See the text for a description of the experiment.
416 Chapter 12
about the human perceptual systemthat theory must explain. Experiments generate data
and tests of theory.
Scientific theories integrate a wide range of observations and phenomena. The better
the theory, the more extensive the range over which predictions of the theory will hold.
Stephen Hawking has said, a theory is a good theory if it satisfies two requirements: It
must accurately describe a large class of observations on the basis of a model which con-
tains only a few arbitrary elements, and it must make definite predictions about the results
of future observations.37 Similarly, Einstein said, A theory is the more impressive the
greater the simplicity of its premises is, the more different kinds of things it relates, and
the more extended is its area of applicability.38
The inductive logic behind the testing of theories holds that no amount of experimenta-
tion can prove a theory to be true, although our state of belief about its truth is stronger
to the extent that it predicts or accommodates the widest range of observations. The more
experiments of different kinds that have been carried out to test the theory without falsi-
fication of it, the more weight the theory holds. Again, in the inductive logic of theory
testing, a single experimental inconsistency with a theory may be sufficient to reject it. In
practice, this too is subject to interpretation. A single falsification of a powerful and suc-
cessful theory may not lead to outright rejection of the theory, but instead to the addition
of a footnote or special case process. Only the development of a better theory that accounts
for all the known phenomena is likely to replace a strong original theory.
The development of theory in psychophysics has an added challenge. Like select areas
in the physical sciences, theories of visual perception can be developed at several different
levelsat the level of behavior, function, biological systems, neural systems, or cellular
functions. Theories of visual perception at each of these levels have their own set of facts
and phenomena to be explained, corresponding to different levels of theory. Each level
has its own experimental tests. The challenge is in some ways greater as the sets of theories
and hypotheses at different levels, like a set of nesting dolls, must fit together. The theory
for visual perception should be integratedor at least not inconsistentfrom one level to
the next. This integration between the neural, functional, and behavioral levels of analysis
is the promise of the new area of neuropsychophysics.
Einstein said no path leads from experiment to the birth of a theory. Yet the data
provided by experimentation or observation are the observational ground from which the
next theoretical invention will grow. Experiments should be driven by theory, while the
next theory will be enriched by the right experiments.
References
1. Merriam-Webster online. Theory [definition 3]. Merriam-Websters collegiate dictionary. Springfield, MA:
Merriam-Webster; 2003. Available at: http://www.merriam-webster.com.
2. Wundt W. Outlines of psychology. Judd CH, trans. Leipzig: W. Engleman; 1902.
From Theory to Experiments 417
3. Mertens J. 1956. Influence of knowledge of target location upon the probability of observation of peripherally
observable test flashes. JOSA 46(12): 10691070.
4. Posner MI, Nissen MJ, Ogden WC. Attended and unattended processing modes: The role of set for spatial
location. In: Pick HL, Saltzman E, eds. Modes of perceiving and processing information. Hillsdale, NJ: Lawrence
Erlbaum; 1978, pp. 137157.
5. Dosher B, Lu Z-L. Mechanisms of visual attention. In: Chubb C, Dosher B, Lu Z-L, Shiffrin RM, eds. Vision,
memory, attention. Irvine, CA: APA Press; 2013.
6. Davis ET, Graham N. 1981. Spatial frequency uncertainty effects in the detection of sinusoidal gratings. Vision
Res 21(5): 705712.
7. Shiu L, Pashler H. 1994. Negligible effect of spatial precuing on identification of single digits. J Exp Psychol
Hum Percept Perform 20(5): 10371054.
8. Bashinski HS, Bacharach VR. 1980. Enhancement of perceptual sensitivity as the result of selectively attending
to spatial locations. Atten Percept Psychophys 28(3): 241248.
9. Downing CJ. 1988. Expectancy and visual-spatial attention: Effects on perceptual quality. J Exp Psychol Hum
Percept Perform 14(2): 188202.
10. Lu Z-L, Dosher BA. 1998. External noise distinguishes attention mechanisms. Vision Res 38(9):
11831198.
11. Dosher BA, Lu ZL. 2000. Noise exclusion in spatial attention. Psychol Sci 11(2): 139146.
12. Dosher BA, Lu ZL. 2000. Mechanisms of perceptual attention in precuing of location. Vision Res 40(10):
12691292.
13. Lu ZL, Dosher BA. 2000. Spatial attention: Different mechanisms for central and peripheral temporal
precues? J Exp Psychol Hum Percept Perform 26(5): 15341548.
14. Lu ZL, Liu CQ, Dosher BA. 2000. Attention mechanisms for multi-location first- and second-order motion
perception. Vision Res 40(2): 173186.
15. Lu ZL, Lesmes LA, Dosher BA. 2002. Spatial attention excludes external noise at the target location. J Vis
2(4): 312323.
16. Carrasco M. 2011. Visual attention: The past 25 years. Vision Res 51: 14841525.
17. Logan GD. 2004. Cumulative progress in formal theories of attention. Annu Rev Psychol 55: 207234.
18. Li X, Lu ZL, Tjan BS, Dosher BA, Chu W. 2008. Blood oxygenation level-dependent contrast response
functions identify mechanisms of covert attention in early visual areas. Proc Natl Acad Sci USA 105(16):
62026207.
19. Reynolds JH, Pasternak T, Desimone R. 2000. Attention increases sensitivity of V4 neurons. Neuron 26(3):
703714.
20. Treue S. 2002. Attentional modulation strength in cortical area MT depends on stimulus contrast. Neuron
35(2): 365370.
21. Williford T, Maunsell JHR. 2006. Effects of spatial attention on contrast response functions in macaque area
V4. J Neurophysiol 96(1): 4054.
22. Lu ZL, Li X, Tjan BS, Dosher BA, Chu W. 2011. Attention extracts signal in external noise: A BOLD fMRI
study. J Cogn Neurosci 23(5): 11481159.
23. Rovamo J, Franssila R, Nsnen R. 1992. Contrast sensitivity as a function of spatial frequency, viewing
distance and eccentricity with and without spatial noise. Vision Res 32(4): 631637.
24. Stromeyer III CF, Julesz B. 1972. Spatial-frequency masking in vision: Critical bands and spread of masking.
JOSA 62(10): 12211232.
25. Henning GB, Hertz BG, Hinton J. 1981. Effects of different hypothetical detection mechanisms on the shape
of spatial-frequency filters inferred from masking experiments: I. Noise masks. JOSA 71(5): 574581.
26. Losada M, Mullen KT. 1995. Color and luminance spatial tuning estimated by noise masking in the absence
of off-frequency looking. JOSA A 12(2): 250260.
27. Lu ZL, Dosher BA. 2001. Characterizing the spatial-frequency sensitivity of perceptual templates. JOSA A
18(9): 20412053.
418 Chapter 12
28. Zhou Y, Huang C, Xu P, Tao L, Qiu Z, Li X, Lu ZL. 2006. Perceptual learning improves contrast sensitivity
and visual acuity in adults with anisometropic amblyopia. Vision Res 46(5): 739750.
29. Huang CB, Zhou Y, Lu ZL. 2008. Broad bandwidth of perceptual learning in the visual system of adults with
anisometropic amblyopia. Proc Natl Acad Sci USA 105(10): 40684073.
30. Falmagne J, Cohen SP, Dwivedi A. Two-choice reactions as an ordered memory scanning process. In Rabbit
P, Dornic S, eds. Attention and performance V. New York: Academic Press; 1975, pp. 296344.
31. Treisman M, Williams TC. 1984. A theory of criterion setting with an application to sequential dependencies.
Psychol Rev 91(1): 68111.
32. Petrov AA, Anderson JR. 2005. The dynamics of scaling: A memory-based anchor model of category rating
and absolute identification. Psychol Rev 112(2): 383416.
33. Green DM. 1990. Stimulus selection in adaptive psychophysical procedures. J Acoust Soc Am 87:
26622674.
34. Hays WL. Statistics for the social sciences, Vol. 410. New York: Holt, Rinehart and Winston; 1973.
35. Kruschke JK. Doing Bayesian data analysis: A tutorial with R and BUGS. Burlington, MA: Academic Press;
2010.
36. Calaprice A, ed. The quotable Einstein. Princeton, NJ: Princeton University Press; 1996.
37. Hawking SW. The illustrated a brief history of time. New York: Bantam; 1996.
38. Schilpp PA. Albert Einstein: Philosopher-scientist, La Salle, IL: Open Court; 1970.
V TAKING VISUAL PSYCHOPHYSICS OUT OF THE LABORATORY
13 Applications and Future Directions
The goal of this book is to enable the reader to become a practicing psychophysics
researcher. This book takes an integrated theoretical approach to understanding perceptual
systems and human information processing. Most of the examples in the book have
focused on the measurement and testing of basic properties of visual functions. These
examples show research principles that may be widely exploited by using the same or very
similar test designs, analyses, and adaptive testing methods to research questions in many
different applications. The current chapter outlines possible applications using a common
philosophy, experimental approach, and computational method in a variety of domains and
highlights future directions in neuropsychophysics that integrate psychophysics, neuro-
physiology, and computational modeling in vision research.
Standard clinical testing of vision includes tests of acuity, of perimetry (peripheral vision),
depth perception, and color. All of these tests relate to scientific laboratory paradigms from
visual psychophysics. The most obvious example is the widespread use of visual acuity
testing. The letter eye charts or tumbling E eye charts used by optometrists and ophthal-
mologists use identification to measure acuity (figure 13.1). Although visual acuity testing
is still the gold standard in clinical vision,1 recent studies indicate that visual acuity is not
the onlyand may not even be the bestpredictor of visual functions. Many patients who
test at or near normal levels of 20/20 in visual acuity may nonetheless exhibit functional
deficits of vision.29
Visual psychophysicists have developed many methods for measuring contrast sensitiv-
ity. Rather than measuring the limits of size resolution at very high contrast as in the
standard Snellen letter chart, the PelliRobson charts measure the limitation in visibility
due to low contrast of letters for a comfortable size.10 This test detects the general contrast
sensitivity deficits exhibited by cataract, macular degeneration, and diabetic retinopathy.11
Yet the broadband letter stimuli do not provide information about frequency-specific
defects.4
422 Chapter 13
E 1
F P 2
1
T O Z 3
3
L P E D 4 4
P E C F D 5 5
E D F C Z P 6 6
F E L O P Z D 7 7
D E F P O T E C 8 8
9
L E F O D P C T 9
P D F L T C E O 10
P E Z D L C F T O 11
Figure 13.1
Applications of visual testing in the eye clinic: (a) Snellen and (b) tumbling E eye charts.
Applications and Future Directions 423
Even for individuals with apparently good visual acuity and normal color vision, signifi-
cant individual differences exist in visual function. Certain individual differences in visual
performance affect behavior in many domains, whereas others may be specific to a
trained activity. For example, athletes in sports that rely on visual inputs, such as baseball
players, seem to achieve better performance either in the speed or the ability to detect
differences in motion patterns.21 Video gamers may have special abilities in identifying
visual patterns or in attending across the visual field.22 In contrast, others such as those
with dyslexia may have relatively poor abilities in some visual tasks although testing
normal in standard vision tests.23 Understanding individual differences in perception or
perceptual task performance can be important in predicting many aspects of regular daily
function.
424 Chapter 13
Figure 13.2
An illustration of a useful field of view test.
such tasks would of course depend on eccentricity, contrast, optical limitations, and any
other noisiness or difficulty in early visual processing stages. However, useful field of view
is also affected by attention limitations. Peripheral deficits in useful field of view with
aging, for example, may also reflect fundamental limitations in the deployment of attention
in addition to limitations in optics or visual transmission per se.3335 It is important to assess
pure vision in order to interpret the results of the test correctly. Performance at a location
in the periphery from a useful field of view test can be compared to single-location detec-
tion or discrimination in the same location to determine the relative contributions of vision
and attention.
426 Chapter 13
Reports of visual or other sensory deficits in clinical populations are now quite common.
For example, reduced responses to contrast of visual stimuli has been cited in depression,41
and some researchers have reported that schizophrenic patients are less susceptible to
certain classes of visual illusions.42 These reported relationships between sensory process-
ing and clinical conditions can arise in a number of ways. A sensory deficit or atypical
response pattern may be a causal factor in the clinical condition, altered sensory processing
may be a consequence of the clinical condition, or there may be a common causal mecha-
nism that is implicated in both.
In many examples, sensory deficits are not among the primary criteria for classification
of the clinical population. Whether altered sensory processing is causal or merely an epi-
phenomenal effect associated with a particular clinical condition, measurements of sensory
function could in the future provide converging evidence in initial assessment or an added
method of evaluating the efficacy of a treatment. And, even if the sensory deficit is neither
causal nor core to a clinical condition, sensory deficits bring their own consequences that
must be managed, and it may be possible to either treat or overcome the sensory deficit
with visual prosthetics or through training.
One very interesting example is the role of sensory deficits in developmental dyslexia,
or what is sometimes more broadly termed reading deficiency (RD).23,4349 Patients with
developmental dyslexia, or RD, have a range of reading deficits, including reduced scores
in general reading ability, phonological processing and awareness, and orthographic skills.
One classic claim is that developmental dyslexia reflects abnormal function of the mag-
nocellular (M) visual pathway.23,4346 The M-pathway in the visual cortex is dominant in
the processing of luminance stimuli of low spatial frequency and high temporal frequency
or motionat least in comparison to the parvocellular (P) visual pathway, which is more
dominant in the processing of color, form, and high spatial frequency in low temporal
Applications and Future Directions 427
frequency.50,51 Some studies have reported reduced performance in tasks associated with
M-pathway processing in those with dyslexia or RD. Other reports focus on temporal
processing of sensory stimuli, whether visual or auditory.
An alternative view argues thatalthough phonological deficits may be the proximal
cause of poor readingthe core issue is reduced identification or recognition from noisy
displays.52 Much of the evidence favoring reduced M-pathway performance in those with
dyslexia utilized noisy displays, or very low luminance displays in which the signal-to-
noise ratio is dominated by internal noise. Unusual noise susceptibility may be directly
related to deficits in development of sophisticated phonological recognition templates and
word templates.5356 Sensory deficits may be only one part of a more complicated picture.
Individuals with dyslexia often show collateral impairments not just in visual processing
but also in motor processing, selective attention, and social and affective issues.57
Another example of a special population with modified perceptual phenomena is schizo-
phrenia.42 The vast majority of the research in schizophrenia has focused on cognitive
aspects of the condition.58 However, the possible sensory dysfunction in the disease is now
widely discussed. Atypical responses of schizophrenic patients to visual testing are of two
broad types: reduction in gain control5961 and limited long-range interactions.6264 In gain
control, surrounding patterns with contrast reduce the perceived contrast of a target region.
A central texture is perceived to be of lower contrast when surrounded by textures of
high contrast of similar spatial frequencies and orientations.60 Contrast-gain reduces or
silences the response to a pattern that is like other patterns and enhances the perception
of unique patterns in the visual field. Schizophrenic patients show much reduced effects
of contrast-gain reduction. Their perception of the central pattern in such tests may be
more accurate, but they are more susceptible to interference from other patterns in noisy
fields (figure 13.3a).
Schizophrenic patients also show deficits in long-range pattern integration. One test asks
observers to detect the presence of orientation snakeslines or curves of patches of
Gabors with locally similar orientation and scale among a cluttered field of Gabor patches
(figure 13.3b).62 Longer spatial gaps between Gabor patterns and jitter in local orientation
make the integration snakes harder to perceive. An ability that continues to develop into
adolescence, contour integration is significantly reduced in the schizophrenic population.64
There have been parallel, though less well documented effects of schizophrenia in auditory
perception.
The ties between sensory and perceptual function and clinical classification have been
most strongly documented in dyslexia and schizophrenia. More diffuse or less well under-
stood perceptual anomalies have been cited in other clinical populations. Issues with
contrast perception have been reported for individuals with clinical depression.41 Deterio-
ration of visual function over and above normal aging is reported in Alzheimers disease
and some dementias.65,66 Diffuse issues of visual function, many involving dysfunctions
of eye movements or pupil response, are cited in Parkinsons disease patients.67,68 These
428 Chapter 13
Figure 13.3
Examples of contrast gain control and long-range interaction phenomena in vision. (a) The contrastcontrast
illusion. (b) Visual contour integration.
issues with visual function may reflect decoupling of vision with motor control. Some
visual deficits have been cited in attention deficit hyperactivity disorder (ADHD).69
In each of these examples, rapid and efficient diagnosis of perceptual deficits with good
psychophysical measures may be one important tool in the arsenal of assessment. Further
research in assessment is needed to determine whether the visual or perceptual dysfunc-
tions are causal or epiphenomenal. However, regardless of the causal relationship, it is
important to understand perceptual aspects of these conditions in order to address a full
range of coping behaviors.
Understanding perceptual deficits in clinical conditions might suggest very practical
interventions that improve visual or auditory perceptual behaviors. One example is the
role of contrast enhancement in compensating for visual losses in Alzheimers disease.
One study showed that contrast enhancement can equate letter detection in early Alzheim-
Applications and Future Directions 429
ers disease with control groups.70 Another study showed that contrast enhancement on the
plate improved food and liquid intake in late-stage Alzheimers disease.71
Figure 13.4
JPEG compression losses. (a) The original image in high resolution. (b, c, d) The same image saved in JPEG
with high, medium, and low image qualities.
demands. The JPEG image formats are commonly used in digital camera applications and
as the format for color photographs on the Web.
Any compression system must lose information relative to a high-quality digital color
image (figure 13.4). JPEG compression was designed to minimize loss for human percep-
tion. The JPEG algorithms were optimized for photographs or realistic paintings, rather
than line drawings or computer renderings, where the compression may introduce artifacts
or aliasing. The compression algorithms take advantage of the typically smooth variations
in color and intensity in scenes of the natural world. The JPEG algorithms exploit the rela-
tive insensitivity of human vision to variation in color compared to luminance and relative
insensitivity to high spatial frequencies. They reduce the number of bits for color and high
spatial frequencies. Psychophysical tests for perceptible loss of information quality in
human users guided the selection of the compression algorithm and specified the circum-
stances where JPEG compression is appropriate.78,79
The compressed JPEG images are not in general appropriate input for other image-
processing functions such as threshold manipulations, cropping, shifting, or resizing.
Successive applications of image-processing manipulations can reveal or even highlight
the losses introduced by JPEG compression. For this reason, uncompressed or raw image
Applications and Future Directions 431
formats such as tagged image file format (TIFF) are used in programs such as Adobe
Illustrator or Photoshop or special-purpose image-manipulation programs. Original source
images of high quality may also be important in the processing and generation of derived
images in support of virtual reality environments or forms of augmented vision. Many
clinical or industrial applications require display systems that match or exceed human
visual capacity.80,81
High-fidelity images render real-world stimulus situations given direct viewing and
retain many details of the outside environment. Display systems of sufficient fidelity
should have adequate spatial, temporal, and gray-level resolution. For some applications,
replication of a wide field of view or small pixel resolution may be important. Understand-
ing the resolution limits of the human visual system assists in design of display systems
that just exceed those limits, without wasting excess fidelity that cannot be perceived by
the human operator.82
Luminance resolution and linearity have been seriously considered in the area of medical
X-ray and diagnostic imaging, where the perceived luminance information codes physical
tissue density.83 Assessment of the adequacy of image and display fidelity for particular
purposes relies on assessment of human performance in specially constructed tests,
informed by visual psychophysics.84
Human factors testing is especially important in designing the layout and content in
dense information displays such as those in large airplane cockpits.85 Optimizing operator
or pilot performance in such dense displays incorporates knowledge about limitations of
the visual system and the human ability to attend to relevant information (figure 13.5, plate
10). It includes how best to visualize and code information about the flight systems, routing
information, and multiple augmented displays such as multiview environmental cameras
and route maps. Such applications in human factors go beyond simple visual information
limits. Many of the designs for testing human performance developed for sensory psycho-
physics can be easily extended to testing these more complex situations.86
One especially important class of applications of human factors design and testing is
aimed at optimizing performance and training in synthesized flight simulators or driving
simulators.87 There is also an increasing interest in remote-operator systems for clean-up
of hazardous materials, undersea exploration, or in military applications where some sce-
narios revolve around remote pilots. Research will determine to what degree environmental
fidelity is necessary to optimize performance or training effectiveness. Systems develop-
ment teams in these areas may include perceptual, psychophysical, and testing expertise
as a part of engineering and design teams.88
At the other end of the spectrum, knowledge of basic sensory psychophysics under
certain extreme environmental circumstances is lacking, and simple tests of discrimination
and detection can quantify human performance in unusual circumstances. For example,
researchers at NASA are now testing basic visual abilities in high-G environments. The
experimental challenges have to do with carrying out vision testing while subjects are
432 Chapter 13
exposed to high-G simulations in large centrifuges that mimic perception during the critical
high-G moments of takeoff.89,90
The methods developed so far were applied to basic visual tasks, such as simple detec-
tion, CSFs, or orientation identificationusually classified as low-level vision. However,
similar or slightly more complicated displays, together with related paradigms of detec-
tion or discrimination and other methods, can be used to study midlevel and high-level
vision.91,92 These methods are also applicable to somewhat more complex stimulus
domains including the processing of faces, objects, and scenes or of other objects critical
to daily function, such as letters or words. Questions in these domains often parallel
questions asked in low-level vision. How easy are these objects to detect or discriminate?
And what is the scaled similarity of their internal representations among different
examples?
As in basic visual domains, careful description and measurement of midlevel and high-
level stimuli is very importantalthough often more complex. Progress in these areas
depends critically upon a sophisticated understanding of the stimuli. For example, low-
level descriptions of faces at the level of pixel or spatial frequency content are not adequate
to account for the complex configural properties of perception of these stimuli. Indeed,
Applications and Future Directions 433
the heart of the problem of understanding face, object, or scene perception is developing
an understanding of how to specify the stimulus space.
To make this point more explicit, consider the recognition of faces or facial expression.
Images of faces may of course be described and analyzed in terms of standard image
attributes, such as contrast, spatial frequency, or orientation content. However, at another
level, face recognition systems extract facial features and configurations, such as the dis-
tance between the eyes or the distance from the eyes to the mouth, and many others.93
Alternatively, some systems use computer algorithms to analyze large sets of face images
mathematically. One example uses principal components analysis. Each face is described
as a mixture of an average face plus some weighting on other face components or
dimensions.94,95
Figure 13.6 illustrates one such computational system for face recognition. Figure 13.6a
shows component faces with decreasing importance for describing a set of real face
images, the so-called eigenfaces. An algorithm applied to a large set of training images
extracts these descriptors. Each individual face is represented as a weighted sum of the
eigenfaces. Figure 13.6b shows that by successively adding back each weighted eigenface,
a more and more realistic image of the original face is created.96
In each of these cases, each face stimulus is quantified with a set if weights on features
within a feature space. This serves as the basis for manipulation in psychophysical inves-
tigations. Manipulations of feature dimensions can be tested using standard psychophysi-
cal paradigms described in earlier chapters. Related methods have been used to identify
the critical feature dimensions in the recognition of gender, age, and aspects of emotion
from images of faces.9395,97
Figure 13.6
Describing natural faces as a weighted combination of eigenfaces. (a) A set of eigenfaces derived from a face
database by performing singular value decomposition on a set of training faces. Each eigenface accentuates
certain characteristics. (b) Each new image from left to right adds one more eigenface in the face reconstruction.
The face of a particular person becomes recognizable around the seventh or eighth image (from http://www
.cs.princeton.edu/~cdecoro/eigenfaces/).
434 Chapter 13
Figure 13.7
Piazza San Marco, Venice, Italy. Two pictorial cues are available to define the slant and tilt of the plaza with
respect to the line of sight of the camera, linear perspective, and the texture gradient formed by humans, pigeons,
other miscellaneous objects and their shadows (from Oruc, Maloney, and Landy128).
Another challenge in midlevel and high-level vision is to model how cues or aspects of
the stimulus, once separated in early visual analyses, are combined to lead to an overall
perception. One approach to this issue measures how multiple cues to perception are
combined by the perceptual system. Figure 13.7 is a natural image that shows several
different cues to depth in the real world, most obviously linear perspective and texture
gradients. The issue of cue combination has been brought into the laboratory. Straightfor-
ward extensions of simple psychophysical experiments manipulate multiple cues in a
stimulus and observe the resulting behavior. For example, there has been an extensive
study of what kinds of cues are used to infer three-dimensional (3D) depth from the two-
dimensional (2D) images that are projected on the eye. These cues include manipulations
of stereo, but also of motion parallax, or texture, or luminance. Careful independent or
partially independent manipulations of the different cues together can be used to test the
importance of each cue in determining the perceived depth of different objects or of dif-
ferent parts of the same object. These investigations can also reveal the principles of cue
interaction or cue integration using quantitative models of how the information from each
Applications and Future Directions 435
cue is integratedcue integration.98100 The same approach has been used in multimodal
cue integration, studying the impact of simultaneous auditory cues on visual perception,
or vice versa.101103 The issue has been important in speech perception, where extensive
work has studied the biasing of speech perception by the visual cues from movements of
the mouth.104,105
Many cognitive processes start with perception. The theories we develop in perception
become the front end for cognitive processes. In addition, the effects of cognitive manipu-
lations and functions can be measured through their impact on perception. Finally, the
paradigms of visual psychophysics have been imported into other areas where they provide
a rigorous framework for experiments in the cognitive domain.
Psychophysical methods have been extremely important in studying attention and per-
ceptual learning. Cognitive manipulations of attention are sometimes measured through
tests of the accuracy of low-level visual perceptual tasks. A substantial literature has com-
bined manipulations of visual attention such as spatial cueing with classic measures of
contrast threshold, orientation discrimination, or the CSF.106109 A similar analysis has
incorporated visual psychophysical testing into the analysis of practice effects and of
practice in video game playing.22,110,111
The theoretical innovations in psychophysics, such as the observer models, provide a
strong framework for understanding these cognitive processes that begins with the sensory
inputs and ends with the decision mechanisms. Perceptual processes are an integral part
of models of many cognitive processes. We must consider the functionally important ele-
ments in sensory and perceptual processes in the development of complete cognitive
models.112,113 For example, comprehensive models of reading must incorporate the percep-
tual systems that take in the written information. These integrated models may improve
our understanding of reading deficits.114
Finally, the decision architectures, such as signal detection and other choice theories
that guide decision in cognitive tasks,115 were in many cases first developed in the domain
of either auditory or visual psychophysics and remain highly influential in the study of
human cognitive abilities.
The goal of classic psychophysics is to quantify the relationship between stimuli and
responses and to understand the nature of the internal representations of the external world.
Neuropsychophysics aims to develop integrated computational models of psychophysics
and brain responses that support behavior. Advancements in neurophysiology and brain
imaging are yielding new insights about internal representations and the brain networks
involved in representing stimuli, making decisions, and producing responses. The stimulus
and task specifications of psychophysical tasks and new knowledge about neural responses
will be central in developing new theories in both psychophysics and sensory and cognitive
436 Chapter 13
insights about the neural coding of internal sensory and cognitive representations derived
from physiology and brain imaging will improve our understanding of the brain substrates
for behavior. Together, both psychophysics and neuroscience will contribute to the devel-
opment and refinement of increasingly realistic and detailed models of representation and
processes.
The future of neuropsychophysics will lead to the development of processing models
of the visual system and human behavior, provide the basis for a better understanding of
cognitive functions, and generate new tests in clinical vision and biomedical research. The
new paradigm will also improve our understanding of the mechanisms leading to indi-
vidual differences and various clinical conditions and improve design for human factors
applications.
With parallel technical and theoretical developments in visual psychophysics, neuro-
physiology and brain imaging, and computational modeling, a solution to Fechners origi-
nal challenge to quantify mental processes is finally within reach. We hope that this book
will contribute to the development of a new generation of researchers working on the
frontiers of vision science.
References
1. Cline D, Hofstetter HW, Griffin JR. Dictionary of visual science. Radnor, PA: Chilton Book Company; 1980.
2. Wood JM, Owens DA. 2005. Standard measures of visual acuity do not predict drivers recognition perfor-
mance under day or night conditions. Optom Vis Sci 82(8): 698705.
3. Comerford J. 1983. Vision evaluation using contrast sensitivity functions. Am J Optom Physiol Opt 60(5):
394398.
4. Ginsburg AP. 2003. Contrast sensitivity and functional vision. Int Ophthalmol Clin 43(2): 515.
5. Huang C, Tao L, Zhou Y, Lu ZL. 2007. Treated amblyopes remain deficient in spatial vision: A contrast
sensitivity and external noise study. Vision Res 47(1): 2234.
6. Faye EE. 2005. Contrast sensitivity tests in predicting visual function. Int Congr Ser 1282: 521524.
7. Jindra LF, Zemon V. 1989. Contrast sensitivity testing: A more complete assessment of vision. J Cataract
Refract Surg 15(2): 141148.
8. Ginsburg A, Tedesco J. 1986. Evaluation of functional vision of cataract and Y AG posterior capsulotomy
patients using the Vistech contrast sensitivity chart. Invest Ophthal Vis Sci 27(3) (Suppl): 107.
9. Kurzer AR. 1986. Contrast sensitivity signals pituitary adenoma. Rev Opt 123(4): 119.
10. Pelli D, Robson J, Wilkins A. 1988. The design of a new letter chart for measuring contrast sensitivity. Clin
Vis Sci 2(3): 187199.
11. Ismail GM, Whitaker D. 1998. Early detection of changes in visual function in diabetes mellitus. Ophthalmic
Physiol Opt 18(1): 312.
12. Holladay JT, Dudeja DR, Chang J. 1999. Functional vision and corneal changes after laser in situ keratomi-
leusis determined by contrast sensitivity, glare testing, and corneal topography. J Cataract Refract Surg 25(5):
663669.
13. Rubin GS, Adamsons IA, Stark WJ. 1993. Comparison of acuity, contrast sensitivity, and disability glare
before and after cataract surgery. Arch Ophthalmol 111(1): 5661.
14. Muoz G, Albarrn-Diego C, Monts-Mic R, Rodrguez-Galietero A, Ali JL. 2006. Spherical aberration
and contrast sensitivity after cataract surgery with the Tecnis Z9000 intraocular lens. J Cataract Refract Surg
32(8): 13201327.
438 Chapter 13
15. Kurz S, Krummenauer F, Thieme H, Dick HB. 2007. Contrast sensitivity after implantation of a spherical
versus an aspherical intraocular lens in biaxial microincision cataract surgery. J Cataract Refract Surg 33(3):
393400.
16. Arden G, Jacobson J. 1978. A simple grating test for contrast sensitivity: Preliminary results indicate value
in screening for glaucoma. Invest Ophthalmol Vis Sci 17(1): 2332.
17. Ginsburg A. 1984. A new contrast sensitivity vision test chart. Am J Optom Physiol Opt 61(6): 403407.
18. Ginsburg AP. Next generation contrast sensitivity testing. In: Rosenthal B, Cole R, eds. Functional assessment
of low vision. St. Louis: Mosby Year Book; 1996: pp. 7788.
19. Owsley C. 2003. Contrast sensitivity. Ophthalmol Clin North Am 16(2): 171178.
20. Lesmes LA, Lu ZL, Baek J, Albright TD. 2010. Bayesian adaptive estimation of the contrast sensitivity
function: The quick CSF method. J Vis 10(3): 17.121.
21. Williams AM, Davids K, Williams JGP. Visual perception and action in sport. Philadelphia: Taylor & Francis;
1999.
22. Green CS, Bavelier D. 2003. Action video game modifies visual selective attention. Nature 423(6939):
534537.
23. Stein J, Walsh V. 1997. To see but not to read; the magnocellular theory of dyslexia. Trends Neurosci 20(4):
147152.
24. Webster MA, MacLeod DIA. 1988. Factors underlying individual differences in the color matches of normal
observers. JOSA A 5(10): 17221735.
25. Neitz J, Jacobs GH. 1986. Polymorphism of the long-wavelength cone in normal human colour vision. Nature
323: 623625.
26. Wilmer JB, Nakayama K. 2007. Two distinct visual motion mechanisms for smooth pursuit: Evidence from
individual differences. Neuron 54(6): 9871000.
27. Lu ZL, Dosher BA. 2008. Characterizing observers using external noise and observer models: Assessing
internal representations with external noise. Psychol Rev 115(1): 4482.
28. McKee SP, Levi DM, Movshon JA. 2003. The pattern of visual deficits in amblyopia. J Vis 3(5): 380
405.
29. Ginsburg A. 1987. Contrast sensitivity, drivers visibility, and visions standards. Transportation Research
Record 1149: 3239.
30. Lovegrove WJ, Bowling A, Badcock D, Blackwood M. 1980. Specific reading disability: Differences in
contrast sensitivity as a function of spatial frequency. Science 210(4468): 439440.
31. Brown B. 1981. Reading performance in low vision patients: Relation to contrast and contrast sensitivity.
Am J Optom Physiol Opt 58(3): 218226.
32. Ball KK, Beard BL, Roenker DL, Miller RL, Griggs DS. 1988. Age and visual search: Expanding the useful
field of view. JOSA A 5(12): 22102219.
33. Scialfa CT, Kline DW, Lyman BJ. 1987. Age differences in target identification as a function of retinal loca-
tion and noise level: Examination of the useful field of view. Psychol Aging 2(1): 1419.
34. Myers RS, Ball KK, Kalina TD, Roth DL, Goode KT. 2000. Relation of useful field of view and other
screening tests to on-road driving performance. Percept Mot Skills 91(1): 279290.
35. Sekuler AB, Bennett PJ, Mamelak M. 2000. Effects of aging on the useful field of view. Exp Aging Res
26(2): 103120.
36. Peterzell DH, Teller DY. 2000. Spatial frequency tuned covariance channels for red-green and luminance-
modulated gratings: Psychophysical data from human adults. Vision Res 40(4): 417430.
37. Billock VA, Harding TH. 1996. Evidence of spatial and temporal channels in the correlational structure of
human spatiotemporal contrast sensitivity. J Physiol 490(Pt 2): 509517.
38. Morrone MC, Burr DC, Pietro SD, Stefanelli MA. 1999. Cardinal directions for visual optic flow. Curr Biol
9(14): 763766.
39. Kanai R, Bahrami B, Rees G. 2010. Human parietal cortex structure predicts individual differences in per-
ceptual rivalry. Curr Biol 20(18): 16261630.
Applications and Future Directions 439
40. Wilmer JB. 2008. How to use individual differences to isolate functional organization, biology, and utility
of visual functions; with illustrative proposals for stereopsis. Spat Vis 21(6): 561.579
41. Bubl E, Kern E, Ebert D, Bach M, Tebartz van Elst L. 2010. Seeing gray when feeling blue? Depression can
be measured in the eye of the diseased. Biol Psychiatry 68(2): 205208.
42. Butler PD, Silverstein SM, Dakin SC. 2008. Visual perception and its impairment in schizophrenia. Biol
Psychiatry 64(1): 4047.
43. Breitmeyer BG, Ganz L. 1976. Implications of sustained and transient channels for theories of visual pattern
masking, saccadic suppression, and information processing. Psychol Rev 83(1): 136.
44. Cornelissen P, Hansen P, Hutton J, Evangelinou V, Stein J. 1998. Magnocellular visual function and childrens
single word reading. Vision Res 38(3): 471482.
45. Lovegrove W, Martin F, Slaghuis W. 1986. A theoretical and experimental case for a visual deficit in specific
reading disability. Cogn Neuropsychol 3(2): 225267.
46. Ramus F, Rosen S, Dakin SC, Day BL, Castellote JM, White S, Frith U. 2003. Theories of developmental
dyslexia: Insights from a multiple case study of dyslexic adults. Brain 126(4): 841865.
47. Cavanagh SHMBP. 1996. Low level visual processing skills of adults and children with dyslexia. Cogn
Neuropsychol 13(7): 9751016.
48. Hulme C. 1988. The implausibility of low-level visual deficits as a cause of childrens reading difficulties.
Cogn Neuropsychol 5(3): 369374.
49. Skottun BC. 2000. The magnocellular deficit theory of dyslexia: The evidence from contrast sensitivity.
Vision Res 40(1): 111127.
50. Zeki S. A vision of the brain. New York: Wiley; 1993.
51. Zihl J, Von Cramon D, Mai N. 1983. Selective disturbance of movement vision after bilateral brain damage.
Brain 106(2): 313340.
52. Sperling AJ, Lu ZL, Manis FR, Seidenberg MS. 2005. Deficits in perceptual noise exclusion in developmental
dyslexia. Nat Neurosci 8(7): 862863.
53. Boets B, Wouters J, Van Wieringen A, De Smedt B, Ghesquiere P. 2008. Modelling relations between sensory
processing, speech perception, orthographic and phonological ability, and literacy achievement. Brain Lang
106(1): 2940.
54. Ziegler JC, Pech-Georgel C, George F, Lorenzi C. 2009. Speech perception in noise deficits in dyslexia. Dev
Sci 12(5): 732745.
55. Boets B, Vandermosten M, Poelmans H, Luts H, Wouters J, Ghesquire P. 2011. Preschool impairments in
auditory processing and speech perception uniquely predict future reading problems. Res Dev Disabil 32(2):
560570.
56. Sperling AJ, Lu ZL, Manis FR, Seidenberg MS. 2006. Motion-perception deficits and reading impairment.
Psychol Sci 17(12): 10471053.
57. Ramus F. 2003. Developmental dyslexia: Specific phonological deficit or general sensorimotor dysfunction?
Curr Opin Neurobiol 13(2): 212218.
58. Nuechterlein KH, Barch DM, Gold JM, Goldberg TE, Green MF, Heaton RK. 2004. Identification of sepa-
rable cognitive factors in schizophrenia. Schizophr Res 72(1): 2939.
59. Butler PD, Zemon V, Schechter I, et al. 2005. Early-stage visual processing and cortical amplification deficits
in schizophrenia. Arch Gen Psychiatry 62(5): 495504.
60. Dakin S, Carlin P, Hemsley D. 2005. Weak suppression of visual context in chronic schizophrenia. Curr Biol
15(20): R822824.
61. Tadin D, Kim J, Doop ML, Gibson C, Blake R, Lappin JS, Park S. 2006. Weakened center-surround interac-
tions in visual motion processing in schizophrenia. J Neurosci 26(44): 1140311412.
62. Keri S, Kelemen O, Benedek G, Janka ZN. 2005. Lateral interactions in the visual cortex of patients with
schizophrenia and bipolar disorder. Psychol Med 35(7): 10431051.
63. Uhlhaas PJ, Phillips WA, Mitchell G, Silverstein SM. 2006. Perceptual grouping in disorganized schizophre-
nia. Psychiatry Res 145(2): 105117.
440 Chapter 13
64. Silverstein SM, Kovcs I, Corry R, Valone C. 2000. Perceptual organization, the disorganization syndrome,
and context processing in chronic schizophrenia. Schizophr Res 43(1): 1120.
65. Calderon J, Perry R, Erzinclioglu S, Berrios G, Dening TR, Hodges J. 2001. Perception, attention, and
working memory are disproportionately impaired in dementia with Lewy bodies compared with Alzheimers
disease. J Neurol Neurosurg Psychiatry 70(2): 157164.
66. Lu ZL, Neuse J, Madigan S, Dosher BA. 2005. Fast decay of iconic memory in observers with mild cognitive
impairments. Proc Natl Acad Sci USA 102(5): 17971802.
67. Mosimann UP, Mather G, Wesnes K, Obrien J, Burn D, McKeith I. 2004. Visual perception in Parkinson
disease dementia and dementia with Lewy bodies. Neurology 63(11): 20912096.
68. Demirci M, Grill S, McShane L, Hallett M. 1997. A mismatch between kinesthetic and visual perception in
Parkinsons disease. Ann Neurol 41(6): 781788.
69. Barkley RA. Attention-deficit hyperactivity disorder: A handbook for diagnosis and treatment, New York:
The Guilford Press; 2006.
70. Gilmore GC, Cronin-Golomb A, Neargarder SA, Morrison SR. 2005. Enhanced stimulus contrast normalizes
visual processing of rapidly presented letters in Alzheimers disease. Vision Res 45(8): 10131020.
71. Dunne TE, Neargarder SA, Cipolloni P, Cronin-Golomb A. 2004. Visual contrast enhances food and liquid
intake in advanced Alzheimers disease. Clin Nutr 23(4): 533538.
72. CIE. Commission Internationale de lEclairage proceedings, 1931. Cambridge, UK: Cambridge University
Press; 1932.
73. Elsaesser T, Barker A. Early cinema: Space, frame, narrative. London: BFI Publishing: 1990.
74. Edgerton GR. The Columbia history of American television. New York: Columbia University Press; 2007.
75. Sanders MS, McCormick EJ. Human factors in engineering and design. New York: McGraw-Hill; 1987.
76. Wallace GK. 1991. The JPEG still picture compression standard. Commun ACM 34(4): 3044.
77. Pennebaker WB, Mitchell JL. JPEG still image data compression standard. Berlin: Springer; 1993.
78. Watson AB. 1993. DCT quantization matrices visually optimized for individual images. Proc SPIE 1913:
202216.
79. Rosenholtz R, Watson AB. Perceptual adaptive JPEG coding. The IEEE International Conference on Image
Processing, 1996, pp. 901904. 1996.
80. Steuer J. 1992. Defining virtual reality: Dimensions determining telepresence. J Commun 42(4): 7393.
81. Cruz-Neira C, Sandin DJ, DeFanti TA. 1993. Surround-screen projection-based virtual reality: The design
and implementation of the CAVE. ACM SIGGRAPH 135142.
82. Bowman DA, McMahan RP. 2007. Virtual reality: How much immersion is enough? Computer 40(7):
3643.
83. Samei E, Badano A, Chakraborty D, Compton K, Cornelius C, Corrigan K, et al. 2005. Assessment of display
performance for medical imaging systems: Executive summary of AAPM TG18 report. Med Phys 32: 12051225.
84. Burgess AE, Jacobson FL, Judy PF. 2001. Human observer detection experiments with mammograms and
power-law noise. Med Phys 28: 419437.
85. Coombs L. Control in the sky: The evolution and history of the aircraft cockpit. South Yorkshire, UK: Pen
& Sword Aviation; 2005.
86. Hooey BL, Foyle DC, Andre AD. A human-centered methodology for the design, evaluation, and integration
of cockpit displays. In: Proceedings of the NATO RTO SCI and SET Symposium on Enhanced and Synthetic
Vision Systems. September 1012, 2002. Ottawa, Canada.
87. Foyle DC, Ahumada AJ, Larimer J, Townsend Sweet B. 1993. Enhanced/synthetic vision systems: Human
factors research and implications for future systems. SAE Transactions 101: 17341741.
88. NASA. Human factors. Available at: http://human-factors.arc.nasa.gov/.
89. Lackner JR, DiZio P. 2000. Human orientation and movement control in weightless and artificial gravity
environments. Exp Brain Res 130(1): 226.
90. Clment G, Clment G, Bukley AP. Artificial gravity, Vol. 20. Berlin: Springer; 2007.
91. Ullman S. High-level vision: Object recognition and visual cognition. Cambridge, MA: The MIT Press; 2000.
Applications and Future Directions 441
92. Wang JYA, Adelson EH. 1994. Representing moving images with layers. IEEE Transactions on Image Pro-
cessing 3(5): 625638.
93. Wiskott L, Fellous JM, Kuiger N, von der Malsburg C. 1997. Face recognition by elastic bunch graph match-
ing. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(7): 775779.
94. Turk M, Pentland A. 1991. Eigenfaces for recognition. J Cogn Neurosci 3(1): 7186.
95. Belhumeur PN, Hespanha JP, Kriegman DJ. 1997. Eigenfaces vs. fisherfaces: Recognition using class specific
linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(7): 711720.
96. de Coro C. Face recognition using Eigenfaces. Available at: http://www.cs.princeton.edu/~cdecoro/
eigenfaces/.
97. Valentine T. 1991. A unified account of the effects of distinctiveness, inversion, and race in face recognition.
Q J Exp Psychol 43(2): 161204.
98. Dosher BA, Sperling G, Wurst SA. 1986. Tradeoffs between stereopsis and proximity luminance covariance
as determinants of perceived 3D structure. Vision Res 26(6): 973990.
99. Landy MS, Maloney LT, Johnston EB, Young M. 1995. Measurement and modeling of depth cue combina-
tion: In defense of weak fusion. Vision Res 35(3): 389412.
100. Hillis JM, Watt SJ, Landy MS, Banks MS. 2004. Slant from texture and disparity cues: Optimal cue com-
bination. J Vis 4(12): 967992.
101. Welch R, Warren D. Intersensory interactions. In: Boff K, Kaufman L, Thomas JP, eds. Handbook of per-
ception and human performance. New York: Wiley; 1986, pp. 25.125.36.
102. Calvert GA. 2001. Crossmodal processing in the human brain: Insights from functional neuroimaging
studies. Cereb Cortex 11(12): 11101123.
103. Stein BE, Meredith MA. The merging of the senses. Cambridge, MA: The MIT Press; 1993.
104. McGurk H, MacDonald J. 1976. Hearing lips and seeing voices. Nature 264: 746748.
105. Massaro DW. Speech perception by ear and eye: A paradigm for psychological inquiry. Hillsdale, NJ:
Lawrence Erlbaum; 1987.
106. Dosher BA, Lu ZL. 2000. Noise exclusion in spatial attention. Psychol Sci 11(2): 139146.
107. Lu Z-L, Dosher BA. 1998. External noise distinguishes attention mechanisms. Vision Res 38(9):
11831198.
108. Yeshurun Y, Carrasco M. 1998. Attention improves or impairs visual performance by enhancing spatial
resolution. Nature 396(6706): 7275.
109. Smith PL, Ratcliff R, Wolfgang BJ. 2004. Attention orienting and the time course of perceptual decisions:
Response time distributions with masked and unmasked displays. Vision Res 44(12): 12971320.
110. Dosher BA, Lu Z-L. 1998. Perceptual learning reflects external noise filtering and internal noise reduction
through channel reweighting. Proc Natl Acad Sci USA 95: 1398813993.
111. Dosher BA, Lu Z-L. 1999. Mechanisms of perceptual learning. Vision Res 39(19): 31973221.
112. Meyer DE, Kieras DE. 1997. A computational theory of executive cognitive processes and multiple-task
performance: Part I. Basic mechanisms. Psychol Rev 104(1): 365.
113. Anderson JR. The architecture of cognition, Vol. 5. Hillsdale, NJ: Lawrence Erlbaum; 1995.
114. Harm MW, Seidenberg MS. 1999. Phonology, reading acquisition, and dyslexia: Insights from connectionist
models. Psychol Rev 106(3): 491528.
115. Wixted JT. 2007. Dual-process theory and signal-detection theory of recognition memory. Psychol Rev
114(1): 152176.
116. Boynton GM, Demb JB, Glover GH, Heeger DJ. 1999. Neuronal basis of contrast discrimination. Vision
Res 39(2): 257269.
117. Li X, Lu ZL, Tjan BS, Dosher BA, Chu W. 2008. Blood oxygenation level-dependent contrast response
functions identify mechanisms of covert attention in early visual areas. Proc Natl Acad Sci USA 105(16):
62026207.
118. Ress D, Backus BT, Heeger DJ. 2000. Activity in primary visual cortex predicts performance in a visual
detection task. Nat Neurosci 3(9): 940945.
442 Chapter 13
119. Ress D, Heeger DJ. 2003. Neuronal correlates of perception in early visual cortex. Nat Neurosci 6(4):
414420.
120. Haushofer J, Livingstone MS, Kanwisher N. 2008. Multivariate patterns in object-selective cortex dissociate
perceptual and physical shape similarity. PLoS Biol 6(7): e187.
121. Cohen MR, Kohn A. 2011. Measuring and interpreting neuronal correlations. Nat Neurosci 14(7):
811819.
122. Xue G, Dong Q, Chen C, Lu L, Mumford JA, Poldrack RA. 2010. Greater neural pattern similarity across
repetitions is associated with better memory. Science 330(6000): 97101.
123. Kriegeskorte N, Mur M, Bandettini P. 2008. Representational similarity analysisconnecting the branches
of systems neuroscience. Front Systems Neurosci 2: 4
124. Newsome WT, Britten KH, Movshon JA. 1989. Neuronal correlates of a perceptual decision. Nature
341(6237): 5254.
125. Wassermann E, Epstein CM, Ziemann U. The Oxford handbook of transcranial stimulation. Oxford: Oxford
University Press; 2008.
126. Nitsche MA, Cohen LG, Wassermann EM, et al. 2008. Transcranial direct current stimulation: State of the
art 2008. Brain Stimulat 1(3): 206223.
127. Xue G, Juan CH, Chang CF, Lu ZL, Dong Q. 2012. Lateral prefrontal cortex contributes to maladaptive
decisions. Proc Natl Acad Sci USA 109(12): 44014406.
128. Oru I, Maloney LT, Landy MS. 2003. Weighted linear cue combination with possibly correlated error. Vis
Res 43: 24512468.
Index
1/threshold, 82, 87, 221, 325, 375 Behavioral signature, 387, 390, 393
2AFC, 241, 244, 246, 248, 257258, 393, 401 Between-observer design, 399401
2IFC, 241, 244, 246, 258, 320 Bias, 167168, 184, 222, 237, 248, 250, 261263,
2, 312, 325 267, 280, 355358, 360361, 371373, 379
Binomial distribution, 8182, 310311, 320, 323, 325
A', 240 Bit stealing, 68
Accelerated stochastic approximation method (ASA), Bitmap, 2730, 41, 43, 110
357358 BITS++, 141
Acuity, 12, 290, 421, 423424 Blocked design, 399
Adaptation, 3, 269, 286, 295, 397, 399400 Blood oxygenation leveldependent (BOLD)
Adaptive Bayesian testing, 362 response, 191, 388389
Adaptive method, 14, 23, 313, 351352, 359, Bootstrap, 262, 301, 307, 311, 318, 323, 329, 338,
362363, 372, 375, 380383 344, 402
Additive internal noise, 269271, 283, 340 Brain imaging, 1011, 13, 95, 185, 191, 194, 199,
Adobe Photoshop, 40, 44, 431 202, 215216, 269, 435437
Ahumada, A. J., 290, 293, 328 Button box, 163, 165, 168, 186
Akaike Information Criterion (AIC), 312313, 380
Alpha channel, 28 Calibration, 66, 109, 113, 121, 123, 127, 129131,
Alzheimers disease, 427429 133, 141, 145146, 149, 152153, 183185, 392
Amblyope, 397, 423424 Carrier, 34
Amblyopia, 331 Cataract, 421, 423
applycform, 44 Cathode ray tube (CRT), 110, 112113, 115116,
Ashby, F. G., 208 120121, 123, 137, 140, 145
Aspect ratio, 31, 96, 114, 253 Channel, 254, 257258, 260261, 267268, 283,
Attention, 13, 8788, 9293, 171, 261, 263, 269, 291295
283, 286, 295, 339, 371, 386390, 393395, 397, Chromaticity space, 141, 144
399403, 405, 409, 413, 425, 427429, 435 City-block distance, 209, 213
Attention deficit hyperactivity disorder (ADHD), 428 Classification image, 284, 290
Attention window, 88 Clinical vision, 11, 421, 437
Attenuator, 140 Color
channel, 68, 121, 126, 137, 147
Back buffer, 112, 120 CMYK color space, 43
Background, 12, 31, 34, 3637, 66, 121, 133, 145, depth, 27, 127
147, 149, 194 gun, 110, 121, 147
Band-pass, 52, 286, 328 homogeneity, 109
Bandwidth, 115, 125, 286, 328, 375, 391, 396397 image, 27, 36, 43, 45, 121, 429430
Baseline, 120, 203, 221, 293, 389390, 404, 423 lookup table (CLUT, colormap), 2831, 41, 43, 45,
Bayes 61, 65, 68, 73
factor, 313, 380 matching, 9, 143144
rule, 362364, 370, 376, 380, matching function, 143144
Bayesian, 215, 311, 313 plane, 120121
Bayesian Information Criterion (BIC), 313, 380 space, 12, 4344, 141, 146147, 153, 211
444 Index
Compression, 429430 Design, 3, 14, 109, 130, 140141, 184, 248, 250,
Computational model, 3, 910, 13, 421, 435, 437 258, 386, 389, 392, 394397, 399400, 402405,
Cone contrast, 146147 429, 431, 437
Cone excitation, 146147, 149 de Valois, R. L., 8
Confusion error, 213 Diabetic retinopathy, 421
Conjoint measurement, 208 Difference between proportion, 402
Contrast, 12 Difference of Gaussians (DOG), 55, 57
gain, 283284, 295, 389390, 427 Difference rule, 241, 271
gain control, 283284, 295 Diffusion model, 347
modulated, 34 Digital image, 2729, 41, 61
psychometric function, 73, 77, 294, 320, 335, 369, Digital light processing (DLP), 121, 126127
372 Digital-to-analog converter (DAC), 140141
sensitivity function (CSF), 12, 2122, 24, 52, 73, Direct estimation, 200, 208
82, 87, 221, 293, 302, 325, 328329, 331332, Direct scaling, 199, 202, 215
342, 352, 373, 375, 379, 381, 390392, 395397, Discriminability, 23, 202, 204, 233, 237238, 240,
423424, 432, 435 248, 285, 343
threshold, 12, 1921, 2324, 77, 82, 140, 272273, Discrimination, 170, 237239, 246, 248, 251,
275, 293, 332, 338340, 374, 378, 387, 393, 396, 261263, 269, 272273, 283285, 289290, 293,
402, 435 318, 335, 339340, 342, 354, 358, 378, 390, 393,
Convolution, 55, 57, 268 397, 403, 425, 429, 431432, 435436
Correct rejection, 228 Discrimination task, 335, 339, 365
Cost function, 310, 314316, 321, 328 Disparity, 3536
CreateProceduralSineGrating, 7172 Display
Criteria/criterion, 21, 200, 213, 225226, 233, device, 29, 6162, 66, 68, 72, 109111, 113114,
236238, 240241, 248251, 253, 258, 261263, 116, 120123, 127130, 133, 137, 141, 145147,
267, 273274, 283, 301305, 310316, 318, 321, 149, 161, 190
325, 335, 339, 355, 360, 362, 365, 371373, setup, 6366, 73, 110
379380, 393, 397, 399, 402, 404, 426 setup module, 63, 6566, 73
noise, 261262 synchronization, 62, 109, 121
Critical band masking, 284 window, 66, 68, 7273
Cropping, 46, 293, 430 Dosher, B., 270, 275, 293
Cue, 57, 35, 88, 9293, 170171, 372, 386387, Double pass, 275, 280283
392, 399, 403, 434435 dprime, 237
combination, 434 DrawTexture, 72, 97
integration, 434435 Dyslexia, 423, 426427
visual, 5, 435
Cutzu, F., 211 Eccentricity, 38, 96, 292, 375, 425
Cycles per degree, 63, 68, 82, 395396 Edelman, S., 211
Edge, 47, 293
d', 233, 237, 239, 240, 244, 248, 250, 260, 262 Effect size, 404405
Data analysis, 14, 19, 301, 380, 386, 401, 403 Efficiency, 352, 355, 360362
DATAPixx, 141 Efficient testing procedure, 385
Debouncing, 167 Eigenface, 433
Decision Einstein, A., 413, 416
boundary, 254 Ekman, G., 209, 211
rule, 199200, 241, 244, 248250, 257, 259, 268 Electroencephalography (EEG), 1314, 23, 185186,
separability, 253254 190191, 199, 215, 392393, 403, 436
uncertainty, 244, 269, 283, 290 Electromyography (EMG), 23
Degrees of visual angle, 6364, 68, 96, 396 Electrooculography (EOG), 171
Dementia, 427 Encoding, 262, 268
Depression, 426427 Equivalent input noise, 272
Depth, 56, 27, 35, 41, 109110, 127128, 137, 421, Error
434 bar, 8182
Depth perception, 35, 421 function, 304
Derrington-Krauskopf-Lennie (DKL) color space, surface, 304305
141, 147149 Euclidean space, 207, 209, 213
Derrington-Krauskopf-Lennie (DKL) coordinates, 147 Event-related potential (ERP), 186
Index 445
Prior probability, 362363, 365, 369370, 372, 213, 215216, 228229, 233, 249, 251, 253254,
376377, 380 257, 262, 267268, 270272, 282283, 286, 295,
distribution, 362, 365, 369, 372, 380 302, 386387, 432, 435437
Probabilistic MDS, 215 Resizing, 46, 430
Probability density, 233, 236, 237, 241, 244, 253, Response
259260, 376 accuracy, 24, 72, 357
Programmable levels of intensity, 23 collection, 19, 62, 96, 161, 194
Projector, 122123, 126 gain, 284, 389
Psi () method, 369, 371, 376 key, 20
PsychImaging, 66, 68, 131, 137 time (RT), 23, 72, 125126, 161, 163, 165, 167
Psychological function, 362, 373374, 380381 171, 185, 193194, 254, 306, 308, 392, 397, 403
Psychological scaling function, 24 Retina, 35, 8, 9, 35, 6364, 141
Psychometric function, 2022, 24, 73, 77, 8182, Retinotopic organization, 5, 96
223, 272, 274, 283, 286, 294, 318, 320323, 325, Retinotopy, 73, 9597, 101
329, 335, 351362, 364365, 368375, 378379, Reverse correlation, 284, 290
381, 393395, 401405 RGB
Psychophysical function, 19, 364 image, 28, 4445, 127
Psychophysical paradigm, 14, 269, 433 space, 43
Psychtoolbox, 14, 6163, 66, 7173, 93, 95, 97, triplet, 28, 36, 125, 141, 145146
101, 110, 112113, 120, 163, 169, 185, 365 rgb2gray, 44, 55
Purkinje image, 174, 178 Robbins-Monro, 361
Root mean squared error (RMSE), 302, 305307,
Qualitative theory, 302 309
Quantitative model, 302, 303, 340, 387, 390, 392, Rotation, 46, 115, 174, 178
401, 405 RTBox, 169170, 194, 392
QUEST, 364365, 368369
function, 365, 369 Salience, 151
Quick Same-different, 216, 248250
contrast sensitivity function (q-CSF), 375379, Sample size, 313, 359, 373, 394395, 403405
423 Saturation, 12, 44, 141, 251
threshold versus external noise contrast (q-TvC), Scaling, 1112, 19, 2224, 76, 149, 200202,
378379 204205, 207, 209, 211, 213, 215216, 267, 295,
Yes/No (q-YN), 371, 372373 399, 436
Schizophrenia, 426427
r2, 305307, 309, 315316, 318, 329, 338, 344 Screen, 66, 68, 72, 93, 95, 97, 120
Random ScreenTest, 110111, 113
dot stereogram, 3536 Second-order processing, 34, 151
order, 19, 76, 88 Sensitivity surface, 1213, 222
seed, 68, 275 Sensory deficit, 426427
Rapid serial visual presentation (RSVP), 73, 8788, Sequential dependency, 360, 382, 397, 399
93 SetMovieTimeIndex, 93
Rater display, 121122 Shape, 6, 13, 32, 40, 114, 116, 120, 186, 211, 216,
Reading, 435 251, 272, 320, 328, 347, 352, 375376, 378, 390,
Reading deficiency (RD), 426427 394
Receiver operating characteristic (ROC), 238240, Shepard, R. N., 209, 211
262, 313314, 316, 318 showImage, 31
Reduced model, 309, 311, 313, 316, 318, 325, 332, Signal
339, 344 contrast, 24, 272274, 279280, 283, 320, 322, 360,
Reeves, A., 88 379
Refresh, 6263, 66, 109113, 116, 120121, 125, detection, 1314, 251, 268, 435
141 detection theory (SDT), 14, 221, 225226,
Refresh rate, 6263, 66, 109111, 113, 120 228229, 233, 237241, 250251, 253, 254,
Region of interest, 49 257258, 261263, 267, 270, 272, 281, 313314,
Remote-operator system, 431 320, 372
Renderer, 110111 plus noise distribution, 229, 236, 240, 314
Representation, 5, 910, 13, 27, 52, 55, 101, 126, Similarity, 12, 23, 88, 199, 209, 211, 213, 215216,
147, 153, 199200, 202, 204205, 207209, 211, 432, 436
Index 449
Sine wave, 2022, 3132, 34, 46, 55, 63, 6566, 68, Text, 3738
7172, 77, 82, 133, 152, 205, 221, 223, 320, 328, TextFront, 68
389390, 395397 TextSize, 68
Sine-wave grating, 31, 46, 63, 6566, 71, 77, 82, Texture, 68, 31, 3435, 40, 71, 73, 97, 110111,
152, 320, 396 151, 209, 427, 434
Single-unit recording, 13, 215 Texture gradient, 434
Slope of the psychometric function, 320, 351, Threshold, 1112, 1924, 45, 77, 82, 87, 140,
356357, 360361, 365, 369, 371, 374, 378 221223, 226, 237, 262263, 270, 272275,
Snellen, 421 285286, 293295, 303, 307, 309, 320323, 325,
Source localization, 186, 190 328, 332, 335, 338340, 351362, 364365,
Spatial 368375, 378379, 381382, 387, 393397,
footprint, 286 400405, 424, 430, 435
frequency, 12, 21, 23, 32, 5152, 66, 68, 82, 87, Threshold versus external noise contrast (TvC)
205206, 221, 251, 268, 283286, 292293, 302, function, 273275, 281, 283, 332, 335, 338340,
328329, 331, 375376, 391, 395397, 426, 378379, 387, 390, 392395, 399, 402, 424
432433 Thurstone scaling, 204205, 207
frequency filtering, 51 Time
resolution, 109110, 113, course, 116, 120, 170171, 186, 340, 342343,
Spectral sensitivity, 143 347
Speed-accuracy trade-off (SAT), 163, 169171, 340, of onset, 23
342344, 347 stamp, 167169, 186, 393
Sperling, G., 88 Timing, 23, 6163, 72, 105, 110111, 126, 163,
Staircase, 203, 339, 351360, 362, 378, 381, 395, 167169, 171, 183, 186, 194, 392
396, 399 error, 167
Standard CIE observer, 144 Touchpad, 62, 161, 163
Standard hypothesis testing, 380, 405 Touchscreen, 23, 163
Standard observer, 143, 146, 149 Training, 13, 199, 269, 312, 340, 357, 390391,
Starting value, 131, 223, 321, 338, 354358, 360 394397, 400, 424, 426, 431, 433
Statistical analysis, 23 Transcranial direct current stimulation (tDCS), 13,
Step-size, 354358, 360 436
Stereopsis, 35 Transcranial magnetic stimulation (TMS), 13, 436
Stereoscope, 128 Transducer function, 270271, 295
Stevens law, 201 Transfer function, 209, 213, 215
Steward, N., 168 Transistor-transistor logic (TTL), 169, 186, 193194
Stimulus energy, 1920 Treutwein, B., 358, 361
Stimulus enhancement, 387388, 390, 394, 402 Trichromate theory, 8
Stop criterion, 362, 365 Trigger, 97, 120, 140, 168170, 193194
Stop rule, 354355, 357, 360, 362, 365, 371, 373, Triple-TvC paradigm, 274
375376 Tri-stimulus value, 145
Subjective equality, 353 True color image, 36, 43
Sum of squared errors (SSE), 304307, 309 Tuning, 286, 291292
Sweat factor, 361
Synchronization, 23, 62, 109, 120121, 127, 141, Uncertainty, 244, 254, 258, 261, 263, 268, 283, 290,
161, 169, 183, 185186, 191, 193194, 393 369, 370
System reinstatement module, 63, 65, 73 Unidimensional scale, 11, 200, 207, 213, 240,
250251
Table-mounted, 183 Useful field of view, 424425
Tachistoscope, 122
Tagged image file format (TIFF), 43, 431 V1, 101, 389390
Template, 194, 268271, 283286, 289291, 294, V2, 101, 389
335, 386387, 403, 405, 424, 427 V3, 101, 389
Temporal V4, 101, 389390
frequency, 12, 66, 68, 373, 375, 426 Variance accounted for, 306
response, 109, 116, 125 VBLsync, 112
response function, 109, 116 Vector-plot display, 122
window, 286 Vertical blanking interval (VBI), 112113, 120
Testable prediction, 10, 387, 391 Video splitter, 110, 116, 120
450 Index