0% found this document useful (0 votes)
35 views

Modeling Human Eye Behavior During Mammographic Scanning: Preliminary Results

This document discusses using an eye tracking system to study how radiologists examine mammograms to detect breast abnormalities. Four experts examined 14 mammograms while their eye movements and pupil diameter were recorded. The data was analyzed to study relationships between visual scanning patterns, pupil response, and diagnostic accuracy. Results were consistent with prior studies examining how experts scan chest radiographs. This preliminary study provides a proof-of-concept for using eye tracking to better understand mammogram examination and potentially help with computer-aided diagnosis.

Uploaded by

CHARAN
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Modeling Human Eye Behavior During Mammographic Scanning: Preliminary Results

This document discusses using an eye tracking system to study how radiologists examine mammograms to detect breast abnormalities. Four experts examined 14 mammograms while their eye movements and pupil diameter were recorded. The data was analyzed to study relationships between visual scanning patterns, pupil response, and diagnostic accuracy. Results were consistent with prior studies examining how experts scan chest radiographs. This preliminary study provides a proof-of-concept for using eye tracking to better understand mammogram examination and potentially help with computer-aided diagnosis.

Uploaded by

CHARAN
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

494 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 27, NO.

4, JULY 1997

Modeling Human Eye Behavior During


Mammographic Scanning: Preliminary Results
K. Preston White, Jr., Senior Member, IEEE, Tonya L. Hutson, and Thomas E. Hutchinson

Abstract—Understanding how people acquire information from present as palpable abnormalities [6]. Performed optimally,
pictures—radiographs, maps, charts, photographs, drawings, and mammography can target an estimated 90% of all breast dis-
other static images—can be an important component in under- ease [11]. The primary diagnostic technique in mammography
standing, aiding, and eventually automating a wide range of
diagnostic tasks. In the experiment reported here, we investigate is the visual evaluation of films by skilled radiologists. The
the use of an inexpensive and unobtrusive eye-tracking system to objective is to identify areas of suspicion in the breast which
explore relationships between visual scanning patterns, pupillary may indicate cancer. If an abnormality is detected on the
response, and the clinical diagnoses of mammographic experts. mammogram, then a biopsy may be ordered to ascertain if
One radiologist and three radiological technicians each examined the corresponding area is benign or malignant.
a series of 14 mammograms for indications of abnormalities as-
sociated with breast cancer. The status of each mammogram was When examining mammograms, radiologists look for cer-
verified by biopsy. The eye-tracking system was used to measure tain visual cues and characteristics which are known indicators
and record eye position and pupil diameter as a function of time of disease. These diagnostic features include microcalcifica-
as the subjects scanned the mammograms. Three treatments were tions and masses. A microcalcification is a small calcium
applied to the scan data to model the experts’ eye behaviors. deposit between 0.1 mm and 1.0 mm across that has ac-
These included quantification of dwell time and pupil diameter
as a function diagnostic accuracy in regions of the mammo- cumulated in the breast tissue, which appears as a bright
gram where abnormalities existed or were perceived; independent speck on the mammogram [45]. Between 30% and 50% of
clustering of lookpoints without respect to abnormalities; and breast cancers demonstrate clustered microcalcifications (three
analysis of scan transitions between lookpoint clusters. Results of or more microcalcifications within a 1 cm region) and in
the analysis were consistent with extensive prior studies of eye- approximately one third of these cases are the only indication
scan measures recorded during the diagnosis of abnormalities on
chest radiograms. This preliminary investigation provides a proof of malignancy [12], [14], [43].
of concept for use of the eye-tracking technology, experimental A mass or nodule is a space-occupying lesion with margins
protocols, and analysis methodologies as the basis for expanded seen on two different projections of the breast. Visually,
mammographic studies, with the promise of eventual adaptation masses are three-dimensional, distinct from the surrounding
as a source of diagnostic information in clinical practice. tissue, and most often asymmetric with an outward convex
contour [1]. Masses are classified as circumscribed, microlob-
I. INTRODUCTION ulated, obscured, indistinct, or spiculated, depending on the
nature of their margins in terms of regularity and definition.
A. Breast Cancer and Mammography Masses are a common finding in mammography, with benign
lesions tending to have medium to low density and well-
B REAST CANCER is the most common malignant disease
in women, with one in ten women developing some form
of the disease during their lifetime [6], [11]. In 1995 alone,
defined margins and malignant lesions tending to have higher
density and fine irregularity or nodularity at their edges [12].
Because of their ambiguous appearance and small size
the American Cancer Society [2] estimates that 182 000 cases
(especially in minimal breast cancer) features which are ac-
of invasive breast cancer will be diagnosed in the United
tually associated with malignancy are difficult to detect with
States. Once fatal, breast cancer is now curable through early
certainty. Mistaken diagnoses present two forms of risk to
diagnosis and advanced treatment techniques. Regardless of
the patient. False negatives fail to identify malignancies on
the treatment, however, the prognosis for surviving breast
screening and lead to the delay of treatment. These are perhaps
cancer ultimately depends on how early the malignancy is
the most common errors and are considered to be the most
detected. (See [37] for an excellent and readable summary of
serious by the medical community. False positives also can
the epidemiology of breast cancer).
have serious consequences, however, resulting in unnecessary
Mammography is a means of imaging breast tissue using
stress on patients recalled for further examination and, in some
X-rays. Applied since the 1960’s, mammography is the only
cases, unnecessary procedures associated with increased risk
proven method of detecting breast cancers which are not
and cost [20], [26].
Manuscript received January 15, 1995; revised April 14, 1996 and August
29, 1996. An earlier version of this paper was presented at the 1994
Symposium on Human Interaction with Complex Systems.
B. Computer-Aided Diagnosis
The authors are with the Erica Laboratory, Department of Systems Engi- Computer technology for improving radiological and mam-
neering, University of Virginia, Charlottesville, VA 22903-2442 USA (e-mail:
[email protected]). mographic diagnosis has received increasing attention over
Publisher Item Identifier S 1083-4427(97)03476-0. the past two decades. Digital mammography appears to be
1083–4427/97$10.00  1997 IEEE
WHITE et al.: MODELING HUMAN EYE BEHAVIOR 495

on the horizon, although significant barriers (such as limited levels of difficulty and a linear relationship between pupil size
resolution and limited field of view) are still to be overcome. and the amount of information processed during a memory
While awaiting advances in digital mammography that will task. In particular, [39] describes pupil behavior at points of
permit widespread implementation, computer-aided diagnosis (correct and incorrect) recognition of an on-screen image as the
(CAD) has been shown to have clinical potential for both the details of this image were gradually revealed over time. This
depiction and analysis of abnormalities [36]. characteristic signature may have potential implications for
One line of CAD research seeks to automate diagnosis, mammographers recognizing microcalcifications and masses
using pattern recognition and image enhancement techniques during scanning.
to detect abnormalities in digitized images. Applied to mam- In the experiments reported here, the eye-tracking system
mography, computer-aided detection of microcalcifications was used to measure and record eye position and pupil di-
[43] and masses [16] are examples of this promising line of ameter as subjects scanned the mammograms. One radiologist
investigation. As second observers in clinical practice, such and three radiological technicians each examined a series of
CAD tools could review images and suggest considerations 14 mammograms for indications of abnormalities associated
to the mammographer, with the anticipation of improving with breast cancer. The status of each mammogram was
unaided diagnostic accuracy. verified by biopsy. Three treatments were applied to the scan
An alternative and complementary line of CAD research data to model the experts’ eye behaviors. These included
seeks a more basic understanding of the human diagnostic quantification of dwell time and pupil diameter as a function
process, by analyzing the eye behavior of radiologists during of diagnostic accuracy (i.e., true positives, false positives, false
the search for abnormalities. Although not yet attempted in negatives, and true negatives) in regions of the mammogram
mammography, this approach is exemplified by the extensive where abnormalities existed or were perceived; independent
research of Kundel, Nodine, and their associates at the Univer- clustering of lookpoints without respect to abnormalities;
sity of Pennsylvania in detecting abnormalities on chest images and analysis of scan transitions between lookpoint clusters.
[21]–[28], [30], [31], [33], [34]. Studies have shown that 30% Results of the analysis were consistent with extensive prior
of all lung nodules can be missed on the first reading of a chest studies of eye-scan measures recorded during the diagnosis of
radiogram, even though the same nodules were clearly present abnormalities on chest radiograms as outlined above.
on prior radiograms of the same patient [3], [33]. Reasons The preliminary study reported is motivated by the difficulty
proposed for these false negatives include factors relating to of obtaining skilled subjects and the value of their time. Our
the image such as the location of nodules within the chest [29], express purpose in this investigation is to provide a proof of
the size and number of nodules present [35], and the diagnostic concept for use of the eye-tracking technology, experimental
quality of the images [27], [36]; as well as factors relating to protocols, and analysis methodologies, before embarking on
the diagnostician, such as level of training [4], search versus expanded mammographic studies requiring more extensive
nonsearch protocols [38], premature termination of the search and costly data collection efforts. The experiment is not
[8]–[10], [31], and conservative decision criteria [26]. representative of large screening program in clinical practice
(because of the large proportion of abnormal images included
in the sample) and the results are insufficient to make definitive
C. Motivation and Overview statements regarding diagnostic behaviors (because of the
In this paper, we seek to improve on the technology and small sample size). Nevertheless, the study as a whole is
expand the range of measures and models applied to un- sufficiently suggestive to encourage future work, with the
derstanding the diagnostic processes used by radiologists. promise of eventual adaptation as a source of diagnostic
Specifically, we investigate the use of an inexpensive and information in clinical practice.
unobtrusive eye-tracking system [19] to explore relationships Statistical analyzes of visual dwell and pupil diameter for
between visual scanning patterns and the clinical diagnoses lookpoints in known feature areas revealed that false-negative
of mammographic experts [40]. This system represents an diagnoses are marked by significant dwell times in 67% of
advance over equipment used in previous (chest radiogram) cases (with dwell times nearly comparable to both true positive
studies, in that it does not require the radiologist to wear head- and false positive diagnoses) and by increased pupil diameter
mounted gear. The noninvasive and comparatively unobtrusive in 75% of cases (in contrast to decreased diameter to both true
nature of the technology provides the subject mammographer positive and false positive diagnoses). Clustering analysis of
the greatest freedom from physical and psychological imped- lookpoints without explicit reference to features revealed that
iments, offering the greatest (if still imperfect) similitude to a 21% of the largest unreported clusters also are associated with
typical radiographic reading room. false negatives. These results are consistent with prior studies
In addition to tracking eye-gaze position on the mam- and suggest that scanning patterns may be a valuable source
mogram, the system tested here also records the reader’s of diagnostic information when applied in clinical practice.
pupil diameter. This additional measurement adds a new Finally, an analysis of lookpoint transitions between clusters
dimension to the recorded scanning patterns for study, not showed that individual diagnosticians employed consistent,
available in prior work on chest radiograms. Prior research but highly individual, scanning strategies. Moreover, in spite
on pupillary response during various information processing of the often marked differences in individual scan paths, the
tasks [32] has shown a correlation between the change in pupil pattern of search along these paths was similar for all four
dilation while solving mathematical problems of different diagnosticians. The sparse transition matrix which captures this
496 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 27, NO. 4, JULY 1997

pattern may be useful for future work, such as comparisons


between novice and expert behaviors.

II. EXPERIMENT

A. Experimental Set-Up
Recording the eye movement of mammographic experts is
accomplished using the Eye-gaze Response Interface Com-
puter Aid (ERICA) [15], [19], [42], a unique computer sys-
tem developed at the University of Virginia. The specific
ERICA configuration applied in this study is built around
an IBM 486 PS/2 Model 70 personal computer; a Sanyo
high-resolution, high-speed-shutter, CCD camera with zoom Fig. 1. Experimental setup for reading mammograms and recording scan
data.
lens and visible light filter; and a Tecmar Video Capture
Adapter image-processing board. A GE 1.5 mW gallium-
arsenide LED is mounted coaxially with the camera lens. The the relative luminance of the light box, all portions of the
LED provides a collimated beam of near-infrared (880 nm) box were obscured by black poster board except the viewing
light that illuminates the subject’s face. The system determines area. EyePsych [13], the psychological testing software of the
where the subject is looking by repeatedly analyzing the near- ERICA system, was used to run the tests and gather data.
infrared light reflected from one of the user’s eyes, as captured EyePsych collects and records data describing the location of
by the camera and recorded digitally by the image board. a diagnostician’s point of regard on the mammogram, called
Changes in pupil diameter also are determined in this way. the lookpoint, and pupil diameter. It also can superimpose
A Hitachi monitor is used to display the camera field of lookpoints on the computer screen, on which a digitized image
view for positioning the camera and subjects and an 8514 of the mammogram pair appears, with an indication of pupil
color monitor with 1024 768 pixel resolution is used to diameter at each point for replay and analysis purposes.
playback recorded scan paths over a digitized image of the The ERICA system collects gaze position and pupil diame-
corresponding mammogram. ter at an incremental frequency of 60 Hz. This data is recorded
Frames are captured at 60 Hz and experiments confirmed in the form of a quadruple in time sequence order,
that changes in eye-gaze position and pupil diameter occur- where is the horizontal computer screen pixel position, is
ring above this frequency (at greater than 0.0167 s between the vertical computer screen position, is the pupil diameter,
changes) could be observed and recorded. Prior studies [13] and is the number of the lookpoint observation (which also
demonstrated that, in the absence of head motion, a properly represents time, since there are 60 points in a second of data).
calibrated ERICA system can determine the user’s point of re- Thus, the problem is one of analyzing these four dimensions
gard with nearly 100% accuracy to within approximately 0.6 of data.
of visual angle, i.e., within a confidence area of approximately Images were obtained from the Primary Care unit of the
0.5 cm in radius on a display screen located approximately University of Virginia Hospital. Ten patient cases were chosen
45 cm in front of the user. with proven biopsy results of microcalcifications and abnormal
The principle limitation of the corneal reflection technique masses. In addition, four images from patients with no visible
embodied in ERICA is that, to achieve the stated accuracy and signs of cancer were included. Cases were selected to be
precision, the subject’s head must remain nearly stationary representative of different levels of difficulty, with a range
during use. Backward or forward movement of the head of locations of the features.
relative to the camera lens causes the image of the eye to blur, In mammography, at least two radiographic views of the
whereas side-to-side motion of the head in the plane parallel to breast are used to pinpoint an areas of suspicion. The medi-
the lens removes the eye from the camera field of view. Head olateral oblique (side) and craniocaudal (top-down) views,
movement in any direction relative to the calibration position abbreviated MLO and CC respectively on mammogram films,
of less than about 3 cm is generally acceptable. Because the are the standard views for all screening and diagnostic ex-
total visual angle of the film display box (described below) aminations. The MLO and CC views of the breast are made
was less than about 20 , the entire film could be viewed at right angles to each other, permitting a three-dimensional
without head motion. Head motion during the reading of a assessment of the breast [5].
film did not appear to cause any measurable error in tracking For each case, the MLO and CC views of one breast were
eye position for any film or subject; head motion between duplicated, spliced together, and cut to a size of 11 in 8 in
readings was corrected by asking the subject to reposition her (27.9 cm 20.3 cm) in order to allow placement on the display
head, or by refocusing the camera. Thus, head restraints proved box. While the films used therefore were smaller than standard,
unnecessary and were not used. no areas of tissue image were removed or obscured; only parts
The arrangement of equipment depicted in Fig. 1 was of the film with no image on them were eliminated. Thus, there
employed. Actual film mammograms attached to a light box was no loss of diagnostic ability from an actual diagnosis based
were used to replicate the diagnostic situation. To increase on the two views in the primary care setting.
WHITE et al.: MODELING HUMAN EYE BEHAVIOR 497

TABLE I
PROFILES OF FOUR MAMMOGRAPHIC EXPERTS PARTICIPATING IN EXPERIMENTS

B. Experimental Protocol
Subjects were seated directly in front of the light box at the
Fig. 2. A typical scatter plot of the x and y position data independent of
center of the viewing area. Their viewing distance was approx- time. This plot provides the most easily interpreted representation of where
imately 18 in (45.7 cm) from the mammograms. Four experts the reader looked on the actual mammogram image.
at the University hospital were tested—three senior technicians
and one radiologist. Table I outlines each subject’s profile in III. EXPERIMENTAL RESULTS AND ANALYSIS
terms of title (radiologist or technician), years of experience,
and visual aid (contact lenses or glasses). Experience refers The raw data consists of four sets of 14 series consisting
to full-time daily reading of mammograms. Visual aids are of up to 2100 points each, where each point has a location,
significant because these may introduce artifacts into the data; pupil diameter, and implicit time stamp. A scatter plot of the
occasionally, small reflections from a glasses nose-piece or position data alone, as illustrated in Fig. 2, displays the series
contact edge can cause a mistaken identification of the gaze of lookpoint positions observed during one representative
position. The subjects are identified as A, B, C, and D. reading in terms of and pixel position on the screen.
Data collection was divided into three distinct segments: The plot provides the most easily interpreted representation
preparation, testing, and post-testing analysis. The procedure of where the reader looked on the actual mammogram image,
for each is described in the following sections. The entire independently of time and pupil diameter.
testing procedure, including preparation and execution, took Fig. 3 graphs the complete data set—pupil diameter and
approximately 1.5 h for each subject. The first portion of screen position as a function of the lookpoint index—for a
the preparation phase, subject registration, entailed calibrating typical reading. Each of the variables is plotted as pixel value
the subject’s eye-gaze. Briefing the subject on the method for each lookpoint. The left -axis represents screen position,
of testing was the second step in preparation. The method while the right -axis is the range for pupil diameter. The -
of image presentation and testing was described and the axis is the lookpoint number; alternatively, this axis represents
subjects were allowed to ask questions for verification. The time, since a lookpoint is taken every 0.01667 s. This plot
subjects were told that some of the images contain masses allows the pattern of pupillary dilation and contraction over
and/or microcalcifications. They were told to indicate locations time to be cross referenced to gaze-position on the scatter
of microcalcifications, masses, and abnormalities after each plot. Diagnosis of the particular image displayed in Fig. 3
image evaluation. was completed at a lookpoint value of approximately 1250,
Subjects began testing immediately after successful prepa- or 20.8 s. The data for subject A, images 6 and 7, were badly
ration. For each image, the subject’s eyes were closed as the placed upon review; lookpoint coordinates were not in the field
image was affixed to the viewing area of the light box. Testing of the screen and thus unsuitable for analysis. This most likely
began when the subjects opened their eyes and continued until resulted from extraneous glasses reflections.
2100 sequential lookpoints were collected (35 s). Because in Once the diagnoses were indicated, these were compared to
true diagnoses the experts do not look at the images for a the true biopsy state for each mammogram. Such a comparison
predetermined amount of time, the subjects could signal that reveals four potential outcomes. There are two possible truth
they had completed their assessment of the image at any time states for each feature (present, absent), with two possible re-
by looking down at the center of the camera. After lookpoints sponses (present, absent), resulting in four cases: true positive,
for each image were recorded, the subjects marked their false positive, true negative, and false negative. The true state
diagnoses by circling areas with masses, microcalcifications, of the image is obtained from the biopsy results.
and other features (such as lymph nodes) on a clear overlay
on top of the mammogram. A. Feature Analysis
This procedure was repeated for each of the 14 mammo- The first step in data analysis is to examine the areas
grams. Discussions with each subject followed the completion containing features of interest. These features are either masses
of the testing session. Subjects indicated the reasoning behind or microcalcifications and their actual location in the image has
their diagnoses, as well as they course of action the would been verified with biopsies. These areas constitute either true
have taken. positives or false negatives, depending on whether or not the
498 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 27, NO. 4, JULY 1997

Fig. 3. A typical plot of pupil diameter and x- and y -screen position (in pixels) versus the observation sequence in time. This plot allows the pattern of
pupillary dilation and contraction over time to be cross referenced to gaze-position on the scatter plot.

subjects marked their location. Of additional interest are the larger pupil changes than others, as described in the qualitative
areas falsely targeted as containing features. analysis.
Two aspects of the data can be analyzed in these feature To account for this, the local mean of pupil diameter within
regions. The first is dwell time. Prior studies by Kundel and an area of interest was compared to the total pupil diameter
others [26] have linked length of dwell in regions on the image mean for the image. In this way, an arbitrary threshold was
to positive and negative diagnoses through the examination not needed. Moreover, large rises caused by blinks would not
of visual dwell on lung nodules in relation to false positive overshadow smaller pupil responses caused by the feature area
and false negative decisions. In order to determine dwells, of interest. Also, the absolute pupil size information would not
Kundel’s study took the -coordinates, calculated centers be lost, as would be the case with rate of change. Both the
of coordinate groups based on a running-mean calculation, dwell analysis and the local mean pupil analysis were executed
and clustered fixations of groups with dispersions of less than in a Borland C Version 3.0 program implemented on an
2.5 . From analysis of these dwells, they determined that 96% 486 IBM-compatible computer. Pupil diameter is represented
of true positive regions, 83% of false positive regions, and as the percent difference from the mean, while dwell is
68% of false negative regions accumulated a significant visual quantified as the percentage of time spent in the feature region.
dwell, averaging 2 s or more. Using percentages affords comparison within and between
subject data sets. Specifically, the following measures were
For purposes of the analysis here, which also examines pupil
used:
diameter, the data are not first condensed into groups before
forming the visual dwells. This analysis examines the data Local Mean Change
within the feature areas of interest. The area of interest is Local Mean-Global Mean
determined from the estimate of a 5 angle covered by the Global Mean
average human eye fovea, as used in prior studies. According Dwell Time
to [26], focal attention around a visual structure extends to
Number of Lookpoints At Feature
2.5 from the center of the gaze in all directions. The radius
Total Number of Lookpoints
which defines this area of foveal acuity is termed the threshold
and is used in this section as well as the clustering analysis to These equations were used to evaluate the dwell and pupil
determine sets of points corresponding to a dwell or cluster. behavior in the true positive, false positive, and false negative
At a viewing distance of 18 in (45.7 cm), the threshold is regions based on subject diagnoses. The center of the region
approximately 0.8 (1.98 cm) or 75 image pixels. was used as feature coordinates. An analysis of the mean
Examination of pupil response does not have a clearly values for pupil change and dwell time appears in Table II.
defined threshold to describe the data. The aim of analyzing The largest deviation from the mean pupil diameter occurred
the pupil is to decide if significant pupil response can be for false positives, where the local feature mean averaged
associated with a region in the image. But determination of over 3% less than the overall mean. The only average rise
what constitutes a “significant” response is difficult, because in local pupil diameter occurred for false negatives, with an
of the many variables involved in pupil fluctuation, such as average rise of almost 2%. In fact, 75% of pupil changes
blinking. Comparison between subjects is even more difficult, for false negatives were increases over the average diameter,
since some people naturally have larger pupils or seem to have as displayed in the previous tables. In the true diagnostic
WHITE et al.: MODELING HUMAN EYE BEHAVIOR 499

TABLE II features, may improve diagnoses by prompting reappraisal of


SUMMARY OF PERCENT CHANGE IN PUPIL DIAMETER AND VISUAL the areas.
DWELL TIME DIFFERENTIATED BY DIAGNOSTIC CLUSTER
TYPE AND AVERAGED OVERALL MAMMOGRAPHIC READINGS

B. Clustering
Feature analysis examined the data based on the true state of
the image, looking at the data within regions containing masses
or microcalcifications, as well as regions falsely targeted as
containing one of these features. In the actual diagnostic
situation, however, the true state of the image is not known.
Thus, it is worthwhile to inspect the problem from the opposite
frame of reference, examining first the visual behavior of the
situation, this information could be used as feedback to the viewer. Conclusions drawn in this manner may then be used
radiologist, prompting reevaluation of the areas with rises in with images for which the true biopsy diagnosis is not known.
From this point of view, one way to analyze the data is
pupil diameter.
The correlation coefficients for each of the three cases through clustering, which aggregates the screen pixel locations
ranged between 0 and 0.5. None of these values indicate a into sets which may define areas of interest in the image.
large linear correlation. As such, dwell time is not necessarily Transforming these individual pixel locations at a given time
into meaningful data comprises the task of this problem.
predictive of pupil response in terms of a linear model, nor
does pupil response serve as a gauge for the dwell time. Preliminary analysis of the data reveals certain areas of a given
All of the average dwell times constitute significant dwells, image which appear to be more interesting (i.e., more gazes
with false positives having the largest percent of time spent in this area) and a tentative pattern of the gazes (i.e., between
the areas of interest), where gaze is defined as screen
in the feature regions. Significance is based on an expected
dwell over a blank image. Dividing the image region of width pixel location.
pixels and height pixels into areas with Several methods of clustering may be applied to this data;
radius 75 pixels (as calculated previously) yields the method of choice is subsequently described. Many algo-
rithms have the constraint that the number of clusters must be
Total Area known. For the image diagnosis problem, this is not a valid
Areas
Circular Area assumption since the images contain a different number of
features, and gazes may cluster around an unknown number
of areas. The Leader Algorithm is, therefore, a good choice
Time at Each Area for clustering in this domain. Rather than assuming that the
number of clusters is known, the Leader algorithm assumes
Thus, assuming an equal proportion of time for each area, the a known threshold which defines the maximum distance for
percentage of attention in each 75-pixel radius region would a point to be in the cluster. The assumption of a threshold
be 2.25%. Dwells for the diagnosed areas of false and true is quite applicable to this problem, because of the biological
positive, as well as for false negative, represent over four times limits of vision. The specific calculation of this threshold is
the expected amount of time for a uniform image (such as a described previously.
blank screen). Once the threshold is determined, the Leader Algorithm is
Using the previous tables as well as this notion of significant executed with the steps shown in Table III. The first step takes
dwells, the percentage of features receiving significant dwells the first point and assigns it as the leader of the first cluster.
can be calculated. Out of 36 true positive diagnoses, 26 (72%) The rest of the algorithm is an iterative procedure, where if
received significant dwells. False positives have a higher the distance between the leader and a point is less than the
percentage of significant dwells with 12 out of 15 (83%). threshold, the point is assigned to that cluster; if not, the point
Finally, false negatives accrue significant dwells in eight out becomes the leader of its own cluster.
of 12 cases (67%). These findings are consistent with Kundel This algorithm has advantages other than not assuming the
[26], who found 80% of the false positive cases and 68% of number of clusters. It is easy to implement in a programming
the false negative cases to have dwells. language, and it is generally fast, being of order in
The percentage for true positives, however, is less than the worst case. The order of the algorithm is an indication
Kundel, which was 96%. This may be because the lung of its efficiency, and is proportional to the amount of time it
nodule study looked only at masses, while in this study will take. For the data in this problem, where the breast images
both microcalcifications and masses were possible features. and features are in distinct locations and the gaze looks around
Looking only at the true positive diagnoses for masses in the these areas, the algorithm will perform consistently better than
images here, 12 out of 14 received significant dwells, which is the worst case, which assigns each point to its own cluster.
86%. This is closer to Kundel’s original findings and reveals Two disadvantages of the Leader Algorithm can be noted.
the possible differences in the subjects’ perception of masses The first is that it is susceptible to the choice of thresh-
versus microcalcifications. As with pupil response, indicating old. In this case, the natural decision of threshold based on
which areas were dwelled upon, including the false negative the biological capabilities of human vision diminishes this
500 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 27, NO. 4, JULY 1997

Fig. 4. Leaders of the 11 clusters determined for the eye-location scatter data presented in Fig. 2, after application of the Leader Algorithm.

TABLE III since they are different for almost every data set. The output
PSEUDOCODE FOR THE LEADER ALGORITHM, USED TO CLUSTER LOOKPOINT DATA of the program are files which enumerate the clusters and list
the number of points in each, pixel position, and pixel
position.
Output of the clustering program consists of a list of clusters
denoted by the number of points in the cluster and the leader
point of the cluster for each subject and image. Each cluster
corresponds to a position on the screen. Fig. 4 shows a plot
of clusters, which corresponds to the same data set presented
in Figs. 2 and 3.
Clusters for each subject and image reveal further similar-
ities and differences in diagnostic styles as first highlighted
by qualitative and feature analysis. For 10 of the 14 images,
each expert had a different number of significant eye clusters.
The mean number of clusters over all subjects and data is
12.78, with the maximum number for one test image at 18 and
the minimum at 5. Shorter diagnosis times do not necessarily
result in a smaller number of significant clusters, since the
decision of significance is determined using the diagnosis time,
as previously explained.
The clusters themselves are somewhat representative of the
sequence in which the subject viewed those areas on the image.
concern. Second, the algorithm passes through the data once, As such, the order of the clusters is different for each image
so changing the order of the data may change the clustering and subject because of individual scanning differences as well
results. Again, the nature of the data, which is time dependent as the different subject matter of the images. In numerous
and so has a given order, decreases this concern. instances, clusters from different subjects will have the same
The Leader Clustering Algorithm was implemented in Bor- proximity in the image, although these similar clusters are not
land C 3.0. Clusters are formed using a threshold of identified by the same number or even the exact same pixel
75 pixels. In addition, only clusters with a significant number screen coordinates.
of points in them are recorded. Significance is determined by In many cases, three or four clusters appeared to dominate
having greater than the expected number of points based on a the others in terms of the number of points these contain.
uniform image. This was calculated previously as 2.247% of Based on this observation, the three largest clusters for each
total visual attention being devoted to a region. For the cluster data set were examined in relation to the feature areas and
analysis, this is based on the total number of points in the data diagnoses of true positive, false positive, and false negative.
set, where the total lookpoints are tabulated for each subject Each of the dominant clusters, identified spatially by the
and image. Specifically, only clusters with greater than (Total leader’s coordinates, was compared to the center coordinates
Number of Lookpoints)/44.5 are recorded. These values for of the features using the Euclidean distance norm. For this
the number of lookpoints are input by the user to the program, comparison, distance threshold of 150 pixels was used, in order
WHITE et al.: MODELING HUMAN EYE BEHAVIOR 501

TABLE IV
PERCENTAGE OF LARGEST (SIZE 1), SECOND LARGEST (SIZE 2), AND THIRD LARGEST (SIZE 3) VISUAL CLUSTERS CATEGORIZED BY DIAGNOSTIC TYPE

to account for overlapping points of the 75 pixel feature and unmarked cluster should trigger reexamination of that area for
75 pixel cluster. such a feature.
For all the images, the percent of clusters in the region
of a feature were calculated for the three largest clusters.
C. Dynamic Analysis
If the cluster did not correspond to any true positive, false
positive, or false negative features, it was considered part of Cluster analysis, as well as the feature analysis, provides
the “other” category. This analysis resulted in Table IV. The only a static representation of the data, with no description
clusters sizes (ordered by the number of lookpoints in the of the subject’s visual behavior over time. The next level of
cluster) are denoted by 1 (largest), 2 (second largest), and 3 analysis looks at where visual attention is focused and how
(third largest). The table shows the percentage of occurrences the areas of interest are related over time. Dynamic analysis
of each type of diagnosis. of the time-series behavior of the eye scans can be modeled
The table shows that approximately 60% of the largest by a stochastic process. The hypothesis is that areas of the
clusters correspond to features in the image which were diag- image may be modeled as states which the subject-image
system may be in at a given time. Additionally, the pattern of
nosed as true positive, false positive, or false negative, while
eye movement between these areas of interest may be mod-
approximately 40% of the largest clusters correspond to other
eled as state transitions. Presumably, the state transitions are
areas of the images. These other areas apparently represent
probabilistic, resulting from the intrinsic uncertainty governing
features which aroused the suspicion of the diagnostician,
human behavior.
but which (properly) did not result in a positive diagnosis.
The implications of this model are that it affords a psychol-
The table shows also that the areas of significant interest
ogist (or anyone studying eye behavior) with an analytical
were roughly evenly divided between positive and negative
method to compare individual performances based on the
diagnoses, with false positives representing something under
transitions between states. Reference [17] points to the use-
a third of all positives and false negatives something over a
fulness of Markov modeling to go beyond visual inspection
quarter of all negatives. of eye movement traces. There are at least two dimensions
These results also are consistent with Kundel’s work. Kun- of analyses over an image allowed by these comparisons:
del found that out of 5117 regions with fixations, 4878 (95%) 1) performance of an individual to previous performance by
of them had a dwell in a true negative region. The mean dwell that individual and 2) performance of an individual to a
in these true negative areas was 0.51 s. Based on the previous selected population’s performance. The first may be useful in
definition of a significant dwell, any image with a diagnosis determining impairment of the individual performing a control
time of less than 23 s would have significant dwells of 0.51 s, task based on visual scanning in order to monitor or seize
since or 2.21% of the visual attention was control of the system. The second basis of comparison may
spent in that region. Thus, for many of the images, significant yield areas for training or targeting aptitude in a visual search
clusters occurred in areas which are true negative. task, such as comparing a novice to a population of experts.
While these results are not as conclusive as the dwell study, A Markov process uses the temporal data to define transition
which only examined areas of interest rather than the total probabilities between areas of the image, denoted as states. The
image, the observation of clusters along the scan path still variation in scanning patterns displayed in a static visual image
appears to be diagnostic, as well as practical for clinical test can be represented by a probabilistic model. Defining the
implementation. Approximately 11% of all of the largest location as a function of screen coordinates and
clusters are associated with false negatives. More importantly, the location at observation number (or time) may be ex-
false negatives represent approximately 21% of the largest pressed as a random variable . Thus the scan path is one
clusters not marked by the diagnosticians. Because of the sample path generated by the stochastic process
relatively large number of clusters in true negative areas, where the index set is the set of all observations.
the occurrence of an cluster does not conclusively lead to At a particular point in time, the system is found to be in
the deduction that an image contained a feature of interest. exactly one of a finite number of states. Labeled
However, if adopted in clinical practice, the occurrence of an these states are both mutually exclusive and exhaustive. The
502 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 27, NO. 4, JULY 1997

implications for the mammogram representation are that the TABLE V


states defined by cluster analysis are non-overlapping NUMBER OF TRANSITIONS BETWEEN DIFFERENT VISUAL
CLUSTERS DURING A TYPICAL MAMMOGRAPHIC READING
(mutually exclusive) areas of interest and number
where state is all area outside the areas of interest so that
the entire screen area is specified by a state (exhaustive). This
formulation leads to a mathematical model of a stochastic
process where the random variables observed
at may take on any one of the cluster values.
In order to obtain analytical results of this model, certain
assumptions must be made. Foremost in these assumptions
is that the conditional probability of any future event, given
any past event and the present state of the process, depends
only on the present state and is independent of the past event.
This Markov property is represented for the stochastic process
as

square matrix which has entries that consist of the number of


times a transition occurred between the states (see Table V)..
where is the present state, is the future state, for any and
The matrix shows transitions between the states, which are
states The conditional probabilities
the significant clusters resulting from clustering. In reality,
called transition probabilities, denote the probability
lookpoints also occur in areas which are not defined by those
that the system will be in state given that it was last in state
clusters. Initially, an additional state was added to represent
A second assumption is that the transition probabilities are
all other areas not encompassed by the clusters, as discussed
stationary
above. This added state, however, obscured the movement
from one state to another. For example, if the eye were moving
from a feature in the left view to a feature in the right view,
where and over all a few lookpoints would occur as the path between the two
This implies that the transition probabilities do not change features. This would be recorded as a transition from the
in time. The transition probabilities may be represented by left feature state into the additional state, and then from the
the stochastic matrix The last assumption is that the initial additional state to the state of the right feature. For this reason,
probabilities are known for all states If all all points not in one of the defined states were not incorporated
of these assumptions are valid, the stochastic process is a in the transition matrix. Although disallowing points to enter
finite-state Markov chain. the th state means that the states are no longer exhaustive, it
Preliminary data examination supports most of these as- increases the number of meaningful state transitions.
sumptions. The areas of interest are definitely finite in number The diagonal of the matrix represents the number of tran-
because there are a finite number of pixels on the screen; sitions a state makes to itself. In reality, this corresponds to a
additionally, the areas represent groups of numerous pixels. dwell and results in a very large number in comparison to the
The initial probabilities of being in any state may be assumed other transitions. For example, a two-second dwell results in
to be a function of the number of pixels composing the state 120 transitions from the state into itself. An average of 85%
only, given that the subject has no prior knowledge of the of the total transitions are those on the diagonal. In order that
composition of the image on the screen. In addition, the the movement implied by the matrix was not masked by these
Markov property appears to be a reasonable assumption based large values, the diagonal of the matrix was written as *’s.
on the preliminary data. The transition probabilities may not The transition matrices exhibit a strong pattern, and the
be stationary over the entire time series, however. This is pattern makes sense intuitively. The result of relatively few
discussed further in Section III. transitions between states in relation to the amount of time
A good state definition would be based on the location of spent in a state is logical in light of the visual task and the
the features of interest on the image, which is only possible manner in which humans perceive visual stimuli. Essentially,
with pre-diagnosed images and therefore of little use in aiding this result means that the majority of time is spent looking at
real diagnoses. Instead, states can be defined based on regions something rather than rapidly glancing around all areas of the
resulting from the previously described cluster analysis. Using image several times.
the clusters as state definitions, the notion of the Markov Dominance of the diagonal is also an artifact of the sam-
transition probability matrix can be applied to examine the pling rate. At a frequency of 60 Hz, the data is sampled
visual relationship between the states. fast enough to record rapid eye transitions, but results in
A C program was coded to quantify the transitions between large numbers of points associated with a dwell. This effect
states. Files of raw gaze data and the clustering results serve would be further pronounced with a faster sampling rate; the
as input to the program. The program tabulates the transitions number of transitions is limited because that is actually the
from one state to another. Results are output in the form of a way the subjects scanned the image, and increasing the rate
WHITE et al.: MODELING HUMAN EYE BEHAVIOR 503

would likely not increase the number of transitions between clustering section, more than one state may correspond to a
states. Likewise, using a slower sampling rate would have feature of interest. As such, a pattern of transitions between
decreased the dominance of the diagonal, but much slower features is not consistent based on the present state definition.
rates may prevent the full realization of the eye trajectory, Nevertheless, representing the behavior of experts based on the
since sometimes transitions occur very quickly. Markov model may afford comparisons to other populations,
Filtering out the diagonal transitions leaves sparse data such as novices, based on the similarities of the state transition
because of the few number of other transitions. In hindsight, matrix.
the pattern of the matrix seems obvious when one thinks about
how people visually search a static scene, but this was not
evident at the outset. As such, Markov analyses seemed to be IV. CONCLUSIONS
a very fruitful method to model and analyze the data. With the Understanding scanning strategies is a key to understanding
limited data, however, this analysis cannot avail the power of the process of human visual diagnosis. Modeling this search
Markovian analyses and tools. Although the underlying model behavior may be useful in several ways. Information gained
may well be a Markov chain, there is not enough data for from models can improve scientific understanding of the
significant Markov chain computations. Moreover, because of behavior, which in turn may contribute to targeting individ-
the differences between subjects and their clusters, the data set uals with diagnostic aptitude, or to the training of novice
can not be increased by combining individuals. diagnosticians. Moreover, models of visual performance could
The assumption of stationary transition probabilities, essen- potentially be used as feedback to improve diagnoses or in
tial for a stationary Markov chain, may not be valid. From developing intelligent systems for CAD.
a cognitive standpoint, the process of diagnosing an image is In this paper, we have demonstrated an improved eye-
transient, where the viewer goes from one state of knowledge tracking technology and expanded the range of measures and
to another. The viewer progresses from ignorance through the models for studying potential relationships between visual
identification process until the viewer decides to complete the scanning patterns and the clinical diagnoses of radiographers
diagnosis. All of these states of knowledge may be steady state, and mammographers. As such, this work represents a contribu-
but at steady state the eye motions may not reflect the internal tion in the research tradition pioneered by Kundel, Nodine, and
cognitive processes. Virtually all of the pupil and eye-scan their associates in the context of reading chest radiograms. To-
path studies discussed in the literature present eye behavior gether with the fruition of digital mammography and advances
of people executing a mental or visual task rather than merely in pattern recognition and image enhancement techniques, this
existing at a constant mental state. In fact, the steady-state and eye-tracking technology appears to have clinical potential for
long-term probabilities afforded by a Markov chain analysis mammographic CAD.
may not have much relevance in the visual diagnosis of a A principal advantage of the ERICA eye-tacking system,
static scene; this corresponds to a person staring at an area besides the ready availability and low cost its off-the-self
of the image for a long period of time, which the data does components, is that the system does not require the radiol-
not support. ogist to wear any head-mounted or other invasive gear. The
Even if stationarity does not apply to the diagnostic situa- comparatively unobtrusive nature of the technology provides
tion, a piecewise stationary model may exist, where segments the radiologist or mammographer the greatest freedom from
of the time series are characterized by similar probabilities. For physical and psychological impediments currently available.
example, the first time segment may be the acquaintance phase, This freedom in turn offers the greatest similitude to a typical
the second may be the search phase, and the third phase may radiographic reading room and the widest hope for ultimate
consist of feature identification where each phase is specified adaptation to clinical practice.
by distinct stationary transition probabilities. The sparse nature The principle limitation of the ERICA system is that the
of the data sets, however, prevent verification of this theory. radiographer’s head must remain nearly stationary during each
The diagnostic activity in this situation does not take long reading. While this limitation did not create any significant
enough to provide sufficient data on meaningful transitions problems in the conduct of the experiments reported here, the
from which to calibrate a model. requirement clearly is undesirable (if not, in fact, unacceptable)
Despite restrictions of the model which prevent its direct in practice. However, White et al. [42] demonstrate a dynamic
application here, the Markov chain theory is useful as an calibration for the system which permits a far greater range of
intellectual construct, especially through the use of the proba- side-to-side head motion than the setup used in this study.
bility transition matrix. A pattern is evident among the matrix In addition, current work on an autofocus mechanism for the
entries around the diagonal, representing transitions between system promises to permit a far greater range of front-to-
adjacent states. As noted it the clustering analysis, the clusters back head motion. Taken together, these further improvements
themselves are dependent on the order of the eye scanning ultimately may yield an eye-tacking system that is wholly
path. The resultant pattern of high proportions of transitions transparent to the practicing radiographer.
among adjacent states is logical based on their definition, A second advantage of the technology used here is that
which follows the trajectory of the eye scan. the ERICA system also records the reader’s pupil diameter
Transitions are also evident between paired features in while tracking gaze position. This additional measurement
different locations in the image. Because of potential overlap adds a new dimension to the recorded scanning patterns, not
between the features and the cluster, as described in the available in prior work on chest radiograms. While the lim-
504 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 27, NO. 4, JULY 1997

ited number of observations collected during this preliminary potentially resectable lesion evident in retrospect,” Radiology, vol. 123,
study showed differences in pupillary response for different pp. 115–122, 1992.
[4] J. C. Bass and C. Chiles, “Visual skill: Correlation with detection
diagnostic outcomes (such as the rise in pupil diameter in of solitary pulmonary nodules,” Investigative Radiology, vol. 25, pp.
clusters associated with false negative diagnostic clusters), 994–989, 1990.
[5] L. W. Bassett, “Quality determinants of mammography: clinical image
clearly, the sample of both readers and films is too small to evaluation,” in Syllabus: A Categorical Course in Breast Imaging, D. B.
draw meaningful conclusions. Nevertheless, it is tempting to Kopans and E. B. Mendelson, Eds. Oak Brook, IL: Radiological Soc.
speculate on the potential usefulness of these results if verified North America, 1995.
[6] L. W. Bassett and R. H. Gold, Breast Cancer Detection. New York:
by more extensive experimentation. Harcourt Brace Jovanovich, 1987.
Results of the gaze-position analysis were consistent with [7] M. S. Belofsky and D. R. Lyon, “Modeling eye movement sequences
extensive prior studies of eye-scan measures recorded during through clustering techniques,” Air Force Human Resources Lab. mi-
crofilm, Williams Air Force Base, AZ, 1988.
the diagnosis of chest radiograms. Again because the sample of [8] K. S. Berbaum et al., “Satisfaction of search in diagnostic radiology,”
both readers and films is too small, the results are insufficient Investigative Radiology, vol. 25, pp. 133–140, 1990.
to make definitive statements regarding diagnostic behaviors [9] , “Time course of satisfaction of search,” Investigative Radiology,
vol. 26, pp. 641–648, 1991.
(because of the small sample size). Nevertheless, the study [10] K. S. Berbaum and E. A. Franken, “Correspondence: Reply,” Investiga-
as a whole is sufficiently suggestive to encourage future work tive Radiology, vol. 27, p. 252, 1992.
[11] The Breast, K. I. Bland and E. M. Copeland, III, Eds. Philadelphia,
with larger samples of both mammograms and mammographic PA: Saunders, 1991.
experts. [12] E. S. deParedes, Atlas of Film-Screen Mammography. Baltimore, MD:
A principal shortcoming of this study (beyond the small Williams and Wilkins, 1992.
[13] N. R. Desai, “ERICA: Measuring pupil diameter using computer vision,”
sample) was the failure to assess the level of confidence of M.S. thesis, Dept. Syst. Eng., Univ. Virginia, Charlottesville, 1991.
the reader with respect to each of the diagnoses. It is well [14] S. A. Feig, “Mammographic evaluation of calcifications,” in Syllabus:
known in signal detection theory that individual diagnoses A Categorical Course in Breast Imaging, D. B. Kopans and E. B.
Mendelson, Eds. Oak Brook, IL: Radiological Soc. North America,
can be influenced both 1) by the reader’s ability to perceive 1995.
abnormalities where these exist (the reader’s sensitivity) and 2) [15] L. A. Frey, K. P. White, Jr., and T. E. Hutchinson, “Eye-gaze word
by the reader’s willingness to commit to such observations in processing,” IEEE Trans. Syst., Man, Cybern., vol. 20, pp. 944–950,
1990.
the presence of ambiguity (the reader’s response criteria). An [16] M. L. Giger, C. J. Vyborny, and R. A. Schmidt, “Computerized
assessment of confidence level would allow the construction characterization of mammographic masses: analysis of spiculation,”
Cancer Lett., vol. 77, pp. 201–211, 1994.
and analyzes of familiar ROC curves, which distinguish be- [17] S. S. Hacisalihzade, L. W. Stark, and J. S. Allen, “Visual perception and
tween these sources of diagnostic error. This additional model sequences of eye movement fixations: A stochastic modeling approach,”
would be valuable in accounting for the variability in the IEEE Trans. Syst., Man, Cybern., vol. 22, pp. 474–481, 1992.
[18] E. H. Hess and J. M. Polt, “Pupil size in relation to mental activity during
data subject-to-subject and film-to-film and will be included simple problem-solving,” Science, vol. 143, pp. 1190–1192, 1964.
in future experiments. [19] T. E. Hutchinson, K. P. White, Jr., W. Martin, K. C. Reichert, and L. A.
Because of the large proportion of abnormal images used Frey, “Human-computer interaction using eye-gaze input,” IEEE Trans.
Syst., Man, Cybern., vol. 19, pp. 1527–1534, 1989.
in this study, the experimental conditions tested here best [20] D. B. Kopans, “Screening mammography and the controversy concern-
represent a clinical setting in which a follow-up reading is ing women aged 40-49 years,” in Syllabus: A Categorical Course in
requested (such as a follow up every six months). In such Breast Imaging, D. B. Kopans and E. B. Mendelson, Eds. Oak Brook,
IL: Radiological Soc. North America, 1995.
cases the mammographer is alerted to the large proportion of [21] H. L. Kundel, “The predictive value and threshold detectability of lung
abnormals and probably is not doing the same things he or tumors,” Radiology, vol. 139, pp. 25–29, 1981.
[22] H. L. Kundel and P. S. LaFollette, Jr., “Visual search patterns and
she might be doing during a routine screening. Designing an experience with radiological image,” Radiology, vol. 103, pp. 523–528,
experiment that is truly representative of large-scale screening 1972.
program remains problematic, however. Given that a mam- [23] H. L. Kundel and C. F. Nodine, “A visual concept shapes image
perception,” Radiology, vol. 146, pp. 363–368, 1983.
mographer might read at most 80 to 100 films in a day and [24] , “Interpreting chest radiographs without visual search,” Radiol-
given that an average of only eight cases in 1000 are positive ogy, vol. 116, pp. 527–532, 1975.
for cancer, it is clear that true positives are a comparatively [25] H. L. Kundel, C. F. Nodine, and D. P. Carmony, “Visual scanning, pat-
tern recognition, and decision making in pulmonary nodule detection,”
rare occurrence in practice. (Of course, this same problem Investigative Radiology, vol. 13, pp. 175–181, 1978.
also occurs in training residents in screening programs in a [26] H. L. Kundel, C. F. Nodine, and E. A. Krupinski, “Searching for lung
clinical setting). Nevertheless, the future development of an nodules: Visual dwell indicates locations of false-positive and false-
negative decisions,” Investigative Radiology, vol. 24, pp. 472–478, 1989.
eye-tacking system that is truly transparent to the radiographer [27] H. L. Kundel and G. Revesz, “Lesion conspicuity, structure noise, and
offers the eventual promise of collecting clinical eye-scan data film reader error,” Amer. J. Radiology, vol. 126, pp. 1233–1238, 1976.
[28] H. L. Kundel and D. Wright, “The influence of prior knowledge on visual
in real-time for screening studies using the measures, models, search strategies during the viewing of chest radiographs,” Radiology,
and analysis techniques developed here. vol. 93, pp. 315–320, 1969.
[29] J. D. Newell et al., “Computed radiographic evaluation of simulated
pulmonary nodules: Preliminary results,” Investigative Radiology, vol.
REFERENCES 23, pp. 267–270, 1988.
[30] C. F. Nodine and H. L. Kundel, “Using eye movements to study visual
[1] D. D. Adler, “Mammographic evaluation of masses,” in Syllabus: search and to improve detection,” Radiographics, vol. 7, pp. 1241–1250,
A Categorical Course in Breast Imaging, D. B. Kopans and E. B. 1987.
Mendelson, Eds. Oak Brook, IL: Radiological Soc. North America, [31] C. F. Nodine, E. A. Krupinski, H. L. Kundel, L. Toto, and G. T.
1995. Herman, “Correspondence: SOS–satisfaction of search or satisfaction
[2] Cancer Facts and Figures: 1995. Atlanta, GA: American Cancer Soc. of serendipity,” Investigative Radiology, vol. 27, pp. 571–572, 1992.
[3] J. H. M. Austin, B. M. Romney, and L. S. Goldsmith, “Missed [32] W. S. Peavler, “Pupil size, information overload, and performance
bronchogenic carcinoma: Radiographic findings in 27 patients with differences,” Psychophysiology, vol. 11, pp. 559–566, 1974.
WHITE et al.: MODELING HUMAN EYE BEHAVIOR 505

[33] G. Revesz and H. L. Kundel, “Psychophysical studies of detection errors Tonya L. Hutson was born in Arlington, VA, in
in chest radiology,” Radiology, vol. 123, pp. 559–562, 1977. 1970. She received the B.S. and M.S. degrees in
[34] G. Revesz, H. L. Kundel, and M. A. Garber, “The influence of structured systems engineering in 1993 from the University
noise on the detection of radiographic abnormalities,” Investigative of Virginia, Charlottesville, graduating with high
Radiology, vol. 9, pp. 479–486, 1974. distinction.
[35] R. H. Sherrier, G. A. Johnson, S. A. Suddarth, C. Chiles, C. Hulka, She was Systems Engineer for Systems Research
and C. E. Ravin, “Digital synthesis of lung nodules,” Investigative and Applications from 1993 to 1994, performing
Radiology, vol. 20, pp. 933–937, 1985. front-end requirements analysis for an in-patient
[36] R. A. Schmidt, D. E. Wolverton, and C. J. Vyborny, “Computer-aided clinical information system for Department of De-
diagnosis in mammography,” in Syllabus: A Categorical Course in fense hospitals. She was with Cybermedix Inc. from
Breast Imaging, D. B. Kopans and E. B. Mendelson, Eds. Oak Brook, 1994 to 1995, where she developed an expert system
IL: Radiological Soc. North America, 1995. to automate hospital emergency room clinical information. She has been
[37] R. A. Smith, “The epidemiology of breast cancer,” in Syllabus: A Cate- a Systems Engineer at APACHE Medical Systems, Inc. since 1995. Her
gorical Course in Breast Imaging, D. B. Kopans and E. B. Mendelson, ongoing projects include developing new systems to implement acute care and
Eds. Oak Brook, IL: Radiological Soc. North America, 1995. cardiovascular predictive methodologies for hospitals. Her other professional
[38] R. G. Swensson and G. H. Theodore, “Search and nonsearch protocols interests include independent consulting for small medical and dental offices.
for radiographic consultation,” Radiology, vol. 177, pp. 851–856, 1990. Ms. Hutson is a member of Tau Beta Pi Engineering Honor Society.
[39] S. E. Thomas, “Using ERICA to examine the point of recognition,” B.S.
Thesis, Dept. Syst. Eng., Univ. Virginia, Charlottesville, 1992.
[40] T. L. Viscomi, “Modeling human eye behavior during static scene
diagnosis: An analysis of mammographic screening,” M.S. thesis, Dept.
Syst. Eng., Univ. Virginia, Charlottesville, 1993. Thomas E. Hutchinson received the B.S. and M.S.
[41] Human Factors in Aviation, E. Weiner and D. C. Nagel, Eds. New degrees in physics from Clemson University, Clem-
York: Academic, 1988. son, SC, in 1958 and 1959, respectively, and the
[42] K. P. White, Jr., T. E. Hutchinson, and J. M. Carely, “Spatially-dynamic Ph.D. degree in physics from the University of
calibration of an eye-tracking system,” IEEE Trans. Syst., Man, Cybern., Virginia, Charlottesville, in 1963.
vol. 23, pp. 1162–1168, 1993. He was a Professor of Biomedical and Chemical
[43] K. S. Woods, C. D. Christopher, K. W. Bowyer, J. L. Solka, C. E. Priebe, Engineering at the University of Minnesota, Min-
and W. P. Kegelmeyer, Jr., “Comparative evaluation pattern recognition neapolis, from 1968 to 1975, and the University
techniques for detection of microcalcifications in mammography,” Int. of Washington, Seattle, from 1975 to 1982. He is
J. Patt. Recognit. Artif. Intell., vol. 7, pp. 1417–1436, 1993. former Associate Dean of the School of Engineering
and Applied Science, University of Virginia, where
he currently holds the William Stantsfield Calcott Professorship. Together
with colleagues at Virginia Polytechnic and State University and Virginia
K. Preston White, Jr. (M’81–SM’88) was born in Commonwealth University, he was instrumental in forming the Virginia
Port Chester, NY, in 1948. He received the B.S.E. Center for Innovative Technology. His research interests include man–machine
degree in mechanical engineering and the M.S. and interfaces, particularly the development of an eye-gaze controlled computer
Ph.D. degrees in systems and control engineering for the severely and chronically disabled. His other research interests include
from Duke University, Durham, NC, in 1970, 1972, bioengineering, materials science, experimental psychology, fault detection
and 1976, respectively. devices for aircraft, and the philosophy of science. In addition to his academic
He served as Assistant Professor in the ORSA pursuits, he consults for IBM, Morgan Bank, and several government agencies.
Department at Polytechnic University from 1975 to Dr. Hutchinson has been an Atomic Energy Commission Research Fellow
1977, and in the Mechanical Engineering and Engi- and is a continuing Senior Research Fellow of the University of Glasgow and
neering and Public Policy Departments at Carnegie an Elected Fellow of Cambridge University. He is a regular panel member
Mellon University, Pittsburgh, PA, from 1977 to for NSF and NIH and for the past six years has been chairman of the
1979. In 1979 he joined the faculty of the University of Virginia, Char- Southeastern Universities Research Association Committee on the Future of
lottesville, where he is currently Associate Professor of Systems Engineering. Materials Science.
His industrial experience includes positions as Faculty-in-Residence in the
Advanced Technologies Division of Newport News Shipbuilding and as Dis-
tinguished Visiting Professor in the Modeling and Statistical Methods Division
of SEMATECH. His research interests include modeling, simulation, and
control of continuous and discrete-event dynamic systems, with applications
in manufacturing, human–machine interaction, and health care delivery. He
has published more than 70 scholarly articles in these areas.
Dr. White is U.S. Editor of International Abstracts in Operations Research
and Associate Editor for Automatica, International Journal of Intelligent
Automation, and IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND
MANUFACTURING TECHNOLOGY. He is a Member of INFORMS and SCS and
a Senior Member of IIE. He has served on the Administrative Committee of
the IEEE Systems, Man, and Cybernetics Society, and is currently chairman
of the Society’s Technical Committee on System Simulation. He also is a
member of Tau Beta Pi, Pi Tau Sigma, and Sigma Xi and a charter member of
Omega Rho. He is a past winner of an NSF Graduate Fellowship and the SAE
Ralph R. Teetor Educational Award and he received the first ABET Award
for Educational Innovation on behalf of the Systems Engineering program at
Virginia. He is listed in numerous biographical directories.

You might also like