FACTORES DEL SISTEMA. Michael, 2019. Cómo Afecta El Orden de Las Preguntas
FACTORES DEL SISTEMA. Michael, 2019. Cómo Afecta El Orden de Las Preguntas
FACTORES DEL SISTEMA. Michael, 2019. Cómo Afecta El Orden de Las Preguntas
To cite this article: Robert B. Michael & Maryanne Garry (2019): How do ordered questions bias
eyewitnesses?, Memory, DOI: 10.1080/09658211.2019.1607388
CONTACT Robert B. Michael [email protected] Department of Psychology, University of Louisiana Lafayette, PO Box 43644, Lafayette, LA,
USA
*Both authors contributed to the conception of the research and its design. Robert B. Michael collected and analyzed data. Both authors interpreted the data
and contributed to the writing of the article, including drafts and critical revisions. Robert B. Michael and Maryanne Garry approved the final version of the
article.
Supplemental data for this article can be accessed at doi:10.1080/09658211.2019.1607388
© 2019 Informa UK Limited, trading as Taylor & Francis Group
2 R. B. MICHAEL AND M. GARRY
We know from research that at least two theories the true answer, insufficient adjustment becomes a likely
provide explanations that are unlikely. The first of these outcome.
theories – the affect heuristic – proposes that people’s feel- How would the anchoring-and-adjustment heuristic
ings can quickly and automatically influence their sub- explain the influence of ordered questions on people’s
sequent information processing (for a review, see Slovic, beliefs? We hypothesise that people generate their initial
Finucane, Peters, & MacGregor, 2007). Within the context beliefs about test performance and memory confidence
of ordered questions, early easy questions should based on the ease or difficulty of early test questions.
produce positive affect, while early difficult questions More specifically, that easy-to-difficult subjects hold initial
should produce negative affect. These affective states beliefs of relatively good test performance and high confi-
could then influence people’s interpretation of the later dence in their memory, while difficult-to-easy subjects hold
questions. If this explanation were true, then we should initial beliefs of relatively poor test performance and low
expect people’s confidence in their answers to questions confidence in their memory. We further hypothesise that
to vary depending on where in a sequence those questions people adjust these beliefs as the test becomes progress-
appear. For example, if the first few questions were ively easier or more difficult, but only to the degree that
difficult, then people’s confidence for a subsequent easy the adjustment is plausible. The result? People’s final
question should be lower than if that same easy question beliefs about how they performed on the test, or their
had appeared early on. But it is not. Results from both confidence in their memory, are skewed toward their
the educational and eyewitness domains show that confi- initial anchor. In other words, despite both question
dence ratings for specific questions are similar, regardless arrangement groups answering the same overall set of
of when those questions appear in a sequence (Michael questions, their resulting beliefs are not the same.
& Garry, 2016; Weinstein & Roediger, 2010, 2012). Some evidence from the existing research fits with the
The second theory – the availability heuristic – proposes anchoring-and-adjustment theory. As noted earlier, differ-
instead that people rely on the information that most easily ences in people’s beliefs emerge as the test progresses,
springs to mind when making decisions and evaluations and not only at the end (Weinstein & Roediger, 2012).
(Tversky & Kahneman, 1973). Within the context of But we are still missing a finer-grained examination of
ordered questions, early questions suffer less from a how these biases develop. For example, one important
buildup of interference and can be rehearsed more than but unanswered question is: How do these differences
later questions (Rundus, 1971). Therefore, when people emerge between people who answer easy-to-difficult
are later asked to estimate their performance on the questions and those who answer difficult-to-easy ques-
whole test, we might expect that what springs to mind tions? Is it all over after the very first question, or is some
are the early parts of the test. If this explanation were minimum number necessary before these groups start to
true, we should see that people can most easily remember diverge? In addition, we know nothing about how or why
the early test questions. But that is not what we see. In fact, people adjust their beliefs over the course of questioning.
what little research there is instead finds that people tend One possibility – consistent with the anchoring-and-adjust-
to remember the later questions best (Franco, 2015; Jones ment explanation – is that people who answer easy-to-
& Roediger, 1995). Moreover, other work shows that differ- difficult questions develop an initial impression that the
ences in beliefs develop while people take the test, and not test is easy and they’re performing well, then adjust this
solely afterward as a result of remembering the experience belief as the test becomes progressively more difficult.
(Weinstein & Roediger, 2012). People who answer difficult-to-easy questions might do
The affect and availability heuristic explanations seem exactly the opposite. The problem is that we do not
inadequate. Where, then, does that leave us? One promis- know if the theory is correct and that this approach is
ing alternative theory – the anchoring-and-adjustment really how people behave.
heuristic – proposes that in situations of uncertainty, To address this problem, we first conducted two exper-
people rely on an initial piece of information as a starting iments (Experiments 1 and 2) in which we asked people to
point when providing estimated answers to questions predict, after every test question, how many of the 30 total
(Tversky & Kahneman, 1974). This “anchor” need not be questions they would answer correctly. Across both exper-
given explicitly; it can be self-generated. For example, iments, we found initial support for an anchoring-and-
when asked to estimate the freezing point of vodka, adjustment explanation. Then, in an effort to add nuance
what springs to mind for most people who are uncertain to this theoretical account, we conducted two additional
of the true answer is the freezing point of water – an experiments (Experiments 3 and 4). Specifically, we know
anchor that people’s estimates are skewed towards that people with a relatively strong desire to engage in
(Epley & Gilovich, 2006). But why are adjustments away effortful thinking tend to make more sufficient adjustments
from these self-generated anchors typically insufficient? than people with a relatively weak desire (Epley & Gilovich,
Research suggests that the adjustment process is 2006). We hypothesised that if people’s beliefs are indeed
effortful and stops once people reach a plausible value the product of an anchoring-and-adjustment heuristic,
(Epley & Gilovich, 2006). Because a plausible range of then the desire to engage in effortful thinking, or Need
values will often include values between the anchor and For Cognition (NFC; Cacioppo et al., 1984), should
MEMORY 3
influence the magnitude of those beliefs. Across both difficulty (see Michael & Garry, 2016).1 Subjects were ran-
experiments, however, we found only partial support for domly assigned one of these test versions. For each test
this explanation. question, subjects used a scale from 1 (Not at all
confident) to 5 (Very confident) to report their confidence
they had selected the correct answer. This item-confidence
Experiment 1
measure served primarily as a manipulation check. Criti-
If the anchoring-and-adjustment explanation is correct, cally, between each test question we asked subjects,
then subjects who answer questions arranged from the “This test consists of 30 questions total. How many of
easiest to most difficult should initially believe they are those questions do you think you will get correct?” Sub-
doing well, but should then adjust their estimates down- jects responded with a number between 0 and 30.
ward over the course of the test. Conversely, subjects The fourth phase followed the test. Subjects answered
who answer questions arranged from the most difficult two randomly ordered questions. One question asked:
to the easiest should show the opposite pattern, initially “The memory test about Eric the Electrician consisted of
believing they are doing poorly, but adjusting their esti- 30 questions. How many of those questions do you think
mates upward over the course of the test. In addition, sub- you answered correctly?” Subjects responded with a
jects should make insufficient adjustments to these number between 0 and 30. This question, in combination
estimates, resulting in group differences even at the end with those asked in the third phase, results in 30 estimates
of the test (Epley & Gilovich, 2006). To investigate these of performance for each subject, staggered across the test.
predictions, we tracked how subjects’ beliefs about their The other question asked: “How confident are you about
performance changed over the course of questioning. the accuracy of your memory for the video?” Subjects
We repeatedly asked subjects to predict how many of responded on a scale from 1 (Not at all confident) to 5
the 30 total questions they would answer correctly. (Very confident).
Figure 1. Top panel: Mean estimated total test scores reported after each test question as a function of question arrangement. Bottom panel: Mean confi-
dence of a correct answer for each test question as a function of question arrangement. Error bars represent 95% confidence intervals of means. Data are from
Experiment 1.
people’s beliefs about their test performance. But on the performance, but does little to influence judgments of
other hand, we did not replicate our earlier findings with memory confidence. Another explanation is that the
respect to memory confidence. One possible explanation arrangement of questions has a smaller true influence on
is that people believe their test performance reflects the confidence than we estimated in our earlier work;
ease or difficulty of the test questions themselves, rather these results might therefore reflect ordinary sampling
than reflecting the quality of their memory. That potential variability.
difference in attribution fits with research showing that We now turn to our primary question: How do people
people rely on anchors less as their compatibility with adjust their beliefs over the course of questioning? To
target judgments decreases (Chapman & Johnson, 2002). answer this question, we examined the mean predicted
If this explanation is true, then it is plausible that the test scores people reported after each test question;
arrangement of questions influences estimates of test these data appear in the top panel of Figure 1.
MEMORY 5
Figure 2. Top panel: Mean estimated total test scores reported after each test question as a function of question arrangement. Bottom panel: Mean confi-
dence of a correct answer for each test question as a function of question arrangement. Error bars represent 95% confidence intervals of means. Data are from
Experiment 2.
15.5)2 - 0.001 * (Time point - 15.5)3; R 2 = .08, F(3, 2846) = predictions occurred after the final test question, Mdifficult-
81.83, p < .01. to-easy = 10.09, SDdifficult-to-easy = 4.37; Measy-to-difficult = 13.55,
In addition, a repeated-measures ANOVA revealed an SDeasy-to-difficult = 6.54; Mdiff = 3.46, 95% CI [1.89, 5.03], t
interaction between Time point and Question Order, F (196) = 4.33, p < .0001.
(29, 168) = 11.58, p < .01. Follow-up Bonferroni-corrected Taken together, Experiments 1 and 2 are consistent with
comparisons (i.e., α = .05 / 30 = 0.00167) revealed statisti- the idea that people developed beliefs using an anchoring-
cally significant differences between the two groups at and-adjustment heuristic. The results suggest that the ease
every time point. The maximum difference in predictions or difficulty with which people experienced the first test
occurred after the 6th test question, Mdifficult-to-easy = 7.74, question provided an anchoring point that constrained
SDdifficult-to-easy = 6.78; Measy-to-difficult = 20.97, SDeasy-to- adjustments across the remainder of the test. The end
difficult = 6.23; Mdiff = 13.23, 95% CI [11.41, 15.06], t(196) = result was a difference in what people believed about
14.31, p < .0001, and the minimum difference in their performance – even though everyone answered the
MEMORY 7
questions and NFC, F(1, 396) = 0.03, p = .85, nor a main in predictions between high and low NFC subjects
effect of NFC, F(1, 396) = 1.80, p = .18. These results occurred after the 9th test question, Mhigh = 24.66, SDhigh
remained virtually unchanged when we controlled for = 5.31; Mlow = 21.34, SDlow = 7.87; Mdiff = 3.32, 95% CI
the slight difference in test accuracy between people [1.42, 5.21], t(191) = 3.45, p = .0007, and the minimum
with low and high NFC. For subjects’ confidence in the difference in predictions occurred after the 28th test ques-
accuracy of their memory, we replicated the null findings tion, Mhigh = 20.08, SDhigh = 5.50; Mlow = 18.93, SDlow = 7.09;
from Experiments 1 and 2, finding no statistically signifi- Mdiff = 1.16, 95% CI [−0.64, 2.95], t(191) = 1.27, p = .21.
cant differences in subjects’ post-test confidence ratings, How are we to explain these results? On the one hand,
all ps > .08. the patterns are consistent with Experiments 1 and 2, in
We now turn to our primary question: How does the that the overall shape of developing beliefs fits with an
desire to engage in effortful cognition influence the adjust- anchoring-and-adjustment explanation. Moreover, the
ments eyewitnesses make to their developing beliefs consistently higher predictions from people with high
about performance? To answer this question, we examined NFC in the easy-to-difficult condition fits with our earlier
the mean predicted test scores people reported after each idea about people hitting a subjective ceiling – one that
test question; these data appear in Figure 3. is slightly higher for people with high NFC, who are more
As the figure shows, the influence of a question again capable adjusters. But on the other hand, we did not antici-
depended on the difficulty of that question and when it pate the lack of any meaningful differences according to
appeared. But the figure also reveals that the influence of NFC in the difficult-to-easy conditions, and it is difficult to
NFC was more complicated than we predicted. We had reconcile that finding with an anchoring-and-adjustment
anticipated that people with low NFC would make explanation.
smaller adjustments than their high NFC counterparts. One possible problem with interpreting these data is
That is, we expected that the lines or curves in Figure 3 that in asking people to repeatedly predict their test per-
for low NFC subjects would look “flatter” than those of formance, we altered their behaviour from how it would
the high NFC subjects. But they do not. In fact, in the unfold in the absence of these repeated requests.
difficult-to-easy conditions, low and high NFC subjects Specifically, the repeated requests for predictions might
look virtually identical, adjusting similarly across the test. have encouraged people to more carefully monitor and
Put another way, regression analyses showed that both think effortfully about their ongoing performance, redu-
groups’ adjustments fit to quadratic curves: estimatelow = cing their reliance on the anchoring-and-adjustment
12.35–0.01 * Time point + 0.01 * (Time point - 15.5)2; R 2 heuristic (Simmons, LeBoeuf, & Nelson, 2010). To
= .02, F(2, 3237) = 27.77, p < .01; estimatehigh = 11.57 + address this issue, we conducted Experiment 4 in an
0.002 * Time point + 0.02 * (Time point - 15.5)2; R 2 = .04, F effort to examine the influence of NFC when people
(2, 2967) = 54.76, p < .01. The same cannot be said about are asked to provide only one final estimate of their
the easy-to-difficult conditions. Here, NFC mattered. test performance. We hypothesised – in accord with
Specifically, people with high NFC consistently reported the theoretical account – that people with high NFC
higher estimates across the test than their low NFC would adjust more sufficiently than their low NFC
counterparts. Put another way, regression analyses counterparts. We therefore predicted that: (1) in the
showed that the low NFC group’s adjustments fit to a easy-to-difficult condition, people with high NFC would
simple line, but the high NFC group’s adjustments fit to a report a smaller final test estimate than people with
quadratic curve: esitmatelow = 22.90–0.14 * Time point; R 2 low NFC; (2) in the difficult-to-easy condition, people
= .02, F(1, 2818) = 72.23, p < .01; estimatehigh = 26.63–0.17 with high NFC would report a larger final test estimate
* Time point - 0.01 * (Time point - 15.5)2; R 2 = .02, F(2, than people with low NFC.
2967) = 144.00, p < .01.
In addition, a repeated-measures ANOVA revealed a
three way interaction, F(28, 369) = 1.93, p < .01. We decom- Experiment 4
posed this interaction with two additional repeated- Method
measures ANOVAs, examining the influence of NFC
within each question arrangement condition. For the Subjects
difficult-to-easy subjects, this analysis revealed only a We aimed to recruit 400 Mechanical Turk workers, and ulti-
main effect of Time point, F(28, 178) = 7.27, p < .01. But mately recruited 408.
for the easy-to-difficult subjects, we found a statistically
significant interaction between Time point and NFC, F(28, Design
164) = 2.02, p < .01. Follow-up Bonferroni-corrected com- The design was the same as Experiment 3.
parisons (i.e., α = .05 / 30 = 0.00167) revealed statistically
significant differences between the easy-to-difficult low Procedure
and high NFC groups after questions 9, 16, and 19 only – The procedure was the same as Experiment 3, except that
although we note that the mean is always numerically we no longer asked people to predict their test perform-
greater for people with high NFC. The maximum difference ance after every test question. Instead – as in our earlier
MEMORY 9
Figure 3. Mean estimated total test scores reported after each test question as a function of question arrangement and Need For Cognition (NFC). Error bars
represent 95% confidence intervals of means. Data are from Experiment 3.
work – we asked people to estimate their test performance 14.25, SDdifficult-to-easy = 5.54; Measy-to-difficult = 17.61, SDeasy-
only once, at the end of the test. We know from this earlier to-difficult = 5.03; Mdiff = 3.37, 95% CI [2.34, 4.40], t(406) =
work that reliable differences emerge in beliefs about test 6.43, p < .01. For confidence in the accuracy of their
performance as a function of question arrangement memory, difficult-to-easy subjects were also less
(Michael & Garry, 2016). confident than easy-to-difficult subjects, Mdifficult-to-easy =
2.73, SDdifficult-to-easy = 0.88; Measy-to-difficult = 3.10, SDeasy-to-
difficult = 0.88; Mdiff = 0.37, 95% CI [0.20, 0.54], t(406) = 4.25,
Results and Discussion
p < .01.
Recall that, as in Experiment 3, our primary question of In other words, for subjects’ final test estimates we
interest is the extent to which NFC influences the use of found no statistically significant interaction between the
an anchoring-and-adjustment heuristic in producing the order of questions and NFC, F(1, 404) = 0.01, p = .91, nor a
question arrangement effect. To answer that question, main effect of NFC, F(1, 404) = 1.49, p = .22. As in Exper-
we first split our sample into two groups based on the iment 3, these results remained virtually unchanged
median NFC score of 4.61 (high: M = 5.45, SD = 0.57, n = when we controlled for the slight difference in test accu-
201; low: M = 3.79, SD = 0.71, n = 207; overall: M = 4.61, racy between people with low and high NFC. For subjects’
SD = 1.05, n = 408). Consistent with Experiments 1-3, we reports of confidence in the accuracy of their memory, we
found that the order of questions had no meaningful found no statistically significant interaction, F(1, 404) =
influence on overall test performance, Mdifficult-to-easy = 1.16, p = .28, nor a main effect of NFC, F(1, 404) = 0.02, p
20.25, SDdifficult-to-easy = 2.94; Measy-to-difficult = 20.64, SDeasy- = .90.
to-difficult = 3.23; Mdiff = 0.39, 95% CI [−0.20, 1.00], t(406) = Overall, these results are consistent with our earlier work
1.29, p = .20. As in Experiment 3, however, people with and show that the biasing influence of question arrange-
high NFC answered slightly more questions correctly ment happens both when people make repeated predic-
than their low NFC counterparts, Mhigh = 21.01, SDhigh = tions during testing, and when they make a single post-
2.97; Mlow = 19.90, SDlow = 3.12; Mdiff = 1.12, 95% CI [0.52, test prediction (Michael & Garry, 2016). The patterns
1.71], t(406) = 3.70, p < .01. We found no statistically signifi- depicted in Figures 1–3 may therefore represent how
cant interaction between the order of questions and NFC, F people’s beliefs develop implicitly. But importantly, we
(1, 404) = 0.64, p = .42. found no meaningful moderation in the size of the ques-
Next, we examined subjects’ final test estimates and tion arrangement effect due to NFC. This unexpected
post-test reports of confidence in the accuracy of their result is, as in Experiment 3, difficult to reconcile with an
memory. Overall, subjects behaved similarly, regardless of anchoring-and-adjustment explanation. Finally, the differ-
differences in NFC. More specifically, for test estimates, ence in post-test memory confidence could suggest that
we replicated only the typical finding wherein difficult-to- question arrangement only influences this judgment
easy subjects believed they performed more poorly on when people are not making explicit, repeated predictions
the test than easy-to-difficult subjects, Mdifficult-to-easy = about their performance. Of course, the alternative
10 R. B. MICHAEL AND M. GARRY
explanation – that the bouncing around of this small effect important question: Why do these beliefs develop in a
reflects ordinary sampling variability – is still viable. qualitatively different way, when everyone ultimately
sees the same set of questions? Put another way, why is
it that difficult questions dramatically change people’s
General Discussion
beliefs about test performance when encountered first,
Across four experiments, we aimed to determine what but those exact same questions produce almost no
drives the finding that the order in which we ask eyewit- change in beliefs about test performance when encoun-
nesses questions about an event can shape how well tered last? Our results also add to the small but growing
those eyewitnesses believe they answered those ques- body of literature investigating explanations for the
tions. To achieve this aim, in Experiments 1 and 2 we influence of question arrangement. The available evidence
repeatedly asked subjects to report how well they to date suggests that a number of other explanations are
thought they would perform on an eyewitness memory unlikely, including the possibility that people remember
test, tracking how this belief changes over the course of the first test questions best (Franco, 2015); their affect
questioning. We found that even with two different test changes across the test (Weinstein & Roediger, 2010,
formats, flipping the order of questions does not simply 2012); and their attention declines across the test
flip the pattern of beliefs people develop. Instead, the (Michael & Garry, 2016).
two orders produce markedly different experiences. In line with our prior work, we consistently found that
In Experiments 3 and 4, we further aimed to identify the eyewitnesses who first answered easy questions believed
role of Need For Cognition, an individual difference they answered more questions correctly than eyewitnesses
measure known to affect the extent to which people who first answered difficult questions. That finding repli-
make adjustments to numerical estimates (Cacioppo cated across all four experiments, and fits with research
et al., 1984; Epley & Gilovich, 2006). We anticipated that investigating the influence of question arrangement in
people high in NFC would make greater adjustments to an educational paradigm (Jackson & Greene, 2014; Wein-
their estimates than their low NFC counterparts, both stein & Roediger, 2010, 2012). But in contrast to our pre-
when people repeatedly provided estimates over the vious work, we found in three of the four experiments
course of the test (Experiment 3) and when people pro- that eyewitnesses who first answered easy questions
vided only one estimate after the test (Experiment 4). were just as confident in the accuracy of their memory as
Such findings, if present, would fit with the idea that eyewitnesses who first answered difficult questions. This
people rely on an anchoring-and-adjustment heuristic finding is at odds with our previous work (Michael &
when forming beliefs about their performance. But Garry, 2016).
instead, both experiments produced results that are How are we to explain this disconnection between
difficult to reconcile with an anchoring-and-adjustment judgments of test performance and memory confidence?
explanation. We suspect that it may be due to different attributions
In Experiment 3, people with high NFC adjusted differ- people make across these two judgments. More specifi-
ently compared to people with low NFC only when the cally, test performance is a consequence of both the
test was arranged from the easiest to most difficult ques- quality of memory and the nature of the test questions. If
tion. And, in Experiment 4, we found no evidence that initially asked difficult questions that virtually no one
NFC affected people’s single, post-test estimates of per- could answer correctly, people might develop an
formance – estimates that were now free of the potential impression that their test performance is poor – but not
influence of repeated test score predictions. Across both because of a shaky memory. Instead, that poor perform-
experiments, we had anticipated instead that people ance can be attributed to some unfairly difficult questions.
with high NFC would adjust more than their low NFC A similar difference in attribution could arise if initially
counterparts, reducing the difference in final test estimates asked easy questions that virtually everyone could
between the question arrangement conditions (see, e.g., answer correctly. One way to test this speculative expla-
Epley & Gilovich, 2006). Overall, the results from these nation would be to ask people to explain their test per-
two experiments suggest that effortful thinking may not formance and memory confidence judgments. If our
protect people from the influence of ordered questions. hypothesised explanation is correct, we would expect
But we state this suggestion only tentatively, because an that people attribute their test performance to the ease
alternative explanation is that there are, in fact, small differ- or difficulty of the test, rather than the quality of their
ences in adjustment due to NFC that require greater pre- memory. As we acknowledged earlier, however, an alterna-
cision to detect. tive explanation – one that is simpler, but perhaps less
Considered as a package, a critic might wonder if these interesting – is that the true size of this effect is smaller
four experiments have value, given that they do not than we estimated in our prior work (Michael & Garry,
support firm conclusions about the mechanisms respon- 2016).
sible for the influence of ordered questions. On the con- Our research adds nuance to the literature because it
trary, we think they do. In particular, the patterns of shows that a seemingly trivial and non-suggestive manipu-
developing beliefs in Experiments 1 and 2 raise an lation can influence eyewitness metacognition (Wells &
MEMORY 11
Loftus, 2003). Moreover, the results have implications for confidence is a good proxy for subjective difficulty (r = −.82,
the mechanisms responsible for the effects that occur 95% CI [−.66, −.91]; Michael & Garry, 2016).
2. We present these line and curve data because they are intui-
when people answer questions arranged in certain orders
tively understandable. But the careful reader will note they
(Weinstein & Roediger, 2012). As a whole, the theory of are statistically problematic due to autocorrelation. We there-
effortful adjustment seems an inadequate explanation for fore ran additional regression analyses that included a lag vari-
our results (Epley & Gilovich, 2006). But very recently, a able of the estimates, and in each case this approach improved
new paper appeared providing empirical support for an model fit and successfully removed autocorrelation. These data
can be found in Table 1 of the Supplementary Materials.
alternative theory that may prove fruitful in future investi-
gations. This theory proposes that anchoring effects are the
result of an aversion to extreme adjustments (Lewis, Disclosure of interest
Gaertig, & Simmons, 2018).
It is also worth noting a methodological difference The authors report no conflict of interest.
between the work presented here and other investigations
of the anchoring phenomenon. In our paradigm, people Data availability statement
provide an estimate of their performance after a series of
questions. In other work, people typically provide an esti- The data for all four experiments reported in this manu-
mate in response to a single question (Epley & Gilovich, script are available from the Open Science Framework at
2006; Tversky & Kahneman, 1974). Perhaps the serial the following address: https://osf.io/8hkmj/
nature of our paradigm reduces the reliance on an anchor-
ing-and-adjustment heuristic because it provides people Disclosure statement
with multiple retrieval cues that can lead to recall of
event details, reducing the necessity of relying on other No potential conflict of interest was reported by the authors.
information – like how easy or difficult it feels to answer
questions (Greifeneder, Bless, & Pham, 2011). ORCID
What recommendations could we make – if any – for
Robert B. Michael http://orcid.org/0000-0001-5275-7636
applied contexts, such as eyewitness interviewing? We
know that best practice interviewing techniques often rec-
ommend an initial rapport-building phase that could be References
construed as a set of easy questions before the “real,”
Cacioppo, J. T., Petty, R. E., & Feng Kao, C. (1984). The efficient assess-
more difficult questioning begins (Collins, Lincoln, & ment of need for cognition. Journal of Personality Assessment, 48,
Frank, 2002). So it is plausible that question arrangement 306–307. doi:10.1207/s15327752jpa4803_13
may have some influence when interviewing eyewitnesses. Chapman, G. B., & Johnson, E. J. (2002). Incorporating the irrelevant:
But we state this possibility cautiously, because a rapport- Anchors in judgments of belief and value. In T. Gilovich, D. Griffin,
building technique differs in a number of ways from the & D. Kahneman (Eds.), Heuristics and biases: The psychology of intui-
tive judgment (pp. 120–138). Cambridge, UK: Cambridge University
serially ordered question manipulation we used, and thus Press.
might not meaningfully bias eyewitnesses at all. Further- Collins, R., Lincoln, R., & Frank, M. G. (2002). The effect of rapport in for-
more, we also know that best practice techniques typically ensic interviewing. Psychiatry, Psychology and Law, 9, 69–78. doi:10.
recommend that the types of questions we asked should 1375/pplt.2002.9.1.69
be used only toward the end of interviewing, after exten- Cumming, G. (2012). Understanding the new statistics: Effect sizes, confi-
dence intervals, and meta-analysis. New York, NY: Routledge.
sive free report procedures (Paulo, Albuquerque, & Bull, Cutler, B. L., Penrod, S. D., & Dexter, H. R. (1990). Juror sensitivity to eye-
2013). We therefore also don’t know, yet, whether question witness identification evidence. Law and Human Behavior, 14, 185–
arrangement would make any appreciable difference in 191. doi:10.1007/Bf01062972
people’s beliefs if those people have already had an oppor- Douglass, A. B., Neuschatz, J. S., Imrich, J., & Wilkinson, M. (2010). Does
tunity to engage in extensive recall. Finally, it is difficult to post- identification feedback affect evaluations of eyewitness testi-
mony and identification procedures? Law and Human Behavior, 34,
see how forensic interviewers could possibly know a priori 282–294. doi:10.1007/s10979-009-9189-5
the difficulty of their questions. Perhaps the only reason- Douglass, A. B., & Steblay, N. (2006). Memory distortion in eyewit-
able conclusion to draw, then, is that we may need to nesses: A meta-analysis of the post-identification feedback effect.
think more carefully about how the experience of Applied Cognitive Psychology, 20, 859–869. doi:10.1002/acp.1237
difficulty changes for eyewitnesses over the course of Epley, N., & Gilovich, T. (2006). The anchoring-and-adjustment heuris-
tic: Why the adjustments are insufficient. Psychological Science, 17,
questioning, because that experience can plausibly 311–318. doi:10.1111/j.1467-9280.2006.01704.x
distort what people believe. Franco, G. (2015). The order of questions on a test affects how well stu-
dents believe they performed. (Unpublished doctoral thesis), Victoria
University of Wellington, Wellington, New Zealand.
Frenda, S. J., Nichols, R. M., & Loftus, E. F. (2011). Current issues and
advances in misinformation research. Current Directions in
Notes
Psychological Science, 20, 20–23. doi:10.1177/0963721410396620
1. In our prior work we established that reported confidence Greifeneder, R., Bless, H., & Pham, M. T. (2011). When do people rely on
closely aligns with reported difficulty, suggesting that affective and cognitive feelings in judgment? A review. Personality
12 R. B. MICHAEL AND M. GARRY
and Social Psychology Review, 15(2), 107–141. doi:10.1177/ Rundus, D. (1971). Analysis of rehearsal processes in free recall.
1088868310367640 Journal of Experimental Psychology, 89, 63–77. doi:10.1037/
Innocence Project. (2018). The Causes of Wrongful Conviction. h0031185
Retrieved from https://www.innocenceproject.org/causes/ Simmons, J. P., LeBoeuf, R. A., & Nelson, L. D. (2010). The effect of accu-
eyewitness-misidentification/. racy motivation on anchoring and adjustment: Do people adjust
Jackson, A., & Greene, R. L. (2014). Impression formation of tests: from provided anchors? Journal of Personality and Social
Retrospective judgments of performance are higher when easier Psychology, 99, 917–932. doi:10.1037/a002140
questions come first. Memory & Cognition, 42, 1325–1332. doi:10. Slovic, P., Finucane, M. L., Peters, E., & MacGregor, D. G. (2007). The
3758/ s13421-014-0439-5 affect heuristic. European Journal of Operational Research, 177,
Jones, T. C., & Roediger, H. L. (1995). The experiential basis of serial pos- 1333–1352. doi:10.1016/j.ejor.2005.04.006
ition effects. European Journal of Cognitive Psychology, 7, 65–80. Takarangi, M. K., Parker, S., & Garry, M. (2006). Modernising the misin-
doi:10.1080/09541449508520158 formation effect: The development of a new stimulus set. Applied
Lewis, J., Gaertig, C., & Simmons, J. P. (2018). Extremeness aversion is a Cognitive Psychology, 20, 583–590. doi:10.1002/acp.1209
cause of anchoring. Psychological Science, 30, 1–15. doi:10.1177/ Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging
0956797618799305 frequency and probability. Cognitive Psychology, 5, 207–232. doi:10.
Loftus, E. F. (2005). Planting misinformation in the human mind: A 30- 1016/0010-0285(73)90033-9
year investigation of the malleability of memory. Learning & Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty:
Memory, 12, 361–366. doi:10.1101/lm.94705 Heuristics and biases. Science, 185, 1124–1131. doi:10.1126/
Loftus, E. F., Donders, K., Hoffman, H. G., & Schooler, J. W. (1989). science.185.4157.1124
Creating new memories that are quickly accessed and confidently Weinstein, Y., & Roediger, H. L. (2010). Retrospective bias in test per-
held. Memory & Cognition, 17, 607–616. doi:10.3758/Bf03197083 formance: Providing easy items at the beginning of a test makes
Michael, R. B., & Garry, M. (2016). Ordered questions bias eyewitnesses students believe they did better on it. Memory & Cognition, 38,
and jurors. Psychonomic Bulletin & Review, 23, 601–608. doi:10.3758/ 366–376. doi:10.3758/MC.38.3.366
s13423-015-0933-1 Weinstein, Y., & Roediger, H. L. (2012). The effect of question order on
Michael, R. B., & Weinstein, Y. (2018). The influence of ordered question evaluations of test performance: How does the bias evolve?
difficulty: A meta-analysis of two paradigms. Manuscript in Memory and Cognition, 40, 727–735. doi:10.3758/s13421-012-
preparation. 0187-3
Paulo, R. M., Albuquerque, P. B., & Bull, R. (2013). The enhanced cogni- Wells, G. L., & Loftus, E. F. (2003). Eyewitness memory for people and
tive interview: Towards a better use and understanding of this pro- events. In A. M. Goldstein (Ed.), Handbook of psychology: Forensic
cedure. International Journal of Police Science and Management, 15, Psychology, Vol. 11 (pp. 149–160). Hoboken, NY: John Wiley &
190–199. doi:10.1350/ijps.2013.15.3.311 Sons Inc.