Mental Health Services Research Methodology 2002
Mental Health Services Research Methodology 2002
Mental Health Services Research Methodology 2002
1Institute of Psychiatry, London & 2Bart’s and The London School of Medicine, London, UK
Summary
Evidence-based mental health is an important goal, and randomized controlled trials (RCTs) are currently used as the currency.
Significant gains have been made in overcoming technical difficulties with RCTs, but conceptual issues with the use of RCTs as ‘best’
evidence can also be identified. Some limits of RCTs for research into individual patients, local services, and national policy will be
identified. The central thesis is that RCTs have an important contribution to make, but are only one form of evidence. Another frame-
work for research—realistic evaluation—is described, in which the context and mechanisms of action are considered, as well as the
outcome. Realistic evaluation will lead to different forms of evidence, including but not limited to RCTs, and will be more illuminating
for some research questions than solely considering RCTs.
Correspondence to: Dr Mike Slade, MRC Clinician Scientist Fellow, PRISM, Health Services Research Department,
Institute of Psychiatry, De Crespigny Park, Denmark Hill, London, SE5 8AF, UK. Tel: 020 7848 0735; Fax: 020 7277
1462; E-mail: [email protected]
ISSN 0954–0261 print/ISSN 1369–1627 online/02/010012–07 Institute of Psychiatry
DOI: 10.1080/0954026012011401 4
Mental health services research methodology 13
maintaining blindness, insufficient follow-up, meth- interaction between patient and therapist character-
odological bias (e.g. insufficient use of intention-to- istics—some patients will ‘connect’ with some
treat analyses), and defining ‘improvement’. therapists, and some with others. Since therapists are
However, creative and methodologically-sound solu- also people, differences apply not just to skills and
tions to all these problems have been proposed techniques, but also to personal characteristics such
(Blacker & Mortimore, 1996; Bradley, 1997; Taylor as appearance, sense of humour, and speech accent.
& Thornicroft, 1996). Technical problems are all Many aspects may impact on the outcome of ther-
potentially amenable to methodological resolution. apy. Given the complexity of a dynamic process of
Conceptual problems are not so easily resolved. interaction, attempting to control fully for individual
The biomedical sciences methods have been devel- differences through larger, more targeted studies
oped most fully in drug trials, in which both the may prove an elusive goal.
object of enquiry (the dose of medication) and the RCTs undoubtedly have an important role to play
entity in which change will be measured (the patient) in answering the question ‘Does intervention X work
are relatively clear. Conceptual limitations become for condition Y?’ However, the question ‘Which
most evident when considering attempts to use these patients with condition Y does intervention X work
methods in other domains. The limitations will be for?’ may prove to have more clinical relevance, and
illustrated with reference to the three levels proposed answering this question may involve asking the ques-
in the matrix model for mental health services: tion ‘How does intervention X work?’, a question
patient, locality, and regional/national (Thornicroft which cannot be answered just by using RCTs.
& Tansella, 1999).
simply too complex to be adequately characterized. mental health research funding should be
In different examples of the ‘same’ programme, prioritized?
there will be important differences in resources For all of these questions, the type of information
(such as quality of buildings, locations relative to which is used in practice to make decisions does not
patients, amount of money for continuing profes- routinely come from RCTs. For some questions,
sional development of staff), processes (what is current practice is based on anecdote—community
done, by whom, and in what way), and structures, treatment orders worked elsewhere, so they should
such as the debate regarding the necessary compo- work here. For others, current practice is based on
nents of assertive community treatment (e.g. Deci precedent—current resource allocation formulae
et al., 1995; McGrew & Bond 1995; Teague et al., have been developed iteratively from previous esti-
1998). Indeed, it would be easy to compile a list of mates (Jarman et al., 1992). Current practice is not
several hundred service characteristics which may based on evidence of the RCT form (Kane, 2002).
impact on outcome for specific patients. Most of Indeed, consider what gathering such evidence
these factors are not measured, especially given the would involve. To decide how much to spend on
lack of standardization regarding what characteris- mental healthcare in England, for example, one
tics of a service to report. The limited efforts to could design a study which used Health Authorities
develop standardized assessments of services, such as cases. But to identify the effectiveness of the inter-
as the European Service Mapping Schedule (John- vention (funding level) a number of factors need to
son et al., 1998), have highlighted the complexity of be controlled for, including characteristics of the
characterizing services. Therefore, although in any population, current service development levels,
individual study random allocation will ensure that current levels of mental health spending, and popu-
any initial differences in the two groups are due to lation-based needs assessment? One would soon run
chance, it will be impossible to generalize the find- out of Health Authorities for matching. Similarly,
ings, since the ‘intervention’ (the social programme) what would be the intervention? Give half the Health
will be inadequately characterized (Slade & Priebe, Authorities high funding and half low, and investi-
2001). gate the resulting health gain? But all Health
High-quality RCTs have an important contribu- Authorities would want more money, so some would
tion to make in generating evidence about feel they are getting a bad deal from this trial, leading
programmes of care. Perhaps the best hope for the to a demoralized work-force, whose more able mem-
RCT approach, as expressed by the UK700 authors, bers might move to working in nearby Health
is that ‘Real progress will be made when essential Authorities who were randomized into high funding.
ingredients in complex interventions are individually What if some Health Authorities decided to invest
subject to equally rigorous evaluation’ (Burns et al., heavily in primary care services, some in specialist
1999a, p. 1386). However, even such a knowledge services such as early intervention in psychosis, and
base would be unable to account for interactions some in generic mental health services? This would
between different components of the programme, or immediately confound the trial, since health gain
emergent properties of the system. The goal of differences might be due to service configuration
ensuring internal validity by adequately characteriz- rather than money spent.
ing all relevant characteristics whilst retaining
external validity such that what is investigated gener-
alizes to what can be done in routine practice may be Unresolvable problems
impossible to meet. RCTs cannot generate all the
necessary evidence for developing mental health The above example is elaborated to underline the fact
services. that some questions simply cannot be investigated
using RCT methodologies, and for other questions
RCTs are not the best form of enquiry. For example,
Conceptual problems at the national level RCTs require the use of groups of patients who differ
only by chance in all relevant respects. The technical
It is at the national level that the conceptual prob- solution is large RCTs—‘mega-trials’—to allow the
lems with biomedical sciences methods become most ‘true’ effect size to be identified. However, it may not
prominent. Questions which are asked at this level be practical to undertake RCTs of a size sufficient to
include: How will a higher national expenditure on discriminate between groups when investigating
mental healthcare impact on outcome criteria? What complex psychological, interpersonal, social, ethnic,
are the effects of different approaches to distributing cultural, ethical or political questions. Examples of
this money? What is the best balance between central research questions which are difficult to address
and localized control of spending? What is the best using RCTs are shown in Table 1.
balance between clinical governance and profes- Some of these areas are, of course, subject to sub-
sional self-regulation? What effects can be expected stantial research efforts using RCTs, by changing the
from the introduction of Community Treatment question to fit the methodology. For example, by
Orders? How can stigma be reduced? What type of changing the individual differences question to
Mental health services research methodology 15
Table 1. Examples of research questions which are difficult to investigate using RCTs
Level Topic Research question
Patient Culture Is it better to be seen by a therapist of the same cultural background?
Ethnicity Do service-related factors account for any of the association between being Afro-
Caribbean and compulsory admission?
Social context Should a patient whose depression occurs in the context of domestic violence be
prescribed anti-depressants?
Individual differences Will this treatment work for this patient?
Local Social programmes What skills and competencies are needed in this team?
Inter-agency working How can communication be improved between health and social services?
Mixed economy of care What is the best balance between voluntary and statutory sector provision of services?
Service structures Should we start an assertive community treatment team in this area?
National Social change How do we get the media to report mental illness in a more balanced way?
Relationship with other Which Government department should be responsible for long-term nursing care?
funding demands
Role of professions How far should patients be involved in planning and developing services?
Research funding priorities What type of research should be commissioned?
‘Does this treatment work for groups of patients The development of knowledge in the physical
similar in some defined way to this patient?’ allows sciences happens through a process of progressive
the generation of Number Needed to Treat data. development, testing and refinement of causal
However, the findings are often equivocal, for hypotheses. Refinement focuses, at least partly, on
example ‘sometimes yes, sometimes no’ (for the the limits within which proposed rules appear to hold
culture question) or ‘it helps some patients, so it’s good, and the imperfections in the hypotheses giving
worth trying’ (for the social context question). The rise to these limits. The limits are often as illuminat-
technical solution of matching control and experi- ing as the rules. This article has highlighted some of
mental groups on all relevant characteristics cannot the limits, and it will now be argued that what is
sufficiently account for the differences which impact needed as well is a different type of research, aimed
on the intervention, leading to inconsistent and non- at answering different questions. Such a framework
generalizable results. Although these problems occur may be offered by an alternative methodology, which
in pharmacotherapy research, they become more has been termed ‘Realistic Evaluation’ in a recent
evident when evaluating psychotherapy, service con- book (Pawson & Tilley, 1997).
figurations and national programmes, due to the
increasing complexity of the intervention and the
difficulty in operationalizing all defining elements. Another framework for research: realistic
The attempt to overcome conceptual problems with evaluation
RCTs by adopting technical solutions may be
likened to the process of trying to increase the speed Realistic evaluation is the process of evaluating the
of a stage-coach by adding another horse—there is a effectiveness of particular social programmes
limit to the gains arising from just doing more of the targeted at specific social problems. Pawson & Tilley
same thing. (1997) discuss social programmes, which they define
To summarize the argument, RCTs are an impor- as ‘the interplays of individual and institution, of
tant method of enquiry, which can provide high- agency and structure, and of micro and macro social
quality answers to some questions. They should not processes’ (p. 63). The central message of the book
be abandoned. However, sometimes RCTs cannot is the need to move from a successionist model to a
be used to answer the question of interest, and hence generative model of causation (Harré, 1972).
they are not always the ‘best’ evidence. This is not a Successionist theory holds that causation is unob-
new observation—it has been noted by other authors servable (following Hume, 1739), and observational
that RCTs are not suitable for questions of (for data are the only mechanism for inferring causality.
example) aetiology, diagnosis and prognosis (Sackett This theory leads directly to the methods of experi-
& Wennberg, 1997). However, a hierarchy of mental manipulation, and the pre-post-comparison
evidence with meta-analyses RCTs at the top of experimental and control groups which permeate
continues to be widely propagated (e.g. Guyatt et al., mental health services research today. In other
1995; Roth & Fonagy, 1996; Geddes & Harrison, words, an observed statistical association between a
1997; Greenhalgh, 1997; Department of Health, defined service configuration and individual out-
1999). This (or any other) universal hierarchy is, in comes is the basis for predicting effects of services,
our view, unlikely to be universally applicable to the since a full understanding of why a service achieves a
evaluation of any but the most simple of more or less favourable outcome is not possible. The
interventions. dynamic processes linking service configuration and
16 Mike Slade et al.
outcome remain unknown. Generative theory, by identify eight possible mechanisms by which the
contrast, holds that there is an observable connection intervention could reduce crime and six contextual
between causally connected events, and that internal issues which could limit the potential for some of
features of the thing being changed are central to the these effects. Even if only even a few of these were
understanding of causality. To illustrate, a succes- really operative, it is predictable that testing the
sionist notion of causality is involved in the statement intervention using RCTs will produce conflicting
‘gravity causes an apple to fall to Earth’, and a gener- results. The best hope of progress is to develop and
ative understanding in the statement ‘bereavement test a series of more detailed hypotheses, based on
causes depression’. Generative theory suggests that causal models of how the approach might be operat-
‘causal outcomes follow from mechanisms acting in ing. The relevance to mental health services research
contexts’ (Pawson & Tilley, 1997, p. 58). An under- is the challenge of whether the right questions are
standing of the causal mechanisms linking input with being asked—or rather being asked in the right way.
outcome and of the contextual factors influencing The overall approach of realistic evaluation is to go
these processes provide the basis for a prediction of back a stage in scientific research. It involves starting
what may happen in a concrete situation. with hypotheses about mechanisms which produce
So what is the relevance of this to mental health particular outcomes, and the context within which
services research? The above definition of a social these mechanisms operate. These hypotheses sug-
programme accords exactly with the subject matter gest initial patterns likely to be found in whatever
of mental healthcare, whether the ‘intervention’ be a problem is being investigated. When a specific inter-
psychiatrist prescribing for a patient, the develop- vention is targeted at altering particular aspects of
ment of an assertive community treatment service, or the mechanism in particular ways, hypotheses about
attempts to reduce the stigma of mental illness. It is the relationship between context and outcome allow
proposed that the current ethos of ‘evidence-based the formulation of specific research questions. As in
mental health’, where the term ‘evidence’ is equated meta-analysis, studies seldom stand alone. However,
with RCTs and meta-analysis (i.e. successionist the goal of grouping studies is to identify ‘middle
methods), can only provide one form of evidence. range theories’, which lie between minor hypotheses
The seminal example of this fate in other areas of and all-inclusive systematic theories of social pro-
social research is the review by Martinson (1974) of grammes (Merton, 1968). The central difference
offender rehabilitation programmes. The review between this approach and the current focus on RCT
considered all published reports in English between methodology is that theories of mechanism under
1945 and 1967, and the full version ran to 1400 test and the contextual issues hypothesized to be
pages. He concluded: influencing them need to be spelt out, so that related
studies (including but not limited to RCTs) can be
I am bound to say that these data, involving over
linked. This already happens to some extent, notably
two hundred studies and hundreds of thousands
in pharmacotherapeutic and psychological interven-
of individuals as they do, are the best available
tion research. However, middle range theories do not
and give us very little reason for hope that we have
appear to be central to research at the local and
in fact found a sure way of reducing recidivism
national levels.
through rehabilitation. This is not to say that we
have found no instances of success or partial
success; it is only to say that these instances have
Implications for mental health services
been isolated, producing no clear pattern to
research
indicate the efficacy of any particular method of
treatment. (Martinson, 1979, p. 49)
What kind of studies are undertaken within this
As Pawson & Tilley (1997) note, the problem at one research framework? There would be more focus on
level is the impossible criteria for success, in which an identifying mechanisms of change and contexts in
intervention ‘works’ only if it produces positive out- which these mechanisms are activated. One
comes in all trials in all contexts. The RCT approach is to use RCTs, as is beginning to be appar-
proponent might argue that the Martinson review ent in trials of psychological interventions. For
simply did not have access to an adequate (i.e. RCT- example, the London and East Anglia RCT of cog-
based) evidence base. However, the pattern of devel- nitive-behavioural therapy for psychosis identified
opments in mental health research is depressingly ‘response to hypothetical contradiction’ as a moder-
similar, with an increased emphasis on methodolo- ator of outcome (Garety et al., 1997). This both
gies which cost more and more to implement (larger accords with the cognitive model of schizophrenia,
samples, increased programme fidelity, etc.), in the and is of practical clinical utility when assessing an
belief that interventions will ultimately be individual for suitability for treatment. However,
categorized into ‘effective’ and ‘ineffective’. investigating contexts and mechanisms involves
Pawson & Tilley (1997) use as an example the more detailed questions than RCTs have been
installation of closed circuit television cameras in an designed to answer, indicating a need for a broader
attempt to reduce thefts in car parks. The authors range of methodologies.
Mental health services research methodology 17
Remediation Iatrogenic effects are minimized Service is less institutional than standard care
Specific intervention is of benefit to the Service provides relevant intervention with
individual patient expertise
More attention from expert staff Low caseloads, high expertise in team
At the local level, consider an attempt to evaluate rather than historical precedent or clinical anecdote,
a service aspiring to offer early intervention in then the challenge for health planners, practitioners
psychosis. Possible mechanisms for improved out- and researchers is to understand not only which
come (compared with standard care) in terms of services work, but also why, how, when and where.
remoralization, remediation and rehabilitation are New methodologies have to be developed and
shown in Table 2. applied in mental health services research for
Each context-mechanism combination indicates a achieving such an understanding.
research question. Since multitudinous mechanisms
might account for improved outcome, attempts to
evaluate an early psychosis service as identical in all Acknowledgements
implementations will result in conflicting findings.
The ideas for this article have been developed
What is needed is to identify what putative mecha-
through discussion with many colleagues, including
nism is being investigated, and what context is
Jonathan Bindman, Derek Bolton, Gyles Glover,
required for that mechanism to be activated. Once
Morven Leese, James Tighe and Graham
identified, a range of methodologies might then be
Thornicroft.
appropriate for testing hypotheses, including but not
limited to RCTs.
At the national level, research needs to draw from References
other disciplines (e.g. marketing, politics), in a bid to
understand population-level mechanisms of change. ALDERMAN, N. (2002). Individual case studies. In: S.
For example, the stigma of homosexuality has PRIEBE & M. SLADE (Eds), Evidence in mental health care.
London: Routledge.
probably been reduced in a number of ways, probably BARTLETT, A. (2002). Anthropological studies. In: S.
including legislative changes, reductions in censor- PRIEBE & M. SLADE (Eds), Evidence in mental health care.
ship, the dissemination of research findings, and London: Routledge.
vociferous lobbying. Identifying the mechanisms of BLACKER, C. & MORTIMORE, C. (1996). Randomized
action for these various strategies may have controlled trials and naturalistic data: time for a change?
Human Psychopathology, 11, 353–363.
implications for reducing the stigma of mental illness. BRADLEY, C. (1997). Psychological issues in clinical trial
The result of this research agenda would be an design. Irish Journal of Psychology, 18, 67–87.
understanding, rather than an explanation, of what BRYANT, M., SIMONS, A. & THASE, M. (1999). Therapist
aspects of a social programme produce change in skill and patient variables in homework compliance:
controlling an uncontrolled variable in cognitive therapy
what way, and for what people. Only in this way will outcome research. Cognitive Therapy & Research, 23,
mental health services move towards being rationally 381–399.
planned, developed and evaluated. RCTs will still BURNS, T., FAHY, T., THOMPSON, S., TYRER, P. & WHITE,
have an important role, but the term ‘evidence’ will I. (1999a). Intensive case management for severe
have a broader meaning, encompassing the results of psychotic illness (letter). Lancet, 354, 1385–1386.
BURNS, T., CREED, F., FAHY, T., THOMPSON, S., TYRER,
many types of research which investigate many types P. & WHITE, I. (1999b). Intensive versus standard case
of questions. Meaningful and important evidence is management for severe psychotic illness: a randomised
likely to come, for instance, from individual case trial. Lancet, 353, 2185–2189.
studies (Alderman, 2002), anthropological research DECI, P., SANTOS, A., HIOTT , D., SCHOENWALD, S. &
DIAS, J. (1995). Dissemination of assertive community
(Bartlett, 2002), and qualitative studies (Williams, treatment programs. Psychiatric Services, 46, 676–678.
2002). If the ultimate goal of the mental health DEPARTMENT OF HEALTH (1999). National service frame-
system is to operate on the basis of realistic evidence, work for mental health. London: HMSO.
18 Mike Slade et al.
GARETY, P., FOWLER, D., KUIPERS, E., FREEMAN, D., DUNN, MCGREW, J. & BOND , G. (1995). Critical ingredients of
G., BEBBINGTON, P., HADLEY, C. & JONES, S. (1997). assertive community treatments: judgment of experts.
London-East Anglia randomised controlled trial of cogni- Journal of Mental Health Administration, 22, 113–125.
tive-behavioural therapy for psychosis. II: Predictors of MERTON, R. (1968). Social theory and social structure. New
outcome. British Journal of Psychiatry, 171, 420–426. York: Free Press.
GEDDES, L. & HARRISON, P. (1997). Closing the gap PAWSON, R. & TILLEY, N. (1997). Realistic evaluation.
between research and practice. British Journal of London: Sage.
Psychiatry, 171, 220–225. ROTH , A. & FONAGY, P. (1996). What works for whom?
GILBODY, S. & HOUSE, A. (1999). Variations in psychiatric London: Guilford Press.
practice. British Journal of Psychiatry, 175, 303–305. SACKETT, D.L. & WENNBERG, J.E. (1997). Choosing the
GREENHALGH, T. (1997). How to read a paper: getting best research design for each question. British Medical
your bearings. British Medical Journal, 315, 243–246. Journal, 315, 1636–1627.
GUYATT, G., SACKETT, D., SINCLAIR, J., HAYWARD, R., SASHIDHARAN, S., SMYTH, M. & OWEN, A. (1999). PRiSM
COOK , D. & COOK , R. (1995). Users’ guide to the psychosis study—thro’ a glass darkly: a distorted
medical literature: IX. A method for grading health care appraisal of community care. British Journal of
recommendations. Journal of the American Medical Psychiatry, 175, 504–507.
Association, 274, 1800–1804. SENSKY , T., TURKINGTON, D., KINGDON, D., SCOTT, J.L.,
HARRÉ, R. (1972). The philosophies of science. Oxford: SCOTT, J., SIDDLE, R., O’CARROLL, M. & BARNES, T.R.
Oxford University Press. (2000). A randomized controlled trial of cognitive-
HUME, D. (1739). A treatise of human nature. London: John behavioral therapy for persistent symptoms in schizo-
Noon. phrenia resistant to medication. Archives of General
KUIPERS, E., GARETY, P., FOWLER, D., DUNN, G., Psychiatry, 57, 165–172.
SLADE, M. & PRIEBE, S. (2001). Are randomised
BEBBINGTON, P., FREEMAN, D. & HADLEY, C. (1997).
controlled trials the only gold that glitters? British Journal
The London East Anglia randomised control trial of
of Psychiatry, 179, 286–287.
cognitive behaviour therapy for psychosis. I: Effects of
TAYLOR, R. & THORNICROFT, G. (1996). Uses and limits
the treatment phase. British Journal of Psychiatry, 171,
of randomised controlled trials in mental health service
319–327.
research. In: G. THORNICROFT & M. TANSELLA (Eds),
JARMAN, B., HIRSCH, S., WHITE, P. & DRISCOLL, R. Mental health outcome measures (pp. 143–151). Berlin:
(1992). Predicting psychiatric admission rates. British Springer.
Medical Journal, 304, 1146–1151. TEAGUE, G., BOND , G. & DRAKE, R. (1998). Program
JOHNSON, S., SALVADOR-C ARULLA, L. & THE EPCAT fidelity in assertive community treatment: development
G ROUP (1998). Description and classification of mental and use of a measure. American Journal of Orthopsychia-
health services: a European perspective. European try, 68, 216–232.
Psychiatry, 13, 333–341. THORNICROFT, G. & TANSELLA, M. (1999). The
KANE, E. (2002). The policy perspective: what evidence is mental health matrix. Cambridge: Cambridge University
influential? In: S. PRIEBE & M. SLADE (Eds), Evidence in Press.
mental health care. London: Routledge. THORNICROFT, G., BECKER, T., HOLLOWAY, F., JOHNSON,
MARSHALL, M., BOND , G., STEIN, L., SHEPHERD, G., S., LEESE, M., MCCRONE, P., SZMUKLER, G., TAYLOR,
MCGREW, J., HOULT, J., ROSEN, A., HUXLEY, P., R. & WYKES, T. (1999). Community mental health
DIAMOND, R., WARNER, R., OLSEN, M., LATIMER, E., teams: evidence or belief? British Journal of Psychiatry,
G OERING, P., CRAIG, T., MEISLER, N. & TEST, M. 175, 508–513.
(1999). PRiSM psychosis study—design limitations, THORNICROFT, G., STRATHDEE, G., PHELAN, M.,
questionable conclusions. British Journal of Psychiatry, HOLLOWAY, F., WYKES, T., DUNN, G., MCCRONE , P.,
175, 501–503. LEESE, M., JOHNSON, S. & SZMUKLER, G. (1998).
MARTINSON, R. (1974). What works? Questions and Rationale and design: PRiSM psychosis study I. British
answers about prison reform. Public Interest, 35, 22–45. Journal of Psychiatry, 173, 363–370.
MCGOVERN, D. & OWEN, A. (1999). Intensive case WILLIAMS, B. (2002). Qualitative studies. In: S. P RIEBE &
management for severe psychotic illness (letter). Lancet, M. SLADE (Eds), Evidence in mental health care. London:
354, 1384. Routledge.