Utility of The Child Behavior Checklist As A Screener For Autism Spectrum Disorder
Utility of The Child Behavior Checklist As A Screener For Autism Spectrum Disorder
Utility of The Child Behavior Checklist As A Screener For Autism Spectrum Disorder
The Child Behavior Checklist (CBCL) has been proposed for screening of autism spectrum disorders (ASD) in clinical
settings. Given the already widespread use of the CBCL, this could have great implications for clinical practice. This
study examined the utility of CBCL profiles in differentiating children with ASD from children with other clinical dis-
orders. Participants were 226 children with ASD and 163 children with attention-deficit/hyperactivity disorder, intel-
lectual disability, language disorders, or emotional disorders, aged 2–13 years. Diagnosis was based on comprehensive
clinical evaluation including well-validated diagnostic instruments for ASD and cognitive testing. Discriminative
validity of CBCL profiles proposed for ASD screening was examined with area under the curve (AUC) scores, sensitiv-
ity, and specificity. The CBCL profiles showed low discriminative accuracy for ASD (AUC 0.59–0.70). Meeting cutoffs
proposed for ASD was associated with general emotional/behavioral problems (EBP; mood problems/aggressive behav-
ior), both in children with and without ASD. Cutoff adjustment depending on EBP-level was associated with
improved discriminative accuracy for school-age children. However, the rate of false positives remained high in chil-
dren with clinical levels of EBP. The results indicate that use of the CBCL profiles for ASD-specific screening would
likely result in a large number of misclassifications. Although taking EBP-level into account was associated with
improved discriminative accuracy for ASD, acceptable specificity could only be achieved for school-age children with
below clinical levels of EBP. Further research should explore the potential of using the EBP adjustment strategy to
improve the screening efficiency of other more ASD-specific instruments. Autism Res 2016, 9: 33–42. V C 2015 Inter-
Keywords: early detection; diagnosis; emotional/behavioral problems; Child Behavior Checklist (CBCL)
From the Center for Autism and the Developing Brain, Weill Cornell Medical College, White Plains, New York (K.A.H., M.H., C.L.); Lovisenberg
Diaconal Hospital, Oslo, Norway (K.A.H.); Norwegian Institute of Public Health, Oslo, Norway (K.A.H.); Department of Psychology, University of
Oslo, Oslo, Norway (S.T.); Department of Psychiatry, University of California San Francisco, California (S.L.B.)
Received March 22, 2015; accepted for publication June 13, 2015
Address for correspondence and reprints: Karoline Alexandra Havdahl, Weill Cornell Medical College, Center for Autism and the Developing
Brain, 21 Bloomingdale Road, White Plains, NY, 10605. E-mail: [email protected] or [email protected]
Published online 3 July 2015 in Wiley Online Library (wileyonlinelibrary.com)
DOI: 10.1002/aur.1515
C 2015 International Society for Autism Research, Wiley Periodicals, Inc.
V
Age, years, m (SD) 4.2 (1.1) 4.4 (0.9) 21.4 9.4 (1.8) 9.2 (2.0) 0.8
Gender, male, n (%) 85 (81.7) 43 (75.4) 0.9 91 (74.6) 69 (65.1) 2.4
Nonverbal IQ, m (SD) 76.9 (25.7) 97.6 (20.5) 25.6*** 87.0 (28.3) 90.7 (19.7) 21.2
Verbal IQ, m (SD) 69.5 (32.7) 92.3 (22.1) 25.2*** 81.2 (29.6) 91.3 (22.1) 23.0**
ADOS module, n (%) 14.5** 9.9**
1: Single words or less 54 (51.9) 12 (21.1) 16 (13.1) 2 (1.9)
2: Phrase speech 27 (26.0) 25 (43.9) 9 (7.4) 8 (7.5)
3: Fluent speech 23 (22.1) 20 (35.1) 97 (79.5) 96 (90.6)
ADOS comparison score, m (SD) 7.3 (1.8) 2.3 (2.2) 12.9*** 7.2 (2.2) 2.4 (1.8) 18.3***
High EBP-level, n (%) 34 (32.7) 12 (21.1) 2.4 45 (36.9) 35 (33.0) 0.4
Non-ASD diagnoses, n (%)a 34.3***
ADHD 14 (24.6) 48 (45.3)
Intellectual disability 5 (8.8) 21 (19.8)
Language disorder 33 (57.9) 15 (14.2)
Emotional disorder 5 (8.8) 22 (20.8)
Note. CBCL 5 child behavior checklist, ASD 5 autism spectrum disorder, ADOS 5 autism diagnostic observation schedule, EBP 5 emotional/behavioral
problems, ADHD 5 attention deficit/hyperactivity disorder.
1 preschool ASD case had missing on IQ.
a
Comparison between preschool and school-age non-ASD groups.
*P < 0.05; **P < 0.01; ***P < 0.001.
Scale-Revised [Conners, Sitarenios, Parker, & Epstein, are reported as partial eta squared (g2P ), interpreted as
1998], the Spence Children’s Anxiety Scale [Spence, small: 0.01–0.05, medium: 0.06–0.13, and large: 0.14.
1998], and the Multidimensional Anxiety Scale for Chil- Logistic regression was used to determine whether
dren [March, Parker, Sullivan, Stallings, & Conners, scale combinations resulted in incremental discrimina-
1997]. Following completion of all measures, clinicians tive validity compared with the individual scales. Dis-
met to discuss their impressions and assign a consensus criminative validity was examined using area under the
diagnosis. Although the CBCL was available at time of curve (AUC) scores from nonparametric receiver operat-
diagnosis, this instrument was not used in determining ing curve (ROC) analyses, which is a plot of true posi-
the presence or absence of ASD. tive vs. false positive results. Swets [1988] suggested the
following benchmarks for interpreting AUC scores:
Data Analysis 0.50–0.70 (low accuracy), 0.70–0.90 (moderate accu-
racy), and >0.90 (high accuracy). A sample size calcula-
Analyses were carried out separately for the CBCL/1.5-5 tion, using the StatsToDo website (https://www.
and the CBCL/6-18, using the Statistical Package for statstodo.com/SSizSenSpc_Pgm.php), indicated that 50
Social Sciences (SPSS) version 21. Significance level was cases in each group were needed to detect a difference
set at alpha 5 0.05 (two-tailed). Characteristics of the between chance-level and moderate discrimination
ASD and non-ASD groups were compared using chi (AUC 5 0.50/0.70, a 5 0.05, power 5 0.80). For the pro-
square tests (Fisher’s exact test if cells <5 observations) file demonstrating the highest AUC-score in each age
and t-tests. group, we calculated sensitivity, specificity, and positive
First, we examined whether the CBCL scales suggested likelihood ratio (LR1). Confidence intervals (95%) were
for ASD screening (i.e., Withdrawn, PDP, Withdrawn/ calculated based on the Wilson score method [New-
depressed, Social problems, and Thought problems) combe, 1998]. T scores were used to facilitate compari-
showed diagnostic group differences when controlling son with previous studies.
for other child characteristics. Multivariate Analysis of Stratified analyses were performed to examine whether
Covariance (MANCOVA) was used to examine diagnostic discriminative accuracy was associated with level of EBP,
group differences on (a) composite scales, (b) syndrome ID, and/or previous ASD diagnosis. The CBCL has multi-
scales, and (c) DSM-oriented scales, with gender, nonver- ple scales intended to capture emotional problems (e.g.,
bal IQ, and age as covariates. Raw scores were used in Internalizing, Emotionally reactive, Anxious/depressed,
the MANCOVA, as recommended by Achenbach and Anxiety problems, and Affective problems) and behav-
Rescorla [2000, 2001]. Individual ANCOVAs were only ioral problems (e.g., Externalizing, Attention problems,
analyzed if the MANCOVA was significant. Effect sizes Attention deficit/hyperactivity problems, Oppositional/
Note. CBCL 5 child behavior checklist, ASD 5 autism spectrum disorder, PDP 5 pervasive developmental problems, ADHD 5 attention deficit/hyperac-
tivity, ODD 5 oppositional/defiant, g2P 5 partial eta squared.
1 case excluded from MANCOVA due to missing on IQ.
*P < 0.05; **P < 0.01; ***P < 0.001.
defiant problems). In operationalizing clinically signifi- ular emotional and behavioral scale on this finding.
cant level of EBP, avoiding overlap with core ASD behav- Therefore, EBP-level was operationalized as high when T
iors was a priority. Therefore, scales with item content score (age- and gender-normed) on Aggressive behavior
clearly overlapping with core ASD behaviors were not and/or Affective problems was in the clinical range
considered (e.g., Emotionally reactive, Internalizing). Few (70). For the EBP classification to be useful in children
studies have examined concordance between CBCL scales with problems specific to the emotional or behavioral
and co-occurring emotional/behavioral disorders in chil- domain, high EBP was defined as scoring in the clinical
dren with ASD. An exception is a recent study of school- range on either of the scales (results were very similar
aged children with ASD, finding the highest discrimina- when using only one of the scales).
tive validity for the Affective problems and Aggressive All results should be interpreted in light of their con-
behavior scales (AUC 5 0.90) [Gjevik, Sandstad, Andreas- fidence intervals. Charman et al. [2007] found a differ-
sen, Myhre, & Sponheim, 2015]. To avoid the multiple ence in specificity of 0.41 and 0.93 for another ASD
comparisons problem, we based the choice of the partic- screener between subgroups with high and low EBP. A
sample size calculation indicated that 13 cases in each Overall Discriminative Validity
group were needed to have 80% power to detect a dif-
As shown in Table 3, overall discriminative validity of
ference of this size (a 5 0.05; StatsToDo).
the two CBCL/1.5-5 scales proposed for ASD screening
was in the low range (AUC 0.68–0.69). Logistic regres-
Results sion showed no incremental discriminative value of
Sample Characteristics combining the scales. Only Withdrawn made a signifi-
cant unique contribution to discrimination (B 5 0.22,
As shown in Table 1, there were large differences in
P 5 0.01), while the nonoverlapping items from PDP
ADOS scores between the ASD and non-ASD groups.
did not contribute significantly (B 5 0.00, P 5 0.99),
The ASD group also showed lower intellectual ability,
v2(2) 5 15.02, P < 0.01. Due to similar findings, further
with significant differences in verbal IQ in both age
results are only presented for Withdrawn.
samples, and in nonverbal IQ in the preschool sample.
The CBCL/6-18 scales suggested for ASD screening also
No significant differences were found for age or gender
resulted in AUC-scores in the low range (AUC 5 0.59–
proportions. Among children with non-ASD disorders,
0.67). Logistic regression showed that combining the
the proportion with language disorders was higher in
scales had incremental discriminative value compared to
the preschoolers, whereas the proportion with ADHD
the individual scales. Withdrawn/depressed and Thought
and emotional disorders was higher in the school-age
problems made statistically significant unique contribu-
children. The prevalence of high EBP was 33% in the tions to discrimination (B 5 0.06, P < 0.01 and B 5 0.05,
total sample, with no significant differences between P < 0.01, respectively), whereas Social problems did not
the ASD and non-ASD groups. The two scales compris- contribute significantly (B 5 20.02, P 5 0.32), v2(3)
ing EBP-level did not significantly correlate with age, 5 29.04, P < 0.01. The aggregated scale of T scores from
nonverbal IQ, or verbal IQ (Pearson’s r ranged from Withdrawn/depressed and Thought problems, hereafter
20.09 to 0.09, P 0.16). referred to as Withdrawn-Thought Problems (WTP),
Group Differences on the CBCL yielded an AUC-score of 0.70.
Given the site differences between the ASD and non-
Table 2 presents mean raw CBCL scores and MANCOVA ASD groups, we examined the possible covariate effect
results for the ASD and non-ASD groups (mean T scores of site (UMACC vs. CCHMC) using ROC regression in
are provided as supplementary information). Control- Stata version 13. Site did not show a significant covari-
ling for gender, age, and nonverbal IQ, preschoolers ate effect on either the preschool Withdrawn scale
with ASD scored significantly higher than preschoolers (P 5 0.87) or the school-age WTP scale (P 5 0.85).
with non-ASD disorders on Withdrawn and PDP
Sensitivity, Specificity, and Likelihood Ratio
(medium effect sizes, ES). The ASD group also scored
significantly higher on Total problems, Internalizing, Sensitivity, specificity, and LR1 of the Withdrawn and
Emotionally reactive, Aggressive behavior, and Anxiety WTP scales was examined at two previously suggested T
problems (small ES). In the school-age sample, the ASD score cutoffs of 65 and 62 [Muratori et al., 2011;
group scored significantly higher than the non-ASD Narzisi et al., 2013], using the aggregated mean scale
group only on the scales suggested for ASD screening cutoff when combining scales (130 and 124 for
(i.e., Withdrawn/depressed, Social problems, and WTP) [Biederman et al., 2010]. At the higher cutoff con-
Thought problems, small-to-medium ES), controlling sistent with the CBCL “borderline clinical” cut-point,
for gender, age, and nonverbal IQ. sensitivity and specificity was 63% (95% CI 5 53–73)
Note. CBCL 5 Child behavior checklist, ASD 5 autism spectrum disorder, WTP 5 Withdrawn-Thought Problems, EBP 5 emotional/behavioral problems,
ID 5 intellectual disability.
and 65% (95% CI 5 51–77) for Withdrawn, and 58% erate range for children with low EBP (AUC 5 0.70–
(95% CI 5 50–68) and 68% (95% CI 5 58–76) for WTP, 0.79) and in the low range for children with high EBP
respectively. LR1 was 1.8 for both Withdrawn (95% (AUC 5 0.62).
CI 5 1.2–2.7) and WTP (95% CI 5 1.3–2.5). With regard to the CBCL/6-18 WTP, scores at or
The lower cutoff resulted in moderate sensitivity above 124 were associated with a 3.2 increase in likeli-
(74% for Withdrawn, 78% for WTP) and low specificity hood of ASD among children with low EBP, in contrast
(53% for Withdrawn, 55% for WTP). Change in proba- to no increase among children with high EBP (1.0).
bility of ASD diagnosis given scores above the lower Optimal cutoffs (maximized specificity with sensitivity
cutoff was small both for Withdrawn (1.6) and WTP 80%) were widely differing in children with high
(1.7). The cutoff required to identify at least 80% of compared to low EBP-level. In the low EBP subgroup, a
children with ASD resulted in specificity of 39% for cutoff of 117 correctly classified 82% (95% CI 5 71–89)
Withdrawn (95% CI 5 26–51, cutoff 58) and 53% for of children with ASD and 62% (95% CI 5 50–73) of chil-
WTP (95% CI 5 43–63, cutoff 123). dren with non-ASD disorders. For children with high
EBP, compared to cutoff 124, a cutoff of 134 resulted in
Factors Associated With Discriminative Validity
improved specificity from 6% (95% CI 5 1–13) to 40%
Table 4 presents the results of the subgroup analyses for the (95% CI 5 24–58) while maintaining sensitivity at 81%
more sensitive lower cutoff by level of EBP, ID, and previ- (95% CI 5 67–91) (see Fig. 1).
ously/first diagnosed ASD. Subgroup analysis by gender was Although a similar pattern was found for the CBCL/
attempted, but was not possible due to confounding of gen- 1.5-5 Withdrawn, CIs were wider, especially in the
der and high EBP within children with ASD, with signifi- small high EBP subgroup (n 5 46). In the larger low EBP
cantly higher proportion of EBP in girls compared to boys in subgroup (n 5 115), discriminative accuracy was some-
preschoolers (53% vs. 29%), v2(1, N 5 104) 5 4.20, P 5 0.04, what lower than for the school-age low EBP subgroup
and school-age children (55% vs. 31%), v2(1, N 5 122) 5 (AUC 0.70 vs. 0.79). The cutoff required to identify at
5.75, P 5 0.02). There was no significant difference in the least 80% of preschoolers with ASD in the low EBP sub-
proportions of high EBP between girls and boys with non- group, resulted in only 33% specificity (cutoff 54, sensi-
ASD disorders in preschoolers (14% vs. 23%), Fisher’s exact tivity: 87%). Thus, it was not possible to achieve
P 5 0.71, or in school-aged children (32% vs. 33%), v2(1, acceptable discriminative accuracy by using adjusted
N 5 106) 5 0.01, P 5 0.93). cutoffs.
Level of EBP Intellectual Disability
Discriminative utility of the Withdrawn and WTP Due to few children with ID in the preschool non-ASD
showed substantial variability depending on EBP-level. group (n 5 5), this analysis was only performed for the
For both scales, discriminative validity was in the mod- school-age sample. Although discriminative accuracy of