Goodkind, Adam Lee, Michelle Martin, Gary E. Losh, Molly and Bicknell, Klinton (2018) Detecting Language Impairments in

Proceedings of the Society for Computation in Linguistics
Volume 1 Article 3
2018
Detecting Language Impairments in Autism: A

Computational Analysis of Semi-structured
Conversations with Vector Semantics
Adam Goodkind
Northwestern University, [email protected]
Michelle Lee
Gary E. Martin
St. John's University, [email protected]
Molly Losh
Klinton Bicknell
Follow this and additional works at: https://scholarworks.umass.edu/scil

Part of the Computational Linguistics Commons, Developmental Psychology Commons, and
the Psycholinguistics and Neurolinguistics Commons
Recommended Citation
Goodkind, Adam; Lee, Michelle; Martin, Gary E.; Losh, Molly; and Bicknell, Klinton (2018) "Detecting Language Impairments in
Autism: A Computational Analysis of Semi-structured Conversations with Vector Semantics," Proceedings of the Society for
Computation in Linguistics: Vol. 1 , Article 3.
DOI: https://doi.org/10.7275/R56W988P
Available at: https://scholarworks.umass.edu/scil/vol1/iss1/3
This Paper is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Proceedings of the
Society for Computation in Linguistics by an authorized editor of ScholarWorks@UMass Amherst. For more information, please contact
[email protected].
Detecting language impairments in autism: A computational analysis of
semi-structured conversations with vector semantics
Adam Goodkind1 , Michelle Lee2,3 , Gary E. Martin4 , Molly Losh3 , and Klinton Bicknell1
1
Dept. of Linguistics, Northwestern Univ., Evanston, IL 60208
2
Clinical Psychology, Feinberg School of Medicine, Northwestern Univ., Chicago, IL 60611
3
Dept. of Communication Sciences and Disorders, Northwestern Univ., Evanston, IL 60208
4
Dept. of Communication Sciences and Disorders, St. John’s Univ., Staten Island, NY 10301
{a.goodkind, michelleannemarie2017}@u.northwestern.edu,
[email protected], {m-losh, kbicknell}@northwestern.edu
Abstract der current standards used in both the DSM-IV and

DSM-5 (American Psychiatric Association, 2000,
Many of the most significant impairments 2013). However, current methods for assessing prag-
faced by individuals with autism spectrum
matic language impairment are often subjective, can
disorder (ASD) relate to pragmatic (i.e. so-
cial) language. There is also evidence that
be very time intensive, and distal from underly-
pragmatic language differences may map to ing mechanisms. Computational models of language
ASD-related genes. Therefore, quantifying the production in ASD thus have the potential to im-
social-linguistic features of ASD has the po- prove diagnostic assessments, contribute to research
tential to both improve clinical treatment and into the basis of language impairment in ASD, and
help identify gene-behavior relationships in may also show strong utility in clinical treatment as
ASD. Here, we apply vector semantics to tran- objective and quantitative measures of response to
scripts of semi-structured interactions with
intervention.
children with both idiopathic and syndromic
ASD. We find that children with ASD are less Additionally, evidence that more subtle language
semantically similar to a gold standard derived differences are evident at elevated rates among rela-
from typically developing participants, and tives of individuals with ASD points towards prag-
are more semantically variable. We show that matic language as a genetically meaningful domain
this semantic similarity measure is affected by in ASD, with potential for informing molecular ge-
transcript word length, but that these group netic studies, which examine more specific ties to
differences persist after removing length dif-
component phenotypes in ASD that may segregate
ferences via subsampling. These findings sug-
gest that linguistic signatures of ASD pervade independently and relate to distinct genetic under-
child speech broadly, and can be automatically pinnings (Losh, Sullivan, Trembath, & Piven, 2008).
detected even in less structured interactions. The development of computational tools for quan-
tifying language impairment in ASD, such as the
present study, may therefore contribute to future
1 Introduction
studies of ASD genetics as well. This can be accom-
From its earliest descriptions (Kanner, 1943), autism plished by applying the present study’s methods to
spectrum disorder (ASD) has been associated with large-scale datasets, which are appropriate for broad
language impairment, and pragmatic language im- genetic studies. In addition, the methods presented
pairment in particular. Problems with pragmatic lan- below provide a continuous measure of pragmatic
guage are a key component of current diagnostic impairment, which can be more readily compared
criteria for ASD, and both atypical and idiosyn- against genetic data.
cratic language are noted as features of ASD un- The pragmatic language impairments in ASD are
12
Proceedings of the Society for Computation in Linguistics (SCiL) 2018, pages 12-22.
Salt Lake City, Utah, January 4-7, 2018
evident in a range of linguistic features. For instance, contexts, in which there may be no objective gold
limited frequency and diversity of complex syntax standard. Arguably, though, these naturalistic stud-
has been shown to significantly impact narrative ies are more ecologically valid, and also constitute
and conversational quality in ASD (Losh & Capps, the discourse context posing the most serious chal-
2003; Prud’hommeaux, Roark, Black, & Van San- lenges to individuals with ASD.
ten, 2011). Individuals with ASD produce more non-
contingent discourse in narrative and conversation 1.1 Goals
(Capps, Kehres, & Sigman, 1998; Losh & Capps, The primary goal of the present work is to investi-
2003, 2006). In a similar vein, people with ASD gate whether this computational approach could be
tend to fixate on a single topic, even though a applied in a more open-ended conversational setting,
conversation may have moved away from that sin- in which there is no objective gold standard. The pri-
gle topic in another direction (Nazeer & Ghaziud- mary contributions of this work are as follows:
din, 2012). Additionally, both inappropriate seman- • We show that language from individuals with
tic and pragmatic language has been demonstrated ASD can be distinguished from that of typi-
(Tager-Flusberg & Sullivan, 1995). From a seman- cally developing individuals, by applying vec-
tic standpoint, one way these differences may man- tor semantic models, even on semi-structured
ifest is when children with ASD use very uncom- conversational data.
mon words in a context in which a common word
suffices. Children with ASD can also fail to make • We demonstrate that semantic similarity met-
common pragmatic inferences, such as understand- rics are affected by transcript length, raising
ing the semantics of a question like Can you close the question of whether such metrics can yield
the door? but failing to understand its pragmatics, valid conclusions with very small language
and so responding by saying Yes, I can. samples, such as often occur with children or
Given this evidence that individuals with ASD ex- lower functioning populations.
hibit such pragmatic impairments, prior work has
• We present a method for adapting semantic
used computational models to distinguish individ-
similarity analyses to accommodate possibly-
uals with ASD from typically developing individ-
small language samples from younger or lower
uals, using distributional semantic word models
functioning populations who have more limited
(Rouhizadeh, Prud’hommeaux, Roark, & Van San-
language abilities.
ten, 2013; Losh & Gordon, 2014). For example,
both Losh and Gordon (2014) and Lee et al. (2017) To do this, we analyzed semi-structured conversa-
used Latent Semantic Analysis (Deerwester, Du- tional interactions, consisting of relatively free rang-
mais, Furnas, Landauer, & Harshman, 1990) with ing conversation in a number of somewhat consis-
transcripts from picturebook narratives, a narrative tent situations. We construct an approximate gold
recall task, and a less structured narrative elicitation standard of comparison from the transcripts of a few
task. Both studies showed that narratives from in- individuals with typical development who had simi-
dividuals with ASD diverged significantly in vector lar and typical language and cognitive abilities.
semantic space from a gold standard (either the orig- We focus on conversational interactions because
inal narrative, or a narrative derived from the TD this language context is among the most challenging
group of participants) compared to (non-gold stan- for individuals with ASD, since lack of structure and
dard) typically developing controls. high interpersonal demands pose serious barriers to
Narrative recall and picturebook description tasks effective communication (as opposed to picturebook
afforded clear gold standards for comparisons, with narratives, for example) (Losh & Capps, 2003). Fur-
a very clear objective semantics to communicate thermore, prior studies of computational linguistic
the original narrative. Thus, they were optimal for approaches to characterizing discourse in ASD have
a computational linguistic approach. However, it is focused on more structured contexts. This study is
less clear whether such an approach can general- the first to apply this technique to conversational in-
ize to other, more variable and naturalistic language teraction in ASD. If this technique can successfully
13
differentiate neurotypical individuals and individu- tal age of approximately 5;0. All participants had
als with ASD, based on semi-structured conversa- a mean length of utterance (MLU) of at least three
tion, it would suggest that the semantic differences words per utterance and were L1 English speakers.
in the language of individuals with ASD are quite Participants were drawn from a larger longitudinal
widespread, and detectable across a range of every- study reported in Martin et al. (2017). Addition-
day tasks. Finally, because the semi-structured inter- ally, individuals with idiopathic ASD were required
actions we analyze come from a standard ASD diag- to have a previous clinical diagnosis, confirmed by
nosis task (the ADOS, see below), such a computa- administration of the Autism Diagnostic Observa-
tional model has the potential to help with diagnosis. tion Schedule (ADOS) (Lord et al., 2000) and/or
The remainder of this paper is structured as fol- the Autism Diagnostic Interview - Revised (Lord,
lows. The following section describes our transcript Rutter, & Le Couteur, 1994). Individuals with FXS-
dataset. Following that, Section 3 describes how we ASD were confirmed based only on the ADOS. The
use word embeddings to quantify distances to gold average chronological and mental age for each group
standards. Section 4 describes our first experiment are provided in Table 1.
applying this method to conversation data. Section
5 points out that this similarity metric is confounded 2.2 Procedure
by transcript length and presents a method to remove Language samples were derived from the ADOS
this confound. Section 6 concludes. and/or ADOS-2, gold standard diagnostic tools for
ASD. The ADOS includes several structured activ-
2 Interaction session transcripts ities as well as opportunities for naturalistic inter-
action, in order to probe for social-communication
2.1 Participants
skills and the presence of restricted and repetitive
We selected 109 participants, in three groups: (1) behaviors. Play-based activities included the oppor-
(younger) typically developing children used as a tunity to play with action figures and other toys.
control due to comparable cognitive ability to the Non-play based activities included conversation be-
clinical groups (TD); (2) school aged children with tween tasks, describing a picture, or telling a story
idiopathic ASD, unrelated to any other known ge- from a book.
netic disorders (ASD); and (3) children with ASD
comorbid with fragile X syndrome (FXS-ASD). For 2.3 Transcription
all three groups, children were selected based upon Subsections of entire language samples were tran-
the nonverbal mental age from the Leiter Interna- scribed from high quality audio recordings by
tional Performance Scale (Wechsler, 2008). For typ- trained transcribers. The transcripts were based on
ically developing children, mental age should on av- a subset of the full assessment: specifically, 55 intel-
erage match chronological age. However, for chil- ligible play based turns and 55 non-play based turns
dren with developmental impairments, mental age is were transcribed (or fewer in the rare case that there
often lower than chronological age. were not 55 intelligible turns).
Fragile X syndrome (FXS) is the most common
heritable intellectual disability, and has common co- 2.4 Processing
morbidity with ASD (Rogers, Wehner, & Hagerman, All child utterances were extracted from the tran-
2001; Kaufmann et al., 2004; Martin et al., 2017; in- scripts, including filled pauses and stop words. Al-
ter alia). Like ASD, fragile X syndrome often shows though stop word removal is common practice in
pragmatic deficits as well. Evidence also exists that distributional semantics and NLP (Levy, Goldberg,
language impairment within fragile X syndrome af- & Dagan, 2015), this class of words can be psycho-
fects males with FXS more than it affects females logically informative (Chung & Pennebaker, 2007).
with FXS (Abbeduto, McDuffie, & Thurman, 2014). This also seems especially relevant to ASD, where
For this reason, all selected participants were male. incorrect pronoun usage is common, e.g. using the
This also eliminates sex as a possible confound. second-person you when referring to oneself, instead
All participants were selected based on a men- of the correct first-person I (Naigles et al., 2016).
14
Diagnostic Group n Chronological Age (SD) Mental Age Equivalent (SD)
Typically developing (TD) 22 4.7 (1.1) 5.1 (1.2)
Autism spectrum disorder (ASD) 39 8.7 (2.9) 6.9 (3.4)
Fragile X syndrome + ASD (FXS-ASD) 48 10.6 (2.6) 5.0 (0.6)
Table 1: Participant chronological age and mental age equivalent for each diagnostic group.
2.5 Gold standard transcripts were excluded from the TD group for all analyses.
Two of the transcripts from children with typical Semantic distance of a transcript vector to the
development were designated gold standard tran- gold standard was measured as the cosine distance
scripts. This designation was performed by two re- between them. That is, for a given transcript vector
searchers who were both familiar with the tasks v~i and the gold standard vector ~g , the distance was
in the interactions. Gold standards were selected then calculated as
based on detailed clinical-behavioral ratings. We se- v~i · ~g
lected TD participants who, based on this coding, d(~
vi , ~g ) = (1)
k~
vi k2 k~g k2
demonstrated minimal pragmatic language deficits
and highly-rated core features of conversation, such Because the transcript vectors are all normalized to
as contingency, reciprocity, and initiation. For the have unit length, this reduces to a simple dot prod-
purposes of analyses conducted below, these two uct.
transcripts were excluded from the TD group, so as A lower cosine distance means that the vectors be-
not to bias the results. ing compared have more similar dimensions. Given
this, we then defined the semantic similarity of a
3 Word embeddings
vector to the gold standard as one minus the cosine
A number of previous studies have used word em- distance (1 − d(~v , ~g )), so that a lower distance re-
beddings (vector semantics) to study language tran- sulted in a higher semantic similarity score.
scripts of people with autism (Rouhizadeh et al., The code for converting transcripts to vectors
2013; Rouhizadeh, 2015). A vector semantic model and computing similarities is freely available on an
specifies an embedding, or mapping, from each open-source repository2 . Although our transcripts
word in the vocabulary to a point in a continuous themselves cannot be shared because of privacy con-
vector space. A document in such models typically cerns, we used a standard format for transcription,
consists of an unordered collection of words. A vec- making our tools readily usable by other investiga-
tor semantic representation of a document can be ob- tors.
tained by combining the embeddings of the words it
contains in some way, such as summing. 4 Experiment 1
For the present study, each word in the transcript
We performed three sets of analyses to compare TD
was converted to a vector using the word2vec
individuals to the two populations with ASD.
model, via the pretrained Google News embed-
dings (Mikolov, Sutskever, Chen, Corrado, & Dean, 4.1 Similarity to gold standard
2013), which are 400-dimensional. A vector seman-
tic representation of each document was created by For each transcript, we calculated the cosine dis-
summing the vectors for each of its words, and then tance between its vector embedding and the gold
normalizing to have unit length. standard, yielding a single similarity score for each
The gold standard vector was calculated as the transcript. Figure 1 illustrates the mean semantic
mean of the two gold standard transcript vectors, and ing using the mean vector from all transcripts as well as the
used as the basis of comparison for semantic similar- mean vector of all of the typically developing transcripts. Fu-
ture studies will compare and contrast the utility of selecting
ity.1 The transcripts identified to be the gold standard
different bases for comparison.
1 2
Other bases for comparison were also considered, includ- https://github.com/langcomp/vectoraut
15
Figure 1: Semantic similarity for each diagnostic group as com- Figure 2: Most informative dimensions using full transcripts,
pared to the semantic content of Gold Standard transcripts as selected by PCA, for different diagnostic groups. Clusters
for children with autism are more diffuse.
Comparison All Words Word Sampling
TD vs ASD p < 0.001 p < 0.05 Diagnostic Group Manhattan Distance (mean)
TD vs FXS-ASD p < 0.00001 p < 0.001 TD 3.44
Table 2: Significance levels for differences in semantic simi- FXS-ASD 4.42
larity between diagnostic groups and a gold standard. Both full ASD 4.49
transcripts as well as random word sampling from transcripts Table 3: Manhattan distance between all pairs of word vectors
are reported. within each diagnostic group, for full transcripts.
similarity for each group, as well as the 95% boot-

strapped confidence intervals. We then ran nonpara- tered around the gold standard. This suggests a pos-
metric Wilcoxon tests comparing the semantic sim- sible reason why the two groups with ASD were less
ilarity scores across groups, specifically comparing similar to the gold standard than was the TD group:
the TD group to the ASD group, and the TD group to because they are more semantically variable.
FXS-ASD, the results of which are shown in Table
2, middle column. The results show reliable differ-
4.3 Within-group variability
ences between TD and each of the two other groups,
with the TD group being more semantically similar Finally, to test this hypothesis that variability within
to the gold standard transcripts than the groups with groups is higher for groups with ASD as opposed
ASD. The two groups with ASD appear very similar to the TD group, we measured the Manhattan dis-
to each other. tance between all pairs of transcript vectors within
each group, using the full 400-dimensional tran-
4.2 Visualizing semantic space
script vectors. This distance is an indication of how
To try to visualize the semantic space these tran- far apart semantically two transcript are. We elected
scripts are embedded in, we used Principal Com- to use Manhattan distance rather than Euclidean dis-
ponent Analysis (Jolliffe, 2011) to reduce the 400- tance because of the former’s robustness in high-
dimensional vectors to two dimensions. The results dimensional space (Aggarwal, Hinneburg, & Keim,
are visualized in Figure 2, where each transcript is 2001). The average distance between each vector
identified by its group, and the two transcripts from pair in each group is reported in Table 3. The results
which the gold standard was constructed have also show that the TD group is much more homogeneous,
been added. As can be seen, the ASD and FXS-ASD while the two groups with ASD exhibit larger vari-
groups are much more dispersed than the typically ability. Interestingly, the two groups with ASD again
developing group, which is relatively tightly clus- appear very similar to each other.
16
5 Experiment 2
The results of Expt. 1 showed strong evidence that
individuals in the ASD and FXS-ASD groups were
on average less semantically similar to a gold stan-
dard than TD individuals were, and that individuals
from both groups with ASD were more semantically
variable. However, it is also a feature of this dataset
that there is a systematic relationship between the
word count of a particular individual’s transcript and
their semantic similarity. This is shown in Figure 3,
where transcripts with a higher word count on av-
Figure 3: Child word counts versus semantic similarity to gold
erage have higher similarity scores to the gold stan-
standard. As children produce more words, similarity to the
dard, within each of the three groups (although there
gold standard also increases. Smoothing lines were calculated
is some suggestion for the ASD group that this ef-
using a generalized additive model (GAM).
fect may reverse for especially high word counts).
There are multiple, non-exclusive hypotheses for
where this relationship arises. It may be an objec-
tive feature of language production that individuals
with more language impairment talk less on average.
Alternatively, it may be that the semantic similarity
metric becomes noisier and thus lower with smaller
language samples.
There are also systematic differences between
groups in transcript length, visualized in Figure
4. On the hypothesis that differences in transcript
length are an artifact of the measure, not necessarily
related to language proficiency, this raises the possi-
bility that the reason the TD group had more seman- Figure 4: Child words counts for different diagnostic groups.
tic similarity to the gold standard on average was Different diagnostic groups have markedly divergent distribu-
that it had longer transcripts on average. To rule out tions.
this possibility, and gain some insight into the rela-

tionship between semantic similarity and transcript gold standard, we left it unchanged from Expt. 1.
length, we developed a simple method to remove the
variance in transcript length. 5.2 Similarity to gold standard
The similarity of these uniform-length transcripts to
5.1 Word sampling the gold standard are shown in Figure 5. As can
To remove this variance in transcript length, we be seen, even without transcript length differences,
performed a random sampling algorithm. Specifi- semantic similarity is still lower for the ASD and
cally, we selected a target transcript length of 300 FXS-ASD groups than for the TD group. The re-
words, which was lower than the majority of tran- sults of a Wilcoxon test are given in the rightmost
script lengths in every diagnostic group. Then, we column of Table 2, showing that these differences
sampled, without replacement, 300 words from ev- are still significant, suggesting that the group differ-
ery transcript. For the 2 transcripts that fell below ences obtained in Expt. 1 were not merely an artifact
the 300-word threshold, we selected the entire tran- of shorter transcript lengths.
script. With this new set of transcripts of a uni- Comparing the values in Figure 5 to those in Fig-
form length, we performed the same analyses as in ure 1 from Expt. 1, however, we can see that the sim-
Expt. 1. To leave in tact full information about the ilarity values here are a bit lower, especially those
17
Diagnostic Group Manhattan Distance (mean)
TD 3.76
FXS-ASD 4.54
ASD 4.67
Table 4: Manhattan distance between all pairs of word vectors
within each diagnostic group, for random word samples.
5.4 Within-group variability

We aimed to understand whether the transcript vec-
tors from randomly sampled words still had a very
short distance between them, or whether the ran-
Figure 5: Semantic similarity for each diagnostic group as com- dom word sampling obfuscated the similarities seen
pared to the semantic content of Gold Standard transcripts, us- in Expt. 1. As seen in Table 4, the uniform-length
ing a 300 word random sample transcripts still had the same qualitative distance re-
lationships, with the vectors in the TD group being
closer together than those within either of the two
groups with ASD.
Comparing Table 4 to its Expt. 1 equivalent, Ta-
ble 3, we see that the mean distance between TD
group transcripts is about 0.3 units higher here,
whereas the ASD and FXS-ASD group distances are
only 0.1–0.2 units higher. This result suggests again,
however, that there is some evidence that the vari-
ability within groups seen in Expt. 1 was biased to
be somewhat higher for the groups with shorter tran-
script, and that this method can correct for that bias.
Figure 6: Most informative dimensions using random word
6 Discussion
sampling, as selected by PCA, for different diagnostic groups.
Clusters for children with autism are more diffuse. In this study, we used vector semantics to show
that semi-structured conversations produced by in-
dividuals with ASD were semantically further from
from the TD group. That suggests that these seman- a gold standard conversational sample by children
tic similarity metrics are indeed somewhat biased with typical development, and that there was more
lower by the smaller samples. For work such as this variability within the groups with ASD than within
applying vector semantic similarity metrics to tran- the typically developing group. We also presented
scripts from younger and/or lower functioning indi- evidence that these semantic similarity and dis-
viduals, analyses like this may be a useful tool. tance measures were moderately biased by transcript
length, with low transcript lengths yielding larger
5.3 Visualizing semantic space semantic distances from a gold standard as well
as yielding more within-group variability. Finally,
As in Expt. 1, we reduced all of the transcript vectors we showed that this bias could not explain the dif-
to two dimensions using PCA. Even when randomly ferences in semantic similarity and distance across
selecting words, the same spatial relationships seen groups.
in Expt. 1 still hold, with greater dispersion for the Many previous studies applying vector seman-
ASD and FXS-ASD groups, and more concentration tics to the language of autism relied upon narra-
for the TD and gold standard groups. tives (Losh & Capps, 2003; Prud’hommeaux et al.,
18
2011; Rouhizadeh, Prud’hommeaux, van Santen, & mantically further from the gold standard than were
Sproat, 2014; Losh & Gordon, 2014; Lee et al., those of typically developing individuals. This sug-
2017; inter alia), for which there is an objective gold gests that the language of ASD may be pervasive in
standard of semantic evaluation, i.e., the original speech and detectable from even shorter samples.
narrative. The present study demonstrates that such The question remains, though What exactly are
methods can be extended to the more naturalistic we detecting through different mean semantic sim-
context of semi-structured conversation. This is im- ilarities for each group? Are we picking up differ-
portant because narrative retellings are expected to ent styles of language use, or are we picking up the
use a more constrained vocabulary; thus, unexpected same type of language, just describing different top-
words are even more surprising when the vocabu- ics? Either of those differences could affect semantic
lary is expected to be more minimal. On the other similarity in a similar fashion.
hand, it is easy to assume that a naturalistic con-
To test this, we looked at the transcripts’ connec-
versation would have a large variety of vocabulary,
tions to type-token ratio, which measures lexical di-
making unexpected words less surprising and less
versity. We ran both Pearson and Spearman corre-
telling, since a conversation can have a more wide
lations between semantic similarity and type-token
array of topic areas to discuss. Nonetheless, even
ratio (TTR). TTR is the ratio of the number of word
when participating in a more unconstrained conver-
types to the number of word tokens If lexical di-
sation, our study still picked up semantic differences
versity was significantly different between groups
for the ASD groups, despite the more freewheeling
in the same way that semantic similarity was differ-
dialogue.
ent between groups, this would be a strong indicator
The results of this study of semi-structured con-
that the difference in group semantic similarity was
versation parallel those of Losh and Gordon (2014)
due to the use of different types of language, or at
and Lee et al. (2017) on narratives. All studies
least different levels of lexical diversity. However,
showed that individuals with ASD produced lan-
this correlation was very low and neither individual
guage semantically further from a gold standard of
groups nor overall correlations approached statisti-
typical development. All studies also showed that
cal significance.
language produced by typically developing children
clustered together more closely in semantic space, This results suggests that linguistic style (at least
whereas that from children with ASD was more vari- as indexed by the type-token ratio) is not a driving
able and diffuse in semantic space (cf. Rouhizadeh factor. Instead, we are left with the most glaring dif-
et al., 2014). These findings suggest that greater se- ferences being due to using the same type of lan-
mantic variability may be a general property of the guage while discussing different topics, and thereby
language of ASD not confined to narrative retellings. using different words to describe different topics.
To our knowledge, this is the first study to present Perhaps this underscores the idiosyncratic na-
a method for analyzing semantic similarities in the ture of utterances from both the ASD and FXS-
presence of strong differences in word length of ASD groups. In other words, even though conver-
transcripts across individuals and groups, which sational language is more unconstrained than narra-
may be a common occurrence in work analyzing tive retellings, Figures 2 and 6 still seem to illustrate
populations that differ in conversation length. Since that TD participants use a more confined vocabulary.
MLU has been commonly found to be strongly cor- This may seem counter-intuitive, as a more natural
related to (chronological) age (Miller & Chapman, conversation can be assumed to be diverse, with a
1981), and would lead to more words in a conversa- less constrained vocabulary. This seems to point to
tion, this may be a common issue for younger popu- the robustness of word vectors, since they can still
lations, as well as those that are lower functioning. capture the diversity of the uncommon words often
Interestingly, even when reducing the lengths of selected by children with ASD and FXS-ASD, while
the transcripts by a substantial amount (often more downplaying the natural diversity that is expected
than 50%) via random sampling, the transcripts of from conversations with typically-developing chil-
individuals with ASD were still significantly se- dren.
19
6.1 Conclusion means to predict, not merely quantify, ASD.
We also have data collected from the same partic-
The primary contribution of this study is to show ipants at subsequent time points, which will allow
that vector semantics distinguishes language of in- us to test the rate of change for individuals. For ex-
dividuals with ASD from language of individuals ample, if language production did not advance at an
who are typically developing, even when that lan- expected rate over a period of time, this could also
guage was produced in a semi-structured conversa- be a sign of developmental deficits.
tional setting with no objective semantic standard –
Finally, by quantifying the language of ASD in a
the language context where individuals with ASD
continuous way, this method has possible applica-
exhibit most severe impairments. It also presented
tions to genetic studies of ASD, where quantitative
evidence that such semantic metrics can be applied
as opposed to categorical phenotypic measures af-
to populations who yield smaller word counts (such
ford greater power to detect molecular genetic as-
as younger, lower functioning children with intellec-
sociations for complex traits and disorders such as
tual disability) – despite short transcripts biasing the
ASD. For example, it could be used to quantify the
semantic metrics – and presented a simple method
extent to which family members of individuals with
to help quantify and control this bias that can be im-
ASD, who are themselves typically developing, nev-
plemented across age and ability levels.
ertheless exhibit values of this continuous measure
This work represents a step toward developing a that more closely resemble the language of ASD.
metric of language impairment in ASD that is em- More generally, these findings suggest that the lin-
pirically quantifiable, objective, and automatically guistic signatures of ASD pervade child speech, and
generated, which has the potential to improve clini- may be automatically detectable under wider condi-
cal assessments, offer objective quantitative indices tions than previously demonstrated.
of language impairment that could be used to strat-
ify groups in biological studies, and possibly serve Acknowledgements
as sensitive measures of response to clinical inter-
ventions. The results here were based on conver- This research was supported by NIDCD
sational data from the ADOS assessment, a stan- R01DC010191 (to Losh), NICHD R01 HD038819
dard assessment for ASD including semi-structured (to Martin), and a Northwestern Data Science
conversational samples. This is a more generalized Institute grant (to Bicknell and Losh).
form of dialogue than the narrative retellings used
References
in many previous studies. This finding, then, opens
the door to investigate further diverse discourse set- Abbeduto, L., McDuffie, A., & Thurman, A. J.
tings where semantic similarity tests might be effec- (2014). The fragile x syndrome–autism co-
tively implemented. For instance, if a classroom or morbidity: what do we really know? Frontiers
family dinnertime setting could be used as a reliable in genetics, 5.
source, then a child would not need to be removed Aggarwal, C. C., Hinneburg, A., & Keim, D. A.
from their typical daily activities in order to perform (2001). On the surprising behavior of distance
testing. Such assessments would also capture prag- metrics in high dimensional spaces. In Icdt
matic impairments in more naturalistic settings, af- (Vol. 1, pp. 420–434).
fording more generalizable findings. American Psychiatric Association. (2000). Task
Future investigations will investigate the seem- force on DSM-IV. diagnostic and statisti-
ing disparity between extremely uncommon word cal manual of mental disorders: DSM-IV-TR.
choice and word choice that is too identical to that Washington, DC: American Psychiatric Asso-
employed by a conversation partner. Perhaps these ciation, 4.
phenomena cancel each other out when an entire American Psychiatric Association. (2013). Diagnos-
conversation is considered, leaving both of these id- tic and statistical manual of mental disorders:
iosyncrasies diluted by averaging. We plan on in- DSM-5. Washington, DC: American Psychi-
vestigating this alongside semantic similarity as a atric Association.
20
Capps, L., Kehres, J., & Sigman, M. (1998). high-functioning children with autism or as-
Conversational abilities among children with perger’s syndrome. Journal of autism and de-
autism and children with developmental developmental disorders, 33(3), 239–251.
lays. Autism, 2(4), 325–344. Losh, M., & Capps, L. (2006). Understand-
Chung, C., & Pennebaker, J. W. (2007). The psy- ing of emotional experience in autism: in-
chological functions of function words. Social sights from the personal accounts of high-
communication, 343–359. functioning children with autism. Develop-
Deerwester, S., Dumais, S. T., Furnas, G. W., Lan- mental psychology, 42(5), 809.
dauer, T. K., & Harshman, R. (1990). In- Losh, M., & Gordon, P. C. (2014). Quantifying
dexing by latent semantic analysis. Journal of narrative ability in autism spectrum disorder:
the American society for information science, A computational linguistic analysis of narra-
41(6), 391. tive coherence. Journal of autism and devel-
Jolliffe, I. T. (2011). Principal component analy- opmental disorders, 44(12), 3016–3025.
sis. In International encyclopedia of statisti- Losh, M., Sullivan, P. F., Trembath, D., & Piven, J.
cal science. (2008). Current developments in the genetics
Kanner, L. (1943). Autistic disturbances of affective of autism: from phenome to genome. Jour-
contact. Nervous child, 2(3), 217–250. nal of Neuropathology & Experimental Neu-
Kaufmann, W. E., Cortell, R., Kau, A. S., Bukelis, rology, 67(9), 829–837.
I., Tierney, E., Gray, R. M., . . . Stanard, P. Martin, G. E., Barstein, J., Hornickel, J., Matherly,
(2004). Autism spectrum disorder in fragile x S., Durante, G., & Losh, M. (2017). Signal-
syndrome: communication, social interaction, ing of noncomprehension in communication
and specific behaviors. American Journal of breakdowns in fragile x syndrome, down syn-
Medical Genetics Part A, 129(3), 225–234. drome, and autism spectrum disorder. Journal
Lee, M., Martin, G. E., Hogan, A., Hano, D., of Communication Disorders, 65, 22–34.
Gordon, P. C., & Losh, M. (2017). Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S.,
Whats the story? A computational analysis & Dean, J. (2013). Distributed representa-
of narrative competence in autism. Autism, tions of words and phrases and their compo-
1362361316677957. sitionality. In Advances in neural information
Levy, O., Goldberg, Y., & Dagan, I. (2015). Im- processing systems (pp. 3111–3119).
proving distributional similarity with lessons Miller, J. F., & Chapman, R. S. (1981). The rela-
learned from word embeddings. TACL, 3, tion between age and mean length of utterance
211-225. in morphemes. Journal of Speech, Language,
Lord, C., Risi, S., Lambrecht, L., Cook, E. H., Lev- and Hearing Research, 24(2), 154–161.
enthal, B. L., DiLavore, P. C., . . . Rutter, Naigles, L. R., Cheng, M., Rattanasone, N. X., Tek,
M. (2000). The autism diagnostic obser- S., Khetrapal, N., Fein, D., & Demuth, K.
vation schedule-generic: A standard measure (2016). “You’re telling me!” The prevalence
of social and communication deficits associ- and predictors of pronoun reversals in chil-
ated with the spectrum of autism. Journal dren with autism spectrum disorders and typ-
of autism and developmental disorders, 30(3), ical development. Research in autism spec-
205–223. trum disorders, 27, 11–20.
Lord, C., Rutter, M., & Le Couteur, A. (1994). Nazeer, A., & Ghaziuddin, M. (2012). Autism spec-
Autism diagnostic interview-revised: a re- trum disorders: clinical features and diagno-
vised version of a diagnostic interview for sis. Pediatric Clinics of North America, 59(1),
caregivers of individuals with possible per- 19–25.
vasive developmental disorders. Journal of Prud’hommeaux, E. T., Roark, B., Black, L. M., &
autism and developmental disorders, 24(5), Van Santen, J. (2011). Classification of atyp-
659–685. ical language in autism. In Proceedings of
Losh, M., & Capps, L. (2003). Narrative ability in the 2nd workshop on cognitive modeling and
21
computational linguistics (pp. 88–96).
Rogers, S. J., Wehner, E. A., & Hagerman, R.
(2001). The behavioral phenotype in fragile
x: symptoms of autism in very young children
with fragile x syndrome, idiopathic autism,
and other developmental disorders. Jour-
nal of developmental & behavioral pediatrics,
22(6), 409–417.
Rouhizadeh, M. (2015). Computational analysis of
language use in autism. Oregon Health & Sci-
ence University.
Rouhizadeh, M., Prud’hommeaux, E., Roark, B., &
Van Santen, J. (2013). Distributional se-
mantic models for the evaluation of disor-
dered language. In Proceedings of the confer-
ence. association for computational linguis-
tics. north american chapter. meeting (Vol.
2013, p. 709).
Rouhizadeh, M., Prud’hommeaux, E., van Santen,
J., & Sproat, R. (2014). Detecting linguistic
idiosyncratic interests in autism using distri-
butional semantic models. In The workshop
on computational linguistics and clinical psy-
chology: From linguistic signal to clinical re-
ality2014 (pp. 46–50).
Tager-Flusberg, H., & Sullivan, K. (1995). Attribut-
ing mental states to story characters: A com-
parison of narratives produced by autistic and
mentally retarded individuals. Applied Psy-
cholinguistics, 16(3), 241–256.
Wechsler, D. (2008). Wechsler adult intelligence
scale–fourth edition (WAIS–IV). San Anto-
nio, TX: NCS Pearson, 22, 498.
22

Goodkind, Adam Lee, Michelle Martin, Gary E. Losh, Molly and Bicknell, Klinton (2018) Detecting Language Impairments in

Uploaded by

Copyright:

Available Formats

Goodkind, Adam Lee, Michelle Martin, Gary E. Losh, Molly and Bicknell, Klinton (2018) Detecting Language Impairments in

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Goodkind, Adam Lee, Michelle Martin, Gary E. Losh, Molly and Bicknell, Klinton (2018) Detecting Language Impairments in

Uploaded by

Copyright:

Available Formats

Proceedings of the Society for Computation in Linguistics

Detecting Language Impairments in Autism: A

Follow this and additional works at: https://scholarworks.umass.edu/scil

Abstract der current standards used in both the DSM-IV and

similarity for each group, as well as the 95% boot-

that it had longer transcripts on average. To rule out tions.

this possibility, and gain some insight into the rela-

5.4 Within-group variability

You might also like