1 s2.0 S1060374315000053 Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Available online at www.sciencedirect.

com

ScienceDirect
Journal of Second Language Writing 28 (2015) 53–67

Different topics, different discourse: Relationships among


writing topic, measures of syntactic complexity,
and judgments of writing quality
Weiwei Yang a,*, Xiaofei Lu b, Sara Cushing Weigle c
a
College of Foreign Languages, Nanjing University of Aeronautics and Astronautics, 29 Jiangjun Ave., Nanjing, Jiangsu 211106, China
b
Department of Applied Linguistics, The Pennsylvania State University, 304 Sparks Building, University Park, PA 16802, USA
c
Department of Applied Linguistics and ESL, Georgia State University, P.O. Box 4099, Atlanta, GA 30302-4099, USA

Abstract
This study examined the relationship between syntactic complexity of ESL writing and writing quality as judged by human
raters, as well as the role of topic in the relationship. Syntactic complexity was conceptualized and measured as a multi-dimensional
construct with interconnected sub-constructs. One hundred and ninety ESL graduate students each wrote two argumentative essays
on two different topics. It was found that topic had a significant effect on syntactic complexity features of the essays, with one topic
eliciting a higher amount of subordination (finite and non-finite) and greater global sentence complexity and the other eliciting more
elaboration at the finite clause level (in particular, coordinate phrases and complex noun phrases). Local-level complexity features
that were more prominent in essays on one topic (i.e., subordination and elaboration at the finite clause level) tended not to correlate
with scores for that topic. Rather, a reversed pattern was observed: the less prominent local-level complexity features for essays on
one topic tended to have a stronger correlation with scores for that topic. Regression analyses revealed global sentence and T-unit
complexity as consistently significant predictors of scores across the two topics, but local-level features exhibited varied predicting
power for scores for the two topics.
# 2015 Elsevier Inc. All rights reserved.

Keywords: Syntactic complexity; ESL writing performance; Topic effect

Introduction

The inquiry into syntactic complexity of writing and its relationship with writing quality is not new. However, as
Ortega (2003) points out, many early second language (L2) studies in this area suffer from problems of small sample
sizes and homogeneity of learner proficiency, often yielding conflicting findings. Furthermore, given the relatively
large number of syntactic complexity measures that have been used (see Lu, 2011; Ortega, 2003; Wolfe-Quintero,
Inagaki, & Kim, 1998), we cannot assume that the relationship between syntactic complexity and writing quality is the
same across the different measures (Norris & Ortega, 2009). The number of measures that exist also invites the

* Corresponding author. Tel.: +86 15261807574.


E-mail addresses: [email protected] (W. Yang), [email protected] (X. Lu), [email protected] (S.C. Weigle).

http://dx.doi.org/10.1016/j.jslw.2015.02.002
1060-3743/# 2015 Elsevier Inc. All rights reserved.
54 W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

question of what the construct really is and what measures are appropriate. Norris and Ortega (2009) usefully propose
examining syntactic complexity as a multi-dimensional construct. To date, however, this proposal has been adopted by
very few studies (see, e.g., Byrnes, Maxim, & Norris, 2010). Additionally, while some research suggests that variations
in writing tasks can influence the linguistic features of texts and the writing scores given to those texts, the role of
writing topic has not been given due attention in studies of the relationship between syntactic complexity and writing
quality, although the very few studies that touched upon this issue suggest that topic effects can be expected
(Crowhurst & Piche, 1979; Tedick, 1990). In this study, we hope to circumvent the limitations of previous studies by
measuring syntactic complexity as a multi-dimensional construct and using a larger sample size. We also explore the
role of writing topic in the relationship between syntactic complexity and writing quality. In the rest of this section, we
review related literature, by first establishing syntactic complexity as a multi-dimensional construct and then
synthesizing related studies. Then, we present the methodology and results of our study and discuss the findings as
well as their implications for syntactic complexity research and L2 writing assessment.

Syntactic complexity as a multi-dimensional construct

In linguistic theories, syntactic complexity traditionally refers to compound and complex sentences, i.e., clausal
complexity (see Diessel, 2004; Ravid & Berman, 2010). In some linguistic traditions, the notion of syntactic complexity
has not extended to phrasal complexity (see, e.g., Givón (2009); Givón & Shibatan, 2009). However, in another view
emerging in L1 and L2 developmental studies focusing on syntactic maturity (e.g., Cooper, 1976; Crossley, McNamara,
Weston, & McLain Sullivan, 2011; Hunt, 1965; Lu, 2011; Ravid & Berman, 2010) and discourse analysis of texts in
different genres (e.g., Biber, 2006; Biber, Gray, & Poonpon, 2011; Ravid & Berman, 2010), phrasal complexity
(particularly noun phrase complexity) has been considered an integral part of syntactic complexity.
What complicates the construct of syntactic complexity further is that the notion of clause has not been defined
consistently across disciplines. Notably, linguistic theories of grammar (Cristofaro, 2003; Givón, 2009; Halliday &
Matthiessen, 2004; Langacker, 2008) count both finite and non-finite clauses as clauses. In writing research, however,
following Hunt’s (1965) definition, the term clause has been predominantly used to refer only to finite clauses.
Therefore, when calculating an index such as number of clauses per sentence as a syntactic complexity measure,
discrepancy in results may arise due to the different definitions of clause adopted. There may be no easy answer as to
which definition of clause is more appropriate, but we adopt the view that both finite clauses and non-finite elements
should be examined as part of the construct. However, to maintain consistency with previous writing research, we
use the term clause to refer to finite clauses only and use the term non-finite element to refer to non-finite clauses.
In alignment with grammar theories, we see both finite dependent clauses and non-finite elements as representing

Overall Sentence Complexity


(Mean Length of Sentence: MLS)

Clausal Coordination Overall T-unit Complexity


(T-units per Sentence: TU/S) (Mean Length of T-unit: MLTU)

Elaboration at Clause Level Clausal Subordination (Finite)


(Mean Length of Clause: MLC) (Dependent Clauses per T-Unit: DC/TU)

Phrasal Coordination Non-finite Elements/Subordination


(Coordinate Phrases per Clause: CP/C) (Non-finite Elements per Clause: NFE/C)

Noun-Phrase Complexity
(Complex Noun Phrases per Clause: CNP/C)

Fig. 1. A multi-dimensional representation of syntactic complexity.


W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67 55

subordination. In general, we agree with Norris and Ortega’s (2009) conceptualization of syntactic complexity as a
multi-dimensional construct represented at the levels of global complexity, clausal subordination (finite), clausal
coordination, and sub-clausal elaboration (including non-finite elements/subordination, phrasal coordination, and
noun phrase complexity).
Based on Norris and Ortega (2009), the diagram in Fig. 1 displays our conceptualization of this multi-dimensional
construct and the hierarchical relationships among the sub-constructs. Laid out in the parentheses for each sub-
construct are the indices selected from previous literature that can best measure the constructs in our framework. The
distinct, discrete sub-constructs for syntactic complexity are found at the terminal nodes, thus including clausal
coordination, clausal subordination (finite), non-finite elements/subordination, phrasal coordination, and noun phrase
complexity. Non-terminal nodes in the diagram are composites of discrete sub-constructs, including elaboration at the
clause level, overall T-unit complexity, and overall sentence complexity, with each composite forming a higher level
and with overall sentence complexity essentially encapsulating all sub-constructs. In this study, mean length of
sentence and mean length of T-unit are seen as global complexity measures, and the other six measures are seen as
local-level complexity measures.
When selecting the measures for the discrete sub-constructs (i.e., the ones at the terminal nodes in the diagram),
we ensured that they not only represent the sub-constructs well but also show the sub-constructs to be non-overlapping
and distinct ones. The following complex sentence illustrates how the five discrete sub-constructs are represented
distinctly with the measures we selected:
1complex noun phrase 2 coordinate clauses 1 finite dependent clause

1 coordinate phrase
It was quite a difficult decision for him, but he decided that he would go back to college but

keep his job in order to pay off the tuition, fees, and living expenses.

1 coordinate phrase
1 non-finite element

The measures selected for the composites of sub-constructs (i.e., the ones at the non-terminal nodes in the diagram)
are overall length measures that have been commonly used in the writing literature. These measures represent the sub-
constructs holistically rather than discretely, with the assumption that the length of the analytical unit for a composite
level will increase when one or more sub-constructs under that level are utilized. An obvious drawback of such
measures is that one would not know which discrete sub-constructs, if any, make a difference in the analysis in hand.
However, such composites of sub-constructs are useful when none of the discrete sub-constructs makes a difference on
its own. For the same example sentence shown above, we display below how the composite sub-constructs are
represented with these length measures.1

1 sentence (33 words)


1 T-unit (25 words)
1 T-unit (8 words)
1 clause (8 words) 1 clause (3 words) 1 clause (22 words)
It was quite a difficult decision for him, but he decided1 that he would go back to

college but keep his job in order to pay off the tuition, fees, and living expenses.

1
One anonymous reviewer questioned counting ‘‘but he decided’’ as a three-word clause, although the dependent clause following it serves as the
object of the main clause. To the best of our knowledge, to calculate the index of ‘‘mean length of clause’’, text length (i.e., the total number of words
in a text) is divided by the total number of clauses in the text. ‘‘[But] he decided’’ in the example is counted as a clause and is thereby technically
counted as a three-word clause. See Hunt (1965) for such a mean clause length calculation method. Brynes, Maxim, and Norris (2010) have similar
examples of clause segmentation. The computational tool we used for our analysis counts such a unit as three words as well. We do see the limitation
with this method of counting, particularly when finite object clauses are used frequently in a text, which can greatly reduce mean clause length.
56 W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

Syntactic complexity and writing quality

A number of L2 and L1 studies have looked into the relationship between syntactic complexity and writing quality,
with the latter typically indicated by holistic or analytic ratings of essays. While language proficiency is often assumed
as a given in L1 writing studies, writing specialists agree that L2 writing ability involves both L2 proficiency and
writing ability (see, e.g., Cumming, 1989; Weigle, 2002). A reasonable hypothesis regarding L2 writing is that
increased language proficiency involves control over increasingly complex syntactic structures, while increased
writing ability involves the successful deployment of these linguistic resources in the service of specific writing goals.
A score on an L2 writing test may thus be an indicator of language proficiency, writing ability, or both, depending on
the nature and purpose of the assessment and the scoring criteria. In this paper, we adopt the view that L2 writing
quality (as judged by raters) is a function of both writing ability and language proficiency.
Among the multiple dimensions of syntactic complexity, few have been examined in terms of their relationship
with L2 writing quality. Such studies have tended to employ overall length measures, with mean length of T-unit
(MLTU) being the most commonly used, along with mean length of sentence (MLS) and mean length of clause
(MLC). Clausal subordination (finite) has been of considerable interest as well, typically measured by clauses per
T-unit (C/TU). Previous results on these measures have been rather mixed (see also Ortega, 2003). For the
relationship between MLTU and writing quality, both significant (e.g., Homburg, 1984; Kameen, 1979) and non-
significant findings (e.g., Larsen-Freeman & Strom, 1977; Nihalani, 1981) have been reported. Similarly, for the
relationship between finite clausal subordination and writing quality, both significant (e.g., Flahive & Snow, 1980;
Homburg, 1984) and non-significant relationships (e.g., Bardovi-Harlig & Bofman, 1989; Kameen, 1979; Perkins,
1980) have been identified. Fewer studies have examined MLS and MLC, but these studies have revealed a
significant relationship between complexity at those levels and writing quality (Homburg, 1984; Kameen, 1979).
The same set of complexity variables have been examined in L1 studies on the relationship between syntactic
complexity and writing quality as well, and these studies paint an equally unclear picture (for reviews, see
Crowhurst, 1983; Hillocks, 1986).
As Ortega (2003) points out, previous L2 inquiries into the relationship between syntactic complexity and writing
quality suffered from several limitations. Many of them used homogenous language-proficiency groups (thereby
yielding small between-group variance) and had small sample sizes. Further, they typically employed analysis of
variance rather than correlation, so that a lot of data values were lost. These limitations rendered early studies less
powered for statistical testing, i.e., less able to reveal significant findings when they existed. Consequently,
conclusions from previous studies must be interpreted with caution. More empirical studies that can avoid or reduce
these limitations are much needed.

Topic effect on syntactic complexity

The effects of variables related to writing tasks or prompts on textual features of the written product and scores or
ratings have been studied in both the L1 and L2 writing literature (for overviews of this literature, see Shaw & Weir,
2007; Weigle, 2002). These variables include, among others, genre (e.g., letters, essays, and reports), discourse mode
(e.g., narrative, expository, and argumentative) and dimensions of the topic or subject matter itself (e.g., personal or
impersonal, discipline-specific or general, and familiar or unfamiliar). The existing literature points to a general
agreement that discourse mode affects syntactic complexity in writing, with potentially different effects for different
syntactic complexity dimensions (e.g., Crowhurst & Piche, 1979; Lu, 2011; Ravid, 2004; San Jose, 1972), as well as an
emerging picture that the relationship between syntactic complexity and writing quality is dependent on discourse
mode (e.g., Beers & Nagy, 2009; Crowhurst, 1980; Spaan, 1993). However, less is known about the effect of topics
within genres or discourse modes on syntactic complexity and how topic may play a role in the relationship between
complexity and writing quality. For the purposes of this paper, we define topic as what is exactly construed by the
writing prompt (i.e., the actual wording of the writing task) and what the writers are invited to specifically write about.
For a prompt of ‘‘What are the advantages and disadvantages of homeschooling?’’, the topic is not simply
homeschooling, but specifically the pros and cons of homeschooling. In assessment settings, an examination of topic is
as interesting and as important as other task variables, since much is to be learned about topic features that may affect
equivalence of writers’ linguistic and writing performance across different topics, a condition for the reliability of an
assessment.
W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67 57

In the analysis of topics, we have found the literature on task-based language teaching (TBLT) relevant and useful.
TBLT is an approach to L2 curriculum design, including instruction and assessment. In this approach, real-world tasks
involving the use of language and pedagogical tasks conducive to promoting learners’ ability to perform real-world
tasks are used as the basic units to organize instruction and assessment (Ellis, 2003; Long & Crookes, 1993; Skehan,
1998). Writing an essay, as we see it, is a real-life task that may need to be performed in various situations. In the TBLT
literature, there is explicit theorizing of the effect of task variables on syntactic complexity of language production,
along with other performance features. There are some overlaps in terms of what dimensions of topics have been
examined between the TBLT literature and the L1 and L2 writing literature, such as personal versus impersonal topics
and familiar versus unfamiliar topics (see the task complexity framework in Skehan, 1998, 2014 in particular). What is
perhaps more enlightening and relevant to the current study is the resource-directing dimensions in Robinson (2001,
2007, 2011) task cognitive complexity framework, which currently includes six dimensions: +/ here and now, +/
few elements, +/ spatial reasoning, +/ causal reasoning, +/ intentional reasoning, and +/ perspectives-taking,
where the +/ signs denote with/without or more/less. These dimensions are seen to make cognitive/conceptual
demands on learners that can direct the learners’ attention to form-function mappings. Robinson hypothesizes that
increased task complexity along these dimensions will lead to higher syntactic complexity in language production.
For example, learners will produce syntactically more complex language for tasks that require causal reasoning than
for those that require no or less causal reasoning. Robinson’s theorizing of such an effect is in a large part based
on Givón (1985) notion that ‘‘greater structural complexity tends to accompany greater functional complexity in
syntax’’ (p. 1021). We found Robinson’s task complexity dimensions pertinent to the tasks used in our study and will
therefore employ them in our analysis of topic effect.
Overall, the current study aims to fill the research gaps by examining syntactic complexity as a multi-dimensional
construct, and by considering the role of writing topic in the relationship between syntactic complexity and writing
quality.
The study aims to address the following three research questions:

1. What is the effect of topic on syntactic complexity (with its different dimensions) of ESL students’ writing?
2. What is the relationship between syntactic complexity (with its different dimensions) and quality of ESL students’
writing?
3. What is the predictive power of syntactic complexity (with its different dimensions) on ESL students’ writing
quality?

Material and methods

Participants and data

The dataset used in this study was a subset of the essays collected by Weigle (2011) for a study investigating the
validity of automated scores of TOEFL iBT independent writing tasks. Weigle collected essays on two different
prompts from each of 386 nonnative English-speaking students studying at eight different institutions in the United
States. Her participants included matriculated undergraduate and graduate students from 10 fields of study as well as
non-matriculated English language students enrolled in language programs. The first prompt asked students to discuss
whether people place too much emphasis on personal appearance (hereafter, the appearance topic). The second prompt
asked students to discuss whether careful planning while young ensures a good future (hereafter, the future topic). The
two writing tasks do not differ in genre or discourse mode, as they are both argumentative essays, nor do they differ in
the dimensions of being personal versus impersonal, or familiar versus unfamiliar. However, they do differ in one of
the cognitive complexity dimensions in Robinson (2001, 2007, 2011) framework, namely, causal reasoning. Robinson
(2005) defines causal reasoning as ‘‘justify[ing] beliefs, and support[ing] interpretations of why events follow each
other by giving reasons.’’ The future topic tends to elicit causal reasoning in the sense that it requires the writers to
justify why a good future follows or does not follow careful planning, whereas the appearance topic does not.
The participants had 30 minutes to write each essay. Half of them wrote on the appearance topic first and half on the
future topic first. Each essay was rated on a five-point scale by two human raters (out of a pool of six trained raters)
using the TOEFL iBT Independent writing scoring guide (ETS, 2008), with a third rater adjudicating if the scores
of the two raters were separated by more than one point; only four percent of essays required such adjudication.
58 W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

Table 1
Descriptive statistics for essay scores.
Topic N Length (words) Score
Mean Std. dev. Mean Std. dev.
Appearance 190 309.22 84.11 3.60 0.70
Future 190 348.15 104.53 3.70 0.80

The TOEFL rubric covers rating descriptors in the areas of task fulfillment, ideas development and organization, unity
and coherence, and language use. The rubric does not explicitly address syntactic complexity as a criterion; however,
higher scoring essays are expected to demonstrate ‘‘syntactic variety.’’
In our study, we used 380 essays written by 190 matriculated graduate students whose essay and score data were
available for both prompts from Weigle’s (2011) study. We chose to use graduate student data because the reliability of the
automated tool we used to calculate the syntactic complexity indices has not been established for lower proficiency ESL
writers, and English proficiency requirements tend to be more stringent for graduate programs than for undergraduate
admission (typically above 79 for TOEFL iBT or above 550 for paper-based TOEFL). Thus, these participants’ L2
English proficiency can be said to be at the range of intermediate-high to advanced levels. Participant age ranged from
21 to 46 years, with a mean of 27.27; 109 were female, and 80 were male. Ten major fields of study were represented, the
most common being social sciences (60 participants), business (37 participants), and natural sciences (23 participants);
38 L1 backgrounds were represented, with Chinese being the most common native language (87), followed by Korean
(18) and Japanese (10). Table 1 provides the descriptive statistics for the essays in the dataset. The average of the ratings
given by the two human raters was taken as the measure of writing quality for each essay.

Syntactic complexity measurement

The syntactic complexity of each essay was assessed using eight different measures representing the eight
interconnected sub-constructs laid out in the Introduction. These include mean length of sentence (MLS), T-units per
sentence (TU/S), mean length of T-unit (MLTU), mean length of clause (MLC), dependent clauses per T-unit (DC/
TU), coordinate phrases per clause (CP/C), complex noun phrases per clause (CNP/C), and non-finite elements per
clause (NFE/C). The definitions of the eight measures and the sub-constructs they represent are summarized in
Table 2.
The essays were analyzed using the L2 syntactic complexity analyzer (L2SCA) (Lu, 2010), with some minor
adaptations. This analyzer takes a written English text as input, produces frequency counts of nine linguistic units
in the text—word, sentence, clause, dependent clause, T-unit, complex T-unit, coordinate phrase, complex
nominal, and verb phrase—and generates 14 indices of syntactic complexity for the text. We followed Lu’s (2010,
2011) definitions for most of the linguistic units and computed six measures—MLS, MLTU, MLC, TU/S, DC/TU,
and CP/C—with the original version of L2SCA. Then, along the work of Biber et al. (2011), we defined complex
noun phrases as noun phrases that contain one or more of the following: pre-modifying adjectives, post-modifying

Table 2
Syntactic complexity measures.
Sub-construct Measure Definition
Overall sentence complexity Mean length of sentence (MLS) Number of words divided by number of sentences
Clausal coordination T-units per sentence (TU/S) Number of T-units divided by number of sentences
Overall T-unit complexity Mean length of T-unit (MLTU) Number of words divided by number of T-units
Clausal subordination Dependent clauses per T-unit (DC/TU) Number of dependent clauses divided by number of T-units
Elaboration at clause level Mean length of clause (MLC) Number of words divided by number of clauses
Phrasal coordination Coordinate phrases per clause (CP/C) Number of coordinate phrases divided by number of clauses
Noun phrase complexity Complex NPs per clause (CNP/C) Number of complex NPs divided by number of clauses
Non-finite elements/subordination Non-finite elements per clause (NFE/C) Number of non-finite elements divided by number of clauses
W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67 59

prepositional phrases, and post-modifying appositives. The pattern used to identify complex nominals in the
original L2SCA was modified accordingly to match this definition in order to calculate CNP/C. This measure does
not show the full range of NP complexity measures (cf. Bulté & Housen, 2012; Ravid & Berman, 2010); however,
other types of NP complexity such as relative clauses and non-finite modifications of nouns are captured within the
measures of DC/TU and NFE/C. Finally, to calculate non-finite elements per clause, we subtracted 1 from the
measure of verb phrases per clause, since by definition a clause contains one finite VP, and the other VPs are
therefore non-finite.

Statistical analysis

The complexity indices and writing scores of the essays were analyzed to answer the three research questions. First,
dependent samples t tests were conducted to examine the effect of writing topic on the syntactic complexity of the students’
writing. Second, Pearson’s product-moment correlations between syntactic complexity indices and writing scores were
calculated for each topic to identify the relationship between syntactic complexity and the quality of the essays.
Finally, regression analyses were run for each topic to assess the predictive power of syntactic complexity on the
writing scores.2 We took two approaches for the regression analyses: the first to look at global syntactic complexity
features and the second to look at local-level complexity features. In the first approach, we used MLS and MLTU,
separately, to examine the predictive power of these global syntactic complexity features on writing scores. In the
second, we conducted all-possible-subsets regression analyses, with the measures for the six local-level complexity
sub-constructs as predictors in the full model (i.e., TU/S, DC/TU, MLC, CP/C, CNP/C, and NFE/C). These predictor
variables did not have problems with multicollinearity, i.e., high inter-correlations among predictor variables, as
tolerance values for each of the measures were all above 0.10. The all-possible-subsets regression, in contrast to the
often-used step methods (forward, backward, and forward stepwise), makes possible an exhaustive analysis of all
subsets (often combinations) of predictor variables and their predictive power. Instead of producing only one
regression model, the all-possible-subsets regression method provides several regression models that can predict the
dependent variable well, often as well as what the step methods may produce (Huberty, 1989; Kutner, Neter,
Nachtsheim, & Li, 2005; Stevens, 2009). The researcher can then choose the ‘‘best’’ regression model based on other
relevant criteria and can also observe patterns based on the best models produced. The all-possible-subsets regression
analyses for each of the topics in our study were conducted with the Automatic Linear Modeling function in SPSS
version 21. Akaike Information Criterion Corrected (AICC, as defined by Hurvich & Tsai, 1989) was used as the
information criterion to determine the best regression models. The best models based on AICC are the ones that have
an SSE (sum of squares for the error) as small as the one for the full model and have a smaller number of predictors.
The smaller the AICC is, the better a model is.

Results

Research question 1: effect of topic on syntactic complexity

Table 3 displays the descriptive statistics for the syntactic complexity features used in the essays for the appearance
topic and the future topic; the t statistics and the p values indicate the statistical testing results for the topic comparison
for each of the features, and Cohen’s d values show the effect sizes. In comparison to essays on the future topic, essays
on the appearance topic in general showed a significantly higher amount of elaboration at the finite clause level, as can
be observed in the significantly higher values for MLC, CP/C, and CNP/C. On the other hand, essays on the future
topic utilized a significantly higher amount of subordination—both finite and non-finite, as can be seen in the
significantly higher values for DC/TU and NFE/C. Essays on the future topic also displayed significantly greater
overall sentence complexity, as measured by MLS. There were, however, no statistical differences in overall T-unit
complexity as measured by MLTU, and clausal coordination as measured by TU/S, for essays on the two topics. As

2
Although path analyses were deemed more appropriate for such a question for our study, since they can take into account the hierarchical
relationships among the predictors, we were unable to successfully run path analyses on our data. This occurred probably due to complexity of the
model based on Fig. 1, the relatively small sample size, and potentially other unknown factors.
60 W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

Table 3
Syntactic complexity indices by writing topic.
Sub-construct Measure Appearance topic Future topic t pa Cohen’s d
b
Overall sentence complexity Mean length of sentence (MLS) 18.55 (4.27) 19.47 (4.61) 3.36 0.001 0.21
Clausal coordination T-units per sentence (TU/S) 1.11 (0.11) 1.13 (0.13) 2.11 0.036 0.17
Overall T-unit complexity Mean length of T-unit (MLTU) 16.70 (3.63) 17.22 (3.80) 2.01 0.046 0.14
Clausal subordination Dependent clauses per T-unit (DC/TU) 0.75 (0.36) 0.92 (0.34) 6.10 0.000b 0.51
Elaboration at clause level Mean length of clause (MLC) 9.62 (1.64) 8.94 (1.63) 5.08 0.000b 0.41
Non-finite subordination Non-finite elements per clause (NFE/C) 0.35 (0.13) 0.43 (0.18) 6.21 0.000b 0.55
Phrasal coordination Coordinate phrases per clause (CP/C) 0.32 (0.15) 0.18 (0.11) 11.75 0.000b 1.09
Noun phrase complexity Complex NPs per clause (CNP/C) 0.94 (0.30) 0.72 (0.28) 9.42 0.000b 0.78
a
The alpha value for the analysis was adjusted to 0.05/8, or 0.00625, by the Bonferroni correction for multiple tests, as eight tests were done.
b
p < 0.00625.

Table 4
Pearson correlations between syntactic complexity indices and writing scores.
Sub-construct Measure Appearance topic Future topic
a
Overall sentence complexity Mean length of sentence (MLS) 0.27 0.25a
Clausal coordination T-units per sentence (TU/S) 0.15 0.11
Overall T-unit complexity Mean length of T-unit (MLTU) 0.21a 0.22a
Clausal subordination Dependent clauses per T-unit (DC/TU) 0.14 0.08
Elaboration at clause level Mean length of clause (MLC) 0.13 0.23a
Non-finite subordination Non-finite elements per clause (NFE/C) 0.03 0.20a
Phrasal coordination Coordinate phrases per clause (CP/C) 0.03 0.21a
Noun phrase complexity Complex NPs per clause (CNP/C) 0.12 0.20a
The alpha value for the analysis was adjusted to 0.05/8, or 0.00625, by the Bonferroni correction for multiple tests, as eight tests were done.
a
p < 0.00625.

Cohen’s d values in the table show, except for the small effect size observed for MLS, the effect sizes for all the
significant differences found for the local-level features are moderate to large, showing the practical meanings of such
differences.3

Research question 2: relationship between syntactic complexity and writing quality

Table 4 summarizes the correlations between each of the eight syntactic complexity indices and writing scores for
the two different topics. First, MLS and MLTU, indicating overall sentence complexity and overall T-unit complexity
respectively, significantly positively correlated with writing scores for both topics. Second, all four measures
pertaining to elaboration at the finite clause level—MLC, NFE/C, CP/C, and CNP/C significantly positively correlated
with writing scores for the future topic, but not the appearance topic. Third, DC/TU, measuring finite clausal
subordination, did not significantly correlate with writing scores for either topic, but its correlation for the appearance
topic was almost twice as large as that for the future topic. Finally, TU/S, showing clausal coordination, did not
correlate with writing scores for either topic. It should also be noted that the strength of the relationship for all
significant findings is overall rather low, ranging from 0.20 to 0.27.

Research question 3: predictive power of syntactic complexity on writing quality scores

In our first approach in the regression analyses, MLS and MLTU were separately used as predictors of writing
scores. For the appearance topic, MLS was found to be a significant predictor of writing scores (R2 = 0.07,
F 1,188 = 14.33, p < 0.001; b = 0.05); so was MLTU (R2 = 0.04, F 1,188 = 8.71, p < 0.01; b = 0.04). For the future topic,

3
Cohen (1988) considers 0.20 to be a small effect for t tests, 0.50 to be a moderate effect, and 0.80 to be a large effect.
W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67 61

Table 5
Best regression models based on all-possible-subsets regression.
Regressors SSE R2 Adj. R 2 AICC
Appearance topic MLC, DC/TU, TU/S 93.99 0.09 0.07 125.52
DC/TU, TU/S, CNP/C 94.69 0.08 0.06 124.11
MLC, DC/TU, TU/S, NFE/C 93.67 0.09 0.07 124.06
MLC, DC/TU, TU/S, CNP/C 93.85 0.09 0.07 123.68
MLC, DC/TU, TU/S, CP/C 93.93 0.09 0.07 123.53
Future topic MLC, TU/S, DC/TU 108.02 0.09 0.08 99.09
DC/TU, TU/S, CP/C, NFE/C 107.30 0.10 0.08 98.24
DC/TU, TU/S, NFE/C, CNP/C 107.33 0.10 0.08 99.09
MLC, DC/TU, TU/S, CP/C 107.38 0.10 0.08 98.86
DC/TU, TU/S, CP/C, NFE/C, CNP/C 106.20 0.11 0.08 98.63
MLC = mean length of clause; DC/TU = dependent clauses per T-unit; TU/S = T-units per sentence; CP/C = coordinate phrases per clause; CNP/
C = complex noun phrases per clause; NFE/C = non-finite elements per clause.

MLS was also found to be a significant predictor of writing scores (R2 = 0.06, F 1,188 = 12.50, p < 0.001; b = 0.04); so
was MLTU (R2 = 0.05, F 1,188 = 9.64, p < 0.01; b = 0.05). The analyses showed that both MLS and MLTU were
significant, consistent predictors of scores across the two topics, accounting for a small variance in the scores: 4–7%.
MLS was a slightly stronger predictor of scores than MLTU. Further, the regression coefficients (b) showed that with
one additional word in each sentence or T-unit, there was an increase of 0.04–0.05 in scores.
In our second approach in the regression analyses, we used all-possible-subsets regression and entered measures for
all six local-level sub-constructs (i.e., TU/S, DC/TU, MLC, CP/C, CNP/C, and NFE/C) as predictors. Table 5 displays
the five best regression models for the two topics. The first row shows the ‘‘best’’ regression model for each topic, and
the order of the variables in the first row is based on their importance in predicting the scores for that topic, with the
most important listed first. The order of presentation of the five best models for each topic is based on AICC values,
with lower values considered better. As can be seen in Table 5, the ‘‘best’’ model for both topics consisted of MLC,
DC/TU, and TU/S, and MLC was the most important predictor in both cases; DC/TU was the second most important
for the appearance topic, while TU/S was the second most important for the future topic. For the appearance topic,
these three predictors accounted for 9% of scores (F 3,186 = 5.83, p < 0.001); for the future topic, these also explained
9% of scores (F 3,186 = 6.37, p < 0.001). The other best models for the appearance topic, particularly the second best
one, further show that among the sub-constructs subsuming MLC, only CNP/C was relatively important in predicting
scores for this topic. In contrast, the other four best models for the future topic demonstrate the importance of all the
sub-constructs subsuming MLC (i.e., CP/C, NFE/C, and CNP/C) in predicting scores for this topic.
Table 6 lists the predictors in the ‘‘best’’ model for each topic, presented in the order of their importance and with
their b (regression coefficient), b (standardized regression coefficient), and p (significance) values. The table further
demonstrates the relative importance of the three predictors in the ‘‘best’’ regression models. Based on the b values,
comparably speaking, MLC is a more important predictor of scores for the future topic than that for the appearance
topic, while DC/TU and TU/S have greater importance in predicting scores for the appearance topic than that for the
future topic. For example, one additional word in each finite clause is associated with 0.14 of increase in scores for the
future topic but only 0.10 of increase in scores for the appearance topic.

Table 6
‘‘Best’’ regression models and regression coefficients b (b).
Appearance topic Future topic
Regressors b (b) p Regressors b (b) p
MLC 0.10 (0.22) 0.003 MLC 0.14 (0.28) 0.000
DC/TU 0.43 (0.21) 0.005 TU/S 0.98 (0.16) 0.022
TU/S 1.21 (0.18) 0.013 DC/TU 0.31 (0.13) 0.064
MLC = mean length of clause; DC/TU = dependent clauses per T-unit; TU/S = T-units per sentence.
62 W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

In summary, the regression findings for both topics showed that syntactic complexity was a significant predictor of
writing scores, explaining a consistent yet rather small proportion of variance in writing scores across the two topics.
However, although the importance of the two global syntactic complexity measures (i.e., MLS and MLTU) in
predicting scores was consistent across the two topics, the importance of the local-level complexity measures in
predicting scores varied across the two topics.

Synthesis of the research findings

Merging the results for the three research questions, we observed two main patterns, categorized according to
the level of syntactic complexity dimensions. At the global complexity levels, as indicated by MLS and MLTU,
topic did not have much effect on the syntactic complexity features. These global features were also found to have
a positively significant and consistent relationship with writing quality scores across the two topics. At the local
level though, topic was found to exert significant and greater effects on the syntactic complexity features, with the
exception of clausal coordination (measured by TU/S). Essays on the appearance topic included a significantly
higher amount of elaboration at the finite clause level, primarily attributable to the use of more coordinate phrases
and complex noun phrases. The future topic elicited significantly more use of subordination—both finite and non-
finite. This suggests that specific topics may naturally elicit more use of certain syntactic complexity features.
What was particularly intriguing was that this topic effect connected with the relationship between the local-level
syntactic complexity features and writing quality scores in a patterned manner. Specifically, although the
appearance topic elicited a significantly higher amount of elaboration at the finite clause level, through more use
of coordinate phrases and complex noun phrases, these features did not significantly correlate with writing scores
for essays on this topic and were not as important in predicting scores; on the other hand, while essays on the
future topic used significantly fewer of these features, scores on these essays had a significant, positive correlation
with the frequency of these features and were explained more by these features. Similarly, although finite
subordination was used more in the essays for the future topic, it did not have a significant relationship with
writing quality scores for those essays and was not as important in predicting the scores. Meanwhile, a reversed
pattern was observed for the appearance topic: the lower usage of finite subordination in the essays on this topic
was accompanied by a much stronger, positive relationship between finite subordination and writing quality
scores for those essays. The only syntactic complexity feature that did not show such a reversed pattern was
non-finite subordination: it was used more in the essays on the future topic and also showed a significant
relationship with scores for those essays. What the combined results suggest is that the writers who were able to
use not only topic-intrinsic complexity features but also other types of local-level complexity features were
awarded with higher scores, which could well be an acknowledgment of their higher linguistic ability and/or
writing ability.
We show in Appendix A the use of the syntactic complexity features in essays for the two topics at two different
score points, demonstrating how higher level of syntactic complexity and variation in features are achieved in the more
highly rated essays and how the more topic-intrinsic complexity features are also prominent in the lower scored essays
for each topic.
The first two samples are excerpts from two essays on the appearance topic rated at 3.5 and 5 points, respectively. In
both, coordinate phrases and complex noun phrases were frequently employed, providing descriptions and
lengthening the clauses. However, in the lower-scored essay sample, the use of finite and non-finite subordination was
much more limited than that in the higher-scored essay sample. The higher-scored essay demonstrated much greater
syntactic complexity and variation.
The second two essay samples illustrate how for the future topic, both essays utilized finite and non-finite
subordination, making both essays highly propositional. However, the lower-scored essay sample showed use of only
few complex noun phrases and no coordinate phrases. In contrast, the higher-scored essay sample contained a lot more
use of complex noun phrases and several coordinate phrases, making the essay highly propositional as well as
descriptive. Similarly, this higher-scored essay embodied both high syntactic complexity and high syntactic variation.
It should be noted that the lower- and higher-scored essay samples for each topic also differ in other linguistic
features, notably lexical sophistication, which has been found to significantly correlate with scores on argumentative
essays (e.g., Yang & Weigle, 2011). Collectively, however, these samples illustrate how syntactic complexity differs in
writing samples of different quality and how it may be affected by topic.
W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67 63

Discussion

Examining syntactic complexity as a multi-dimensional construct (Norris & Ortega, 2009) with different levels of
sub-constructs, the study revealed complex yet patterned findings about the relationship between syntactic complexity
and writing quality and the role of topic in this relationship. The discussion centers on two main areas that our study is
able to illuminate: (1) the systematic and patterned findings associated with topic in linguistic performance; (2)
measurement issues pertaining to syntactic complexity.
One particularly intriguing question from our study results is why such topic effects on syntactic complexity were
observed and whether such findings are generalizable in any way to other topics. In our study, with one topic eliciting
higher amount of elaboration at the finite clause level and the other one inviting greater subordination, we concluded
that certain topics may naturally call for more use of certain local-level syntactic complexity features. One possible
explanation for our findings is that the future topic demands causal reasoning in task performance while the
appearance topic does not, a comparison we laid out in the Material and methods section. Since causal reasoning
requires juxtaposing the relationship between two or more entities or events, it is reasonable that the future topic
elicited more frequent use of multi-propositional sentences containing subordination. In contrast, the appearance topic
does not demand causal reasoning; it simply involves one entity (i.e., appearance) and one proposition (i.e., people are
placing too much emphasis on appearance) and asks about the truth value of such a proposition. Most likely due to
these factors, the appearance topic elicited more descriptions, rather than propositions.
Since Robinson (2001, 2007, 2011) framework for task complexity makes hypotheses about the relationship
between causal reasoning and syntactic complexity in language production, the current study is able to illuminate the
proposed relationship. Robinson predicts that an increase in causal reasoning will lead to an increase in the syntactic
complexity of language production. The findings of the current study provide partial support for the prediction and
show the prediction to be true for some of the syntactic complexity sub-constructs but not others. Overall though, the
findings support Robinson’s prediction which is primarily concerned with the amount of subordination as the main
syntactic complexity construct, since the future topic which required causal reasoning indeed elicited more use of
subordination. However, the view of the syntactic complexity construct in the TBLT literature has also been expanded
to include other sub-constructs such as elaboration at the clause level (see Skehan, 2014). Examined this way,
Robinson’s prediction is not supported regarding the other syntactic complexity sub-constructs.
The findings of the study can also certainly illuminate measurement choices for syntactic complexity. Such
considerations are two-fold in examinations of task effects on syntactic complexity and in examinations of the
relationship between syntactic complexity and writing quality (i.e., writing proficiency and linguistic proficiency).
First of all, based on the findings of the current study, when we investigate task effects on syntactic complexity,
syntactic complexity is ideally measured multi-dimensionally. The study demonstrates that task may have different
effects on different sub-constructs of syntactic complexity, as Crowhurst and Piche’s (1979) study also indicated.
When claims or hypotheses are made about task effects on syntactic complexity, considerations shall be given to which
sub-constructs may be affected and why, and predictions should be made explicitly in relation to different sub-
constructs. Currently, in the task-based language literature, the most commonly used sub-construct is the amount of
subordination (Skehan, 2014). Global complexity, clausal coordination, overall elaboration at the finite clause level,
phrasal coordination, and noun-phrase complexity have not been given adequate attention.
Secondly, in examinations of the relationship between syntactic complexity and writing quality, our study suggests
that either of the two global measures (MLS and MLTU) can work well as a generic syntactic complexity measure
since both were found to significantly and consistently predict writing scores across topics. However, it is equally
important to identify which local-level complexity features are at work in predicting scores. The challenge of using
local-level complexity features, though, is that different topics may need different constellations of these features in
predicting writing scores or that the importance of each local-level feature in predicting scores varies across topics.
Based on our findings, MLC, TU/S and DC/TU could be used to collectively predict scores across topics. However, the
regression coefficients (or their relative importance) for each of these measures are likely to differ across topics.
Examined in conjunction with the syntactic complexity measures commonly used in the writing literature (see Ortega,
2003), the most commonly used measure of MLTU could work, ideally along with the clausal coordination measure of
TU/S. However, using local-level measures such as MLC or DC/TU as the only syntactic complexity measure or the
only local-level measure in addition to the global measure of MLTU to make claims about the studied relationships can
be problematic given that choices of local-level measures should be contingent upon topics and three or more local-
64 W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

level measures are needed to function collectively to predict writing scores. Such complex relationships between
syntactic complexity and writing quality mediated by the topic could also explain the mixed findings reported about
the syntactic complexity–writing quality relationship in earlier studies (Crowhurst, 1983; Hillocks, 1986; Ortega,
2003).
In general, the study illustrates the importance of considering syntactic complexity measurement choices in
view of potential influences from the topic of the discourse and the cognitive operations that may be invited by the
topic. Syntactic complexity is also studied and measured to answer some other research questions, such as
trajectories of syntactic development and maturity (e.g., Hunt, 1965; Ravid & Berman, 2010; Nippold, Hesketh,
Duthie, & Mansfield, 2005) and a comparison of syntactic complexity features in different registers (e.g., Biber
et al., 2011). In investigations of these other questions, topic and other task factors must also be carefully
considered.
Finally, the study demonstrates that diversity and variations in the local-level syntactic complexity features
employed, rather than mere use of the complexity features called for by the topic, contributes to higher ratings of the
writing. This indicates that one essence of linguistic development and L2 writing development is seen in learners’
ability to stretch their linguistic repertoire and achieve linguistic complexity in ways not constrained by the task or the
topic, shown in the greater linguistic resources and means to attain greater diversity and sophistication in language use.
In not only meeting the task demands but also going beyond what is expected by enriching the discourse through other
complex constructions and the meanings embodied, learners demonstrate their highest linguistic ability and convey
their thoughts in sophisticated and linguistically impressive ways. The observation of greater variations in the
syntactically complex structures employed in writing as benchmark of higher writing quality is also seen in the work of
Myhill (2008, 2009) who examined L1 young children’s writing. Berman (2008) similarly found that L1 speakers/
writers’ ability to ‘‘stack’’ or ‘‘nest’’ different finite clauses through coordination and subordination developed as a
function of age. Our findings thus support syntactic variety (specifically, variety of complex structures used) as one
criterion in writing rating rubrics, as found in some existing rating rubrics (e.g., Jacobs, Zinkgraf, Wormuth, Hartfiel,
& Hughey, 1981; Gentile, Riazantseva, & Cline, 2002).

Conclusions

The study revealed intricate relationships among writing topic, syntactic complexity of writing, and writing
quality. The relationship between syntactic complexity and writing quality was found to be significant and rather
constant at the global syntactic complexity levels—global sentence complexity and global T-unit complexity
across different topics. Yet, such a relationship was found to vary across topics at the local complexity levels—
clausal coordination, finite subordination, overall elaboration at the finite clause level, non-finite subordination,
phrasal coordination, and noun-phrase complexity. The generic length measures of mean length of sentence and
mean length of T-unit may work in predicting writing scores across topics, but they fall short in their capacity to
indicate how syntactic complexity is exactly achieved by ESL writers on a given topic. In general, however,
syntactic variety achieved by writers, using not only features naturally called for by a certain topic, but also other
features to add to the variation, appeared to contribute to higher writing quality as judged by human raters. The
study had a number of strengths, such as the use of repeated measures, a relatively large sample size, and a
thorough examination of the syntactic complexity construct. The findings and our interpretations should however
be viewed in relation to other features of our study design. In particular, our writer sample was limited to ESL
graduate students who are likely to be more linguistically and cognitively mature than many other ESL
populations.4 Second, the writing tasks were argumentative tasks only so that the findings are not to be generalized
to other rhetorical tasks. Third, only two writing topics were examined and thus the findings, although revealing,

4
Whether our findings will be borne out for lower-proficiency and younger or older writers is an open question. Both the variables of language
proficiency and learners’ age have been found to affect syntactic complexity in language production (see Ortega (2003) for L2 cross-sectional
studies, Berman (2008) and Hunt (1965) as examples of L1 studies examining different age groups) and learners’ age may affect the relationship
between writing quality and syntactic complexity, as Crowhurst’s (1980) study suggests. However, there is also evidence that the effects of task on
syntactic complexity can possibly be the same across age groups; studies comparing syntactic complexity in argumentative and narrative essays
have reported greater global syntactic complexity and amount of subordination in argumentative essays for different age groups (e.g., Lu, 2011;
Beers & Nagy, 2009; Crowhurst, 1980; Crowhurst & Piche, 1979).
W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67 65

may be confined to certain topics. Finally, there were not as many lower-scored essays as higher-scored essays in
our data, which may have affected the strength of relationships reported in our study. We certainly welcome future
inquiries that replicate our study with other writing topics and writer populations. Further, for future work,
syntactic complexity sub-constructs can be conceptualized and measured in a more fine-grained manner, taking a
more functional perspective and taking into account different types of subordination—complement, adverbial, and
adjective, finite and non-finite, respectively, since these different subordination types may not develop with age
and proficiency in similar ways and may function differently in different discourses, as suggested in the work of
Hunt (1965), Nippold et al. (2005), Nippold, Mansfiled, and Billow (2007), and Biber et al. (2011), which all
primarily examined L1 language samples. Likewise, with increasing interest in noun-phrase complexity,
sophisticated and more sensitive measures to tap complexity that arise from the use of different types of
modifications for head nouns (see Bulté & Housen, 2012; Ravid & Berman, 2010) can be pursued in our
understanding of syntactic complexity and its relationship with other variables.

Acknowledgment

The essay data for this paper were collected for a project sponsored by the TOEFL Committee of Examiners and the
TOEFL program at Educational Testing Service entitled ‘‘Validation of Automated Scoring of TOEFL iBT Tasks
Against Non-Test Indicators of Writing Ability.’’

Appendix A

Essay samples
Annotation symbols:
underlined: subordinate clause or element
italicized: complex noun phrases
bold: coordinate phrases
italicized and bold: coordinate phrases as modifiers of nouns, or complex noun phrases in a coordinate noun phrase
Appearance topic
Score: 3.5
First, you are so easily judged by your appearance when you first time meeting with people. It is right. You and a
stranger met each other. You know nothing about him/her, and he/she completely has no clue about you. The
appearance and fashion somehow at this time become a proxy for your personal characteristics. You could be
labeled as ‘‘neat’’, ‘‘cute’’, ‘‘sharp’’ or ‘‘nasty’’ by your appearance. Job interview is good example for such
case. You want to look as smart and professional, then wear the business suit.
Score: 5
The world has gone through various changes in the recent years, among which the way people dress and appear in
public. The advances in technology have contributed to the development of new fibers and textile material that have
helped people find the best attire for the best situation. This has created a huge dependence on appearance in various
societies—not to say all. Fashion is, indeed, a big and industry nowadays, satisfying the needs for people to feel better
about themselves and to please others around them in society.
Future topic
Score: 3.5
Planning is something we should all learn to do when little. By learning how to do it early in life, it becomes a habit,
that will ensure success in other activities. There are a lot of factors that come in to play in when planning something,
one of those factors are the unknown. I believe that by learning to take the unknown into consideration you are able to
react better to changes in life.
Score: 5
I believe that these aspirations and careful planning will guide young people in other steps they might take. For
example, a young boy who desires to be a medical doctor will know for sure that college education is undebatable.
Keeping this goal in mind will also prevent such youngster from non-normative or delinquent behaviors that could
hinder him from achieving his goals. In other words, aspirations and goals for the future promote academic resilience
in young people.
66 W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67

References

Bardovi-Harlig, K., & Bofman, T. (1989). Attainment of syntactic and morphological accuracy by advanced language learners. Studies in Second
Language Acquisition, 11, 17–34.
Beers, S. F., & Nagy, W. E. (2009). Syntactic complexity as a predictor of adolescent writing quality: Which measures? Which genre?. Reading and
Writing, 22, 185–200.
Berman, R. A. (2008). The psycholinguistics of developing text construction. Journal of Child Language, 35, 735–771.
Biber, D. (2006). University language: A corpus-based study of spoken and written registers. Amsterdam: John Benjamins.
Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing
development? TESOL Quarterly, 45, 5–35.
Bulté, B., & Housen, A. (2012). Defining and operationalising L2 complexity. In A. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions of L2
performance and proficiency: Complexity, accuracy and fluency in SLA (pp. 21–46). Amsterdam: John Benjamins.
Byrnes, H., Maxim, H. H., & Norris, J. M. (Eds.). (2010). Realizing advanced foreign language writing development in collegiate education:
Curricular design, pedagogy, assessment [Special Issue]. The Modern Language Journal, 94(S1), i–iv 1–235.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum.
Cooper, T. C. (1976). Measuring written syntactic patterns of second language learners of German. The Journal of Educational Research, 69,
176–183.
Cristofaro, S. (2003). Subordination. Oxford: Oxford University Press.
Crossley, S. A., Weston, J. L., McLain Sullivan, S. T., & McNamara, D. S. (2011). The development of writing proficiency as a function of grade
level: A linguistic analysis. Written Communication, 28, 282–311.
Crowhurst, M. (1980). Syntactic complexity in narration and argument at three grade levels. Canadian Journal of Education, 5, 6–13.
Crowhurst, M. (1983). Syntactic complexity and writing quality: A review. Canadian Journal of Education, 8, 1–16.
Crowhurst, M., & Piche, G. L. (1979). Audience and mode of discourse effects on syntactic complexity in writing at two grade levels. Research in the
Teaching of English, 13, 101–109.
Cumming, A. (1989). Writing expertise and second-language proficiency. Language Learning, 39, 81–135.
Diessel, H. (2004). The acquisition of complex sentences. Cambridge, England: Cambridge University Press.
Ellis, R. (2003). Task-based language learning and teaching. Oxford, UK: Oxford University Press.
ETS. (2008). iBT/Next Generation TOEFL Test Independent Writing Rubrics (Scoring Standards) Retrieved from www.ets.org/Media/Tests/
TOEFL/pdf/Writing_Rubrics.pdf.
Flahive, D., & Snow, B. (1980). Measures of syntactic complexity in evaluating ESL compositions. In J. W., Oller, Jr. & K. Perkins (Eds.), Research
in language testing (pp. 171–176). Rowley, MA: Newbury House.
Gentile, C., Riazantseva, A., & Cline, F. (Riazantseva, & Cline, 2002). A Comparison of Handwritten and Word-processed TOEFL Essays: Final
Report Internal document. ETS.
Givón, T. (1985). Function, structure, and language acquisition. In Slobin, D. I. (Ed.). The crosslinguistic study of language acquisition. Vol. 1
(pp.1008–1025). Hillsdale, NJ: Lawrence Erlbaum.
Givón, T. (2009). The genesis of syntactic complexity: Diachrony, ontogeny, neuro-cognition, evolution. Amsterdam: John Benjamins.
Givón, T., & Shibatan, M. (Eds.). (2009). Syntactic complexity: Diachrony, acquisition, neuro-cognition, evolution. Amsterdam: John Benjamins.
Halliday, M. A. K., & Matthiessen, C. (2004). An introduction to functional grammar (3rd ed.). London: Arnold.
Hillocks, G. (1986). Research on written composition: New directions for teaching. Urbana, IL: ERIC Clearinghouse on Reading and Commutation
Skills and the National Conference on Research in English.
Homburg, T. J. (1984). Holistic evaluation of ESL compositions: Can it be validated objectively? TESOL Quarterly, 18, 87–107.
Huberty, C. J. (1989). Problems with stepwise methods: Better alternatives. In Thompson, B. (Ed.). Advances in social science methodology. Vol. 1
(pp.43–70). Greenwich, CT: JAI Press.
Hunt, K. W. (1965). Grammatical structures written at three grade levels. Champaign, IL: National Council of Teachers of English.
Hurvich, C. M., & Tsai, C.-L. (1989). Regression and time series model selection in small samples. Biometrika, 76, 297–307.
Jacobs, H. L., Zinkgraf, S. A., Wormuth, D. R., Hartfiel, V. F., & Hughey, J. B. (1981). Testing ESL composition: A practical approach. Rowley, MA:
Newbury House.
Kameen, P. T. (1979). Syntactic skill and ESL writing quality. In C. Yorio, K. Perkins, & J. Schachter (Eds.), On TESOL’79: The learner in focus (pp.
343–364). Washington, DC: TESOL.
Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2005). Applied linear statistical models (5th ed.). New York: McGraw-Hill.
Langacker, R. W. (2008). Cognitive grammar: A basic introduction. Oxford: Oxford University Press.
Larsen-Freeman, D., & Strom, V. (1977). The construction of a second language acquisition index of development. Language Learning, 27,
123–134.
Long, M. H., & Crookes, G. (1993). Units of analysis in syllabus design: The case for task. In G. Crookes & S. M. Gass (Eds.), Tasks in a pedagogical
context: Integrating theory and practice (pp. 9–54). Clevedon, UK: Multilingual Matters.
Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15, 474–496.
Lu, X. (2011). A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. TESOL
Quarterly, 45, 36–62.
Myhill, D. (2008). Towards a linguistic model of sentence development in writing. Language and Education, 22, 271–288.
Myhill, D. (2009). Becoming a designer: Trajectories of linguistic development. In R. Beard, D. Myhill, J. Riley, & M. Nystrand (Eds.), The Sage
handbook of writing development (pp. 402–414). London: Sage.
Nihalani, N. K. (1981). The quest for the L2 index of development. RELC Journal, 12, 50–56.
W. Yang et al. / Journal of Second Language Writing 28 (2015) 53–67 67

Nippold, M. A., Hesketh, L. J., Duthie, J. K., & Mansfield, T. C. (2005). Conversational versus expository discourse: A study of syntactic
development in children, adolescents, and adults. Journal of Speech, Language, and Hearing Research, 48, 1048–1064.
Nippold, M. A., Mansfield, T. C., & Billow, J. L. (2007). Peer conflict explanations in children, adolescents, and adults: Examining the development
of complex syntax. American Journal of Speech-Language Pathology, 16, 179–188.
Norris, J. M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in instructed SLA: The case of complexity. Applied
Linguistics, 30, 555–578.
Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied
Linguistics, 24, 492–518.
Perkins, K. (1980). Using objective methods of attained writing proficiency to discriminate among holistic evaluations. TESOL Quarterly, 14, 61–69.
Ravid, D. (2004). Emergence of linguistic complexity in written expository texts: Evidence from later language acquisition. In D. Ravid & H. Bat-
Zeev Shyldkrot (Eds.), Perspectives on language and language development (pp. 337–355). Dordrecht: Kluwer.
Ravid, D., & Berman, R. A. (2010). Developing noun phrase complexity at school age: A text-embedded cross-linguistic analysis. First Language,
30, 3–26.
Robinson, P. (2001). Task complexity, cognitive resources and syllabus design: A triadic framework for examining task influences on SLA. In P.
Robinson (Ed.), Cognition and second language instruction (pp. 287–318). Cambridge: Cambridge University Press.
Robinson, P. (2005). Cognitive complexity and task sequencing: Studies in a componential framework for second language task design. International
Review of Applied Linguistics in Language Teaching, 43, 1–32.
Robinson, P. (2007). Criteria for classifying and sequencing pedagogic tasks. In M. P. Garcı́a Mayo (Ed.), Investigating tasks in formal language
learning (pp. 7–26). Clevedon, UK: Multilingual Matters.
Robinson, P. (2011). Second language task complexity, the cognition hypothesis, language learning, and performance. In P. Robinson (Ed.), Second
language task complexity: Researching the cognition hypothesis of language learning and performance (pp. 3–38). Amsterdam: John
Benjamins.
San Jose, C. P. M. (1972). Grammatical structures in four modes of writing at fourth-grade level. (unpublished doctoral dissertation) Syracuse, NY:
Syracuse University.
Spaan, M. (1993). The effect of prompt on essay examinations. In D. Douglas & C. Chapelle (Eds.), A new decade of language testing research (pp.
98–122). Alexandria, VA: TESOL.
Shaw, S. D., & Weir, C. J. (2007). Examining writing: Research and practice in assessing second language writing. Studies in Language Testing 26,
Cambridge: Cambridge University Press.
Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University Press.
Skehan, P. (2014). The context for researching a processing perspective on task performance. In P. Skehan (Ed.), Processing perspectives on task
performance (pp. 1–26). Amsterdam: John Benjamins.
Stevens, J. (2009). Applied multivariate statistics for the social sciences (5th ed.). Mahwah, NJ: Lawrence Erlbaum.
Tedick, D. J. (1990). ESL writing assessment: Subject-matter knowledge and its impact on performance. English for Specific Purposes, 9, 123–143.
Weigle, S. C. (2002). Assessing writing. Cambridge: Cambridge University Press.
Weigle, S. C. (2011). Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability. TOEFL iBT Research Report
(TOEFL iBT-15). Princeton, NJ: Educational Testing Service.
Wolfe-Quintero, K., Inagaki, S., & Kim, H.-Y. (1998). Second language development in writing: Measures of fluency, accuracy, and complexity.
Honolulu, HI: University of Hawai’i, Second Language Teaching and Curriculum Center.
Yang, W., & Weigle, S. C. (2011). Lexical richness of ESL writing and the role of prompt. Paper Presented at the 10th Conference for the American
Association for Corpus Linguistics (AACL).

Weiwei Yang is Associate Professor of English at Nanjing University of Aeronautics and Astronautics. Her research interests include cognition and
discourse, discourse analysis, second language literacy development and assessment, and second language teaching and learning. She holds a PhD in
Applied Linguistics from Georgia State University.

Xiaofei Lu is Gil Watz Early Career Professor in Language and Linguistics and Associate Professor of Applied Linguistics and Asian Studies at The
Pennsylvania State University. His research interests are primarily in computational linguistics, corpus linguistics, and intelligent computer-assisted
language learning. He is the author of Computational Methods for Corpus Annotation and Analysis (2014, Springer).

Sara Cushing Weigle is Professor of Applied Linguistics at Georgia State University. She has conducted research in the areas of assessment, second
language writing, and teacher education and is the author of Assessing Writing (2002, Cambridge University Press). Her most recent research has
focused on the validity of automated scoring of ESL writing and the use of integrated tasks in writing assessment.

You might also like