Alte2014 Masashi Negishi

Grant-in-aid for Scientific Research Basic Research（A）No.
24242017
Research into How to Identify Criterial Features for the CEFR(-J) Levels Using Textbook & Learner Corpora:
An update on the CEFR-J project and

its impact on English language education in Japan
Masashi Negishi, Ph.D.

Tokyo University of Foreign Studies
(TUFS)
Yukio Tono, Ph.D.
TUFS
ALTE Paris 2014
Contents
• The Development of the Framework (M. Negishi)
• The Development of Resources for the CEFR-J (Y, Tono)
• Reference Level Descriptions for the CEFR-J (Y, Tono)
• The Impact of the CEFR-J (M. Negishi)
2
The Development of
the CEFR-J
Masashi Negishi
TUFS
3
A Brief Summary of ALTE 2011
A progress report on the

development of the CEFR-J
Masashi Negishi
Tomoko Takada
(Yukio Tono)
4
Construction of CEFR-J & Relevant Resources
CEFR-J
Resources：
Preparation Phase Revision Phase Validation Phase Sample tasks
CEFR-J Voc
Re-examining list
Can-do School Piloting ELP Descriptor
Descriptors DB
Teacher Survey Expert Survey Comparing Self-assessment &
Actual Skills ＋
Descriptor Sorting Exercise
CEFR-J Rewriting descriptors CEFR-J Empirical Data & Final Revision CEFR-J
alpha beta Ver.1
Students’ Self-assessment
Students’ Assessment Symposium

by their teachers
Final Report
Interim Report
5
2008 2009 2010 2011
The CEFR Levels of the Japanese
Learners of English
• Non/Basic Users (A1 and A2) are more

than 80%.
• Independent Users (B1 and B2) are less
than 20%.
• Proficient Users (C1 and C2) are almost nil.
→skewed towards lower levels
60
40
20
0 6
Pre-A1 A1 A2 B1 B2 C1 C2
The Development of the CEFR-J:
The Principles
• Add Pre-A1
• Divide A1 into three levels: A1.1, A1.2, A1.3
• Divide A2 into two levels: A2.1, A2.2
• Divide B1 into two levels: B1.1, B1.2
• Divide B2 into two levels: B2.1, B2.2
• No change for C1, C2
• Adapt Can-do descriptors to a Japanese context

7
The development of the CEFR-J
• collect descriptors available both in and outside
Japan
• eradicate the inconsistencies by dissecting

descriptors
– Descriptors for productive skills
• (1) performance, (2) criteria, (3) condition
– Descriptors for receptive skills
• (1) task, (2) text, (3) condition
8
The Validation of the CEFR-J
• Learners’ Self-assessment
• Learners’ Assessment by their
Teachers
• Descriptor Sorting Exercise
• Comparing Self-assessment and
Actual Performance
9
The Validation of the CEFR-J
• Carry out IRT to learners’ self-
assessment data
– The descriptors in the CEFR

 groups of teachers as informants
(North 2000)
– The descriptors in the CEFR-J
 groups of learners as informants
10
Revision of the descriptors based on the
results of IRT analysis
An example of item difficulty line graphs: CEFR-J Listening Can Do descriptors
11
Some of the problems and solutions for CEFR-J
“Can Do” descriptors
Problems Solutions
1. The perceived difficulties 1. Reordering the descriptors
were not necessarily ordered according to the item
as we had expected. difficulty.
12
Some of the problems and solutions for CEFR-J
“Can Do” descriptors
Problems Solutions
2. “Can Do” descriptors which the 2. Eliminating the unfamiliar
participants had never elements for Japanese
experienced seemed to be
learners
judged to be more difficult.
• Reading: A1.2 right (D) Beta • Reading: A1.2 right Version 1

version • I can understand very short
• I can understand very short reports of recent events such
reports of recent events such as text messages from friends
as simple letters, postcards or or relatives, describing travel
e-mails from friends or memories, etc.
relatives describing travel
memories, etc.
13
After the revision process,
The release of the CEFR-J The publication of the CEFR-J
in2012 Guidebook in 2013
14
Developing resources
for using the CEFR-J
Yukio Tono
TUFS
15
After the release of the CEFR-J Version 1
• Wordlist
Resource • Descriptor DB
development
• Handbook
• Corpus building
Profiling • Criterial feature
research selection
16
Companion resources for using the CEFR-J
CEFR-J Wordlist
ELP "Can Do" Descriptor DB
CEFR-J Handbook
17
CEFR-J Wordlist
ELP ‘Can Do’ Descriptor DB
CEFR-J Handbook
18
CEFR-J Wordlist Version 1
CEFR -
Level
Pre-
A1 A2 B1 B2 Total
A1
Text
analysis 976 1057 1884 1722 5639
Our
Target 1000 1000 2000 2000 6000
+ EVP
Integrated
 1068 1358 2359 2785 7570
Final
Version 19
Using the wordlist for task development
Can do descriptor I like …/ I don’t like …

I can exchange simple opinions Do you like …?
about very familiar topics such as
likes and dislikes for sports, foods,
etc., using a limited repertoire of
expressions, provided that people CEFR-J Wordlist
speak clearly.
apple A0 n Food and drink art A0 n Hobbies and pastimes
(A1.2 Spoken interaction)
banana A0 n Food and drink ball A0 n Hobbies and pastimes
bean A1 n Food and drink baseball A0 n Hobbies and pastimes
beef A1 n Food and drink basketball A0 n Hobbies and pastimes
cartoon A0 n Hobbies and pastimes
biscuit A1 n Food and drink
concert A0 n Hobbies and pastimes
bottle A0 n Food and drink
dance A0 n Hobbies and pastimes
bread A0 n Food and drink
drama A1 n Hobbies and pastimes
breakfast A0 n Food and drink football A0 n Hobbies and pastimes
burger A1 n Food and drink music A0 n Hobbies and pastimes
butter A1 n Food and drink opera A0 n Hobbies and pastimes
cake A0 n Food and drink painting A0 n Hobbies and pastimes
candy A0 n Food and drink party A0 n Hobbies and pastimes
cheese A0 n Food and drink piano A0 n Hobbies and pastimes
20
CEFR-J Wordlist
CEFR-J Handbook
21
The "Can Do" Descriptor DB
European Language Portfolio
2,800 "Can Do" descriptors
SP: 69 SI: 137 L: 124 R: 146 W: 171
647 descriptors
22
Retrieval of descriptors
Category/C General descriptors Descriptors for children
Lev.
ode ELP descriptor(s) (Japanese) (Japanese)
I can say who I am, ask someone’s 自分が誰であるか言うことができ、相手の
自分の名前を言ったり、相手の名前を聞
A1 IS1-A1 name and introduce someone. 名前を尋ねたり、相手のことを紹介するこ
いたり、相手の紹介ができる
とができる
I can ask and answer simple
questions, initiate and respond to 簡単な質問をしたり、簡単な質問に答え
簡単な質問をしたり、簡単な質問に答え
simple statements in areas of ることができる。また必要性の高いことや
A1 IS1-A1-1 ることができる。また身近なことについて
immediate need or on very familiar 身近な話題について発言したり、反応す
話したり、質問に答えることができる
topics[1.2000-CH] ることができる
I can make myself understood in a

simple way but I am dependent on 簡易な方法であれば通じるが、ゆっくり繰
my partner being prepared to repeat り返してくれたり、自分が言った事を言い相手がゆっくり話したり、自分が言ったこと
A1 IS1-A1-1 more slowly and rephrase what I say 直してくれたり、自分が言いたいことが言を確認してくれるなど、やさしい人だった
and to help me to say what I want. えるよう助けてくれるような相手に依存しら自分の簡単な英語は通じる
ている
I can understand simple questions

about myself and my family when
相手がゆっくりはっきり話してくれれば、
people speak slowly and clearly (e.g. 相手がゆっくりはっきり話してくれれば、
「名前は？」「歳は？」「調子はどう？」な
A1 IS2-A1 "What’s your name?" "How old are 自分や家族についての簡単な質問が分
どの自分や家族についての簡単な質問
you?" "How are you?" etc.). かる
を理解することができる
I can understand simple words and

「すみません」「ごめんなさい」「ありがとう」
phrases, like "excuse me", "sorry", 「すみません」「ごめんなさい」「ありがとう」
A1 IS2-A1 といった簡単な語句を理解することがで
"thank you", etc. といった簡単な語句が分かる
きる
I can understand simple greetings,
「やあ」「さようなら」「おはよう」といった簡「やあ」「さようなら」「おはよう」といった簡
A1 IS2-A1 like "hello", "good bye", "good 23
単な挨拶を理解することができる単な挨拶が分かる
morning", etc.
CEFR-J Wordlist
CEFR-J Handbook
24
Tono, Y. (ed.) (2013) The CEFR-J Handbook.
• Part 1: What is the CEFR?
• Part 2: What is the CEFR-

J?
• Part 3: Using the CEFR-J
25
RLDs for the CEFR-J
26
Corpus-based approach
Methodological contribution to
L2 Profiling Research
Coursebook corpora
based on CEFR
Finding language points
for CEFR levels
A1 A2 B1 B2 Syllabus/ Textbook/
Materials development
Learner corpora
based on CEFR Extraction
of criterial
A1 A2 B1 B2 features
Other resources: • Linking to CEFR-J
EP/Core Inventory, etc. • Inventory for CEFR-J
27
2012 - 2013 2014 - 2015
Corpora
• Learner corpora: • Textbook corpora:
– JEFLL Corpus （WR; – Exam materials
JH/SH; 10,000 samples ;
c. 670,000 tokens） – Major ELT coursebooks
– NICT JLE Corpus （SP; based on the CEFR
OPI-like interview data;
1281 subjects；c. 2 million）
– English textbooks used in
– MEXT Data (1,600 JH-3 Japan (for comparison)
students; randomly
sampled; WR & SP)
– GTEC for STUDENTS
Writing Corpus (WR;
exam scripts; 30,000
samples; 2.5 million)
28
Method of identifying criterial features
• Grammar • Hypothesis testing
– Data-driven approach – Theory-driven approach,
– Extract all the grammar points focusing on particular
taught at secondary school grammatical properties
– Using machine learning to – Verb subcat; postnominal
find out which features modifiers; to Infinitives;
classify CEFR levels best  articles; tense; collocation, etc
compare different classifiers:
• Decision Tree/ Support • Lexical profiling
Vector Machine/ Random – Measures of text
Forest/ etc. characteristics:
• Lexical richness measures：
• Learner errors Guiraud; Yule’s K
– Automatic error tagging • Complexity measures：
Sentence length; T-unit
length; VP/T-unit;
Clause/Sentence; Complex
nominal per clause/T-unit,
etc.
29
The Impact of
the CEFR-J
Masashi Negishi
TUFS
30
The Impact of the CEFR-J
• In 2011,
The impact of the CEFR-J wasn’t yet
clear.
• How about now?

31
Impact Analyses
• The backwash or washback of language tests has been

investigated mainly by using “questionnaires”,
“interviews to teachers and learners”, and “classroom
observations” (e.g. Alderson & Hamp-Lyons; 1996,
Watanabe; 1996, Muñoz & Álvarez; 2010).
32
Impact Analyses
• The impact of such comprehensive frameworks as the

CEFR or the CEFR-J, however, is far-reaching, and
therefore should be explored not only at the classroom
level, but also in a much wider context.
• How?
 Analyse Big Data.
33
Big Data Analysis
• The data analysed: 15,579,018 texts, written in Japanese, from August 2012 to
September 2013
• The analyses: carried out by Jetrun Technology Inc.
• The results of the analyses: “Positive/Negative Graphs” and “Word Maps”
• The “Positive/Negative Graphs”: created by analysing the comments in terms of
the attitude of the writer, based on the semantic database
• The “Word Maps”: indicate the relationship of the key words in the writing. The
words were automatically analysed based on the tailored database of Jetrun
Technology Inc. The connections shown in the “Word Maps” are those of the key
words in the same sentence. It is necessary to interpret the relationships
between the key words by looking not only at the main webs but also at the
extended webs.
34
Big Data Analysis
The computer programme was customised for this

particular research so that such everyday words as “Can
Do” and “level(s)” could be categorised as key words. The
term “Can Do”, which happens to be the name of a popular
100 yen shop chain in Japan, is usually excluded in this
kind of analysis, but since this is one of the crucial terms for
this analysis, the author made a special request to include it
as key words.
35
The numbers of websites per month
TOEIC 741
TOEFL 405
Juken eigo
117
English for entrance exams
CEFR 10
CEFR-J 3
0 200 400 600 800
36
The Positive/Negative Graphs
The CEFR The CEFR-J
Neutral
14%
Positive Negative
Neutral 39% 11%
44%
Positive
75%
Negative
17%
37
レベル
中学校
level junior high school
言語
language B1
2
CEFR 複数
plural
5
10
2
台湾
3 ヨーロッパ教員中学校
teacher Taiwan
Europe junior high school
台湾
12 2 2
政策 CEFR 2
policy
B2
2
10
2 小学校
教員 4 elementary school
teacher 英語
2 English 2
2
学生
10 3
student 自己評価
self-assessment 3 中学校
junior high school
CAN-DO 未満 15
below
2 2 レベル
ボン大学活用 level(s) 3
University of Bonn use 3 高等学校
high school
A1 半数
half
3 38
CEFR
39
CEFR
40
細分化
目標 branching CEFR
goal 研究
research
CEFR-J
1
選択 1
2
choice 4
下位
1 学習者レベル sub- ボトムアップ
learner(s) level bottom-up
学校
1 school
6 1
達成度 CEFR-J 3
degree of achievement
can-do
1
7
作成
1 development
アレンジ
3 adaptation 2
1
A1 8 1
特徴
feature 既存
レベル existing
level(s) 1
6
親和性
2 2 compatibility CEFR
細分化設定 2
branching setting 教科書
textbook
低い
low
41
2
CEFR CEFR-J
42
CEFR-J
43
Impact on Testing, not on Teaching
North (2009: 307) argues that “... the impact of the descriptive
scheme or other aspects of the CEFR on curriculum or teaching
have as yet been very limited”, and he quotes Little (2007) as
follows:
To date (the CEFR’s) impact on language testing far outweighs its

impact on curriculum design and pedagogy ...’ (Little 2007: 648) and
‘On the whole the CEFR has no more occasioned a revolution in
curriculum development than it has promoted the radical redesign of
language tests (Little 2007: 649)
44
A Price to Pay
• High item discrimination

 narrowly-focused “Can Do” descriptors: too narrow to
reflect on teaching and build syllabus based on it.
• CEFR-J version 1: A2.1 Spoken Interaction

o I can give simple directions from place to place, using basic
expressions such as "turn right" and "go straight" along with
sequencers such as first, then, and next.
45
CEFR-J “Can Do” descriptors:
too narrow to reflect on teaching
B1
A2
A1
46
Impact on language policy
• The “English Education Reform Plan corresponding to

Globalization”: released on 13th December 2013.
• Specific reference to the CEFR
• The plan proposes that Japanese teachers of English
should assess four skills with the use of “Can Do”
descriptors, and it specifies the attainment target of the
Japanese people’s English proficiency in terms of the
CEFR levels.
47
Impact on teaching of other languages
• The CEFR-J is beginning to be used as a framework of

the attainment targets for other languages, e.g. French,
Japanese, etc.
 The progress of learning is tangible to learners and

teachers due to the branching of lower CEFR-J levels.
48
Conclusion
• After the completion of the CEFR-J version 1, the CEFR-J

Guidebook, and its related resources have been available for use.
• Our search for criterial features for the CEFR and/or CEFR-J is still
in progress.
• So far, the impact of the CEFR-J seems to have been limited
compared with specific language tests. Discussion regarding the
CEFR-J centres around “levels” and “branching”, rather than
“language policy” as in that of the CEFR.
• Teachers find it hard to see the link between the narrowly-focused
“Can Do” descriptors and their everyday teaching.
49

Alte2014 Masashi Negishi

Uploaded by

Copyright:

Available Formats

Alte2014 Masashi Negishi

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Alte2014 Masashi Negishi

Uploaded by

Copyright:

Available Formats

Grant-in-aid for Scientific Research Basic Research（A）No.

An update on the CEFR-J project and

Masashi Negishi, Ph.D.

• The Development of the Framework (M. Negishi)

• The Development of Resources for the CEFR-J (Y, Tono)

• Reference Level Descriptions for the CEFR-J (Y, Tono)

• The Impact of the CEFR-J (M. Negishi)

A progress report on the

Descriptor Sorting Exercise

Students’ Assessment Symposium

• Non/Basic Users (A1 and A2) are more

• Adapt Can-do descriptors to a Japanese context

• eradicate the inconsistencies by dissecting

– The descriptors in the CEFR

• Reading: A1.2 right (D) Beta • Reading: A1.2 right Version 1

Can do descriptor I like …/ I don’t like …

2,800 "Can Do" descriptors

SP: 69 SI: 137 L: 124 R: 146 W: 171

I can make myself understood in a

I can understand simple questions

I can understand simple words and

• Part 1: What is the CEFR?

• Part 2: What is the CEFR-

• Part 3: Using the CEFR-J

• How about now?

• The backwash or washback of language tests has been

• The impact of such comprehensive frameworks as the

The computer programme was customised for this

0 200 400 600 800

The CEFR The CEFR-J

To date (the CEFR’s) impact on language testing far outweighs its

• High item discrimination

• CEFR-J version 1: A2.1 Spoken Interaction

• The “English Education Reform Plan corresponding to

• The CEFR-J is beginning to be used as a framework of

 The progress of learning is tangible to learners and

• After the completion of the CEFR-J version 1, the CEFR-J

You might also like