Alte2014 Masashi Negishi
Alte2014 Masashi Negishi
Alte2014 Masashi Negishi
24242017
Research into How to Identify Criterial Features for the CEFR(-J) Levels Using Textbook & Learner Corpora:
2
The Development of
the CEFR-J
Masashi Negishi
TUFS
3
A Brief Summary of ALTE 2011
4
Construction of CEFR-J & Relevant Resources
CEFR-J
Resources:
Preparation Phase Revision Phase Validation Phase Sample tasks
CEFR-J Voc
Re-examining list
Can-do School Piloting ELP Descriptor
Descriptors DB
Teacher Survey Expert Survey Comparing Self-assessment &
Actual Skills +
CEFR-J Rewriting descriptors CEFR-J Empirical Data & Final Revision CEFR-J
alpha beta Ver.1
Students’ Self-assessment
40
20
0 6
Pre-A1 A1 A2 B1 B2 C1 C2
The Development of the CEFR-J:
The Principles
• Add Pre-A1
• Divide A1 into three levels: A1.1, A1.2, A1.3
• Divide A2 into two levels: A2.1, A2.2
• Divide B1 into two levels: B1.1, B1.2
• Divide B2 into two levels: B2.1, B2.2
• No change for C1, C2
8
The Validation of the CEFR-J
• Learners’ Self-assessment
• Learners’ Assessment by their
Teachers
• Descriptor Sorting Exercise
• Comparing Self-assessment and
Actual Performance
9
The Validation of the CEFR-J
• Carry out IRT to learners’ self-
assessment data
10
Revision of the descriptors based on the
results of IRT analysis
An example of item difficulty line graphs: CEFR-J Listening Can Do descriptors
11
Some of the problems and solutions for CEFR-J
“Can Do” descriptors
Problems Solutions
1. The perceived difficulties 1. Reordering the descriptors
were not necessarily ordered according to the item
as we had expected. difficulty.
12
Some of the problems and solutions for CEFR-J
“Can Do” descriptors
Problems Solutions
2. “Can Do” descriptors which the 2. Eliminating the unfamiliar
participants had never elements for Japanese
experienced seemed to be
learners
judged to be more difficult.
13
After the revision process,
The release of the CEFR-J The publication of the CEFR-J
in2012 Guidebook in 2013
14
Developing resources
for using the CEFR-J
Yukio Tono
TUFS
15
After the release of the CEFR-J Version 1
• Wordlist
Resource • Descriptor DB
development
• Handbook
• Corpus building
Profiling • Criterial feature
research selection
16
Companion resources for using the CEFR-J
CEFR-J Wordlist
ELP "Can Do" Descriptor DB
CEFR-J Handbook
17
Companion resources for using the CEFR-J
CEFR-J Wordlist
ELP ‘Can Do’ Descriptor DB
CEFR-J Handbook
18
CEFR-J Wordlist Version 1
CEFR -
Level
Pre-
A1 A2 B1 B2 Total
A1
Text
analysis 976 1057 1884 1722 5639
Our
Target 1000 1000 2000 2000 6000
+ EVP
Integrated
1068 1358 2359 2785 7570
Final
Version 19
Using the wordlist for task development
CEFR-J Wordlist
ELP "Can Do" Descriptor DB
CEFR-J Handbook
21
The "Can Do" Descriptor DB
European Language Portfolio
647 descriptors
22
Retrieval of descriptors
Category/C General descriptors Descriptors for children
Lev.
ode ELP descriptor(s) (Japanese) (Japanese)
I can say who I am, ask someone’s 自分が誰であるか言うことができ、相手の
自分の名前を言ったり、相手の名前を聞
A1 IS1-A1 name and introduce someone. 名前を尋ねたり、相手のことを紹介するこ
いたり、相手の紹介ができる
とができる
I can ask and answer simple
questions, initiate and respond to 簡単な質問をしたり、簡単な質問に答え
簡単な質問をしたり、簡単な質問に答え
simple statements in areas of ることができる。また必要性の高いことや
A1 IS1-A1-1 ることができる。また身近なことについて
immediate need or on very familiar 身近な話題について発言したり、反応す
話したり、質問に答えることができる
topics[1.2000-CH] ることができる
CEFR-J Wordlist
ELP "Can Do" Descriptor DB
CEFR-J Handbook
24
Tono, Y. (ed.) (2013) The CEFR-J Handbook.
25
RLDs for the CEFR-J
26
Corpus-based approach
Methodological contribution to
L2 Profiling Research
Coursebook corpora
based on CEFR
Finding language points
for CEFR levels
A1 A2 B1 B2 Syllabus/ Textbook/
Materials development
Learner corpora
based on CEFR Extraction
of criterial
A1 A2 B1 B2 features
Other resources: • Linking to CEFR-J
EP/Core Inventory, etc. • Inventory for CEFR-J
27
2012 - 2013 2014 - 2015
Corpora
• Learner corpora: • Textbook corpora:
– JEFLL Corpus (WR; – Exam materials
JH/SH; 10,000 samples ;
c. 670,000 tokens) – Major ELT coursebooks
– NICT JLE Corpus (SP; based on the CEFR
OPI-like interview data;
1281 subjects;c. 2 million)
– English textbooks used in
– MEXT Data (1,600 JH-3 Japan (for comparison)
students; randomly
sampled; WR & SP)
– GTEC for STUDENTS
Writing Corpus (WR;
exam scripts; 30,000
samples; 2.5 million)
28
Method of identifying criterial features
• Grammar • Hypothesis testing
– Data-driven approach – Theory-driven approach,
– Extract all the grammar points focusing on particular
taught at secondary school grammatical properties
– Using machine learning to – Verb subcat; postnominal
find out which features modifiers; to Infinitives;
classify CEFR levels best articles; tense; collocation, etc
compare different classifiers:
• Decision Tree/ Support • Lexical profiling
Vector Machine/ Random – Measures of text
Forest/ etc. characteristics:
• Lexical richness measures:
• Learner errors Guiraud; Yule’s K
– Automatic error tagging • Complexity measures:
Sentence length; T-unit
length; VP/T-unit;
Clause/Sentence; Complex
nominal per clause/T-unit,
etc.
29
The Impact of
the CEFR-J
Masashi Negishi
TUFS
30
The Impact of the CEFR-J
• In 2011,
The impact of the CEFR-J wasn’t yet
clear.
32
Impact Analyses
• How?
Analyse Big Data.
33
Big Data Analysis
• The data analysed: 15,579,018 texts, written in Japanese, from August 2012 to
September 2013
• The analyses: carried out by Jetrun Technology Inc.
• The results of the analyses: “Positive/Negative Graphs” and “Word Maps”
• The “Positive/Negative Graphs”: created by analysing the comments in terms of
the attitude of the writer, based on the semantic database
• The “Word Maps”: indicate the relationship of the key words in the writing. The
words were automatically analysed based on the tailored database of Jetrun
Technology Inc. The connections shown in the “Word Maps” are those of the key
words in the same sentence. It is necessary to interpret the relationships
between the key words by looking not only at the main webs but also at the
extended webs.
34
Big Data Analysis
TOEIC 741
TOEFL 405
Juken eigo
117
English for entrance exams
CEFR 10
CEFR-J 3
36
The Positive/Negative Graphs
Neutral
14%
Positive Negative
Neutral 39% 11%
44%
Positive
75%
Negative
17%
37
レベル
中学校
level junior high school
言語
language B1
2
CEFR 複数
plural
5
10
2
台湾
3 ヨーロッパ 教員 中学校
teacher Taiwan
Europe junior high school
台湾
12 2 2
政策 CEFR 2
policy
B2
2
10
2 小学校
教員 4 elementary school
teacher 英語
2 English 2
2
学生
10 3
student 自己評価
self-assessment 3 中学校
junior high school
CAN-DO 未満 15
below
2 2 レベル
ボン大学 活用 level(s) 3
University of Bonn use 3 高等学校
high school
A1 半数
half
3 38
CEFR
39
CEFR
40
細分化
目標 branching CEFR
goal 研究
research
CEFR-J
1
選択 1
2
choice 4
下位
1 学習者 レベル sub- ボトムアップ
learner(s) level bottom-up
学校
1 school
6 1
達成度 CEFR-J 3
degree of achievement
can-do
1
7
作成
1 development
アレンジ
3 adaptation 2
1
A1 8 1
特徴
feature 既存
レベル existing
level(s) 1
6
親和性
2 2 compatibility CEFR
細分化 設定 2
branching setting 教科書
textbook
低い
low
41
2
CEFR CEFR-J
42
CEFR-J
43
Impact on Testing, not on Teaching
North (2009: 307) argues that “... the impact of the descriptive
scheme or other aspects of the CEFR on curriculum or teaching
have as yet been very limited”, and he quotes Little (2007) as
follows:
44
A Price to Pay
45
CEFR-J “Can Do” descriptors:
too narrow to reflect on teaching
B1
A2
A1
46
Impact on language policy
47
Impact on teaching of other languages
48
Conclusion
49