John De Jong: Optimizing Test & Courseware Development

Post on 22-Jan-2018

415 views 0 download

Transcript of John De Jong: Optimizing Test & Courseware Development

Optimizing Test & Courseware Development

Lisbon 23 April 2016

John De Jong SVP Global Assessment Standards, Pearson

Professor of Language Testing VU University Amsterdam

3

PISA Programme for International Student Assessment

PISA Development over time

2000: Reading Mathematics and Science

2003: Reading Mathematics and Science

2006: Reading Mathematics and Science

2009: Reading Mathematics and Science

+ Optional Electronic Reading

2012: Reading Mathematics and Science

+ Optional Electronic Mathematics

2015: Electronic: Reading Mathematics and Science

+ Collaborative Problem Solving

2018 : Reading Mathematics and Science

+ Global Competence

4

Lessons from PISA

Major drivers of success of countries

• Clear standards defined at national level

• High level of teacher autonomy

5

… then, how to define standards?

Ranking CPS in higher education and workplace Applied Skill Rank Educ Rank Work

Oral Communications 3 1

Teamwork / Collaboration 3 1

Problem Solving 1 2

Written Communications 2 2

Information Technology Application 4 3

Lifelong Learning / Self Direction 2 4

Professionalism / Work Ethic 5 4

Ethics / Social Responsibility 6 4

Creativity / Innovation 3 5

Diversity 7 6

Leadership 7 7

Survey results

Definition Agree %

is clearly described 97

matches my own understanding

of CPS 95

will help higher ed institutions

to understand CPS 88

will help employers to

understand CPS 100

is what is taught in my country 52

The CPS definition is … Agree %

Crucial reformation targets

• Establish needs

• Define learning objectives

• Define coherent and realistic curriculum

• Engage students

9

15

Structural approach to defining objectives

Difficulty

Dom

ain

Language

Do

ma

ins o

f la

ng

ua

ge

use

/ T

op

ics

Difficulty

Self / personal experience

Negotiating with others

Deal with new

Academic

Specialized

Jokes

GE: A1 A2 B1 B2 C1 C2

AE: General MBA

PE: Waiter Politician

Coherent bank of objectives

A General Model of Language Development

Gen

eral

Cogn

itio

n

Language Proficiency

Measuring within population of language learners: measures both linguistic and general cognitive development

Measuring across two populations of language learners, may just measure cognitive development only.

Including appropriate native speaker population can help to measure linguistic development only

0 1 2 3 4 5 etc. “language age”

0

1

2

3

4

5

etc

.

“co

gn

itiv

e a

ge”

The Global Scale of English

18

Comparison PTE Academic (GSE scale) and IELTS and TOEFL

IELTS

TOEFL iBT

Sample page (from B1)

The Pearson Syllabus – General English

20

English The need for

Overview

• A vocabulary framework linked to the Global Scale of English (GSE) and the CEFR

• Organized by topics and subtopics based on the CoE Vantage specifications categorization

• Describing vocabulary targets for learners of general English

• A probabilistic model of productive vocabulary learning

• Based on the principle of incremental learning of word meanings, from basic to specialized

• Including 20k+ lemmas; 37k+ meanings; 80k+ collocations; 7k+ functional units

• Helping learners, teachers, and materials designers identify level-appropriate vocabulary

Methodology

Combines frequency data and teacher judgements via 4 main steps:

1. Corpus 2.5 billion words > extraction of frequency list

2. Semantic annotation

• Manual tagging of 37k word meanings using of CoE ‘Vantage’

3. Teacher ratings

• Rating of 37k word meanings by 10 teachers (scale: 1 to 5 + 99)

4. Statistical analysis

• Rank word meanings by combining frequency data and teacher ratings

5. Fit the data onto a model, link each meaning to the CEFR /GSE

Lemmas and meanings

Structure vocabulary around pedagogically relevant

sets using the CoE Vantage categorization

Example:

Specific Notions (Topics)

Fork > FOOD&DRINKS_tableware

SPORT&HOBBIES_gardening

TRAVEL_directions

23

Theoretical assumptions

A model of vocabulary growth based on current literature:

• Basic (A1) > 500-1k words (500 words as min. elementary level -Hill, 2013; 500-1k as general teaching target)

• Basic (A2)> boundary for high frequency vocabulary set at 3k families for everyday conversation (Adolphs & Schmitt, 2003)

• Independent (B1) > 5k families to read authentic texts (Schmitt, 2007)

• Independent (B2) > minimum target of 10k lemmas at univ. level (Hazenberg & Hulstijn, 1996) for Dutch; 8/9k f. for unassisted comprehension (Nation, 2006)

• Proficient (C1 upwards) > 20k f. known by educated L1 speakers (Nation, 2001); 50k w. known by most L1 speakers (Crystal, 1981)

Hill, D. R. (2001). Survey: Graded Readers. ELT Journal 55(3), Oxford University Press, 300-324

Adolphs, S. & Schmitt, N. (2003). Lexical coverage of spoken discourse. Applied Linguistics 24, 4: 425-438.

Schmitt, N. (2007). Current perspectives on vocabulary teaching and learning. In J. Cummins and C. Davison (eds.), International Handbook of English language teaching: part II. NY: Springer, 827-841.

Hazenberg, S. & Hulstijn, J. H. (1996). Defining a minimal receptive second‐ language vocabulary for non‐native university students: An empirical investigation. Applied Linguistics, 17 (2), 145‐163

Nation, I., S., P. (2006). How large a vocabulary is needed for reading and listening. The Canadian Modern Language Review, 63 (1), 59-82

Nation, P. (2001). Leaning vocabulary in another language. Cambridge: Cambridge University Press.Schmitt, N. (2000). Vocabulary in language teaching. Cambridge: Cambridge University Press, pp.7-8

Crystal, D. (1981). Clinical Linguistics. Vienna, Springer

Data modelling 1

y = 0.006x3.539

R² = 0.9842

0

10,000

20,000

30,000

40,000

50,000

60,000

10 20 30 40 50 60 70 80 90

From GSE to ModelLem

Hypothesis: 'CumLem'

Model: 'ModelLem'

Meanings vs Lemmas

1.0

1.5

2.0

2.5

<T T A1 A2 A2+ B1 B1+ B2 B2+ C1 C2

Average number of Meanings per Lemma

Vocabulary growth

0

2000

4000

6000

8000

10000

12000

14000

PreT T A1 A2 A2+ B1 B1+ B2 B2+ C1 C2

Vocabulary growth by level

New Meanings New Lemmas

Cumulative vocabulary growth

0

10000

20000

30000

40000

50000

60000

PreT T A1 A2 A2+ B1 B1+ B2 B2+ C1 C2

Cumulative Vocabulary Growth by Level

Cumul Meanings Cumul Lemmas

The vocabulary usefulness rating

1 = Essential words learners would want to acquire first

2 = Important words that become necessary at a next stage

3 = Useful words enabling more detailed and specific language

4 = Nice to have words to express concepts more accurately

5 = Extra words some language users will use occasionally

99 “Escape” words which are impossible to rate - you have never heard of the word before or you cannot decide between widely different ratings

Teachers received online training and followed specific

guidelines

Each word was rated by a random 10 out of the 19 raters in an

overlapping design using a pre-defined scale of 1-5

Combine ratings and Frequency data

Ra x rRating + Frank x (1- rRating) + Frank

Combine =

2

Where

Combine is the optimal combination of ratings and Frequency data

Ra is the Rating average

rRating is the Reliability of rating data

Frank is the scaled frequency rank.

adj.in People & relationships [personal traits]

A1: happy (23), good (22);

A2: angry (34), kind (36)

A2+: noisy (39), silly (40)

B1: upset (47), lonely (48)

B1+: confident (51), nasty (53)

B2: creative (59), sympathetic (63)

B2+: kind-hearted (67), spoiled (70)

C1: hypocritical (76), bashful (80)

C2: shifty (86), sycophantic (88)

34

y = -3.8806x2 + 42.05x - 24.081R² = 0.9974

10

20

30

40

50

60

70

80

90

1 2 3 4 5

Tourist

A1

A2

B1

B2

C1

C2

Essential

Important

Useful

Extra

Nice to have

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

10 20 30 40 50 60 70 80 90

Lik

elih

ood

of

Su

ccess

GSE Task Difficulty

A learner at 25 on GSE

Girl, Mother

Boy, Father

www.English.com/GSE

37