Authors ： Shu -Chen Cheng,

1Intelligent System Lab. (iLab)

Southern Taiwan University of Science and Technology

Estimation of Item Difficulty Index Based on Item Response Theory for

Computerized Adaptive Testing

Authors： Shu-Chen Cheng, Guan-Yu Chen

Intelligent System Lab. (iLab)Southern Taiwan University of Science and Technology

22

1. Introduction

2. Literature Reviews

3. Methods

4. Experiments and Results

5. Conclusions

Outline


33

• Computerized Adaptive Testing– Item Response Theory

Advantage: Personalized test, Shorter test length.

Shortcoming: The number of pre-test samples.• IRT-1PL: 20 items, 200 testees (Wright & Stone, 1979)• IRT-2PL: 30 items, 500 testees (Hulin et al., 1982)• IRT-3PL: 60 items, 1000 testees (Hulin et al., 1982)

( There are 1,513 items in our item bank！ )

1. Introduction (1/2)


4

1. Introduction (2/2)• Test System = Item Bank + Item Selection

Item Difficulty Index Answers Abnormal Rate

Dynamic Item Selection Strategy Particle Swarm Optimization


55

2.1 Computerized Adaptive Testing

2.2 Item Difficulty Index

2.3 Item Response Theory

2. Literature Reviews


66

• To select the item that its difficulty is most consistent with testee’s ability.

• To assess testee’s ability immediately.

• The difficulty of next item is affected by previous answer.

2.1 Computerized Adaptive Testing (1/2)


77

• To test for different abilities through dynamitic item selection strategy.– High ability testee No too easy items.– Low ability testee No too difficult items.

• A personalized test.

2.1 Computerized Adaptive Testing (2/2)


8

2.2 Item Difficulty Index (1/2)

• Method 1：

𝑃=𝑅𝑁 ×100 %

P : Item difficulty.R : The number of correct answers.N : The number of total testees.


9

2.2 Item Difficulty Index (2/2)

• Method 2：

𝑃=𝑃𝐻+𝑃 𝐿

2

P : Item difficulty.PH : Correct rate of high score group.PL : Correct rate of low score group.(Generally take 25%, 27%, 33%, etc.)


1010

• Item Response Theory (Lord, 1980)

– To estimate testee’s ability, aptitude, or location of other continuous psychological interval by the information of their item responses.

– Ability location Item response (Psychometric theory)

– In addition to the model of IRT, without any other information to describe the item responses.

2.3 Item Response Theory (1/2)


11

• Three-Parameter Logistic Model (Birnbaum, 1968)

Pi(θ) : Correct probability of item i for ability θ.ai : Discrimination parameter of item i.bi : Difficulty parameter of item i.ci : Guess parameter of item i.

11

2.3 Item Response Theory (2/2)


1212

• Answers1) Testees’ ability > Item difficulty index

Most testees are supposed to answer correctly.

2) Testees’ ability < Item difficulty index Most testees are supposed to answer wrong.

3) Testees’ ability = Item difficulty index The correct answer rate is 50%.

3. Methods (1/4)


1313

• Answers Abnormal– Violations of any one of these above 3 assumptions

among answers are answers abnormal.1st group with wrong answers.

(Testee’s ability > Item difficulty)

2nd group with correct answers.(Testee’s ability < Item difficulty)

3rd group, correct answer rate ≠ 0.5.(Testee’s ability = Item difficulty)

3. Methods (2/4)


1414

： Answers abnormal rate of item i with difficulty j.

• Answers Abnormal Rate

T ： The number of correct answers.F ： The number of wrong answers.N ： The number of total testees.

h ： 1st group (Testee’s ability > Item difficulty).l ： 2nd group (Testee’s ability < Item difficulty).e ： 3rd group (Testee’s ability = Item difficulty).

3. Methods (3/4)


1515

3. Methods (4/4)

15

• Item Difficulty

𝑃 𝑖=¿Difficulty j, let𝐴𝐴𝑅𝑖𝑗be the smallest.

𝑃 𝑖: Item difficulty index of item i.𝐴𝐴𝑅𝑖𝑗 Answers abnormal rate of

item i with difficulty j.:


1616

4.1 System Descriptions

4.2 Experiment Descriptions

4.3 Results and Discussions

4. Experiments and Results


1717

http://ilearning.csie.stust.edu.tw/EST/Dedault.aspx

4.1 System Descriptions (1/3)


1818



19


PSO Dynamic Item Selection Strategy

• Item Difficulty

• Knowledge Weights

• Item Exposure Rate


2020

4.2 Experiment Descriptions• Method: Online test• Item Bank:

– Items: 1,513– Initial Difficulty: 0.5 (9 levels, 0.1~0.9)

• Participants:– Students: 51– Initial Ability: 0.2 (9 levels, 0.1~0.9)

• Periods: 6 weeks


2121

4.3 Results and Discussions (1/3)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

20

40

60

80

100

120

140

160

180

200

Item Difficulty Index.

Num

ber

of It

ems.


2222


1 2 3 4 5 60

100

200

300

400

500

600

700

800

688

7651 29 20 16

Weeks.

Num

ber

of A

djus

ted

Item

s.


2323


1 2 3 4 5 60

0.05

0.1

0.15

0.2

0.25

Weeks.

Ave

rage

Adj

uste

d L

evel

s.


2424

5. Conclusions• Each test item is treated as independent, and the item

difficulty can be estimated individually. Therefore, the item bank can be expanded easily at any time.

• The estimation based on the answers abnormal rate proposed in this study can estimate the item difficulty index quickly and reasonably without too many pre-test samples.

25Intelligent System Lab. (iLab)

Southern Taiwan University of Science and Technology

The End ~Thanks for your attention!

Authors ： Shu -Chen Cheng,

Documents

Transcript of Authors ： Shu -Chen Cheng,