Statistics of EBO 2010 Examination EBO General Assembly Sunday June 21st, 2010 (Tallin, Estonia)...

Statistics of EBO 2010 Examination

EBO General AssemblySunday June 21st, 2010 (Tallin, Estonia)

Danny G.P. MathysenMSc. Biomedical Sciences

EBOD Assessment and Executive Officer

Antwerp University Hospital, Department of OphthalmologyWilrijkstraat 10, B-2650 Edegem, Belgium

E-mail: [email protected]

Score calculation for Written PaperEBOD 2010 Candidate Population

• Written examination (MCQs)– 310 candidates

• Oral examination– 308 candidates– 1 candidate did not show up for Viva Voce– 1 candidate did show only for some of the Viva Voce topics

Score calculation for Written PaperEBOD 2010 Scoring rules

• Question Number (1 52)• Item Number (A E)

• T (True)• F (False)• D (Don’t know)

Marks obtained?

+1In case ONLY the correct answer was completed

0In case ONLY the D option was completed

–0.5In case ONLY the incorrect answer was completed

In case T AND F were completedIn case NOTHING was completed (blank item)

In case D was COMBINED with T and/or F

Score calculation for Written PaperEBOD 2010 Scoring rules

• Candidate score for MCQ-1(simulation):– A True (Correct Answer: True) +1– B False (Correct Answer: False) +1– C True (Correct Answer: True) +1

+2.5– D Don’t know (Correct Answer: True) 0– E True (Correct Answer: False) –0.5

• Advantages for EBO candidates of T/F items– Reliable in case of translation (English, French, German)

choice of language will not result in being (dis)advantaged– Accessibility (e.g. dyslexia)

not too complicated for candidates– Duration of the examination

stress level of candidates can be kept to a minimum– Relatively easy to process

results can be presented on-site

• Disadvantage for EBO candidates of T/F items– Probability of guessing right = 50 %

level of weakest candidates is overestimated ( oral examination)

EBOD 2010 Negative Marking

• Hypothesis on the influence of negative marking– Average scores will drop (punishment of incorrect answers)– Spread of candidate scores will enlarge ( room for

discrimination)– Rit-value of individual items will increase– Reliability of EBOD will increase

• Argument against negative marking expressed by European Board of Anaesthesiology– Negative marking is discriminating towards female candidates

EBOD 2010 Negative Marking

2010

2009

0 260

• How to overcome the disadvantages of T/F items?– Introduction of negative marking

• Increase of discriminative power of examination• Reduction of guess factor

– wild guesses will be punished (weakest candidates)– guesses by reasoning (partial knowledge) will be rewarded

NEGATIVE MARKINGAT EBOD 2010

EBOD 2010 Spread of Scores

2009 2010

Min 154 61.5

Max 230 209

Mean 204.11 145.99

Stdev 13.04 24.76

Score calculation for Written Paper

EBOD 2010 Statistical Output (SpeedWell)

• EBOD 2009– Degree of Difficulty (P-value) of 0.79 (overestimated due to

guessing)– Estimation of a large proportion of candidates guessing (> 33

%)

• EBOD 2010– Introduction of the “Don’t know” option

reduction of wild guesses used on average for 15 % of items (or 39 items) per candidate

– Degree of Difficulty (P-value) of 0.66

EBOD 2010 Degree of Difficulty

-1 0 +1

• Point biserial correlation coefficient (Rit)– Estimator of the correlation between

the individual item scores Xi (either -0.5, 0 or 1), andthe total MCQ scores Yi (ranging from 61.5 to 209) of the candidates

Y

in

i X

i

s

YY

s

XX

nRit

11

1

correlation betweenitem and total MCQ score

EBOD 2010 Point Biserial Correlation

• Cronbach’s coefficient alpha (r) = 0.87 (2009: 0.78)– Estimator of the lower bound of the internal consistency

(degree to which all MCQs leaves are measuring the same, i.e. knowledge of candidates) of EBOD 2010 (95% CI: 0.86 – 0.89)

87.011260

2602

260

1

260

1

2

iii

ii

Rit

r

internal consistencyof EBOD MCQ-test is good

EBOD 2010 Internal Consistency

• EBOD 2010 Written Examination– 310 Candidates– 168 Male Candidates– 142 Female Candidates

• Percentage of candidates using the “Don’t know” option– Male candidates: used on average for 13% of items

(34 items)– Female candidates: used on average for 16% of items (42

items)

– Statistically significant (p = 0.02)

EBOD 2010 Male vs. Female Candidates

• Average absolute candidate scores– Male candidates: 148.21– Female candidates: 143.36– NOT statistically significant (p > 0.05)

• Distribution of converted candidate scores (1-10)

– NOT statistically significant (p > 0.05) when comparing all scores– NOT statistically significant (p > 0.05) when comparing ≤ 5 versus

≥ 6

EBOD 2010 Male vs. Female Candidates

1 2 3 4 5 6 7 8 9 10

Male 0 1 0 5 9 34 27 30 27 35

Female

3 1 0 6 8 21 35 25 23 20

• In general:– Average scores dropped (204.11 145.99)– Spread of results became larger (13.0 24.8)– Internal consistency (Cronbach-α) improved (0.78 0.87)– P-value was less overestimated due to D option (0.79 0.66)– Rit-value improved (0.14 0.18)

• When comparing male and female candidates:– Female candidates (D option ticked for 42 items on average) are

more prudent when guessing is concerned compared to male candidates(D option ticked for 34 items on average) (p = 0.02)

– However, without negative impact on ability to pass EBOD 2010!

EBOD 2010 Negative Marking: Conclusions

Statistics of EBO 2010 Examination EBO General Assembly Sunday June 21st, 2010 (Tallin, Estonia)...

Documents

Transcript of Statistics of EBO 2010 Examination EBO General Assembly Sunday June 21st, 2010 (Tallin, Estonia)...