Caveon Test Security Audit for Cesar Chavez Academy – Oral Report December 5, 2009
Caveon Webinar Series: Using Decision Theory for Accurate Pass/Fail Decisions
-
Upload
caveon-test-security -
Category
Technology
-
view
570 -
download
0
Transcript of Caveon Webinar Series: Using Decision Theory for Accurate Pass/Fail Decisions
Upcoming Caveon Events
• Caveon Webinar Series: Next session, June 19Protecting your Tests Using Copyright Law• Presenters include Intellectual Property Attorney Kenneth Horton and a
member of the Caveon Web Patrol team• Register at: http://bit.ly/protectingip
• NCSA – June 19-21 National Harbor, MD– Dr. John Fremer is co-presenting Preventing, Detecting, and Investigating Test
Security Irregularities: A Comprehensive Guidebook On Test Security For States – Visit the Caveon booth!
Latest Publications• Handbook of Test Security – Now available for
purchase! We’ll share a discount code before end of session.
• TILSA Guidebook for State Assessment Directors on Data Forensics – coming soon!
Caveon Online• Caveon Security Insights Blog
– http://www.caveon.com/blog/• twitter
– Follow @Caveon• LinkedIn
– Caveon Company Page– “Caveon Test Security” Group
• Please contribute!• Facebook
– Will you be our “friend?”– “Like” us!
www.caveon.com
“Using Decision Theory to Score Accurate Pass/Fail Decisions”
Lawrence M. Rudner, Ph.D., MBAVice President and Chief Psychometrician Research and DevelopmentGMAC®
May 15, 2013
Caveon Webinar Series:
Jamie Mulkey, Ed.D.Vice President and General ManagerTest Development ServicesCaveon
Agenda for today
• Role of decision theory
• Examples
• Logic
• Tools
• Adaptive Testing
Goal of Measurement Decision Theory
Classify an examinee into one of K groups
– mastery/non-master– below basic / basic / proficient / advanced– A / B / C / D / F
Poll #1
Are you involved with any classification tests as part of your work?
Attendee Responses:
Yes – Pass/Fail – 49%Yes - Yes - Multiple categories, e.g. A,B,C,D,F – 39%No – 11%
Poll #2
How familiar are you with Item Response Theory?
Attendee Responses:
Very – I understand and routinely apply IRT formulas – 37%Somewhat – I understand the logic and concepts – 38%A little – I have heard of it – 20%Not at all – I have never heard of it – 5%
Poll #3
What is your primary job function?
Attendee Responses:
Teacher or Content Expert -6%Item Writer – 8%Psychometrician – 30%Manager and I am a non Psychometrician – 35%Manager and I am a Psychometrician – 21%
Usual Approach
0
0. 2
0. 4
0. 6
0. 8
1
-3 -2 -1 0 1 2 3
Population Distribution
Usual Approach
0
0. 2
0. 4
0. 6
0. 8
1
-3 -2 -1 0 1 2 3
Population Distribution
New Thinking
Probability of being a Master or a Non-Master
Non-Master Master0.00.10.20.30.40.50.60.70.80.91.0
A Different Question
Old: Your score was 76 which is above the passing score of 72. You passed.
vs
New: Probability of this response pattern for a master is 85% and the probability for a non-master is 15%. You passed.
IRT Approach
Probability of a correct response to Question 123 given ability level
Question 123
-3 -2 -1 0 1 2 30.00.10.20.30.40.50.60.70.80.91.0
Non-Master Master0.00.10.20.30.40.50.60.70.80.91.0
New Thinking
Probability of a correct response to Question 123 for Masters and Non-Masters
Question 123
Advantages
• Simple framework• Small number of items• Small calibration sample sizes• Classifies as well as or better
than IRT• Effective for adaptive testing • Well developed science
Applications
• Intelligent Tutoring Systems• Diagnostic Testing• Personality Assessment• Automated Essay Scoring• Certification Examinations• End-of-course examinations
Examples
A Certification Examination
MDT
Logic
Notation
• K - # of mastery states
• P(mk) - Prob of a randomly drawn examinee being in each mastery state k
• z - an individual’s response vector z1,z2,…,zN zi ∈ (0,1) for N questions
Want
P(mk | z )
The probability of each mastery state k, mk, given the response vector z.
The probability of being a master given zThe probability of being a non-master given z
Do you recognize these people?
Bayes Theorem
• P(a|b)*P(b) = P(b|a)*P(a)
k k kP(m | ) P( )= P( |m ) P(m )cz z z
Mastery state (using Bayes Theorem)
P (m | ) = P ( | m ) P (m )k k kz zc
But there are too many possible response vectors z
Mastery state (using Bayes Theorem)
P (m | ) = P ( | m ) P (m )k k kz zc
But there are too many possible response vectors z
P ( | m ) = P (z | m )k i ki =1
N
z
Simplifying assumption
Basic Concept Conditional probabilities of a correct response, P(zi=1|mk)
Item 1
Item 2
Item 3
Masters (m1)
.8
.8
.6
Non-masters (m2)
.3
.6
.5
Response Vector [1,1,0]
Probability of the response vector z for each mastery state is:
P(z| m1) =.8 * .8 * (1-.6) = .26
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1
Item 2
Item 3
Masters (m1)
.8
.8
.6
Non-masters (m2)
.3
.6
.5
Response Vector [1,1,0]
Examinee 1
Probability of the response vector z for each mastery state is:
P(z| m1) =.8 * .8 * (1-.6) = .26P(z| m2) =.3 * .6 * (1-.5) = .09
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1
Item 2
Item 3
Masters (m1)
.8
.8
.6
Non-masters (m2)
.3
.6
.5
Response Vector [1,1,0]
Examinee 1
Probability of the response vector z for each mastery state is:
P(z| m1) =.8 * .8 * (1-.6) = .26P(z| m2) =.3 * .6 * (1-.5) = .09
Normalized
P(z| m1) = .26 / (.26 + .09) = .74P(z| m2) = .09 / (.26 + .09) = .26
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1
Item 2
Item 3
Masters (m1)
.8
.8
.6
Non-masters (m2)
.3
.6
.5
Response Vector [1,1,0]
Examinee 1
Probability of the response vector z for each mastery state is:
P(z| m1) =.2 * .2 * .6 = .024P(z| m2) =.7 * .4 * .5 = .14
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1
Item 2
Item 3
Masters (m1)
.8
.8
.6
Non-masters (m2)
.3
.6
.5
Response Vector [0,0,1]
Examinee 2
Probability of the response vector z for each mastery state is:
P(z| m1) =.2 * .2 * .6 = .024P(z| m2) =.7 * .4 * .5 = .14
Normalized
P(z| m1) = .024 / (.024 + .14) = .15P(z| m2) = .14 / (.024 + .14) = .85
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1
Item 2
Item 3
Masters (m1)
.8
.8
.6
Non-masters (m2)
.3
.6
.5
Response Vector [0,0,1]
Examinee 2
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1
Item 2
Item 3
Masters (m1)
.8
.8
.6
Non-masters (m2)
.3
.6
.5
Response Vector [1,0,1]
Poll 1. Master 2. Non-master
Check YourselfExaminee 3
Probability of the response vector z for each mastery state is:
P(z| m1) =.8 * (1-.8) * .6 = .096P(z| m2) =.3 * (1-.6) * .5 = .06
Normalized
P(z| m1) = .096 / (.096 + .06) = .62P(z| m2) = .06 / (.096 + .06) = .38
Response Vector [1,0,1]
Conditional probabilities of a correct response, P(zi=1|mk)
Item 1
Item 2
Item 3
Masters (m1)
.8
.8
.6
Non-masters (m2)
.3
.6
.5
Examinee 3
Decision Criteria
Decision Rule – Maximum Likelihood
0
0.05
0.1
0.15
0.2
0.25
0.3
P(z|mk)
MasterNon-Master
• Probability of the response vector, z, for each mastery state is:P(z| m1) = .8 * .8 * (1-.6) = .26 P(z| m2) = .3 * .6 * (1-.5) = .09
Decision Rule - Maximum a posteriori probability• Probability of each mastery state is
P(m1|z) = c * .26 *.7 = c* .52 = .87P(m2|z) = c * .09 *.3 = c* .08 = .13
00.10.20.30.40.50.60.70.80.9
P(mk|z)
MasterNon-Master
Decision Criteria
Bayes Risk
Given a set of item responses z and the costs associated with each decision, select dk to minimize the total expected cost.
Tools
Tools and Resources
http://edres.org/mdt• Paper• Java Applet• Download Excel tool• Tools for
– Data Generation– Item Calibration– Scoring– CAT simulation (in progress)
http://bit.ly/pareonline
Example
Adaptive Testing
1. Sequentially select items to maximize certainty,
2. Administer and score item,
3. Update the estimated mastery state classification probabilities,
4. Evaluate whether there is enough information to terminate testing,
5. Back to Step 1 if needed.
Sequential Testing
Claude Shannon
Entropy
A measure of the disorder of a system.
How many bits of information are needed to send
a) 1,000,000 random signals
b) 1,000,000 zero’s
H S p pkk
K
k( ) lo g
12
Less peaked = more uncertainty = more entropy
Non-Master Master0.0
0.2
0.4
0.6
0.8
1.0
Non-Master Master0.0
0.2
0.4
0.6
0.8
1.0
H(s) = 1.00
H(s) = 0.72
Adaptive Testing
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30 35 40 45 50
Max No of items
Prop
ortio
n
Accuracy
Classified
Percent classified vs accuracy as a function of the maximum number of items administered (NAEP items)
Recap
• Simple framework
• Small number of items
• Classifies as well as or better than much more complicated IRT
• Effective for adaptive testing
• Small sample sizes
• Well developed science
Option For
• Small certification programs
• Large certification programs
• Embedded in instructional systems
• Test preparation
HANDBOOK OF TEST SECURITY
• Editors - James Wollack & John Fremer• Published March 2013• Preventing, Detecting, and Investigating Cheating• Testing in Many Domains
– Certification/Licensure– Clinical– Educational– Industrial/Organizational
• Don’t forget to order your copy at www.routledge.com– http://bit.ly/HandbookTS (Case Sensitive)– Save 20% - Enter discount code: HYJ82
Questions?
Please type questions for our presenters in the GoToWebinar control panel on your screen
THANK YOU!
- Follow Caveon on twitter @caveon- Check out our blog…www.caveon.com/blog- LinkedIn Group – “Caveon Test Security”
Lawrence M. Rudner, Ph.D. MBAVice President and Chief Psychometrician Research and DevelopmentGMAC®
Jamie Mulkey, Ed.D.Vice President and General Manager Test Development ServicesCaveon