On the statistical meaning of the item parameters in IRT models - … · 2018-09-12 ·...
Transcript of On the statistical meaning of the item parameters in IRT models - … · 2018-09-12 ·...
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 1
/ 33
On the statistical meaning of the item parame-ters in IRT modelsCONBRATRI VI
Ernesto San Martín1Faculty of Mathematics, Pontificia Universidad Católica de Chile, Chile2The Economics School of Louvain, Université catholique de Louvain, Belgium3Laboratorio Interdisciplinario de Estadística Social, LIES, Pontificia Universidad Católica de Chile, Chile
Agosto 2018, Juiz de Fora, Brasil
What means to model aeducational phenomenon?
We observe the response patterns of 50 students on 7 items:
It1 It2 It3 It4 It5 It6 It7Student 1 1 0 0 1 1 0 1Student 2 0 1 1 0 1 1 0Student 3 1 1 1 1 0 1 0Student 4 0 0 1 0 0 0 1Student 5 0 0 0 1 1 1 1Student 6 1 0 0 1 1 0 1Student 7 0 1 1 0 0 1 0Student 8 1 1 0 0 1 1 0Student 9 0 0 1 1 0 0 0Student 10 1 0 0 1 1 1 1
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 2
/ 33
What means to model aeducational phenomenon?
We observe the response patterns of 50 students on 7 items:
It1 It2 It3 It4 It5 It6 It7Student 1 1 0 0 1 1 0 1Student 2 0 1 1 0 1 1 0Student 3 1 1 1 1 0 1 0Student 4 0 0 1 0 0 0 1Student 5 0 0 0 1 1 1 1Student 6 1 0 0 1 1 0 1Student 7 0 1 1 0 0 1 0Student 8 1 1 0 0 1 1 0Student 9 0 0 1 1 0 0 0Student 10 1 0 0 1 1 1 1
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 2
/ 33
What means to model aeducational phenomenon?
We intend to describe the behavior of items.
We intend to describe the behavior of students.
In particular, we intend to identify test fraud as answer copyingbetween two examinees.
We can face these questions using psychometric models.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 3
/ 33
What means to model aeducational phenomenon?
We intend to describe the behavior of items.
We intend to describe the behavior of students.
In particular, we intend to identify test fraud as answer copyingbetween two examinees.
We can face these questions using psychometric models.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 3
/ 33
What means to model aeducational phenomenon?
We intend to describe the behavior of items.
We intend to describe the behavior of students.
In particular, we intend to identify test fraud as answer copyingbetween two examinees.
We can face these questions using psychometric models.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 3
/ 33
Detecting answer copyingand IRT models
IRT models: student’s answer on an item depends on both anindividual characteristic (ability) and item characteristics(difficulty, discrimination, guessing).
3PL model: P(Ypi = 1) = ci + (1− ci )exp(αiθp − βi )
1+ exp(αiθp − βi ).
If αi = 1 then we obtain the 1PL-G model.If ci = 0 then we obtain the 2PL model.If αi = 1 and ci = 0 then we obtain the 1PL model.
There are other extensions: the 4PL Model . . .
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 4
/ 33
Detecting answer copyingand IRT models
IRT models: student’s answer on an item depends on both anindividual characteristic (ability) and item characteristics(difficulty, discrimination, guessing).
3PL model: P(Ypi = 1) = ci + (1− ci )exp(αiθp − βi )
1+ exp(αiθp − βi ).
If αi = 1 then we obtain the 1PL-G model.If ci = 0 then we obtain the 2PL model.If αi = 1 and ci = 0 then we obtain the 1PL model.
There are other extensions: the 4PL Model . . .
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 4
/ 33
Detecting answer copyingand IRT models
IRT models: student’s answer on an item depends on both anindividual characteristic (ability) and item characteristics(difficulty, discrimination, guessing).
3PL model: P(Ypi = 1) = ci + (1− ci )exp(αiθp − βi )
1+ exp(αiθp − βi ).
If αi = 1 then we obtain the 1PL-G model.If ci = 0 then we obtain the 2PL model.If αi = 1 and ci = 0 then we obtain the 1PL model.
There are other extensions: the 4PL Model . . .
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 4
/ 33
Detecting answer copyingand IRT models
IRT models: student’s answer on an item depends on both anindividual characteristic (ability) and item characteristics(difficulty, discrimination, guessing).
3PL model: P(Ypi = 1) = ci + (1− ci )exp(αiθp − βi )
1+ exp(αiθp − βi ).
If αi = 1 then we obtain the 1PL-G model.If ci = 0 then we obtain the 2PL model.If αi = 1 and ci = 0 then we obtain the 1PL model.
There are other extensions: the 4PL Model . . .
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 4
/ 33
−4 −2 0 2 4
0.0
0.2
0.4
0.6
0.8
1.0
Pro
babi
lidad
de
resp
uest
a co
rrec
ta
beta=0beta=1beta=−1
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 5
/ 33
●●
●●
●
●●
●●●●●●●●●●●●●●●●
●●●
●●
●●
●
●
●
●
●●
●
●●
●
●
●●●
●
10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
DE=0.31; CIT=0.37
Puntaje Total
Val
or d
e C
CE
I
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 6
/ 33
Detecting answer copyingand IRT models
Zopluoglu (2016) analyzes the statistical behavior of answer-copying indexes underdifferent IRT models (dichotomous and polychotomous)
He remarks that copying indexes share the same rationale, but the computation ofthe probability of choosing the alternative k of the item j for the person p is doneusing either an IRT model or the CTT framework.
The objective of his contribution is to analyze the type I error behavior and thestatistical power of copying indexes.
Data: 40 items of mathematics applied to 67896 students.
Adjust 1PL, 2PL and 3PL using IRTPRO (which provides MML estimators for theparameters that characterize the items, although it also provides Bayesianestimates).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 7
/ 33
●
●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●
● ●●
●
●
●
●●
●
●
●●
●
●●
−4 −3 −2 −1 0 1
−4
−3
−2
−1
01
Con datos de Zopluoglu (2016)
Dificultades 1PL
Difi
culta
des
2PL/
3PL
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
oo
1PL vs 2PL (0.81)1PL vs 3PL (0.86)
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 8
/ 33
Detecting answer copyingand IRT models
Study 1: manipulate sample sizes, amount of copy, type of copy, levels of itemsdifficulties.
Item difficulties: easy and medium difficulties . . . “overall test difficulty wasmanipulated through the b parameters for the dichotomous IRT models” (p.595).
Remark: Given the high correlations between the estimators of the difficulties, it ispossible to keep this part of the design comparable with respect to the different IRTmodels . . .
¿or not? ¿why?
For easy test difficulty conditions, test difficulty was around .80 for typical simulated responsedata which was similar to the real dataset. For medium test difficulty conditions, the b parametersfor dichotomous IRT models [. . . ] were manipulated such that test difficulty was around .50for typical simulated response data. This was accomplished by adding a constant of 1.52 forthe 1PL, 1.56 for the 2PL, and 1.53 for the 3PL model to the b parameters used for the easyconditions (pp.595-596.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 9
/ 33
Detecting answer copyingand IRT models
Study 1: manipulate sample sizes, amount of copy, type of copy, levels of itemsdifficulties.
Item difficulties: easy and medium difficulties . . . “overall test difficulty wasmanipulated through the b parameters for the dichotomous IRT models” (p.595).
Remark: Given the high correlations between the estimators of the difficulties, it ispossible to keep this part of the design comparable with respect to the different IRTmodels . . . ¿or not? ¿why?
For easy test difficulty conditions, test difficulty was around .80 for typical simulated responsedata which was similar to the real dataset. For medium test difficulty conditions, the b parametersfor dichotomous IRT models [. . . ] were manipulated such that test difficulty was around .50for typical simulated response data. This was accomplished by adding a constant of 1.52 forthe 1PL, 1.56 for the 2PL, and 1.53 for the 3PL model to the b parameters used for the easyconditions (pp.595-596.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 9
/ 33
Detecting answer copyingand IRT models
Study 1: manipulate sample sizes, amount of copy, type of copy, levels of itemsdifficulties.
Item difficulties: easy and medium difficulties . . . “overall test difficulty wasmanipulated through the b parameters for the dichotomous IRT models” (p.595).
Remark: Given the high correlations between the estimators of the difficulties, it ispossible to keep this part of the design comparable with respect to the different IRTmodels . . . ¿or not? ¿why?
For easy test difficulty conditions, test difficulty was around .80 for typical simulated responsedata which was similar to the real dataset. For medium test difficulty conditions, the b parametersfor dichotomous IRT models [. . . ] were manipulated such that test difficulty was around .50for typical simulated response data. This was accomplished by adding a constant of 1.52 forthe 1PL, 1.56 for the 2PL, and 1.53 for the 3PL model to the b parameters used for the easyconditions (pp.595-596.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 9
/ 33
Detecting answer copyingand IRT models
Copying types:
Random copying: it is assumed that the student who copies, copies responsesfrom a student-source in a random manner, so that all items are equally likelyto be copied.
Difficulty weighted copying: the items are ordered from the easiest to the moredifficult. The probability that each item is copied is proportional to their rank.
Outcome of interest: the AUC is used as a measure ofclassification accuracy for how well an index separates theanswer-copying and honest pairs of students.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 10
/ 33
Detecting answer copyingand IRT models
Results:
Amount of copy is the most important factor that affect the classification’sperformance.
The difficulty level of items and the copy types has a negligible effect.
It was expected that the indexes would be reduced to some degree given thepresence of guessing, but it doesn’t.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 11
/ 33
Critical review?
Initial question: What means to model an educationalphenomenon?
It means to provide structural definitions of educational concepts
either by introducing explicit structural definitions,
or by introducing statistical models the parameters of which“represent some properties of the population under analysis”(Fisher, 1922).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 12
/ 33
Critical review?
Initial question: What means to model an educationalphenomenon?
It means to provide structural definitions of educational concepts
either by introducing explicit structural definitions,
or by introducing statistical models the parameters of which“represent some properties of the population under analysis”(Fisher, 1922).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 12
/ 33
Critical review?
Initial question: What means to model an educationalphenomenon?
It means to provide structural definitions of educational concepts
either by introducing explicit structural definitions,
or by introducing statistical models the parameters of which“represent some properties of the population under analysis”(Fisher, 1922).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 12
/ 33
More questions . . .
What do we mean by the expression difficulty of an item in anIRT model?
What do we mean by guessing parameter?
How can we structurally define copying? We are ready to believethat some “irregular” patterns can be discovered” by statisticaltechnique? How can we found something that we didn’t definepreviously?
More fundamentally, can we measure without having a theoryabout what we intend to measure?
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 13
/ 33
More questions . . .
What do we mean by the expression difficulty of an item in anIRT model?
What do we mean by guessing parameter?
How can we structurally define copying? We are ready to believethat some “irregular” patterns can be discovered” by statisticaltechnique? How can we found something that we didn’t definepreviously?
More fundamentally, can we measure without having a theoryabout what we intend to measure?
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 13
/ 33
More questions . . .
What do we mean by the expression difficulty of an item in anIRT model?
What do we mean by guessing parameter?
How can we structurally define copying? We are ready to believethat some “irregular” patterns can be discovered” by statisticaltechnique? How can we found something that we didn’t definepreviously?
More fundamentally, can we measure without having a theoryabout what we intend to measure?
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 13
/ 33
Statistical model
A statistical model fully describe the observations.
A parameter ω captures features of the observations only if it isidentified –namely, if the mapping ω 7−→ Pω is injective.
ω > Pω
Observations∨
∧
>
Therefore, only identified parameters can be statisticallyinterpreted.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 14
/ 33
Statistical model
A statistical model fully describe the observations.
A parameter ω captures features of the observations only if it isidentified –namely, if the mapping ω 7−→ Pω is injective.
ω > Pω
Observations∨
∧
>
Therefore, only identified parameters can be statisticallyinterpreted.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 14
/ 33
Statistical model
A statistical model fully describe the observations.
A parameter ω captures features of the observations only if it isidentified –namely, if the mapping ω 7−→ Pω is injective.
ω > Pω
Observations∨
∧
>
Therefore, only identified parameters can be statisticallyinterpreted.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 14
/ 33
Identifiability andthe Bayesian approach
Let Y | ψ, λ) ∼ N (ψ + λ, σ2Y ) be the statistical model(likelihood).
The parameters (ψ, λ) are not identified.
Prior distribution: ψ ∼ N (µψ, σ2ψ), λ ∼ N (µλ, σ
2λ), and ψ ⊥⊥ λ.
We can compute the posterior distribution of ψ | Y and thereforethe problem seems to be solved: we learn on a non-identifiedparameter.
However, the posterior distribution of ψ is a function of theposterior distribution of the identified parameter because
E [ψ | Y ] = ηψ,λ +σ2ψ
σ2ψ + σ2λE [ψ + λ | Y ].
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 15
/ 33
Identifiability andthe Bayesian approach
Let Y | ψ, λ) ∼ N (ψ + λ, σ2Y ) be the statistical model(likelihood).
The parameters (ψ, λ) are not identified.
Prior distribution: ψ ∼ N (µψ, σ2ψ), λ ∼ N (µλ, σ
2λ), and ψ ⊥⊥ λ.
We can compute the posterior distribution of ψ | Y and thereforethe problem seems to be solved: we learn on a non-identifiedparameter.
However, the posterior distribution of ψ is a function of theposterior distribution of the identified parameter because
E [ψ | Y ] = ηψ,λ +σ2ψ
σ2ψ + σ2λE [ψ + λ | Y ].
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 15
/ 33
Identifiability andthe Bayesian approach
Let Y | ψ, λ) ∼ N (ψ + λ, σ2Y ) be the statistical model(likelihood).
The parameters (ψ, λ) are not identified.
Prior distribution: ψ ∼ N (µψ, σ2ψ), λ ∼ N (µλ, σ
2λ), and ψ ⊥⊥ λ.
We can compute the posterior distribution of ψ | Y and thereforethe problem seems to be solved: we learn on a non-identifiedparameter.
However, the posterior distribution of ψ is a function of theposterior distribution of the identified parameter because
E [ψ | Y ] = ηψ,λ +σ2ψ
σ2ψ + σ2λE [ψ + λ | Y ].
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 15
/ 33
Identifiability andthe Bayesian approach
The previous analysis is an example of a general theorem.
The likelihood is exhaustively described by the identifiedparameter.
The learning-by-observing process is fully concentrated on theidentified parameter in the sense that, conditionally on it, there isnothing more to learn on the unidentified ones.
The identified parameter is the larger parameter that can berepresented as functions of the observations.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 16
/ 33
1PL model fixed effect I
The DGP corresponds to extract with replacement correct andincorrect answers from an urn like this:
It is assumed that {Ypi} are mutually independent and that
P(Ypi = 1) =εpηi.
The parameters of interest are identified provided that η1 = 1.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 17
/ 33
1PL model fixed effect I
The DGP corresponds to extract with replacement correct andincorrect answers from an urn like this:
It is assumed that {Ypi} are mutually independent and that
P(Ypi = 1) =εpηi.
The parameters of interest are identified provided that η1 = 1.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 17
/ 33
1PL model fixed effect I
The DGP corresponds to extract with replacement correct andincorrect answers from an urn like this:
It is assumed that {Ypi} are mutually independent and that
P(Ypi = 1) =εpηi.
The parameters of interest are identified provided that η1 = 1.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 17
/ 33
1PL model fixed effect II
The identification restriction η1 = 1 allows to define person abilityrelative to the standard item 1:
εp =P(Yp1 = 1)P(Yp1 = 0)
.
This leads to compare the ability of two persons w.r.t. the itemstandard 1 (Specific Objectivity):
εp > εq ⇐⇒ P(Yp1 = 1) > P(Yq1 = 1).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 18
/ 33
1PL model fixed effect II
The identification restriction η1 = 1 allows to define person abilityrelative to the standard item 1:
εp =P(Yp1 = 1)P(Yp1 = 0)
.
This leads to compare the ability of two persons w.r.t. the itemstandard 1 (Specific Objectivity):
εp > εq ⇐⇒ P(Yp1 = 1) > P(Yq1 = 1).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 18
/ 33
1PL model fixed effect III
The identification restriction leads to identify (and, therefore, todefine) the difficulty of an item j relative to the item standard 1:
ηi =P(Yp1 = 1)P(Yp1 = 0)
P(Ypi = 0)P(Ypi = 1)
,
ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).
Joint representation “ability-difficulty”:
εp > ηi ⇐⇒ P(Ypi = 1) > P(Ypi = 0).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 19
/ 33
1PL model fixed effect III
The identification restriction leads to identify (and, therefore, todefine) the difficulty of an item j relative to the item standard 1:
ηi =P(Yp1 = 1)P(Yp1 = 0)
P(Ypi = 0)P(Ypi = 1)
,
ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).
Joint representation “ability-difficulty”:
εp > ηi ⇐⇒ P(Ypi = 1) > P(Ypi = 0).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 19
/ 33
1PL model random effects I
Framework: the basic decompositions of Classical Test Theory,
Y = E (Y | θ) + [Y − E (Y | θ)],Var(Y ) = Var [E (Y | θ)] + E [Var(Y | θ)],
and the Axiom of Local Independence.
The DGP consists of choosing at random an urn. Once the urnhas been chosen, correct and incorrect responses are extractedwith replacement.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 20
/ 33
1PL model random effects I
Framework: the basic decompositions of Classical Test Theory,
Y = E (Y | θ) + [Y − E (Y | θ)],Var(Y ) = Var [E (Y | θ)] + E [Var(Y | θ)],
and the Axiom of Local Independence.
The DGP consists of choosing at random an urn. Once the urnhas been chosen, correct and incorrect responses are extractedwith replacement.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 20
/ 33
1PL model random effects II
ALI implies that responses of each person are correlated betweenthem.
Considering a semi-parametric version of the model (that is,θp
iid∼ G ), an identification restriction is needed, namely η1 = 1.
The identification restriction allows to identify (and, therefore, todefine), the difficulty of an item j :
ηi =P(Yp1 = 1,Ypj = 0)P(Yp1 = 0,Ypj = 1)
,
ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 21
/ 33
1PL model random effects II
ALI implies that responses of each person are correlated betweenthem.
Considering a semi-parametric version of the model (that is,θp
iid∼ G ), an identification restriction is needed, namely η1 = 1.
The identification restriction allows to identify (and, therefore, todefine), the difficulty of an item j :
ηi =P(Yp1 = 1,Ypj = 0)P(Yp1 = 0,Ypj = 1)
,
ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 21
/ 33
1PL model random effects II
ALI implies that responses of each person are correlated betweenthem.
Considering a semi-parametric version of the model (that is,θp
iid∼ G ), an identification restriction is needed, namely η1 = 1.
The identification restriction allows to identify (and, therefore, todefine), the difficulty of an item j :
ηi =P(Yp1 = 1,Ypj = 0)P(Yp1 = 0,Ypj = 1)
,
ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 21
/ 33
1PL model random effects III
What kind of scientific statements can be done with respect topersons? To what extent such statements relate persons anditems?
In the context of CTT, scientific statements at the person levelcan be done taking into account the dual decomposition
θp = E (θp | Yp) + [θp − E (θp | Yp)].
Given a responses pattern, which is the more probable urn fromwhich the responses come from? The conditional distribution of(θp | Yp) can be used to classify persons in “ability-urns”.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 22
/ 33
1PL model random effects III
What kind of scientific statements can be done with respect topersons? To what extent such statements relate persons anditems?In the context of CTT, scientific statements at the person levelcan be done taking into account the dual decomposition
θp = E (θp | Yp) + [θp − E (θp | Yp)].
Given a responses pattern, which is the more probable urn fromwhich the responses come from? The conditional distribution of(θp | Yp) can be used to classify persons in “ability-urns”.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 22
/ 33
1PL model random effects III
What kind of scientific statements can be done with respect topersons? To what extent such statements relate persons anditems?In the context of CTT, scientific statements at the person levelcan be done taking into account the dual decomposition
θp = E (θp | Yp) + [θp − E (θp | Yp)].
Given a responses pattern, which is the more probable urn fromwhich the responses come from? The conditional distribution of(θp | Yp) can be used to classify persons in “ability-urns”.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 22
/ 33
Summary
Rasch (1960) Lord (1952)
DGP Responses mutually independent For each person, responsesacross persons and items mutually dependent
Specific Valid InvalidObjectivity
Classification Invalid Valid
DifficultyP(Yp1 = 1)P(Yp1 = 0)
P(Ypi = 0)P(Ypi = 1)
P(Yp1 = 1,Ypj = 0)P(Yp1 = 0,Ypj = 1)
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 23
/ 33
1PL-G model random effects I
Axiom of Local Independence, θpiid∼ G ,
P(Ypi = 1) = ci + (1− ci )exp(θp − βi )
1+ exp(θp − βi ).
What means the parameters ci y βi?
The item parameters are identified provided that β1 = 0 y c1 = 0.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 24
/ 33
1PL-G model random effects I
Axiom of Local Independence, θpiid∼ G ,
P(Ypi = 1) = ci + (1− ci )exp(θp − βi )
1+ exp(θp − βi ).
What means the parameters ci y βi?
The item parameters are identified provided that β1 = 0 y c1 = 0.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 24
/ 33
1PL-G model random effects I
Axiom of Local Independence, θpiid∼ G ,
P(Ypi = 1) = ci + (1− ci )exp(θp − βi )
1+ exp(θp − βi ).
What means the parameters ci y βi?
The item parameters are identified provided that β1 = 0 y c1 = 0.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 24
/ 33
1PL-G model random effects II
Conditional odds with respect to the DGP:
Ok|j =P(Yk = 1 | Yj = 1, Yl = 0 para todo l 6= k, j)
P(Yk = 0 | Yj = 1, Yl = 0 para todo l 6= k, j).
Ok|0 =P(Yk = 1 | Yj = 0 para todo j 6= k)
P(Yk = 0 | Yl = 0 para todo l 6= k).
With respect to the standard item 1, it follows that
Ok|1 ≥ Ok|j ≥ Ok|0 for all k.
For the semiparametric 1PL model it follows that
Ok|1 = Ok|j ≥ Ok|0 for all k.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 25
/ 33
1PL-G model random effects II
Conditional odds with respect to the DGP:
Ok|j =P(Yk = 1 | Yj = 1, Yl = 0 para todo l 6= k, j)
P(Yk = 0 | Yj = 1, Yl = 0 para todo l 6= k, j).
Ok|0 =P(Yk = 1 | Yj = 0 para todo j 6= k)
P(Yk = 0 | Yl = 0 para todo l 6= k).
With respect to the standard item 1, it follows that
Ok|1 ≥ Ok|j ≥ Ok|0 for all k.
For the semiparametric 1PL model it follows that
Ok|1 = Ok|j ≥ Ok|0 for all k.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 25
/ 33
1PL-G model random effects II
Conditional odds with respect to the DGP:
Ok|j =P(Yk = 1 | Yj = 1, Yl = 0 para todo l 6= k, j)
P(Yk = 0 | Yj = 1, Yl = 0 para todo l 6= k, j).
Ok|0 =P(Yk = 1 | Yj = 0 para todo j 6= k)
P(Yk = 0 | Yl = 0 para todo l 6= k).
With respect to the standard item 1, it follows that
Ok|1 ≥ Ok|j ≥ Ok|0 for all k.
For the semiparametric 1PL model it follows that
Ok|1 = Ok|j ≥ Ok|0 for all k.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 25
/ 33
1PL-G model random effects III
A measure of the amount of information of item j w.r.t. the itemstandard 1 is defined as
IMj =Ok|j − Ok|0
Ok|1 − Ok|0∈ [0, 1].
It measure the distance between Ok|j and Ok|1 in(Ok|1 − Ok|0)-units.
IMj = 1 if and only if the behavior of the item is as a 1PL-item.
Therefore, IMj < 1 is equivlent to an item the behavior of whichis different from a 1PL-item.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 26
/ 33
1PL-G model random effects IV
If cj 6= 0 then
exp(βj) =
(1+ Oj |0
1−IMj
IMj
)(
Oj|0O1|0
)
βj is decreasing with respect to Oj |0: it seems “natural”.
βj is decreasing with respect to IMj : if the behavior of the item is“near” to a 1PL-item, then its value decreases.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 27
/ 33
1PL-G model random effects IV
If cj 6= 0 then
exp(βj) =
(1+ Oj |0
1−IMj
IMj
)(
Oj|0O1|0
)βj is decreasing with respect to Oj |0: it seems “natural”.
βj is decreasing with respect to IMj : if the behavior of the item is“near” to a 1PL-item, then its value decreases.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 27
/ 33
Modelo 1PL-G efecto aleatorio V
For an item k 6= 1,
ck =Ok|0(1− IMk)
1+ Ok|0(1− IMk)
ck is increasing with respect to Ok|0: if, conditionally to answerincorrectly all the items except the k-th item, is more probable toanswer correctly the item k than to answer it incorrectly, then ckincreases.
ck is decreasing with respecto to IMk : if the behavior of the itemis “near” to a 1PL-item, then ck decreases.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 28
/ 33
Modelo 1PL-G efecto aleatorio V
For an item k 6= 1,
ck =Ok|0(1− IMk)
1+ Ok|0(1− IMk)
ck is increasing with respect to Ok|0: if, conditionally to answerincorrectly all the items except the k-th item, is more probable toanswer correctly the item k than to answer it incorrectly, then ckincreases.
ck is decreasing with respecto to IMk : if the behavior of the itemis “near” to a 1PL-item, then ck decreases.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 28
/ 33
Summary
In a 1PL-G model, the parameters βj y cj represent how far thebehavior of the item is from a 1PL-item.
This comparison can be done because the identification restrictionensures to have a 1PL-type item in the model.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 29
/ 33
Discussion/Questions
The item parameters βj and cj can be written in terms of Oj |kand Oj |0.
These terms are defined conditionally on answering incorrectly allthe items, except one or answering incorrectly all the items.
Some conflict with the “definitions” of copying? The question ismeaningful because two examinees could copy incorrectalternative and therefore all their responses are incorrect. In sucha case, the parameters of the 1PL-G model seem to includecopying behavior: how is it possible to analyze copying behaviorusing IRT models?
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 30
/ 33
Discussion/Questions
Difficulty and guessing are not univocal designations: it is not welldone to make comparison between items parameters when fittingdifferent IRT models.
Aún queda por mejorar nuestra comprensión de lo que dichosmodelos permiten analizar.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 31
/ 33
Acknowledgement
Research supported by the FONDECYT Project Number 1181261of the Chilean government.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 32
/ 33
References
E. San Martín, J. González and F. Tuerlinckx (2009). Identified Parameters, Parameters of Interestand Their Relationships. Measurement: Interdisciplinary Research and Perspective 7, 95-103.
E. San Martín, A. Jara, J. -M. Rolin and M. Mouchart (2011). On the Bayesian nonparametricgeneralization of Rasch-type models. Psychometrika 76, 385-409.
E. San Martín and J. -M. Rolin (2013). Identification of Parametric Rasch-type Models. Journal ofStatistical Planning and Inference 143, 116-130.
E. San Martín, J.-M. Rolin and L. M. Castro (2013). Identification of the 1PL Model with guessingparameter: Parametric and Semi-parametric results. Psychometrika 78, 341-379.
E. San Martín (2015). Identification of Item Response Theory Models. In: W. van der Linden (Ed.),Handbook of Item Response Theory, Volumen 2, Chapter 8.
E. San Martín and P. De Boeck. (2015). What do you mean by a difficult item? On the interpretationof the difficulty parameter in Rasch models. In: Roger E. Millsap, Daniel M. Bolt, L. Andries van derArk, Wen-Chung Wang (Eds.), Quantitative Psychology Research, Springer Proceedings inMathematics & Statistics 89, Chapter 1.
E. San Martín, J. González and F. Tuerlinckx (2015). On the Unidentifiability of the Fixed-Effects3PL Model. Psychometrika 80, 450-467.
E. San Martín (2015). Modelos Rasch: ¿cuán (in-)coherentemente son presentados y utilizados?Actualidades en Psicología 29, 91-102.
P. Fariña, J. González and E. San Martín (2018). An identifiability-based strategy for theinterpretation of parameters in IRT models; under revision for Psychometrika.
E. San Martín (2018). Identifiability of Structural Characteristics: How relevant is in the BayesianApproach? Brazilian Journal of Probability and Statistics 32, 346-373.
Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 33
/ 33