On the statistical meaning of the item parameters in IRT models - … · 2018-09-12 ·...

Post on 22-Jun-2020

2 views 0 download

Transcript of On the statistical meaning of the item parameters in IRT models - … · 2018-09-12 ·...

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 1

/ 33

On the statistical meaning of the item parame-ters in IRT modelsCONBRATRI VI

Ernesto San Martín1Faculty of Mathematics, Pontificia Universidad Católica de Chile, Chile2The Economics School of Louvain, Université catholique de Louvain, Belgium3Laboratorio Interdisciplinario de Estadística Social, LIES, Pontificia Universidad Católica de Chile, Chile

Agosto 2018, Juiz de Fora, Brasil

What means to model aeducational phenomenon?

We observe the response patterns of 50 students on 7 items:

It1 It2 It3 It4 It5 It6 It7Student 1 1 0 0 1 1 0 1Student 2 0 1 1 0 1 1 0Student 3 1 1 1 1 0 1 0Student 4 0 0 1 0 0 0 1Student 5 0 0 0 1 1 1 1Student 6 1 0 0 1 1 0 1Student 7 0 1 1 0 0 1 0Student 8 1 1 0 0 1 1 0Student 9 0 0 1 1 0 0 0Student 10 1 0 0 1 1 1 1

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 2

/ 33

What means to model aeducational phenomenon?

We observe the response patterns of 50 students on 7 items:

It1 It2 It3 It4 It5 It6 It7Student 1 1 0 0 1 1 0 1Student 2 0 1 1 0 1 1 0Student 3 1 1 1 1 0 1 0Student 4 0 0 1 0 0 0 1Student 5 0 0 0 1 1 1 1Student 6 1 0 0 1 1 0 1Student 7 0 1 1 0 0 1 0Student 8 1 1 0 0 1 1 0Student 9 0 0 1 1 0 0 0Student 10 1 0 0 1 1 1 1

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 2

/ 33

What means to model aeducational phenomenon?

We intend to describe the behavior of items.

We intend to describe the behavior of students.

In particular, we intend to identify test fraud as answer copyingbetween two examinees.

We can face these questions using psychometric models.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 3

/ 33

What means to model aeducational phenomenon?

We intend to describe the behavior of items.

We intend to describe the behavior of students.

In particular, we intend to identify test fraud as answer copyingbetween two examinees.

We can face these questions using psychometric models.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 3

/ 33

What means to model aeducational phenomenon?

We intend to describe the behavior of items.

We intend to describe the behavior of students.

In particular, we intend to identify test fraud as answer copyingbetween two examinees.

We can face these questions using psychometric models.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 3

/ 33

Detecting answer copyingand IRT models

IRT models: student’s answer on an item depends on both anindividual characteristic (ability) and item characteristics(difficulty, discrimination, guessing).

3PL model: P(Ypi = 1) = ci + (1− ci )exp(αiθp − βi )

1+ exp(αiθp − βi ).

If αi = 1 then we obtain the 1PL-G model.If ci = 0 then we obtain the 2PL model.If αi = 1 and ci = 0 then we obtain the 1PL model.

There are other extensions: the 4PL Model . . .

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 4

/ 33

Detecting answer copyingand IRT models

IRT models: student’s answer on an item depends on both anindividual characteristic (ability) and item characteristics(difficulty, discrimination, guessing).

3PL model: P(Ypi = 1) = ci + (1− ci )exp(αiθp − βi )

1+ exp(αiθp − βi ).

If αi = 1 then we obtain the 1PL-G model.If ci = 0 then we obtain the 2PL model.If αi = 1 and ci = 0 then we obtain the 1PL model.

There are other extensions: the 4PL Model . . .

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 4

/ 33

Detecting answer copyingand IRT models

IRT models: student’s answer on an item depends on both anindividual characteristic (ability) and item characteristics(difficulty, discrimination, guessing).

3PL model: P(Ypi = 1) = ci + (1− ci )exp(αiθp − βi )

1+ exp(αiθp − βi ).

If αi = 1 then we obtain the 1PL-G model.If ci = 0 then we obtain the 2PL model.If αi = 1 and ci = 0 then we obtain the 1PL model.

There are other extensions: the 4PL Model . . .

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 4

/ 33

Detecting answer copyingand IRT models

IRT models: student’s answer on an item depends on both anindividual characteristic (ability) and item characteristics(difficulty, discrimination, guessing).

3PL model: P(Ypi = 1) = ci + (1− ci )exp(αiθp − βi )

1+ exp(αiθp − βi ).

If αi = 1 then we obtain the 1PL-G model.If ci = 0 then we obtain the 2PL model.If αi = 1 and ci = 0 then we obtain the 1PL model.

There are other extensions: the 4PL Model . . .

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 4

/ 33

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

Pro

babi

lidad

de

resp

uest

a co

rrec

ta

beta=0beta=1beta=−1

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 5

/ 33

●●

●●

●●

●●●●●●●●●●●●●●●●

●●●

●●

●●

●●

●●

●●●

10 20 30 40

0.0

0.2

0.4

0.6

0.8

1.0

DE=0.31; CIT=0.37

Puntaje Total

Val

or d

e C

CE

I

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 6

/ 33

Detecting answer copyingand IRT models

Zopluoglu (2016) analyzes the statistical behavior of answer-copying indexes underdifferent IRT models (dichotomous and polychotomous)

He remarks that copying indexes share the same rationale, but the computation ofthe probability of choosing the alternative k of the item j for the person p is doneusing either an IRT model or the CTT framework.

The objective of his contribution is to analyze the type I error behavior and thestatistical power of copying indexes.

Data: 40 items of mathematics applied to 67896 students.

Adjust 1PL, 2PL and 3PL using IRTPRO (which provides MML estimators for theparameters that characterize the items, although it also provides Bayesianestimates).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 7

/ 33

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

−4 −3 −2 −1 0 1

−4

−3

−2

−1

01

Con datos de Zopluoglu (2016)

Dificultades 1PL

Difi

culta

des

2PL/

3PL

●●

●●

● ●

oo

1PL vs 2PL (0.81)1PL vs 3PL (0.86)

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 8

/ 33

Detecting answer copyingand IRT models

Study 1: manipulate sample sizes, amount of copy, type of copy, levels of itemsdifficulties.

Item difficulties: easy and medium difficulties . . . “overall test difficulty wasmanipulated through the b parameters for the dichotomous IRT models” (p.595).

Remark: Given the high correlations between the estimators of the difficulties, it ispossible to keep this part of the design comparable with respect to the different IRTmodels . . .

¿or not? ¿why?

For easy test difficulty conditions, test difficulty was around .80 for typical simulated responsedata which was similar to the real dataset. For medium test difficulty conditions, the b parametersfor dichotomous IRT models [. . . ] were manipulated such that test difficulty was around .50for typical simulated response data. This was accomplished by adding a constant of 1.52 forthe 1PL, 1.56 for the 2PL, and 1.53 for the 3PL model to the b parameters used for the easyconditions (pp.595-596.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 9

/ 33

Detecting answer copyingand IRT models

Study 1: manipulate sample sizes, amount of copy, type of copy, levels of itemsdifficulties.

Item difficulties: easy and medium difficulties . . . “overall test difficulty wasmanipulated through the b parameters for the dichotomous IRT models” (p.595).

Remark: Given the high correlations between the estimators of the difficulties, it ispossible to keep this part of the design comparable with respect to the different IRTmodels . . . ¿or not? ¿why?

For easy test difficulty conditions, test difficulty was around .80 for typical simulated responsedata which was similar to the real dataset. For medium test difficulty conditions, the b parametersfor dichotomous IRT models [. . . ] were manipulated such that test difficulty was around .50for typical simulated response data. This was accomplished by adding a constant of 1.52 forthe 1PL, 1.56 for the 2PL, and 1.53 for the 3PL model to the b parameters used for the easyconditions (pp.595-596.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 9

/ 33

Detecting answer copyingand IRT models

Study 1: manipulate sample sizes, amount of copy, type of copy, levels of itemsdifficulties.

Item difficulties: easy and medium difficulties . . . “overall test difficulty wasmanipulated through the b parameters for the dichotomous IRT models” (p.595).

Remark: Given the high correlations between the estimators of the difficulties, it ispossible to keep this part of the design comparable with respect to the different IRTmodels . . . ¿or not? ¿why?

For easy test difficulty conditions, test difficulty was around .80 for typical simulated responsedata which was similar to the real dataset. For medium test difficulty conditions, the b parametersfor dichotomous IRT models [. . . ] were manipulated such that test difficulty was around .50for typical simulated response data. This was accomplished by adding a constant of 1.52 forthe 1PL, 1.56 for the 2PL, and 1.53 for the 3PL model to the b parameters used for the easyconditions (pp.595-596.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 9

/ 33

Detecting answer copyingand IRT models

Copying types:

Random copying: it is assumed that the student who copies, copies responsesfrom a student-source in a random manner, so that all items are equally likelyto be copied.

Difficulty weighted copying: the items are ordered from the easiest to the moredifficult. The probability that each item is copied is proportional to their rank.

Outcome of interest: the AUC is used as a measure ofclassification accuracy for how well an index separates theanswer-copying and honest pairs of students.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 10

/ 33

Detecting answer copyingand IRT models

Results:

Amount of copy is the most important factor that affect the classification’sperformance.

The difficulty level of items and the copy types has a negligible effect.

It was expected that the indexes would be reduced to some degree given thepresence of guessing, but it doesn’t.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 11

/ 33

Critical review?

Initial question: What means to model an educationalphenomenon?

It means to provide structural definitions of educational concepts

either by introducing explicit structural definitions,

or by introducing statistical models the parameters of which“represent some properties of the population under analysis”(Fisher, 1922).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 12

/ 33

Critical review?

Initial question: What means to model an educationalphenomenon?

It means to provide structural definitions of educational concepts

either by introducing explicit structural definitions,

or by introducing statistical models the parameters of which“represent some properties of the population under analysis”(Fisher, 1922).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 12

/ 33

Critical review?

Initial question: What means to model an educationalphenomenon?

It means to provide structural definitions of educational concepts

either by introducing explicit structural definitions,

or by introducing statistical models the parameters of which“represent some properties of the population under analysis”(Fisher, 1922).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 12

/ 33

More questions . . .

What do we mean by the expression difficulty of an item in anIRT model?

What do we mean by guessing parameter?

How can we structurally define copying? We are ready to believethat some “irregular” patterns can be discovered” by statisticaltechnique? How can we found something that we didn’t definepreviously?

More fundamentally, can we measure without having a theoryabout what we intend to measure?

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 13

/ 33

More questions . . .

What do we mean by the expression difficulty of an item in anIRT model?

What do we mean by guessing parameter?

How can we structurally define copying? We are ready to believethat some “irregular” patterns can be discovered” by statisticaltechnique? How can we found something that we didn’t definepreviously?

More fundamentally, can we measure without having a theoryabout what we intend to measure?

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 13

/ 33

More questions . . .

What do we mean by the expression difficulty of an item in anIRT model?

What do we mean by guessing parameter?

How can we structurally define copying? We are ready to believethat some “irregular” patterns can be discovered” by statisticaltechnique? How can we found something that we didn’t definepreviously?

More fundamentally, can we measure without having a theoryabout what we intend to measure?

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 13

/ 33

Statistical model

A statistical model fully describe the observations.

A parameter ω captures features of the observations only if it isidentified –namely, if the mapping ω 7−→ Pω is injective.

ω > Pω

Observations∨

>

Therefore, only identified parameters can be statisticallyinterpreted.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 14

/ 33

Statistical model

A statistical model fully describe the observations.

A parameter ω captures features of the observations only if it isidentified –namely, if the mapping ω 7−→ Pω is injective.

ω > Pω

Observations∨

>

Therefore, only identified parameters can be statisticallyinterpreted.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 14

/ 33

Statistical model

A statistical model fully describe the observations.

A parameter ω captures features of the observations only if it isidentified –namely, if the mapping ω 7−→ Pω is injective.

ω > Pω

Observations∨

>

Therefore, only identified parameters can be statisticallyinterpreted.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 14

/ 33

Identifiability andthe Bayesian approach

Let Y | ψ, λ) ∼ N (ψ + λ, σ2Y ) be the statistical model(likelihood).

The parameters (ψ, λ) are not identified.

Prior distribution: ψ ∼ N (µψ, σ2ψ), λ ∼ N (µλ, σ

2λ), and ψ ⊥⊥ λ.

We can compute the posterior distribution of ψ | Y and thereforethe problem seems to be solved: we learn on a non-identifiedparameter.

However, the posterior distribution of ψ is a function of theposterior distribution of the identified parameter because

E [ψ | Y ] = ηψ,λ +σ2ψ

σ2ψ + σ2λE [ψ + λ | Y ].

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 15

/ 33

Identifiability andthe Bayesian approach

Let Y | ψ, λ) ∼ N (ψ + λ, σ2Y ) be the statistical model(likelihood).

The parameters (ψ, λ) are not identified.

Prior distribution: ψ ∼ N (µψ, σ2ψ), λ ∼ N (µλ, σ

2λ), and ψ ⊥⊥ λ.

We can compute the posterior distribution of ψ | Y and thereforethe problem seems to be solved: we learn on a non-identifiedparameter.

However, the posterior distribution of ψ is a function of theposterior distribution of the identified parameter because

E [ψ | Y ] = ηψ,λ +σ2ψ

σ2ψ + σ2λE [ψ + λ | Y ].

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 15

/ 33

Identifiability andthe Bayesian approach

Let Y | ψ, λ) ∼ N (ψ + λ, σ2Y ) be the statistical model(likelihood).

The parameters (ψ, λ) are not identified.

Prior distribution: ψ ∼ N (µψ, σ2ψ), λ ∼ N (µλ, σ

2λ), and ψ ⊥⊥ λ.

We can compute the posterior distribution of ψ | Y and thereforethe problem seems to be solved: we learn on a non-identifiedparameter.

However, the posterior distribution of ψ is a function of theposterior distribution of the identified parameter because

E [ψ | Y ] = ηψ,λ +σ2ψ

σ2ψ + σ2λE [ψ + λ | Y ].

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 15

/ 33

Identifiability andthe Bayesian approach

The previous analysis is an example of a general theorem.

The likelihood is exhaustively described by the identifiedparameter.

The learning-by-observing process is fully concentrated on theidentified parameter in the sense that, conditionally on it, there isnothing more to learn on the unidentified ones.

The identified parameter is the larger parameter that can berepresented as functions of the observations.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 16

/ 33

1PL model fixed effect I

The DGP corresponds to extract with replacement correct andincorrect answers from an urn like this:

It is assumed that {Ypi} are mutually independent and that

P(Ypi = 1) =εpηi.

The parameters of interest are identified provided that η1 = 1.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 17

/ 33

1PL model fixed effect I

The DGP corresponds to extract with replacement correct andincorrect answers from an urn like this:

It is assumed that {Ypi} are mutually independent and that

P(Ypi = 1) =εpηi.

The parameters of interest are identified provided that η1 = 1.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 17

/ 33

1PL model fixed effect I

The DGP corresponds to extract with replacement correct andincorrect answers from an urn like this:

It is assumed that {Ypi} are mutually independent and that

P(Ypi = 1) =εpηi.

The parameters of interest are identified provided that η1 = 1.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 17

/ 33

1PL model fixed effect II

The identification restriction η1 = 1 allows to define person abilityrelative to the standard item 1:

εp =P(Yp1 = 1)P(Yp1 = 0)

.

This leads to compare the ability of two persons w.r.t. the itemstandard 1 (Specific Objectivity):

εp > εq ⇐⇒ P(Yp1 = 1) > P(Yq1 = 1).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 18

/ 33

1PL model fixed effect II

The identification restriction η1 = 1 allows to define person abilityrelative to the standard item 1:

εp =P(Yp1 = 1)P(Yp1 = 0)

.

This leads to compare the ability of two persons w.r.t. the itemstandard 1 (Specific Objectivity):

εp > εq ⇐⇒ P(Yp1 = 1) > P(Yq1 = 1).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 18

/ 33

1PL model fixed effect III

The identification restriction leads to identify (and, therefore, todefine) the difficulty of an item j relative to the item standard 1:

ηi =P(Yp1 = 1)P(Yp1 = 0)

P(Ypi = 0)P(Ypi = 1)

,

ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).

Joint representation “ability-difficulty”:

εp > ηi ⇐⇒ P(Ypi = 1) > P(Ypi = 0).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 19

/ 33

1PL model fixed effect III

The identification restriction leads to identify (and, therefore, todefine) the difficulty of an item j relative to the item standard 1:

ηi =P(Yp1 = 1)P(Yp1 = 0)

P(Ypi = 0)P(Ypi = 1)

,

ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).

Joint representation “ability-difficulty”:

εp > ηi ⇐⇒ P(Ypi = 1) > P(Ypi = 0).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 19

/ 33

1PL model random effects I

Framework: the basic decompositions of Classical Test Theory,

Y = E (Y | θ) + [Y − E (Y | θ)],Var(Y ) = Var [E (Y | θ)] + E [Var(Y | θ)],

and the Axiom of Local Independence.

The DGP consists of choosing at random an urn. Once the urnhas been chosen, correct and incorrect responses are extractedwith replacement.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 20

/ 33

1PL model random effects I

Framework: the basic decompositions of Classical Test Theory,

Y = E (Y | θ) + [Y − E (Y | θ)],Var(Y ) = Var [E (Y | θ)] + E [Var(Y | θ)],

and the Axiom of Local Independence.

The DGP consists of choosing at random an urn. Once the urnhas been chosen, correct and incorrect responses are extractedwith replacement.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 20

/ 33

1PL model random effects II

ALI implies that responses of each person are correlated betweenthem.

Considering a semi-parametric version of the model (that is,θp

iid∼ G ), an identification restriction is needed, namely η1 = 1.

The identification restriction allows to identify (and, therefore, todefine), the difficulty of an item j :

ηi =P(Yp1 = 1,Ypj = 0)P(Yp1 = 0,Ypj = 1)

,

ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 21

/ 33

1PL model random effects II

ALI implies that responses of each person are correlated betweenthem.

Considering a semi-parametric version of the model (that is,θp

iid∼ G ), an identification restriction is needed, namely η1 = 1.

The identification restriction allows to identify (and, therefore, todefine), the difficulty of an item j :

ηi =P(Yp1 = 1,Ypj = 0)P(Yp1 = 0,Ypj = 1)

,

ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 21

/ 33

1PL model random effects II

ALI implies that responses of each person are correlated betweenthem.

Considering a semi-parametric version of the model (that is,θp

iid∼ G ), an identification restriction is needed, namely η1 = 1.

The identification restriction allows to identify (and, therefore, todefine), the difficulty of an item j :

ηi =P(Yp1 = 1,Ypj = 0)P(Yp1 = 0,Ypj = 1)

,

ηi > ηj ⇐⇒ P(Ypj = 1) > P(Ypi = 1).

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 21

/ 33

1PL model random effects III

What kind of scientific statements can be done with respect topersons? To what extent such statements relate persons anditems?

In the context of CTT, scientific statements at the person levelcan be done taking into account the dual decomposition

θp = E (θp | Yp) + [θp − E (θp | Yp)].

Given a responses pattern, which is the more probable urn fromwhich the responses come from? The conditional distribution of(θp | Yp) can be used to classify persons in “ability-urns”.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 22

/ 33

1PL model random effects III

What kind of scientific statements can be done with respect topersons? To what extent such statements relate persons anditems?In the context of CTT, scientific statements at the person levelcan be done taking into account the dual decomposition

θp = E (θp | Yp) + [θp − E (θp | Yp)].

Given a responses pattern, which is the more probable urn fromwhich the responses come from? The conditional distribution of(θp | Yp) can be used to classify persons in “ability-urns”.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 22

/ 33

1PL model random effects III

What kind of scientific statements can be done with respect topersons? To what extent such statements relate persons anditems?In the context of CTT, scientific statements at the person levelcan be done taking into account the dual decomposition

θp = E (θp | Yp) + [θp − E (θp | Yp)].

Given a responses pattern, which is the more probable urn fromwhich the responses come from? The conditional distribution of(θp | Yp) can be used to classify persons in “ability-urns”.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 22

/ 33

Summary

Rasch (1960) Lord (1952)

DGP Responses mutually independent For each person, responsesacross persons and items mutually dependent

Specific Valid InvalidObjectivity

Classification Invalid Valid

DifficultyP(Yp1 = 1)P(Yp1 = 0)

P(Ypi = 0)P(Ypi = 1)

P(Yp1 = 1,Ypj = 0)P(Yp1 = 0,Ypj = 1)

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 23

/ 33

1PL-G model random effects I

Axiom of Local Independence, θpiid∼ G ,

P(Ypi = 1) = ci + (1− ci )exp(θp − βi )

1+ exp(θp − βi ).

What means the parameters ci y βi?

The item parameters are identified provided that β1 = 0 y c1 = 0.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 24

/ 33

1PL-G model random effects I

Axiom of Local Independence, θpiid∼ G ,

P(Ypi = 1) = ci + (1− ci )exp(θp − βi )

1+ exp(θp − βi ).

What means the parameters ci y βi?

The item parameters are identified provided that β1 = 0 y c1 = 0.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 24

/ 33

1PL-G model random effects I

Axiom of Local Independence, θpiid∼ G ,

P(Ypi = 1) = ci + (1− ci )exp(θp − βi )

1+ exp(θp − βi ).

What means the parameters ci y βi?

The item parameters are identified provided that β1 = 0 y c1 = 0.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 24

/ 33

1PL-G model random effects II

Conditional odds with respect to the DGP:

Ok|j =P(Yk = 1 | Yj = 1, Yl = 0 para todo l 6= k, j)

P(Yk = 0 | Yj = 1, Yl = 0 para todo l 6= k, j).

Ok|0 =P(Yk = 1 | Yj = 0 para todo j 6= k)

P(Yk = 0 | Yl = 0 para todo l 6= k).

With respect to the standard item 1, it follows that

Ok|1 ≥ Ok|j ≥ Ok|0 for all k.

For the semiparametric 1PL model it follows that

Ok|1 = Ok|j ≥ Ok|0 for all k.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 25

/ 33

1PL-G model random effects II

Conditional odds with respect to the DGP:

Ok|j =P(Yk = 1 | Yj = 1, Yl = 0 para todo l 6= k, j)

P(Yk = 0 | Yj = 1, Yl = 0 para todo l 6= k, j).

Ok|0 =P(Yk = 1 | Yj = 0 para todo j 6= k)

P(Yk = 0 | Yl = 0 para todo l 6= k).

With respect to the standard item 1, it follows that

Ok|1 ≥ Ok|j ≥ Ok|0 for all k.

For the semiparametric 1PL model it follows that

Ok|1 = Ok|j ≥ Ok|0 for all k.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 25

/ 33

1PL-G model random effects II

Conditional odds with respect to the DGP:

Ok|j =P(Yk = 1 | Yj = 1, Yl = 0 para todo l 6= k, j)

P(Yk = 0 | Yj = 1, Yl = 0 para todo l 6= k, j).

Ok|0 =P(Yk = 1 | Yj = 0 para todo j 6= k)

P(Yk = 0 | Yl = 0 para todo l 6= k).

With respect to the standard item 1, it follows that

Ok|1 ≥ Ok|j ≥ Ok|0 for all k.

For the semiparametric 1PL model it follows that

Ok|1 = Ok|j ≥ Ok|0 for all k.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 25

/ 33

1PL-G model random effects III

A measure of the amount of information of item j w.r.t. the itemstandard 1 is defined as

IMj =Ok|j − Ok|0

Ok|1 − Ok|0∈ [0, 1].

It measure the distance between Ok|j and Ok|1 in(Ok|1 − Ok|0)-units.

IMj = 1 if and only if the behavior of the item is as a 1PL-item.

Therefore, IMj < 1 is equivlent to an item the behavior of whichis different from a 1PL-item.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 26

/ 33

1PL-G model random effects IV

If cj 6= 0 then

exp(βj) =

(1+ Oj |0

1−IMj

IMj

)(

Oj|0O1|0

)

βj is decreasing with respect to Oj |0: it seems “natural”.

βj is decreasing with respect to IMj : if the behavior of the item is“near” to a 1PL-item, then its value decreases.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 27

/ 33

1PL-G model random effects IV

If cj 6= 0 then

exp(βj) =

(1+ Oj |0

1−IMj

IMj

)(

Oj|0O1|0

)βj is decreasing with respect to Oj |0: it seems “natural”.

βj is decreasing with respect to IMj : if the behavior of the item is“near” to a 1PL-item, then its value decreases.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 27

/ 33

Modelo 1PL-G efecto aleatorio V

For an item k 6= 1,

ck =Ok|0(1− IMk)

1+ Ok|0(1− IMk)

ck is increasing with respect to Ok|0: if, conditionally to answerincorrectly all the items except the k-th item, is more probable toanswer correctly the item k than to answer it incorrectly, then ckincreases.

ck is decreasing with respecto to IMk : if the behavior of the itemis “near” to a 1PL-item, then ck decreases.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 28

/ 33

Modelo 1PL-G efecto aleatorio V

For an item k 6= 1,

ck =Ok|0(1− IMk)

1+ Ok|0(1− IMk)

ck is increasing with respect to Ok|0: if, conditionally to answerincorrectly all the items except the k-th item, is more probable toanswer correctly the item k than to answer it incorrectly, then ckincreases.

ck is decreasing with respecto to IMk : if the behavior of the itemis “near” to a 1PL-item, then ck decreases.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 28

/ 33

Summary

In a 1PL-G model, the parameters βj y cj represent how far thebehavior of the item is from a 1PL-item.

This comparison can be done because the identification restrictionensures to have a 1PL-type item in the model.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 29

/ 33

Discussion/Questions

The item parameters βj and cj can be written in terms of Oj |kand Oj |0.

These terms are defined conditionally on answering incorrectly allthe items, except one or answering incorrectly all the items.

Some conflict with the “definitions” of copying? The question ismeaningful because two examinees could copy incorrectalternative and therefore all their responses are incorrect. In sucha case, the parameters of the 1PL-G model seem to includecopying behavior: how is it possible to analyze copying behaviorusing IRT models?

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 30

/ 33

Discussion/Questions

Difficulty and guessing are not univocal designations: it is not welldone to make comparison between items parameters when fittingdifferent IRT models.

Aún queda por mejorar nuestra comprensión de lo que dichosmodelos permiten analizar.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 31

/ 33

Acknowledgement

Research supported by the FONDECYT Project Number 1181261of the Chilean government.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 32

/ 33

References

E. San Martín, J. González and F. Tuerlinckx (2009). Identified Parameters, Parameters of Interestand Their Relationships. Measurement: Interdisciplinary Research and Perspective 7, 95-103.

E. San Martín, A. Jara, J. -M. Rolin and M. Mouchart (2011). On the Bayesian nonparametricgeneralization of Rasch-type models. Psychometrika 76, 385-409.

E. San Martín and J. -M. Rolin (2013). Identification of Parametric Rasch-type Models. Journal ofStatistical Planning and Inference 143, 116-130.

E. San Martín, J.-M. Rolin and L. M. Castro (2013). Identification of the 1PL Model with guessingparameter: Parametric and Semi-parametric results. Psychometrika 78, 341-379.

E. San Martín (2015). Identification of Item Response Theory Models. In: W. van der Linden (Ed.),Handbook of Item Response Theory, Volumen 2, Chapter 8.

E. San Martín and P. De Boeck. (2015). What do you mean by a difficult item? On the interpretationof the difficulty parameter in Rasch models. In: Roger E. Millsap, Daniel M. Bolt, L. Andries van derArk, Wen-Chung Wang (Eds.), Quantitative Psychology Research, Springer Proceedings inMathematics & Statistics 89, Chapter 1.

E. San Martín, J. González and F. Tuerlinckx (2015). On the Unidentifiability of the Fixed-Effects3PL Model. Psychometrika 80, 450-467.

E. San Martín (2015). Modelos Rasch: ¿cuán (in-)coherentemente son presentados y utilizados?Actualidades en Psicología 29, 91-102.

P. Fariña, J. González and E. San Martín (2018). An identifiability-based strategy for theinterpretation of parameters in IRT models; under revision for Psychometrika.

E. San Martín (2018). Identifiability of Structural Characteristics: How relevant is in the BayesianApproach? Brazilian Journal of Probability and Statistics 32, 346-373.

Ernesto San Martín (LIES) Statistical meaning of IRT parametersAgosto 2018, Juiz de Fora, Brasil 33

/ 33