Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for...

43
Bayesian Methods for Testing Axioms of Measurement George Karabatsos University of Illinois-Chicago University of Minnesota Quantitative/Psychometric Methods Area Department of Psychology April 3, 2015, Friday. Supported by NSF-MMS Research Grants SES-0242030 and SES-1156372

Transcript of Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for...

Page 1: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

Bayesian Methods for Testing

Axioms of Measurement

George Karabatsos

University of Illinois-Chicago

University of Minnesota

Quantitative/Psychometric Methods Area

Department of Psychology

April 3, 2015, Friday.

Supported by NSF-MMS Research Grants SES-0242030 and SES-1156372

Page 2: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

Outline I. Introduction: Axioms of Measurement.

A. Relations of Axioms to IRT models.

B. Rasch, 2PL, Monotone Homogeneity and

Double-Monotone IRT models.

II. General Bayesian Model for Axiom Testing

A. Model Estimation (MCMC).

B. Axiom Testing Procedures

III. Empirical Illustrations of Bayesian Axiom Testing.

a) Convict data (orig. analyzed by Perline Wright & Wainer, 1979, APM).

b) NAEP reading test data

IV. Dealing with axiom violations:

A Bayesian Nonparametric outlier-robust IRT model

with application to teacher preparation survey from PIRLS.

V. Extensions of the Bayesian axiom testing model.

VI. Conclusions 2

George Karabatsos, 4/3/2015

Page 3: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

I. Introduction • IRT models aim to represent, via model parameters,

persons (examinees) and items on ordinal or interval

scales of measurement.

• In IRT practice, such measurement scales are assumed

for the parameters.

• The ability to represent persons and items on ordinal or interval

scales depends on the data satisfying a set of key

cancellation axioms (Luce & Tukey, 1964, JMP).

• These axioms are deterministic, but we can state these

axioms in more probabilistic terms, as follows.

• We first briefly consider the deterministic case,

to motivate the probabilistic approach. 3

George Karabatsos, 4/3/2015

Page 4: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

4

I. (Deterministic) Axioms of Measurement

George Karabatsos, 4/3/2015

Levels of the column variable j = 1 2 3 4 5 6

Levels of the row variable

i = 0

Y(0,1)

Y(0,2)

Y(0,3)

Y(0,4)

Y(0,5)

Y(0,6)

1

Y(1,1)

Y(1,2)

Y(1,3)

Y(1,4)

Y(1,5)

Y(1,6)

2

Y(2,1)

Y(2,2)

Y(2,3)

Y(2,4)

Y(2,5)

Y(2,6)

3

Y(3,1)

Y(3,2)

Y(3,3)

Y(3,4)

Y(3,5)

Y(3,6)

4

Y(4,1)

Y(4,2)

Y(4,3)

Y(4,4)

Y(4,5)

Y(4,6)

5

Y(5,1)

Y(5,2)

Y(5,3)

Y(5,4)

Y(5,5)

Y(5,6)

6

Y(6,1)

Y(6,2)

Y(6,3)

Y(6,4)

Y(6,5)

Y(6,6)

Page 5: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

5

I. Deterministic Single Cancellation Axiom

George Karabatsos, 4/3/2015

Levels of the column variable j = 1 2 3 4 5 6

Levels of the row variable

i = 0

Y(0,1)

Y(0,2)

Y(0,3)

Y(0,4)

Y(0,5)

Y(0,6)

1

Y(1,1)

Y(1,2)

Y(1,3)

Y(1,4)

Y(1,5)

Y(1,6)

2

Y(2,1)

Y(2,2)

Y(2,3)

Y(2,4)

Y(2,5)

Y(2,6)

3

Y(3,1)

Y(3,2)

Y(3,3)

Y(3,4)

Y(3,5)

Y(3,6)

4

Y(4,1)

Y(4,2)

Y(4,3)

Y(4,4)

Y(4,5)

Y(4,6)

5

Y(5,1)

Y(5,2)

Y(5,3)

Y(5,4)

Y(5,5)

Y(5,6)

6

Y(6,1)

Y(6,2)

Y(6,3)

Y(6,4)

Y(6,5)

Y(6,6)

Each

column:

Premise

Implication

Each row:

Premise

Implication

Like a

“Guttman

scale”

(1950)

Page 6: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

6

I. Probabilistic Measurement Theory

George Karabatsos, 4/3/2015

Define:

ij

Probability

that person

with score

level i

answers

item j

correctly.

Test Items in easiness order j = 1 2 3 4 5 6

Ability Level (test score)

i = 0

01

02

03

04

05

06

1

11

12

13

14

15

16

2

21

22

23

24

25

26

3

31

32

33

34

35

36

4

41

42

43

44

45

46

5

51

52

53

54

55

56

6

61

62

63

64

65

66

Page 7: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

Test Items in easiness order j = 1 2 3 4 5 6

Ability Level (test score)

i = 0

01

02

03

04

05

06

1

11

12

13

14

15

16

2

21

22

23

24

25

26

3

31

32

33

34

35

36

4

41

42

43

44

45

46

5

51

52

53

54

55

56

6

61

62

63

64

65

66

7

I. Single Cancellation Axiom (rows)

George Karabatsos, 4/3/2015

Each row:

Premise

Implication

Page 8: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Key axiom for representing

person ability (test score) on an ordinal scale.

• All Item Response Theory Models, which are of the form

Pr(Yj = 1 | ) = Gj()

for non-decreasing Gj: ℝ [0,1], assume this axiom.

Examples of such IRT models:

1PL Rasch model: Pr(Yj = 1 | ) = exp( j) / [1+ exp( j)]

2PL: Pr(Yj = 1 | ) = exp(aj{ j}) / [1 + exp(aj{ j})]

3PL: Pr(Yj = 1 | ) = cj + (1 cj) / [1 + exp(aj{ j})]

MH Model: Pr(Yj = 1 | ) is non-decreasing in .

DM Model: Pr(Yj = 1 | ) is non-decreasing in , AND

IIO: Pr(Y1 = 1|) < Pr(Y2 = 1 | ) < < Pr(YJ = 1|) for all .

8

I. Single Cancellation Axiom (rows)

George Karabatsos, 4/3/2015

Page 9: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

Test Items in easiness order j = 1 2 3 4 5 6

Ability Level (test score)

i = 0

01

02

03

04

05

06

1

11

12

13

14

15

16

2

21

22

23

24

25

26

3

31

32

33

34

35

36

4

41

42

43

44

45

46

5

51

52

53

54

55

56

6

61

62

63

64

65

66

9

I. Single Cancellation Axiom

George Karabatsos, 4/3/2015

Each

column:

Premise

Implication

Each row:

Premise

Implication

Page 10: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Key axiom for representing

person ability (test score) and item easiness (difficulty)

on a common ordinal scale.

• Examples of IRT models that (fully) assume single cancellation:

1PL Rasch model: Pr(Yj = 1 | ) = exp( j) / [1+ exp( j)]

OPLM model:

Pr(Yj = 1 | ) = exp({ j}) / [1+ exp({ j})]

DM Model:

Pr(Yj = 1 | ) is non-decreasing in ,

and IIO:

Pr(Y1 = 1|) < Pr(Y2 = 1 | ) < < Pr(YJ = 1|) for all .

10

I. Single Cancellation Axiom

George Karabatsos, 4/3/2015

Page 11: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

Premise

Implication

Test Items in easiness order j = 1 2 3 4 5 6

Ability Level (test score)

i = 0

01

02

03

04

05

06

1

11

12

13

14

15

16

2

21

22

23

24

25

26

3

31

32

33

34

35

36

4

41

42

43

44

45

46

5

51

52

53

54

55

56

6

61

62

63

64

65

66

11

I. Double Cancellation Axiom

George Karabatsos, 4/3/2015

Axiom

must hold

for all

3 3

submatrices

Page 12: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

Test Items in easiness order j = 1 2 3 4 5 6

Ability Level (test score)

i = 0

01

02

03

04

05

06

1

11

12

13

14

15

16

2

21

22

23

24

25

26

3

31

32

33

34

35

36

4

41

42

43

44

45

46

5

51

52

53

54

55

56

6

61

62

63

64

65

66

12

I. Triple Cancellation Axiom

George Karabatsos, 4/3/2015

Premise

Implication

Axiom

must hold

for all

4 4

submatrices

Page 13: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Key axioms for representing person ability (test score) and

item easiness (difficulty) on a common interval scale.

• All these axioms, together, are axioms for additive conjoint

measurement.

• Examples of IRT models that (fully) assume single cancellation:

1PL Rasch model (logistic):

Pr(Yj = 1 | ) = exp( j) / [1+ exp( j)]

Any 1PL model of the form:

Pr(Yj = 1 | ) = G( j), for non-decreasing G: ℝ [0,1] common to all test items.

• All previous discussions about measurement axioms and IRT also apply to polytomous IRT models.

13

I. Single, Double, Triple, and

all higher order cancellation axioms

George Karabatsos, 4/3/2015

Page 14: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Even the probabilistic measurement axioms are deterministic.

They assert deterministic order relations among probabilities.

• Perline, Wright & Wainer (PWW; 1979, APM), to test the Rasch

model, analyzed data from a 10-item dichotomous-scored test

administered to 2500 released convicts (from Hoffman & Beck,

1974). The test inquires about the subject’s criminal history.

• PWW tested the conjoint measurement axioms on real data,

by counting the number of axiom violations.

For example, the number of rows violating single cancellation

and, the number of 3 3 submatrices violating double

cancellation.

This axiom testing approach does not distinguish between small

and large axiom violations. We illustrate this issue now.

14

How to Test Measurement Axioms?

George Karabatsos, 4/3/2015

Page 15: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

15

True or

Random

Violation

of the

Single

Cancellation

Axiom ?

Page 16: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

16

True or

Random

Violation

of the

Single

and

Double

Cancellation

Axioms ?

Apparent

single

cancellation

axiom

violations

in red

Apparent

double

cancellation

axiom

violations

in purple

Page 17: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• The number of axiom violations, as a statistic, has an intractable

sampling distribution, for the purposes of hypothesis testing.

• The false discovery rate approach to multiple testing

(Benjamini & Hochberg, 1995, JRSSB) is not easily

applicable because the different axioms such as single

cancellation and double cancellation are dependent

of on other.

17

How to Test Measurement Axioms?

George Karabatsos, 4/3/2015

Page 18: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Data likelihood:

The Data:

n = (nij)(I+1)J , nij : # correct in test score group i for item j

N = (Nij)(I+1)J , Nij : # in test score group i who completed item j

MLE: p = (pij)(I+1) J = (nij / Nij)(I+1)J.

• Prior density, i.e., set of axioms:

• Example: single cancellation axiom (rows & columns),

A = { : ij < i+1,j for i = 0,1,…, I 1 & ij < i,j+1 for j =1,…, J 1}

(i: test score level; j indexes item in item easiness order) 18

II. Bayesian Model for Axiom Testing

be( | a,b): beta p.d.f.

Be( | a,b): beta c.d.f.

Be1(u | a,b): quantile.

1( A) = 1 if A.

1( A) = 0 if A.

Often in practice, a = b =1

(truncated uniform prior)

or a = b =½

(truncated reference prior).

George Karabatsos, 4/3/2015

Ln|N, i0

I

j1

JNij

n ij ij

n ij1 ij Nijn ij

i0

I

j1

J

be ij |aij,bij 1 A

i0

I

j1

J

be ij |aij,bij 1 Ad

Page 19: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Posterior Density (Distribution):

19

II. Bayesian Model for Axiom Testing

George Karabatsos, 4/3/2015

|N,n,A Ln |N,

Ln |N, d

i0

I

j1

JNij

n ij ij

n ij1 ij Nijn ijbe ij |aij,bij 1 A

i0

I

j1

JNij

n ij ij

n ij1 ij Nijn ijbe ij |aij,bij 1 Ad

i0

I

j1

J

be ij |aij nij,bij N ij nij 1 A

i0

I

j1

J

be ij |aij,bij

1 A

Page 20: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Posterior Density (Distribution): (c.d.f. ( | N, n, A) )

• Posterior cannot be numerically evaluated.

• MCMC full conditional posterior p.d.f.s (f.c.p.s):

π(θij | N, n, θ\ij) be(θij | aij + nij, bij + Nij nij)1(θ ∈ A), ∀ i, j

• Each MCMC sampling iteration: For every pair i, j in turn,

update/sample θij by drawing uij ~ U(0,1), and then taking:

(inverse c.d.f. sampling method; Devroye, 1986).

• As # of MCMC iterations S gets larger,

the MCMC chain {θ(s)}s=1,..,S converges to samples

from the posterior distribution (θ | N, n, A).

20

II. Bayesian Model for Axiom Testing

ij Be1 Beijmin |aij

, bij uij Beij

max |aij , bij

Beijmin |aij

, bij |aij

, bij

George Karabatsos, 4/3/2015

|N,n,A

i0

I

j1

JNij

n ij ij

n ij1 ij Nijn ijbe ij |aij,bij 1 A

i0

I

j1

JNij

n ij ij

n ij1 ij Nijn ijbe ij |aij,bij 1 Ad

Page 21: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Possible ways to test axioms from model:

1. Check if pij = nij / Nij

is within 95% posterior interval of the

marginal posterior distribution (θij | N, n, A).

Decide violation of axiom(s) if pij

is located outside the 95% posterior interval.

2. Compute the posterior predictive p-value (Karabatsos Sheu 2004 APM):

with:

Decide violations of axioms if pvalueij < .05. (or smaller) 21

II. Bayesian Model for Axiom Testing

2pij;ij Nijpij Nijij

2

Nijij

nijrep

|Nij, ij binij |Nij, ij , with pijrep nij

rep/Nij

George Karabatsos, 4/3/2015

pvalueij 12 pij

rep; ij 2pij; ij pij

rep| |N,n,Adpij

repd

Page 22: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Possible ways to test axioms from model (continued):

3. Consider the Deviance Information Criterion (DIC)

Consider DIC(A) of model under axiom (order) constraints,

and DIC(U) for unconstrained model (no order constraints).

Decide violations of axiom(s) if DIC(A) > DIC(U).

22

II. Bayesian Model for Axiom Testing

DIC D 2 D D

George Karabatsos, 4/3/2015

Deviance:

D 2i0

I

j1

J

nij log ij N ij nij log1 ij logN ij

nij

Deviance at posterior mean: D DE |N,n,A

Posterior mean of deviance: D Dd |N,n,A

D is goodness (badness) of fit term.

2 D D is model flexibility penalty,

given by 2 times the effective number of model parameters.

Page 23: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

23

Apparent

single

cancellation

axiom

violations

in red

Page 24: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

24

Test of single cancellation (over rows only)

results from

Karabatsos (2001, JAM)

No

significant

violation

of single

cancellation

over rows.

Page 25: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

25

Test of single cancellation (over rows and columns)

results from Karabatsos (2001, JAM)

Significant

violation

of single

cancellation

axiom

Page 26: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

26

True or

Random

Violation

of the

Single

and

Double

Cancellation

Axioms ?

Apparent

single

cancellation

axiom

violations

in red

Apparent

double

cancellation

axiom

violation

in purple

Page 27: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

27

Test of single and double cancellation

(Karabatsos, 2001, JAM)

Significant

violation

of single

and double

cancellation

axiom

Page 28: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

NAEP reading test data

28

George Karabatsos, 3/27/2015

NAEP data

100

examinees

6 items

results from

Karabatsos

& Sheu

(2004, APM)

Posterior

Predictive

Chi-square

test of

single

cancellation

(over rows).

Violations

indicated

by bold.

Page 29: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

29

NAEP data

100

examinees

6 items

results from

Karabatsos

& Sheu

(2004, APM)

Posterior

Predictive

Chi-square

test of

single

cancellation

(over

columns).

Violations

indicated

by bold.

Page 30: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

IV. Dealing With Axiom Violations • We have seen from the previous two empirical applications

that the measurement axioms can be violated,

even from data arising from carefully-constructed tests.

• One way to deal with the problem is by defining a more

flexible IRT model that can handle outliers.

• A flexible Bayesian Nonparametric outlier-robust IRT model.

• Will present and briefly illustrate the model through the analysis

of data arising from a teacher preparation survey from PIRLS.

• 244 respondents (teachers).

• Each rated (0-2) own level of teacher preparation on 10 items:

CERTIFICATE, LANGUAGE, LITERATURE, PEDAGOGY,

PSYCHOLOGY, REMEDIAL, THEORY, LANGDEV, SPED,

SECLANG.

Also included covariates AGE, FEMALE, Miss:FEMALE.

30

George Karabatsos, 4/3/2015

Page 31: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

31

BNP-IRT model

Karabatsos (2015,

Handbook of Modern IRT) fD |X; p1

P

j1

J

fypj |xpi;

fypj |xpj; PYpj 1 |xpj; ypj1 PYpj 1 |xpj; 1ypj

PrY 1 |x; 1 F0 |x; 0

fy |x; dy

0

k

ny |k x,2 jx;,dy

kx;, k x

k 1 x

k,2 Nk | 0,

2U | 0,b

, N |0,2vdiag,JNJ N |0,

2 vINJ1

2 ,2 IG2 |a0 /2,a0 /2IG

2 |a/2,a/2.

#

#

#

#

#

#

#

#

Persons

(examinees)

indexed by

p = 1,…,P

Test items

indexed by

j = 1,…,J

Page 32: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

32

Page 33: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

33

Absolutely no item response outliers under the BNP-IRT model.

Page 34: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

34

• The estimated posterior means of the person ability parameters

were found to be distributed with mean .00, s.d..46,

minimum .66, and maximum 3.68 for the 244 persons.

Valu

e

-5

0

5

10

Dependent variable = itemrespvs0

be

ta0

be

ta:C

ER

TIF

ICA

TE

(1)

be

ta:L

AN

GU

AG

E(1

)b

eta

:LIT

ER

AT

UR

E(1

)b

eta

:PE

DA

GO

GY

(1)

be

ta:P

SY

CH

OL

OG

Y(1

)b

eta

:RE

ME

DIA

L(1

)b

eta

:TH

EO

RY

(1)

be

ta:L

AN

GD

EV

(1)

be

ta:S

PE

D(1

)b

eta

:SE

CL

AN

G(1

)b

eta

:AG

E(1

)b

eta

:FE

MA

LE

(1)

be

ta:M

iss:F

EM

AL

E(1

)b

eta

:CE

RT

IFIC

AT

E(2

)b

eta

:LA

NG

UA

GE

(2)

be

ta:L

ITE

RA

TU

RE

(2)

be

ta:P

ED

AG

OG

Y(2

)b

eta

:PS

YC

HO

LO

GY

(2)

be

ta:R

EM

ED

IAL

(2)

be

ta:T

HE

OR

Y(2

)b

eta

:LA

NG

DE

V(2

)b

eta

:SP

ED

(2)

be

ta:S

EC

LA

NG

(2)

be

ta:A

GE

(2)

be

ta:F

EM

AL

E(2

)b

eta

:Mis

s:F

EM

AL

E(2

)sig

ma

^2

sig

ma

^2

_m

ub

eta

_w

0b

eta

_w

:CE

RT

IFIC

AT

E(1

)b

eta

_w

:LA

NG

UA

GE

(1)

be

ta_

w:L

ITE

RA

TU

RE

(1)

be

ta_

w:P

ED

AG

OG

Y(1

)b

eta

_w

:PS

YC

HO

LO

GY

(1)

be

ta_

w:R

EM

ED

IAL

(1)

be

ta_

w:T

HE

OR

Y(1

)b

eta

_w

:LA

NG

DE

V(1

)b

eta

_w

:SP

ED

(1)

be

ta_

w:S

EC

LA

NG

(1)

be

ta_

w:A

GE

(1)

be

ta_

w:F

EM

AL

E(1

)b

eta

_w

:Mis

s:F

EM

AL

E(1

)b

eta

_w

:CE

RT

IFIC

AT

E(2

)b

eta

_w

:LA

NG

UA

GE

(2)

be

ta_

w:L

ITE

RA

TU

RE

(2)

be

ta_

w:P

ED

AG

OG

Y(2

)b

eta

_w

:PS

YC

HO

LO

GY

(2)

be

ta_

w:R

EM

ED

IAL

(2)

be

ta_

w:T

HE

OR

Y(2

)b

eta

_w

:LA

NG

DE

V(2

)b

eta

_w

:SP

ED

(2)

be

ta_

w:S

EC

LA

NG

(2)

be

ta_

w:A

GE

(2)

be

ta_

w:F

EM

AL

E(2

)b

eta

_w

:Mis

s:F

EM

AL

E(2

)sig

ma

^2

_w

For BNP-IRT model,

boxplot of the marginal posterior

distributions of the item,

covariate, and prior parameters.

George Karabatsos, 4/3/2015

Page 35: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• The ability to measure persons and/or items on an ordinal or

interval scale depends on data satisfying a hierarchy of conjoint

measurement axioms, including single, double, triple cancelation,

and higher order cancellation conditions.

• We presented Bayesian model that can represent a set of

one or more axioms in terms of order constraints on binomial

parameters, with the constraints enforced by the prior distribution.

• This model provided a coherent approach to test the measurement

axioms on real data sets.

35

V. Conclusions

George Karabatsos, 4/3/2015

Page 36: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Applications of the Bayesian axiom testing model

showed that the measurement axioms can be violated

from data arising even from carefully constructed tests.

• As a possible remedy to this issue,

we propose a more flexible, BNP-IRT model that

can provide estimates of person and item parameters

that are robust to any item response outliers in the data.

• In a sense the BNP-IRT model is not wrong for the data;

It is a highly flexible model which makes rather irrelevant

the practice of model-checking or axiom testing or model fit

analysis.

For related arguments, see Karabatsos & Walker 2009, BJMSP).

36

V. Conclusions

George Karabatsos, 4/3/2015

Page 37: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• The Bayesian axiom testing model of Karabatsos (2001),

was later used to

-- test decision theory axioms (e.g., Myung et al., 2005, JMP);

-- test measurement axioms (e.g., Kyngdon, 2011; Domingue

2012). The latter author suggested a minor modification to the

MH algorithm of Karabatsos (2001) to handle more orderings

under double cancellation.;

Like Karabatsos & Sheu (2004), this talk focused on a

Gibbs sampler which is usually preferable to a rejection

sampler like the MH algorithm, for MCMC practice.

etc.

• Karabatsos (2005, JMP) defined binomial parameter as the

probability of choice that satisfied an axiom. Then under a

conjugate beta prior for , we may directly calculate a

Bayes factor to test the axiom (H0) according to

H0: > c versus H1: < c for some large c, such as .95. 37

V. Conclusions

George Karabatsos, 4/3/2015

Page 38: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Allow for random orderings for the cancellation axioms.

• Consider the joint posterior distribution:

(, , , A, | Y, N, n) = (, , A, | Y) ( | N, n, A,)

given Rasch model:

Posterior distribution:

• As before,

• A, is the random linear rank ordering that

the matrix ( P(Ypj = 1 | ,) )NJ induces on = (ij)(I+1)J .

• This ordering automatically satisfies all cancellation axioms. 38

Extensions of Axiom Testing Model (1)

, |Y ypjNJ

p1

N

j1

J expp jypj

1 expp jn, |0, INJ

p1

N

j1

J expp jypj

1 expp jdN, |0, INJ

George Karabatsos, 4/3/2015

PYpj 1 |p ,j expp j

1 expp j

|N,n,A, i0

I

j1

J

be ij |aij,bij

1 A,

Page 39: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Then the joint posterior distribution: (, , , A, | Y, N, n)

can be estimated by using the usual MCMC methods.

• For each stage of the MCMC chain, {((s), (s), (s), A, (s))}s=1:S ,

the Gibbs sampler (inverse c.d.f.) method

would be used to provides a Gibbs sampling

update for (s), based on the updated ordering A, (s).

• Then the Bayesian axiom tests as before,

but now they are based marginalizing these tests

over the posterior distribution of A, .

39

Extensions of Axiom Testing Model (1)

George Karabatsos, 4/3/2015

Page 40: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Extend the independent (truncated) Beta priors for the ijs

namely ~ ∏i∏j Be(ij | a, b) 1( A)

to a prior defined by a discrete mixture of beta distributions.

= (ij)(I+1)J ~iid ∏i∏j Be(ij | a, b)dG(a, b) 1( A),

G ~ DP(,G0)

where E[G(a, b)] = G0(a, b) := N2(log(a),log(b) | 0, V)

Var[G(a, b)] = G0(a, b) [1G0(a, b)] / ( + 1)

• Any smooth distribution defined on (0,1) can be approximated

arbitrarily-well by a suitable mixture of beta distributions.

Such a prior would define a more flexible Bayesian axiom testing

model, based on a richer class of prior distributions.

40

Extensions of Axiom Testing Model (2)

George Karabatsos, 4/3/2015

Page 41: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Bayesian nonparametric inference of distribution function

under stochastic ordering: F1 < F2 < < FK

(Karabatsos & Walker, 2007, SPL).

o Considered Bernstein polynomial priors and

Polya tree priors for the Fs.

In each case, posterior inference based on order-constrained

beta posterior distributions (as in Karabatsos 2001).

• Bayesian nonparametric score equating model using a novel

dependent Bernstein-Dirichlet polynomial prior for the

test score distribution functions (FX, FY) used for equipercentile

equating (Karabatsos & Walker, 2009, Psychometrika).

• Bayesian inference for test theory without an answer key

(Karabatsos & Batchelder, 2003, Psychometrika).

• Comparison of 36 person fit statistics (Karabatsos 2003, AME).

41

Other Work / Collaborations

George Karabatsos, 4/3/2015

Page 42: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Karabatsos, G., and Walker, S.G. (2012). A Bayesian

nonparametric causal model. J Statistical Planning & Inference.

o DP mixture of propensity score models for causal inference

in nonrandomized studies.

• Karabatsos, G., and Walker, S.G. (2012). Bayesian nonparametric

mixed random utility models. Computational Statistics & Data

Analysis, 56, 1714-1722.

o In terms of an IRT model, provides a DP infinite-mixture

of nominal response models, with person and item

parameters subject to the infinite-mixture.

• Fujimoto, K., and Karabatsos, G. (2014). Dependent Dirichlet

Process Rating Model (DDP-RM). Applied Psychological

Measurement, 38, 217-228.

o Model allows for clustering of ordinal category thresholds.

o Ken Fujimoto: former Ph.D. student. Now faculty at Loyola U. Chicago

42

Other Work / Collaborations

George Karabatsos, 4/3/2015

Page 43: Bayesian Methods for Testing Axioms of MeasurementApr 03, 2015  · II. General Bayesian Model for Axiom Testing A. Model Estimation (MCMC). B. Axiom Testing Procedures III. Empirical

• Karabatsos, G., and Walker, S.G. (2012).

Adaptive-Modal Bayesian Nonparametric Regression (EJS).

o IRT version of this model, mentioned in this talk,

to appear in Handbook Of Item Response Theory (2015).

o Model extended to meta analysis:

Karabatsos, G., Walker, S.G., and Talbott, E. (2014). A

Bayesian nonparametric regression model for meta-analysis.

Research Synthesis Methods.

o Model extended for causal inference in non-randomized,

regression discontinuity designs:

(Karabatsos & Walker, 2015; (to appear in

Müller and R. Mitra (Eds.), Nonparametric Bayesian

Methods in Biostatistics and Bioinformatics).

43

Other Work / Collaborations

George Karabatsos, 4/3/2015