Creating Innovative Tests: Applying Universal Design to Assessment Practices Assessment Colloquium...

Creating Innovative Tests: Creating Innovative Tests: Applying Universal Design to AssessmentApplying Universal Design to Assessment

Practices Practices

Assessment ColloquiumNovember 30, 2007

Manju Banerjee, Ph.D.Assistant Professor in Residence

Special Education

Just imagine ---

If there were no tests, no assessment, no accountability as we know it?

Student perspective: Teacher perspective: Policy maker perspective:

“Opportunities borne of new technologies, desires borne of new understandings of learning ---- a new generation of assessment beckons. To realize the vision, we must reconceive how we think about assessment, from purposes and designs to production and delivery.” (Mislevy, Steinberg, & Almond, 1999, p.6)(Mislevy, Steinberg, & Almond, 1999, p.6)

The States and Online Testing

Source: Education Week survey of state technology contacts, Technology Counts, 2004

Computer-based tests (CBT) are the “next frontier” in high stakes assessment (Thompson, Johnstone, & Thurlow, 2002)

UD is anchored in the belief that a design that works well for examinees with disabilities, improves usability for all individuals (Center for an Accessible Society, 2006)

What is universal design? (Center for Universal Design, 1997) What makes a test universally designed? * Seven Elements of a universally designed test

(Thompson, Johnstone, & Thurlow, 2002)

Opportunity to create tests that support accessibility needs of diverse test takers -- Universal Design (UD)

What is the appeal of computer-based tests?What is the appeal of computer-based tests?

Maximum usabilityWidest range of consumers

Without designadaptations

Minimize construct irrelevant features

Include test takingfeatures

Disabilities, ELL, Non-traditional age

Built-in from the start

Examinee choice is “flexibility to access and express in the mode or methods that best suit the individual” (Hall, 2005, p. 2)(Russell, Goldberg, & O’Conner, 2003)

EXAMINEE CHOICE

Application of Universal Design to High Stakes TestsApplication of Universal Design to High Stakes Tests

Inform product development of high stakes tests

1. Objective of Study 1. Objective of Study

Based on current research on features that support “examinee choice” in high stakes test design

Test taking toolsOn-screen item

display toolsAccess tools

• Goldberg & Pedula, 2002• Peak, 2005 • Lunz & Bergstrom, 1994• Vispoel et al., 2000

• Bridgeman, Lennon, & Jackenthal, 2002• Mazzeo & Harvey, 1988• Pommerich, 2004• Pommerich & Burden, 2004

• Mandinach et al., 2005• Sireci, Li, & Scarpati, 2003• Tindal & Fuchs, 2000

II. Background Information II. Background Information

Features of Examinee Choice

Construct neutral Construct related• Tindal, 1998• CTB/McGraw-Hill, 2004

Test taking tools

On-screen item display

Access tools

II. Background Information (Cont.) II. Background Information (Cont.)

• U D increased accessibility for all examinees

• Accessibility is maximized when examinees have choice over features of test design

• Research on features of test design fall into three broad categories:

(1)Test taking tools (2)Item Display (3) Access tools

• Some features are construct neutral/construct irrelevant; others are construct related (including test accommodations)

• Allowing examinees to choose features of test design based on individual preferences needs to be explored for a wide range of features including features that affect test construct

• U D suggest a framework but research is still emerging on the application of UD to high stakes CBTs.

1. What are college students’ stated preferences for features and combinations of features of test design from among test taking tools, on-screen item display, and access tools for the Passage Comprehension section of the GRE?

2. Are stated preferences for features and combinations of features from among test-taking tools, on-screen item display, and access tools different among students with and without learning disabilities (LD), Attention Deficit Hyperactivity Disorder (ADHD), or both?

Research questions

III. Methodology and Procedures

Exploratory study - Survey design Participants responded to an online survey instrument (1) Student background questionnaire (2) Demonstration of selected features of CBT (3) Opportunity for practice (4) Two choice exercises * Rank-ordered choice exercise * Voluntary top feature choice exercise Two pilot studies

Research Design, Instrumentation, Pilot Study

III. Methodology and Procedures (cont.)

Attribute 1: Test taking tools Highlighting Tagging Strike-out Change answer

Attribute 2: On-screen item display tools Font size Note pad Question reorder

Attribute 3: Access tools Self-voicing less 20 points 50% extra time less 20 points Self-voicing less 40 points 50% extra time less 40 points Self-voicing less 60 points 50% extra time less 60 points No selection

Instrumentation – Test Features

III. Methodology and Procedures (contd.)

Instrumentation


Introduction http://www.education.uconn.edu/jamison/highstakestesting/intro.cfm

Highlighting feature

http://www.education.uconn.edu/jamison/highstakestesting/tool1video.cfm

Strike out feature


http://www.education.uconn.edu/jamison/highstakestesting/intro.cfm


















Instrumentation

III. Methodology and Procedures (cont.)

Choice exercise 1

http://www.education.uconn.edu/jamison/highstakestesting/choice1.cfm

Choice exercise 2














Instrumentation- Creating the 1st choice exercise


Given 4x3x7 (features) = 84 combinations Select a unique group of 4 from 84 combinations 4

84C

Attribute Range of occurrence of features

Test taking tools 176 - 196

On-screen item display tools 230 - 250

Access tools

"No selection” feature

75 – 99

185

Research Question 1

Data Analysis

Rank-ordered choiceexercise data

Voluntary top feature choiceexercise data

Rank-ordered LogitRegression

Multinomial LogitRegression


Research Question 2

Data Analysis – Rank-ordered logit regression


Dependent variable - Ranks assigned to the combination of feature

Independent variables - Features and attributes

Utility/Preference * Non-significant baseline

[rank is proxy for preference] * Non-significant zero probability of selection Relative Utility * One feature is dropped from each attribute for the model to be determinate

XY ̂ˆ

Data Analysis – Rank-ordered logit regression

Three models were estimated:

Model 1: Three attributes as independent variables

Model 2: Three attributes as independent variables with “no selection” feature was omitted

Model 3: Features within each attribute as independent variables


Data Analysis – Multinomial logit regression

Dependent variable –Top pick feature within an attribute

Independent variables – Demographic characteristics


Sample Demographics

0

20

40

60

80

100

120

140

160

180

200

1

Profile

Num

ber o

f par

ticip

ants

All particpants

LD/ADHD

No LD/ADHD

Grad

Undergrad

High GPA

Low GPA

No disability

Disability

Prior experience

No Prior experience

Male

Female

IV. Results – Participant demographics

Demographic

Characteristics

Test-taking tools (SE)

On-screen item display tools (SE)

Access tools

(SE)

All participants .03(.06) .04(.09) .04(.03)

No LD/ADHD .03(.07) -.03(.09) .06(.03)**

LD/ADHD -.06(.17) .51(.24)** -0.12(.08)*

Graduate -.03(.08) .02(.11) .11(.03)***

Undergraduate .10(.09) .07(.14) -.05(.04)

No disability .05(.07) -.03(.11) .07(.03)**

Disability -.02(.12) .20(.15) -.02(.05)

*p<0.10, **p<0.05, ***p<0.01

IV. Results - Model 1

Demographic

Characteristics



Access tools

(SE)

High GPA (3.0)

.04(.07) -.04(.09) .04(.03)

Low GPA (<3.0) .004(.17) .54(.26)** .06(.08)

Prior CBT exp. .05(.09) -.10(.12) .09(.03)**

No CBT exp. .10(.08) .15(.12) .01(.04)

Male .02(.11) -.01(.15) .09(.04)**

Female .04(.07) .05(.10) .01(.03)

*p<0.10, **p<0.05, ***p<0.01

IV. Results - Model 1 (contd.)

Demographic

Characteristics



SV + Time

(SE)

All participants .03(.06) .03(.09) -.19(.15)

No LD/ADHD .03(.07) -.04(.09) -.24(.16)

LD/ADHD -.04(.17) .49(.24)** .40(.45)

Graduate -.03(.08) .02(.11) -.39(.19)**

Undergraduate .10(.09) .07(.14) .12(.24)

No disability .05(.07) -.04(.11) -.25(.18)

Disability -.02(.11) .02(.15) .01(.26)

*p<0.10, **p<0.05, ***p<0.01

IV. Results - Model 2

Demographic

Characteristics



SV + Time

(SE)

High GPA (3.0)

.03(.07) -.04(.09) -.24(.16)

Low GPA (<3.0) -.01(.17) .50(.26)** -.05(.43)

Prior CBT exp. -.05(.09) -.10(.12) -.09(.04)**

No CBT exp. .10(.08) .15(.12) .01(.04)

Male .02(.11) -.02(.15) -.55(.28)**

Female .03(.11) -.02(.16) -.55(.31)

*p<0.10, **p<0.05, ***p<0.01


Features above baseline preference

Features at baseline

preference

Features below baseline preference

• Strike-out

• Tagging for review

• Highlighting

• Change answer

• Question reorder

• Extra time

• Font size

• SV less 20 pt.

• SV less 40 pt.

All LD/ADHD No LD/

ADHD

Grad Undergrad


Features above baseline preference

Features at baseline

preference

Features below baseline preference

•Tagging for review • Highlighting

• Change answer

• Question reorder

• Extra time

• Note pad

• Strike-out

• SV less 40 pt.

• Extra time less 40 pt

High GPA

3.0

Low GPA

<3.0

Prior CBT experience

No prior

CBT

experience

Male Female


Test of equality of regression coefficients for LD/ADHD status

Model 1 On-screen item display

Z = 2.107; p = 0.02

Model 2 On-screen item display SV + ET

Z= 2.07; p = 0.02 Z=1.34; p = 0.09

Model 3 Tagging for later review

Z= 1.76; p = 0.04

Strike-out

Z= 2.40; p = 0.01

Self-voicing less 40 points

Z= 2.30; p = 0.01

IV. Results (contd.)

33.15

12.16

22.1

32.6

24.68

43.09

32.04

79.6

20.4

0

10

20

30

40

50

60

70

80

Probabilities

1

Features

Estimated Probability of Selection in Voluntary Top Feature Choice Exercise - All Participants

Highlighter

Tag

Strike-out

Change answ er

Font size

Note pad

Question reorder

Extra time

Self voicing

IV. Results -Voluntary Top Feature Choice Exercise

College students’ preferences for combinations of features of test design varied by demographic strata

At the attribute level (rank-ordered exercise):

- Students with LD and/or ADHDAND students with low GPA indicated above baseline level of preference for on-screen item display relative to test-taking tools and access tools

- Students without LD/ADHD, Graduates, No disabilities, With prior CBT experience and Males prefer Access Tools with “no selection”, BUT indicated below baseline preference when “no selection” is removed (except for those with no disabilities)

V. Summary of Results and Discussion

At the features level (rank-ordered exercise)

Strike-out, Tagging* (LD/ADHD; With disabilities

Undergrad*; GPA* < 3.0)

At the features level (voluntary top choice exercise) among Test-taking tools display

Highlight (all participants*; No LD/ADHD; High

GPA*; Graduates, Undergraduates, No disabilities,

Male; Female*)

V. Summary of Results and Discussion (contd.)

At the features level (voluntary top choice exercise) among On-screen item display

Note pad (across all demographic strata)

At the features level (voluntary top choice exercise) among Access tools

Extra time (across all demographic strata)

V. Summary of Results and Discussion (contd.)

Further investigation of “examinee choice” in high stakes computer-based test development

Explore other features of test design

Investigate concept of examinee choice with different college populations

Expand UD in assessment to include construct irrelevant and construct related features.

Provide examinee choice for features within high stakes test preparation material

V. Implications for Future Research

Sample selection did not follow formal procedures for stratified random sampling

(Levy & Lemeshow, 1991)

Participants were all from a competitive Research One university (external validity)

Focus on hypothetical choices rather than real time choices (stated vs. revealed preference)

No way to determine if all participants clearly understood the “trade-off” exercise.

(Notion of students with LD/ADHD and penalty)

V. Limitations of Study

**********

END OF PRESENTATION

Creating Innovative Tests: Applying Universal Design to Assessment Practices Assessment Colloquium...

Documents

Transcript of Creating Innovative Tests: Applying Universal Design to Assessment Practices Assessment Colloquium...