15. Accommodations on Large-scale Assessment

download 15. Accommodations on Large-scale Assessment

of 35

Transcript of 15. Accommodations on Large-scale Assessment

  • 7/24/2019 15. Accommodations on Large-scale Assessment

    1/35

    Sage Publications, Inc. and American Educational Research Association are collaborating with JSTOR to digitize, preserveand extend access to Review of Educational Research.

    http://www.jstor.org

    Accommodations for English Language Learners Taking Large-Scale Assessments: AMeta-Analysis on Effectiveness and ValidityAuthor(s): Michael J. Kieffer, Nonie K. Lesaux, Mabel Rivera and David J. FrancisSource: Review of Educational Research, Vol. 79, No. 3 (Sep., 2009), pp. 1168-1201Published by: American Educational Research AssociationStable URL: http://www.jstor.org/stable/40469092Accessed: 15-11-2015 00:20 UTC

    Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp

    JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content

    in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.For more information about JSTOR, please contact [email protected].

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/http://www.jstor.org/action/showPublisher?publisherCode=aerahttp://www.jstor.org/stable/40469092http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/40469092http://www.jstor.org/action/showPublisher?publisherCode=aerahttp://www.jstor.org/
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    2/35

    Review

    f

    Educational

    esearch

    September

    009,

    Vol.

    9,

    No.

    3,

    pp.

    1168-1201

    DOI: 10.3102/0034654309332490

    2009AERA, ttp://rer.aera.net

    Accommodations

    or

    English

    Language

    Learners

    TakingLarge-Scale

    Assessments:

    A

    Meta-

    Analysis

    n

    Effectiveness

    nd

    Validity

    MichaelJ.Kieffer ndNonieK. Lesaux

    HarvardGraduate

    chool

    of

    Education

    Mabel Rivera

    nd

    David

    J.

    Francis

    University

    f

    Houston

    IncludingEnglish language

    learners

    ELLs)

    in

    large-scale

    assessments

    raises

    questions

    bout the

    validity

    f

    nferences

    ased on

    their cores.

    Test

    accommodations

    or

    ELLs are intended

    o reduce

    the

    impact

    of

    limited

    Englishproficiency

    n the assessment

    f

    the

    target

    onstruct,

    most

    often

    mathematicr science

    proficiency.

    his

    meta-analysis

    ynthesizes

    esearch

    on the ffectivenessndvalidityf uch ccommodationsorELLs. Findings

    indicate hat one

    of

    the even ccommodations

    tudied

    hreaten

    he

    validity

    of inferences.

    owever,

    nly

    one

    accommodation-

    roviding

    nglish

    dic-

    tionaries r

    glossaries-

    has

    a

    statisticallyignificant

    ffect

    n

    ELLs

    perfor-

    mance,

    nd this

    ffectquates

    to

    only

    small

    reduction

    n the chievement

    score

    gap

    between

    LLs and native

    nglish peakers.

    indings uggest

    hat

    accommodations o

    reduce the

    mpact

    f

    limited

    anguage

    proficiency

    n

    academic skill ssessment

    re

    not

    articularly

    ffective.

    iven

    his,

    we

    posit

    a

    hypothesis

    bout the

    necessary

    ole

    of

    cademic

    anguage

    kills n mathe-

    matics nd science ssessments.

    Keywords:

    achievement

    gap,

    assessment,

    English

    language

    learners,

    high

    stakes

    testing, anguage development.

    As the tandards

    movement

    n

    education

    has

    gained

    n

    momentum,

    olicy

    mak-

    ers have

    increasingly

    ocused

    on test-based

    ccountability

    ystems

    with

    the

    goal

    of

    mproving

    cademic achievement

    or ll children.

    he

    principles

    f

    setting igh

    standards,

    ssessing

    all

    students elative

    o those

    standards,

    nd

    holding

    schools

    accountable for tudent

    chievement ave

    long

    been central

    o reform

    movements

    in

    public

    education

    (e.g.,

    Fuhrman,

    2003).

    However,

    since the

    No Child

    Left

    Behind

    Act of 2001

    (NCLB),

    the

    application

    of

    these

    principles

    o

    subgroups

    f

    studentsdentified s particularlytriskfor cademicdifficultiesas becomevery

    important.

    1168

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    3/35

    Accommodations

    or

    ELLs

    One

    of these

    ubgroups

    onsists

    f studentswho ack full

    proficiency

    n

    English,

    commonly

    eferred

    o

    as

    English

    anguage

    earners

    ELLs).

    ELLs

    represent

    ne of

    thefastest-growingroupsamongtheschool-aged population n thisnation e.g.,

    Capps

    et

    al.,

    2005).

    Speaking

    a wide

    variety

    f

    languages,

    this

    group

    almost dou-

    bled in size

    between

    1980

    and

    2000,

    and

    the most

    recent stimates

    place

    the size

    of

    the

    population

    t more han million

    e.g.,

    Batalova, Fix,

    &

    Murray,

    007).

    The

    results

    from

    many arge-scale

    assessments

    suggest

    thatwhen

    compared

    to

    their

    native

    nglish-speaking

    eers,

    ELLs

    lag

    behind

    n

    all

    grades

    nd

    content reas. For

    example,

    on recent

    national ssessments

    f

    reading

    nd

    math,

    nly

    a

    small minor-

    ity

    of

    ELLs scored

    at

    proficient

    evels

    (4%

    to 1

    1%,

    depending

    on

    grade

    and

    sub-

    ject),

    compared

    o

    a third r more

    of

    native

    English speakers

    National

    Centerfor

    Education

    Statistics,

    005).

    According omany ducators,NCLB has succeededin ncreasing warenessof

    the

    cademic

    needs and

    achievement f

    ELLs

    through

    ew

    requirements

    o evalu-

    ate

    schools,

    districts,

    nd

    statesbased on

    the

    English

    and content

    utcomes

    of

    this

    group

    of earners

    Center

    n Education

    Policy,

    2006).

    However,

    ncluding

    LLs

    in

    large-scale

    assessments

    is not a

    straightforwardndertaking.

    LLs

    present

    a

    unique

    set

    of

    challenges

    foreducators and

    policy

    makersbecause

    of

    the central

    role

    played by

    language proficiency

    n

    the

    acquisition

    and assessmentof content

    area

    knowledge.

    Thus,

    many

    unanswered

    uestions

    remain bout

    the

    nclusionof

    ELLs

    in

    large-scale

    assessments;

    foremost

    mong

    them re

    questions

    about how

    valid inferences

    bout

    ELLs' abilities can be

    made

    based

    on

    scores

    from

    hese

    assessments.

    The

    purpose

    of this

    study

    was to determine he

    effectiveness nd

    validity ftest ccommodations orELLs taking arge-scale ssessmentsby using

    meta-analysis

    o

    quantify

    he

    mpact

    of the

    specific

    accommodationson the

    per-

    formance

    f

    ELLs and

    native

    English speakers.

    Including

    ELLs in

    Large-Scale

    Assessments

    Historically,

    ELLs

    have often been

    excluded from

    arge-scale

    assessments

    because

    limited

    English

    proficiency

    as

    thought

    o

    prevent

    tudents

    romunder-

    standing uestions

    nd/or

    esult

    n

    invalid

    estresults nder tandard est

    dminis-

    tration

    rocedures

    Rivera,

    Collum,

    & Shafer

    Willner,

    006).

    Exclusion of

    large

    numbers

    f students

    rom

    articipation

    n

    standards-based ests

    not

    only

    can

    result

    in substantial istortionfthepercentage

    f students

    chieving roficiency

    ut

    lso,

    more

    important,

    an obscure

    important

    nd

    systematic

    differencesn student

    achievement

    between

    different

    emographicgroups.

    Thus,

    one

    of

    the laudable

    goals

    of

    NCLB and

    state fforts

    s to ncrease

    participation

    f all learners includ-

    ing

    those

    n

    dentified

    ubgroups

    in

    large-scale

    ssessments.

    However,

    t s

    not

    enough

    for tudents

    o

    participate

    n

    state

    ssessments;

    tu-

    dents'

    participation

    ust

    ead to valid nferences

    bouttheir chievement.

    btaining

    valid

    results

    s a

    particularly

    ressing

    ssue

    because the takesof mandated ssess-

    ments

    for

    states,

    districts,

    nd schools

    are

    high.

    NCLB

    and state

    accountability

    systems

    not

    only place

    considerable

    pressure

    n schools and districts

    o

    increase

    participation

    ates

    n

    arge-scale

    ssessmentsbut lso

    impose

    sanctions

    n

    schools

    that cannot move students n all identified ubgroupstowardproficiency.n

    addition,

    performance

    n

    large-scale

    assessments s

    increasingly

    igh

    stakes for

    1169

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    4/35

    Kiefferetal.

    students:

    By

    2008,

    28 states n the United States

    will

    require

    hat tudents

    ass

    a

    state-administeredestfor

    high

    school

    graduation

    Fuhrman,

    003).

    There s reasonfor oncern bout thevalidity f test cores f n fact hesereflect

    individual

    differences

    n abilitiesthat

    re distinct

    rom

    hose

    that re

    the

    target

    f

    assessment

    (American

    Educational

    Research

    Association

    [AERA],

    American

    Psychological

    Association

    [APA],

    &

    National

    Council

    on

    Measurement

    in

    Education

    NCME],

    1999).

    Because

    language plays

    an

    integral

    ole

    n

    most,

    f not

    all,

    academic

    learning,

    ny

    test f academic

    achievement

    s

    also,

    to some

    degree,

    test

    f

    anguage ability.

    onsequently,

    LLs

    present

    special

    challenge

    to schools

    and

    those nvolved

    n

    arge-scale

    ssessment;

    f ests

    re

    not

    ppropriately

    esigned

    or

    if ELLs are not

    testedunder

    ppropriate

    onditions,

    hen

    anguage

    demands

    of

    the

    test hat

    re notcentral

    o the

    target

    f assessment

    may

    unfairly

    nd

    negatively

    influencetheirperformance.Research conductedbyAbedi and colleagues has

    demonstrated

    hat here

    s

    indeed

    a substantial

    inkbetween

    tudents'

    nglish

    an-

    guage proficiency

    nd their

    erformance

    n tests

    f

    math,

    cience,

    and

    social stud-

    ies

    (e.g.,

    Abedi &

    Leon,

    1999;

    Bailey,

    2005;

    Butler

    &

    Castellon-Wellington,

    005).

    Furthermore,

    lthough

    here

    may

    be substantial

    ifferences

    etween

    ELLs

    and

    their

    peers

    in content

    knowledge,

    research

    hows

    that

    he size

    of

    this

    knowledge

    gap

    often

    depends

    on

    the

    anguage

    demands

    of the assessment.

    Several

    correlational

    studies

    have found hat

    ssessments

    nd

    individual est

    tems

    thathave

    more

    in-

    guistic

    complexityyield

    larger

    performance aps

    between

    ELLs

    and

    non-ELLs

    (e.g.,

    Abedi, Leon,

    &

    Mirocha, 2003;

    Abedi,

    Lord,

    Hofstetter,

    Baker,

    2000;

    Abedi,

    Lord,

    &

    Plummer, 997;

    Martiniello,

    007).

    Thesefindingsuggest hat contraryo somepopular onceptions assessments

    in all domains ssess

    anguage

    kills s

    well as content

    nowledge

    nd

    skills.

    However,

    such

    a

    relationship

    oes

    not ead

    directly

    o the

    conclusion

    hat

    alid

    nferences

    an

    neverbe

    made about

    the content

    nowledge

    f

    ELLs from

    arge-scale

    ssessments.

    Rather,

    he

    key

    question

    s to what

    extent he

    anguage

    skills

    measured

    by

    these

    assessments

    re essential

    o the

    construct

    argeted

    y

    the

    test

    nd,

    n

    turn,

    o

    what

    extent

    hey

    measure

    anguage

    demands

    hat re

    rrelevant

    o the

    cademic

    kills

    being

    assessed.

    Use

    of

    Accommodations

    or

    ELLs

    TakingLarge-Scale

    Assessments

    Making specificchanges

    to the test

    format

    r the conditions

    under

    which

    stu-

    dents

    are tested s one method hathas been

    proposed

    tominimize he nfluence

    on content rea

    test

    performance

    f variation

    n

    ELLs'

    language

    skills that

    s not

    central

    to the construct

    eing

    assessed.

    Such

    test accommodations

    nclude

    any

    alteration

    o standard est

    dministration

    rocedures

    designed

    to

    provide

    support

    for studentsbased

    on their

    pecial

    needs

    without

    hanging

    the construct

    eing

    assessed

    (AERA,

    APA,

    &

    NCME,

    1999).

    These

    procedures

    nclude

    the

    presenta-

    tion

    of the assessment

    tems,

    he

    ways

    in

    which students

    espond

    o

    the

    tems,

    ny

    equipment

    r materials

    o be

    used,

    the

    period

    of time

    llowed

    to

    complete

    he

    est,

    and

    the environment

    n which

    students

    ake the test.

    There

    are

    as

    many

    s 75

    dif-

    ferent ccommodations

    currently

    n

    use

    with

    ELLs,

    although

    not all

    of them

    re

    appropriate.Moreover, heir electionand implementationarybystate nddis-

    trict

    for

    review

    of state

    policies

    on accommodations

    for

    ELLs,

    see

    Rivera et

    al.,

    2006).

    1170

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    5/35

    Accommodations

    or

    ELLs

    An

    appropriate

    ccommodationfocuses

    on

    those extraneous actors hat

    ffect

    the test

    cores of studentswith

    pecial

    needs but that

    re

    not the

    target

    f assess-

    ment.An exampleof an appropriateccommodationwouldbe toprovide large-

    print

    version of a

    test to a studentwith a visual

    impairment.

    t

    the

    same

    time,

    accommodations

    hould

    not

    provide nappropriate upport

    r

    change

    thenature f

    the

    task such that

    esulting

    cores no

    longer

    llow valid

    inferences

    bout the

    cen-

    tral

    construct

    eing

    measured. An

    example

    of an

    inappropriate

    ccommodation

    would

    be to rewrite

    he

    passages

    in

    a

    reading

    omprehension

    ssessment n a

    way

    that lters

    heir

    undamental

    ifficulty

    evel.

    Thus,

    for

    ELLs,

    appropriate

    ccom-

    modations

    provide

    direct

    or indirect

    inguistic

    upport

    o

    minimizethe

    negative

    impact

    of irrelevant

    anguage

    demands

    on

    students'

    erformance

    o that he stu-

    dents

    an

    demonstrate

    heir

    ontent

    nowledge

    nd academic skillsto the

    greatest

    extent ossible.

    Evaluating

    Accommodations

    or

    ELLs in

    Large-Scale

    Assessments

    Theoretically peaking,

    many

    accommodations

    that offer

    inguistic upport,

    such

    as

    providing

    ictionaries

    r

    simplifying

    he

    English

    sentence tructuref the

    test

    tems,

    may

    ndeed be

    appropriate

    orELLs.

    However,

    because content

    nowl-

    edge

    is

    inextricably

    inked

    to

    language,

    the use of certain

    anguage supports

    or

    ELLs

    may

    notbe as

    straightforward

    s

    providing

    large-print

    ersion f an

    assess-

    ment o

    a studentwith

    visual

    mpairment;

    ven

    anguage-based

    ccommodations

    that re

    grounded

    n

    theorymay

    n

    practice

    e

    ineffective

    r threaten he

    validity

    f

    scores.

    Thus,

    the election

    f accommodations

    or LLs must e based on

    empirical

    evidencefor heir ffectivenessndvalidityAbedi,Hofstetter, Lord,2004).

    Although

    ccommodations

    for

    ELLs can

    be

    evaluated

    along

    several dimen-

    sions,

    evaluating

    ccommodations

    for

    effectiveness

    nd

    validity

    s of

    paramount

    importance.

    Effectiveness

    efers

    o the extentto

    which

    students

    receiving

    the

    accommodation

    demonstrate

    mproved

    est scores.

    In

    contrast,

    he

    validity

    f

    an

    accommodation

    refers,

    n

    part,

    to the

    notion that the accommodation should

    improve

    he

    performance

    f students

    who

    require

    t

    but

    not

    affect

    he

    performance

    of students

    who do not.

    f an accommodation

    ffects he

    performance

    f

    students

    who

    do not

    require

    t,

    hen

    providing

    he ccommodation

    o

    some

    students utnot

    others

    would threaten

    he

    validity

    f the

    resulting

    est cores.

    If

    an assessment s

    valid foruse with specificgroup, hen tudentswho do notrequirethe accom-

    modation

    will

    be neither

    dvantaged

    nor

    disadvantaged

    y

    receiving

    t.

    A

    growing

    body

    of

    empirical

    esearchhas evaluated

    ccommodations

    or

    LLs,

    but heresults

    of these

    individual

    studies

    have

    yet

    to be

    quantitatively

    ynthesized

    o

    produce

    aggregate

    stimates

    f

    their ffectiveness

    nd

    validity.

    Moreover,

    nvestigation

    f factors

    hat

    may

    potentially

    moderate

    he effective-

    ness

    of

    these accommodations

    (e.g.,

    grade

    level,

    domain

    tested,

    anguage

    of

    instruction)

    s

    needed.

    It

    is

    possible

    that

    a

    given

    accommodation

    will be

    more

    effective

    or ests

    n some

    domains than

    for ests

    n

    otherdomains or

    that

    ccom-

    modations

    will

    be

    more effective

    t some

    grade

    levels than at others.Curricular

    content

    nd

    corresponding

    measures

    of achievement

    hange

    with

    respect

    o

    both

    difficultyNationalCenteron Education and theEconomy,1998) and thenature

    of the skills tested

    e.g., Koenig

    &

    Bachman, 2004;

    RAND

    Mathematical

    Study

    Panel,

    2003;

    RAND

    Reading

    Study Group,

    2002)

    over

    the course of the

    grade

    span,

    thus

    potentially nfluencing

    he effectiveness f

    specific

    accommodations.

    1171

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    6/35

    Kieffer

    t l

    This s

    particularly

    mportant

    n

    the ontext fELLs' test

    erformanceiven

    he

    differinganguage

    emands f

    cademic asks

    ver ime nd he

    anguage

    emands

    specificodifferentomains ested. orexample,he ourthrademathestmay

    emphasize

    nd

    prioritize

    hildren's alculation

    kills,

    whereas

    he

    ighth

    rade

    tests

    n

    the ame content

    rea of math

    may mphasize

    omplex

    word

    roblems

    with

    ophisticatedanguage. inally,

    valuating

    ccommodations

    or his

    opula-

    tion

    must

    urther

    ecognize

    otential

    ources

    fdifferential

    ffectiveness

    y

    focus-

    ing

    on the nstructional

    nd

    inguistic

    ontext

    n

    which

    he

    esting

    s

    occurring,

    given

    he

    iffering

    odels

    f

    nstruction

    ffered

    or LLs

    (Abedi

    t

    al.,

    2004).

    Present

    tudy

    The

    purpose

    f his

    tudy

    s

    to

    evaluate

    he

    ffectivenessnd

    validity

    f ccom-

    modationsor LLsparticipatingn arge-scalessessments.wonarrativeeviews

    (Abedi

    t

    al.,

    2004;

    Sireci, i,

    &

    Scarpati,

    003)

    have

    previously

    ynthesized

    he

    findings

    f

    studies n

    test ccommodations

    or LLs

    published

    efore 001.

    The

    present

    tudy

    as

    designed

    obuild n this

    work

    n

    two

    ways.

    irst,

    sing

    meta-

    analytic pproach,

    he

    urrent

    tudy uantifies

    he

    verage

    ffects

    f the ccom-

    modations tudied.

    econd,

    hecurrent

    tudy

    pdates

    he

    findings

    f

    previous

    reviews

    y ncluding

    he

    indings

    f

    several

    tudies

    ublished

    ince 001

    as well

    as those

    reviously

    eviewed. iven

    he

    otential

    ources f

    differential

    ffective-

    ness

    of

    accommodations

    iscussed

    bove,

    the

    meta-analysis

    lso

    includes

    n

    examination

    f

    several

    moderatorsf effects.

    he

    analyses

    were

    guided

    y

    two

    specific

    esearch

    uestions:

    1. What

    evidence xists

    hat

    pecific

    est ccommodations

    re

    effective

    n

    improving

    he

    performance

    f ELLs

    takingarge-scale

    ssessments?

    hat

    evidence xists

    hat hese ffects iffer

    s a function

    f the

    grade

    evelof

    students,

    omain

    ested,

    rovision

    f

    xtra

    ime,

    r

    anguage

    f nstruction?

    2. What vidence xists

    hat

    pecific

    est ccommodations

    esigned

    or

    LLs

    are

    valid

    n

    arge-scale

    ssessments?

    Method

    Study

    nclusion

    riteria

    Based on ourresearch

    uestions,

    e selected

    our haracteristics

    hat

    ormed

    the riteriaor nclusion

    f studies hat

    rovide

    mpirical

    vidence

    or

    valuating

    accommodationsor LLs. We ncludedtudies

    n

    he

    meta-analysis

    hat

    a)

    exam-

    ined

    ndividual ccommodations

    r

    individual

    ccommodations

    undledwith

    extra

    ime,

    b)

    were articles

    ublished

    n

    peer-reviewed

    ournals

    r technical

    reports

    vailable

    online,

    c)

    employed

    n

    experimental,

    uasi-experimental,

    r

    repeated

    measures

    esign,

    nd

    d)

    reported

    ufficientata o allow

    for he

    stima-

    tion feffect

    izes.

    Search

    procedure.

    tudies

    or eviewwere

    obtained

    hrough

    wo searches on-

    ductednJuly006designedo nclude ll studies vailable ptothat ime. irst,

    we conducted

    comprehensive

    earch

    f online

    atabases,

    ncluding

    ducation

    Resources nformation

    enter,

    PsycINFO,

    Modern

    Language

    Association,

    1172

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    7/35

    Accommodations

    or

    ELLs

    Education

    Abstracts,

    nd

    Academic Search Premier

    which

    yielded

    1 14

    entries),

    as

    well as the

    online database

    of the National Center

    for

    Research on

    Evaluation,

    Standards, nd StudentTesting whichyieldedan additional 27 entries,manyof

    them

    redundant

    with

    the

    1 14

    previously

    found).

    The abstract f each

    identified

    citation

    was

    read to determine

    f t was an

    empirical tudy xamining

    he ffects f

    one

    or more

    ccommodations.

    econd,

    we collected citations f studies

    previously

    reviewed

    by

    Sireci

    et al.

    (2003)

    and/or

    y

    Abedi

    et al.

    (2004).

    Based on the ist of

    citations

    f

    empirical

    tudiesfrom

    he wo

    searches,

    we collected technical

    eports

    as

    well as

    articles.

    However,

    we did not collect

    presentations

    t academic confer-

    ences

    because

    of both

    practical

    nd

    quality

    concerns.

    n

    several

    cases,

    the results

    of

    a

    single

    study

    were

    reported

    n

    multiple

    documents;

    n

    such

    cases,

    the docu-

    ments

    were linked

    together

    nd cross-checked

    for

    complete

    nformation

    nd

    the

    mostrecent ocument s cited here.

    Excluded

    studies.

    The

    search

    procedure

    bove

    yielded

    21

    studies or

    possible

    nclu-

    sion

    n the

    nalyses.

    However,

    everal

    fthese

    tudies,

    ncluding

    ome cited

    n

    previ-

    ous

    reviews,

    ad to

    be excluded

    from he

    meta-analysis

    or easonsof data

    reporting

    or

    methodology.

    n

    three nstances

    N.

    E.

    Anderson,

    enkins, Miller,1996; Hafner,

    2001

    Lotherington-

    oloszyn,

    1993),

    the tudiesdid

    not

    report

    he

    necessary

    nfor-

    mation

    o

    quantify

    he effects

    f accommodations

    eparately

    orELLs and native

    English

    peakers.

    n two

    cases

    (Abedi

    &

    Hejri,

    2004;

    Shepard, aylor,

    Betebenner,

    1998),

    studies

    xamined

    he ffect

    f various

    ccommodations hosenfor ndividual

    students

    y

    their

    eachers

    nd thus

    were

    inappropriate

    or

    xamining

    he effect f

    specific ccommodations.nonecase,a previouslyited tudyMiller,Okum,Sinai,

    &

    Miller,

    1999)

    was a conference

    resentation.

    After

    xcluding

    he tudies

    bove,

    a

    totalof 15 studiesremained.Of these

    tud-

    ies,

    4

    (Abedi

    &

    Lord,

    2001;

    Albus,

    Thurlow,Liu,

    &

    Burlinski,

    005;

    Castellon-

    Wellington,

    000;

    Johnson

    Monroe,

    2004)

    employedrepeated

    measures

    designs

    in which

    he

    ame

    group

    of students

    was

    testedwith nd without ccommodations.

    Because

    the

    preponderance

    f the tudies

    o be

    included

    mployedbetween-groups

    designs

    and

    because

    effect

    izes

    from

    epeated

    measures

    designs

    are

    not

    strictly

    comparable

    o those

    from

    etween-groups

    tudies,

    esults

    rom

    hese

    studieswere

    not ncluded

    n the

    formal

    meta-analysis

    ut were

    considered

    n

    our

    findings.

    Studies ncludedinMeta-Analysis

    In

    all,

    1 1 studies

    were

    ncluded

    n the

    meta-analysis

    with

    total

    of

    23,999

    par-

    ticipants

    17,445

    native

    English

    speakers,

    6,554

    ELLs).

    Of these

    studies,

    6

    were

    conducted

    by

    Abedi

    and

    colleagues,

    whereas 5

    otherswere conducted

    by

    other

    research

    eams

    (i.e.,

    M.

    Anderson,Liu,

    Swierzbin,

    Thurlow,

    &

    Bielinski,

    2000;

    Brown,

    1999;

    Garcia

    Duncan

    et

    al.,

    2005;

    Hofstetter,003;

    Rivera

    &

    Stansfield,

    2004).

    With

    respect

    to

    design,

    8

    were

    true

    experiments,

    n

    which students

    were

    randomly ssigned

    to accommodated

    or

    unaccommodated

    onditions,

    whereas 3

    (Abedi,

    Courtney,

    Leon,

    2003a;

    Abedi,

    Courtney,

    Leon, 2003b; Brown,

    1999)

    were

    classified

    s

    quasi-experiments

    ecause

    of

    factors

    pecific

    to each

    study.

    n

    thestudybyBrown 1999), the mechanism fassignments unclear n thereport

    and

    could

    not be

    confirmed

    hrough

    ommunications

    with the

    study

    author

    or

    school

    personnel

    nvolved

    n

    the

    study.

    Observed

    pretest

    ifferences

    etween the

    two

    groups

    were

    negligible.

    In the

    study by

    Abedi,

    Courtney,

    t al.

    (2003a),

    1173

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    8/35

    Kieffer

    t l.

    students

    were

    originally ssigned

    at

    random

    o

    a treatment

    ondition;however,

    ot

    all

    students

    andomly ssigned

    were

    actually

    provided

    he ccommodation ecause

    of limited pace and equipment. imilarly,n thestudybyAbedi,Courtney,tal.

    (2003b),

    only Spanish speakers

    were

    randomly ssigned

    to a

    bilingual

    dictionary

    condition,

    although

    the

    control

    group

    included

    students

    with native

    anguages

    other han

    Spanish.

    The

    findings eported

    elow

    were

    largely

    robust

    o

    the nclu-

    sion or exclusion

    of these three tudies.

    All but 1 of the

    1 1

    studies

    used

    multiple

    amples

    to

    nvestigate

    ifferent

    ccom-

    modations and/or

    single

    accommodation

    provided

    n

    multiplegrades.1

    Thus,

    together

    he tudies

    yielded

    38

    different

    ests f the ffectiveness

    f

    specific

    ccom-

    modationsfor

    ELLs as well

    as

    30

    testsof the

    validity

    f

    accommodations.

    Of

    the

    38

    testsof

    effectiveness,

    4

    involved students

    n fourth

    rade

    n

    =

    1

    1)

    or

    eighth

    (n

    -

    23) grade,whereas4 involved studentsn fifth rade n

    =

    2) or sixthgrade

    (n

    =

    2).

    Of

    the 38 testsof

    effectiveness,

    7 used

    a math est

    s the outcome

    mea-

    sure,

    20 used

    a science

    test,

    nd

    1

    used

    a

    reading

    est.Of

    these

    effects,

    9

    used the

    NationalAssessment f

    Educational

    Progress

    NAEP)

    assessment

    r NAEP

    items

    (n

    =

    23)

    or itemsdrawnfrom

    he NAEP and

    Trends

    n International

    Mathematics

    and

    Science

    Study

    assessments

    (n

    =

    6).

    Only

    9

    effects

    were based

    on a state

    accountability

    ssessment

    8

    of

    which came

    from wo studies

    using

    the

    Delaware

    StateTest and

    1

    of whichcame from

    study sing

    the

    Minnesota

    tate

    est).

    Of

    the

    1

    1

    studies,

    reported

    hat tudents

    wereclassified

    s

    ELLs based

    on school

    records

    of a limited

    nglishproficient

    r

    ELL

    designation,

    whereas

    ELL classification

    was

    not

    reported

    n

    the

    remaining

    tudies.

    Although

    his

    suggests

    consistency

    n

    ELL classification cross studies, t s importanto notethat he criteria or uch

    school-based

    designations

    an

    vary

    onsiderably

    cross states

    nd districts

    Ragan

    &

    Lesaux,

    2006).

    Appendix

    A

    provides

    detailed

    nformation

    n the

    design

    of each

    study

    nd the characteristics

    f the

    participants.

    In their

    review

    of state assessment

    policies

    regarding

    LLs,

    Rivera

    and col-

    leagues

    (2006)

    identified

    5

    accommodations

    hat re

    currently

    made available

    to

    ELLs.

    Of

    these,

    hey

    ound

    roughly

    7

    that re considered

    potentially

    ppropriate

    insofar s

    they

    re

    specially

    designed

    to

    address the

    inguistic

    needs

    of

    ELLs. In

    contrast o

    thisbreadth f

    accommodations

    ffered o

    ELLs

    by

    states,

    he

    11 stud-

    ies and 38 testsof the effectiveness

    f

    specific

    ccommodations

    focused

    on

    only

    seven differentypesof accommodation: implified nglish n

    =

    16),English

    dic-

    tionary

    r

    glossary

    n

    = 1

    1),

    bilingual

    dictionary

    r

    glossary

    n

    =

    5),

    extratime

    (n

    =

    2),

    Spanish language

    test

    n

    =

    2),

    dual

    language questions

    n

    =

    1),

    and

    dual

    language

    booklet

    n

    =

    1).

    In

    addition to the

    two effects

    hat ncluded

    extratime

    alone,

    seven estimated ffects

    ame from tudies

    hat nvolved

    xtra

    imebundled

    withone of three ther ccommodations:

    implified

    nglish

    n

    =

    2),

    English

    dic-

    tionary

    n

    -

    3),

    or

    bilingual

    dictionary

    n

    =

    2).

    One

    study

    Abedi,

    Courtney,

    Mirocha,

    Leon,

    &

    Goldberg,

    2005)

    allowed

    extra ime to

    participants

    n

    both the

    control nd treatment

    onditions;

    his

    tudy

    was notcoded as

    evaluating

    he

    effect

    of extra ime.

    All

    but two of

    the

    reported

    ffect ize

    estimates

    re based

    on

    paper

    and

    pencil

    tests;

    he

    remaining

    wo used

    computerized

    ssessments.

    Because technical eportswere ncluded n addition opublished rticles, here

    is littlereason

    to

    believe that

    publication

    bias

    would have

    led to the

    nflation f

    effect

    izes.

    Nonetheless,

    o

    nvestigate

    he

    possibility

    hat

    heresults f

    studies

    with

    nonsignificant

    esults

    were

    more

    ikely

    o

    go

    unreported

    han

    hosewith

    ignificant

    1174

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    9/35

    Accommodations

    or

    ELLs

    results,

    we

    plotted

    he tandard rror f

    Hedges's

    g

    tatistic

    gainst

    hevalue

    of the

    Hedges's

    gu

    tatistic or ach

    study.

    nspection

    f this

    plot

    revealed hefunnel

    hape

    we wouldexpect nthe absence of substantial ublication ias,with ampleswith

    more

    precise

    estimates

    yielding

    ffect izes closer to the mean and

    little vidence

    of a

    gap

    in

    which

    unreported

    onsignificant

    ffect

    izes would

    occur.

    Accommodations

    hat Have Been Evaluated

    Empirically

    As

    mentioned,

    n the tudies

    eviewed,

    even

    different

    ypes

    f

    accommodations

    were

    evaluated:

    implified

    nglish, nglish

    dictionaries r

    glossaries,

    ilingual

    dic-

    tionaries

    r

    glossaries,

    ests

    n

    thenative

    anguage,

    dual

    language

    test

    ooklets,

    ual

    language

    questions

    for

    English passages,

    and extra ime.Each of

    these

    s

    theoreti-

    cally

    ustifiable

    or LLs

    insofar

    s

    they

    re

    designed

    o address he

    anguage

    needs

    oftheELLs by minimizing ariationnscores because of construct-irrelevantan-

    guage

    abilities.

    With he

    single exception

    of dual

    language questions,

    he accom-

    modations

    were studied

    xclusively

    with ests

    f

    math nd

    science.

    Simplified

    nglish

    nvolves

    hanges

    n the

    vocabulary

    nd

    grammar

    f test tems

    to

    eliminate

    rrelevant

    inguistic omplexity

    while

    maintaining

    he

    same

    content

    vocabulary

    nd evel of

    complexity

    n

    the

    ontent ask.These

    changes

    nclude lim-

    inating

    are

    vocabulary

    nrelated o

    the

    content,

    hortening

    r

    simplifying

    entence

    structure,

    eplacing

    passive

    voice with active

    voice,

    and

    replacingcomplex

    verb

    forms

    with

    present

    ense

    verbs

    for

    a

    description,

    ee

    Abedi et

    al.,

    1997).

    English

    dictionaries

    r

    glossaries

    involve

    providing

    efinitionalnformationn

    English

    n

    some

    form,

    ncluding

    tandard

    ictionaries,

    ictionaries ustomized o the assess-

    ment, rglossariesfor pecificwordsusedinthe ssessment.Hereagain,the ntent

    is to

    provide

    efinitional

    nformationbout

    words

    hat

    re

    necessary

    o

    comprehend

    the

    askbut

    do not

    represent

    ey oncepts

    fthe ontent.

    imilarly,

    ilingual

    diction-

    ary,glossary,

    r

    marginal

    glosses

    provide

    bilingual

    tudents

    with

    ccess to defini-

    tions

    r

    direct ranslations

    f

    selectednoncontent

    ords n students' ative

    anguage.

    Another

    varianton

    this accommodation

    nvolves

    providing

    marginal

    glosses

    explanatory

    otes

    written

    n the

    margin

    f

    the ext n the tudents' ative

    anguage.

    Threeother

    ccommodations

    nvolve

    heuse

    of

    native

    anguage

    n

    the est tself.

    Native

    anguage

    versions

    f tests nvolve

    dapting

    ests

    nto he

    native

    anguage

    of

    students.

    he most ommon

    method f

    adapting

    test o another

    anguage

    s

    to

    use

    back translation;he test s translated rom heoriginal anguage into the native

    language

    by

    a biliterate

    est

    maker.This

    adapted

    test s thentranslated ack

    into

    the

    original

    anguage

    by

    an

    independent

    ndividual,

    nd

    thetwo

    original anguage

    tests

    re

    compared

    for

    quivalence.

    This

    process

    not

    only

    s resource ntensive ut

    also

    can

    introduce dditional

    hreats

    o

    validity

    ecause

    of the

    difficulty

    n main-

    taining quivalence

    in the

    constructmeasured

    American

    nstitutes f

    Research,

    1999).

    Dual

    language

    assessments

    nvolve

    est

    booklets,

    n

    which

    English

    versions

    and

    native

    anguage

    versions

    of the same

    item are

    placed

    on

    facing pages.

    Two

    types

    of dual

    language

    testshave been

    investigated

    dual

    language

    booklets

    n

    which all

    items

    on

    mathtest

    are

    presented

    n two

    languages

    and

    dual

    language

    questions

    n which

    a

    readingpassage

    is

    presented

    n

    English,

    followed

    by ques-

    tionsreadaloud intwo anguages.

    Finally,

    ne of the

    most

    frequently

    sed accommodations or LLs is to

    provide

    extra

    ime

    o

    complete

    the

    test.The theoretical

    ationale

    s

    thatELLs will be able

    to

    demonstrate

    heir ontent

    nowledge

    nd skillsbetter

    f

    given

    dditional

    ime

    o

    1175

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    10/35

    Kiefferetal

    work

    through

    he

    anguage

    demands of the test.

    Often,

    xtra

    time s

    provided

    n

    combinationwith nother

    ype

    of

    accommodation,

    n which case the

    rationale

    s

    to allow students he timerequired ouse the accommodation e.g., to use a dic-

    tionary

    o look

    up

    the

    meanings

    of unknown

    words).

    Methods

    or

    Meta-Analysis

    To evaluate

    the

    appropriateness

    nd

    practical

    mportance

    f test

    ccommoda-

    tions for

    ELLs,

    three

    ets of

    meta-analyses

    were

    conducted.

    First,

    preliminary

    meta-analysis

    was

    conducted

    o

    compare

    the academic

    achievement

    est

    cores

    of

    ELLs

    in the bsence

    of accommodations

    with hose

    of native

    nglish peakers.

    his

    first

    nalysis

    was undertaken

    n

    an

    effort

    o describe

    the

    magnitude

    f

    differences

    in test cores between

    ELLs

    and non-ELLs

    in the

    absence

    of accommodations.

    t

    is this et ofdifferenceshat he accommodations re intended ohelpameliorate,

    and thus

    t erves s

    a metric or

    udging

    he

    magnitude

    f the

    ffect izes

    for ccom-

    modations.

    he second

    analysis

    ddressed

    he ffectiveness

    f accommodations

    y

    estimating

    he

    degree

    to whicheach accommodation

    ed

    to

    improved erformance

    forELLs.

    The third

    nalysis

    ddressed

    validity

    f

    the ccommodations

    y

    estimat-

    ing

    the

    mpact

    of

    the accommodations

    n the

    performance

    f

    non-ELLs,

    with

    he

    assumption

    hat

    valid accommodation

    hould

    have

    no

    significant

    ffect

    n their

    performance.

    o

    compute

    average

    effect

    izes,

    we treated

    ach

    study

    ample

    as

    the

    unit of

    analysis,

    yielding

    38

    tests

    of

    effectiveness.

    We made

    this

    decision

    because

    effects f different

    ccommodations

    that

    were

    derived

    from

    he same

    study

    were

    based

    on

    different

    amples

    of students.

    Although

    ffect

    izes

    derived

    from hesame study annotgenerallybe considered ndependent,n thepresent

    case

    multiple

    effects

    rom he

    same

    study

    were

    not

    generally

    nvolved

    in eval-

    uating

    the effects

    f

    any particular

    ccommodation.

    That

    is,

    studies

    contributed

    multiple

    ffects cross

    the set

    of

    accommodations

    butdid

    not

    typically

    ontribute

    multiple

    ffect

    izes for

    ny single

    accommodation.

    nsofar

    s thenet

    ffect f

    this

    nonindependence

    s to

    reduce

    the standard

    rror

    f the mean

    effect

    ize,

    it

    will

    be

    seen that

    ny

    failure

    f this

    trategy

    o

    fully

    ddress

    the ssue

    of

    nonindependence

    would not alter

    he

    general

    conclusions

    from he

    analyses

    of

    mean effect

    izes.

    To

    compute

    average

    effect izes across

    the

    entire et

    of

    samples

    and

    for all

    samples

    addressing pecific

    accommodations,

    we

    averaged

    across different

    ut-

    comes and

    grades.2

    n

    averaging

    he different

    ffect

    izes,

    we

    weighted

    he

    ndi-

    vidual effect izes

    according

    o their

    recision.

    As ourmeasureof effect

    ize,

    we

    first

    omputed

    the mean

    difference

    n

    performance

    etween

    ELLs

    receiving

    he

    accommodated

    est

    nd ELLs

    taking

    he estwithout

    ccommodations.

    For

    analy-

    ses of

    validity,

    his

    difference

    as

    computed

    for

    non-ELLs

    taking

    he

    accommo-

    dated test

    with nd those

    taking

    he est

    without

    ccommodations.)

    This difference

    in

    mean

    performance

    as

    then tandardized

    sing

    the

    pooled

    within-groups

    sti-

    mateof the tandard

    eviation.

    This measure

    of effect

    ize is

    thecommon

    Cohen's

    d,

    which

    s known o be biased

    in small

    samples.

    We

    therefore

    orrected

    his

    mea-

    sure of effect ize

    using

    a transformation

    f

    recommended

    by

    Hedges

    (1981)

    to

    produce

    estimates

    n

    Hedges's

    gu.

    These

    estimates

    were

    computed

    directly

    rom

    the means and standard eviations eportedn thestudiesby usinga programmed

    routine

    n

    the

    Comprehensive

    Meta-Analysis

    Version

    2)

    software

    Borenstein,

    Hedges,

    Higgins,

    &

    Rothstein,

    005).

    1176

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    11/35

    Accommodations

    or

    ELLs

    In additiono

    stimating

    hemean ffectizefor ach

    ccommodation,

    e nves-

    tigated

    hetherther

    spects

    f he ccommodationreatment

    oderatedhe ffect

    ofthe ccommodations;hesemoderatorsncludedhegradeevel fthe tudents,

    thedomain ested

    math,

    cience,

    r

    reading),

    hether

    he

    estwas based on the

    NAEP

    or

    state

    est,

    nd

    whetherhe

    ccommodationas bundled ith xtra ime

    or

    provided

    lone.

    Using

    ROC

    MIXED in

    SAS

    (SAS

    Institute,

    999),

    two-level

    hierarchical

    inearmodel

    HLM)

    was

    fitted,

    nwhich evel

    1

    equations

    epresented

    the evel

    f he ffectize

    for ach bservationndLevel

    2

    equations epresented

    he

    study

    evel,

    where

    tudy

    haracteristics

    includingype

    f

    accommodations

    well

    as

    moderating

    actors)

    hat erved

    o

    explain

    ariation

    neffectizes

    were ncluded

    (Raudenbush

    Bryk,

    002).

    We

    first

    ittedn unconditional

    odel,

    n which an-

    dom

    ffectsariance

    t Level

    1 was

    specified

    o be

    thevariance ue

    to

    sampling

    error ithinamplewhichwas assumed nownndgiven ythe quare f the

    standardrror f

    he

    Hedges's

    u

    tatisticrom acheffectize

    estimate)

    ndLevel

    2 variance as

    specified

    o be thevariance

    n

    Hedges's

    gu

    tatistics

    ttributableo

    differences

    etween

    amples.

    ext,

    we

    fitted set

    of

    conditional odels

    n

    which

    dummy

    ariables

    or he

    ype

    f ccommodationnd ther

    otential

    oderatorari-

    ables

    were ncluded

    tLevel

    2 todeterminef

    hey xplained

    ariationn the ffect

    sizes

    between

    amples.

    o determine

    f

    given

    ariable

    xplained tatisticallyig-

    nificant

    ariation

    n

    ffect

    izes,

    we

    examinedhe

    hange

    n

    goodness

    ffit etween

    models

    sing

    he

    hange

    n -2

    log

    likelihood tatistic

    A-2LL)

    and

    conducted

    significance

    est

    y omparing

    his

    tatistico

    chi-square

    istributionith

    degree

    of

    freedom.

    n addition

    o

    nvestigating

    oderator

    ffectsecause f

    ype

    f ccom-

    modation,nalyseswere onductedo determinef he ffectsor pecificccom-

    modations

    iffereds a

    function

    fa characteristic

    f

    the

    tudies

    hemselves

    e.g.,

    whether

    he

    tudy mployed

    n

    experimental

    r

    quasi-experimentalesign,

    he

    grade

    evel f he

    tudents,

    ontent

    omain

    measured).

    Results

    Preliminarynalyses: ifferences

    nAchievement

    Test

    cores

    Between

    LLs and Native

    nglish

    peakers

    Before

    ddressing

    he

    uestion

    f

    effectiveness

    f

    accommodations,

    e esti-

    mated

    he

    verage

    ifference

    n academic chievementest

    cores

    etween

    LLs

    andnativenglishpeakershatanbeexpectedn arge-scalessessments.hese

    estimates

    rovide

    context

    or

    valuating

    he

    practical

    mportance

    f the ffects

    of

    accommodations.

    able

    1

    presents

    everal

    stimates f

    themath

    nd

    science

    achievement

    aps

    between

    LLs and native

    nglish peakers.

    he

    top

    half

    of

    Table

    1

    presents

    ean

    ffect

    izes

    reported

    s

    Hedges's

    gu

    tatistics)

    or hedif-

    ferences

    nmath

    nd cience

    chievement

    cores etween

    LLs

    andnative

    nglish

    speakers

    n

    he

    naccommodated

    onditions

    romhe tudies

    eviewed. hese sti-

    mates

    uggest

    hat

    here re

    arge

    chievementcore

    differencesetween he wo

    groups

    cross

    hese

    grades

    nd

    domains

    f

    knowledge,

    ithmean

    effect izes

    ranging

    rom ix

    tenths

    othree

    ourths

    f standardeviation.

    hey

    lso

    suggest

    that

    he chievement

    ap

    differs

    y

    test

    omain o

    some

    xtent,

    ith

    arger aps

    present

    nscience hann math.

    Although

    hese

    ifferencesetween LLs andnon-ELLs re

    quite

    ubstantial,

    they

    re omewhat

    mall

    n

    comparison

    o estimates

    fthe

    chievement

    ap

    from

    1177

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    12/35

    TABLE 1

    Estimates

    f

    he chievement

    core

    ifferences

    etween

    nglishanguage

    earners

    andnative

    nglish peakers

    nmathnd cienceromtudieseviewedasHedgesys

    gu)

    nd

    from

    he

    005 National

    Assessment

    f

    Educational

    rogress

    as

    Cohen's

    d;

    National enter

    or

    Education

    tatistics,

    005)

    95%

    confidence

    interval

    Number

    Mean effect

    Lower

    Upper

    of studies

    size

    limit

    limit

    By

    domain3

    Math 7 0.604 0.279 0.929

    Science

    11

    0.748

    0.581

    0.914

    2005

    National Assessment

    of Educational

    Progress

    4th

    grade

    math

    0.831

    0.799

    0.864

    8th

    grade

    math

    1.006

    0.964

    1.047

    4th

    grade

    science

    1.051

    1.008

    1.094

    8th

    grade

    science

    1.227

    1.177

    1.277

    a. The chievement

    core ifference

    n

    reading

    asnot stimated

    ecause

    nly

    single

    tudy

    xamined

    his

    domain.

    national

    tudies. or

    xample,

    s another

    oint

    f

    reference,

    he ottom

    alf

    fTable

    1

    presents

    stimates

    f he

    chievement

    ifference

    etween

    ative

    nglish

    peakers

    and ELLs from

    he

    2005

    NAEP.3

    hese

    estimates

    re

    expressed

    ppropriately

    s

    Cohen's

    d because

    of

    the

    arge ample

    n

    which he

    stimates

    re

    based.

    These

    estimates

    re

    ppreciably

    arger

    han

    hose

    rom he tudies

    eviewed,

    ith

    hree f

    the

    fourdifferences

    reater

    han

    one standard

    eviation.

    s with he

    studies

    reviewed,

    he

    ap

    was

    arger

    or cience

    han

    ormath

    ndfor

    ighth rade

    tudents

    compared

    ofourth

    rade

    tudents.

    hedifference

    n

    magnitude

    etween

    he

    NAEP

    estimatesndthose

    rom he tudies

    eviewed

    may

    e

    because

    f he

    onfounding

    of oncomitantredictorsf chievement,uch spoverty,nthenationalamples,

    which

    ikely

    rebetterontrolled

    y

    the

    design

    f the

    esearch

    tudies

    f accom-

    modations.

    ll ofthe tudies

    eviewed

    ampled

    LL

    andnative

    nglish-speaking

    students

    rom ithinhe ame

    chools nd/or

    istricts,

    hereashe

    NAEP

    estimates

    are based

    on

    a

    nationally

    epresentative

    ample.

    he NAEP

    estimates

    may

    hus

    capture

    more f thevariation

    ue to

    differencesetween

    he

    chools ttended

    y

    ELLs and

    those ttended

    y

    native

    nglish

    peakers

    s well

    s those

    oncomitant

    demographic

    haracteristics

    hat end

    o affect

    chievement

    f

    at-risk

    opulations

    innational

    amples

    ut

    whose ffects

    remasked

    hen esults

    re

    disaggregated

    n

    only

    single

    imension.

    evertheless,

    oth ets

    f stimates

    ndicate

    hat

    here re

    large

    bserved ifferences

    n

    achievement

    nbothmath

    nd cience

    etween

    LLs

    andnativenglish peakersn arge-scalessessments,uggestinghat nemetric

    by

    which

    we can

    udge

    he

    ffectiveness

    f ccommodations

    s the xtent

    owhich

    they

    educe

    hese

    pparent

    chievement

    aps.

    1178

    This content downloaded from 161.112.232.221 on Sun, 15 Nov 2015 00:20:46 UTCAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 7/24/2019 15. Accommodations on Large-scale Assessment

    13/35

    1

    i

    i

    v>

    S

    1

    I

    I

    I

    .s

    s

    1

    t)

    1

    I

    1

    11

    ;t

    Os t^ os

    ^h

    en

    ,- wi

    ,-

    t

    O

    co

    vo

    O

    O

    OsQQO

    >*

    V

    * *

    ' P*

    P* P

    ^0000

    ^

    ' *

    *

    V

    '

    V V V

    p

    '*>

    ^

    ||

    ^

    - -

    ^

    w

    2

    .s

    p

    ooocovo

    n^Tcsr-

    rt

    en

    en

    tJ-'

    '

    Tt

    i-^

    ^*

    (N

    --h ^h

    VO CN

    Os

    -

    (OsOsO'-H H

    1-H

    Os

    ^>

    r-oooNCNON'-H

    m

    i-

    i

    3

    (Nor^r^oso r*

    oo

    S

    S

    ^

    n

    ^-

    r

    ^

    r

    o

    -

    v

    ^

    T

    o

    n

    o

    n

    e

    n

    n

    n

    "

    n

    w

    o

    n

    o

    e

    n

    c

    I

    d

    r

    n

    n

    N

    r

    v

    r

    n

    n

    C

    O

    0

    u

    N

    C

    N

    0

    O

    (

    N

    0

    w

    r

    o

    v

    O

    n

    o

    n

    i

    n

    m

    o

    r

    o

    n

    o

    C

    r

    O

    m