Revision of the Standards for Educational and Psychological Testing: Overview
Society for Industrial and Organizational Psychology25th Annual Conference, Atlanta, Georgia
April 9, 2010
Jo-Ida HansenUniversity of Minnesota
Revising our Test Standards 2April 9, 2010
Presentation: Four Substantive Areas
Access – Nancy TippinsAccountability – Laurie WiseTechnology – Fritz DrasgowWorkplace – Paul Sackett
Revising our Test Standards 3April 9, 2010
Joint Committee Members
Barbara Plake, Co Chair Lauress Wise, HumRRO, Co ChairLinda Cook, ETSFritz Drasgow, University of IllinoisBrian Gong, NCIEALaura Hamilton, Rand CorporationJo-Ida Hansen, University on MNJoan Herman, UCLA
Revising our Test Standards 4April 9, 2010
Joint Committee Members
Michael Kane, Bar ExaminersMichael Kolen, University of IowaAntonio Puente, UNC-WilmingtonPaul Sackett, University of MNNancy Tippins, Valtera CorporationWalter (Denny) Way, Pearson Frank Worrell, Univ of CA- Berkeley
Revising our Test Standards 5April 9, 2010
Scope of Revision
Based on comments each organization received from invitation to comment
Summarized by the Management Committee in consultation with the Co-ChairsWayne Camara, Chair, APASuzanne Lane, AERADavid Frisbie, NCME
Revising our Test Standards 6April 9, 2010
Four Substantive Areas for Revisions
TechnologyAccountabilityWorkplaceAccess
Plus attention to format issues
Revising our Test Standards 7April 9, 2010
Theme Teams
Working teamsCross team collaborationsChapter LeadersFocusing of bringing into chapters
content related to themes in coherent and meaningful ways
Revising our Test Standards 8April 9, 2010
Timeline
First meeting January, 2009Projected 4 meetings per yearThree year process for completing text of
revisionOpen comment/Organization reviews
Projected for December 2010 – April 2011Projected publication Summer, 2012
Revising our Test Standards:Access for All Examinee Populations
Society for Industrial and Organizational Psychology25th Annual Conference, Atlanta, Georgia
April 9, 2010
Nancy TippinsValtera
Revising our Test Standards 10April 9, 2010
Overview
Standards related to Access appear throughout many of the chapters but are concentrated inChapter 9: Testing Individuals of Diverse
Linguistic BackgroundsChapter 10: Testing Individuals with
Disabilities Comments on Access were received by
the management committee and summarized for the committee charge
Revising our Test Standards 11April 9, 2010
Elements of the Charge
Accommodations/modifications Impact/differentiation of accommodation and
modification Appropriateness for English language learners and
examinees with disabilities Appropriateness for variety of groups, e.g., pre-K,
older populations Flagging Comparability/validity
Adequacy and comparability of translations Universal Design
Revising our Test Standards 12April 9, 2010
Key Access Issues Included in our Charge - 1
Impact/differentiation of accommodations/modifications
What are the appropriate ways to determine or establish the impact of accommodations/modifications on inferences, interpretations, and uses of scores?
How do you differentiate clearly between what is an accommodation and what is a modification?
Revising our Test Standards 13April 9, 2010
Key Access Issues Included in our Charge - 2
Appropriate ways to accommodate English-language learners and examinees with disabilities
Selecting the appropriate accommodation for the individual
Who should select the accommodation? What evidence should the selection be based on?
Administering the appropriate accommodation What evidence is available to determine impact on test
scores, given purpose of the test? how effective is the accommodation?
Providing alternative assessments/modified achievement standards
Revising our Test Standards 14April 9, 2010
Key Access Issues Included in our Charge - 3
Appropriate ways to accommodate a wider variety of groups
Pre-K Older populations
Number of older adults with cognitive impairments is rising
Tested is often used to determine mental status changes
There are many complexities associated with testing this population Combined effects of medical problems, medication side
effects, multiple sensory deficits, testing environment
Revising our Test Standards 15April 9, 2010
Key Access Issues Included in our Charge - 4
Flagging Current treatment needs to be updated to
reflect changes in practice since 1999 standards
Most testing organizations no longer flag Decisions about flagging should be based
on empirical evidence
Revising our Test Standards 16April 9, 2010
Key Access Issues Included in our Charge - 5
Comparability and validity of inferences made based on scores from accommodated or modified tests
Foundational issues such as comparability and validity need to be addressed in foundational chapters
If sample sizes do not support analyses such as DIF, other evidence of validity should be pursued
Revising our Test Standards 17April 9, 2010
Key Access Issues Included in our Charge - 6
Adequacy and comparability of translations (language to language and language to symbol, e.g., Braille)
Evidence is needed to demonstrate adequacy of translation and comparability of scores from translated tests
Fluency, rather than primary language, should be used to describe target population for a test
Quality of translation/adaptation needs to be emphasized
Interaction of language proficiency and construct needs to be considered
Revising our Test Standards 18April 9, 2010
Key Access Issues Included in our Charge - 7
Universal Design 1999 Standards focus too much on
accommodations and modifications and not enough on building accessibility features into design and development process
Revising our Test Standards:Issues for Accountability
Society for Industrial and Organizational Psychology25th Annual Conference, Atlanta Georgia
April 9, 2010
Laurie WiseHumRRO
Revising our Test Standards 20April 9, 2010
Overview
There has been a dramatic expansion of the use of tests for various forms of accountability and other uses related to educational policy-setting.
The Joint Committee has been charged with considering how these uses in accountability should impact revisions to the Standards
As with the other themes, comments on the standards that related to accountability were compiled by the Management Committee and summarized in their charge to the Joint Committee
Revising our Test Standards 21April 9, 2010
Overview
Standards related to accountability currently are especially relevant to Chapter 13 (Educational Testing and Assessment) and Chapter 15 (Testing in Program Evaluation and Public Policy)
Examples of emerging issues associated with use of tests for accountability Test results have important consequences for third
parties such as school administrators and teachers, although not always for the examinees themselves.
Federal peer review procedures have required assurances of reliability and validity that often go beyond requirements of the current Standards.
Revising our Test Standards 22April 9, 2010
Key Accountability Topics Included in our Charge
Validity and reliability requirements Issues with scores, scaling, and
equating Policy and practice Formative and interim assessments
Revising our Test Standards 23April 9, 2010
1. Validity, Reliability and Reporting Issues for Accountability
• Use of a single test (whether or not scores resulting from retesting or repeat testing are sufficient for using more than one score for high stakes decisions) as the sole source of high stakes decisions (e.g., graduation, promotion).
• How test alignment studies should be documented and used to demonstrate the validity of score interpretations regarding mastery of required content standards.
Revising our Test Standards 24April 9, 2010
1. Validity, Reliability, and Reporting Issues - continued
Provide additional guidance on score accuracy, especially when used to classify individuals or groups into performance regions or other bands on a score scale.
Validity and reliability requirements for reporting individual or aggregate performance on subscales (skills or diagnostics).
Incorporating error estimates and interpretive guidance in score reports, including subscores and diagnostic reports for individuals and groups.
Revising our Test Standards 25April 9, 2010
2. Issues with Scores, Scaling, and Equating
• Growth modeling, gain scores, and other methods of estimating the value added by teachers and schools.
• Issues or requirements when linking different assessments (e.g., concordances, linkages and equating)
Revising our Test Standards 26April 9, 2010
3. Policy and Practice
How to balance privacy concerns for individual examinees, teachers, and administrators while meeting information needs for policy-makers.
Issues related to the appropriate role of practice and test preparation, especially in contrast to admissions testing or credentialing.
Revising our Test Standards 27April 9, 2010
4. Addressing formative and interim assessments
Schools are increasingly developing or purchasing interim or formative assessments to identify study problems well before the end-of-year summative assessments
Some issues: Appropriate uses of such tests Validity evidence required for interpreting scores
As mastery As predictions
Revising our Test Standards:Technological Advances
Society for Industrial and Organizational Psychology25th Annual Conference, Atlanta Georgia
April 9, 2010
Fritz DrasgowUniversity of Illinois
Revising our Test Standards 29April 9, 2010
Overview
Technological advances are changing the way tests are delivered, scored, interpreted and in some cases, the nature of the tests themselves
The Joint Committee was charged with considering the implications of technological advances for the Standards
As with the other themes, comments on the standards that related to technology were compiled by the Management Committee and summarized in their charge to the Joint Committee
Revising our Test Standards 30April 9, 2010
Key Technology Issues Included in our Charge
Reliability & validity of innovative item formats
Validity issues associated with the use of:
Automated scoring algorithms Automated score reports and interpretations
Security issues for tests delivered over the internet
Issues with web-accessible data, including data warehousing
Revising our Test Standards 31April 9, 2010
Resources for Consideration
Guidelines for Computer-Based Testing, Copyright 2002 Association of Test Publishers (ATP)
International Guidelines on Computer-Based and Internet Delivered Testing, Copyright 2005 International Test Commission (ITC)
Revising our Test Standards 32April 9, 2010
Reliability & Validity of Innovative Item Formats
What special issues exist for innovative items with respect to access for various groups? How might the standards reflect these issues?
What steps should the standards suggest with regards to “usability” of possibly unfamiliar innovative items?
Revising our Test Standards 33April 9, 2010
Automated Scoring Algorithms
What level of documentation/disclosure is appropriate and tolerable for proprietary (i.e. secret) automated scoring algorithms?
What sorts of evidence seem most important for demonstrating the validity and “reliability” of automated scoring systems?
Revising our Test Standards 34April 9, 2010
Automated Score Reports and Interpretation
Use of computer for score interpretations
“Actionable” reports (e.g., routing students and teachers to instructional materials and lesson plans based on test results)
Revising our Test Standards 35April 9, 2010
Security issues for tests delivered over the internet
Issues include:Protecting examinee privacy Threats to validity due to breach of securityAre the reported scores correct?
Considerations likely to affect standards related to test administration and responsibilities of test users
Revising our Test Standards 36April 9, 2010
Web-Accessible Data, including Data Warehousing
Applicability of general technology standards?Security IT standards similar to ISO
Revision to commentary vs. drafting additional standards
Revising our Test Standards:Issues for Work-Place Testing
Society for Industrial and Organizational Psychology25th Annual Conference, Atlanta Georgia
April 9, 2010
Paul SackettUniversity of Minnesota
Revising our Test Standards 38April 9, 2010
Overview
Standards for testing in the work place are currently covered in Chapter 14 (one of the testing application chapters)
Work-place testing includes employment testing as well as licensure, certification, and promotion testing.
Comments on standards related to work place testing were received by the Management Committee and summarized in their charge to the Joint Committee.
Revising our Test Standards 39April 9, 2010
Key Work-Place Testing Issues Included in our Charge
1. Validity and reliability requirements for certification and licensure tests.
2. Issues when tests are administered only to small populations of job incumbents.
3. Requirements for tests for new, innovative job positions that do not have incumbents or job history to provide validity evidence.
4. Assuring access to licensure and certification tests for examinees with disabilities that may limit participation in regular testing sessions?
5. Differential requirements for certification and licensure and employment tests.
Revising our Test Standards 40April 9, 2010
1. Validity and Reliability Requirements
Some specific issues:Documenting and communicating the
validity and reliability of pass-fail decisions in addition to the underlying scores
How cut-offs are determined How validity and reliability information is
communicated to relevant stakeholders
Revising our Test Standards 41April 9, 2010
2. Issues with Small Examinee Populations
Including:Alternatives to statistical tools for item
screeningAssuring fairness Assuring technical accuracy
Alternatives to empirical validity evidenceMaintaining comparability of scores from
different test forms
Revising our Test Standards 42April 9, 2010
3. Requirements for New Jobs
Issues include: Identifying test contentEstablishing passing scoresAssessing reliabilityDemonstrating validity
Revising our Test Standards 43April 9, 2010
4. Assuring Access to Employment Testing
See also separate presentation on fairness Issues include:
Determining appropriate versus inappropriate accommodations
Relating testing accommodations to accommodations available in the work place
Revising our Test Standards 44April 9, 2010
5. Certification and Licensure versus Employment Testing
Currently, two sections in the same chapter
Examples of relevant issues:Differences in how test content is identifiedDifferences in validation strategiesDifferences in test score useWho oversees testing:
Private company versus professional board/organization
Top Related