1
Learning, Assessment, and Accountability: Priorities for Educational Reform
Eva L. Baker
UCLA Graduate School of Education & Information StudiesNational Center for Research on Evaluation,Standards, and Student Testing (CRESST)
British Columbia Ministry of Education ConferenceVictoria, BC
March 2004
2
Today’s Topics
YOU can make a difference in educational reform
Coping strategies for lurching change
Review key aspects of current policy reform
Improving assessment, learning, and transfer
New paths from CRESST work
Principles and criteria for improvement
3
Intellectual Goals for Reform Mastery
Understand deep goals of reform and how they can be achieved
Managing assessment knowledge: research-based procedures for use
Development of social capital in education
4
Policy Context
Assumption: Schools are not meeting goals
Need for new instruments and mechanisms
Trade process for accountability, individual scores
Now specify process, accept most outcomes
Devolve responsibility to states and LEAS
Innovations—charters, private managers, vouchers
Retain federal authority—Adequate Yearly Progress
5
15-Year History of U.S. Attempts
1989 NCTM Math Standards, NCEST
America 2000, Goals 2000, Improving America’s Schools Act, VNT, NCLB
Policies have remained relatively consistent—political sides change
NCLB expanded national policies
Bipartisan support
Early in implementation with known problems
Lessons learned?
6
No Child Left Behind
Builds on standards and assessments of IASA
Annual testing in Grades 3-8 plus high school
Growth targets set by states so that all children reach “proficiency level” in 12 years
95% participation required
Disaggregated groups reporting
Tests and proficiency definition left to states
Options if school “fails”
High failure rates
7
Policy Limits
Accountability crux
Consequences based on outcomes AYP targets
Approaches imported from centrally controlled systems; agreement on outcomes—training
Contrasts to schoolhouse traditions—local control
Weak curriculum/instructional “alignment”
Tests may not be sensitive or conceptually connected to instruction and learning
Recruiting and maintaining quality staff
8
Early Policy Consequences
External, varying standards and tests from States
Unrealistic targets (AYP) plus
Short timeline to serious sanctions
Ergo: Raised scores only evidence of learning
Neither likely to measure “high standards” nor to create assessment results that respond to quality instruction
Growing enthusiasm for use of classroom assessment for accountability
Test prep increases
Popularity of reform unstable
9
Gaps in Practice for the “Theory of Action” of Accountability
Problem 1: Alignment is Asserted
At best, links tests with some standards
No complete standards-instruction-test-results loop
No common technical approach to document
Wrong metaphor (geometric congruence)
Goals aligned with instruction and testsmeasure goals. Feedback on results improves the instruction and learning
10
Description:Extra comfort for senior dogs. Our popular orthopedic pet bed, made extra thick for aging dogs. A full 4" of medical grade convoluted foam supports bones and joints, and the elevated headrest provides proper neck and spine alignment.
http://www.petdiscounters.com/dog/beds/cu_orthopedic.html
12
http://www.fly-ford.com/StepByStep-Front-Series.html
Always check Alignment readingsbefore and after work is performed.
13
How to Monitorand Improve Alignment
Count items for each standard
Understand weighting of results
Analyze, review, and share lessons that exhibit standards and promote transfer
Examples using teacher assignments
Support collaboration
14
Gaps in Practice for “Theoryof Action” in Assessment
and Testing
Problem 2: Assessment Design and Reporting
Multiple purposes, uses, and audiences
Limited designs and types
Unresolved quality issues in traditional testing
A better path
15
Assessment Purposes
Needs sensing
System monitoring
Accountability
Program evaluation
Improvement
Achievement
Certification
Progress
Diagnosis
Selection
Placement
Comparisons
16
Review of Achievement Testing Traditions
Any new approach is compared to the extant commercial standard
Familiar, inexpensive, “trustworthy,” independent of particular learning and teaching, correlated, national norms
One purpose—one test framework
Too many tests with no clear evidence related to accreted purpose(s)
Optimize measurement efficiency
17
What Should a Coherent Assessment System Do?
Detect differences in instruction
Partially guide educational improvement
Impact positively on instructional practice
Reflect current views of learning and sustained performance
Support fairness
18
What Should a Coherent Assessment System Do? (Cont’d)
Promote transfer of learning to new applications
Represent the real range of cognitive task demands
Exhibit technical quality for intended purpose(s)
Support enthusiasm for teaching and learning
19
How to Support Deep Learning?Families of Cognitive Demands
Scientifically based components of school learning
Based on syntheses and targeted research
Map assessment demands to learning processes and products first rather than to psychometrics
Re-emphasize focused thinking, self-management, and transfer of learning skills
20
Intellectual Capital Cognitive Families
ContentUnderstanding
ProblemSolving
Teamwork andCollaboration
MetacognitionLearning to LearnCommunication
Learning
21
From Science to Models to Templates
DOMAIN-INDEPENDENT
PRINCIPLES
CONTENT CONTENT CONTENT
CBA
TEMPLATE TEMPLATE TEMPLATE
MODEL
SCIENTIFICFINDINGS
COGNITIVE DEMANDS
SCIENTIFICFINDINGS
SUBJECT MATTERSPECIFIC MODELS
22
From Templates to Tasks
CBA
TEMPLATE TEMPLATE TEMPLATE
TASK TASKTASK
TASKTASK
TASK
TASK
TASKTASK
TASK
TASK TASK
23
Domain-Independent Definition: Content Understanding
Domain-independent set of principles:
Understanding is based on the demonstrated relationships among principled declarative and procedural knowledge
Ability to express critical relationships
The quality of the relationships is judged from an expert knowledge perspective
24
Domain-Independent Definition:Problem Solving
Depends upon finding the problem (if masked)
Using knowledge to identify critical barriers and ways around them
Selecting procedures to follow, recognizing impasses, and adjusting plan
Has knowledge, metacognitive, motivational, analytic, and feedback components
25
NEWTON'SLAWS
Third Law
Second Law
First LawA body in motionremains in motion
unless...
Forces betweeninteracting bodies:equal but opposite
is
is
is
areForce equals Masstimes Acceleration
(F=MA)
Ontology
26
Research-Based Model: Deep Understanding of Content
(Domain Independent)
Principles or themes (big ideas)
Key prior knowledge
Explicit relationships
Avoid misconceptions
Expert performance-based scoring
27
Template Ingredients (Specifications)
Task(s)
Format(s)
Prompt(s) and requirements
Scoring
Directions
Sample
28
Common Attributes of Template for Deep Understanding of
Content
Present primary source materials in each domain
Student required to integrate prior knowledge and principles
Scored by using expert performance by subject matter experts
29
Three Templates for the Model of Deep Understanding of Content
1. Explanation
2. Explanation with explicit knowledge
3. Graphical representation of relationships
30
Content UnderstandingTemplate #1 Explanation
An array of primary source materials
A prompt that asks for an explanation in context
Constructed (written) answer
Evaluated by means of a scoring rubric that embodies key elements of learning model
31
Content Knowledge Prompt:Hawaiian History Writing
Assignment—BayonetConstitution
Be sure to show the relationships among your ideas and facts.
Your essay should be based on two major sources:
1. The general concepts and specific facts you know about Hawaiian history, and especially what you know about the period of the Bayonet Constitution.
2. What you have learned from the readings yesterday.
Imagine you are in a class that has been studying Hawaiian history. One ofyour friends, who is a new student in the class, has missed all the classes.Recently, your class began studying the Bayonet Constitution. Your friend isvery interested in this topic and asks you to explain everything that you havelearned about it.
Write an essay explaining the most important ideas you want your friend tounderstand. Include what you have already learned in class about Hawaiianhistory, and what you have learned from the texts you have just read. Whileyou write, think about what Thurston and Liliuokalani said about the BayonetConstitution, and what is shown in the other materials.
32
Excerpts from Hawaiian HistoryPrimary Source Documents
LILIUOKALANI
For many years our sovereigns had welcomed the advice of American residents who had established industries on the Islands. As they becamewealthy, their greed and their love of power increased. Although settledamong us, and drawing their wealth from resources, they were alien to usin their customs and ideas, and desired above all things to secure their own personal benefit.
Kalakaua valued the commercial and industrial prosperity of his kingdomhighly. He sought honestly to secure it for every class of people, alien ornative. Kalakaua’s highest desire was to be a true sovereign, the chiefservant of a happy, prosperous, and progressive people.
And now, without any provocation on the part of the king, having maturedtheir plans in secret, the men of foreign birth rose one day en masse, calleda public meeting, and forced the king to sign a constitution of their ownpreparation, a document which deprived [him] of all power and practically took away the franchise from the Hawaiian race.
33
Content Knowledge Prompt (Cont’d)
*From Hawaii’s Story by Hawaii’s Queen, Liliuokalani (Boston: Lee and Shepard Publishers, 1898).
It may be asked, “Why did the king give them his signature?” I answerwithout hesitation, because he had discovered traitors among his mosttrusted friends and because the conspirators were ripe for revolution, andhad taken measures to have him assassinated if he refused.
It has been known ever since that day as “The Bayonet Constitution,” and the name is well-chosen; for the cruel treatment received by the king from the military companies. [text continues]
Explain to your friend who missed class the reasons and differences for the Queen and the Senator’s approach to Hawaii’s future.
Scoring Rubric •General impression (on task)•Principles and themes•Prior knowledge•Relevant concrete examples•Avoidance of misconceptions
34
Template #2Prior Knowledge and
Explanation
Explicit measurement of knowledge domain in the explanation
Adds short-answer or selected response
Helps interprets explanation performance
36
Using what you know about physics and the applicable laws, write an essay explaining and comparing the forces present in each system. In your essay discuss all the major similarities and differences between the two systems.
Also address the following questions in your essay:
In which direction will the balloon travel once it is released? How is this similar to the rocket system? How is it different? Explain your answers using what you know about forces.
If you placed both of these systems in space, would you expect the movement of the rocket and the balloon to change compared to their movement through air? Would anything else change?
Explain why a rocket starts off moving slowly and gets faster and faster as it climbs into space. Does the same thing happen to the balloon? Why or why not?
Consider the two systems shown above, a balloon and a rocket being launched into space.
38
Template #3Knowledge Representation
Same prompts
Key aspects of ideas, supporting facts and views, and their relationships
Relationship is explicit
Organizational options
Core and peripheral Hierarchical Cause-and-effect Chronological
Expert scoring
44
Measuring Transfer
Vary
Content complexity
Number of task elements to address, including distracters or irrelevant content
Graphical support or distraction
Need to prioritize requirements
Linguistic demands
45
Measuring Transfer (Cont’d)
Response types
Constructed response modes
Length
Response support/prompts
Degree of stringency in criteria
46
Evidence for Model-Based Assessment (MBA)
Across age ranges (preschool to adult)
Reliable scores
Teachable
Impact long-range outcomes (HS exit exam)
Automated scoring using a subset of common elements (DI) across topics
Cost low, quality maintained
Reusable elements
47
CRESST Validation Studies
Score reliability
Task and rater generalizability
Stability of student performance over time
Relationships among measures
Instructional sensitivity
Opportunity to Learn (OTL)
Effect of school composition on performance
Cut-score modeling
48
CRESST Validity Criteria for Tests and Assessments at Any Level
Fairness
Cognitive complexity
Content domain
Instructionally sensitive
Transfer and generalization
Learning-focused
Validity evidence reported for each use
Trustworthy
Credible
49
Criteria for Judging Utility of Any Assessment Design
Promote learning of the curriculum
Support cognitive complexity and content richness
Avoid unnecessary language complexity
Support transfer
Reusable components, i.e., templates or objects to save renewal cost
Economical (future, on-the-fly, open-ended scoring)
Engage teachers in challenging instruction
Fair and public
50
Criteria for Useful Assessments in Classrooms
Validity—detects differences in instruction
Samples the domains claimed to be measured
Provides information about where to focus attention rather than success/lack of success
Integrates cognitive skills and content
Includes transfer for situations and response types
Economical, transparent, and usable
Develops rather than constrains teacher growth
51
Why Are Some Schools Successful in Using Assessment
Knowledge?
Focus on learning (students and adults)
Constant use of appropriate information (formal and informal)
Focus on feedback and change
Public display and exchange
Community pride in outcomes of students and place
Knowledge managers
52
Context for Success of Knowledge-Based Reform
Local ownership of knowledge
Infrastructure and stability
Capacity to investigate
Learning by all
Congruence or peace with external mandates
QSP
54
Quality School Portfolio
Individual student longitudinal records
Standards-based
Multi-purpose
District, school, classroom, parent
Disaggregation
Local goals and questions
Evaluation
Easy-to-read reports
Free, Web-based, Fall 2002
1000 schools, 80 districts
57
Summary of Accountability Knowledge Requirements
Knowing why
Knowing what to assess: content plus cognitive demands (problem solving, communication, learning to learn, teamwork, content knowledge)
Knowing how: transfer (application to other topics and situations)
Supporting social capital development
58
Trust
EfficacyNetworks
EffortTransparency
LearningOrganization
Teamwork Skills
Social Capital in Knowledge Management
59
Knowledge Management: Assessment
Usable Knowledge
In a form that can be understood
In a form that can be applied
Timed appropriately
May cause rethinking of the problem
Useful Knowledge
Rethinking indicates a new solution path
Adapted to situation
Sufficient to guide solution
Improved outcomes occur as a result
62
Continuing R&D Areas
New contexts
Trade-offs (limited number of templates vs. wide range of formats)
Performance over time
Scalability in the long run
Authoring systems to support teacher-developed assessments linked to large-scale assessment
63
Brief History of MBA in LAUSD
Content understanding and problem-solving models
Explanation templates
4 subjects, 3 grade levels, 2 languages
Purposes: (1) to clarify expectations; (2) to provide instructionally embedded assessment; (3) to get a measure of school performance
CRESST-managed teacher involvement
64
LAUSD Process
Teacher design teams
LAUSD standards first
Adapted to success standards
Training cadre of scorers
Training trainers
Supervising scoring
65
Present LA Situation
Administered in 2, 3, 4, 5, 6, 7, 8, 9
Purpose added regarding promotion
Teacher scored with an audit reported to school
Local subdistricts managing activity
Ongoing validity studies
District review of alternative assessments
66
LAUSD Grade 7 Student Achievement Levels: Comparison of 2002 California Standards Test and Performance
Assignment Scores
Evidence of Predictive Validity
73.7%
49.1%
25.1%
9.3%
21.4%
36.3%41.2%
31.7%
4.9%
14.5%
33.8%
59.0%
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
Not Proficient PartiallyProficient
Proficient Advanced
2001 Performance Assignment Scores
% o
f S
tud
ents
in
Dif
fere
nt
Cat
ego
ries
o
f P
erfo
rman
ce i
n C
A S
tan
dar
ds
Tes
t
Below Bas ic
Bas ic
Above Bas ic
Top Related