ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or...
Transcript of ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or...
![Page 1: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/1.jpg)
Fundamentals of Automated
Essay Scoring
Mark D. Shermis, Ph.D.
Professor and Dean,
College of Education
The University of Akron
![Page 2: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/2.jpg)
What is Automated Essay
Scoring?
• Software technology that automatically grades written
English. Graders in other languages have already
been developed.
• Has been applied successfully to short essays (high-
and low-stakes tests) and longer documents.
• Presently a web-based performance assessment.
• Provides both holistic and trait scores.
• Can provide discourse analysis.
CSSO Conference
2
![Page 3: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/3.jpg)
AES--How Does It Work?
• Most grading engines use rater-behavior as the
criterion for their predictions.
• The computer doesn’t “understand” what is written,
but can be programmed to evaluate keywords and
synonyms.
• It is possible to write a non-sensical essay that gets a
good score, but you have to be a good writer to
accomplish this.
• Can evaluate both content and writing ability.
CSSO Conference
3
![Page 4: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/4.jpg)
CSSO Conference
4
Parsers Invested Heavily in
Content
• Intelligent Essay AssessorTM (Pearson
Knowledge Technologies)
• e-Rater® (Educational Testing Service)
• IntelliMetric™ (Vantage Learning)
![Page 5: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/5.jpg)
5
Content is Slippery, However
• Christopher Columbus – Queen America sailed to Santa Maria with 1492
ships. Her husband, King Columbus, looked to
the Indian explorer, Nina Pinta, to find vast wealth
on the beaches of Isabella, but would settle for
spices from the continent of Ferdinand.
CSSO Conference
![Page 6: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/6.jpg)
Tape Measure Analogy
CSSO Conference
6
If you ask a person how to measure length…
![Page 7: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/7.jpg)
Reliability
• Most studies show exact agreement in the
80s and adjacent agreement in the 90s for
the three major vendors.
CSSO Conference
7
![Page 8: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/8.jpg)
Validity
• Validity demonstrated through true score
analysis, correlations with other (objective)
tests, and prediction studies (Keith, 2003).
CSSO Conference
8
![Page 9: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/9.jpg)
Writing A Prompt
• A good prompt is a good prompt; no different in the
automated world.
• Focused Topic and Expectations
• Clear Task/ Charge
• Other Characteristics
– Generate enough content
– Scorability
– Stimulates original writing
– unemotional/unbiased
CSSO Conference
9
![Page 10: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/10.jpg)
Rating Rubrics
• Scoring mechanism that evaluates essays holistically, analytically, or via traits.
• Most of the trait analytic and trait rubrics don’t seem to differentiate all that much from holistic scoring, but people like them (Shermis et al, 2002)
• May miss important (unarticulated) aspects of the writing enterprise (Bennett & Bejar, 1999).
CSSO Conference
10
![Page 11: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/11.jpg)
6+1 Traits™
• Ideas
• Organization
• Voice
• Word Choice
• Sentence Fluency
• Conventions
• +1 Presentation (not used) • Source: Northwest Educational Research Laboratory, Eugene, OR. 6+1™ is a trademark of NWREL.
CSSO Conference
11
![Page 12: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/12.jpg)
6+1 Traits™ Scoring Rubric
CSSO Conference
12
![Page 14: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/14.jpg)
eRater® and Criterion(SM)
• http://www.ets.org/criterion
CSSO Conference
14
![Page 16: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/16.jpg)
Developing The Model
• Ideal: 300 Typical, Scored-Responses Drawn
From the Population
• Ideal: Strong Representation at the Tails of
the Distribution
• Ideal: Scored by Two Well Trained Scorers
• Cross-validated
CSSO Conference
16
![Page 17: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/17.jpg)
Portfolios for Document
Storage/Evaluation
• View Reports/Reporting Options
• Set up Assignments/Assignments
• View Setup Options (Tools, feedback)
CSSO Conference
17
![Page 18: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/18.jpg)
The Florida Proposal
• Develop norms for automated essay scoring
& assess for “vulnerable” groups; replace
FCAT+ Writing
CSSO Conference
18
0
1
2
3
4
5
6
Ass
ign
1
Ass
ign
3
Ass
ign
5
Ass
ign
7
Ass
ign
9
Ass
ign
11
Ass
ign
13
Ass
ign
15
Assignment
Sc
ore
![Page 19: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/19.jpg)
CSSO Conference
19
Future Directions
• Development of general writing models that will
speed up formulation of specific statistical models for
grading.
• Grading by an “ideal” or “gold standard” essay.
• More work with LSA-like approaches to evaluating
content.
• Writing tutorials that will provide additional feedback.
![Page 20: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content.](https://reader033.fdocuments.in/reader033/viewer/2022050110/5f47c1e9b3bc665c2635a9ef/html5/thumbnails/20.jpg)
CSSO Conference
20
For Further Information…
Lawrence Erlbaum Associates,
Inc.
http://www.erlbaum.com