Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy,...
-
Upload
cameron-powers -
Category
Documents
-
view
219 -
download
0
Transcript of Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy,...
![Page 1: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/1.jpg)
Artificial IntelligenceScoringof Student Essays:West Virginia’sExperience
Vaughn G. Rhudy, Ed.D., NBCT
Office of Assessment
West Virginia Department of Education
June 22, 2015
![Page 2: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/2.jpg)
1984-2004
• Statewide writing assessment began in 1984.• Traditional paper-pencil assessment administered from
1984-2004.o Grades 4, 7 and 10o Approximately 20,000 students per grade levelo Hand scoredo Grade-level rubrics for scoringo Modified holistic scoring on 4-point scaleo Four genres – narrative, descriptive, expository, persuasiveo Results not included as part of state accountability data
West Virginia Writing Assessment
![Page 3: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/3.jpg)
1984-2004 Grade 4 Rubric
West Virginia Writing Assessment
![Page 4: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/4.jpg)
1984-2004 Grades 7 and 10 Rubric
West Virginia Writing Assessment
![Page 5: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/5.jpg)
2005-2007• Online Writing Assessment from 2005-2007, except grade 4.• Paper-pencil test in grade 4
o Hand scored• Computer-based assessment in grades 7 and 10
o Artificial intelligence engine scoring• Approximately 20,000 students per grade level• Grade-level rubrics for scoring• Analytic trait scoring on a 6-point scale
o Five traits – Organization, Development, Sentence Structure, Word Choice, Mechanics
• Four genres – narrative, descriptive, informative, persuasive• Results not included as part of state accountability data• Scores on each analytic trait added to obtain a Summative Score and
Performance Level
WV Online Writing Assessment
![Page 6: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/6.jpg)
2007 Grade 4 Writing Prompt• Imagine that you are on a magic carpet that takes you
anywhere you want to go. Tell about where you might go and what you might do.
WV Online Writing Assessment
![Page 7: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/7.jpg)
2005-2007 Grade 7 Rubric
WV Online Writing Assessment
![Page 8: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/8.jpg)
2005-2007 Grade 10 Rubric
WV Online Writing Assessment
![Page 9: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/9.jpg)
Initial Challenges• Bandwidth - Connectivity• Number of testing devices/computer labs• Computer classes in labs• Security updates• Length of testing window• Concerns about keyboarding skills, particularly younger
students• Validity and reliability of artificial intelligence scoring
engine
WV Online Writing Assessment
![Page 10: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/10.jpg)
Actions• State and districts increased bandwidth.• State and districts increased the number of testing
devices.• Nine-week testing window was established to address
technology concerns and reduce daily testing load. Window spanned from February to April.
• From 2005-2007, fourth graders continued paper-pencil testing because of concerns about keyboarding skills.
• State engaged teachers in reviewing computer scoring to help with teacher buy-in.
WV Online Writing Assessment
![Page 11: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/11.jpg)
New Online Writing Assessment Field Test - 2008
• Expanded to grades 3-11o Hand scored
• Approximately 20,000 students per grade level• Grade-level rubrics for scoring• Analytic trait scoring on a 6-point scale
o Five traits – Organization, Development, Sentence Structure, Word Choice/Grammar Usage, Mechanics
• Four genres – narrative, descriptive, informative and persuasiveo Only narrative and descriptive at grade 3
• Passages added to prompts• Results not included as part of state accountability data
WV Online Writing Assessment
![Page 12: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/12.jpg)
2008 Field Test• New prompts with passages• 136 prompts field tested – 2 genres at grade 3, 4 genres at
grades 4-9, 4 prompts per genre• 2 operational prompts selected per genre• New grade-specific, 6-point analytic writing rubrics• All student essays were hand scored• State staff and selected teachers participate in range-
finding• Hand scored essays used to training new AI scoring engine
WV Online Writing Assessment
![Page 13: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/13.jpg)
2009-2014 Sample Grade 3
Descriptive Writing Prompt
WV Online Writing Assessment
![Page 14: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/14.jpg)
WV Online Writing Assessment
This is where you will begin typing your essay. At the end of the paragraph, hit the enter key at least once to skip a line between paragraphs.
Do not hit the tab key to indent your paragraph. It will not work.
![Page 15: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/15.jpg)
2009-2014 Grade 7 Rubric
WV Online Writing Assessment
![Page 16: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/16.jpg)
Grade 3 Student Survey• 85 percent of grade 3 students indicated they
preferred writing their essays on the computer than using traditional paper-pencil.
WV Online Writing Assessment
![Page 17: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/17.jpg)
WESTEST 2 Online Writing Assessment – 2009-2014
• Grades 3-11o Artificial intelligence engine scoring
• Approximately 20,000 students per grade level• Grade-level rubrics for scoring• Analytic trait scoring on a 6-point scale
o Five traits – Organization, Development, Sentence Structure, Word Choice/Grammar Usage, Mechanics
WV Online Writing Assessment
![Page 18: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/18.jpg)
WESTEST 2 Online Writing Assessment – 2009-2014
• Four genres – narrative, descriptive, informative and persuasiveo Only narrative and descriptive at grade 3
• Passages/prompts• Results not included as part of state accountability data• Online formative assessment practice program available
for schools to use
WV Online Writing Assessment
![Page 19: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/19.jpg)
Later Challenges• Bandwidth/Connectivity
o Continued in some districts and schools but improved overall• Number of testing devices/computer labs
o Continued in some districts and schools but improved overall• Computer classes in labs
o Continued to be an issue but improved overall • Browser updates
o Test platform only allowed the use of Internet Explorero Microsoft auto updates sometimes created problems
• Accuracy and reliability of AI scoring in practice programo Created lack of confidence in summative scoring engine
WV Online Writing Assessment
![Page 20: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/20.jpg)
Formative Assessment Practice Program – 2009-2014
• Writing Roadmap – shelf producto Shelf promptso Shelf rubrico AI scoring
• West Virginia Writes – customization of Writing Roadmap for West Virginiao WV passages and prompts (field tested)o WV writing rubricso Student responses from field test used to train AI
engine
WV Online Writing Assessment
![Page 21: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/21.jpg)
AI Scoring Challenges• Teacher buy-in and understanding of AI scoring• Field testing sufficient number of prompts
o WV lost some prompts during psychometric analysis resulting in the need to repeat prompts in alternate years
• Rubric development for use in AI scoring• Initial hand scoring• Range finding• Training sets• Sufficient number of student responses to train engine
o Particularly finding sufficient number of student responses scored in the high range
WV Online Writing Assessment
![Page 22: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/22.jpg)
Scoring Reliability• Validation Papers/Iterations• Second Reads• Comparability Studies
Artificial Intelligence Scoring
![Page 23: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/23.jpg)
Importance of Comparability• Engine to Professional Hand Scorers• Engine to West Virginia Teachers
Artificial Intelligence Scoring
![Page 24: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/24.jpg)
Vendor Validation
WV Online Writing Assessment
![Page 25: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/25.jpg)
WV Comparability Studies
WV Online Writing Assessment
![Page 26: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/26.jpg)
Benefits of Teacher Participation
• Professional development in using rubrics for hand scoring of student essays
• Improvement of instructional practices• Teacher buy-in of artificial intelligence scoring
WV Online Writing Assessment
![Page 27: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/27.jpg)
Considerations• Involve teachers in prompt and rubric
development• Pilot testing and field testing important• Sufficient number of prompts should be included
in field test depending on sample size• Include teachers in range finding• Sufficient number of essays at each score point
necessary to train engine, particularly for highest score point
Artificial Intelligence Scoring
![Page 28: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/28.jpg)
Considerations• Quality of training sets important• Engine must be calibrated to the scoring rubric(s)• Engine training is key• Vendor validation and read-behind studies• State comparability studies with state teachers• Ongoing engine training to account for potential drift• Provide practice program for teachers and students• Professional development for state teachers
Artificial Intelligence Scoring
![Page 29: Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.](https://reader036.fdocuments.in/reader036/viewer/2022062517/56649e8e5503460f94b912b6/html5/thumbnails/29.jpg)
Scoring Strengths and WeaknessesHuman Scoring Engine Scoring
Scoring accuracy dependent on training
Scoring accuracy dependent on training
Get tired, hungry, bored Doesn’t get tired, hungry, bored
Individual Scorer Bias No Bias
Easier to train – quicker More difficult to train- time-consuming
Can make inferences Has difficulty with inferences
Slow, expensive scoring Quick, less expensive scoring
Artificial Intelligence Scoring