Designing interactive assessments to promote independent learning Sally Jordan, Phil Butcher &...

Post on 02-Jan-2016

217 views 2 download

Transcript of Designing interactive assessments to promote independent learning Sally Jordan, Phil Butcher &...

Designing interactive assessments to promote independent learning

Sally Jordan, Phil Butcher & Richard JordanThe Open University

Effective Assessment in a Digital Age Workshop 3rd February 2011

The context : the Open University

• Supported distance learning;• 180,000 students, mostly studying part-time;• Undergraduate modules are completely open entry, so

students have a wide range of previous qualifications;• Normal age range from 16 to ??• 10,000 of our students have declared a disability of

some sort;• 25,000 of our students live outside the UK.

Case study presentation

• What we did

• Why we did it

• How we did it

• Your chance to have a go

• Discussion

S104 website iCMA47

WhyWe use interactive computer marked assignments (iCMAs) alongside tutor-marked assignments (TMAs) to:

• Provide instantaneous feedback – and an opportunity for students to act on that feedback;

• Provide ‘little and often’ assessment opportunities and so help students to pace their studies;

• Act as ‘a tutor at the student’s elbow’;• We also use iCMAs for diagnostic purposes.

We use a range of question types, going beyond those where students select from pre-determined answers to those where they have to write in their own words.

How

• Most of our questions are written in OpenMark and sit within the Moodle virtual learning environment;

• For short-answer free-text questions, we initially used answer-matching software provided by Intelligent Assessment Technologies (IAT) (sitting within OpenMark)

• The software is based on the Natural Language Processing technique of ‘Information Extraction’;

• Key point: Real student responses were used in developing the answer matching.

The IAT software

• represents mark schemes as templates;

• synonyms can be added and verbs are usually lemmatised.

Human-computer marking comparison

• The computer marking was compared with that of 6 human markers;

• For most questions the computer’s marking was indistinguishable from that of the human markers;

• For all questions, the computer’s marking was closer to that of the question author than that of some of the human markers;

• The computer was not always ‘right’, but neither were the human markers.

Question Number of responses in

analysis

Percentage of responses where the human markers

were in agreement with question author

Percentage of responses where

computer marking was in agreement

with question author

Range for the 6 human markers

Mean percentage

for the 6 human

markers

A 189 97.4 to100 98.9 99.5

B 248 83.9 to 97.2 91.9 97.6

C 150 80.7 to 94.0 86.9 94.7

D 129 91.5 to 98.4 96.7 97.6

E 92 92.4 to 97.8 95.1 98.9

F 129 86.0 to 97.7 90.8 97.7

G 132 66.7 to 90.2 83.2 89.4

Computer-computer marking comparison

• An undergraduate student (not of computer science) developed answer matching using two algorithmically based systems, Java regular expressions and OpenMark PMatch;

• These are not simple ‘bag of words’ systems;• Student responses were used in the development of the

answer matching, as had been the case for the linguistically based IAT system;

• The results were compared.

Question Responses in set

Percentage of responses where computer marking was in agreement with question

author

Computational linguistics

Algorithmic manipulation of keywords

IAT OpenMark Regular Expressions

A 189 99.5 99.5 98.9

B 248 97.6 98.8 98.0

C 150 94.7 94.7 90.7

D 129 97.6 96.1 97.7

E 92 98.9 96.7 96.7

F 129 97.7 88.4 89.2

G 132 89.4 87.9 88.6

Recent work

• We repeated the computer-computer marking comparison - PMatch did even better;

• We have introduced a spellchecker into Pmatch;• We have now transferred the answer matching for our

‘live’ questions from IAT to PMatch;• Similar software will be available as part of Core Moodle

from late 2011.

PMatch is an algorithmically based system• so a rule might be something like

Accept answers that include the words ‘high’, ‘pressure’ and ‘temperature’ or synonyms, separated by no more than three words

• This is expressed as:

else if ((m.match("mowp3", "high|higher|extreme|inc&|immense_press&|compres&|[deep_burial]_temp&|heat&|[hundred|100_degrees]")

matchMark = 1; whichMatch = 9;• 10 rules of this type match 99.9% of student responses

Have a go• Follow the instructions on Worksheet 1, using the

student responses on Worksheet 2, to suggest rules for answer matching and feedback.

Notes after the Workshop• The rules suggested at the JISC workshop on 3rd Feb

gave a 100% match (well done!). Our rules for all the student responses available are given in Worksheet 3.

• It was suggested that this question could be reframed as a multiple-choice question where students are given a picture of a slide and asked to indicate the position. This is a very good point, though you’d miss the students who thought the kinetic energy was greatest as the child climbed the steps.

Benefits

• For students – instantaneous feedback on non-trivial e-assessment tasks;

• For associate lecturer staff – saving from drudgery of marking routing responses; more time to spend supporting students in other ways;

• For module team staff – knowledge that marking has been done consistently and quickly;

• For institution – cost saving; more information about student misunderstandings

A proviso

• Our answer matching accuracy is based on the use of hundreds of student responses and there is an overall financial saving because the modules are studied by thousands of students per year and have a lifetime of 8-10 years;

• Our approach may be a less practical solution for smaller student numbers;

• However monitoring student responses to e-assessment questions of all types remains important.

Overall conclusions• Simple pattern matching software has been shown to be

very effective;• Very small changes can make a huge difference to the

effectiveness of innovation;• Student perception is important;• It is really important to monitor actual student responses

and to learn from them.

How far is it appropriate to go?

• I don’t see online assessment as a panacea;• Some learning outcomes are easier than others to

assess in this way;• Free text questions require students to construct a

response, but there still need to be definite ‘right’ and ‘wrong’ answers (though not necessarily a single right answer);

• However, online assessment provides instantaneous feedback and has been shown to be more accurate than human markers. It can free up human markers for other tasks. It also has huge potential for diagnostic use.

For further information:

• Jordan, S. & Mitchell, T. (2009) E-assessment for learning? The potential of short free-text questions with tailored feedback. British Journal of Educational Technology, 40, 2, 371-385.

• Butcher, P.G. & Jordan, S.E. (2010) A comparison of human and computer marking of short free-text student responses. Computers & Education, 55, 489-499.

Useful links

PMatch demonstrationhttps://students.open.ac.uk/openmark/omdemo.pm2009/

‘Are you ready for S104?’ (diagnostic quiz, showing a range of question types)

https://students.open.ac.uk/openmark/science.level1ayrf.s104/OpenMark examples site

http://www.open.ac.uk/openmarkexamples/index.shtmlIntelligent Assessment Technologies (IAT)

http://www.intelligentassessment.com/

Sally JordanStaff Tutor in ScienceThe Open University in the East of EnglandCintra House12 Hills RoadCambridgeCB2 1PFs.e.jordan@open.ac.ukwebsite: http://www.open.ac.uk/science/people/people-profile.php?staff_id=Sally%26%26Jordanblog: http://www.open.ac.uk/blogs/SallyJordan/