In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking...
-
Upload
centre-for-distance-education -
Category
Education
-
view
590 -
download
1
description
Transcript of In Focus Presentation: Improving retention: predicting at-risk students by analysing clicking...
Improving retention: predicting at-risk students by analysing clicking behaviour in a virtual learning environment
Annika Wolff and Zdenek Zdrahal10th December 2013
Student retention
• Struggling students don’t always ask for help – drop-out of module or fail and then don’t progress further
• When timely help is offered, this can make the difference between success and failure.
• It can be hard to know who’s in trouble and where to direct resources
Open University context
students
tutors
Distance learning:• Content through VLE• Contact mediated
through VLE – how to tell if students are struggling?
Solution: develop predictive models from student data
Data sources and data sets
VLE Assessment Demographic
Learning contentForumsQuizzes….
Ongoing assessmentsFinal exam
AgeGenderPrevious study…..
Typical VLE clicks
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 470
500
1000
1500
2000
2500
3000
Students Tutors
VLE activity (prior TMA1)• No VLE activity … 317 students• 1-20 clicks ……….. 609 students• 21-80 clicks ……… 943 students• 81-150 clicks ……. 621 students• 151-300 clicks …. 803 students• 301-600 clicks …. 516 students• > 600 clicks ……… 355
students
Problem specification
• Given:– Demographic data at the Start (may include information about
student’s previous modules studied at the OU and his/her objectives)– Assessments (TMAs) as they are available during the module– VLE activities between TMAs– Conditions student must satisfy to pass the module
• Goal: – Identify students at risk of failing the module as early as possible so
that OU intervention is meaningful.
Comments on problem specification
• OU intervention is meaningful if the cost of the intervention is lower than the expected gain from retaining the student.
• Modelling the problem:
We are here
Comments on problem specification
• OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student.
• Modelling the problem:
We are here
History we know
Comments on problem specification
• OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student.
• Modelling the problem:
We are here
History we know Future we can estimate
Comments on problem specification
• OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student.
• Modelling the problem:
We are here
History we know Future we can estimate
… and we can influence!
Comments on problem specification
• OU intervention is meaningful if the cost of intervention is lower than the expected gain from retaining the student.
• Modelling the problem:
We are here
History we know Future we can estimate
How can we estimate the future? … Based on student’s history and properties of upcoming parts of the module known from previous presentations.
Prediction at TMA1
– Why? TMA1 is a good predictor of success or failure
– It is enough time to intervene
We are hereHistory we know Future we can affect
Building a classifier
Training instances
New instances
FAIL
PASS
PassFail
Pass
Fail
FailPass
Assessment 1 score?
>40% <40%
Decision Tree – first results (no demographics)
Performance drop (VLE+TMA)
Final outcome
Naïve Bayes network
Sex
Education
N/C
VLE
TMA1
• Education:– No formal qualif.– Lower than A level– A level– HE qualif.– Postgraduate qualif.
• VLE:– No engagement– 1-20 clicks– 21-100 clicks– 101 – 800 clicks
• N/C:– New student– Continuing student
• Sex:– Female– Male
Goal:Calculate probability of failing at TMA1 • either by not submitting TMA1,• or by submitting with score < 40.
Predicting final result from TMA1
TMA1 Final resultTMA7TMA2
Pass/Distinction
Fail
TMA1 >=40
TMA1 <40
Prior probabilities: P(Success) = 0.807, P(Fail) = 0.193
Posteriori probabilities: P(Success|TMA1) = 0.858, P(Fail|TMA1) = 0.142P(Success|~TMA1) = 0.093, P(Fail|~TMA1) = 0.907
Bayes minimum error classifierIf student fails in TMA1 he/she is likely to fail the final result
VLE
P(Fail|TMA1-score), P(Pass/Dist|TMA1-score)
0-39 40-59 60-69 70-79 80-1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
FailPass/Dist
TMA1
Predicting final result from TMA1
Sex
Education
N/C
VLE
TMA1 Final resultTMA7TMA2
Pass/Distinction
Fail
TMA1 >=40
TMA1 <40
Prior probabilities: P(Success) = 0.807, P(Fail) = 0.193
Posteriori probabilities: P(Success|TMA1) = 0.858, P(Fail|TMA1) = 0.142P(Success|~TMA1) = 0.093, P(Fail|~TMA1) = 0.907
Bayes minimum error classifierIf student fails in TMA1 he/she is likely to fail the final result
VLE
Demo Case 1• Demographic data
– Student fits certain demographic profile of gender, educational background etc.
Sex
Education
N/CTMA1
Without VLE:Probability of failing at TMA1 = 18.5%
Sex
Education
N/C
VLE
TMA1
Clicks Probability Nr of students0 64% 4
1-20 44% 3
21-100 26% 5
101-800 6.3% 14
With VLE:
Demo Case 2• Demographic data
– Different demographic profile to previous slide
Sex
Education
N/CTMA1
Without VLE:Probability of failing at TMA1 = 7.7%
Sex
Education
N/C
VLE
TMA1
Clicks Probability Nr of students0 39% 35
1-20 22% 74
21-100 11.2% 178
101-800 2.4% 461
With VLE:
TMA1? … it might be too late!
Can we predict TMA1 from VLE activities 1 week before the TMA1 deadline? How about 2, 3, … weeks?
We are here
History Future we can affect
predicted to fail
has not engaged with VLE
average score < 40
Dashboard and Chart
at least one TMA below 40
Has not submitted TMA5
has not engaged with VLEaverage score = 81.71 !!!
However
Dashboard – new design
Conclusions
• In a distance learning context, the VLE data provides a valuable source of data for prediction
• Prediction improves as a module progresses, but this is too late!
• We need to optimise methods for early prediction