Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian...

1
Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian Network By Zach Pardos, Advisors: Neil Heffernan, Carolina Ruiz, Joseph Beck To help teachers track student knowledge and learning during the school year from responses on the ASSISTment tutoring system and to make accurate end of year standardized test score predictions. Goal • ASSISTment is a web-based assessment system for 8 th -10 th grade math that tutors students on items they get wrong. There are 1,443 items in the system. • The system is freely available at www.assistment.org • Question responses from 600 The Skill Models The skill models were created for use in the online tutoring system called ASSISTment, founded at WPI. They consist of skill names and tagging of those skill names to math questions on the system. Models with 1, 5, 39 and 106 skills were evaluated to represent varying degrees of concept generality. The skill model’s ability to predict performance of students on the system as well as on a standardized state test was evaluated. The five skill models used: WPI-106: 106 skill names were drafted and tagged to items in the tutoring system and to the questions on the state test by our subject matter expert. WPI-5 and WPI-39: 5 and 39 skill names drafted by the Massachusetts Department of Education. WPI-1: Represents unidimensional assessment. Background on ASSISTment Learning Results from Temporal Net •The ASSISTment fine-grained and temporal skill models excel at assessment of student skills and prediction of the MCAS. •Accurate prediction and parameter learning means teachers can know when students have attained certain mandated math competencies. Skill probabilities are inferred from a students’ responses to questions on the system Bayesian Belief Network Student Test Score Prediction WPI Department of Computer Science, 2008 Sponsors Collaborators Temporal Network Structure Conclusions • A Bayesian network is a probabilistic machine learning method. It is well suited for making inferences on unobserved random variables by incorporating prior probabilities with new evidence. Bayesian Networks • Arrows represent associations of skills with question items. They also represent conditional dependence in the Bayesian Belief Network. • Skill values are inferred for each student from their responses on the tutor • Inferred skill values are used to predict the probability of a given student answering a question correctly on the tutor system or on the MCAS (Massachusetts Comprehensive Assessment System) Test. Predicting end of year MCAS scores Addition 87.38 % Ordering-Numbers 80.83 % Multiplication 69.66 % Integers 68.54 % Multiplying-Positive- Negative-Numbers 66.55 % Venn-Diagram 0.11% Pythagorean-theorem 0.87% Of-Means-Multiply 1.07% Interpreting-Linear- Equations 1.16% Fraction-Multiplication 1.96% Multiplication 35.94 % Point-Plotting 30.24 % Addition 27.52 % Square-Root 24.12 % Proportion 18.43 % Rate 1.25% Sum-of-Interior-Angles- Triangle 1.47% Equation-Concept 1.66% Venn-Diagram 1.89% Unit-Conversion 2.04% 22.31% 17.28% 14.45% 12.86% 12.72% Skills with the most learning between tutor sessions Skills with the least learning between tutor sessions Skills with highest incoming 8 th grade knowledge level Skills with lowest incoming 8 th grade knowledge level •Average prior knowledge of skills before using the tutor: 30% •Average probability of Guessing: 14% Slipping: 9% •Average probability of learning a skill from one session to the next: 8% All student data was presented to the temporal Bayesian network with each time slice representing a tutor session. Network parameters were learned using the Expectation Maximization algorithm to reveal student performance characteristics. In the temporal network, the 106 skills were split up into their own independent networks due to the intractability of representing all the nodes of the static network in temporal form. Three questions represented in the static WPI-106 The same questions now with three separate networks in the temporal WPI-10 Hidden Markov Model representation of a generic temporal Bayesian network where the inferred latent skill value from the previous time slice becomes the prior in the next time slice. Making each skill network independent allows for parallel computation of learned parameters for each network simultaneously. Reduced size of each network also speeds up the total computation by an order of magnitude. The 29 question (multiple choice) end of year MCAS test score was predicted for each student given their answers on the tutor system. A steady decline can be seen in prediction error rate by model. Pardos, Z. A., Heffernan, N. T., Anderson, B. & Heffernan, C. (2007). The effect of model granularity on student performance prediction using Bayesian networks. The International User Modeling Conference 2007. Pardos, Z., Feng, M. & Heffernan, N. T. & Heffernan-Lindquist, C. (2007) Analyzing fine-grained skill models using bayesian and mixed effect methods. In Luckin & Koedinger (Eds.) Proceedings of the 13th Conference on Artificial Intelligence in Education. IOS Press. pp. 626-628. Pardos, Z., Heffernan, N., Ruiz, C., Beck, J. (Draft) Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian Network. In References #156
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    1

Transcript of Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian...

Page 1: Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian Network By Zach Pardos, Advisors: Neil Heffernan, Carolina.

Effective Skill Assessment Using Expectation Maximizationin a Multi Network Temporal Bayesian Network

By Zach Pardos, Advisors: Neil Heffernan, Carolina Ruiz, Joseph Beck

To help teachers track student knowledge and learning during the school year from responses on the ASSISTment tutoring system and to make accurate end of year standardized test score predictions.

Goal

• ASSISTment is a web-based assessment system for 8th-10th grade math that tutors students on items they get wrong. There are 1,443 items in the system.• The system is freely available at www.assistment.org• Question responses from 600 students using the system during the 2004-2005 school year were used.•Each student completed around 260 items each.

The Skill ModelsThe skill models were created for use in the online tutoring system called ASSISTment, founded at WPI. They consist of skill names and tagging of those skill names to math questions on the system. Models with 1, 5, 39 and 106 skills were evaluated to represent varying degrees of concept generality. The skill model’s ability to predict performance of students on the system as well as on a standardized state test was evaluated.

The five skill models used:• WPI-106: 106 skill names were drafted and tagged to items in the tutoring system and to the questions on the state test by our subject matter expert.• WPI-5 and WPI-39: 5 and 39 skill names drafted by the Massachusetts Department of Education.• WPI-1: Represents unidimensional assessment.

Background on ASSISTment

Learning Results from Temporal Net

•The ASSISTment fine-grained and temporal skill models excel at assessment of student skills and prediction of the MCAS.•Accurate prediction and parameter learning means teachers can know when students have attained certain mandated math competencies.

• Skill probabilities are inferred from a students’ responses to questions on the system

Bayesian Belief Network

Student Test Score Prediction

WPI Department of Computer Science, 2008

SponsorsCollaborators

Temporal Network Structure

Conclusions

• A Bayesian network is a probabilistic machine learning method. It is well suited for making inferences on unobserved random variables by incorporating prior probabilities with new evidence.

Bayesian Networks

• Arrows represent associations of skills with question items. They also represent conditional dependence in the Bayesian Belief Network.

• Skill values are inferred for each student from their responses on the tutor

• Inferred skill values are used to predict the probability of a given student answering a question correctly on the tutor system or on the MCAS (Massachusetts Comprehensive Assessment System) Test.

• Predicting end of year MCAS scores

Addition 87.38%Ordering-Numbers 80.83%Multiplication 69.66%Integers 68.54%Multiplying-Positive-Negative-Numbers 66.55%

Venn-Diagram 0.11%Pythagorean-theorem 0.87%Of-Means-Multiply 1.07%Interpreting-Linear-Equations 1.16%Fraction-Multiplication 1.96%

Multiplication 35.94%Point-Plotting 30.24%Addition 27.52%Square-Root 24.12%Proportion 18.43%

Rate 1.25%Sum-of-Interior-Angles-Triangle 1.47%Equation-Concept 1.66%Venn-Diagram 1.89%Unit-Conversion 2.04%

22.31% 17.28% 14.45% 12.86% 12.72%

Skills with the most learningbetween tutor sessions

Skills with the least learningbetween tutor sessions

Skills with highest incoming 8th grade knowledge level

Skills with lowest incoming 8th grade knowledge level

•Average prior knowledge of skills before using the tutor: 30%

•Average probability of Guessing: 14% Slipping: 9%

•Average probability of learning a skill from one session to the next: 8%

All student data was presented to the temporal Bayesian network with each time slice representing a tutor session. Network parameters were learned using the Expectation Maximization algorithm to reveal student performance characteristics.

In the temporal network, the 106 skills were split up into their own independent networks due to the intractability of representing all the nodes of the static network in temporal form.

Three questions represented in the static WPI-106

The same questions now with three separate networks in the temporal WPI-106

Hidden Markov Model representation of a generic temporal Bayesian network where the inferred latent skill value from the previous time slice becomes the prior in the next time slice.

Making each skill network independent allows for parallel computation of learned parameters for each network simultaneously. Reduced size of each network also speeds up the total computation by an order of magnitude.

The 29 question (multiple choice) end of year MCAS test score was predicted for each student given their answers on the tutor system. A steady decline can be seen in prediction error rate by model.

Pardos, Z. A., Heffernan, N. T., Anderson, B. & Heffernan, C. (2007). The effect of model granularity on student performance prediction using Bayesian networks. The International User Modeling Conference 2007.

Pardos, Z., Feng, M. & Heffernan, N. T. & Heffernan-Lindquist, C. (2007) Analyzing fine-grained skill models using bayesian and mixed effect methods. In Luckin & Koedinger (Eds.) Proceedings of the 13th Conference on Artificial Intelligence in Education. IOS Press. pp. 626-628.

Pardos, Z., Heffernan, N., Ruiz, C., Beck, J. (Draft) Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian Network. In the Proceedings of the 1st Conference on Educational Data Mining. Montreal.

References

#156