2014UMAP Student Modeling with Reduced Content Models

+Doing More with Less : Student Modeling and Performance Prediction with Reduced Content Models

Yun Huang, University of Pittsburgh Yanbo Xu, Carnegie Mellon University Peter Brusilovsky, University of Pittsburgh

+This talk…

n  What? More effective student modeling and performance prediction

n  How? A novel framework reducing content model without loss of quality

n  Why? Better and cheaper n  Reduced to 10%~20% while maintaining or

improving performance n  Up to 8% better AUC, a classification metric n  Beat expert based reduction

+Outline

n  Motivation

n  Content Model Reduction

n  Experiments and Results

n  Conclusion and Future Work

+ Motivation

n  In advanced learning contents, the elements of domain knowledge (Knowledge Component, KCs)

required to complete it can be very large.

n  It complicates modeling due to increasing noise and decreasing efficiency

n  We argue that we only need a subset of the most important KCs!

+Content model

n  The focus of our study is the Java programming domain, where each problem involves a complete program.

n  Original content model: experts indexed each problem by a set of Java programming concepts aided by an ontology and a parser.

n  In our context of study, KCs indexed to a problem can range from 9 to 55!

+Challenge

n  Traditional feature selection focuses on selecting a subset of features for all datapoints (a domain).

n  We select important KCs for each item rather than removing “less important” concepts from the domain model.

item level not domain level

+ Our intuitions of reduction methods n  Three types of methods from different information sources

and intuitions: Intuition 1 “for statement” appears 2 times in this problem -- it should be important for this problem!

“assignment” appears in a lot of problems -- it should be trivial for this problem!

Intuition 2: When “nested loops” appears, students always get it wrong -- it should be important for this problem!

Intuition 3: Expert labeled “assignment”, “less than” as prerequisite concepts, while “nested loops”, “for statement” as outcome concepts --- outcome concepts should be the important ones for current problem!

+Reduction Methods

n Content-based methods

n  A problem=a document, a KC = a word

n  Use IDF (Inverse Document Frequency) and TFIDF (Term Frequency - IDF) keyword weighting approach to compute KC importance score.

n Response-based Method

n  Train a logistic regression (PFA) to predict student response

n  Use the coefficient representing the initial easiness (EASINESS-COEF) of a KC.

n Expert-based Method Use only the OUTCOME concepts as the KCs for an item.

+Item-level ranking of KC importance

n  For each method, we define SCORE function assigning a score to a KC in an item n  The higher the score, the more important a KC is in

an item.

n  Then, we do item-level ranking to order KCs by their importance score within an item so that a KC's importance can be differentiated n  by different score values, or/and n  by its different ranking positions in different items

+Reduction Sizes

n  What is the best number of KCs each method should reduce to? n  Reducing non-adaptively to items (TopX):

Select x KCs per item with the highest importance scores.

n  Reducing adaptively to items (TopX%): Select x% KCs per item with the highest importance scores

+ Evaluating Reduction on PFA and KT

n  We evaluate by the prediction performance of two popular student modeling and performance prediction models n  Performance Factor Analysis (PFA): logistic

regression model predicting student response n  Knowledge Tracing (KT): Hidden Markov models

predicting student response and inferring student knowledge level

*We select a variant that can handle multiple KCs.

+Outline

n  Motivation




+Tutoring System

Collected from JavaGuide, a tutor for learning Java programming.

Each question is generated from a template, and students can try multiple attempts

Students give values for a variable or the output

Java code

+Experimental Setup

n  Dataset n 19, 809 observations, about 69.3% correct n 132 students on 94 question templates (items) n A problem is indexed into 9 ~ 55 KCs, 124 KCs in total

n  Classification metric: Area Under Curve (AUC) n  1: perfect classifier, 0.5: random classifier

n  Cross-validation: Two runs of 5-fold CV where in each run 80% of the users are in train, and the remaining are in test.

n  We list the mean AUC on test sets across the 10 runs, and use Wilcoxon Signed Ranks Test (alpha = 0.05) to test AUC comparison significance.

+

0.67

0.68

0.69

0.7

0.71

0.72

0.73

0.74

ori 90% 80% 70% 60% 50% 40% 30% 20% 10%

AUC

TopX% IDF TFIDF EASINESS-COEF RANDOM

Reduction v.s. original on PFA

n Roughly in bell shapes with fluctuations n Reduction to a moderate size can provide comparable or even

better prediction than using original content models. n  Reduction could hurt if the size goes too small (e.g. < 5), possibly

because PFA was designed for fitting items with multiple KCs.

0.67

0.68

0.69

0.7

0.71

0.72

0.73

0.74

ori 35 25 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

AUC

TopX IDF TFIDF EASINESS-COEF RANDOM

+ Reduction v.s. original on KT

n  Reduction provides steady significant gain ranging a much bigger span and scale (up to 8% improvement in mean AUC)!

n  KT achieves the best performance when the reduction size is small: it may be more sensitive than PFA to the size!

n  Our reduction methods have selected promising KCs that are the important ones for KT making predictions!

0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7

0.71

ori 35 25 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1

AUC

TopX IDF TFIDF EASINESS-COEF RANDOM

0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69 0.7

0.71

ori 90% 80% 70% 60% 50% 40% 30% 20% 10%

AUC

TopX% IDF TFIDF EASINESS-COEF RANDOM

+ Automatic v.s. expert-based (OUTCOME) reduction method

n  IDF and TFIDF can be comparable to or outperform OUTCOME method!

n E-COEF provides much gain on KT than PFA, suggesting PFA coefficients can provide useful extra information for reducing the KT content models.

(+/−: signicantly better/worse than OUTCOME, � : the optimal mean AUC)

+Outline

n  Motivation




+

“Everything should be made as simple as possible, but not simpler.”

-- Albert Einstein

+Conclusion

n  “Content model should be made as simple as possible, but not simpler.” n Given the proper reduction size, reduction enables

prediction performance better!

n  Different model reacts to reduction differently! n KT is more sensitive to reduction than PFA n Different models achieve the best balance between

model complexity and model fit in different ranges n  We are the first to explore reduction extensively!

n More ideas for selecting important KCs? n What about other domains?

+

Thank you for listening !

+ Why RANDOM can occasionally be good?

n  When remaining size is relatively large (e.g. > 4 or > 20%), RANDOM can by chance target one or a subset of the important KCs, and then n  it takes advantage of PFA’s logistic regression to adjust the

coefficients of other non-important KCs, or n  it take advantage of KT to pick out the most important one in the set

by computing the “weakest” KC.

n When remaining size of KCs is relatively small, proposed methods becomes better than RANDOM more significantly.

n Our proposed method is not perfect…

(+/−: signicantly better/worse than RANDOM, � : the optimal mean AUC)

2014UMAP Student Modeling with Reduced Content Models

Data & Analytics

Transcript of 2014UMAP Student Modeling with Reduced Content Models