Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT...

30
Anhui Province Key Laboratory of Big Data Analysis and Application 1 Reporter: Zhenya Huang 43 rd International ACM SIGIR Conference on Research and Development in Information Retrieval VIRTUAL CONFERENCE

Transcript of Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT...

Page 1: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 1

Reporter: Zhenya Huang

43rd International ACM SIGIR Conference on Researchand Development in Information RetrievalVIRTUAL CONFERENCE

Page 2: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 2

Outline

Background1

2 Problem & Overview

Model3

Experiment4

Conclusion & Future work5

Page 3: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 3

BackgroundØCognitive diagnosis for knowledge proficiency

Ø Domain: Education, Recruitment, Sports, Game, etcØ Goal: Evaluating how much students learn about different knowledge

concepts Ø Math subject: Function, Set, Inequality, etc

Ø Fundamental taskØ Evaluation, Testing, Recommendation, etc

Page 4: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 4

BackgroundØLearning activities

Ø Taking courses, Practicing exercises, Taking Tests, etcØ Classroom-based

Ø Rely on expertise of teachersØ Hard to record data

Ø Online learningØ Open environment with computer-aided technologyØ Learning data of students can be recordedØ KhanAcademy, MOOC, etc

Page 5: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 5

BackgroundØCognitive diagnosis Problem

Page 6: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 6

Related workØStatic modeling

Ø IRT: Item Response Theory

Ø DINA:

Ø PMF: Probabilistic Matrix Factorization

Latent trait

Knowledge vector

Latent vector

Page 7: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 7

Related workØDynamic modeling

Ø BKT: Baysian Knowledge TracingØ Hidden Markov ModelØ Tracing for single conceptØ Discrete results (mastered or non-mastered)

Student’s knowledge state

Probability transition matrix

Observation, 0 for wrong, 1 for correct

Page 8: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 8

Related workØDynamic modeling

Ø DKT: Deep Knowledge TracingØ Apply RNNs (LSTM) to model student knowledge over timeØ Tracing all concepts togetherØ Hidden states can represent the latent knowledge states

Knowledge states

DKVMN WWW 2017

DKT-Trees Cognitive Computation 2018

PDKT-C ICDM 2018

DKT+ time factors

WWW 2019

… …

Page 9: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 9

BackgroundØLimitation

Ø Ignoring the dynamic memory factorsØ How can we learn and remember knowledge?Ø Why do we forget what we have learned ?

Ø Lack of interpretabilityØ Don’t know the meaning of latent vectors/ hidden states

Ø Learning records are sparseØ Students practice very few exercises

Page 10: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 10

Outline

Background1

2 Problem & Overview

Model3

Experiment4

Conclusion & Future work5

Page 11: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 11

Problem & OverviewØGiven

Ø Exercising logs as a score tensor: Ø Q-matrix representing exercise-knowledge relation:

ØGoalØ Tracking the change of knowledge proficiency of students from time 1 to T Ø Predicting her proficiency on K concepts and performance scores on

specific exercises at time T + 1

Page 12: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 12

Problem & OverviewØModel overview

Ø KPT: Knowledge Proficiency Tracing modelØ EKPT: Exercise-correlated Knowledge Proficiency model

Page 13: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 13

Outline

Background1

3 Model

Problem & Overview2

Experiment4

Conclusion & Future work5

Page 14: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 14

KPT modelØProbabilistic modeling

Ø For each student and exercise, modeling the responses as:

Ø : proficiency vector of student i, representing how much students learn on K concepts at time t

Ø : knowledge vector of exercise j, denoting the latent correlation between exercise j and K concepts

Ø How to establish the corresponding relationship among students, exercisesand knowledge concepts?

Page 15: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 15

KPT modelØModeling V with Q-matrix prior

Ø Goal: project exercise into knowledge space, enhancing interpretabilityØ Traditional Q-matrix

Ø Denoting exercise-knowledge correlationØ Binary entries: do not fit for probabilistic modeling

Ø Our work assumptionØ If Qjq = 1, then this concept q is more relevant to exercise j than all other

concepts with mark 0

e1, k1, k2e1, k1, k3e2, k3, k1…e5, k1, k2e5, k1, k4

Page 16: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 16

KPT modelØModeling U with learning theories

Ø Goal: explain the dynamic factors in the learning process

Ø Two learning theoriesØ Learning curve: The more exercises she does, the higher

level of proficiency on the related knowledge she will get

Ø Forgetting curve: The longer the time passes, the more knowledge she will forget

Number of practice times

Time interval

Page 17: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 17

EKPT modelØSparsity problem

Ø Students practice very few exercises compared with the huge exercise spaceØ Inaccurate if students just practices few exercises at each time

ØEKPT modelØ Exercise connectivity assumption

Ø Students may get consistent scores on these knowledge-based exercisesØ Learning each exercise vector with its similar ones

Page 18: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 18

EKPT modelØEKPT model

ØModeling V with exercise connectivityØ For exercise j, we define a neighbor set

Ø The knowledge vector of exercise j is influenced by the set:

Ø is the weight influence, which can be any weight function, like

Equal contribution for all neighbor exercicses

Page 19: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 19

ModelØModel Comparasion

KPTEKPT

Page 20: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 20

ModelØModel Learning

KPT EKPT

Page 21: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 21

ModelØApplication

Ø Knowledge Proficiency Estimation

Ø Student Performance Prediction

Ø Diagnosis results explanation and visualization

Page 22: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 22

Outline

Background1

4 Experiment

Problem & Overview2

Model3

Conclusion & Future work5

Page 23: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 23

ExperimentØDataset

ØBaseline

sparse

Static models

Dynamic models

Variants

Ours

Page 24: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 24

ExperimentØ Knowledge Proficiency Estimation

Ø DOA: if a masters better than b on a concept k at time T, then a will have a higher probability to get correct answers to the exercises related to concept kthan b at time T

Ø Our models perform better than baselinesØ EKPT is better than KPT on sparse dataset

Page 25: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 25

ExperimentØ Student Performance Prediction

Ø MAE, RMSE

Ø Dynamic models are better than static onesØ Deep learning based models (DKT) perform not very good

Ø Possible: Time is not longer enough, Data volume may not supportØ Diagnosis results visualization

Ø The student practices many times on K3, knowledge proficiency increasesØ The student practices very few exercises on K4, she may forget what she have learned

Page 26: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 26

ExperimentØModel Analysis

Ø Computational PerformanceØ Though our model needs more time for training, they are competitive compared

with DKT (deep learning based ones)

Ø Parameter sensitivity

Page 27: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 27

ExperimentØModel Analysis

Ø Exercise relationshipØ Exercise with same concepts are grouped together

Page 28: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 28

Outline

Background1

5 Conclusion & Future work

Problem & Overview2

Model3

Experiment4

Page 29: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 29

Conclusion & Future workØConclusion

Ø A focused study on tracking the knowledge proficiency of studentsØ Two explanatory probabilistic models considering different educational

factorsØ Incorporating learning theories for explaining the knowledge changeØ Incorporating Q-matrix for improving the interpretabilityØ Incorporating exercise connectivity property to address sparsity problem

Ø Experiments on different datasets show the both effectiveness and explanatory power of our models

ØFuture workØ Consider different specific modeling for learning and forgetting factorsØ Consider student behaviors and social connections for more precise diagnosisØ Consider different learning scenarios

Ø GameØ Multiple-attempt responseØ Repeated learning

Page 30: Reporter: ZhenyaHuang - USTChome.ustc.edu.cn/~huangzhy/files/slides/ZhenyaHuang-TOIS...KPT EKPT Anhui Province Key Laboratory of Big Data Analysis and Application 21 Model ØApplication

Anhui Province Key Laboratory of Big Data Analysis and Application 30

Thanks for your listening!

[email protected]