Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a...
Transcript of Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a...
![Page 1: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/1.jpg)
Week 1, video 2: Regressors
![Page 2: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/2.jpg)
Prediction
! Develop a model which can infer a single aspect of the data (predicted variable) from some combination of other aspects of the data (predictor variables)
! Sometimes used to predict the future ! Sometimes used to make inferences about the
present
![Page 3: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/3.jpg)
Prediction: Examples
! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?
! A student has used educational software for the last half hour. ! How likely is it that she knows the skill in the next
problem?
! A student has completed three years of high school. ! What will be her score on the college entrance
exam?
![Page 4: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/4.jpg)
What can we use this for?
! Improved educational design ! If we know when students get bored, we can improve
that content ! Automated decisions by software
! If we know that a student is frustrated, let’s offer the student some online help
! Informing teachers, instructors, and other stakeholders ! If we know that a student is frustrated, let’s tell their
teacher
![Page 5: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/5.jpg)
Regression in Prediction
! There is something you want to predict (“the label”) ! The thing you want to predict is numerical
! Number of hints student requests ! How long student takes to answer ! How much of the video the student will watch ! What will the student’s test score be
![Page 6: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/6.jpg)
Regression in Prediction
! A model that predicts a number is called a regressor in data mining
! The overall task is called regression
![Page 7: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/7.jpg)
Regression
! To build a regression model, you obtain a data set where you already know the answer – called the training label
! For example, if you want to predict the number of hints the student requests, each value of numhints is a training label Skill pknow *me totalac*ons
numhints ENTERINGGIVEN 0.704 9 1
0 ENTERINGGIVEN 0.502 10 2
0 USEDIFFNUM 0.049 6 1
3 ENTERINGGIVEN 0.967 7 3
0 REMOVECOEFF 0.792 16 1
1 REMOVECOEFF 0.792 13 2
0 USEDIFFNUM 0.073 5 2
0 ….
![Page 8: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/8.jpg)
Regression
! Associated with each label are a set of “features”, other variables, which you will try to use to predict the label
Skill pknow *me totalac*ons numhints
ENTERINGGIVEN 0.704 9 1 0
ENTERINGGIVEN 0.502 10 2 0
USEDIFFNUM 0.049 6 1 3
ENTERINGGIVEN 0.967 7 3 0
REMOVECOEFF 0.792 16 1 1
REMOVECOEFF 0.792 13 2 0
USEDIFFNUM 0.073 5 2 0
….
![Page 9: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/9.jpg)
Regression
! The basic idea of regression is to determine which features, in which combination, can predict the label’s value
Skill pknow *me totalac*ons numhints
ENTERINGGIVEN 0.704 9 1 0
ENTERINGGIVEN 0.502 10 2 0
USEDIFFNUM 0.049 6 1 3
ENTERINGGIVEN 0.967 7 3 0
REMOVECOEFF 0.792 16 1 1
REMOVECOEFF 0.792 13 2 0
USEDIFFNUM 0.073 5 2 0
….
![Page 10: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/10.jpg)
Linear Regression
! The most classic form of regression is linear regression
! Numhints = 0.12*Pknow + 0.932*Time – 0.11*Totalactions
Skill pknow *me totalac*ons numhints
COMPUTESLOPE 0.544 9 1 ?
![Page 11: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/11.jpg)
Quiz
! Numhints = 0.12*Pknow + 0.932*Time – 0.11*Totalactions
! What is the value of numhints? A) 8.34 B) 13.58 C) 3.67 D) 9.21 E) FNORD
Skill pknow *me totalac*ons numhints
COMPUTESLOPE 0.322 15 4 ?
![Page 12: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/12.jpg)
Quiz
! Numhints = 0.12*Pknow + 0.932*Time – 0.11*Totalactions
! Which of the variables has the largest impact on numhints? (Assume they are scaled the same)
A) Pknow
B) Time
C) Totalactions D) Numhints
E) They are equal
![Page 13: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/13.jpg)
However…
! These variables are unlikely to be scaled the same! ! If Pknow is a probability
! From 0 to 1 ! We’ll discuss this variable later in the class
! And time is a number of seconds to respond ! From 0 to infinity
! Then you can’t interpret the weights in a straightforward fashion ! You need to transform them first
![Page 14: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/14.jpg)
Transform
! When you make a new variable by applying some mathematical function to the previous variable
! Xt = X2
![Page 15: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/15.jpg)
Transform: Unitization
! Increases interpretability of relative strength of features
! Reduces interpretability of individual features
Xt = X – M(X) SD(X)
![Page 16: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/16.jpg)
Linear Regression
! Linear regression only fits linear functions… ! Except when you apply transforms to the input
variables ! Which most statistics and data mining packages can
do for you
![Page 17: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/17.jpg)
Ln(X)
-5
-4
-3
-2
-1
0
1
2
3
-15 -10 -5 0 5 10 15
![Page 18: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/18.jpg)
Sqrt(X)
0
0.5
1
1.5
2
2.5
3
3.5
-15 -10 -5 0 5 10 15
![Page 19: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/19.jpg)
X2
0
20
40
60
80
100
120
-15 -10 -5 0 5 10 15
Xt
![Page 20: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/20.jpg)
X3
-1500
-1000
-500
0
500
1000
1500
-15 -10 -5 0 5 10 15 Xt
![Page 21: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/21.jpg)
1/X
-80
-60
-40
-20
0
20
40
60
80
-15 -10 -5 0 5 10 15
![Page 22: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/22.jpg)
Sin(X)
-1.5
-1
-0.5
0
0.5
1
1.5
-15 -10 -5 0 5 10 15
![Page 23: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/23.jpg)
Linear Regression
! Surprisingly flexible… ! But even without that ! It is blazing fast ! It is often more accurate than more complex models,
particularly once you cross-validate ! Caruana & Niculescu-Mizil (2006)
! It is feasible to understand your model (with the caveat that the second feature in your model is in the context of the first feature, and so on)
![Page 24: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/24.jpg)
Example of Caveat
! Let’s graph the relationship between number of graduate students and number of papers per year
![Page 25: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/25.jpg)
Data
0
2
4
6
8
10
12
14
16
0 2 4 6 8 10 12 14 16
Pape
rs p
er y
ear
Number of graduate students
![Page 26: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/26.jpg)
Data
0
2
4
6
8
10
12
14
16
0 2 4 6 8 10 12 14 16
Pape
rs p
er y
ear
Number of graduate students
Too much time spent filling out personnel action forms?
![Page 27: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/27.jpg)
Model
! Number of papers = 4 + 2 * # of grad students - 0.1 * (# of grad students)2
! But does that actually mean that (# of grad students)2 is associated with less publication?
! No!
![Page 28: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/28.jpg)
Example of Caveat
! (# of grad students)2 is actually positively correlated with publications! ! r=0.46
0
2
4
6
8
10
12
14
16
0 2 4 6 8 10 12 14 16
Pape
rs p
er y
ear
Number of graduate students
![Page 29: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/29.jpg)
Example of Caveat
! The relationship is only in the negative direction when the number of graduate students is already in the model…
0
2
4
6
8
10
12
14
16
0 2 4 6 8 10 12 14 16
Pape
rs p
er y
ear
Number of graduate students
![Page 30: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/30.jpg)
Example of Caveat
! So be careful when interpreting linear regression models (or almost any other type of model)
![Page 31: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/31.jpg)
Regression Trees
![Page 32: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/32.jpg)
Regression Trees (non-linear; RepTree) ! If X>3
! Y = 2 ! else If X<-7
" Y = 4 " Else Y = 3
![Page 33: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/33.jpg)
Linear Regression Trees (linear; M5’) ! If X>3
! Y = 2A + 3B ! else If X< -7
" Y = 2A – 3B " Else Y = 2A + 0.5B + C
![Page 34: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/34.jpg)
Linear Regression Tree
0
2
4
6
8
10
12
14
16
0 2 4 6 8 10 12 14 16
Pape
rs p
er y
ear
Number of graduate students
![Page 35: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/35.jpg)
Later Lectures
! Other regressors ! Goodness metrics for comparing regressors ! Validating regressors
![Page 36: Week 1, video 2: Regressors - University of …...Prediction: Examples ! A student is watching a video in a MOOC right now. ! Is he bored or frustrated?! A student has used educational](https://reader034.fdocuments.in/reader034/viewer/2022042411/5f29a48764ed477af312e704/html5/thumbnails/36.jpg)
Next Lecture
! Classifiers – another type of prediction model