A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... ·...
Transcript of A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... ·...
![Page 1: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/1.jpg)
A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching
Jacob Whitehill1,2
Javier Movellan1,2
1University of California, San Diego (UCSD)2Machine Perception Technologies (www.mptec.com)
Saturday, December 8, 12
![Page 2: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/2.jpg)
Automated teaching machines
• Automated teaching machines, a.k.a. intelligent tutoring systems (ITS), offer the ability to personalize instruction to the individual student.
• ITS offer some of the benefits of 1-on-1 human tutoring at a fraction of the cost.
Saturday, December 8, 12
![Page 3: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/3.jpg)
History of automated teaching• Automated teaching has a 50+ year history:
• 1960s-70s: Stanford researchers (e.g., Atkinson) applied control theory to optimize the learning process for “flashcard”-style vocabulary learning.
Saturday, December 8, 12
![Page 4: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/4.jpg)
History of automated teaching• Automated teaching has a 50+ year history:
• 1980s-90s: John Anderson at CMU started the “cognitive tutor” movementto teach complex skills, e.g.:
• Algebra
• Geometry
• Computer programmingAlgebra Cognitive Tutor
Saturday, December 8, 12
![Page 5: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/5.jpg)
History of automated teaching• Automated teaching has a 50+ year history:
• 2000s-present: cognitive tutors were enhanced with more sophisticated graphics and sound.
• Applications of reinforcement learning to ITS.
Wayang Outpost math tutor
Saturday, December 8, 12
![Page 6: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/6.jpg)
Limited sensors• Over their 50+ year history, one notable feature
about ITS is the limited sensors they use, usually consisting of:
• Keyboard
• Mouse
• Touch screen
Saturday, December 8, 12
![Page 7: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/7.jpg)
Sensors• In contrast, human tutors consider the student’s:
• Speech
• Body posture
• Facial expression
Saturday, December 8, 12
![Page 8: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/8.jpg)
Sensors• In contrast, human tutors consider the student’s:
• Speech
• Body posture
• Facial expression
• It is possible that automated tutors could become more effective if they used richer sensory information.
Saturday, December 8, 12
![Page 9: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/9.jpg)
Affect-sensitive automated teachers
• A hot topic in the ITS community is affect-sensitive automated teaching systems.
• “Affect-sensitive”: use rich sensors to sense and respond to the student’s affective state.
• “Affective state”:
• Student’s motivation, engagement, frustration, confusion, boredom, etc.
Saturday, December 8, 12
![Page 10: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/10.jpg)
Affect-sensitive automated teachers
• Developing an affect-sensitive ITS can be divided into 2 computational problems:
• Perception: how to recognize affective states automatically using affective sensors.
• E.g., how to map image pixels from a webcam into a estimate of the student’s engagement.
Saturday, December 8, 12
![Page 11: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/11.jpg)
Affect-sensitive automated teachers
• Developing an affect-sensitive ITS can be divided into 2 computational problems:
• Perception: how to recognize affective states automatically using affective sensors.
• E.g., how to map image pixels from a webcam into a estimate of the student’s engagement.
• Control: how to use affective state estimates to teach more effectively.
Saturday, December 8, 12
![Page 12: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/12.jpg)
Perception problem• Tremendous progress has been made in machine
learning & vision during last 15 years.
• Real-time automatic face detectors are commonplace.
• Facial expression recognition is starting to become practical.
Saturday, December 8, 12
![Page 13: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/13.jpg)
Control problem• Much less research has addressed how students’
affective state estimates should influence the teacher’s decisions.
Saturday, December 8, 12
![Page 14: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/14.jpg)
Control problem• Much less research has addressed how students’
affective state estimates should influence the teacher’s decisions.
• Thus far, the approaches have been rule-based:
• If student looks frustrated, then:Say: “That was frustrating. Let’s move to something easier.”
(Wayang Outpost Tutor -- Woolf, et al. 2009)
Saturday, December 8, 12
![Page 15: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/15.jpg)
Control problem• So far there is little empirical evidence that affect-
sensitivity is beneficial.
• Comparison of affect-sensitive to affect-blind computer literacy tutor (“AutoTutor”):
Learning gainsLearning gains
Aff.-Sens. Aff.-Blind
Day 1 0.249 0.389
Day 2 0.407 0.377
D’Mello, et al. 2010
Affect-sensitive tutor was less effective on
day 1.
Saturday, December 8, 12
![Page 16: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/16.jpg)
Control problem• Even if rules can be devised for a few scenarios, it
is unlikely that this approach will scale up:
• Multiple sensors, high bandwidth, varying timescales, etc.
Saturday, December 8, 12
![Page 17: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/17.jpg)
Control problem• Even if rules can be devised for a few scenarios, it
is unlikely that this approach will scale up:
• Multiple sensors, high bandwidth, varying timescales, etc.
• Instead, a formal computational framework for decision-making may be useful.
Saturday, December 8, 12
![Page 18: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/18.jpg)
• Stochastic optimal control (SOC) theory may provide such a framework.
• SOC provides:
• Mathematics to define teaching as an optimization problem.
• Computational tools to solve the optimization problem.
Stochastic optimal control
Saturday, December 8, 12
![Page 19: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/19.jpg)
• SOC has well-known computational difficulties:
• Finding exact solutions to SOC problems is usually intractable.
• More research is needed on how to find approximately optimal control policies for automated teaching problems.
• Since the 1960s, a variety of machine learning and reinforcement learning methods have been developed for finding approximately optimal solutions.
Stochastic optimal control
Saturday, December 8, 12
![Page 20: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/20.jpg)
SOC-based ITS• In this talk, I will describe one approach to building an
ITS for language acquisition using approximate methods from SOC.
• Our work draws inspiration from Rafferty, Brunskill, Griffiths, and Shafto (2011).
• I also describe how an SOC-based automated teacher naturally uses affective observations when they are available.
• No ad-hoc rules are necessary.
Saturday, December 8, 12
![Page 21: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/21.jpg)
Teaching word meanings from visual examples
ontbyt
Saturday, December 8, 12
![Page 22: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/22.jpg)
Teaching word meanings from visual examples
ontbyt
Saturday, December 8, 12
![Page 23: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/23.jpg)
Teaching word meanings from visual examples
ontbyt
Saturday, December 8, 12
![Page 24: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/24.jpg)
Teaching word meanings from visual examples
ontbyt (breakfast)
Saturday, December 8, 12
![Page 25: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/25.jpg)
Teaching word meanings from visual examples
• This is the learning approach used in Rosetta Stone language software.
Saturday, December 8, 12
![Page 26: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/26.jpg)
• We wish to teach the meanings of a set of words.
• Each word can mean any one of a set of concepts.
• We have a set of example images.
• At each timestep t, the automated teacher can:
• Teach word j using image k
• Ask student a question about word j
• Give the student a test on all the words in the set
• Teacher’s goal: help student pass the test as quickly as possible.
Teaching task
Saturday, December 8, 12
![Page 27: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/27.jpg)
Teaching task as SOC problem• We pose this teaching task as a SOC problem.
• We use model-based control:
• We develop probabilistic models of how the student learns, and how she responds to questions asked by the teacher.
• We collect data of human students to estimate model parameters.
• Once model is learned, we can optimize the automated teacher using simulation.
Saturday, December 8, 12
![Page 28: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/28.jpg)
Student model
• We model the student as a Bayesian learner, in the manner of Nelson, Tenenbaum and Movellan (2007) for concept learning and Rafferty, et al. (2011) for concept teaching.
• Reduces amount of data needed to fit the model.
Saturday, December 8, 12
![Page 29: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/29.jpg)
C1
Y1
A11 A1n...
W1 Wn...
Timestep 1 Timestep t
Ct
Yt
At1 Atn...
...
Student has a belief P(c | y) about what concept the teacher was trying to convey with the image.
Student model
Saturday, December 8, 12
![Page 30: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/30.jpg)
C1
Y1
A11 A1n...
W1 Wn...
Timestep 1 Timestep t
Ct
Yt
At1 Atn...
...
After t timesteps the student updates her belief:
Student model
mtj.= P (wj | y1:t, a1q1 , . . . , atqt)
Saturday, December 8, 12
![Page 31: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/31.jpg)
• Since a perfectly Bayesian learner is unrealistic (Nelson and Cottrell 2007), we “soften” the model by introducing a “belief update strength” variable βt ∈ (0,1]:
• βt specifies how much the student updates her belief at time t.
• βt may be related to the student’s level of “engagement” in the learning task.
Student inference
Saturday, December 8, 12
![Page 32: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/32.jpg)
• For ask and test actions:
• If student is asked to define the meaning of word j, she responds using probability matching according to mtj.
• Probability matching is a popular response model in psychology (e.g., Movellan and McClelland, 2000).
Student responses
Saturday, December 8, 12
![Page 33: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/33.jpg)
• Let us now consider the problem from the automated teacher’s perspective...
Teacher model
Saturday, December 8, 12
![Page 34: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/34.jpg)
Problem formulation using SOC• State St:
• Student’s knowledge mt of the words’ meanings as well as the belief update strength βt.
St0
0.2
0.4
0.6
0.8
1
man
wom
an boy
girl
eat
drink
milk
breakfast
Saturday, December 8, 12
![Page 35: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/35.jpg)
Problem formulation using SOC• State St:
• The state is assumed to be “hidden” from the teacher because the state is inside the student’s brain.
St0
0.2
0.4
0.6
0.8
1
man
wom
an boy
girl
eat
drink
milk
breakfast
Saturday, December 8, 12
![Page 36: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/36.jpg)
Problem formulation using SOC• Action Ut:
• Teach word j with image k
• Ask word j
• Test
St
Ut
Saturday, December 8, 12
![Page 37: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/37.jpg)
Problem formulation using SOC• Action Ut:
• Ut and St jointly determine the student’s next state St+1 according to the transition dynamicsgiven by the student learning model.
St St+1
Ut
......
Saturday, December 8, 12
![Page 38: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/38.jpg)
Problem formulation using SOC• Observation Ot:
• When the teacher asks a question, it receives a response (“observation”) from the student.
• Ot is determined by St and Ut according to the student response model.
St St+1
Ut
Ot
......
Saturday, December 8, 12
![Page 39: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/39.jpg)
Problem formulation using SOC• Belief Bt:
• The teacher maintains a beliefbt ≐ P(st | o1:t-1, u1:t-1) over the student’s state given the history of actions and observations up to time t.
St St+1
Ut
Ot
......
Saturday, December 8, 12
![Page 40: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/40.jpg)
Problem formulation using SOC• Belief Bt: update from time t to time t+1:
St St+1
Ut
Ot
......
P (st+1 | o1:t, u1:t)
/Z
P (st+1 | st, ut)P (ot | st, ut)P (st | o1:t�1, u1:t�1)dst
Saturday, December 8, 12
![Page 41: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/41.jpg)
Problem formulation using SOC• Belief Bt: update from time t to time t+1:
St St+1
Ut
Ot
......
Prior beliefStudent response likelihood
Student learning dynamics
Posterior beliefP (st+1 | o1:t, u1:t)
/Z
P (st+1 | st, ut)P (ot | st, ut)P (st | o1:t�1, u1:t�1)dst
Saturday, December 8, 12
![Page 42: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/42.jpg)
Problem formulation using SOC• Belief Bt:
• Since St itself is a probability distribution, Bt is a probability distribution over probability distributions.
• We approximate Bt using a finite set of particles.
St St+1
Ut
Ot
......
Saturday, December 8, 12
![Page 43: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/43.jpg)
0
0.2
0.4
0.6
0.8
1
man
wom
an boy
girl
eat
drink
milk
breakfast
Problem formulation using SOC• Reward function r(s,u):
• Teacher may prefer certain states, or certain state+action combinations,over others.
St St+1
Ut
Ot
......
0
0.2
0.4
0.6
0.8
1
man
wom
an boy
girl
eat
drink
milk
breakfast
s
s′
Saturday, December 8, 12
![Page 44: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/44.jpg)
Problem formulation using SOC• Control policy π:
• The teacher chooses its action at time t according to the control policy π.
• π maps the teacher’s belief bt about what the student knows, into an action ut.
Saturday, December 8, 12
![Page 45: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/45.jpg)
Problem formulation using SOC• Control policy π:
• Different policies are better than others, as expressed by their value V:
where τ is the length of the teaching session, measured in # of teacher’s actions.
V (⇡).= E
"⌧X
t=1
r(St, Ut) | ⇡#
Saturday, December 8, 12
![Page 46: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/46.jpg)
Problem formulation using SOC• Control policy π:
• Different policies are better than others, as expressed by their value V:
• An optimal policy π* is a policy that maximizes V:
⇡⇤ .= argmax
⇡V (⇡)
V (⇡).= E
"⌧X
t=1
r(St, Ut) | ⇡#
Saturday, December 8, 12
![Page 47: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/47.jpg)
Computing policies
• Finding π* exactly is intractable.
• Instead, we find an approximately optimal policy using policy gradient to maximize V(π) in simulation using the student model.
Saturday, December 8, 12
![Page 48: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/48.jpg)
Experiment• We created a vocabulary of 10 words from
an artificial language:
Teaching Language to a Bayesian Student by Image Association
Word Meaningduzetuzi manfota womannokidono boymininami girlpipesu dogmekizo catxisaxepe birdbotazi rabbitkoto eatnotesabi drink
Table 1: List of words and associated meanings taught during a word learning experiment.
to 1 and can thus be used as a reasonable estimate of P (c | k) for each concept and imagecombination.
Next, we estimated the time costs and student model parameters from 42 subjectsfrom the Amazon Mechanical Turk. The subjects for this parameter estimation phase weretaught with one of three possible teaching policies described below: a RandomWordTeacher,a HandCraftedTeacher, and OptimizedTeacher. (The parameters of the teachers during thisdata collection phase were set by hand.) Based on these pilot subjects, we computed theaverage time costs to be c(teach) = 10.74, c(ask) = 7.53, and c(test) = 106.46 seconds,respectively. For simplicity, we assumed ↵t,�t were constant across the learning session foreach student.
Finally, we conducted policy gradient on the stochastic logistic policy’s weight vectorsusing the costs and parameters estimated in the previous step. We set the time horizon⌧ = 500 and the learning rate to 0.005. In practice, we found that the choice for the learningrate was important – if it was too high, then policy gradient descent sometimes convergedto a nonsensical solution such as never testing the student at all. We executed gradientdescent for 400 iterations. We call the resultant policy the OptimizedTeacher.
9.1 How the OptimizedTeacher behaves
The parameter matrix {wu}u2U , concatenated row-wise into a matrix, is shown in Figure5. For a vocabulary of n = 10 words there are 21 rows – one “teach” and “ask” action foreach word, plus a “test” action. g(j) represents the goodness of the student’s belief aboutword j as estimated by the teacher’s particles. At run-time, each vector wu is multiplied(inner-product) by the feature vector xt computed from the teacher’s particles, which inturn gives a scalar proportional to the probability of executing action u under the policy.In the figure, dark colors represent low weights, and light colors represent high weights.First, notice that the “bias” feature (last column) for the “test” action (last row) is verydark, i.e., has a very low weight – this means that the teacher tends not to execute “test”actions very often. In addition, notice that the “uncertainty” feature for “test” is also quite
21
Saturday, December 8, 12
![Page 49: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/49.jpg)
Experiment• We collected a set of images from Google Image
Search:
Saturday, December 8, 12
![Page 50: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/50.jpg)
Experiment• To estimate student model parameters as well as
time costs of each action (teach, ask, test), we collected data from human subjects.
• Given the student model and time costs, we used policy gradient to compute π so as to minimize the expected time the student needs to pass the test.
• This control policy constitutes the “SOCTeacher”.
Saturday, December 8, 12
![Page 51: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/51.jpg)
Experiment
• We conducted an experiment on 90 subjects from the Amazon Mechanical Turk.
• Dependent variable: time to pass the test.
Saturday, December 8, 12
![Page 52: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/52.jpg)
Experimental conditions1. SOCTeacher
2. HeuristicTeacher
• Select a word randomly at each round, and teach it using an image sampled according to P(c | y).
• Test every p rounds (p was optimized in simulation).
Saturday, December 8, 12
![Page 53: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/53.jpg)
Results
OptimizedTeacher HandCraftedTeacher RandomWordTeacher500
550
600
650
700
750
800
850
Avg
tim
e t
o f
inis
h (
sec)
Avg time to finish v. teaching strategy
TimeCost(SOCTeacher) is 24% less than TimeCost(HeuristicTeacher) (p < 0.01).
SOCTeacher HeuristicTeacher
Saturday, December 8, 12
![Page 54: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/54.jpg)
Affect while learning• In pilot exploration of students’ affect, we found that
students were usually engaged in the task.
Saturday, December 8, 12
![Page 55: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/55.jpg)
Affect while learning• There were, however, occasional moments of non-
engagement.
Saturday, December 8, 12
![Page 56: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/56.jpg)
How affect could be used• Suppose that the student’s face image zt is
correlated with the student’s belief update strength βt according to P(zt | βt):
• How can this “affective sensor” measurement be used to teach better?
Saturday, December 8, 12
![Page 57: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/57.jpg)
How affect could be used• In an SOC-based automated teacher, the
teacher’s belief update simply gains an additional term:
• The “affective observation” greatly constrains the teacher’s belief of the student’s knowledge.
• Amended belief update emerges naturally from probability theory -- no need for ad-hoc rules.
Affective observation
P (st+1 | o1:t, u1:t)
/Z
P (st+1 | st, ut)P (ot | st, ut)P (zt | �t)P (st | o1:t�1, u1:t�1)dst
Saturday, December 8, 12
![Page 58: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/58.jpg)
0 50 100 150 2000
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
Timestep (t)
Un
cert
ain
ty in
teach
er’
s b
elief
Affect!blindAffect!sensitive
Incorporating affect: simulation
Saturday, December 8, 12
![Page 59: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/59.jpg)
0 50 100 150 2000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Timestep (t)
Pro
p. o
f stu
den
ts w
ho
passed
test
Affect!blindAffect!sensitive
Incorporating affect: simulation
Saturday, December 8, 12
![Page 60: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/60.jpg)
Summary
• While stochastic optimal control brings with it significant computational challenges, approximate solution methods can be used to create practical ITS.
• SOC provides a principled method of incorporating affective sensor readings into the teaching process.
Saturday, December 8, 12
![Page 61: A Stochastic Optimal Control Perspective on Affect-Sensitive …mozer/Admin/Personalizing... · 2013-01-25 · A Stochastic Optimal Control Perspective on Affect-Sensitive Teaching](https://reader033.fdocuments.in/reader033/viewer/2022060404/5f0eeaec7e708231d44193c0/html5/thumbnails/61.jpg)
Thank you
Saturday, December 8, 12