The interaction plateau CPI 494, April 9, 2009 Kurt VanLehn 1.
-
Upload
rudolf-foster -
Category
Documents
-
view
219 -
download
3
Transcript of The interaction plateau CPI 494, April 9, 2009 Kurt VanLehn 1.
The interaction plateau
CPI 494, April 9, 2009
Kurt VanLehn
1
2
Schematic of a natural language tutoring systems, AutoTutor
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: IncorrectT: Hint or prompt
Remediation:
Only if out of hints
3
Schematic of other natural language tutors, e.g., Atlas, Circsim-Tutor, Kermit-SE
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: IncorrectT: What is…?S: I don’t know.T:Well, what is…
S:…T:…
Remediation:
Only if out of hints
Often called a KCD: Knowledge construction dialogue
4
Hypothesized ranking of tutoring, most effective first
A. Expert human tutors
B. Ordinary human tutors
C. Natural language tutoring systems
D. Step-based tutoring systems
E. Answer-based tutoring systems
F. No tutoring
5
Hypothesized effect sizes
00.5
11.5
22.5
No tutoring
Answer-based tut...
Step-based tutoring
Nat. lang. tutoring
Ordinary human t...
Expert human tutors
Lear
ning
gai
ns
6
Hypothesized effect sizes
00.5
11.5
22.5
No tutoring
Answer-based tut...
Step-based tutoring
Nat. lang. tutoring
Ordinary human t...
Expert human tutors
Lear
ning
gai
nsBloom’s (1984) 2-sigma: 4 weeks of human tutoring vs. classroom
Classroom
7
Hypothesized effect sizes
00.5
11.5
22.5
No tutoring
Answer-based tut...
Step-based tutoring
Nat. lang. tutoring
Ordinary human t...
Expert human tutors
Lear
ning
gai
ns
Classroom
Kulik (1984) meta-analysis of CAI vs. classroom 0.4 sigma
8
Hypothesized effect sizes
00.5
11.5
22.5
No tutoring
Answer-based tut...
Step-based tutoring
Nat. lang. tutoring
Ordinary human t...
Expert human tutors
Lear
ning
gai
ns
Classroom
Many intelligent tutoring systems: e.g., Andes (VanLehn et al, 2005), Carnegie Learning’s tutors…
9
My main claim: There is an interaction plateau
00.5
11.5
22.5
Lear
ning
gai
ns
Expected Observed
10
A problem and its steps Suppose you are running in a straight line at
constant speed. You throw a pumpkin straight up. Where will it land?
1. Initially, you and the pumpkin have the same horizontal velocity.
2. Your throw exerts a net force vertically on the pumpkin.3. Thus causing a vertical acceleration.4. Which leaves the horizontal velocity unaffected.5. So when the pumpkin falls, it has traveled the same
distance horizontally as you have.6. Thus, it lands in your hands
11
A dialogue between a human tutor (T) and human student (S)
Suppose you are running in a straight line at constant speed. You throw a pumpkin straight up. Where will it land?
S: Behind me.– T: Hmm. Let’s think about that. Before you toss the
pumpkin and are just carrying it, do you and the pumpkin have the same speed?S: Yes
– T: Good. When you toss it up, is the net force on it exactly vertical?S: I’m not sure.T: You exert a force on the pumpkin, right?Etc.
12
Schematic of dialogue about a single step
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: IncorrectT: Hint, or prompt,
or explain, or analogy, or …
Remediation:
13
Comparisons of expert to novice human tutors
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: IncorrectT: Hint, or prompt,
or explain, or analogy, or …
Novices
Experts
Experts may have a wider variety
14
Schematic of an ITS handling of a single step
Stepstart
T: Tell
S: Correct
Stepend
S: IncorrectT: Hint
Only if out of hints
15
Major differences Low-interaction tutoring (e.g., CAI)
– Remediation on answer only Step-based interaction (e.g., ITS)
– Remediation on each step– Hint sequence, with final “bottom out” hint
Natural tutoring (e.g., human tutoring) – Remediation on each step, substep, inference…– Natural language dialogues– Many tutorial tactics
16
Conditions(VanLehn, Graesser et al., 2007) Natural tutoring
– Expert Human tutors » Typed» Spoken
– Natural language dialogue computer tutors» Why2-AutoTutor (Graesser et al.)» Why2-Atlas (VanLehn et al.)
Step-based interaction– Canned text remediation
Low interaction– Textbook
17
Human tutors(a form of natural tutoring)
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: IncorrectT: Hint, or prompt,
or explain, or analogy, or …
18
Why2-Atlas(a form of natural tutoring)
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: IncorrectA Knowledge Construction
Dialogue
19
Why2-AutoTutor(a form of natural tutoring)
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: IncorrectHint or prompt
20
Canned-text remediation(a form of step-based interaction)
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: IncorrectText
21
Experiment 1: Intermediate students & instruction
00.10.20.30.40.50.60.70.80.9
1
MultipleChoice
Essay
Adj
uste
d po
st-t
est
scor
e Human tutors(N=18)
Why2-Atlas(N=22)
Why2-AutoTutor(N=24)
Canned textremediation(N=22)
22
Experiment 1: Intermediate students & instruction
00.10.20.30.40.50.60.70.80.9
1
MultipleChoice
Essay
Adj
uste
d po
st-t
est
scor
e Human tutors(N=18)
Why2-Atlas(N=22)
Why2-AutoTutor(N=24)
Canned textremediation(N=22)
No reliable differences
23
Experiment 2:AutoTutor > Textbook = Nothing
00.10.20.30.40.50.60.70.80.9
1
Multiple Choice Essay
Ad
just
ed p
ost
-tes
t sc
ore
AutoTutor
Textbook
Nothing
Reliably different
24
Experiments 1 & 2(VanLehn, Graesser et al., 2007)
00.10.20.30.40.50.60.70.80.9
1
Read-onlytextbookstudying
Step-based
computertutoring
Why2-AutoTutor
Why2-Atlas
Humantutoring
Ad
jus
ted
po
st-
tes
t s
co
res No significant differences
25
Experiment 3: Intermediate students & instruction
00.10.20.30.40.50.60.70.80.9
1
Multiplechoice
Near transferessay
Far transferessay
Retentionmultiplechoice
Retentionessay
Why2-AutoTutor (N=32) Canned Text Remediation (N=30)
Deeper assessments
26
Experiment 3: Intermediate students & instruction
00.10.20.30.40.50.60.70.80.9
1
Multiplechoice
Near transferessay
Far transferessay
Retentionmultiplechoice
Retentionessay
Why2-AutoTutor (N=32) Canned Text Remediation (N=30)
No reliable differences
27
Experiment 4: Novice students & intermediate instruction
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Multiple choice Essay
Spoken humantutoring (N=14)
Typed humantutoring (N=20)
Canned textremediation(N=20)
Relearning
28
Experiment 4: Novice students & intermediate instruction
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Multiple choice Essay
Spoken humantutoring (N=14)
Typed humantutoring (N=20)
Canned textremediation(N=20)
All differences reliable
29
Experiment 5: Novice students & intermediate (but shorter) instruction
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Multiple choice Near essay Far essay
Spokenhumantutoring(N=21)Why2-Atlas(N=21)
Why2-AutoTutor(N=21)
Canned textremediation(N=19)
Relearning AddAdd
30
Experiment 5: Novice students & intermediate instruction
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Multiple choice Near essay Far essay
Spokenhumantutoring(N=21)Why2-Atlas(N=21)
Why2-AutoTutor(N=21)
Canned textremediation(N=19)
No reliable differences
31
Experiment 5: Low-pretest students only
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Multiple choice Near essay Far essay
Spokenhumantutoring(N=9)Why2-Atlas(N=7)
Why2-AutoTutor(N=10)
Canned textremediation(N=11)
Aptitude-treatment
interaction?
32
Experiment 5, Low-pretest students only
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Multiple choice Near essay Far essay
Spokenhumantutoring(N=9)Why2-Atlas(N=7)
Why2-AutoTutor(N=10)
Canned textremediation(N=11)
Spoken human tutoring > canned text remediation
33
Experiments 6 and 7 Novice students & novice instruction
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Multiple choice Fill in the blank Essay
Why2-AutoTutor
CTR expt 6
Text only
Why2-Atlas
CTR expt 7
Was the intermediate text over the novice
students’ heads?
34
Experiments 6 and 7 Novice students & novice instruction
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Multiple choice Fill in the blank Essay
Why2-AutoTutor
CTR expt 6
Text only
Why2-Atlas
CTR expt 7No reliable differences
35
Interpretation
Experiments 1 & 4
Experiments 3 & 5
Experiments 6 & 7
High-pretest Low-pretest
Intermediates
High-pretest Low-pretest
Novices
Content complexity
= Can follow reasoning only with tutor’s help (ZPD) predict: Tutoring > Canned text remediation= Can follow reasoning without any help predict: Tutoring = Canned text remediation
36
Original research questions
Can natural language tutorial dialog add pedagogical value?– Yes, when students must study content that is too
complex to be understood by reading alone
How feasible is a deep linguistic tutoring system?– We built it. It’s fast enough to use.
Can deep linguistic and dialog techniques add pedagogical value?
37
When content is too complex to learn by reading alone: Deep>Shallow?
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Multiple choice Near essay Far essay
Spokenhumantutoring(N=9)Why2-Atlas(N=7)
Why2-AutoTutor(N=10)
Canned textremediation(N=11)
Why2-Atlas is not clearly better than Why2-AutoTutor
38
When to use deep vs. shallow?
Shallow linguistic Deep linguistic
Sentence understanding
LSA, Rainbow, Rappel Carmel: parser, semantics…
Essay/Discourse understanding
LSA Abduction, Bnets
Dialog management
Finite state networks Reactive planning
Natural language generation
Text Plan-based
Use both
Use deep
Use locally smart FSA
Use equivalent texts
39
Results from all 7 experiments(VanLehn, Graesser et al., 2007)
Why2: Atlas = AutoTutor Why2 > Textbook
– No essays– Content differences
Human tutoring = Why2 = Canned text remediation– Except when novice students worked with instruction
designed for intermediates, then Human tutoring > Canned text remediation
40
Other evidence for the interaction plateau (Evens & Michael, 2006)
0
1
2
3
4
5
6
Reading(1993)
Reading(1999)
Reading(2002)
Circsim(1999)
Circsim-Tutor
(1999)
Circsim-Tutor
(2002)
Humantutors(1999)
Humantutors(1993)
Mea
n ga
in
No significant differences
41
Other evidence for the interaction plateau (Reif & Scott, 1999)
0
10
20
30
40
50
60
70
80
90
100
Untutored Step-basedtutoring
Human tutoring
No significant differences
42
Other evidence for the interaction plateau (Chi, Roy & Hausmann, in press)
0
10
20
30
40
50
60
70
Individuals +video
Individuals +textbook
Pairs + textbook Pairs + video Human tutoring
Ad
just
ed
de
ep
po
st-t
est
ste
ps
%
No significant differences
43
Still more studies where natural tutoring = step-based interaction Human tutors
1. Human tutoring = human tutoring with only content-free prompting for step remediation (Chi et al., 2001)
2. Human tutoring = canned text during post-practice remediation (Katz et al., 2003)
3. Socratic human tutoring = didactic human tutoring (Rosé et al., 2001a
4. Socratic human tutoring = didactic human tutoring (Johnson & Johnson, 1992)
5. Expert human tutoring = novice human tutoring (Chae, Kim & Glass, 2005)
Natural language tutoring systems1. Andes-Atlas = Andes with canned text (Rosé et al, 2001b)2. Kermit = Kermit with dialogue explanations (Weerasinghe &
Mitrovic, 2006)
44
Hypothesis 1: Exactly how tutors remedy a step doesn’t matter much
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: Incorrect
What’s in here doesn’t matter much
45
Main claim: There is an interaction plateau
00.5
11.5
22.5
Low-interactioninstruction
Step-basedinstruction
Naturaltutoring
Lear
ning
gai
ns
Expected Observed
Hypothesis 1
46
Hypothesis 2: Cannot eliminate the step remediation loop
Stepstart
T: Tell
T: Elicit S: Correct
Stepend
S: Incorrect
Must avoid this
47
Main claim: There is an interaction plateau
00.5
11.5
22.5
Low-interactioninstruction
Step-basedinstruction
Naturaltutoring
Lear
ning
gai
ns
Expected Observed
Hypothesis 2
48
Conclusions
What does it take to make computer tutors as effective as human tutors?– Step-based interaction– Bloom’s 2-sigma results may have been due to weak
control conditions (classroom instruction)– Other evaluations have also used weak controls
When is natural language useful?– For steps themselves (vs. menus, algebra…)– NOT for feedback & hints (remeditation) on steps
49
Future directions for tutoring systems research
Making step-based instruction ubiquitous– Authoring & customizing– Novel task domains
Increasing engagement
50
Final thought
Many people “just know” that more interaction produces more learning.
“It ain’t so much the things we don’t know that get us into trouble. It’s the things we know that just ain’t so.” – Josh Billings (aka. Henry Wheeler Shaw)