Post on 23-Aug-2020
Human Learning in Dynamic Human Learning in Dynamic Environments
Cleotilde (Coty) GonzalezDynamic Decision Making Laboratory
d /DDML bwww.cmu.edu/DDMLabSocial and Decision Sciences Department
Carnegie Mellon University
Research supported by the National Science Foundation :
Human and Social Dynamics: Decision, Risk, and Uncertainty
Dynamic Environments• Combat missions, Production scheduling, Fire fighting,
Emergency dispatch, Air-traffic control• Complex
o Number of components: alternatives, events, courses of action, outcomes
o Uncertainty: All possible states of the world and outcomes are unavailable, incomplete, and difficult to imagineg
o Constraints: limited time, knowledge, resources, human capacity
• Dynamic Complexity• Dynamic Complexityo Arises from the interactions of components over timeo Environment is autonomous. All is change at many g y
different time scaleso Learning from our actions: feedback delays
Dynamic Decision Making: A Closed-Loop view
Hypothesize illnesses and Symptoms
delay
run tests
delay delay
Test resultsHealth
External event
resultsHealth
DiagnosisTreatment
delay delay
DiagnosisTreatment
delay
Learning in dynamic systems is hard
• People remain suboptimal in these systems even with repeated trials, unlimited time and performance incentives (Sterman,1994; Diehl & Sterman 1995)Sterman, 1995).
• We have difficulty processing feedback. F db k d l bl f l Feedback delay is a problem for learning (Brehmer, 1992; Sterman, 1989).
But… how do we learn in dynamic environments?environments?• Decision Makers recognize typical situations and typical
D i i k th i t k l d responses. Decision makers use their past knowledge and adapt their strategies “on the fly”.
Chess studies Expertise: Chase & Simon 1973o Chess studies, Expertise: Chase & Simon, 1973
o Adaptive Decision Making: Payne, Bettman, & Johnson, 1993
o Decision making under uncertainty: “Case-Based Decision o Decision making under uncertainty Case Based Decision Theory” , Gilboa and Schmeidler, 1995
o Theory of automaticity: Logan, 1988
o “Recognition-Primed Decision Making” (RPDM): Intuition, Mental simulations, Klein et al., 1993; Klein, 1998
Pattern recognition is easier if you have iexperience
Instance Based Learning Theory (Gonzalez, Lerch, & Lebiere, 2003)
• RECOGNITION OF FAMILIAR PATTERNSo Determining the similarity between a situation and past
experience o Identifying ‘typical’ situations and responsesy g yp p
• ACQUIRING CAUSE-EFFECT KNOWLEDGEQo Accumulation of instances with practice in a task o Improvement of decision making by bootstrapping on previous
k l d knowledge
Implemented in ACT-R (Anderson and Lebiere, 1988)
IBLT: WHAT do we learn?
Situation Decision OutcomeSituation-Decision
Cycle
Action-Outcome
CycleCycle Cycle
FutureDecisions
S ODS D O
S D O
Blending of past
OutcomesSimilarity
S D OS D O Time
Outcomes
F db kEnvironment
Feedback
IBLT: HOW do we learn?
ACT-R(A d & L bi 1998)(Anderson & Lebiere, 1998)
h l l f
Declarative Memory Procedural Memory
The 2x2 levels of ACT-R
Chunks: declarative facts
Productions: If (cond) Then (action)
Symbolic
facts (cond) Then (action)
A ti ti f h k
S bS b li
Activation of chunks (likelihood of
retrieval)
Conflict Resolution (likelihood of use)
SubSymbolic
IBLT models compare to human decision making:
• In dynamic resource allocation tasks (Gonzalez et
making:
al., 2003)
• In supply chain management control (Martin, Gonzalez & Lebiere 2004)Gonzalez & Lebiere, 2004)
• In repeated choice tasks (Lebiere, Gonzalez & Martin, 2007)2007)
• But there is long way to go to demonstrate: generalizability and utility of IBLTg y y
Decision Making Games (DMGames) used for experimentationfor experimentation
• DMGames embody the essential characteristics of • DMGames embody the essential characteristics of real-world decision environments
o Interactiveo Interactive
o Repeated and interrelated decisions
E t l t d t i t tio External events and team interactions
• Help compress time and space – speed up learning
• Help manipulate experience - learn from simulated cases and on-demand repeated practice
k d d l d h • No risk to individuals and they are FUN.
DMGames used in behavioral research in the DDMlab
Military Command and Control
Real-time resource allocation
Military Command and Control
Real-time resource allocationReal time resource allocationReal time resource allocation
Medical Medical
Supply-Chain
ed caDiagnosis
Supply-Chain
ed caDiagnosis
Chain Management Fire
Fighting
Chain Management Fire
Fighting
MEDIC: Learning tools that represent the dynamics of medical diagnosis (Gonzalez & Vrbin, 2007)y f g ( , )
• Concepts adapted from Kleinmuntz (1985):
Task complexity (numerous diseases and symptoms)o Task complexity (numerous diseases and symptoms)o Disease base rateso Time pressureo Test diagnosticityo Treatment effectivenesso Treatment risko Treatment risk
• Additions:
o Feedback delays (e.g. receiving test results)
• With the potential for:
o Dynamic diagnostic cues
o Dynamic symptoms
MEDIC demo
Factors that influence Learning in dynamic systemsy
• Time constraints (Gonzalez, 2004)
• Workload (Gonzalez, 2005)
• The similarity and diversity of experiences (Gonzalez and y y pQuesada, 2004; Gonzalez and Madhavan, in preparation)
• Our inherent cognitive abilities (Gonzalez, Thomas and Vanyukov 2004)Vanyukov, 2004)
• The type of feedback (Gonzalez, 2005)
• Our difficulty in understanding simple stock and flow Our difficulty in understanding simple stock and flow structures (Cronin and Gonzalez, 2005; Cronin, Gonzalez and Sterman, 2006; Gonzalez, Sterman and Cronin, in preparation)
Experiment 1: probabilities
• MEDIC incorporated:
o Symptoms-disease associations from 0.1 to 0.9
o Delay in test resultsy
o Time pressure due to patient’s declining health in real-time
o Deterministic treatment needed to be provided
• N=12, students, paid flat rateN , students, pa d flat rate
• Each student resolved 56 cases
Results
Treatment
Results- test diagnosticity
Disease base rates
Diagnosticity per disease
Experiment 1: Conclusions
• Students did learn – not perfectly• Showed knowledge of probabilities, tested for the
more diagnostic cues, and diagnosed very closely to the real state of the diseases.f .
• What is the role of feedback and how would that interact with the symptom-probability matrix?
Experiment 2: Probabilities and f db kfeedback
• MEDIC:• Symptomology table: Probability or Certainty• either detailed feedback or no feedback
• Participants were assigned to one of four conditions:o probabilities, full feedback (P1) -26o certainty full feedback (P2)-30o certainty, full feedback (P2) 30o certainty, no feedback (P3)-25o probabilities, no feedback (P4)- 29
• N= 110 Participants were paid a flat dollar amount
P b biliProbability
C t i t
Disease 1 Disease 2 Disease 3 Disease 4 0.25 0.25 0.25 0.25 Base Rates
Certainty
0.0 0.0 0.0 0.0 Symptom 11.0 0.0 0.0 0.0 Symptom 2 1.0 1.0 0.0 0.0 Symptom 3 0.0 0.0 1.0 0.0 Symptom 4
Test diagnosticity - probability condition
Test diagnosticity – Certainty condition
Diagnosticity per disease
Experiment 2: Conclusions
• Full feedback was helpful in the probabilistic i t d did t k diff i th environment and did not make a difference in the
certain environment
• We now know that: with repeated trials, students p ,learn in probabilistic environments with time constraints and feedback delays
• Feedback helps in probabilistic environmentsFeedback helps in probabilistic environments• Probabilistic environments are not the main reason
for poor learning in dynamic tasks
Basic Building Blocks of Dynamic Decision Making TasksMaking Tasks
• Stocks (accumulations)
• Flows that increase (Inflow) or decrease (Outflow) the stock
• Feedback Delays & multiple relationships
• Environmental or external effects
• Multiple decisions about flows
These problems of dynamic control over time are important to human life: keeping a healthy weight, bank p p g y gaccounts, company inventory, stress levels, climate change etc.
Humans suffer of poor understanding of accumulation: Stock-Flow failureaccumulation: Stock Flow failure
Cronin, Gonzalez & Sterman, 2008 ; Cronin & Gonzalez, 2007; Cronin, Gonzalez and Sterman, 2006; Sweeney & Sterman, 2000 St 2002 2000; Sterman, 2002;
Weight as balance between consumed and expended energyexpended energy
1. When eaten most?
2. When exercised most?
3. When weight highest?
4. When weight lowest?4 g
Blood glucose level as balance between glucagon and insulin productionglucagon and insulin production
1. When most glucagon?g g
2. When most insulin?
3. When glucose level 3. When glucose level
highest?
4. When glucose level 4. When glucose level
lowest?
Why? (Cronin & Gonzalez, 2007; Cronin, Gonzalez & Sterman, 2008)
• Not an artifact of the graph
t rman, )
• Not due to the form of graphical presentation
• Not due to motivation• Not due to motivation
• Not due to familiarity with the context
• Stock Flow failure is one important reason for • Stock-Flow failure is one important reason for learning problems in dynamic systems
U f h i ti th t i t iti l li • Use of heuristics that are intuitively appealing but erroneous
Future work
• Further investigate the correlation heuristic and the Stock Flow failureand the Stock-Flow failure
• Use DMGames of Dynamic Stocks and Flows to d t d th i d l i bl understand the reasoning and learning problems
in dynamic tasks
• Further develop the Instance-Based learning theory to other dynamic problems, like the St k FlStock-Flow
• Further investigate ways to identify and overcome the problems in learning in dynamic systems
DDMLab – February,