Influence Diagrams for Robust Decision Making in Multiagent Settings
description
Transcript of Influence Diagrams for Robust Decision Making in Multiagent Settings
![Page 1: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/1.jpg)
Influence Diagrams for Robust Decision Making in Multiagent Settings
![Page 2: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/2.jpg)
Prashant Doshi University of Georgia, USA
![Page 3: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/3.jpg)
http://thinc.cs.uga.edu
![Page 4: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/4.jpg)
Yingke ChenPost doctoral student
Yifeng ZengReader, Teesside Univ.
Previously: Assoc Prof., Aalborg Univ.
Muthu ChandrasekaranDoctoral student
![Page 5: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/5.jpg)
Influence diagram
![Page 6: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/6.jpg)
Ai Ri
Oi
S
ID for decision making where state may be partially observable
![Page 7: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/7.jpg)
How do we generalize IDs to multiagent settings?
![Page 8: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/8.jpg)
Adversarial tiger problem
![Page 9: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/9.jpg)
Multiagent influence diagram (MAID)(Koller&Milch01)
MAIDs offer a richer representation for a game and may be transformed into a normal- or extensive-form game A strategy of an agent is an assignment of a decision rule
to every decision node of that agent
Open or Listeni
Rj
Growli
Tiger loc
Open or Listenj
Growlj
Ri
![Page 10: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/10.jpg)
Expected utility of a strategy profile to agent i is the sum of the expected utilities at each of i’s decision node
A strategy profile is in Nash equilibrium if each agent’s strategy in the profile is optimal given others’ strategies
Open or Listeni
Rj
Growli
Tiger loc
Open or Listenj
Growlj
Ri
![Page 11: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/11.jpg)
Strategic relevanceConsider two strategy profiles which differ in the
decision rule at D’ only. A decision node, D, strategically relies on another, D’, if D‘s decision rule
does not remain optimal in both profiles.
![Page 12: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/12.jpg)
Is there a way of finding all decision nodes that are strategically relevant to
D using the graphical structure?
Yes, s-reachabilityAnalogous to d-separation for determining
conditional independence in BNs
![Page 13: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/13.jpg)
Evaluating whether a decision rule at D is optimal in a given strategy profile involves removing decision nodes
that are not s-relevant to D and transforming the decision and utility nodes into chance nodes
Open or Listeni
Rj
Growli
Tiger loc
Open or Listenj
Growlj
Ri
![Page 14: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/14.jpg)
What if the agents are using differing models of the same game to make decisions, or are
uncertain about the mental models others are using?
![Page 15: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/15.jpg)
Let agent i believe with probability, p, that j will listen and with 1- p that j will do the best response decision
Analogously, j believes that i will open a door with probability q, otherwise play the best response
Open or Listeni
Rj
Growli
Tiger loc
Open or Listenj
Growlj
Ri
![Page 16: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/16.jpg)
Network of ID (NID)
Let agent i believe with probability, p, that j will likely listen and with 1- p that j will do the best response decision
Analogously, j believes that i will mostly open a door with probability q, otherwise play the best response
Listen Open
L OL OR0.9 0.05 0.05
L OL OR0.1 0.45 0.45
Block L Block O
Top-level
ListenOpen
q p
(Gal&Pfeffer08)
![Page 17: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/17.jpg)
Let agent i believe with probability, p, that j will likely listen and with 1- p that j will do the best response decision
Analogously, j believes that i will mostly open a door with probability q, otherwise play the best response
Open or Listeni
Rj
Growli
Tiger loc
Open or Listenj
Growlj
Ri
Top-level Block -- MAID
![Page 18: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/18.jpg)
MAID representation for the NID
BR[i]TL
RTLj
GrowlTLi
Tiger locTL
BR[j]TL
GrowlTLj
RTLi
Mod[j;Di]
OpenO
Open or ListenTL
i
Mod[i;Dj]
ListenL
Open or ListenTL
j
![Page 19: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/19.jpg)
MAIDs and NIDsRich languages for games based on IDs that
models problem structure by exploiting conditional independence
![Page 20: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/20.jpg)
MAIDs and NIDsFocus is on computing equilibrium,
which does not allow for best response to a distribution of non-equilibrium behaviors
Do not model dynamic games
![Page 21: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/21.jpg)
Generalize IDs to dynamic interactions in multiagent settings
![Page 22: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/22.jpg)
Challenge: Other agents could be updating beliefs and changing strategies
![Page 23: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/23.jpg)
Model node: Mj,l-1
models of agent j at level l-1
Policy link: dashed arrowDistribution over the other agent’s actions given its models
Belief on Mj,l-1: Pr(Mj,l-1|s)
Open or Listeni
Ri
Growli
Tiger loci
Open or Listenj
Mj,l-1
Level l I-ID
![Page 24: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/24.jpg)
Members of the model nodeDifferent chance nodes are
solutions of models mj,l-1
Mod[Mj] represents the different models of agent j
Mod[Mj]
Aj1
Aj2
Mj,l-1
S
mj,l-11
mj,l-12
Open or Listenj
mj,l-11, mj,l-1
2 could be I-IDs , IDs or simple distributions
![Page 25: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/25.jpg)
CPT of the chance node Aj is a multiplexer
Assumes the distribution of each of the action nodes (Aj
1, Aj2)
depending on the value of Mod[Mj]
Mod[Mj]
Aj1
Aj2
Mj,l-1
S
mj,l-11
mj,l-12
Aj
![Page 26: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/26.jpg)
Could I-IDs be extended over time?
We must address the challenge
![Page 27: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/27.jpg)
Ait+1
Ri
Oit+1
St+1
Ajt+1
Mj,l-1t+1
Ait
Ri
Oit
St
Ajt
Mj,l-1t
Model update link
![Page 28: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/28.jpg)
Interactive dynamic influence diagram (I-DID)
![Page 29: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/29.jpg)
How do we implement the model update link?
![Page 30: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/30.jpg)
mj,l-1t,2
Mod[Mjt]
Aj1
Mj,l-1t
st
mj,l-1t,1
Ajt
Aj2
Oj1
Oj2
Oj
Mod[Mjt+1]
Aj1
Mj,l-1t+1
mj,l-1t+1,1
mj,l-1t+1,2
Ajt+1
Aj2
Aj3
Aj4
mj,l-1t+1,3
mj,l-1t+1,4
![Page 31: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/31.jpg)
mj,l-1t,2
Mod[Mjt]
Aj1
Mj,l-1t
st
mj,l-1t,1
Ajt
Aj2
Oj1
Oj2
Oj
Mod[Mjt+1]
Aj1
Mj,l-1t+1
mj,l-1t+1,1
mj,l-1t+1,2
Ajt+1
Aj2
Aj3
Aj4
mj,l-1t+1,3
mj,l-1t+1,4
These models differ in their initial beliefs, each of which is the result of j updating its beliefs due to its actions and possible observations
![Page 32: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/32.jpg)
Recap
![Page 33: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/33.jpg)
Prashant Doshi, Yifeng Zeng and Qiongyu Chen, “Graphical Models for Interactive POMDPs: Representations and Solutions”, Journal of AAMAS, 18(3):376-416, 2009
Daphne Koller and Brian Milch, “Multi-Agent Influence Diagrams for Representing and Solving Games”, Games and Economic Behavior, 45(1):181-221, 2003
Ya’akov Gal and Avi Pfeffer, “Networks of Influence Diagrams: A Formalism for Representing Agent’s Beliefs and Decision-Making Processes”,Journal of AI Research, 33:109-147, 2008
![Page 34: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/34.jpg)
How large is the behavioral model space?
![Page 35: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/35.jpg)
How large is the behavioral model space?
General definitionA mapping from the agent’s history of
observations to its actions
![Page 36: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/36.jpg)
How large is the behavioral model space?
2H (Aj)Uncountably infinite
![Page 37: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/37.jpg)
How large is the behavioral model space?
Let’s assume computable models
Countable
A very large portion of the model space is not computable!
![Page 38: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/38.jpg)
Daniel DennettPhilosopher and Cognitive Scientist
Intentional stanceAscribe beliefs, preferences and intent to explain others’ actions
(analogous to theory of mind - ToM)
![Page 39: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/39.jpg)
Organize the mental models
Intentional modelsSubintentional models
![Page 40: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/40.jpg)
Organize the mental modelsIntentional models
E.g., POMDP = bj, Aj, Tj, j, Oj, Rj, OCj (using DIDs) BDI, ToM
Subintentional models
Frame(may give rise to recursive modeling)
![Page 41: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/41.jpg)
Organize the mental modelsIntentional models
E.g., POMDP = bj, Aj, Tj, j, Oj, Rj, OCj (using DIDs)
BDI, ToMSubintentional models
E.g., (Aj), finite state controller, plan
Frame
![Page 42: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/42.jpg)
Finite model space grows as the interaction progresses
![Page 43: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/43.jpg)
Growth in the model space
Other agent may receive any one of |j| observations
|Mj| |Mj||j| |Mj||j|2 ... |Mj||j|t
0 1 2 t
![Page 44: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/44.jpg)
Growth in the model space
Exponential
![Page 45: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/45.jpg)
General model space is large and grows exponentially as the interaction progresses
![Page 46: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/46.jpg)
It would be great if we can compress this space!
No loss in value to the modelerFlexible loss in value for greater compression
LosslessLossy
![Page 47: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/47.jpg)
Expansive usefulness of model space compression to many areas:
1. Sequential decision making in multiagent settings using I-DIDs
2. Bayesian plan recognition3. Games of imperfect information
![Page 48: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/48.jpg)
General and domain-independent approach for compression
Establish equivalence relations that partition the model space and retain representative models from each equivalence class
![Page 49: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/49.jpg)
Approach #1: Behavioral equivalence (Rathanasabapathy et al.06,Pynadath&Marsella07)
Intentional models whose complete solutions are identical are considered equivalent
![Page 50: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/50.jpg)
Approach #1: Behavioral equivalence
Behaviorally minimal set of models
![Page 51: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/51.jpg)
Lossless
Works when intentional models have differing frames
Approach #1: Behavioral equivalence
![Page 52: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/52.jpg)
Multiagent tiger
Approach #1: Behavioral equivalence
Impact on I-DIDs in multiagent settings
Multiagent tiger
Multiagent MM
![Page 53: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/53.jpg)
Utilize model solutions (policy trees) for mitigating model growth
Approach #1: Behavioral equivalence
Model reps that are not BE may become BE next step onwards
Preemptively identify such models and do not update all of them
![Page 54: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/54.jpg)
Thank you for your time
![Page 55: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/55.jpg)
Intentional models whose partial depth-d solutions are identical and vectors of updated beliefs at the leaves of the partial trees
are identical are considered equivalent
Approach #2: Revisit BE(Zeng et al.11,12)
Sufficient but not necessary
Lossless if frames are identical
![Page 56: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/56.jpg)
Approach #2: (,d)-Behavioral equivalence
Two models are (,d)-BE if their partial depth-d solutions are identical and vectors of updated beliefs at the leaves of the
partial trees differ by
Models are(0.33,1)-BE
Lossy
![Page 57: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/57.jpg)
Approach #2: -Behavioral equivalence
Lemma (Boyen&Koller98): KL divergence between two distributions in a discrete Markov stochastic process reduces or remains the same after a transition, with the mixing rate acting as a discount factor
Mixing rate represents the minimal amount by which the posterior distributions agree with each other after one transition
Property of a problem and may be pre-computed
![Page 58: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/58.jpg)
Given the mixing rate and a bound, , on the divergence between two belief vectors, lemma allows computing the depth, d, at which the bound is reached
Approach #2: -Behavioral equivalence
Compare two solutions up to depth d for equality
![Page 59: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/59.jpg)
Discount factor F = 0.5
Multiagent Concert
Approach #2: -Behavioral equivalence
Impact on dt-planning in multiagent settings
Multiagent Concert
On a UAV reconnaissance problem in a 5x5 grid, allows the solution to scale to a 10 step look ahead in 20 minutes
![Page 60: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/60.jpg)
What is the value of d when some problems exhibit F with a value of 0 or 1?
Approach #2: -Behavioral equivalence
F=1 implies that the KL divergence is 0 after one step: Set d = 1
F=0 implies that the KL divergence does not reduce: Arbitrarily set d to the horizon
![Page 61: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/61.jpg)
Intentional or subintentional models whose predictions at time step t (action distributions)
are identical are considered equivalent at t
Approach #3: Action equivalence(Zeng et al.09,12)
![Page 62: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/62.jpg)
Approach #3: Action equivalence
![Page 63: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/63.jpg)
Lossy
Works when intentional models have differing frames
Approach #3: Action equivalence
![Page 64: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/64.jpg)
Approach #3: Action equivalence
Impact on dt-planning in multiagent settings
Multiagent tigerAE bounds the model space at each time
step to the number of distinct actions
![Page 65: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/65.jpg)
Intentional or subintentional models whose predictions at time step t influence the subject agent’s plan
identically are considered equivalent at t
Regardless of whether the other agent opened the left or right door,the tiger resets thereby affecting the agent’s plan identically
Approach #4: Influence equivalence(related to Witwicki&Durfee11)
![Page 66: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/66.jpg)
Influence may be measured as the change in the subject agent’s belief due to the action
Approach #4: Influence equivalence
Group more models at time step t compared to AE
Lossy
![Page 67: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/67.jpg)
Compression due to approximate equivalence may violate ACC
Regain ACC by appending a covering model to the compressed set of representatives
![Page 68: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/68.jpg)
Open questions
![Page 69: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/69.jpg)
N > 2 agents
Under what conditions could equivalent models belonging to different agents be
grouped together into an equivalence class?
![Page 70: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/70.jpg)
Can we avoid solving models by using heuristics for identifying approximately
equivalent models?
![Page 71: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/71.jpg)
Modeling Strategic Human Intent
![Page 72: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/72.jpg)
Yifeng ZengReader, Teesside Univ.
Previously: Assoc Prof., Aalborg Univ.
Yingke ChenDoctoral student
Hua MaoDoctoral student
Muthu ChandrasekaranDoctoral student
Xia QuDoctoral student
Roi CerenDoctoral student
Matthew MeiselDoctoral student
Adam GoodieProfessor of Psychology, UGA
![Page 73: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/73.jpg)
Computational modeling of human recursive thinking in sequential games
Computational modeling of probability judgment in stochastic games
![Page 74: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/74.jpg)
Human strategic reasoning is generally hobbled by low levels of recursive thinking
(Stahl&Wilson95,Hedden&Zhang02,Camerer et al.04,Ficici&Pfeffer08)
(I think what you think that I think...)
![Page 75: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/75.jpg)
You are Player I and II is human. Will you move or stay?
Move MoveMove
Stay Stay Stay
Payoff for I:Payoff for II:
31
13
24
42
I II I IIPlayer to move:
![Page 76: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/76.jpg)
Less than 40% of the sample population performed the rational action!
![Page 77: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/77.jpg)
Thinking about how others think (...) is hard in general contexts
![Page 78: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/78.jpg)
Move MoveMove
Stay Stay Stay
Payoff for I:
(Payoff for II is 1 – decimal)
0.6 0.4 0.2 0.8
I II I IIPlayer to move:
![Page 79: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/79.jpg)
About 70% of the sample population performed the rational action in this simpler and strictly competitive game
![Page 80: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/80.jpg)
Simplicity, competitiveness and embedding the task in intuitive representations seem to facilitate
human reasoning (Flobbe et al.08, Meijering et al.11, Goodie et al.12)
![Page 81: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/81.jpg)
3-stage game
Myopic opponents default to staying (level 0) while predictive opponents think about the player’s
decision (level 1)
![Page 82: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/82.jpg)
Can we computationally model these strategic behaviors using process models?
![Page 83: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/83.jpg)
Yes! Using a parameterized Interactive POMDP framework
![Page 84: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/84.jpg)
Replace I-POMDP’s normative Bayesian belief update with Bayesian learning that underweights evidence, parameterized by
Notice that the achievement score increases as more games are played indicating learning of the opponent modelsLearning is slow and partial
![Page 85: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/85.jpg)
Replace I-POMDP’s normative expected utility maximization with quantal response model that selects actions proportional to their utilities, parameterized by
Notice the presence of rationality errors in the participants’ choices (action is inconsistent with prediction) Errors appear to reduce with time
![Page 86: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/86.jpg)
Underweighting evidence during learning and quantal response for
choice have prior psychological support
![Page 87: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/87.jpg)
Use participants’ predictions of other’s action to learn and participants’ actions to learn
![Page 88: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/88.jpg)
Use participants’ actions to learn both and Let vary linearly
![Page 89: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/89.jpg)
Insights revealed by process modeling:1. Much evidence that participants did not make rote use of BI, instead
engaged in recursive thinking2. Rationality errors cannot be ignored when modeling human decision
making and they may vary3. Evidence that participants’ could be attributing surprising
observations of others’ actions to their rationality errors
![Page 90: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/90.jpg)
Open questions:1. What is the impact on strategic thinking if action outcomes
are uncertain?2. Is there a damping effect on reasoning levels if participants
need to concomitantly think ahead in time
![Page 91: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/91.jpg)
Suite of general and domain-independent approaches for compressing agent model
spaces based on equivalence
Computational modeling of human behavioral data pertaining to strategic thinking
![Page 92: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/92.jpg)
2. Bayesian plan recognition under uncertainty
Plan recognition literature has paid scant attention to finding general ways of reducing the set of feasible
plans (Carberry, 01)
![Page 93: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/93.jpg)
3. Games of imperfect information (Bayesian games)
Real-world applications often involve many player types Examples• Ad hoc coordination in a spontaneous team• Automated Poker player agent
![Page 94: Influence Diagrams for Robust Decision Making in Multiagent Settings](https://reader036.fdocuments.in/reader036/viewer/2022062410/568165c2550346895dd8cb41/html5/thumbnails/94.jpg)
3. Games of imperfect information (Bayesian games)
Real-world applications often involve many player types
Model space compression facilitates equilibrium computation