Design of Multi-Agent Systems
description
Transcript of Design of Multi-Agent Systems
![Page 1: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/1.jpg)
Design of Multi-Agent Systems
TeacherBart Verheij
Student assistantsAlbert HankelElske van der Vaart
Web sitehttp://www.ai.rug.nl/~verheij/teaching/dmas/(Nestor contains a link)
![Page 2: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/2.jpg)
Student presentations
Week 37
* C. Jonker et al. (2002). BDI-Modelling of Intracellular Dynamics.
Joris Ijsselmuiden
* R. Wulfhorst et al. (2003). A Multiagent Approach for Musical Interactive Systems.
Rosemarijn Looije
* M. Dastani, J. Hulstijn, F. Dignum, J.-J.Ch. Meyer (2004). Issues in Multiagent System Development.
Sander van Dijk
![Page 3: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/3.jpg)
Week 38
* W. C. Stirling, M. A. Goodrich and D. J. Packard (2002). Satisficing Equilibria: A Non-Classical Theory of Games and Decisions.
Dimitri Vrehen
* A. Bazzan and R.H. Bordini (2001). A framework for the simulation of agents with emotions. Report on Experiments with the Iterated Prisoner's Dilemma.
Stijn Colen
* I. Dickinson and M. Wooldridge (2003). Towards Practical Reasoning Agents for the Semantic Web.
* E. Norling (2004). Folk Psychology for Human Modelling: Extending the BDI Paradigm.
Student presentations
![Page 4: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/4.jpg)
Some practical matters
Please submit exercises to [email protected].
Please use naming conventions for file names and message subjects.
Please read your student mail.
![Page 5: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/5.jpg)
Overview
IntroductionEvaluation criteria & equilibria
Social welfarePareto efficiencyNash equilibria
The Prisoner’s DilemmaLoose end: dominant strategies
Not or differentin the book
![Page 6: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/6.jpg)
Typical structure of a multi-agent system
![Page 7: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/7.jpg)
Interactions
Communication Influence on environment (‘spheres of
influence’) Organizations, communities, coalitions Hierarchical relations Cooperation, competition
![Page 8: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/8.jpg)
Utilities & preferences
How to measure the results of a multi-agent systems? In terms of preferences and utilities.
Some notation:={1,2, … } ‘outcomes’, future environmental statesgroup preferences (assumes cooperation)individual preferences
''''''
iiii ''''''
![Page 9: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/9.jpg)
Preferences
Strict preferences
PropertiesReflexive: Transitive:Comparable:
'not and ' ifonly and if '
i: allfor '' then ''' and ' if iii
ii ' of ':', allfor
![Page 10: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/10.jpg)
Utilities
According to utility theory, preferences can be measured in terms of real numbers
Example: moneyBut money isn’t always the right measure: think of the subjective value of a million dollars when you have nothing or when you are Bill Gates.
' ifonly and if )'()(:
iii uuu
R
![Page 11: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/11.jpg)
Utility & money
![Page 12: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/12.jpg)
Zero-sum & constant-sum games
Simplification: two agentsConstant sum gamesThe sum of all players' payoffs is the same for any outcome.
ui() + uj() = C for all Zero-sum gamesAll outcomes involve a sum of the players’ payoffs of 0:
ui() + uj() = 0 for all
Chess
0, ½, 1
-½, 0, ½
![Page 13: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/13.jpg)
Zero-sum & constant-sum games
One agent’s gain is another agent’s loss.
Zero-sum games are necessarily always competitive.
But there are many non-zero sum situations.
![Page 14: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/14.jpg)
Overview
IntroductionEvaluation criteria & equilibria
Social welfarePareto efficiencyNash equilibria
The Prisoner’s DilemmaLoose end: dominant strategies
![Page 15: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/15.jpg)
Kinds of evaluation criteria & equilibria
Social welfare
Pareto efficiency
Nash equilibrium
![Page 16: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/16.jpg)
Social welfare
Social welfare measures the sum of all individual outcomes.
Optimal social welfare may not be achievable when individuals are self-interested
Individual agents follow their own (different) utility function.
![Page 17: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/17.jpg)
Example 1
Agent a2
Strategy s2,1 s2,2
s1,1 (5,6) (4,3)
a1
s1,2 (1,2) (6,4)
highest social welfare
![Page 18: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/18.jpg)
Overview
IntroductionEvaluation criteria & equilibria
Social welfarePareto efficiencyNash equilibria
The Prisoner’s DilemmaLoose end: dominant strategies
![Page 19: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/19.jpg)
Pareto efficiency or optimality
An outcome is Pareto optimal if a better outcome for one agent always results in a worse outcome for some other agent
When all agents pursue social welfare, highest social welfare is Pareto optimal. However, a Pareto optimal outcome need not be desirable. E.g., dictatorship
Pareto improvement: change that is an improvement for someone without hurting anyone
![Page 20: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/20.jpg)
Example 1
Agent a2
Strategy s2,1 s2,2
s1,1 (5,6) (4,3)
a1
s1,2 (1,2) (6,4)
Pareto efficient
Pareto improvements
![Page 21: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/21.jpg)
Overview
IntroductionEvaluation criteria & equilibria
Social welfarePareto efficiencyNash equilibria
The Prisoner’s DilemmaLoose end: dominant strategies
![Page 22: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/22.jpg)
Nash equilibrium
Two strategies s1 and s2 are in Nash equilibrium if:1. under the assumption that agent i plays s1, agent j can do no
better than play s2; and2. under the assumption that agent j plays s2, agent i can do no
better than play s1.
No individual has the incentive to unilaterally change strategyExample: driving on the right side of the road
Nash equilibria do not always exist and are not always unique
![Page 23: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/23.jpg)
Example 1
Agent a2
Strategy s2,1 s2,2
s1,1 (5,6) (4,3)
a1
s1,2 (1,2) (6,4)
Nash equilibria
‘Nashincentives’
![Page 24: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/24.jpg)
Example 1
Agent a2
Strategy s2,1 s2,2
s1,1 (5,6) (4,3)
a1
s1,2 (1,2) (6,4)
outcomes corresponding to strategies in Nash
equilibrium
![Page 25: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/25.jpg)
Example 2
Agent a2
Strategy s2,1 s2,2
s1,1 (3,6) (5,3)
a1
s1,2 (6,2) (2,5)
no Nash equilibrium
![Page 26: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/26.jpg)
Example 3 unique Nash equilibrium
Agent a2
Strategy s2,1 s2,2
s1,1 (1,1) (5,0)
a1
s1,2 (0,5) (3,3)
![Page 27: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/27.jpg)
Example 3 unique Nash equilibrium
Agent a2
Strategy s2,1 s2,2
s1,1 (1,1) (5,0)
a1
s1,2 (0,5) (3,3)
highest social welfare & Pareto efficient
![Page 28: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/28.jpg)
Overview
IntroductionEvaluation criteria & equilibria
Social welfarePareto efficiencyNash equilibria
The Prisoner’s DilemmaLoose end: dominant strategies
![Page 29: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/29.jpg)
The Prisoner’s Dilemma
Two men are collectively charged with a crime and held in separate cells, with no way of meeting or communicating. They are told that:
– if one confesses and the other does not, the confessor will be freed, and the other will be jailed for three years
– if both confess, then each will be jailed for two years
Both prisoners know that if neither confesses, then they will each be jailed for one year
![Page 30: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/30.jpg)
The Prisoner’s Dilemma
The prisoners can either defect or cooperate.
The rational action for each individual prisoner is to defect.
Example 3 is a prisoner’s dilemma (but note that it tables utilities, not prison years: less years in prison has a higher utility).
Real life: nuclear arms reduction, free riders
![Page 31: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/31.jpg)
The Prisoner’s Dilemma
The Prisoner’s Dilemma is the fundamental problem of multi-agent interactions.
It appears to imply that cooperation will not occur in societies of self-interested agents.
![Page 32: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/32.jpg)
Recovering cooperation ...
Conclusions that some have drawn from this analysis:
– the game theory notion of rational action is wrong!
– somehow the dilemma is being formulated wrongly
Arguments to recover cooperation:– We are not all Machiavelli!– The other prisoner is my twin!– The shadow of the future…
![Page 33: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/33.jpg)
The Iterated Prisoner’s Dilemma
One answer: play the game more than once
If you know you will be meeting your opponent again, then the incentive to defect appears to evaporate
When you now how many times you’ll meet your opponent, defection is again rational
![Page 34: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/34.jpg)
Axelrod’s tournament
Suppose you play iterated prisoner’s dilemma against a range of opponents…What strategy should you choose, so as to maximize your overall payoff?
Axelrod (1984) investigated this problem, with a computer tournament for programs playing the prisoner’s dilemma
![Page 35: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/35.jpg)
Strategies in Axelrod’s tournament
ALL-D:Always defect
TIT-FOR-TAT:At the first meeting of an opponent: cooperate.
Then do what your opponent did on the previous meeting
TESTER:First: defect. If the opponent retaliates, play TIT-
FOR-TAT. Otherwise intersperse cooperation and defection.
JOSS:As TIT-FOR-TAT, except periodically defect
![Page 36: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/36.jpg)
Reasons for TIT-FOR-TAT’s success
– Don’t be envious:Don’t play as if it were zero sum!
– Be nice:Start by cooperating, and reciprocate cooperation
– Retaliate appropriately:Always punish defection immediately, but use “measured” force — don’t overdo it
– Don’t hold grudges:Always reciprocate cooperation immediately
![Page 37: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/37.jpg)
Overview
IntroductionEvaluation criteria & equilibria
Social welfarePareto efficiencyNash equilibria
The Prisoner’s DilemmaLoose end: dominant strategies
![Page 38: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/38.jpg)
Dominant strategy
A strategy is dominant for an agent if it is the best under all circumstances
Dominant strategy equilibrium: each agent uses a dominant strategy
A dominant strategy equilibrium is always a Nash equilibrium (but there are ‘more’ of the latter).
![Page 39: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/39.jpg)
Example 4
(2,3)(1,2)s1,2
a1
(4,5)(2,3)s1,1
s2,2s2,1Strateg
y
a2Agen
t
Dominant for a1
Dominant for a2
![Page 40: Design of Multi-Agent Systems](https://reader035.fdocuments.in/reader035/viewer/2022062411/56816841550346895dde1340/html5/thumbnails/40.jpg)
Just to play with: new roads
- There are 6 cars going from A to D each day.
- (A,B) and (C,D) are highways
time(c) = 5 + 2c, where c is the number of cars
- (B,D) and (A,C) are local roads
time(c) = 20 + c
A
B
C
D
What will happen when a new highway is made between B
and C?