Chapter 17: Making Complex Decisions
description
Transcript of Chapter 17: Making Complex Decisions
Chapter 17: Making Complex Decisions
April 1, 2004
17.6 Decisions With Multiple Agents: Game Theory
• Assume that agents make simultaneous moves
• Assume that the game is a single move game.
Uses
• Agent Design (2 finger Morra)
• Mechanism Design
Game Components
• Players
• Actions
• Payoff Matrix e.g. rock-paper-scissors
Terminology
• Pure Strategy – deterministic policy
• Mixed Strategy – randomized policy, [p: a; (1-p): b]
• Outcome – result of game
• Solution: player adopts a strategy profile that is a rational strategy
Prisoner’s Dilemna
B testifies B refuses
A testifies A = -5
B = -5
A = 0
B = -10
A refuses A = -10
B = 0
A = -1
B = -1
Terminology
• (testify, testify) is a dominant strategy
• s strongly dominates s’ – s is better than s’ for all other player strategies
• s weakly dominates s’ – s is better than s’ for one other strategy and is at least as good as all the rest
Terminology
• An outcome is Pareto optimal if there is no other outcome that all players would prefer
• An equilibrium is a strategy profile where no player benefits by switching strategies given that no other player may switch strategies
• Nash showed that every game has an equilibrium
• Prisoner’s Dilemna!
Example: Two Nash Equilibria
no dominant strategy!
B: dvd B: cd
A: dvd A = 9
B = 9
A = -4
B = -1
A: cd A = -1
B = -4
A = 5
B = 5
Von Neumann’s Maximin
• zero sum game
• E maximizer (2 finger Morra)
• O minimizer (2 finger Morra)
• U(E = 1, O = 1) = 2
• U(E = 1, O = 2) = -3
• U(E = 2, O = 1) = -3
• U(E = 2, O = 2) = 4
Maximin
• E reveals strategy, moves first
• [p: one; 1-p: two]
• O chooses based on p
• one: 2p -3(1-p)
• two: -3p + 4(1-p)
• p = 7/12
• UE,O = -1/12
Maximin
• O reveals strategy, moves first
• [q: one; 1-q: two]
• E chooses based on q
• one: 2q -3(1-q)
• two: -3q + 4(1-q)
• q = 7/12
• UO,E = -1/12
Maximin
• [7/12: one, 5/12: two] is the Maximin equilibrium or Nash equilibrium
• Always exists for mixed strategies!
• The value is a maximin for both players.
Repeated Move Games
• Application: packet collision in an Ethernet network
• Prisoner’s Dilemna – fixed number of rounds – no change!
• Prisoner’s Dilemna – variable number of rounds (e.g. 99% chance of meeting again)– perpetual punishment– tit for tat
Repeated Move Games
• Partial Information Games – games that occur in a partially observable environment such as blackjack
17.7 Mechanism Design
• Given rational agents, what game should we design
• Tragedy of the Commons
Auctions
• Single Item
• Bidderi has a utility vi for the item
• vi is only known to Bidderi
• English Auction• Sealed Bid Auction• Sealed Bid Second Price or “Vickrey” auction
(no communication, no knowledge of others)