Chapter 17: Making Complex Decisions

Chapter 17: Making Complex Decisions

April 1, 2004

17.6 Decisions With Multiple Agents: Game Theory

• Assume that agents make simultaneous moves

• Assume that the game is a single move game.

Uses

• Agent Design (2 finger Morra)

• Mechanism Design

Game Components

• Players

• Actions

• Payoff Matrix e.g. rock-paper-scissors

Terminology

• Pure Strategy – deterministic policy

• Mixed Strategy – randomized policy, [p: a; (1-p): b]

• Outcome – result of game

• Solution: player adopts a strategy profile that is a rational strategy

Prisoner’s Dilemna

B testifies B refuses

A testifies A = -5

B = -5

A = 0

B = -10

A refuses A = -10

B = 0

A = -1

B = -1

Terminology

• (testify, testify) is a dominant strategy

• s strongly dominates s’ – s is better than s’ for all other player strategies

• s weakly dominates s’ – s is better than s’ for one other strategy and is at least as good as all the rest

Terminology

• An outcome is Pareto optimal if there is no other outcome that all players would prefer

• An equilibrium is a strategy profile where no player benefits by switching strategies given that no other player may switch strategies

• Nash showed that every game has an equilibrium

• Prisoner’s Dilemna!

Example: Two Nash Equilibria

no dominant strategy!

B: dvd B: cd

A: dvd A = 9

B = 9

A = -4

B = -1

A: cd A = -1

B = -4

A = 5

B = 5

Von Neumann’s Maximin

• zero sum game

• E maximizer (2 finger Morra)

• O minimizer (2 finger Morra)

• U(E = 1, O = 1) = 2

• U(E = 1, O = 2) = -3

• U(E = 2, O = 1) = -3

• U(E = 2, O = 2) = 4

Maximin

• E reveals strategy, moves first

• [p: one; 1-p: two]

• O chooses based on p

• one: 2p -3(1-p)

• two: -3p + 4(1-p)

• p = 7/12

• UE,O = -1/12

Maximin

• O reveals strategy, moves first

• [q: one; 1-q: two]

• E chooses based on q

• one: 2q -3(1-q)

• two: -3q + 4(1-q)

• q = 7/12

• UO,E = -1/12

Maximin

• [7/12: one, 5/12: two] is the Maximin equilibrium or Nash equilibrium

• Always exists for mixed strategies!

• The value is a maximin for both players.

Repeated Move Games

• Application: packet collision in an Ethernet network

• Prisoner’s Dilemna – fixed number of rounds – no change!

• Prisoner’s Dilemna – variable number of rounds (e.g. 99% chance of meeting again)– perpetual punishment– tit for tat

Repeated Move Games

• Partial Information Games – games that occur in a partially observable environment such as blackjack

17.7 Mechanism Design

• Given rational agents, what game should we design

• Tragedy of the Commons

Auctions

• Single Item

• Bidderi has a utility vi for the item

• vi is only known to Bidderi

• English Auction• Sealed Bid Auction• Sealed Bid Second Price or “Vickrey” auction

(no communication, no knowledge of others)

Chapter 17: Making Complex Decisions

Documents

Transcript of Chapter 17: Making Complex Decisions