Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R....

20
Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable Utility Design for Distributed Resource Allocation

Transcript of Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R....

Page 1: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Raga GopalakrishnanUniversity of Colorado at Boulder

Sean D. Nixon (University of Vermont)Jason R. Marden (University of Colorado at

Boulder)

Stable Utility Design forDistributed Resource Allocation

Page 2: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Resource AllocationAllocate agents to

resources to optimize system-level objective

Wireless Frequency Selection

F1 F2 F3 F1 F2 F3

?

frequency

frequency

Page 3: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Wireless Access Point Assignment

?

Resource AllocationAllocate agents to

resources to optimize system-level objective

Page 4: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

?

Sensor Coverage

Resource AllocationAllocate agents to

resources to optimize system-level objective

Page 5: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

?

Sensor Coverage

Allocate agents to resources to optimize system-level objective

Distributed Resource Allocation

Page 6: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

?

Sensor Coverage

Design local control policies for agents that result in desirable global behavior

(convergence to an allocation optimizing system-level objective)

Distributed Resource Allocation

Page 7: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Design local control policies for agents that result in desirable global behavior

(convergence to an allocation optimizing system-level objective)

Distributed Resource Allocation

Game Theoretic Control• Model agents as players in a non-cooperative

game• Equilibria correspond to stable allocations• Goal is to design the game such that

equilibria• exist (stability)• are efficient• are easy to converge to

UTILITY DESIGN (static)

LEARNING DESIGN (dynamic)

Page 8: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player

A pure Nash equilibrium (PNE) is an action profile such that for each player ,

Page 9: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player

DESIGN must be “scalable” independent of specific problem instance (resources, action sets)

A pure Nash equilibrium (PNE) is an action profile such that for each player ,

Page 10: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player • – global objective function

or “welfare”• Separability: • – local “welfare” generated

at resource

• – local “distribution rule” at resource

set of players

that chose

Page 11: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player • – global objective function

or “welfare”• Separability: • – local “welfare” generated

at resource

• – local “distribution rule” at resource

set of players

that chose

Page 12: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

S1

S2

D1

D2

61 6

1

1

6

1?+?

Example: Network formation

Page 13: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

S1

S2

D1

D2

61 6

1

1

6

13+3

A Nash equilibrium

Also optimal!

1+5

Unique Nash

equilibriumSuboptimal

Example: Network formation

Page 14: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Key feature:Distribution rules

outcome

?+?

S1

S2

D1

D2

61 6

1

1

6

1

Example: Network formation

Page 15: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Formal Model• – set of agents• – set of resources• – action set of player • – joint action (allocation) set• – utility function of player • – local “welfare” generated

at resource • – local “distribution rule”

at resource

UTILITY DESIGN DISTRIBUTION RULE DESIGN

Page 16: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Most prior work studies two distribution rules

Marginal Contribution (MC)[ Wolpert and Tumer 1999 ]

average marginal contribution over player

orderings

Shapley Value (SV)[ Shapley 1953 ]

externality experienced by all other players

Extensions: “weighted” versions parameterized by weights

Both guarantee PNE in all games!

Question: Are there other such distribution rules?Prior Work: NO, for any given welfare function.[G., Marden, Wierman 2013]

𝒇 𝒓𝑺𝑽 (𝒊 ,𝑺 )= ∑

𝑻⊆𝑺¿𝒊 }¿ ¿¿ ¿¿ 𝒇 𝒓

𝑴𝑪 (𝒊 ,𝑺 )=𝑾 𝒓 (𝑺 )−𝑾 𝒓 (𝑺¿ {𝒊¿})

Page 17: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Most prior work studies two distribution rules

Marginal Contribution (MC)[ Wolpert and Tumer 1999 ]

Shapley Value (SV)[ Shapley 1953 ]

Both guarantee PNE in all games!

Question: Are there other such distribution rules?Prior Work: NO, for any given welfare function.[G., Marden, Wierman 2013]Observation: Many practical problems involve “single-selection”: agents select a single resource.Question: Are there other such distribution rules if we only require equilibrium existence for all single-selection games?Our Answer: No, not for all welfare functions.

Page 18: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Single-Selection ScenarioPrior Work:• “Proportional share” distribution rule

guarantees PNE for certain types of coverage problems (certain forms of )

[Marden and Wierman 2013]

Our Results (characterizations):• The only linear budget-balanced distribution

rules that guarantee PNE in all single-selection games, for all welfare functions, are weighted Shapley values.

• Given any linear welfare function with no dummy players, the only budget-balanced distribution rules that guarantee PNE in all single-selection games are weighted Shapley values.

• Given any welfare function, the only budget-balanced distribution rules that guarantee PNE in all two-player single-selection games are weighted Shapley values.

[G., Nixon, Marden 2013]

Page 19: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Concluding Remarks

• Consequences of the restriction to weighted Shapley values:• Resulting game is a weighted

potential game for which several learning dynamics converge to PNE.

• It is hard for agents to compute their utilities.

• Open Problems:• Obtaining a tighter characterization of

stable distribution rules for a given welfare function.

• Obtaining the characterization when budget-balance is relaxed.

• Optimizing the “weights” for efficiency.

Page 20: Raga Gopalakrishnan University of Colorado at Boulder Sean D. Nixon (University of Vermont) Jason R. Marden (University of Colorado at Boulder) Stable.

Ragavendran GopalakrishnanUniversity of Colorado at Boulder

Sean D. Nixon (University of Vermont)Jason R. Marden (University of Colorado at

Boulder)

Stable Utility Design forDistributed Resource Allocation