Software Multiagent Systems: Lecture 10

40
Software Multiagent Systems: Lecture 10 Milind Tambe University of Southern California [email protected]

description

Software Multiagent Systems: Lecture 10. Milind Tambe University of Southern California [email protected]. Announcements. From now on, slides posted on our class web site Password: teamcore Homework answers will be sent out by email next week. DCOP Definition. d i d j f(d i ,d j ) 1 - PowerPoint PPT Presentation

Transcript of Software Multiagent Systems: Lecture 10

Page 1: Software Multiagent Systems:  Lecture 10

Software Multiagent Systems: Lecture 10

Milind TambeUniversity of Southern [email protected]

Page 2: Software Multiagent Systems:  Lecture 10

Announcements

From now on, slides posted on our class web sitePassword: teamcore

Homework answers will be sent out by email next week

Page 3: Software Multiagent Systems:  Lecture 10

DCOP Definition

Variables {x1,x2,…,xn} distributed among agents

Domains D1,D2,...,DN,

Link functions fij: Di x Dj → N.

x1

x2

x3 x4

Find assignment A* s.t. F(A*) is min,

F(A) = fij(di,dj), xidi,xj dj in A

Cost = 0

x1

x2

x3 x4

Cost = 4

x1

x2

x3 x4

Cost = 7

di dj f(di,dj)1220

Page 4: Software Multiagent Systems:  Lecture 10

Branch and Bound Search

Familiar with branch and bound search?

Page 5: Software Multiagent Systems:  Lecture 10

Synchronous Branch and Bound (Hirayama97)

• Agents prioritized into chain • Choose value, send partial solution (with cost) to child• When cost exceeds upper bound, backtrack• Agent explores all its values before reporting to parent

x1

x2

x3

x4

0

di dj f(di,dj)1220

x1

x2

x3

x4

1

x1

x2

x3

x43

x1

x2

x34 = UB

x4

x1

x2

x3

x4

Page 6: Software Multiagent Systems:  Lecture 10

DCOP before ADOPT

Branch and Bound

Backtrack condition - when cost exceeds upper bound

Problem – sequential, synchronous

Asynchronous Backtracking

Backtrack condition - when constraint unsatisfiable

Problem - only hard constraints allowed

Observation: Backtrack only when sub-optimality is proven

Page 7: Software Multiagent Systems:  Lecture 10
Page 8: Software Multiagent Systems:  Lecture 10

Can we backtrack without proving optimality?

Page 9: Software Multiagent Systems:  Lecture 10

Adopt: Idea #1

Weak backtracking: When lower bound gets too high

Why lower bounds?

Allows asynchrony!

Yet allows quality guarantees

Downside?

Backtrack before sub-optimality is proven

Cant throw away solutions; need to revisit!

Page 10: Software Multiagent Systems:  Lecture 10

Adopt: Idea #2

Solutions need revisiting

How could we do that?

Remember all previous solutions

Efficient reconstruction of abandoned solutions

Page 11: Software Multiagent Systems:  Lecture 10

Adopt Overview

Agents are ordered in a DFS TREE

Constraint graph need not be a tree

x1

x2

x3 x4

Page 12: Software Multiagent Systems:  Lecture 10

Adopt Overview

Agents concurrently choose values

VALUE messages sent down

COST messages sent up only to parent

THRESHOLD messages sent down only to child

Constraint Graph

x1

x2

x3 x4

di dj f(di,dj)1220

VALUE messages

COST messages

x1

x2

x4x3

THRESH messages

Page 13: Software Multiagent Systems:  Lecture 10

Asynchronous, concurrent search

di dj f(di,dj)1220

Each variable has two values: b and wEach initialized with a lower-bound of 0

x1

x2

x3 x4

Page 14: Software Multiagent Systems:  Lecture 10

Asynchronous, concurrent search

Concurrently choose, send to descendents

x1

x2

x3 x4

Optimal Solution

x1

x2

x3 x4

. . .Concurrently reportlocal costs,with contexte.g. x3 sends cost 2 withx1=b,x2=b

x1

x2

x3 x4

12

1

x4

x1 switches to “better?” value

x1

x2

x3

•x2, x3 switch to best value, report cost, with context•x2 disregards x3’s report (context mismatch)

x1

x2

x3 x4

0

2

di dj f(di,dj)1220

Page 15: Software Multiagent Systems:  Lecture 10

Asynchronous, concurrent searchAlgorithm:

Agents are prioritized into tree

Agents:

Initialize lower bounds of values to zero

Concurrently choose values, send to all connected descendents.

Choose the best value given what ancestors chose:

immediately send cost message to parent

Cost = lower bound + cost with ancestors

Costs asynchronously reach parent

Asynchronous costs: context attachment

Page 16: Software Multiagent Systems:  Lecture 10

Weak Backtracking

Suppose parent has two values, “white” and “black”

parent

Explore “white” first

LB(w) = 0LB(b) = 0

parent

Receive cost msg

LB(w) = 2LB(b) = 0

parent

Now explore “black”

LB(w) = 2LB(b) = 0

parent

Receive cost msg

LB(w) = 2LB(b) = 3

parent

Go back to “white”

LB(w) = 2LB(b) = 3

parent

Termination Condition True

LB(w)=10 =UB(w)LB(b)=12

. . . .

Page 17: Software Multiagent Systems:  Lecture 10

Key Lemma for soundness/correctness

Lemma: Assuming no context change, an agent’s report of cost is non-decreasing and is never greater than the actual cost.

Inductive Proof Sketch: Leaf agents never overestimate cost. Each agent sums the costs from its children and chooses its best choice and reports to parent.

di dj f(di,dj)1220

x2 receives costsfrom children, computestotal cost of 2 + 1 + 2 = 5.

x1

x2

x3 x4

12

Instead, x2 switches to unexplored value, reports lower bound

x1

x2

x3 x4

0

5 is an OVERestimate!

x1

x2

x3 x4

5

Page 18: Software Multiagent Systems:  Lecture 10

Revisiting Abandoned Solutions

Problem

reconstructing from scratch: inefficient

remembering solutions: expensive

Solution

remember only lower bounds: polynomial space

use lower bounds to efficiently re-search

lower bound = 10parent

single child

Chain Ordering

threshold = 10

Page 19: Software Multiagent Systems:  Lecture 10

Revisiting Abandoned Solutions

Solution

remember only lower bounds – polynomial space

use lower bounds to efficiently re-search

Suppose parent has two values, “a” and “b”

parent

single child

Explore “a” First

LB(a) = 10LB(b) = 0

parent

single child

Now explore “b”

parent

single child

Return to “a”

threshold = 10

LB(a) = 10

LB(b) = 11

Page 20: Software Multiagent Systems:  Lecture 10

Backtrack Thresholds

agent i received threshold = 10 from parent

Explore “white” first

LB(w) = 0LB(b) = 0threshold = 10

Receive cost msg

LB(w) = 2LB(b) = 0threshold = 10

Stick with “white”

LB(w) = 2LB(b) = 0threshold = 10

Receive more cost msgs

LB(w) = 11LB(b) = 0threshold = 10

Now try black

LB(w) = 11LB(b) = 0threshold = 10

agent i agent i

Key Point: Don’t change value until LB(current value) > threshold.

Page 21: Software Multiagent Systems:  Lecture 10

parent parent parent

thresh=5thresh=5 cost=6 thresh=4 thresh=6

Time T1 Time T2 Time T3

lower bound = 10parent

multiplechildren

Tree Ordering

thresh = ?thresh = ?

Idea: Rebalance threshold

Page 22: Software Multiagent Systems:  Lecture 10

Is Adopt completely distributed?

Page 23: Software Multiagent Systems:  Lecture 10

Evaluation of Speedups

Conclusions • Adopt’s asynchrony and parallelism yields significant efficiency gains• Sparse graphs (density 2) solved optimally, efficiently by Adopt.

Page 24: Software Multiagent Systems:  Lecture 10

Metric: Cycles

Cycle = one unit of algorithm progress in which all agents receive incoming messages; perform computation, send outgoing messages

Independent of machine speed, network conditions, etc.

Outgoing comm

Page 25: Software Multiagent Systems:  Lecture 10

Number of Messages

Conclusion• Communication grows linearly

• only local communication (no broadcast)

Page 26: Software Multiagent Systems:  Lecture 10

Is optimality a good goal to reach for?

Page 27: Software Multiagent Systems:  Lecture 10

Bounded error approximation

Motivation Quality control for approximate solutions

Problem User provides error bound b

Goal Find any solution S where

cost(S) cost(optimal soln) + b

lower bound = 10root

thresh = 10 + b

• Adopt’s ability to provide quality guarantees naturally leads to bounded error approximation!

Page 28: Software Multiagent Systems:  Lecture 10

Evaluation of Bounded Error

Conclusion

• Varying b is an effective method for doing time-to-solution/solution-quality tradeoffs.

Page 29: Software Multiagent Systems:  Lecture 10

Adopt summary – Key Ideas

First-ever optimal, asynchronous algorithm for DCOP

polynomial space at each agent

Weak Backtracking

lower bound based search method

Parallel search in independent subtrees

Efficient reconstruction of abandoned solutions

backtrack thresholds to control backtracking

Bounded error approximation

sub-optimal solutions faster

bound on worst-case performance

Page 30: Software Multiagent Systems:  Lecture 10

Discussion

Can we improve Adopt efficiency?

Can we allow n-ary constraints in Adopt?

Does Adopt preserve privacy?

What are some key applications of Adopt?

Page 31: Software Multiagent Systems:  Lecture 10

New Ideas for EfficiencyCommunication Structure

Idea: Reach a solution faster if end-to-end messaging is shorter

Application: Shorter depth trees in ADOPT

Intelligent Preprocessing of Bounds

PASSUP heuristic: bounds via one-time message up the tree

PASSUP extended via a framework of several preprocessing heuristics

Page 32: Software Multiagent Systems:  Lecture 10

Performance (EAV)Orders of Magnitude Speedup!

Page 33: Software Multiagent Systems:  Lecture 10

OptAPO 2004

OPTAPO

0

1000

2000

3000

4000

5000

8 12 16 20

Variables

Cycles

Adopt

OptAPO

Page 34: Software Multiagent Systems:  Lecture 10

Performance

Page 35: Software Multiagent Systems:  Lecture 10

•J. Davin, P. J. Modi , "Impact of Problem Centralization in Distributed Constraint Optimization Algorithms," Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2005.

Page 36: Software Multiagent Systems:  Lecture 10

Defining DCOP Centralization

Centralization: Aggregating problem information into a single agent

information was initially distributed among multiple agents, and

aggregation results in a larger local search space.

For example, constraints on external variables canbe centralized.

Page 37: Software Multiagent Systems:  Lecture 10

Motivation

Adopt and OptAPO:

Adopt does no centralization.

OptAPO does partial centralization.

OptAPO completes in fewer cycles than Adopt for graph coloring

But, cycles do not capture performance differences

When different levels of centralization.

Page 38: Software Multiagent Systems:  Lecture 10

Metric: Cycles

What is missing in measuring cycles?

Outgoing comm

Page 39: Software Multiagent Systems:  Lecture 10

Key Questions

How do we measure performance of DCOP algorithms that differ in their level of centralization?

How do Adopt and OptAPO compare when we use such a measure?

Page 40: Software Multiagent Systems:  Lecture 10

Results

Tested on graph coloring problems, |D|=3 (3-coloring).

# Variables = 8, 12, 16, 20, with link density = 2n or 3n.

50 randomly generated problems for each size.

Cycles: CCC:

OptAPO takes fewer cycles, but more constraint checks.

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

8 12 16 20

Variables

CCC

Adopt

OptAPO

0

1000

2000

3000

4000

5000

8 12 16 20

Variables

Cycles

Adopt

OptAPO