Information Sharing in Large Heterogeneous Teams
description
Transcript of Information Sharing in Large Heterogeneous Teams
FRC Seminar - August 13, 2009
Information Sharing in Large Heterogeneous Teams
Prasanna VelagapudiRobotics Institute
Carnegie Mellon University
FRC Seminar - August 13, 2009 2
Large Heterogeneous Teams• 100s to 1000s of agents
(robots, agents, people)• Shared goals• Must collaborate to
complete complex tasks• Dynamic, uncertain
environment
Scaling Teams
• Far more data than can be feasibly shared– Amount of information exchanged often grows
faster than amount of available bandwidth• Vague, incomplete knowledge of large parts of
the team– Often not important
• Shared information improves team performance
FRC Seminar - August 13, 2009 4
Search and Rescue
• Air robots, ground robots, human operators• Each is generating information
– Humans Classify objects and issue commands– Robots Explore and map area
• Geometric Random Graph
FRC Seminar - August 13, 2009 5
Search and Rescue
VideoStreams
(320kbps x 24,For operators) Decentralized
Evidence Grid(14kbps x 24,For all agents)
OperatorControl
(<1kbps x 24,For robots)
O(N2)O(N2)
Available throughput: Θ(WN0.5) [Gupta 2000]
FRC Seminar - August 13, 2009 6
Available Network Technologies
Source: William Webb - Ofcom
Scaling Teams
• We need to deliver information efficiently– Get to the agents that can make use of it most– Don’t waste communication bandwidth
• Key Idea: Different agents have different needs for a given piece of information
FRC Seminar - August 13, 2009 8
Sharing information
• When information generation exceeds network capacity, there are a few options:– Compression/Fusion (Eliminate redundant data)– Structuring (Eliminate overhead costs)– Selection (Eliminate unimportant data)
FRC Seminar - August 13, 2009 9
Related work
• Distributed Data Fusion– Channel filtering (DDF) [Makarenko 04]– Particle exchange [Rosencrantz 03]
• Networking– Gossip[Haas 06], SPIN[Heinzelman 99], IDR[Liu 03]
• Multiagent Coordination– STEAM [Tambe 97]– ACE-PJB-COMM [Roth 05], Reward-shaping
[Williamson 09], dec-POMDP-com [Zilberstein 03]
FRC Seminar - August 13, 2009 10
Domain assumptions
• Information generated dynamically and asynchronously
• Limited bandwidth and memory – With respect to size of team
• Significant local computing• Some predictive knowledge about other
agents’ information needs• Peer-to-peer communications
11
Domain assumptions
FRC Seminar - August 13, 2009
Inconsistency
Complexity Communication
dec-POMDP-com
Gossip
SPIN, IDR
STEAM
Flooding
Particle ExchangeChannel Filter
Tokens
ACE-PJB-COMM
Reward Shaping
Our domains
FRC Seminar - August 13, 2009 12
Abstract Problem
• Suppose we are given some metric for team performance in a domain:– How much information sharing complexity and
communication is necessary to achieve good performance in a large team?
– How can we characterize the effects of information sharing on performance in large teams?
• Suppose we are given some metric for team performance in a domain:– How much information sharing complexity and
communication is necessary to achieve good performance in a large team?
– How can we characterize the effects of information sharing on performance in large teams?
13
A simple example
• Two robots (1 static, 1 mobile) in a maze• Limited sensing radius, global communication• Team task: Get mobile robot to goal point• Team performance = battery power
– Movement and communication use power
• How useful is it to the team for the static robot to share its info with the mobile robot?
FRC Seminar - August 13, 2009
14
A simple example
FRC Seminar - August 13, 2009
FRC Seminar - August 13, 2009 15
A simple example• Without information • With information
FRC Seminar - August 13, 2009 16
A simple example• Without information • With information
The change in path cost is the “utility” of this information
FRC Seminar - August 13, 2009 17
Utility of Information
• Utility: the change in team performance when an agent gets a piece of information
• Often dependent on other information • Difficult to calculate during execution, even
with complete real-time knowledge– Need to know final state of team
18
Objective
• Utility: the change in team performance when an agent gets a piece of information
• Communication cost: the cost of sending a piece of information to a specific agent
FRC Seminar - August 13, 2009
19
Objective
• Maximize team performance:
FRC Seminar - August 13, 2009
utility communication
agentsinfo. source
dissemination tree
In actual systems, this solution must be formed through local decisions!
FRC Seminar - August 13, 2009 20
Distributions of Utility
• For large amounts of information, consider the distribution of utility– May be conditioned on known data, or just
independently sampled• Characterize domains as having specific
distributions of utility• Estimate performance of various algorithms as
function of this distribution
21
Back to the simple example
FRC Seminar - August 13, 2009
Freq
uenc
y
Utility (Δ path cost)
Maze Utility Distribution
FRC Seminar - August 13, 2009 22
Abstract Problem
• Suppose we are given some metric for team performance in a domain:– How much information sharing complexity and
communication is necessary to achieve good performance in a large team?
– How can we characterize the effects of information sharing on performance in large teams?
FRC Seminar - August 13, 2009 23
Approach
• Useful information sharing algorithms fall between two extremes:– Full knowledge/high complexity (omniscient)– No knowledge/low complexity (blind)
• Observe performance of two extremes of information sharing algorithms– Learn when it is useful to use complex algorithms– If blind policies do well, other low complexity
algorithms will also work well
FRC Seminar - August 13, 2009 24
Utility vs. Communication
Team
Util
ity
Communication Cost
Distributional upper bound
Omniscient policy
Blind policy
Efficient policies
25
Expected Upper Bound
• Order statistic: expectation of k-th highest value over n samples– Computable for many common distributions
• Expected best case performance – What values of utility would we expect to see in a
team of n agents?– Sum of k highest order statistics
FRC Seminar - August 13, 2009
FRC Seminar - August 13, 2009 26
Utility vs. Communication
Team
Util
ity
Communication Cost
Distributional upper bound
Omniscient policy
Blind policy
Efficient policies
27
Omniscient Policy
• Lookahead policy1. Assume we are given estimate of utility for every
other node (possibly with noise)2. Exhaustively search all n-length paths from current
node3. Send information along best path4. Repeat until TTL reaches 0
– Approximation of best omniscient policy – Full exhaustive search is intractable
FRC Seminar - August 13, 2009
FRC Seminar - August 13, 2009 28
Utility vs. Communication
Team
Util
ity
Communication Cost
Distributional upper bound
Omniscient policy
Blind policy
Efficient policies
29
Blind policies• Random: “Gossip” to randomly chosen neighbor
• Random Self-Avoiding– Keep history of agents visited– O(lifetime of piece)
• Random Trail– Keep history of links used– O(# of pieces/time step)
FRC Seminar - August 13, 2009
FRC Seminar - August 13, 2009 30
Questions
• How well does the lookahead policy approximate omniscient policy performance?
• How wide is the performance gap between the omniscient policy and blind policies?
• How does team size affect performance?• Is omniscient policy performance better
because it knows where to route, or where not to route?
31
Experiment
• Network of agents with utility sampled from distribution
• Single piece of information shared each trial• Average-case performance recorded
FRC Seminar - August 13, 2009
Distributions:• Normal• Exponential• Uniform
Networks:• Small-Worlds (Watts-Beta)• Scale-free (Preferential attachment)• Lattice (2D grid)• Hierarchy (Spanning tree)
FRC Seminar - August 13, 2009 32
Questions
• How well does the lookahead policy approximate omniscient policy performance?
• How wide is the performance gap between the omniscient policy and blind policies?
• How does team size affect performance?• Is omniscient policy performance better
because it knows where to route, or where not to route?
33
Lookahead convergence
FRC Seminar - August 13, 2009
2-step lookahead: pathological case?
FRC Seminar - August 13, 2009 34
Questions
• How well does the lookahead policy approximate omniscient policy performance?
• How wide is the performance gap between the omniscient policy and blind policies?
• How does team size affect performance?• Is omniscient policy performance better
because it knows where to route, or where not to route?
35
Performance Results
FRC Seminar - August 13, 2009
Normal Distribution Exponential Distribution
36
Policy Performance
FRC Seminar - August 13, 2009
(Utility sampled from Exponential distribution)
37
Utility of knowledge
FRC Seminar - August 13, 2009
~120 communications
FRC Seminar - August 13, 2009 38
Questions
• How well does the lookahead policy approximate omniscient policy performance?
• How wide is the performance gap between the omniscient policy and blind policies?
• How does team size affect performance?• Is omniscient policy performance better
because it knows where to route, or where not to route?
39
Scaling effects
FRC Seminar - August 13, 2009
The costs of maintaining utility estimates for Lookahead increase with team size, but the costs of Random policy do not.
FRC Seminar - August 13, 2009 40
Questions
• How well does the lookahead policy approximate omniscient policy performance?
• How wide is the performance gap between the omniscient policy and blind policies?
• How does team size affect performance?• Is omniscient policy performance better
because it knows where to route, or where not to route?
41
Noisy estimation
• How does the omniscient policy degrade as its estimates of utility become noisy?
• As noise increases, the omniscient policy approaches an ideal blind policy
• Gaussian noise scaled by network distance:
FRC Seminar - August 13, 2009
42
Noisy estimation
FRC Seminar - August 13, 2009
43
Modeling maze navigation
FRC Seminar - August 13, 2009
Freq
uenc
y
Utility (Δ path cost)
44
Modeling maze navigation
FRC Seminar - August 13, 2009
45
Summary of Results
• Omniscient policy approaches optimal routing on many graphs (not hierarchies)
• Gap between omniscient and blind policies is small when:– Network is conducive (Small Worlds, Lattice)– Maintaining shared knowledge is expensive– Network is massive– Estimation of value is poor
FRC Seminar - August 13, 2009
FRC Seminar - August 13, 2009 46
Improving the model
• Current work on validating this model– USARSim (Search and Rescue)– VBS2 (Military C2)– TREMOR (POMDP)
• Predictive utility estimation and dynamics• Better solution for optimal policy:
– Prize-collecting Steiner Tree [Ljubić 2007]
47
Conclusions
• Utility distributions: a mechanism to test information sharing performance– Computable from real-world data– Can be conditional/joint/marginal to encode
domain dependencies• Simple random policies: surprisingly
competitive in many cases– No structural or computational overhead– No expensive costs to maintain utility estimates
FRC Seminar - August 13, 2009
FRC Seminar - August 13, 2009 48
Questions?
FRC Seminar - August 13, 2009 49
FRC Seminar - August 13, 2009 50
Outline
• What we mean by large heterogeneous teams• The common assumptions in our domains• What we mean by utility utility distributions• The experiment• The results• Conclusions• Future work/validation
FRC Seminar - August 13, 2009 51
We need information
• Information generated all over network• Information consumed all over network• Team performance is improved by additional
information– More data = better decisions
• However, information loss degrades performance gracefully– Less data = alright decisions
FRC Seminar - August 13, 2009 52
Scalability of Large Teams
• As size increases, amount of information exchanged grows faster than amount of available bandwidth– Constant network density: O(n)
53
Motivation
• Large, heterogeneous teams of agents– 100s to 1000s of robots, agents, and people– Must collaborate to complete complex tasks– Decentralized algorithms
FRC Seminar - August 13, 2009
FRC Seminar - August 13, 2009 54
Motivation• Agents need to share information about objects and
uncertainty in the environment to perform roles– Individual sensor readings unreliable– Used to reason about appropriate actions– Maintenance of mutual beliefs is key
• Need effective means to propagate information– Agent needs for information change dynamically– Highly redundant data
55
Utility of Information
• A given piece of data can improve a given agent’s performance by a certain amount– Need to determine which pieces are useful to
deliver to which agents– Need to determine how a piece of information will
affect team performance
FRC Seminar - August 13, 2009
FRC Seminar - August 13, 2009 56
Utility of Information
• In our domains, we want to maximize the utility of what we are sending around while minimizing the cost of communication
• There are many possible information sharing strategies, how can we estimate or predict their performance?
FRC Seminar - August 13, 2009 57
USARSim
• In search and rescue/disaster response, network communication is very limited, while information generated must be shipped elsewhere to be processed.
• Video and map information can be compressed, but compression is limited because data must be streamed to operators
• Also, as more autonomous vehicles are added, it becomes impossible for single operators to handle all the information anyway
FRC Seminar - August 13, 2009 58
VBS2
• In military C2, high-level decisions must be made based on available information from a large number of units.
• However, military communications are especially limited, and further constrained by hierarchical organization and classification
• Can we intelligently guarantee that information will get between units and to command units?
FRC Seminar - August 13, 2009 59
TREMOR
• Varakantham et al. present a multiagent POMDP solver that uses reward shaping to decompose joint POMDPs into local POMDPs in situations where most interaction occurs at a small number of “coordination locales”.
• The reward shaping component can be described as an intelligent information sharing problem, and as such, we can create a distributed variant capable of solving much larger multi-agent POMDPs
FRC Seminar - August 13, 2009 60
FRC Seminar - August 13, 2009 61