Carnegie Mellon AI, Sensing, and Optimized Information Gathering: Trends and Directions Carlos...
-
Upload
kimberly-adelia-allen -
Category
Documents
-
view
213 -
download
0
Transcript of Carnegie Mellon AI, Sensing, and Optimized Information Gathering: Trends and Directions Carlos...
Carnegie Mellon
AI, Sensing, and Optimized Information Gathering:
Trends and Directions
Carlos Guestrin
joint work with:
and:Anupam Gupta, Jon Kleinberg,
Brendan McMahan, Ajit Singh, and others…
Monitoring algal blooms
Algal blooms threaten freshwater4 million people without water1300 factories shut down$14.5 billion to clean upOther occurrences in Australia, Japan, Canada, Brazil, Mexico, Great Britain, Portugal, Germany …
Growth processes still unclear [Carmichael]Need to characterize growth in the lakes, not in
the lab!
Tai Lake China10/07 MSNBC
Can only make a limited number of measurements!
Dep
th
Location across lake
Monitoring rivers and lakesNeed to monitor large spatial phenomena
Temperature, nutrient distribution, fluorescence, …
Predict atunobserved
locations
NIMSKaiseret.al.
(UCLA)
Color indicates actual temperature Predicted temperature
Use robotic sensors tocover large areas
Where should we sense to get most accurate predictions?
[Singh, Krause, G., Kaiser ‘07]
Water distribution networks
Simulator from EPA
Water distribution in a city very complex systemPathogens in water can affect thousands (or millions) of peopleCurrently: Add chlorine to the source and hope for the best
Chlorine
ATTACK!could deliberately
introduce pathogen
Monitoring water networks[Krause, Leskovec, G., Faloutsos, VanBriesen ‘08]
Contamination of drinking watercould affect millions of people
Place sensors to detect contaminations“Battle of the Water Sensor Networks” competitionWhere should we place sensors
to detect contaminations quickly ?
Sensors
Simulator from EPA Hach Sensor~$14K
Sensing problemsWant to learn something about the state of the world
Detect outbreaks, predict algal blooms …
We can choose (partial) observations…Place sensors, make measurements, …
… but they are expensive / limitedhardware cost, power consumption, measurement time …
Want cost-effectively get most useful information!
Fundamental problem:What information should I use to learn
?
Related work
Sensing problems considered inExperimental design (Lindley ’56, Robbins ’52…), Spatial
statistics (Cressie ’91, …), Machine Learning (MacKay ’92, …), Robotics (Sim&Roy ’05, …), Sensor Networks (Zhao et al ’04, …), Operations Research (Nemhauser ’78, …)
Existing algorithms typicallyHeuristics: No guarantees! Can do arbitrarily badly.Find optimal solutions (Mixed integer programming, POMDPs):Very difficult to scale to bigger problems.
This talk
Theoretical: Approximation algorithms that have theoretical guarantees and scale to large problems
Applied: Empirical studies with real deployments and large datasets
Model-based sensingModel predicts impact of contaminations
For water networks: Water flow simulator from EPAFor lake monitoring: Learn probabilistic models from data (later)
For each subset AÍV compute “sensing quality” F(A)
S2
S3
S4S1 S2
S3
S4
S1
High sensing quality F(A) = 0.9Low sensing quality F(A)=0.01
Model predictsHigh impact
Medium impactlocation
Lowimpactlocation
Sensor reducesimpact throughearly detection!
S1
Contamination
Set V of all network junctions
Robust sensing Complex constraints
Sequential sensing
Optimizing sensing / Outline
Sensing locationsSensing quality
Sensing budgetSensing cost
Sensor placement
Sensor placementGiven: finite set V of locations, sensing quality FWant: A*Í V such that
Typically NP-hard!
How well can this simple heuristic do?
S1
S2
S3
S4
S5
S6Greedy algorithm:
Start with A =Ø ;For i = 1 to k
s* := argmaxs F(A {s})A := A {s*}
2 4 6 8 100.5
0.6
0.7
0.8
0.9
Number of sensors placed
Pop
ulat
ion
affe
cted
Performance of greedy algorithm
Greedy score empirically close to optimal. Why?
Small subset of Water networks
data
2 4 6 8 100.5
0.6
0.7
0.8
0.9
Number of sensors placed
Pop
ulat
ion
affe
cted
Greedy
Optimal
Pop
ula
tion
pro
tecte
d
(hig
her
is b
ett
er)
Number of sensors placed
S2S3
S4S1
Key property: Diminishing returns
S2
S1
S’
Placement A = {S1, S2} Placement B = {S1, S2, S3, S4}
Adding S’ will help a lot!
Adding S’ doesn’t help muchNew
sensor S’
B . . . . . A S’
S’
+
+
Large improvement
Small improvement
For AÍB, F(A {S’}) – F(A) ≥ F(B {S’}) – F(B)
Submodularity:
Theorem [Krause, Leskovec, G., Faloutsos, VanBriesen ’08]:Sensing quality F(A) in water networks is
submodular!
One reason submodularity is useful
Theorem[Nemhauser et al ‘78]
Greedy algorithm gives constant factor approximation
F(Agreedy)≥ (1-1/e) F(Aopt)
Greedy algorithm gives near-optimal solution!Guarantees best possible unless P = NP!
Many more reasons, sit back and relax…
~63%
People sit a lotActivity recognition inassistive technologiesSeating pressure as user interface Equipped with
1 sensor per cm2!
Costs $16,000!
Can we get similar accuracy with fewer,
cheaper sensors?
Leanforward
SlouchLeanleft
82% accuracy on 10 postures! [Zhu et al]
Building a Sensing Chair [Mutlu, Krause, Forlizzi, G., Hodgins ‘07]
How to place sensors on a chair?
Sensor readings at locations V as random variablesPredict posture Y using probabilistic model P(Y,V)Pick sensor locations A*ÍV to minimize entropy:
Possible locations V
Theorem: Information gainis submodular!*[UAI’05]
*See store for details
Accuracy CostBefore 82% $16,000 After 79% $100
Placed sensors, did a user study:
Similar accuracy at <1% of cost!
Battle of the Water Sensor Networks Competition
Real metropolitan area network (12,527 nodes)Water flow simulator provided by EPA3.6 million contamination eventsMultiple objectives: Detection time, affected population, …Place sensors that detect well “on average”
BWSN Competition results
13 participantsPerformance measured in 30 different criteria
0
5
10
15
20
25
30
Tota
l S
core
Hig
her
is b
ett
er
Our
app
roac
h
Berry
et. a
l.
Dor
ini e
t. a
l.
Wu
& W
alsk
i
Ost
feld
& S
alom
ons
Prop
ato
& P
iller
Elia
des & P
olyc
arpo
u
Hua
ng e
t. a
l.
Gua
n et
. al.
Ghi
mire
& B
arkd
oll
Trac
htm
an
Gue
li
Prei
s & O
stfe
ld
E
E
D DG
GG
GG
H
H
H
G: Genetic algorithm
H: Other heuristic
D: Domain knowledge
E: “Exact” method (MIP)
24% better performance than runner-up!
Simulated all on 2 weeks / 40 processors152 GB data on disk Very accurate sensing quality
, 16 GB in main memory (compressed)
Using “lazy evaluations”:1 hour/20 sensorsDone after 2 days! Advantage through theory and
engineering!
Low
er
is b
ett
er 30 hours/20 sensors
6 weeks for all30 settings
3.6M contaminations
Very slow evaluation of F(A)
1 2 3 4 5 6 7 8 9 100
100
200
300
Number of sensors selected
Runn
ing
time
(min
utes
)
Exhaustive search(All subsets)
NaiveGreedy
ubmodularity to the rescue:
What was the trick?
Fast Greedy
Robustness against adversaries
Unified viewRobustness to change in parameters Robust experimental designRobustness to adversaries
SATURATE: A simple, but very effective algorithm for robust sensor placement
If sensor locations are known, attack vulnerable locations
[Krause, McMahan, G., Gupta ‘07]
What about worst-case?
S2
S3
S4S1
Knowing the sensor locations, an adversary contaminates here!
Where should we place sensors to quickly detect in the worst case?
Very different average-case score,Same worst-case score
S2
S3
S4
S1
Placement detects well on “average-case”
(accidental) contamination
Optimizing for the worst case
Contamination at node s Sensors A
Fs(A) is high
Contamination at node r
Fr(A) is low
Fs(B) is low
Fr(B) is high
Sensors B
Fr(C) is high
Fs(C) is high
Sensors C
Separate utility function Fi with each contamination i
Fi(A) = impact reduction by sensors A for contamination i
Want to solve
Each of the Fi is submodular
Unfortunately, mini Fi
not submodular!How can we solve this robust sensing problem?
How does the greedy algorithm do?
Theorem [NIPS ’07]: The problem max|A|≤ k mini Fi(A) does not admit any approximation unless P=NP
Optimalsolution
Greedy picks first
Then, canchoose only
or
Greedy does arbitrarily badly. Is there something better?
V={ , , }
Can only buy k=2
Greedy score:Optimal score: 1
Set A F1 F2 mini Fi
1 0 00 2 0
1 2
1 2 1
Hence we can’t find any approximation algorithm.
Or can we?
Alternative formulation
If somebody told us the optimal value,
can we recover the optimal solution A*?
Need to find
Is this any easier?
Yes, if we relax the constraint |A| ≤ k
Solving the alternative problem
Trick: For each Fi and c, define truncation
c
|A|
Fi(A)
F’i,c(A)
Same optimal solutions!Solving one solves the other
Non-submodular Don’t know how to solve
Submodular!Can use greedy!
Problem 1 (last slide) Problem 2
Back to our example
Guess c=1First pick Then pick
Optimal solution!
How do we find c?Do binary search!
Set A F1 F2 mini Fi F’avg,1
1 0 0 ½0 2 0 ½ 1 (1+)/2
2 (1+)/2
1 2 1 1
Saturate Algorithm [NIPS ‘07]Given: set V, integer k and submodular functions F1,…,Fm
Initialize cmin=0, cmax = mini Fi(V)
Do binary search: c = (cmin+cmax)/2
Greedily find AG such that F’avg,c(AG) = c
If |AG| ≤ k: increase cmin
If |AG| > k: decrease cmax
until convergence
Truncationthreshold(color)
Theoretical guarantees
Theorem:If there were polytime algorithm with better factor <, then NP DTIME(nlog log n)
Theorem: Saturate finds a solution AS such that
mini Fi(AS) ≥ OPTk and |AS| ≤ k
where OPTk = max|A|≤k mini Fi(A)
= 1 + log maxsi Fi({s})
Theorem: The problem max|A|≤ k mini Fi(A) does not admit any approximation unless P=NP
Example: Lake monitoring
Monitor pH values using robotic sensor
Position s along transect
pH
valu
e
True (hidden) pH values
Prediction at unobservedlocations
transect
Where should we sense to minimize our maximum error?
Use probabilistic model(Gaussian processes)
to estimate prediction error
(often) submodular[Das & Kempe ’08]
Var(s | A)
Robust sensing problem!
Observations A
Comparison with state of the art
Algorithm used in geostatistics: Simulated Annealing[Sacks & Schiller ’88, van Groeningen & Stein ’98, Wiens ’05,
…]7 parameters that need to be fine-tuned
Environmental monitoring
bett
er
0 20 40 600
0.05
0.1
0.15
0.2
0.25
Number of sensors
Max
imum
mar
gina
l var
ianc
e
Greedy
0 20 40 600
0.05
0.1
0.15
0.2
0.25
Number of sensors
Max
imum
mar
gina
l var
ianc
e
Greedy
SimulatedAnnealing
0 20 40 600
0.05
0.1
0.15
0.2
0.25
Number of sensors
Max
imum
mar
gina
l var
ianc
e
Greedy
SimulatedAnnealing
Saturate
Precipitation data
0 20 40 60 80 1000.5
1
1.5
2
2.5
Number of sensors
Max
imu
m m
arg
inal
var
ian
ce
Greedy
Saturate
SimulatedAnnealing
Saturate is competitive & 10x fasterNo parameters to tune!
Saturate
Results on water networks
60% lower worst-case detection time!
Water networks
500
1000
1500
2000
2500
3000
Number of sensors
Maxim
um
dete
cti
on
tim
e (
min
ute
s)
Low
er
is b
ett
er
No decreaseuntil allcontaminationsdetected!
0 10 200
Greedy
SimulatedAnnealing
Summary so far
Submodularity in sensing optimization
Greedy is near-optimal
Robust sensingGreedy fails badly
Saturate is near-optimal
Path planningCommunication constraints
Constrained submodular optimizationpSPIEL gives strong guarantees
Sequential sensingExploration Exploitation Analysis
All these applications involve physical sensing
Now for something completely different
Let’s jump from water…
… to the Web!
You have 10 minutes each day for reading blogs / news.
Which of the million blogs should you read?
Tim
e
Information cascade
Which blogs should we read to learn about big cascades early?
Learn aboutstory after us!
Information Cascades[Leskovec, Krause, G., Faloutsos, VanBriesen ‘07]
Water vs. Web
In both problems we are givenGraph with nodes (junctions / blogs) and edges (pipes / links)Cascades spreading dynamically over the graph (contamination / citations)
Want to pick nodes to detect big cascades early
Placing sensors inwater networks
Selectinginformative blogsvs.
In both applications, utility functions submodular
Performance on Blog selection
Outperforms state-of-the-art heuristics700x speedup using submodularity!
Blog selection
Low
er
is b
ett
er
1 2 3 4 5 6 7 8 9 10
0
100
200
300
400
Number of blogs selected
Ru
nn
ing
tim
e (
secon
ds)
Exhaustive search(All subsets)
Naivegreedy
Fast greedy
Blog selection~45k blogs
Hig
her
is b
ett
er
Number of blogs
Cascad
es c
ap
ture
d
0 20 40 60 80 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7Greedy
In-linksAll outlinks
# Posts
Random
Naïve approach: Just pick 10 best blogsSelects big, well known blogs (Instapundit, etc.)These contain many posts, take long to read!
Taking “attention” into account
Casc
ades
cap
ture
d
Number of posts (time) allowed
x 104
Cost/benefitanalysis
Ignoring cost
Cost-benefit optimization picks summarizer blogs!
Predicting the “hot” blogs
Jan Feb Mar Apr May0
200
#detectio
ns
Greedy
Jan Feb Mar Apr May0
200
#detectio
ns
Saturate
Detects on training set
Greedy on historicTest on future
Poor generalization!Why’s that?
0 1000 2000 3000 40000
0.05
0.1
0.15
0.2
0.25 Greedy on futureTest on future
“Cheating”
Cascad
es c
ap
ture
d
Number of posts (time) allowed
Detect wellhere!
Detect poorlyhere!
Want blogs that will be informative in the futureSplit data set; train on historic, test on future
Blog selection “overfits”to training data!
Let’s see whatgoes wrong here.
Want blogs thatcontinue to do well!
Robust optimization
Jan Feb Mar Apr May0
200
#detectio
ns
Greedy
Jan Feb Mar Apr May0
200
#detectio
ns
Saturate
Jan Feb Mar Apr May0
200
#detectio
ns
Greedy
Jan Feb Mar Apr May0
200
#detectio
ns
Saturate
Detections using Saturate
F1(A)=.5
F2 (A)=.8
F3 (A)=.6
F4(A)=.01
F5 (A)=.02
Optimizeworst-case
Fi(A) = detections in interval i
“Overfit” blog selection A
“Robust” blog selection A*
Robust optimization Regularization!
Predicting the “hot” blogs
Greedy on historicTest on future
Robust solutionTest on future
0 1000 2000 3000 40000
0.05
0.1
0.15
0.2
0.25 Greedy on futureTest on future
“Cheating”S
en
sin
g q
uality
Number of posts (time) allowed
50% better generalization!
Summary
Submodularity in sensing optimization
Greedy is near-optimal
Robust sensingGreedy fails badly
Saturate is near-optimal
Path planningCommunication constraints
Constrained submodular optimizationpSPIEL gives strong guarantees
Sequential sensingExploration Exploitation Analysis
Constrained optimization better use of
“attention”
Robust optimization better generalization
AI-complete dreamRobot that saves the world
Robot that cleans your room
But…It’s definitely useful, but…
Really narrow
Hardware is a real issueWill take a while
What’s an “AI-complete” problem that will be useful to a huge number of people in the next 5-10 years?What’s a problem accessible to a large part of AI community?
What makes a good AI-complete problem?
A complete AI-system:Sensing: gathering information from the worldReasoning: making high-level conclusions from informationActing: making decisions that affect the dynamics of the world and/or the interaction with the user
But alsoHugely complexCan get access to real dataCan scale up and layer upCan make progressVery cool and exciting
Data gathering can lead to good, accessible and cool AI-complete
problems
Factcheck.orgTake a statement
Collect information from multiple sources
Evaluate quality of sources
Connect them
Make a conclusion AND provide an analysis
Automated fact checkingQuery
Fact or Fiction
?Conclusion
and Justification
Active user feedback on sources and proof
Web
Models
InferenceCan lead to very cool “AI-
complete” problem, useful, and can make progress in short
term!
ConclusionsSensing and information acquisition problems are important and ubiquitousCan exploit structure to find provably good solutionsObtain algorithms with strong guaranteesPerform well on real world problems
Could help focus on a cool “AI-complete”
problem