Topic defense- Situation modeling and detection
-
Upload
vivek-singh -
Category
Technology
-
view
2.799 -
download
2
description
Transcript of Topic defense- Situation modeling and detection
1
Situation Modeling and Detection
Vivek SinghAdvisor: Ramesh Jain
2
Introduction
• Trends– Social media – Internet of things– Human (participatory) sensing
• Properties– Multiple media– Spatio-temporal – Realtime – Cloud
3
Social Life Networks Connecting People and
resources
Aggregation and
Composition
Situation Detection
Alerts
Queries
Information
Situation aware routing
4
Motivating example
STT data
Tweet:‘Urrgh… got the flu’
Loc: NYC,Date: 3rd Jun, 2011Theme: Swine Flu
Situation Detection User-Feedback
‘Please visit nearest CDC center at 4th St
immediately’
Date: 3rd Jun, 2011
Aggregation,
1) Characterization2) Control action
Characterization,…
Alert level = High
5
Aim
• Computational tools to define and detect situations using all available (device and human) data sources.
• Focus:– STT (Spatio-temporal-thematic) data– Social and sensor networks
Situations
• Multiple definitions– Situation awareness– Situation modeling– Situation detection – Situation calculus– Context based computing
“the perception of elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future (Endsley, 1988)”.
“knowing what is going on so you can figure out what to do” (Adam, 1993)”.“the complete state of the universe at an instant of time” (McCarthy, 1969)
“a set of past contexts and/or actions of individual devices relevant to future device actions” ” (Wang,2004)”.
“…extensive information about the environment to be collected from all sensors independent of their interface technology. Data is transformed into abstract symbols. A combination of symbols leads to representation of current situations…which can be detected”(Dietrich, 2003)
“A situation is a set of contexts in the application over a period of time thataffects future system behavior” (Yau, 2006)
Situation: definition
• Situation:– “An actionable abstraction of observed spatio-
temporal descriptors”– Revisiting the definitions
“the perception of elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future (Endsley, 1988)”.
“knowing what is going on so you can figure out what to do” (Adam, 1993)”.“the complete state of the universe at an instant of time” (McCarthy, 1969)
“a set of past contexts and/or actions of individual devices relevant to future device actions” ” (Wang,2004)”.
“…extensive information about the environment to be collected from all sensors independent of their interface technology. Data is transformed into abstract symbols. A combination of symbols leads to representation of current situations…which can be detected”(Dietrich, 2003)
“A situation is a set of contexts in the application over a period of time thataffects future system behavior” (Yau, 2006)
Situation: definition
• Situation:– “An actionable abstraction of observed spatio-
temporal descriptors”
9
Applications
• Healthcare– Alert me if there is a flu epidemic in my area
• Business analysis:– Where is the most suitable place to open a new ‘iphone’
store ?• Weather
– Alert me when the fall colors blossom in New England? • Daily living:
– Which place (and at what time) is conducive for exercising?• Weather, climate, politics, traffic, …
10
Generic Situation modeling and detection
A. STT data representation and aggregation– Unified representation of STT data across scale
B. Situation characterization operators– Generic operators which can be used
declaratively across applications
C. Situation modeling– Generic building blocks to define ‘actionable’
situations
Situation: “An actionable abstraction of observed spatio-temporal descriptors”
11
Timeline
Step 1) Visualization: Iphone launch in Google Earth Step 2) Generic data representationStep 3) Operators for processing Step 4) Generic list of event processing operatorsStep 5) Generic list of declarative operators Step 6) Generic blocks to define actionable queries
12
Step 1) Visualization: Iphone launch in Google Earth
• Iphone launch Jun 8th 2009.
13
S2) STT data representation: Social Pixels
• Focus on commonality across media sources (STT)• Analogy: photons aggregating at a location
14
Why social pixels/Emages?
• Advantages– Visualization– Intuitive query and mental model– Common spatio-temporal data representation– Data analysis using media processing
• Image/Media Processing operators -> Situation characterization operators – e.g. convolution, filtering, background subtraction
15
S3) Operators for processing
16
S4) Situation detection operators
17
S5) Situation characterization operators (declarative)
S. No
Operator Input Output
1 Selection Temporal E-mage Set
Temporal E-mage Set
2 Arithmetic & Logical
K*Temporal E-mage Set
Temporal E-mage Set
3 Aggregation α Temporal E-mage set
Temporal E-mage Set
4 Grouping Temporal E-mage Set
Temporal E-mage Set
5 Characterization :•Spatial •Temporal
•Temporal E-mage Set•Temporal Pixel Set
•Temporal Pixel Set•Temporal Pixel Set
6 Pattern Matching •Spatial •Temporal
•Temporal E-mage Set•Temporal Pixel Set
•Temporal Pixel Set•Temporal Pixel Set
18
Media processing engine
19
Implementation and results
• Twitter feeds– Geo-coding user home location– Loops of location based queries for different terms– Over 100 million tweets using ‘Spritzer’ stream
(since Jun 2009), and the higher rate ‘Gardenhose’ stream since Nov, 2009.
• Flickr feeds– API – Tags, RGB values from >800K images
Singh, Gao, Jain, ACM Multimedia conference, 2010
20
AT&T retail locations
AT&T total catchment area
iPhone theme based e-mage,Jun 2
Aggregate interest
Under-served interest areas
-Subtract
DecisionBest Location is at
Geocode [39, -122] , just north of Bay
Area, CA
Maxima<geoname><name>College City</name><lat>39.0057303</lat><lng>-122.0094129</lng><geonameId>5338600</geonameId><countryCode>US</countryCode><countryName>United States</countryName><fcl>P</fcl><fcode>PPL</fcode><fclName>city, village,...</fclName><fcodeName>populated place</fcodeName><population/><distance>1.0332</distance></geoname>
+ Add
to Jun 11
Convolution.
*Store
catchment area
Convolution.
*Store catchment
area
21
Flickr Social E-mages
• Jan – Dec 2009
22
Seasonal characteristics analysis
• Show me the difference between red and green colors for New England region, as it varies throughout the year
(-(sum (t <= 1yr theme = Green R=[(40,-76), (44,-71)] (TES)), sum(t <= 1yr theme = Red R=[(40,-76), (44,-71)] (TES))))
23
Variations throughout the year
– Fall colors of New England– [R-G] channel data
• Total Energy
Jan Dec
Jan
0
Dec
24
S6) Generic blocks to define ‘actionable’ queries
End user Domain Expert IT expertApplication
ActionApply for loan
Accepted/rejected
Domain rules (Banker)Check Credit history
Check collateral…
UML1) BankingClasses
AttributesConstraints
….
ActionTweet about Sore throat
Actions recommended
Domain rules (Doctor)Personal condition
Check location affectRate of growth…
SituationML2) Swine fluEmagesEvents
Characterizations….
Aim: Actionable mass personalization for end users
25
Situation Modeling: Problem
•High level (Abstract)•Vague•Spatio temporal •Across different data sources•Across different abstraction levels
Situatione.g. Pandemic level Data sources
Operators
Representation level
Characteristics
1.Model 2.Evaluate
26
Why situation modeling?
• Provides IT experts a short-hand conceptual model to capture domain semantics for STT data
• Decoupled from both:1. Specific applications 2. Implementation details– But bridges the gap between the two
• Allows reuse of components:– Across applications – Across different queries within same application
27
Modeling Kit
1. Data representation levels 2. Operators:
a) Transform across representation levelsb) Characterize data in any layer
3. Algorithm:– To model the situation descriptor in terms of 1)
and 2) above.
28
The framework
Less abstraction,
More detail
More abstraction,
Less detail
Characterizations
Transformations
Level 1: Unified representation
(STT Data)
Level 2: Aggregation
(Emage)
Level 3: Symbolic Rep.
(Events)
Properties
Properties
Properties
Representations
Level 0: Raw data e.g. tweets, cameras, traffic, weather, RSS, check-ins,
www
NYC,02/12/11, Flu, 14 persons
Examples
{NYC,02/12/11, Flu, 1 person}
Tweet: Arrggh ! Got sore throat Check-ins: John checked in at NY CDC w 12 others
{NYC,02/12/11, Flu, 13 persons}
Swine flu outbreak NYC, 02/12/11
29
Swine flu level Descriptors
Low, Mid, High
Data sourcesTwitter
Output space
-Events (#Reports)Representation level
Swine flu level
Δ
∏
Join
Filter
Transform
Operands
Operators
Φ Learn
@ Characterize
The framework: Building Blocks
30
Situation Modeling: Algorithm overview
Situation descriptor
Intermediate descriptor
Data source
Low, Mid, HighC1
f1
v4v2 v3
f2
v5 v6
@
D1 D4
∏
D2
∏
D3
Δ@
C1
v3
D3
31
AlgorithmGet_dependency_list (v){
1. Identify output state space. 2. Identify component features;
v =f1(v1, …, vk)a) If (type=imprecise)
– Identify learning data source.
3. ForEach (feature vi) {a) Identify Data sources. DS_list.Add();b) ForEach(Rep. level reqd.),
– Identify variable, theme for transformation;
c) If (vi.type != (observed || internal))– Get_dependency_list(vi);
}}
Data Sources List
Representations required
Operators
Input
Output
Internal descriptors
Actionable situation descriptor
32
Pandemic level
Low, mid, high
Number of Outbreak events
% of Population at Risk
Size of high activity zone
@
Φ CDC reports
Census
∏
S-t-t (population)
Δ-Emage
(#reports)
S-t-t (#reports)
Δ
∏
-Emage (High activity)
@
-Emage (#reports)
S-t-t (#reports)
Δ
@
Events(#reports)
Δ
Δ
∏
Locations with high activity
Population at Locations
ϵ Ʀ [0,1]
33
Results: Asthma
• Asthma affects 15 million Americans, 5 million of whom are children.
• 90% of all asthma cases are Extrinsic, i.e. allergic asthma. 80% of children with asthma also have documented allergies.
• Better planning of daily activities can minimize risk of severe asthma attacks.
http://www.rxlist.com/allergy/article.htm , http://www.rxlist.com/asthma/page6.htm#tocl
34
Application
• Uses:– Individuals: Planning their daily activities, or
combine across their lifetimes to measure their exposure level –Macro Level Policy Makers :Noticing sudden
changes, identifying healthier years, seasons, locations– Insurance companies: Care about both levels
e.g. charging different premiums.
35
Pre-processing of Data
• Image transformation of Pollen and Air quality maps – Rectified images through 25 matching point– Filtered for only populated US areas
• Downloading tweets through API• Resolution used:– Pollen and Air quality=0.1 lat by 0.1 lon– Tweets= 1 lat * 1 lon
36
Sample Individual “Query”/concern
Location: Anaheim (33.806299,-117.919185)
Date:May 25, 2011
INDIVIDUAL QUERIES
37
1. Alert me when major Allergy outbreak happens in my location !
*ALI= Asthma like Illness
Allergy Outbreak Yes, No
Number of ALI* cases reported Pollen IndexRate of growth
-Emage (#reports)
S-t-t (#reports)
Δ
∏
Δ
Air Quality Index
Past data
Current
Self created DB
Δ
-Emage (Pollen Index)
Weather.com
∏
Δ
@
-Emage (Air Quality Index)
∏
Δ
Weather.com
Human sensor reportsGrowth rate (human reports)Pollen IndexAir quality IndexALLERGY: Local condition severity
1. Alert me when major Allergy outbreak happens in my location !
• LCS(33.80,-117.91)= NO ALERT!
Human sensors: High (3/3)Growth: Neutral (2/3)Pollen index: Medium (3/5)Air quality index: Low (1/5)
39
2. How healthy is today for me?Healthiness
Rating
Conducive, OK,
Unhealthy
Number of ALI* cases reported Pollen IndexRate of growth
-Emage (#reports)
S-t-t (#reports)
Δ
∏
Δ
Air Quality Index
Past data
Current
Self created DB
Δ
-Emage (Pollen Index)
Weather.com
∏
Δ
@
-Emage (Air Quality Index)
∏
Δ
Weather.com
Personal Condition Severity
Locality Condition Severity
S-t-t (ALI report)
Δ
@
40
2. How healthy is today for me?
• Healthiness Rating= Poor• White Box details
Personal Condition Severity = 3
Locality Condition Severity
Locality Condition Severity = 2
Net Condition Severity = 3 * 6 = 3 i.e. Poor ϵ {Good, Poor, Hazardous}
41
3. What is the best location for me to undertake outdoor activities?
Best LocationLocation
Distance Personal Condition Severity
Locality Condition Severity
NOTE:1) Where Locality Condition Severity and Personal Condition Severity Are same as those defined in Query 2.
S-t-t (ALI report)
Δ
@
42
3. What is the best location for me to undertake outdoor activities?
• Best location to exercise is at: Irvine (33.7,-117.8) really !
ALLERGY: Local condition severity
White box details Location recommended= (33.7,-117.8)Distance = 0.13 Degree ≈ 10 milesHealthiness Rating at rec. loc.= ConduciveHealthiness Rating at your loc= PoorTBD: Find nearest park using Google API
43
4. What is the National Allergy Risk Index for today ?
National Allergy Risk Index
Low, Mid, High
Population Locality Condition Severity
NOTE:1) Where Locality Condition Severity for each location is same as that defined in Query 2.
-Emage (population)
US Census
Δ
@MACRO QUERIES
44
4. What is the National Allergy Risk Index for today ?
• National Allergy Risk Index= Mid
Details:%population under hazardous conditions= 0.0041% %%population under poor conditions= 56.9%%population under conducive conditions= 43.1%
ALLERGY: Local condition severity
45
Related problems tackled
1. Situation based control2. Properties: STT power laws3. User behavior modeling
46
Situation based control
•Situation Calculus •Environment-to-environment Communication
1) Best Student Paper: IEEE workshop on situation management, MILCOM, 2009, 2) E2E systems paper: Multimedia Tools and App. Journal
47
STT power laws• 80% of tweets are on 20% of topics. • There is a fixed relative ratio for the
occurrence of events of different magnitude across space or time.
Whole world
AroundNew York
city
Only USA
Log(Rank)
Log(Magnitude)
Across Space
1 week
30 mins
1 day
2 weeks
1 month
3 weeks
Log(Rank)Log(Magnitude)
Across Time
48
User behavior modeling: incentivizing crowd sensing…
• User perspective:• Optimal contribution strategy i.e. “when (and
when not) should she undertake the social media task”
• System designer perspective:• “Finding the optimal incentive levels to
influence these selfish end-users so that the overall system utility is maximized”
Best Paper, ACM Workshop on Social Media, 2009
49
Summary
• Computationally defined situations• Proposed a generic situation modeling
framework– STT data representation /aggregation– Across granularity– Characterization Operations– Domain knowledge
• Aggregated human and sensor network data
50
Work Plan
1. Measuring Situation Models?2. Applications:– More robust analysis for allergy– Another application
3. System building?4. Leave control aspect for future work?5. Include/Exclude other research threads