SEVERE WEATHER EVENTS AND SOCIAL MEDIA STREAMS: BIGDATA APPROACH FOR IMPACT MAPPING

Post on 27-Jan-2015

109 views 0 download

Tags:

description

Big data seminar Trento : http://www.eitictlabs.eu/news-events/events/article/big-data-seminar-series/#gridView

Transcript of SEVERE WEATHER EVENTS AND SOCIAL MEDIA STREAMS: BIGDATA APPROACH FOR IMPACT MAPPING

Seminars BigData Trento 26/03/2013

Via Sommarive, 18 – sala EIT ICT Labs.

Alfonso Crisci - a.crisci@ibimet.cnr.it

Valentina Grasso - grasso@lamma.rete.toscana.it

Image:http://www.greenbookblog.org/2012/03/21/big-data-opportunity-or-threat-for-market-research/

Severe weather events and social media streams:

bigdata approach for impact mapping

Social media and SEO are the information web rivers available.

Are they useful or not?That is the question ( W. Shakespeare).

Social Media are Data

•contents (UGC)

•conversation

•connection

•collaboration

•community

A big lens for crowd behaviour essentialy by:

COMMUNITY membership and CONVERSATION

5C

Why Social Media & Big Data?

User’s Generated Content is the actual largest world mine of data for every purposes.

Perfoming data mining on these kind of data involves many tasks and computational services for parsing and information extration concerning:

Georeferencing

Social Network analitics

Semantic processing

Information Rendering and visualisations

Co-Inference with other informative sources

Retrieval and stocking SM streams

Now there are platforms to realize the

data meet-upMapReduce

Parallel computation

RHadoop

Today code working….

RHadoopRevolution Analytics is only an example!

Social Media

Weatherversus

Intrinsically Big Data

SM & weather are connected!!

Plenty of weather content on SM

• Weather is a common conversation topic

• Services push the personalization of weather forecast

• Weather perceived has local dimension

• Weather could become a "emergency" issue

Considering severe weather events..

Where

WhoWhen

They happens in the space and in the time and troughout media a SM build

trigger an informative frame on the Web-sphere

a deep analogywith WEB processes exist!

Weather as emergency issue main features

•FREQUENT: vs to other emergencies

•FAMILIAR: people deal with weather daily

•PREDICTABLE: important for warnings

•LOCATED: specific spatial and temporal dimension

#fires#earthquake#chemical #nuclear #disaster#health#terrorism

Weather as an operational context where community may increase "resilience" attitude.

In emergency "behaviours" modulate "impacts" on society.

If I'm aware and prepared I act responsibly.

US tornado warning:

people get used to "weather warnings" and they learnt to be proactive in protection.

Enhance the resilience of communities as the aim

Changing climate - changing awareness In Italy and Europe in the last 10 years climate

change made us more exposed to extreme weather events - "preparedness"

Tornado hits: US - Italy 1999-2009

Geographical spreading and magnitude of events are important for awareness

Lovely (or less) Meteo SM fakes ..are everywhere…

Information verification become a must!

Welcome Bigdata!

Verification is a question of time event shape and coherency

start

peak

decline

weather phenomena and social/communication streams as "analogue" time delayed information waves

time

…..and geography as well

real physical process

& information flows

… dynamic informations warping means to explore the

Time coherence between

[ or its mathematical representation!!!!]

In a multidimesional space or better in every time-varying systems ( as the atmosphere or as the “WEB information seas” ) some structures ever could be detected.

Uncovering the Lagrangian Skeleton of TurbulenceMarthur et al.Phys Rev Lett. 2007 Apr 6;98(14):144502. Epub 2007 Apr 4.

Lagrangian coherent structures (LCS)well known in ecology and fluid dynamics

When two or more time-varying systems are connected a supercoherence could be detected if processes are linked.

The link structure between SM and weather could be done hypothetically by a opportune Hierarchy model (Theory of middle-number systems Weinberg 1975). Social media and weather relationships are surely an Organized Complexity.Many parts to be deterministically predicted, too few to be statistically forecasted.

Agent-Based Modeling of Complex Spatial Systemshttp://www.ncgia.ucsb.edu/projects/abmcss/ May Yuan, University of Oklahoma

SMERST 2013: Social Media and Semantic Technologies in Emergency Response15-16 April 2013, University of Warwick, Coventry UK

Disaster 2.0 project

Weather event: early heat wave on 5-7 April 2011

Working case on Italian Twitter-sphere

• investigate time/space coherence between the event extension and its social footprint on Twitter

• semantic analysis of Twitter stream on/off peaks days

Research objectives

Heat wave as a good case

Emergency as consequence of "behaviour"

Communication is key: "how to act"

Heat wave: definitionit's a period with persistent T° above the seasonal mean. Local definition depends by regional climatic context.

Severe weather refers to any dangerous

meteorological phenomena with the

potential to cause damage, serious social disruption, or loss of human life.[WMO]

Types of severe weather phenomena

vary, depending on the latitude, altitude, topography, and

atmospheric conditions. Ref:

http://en.wikipedia.org/wiki/Severe_weather

To overcome every SM& Weather complexities a 5-point :

road map

• Identify a 1-dimensional time flux of information from SM’s world

• Detection of every local statistical linear association of this one in a parametric –physical- spacetime representation ( time spatial grid of data).

• Mapping the significance in classes previously determined.

• Pattern verification with observations.

• Semantics and textual mining confirms.

• Community analisys of SM streams to detect users filters

Target and Products

Stakeholders: •forecasters

•institutional stakeholders

•EM communities

•media agents

Products: •DNKT sematic based SM stream metric

•The significant areas where association of the SM time vector (DNKT) and coupled time gridded data stack of weather paraemeters = spatial associative map

•A semantic analysis Twitter stream:

- clustering

-word clouds

-SNA improves

Detect areas where it's worth focusing attention, also for communication purpose.

Target

Data usedHeat wave period considered (7-13 April 2011)Social - Using Twitter API key-tagged (CALDO-AFA-SETE)

6069 tweets collected through geosearch service for italian area.

- Retweets and replies included (full volume stream)

Climate & Weather (7-10 April 2011)

- Urban daily maximum T° - Daily gridded data (lon 5-20 W lat 35-50)

WRF-ARW model T°max daily data (box 9km)

Semantized Twitter stream metrics

DNKT shows time coherence with daily profiles of areal averaged temperature

*Critical days identified as numerical neighbour of peaks (7-8-9-April): social "heaty days"

DNKT - "daily number of key-tagged tweets"

*

**

The associative map as a tool

Semantic based social stream in 1D * time space (DNKT)

Weather informative layers in 2D time* space

LinearAssociation Statisticallybased Verifierby pixel

Geographic Associative Map (2D space)

Impacted areas in evidence

It's a weather map at X-rays: Twitter stream is used as a "contrast medium"to visualize impacted areas.

This is not a Twitter map

Associative maps fits well

Urban maximum T° over 28 C° on 9 April

where & when

Semantic analitics

- Corpus creationDNKT classification by heat-wave peak days:

heat days ( 7-8-9 April) no-heat days (6-10-11 April).

- Terms Word Clouds (min wd frequency>30)

heat days vs no-heat days

Clustering associated terms

Term frequency ranking comparison

- Hashtag Word Clouds heat days vs no-heat days

R Stat 15.2 Packages used: tm (Feinerer and Hornik, 2012) & wordcloud (Fellows , 2012)

heat days

WordClouds of terms (excluding key-tag caldo-afa-sete)

heat days no-heat days

Terms association clustering

heat days no heat days

"heat" is THE conversation topic "heat" is marginal to the conversation topic

heat days

Terms frequency ranking

no heat N=2608 heat N=3461

oggi 6.0% oggi 8.3% 1°

sole 5.5% troppo

7.7% 2°

troppo 4.1% sole 5.9% 3°

Hashtags WordCloudsheat days no-heat days

On peak days:

- widening of lexical base during "heat critical days" - heat as a conversation topic

- ranking of terms (i.e.:adjectives as "troppo"!) is useful to detect change in communication during climatic stress

- geographic names appears in terms and hashtags wordsets ("#milano" !).

This fits with recent researches on "social media contribution to situational awareness during emergencies".

Semantic results

Snow events

SNA of keytagged social media streams

Begin 10 feb 2013

End 11 feb 2013

The Graph metrics of SM streams are dynamics.

The graph centrality analisys of Media and Istitutions may provide very useful parametersforWeather Event follow-up.

#firenzeneve

Conclusions- Methodology for a social "x-

rays" of a weather event: Semantized SM stream could become as a "contrast medium" to understand the social impact of severe weather events

- Methodology of social geosensing mining is able to map the severe weather impacts and overcome the weakening inside social media data.

Weather as a key emergency context where it's worth working on community resilience - also with the help of social insightful contents.

Reproducible R code

socialsensing Code & Data

https://github.com/alfcrisci/socialgeosensing.git

Wiki Recipes in

https://github.com/alfcrisci/socialgeosensing/wiki

#thanksContacts:Crisci Alfonso & Valentina Grassomail: a.crisci@ibimet.cnr.it mail: grasso@lamma.rete.toscana.it

Twitter: @alf_crisci @valenitna

www.lamma.rete.toscana.itwww.ibimet.cnr.it

#nowquestions(slowly please if is possible)

www.lamma.rete.toscana.itwww.ibimet.cnr.it