Towards the operational exploitation of social data and crowdsourcing...

29
Towards the operational exploitation of social data and crowdsourcing in Copernicus EMS Copernicus EMS Annual Workshop Ispra, June 21 st 2017

Transcript of Towards the operational exploitation of social data and crowdsourcing...

Towards the operational exploitation of social data and crowdsourcing in Copernicus EMS

Copernicus EMS Annual Workshop

Ispra, June 21st 2017

E2mC I www.e2mc-project.eu

OBJECTIVE

E2mC I www.e2mc-project.eu

E2mC Objective - Main

Demonstrating

the technical and operational feasibility of the integration of social media analysis and crowdsourcing within the full Copernicus EMS (Mapping and Early Warning)

Developing

a) a prototype of the innovative Copernicus Witness, a new EMS Service Component conceived to exploit social media analysis and crowdsourcing capabilities oriented

to improve information management for emergency responders

b) a prototype of the Social Crisis Map, a new Product type of the EMS Portfolio

E2mC I www.e2mc-project.eu VISION

E2mC I www.e2mc-project.eu

E2mC Vision – The Problem

Copernicus EMS - Mapping

…timeliness…not yet fully achieved…

…most of the delay is concentrated in the availability of the first usable post event

satellite image…

…map production throughput…

…quality…

…broadening the current scope of the EMS service…

…enlarging the Copernicus EMS range of use…

E2mC I www.e2mc-project.eu

E2mC vision – The Problem

Copernicus EMS – Early Warning

…forecasting skills…

…forecasts are still subject to relatively high uncertainties…

…forecast verification is important to understand the strength and weaknesses of the system …

…event based verification…

…impact based forecast…

E2mC I www.e2mc-project.eu

E2mC vision – The Solution

E2mC I www.e2mc-project.eu

E2mC vision – The Solution

Video features

extraction

Today the handling of “unconventional” data (e.g. Twitter, news, …) is fully manual. E2mC will provide tools to make this process as much as possible automated

News and social media crawling,

filtering and geolocalization

E2mC I www.e2mc-project.eu

E2mC vision – (Social) Media & Crowdsourcing

(Social) Media Crowdsourcing

…as a source of information… …as a tool to gather and share information…

…as a platform to create, motivate and maintain crowdsourcing communities…

…as a resource multiplier… …as a tool to tag (social) media contents…

E2mC Social&Crowd

Platform

E2mC I www.e2mc-project.eu

E2mC vision – The users’ role

TRADITIONAL SCHEMA

• Users: the beneficiary of a Geo-Information Service; they receive product/services and they integrate them into their own workflows and working practices receiving a tangible benefit from them.

• Service Providers: operators of a value chain based on satellite\in situ data acquisition and post-processing.

• Data generators: satellite image or in-situ data provided as an input to the value chain.

E2mC VISION

• Users can be “distributed Service Providers” (a crowd to be motivated), “Data Generators as in situ sensors” (witnesses of a disaster consciously or unconsciously contributing data through social media and exploiting the Social&Crowd Platform to exchange information) and “Users of the service”.

Source: ClouT project, http://clout-project.eu/

E2mC I www.e2mc-project.eu

E2mC vision – Why is it URGENT? …because it is already happening…

…and more…

E2mC I www.e2mc-project.eu

IDEA

E2mC I www.e2mc-project.eu

E2mC Idea – Copernicus EMS evolved version

Key elements of the Copernicus Witness

perfectly fitting in the current operational Copernicus EMS.

a Service Component that serves simultaneously all the different

components of the running Copernicus EMS, as it takes into

account the needs, requirements and constraints expressed by

both the Mapping and the Early Warning

directly available to Copernicus EMS (Authorized) Users as a

standalone service ready for being further integrated into specific

custom and downstream applications or for being used

independently for ad hoc and tailored social media analysis or

crowdsourcing campaigns

E2mC I www.e2mc-project.eu

E2mC Idea – Copernicus Witness architecture

E2mC I www.e2mc-project.eu

E2mC Idea – Technical Challenges

Social Media Monitoring & Analysis

• Multilingual (semantycs, ontologies), multidisaster, worldwide

• Access to heterogeneous data streams

• Selection of relevant data streams

• Big data problem for systematic monitoring

• Georeferencing strategies

• Identify relevant and independent contents

• Assess quality and reliability

Federated Crowdsourcing: • Heterogeneous platforms, with different triggering

mechanisms and organizational models • Data exchange, interoperability • Exploit crowdsourcing in a SLA ruled environment with

specific time constraints • Crowd building • Thematic geospatial and crisis oriented vs general

purpose platforms • Exploit crowdsourcing also for enriching social media

analysis?

E2mC I www.e2mc-project.eu ON THE MOVE

E2mC I www.e2mc-project.eu

E2mC Organization

Study Logic

Project Kick-off: November 29th- 30th 2016

Project duration: 27 months

E2mC I www.e2mc-project.eu

E2mC – The Project Team

Coordinator

Partner

User

Project Coordinator

Project Partners

Users

E2mC I www.e2mc-project.eu

• WP1 Analysis of requirements, feasibility and Use Cases definition: • State of the art analysis & review: COMPLETED

• Recommendation for S&C platform prototype design: COMPLETED

• Analysis of Copernicus Witness integration issues: COMPLETED

• Scenario and use cases definitions: ON GOING (v1, subject to revision)

• Copernicus Witness Service Specifications: ON GOING (v1, subject to revision)

• WP2 Design&Development of the Social&Crowd Platform: • Plannning: COMPLETED

• Architecture design: ON GOING

E2mC – Current status & Achievements

E2mC I www.e2mc-project.eu

E2mC State of the art analysis Earthquake example (Italy 2016)

Numbers of Tweets in Italian talking about the 2016

central Italy earthquake on August 24, 2016.

First general analysis

Language: 86.6% are in Italian.

Geographical coordinates: only 533 Tweets have geographical coordinates, approximately 0.35%

Links: 51.23% of Tweets have a link. Links are mainly to Facebook (20,328 links) and News sites (15,989 links).

Media: 26,914 contain images. However images lose all metadata, including their geographical coordinates, when stored by Twitter inside the repository that is used to respond to data retrieval queries from the official Twitter APIs.

Geolocated tweets with links: 53.54% out of the 533 geolocated tweets

Instagram: 846 Instagram pictures were linked by a tweet, 68.16% of them do not have geolocation, 23.67% of them have geolocation.

Example of approximate geolocation Example of accurate geolocation

Findings

Process audio information is not a priority in prototyping the Witness component

Twitter accounts of official sources need to be monitored to gather links to videos.

The number of images linked by tweets is very high and a large number of these images is not useful for

mapping purposes. Only a small percentage of images is geolocated.

A naive keyword-based approach to filtering does not seem to be effective, there is a need for a more

sophisticated semantic approach for text analyses as well as image processing techniques.

Even with manual processing, a large percentage of images remains without geolocation. This indicates a

possible important application of crowdsourcing.

Given that we consider images and videos, YouTube and Instagram should also be considered. Interesting

content on Instagram and YouTube can be accessed through Twitter, by following links.

E2mC I www.e2mc-project.eu

E2mC State of the art analysis Flood example (UK 2014)

Key findings

Social Media sources. Among the more diffuse Social Media (e.g. Twitter, Facebook,

Instagram, YouTube, Pinterest, etc.) Twitter has resulted the most relevant channel

to achieve information, not only as primary source provided by the users, but also

as indirect way to access to other social data content,

Type of information: text, photo, video. Data containing panoramic videos or photo

of the affected areas are more relevant with respect to message containing only

textual information,

User: private, public, institutional. The most relevant information about the crisis in

terms of infrastructures damages, flood areas, etc. are provided in largest part

from public entities or institutions,

Geolocation. Less than 1% of analysed Tweets have been geotagged and, in all

cases, the position of the Tweet was located outside the crisis event area.

Information redundancy. Information redundancy is another factor to get and filter,

in a rapid way, only those are reliable and relevant for the event.

E2mC I www.e2mc-project.eu

E2mC – High level functions

E2mC I www.e2mc-project.eu

E2mC - Use Cases

Use Case Domain

UC1 - Global alert from keywords and external triggers

Social media exploitation

UC2 - Event confirmation and keywords/hashtags identification

Social media exploitation

UC3 - Topic identification and geospatial hotspot analysis

Social media exploitation

UC4 – Automatic extraction of relevant information from social media data streams

Social media exploitation

UC5 - Translation of keyword dictionaries into local languages

Crowd sourcing resource exploitation

UC6 - Alert on relevant hashtags Crowd sourcing resource exploitation

UC7 - Social Media content enrichment Crowd sourcing resource exploitation

UC8 – Simple mapping, change detection and feature identification

Crowd sourcing resource exploitation

UC9 – Request of geo-report from in field people

Crowd sourcing resource exploitation

UC10 – Request of local sources of information Crowd sourcing resource exploitation

UC11 - Federation of other crowdsourcing platforms

Crowd sourcing resource exploitation

UC12 - Delivery of Witness Service to end-user Other

List of Use Cases

Use Case Template

(Source: www.ogcnetwork.net/system/files/use_case_template.doc )

Use Cases are defined in order to provide a common ground of understanding to the E2mC Developers Team and to the E2mC Copernicus Ops Team Use Cases will be also shared with Stakeholders to gather comments and feedback

E2mC I www.e2mc-project.eu

E2mC - End-to-end (E2E) Scenarios

E2E Scenario ID Name Domain

EW1 Flood Alert - Europe Copernicus EMS Early

Warning

RM1 Earthquake damage assessment in Italy Copernicus EMS Rapid

Mapping

RR1 Exposure assessment and asset mapping – Europe Copernicus EMS Risk&Recovery

RR2 Detailed damage assessment and reconstruction monitoring

EM1 International Charter “Space and major Disasters” Other emergency mapping

initiatives

CO1 Social media and crowdsourcing for Copernicus Land Service

Other Copernicus Service

LU1 Use of the Copernicus Witness service by local Civil Protection authorities to set up their own crowd of “professional” volunteers Other local Civil Protection

users LU2 Use of Witness as stand alone service

E2mC I www.e2mc-project.eu

E2mC - End-to-end (E2E) Scenarios Copernicus EMS RM - Earthquake damage assessment

E2mC I www.e2mc-project.eu

E2mC Stakeholders – Categories (WHO)

• EC/Copernicus Institutions and decision makers

• Copernicus Entrusted Entities

• DG ECHO and JRC On Duty Teams

• Copernicus EMS Authorized Users

• Researchers in Universities or Research Centers, Research Networks or Associations

• Copernicus EMS Service Providers

• Crowd Communities

• Industry and SMEs

• Satellite based emergency mapping initiatives

• Working groups focused on the emergency mapping best practices

• Industrial sectorial associations (e.g. EARSC)

• ESA

E2mC I www.e2mc-project.eu

E2mC Stakeholders – Contribution (WHAT)

Beyond dissemination actions, stakeholders will be engaged also in:

• reviewing/validating the proposed requirements analysis;

• identifying main integration issues;

• reviewing use cases;

• reviewing/validating the Copernicus Witness design.

Expected contribution

Stakeholders categories

Reviewing/validating the proposed requirements analysis

Identifying main integration issues

Defining use cases Reviewing/validating the Copernicus Witness design

EC/Copernicus Institutions and decision makers

Who: DG GROW – Copernicus Unit

Who: - Who: -

Who: DG GROW – Copernicus Unit, Copernicus User Forum (UF)

What: providing an overall Copernicus perspective

What: - What: -

What: presentation of Copernicus Witness during a UF session

Copernicus Entrusted Entities

Who: EEA, ECMWF Who: EEA, ECMWF Who: EEA, ECMWF Who: EEA, ECMWF

What: LAND, CAMS and C3S specific needs

What: LAND, CAMS and C3S operational workflows

What: LAND, CAMS and C3S potential use cases revision

What: revision of the Copernicus Witness service model

EMS Authorized Users

Who: ANPC (PT), DPC (IT), EA (UK), UN WFP, Campus Vesta

Who: ANPC (PT), DPC (IT), EA (UK), UN WFP, Campus Vesta

Who: ANPC (PT), DPC (IT), EA (UK), UN WFP, Campus Vesta

Who: ANPC (PT), DPC (IT), EA (UK), UN WFP, Campus Vesta

What: critical review from an operational Civil Protection and Humanitarian Aid perspective

What: integration into operational workflows and current working practices

What: use cases involving Civil Protection (multi-level) and Humanitarian Aid actors

What: critical review of the Copernicus Witness service and operational model

DG ECHO and JRC On Duty Teams

Who: DG ECHO, JRC Who: DG ECHO, JRC

Who: DG ECHO, JRC Who: DG ECHO, JRC

What: Copernicus EMS overall service perspective

What: Copernicus EMS overall service perspective

What: Copernicus EMS overall service perspective

What: Copernicus EMS overall service perspective

E2mC I www.e2mc-project.eu

• Finalization of the Use Cases and of the Copernicus Witness Service Specifications (end of June 2017) they will be shared for comments/feedback

• Development of the first version of the Social&Crowd platform (November 2017) it will be used as a first tangible result to be tested in the first Demonstration Scenarios

• Preparation of the first E2mC User Workshop (Q1 2018)

E2mC – What’s next?

Thank you