Towards the operational exploitation of social data and crowdsourcing in Copernicus EMS
Copernicus EMS Annual Workshop
Ispra, June 21st 2017
E2mC I www.e2mc-project.eu
E2mC Objective - Main
Demonstrating
the technical and operational feasibility of the integration of social media analysis and crowdsourcing within the full Copernicus EMS (Mapping and Early Warning)
Developing
a) a prototype of the innovative Copernicus Witness, a new EMS Service Component conceived to exploit social media analysis and crowdsourcing capabilities oriented
to improve information management for emergency responders
b) a prototype of the Social Crisis Map, a new Product type of the EMS Portfolio
E2mC I www.e2mc-project.eu
E2mC Vision – The Problem
Copernicus EMS - Mapping
…timeliness…not yet fully achieved…
…most of the delay is concentrated in the availability of the first usable post event
satellite image…
…map production throughput…
…quality…
…broadening the current scope of the EMS service…
…enlarging the Copernicus EMS range of use…
E2mC I www.e2mc-project.eu
E2mC vision – The Problem
Copernicus EMS – Early Warning
…forecasting skills…
…forecasts are still subject to relatively high uncertainties…
…forecast verification is important to understand the strength and weaknesses of the system …
…event based verification…
…impact based forecast…
E2mC I www.e2mc-project.eu
E2mC vision – The Solution
Video features
extraction
Today the handling of “unconventional” data (e.g. Twitter, news, …) is fully manual. E2mC will provide tools to make this process as much as possible automated
News and social media crawling,
filtering and geolocalization
E2mC I www.e2mc-project.eu
E2mC vision – (Social) Media & Crowdsourcing
(Social) Media Crowdsourcing
…as a source of information… …as a tool to gather and share information…
…as a platform to create, motivate and maintain crowdsourcing communities…
…as a resource multiplier… …as a tool to tag (social) media contents…
E2mC Social&Crowd
Platform
E2mC I www.e2mc-project.eu
E2mC vision – The users’ role
TRADITIONAL SCHEMA
• Users: the beneficiary of a Geo-Information Service; they receive product/services and they integrate them into their own workflows and working practices receiving a tangible benefit from them.
• Service Providers: operators of a value chain based on satellite\in situ data acquisition and post-processing.
• Data generators: satellite image or in-situ data provided as an input to the value chain.
E2mC VISION
• Users can be “distributed Service Providers” (a crowd to be motivated), “Data Generators as in situ sensors” (witnesses of a disaster consciously or unconsciously contributing data through social media and exploiting the Social&Crowd Platform to exchange information) and “Users of the service”.
Source: ClouT project, http://clout-project.eu/
E2mC I www.e2mc-project.eu
E2mC vision – Why is it URGENT? …because it is already happening…
…and more…
E2mC I www.e2mc-project.eu
E2mC Idea – Copernicus EMS evolved version
Key elements of the Copernicus Witness
perfectly fitting in the current operational Copernicus EMS.
a Service Component that serves simultaneously all the different
components of the running Copernicus EMS, as it takes into
account the needs, requirements and constraints expressed by
both the Mapping and the Early Warning
directly available to Copernicus EMS (Authorized) Users as a
standalone service ready for being further integrated into specific
custom and downstream applications or for being used
independently for ad hoc and tailored social media analysis or
crowdsourcing campaigns
E2mC I www.e2mc-project.eu
E2mC Idea – Technical Challenges
Social Media Monitoring & Analysis
• Multilingual (semantycs, ontologies), multidisaster, worldwide
• Access to heterogeneous data streams
• Selection of relevant data streams
• Big data problem for systematic monitoring
• Georeferencing strategies
• Identify relevant and independent contents
• Assess quality and reliability
Federated Crowdsourcing: • Heterogeneous platforms, with different triggering
mechanisms and organizational models • Data exchange, interoperability • Exploit crowdsourcing in a SLA ruled environment with
specific time constraints • Crowd building • Thematic geospatial and crisis oriented vs general
purpose platforms • Exploit crowdsourcing also for enriching social media
analysis?
E2mC I www.e2mc-project.eu
E2mC Organization
Study Logic
Project Kick-off: November 29th- 30th 2016
Project duration: 27 months
E2mC I www.e2mc-project.eu
E2mC – The Project Team
Coordinator
Partner
User
Project Coordinator
Project Partners
Users
E2mC I www.e2mc-project.eu
• WP1 Analysis of requirements, feasibility and Use Cases definition: • State of the art analysis & review: COMPLETED
• Recommendation for S&C platform prototype design: COMPLETED
• Analysis of Copernicus Witness integration issues: COMPLETED
• Scenario and use cases definitions: ON GOING (v1, subject to revision)
• Copernicus Witness Service Specifications: ON GOING (v1, subject to revision)
• WP2 Design&Development of the Social&Crowd Platform: • Plannning: COMPLETED
• Architecture design: ON GOING
E2mC – Current status & Achievements
E2mC I www.e2mc-project.eu
E2mC State of the art analysis Earthquake example (Italy 2016)
Numbers of Tweets in Italian talking about the 2016
central Italy earthquake on August 24, 2016.
First general analysis
Language: 86.6% are in Italian.
Geographical coordinates: only 533 Tweets have geographical coordinates, approximately 0.35%
Links: 51.23% of Tweets have a link. Links are mainly to Facebook (20,328 links) and News sites (15,989 links).
Media: 26,914 contain images. However images lose all metadata, including their geographical coordinates, when stored by Twitter inside the repository that is used to respond to data retrieval queries from the official Twitter APIs.
Geolocated tweets with links: 53.54% out of the 533 geolocated tweets
Instagram: 846 Instagram pictures were linked by a tweet, 68.16% of them do not have geolocation, 23.67% of them have geolocation.
Example of approximate geolocation Example of accurate geolocation
Findings
Process audio information is not a priority in prototyping the Witness component
Twitter accounts of official sources need to be monitored to gather links to videos.
The number of images linked by tweets is very high and a large number of these images is not useful for
mapping purposes. Only a small percentage of images is geolocated.
A naive keyword-based approach to filtering does not seem to be effective, there is a need for a more
sophisticated semantic approach for text analyses as well as image processing techniques.
Even with manual processing, a large percentage of images remains without geolocation. This indicates a
possible important application of crowdsourcing.
Given that we consider images and videos, YouTube and Instagram should also be considered. Interesting
content on Instagram and YouTube can be accessed through Twitter, by following links.
E2mC I www.e2mc-project.eu
E2mC State of the art analysis Flood example (UK 2014)
Key findings
Social Media sources. Among the more diffuse Social Media (e.g. Twitter, Facebook,
Instagram, YouTube, Pinterest, etc.) Twitter has resulted the most relevant channel
to achieve information, not only as primary source provided by the users, but also
as indirect way to access to other social data content,
Type of information: text, photo, video. Data containing panoramic videos or photo
of the affected areas are more relevant with respect to message containing only
textual information,
User: private, public, institutional. The most relevant information about the crisis in
terms of infrastructures damages, flood areas, etc. are provided in largest part
from public entities or institutions,
Geolocation. Less than 1% of analysed Tweets have been geotagged and, in all
cases, the position of the Tweet was located outside the crisis event area.
Information redundancy. Information redundancy is another factor to get and filter,
in a rapid way, only those are reliable and relevant for the event.
E2mC I www.e2mc-project.eu
E2mC - Use Cases
Use Case Domain
UC1 - Global alert from keywords and external triggers
Social media exploitation
UC2 - Event confirmation and keywords/hashtags identification
Social media exploitation
UC3 - Topic identification and geospatial hotspot analysis
Social media exploitation
UC4 – Automatic extraction of relevant information from social media data streams
Social media exploitation
UC5 - Translation of keyword dictionaries into local languages
Crowd sourcing resource exploitation
UC6 - Alert on relevant hashtags Crowd sourcing resource exploitation
UC7 - Social Media content enrichment Crowd sourcing resource exploitation
UC8 – Simple mapping, change detection and feature identification
Crowd sourcing resource exploitation
UC9 – Request of geo-report from in field people
Crowd sourcing resource exploitation
UC10 – Request of local sources of information Crowd sourcing resource exploitation
UC11 - Federation of other crowdsourcing platforms
Crowd sourcing resource exploitation
UC12 - Delivery of Witness Service to end-user Other
List of Use Cases
Use Case Template
(Source: www.ogcnetwork.net/system/files/use_case_template.doc )
Use Cases are defined in order to provide a common ground of understanding to the E2mC Developers Team and to the E2mC Copernicus Ops Team Use Cases will be also shared with Stakeholders to gather comments and feedback
E2mC I www.e2mc-project.eu
E2mC - End-to-end (E2E) Scenarios
E2E Scenario ID Name Domain
EW1 Flood Alert - Europe Copernicus EMS Early
Warning
RM1 Earthquake damage assessment in Italy Copernicus EMS Rapid
Mapping
RR1 Exposure assessment and asset mapping – Europe Copernicus EMS Risk&Recovery
RR2 Detailed damage assessment and reconstruction monitoring
EM1 International Charter “Space and major Disasters” Other emergency mapping
initiatives
CO1 Social media and crowdsourcing for Copernicus Land Service
Other Copernicus Service
LU1 Use of the Copernicus Witness service by local Civil Protection authorities to set up their own crowd of “professional” volunteers Other local Civil Protection
users LU2 Use of Witness as stand alone service
E2mC I www.e2mc-project.eu
E2mC - End-to-end (E2E) Scenarios Copernicus EMS RM - Earthquake damage assessment
E2mC I www.e2mc-project.eu
E2mC Stakeholders – Categories (WHO)
• EC/Copernicus Institutions and decision makers
• Copernicus Entrusted Entities
• DG ECHO and JRC On Duty Teams
• Copernicus EMS Authorized Users
• Researchers in Universities or Research Centers, Research Networks or Associations
• Copernicus EMS Service Providers
• Crowd Communities
• Industry and SMEs
• Satellite based emergency mapping initiatives
• Working groups focused on the emergency mapping best practices
• Industrial sectorial associations (e.g. EARSC)
• ESA
E2mC I www.e2mc-project.eu
E2mC Stakeholders – Contribution (WHAT)
Beyond dissemination actions, stakeholders will be engaged also in:
• reviewing/validating the proposed requirements analysis;
• identifying main integration issues;
• reviewing use cases;
• reviewing/validating the Copernicus Witness design.
Expected contribution
Stakeholders categories
Reviewing/validating the proposed requirements analysis
Identifying main integration issues
Defining use cases Reviewing/validating the Copernicus Witness design
EC/Copernicus Institutions and decision makers
Who: DG GROW – Copernicus Unit
Who: - Who: -
Who: DG GROW – Copernicus Unit, Copernicus User Forum (UF)
What: providing an overall Copernicus perspective
What: - What: -
What: presentation of Copernicus Witness during a UF session
Copernicus Entrusted Entities
Who: EEA, ECMWF Who: EEA, ECMWF Who: EEA, ECMWF Who: EEA, ECMWF
What: LAND, CAMS and C3S specific needs
What: LAND, CAMS and C3S operational workflows
What: LAND, CAMS and C3S potential use cases revision
What: revision of the Copernicus Witness service model
EMS Authorized Users
Who: ANPC (PT), DPC (IT), EA (UK), UN WFP, Campus Vesta
Who: ANPC (PT), DPC (IT), EA (UK), UN WFP, Campus Vesta
Who: ANPC (PT), DPC (IT), EA (UK), UN WFP, Campus Vesta
Who: ANPC (PT), DPC (IT), EA (UK), UN WFP, Campus Vesta
What: critical review from an operational Civil Protection and Humanitarian Aid perspective
What: integration into operational workflows and current working practices
What: use cases involving Civil Protection (multi-level) and Humanitarian Aid actors
What: critical review of the Copernicus Witness service and operational model
DG ECHO and JRC On Duty Teams
Who: DG ECHO, JRC Who: DG ECHO, JRC
Who: DG ECHO, JRC Who: DG ECHO, JRC
What: Copernicus EMS overall service perspective
What: Copernicus EMS overall service perspective
What: Copernicus EMS overall service perspective
What: Copernicus EMS overall service perspective
E2mC I www.e2mc-project.eu
• Finalization of the Use Cases and of the Copernicus Witness Service Specifications (end of June 2017) they will be shared for comments/feedback
• Development of the first version of the Social&Crowd platform (November 2017) it will be used as a first tangible result to be tested in the first Demonstration Scenarios
• Preparation of the first E2mC User Workshop (Q1 2018)
E2mC – What’s next?
Top Related