Post on 25-Mar-2019
“Accelerating data-driven innovation in Europe”
Ana García Robles, Secretary General BDVA (@RoblesAG)Arne.J.Berre@sintef.no, Lead of BDVA TF6 Technical Priorities
ITU-T FG-DPM – European Commission - OASC WorkshopFebruary 19th, 2018
BDV Reference model / TF6 Technical Priorities and SRIA
BDV PPP Projects
The EU and Industry launched the Contractural Public Private Partnership (cPPP) on
Big Data Value in 2014-10
Big Data Value PPP: Investment
• “The European Commission and Europe's data industry have committed to invest €2.5 billion in a public-private partnership (PPP) that aims to strengthen the data sector and put Europe at the forefront of the global data race.”
• “The EU has earmarked over €500 million of investment over 5 years (2016-2020) from Horizon 2020”
• Private partners are expected to leverage this through sector investments of four times the cPPP budget (ie €2 billion)
194 Members
35 Large companies
63 SMEs
82 Research institutions
14 Others
Present in 28 countries
Industry-driven and fully self-
financed international non–for-profit
organisation under Belgian law
TF1: Programme
TF2: Impact
TF3: Community
TF4: Communication
TF5: Policy
& Societal
Policy & Societal
TF6: Technical
TF6-SG1: Data Management
TF6-SG2: Data Processing
Architectures
TF6-SG3: Data Analytics
TF6-SG4: Data Protection and Pseudonymisation
Mechanisms
TF6-SG5: Advanced Visualisation and User
Experience
TF6-SG6: Standardisation
TF7: Application
TF7-SG1: Emerging Application Areas
TF7-SG2: Telecom
TF7-SG3: Healthcare
TF7-SG4: Media
TF7-SG5: Earth observation & geospatial
TF7-SG6: Smart Manufacturing Industry
TF7-SG7: Mobility and Logistics
TF7-SG8: Smart Cities
TF8: Business
TF8-SG1: Data entrepreneurs
(SMEs and startups)
TF8-SG2: Transforming
traditional business (Large
Enterprise)
TF8-SG3: Observatory
on Data Business Models
TF9:
Skills and Education
TF9.SG1: Skill requirements
from European industries
TF9SG2: Analysis of
current curricula
related to data science
TF9.SG3: Liaison with
existing educational
projects
Stan
dar
dis
atio
n C
ross
Cu
ttin
g Li
nks
Standardisation Cross Cutting Links
TF6 Technical Priorities and Standardisation
Time series,
IoT
GeoSpatioTemp
MediaImageAudio
TextNLP.
Genom
WebGraphMeta
BDV ReferenceModel evolution (earlier version – summer 2017)
Structdata/
BI
Standards
Data Processing Architectures and WorkflowsBatch, Interactive, Streaming/Real-time
Data Visualisation and User Interaction1D, 2D, 3D, 4D, VR/AR
Data Analytics Descriptive, Diagnostic, Predictive, Prescriptive
Machine Learning and AI, Deep Learning, Statistics,
j
Data ManagementCollection, Preparation, Curation, Linking, Access – Data Market
DB types: SQL, NoSQL (Document, Key-Value, Coloum, Array,Graph, …)
Cloud and High Performance Computing (HPC)
Big D
ata Prio
rity Tech
Are
as
Applications/Solutions: Manufacturing, Health, Energy, Transport, BioEco, Media, Telco, Finance, EO, SE
Data Protection, Anonymisation, …
Big dataTypes &semantics
Things/Assets, Sensors and Actuators (Edge, Fog, IoT, CPS)
Co
mm
un
ication
and
Co
nn
ectivity, in
cl. 5G
Cyb
erSe
curity, R
isk, Trust, D
ata Privacy
De
velo
pm
en
t -En
gine
erin
g and
De
vOp
s
Marke
tplace
s, Ecosystem
s, and
Inn
ovatio
n su
pp
ort
Initial ISO JTC1/WG9 – fromNIST Big Data Reference Architecture
SG1-DM
SG3-AN
SG4-DP
with BDVA SGs Technical Priorities
SG2-AR
SG2-AR
SG1-DM
SG1-DM
SG5-VI
SG6-ST
SG6-ST
HPC
Latest ISO JTC1 WG9 Big Data Reference Architecture
ApplicationProviderLayer
ProcessingLayer
PlatformLayer
ResourceLayer
Multi-layerfunctions
IntegrationSecurity&Privacy
Manage-ment
Algorithms
Visualizations
WorkFlowEngine
AccessServices
Transformations
BatchFrameworks
InteractiveFrameworks
StreamingFrameworks
RelationalStorage
DocumentStorage
Key-valuestorageGraphStorage
Wide-columnarstorage
FileSystems
MessagingFrameworks Audit
Frameworks
AuthorizationFrameworks
AuthenticationFrameworks
MonitoringFrameworks
Provisioning/ConfigurationFrameworks
PackageManagementFrameworks
ResourceManagementFrameworks
IngestServices
AnonimizationFrameworks
StateManagementFrameworks
DataLifeCycleManagement
DP-AP
AP-PR
PR-PL
DC-AP
PL-RL
IN-FL
SP-FL
MG-FL
IN-SP SP-MG
IN-MG
MakeDataAvailable
Role:DataProvider Role:BigDataConsumer
ReceiveDataOutput
ResourceAbstractionandControl
PhysicalResources
Columnbasedstorage
Big Data StandardsWorkshop and ISOJTC 1 WG9 meeting, with BDVADublin, 15-22 August 2017
(See also forthcomingISO JTC SC42ArtificialIntelligence)
BDVA Reference Model vs ISO WG9 Big Data Reference Architecture
Updates/changes from the BDVA Reference Model will be submitted into the ISO process
ApplicationProviderLayer
ProcessingLayer
PlatformLayer
ResourceLayer
Multi-layerfunctions
IntegrationSecurity&Privacy
Manage-ment
Algorithms
Visualizations
WorkFlowEngine
AccessServices
Transformations
BatchFrameworks
InteractiveFrameworks
StreamingFrameworks
RelationalStorage
DocumentStorage
Key-valuestorageGraphStorage
Wide-columnarstorage
FileSystems
MessagingFrameworks Audit
Frameworks
AuthorizationFrameworks
AuthenticationFrameworks
MonitoringFrameworks
Provisioning/ConfigurationFrameworks
PackageManagementFrameworks
ResourceManagementFrameworks
IngestServices
AnonimizationFrameworks
StateManagementFrameworks
DataLifeCycleManagement
DP-AP
AP-PR
PR-PL
DC-AP
PL-RL
IN-FL
SP-FL
MG-FL
IN-SP SP-MG
IN-MG
MakeDataAvailable
Role:DataProvider Role:BigDataConsumer
ReceiveDataOutput
ResourceAbstractionandControl
PhysicalResources
Columnbasedstorage
BioTechAgriFood
TransportMobility
Health,Ageing
Industrial& Personal
DataSharing
Platforms+
Research/Urban data
Manufacturing
AI,Cognitive
Computing&
AnalyticsProcessingPlatforms
EnergySmartCities
EarthObs, GeO
Telecomothers
…Retail,
FInance
National levelEU level Worldwide level
ResearchInnovation
Standards
Regulation/policies
Market
Society
AIOTIETP4H
PCECSO 5G
Other Sectorial PPPs (and associations
EIT Digital
Robotics
IDS
Platform Industrie
4.0
NSB
Platform Industrie 4.0
MyData
ISO ITU-T W3C IEEEETSI CEN/CENELEC
RDA IICNational Initiatives
in BD and AI
EC/EU
Big Data Value ecosystem
BDVA Collaborations
Sectorial PPPs: EFFRA
Officially established
Ongoing collaboration with outcomes
Ongoing collaboration (initiated)
Identified collaboration
IoTForum
Industrial& AgriData
SharingPlatforms
+ Research/
Urban,City data
AI,Cognitive
Computing&
AnalyticsProcessingPlatforms
BigData typesSemanticInteroperability
BDVA
BDVA
AIOTI AIOTI
5GOASC
AIOTI AIOTIETP4HPC
BDVA
BDVABDVA
SPARC ISOCENCENELECDINW3COASISOGCGEOSSITU-T…
ITU-T
ECSO
Some mapping of collaborations in BDV Reference Model
BDV Reference Model - AIOTI WG03 HLA Mapping
Data types,Semanticinteroperability
IoT layer
Applicationlayer
Networklayer
AppEntity
IoTEntity
Networks
FarmingFood
TransportMobility
Health,Ageing
Industrial& Personal
DataSharing
Platforms+
Research/Urban data
SmartManufacturing
AI,Cognitive
Computing&
AnalyticsProcessingPlatforms
SmartEnergy
SmartCities
SmartBuildings
Enviro,Water,Air
others…
Telecom
IoT layer
Applicationlayer
Networklayer
SecurityPrivacyProtectionlayer
IoTEntity
AppEntity
AppEntity
IoTEntity
Project –
use case template
(from ISO JTC1 WG9,
used for BDVA and HPC use-cases)
19/02/2018
Could be harmonised with use case templatesfrom other groups ?
Big Data Value post-2020: Main pillars
Trust in Critical Decision Making
Data platforms
Dig
ital
ski
lls a
nd
Kn
ow
Ho
w
Private data
Public data
Personal data
Ind
ust
rial
co
llab
ora
tio
ns
mo
del
s
Bu
sin
ess
an
d le
gal f
ram
ewo
rks
Next Generation Digital Infrastructure BD/AIOTI/HPC/5G/
CyberSecurity
Citizen-Centric Data SocietyKey societal challenges
Core BDV cPPP
Core BDV cPPP in collaboration
AI Platforms
Data Incubators / Data Platforms
(IA)
Topic:ICT-14-WP2016-2017
N=15
Lighthouse Projects / Large scale pilots / test-beds
(IA)
Topic: ICT-15-2016-2017
N=4
Technical Projects (RIA)
Topic:ICT-18-2016ICT-16-2017
N=12
Collab. & Support Actions (CSA)Topic: ICT-17-2016-2017
N=2
BDV PPP Implementation projects (H2020-ICT-2016-2017) (33 projects)
BDV PPP implementation website: www.big-data-value.eu
For data collection and integration
We deliver tools to:
● Collect data from engaged citizens(i-Log app)
● Integrate data into a data lake from disparate sources (Pentaho Kettle, Karma integration, URI-fication, RDF-ization)
● Data collection from crowd-workers
Human factor on urban mobility Data Value Chain
Main data sources
Two main data providers: Municipality of Trento and TomTomOpen data from Trento Municipality and Region (i.e. city–bus stops, bike racks…-, )
Stream data (often managed by 3rd parties) from some urban traffic sensors and data streams in Trento (i.e.
parking occupation, bus positions…).
Historical data about mobility in the area from Trento and TomTom (i.e. several years of TomTom devices in
the area, weather conditions,
Crowdsourced data
Using standards and
existing tools as much
as possible.
● FIWARE and OASC-
compliant
● Big Data Europe
● CKAN, RDF
Architecture view
Human factor on urban mobility Data Value Chain
•Getting data from citizens, i.e. by completing data infrastructure (locating bike racks) and measuring occupancy of parking groups that are not
Data Acquisition
• By helping in the training phase: i.e. citizens with an extra incentive can provide data with more frequent and precise labels, that can be used as training sets for machine algorithms
• Confirming the predictions of transportation mode made by the machine, opening the door to improving the quality of the machine prediction on the go
Data Analysis
• Curate data on mobility infrastructure. i.e. in Trento, data about disabled parking spots is incomplete, needs to be verified and curated
Data Curation
• Detection of inconsistencies: i.e. when trying to merge several data sources
• Entity resolution, disambiguation, missing data: i.e. a mobility point appears at one position in the council's data, in a slightly different in Open Street Map, not at all on TomTom's map
Data Linking and Integration
Main Contribution
Towards a fully distributed architecture which edge and
cloud computing resources are coordinated
Develop a novel software architecture capable of
1. Distribute and coordinate big-data workloads with
real-time requirements along the compute
continuum
2. Combine data-in-motion and data-at-rest analytics
3. Increase productivity in terms of programmability,
portability/scalability and (guaranteed) performance
31
Data Analytics
Software Development Framework
Integrate technologies from different computing domains into a single development framework1. Combine of SoA data-in-motion and data-at-
rest analytics solutions
2. Apply high-performance techniques to distribute computation across edge/cloud resources
3. Apply of timing analysis techniques
4. Use the most advanced parallel heterogeneous embedded platforms
32
Computation Distribution
Edge Software Components
Cloud Software Components
Low Level Resource Managers
Smart System (smart city use-case)
Compute Continuum
Data Analytics Platform
Smart City Use-case
Deployed on a real urban area in the city of
Modena (Italy) with several highly-connected
cars
1. Intelligent traffic management, acting on traffic
lights and smart road signals
2. Advanced driving assistance systems
33
Automotive Smart Area
Prototype car intended to be used
V2X connectivity + sensors
• Data analytics and real-time requirements • 11.4 GB/s of heterogeneous data-sets considering 3
cars and a 1 km2 sensing area
Further Information:
BDVA: http://www.bdva.eu/
Secretarygeneral@core.bdva.eu
info@core.bdva.eu
@BDVA_PPP #Bigdatavalue #Bigdata