La rete delle Società ICT Associazione delle Società per ......Market potential: new products,...
Transcript of La rete delle Società ICT Associazione delle Società per ......Market potential: new products,...
La rete delle Società ICT di Regioni e Province Autonome al servizio del Sistema Paese
Associazione delle Società per l’Innovazione tecnologica nelle Regioni
I BIG DATA PER LA GOVERNANCE SANITARIA
The Cineca HPC point of view. Sanzio Bassini, Director HPC [email protected]
Bologna, 19 aprile 2018
BIG DATA
In a global, open, competitive, digital world the driver element is hyper-connectivity (IoT) and therefore big data.
Big data not as a commodity but (above all) as a way of dealing with contemporary complexity.
Big data enabling capability is the the link between science, technology transfer and know how, competence and skills, global social science and socio economical challenge.
To provide a solutions / answers both to the global challenges and to the individual requirements of millions of users / customers.
BIG DATA FOR THE RESEARCH
CERN LHC – More than 200PB ofdata repository; Aspectedincreasing 10X in the next cycle
More than 100 PB of data repository
Today 2 Pbyte per week 250 PB of data repository
Why invest in HPC and big data?
HPC is at the core of major advances and innovations in the digital age
Strategic value for science and socio economical challengesHPC enables breakthrough science
disease treatment; new therapies; brain; climate; chemistry;new materials; cosmology, astrophysics; high-energy physics;environment; transportation; earthquakes; etc.,
Strategic value for IndustryMarket potential: new products, design and production cycles,decision processes, reducing costs, resource efficiency, etc.
National security and defenseComplex encryption technologies, terrorism, forensics,cyber attacks, nuclear simulations
Brain
Drug discovery
ClimateCosmology
Defense
Materials
Security
Aero space
Cineca Roadmap
2009 2012 2016 2019 2021/2022
IBM SP6Power6
FermiIBM BGQPowerA2
MarconiLenovoXeon+KNL15 Pbyte of storage>30 Pbyte of repository
To be definedScalar + Vector and / orAccelerator
To be definedScalar + Vector and / orAccelerator
Procurement in
progress
100TF1MW
2PF1MW
11PF+9PF
3.5MW
50PF+10PF
~4MW
>250PF+>20PF~8MW
20x
5x
5x
1x
(latency cores)
5x
2x
(latency cores)
Paradigm
change
10x
(in total)
Pre-exascale
- Project -
HPC and VerticalsValue delivered to users
Big
DA
TA
Man
agm
ent
HW infrastructure(clusters, storage, network, devices)
VA
LUE
Acc
eler
ated
co
mp
uti
ng
(sca
le o
ut
per
form
ance
)
AI
3D
Viz
Hig
hTh
rou
ghp
ut
/ C
lou
d
com
pu
tin
g
Co
nn
ect
ors
to
oth
er
infr
astr
uct
ure
s
pro
cure
men
tco
des
ign
Applications integration Meteo, Astro, Matter & Materials, CFD, Precision Medicine, etc…
DA
TA
AN
ALY
TIC
S
HW infrastructure(clusters, storage, network, devices)
VA
LUE
NU
MER
ICA
L SI
MU
LATI
ON
pro
cure
men
tco
des
ign
SolutionsResources HPC, HTC, Cloud, Viz, etc…
Support, Consultancy, Training
AC
CES
S &
P
RO
DU
CTI
ON
AP
PLI
CA
TIO
NS
PEO
PLE
/
CO
MP
ETEN
CE
Technologies for Applications Solutions for services
Big Data success stories
Virtual exibitor per Terrae MotusPresso la Reggia di Caserta
Data integration of unstructured data from different sources
About 100.000 interactions
(only in Italian and English
language)
Social networks analyzed
10.320
users
(27%)
22.780 Twitter users
(59%)
5.260
Tripadvisor
reviewers
(14%)
Reggia di Caserta #Felicori @ReggiaCe
#MauroFelicori #FiduciaCaserta
data extraction keywords for Social
Network
English Italian
FB 2% 98%
TW 14% 86%
TA 10% 90%
Language
Interactions English Italian
FB 2% 98%
TW 20% 80%
TA 10% 90%
Language
Users
How to interpret all this information?
Sentiment Analysis:
data mining application to
social networks
Sentiment is
calculated for
Italian and English
language interactions
Sentiment analysis
Through the “standard survey” it is estimated that, in Italy, 14.98% of the enterprises offer web ordering (within a confidence interval between 13.88 and 16.09).Web scraping methods give an estimate of 15.56 (that is in the CI interval).
Supporting ISTAT
Services for health and wellbeing
CINECA infrastructure:HPC, Data intensive, Data
management, remote visualization
Services for bioinformatics
Software tools
Quality control
AlignmentConversion
utilities
Variant callers
Annotation
Differential expression
AssemblingPeak
finders
Metagenomics
Large number of traditional and emerging applications for NGS processing: alignment, variant callers, assembling
Big data technopole
ECMWF
• Climate change and services• Welfare, health and aging• Production and digital transformation• Cultural heritage, humanities and society• Sustainable cities• Security, Cyber security and artificial
intelligence• Education and skills
Fundamental analytics and applications.Scientific, Industrial and societal challenges.
The regional ecosystem
Integration of CINECA and INFN CNAF infrastructure to provide services to:
• Institutional basic and applied research
• Enabling for Public administrations
• Proof of concept and innovation for private organizations and industries
DATA
Integrated Research Data Infrastructure
Network
Digital single market
• Scale out performance;
• Towards exascale;
• Hosting Member of PRACE, Eudat, Human Brain, Human Technopole…;
• # 1 in Europe in the TOP500
•Hyper scaling;• Tier1 WLCG;•Hosting
Member of Indigo Cloud, SKA…;• # 1 in Europe in
the LHC WLCG
• 70% of the national big data
•Up to 2 tera bit /s • Common sw stack
Implementation model
HPC & BIG
Data
DATA
Infrastructure
HTC & BIG
Data
Network & transport layer
Services Layer
ResearchOpen access Scientific merit;National & International Private &
Public Sectors
Proof of concept;Open innovation;
Partnership agreements