Taming the Big Data Beast - Together

30
Netherlands eScience Center ICT Synergy Hub, Amsterdam Taming the Big Data Beast - Together Nieuwjaarsbijeenkomst Kennisalliantie Delft, 31 januari-2013 Prof. dr. Jacob de Vlieg ¹ ² 1. CEO & Scientific Director of Netherlands eScience Center, NWO-SURF 2. Head Computational Design & Discovery, CMBI, Radboud University, Medical Center, Nijmegen, Netherlands

description

Kennisalliantie Nieuwjaarsreceptie 31 januari 2013: Prof. dr. Jacob de Vlieg: “Taming the Big Data Beast Together” CEO en wetenschappelijk directeur van het Netherlands eScience Center (NLeSC)

Transcript of Taming the Big Data Beast - Together

Page 1: Taming the Big Data Beast - Together

Netherlands eScience Center ICT Synergy Hub, Amsterdam

Taming the Big Data Beast - Together Nieuwjaarsbijeenkomst Kennisalliantie Delft, 31 januari-2013 Prof. dr. Jacob de Vlieg ¹ ² 1. CEO & Scientific Director of Netherlands eScience Center, NWO-SURF 2. Head Computational Design & Discovery, CMBI, Radboud University, Medical Center, Nijmegen, Netherlands

Page 2: Taming the Big Data Beast - Together

Agenda

• Big Data in Science: Challenges & Opportunities – Top Sector ICT Roadmap theme: “Data, Data, Data”

• Netherlands eScience Center (NLeSC)

– Expert centre for Big Data Research

• Joint NWO-NLeSC “Big Data” project call

– Public-private partnerships

Page 3: Taming the Big Data Beast - Together

Data are the lifeblood of modern science and the digital economy

Presenter
Presentation Notes
Page 4: Taming the Big Data Beast - Together

Data are the lifeblood of modern science and the digital economy

Managing, analyzing, linking & re-using data to create business

value and/or scientific breakthroughs e.g. – Social media data to influence consumer choices – Sensor networks data: e.g. sensor-enabled smart dikes – Imaging & biobanking data in health care e.g. diagnostics, medicine – And many more opportunities

Page 5: Taming the Big Data Beast - Together

Data are the lifeblood of modern science and the digital economy

Managing, analyzing, linking & re-using data to create business

value and/or scientific breakthroughs e.g. – Social media data to influence consumer choices – Sensor networks data: e.g. sensor-enabled smart dikes – Imaging & biobanking data in health care e.g. diagnostics, medicine – And many more opportunities

Big Data: a complex concept – 4Vs: Volume, Variety, Velocity, Verification

Page 6: Taming the Big Data Beast - Together

Data are the lifeblood of modern science and the digital economy

Managing, analyzing, linking & re-using data to create business

value and/or scientific breakthroughs e.g. – Social media data to influence consumer choices – Sensor networks data: e.g. sensor-enabled smart dikes – Imaging & biobanking data in health care e.g. diagnostics, medicine – And many more opportunities.

Big Data: a complex concept – 4Vs: Volume, Variety, Velocity, Verification

Big Data inextricably connected to eScience/HPC ICT top sector roadmap: e-Science is about intelligent infrastructure to

model and/or to access big data

Page 7: Taming the Big Data Beast - Together

Key eScience challenges Big Data research

– Cross-type data integration – Data-driven & multi-models simulations – Visualization & analytics – High performance computing: connected computers & fast networks.

Page 8: Taming the Big Data Beast - Together

Key eScience challenges Big Data research

– Cross-type data integration – Data-driven & multi-models simulations – Visualization & analytics – High performance computing: connected computers & fast networks

– Stimulate culture of knowledge sharing: no silos; data stewardship – Rationalization of ICT landscapes; interoperability & industry data standards – Training & education

Page 9: Taming the Big Data Beast - Together

Science itself is changing …We need to change with it…

Neelie Kroes in “Giving Europe’s Scientists the Tools to Deliver”

Two key words: multidisciplinary research & data-driven discovery

Page 10: Taming the Big Data Beast - Together

eScience and the mystery of the empty labs

Page 11: Taming the Big Data Beast - Together

eScience and the mystery of the empty labs

Page 12: Taming the Big Data Beast - Together

eScience and the mystery of the empty labs

• Much more data per experiment (miniaturized and/or automation) • External data sources & outsourcing • Experimental design, data management & analytics(eScience)

Page 13: Taming the Big Data Beast - Together

Use apps and wearable sensors to monitor daily life e.g. hours of sleep, food consumed, exercise taken, etc. Quantified Self = Big Data + Mobile + Sensors + Visualization + Gamification .

Quantified Self Movement -> Big Data

Page 14: Taming the Big Data Beast - Together

eScience Hero

• Big Data

• Pattern recognition

• Machine learning

• Social Media

Andy Grove (ex-CEO Intel)

Fights for medical innovation; parkinson’s disease

Page 15: Taming the Big Data Beast - Together

Voice algorithms spot Parkinson's disease: data-driven diagnostics

• Machine learning algorithms that analyse voice recordings to detect Parkinson's symptoms early on (Little at al. @ Media Lab, MIT)

• Social Media:

Looking for volunteers to contribute to the database to improve pattern recognition

Page 16: Taming the Big Data Beast - Together

Voice algorithms spot Parkinson's disease: data-driven diagnostics

• Machine learning algorithms that analyse voice recordings to detect Parkinson's symptoms early on (Little at al. @ Media Lab, MIT)

• Social Media:

Looking for volunteers to contribute to the database to improve pattern recognition

•21andme •PatientsLikeMe.com •And so on

Social networking health sites: patient-driven data collection

Big Data V= Verification: privacy, compliance, etc

Page 17: Taming the Big Data Beast - Together

'Data Scientist' is now the hottest job title in Silicon Valley…

Tim O'Reilly Founder of O'Reilly Media Supporter free software and open source movements

McKinsey projected that the US needs 140,000 to 190,000 more workers with “deep analytical expertise”

Page 18: Taming the Big Data Beast - Together

Netherlands eScience Center

Netherlands organization for scientific research:

Principal Dutch body for ICT innovation for research

NL-eSC SURF Science park, Amsterdam; SARA, EGI Networked innovation model Bridge:

•Science & advanced ICT •Industry & Academic Research

•Training & Education New ways to do research made possible because of Big Data/eScience

Page 19: Taming the Big Data Beast - Together

NLeSC portfolio divided in themes •Sustainability & Environment - Climate - Water management -Energy -Ecology •Chemistry & Materials -Chemistry

•Humanities & Social Sciences - Humanities -Social Sciences

•Life Sciences - Green Genetics - Translational Research IT - Foods - Cognition/Neuroscience •eScience Methodology & ‘Big Data’ - eScience Methodology - Astronomy

Page 20: Taming the Big Data Beast - Together

Can scientists from digital humanities help food researchers?

Digital Humanities: BiographyNED

Project Leader: Guus Schreiber

Will improve current version of the Biography Portal by incorporating analytical tools to show interconnections, trends, geographical maps and time lines.

Food Research: Food Specific Ontologies for Food Focused Text Mining

Project Leader: Wynand Alkema

Addressing absence of domain specific structured vocabularies which limits the use of data mining & knowledge management methods in food research.

Page 21: Taming the Big Data Beast - Together

eScience & Big Data: providing leads for new food applications

Presenter
Presentation Notes
Page 22: Taming the Big Data Beast - Together

NLeSC eScience engineers: Scientists bridging research and advanced ICT

Deliver sustainable solutions for data-driven research Work both at center and on site

Page 23: Taming the Big Data Beast - Together

NLeSC eScience Engineers: Work both at center and on site: •Exchange of eScience expertise •Re-use of proven eScience (technology hopping) •Career development & training

Collaborative Innovation Network Taming the Data Beast Together

SMEs,etc

Page 24: Taming the Big Data Beast - Together

Grand scientific challenges leads to innovative eScience & Big Data Research

•eScience to allow unprecedented level of detail (large scale distributed computing) •State-of-the-art visualization techniques to analyze hundreds of Terabytes of output

•Re-use of proven eScience concepts in new areas (e.g. sector water)

Prof. Henk Dijkstra, Univ. of Utrecht NLeSC Integrator Climate

eSalsa NLeSC project: data-driven simulations & advanced visualization to understand Climate Change

Dr. Jason Maassen eScience Engineer NLeSC

Page 25: Taming the Big Data Beast - Together

The number of data-driven start-ups is growing—particularly when it comes to social media.

Taming the Big Data Beast

Page 26: Taming the Big Data Beast - Together

Development of a high performance Twitter analysis platform

Hadoop – MapReduce architecture @ a large SARA computer cluster

Smart search & analysis software

Goal is to ask “Big Data” research questions e.g.

• Ability to analyze microblogging data produced over years • Time dependant • Real time sentiment analysis • And so on…

Prof. Antal van den Bosch NLeSC Integrator Humanities Radboud University Nijmegen

Dr. Erik Tjong Kim Sang eScience Engineer NleSC

Page 27: Taming the Big Data Beast - Together

Cyber-common: a facility for 21st century data-driven research and multidisciplinary team work

SURF-SARA-NLeSC

To link minds and eScience

The key to scientific questions y!

Page 28: Taming the Big Data Beast - Together

Cyber-common: a facility for 21st century data-driven research and multidisciplinary team work

SURF-SARA-NLeSC

To link minds and eScience

The key to scientific questions y! The key to scientific questions yet unasked!

Page 29: Taming the Big Data Beast - Together

Joint NWO-NLeSC “data sciences” call • Focus on stimulating public-private partnerships

• Three instruments:

– Industrial Partnership Programme (IPP) – Technology Area’s (TA) – Knowledge Innovation Mapping SMEs (KIEM MKB)

Rosemarie van der Veen-Oei (NLeSC) [email protected] T 070 3440 851

Mark Kas (NWO) [email protected] T 070 3440 811, M 06 205 93 207

www.nlesc.nl Netherlands eScience Center

Page 30: Taming the Big Data Beast - Together

Thank you

www.esciencecenter.nl