Optimizing discovery in the big science era · 2015-03-12 · Optimizing discovery in the big...
Transcript of Optimizing discovery in the big science era · 2015-03-12 · Optimizing discovery in the big...
Netherlands eScience Center
Optimizing discovery in the big science era
www.esciencecenter.nl
Prof. Dr. Wilco Hazeleger
The world around us
• Science and society are intimately
connected
Science becomes increasingly problem-
driven
Science increasingly inter-, multi-, trans-
disciplinary
The Big Science era
Mission
Enabling digitally enhanced research through
efficient use of scientific software, data, and e-
infrastructure
Application DomainsLife Sciences & eHealth, Environment & Sustainability,
Humanities & Social Sciences, Physical World & Beyond
e-InfrastructureComputing, Networking
Storage & Visualization
Organisation
Founding organisations (since 2011).
NWO – Netherlands Organisation for Scientific Research (2.7 M€ p.a.)SURF – Dutch higher education and research partnership for ICT (2.7 M€ p.a.)
Collaborative projects between NLeSC, academic partners and industry whichinclude our digital scientists; cash and in kind.
NLeSC research program on generic eScience concepts and tools.
NLeSC priority domains (demand-driven from science)
I. Environment & Sustainability
Climate, ecology, energy, logistics,
water management, agriculture & food
II. Life Sciences & eHealth
Next generation sequencing,
biobanking, molecules & man
III. Humanities & Social Sciences
SMART cities, text analysis, eBusiness,
creative technologies
IV. Physics and beyond
Astronomy, high-energy physics,
advanced materials, engineering
& manufacturing
NLeSC eScience competences applied in science
Big data analyticsStatistics, machine learning, visualisation, text mining
Optimized data handlingData base optimization, structured & unstructured data, real time data
Efficient computingDistributed & acceleratedcomputing, efficient algorithms
– broad oriented scientists at the interface of research and IT
– collaborating with domain researchers to implement eScience concepts and tools
– mostly PhDs with domain knowledge and IT skills
– Involved in projects, funded in cash and in kind
eScience Research Engineers = Digital Scientists
eScience Technology Platform
Core of NLeSC expertise; promotes exchange
and re-use of best practices
• Repository
– compute kernels, interfaces, libraries, tools, and scientific
workflows
• Knowledge base
– professional coding standards, coding styles, unit and
integration testing, and documentation
• Expertise center & meeting place
Collaborative Project Examples
Astronomy
Project Leader: Marco de Vos, ASTRON
Neuroimaging
Project Leader: Paul Tiesinga, Univ. of Nijmegen
eChemistry
Project Leader: Lars Ridder, Univ. of Wageningen
eScience Engineer: Marijn Sanders
Climatology
Project Leader: Henk Dijkstra, Univ. Utrecht
eScience Engineer: Jason Maassen
eEcology
Project Leader: Willem Bouten, Univ. Amsterdam
eFood Research
Project Leader: Wynand Alkema, Univ. Nijmegen
Life Sciences
Project Leader: Jan Willem Boiten, CTMM
Water Management
Project Leader: Prof. Nick van de Giesen, TU Delft
eHumanities
Project Leader: Guus Schreiber, Free Univ. Amsterdam
Green Genetics
Project Leader: Bernard de Geus, TTi Green Genetics
Collaborative Project Examples
Massive Point Clouds
Project Leader: Peter van Oosterom, Delft
University of Technology
Sim-City
Project Leader: Peter Sloot, UVA
SPuDisc
Project Leader: Maarten de Rijke, Univ Amsterdam
Summer in the City
Project Leader: Bert Holtslag, Univ. of Wageningen
eVisualization
Project Leader: Edwin Valentijn, Univ. Groningen
TwiNL
Project Leader: Antal van de Bosch, Univ. of Nijmegen
ODEX4All
Project Leader: Barend Mons, Leiden Univ.
eSiBayes
Project Leader: Willem Bouten, Univ. Amstedam
AMUSE
Project Leader: Simon Portegies Zwart, Leiden Univ.
Via Appia
Project Leader: Henk Scholten, VU Amsterdam
e-Ecology
NLeSC and UVA (Prof. W. Bouten)
Annotation tool and learning
Annotation tool and learning
Acceleration data to behavior
Machine learning
– Labeled train set
– Trained model
Schema from Natural Language Processing with Python, by Steven
Bird, Ewan Klein and Edward Loper, Copyright © 2009
e-Food
NLeSC and Prof. Alkema (Radboud Univ Nijmegen) and Dr Tops (VU and WUR)
Literature sources
Taste Ontologye.g. sweet, sour, bitter,umami, salty,
ropiness, TASR1
Ingredient ontologye.g. mannitol,sucrose,sorbitol,
alpha-terpineol, 4-
methylpentanoic acid,ethyl
propionate,flavonoid,caffeine
tag tag
Calculate
Compoundontology profiles
store
Classifying compounds according to
taste
~500 terms
~40.000 terms
Derived from ChEBI
A number of known food proteins
24 million
scientific
abstracts
Point clouds
NLeSC with TU-Delft
Point clouds
• Set of data points in some coordinate system
• In 3D coordinate system, points defined by X, Y, Z
• Possible to have more attributes. Ex: Color (R, G, B)
• NL surface
• 640 billion points
• 6 – 10 points per m
• 12 attributes
• 20 bytes/point
• 60000 files
2
Actual Height Model of the Netherlands (AHN2):
Massive point clouds for eSciences
LAS 11.64 TB
LAZ 1 TB
• Loading
• Organization
• Indexing
• Clustering
• Blocking
• Compressing
• Querying
• Parallel processing
• Level of detail / Data pyramid
Point could data bases
Point cloud databases
Massive point clouds for eSciences
e-Watercycle
NLeSC and TU-Delft, Utrecht Univ (Prof. N. vd Giessen)
Enabling digitally enhanced research
through efficient use of scientific
software, data and e-infrastructure
• Deals with data, data, data…and computing
• Domain overarching solutions needs cross-
discipline expertise, well defined interfaces,
and standardization
– eScience technology platform: software & expertise
– Application in domain sciences
Thank you