The Scientific Case for High Performance Computing in Europe 2012-2020

PRACE – The Scientific Case for HPC in Europe

ALL RIGHTS RESERVED. This report contains material protected under International and Federal Copyright Laws and Treaties. Any unauthorized reprint or use of this material is prohibited. No part of this report may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system without express written permission from the author / publisher.

The copyrights of the content and images contained within this report remain with their original owner(s). The PRACE name and the PRACE logo are © PRACE.

All diagrams and most photographs are original to this book. See copyright notice for individual copyrights.

Printed by Bishops Printers Ltd, Spa House, Walton Road, Portsmouth, Hampshire, PO6 1TR, United Kingdom, Telephone +44 239 2334 900

Published by Insight Publishers Ltd, 12-13 King Square, Bristol, BS2 8JH, United Kingdom, Telephone +44 117 2033 120, www.ipl.eu.com

Project Manager Ellen HagganBook designer Mike StaffordAdditional designers Thomas HuntLayout assistant Simon BrowneProofreader Becky FreemanPrint manager Paul Rumbold

The PRACE Research Infrastructure is established as an international non-profit association with seat in Brussels and is named “Partnership for Advanced Computing in Europe aisbl”. It has 24 member countries (June 2012) whose representative organisations are creating a pan-European supercomputing infrastructure, providing access to computing and data management resources and services for large-scale scientific and engineering applications at the highest performance level.

PRACE, October 2012

ISBN 978-0-9574348-0-6

PRACE – The Scientific Case for HPC in Europe

www.prace-ri.eu

This work has received funding from the European Community’s

Seventh Framework Programme (FP7/2007-2013) in the PRACE-1IP Project under grant agreement n° RI-261557.

Lead Author:

Martyn Guest – Cardiff University, UK

Panel Chairs:

Giovanni Aloisio – University of Salento and ENES-CMCC, Italy

Stefan Blügel – Forschungszentrum Jülich, Germany

Modesto Orozco – Institute for Research in Biomedicine, Spain

Philippe Ricoux – TOTAL, France

Andreas Schäfer – University of Regensburg, Germany

Scientific Case Management Group:

Richard Kenway – PRACE Scientific Steering Committee (Chair)

Turlough Downes – PRACE User Forum (Chair)

Thomas Lippert – PRACE-1IP Project Coordinator

Maria Ramalho – Acting Managing Director of PRACE aisbl

Secretary:

Giovanni Erbacci – PRACE-1IP WP4 leader

Editor-in-chief:

Marjolein Oorsprong – PRACE Communications Officer

PRACE – The Scientific Case for HPC in Europe Table of Contents

TABLE OF CONTENTS

Glossary…………………………………………………………………………………………………………….…………………………5 Executive Summary ....................................................................................................................... 11 Key Recommendations .................................................................................................................. 16 1 The European HPC Ecosystem and its Potential Impact – 2012–-‐2020 .................................... 18 1.1 Introduction and Background ....................................................................................................... 18 1.2 Objectives and Scope of the Scientific Case Update ..................................................................... 21 1.3 Progress to be Expected During the Petascale Era ....................................................................... 23 1.4 Balance between Scientific, Industrial and Societal Benefits ........................................................ 41 2 Weather, Climatology and solid Earth Sciences ..................................................................... 47 2.1 Summary ...................................................................................................................................... 47 2.2 Computational Grand Challenges and Expected Outcomes ......................................................... 49 2.3 A Roadmap for Capability and Capacity Requirements ................................................................ 64 2.4 Expected Status in 2020 ............................................................................................................... 65 3 Astrophysics, High-‐Energy Physics and Plasma Physics .......................................................... 70 3.1 Summary ...................................................................................................................................... 70 3.2 Computational Grand Challenges and Expected Outcomes ......................................................... 70 3.3 A Roadmap for Capability and Capacity Requirements ................................................................ 83 3.4 Expected Status in 2020 ............................................................................................................... 84 4 Materials Science, Chemistry and Nanoscience ..................................................................... 86 4.1 Summary ...................................................................................................................................... 86 4.2 Computational Grand Challenges and Expected Outcomes ......................................................... 87 4.3 A Roadmap for Capability and Capacity Requirements ................................................................ 95 4.4 Expected Status in 2020 ............................................................................................................. 100 5 Life Sciences and Medicine .................................................................................................. 101 5.1 Summary .................................................................................................................................... 101 5.2 Computational Grand Challenges and Expected Outcomes ....................................................... 101 5.3 A Roadmap for Computational Requirements ............................................................................ 105 5.4 Expected Status in 2020 ............................................................................................................. 108 6 Engineering Sciences and Industrial Applications ................................................................. 110 6.1 Introduction ................................................................................................................................ 110 6.2 Computational Grand Challenges & Expected Outcomes in Engineering ................................... 112 6.3 Computational Grand Challenges and Expected Outcomes in Industry ..................................... 119 6.4 Engineering and Industrial Exascale Issues ................................................................................ 130 6.5 A Roadmap for Computational Requirements ............................................................................ 131 6.6 Expected Status in 2020 ............................................................................................................. 132 7 Requirements for the Effective Exploitation of HPC by Science and Industry ....................... 133 7.1 Introduction ................................................................................................................................ 133 7.2 An Effective and Persistent Infrastructure .................................................................................. 133 7.3 Computational Science Infrastructure in Europe ........................................................................ 136 7.4 The Challenges of Exascale-‐Class Computing ............................................................................. 139 7.5 A Support Infrastructure for the European HPC Community ...................................................... 145 7.6 Education and Training of Researchers ...................................................................................... 148 7.7 Community Building and Centres of Competence ...................................................................... 151 8 Membership of International Scientific Panel ...................................................................... 153

PRACE – The Scientific Case for HPC in Europe Glossary

GLOSSARY

Abbreviation / acronym Description

4DVAR Four-‐dimensional variational assimilation – a simple generalisation of 3DVAR for observations that are distributed in time

ACARE Advisory Council for Aviation Research and Innovation in Europe AGN Active Galactic Nuclei ALD Atomic Layer Deposition ARGO Argo is a global array of 3,000 free-‐drifting profiling floats that measures the

temperature and salinity of the upper 2,000 m of the ocean B3LYP A hybrid functional in which the exchange energy, in this case from Becke's

exchange functional, is combined with the exact energy from Hartree–Fock theory

Big-‐BOSS A ground-‐based dark energy experiment to study baryon acoustic oscillations (BAO) and the growth of structure with an all-‐sky galaxy redshift survey

BLAS Basic Linear Algebra Subprograms BLAST Basic Local Alignment Search Tool (bioinformatics) BSC Barcelona Supercomputing Center (Spain) BSM Beyond the Standard Model CAD Computer-‐Aided Design CAE Computer-‐Aided Engineering CASPT2 Complete Active Space with Second-‐order Perturbation Theory CASSCF Complete Active Space Self Consistent Field method – a particularly important

multi-‐configurational self-‐consistent field approach (MCSCF) CC Coupled cluster – a numerical technique used for describing many-‐body systems CCSD(T) Coupled-‐cluster method that included singles and doubles fully, while triples are

calculated non-‐iteratively CEA Commissariat à l'Energie Atomique CECAM Centre Européen de Calcul Atomique et Moléculaire CECDC Combustion Exascale Co-‐Design Center (USA) CERF Co-‐Design for Exascale Research in Fusion (USA) CERFACS European Centre for Research and Advanced Training in Scientific Computation

(France) CERN European Organisation for Nuclear Research CESAR Office of Science Center for Exascale Simulation of Advanced Reactors (USA) CFD Computational Fluid Dynamics CI Configuration Interaction CINES Centre Informatique National de l’Enseignement Supérieur (France) CMCC Centro Euro-‐Mediterraneo per i Cambiamenti Climatici (Italy) CMIP5 Coupled Model Intercomparison Project Phase 5 CNRS Centre National de la Recherche Scientifique (France) COPES Coordinated Observation and Prediction of the Earth System CPMD ’Car-‐Parrinello’ molecular dynamics (ab-‐initio MD) CPU Central Processing Unit CSM Continuum solvation models CT-‐QMC Continuous-‐Time Quantum Monte Carlo methods for numerically exact

calculation of complicated fermionic path integrals CTM Chemical transport models

CUDA NVIDIA’s parallel computing architecture CVD Chemical vapour deposition DAs Distribution Amplitudes DEM Discrete Element Method (particle simulation technology) DESY Deutsches Elektronen-‐Synchrotron DESY, a Research Centre of the Helmholtz

Association, in Hamburg (Germany) DFT Density Functional Theory – a quantum mechanical modelling method used to

investigate the electronic structure of many-‐body systems DMFT Dynamical Mean-‐Field Theory DNA Deoxyribonucleic Acid DNS Direct Numerical Simulation DPD Dissipative Particle Dynamics EBI European Bioinformatics Institute EByte 1 Exabyte = 1018 bytes of digital information ECWS Exascale Climate and Weather Science Co-‐Design Center EDF Electricité de France (France) EESI European Exascale Software Initiative (Europe) Eflop/s 1 Exaflop = 1018 floating-‐point operations per second EFT Effective Field Theory EIDA European Integrated Waveform Data Archive ELI Extreme Light Infrastructure, a European Project, involving nearly 40 research

and academic institutions from 13 EU Member States, forming a pan-‐European Laser facility

ELIXIR European bioinformatics initiative to construct and operate a sustainable infrastructure for biological information in Europe

EMBL European Molecular Biology Laboratory EM-‐PIC Electromagnetic PIC simulation ENES European Network for Earth System modelling ENSO El Niño/La Niña-‐Southern Oscillation is a quasiperiodic climate pattern that

occurs across the tropical Pacific Ocean roughly every five years EPOS European Plate Observing System EPR A third-‐generation pressurised water reactor (PWR) design ESA European Space Agency ESF European Science Foundation ESM Earth System Model (climate) ESMF Earth System Modelling Framework – open-‐source software for building climate,

numerical weather prediction, data assimilation and other Earth science software applications

ESFRI European Strategy Forum for Research Infrastructures ETP4HPC European Technology Platform (ETP) for High-‐Performance Computing Euclid A planned space telescope, an M-‐class mission of the ESA Cosmic Vision 2020–

2025, planned to be launched in 2019 Exascale Simulations and HPC systems which calculate at around 1018 floating-‐point

operations per second FAIR Facility for Antiproton and Ion Research FE Finite Elements FEA Finite Element Analysis FFT Fast Fourier Transformation Flash High-‐Energy Density Physics Co-‐Design Center (USA) FP7 European Commission –-‐ Research: The Seventh Framework Programme (2007–

FZJ Forschungszentrum Julich (Germany) Gauß-‐Allianz The Gauß-‐Allianz e.V. is a German association in which academic computing

centres team up to create the necessary infrastructure for the future of HPC and Grid computing on a national level

GCM Global Climate Model GENCI Grand Equipment National de Calcul Intensif (France) GENE Gyrokinetic Electromagnetic Numerical Experiment – an open source plasma

microturbulence code GEO600 A gravitational wave detector located near Sarstedt, Germany GHG Greenhouse Gas GMES Global Monitoring for Environment and Security – European initiative GPDs Generalised Parton Distributions GP-‐GPU General Purpose Graphics Processor Unit GPU Graphical Processing Unit GRAPE Gravity Pipeline Engine – special-‐purpose hardware GW GW approximation, derived by Hedin, is based on an expansion in terms of the

dynamically screened Coulomb interaction GYSELA The GYSELA code simulates the electrostatic branch of the Ion Temperature

Gradient turbulence in tokamak plasmas Hadoop Open-‐source software project that enables the distributed processing of large

data sets across clusters of commodity servers HED High energy density (plasma physics) HEP High-‐Energy Physics HiPER High-‐Power Laser for Energy Research HLRS High-‐Performance Computing Center Stuttgart (Germany) HPC High-‐Performance Computing HQP Highly Qualified Personnel HTS High-‐Throughput Screening I/O Input and Output IATA International Air Transport Association IBM Immersed Boundary Method (particle simulation technology) IceCube The IceCube Neutrino Observatory is a neutrino telescope constructed at the

Amundsen–Scott South Pole Station in Antarctica ICF Inertial Confinement Fusion IDC International Data Corporation IESP International Exascale Software Project IFERC International Fusion Energy Research Centre IGBP International Geosphere–Biosphere Programme IGBP-‐AIMES The Earth System synthesis and integration project of the IGBP INCITE Innovative and Novel Computational Impact on Theory and Experiment Intel MIC Intel Many Integrated Core Architecture IPCC Intergovernmental Panel on Climate Change IPCC-‐AR5 Fifth Assessment Report (AR5) of the IPCC ITER ITER Fusion Research Collaboration JET Joint European Torus JWST The James Webb Space Telescope K-‐computer The first machine to achieve 10 Pflop/s (Fujitsu, Japan) KIT Karlsruhe Institute for Technology LBM Lattice Boltzmann Method LDA Local Density Approximation LDA-‐DMFT LDA-‐Dynamical Mean-‐Field Theory to address strongly correlated electron systems

LES Large eddy simulation LHC Large Hadron Collider LIGO Laser Interferometer Gravitational-‐Wave Observatory (US) Linpack The LINPACK Benchmarks are a measure of a system’s floating-‐point computing

power LQCD Lattice Quantum Chromo Dynamics LSST Large Synoptic Survey Telescope MapReduce A patented software framework introduced by Google in 2004 to support

distributed computing on large data sets on clusters of computers MatSEEC An independent ESF science-‐based committee in materials science and its

applications, materials engineering and technologies and related fields of science and research management

MByte 1 Megabyte = 106 bytes of digital information MC Monte Carlo MCF Magnetic Confinement Fusion MCTDH Multi-‐Configuration Time-‐Dependent Hartree algorithm MD Molecular Dynamics MDGRAPE Molecular Dynamics Gravity Pipeline Engine MDO Multidisciplinary Design and Optimisation MeMoVolc European research network in Measuring and Modelling Volcano Eruption

Dynamics MHD Magneto-‐Hydrodynamics MMM Multiscale Materials Modelling MP2 2nd order Møller–Plesset perturbation theory (MP) MPCD Multi-‐Particle Collision Dynamics MPI Message Passing Interface (distributed memory system programming model) MRAM Magnetic Random Access Memory MRI Magnetic Resonance Imaging MSU Moscow State University (Russia) MW Megawatt MyOcean Implementation of the Marine Core Service NERIES Integrated Infrastructure Initiative FP6 project aiming at networking the

European seismic networks Neutronics Neutron transport or simply Neutronics is the term used to describe the

mathematical treatment of neutron and gamma ray transport through materials NVH Noise, Vibration and Harshness ODES Ordinary Differential Equations OpenMP Open specification for Multi-‐Processing (shared memory system

programming model) ORCA12 Global ocean model including sea ice, at 1/12 deg resolution ORFEUS The European non-‐profit foundation that aims at coordinating and promoting

digital broadband (BB) seismology in the European Mediterranean area OSIRIS An integrated framework for parallel PIC simulations Pan-‐STARRS The Panoramic Survey Telescope and Rapid Response System PByte 1 Petabyte = 1015 bytes of digital information PDE Partial Differential Equations Petascale Simulations and HPC systems which calculate at around 1015 floating-‐point

operations per second Pflop/s 1 Petaflop = 1015 floating-‐point operations per second PIC Particle In Cell PRACE Partnership for Advanced Computing in Europe

PRACE-‐RI PRACE Research Infrastructure PRACE-‐1IP 1st Implementation Phase of PRACE PRMAT Parallel R-‐Matrix Program PWR Pressurised Water Reactor QCD Quantum Chromo Dynamics QFT Quantum Field Theory QM/MM Quantum Mechanical / Molecular Mechanics QMC Quantum Monte Carlo QSAR Quantitative Structure Activity Relationship QUEST QUantitative Estimation of Earth’s Seismic Sources and Structure – Initial Training

Network in computational seismology funded within EU FP7 RANS Reynolds Averaged Navier Stokes Reτ In fluid mechanics, the Reynolds number (Re) gives a measure of the ratio of

inertial forces (which characterise how much a particular fluid resists any change in motion) to viscous forces and consequently quantifies the relative importance of these two types of forces for given flow conditions

SAR Synthetic-‐Aperture Radar SDP Seismic Data Processing SHMEM SHared MEMory SKA Square Kilometre Array SLOOP SheLf to deep Ocean mOdelling of Processes – the International Training Network

SLOOP recently submitted to FP7 SM Standard Model SME Small and Medium Enterprise SN A supernova (abbreviated SN, plural SNe for supernovae) Sn neutronics A deterministic method for neutronics (neutron transport) in which the particle

flux distribution in space, angle and energy is found by solving the transport equation numerically

SNP Single Nucleotide Polymorphism SPH Smoothed Particle Hydrodraulics (particle simulation technology) SPICE Seismic wave Propagation and Imaging in Complex media – a Marie Curie

Research Training Network in FP6 focusing on research and training in all aspects of computational seismology

SQUIDs Superconducting QUantum Interference Devices – sensitive sensors for magnetic fields

SSC PRACE Scientific Steering Committee SST Sea-‐surface temperature STFC Science and Technology Facilities Council (UK) Super-‐K The Super-‐Kamiokande neutrino observatory (Japan) TByte 1 Terabyte = 1012 bytes of digital information TDDFT Time-‐Dependent DFT Terascale Simulations and HPC systems which calculate at around 1012 floating-‐point

operations per second Tflop/s 1 Teraflop = 1012 floating-‐point operations per second THC Thermohaline Circulation Tier-‐0 Leadership-‐class computing systems Tier-‐1 National ‘Mid-‐range’ HPC systems Tier-‐2 Institutional HPC systems TMDs Transverse Momentum-‐Dependent Distribution functions TOPO-‐EUROPE

European initiative in the Geoscience of Coupled Deep Earth–Surface Processes

TOPOMOD Training project to investigate and model the origin and evolution of topography of the continents over a wide range of spatial and temporal scales and using a multidisciplinary approach, coupling geophysics, geochemistry, tectonics and structural geology with advanced geodynamic modelling

UCL University College London US United States USA United States of America VERCE Virtual Earthquake and seismology Research Community in Europe e-‐science

environment VIRGO A gravitational wave detector (Michelson laser interferometer) in Italy with two

orthogonal arms each 3 kilometres long WCES Weather, Climate and solid Earth Sciences WCRP World Climate Research Program WG Working Group WORM A Write Once Read Many drive is a data storage device where information, once

written, cannot be modified WP Work Package ZByte 1 Zettabyte = 1021 bytes of digital information Zettascale Simulations and HPC systems which calculate at around 1021 floating-‐point

operations per second Zflop/s 1 Zettaflop = 1021 floating-‐point operations per second

PRACE – The Scientific Case for HPC in Europe Executive Summary

Communication from the EC6 on ‘High Performance Computing: Europe’s Place in a Global Race’ COM(2012) 45 final

‘The race for leadership in HPC systems is driven both by the need to address societal and scientific grand challenges more effectively, such as early detection and treatment of diseases like Alzheimer's, deciphering the human brain, forecasting climate evolution, or preventing and managing large-‐scale catastrophes, and by the needs of industry to innovate in products and services.’ ‘Industry has a dual role in high-‐end computing: firstly, supplying systems, technologies and software services for HPC; and secondly, using HPC to innovate in products, processes and services. Both are important in making Europe more competitive. Especially for SMEs, access to HPC, modelling, simulation, product prototyping services and consulting is important to remain competitive. This Action Plan advocates for a dual approach: strengthening both the industrial demand and supply of HPC.’

EXECUTIVE SUMMARY

1. In 2005 and 2006, an international panel produced a White Paper entitled ‘Scientific Case for Advanced Computing in Europe’ that argued the case for High-‐Performance Computing (HPC) to support European competitiveness.1 The document was published by the ‘HPC in Europe Taskforce’2 in January 2007. The initiative was instrumental to the establishment of PRACE – The Partnership for Advanced Computing in Europe – and the PRACE Research Infrastructure (PRACE-‐RI) in April 2010.3 This document represents the culmination of an initiative by PRACE to create an update of the Scientific Case and capture the current and expected future needs of the scientific communities. It has involved the PRACE Scientific Steering Committee (SSC),4 leading scientists from across all major user disciplines, and has been funded through the 1st Implementation Phase of PRACE – the PRACE-‐1IP project.

2. Five years after the publication of the scientific case, the HPC landscape in Europe has changed significantly. The PRACE Research Infrastructure is providing Tier-‐0 HPC services – large allocations of time on some of the most powerful computers in the world – to researchers in Europe, a global effort has been launched towards achieving exascale HPC5 by the end of this decade, and the importance of HPC in solving the socio-‐economic challenges and maintaining Europe’s competitiveness has become even more evident.

3. The scope of this report is wide-‐ranging and captures the conclusions of five scientific areas, each derived from the work of an associated panel of experts. The five panels include those in the areas of: Weather, Climatology and solid Earth Sciences; Astrophysics, HEP and Plasma Physics; Materials Science, Chemistry and Nanoscience; Life Sciences and Medicine; and Engineering Sciences and Industrial Applications.

4. The position of HPC has evolved since 2007 from a technology crucial to the academic research community to a point where it is acknowledged as central in pursuing ‘Europe’s place in a Global Race’.6 The sidebar extract from the communication from the Commission to the European Parliament pays testimony to this position.

1 www.hpcineuropetaskforce.eu/files/Scientific case for European HPC infrastructure HET.pdf 2 http://www.hpcineuropetaskforce.eu 3 http://www.prace-‐ri.eu/ 4 http://www.prace-‐ri.eu/Organisation 5 http://www.exascale.org Experts predict that exascale computers (capable of 1018 operations per second) will be in existence before 2020.

6 Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions – ‘High-‐Performance Computing: Europe’s place in a Global Race’, 15.02.2012,

http://ec.europa.eu/information_society/newsroom/cf/item-‐detail-‐dae.cfm?item_id=7826

5. HPC is currently undergoing a major change as the next generation of computing systems (‘exascale systems’)4 is being developed for 2020. These new systems pose numerous challenges, from a 100-‐fold reduction of energy consumption to the development of programming models for computers that host millions of computing elements, while addressing the data challenge presented by the storage and integration of both observational and simulation/modelling data. These challenges cannot be met by mere extrapolation but require radical innovation in several computing technologies. This offers opportunities to industrial and academic players in the EU to reposition themselves in the field.

6. All of the panels contributing to this report are convinced that the competitiveness of European science and industry will be jeopardised if sufficiently capable computers are not made available, together with the associated infrastructure and skilled people necessary to maximise their exploitation. The panels have listed multiple areas at risk in concluding that access to high-‐performance computers in the exascale range is of the utmost importance. Thus, in aerospace, considerable changes in the development processes will lead to a significant reduction in development times while at the same time including more and more disciplines in the early design phases to find an overall optimum for the aircraft configuration. This will enable the European aircraft industry to keep a leading role in worldwide competition, facing both an old challenge, i.e. competing with the USA, and a new, rapidly emerging one – keeping an innovation advantage over China. However, while aerospace can afford its own HPC provision, it may not have the capability to exploit exascale if similar systems are not available to academia for training and software development. In a similar vein, the lack of high-‐performance computers appropriate for life sciences research will displace R&D activities to the USA, China or Japan, putting European leadership in this field at risk.

7. Providing scientists and engineers with ongoing access to computers of leadership class must be recognised as an essential strategic priority in Europe: there is a compelling need for a continued European commitment to exploit the most powerful computers. Such resources are likely to remain extremely expensive and require significant expertise to procure, deploy and utilise efficiently; some fields even require research for specialised and optimised hardware. The panel stresses that these resources should continue to be reserved for the most exigent computational tasks of high potential value. It is clear that the computational resource pyramid must remain persistent and compelling at all levels, including national centres, access and data grids. The active involvement of the European Community along with appropriate member states remains critical in maintaining a world-‐leading supercomputer infrastructure in the European ecosystem. Europe must foster excellence and cooperation in order to gain the full benefits of exascale computing for science, engineering and industry in the European Research Area.

By way of summary, we present below key statements from the Commission and from our thematic panels that emphasise the essential role of a sustainable top-‐level infrastructure.

Communication from the Commission7 to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions – ‘High-‐Performance Computing: Europe’s place in a Global Race’ COM(2012) 45 final, Brussels, 15.2.2012

‘High-‐Performance Computing (HPC) is critical for industries that rely on precision and speed, such as automotive and aviation, and the health sector. Access to rapid simulations carried out by ever-‐improving supercomputers can be the difference between life and death; between new jobs and profits or bankruptcy.’

Weather, Climatology and solid Earth Sciences (WCES)

‘In the last decade, our understanding of climate change has increased, as has the societal need to carry this over into advice and policy. However, while there is great confidence in the fact that climate change is happening, there remain uncertainties. In particular, there is uncertainty about the levels of greenhouse gas emissions and aerosols likely to be emitted and, perhaps more significant, there are uncertainties about the degree of warming and the likely impacts. Increasing the capability and comprehensiveness of ‘whole Earth system’ models that represent in ever-‐increasing realism and detail scenarios for our future climate is the only way to reduce these latter uncertainties.’

‘A programme of provision of leadership-‐class computational resources will make it increasingly possible in solid Earth sciences to address the issues of resolution, complexity, duration, confidence and certainty. Challenges have significant scientific and social implications, playing today a central role in natural hazard mitigation, treaty verification for nuclear weapons, and increased discovery of economically recoverable petroleum resources and monitoring of waste disposal.’

‘There is a fundamental need in oceanography and marine forecasting to build and efficiently operate the most accurate ocean models. Improved understanding of ocean circulation and biogeochemistry is critical to assess properly climate variability and future climate change and related impacts on, for example, ocean acidification, coastal sea level, marine life and polar sea-‐ice cover.’

Astrophysics, High-Energy Physics and Plasma Physics

‘Astrophysics, high-‐energy physics and plasma physics have, in recent years, shared a dramatic change in the role of theory for scientific discovery. In all three fields, new experiments became ever more costly, require increasingly long timescales and aim at the investigation of more and more subtle effects. Consequently, theory is faced with two types of demands: precision of theory predictions has to be increased to the point that it is better than the experimental one. Since the latter can be expected to increase by further orders of magnitude until 2020, this is a most demanding requirement. In parallel, the need to explore model spaces of much larger extent than previously investigated also became apparent. For example: In astrophysics, determination of the nature of dark energy and dark matter requires a detailed comparison of predictions from large classes of cosmological models with data from the new satellites and ground-‐based detectors which will be deployed until 2020. In high-‐energy physics, one of the tasks is to explore many possible extensions of the Standard Model to such a degree that even minute deviations between experimental data and Standard Model predictions can serve as smoking guns for a specific realisation of New Physics. In plasma physics,

7 http://ec.europa.eu/information_society/newsroom/cf/item-‐detail-‐dae.cfm?item_id=7826

one of the tasks is to understand the physics observed at ITER at such a high level that substantially more efficient fusion reactors could be reliably designed based on theoretical simulations which explore a large range of options.’

Materials Science, Chemistry and Nanoscience

‘Computational materials science, chemistry and nanoscience is concerned with the complex interplay of the myriads of atoms in a solid or a liquid, thereby producing a continuous stream of new and unexpected phenomena and forms of matter, characterised by an extreme range of length, time, energy, entropy and entanglement scales. The target of this science is to design materials ranging from the level of a single atom up to the macroscopic scale, and unravel phenomena and design processes from electronic reaction times in the femtosecond range up to geological periods. Computational materials science, chemistry and nanoscience stand in close interaction with the neighbouring disciplines of biology and medicine, as well as the geosciences, and affect wide fields of the engineering sciences. A large and diverse computational community that views as critical assets the conceptualisation, development and implementation of algorithms and tools for cutting edge HPC will achieve this goal. These tools are used to great benefit in other communities such as medicine and life sciences, and engineering sciences and industrial applications.’

‘The advance from petascale to exascale computing will change the paradigm of computational materials science and chemistry. The move to petascale is broadening this paradigm – to an integrated engine that determines the pace in a design continuum from the discovery of a fundamental physical effect, a process, a molecule or a material, to materials design, systems engineering, processing and manufacturing activities, and finally to the deployment in technology, where multiple scientific disciplines converge. Exascale computing will significantly accelerate the innovation, availability and deployment of advanced materials and chemical agents and foster the development of new devices. These developments will profoundly affect society and the quality of life, through new capabilities in dealing with the great challenges of knowledge and information, sustained welfare, clean energy, health, etc.’

Life Sciences and Medicine

‘In life sciences and medicine, Eflop/s8 capabilities will allow the use of more accurate formalisms (more accurate energy calculations, for example) and enable molecular simulation for high-‐throughput applications (e.g. the study of larger number of systems). Molecular simulation is a key tool for computer-‐aided drug design. The lack of high-‐performance computers appropriate for this research will displace R&D activities to the USA, China or Japan, putting European leadership in this field at risk. Appropriate exascale resources could revolutionise the simulation of biomolecules, allowing molecular simulators to decipher the atomistic clues to the functioning of living organisms. ‘Biomedical simulation will reduce costs, time to market and animal experimentation. In the medium to long term, simulation will have a major impact on public health, providing insights into the cause of diseases and allowing the development of new diagnostic tools and treatments. It is expected that understanding the basic mechanisms of cognition, memory, perception, etc., will allow the development of completely new forms of energy-‐efficient computation and robotics. The potential

8 flop/s, for floating-‐point operations per second; teraflop, 1 Tflop/s = 1012 flop/s; exaflop, 1 Eflop/s = 1018

flop/s

long-‐term social and economic impact is immense. ‘While exaflop machines are essential for specific areas of life sciences (e.g. brain simulation), and higher computational power will enable significantly increased accuracy for current modelling studies, some extremely important fields in life science will be mainly limited by throughput and data management.’

Engineering Sciences and Industrial Applications

‘All of us experience the effects of HPC in our day-‐to-‐day lives, although in many cases we are unaware of that impact. We travel in cars and aeroplanes designed using modelling and simulation applications run on HPC systems so that they are efficient and safe. HPC is essential for ensuring that our energy needs are met. Finding and recovering fossil fuels require engineering analysis that only HPC can deliver. Nuclear power generation also relies heavily on HPC to ensure that it is safe and reliable. In the coming years, HPC will have an even greater impact as more products and services come to rely on it.’

‘The automotive industry is actively pursuing important goals that need exaflop computing capability or greater. Examples include (i) vehicles that will operate for 250,000 kilometres on average without the need for repair – this would provide substantial savings for automotive companies by enabling the vehicles to operate through the end of the typical warranty period at minimal cost to the automakers – and (ii) insurance companies require full-‐body crash analysis that includes simulation of soft tissue damage – today's "crash dummies" are inadequate for this purpose.’

‘The impact of computer simulation in aircraft design has been significant and continues to grow. Numerical simulation allows the development of highly optimised designs and reduced development risks and costs. Boeing, for example, exploited HPC in order to reduce drastically the number of real prototypes from 77 physical prototype wings for the 757 aircraft to only 11 prototype wings for the 787 "Dreamliner" plane. HPC usage saved the company billions of dollars.’

‘In addition to the automotive and aeronautics examples above, many areas within the engineering sciences – waves seismic equation inversion, engine combustion (chemical and multi-‐physics combustion) and turbulence – demand highly scalable or so-‐called “hero applications” to deliver long-‐term social and economic impact.’

Giovanni Aloisio, University of Salento and ENES-‐CMCC, Italy

Chair, Weather, Climatology and solid Earth Sciences Panel

Andreas Schäfer, University of Regensburg, Germany Chair, Astrophysics, HEP and Plasma Physics Panel

Stefan Blügel, Forschungszentrum Jülich, Germany Chair, Materials Science, Chemistry and Nanoscience Panel

Modesto Orozco, Institute for Research in Biomedicine, Spain Chair, Life Sciences and Medicine Panel

Philippe Ricoux, TOTAL, France Chair, Engineering Sciences and Industrial Applications Panel

Martyn Guest, Cardiff University, Wales, United Kingdom Lead Author of the Scientific Case

PRACE – The Scientific Case for HPC in Europe Key Recommendations

KEY RECOMMENDATIONS

In pointing to the compelling need for a continued European commitment to exploit leadership class computers, the scientific panels have considered the infrastructure requirements that must underpin this commitment and present their considerations as part of the review of computational needs. This considers both the vital components of the computational infrastructure and the user support functions that must be provided to realise the full benefit of that infrastructure. This review has led to a set of key recommendations deemed vital in shaping the future provision of resources, recommendations that are justified in full in the Scientific Case9 and outlined below.

Recommendation 1 Need for HPC Infrastructure at the Europe Level

The scientific progress that has been achieved using HPC since the ‘Scientific Case for Advanced Computing in Europe’ was published in 2007, the growing range of disciplines that now depend on HPC, and the technical challenges of exascale architectures make a compelling case for continued investment in HPC at the European level. Europe should continue to provide a world-‐leading HPC infrastructure to scientists in academia and industry, for research that cannot be done any other way, through peer review based solely on excellence. This infrastructure should also address the need for centres to test the maturity of future exascale codes and to validate HPC exascale software ecosystem components developed in the EU or elsewhere.

Recommendation 2 Leadership and Management

The development of Europe’s HPC infrastructure, its operation and access mechanisms must be driven by the needs of science, industry and society to conduct world-‐leading research. This public-‐sector investment must be a source of innovation at the leading edge of technology development and this requires user-‐centric governance. Leadership and management of HPC infrastructure at the Europe level should be a partnership between users and providers.

Recommendation 3 A Long-‐Term Commitment to Europe-‐Level HPC

Major experiments depend on HPC for analysis and interpretation of data, including simulation of models to try to match observation to theory, and support research programmes extending over 10–20 year time frames. Some applications require access to stable hardware and system software for 3–5 years. Data typically need to be accessed over long periods and require a persistent infrastructure. Investment in new software must realise benefits over at least 10 years, with the lifetime of major software packages being substantially longer. A commitment to Europe-‐level HPC infrastructure over several decades is required to provide researchers with a planning horizon of 10–20 years and a rolling 5-‐year specific technology upgrade roadmap.

9 See section 7.

PRACE – The Scientific Case for HPC in Europe Key Recommendations

Recommendation 4 Algorithms, Software and Tools

Most applications targeting Tier-‐0 machines require some degree of rewriting to expose more parallelism and many face severe strong-‐scaling challenges if they are effectively to progress to exascale, as is demanded by their science goals. There is an ongoing need for support for software maintenance, tools to manage and optimise workflows across the infrastructure, and visualisation. Support for the development and maintenance of community code bases is recognised as enhancing research productivity and take-‐up of HPC. There is an urgent need for algorithm and software development to be able to continue to exploit high-‐end architectures efficiently to meet the needs of science, industry and society.

Recommendation 5 Integrated Environment for Compute and Data

Most application areas foresee the need to run long jobs (for months or years) at sustained performances10 around 100 Pflop/s to generate core data sets and very many shorter jobs (for hours or days) at lower performances for pre-‐ and post-‐processing, model searches and uncertainty quantification. A major challenge is the end-‐to-‐end management of, and fast access to, large and diverse data sets, vertically through the infrastructure hierarchy. Most researchers seek more flexibility and control over operating modes than they have today to meet the growing need for on-‐demand use with guaranteed turnaround times, for computational steering and to protect sensitive codes and data. Europe-‐level HPC infrastructure should attach equal importance to compute and data, provide an integrated environment across Tiers 0 and 1, and support efficient end-‐to-‐end data movement between all levels. Its operation must be increasingly responsive to user needs and data security issues.

Recommendation 6 People and Training

There is grave concern about HPC skills shortages across all research areas and particularly in industry. The need is for people with both domain and computing expertise. The problems are both insufficient supply and low retention, because of poor career development opportunities for those supporting academic research. Europe’s long-‐term competitiveness depends on people with skills to exploit its HPC infrastructure. It must provide ongoing training programmes to keep pace with the rapid evolution of the science, methods and technologies, and must put in place more attractive career structures for software developers to retain their skills in universities and associated institutions.

Recommendation 7 Thematic Centres

Organisational structure is needed to support large long-‐term research programmes, bringing together competences to share expertise. This could take the form of virtual or physical thematic centres which might support community codes and data, operate dedicated facilities, focus on co-‐design, or have a cross-‐cutting role in the development and support for algorithms, software or tools. While some existing application areas have self-‐organised in this way, new areas such as medicine might achieve more rapid impact if encouraged to follow this path. Thematic centres should be established to support large long-‐term research programmes and cross-‐cutting technologies, to preserve and share expertise, to support training and to maintain software and data.

10 Petaflop, 1 Pflop/s = 1015 flop/s

PRACE – The Scientific Case for HPC in Europe The European HPC Ecosystem – 2012–2020

1 THE EUROPEAN HPC ECOSYSTEM AND ITS

POTENTIAL IMPACT 2012 - 2020

1.1 Introduction and Background In this section, we initially take the opportunity to provide some of the background to developments since the last report. Following a brief summary of the emerging European strategy for HPC (section 1.1.1) and the achievements of PRACE to date (section 1.1.2), we highlight the major opportunity presented with the advent of the next-‐generation exascale systems.4 The objectives and scope of the present update and the resulting scientific perspective from each of the five panels charged with contributing to this case are summarised in sections 1.2 and 1.3. Our considerations extend beyond the scientific impact to consider the balance between the scientific, industrial and social benefits of HPC (section 1.4).

HPC is currently undergoing a major change as the next generation of computing systems (‘exascale systems’)4 is being developed for 2020. These new systems pose numerous challenges, from a 100-‐fold reduction of energy consumption to the development of programming models for computers that host millions of computing elements. Exascale systems will be very different from today’s HPC systems, and building, operating and using such systems will face severe technological challenges. While the major focus of this report lies in the scientific challenges and outcomes associated with the provision of exascale resources, this panel would stress from the outset that such outcomes are critically dependent on the provision of the associated support infrastructure: without this provision, the full benefits of an exascale-‐class infrastructure will simply not be realised. An overview of the key requirements is given in section 7.

Through PRACE, the academic sector is now pooling its leadership-‐class or Tier-‐0 computing systems as a single infrastructure, making them available to all researchers in the EU. Critical mass is thus achieved, and access to these top-‐of-‐the-‐range HPC systems is provided based on scientific excellence rather than the geographical location of a researcher. PRACE is further extending its services to mid-‐range HPC systems (Tier-‐1) with the objective of providing a distributed computing ecosystem that serves its users irrespective of their location and the availability of national resources. The scientific panels responsible for this paper are convinced that the PRACE model of pooling and sharing systems and expertise makes optimal use of the limited resources available.

Europe has many of the technical capabilities and human skills needed to tackle the exascale challenge, i.e. to develop native capabilities that cover the whole technology spectrum from processor architectures to applications. Even though the EU is currently weak compared to the US in terms of HPC system vendors, there are particular strengths in microprocessor architectures, applications, low-‐power computing, systems and integration that can be leveraged to engage successfully in this global race, restoring the EU on the world scene as a leading-‐edge technology supplier. Progress within Europe has to date been channelled through the EESI11 – The European Exascale Software Initiative – an initiative co-‐funded by the European Commission.

EESI’s goal is to build a European vision and roadmap to address the challenge of the new generation of massively parallel systems that will provide Pflop/s performances in 2010 and Eflop/s performances in 2020.12 EESI is investigating the strengths and weaknesses of Europe in the overall

11 http://www.eesi-‐project.eu/pages/menu/homepage.php 12 flop/s, for floating-‐point operations per second; teraflop, 1 Tflop/s = 1012 flop/s; petaflop, 1 Pflop/s = 1015 flop/s; exaflop, 1 Eflop/s = 1018 flop/s; zettaflop, 1 Zflop/s = 1021 flop/s

international HPC landscape and competition. In identifying priority actions and the sources of competitiveness for Europe induced by the development of peta/exascale solutions and usages, EESI is investigating and proposing programmes in education and training for the next generation of computational scientists. The Initiative is also seeking to identify and stimulate opportunities for worldwide collaborations.

Leveraging the results from the EESI deliberations as part of the current exercise has been ensured through including many of the EESI project leads in the present panel membership.

The primary objectives of this update to the Scientific Case are to identify the scientific areas for which PRACE is an important Research Infrastructure and the key challenges within these areas. In highlighting the crucial role of large-‐scale computer simulation, we identify the potential outcomes in science and engineering to be addressed through PRACE petascale resources, and the impact of computer simulations on the economy and society in general. This impact is quantified through the production of a roadmap of expected achievements in the next 5–8 years.

The importance of exascale-‐class supercomputing for scientific and economic leadership has been stressed in numerous reports in the USA and Europe. Globally, nations are investing in HPC to tackle some of these issues. In the 1990s, the USA stood out as the world leader in HPC, with Europe and Japan the other major players. Now, countries including India, Russia and China are undertaking ambitious HPC programmes. Europe has lost ground by 10% since 2007 in terms of HPC investment.13 Failure by Europe to increase its investment means that not only will it risk falling further behind the world leader, the USA, but, worse, its position may be threatened by emerging HPC powers.

1.1.1 The Emergence of a European HPC Strategy The development of HPC has long been a national affair for EU Member States, often driven by military and nuclear energy applications. In recent years, the increasing importance of HPC for researchers and industry, as well as the exponential rise in the investments required to stay competitive at world level, have led to a common understanding that ‘Europeanisation’ of this domain would benefit everyone. This is also true for those Member States which encounter difficulties in creating self-‐sufficient national HPC infrastructures but which can make valuable contributions to and benefit from EU-‐level HPC capabilities.

As outlined above, the HPC in Europe Taskforce published in 2007 a White Paper entitled ‘Scientific Case for Advanced Computing in Europe’1 that argued the case for HPC to support EU competitiveness. This work was carried out in the context of the ESFRI14 European Roadmap for Research Infrastructures. It led to the consolidation of national HPC strategies, e.g. in Germany and France with the creation of the Gauß-‐Allianz15 and of GENCI (Grand Equipement National de Calcul Intensif)16 respectively. In turn, these developments resulted in the setting up of PRACE, as Member States and national institutions have realised that only through a joint and coordinated effort will they be able to stay competitive. This was supported in 2009 by the European Council, which called for further efforts in this domain.

1.1.2 Achievements of PRACE to Date Since the creation of the PRACE legal entity in 2010, the academic sector has been pooling its leadership-‐class computing systems as a single infrastructure, making them available to all

13 http://insidehpc.com/2010/11/23/video-‐interview-‐with-‐idc-‐on-‐their-‐strategic-‐agenda-‐for-‐european-‐supercomputing-‐leadership/

14 European Strategy Forum for Research Infrastructures http://ec.europa.eu/research/infrastructures/index_en.cfm?pg=esfri

15 http://www.gauss-‐allianz.de/en/ 16 http://www.genci.fr/?lang=en

‘PRACE ensures the wide availability of HPC resources on equal access terms. It has to be further strengthened to acquire the competence to (i) pool national and EU funds, (ii) set the specifications and carry out joint (pre-‐commercial) procurement for leadership-‐class systems, (iii) support Member States in their preparation of procurement exercises, (iv) provide research and innovation services to industry, and (v) provide a platform for the exchange of resources and contributions necessary for the operation of high-‐performance computing infrastructure. Additionally, an e-‐Infrastructure for HPC application software and tools needs to be put in place. It should further consolidate the EU’s strong position in HPC applications by coordinating and stimulating parallel software code development and scaling, and by ensuring the availability of quality HPC software to users.’

researchers in the EU. Critical mass is achieved, and access to these top-‐of-‐the-‐range HPC systems is provided based on scientific excellence rather than the geographical location of a researcher. PRACE is further extending its services to mid-‐range HPC systems with the objective of providing a distributed computing platform that serves its users irrespective of their location and the availability of national resources. The PRACE model of pooling and sharing systems and expertise makes optimal use of the limited resources available.

The mission of the PRACE RI is thus to enable high-‐impact European scientific discovery and engineering research and development across all disciplines to enhance European competitiveness for the benefit of society.

The PRACE RI seeks to realise this goal through world-‐class computing and data management resources and services open to all European public research through a peer review process. With the broad participation of European governments through representative organisations, a diversity of resources can be provided by the PRACE RI – including expertise throughout Europe in effective use of the resources.

PRACE encourages collaboration with industry and industrial use and conducts annual Industrial Seminars at locations throughout Europe. It also seeks to strengthen the European HPC industry through various initiatives and has a strong interest in improving the energy efficiency of computing systems and reducing their environmental impact.

The PRACE RI is established as an international non-‐profit association located in Brussels and is named the ‘Partnership for Advanced Computing in Europe AISBL’. It has 24 member countries whose representative organisations are creating a pan-‐European supercomputing infrastructure, providing access to computing and data management resources and services for large-‐scale scientific and engineering applications at the highest performance level. PRACE is funded by member governments through their representative organisations and the EU’s Seventh Framework Programme (FP7/2007-‐2013).17

The first PRACE computer systems and their operations are funded by the governments of the representative organisations hosting the systems.

It is clear that the position of HPC has evolved since the last Scientific Case in 2007 from a discipline crucial to the academic research community to a point where it is acknowledged to be a central asset in pursuing ‘Europe’s place in a Global Race’.18

The sidebar provides an extract from the communication from the Commission to the European Parliament.

17 Under grant agreement n° RI-‐261557 18 Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions – ‘High-‐Performance Computing: Europe’s place in a Global Race’, 15.2.2012, http://ec.europa.eu/information_society/newsroom/cf/item-‐detail-‐dae.cfm?item_id=7826

Apex of Resources The PRACE leadership systems form the apex of resources for large-‐scale computing and data management for scientific discovery, engineering research and development for the benefit of Europe and are well integrated into the European HPC ecosystem.

Access The PRACE RI offers three different forms of access: Project Access, Multiyear Access and Preparatory Access. Project Access is the norm for individual researchers and research groups. It is open to academic researchers worldwide and industry for projects deemed to have significant European and International impact. 'Calls for Proposals' issued twice a year are evaluated by leading scientists and engineers in a peer review process governed by a PRACE Scientific Steering Committee comprising leading European researchers from a broad range of disciplines. Programme Access is available to major European projects or infrastructures that can benefit from PRACE resources and for which Project Access is not appropriate. Preparatory Access is a simplified form of access for limited resources for the preparation of resource requests in response to Project Access Calls for Proposals.

Education and Training PRACE has an extensive education and training effort for effective use of the RI through seasonal schools, workshops and scientific and industrial seminars throughout Europe. Seasonal schools target broad HPC audiences, whereas workshops are focused on particular technologies, tools or disciplines or research areas. Education and training material and documents related to the RI are available on the PRACE website, as is the schedule of events (http://www.training.prace-‐ri.eu/). Six PRACE Advanced Training Centres have been established in 2012.

Software and Hardware Technology Initiatives PRACE undertakes software and hardware technology initiatives with the goal of preparing for changes in technologies used in the RI and provides the proper tools, education and training for the user communities to adapt to those changes. A goal of these initiatives is also to reduce the lifetime cost of systems and their operations, in particular the energy consumption of systems and the environmental impact.

ETP4HPC The European Technology Platform (ETP) for High-‐Performance Computing (HPC) was created to improve Europe’s position in the domain of HPC technologies and to foster collaboration among all players in the HPC supply chain. ETP4HPC will promote the growth of Europe's HPC vendors by maintaining a Strategic Research Agenda for HPC technologies, complementing the support provided by PRACE for academic and industrial user communities.

1.2 Objectives and Scope of the Scientific Case Update The preceding section provides a summary of the developments since the Scientific Case was last published in 2007. The primary objectives of this update to the Scientific Case19 are to identify the scientific areas for which PRACE is an important Research Infrastructure and the key challenges within these areas, highlighting the crucial role that large-‐scale computer simulation is playing in many areas of science. In addition to identifying the potential outcomes in science and engineering to be addressed through PRACE petascale resources and the anticipated approach of exascale capabilities, the update focuses on the potential impact of computer simulations on the economy and society in general. This impact is quantified through the production of a roadmap of expected achievements in the next 5–8 years.

The scope of this update is to cover the period 2012–2020, including the same panels and scientific areas as before, while extending the life science panel to include medicine. The aim is to place a 19 Highlighted by the PRACE SSC, the PRACE User Forum Programme Committee and the Board of Directors

greater emphasis on socio-‐economic challenges, business and innovation than the original Case. While leveraging the results from the EESI11, the update reports on the status of implementation of the recommendations of the original Scientific Case, in particular through the establishment of the PRACE Research Infrastructure. The final aim is to provide strategic input to the European Commission, the National funding agencies, decision makers, the science communities and the PRACE Research Infrastructure.

The following five sections of this report are devoted to the description of a scientific roadmap, detailing the major challenges, the scientific and societal benefits through making progress towards their resolution, and the prerequisites for being able to tackle these challenges. Five scientific areas are presented in sections 2–6, each derived from the work of an associated panel of experts. The five panels and their Chairs who have contributed to the case include those in the following areas:

• Weather, Climatology and solid Earth Sciences: Chair Giovanni Aloisio, University of Salento and ENES-‐CMCC, Italy The focus is on climate change, oceanography and marine forecasting, meteorology, hydrology and air quality and solid Earth sciences.

• Astrophysics, HEP and Plasma Physics: Chair Andreas Schäfer, University of Regensburg, Germany A compelling and recurring theme is that of theory and modelling providing fresh insight and hence contributing to the success of major experimental facilities – from space experiments such as the European Planck Surveyor satellite to those at large European centres like CERN, FAIR and ITER.

• Materials Science, Chemistry and Nanoscience: Chair Stefan Blügel, Research Centre Jülich, Germany The key challenges highlight the crucial role that computer simulations now play in almost all aspects of the study of materials, not only in traditional materials science, physics and chemistry, but also in nanoscience, surface science, electrical engineering, Earth sciences, biology and drug design. The demands of environmental constraints – cleaner catalysis-‐based chemical processes, materials able to withstand extreme stress and environments, nano technologies, etc. – drive many of these developments.

• Life Sciences and Medicine: Chair Modesto Orozco, Institute for Research in Biomedicine, Spain The focus lies in the key challenges and scientific objectives in genomics, systems biology, molecular dynamics, biomolecular simulation and in medicine.

• Engineering Sciences and Industrial Applications: Chair Philippe Ricoux, TOTAL, France The key objectives and challenges present compelling exemplars of the sheer breath and impact of simulation. These range from innovation in technology and design (complete helicopter simulation for next generation rotorcraft, ‘green’ aircraft and the virtual power plant) to enhanced understanding and modelling of physical phenomena in, for example, optimising gas turbines (aero-‐engines or power generation) or internal combustion engines (in terms of costs, stability, higher combustion efficiency, reduced fuel consumption, near-‐zero pollutant emissions and low noise). The scientific and societal impacts are widespread, from disaster management (e.g. forest fires) to improvements in medical care (biomedical flows).

An outline of the key components of the Computational Infrastructure, and the user support functions that must be provided to realise the full benefit of that infrastructure, are presented in section 7. The full membership of each panel is given in section 8. Furthermore, representatives from each PRACE Partner86 have been appointed as national contact points to disseminate the activity and identify scientists in their country that could also contribute to the process.

We consider it essential that the updated Scientific Case has broad support from the scientific community. The PRACE partners have therefore spread the information widely and encouraged participation in the preparation of the document itself. Drafts have been made publically available for comment on the PRACE website.

1.3 Progress to be Expected During the Petascale Era By way of introduction to the domain areas, we present initially a summary of the key scientific objectives and challenges from each of the scientific areas under consideration. We then present this summary in tabular form to impress on the reader the sheer scope of accomplishments promised through provision of a sustainable top-‐level infrastructure. Thus, Table 1.1 is organised to identify the key challenges from the five distinct areas listed above.

1.3.1 Weather, Climatology and solid Earth Sciences Weather, Climatology and solid Earth Sciences (WCES) encompass a wide range of disciplines from the study of the atmosphere, the oceans and the biosphere to issues related to the solid part of the planet. They are all part of Earth system sciences or geosciences. Earth system sciences address many important societal issues from weather prediction to air quality, ocean prediction and climate change to natural hazards such as seismic and volcanic hazards, for which the development and the use of high-‐performance computing plays a crucial role.

Research in the fields of w eather, c limatology and solid Earth s ciences is of key importance for Europe for:

• Informing and supporting preparation of EU policy on environment and climate mitigation and adaptation

• Understanding the likely impact of the natural environment on EU infrastructure, economy and society

• Enabling informed EU investment decisions in ensuring sustainability within the EU and globally

• Developing civil protection capabilities to protect the citizens of the EU from natural disasters

• Supporting through the EU and ESA joint initiative on Global Monitoring of Environment and Security

The challenges and outcomes in the WCES scientific domains to be addressed through petascale HPC provision are summarised below. More details are provided in Table 1.1; the associated societal benefits are fully developed in section 1.4.

Climate Change

Quantify uncertainties on the degree of warming and the likely impacts on nature and society. In particular this implies:(i) increasing the capability and complexity of ‘whole Earth system’ models that represent in ever-‐increasing realism and detail the scenarios for our future climate; (ii) performing process studies with ultra-‐high-‐resolution models of components of the Earth system (e.g. cloud resolving models of the global atmosphere); (iii) running large member ensembles of these models.

Oceanography and Marine Forecasting

Build and efficiently operate the most accurate ocean models in order to assess and predict how the different components of the ocean (physical, biogeochemical, sea-‐ice) evolve and interact. Produce realistic reconstructions of the ocean's evolution in the recent past and accurate predictions of the ocean's future state over a broad range of time and space scales, to provide policy makers, environment agencies and the general public with relevant information and to develop applications and services for government and industry.

Meteorology, Hydrology and Air Quality

Predict weather and flood events with high socio-‐economic and environmental impact a few days in advance – with enough certainty and early warning to allow practical mitigation decisions to be

taken. Understand and predict the quality of air at the Earth’s surface: development of advanced real-‐time forecasting systems to allow early enough warning and practical mitigation in the case of pollution crisis.

Solid Earth Sciences

Challenges span a wide range of disciplines and have significant scientific and social implications, playing today a central role in natural hazard mitigation (seismic, volcanic, tsunami and landslides), treaty verification for nuclear weapons, and increased discovery of economically recoverable petroleum resources and monitoring of waste disposal. Exascale-‐class computing capability will make it increasingly possible to address the issues of resolution, complexity, duration, confidence and certainty, and to resolve explicitly phenomena that were previously parameterised, and will lead to operational applications in other European centres, national centres and industry.

1.3.2 Astrophysics, High-‐Energy Physics and Plasma Physics In recent years, astrophysics, high-‐energy physics and plasma physics have shared a dramatic change in the role of theory for scientific discovery. In all of these fields, new experiments became ever more costly, require increasingly long timescales and aim at the investigation of more and more subtle effects. Consequently, theory is faced with two types of demands: precision of theory predictions has to be increased to the point that it is better than experimental precision. Since the latter can be expected to increase by further orders of magnitude until 2020, this is a most demanding requirement. In all of these research fields, well-‐established theoretical methods have existed for many decades. To achieve dramatic progress therefore requires a dramatic increase in theoretical resources, including computer resources for numerical studies.

In parallel, the need to explore model spaces of much larger extent than previously investigated also became apparent. For example, to determine the nature of dark energy and dark matter requires a detailed comparison of predictions from large classes of cosmological models with data from the new satellites and ground-‐based detectors which will be deployed until 2020.

These predictions can be generated only by massive numerical simulations. In high-‐energy physics, one of the tasks is to explore many possible extensions of the Standard Model to such a degree that even minute deviations between experimental data and Standard Model predictions can serve as smoking guns for a specific realisation of New Physics. In plasma physics, one of the tasks is to understand the physics observed at ITER at such a high level that substantially more efficient fusion reactors can be reliably designed based on theoretical simulations which explore a large range of options.

While the three fields covered in this section are distinctly different, they also have substantial overlap. For example, the Big Bang is equally a topic of astrophysics as of high-‐energy physics while nucleosynthesis depends on nuclear physics as well as the modelling of supernova explosions. Plasma physics is crucial for many aspects of astrophysics as well as, for example, a better understanding of high-‐energy heavy-‐ion collisions at CERN.

As the experimental roadmap until 2020 is already fixed in all three research fields, it is possible to quantify with some reliability what these demands imply for HPC in Europe. If one requires that theory keeps up with the experimental progress, which is crucial to maximise the scientific output of the latter, these three fields together will require at least one integrated sustained Eflop/s-‐year. This will require at least a dedicated compute power of 1 Eflop/s, peak for roughly one decade.

1.3.3 Materials Science, Chemistry and Nanoscience We all experience in many aspects of life the increasing use of diverse materials-‐science-‐based technologies, from ingestible radio transmitters and fluorescent quantum dots for medical diagnosis and treatment to modern multifunctional cell phones that take photographs, receive and transmit

electronic mail and connect us ad hoc to digital information available via the Internet, the latter made possible by virtue of nanotechnology-‐enabled electronics. The objective of computational materials science, chemistry and nanoscience is to create novel materials or chemical agents ranging from the level of a single atom up to macroscopic measures, with effects ranging from electronic reaction times in the femtosecond range up to geological periods that enter materials formation. Computational materials science is thus part of a processing and manufacturing activity that will finally be deployed in a technology that affects our society today and determines our options to shape our future.

Computational materials science, chemistry and nanoscience is concerned with the complex interplay of the myriads of electrons and atoms in a solid or a liquid, thereby producing a continuous stream of new and unexpected phenomena, processes and forms of matter. The science deals with the complexity of the quantum dimension and the complexity introduced by the large configuration space of the many particles that interact and compete at a large range of length, time, energy and entropy scales. It focuses on the conceptualisation, development, implementation and application of analytical models and novel, technically very complex and computationally very demanding computer-‐based classical and quantum mechanical methods. The goals are to analyse and to interpret experimental characterisation, to assist the design and optimisation of routes for materials synthesis, processing and manufacturing – ranging from chemical reactions for growth to long-‐term annealing and recovery routes of materials. The aim is to facilitate the discovery and design of materials with new functionalities and desired properties, and to provide the methods and computational tools for neighbouring fields such as life sciences and medicine, and engineering sciences and industrial applications that will result, for example, in new drugs or more efficient solar cells.

The advance from petascale to exascale computing is a driving force for improving the robustness and predictability of the computational models, for instance by extending and improving the degree of quantum mechanics that are included in the models. One example is the field of strongly correlated electron materials exhibiting a wealth of exciting properties such as the absence of freezing down to the lowest of temperatures, high-‐temperature superconductivity, colossal magneto-‐resistance, orbitaltronics, or a simultaneous presence of magnetism and ferroelectricity. Many of these phenomena are investigated today combining dynamical mean field theory with density functional theory, but the analysis of one system in a petascale environment can easily take up to two years. Capability computing in an exascale environment is necessary to address grand challenges in computational nanoscience. It not only enables the investigation of such phenomena and materials by larger computational models to improve the robustness of the predictions, but also allows the scanning of the parameter spaces as a function of temperature, pressure, external fields and stimuli for a large number of systems. More memory per computer node permits the application of new algorithms with finer energy resolution.

Electron excitations and dynamics and non-‐adiabatic molecular or spin dynamics are, for example, responsible for the description of photosynthesis, photovoltaic and chemical reactions and processes relevant to ultrafast writing and reading of information, and at the end even for the van der Waals interaction. This is another area that benefits greatly from an exascale facility which provides the required computational resources – computing power and memory – for simulations relevant to the fields of information technology, chemistry and the life sciences, as well as energy management.

Finally, on the road from a fundamental principle to synthesis and growth and then to functionality in a device or a biological environment that is integrated in a certain technology, materials are part of a heterogeneous system driven by simulation. The latter involves the concurrent coupling between atomic-‐scale and macro-‐scale dynamics through a multiscale, multidisciplinary approach that stretches from fundamental science to chemical and process engineering. Exascale computing will provide the resources to change in part the paradigm of computational materials science, evolving from explanation and analysis to discovery and eventually to control of materials properties and

processes at the quantum scale. It will become the driving engine for a design continuum from fundamental discovery, through systems engineering and processing, to technology that accelerates the availability of new devices with new functionalities. In addition to promoting the role of computational materials science in guiding the field with respect to experimental and theoretical approaches, it is also needed to perform the atomistic level simulations that provide the fundamental data for the coarse-‐grained methods used in multiscale simulation. The exascale infrastructure offers high throughput through massive capacity computing. It facilitates the link between materials science and materials informatics, enabling new discoveries through combinatorial search among a vast number of alternatives being in a feedback loop to processing and manufacturing.

In conclusion, there is no doubt that the materials science, chemistry and nanoscience community in Europe requires a large allocation of CPU time that will exceed 1 Eflop/s. There is a considerable demand on Tier-‐0 capability computing for dynamical mean-‐field theory, (ab-‐initio) molecular dynamics, and multiscale and device simulation. Obviously, a European exascale environment must take into account that this field also requires capacity computing to search the immense phase space of opportunities. Therefore a heterogeneous infrastructure best serves this field. We re-‐emphasise that a critical requirement of this community is the optimal and efficient use of massively parallel supercomputers for this very broad range of complex problems in soft matter. The ‘know-‐how’ surrounding suitable parallelisation strategies needs to be strengthened and expanded.

1.3.4 Life Sciences and Medicine The benefits of the continuous development of more powerful computation systems are visible in many areas of life sciences. For example, at the beginning of 2000, the Human Genome Project20 was an international flagship project that took several months of CPU time using a hundred-‐Gflop/s computer with 1 terabyte of secondary data storage.

Today, genomic sequencing has changed from being a scientific milestone to a powerful tool for the treatment of diseases, in particular because it is able to deliver results in days, while the patients are still under treatment. As an example, Beijing Genomics Institute is capable of sequencing more than 100 human genomes a week using the Next Generation Sequencing instruments and a 100 Tflop/s computer that will migrate in the near future to a 1 Pflop/s capability.21 Today, genome sequencing technology is ineffective if the data analysis needs to be carried out on a grid or cloud-‐like distributed computing platform. First, such systems cannot achieve the necessary dataflow, of the order of 10–100 petabytes/year;22 second, research involving living patients requires both speed and high security; last, but not least, ethical and confidentiality issues handicap distributing patient data across the cloud world. In coming years, sequencing instrument vendors expect to decrease costs by one to two orders of magnitude, with the objective of sequencing a human genome for $1,000. This will make it possible to integrate genomic data into clinical trials (that typically involve thousands of human tests) and into the health systems of European countries, making drug development easier and faster and having a dramatic impact on the development of personalised therapies.

We should not forget, however, that all these possibilities will develop only if computer resources can deal with the complexity of the large interconnected data sets that are serving the large life science community. For example, the EBI (that hosts the major core bio-‐resources of Europe) doubled the storage from 6,000 TBytes (in 2009) to 11,000 TBytes (in 2010), and has received an average of 4.6 million requests per day (see Figure 1.1).

20 International Human Genome Sequencing Consortium. Nature 2001 21 http://www.genomics.cn/en/platform.php?id=248 22 The byte is a unit of digital information that most commonly consists of eight bits; terabyte, 1 TByte = 1012 bytes; petabyte, 1 PByte = 1015 bytes; exabyte, 1 EByte = 1018 byte/s; zettabyte, 1 ZByte = 1021 bytes

Figure 1.1.The exponential growth of data storage in EBI (TBytes). Figure reproduced from the EBI annual report.23

There are also many other steps along the drug discovery pipeline that will benefit from advances in supercomputing. For example, the identification of potential drug candidates for identified disease targets will be fuelled by next generation supercomputers. Most lead discovery projects currently involve high-‐throughput screening (HTS) instruments that can scan 100,000 molecules per day looking for those showing activity against the target. The cost of this technique is very high and the typical success rate is very low. In contrast, virtual screening is a computational technique that can scan the ability of a therapeutic target to recognise molecules from a virtual library, extending the chemical search space and dramatically reducing the costs of drug discovery. Current virtual libraries can contain one billion drug-‐like compounds24 and they are expected to grow still larger.25 Only multi-‐petascale supercomputers are capable of scanning such chemical spaces while simultaneously treating a large number of potential targets. The improvement of drugs during the process of ‘lead optimisation’ largely relies on structural-‐based drug design procedures, requiring very large computer resources when thousands of potential modifications of the lead need to be analysed.

Finally, we are fast approaching an information-‐rich scenario, where we will have detailed structural information on biomacromolecules, complete information on DNA, RNA and protein expression in different cellular situations, complete metabolomic data and accurate imaging of sub-‐cellular structures, complete cells and tissues. In the near future, we will need to integrate all this data into predictive mathematical models that will be able to represent not only individual macromolecules but entire cells and even organs. Flagship efforts, such as the Human Brain Project, which targets simulating the behaviour of a human brain, will open new views in the medical field. The computational cost of these multiscale simulations, ranging from macromolecules to entire organs, is still far beyond current computational resources.

The priorities set out by the life sciences panel include new techniques for: (i) data management and large storage systems (increase of shared memory capacity), (ii) interactive supercomputing, (iii) data analysis and visualisation, (iv) multi-‐level simulation, and (v) training. Because life sciences and health is such a heterogeneous field, it will be necessary to have several application-‐oriented initiatives developed in parallel, although they can share similar agendas. A flexible access protocol to Tier-‐0 resources will be as important as absolute computer power for this community. 23 http://www.ebi.ac.uk/Information/Brochures/pdf/Annual_Report_2010_hi_res.pdf 24 Reymond et al. (2009) J. Am. Chem. Soc. 25 Bohacek et al. (1996) Medical Research Reviews

1.3.5 Engineering Sciences

The engineering sciences domains considered in this overview include turbulence, combustion, aeroacoustics, biomedical flows, and general process technologies and chemical engineering. The challenges and outcomes in the engineering scientific domains to be addressed through petascale HPC provision are summarised below, with more detail provided in Table 1.1; the associated societal benefits are fully developed in section 1.4.

Turbulence

Turbulence is characterised by many degrees of freedom, measured by the Reynolds number, which imply large computational grids. Present simulations have Reynolds numbers of a few thousands, involve 1010 grid points, and run over hundreds of millions of CPU hours in O(105) processors. Direct Numerical Simulations (DNS), which centred on simple turbulent channels five years ago, have turned to jets and boundary layers, and the trend towards ‘useful’ flows is likely to continue. The Reynolds numbers have increased by a factor of roughly five, implying a work increase of three orders of magnitude.

However, this can only be considered an intermediate stage in turbulence research. There is a tentative consensus that a ‘breakthrough’ boundary layer free of viscous effects requires Reynolds numbers of the order of Reτ = 10,000 – five times higher than present simulations. That implies computer times 1,000 times longer than present (Re4), and storage capacities 150 times larger (Re3). Turbulence research requires the storage and sharing of large data sets, becoming O(20 PBytes) per case within the next 5–10 years. The rewards in the form of more accurate models, increased physical understanding and better design strategies will grow apace.

Combustion

Scientific challenges in combustion are numerous. First, a large range of physical scales should be considered from fast chemical reaction characteristics, pressure wave propagation up to burner scales or system scales. Turbulent flows are, by nature, strongly unsteady. Handling chemistry and pollutant emissions in numerical simulations requires adapted models, the treatment of fuels demands that two-‐phase flows are taken into account, while solid particles such as soot may also be encountered. Interactions between flow hydrodynamics, acoustics and combustion may induce strong combustion instabilities or cycle-‐to-‐cycle variations, decreasing the burner performances and, in extreme cases, leading to destruction of the system. The design of cooling systems requires the knowledge of heat transfer to walls due to conduction, convection and radiation as well as flame/material interactions.

Aeroacoustics

In the development of new aircraft, engines, high-‐speed trains, wind turbines and so forth the prediction of the flow-‐generated acoustic field becomes vital as society expects a quieter environment and noise regulations become stricter. The future of noise prediction and one day even noise-‐oriented design belongs to the unsteady three-‐dimensional numerical simulations and first principles, but the contribution of such methods to industrial activities in aerospace seems to be years away. Certification often depends on a fraction of a dB, whereas presently predicting noise to within, say, 2 dB without adjustable parameters is decidedly impressive.

The state of the art is limited to simplified components or geometries which can be tackled using manually generated structured meshes in contrast to the systems actually installed which need to be simulated. Massively parallel machines in the Eflop/s range and higher are essential for solving the aeroacoustics problems not only on a generic but also on an industrial scale.

Biomedical Flows

Surgical treatment in human medicine can be optimised using virtual environments, where surgeons perform pre-‐surgical interventions to explore best practice methods for the individual patient. The treatment of the pathology is supported by analysing the flow field, for example optimising nasal cavity flows or understanding the development of aneurysms. The computational requirements for such flow problems have constantly increased over recent years and have reached the limits of petascale computing. It is vital to understand fully the details of the flow physics to conclude the derivation of medical pathologies and to propose, for instance, shape optimisations for surgical interventions. Such an in-‐depth analysis can be obtained only by a higher resolution of the flow field, which in turn increases the overall problem size.

As an example of the demands of biomedical flow requirements, computations that have to be performed for the nasal cavity problem under high-‐frequency conditions involve Reynolds numbers in the range of Re≈15,000. To tackle problems where the entire fluid and structure mechanics of the respiratory system is simulated demands even the next generation of exascale computers.

General Process Technologies, Chemical Engineering

Chemical engineering and process technology are traditional users of HPC for dimensioning and optimising reactors in the design stage. Computational techniques are also used for improving the operation of processes, for example through model predictive optimal control, or through inverse modelling for estimating system parameters. The computational models used in chemical engineering span a wide range of scales. On the microscopic level, chemical reactions may be represented by molecular dynamics techniques, while on the mesoscopic level, flows through pores or around an individual particle may be of interest. The macroscopic scale eventually considers the operation including heat and mass transfer in a full industrial-‐scale reactor or even the operation of a full facility.

Exascale systems will permit a better understanding of highly dispersed phenomena or very large up (or down) scaling problems, such as aggregates formation and growth, by the development of much improved particle simulation technologies, for example for describing multiscale interactions between fluids and structure, or fluid-‐solid suspension, interfaces and multi-‐physics coupling.

1.3.6 Industrial Applications

A variety of industrial applications are considered in this overview, including aeronautics, turbo machines and propulsion, energy, automotive, oil and gas, and other applications. Requirements are outlined below and presented in greater detail in Table 1.1.

Aeronautics Aircraft companies are now heavily engaged in trying to solve problems such as calculating maximum lift using HPC resources. This problem has an insatiable appetite for computing power and, if solved, would enable companies designing civilian and military aircraft to produce lighter, more fuel-‐efficient and environmentally friendlier planes. The challenges of future aircraft transportation (‘Greening the Aircraft’) demand the ability to flight-‐test a virtual aircraft with all its multidisciplinary interactions in a computer environment, to compile all of the data required for development and certification – with guaranteed accuracy – in a reduced time frame. For these challenges, a complete digital aircraft will require more than Zflop/s systems.

Turbo Machines, Propulsion Numerical simulation and optimisation is pervasive in the aeronautics industry, and notably in the design of propulsion engines. The main targets are substantial targeted reductions of specific fuel consumption and environmental nuisance – in particular, greenhouse gases, pollutant emissions and noise – as put forward by regulators such as ACARE and IATA. On the engine side, these ambitious

goals are pursued by increasing propulsive and thermodynamic efficiency, reducing weight and finally controlling sources of noise. The development of disruptive propulsive technology is needed, relying even more heavily on numerical tools to overcome the lack of design experience. There are two major HPC-‐related challenges – the use of high-‐fidelity numerical tools leading to a more direct representation of turbulence and the evolution of optimisation strategies.

Energy Objectives include improved safety and efficiency of the facilities (especially nuclear plants), plus optimisation of maintenance, operation and life span. This is one area in which physical experimentation – for example with nuclear plants – can be both impractical and unsafe. Computer simulation, in both the design and operational stages, is therefore indispensable.

Considering thermal hydraulic CFD applications, the improvement of efficiency may typically involve mainly steady CFD calculations on complex geometries, while improvement and verification of safety may involve long transient calculations on slightly less complex geometries. As safety studies increasingly require assessment of CFD code uncertainty, sensitivity to boundary conditions and resolution options must be studied, but turbulence models may still induce a bias in the solution. Doing away with turbulence models and running DNS-‐type calculations at least for a set of reference calculations would be a desirable way of removing this bias. Such studies will require sustained access to multi-‐Eflop/s capacities over several weeks.

Neutronics applications These include the capability to model very complex, possibly coupled phenomena over extended spatial and time scale. In addition, uncertainty quantification and data assimilation are considered as key to industrial acceptance, so that their associated computational needs that depend on the complexity of the model considered have to be met.

Automotive The automotive industry is actively pursuing important goals that need Eflop/s computing capability or greater. These include (i) vehicles that will operate for 250,000 kilometres on average without the need for repairs, (ii) full-‐body crash analysis that includes simulation of soft tissue damage, and (iii) longer-‐lasting batteries for electrically power and hybrid vehicles.

For both aerodynamics and combustion, at least LES, and if possible DNS, simulations are required on an industrial scale and Eflop/s applications must be developed at the right scale.

Crash: Most computations are currently done in parallel (8–64 cores), and scalability tests have shown that up to 1,024 cores may be reasonable on 10 million finite elements. It is likely that future simulations will require model sizes for a full car ranging from 1.5 to 10 billion finite elements, demanding the development of new codes (mainly open source) for Eflop/s systems.

Oil and Gas The petroleum industry is driven to increase the efficiency of its processes, especially in exploration and production and to reduce risks by the deployment of HPC. Typical steps in the business process are: geoscience for identification of oil and gas underground, development of reservoir modelling, designing of facilities for the cultivation of hydrocarbons; drilling of wells and construction of plant facilities; operations during the life of the fields; and eventually decommissioning of facilities at the end of production.

Other Industrial Applications Banks and insurance companies are increasingly using HPC, mostly embarrassingly parallel Monte-‐Carlo solutions of stochastic ODEs; but high-‐frequency trading will inevitably require better models and faster calculation. They also have the challenge of interconnecting supercomputers and several private clouds.

In common with many other industries mentioned in the report, they are faced with the ‘big data’ problem in the sense that massive market data are available (Reuters) and current calibration algorithms cannot exploit such large input. Note that 41 machines are characterised as ‘finance’ in the November 2011 Top 500 list.

Pharma industries, firmly established in Europe, already use ab-‐initio and molecular simulation applied to their domains, and will increase R&D efforts in this field, for example drug design (GSK, Sanofi) or biomedical applications (L’Oréal) (see sections 4 and 5 of this report).

1.3.7 A Schematic Overview of the Scientific Roadmap To complete our introduction to the domain areas, we present the key scientific objectives and challenges in tabular form to impress on the reader the sheer scope of accomplishments promised through provision of a sustainable top-‐level infrastructure.

Table 1.1 is organised to identify the key challenges from the five distinct areas listed above.

Table 1.1. The challenges and outcomes in science and engineering to be addressed through

petascale HPC provision.

Application Science Challenges and Potential Outcomes

Weather, Climatology and Solid Earth Sciences

Climate Change

In the last decade, our understanding of climate change has increased, as has the societal need for pull-‐through to advice and policy. However, while there is great confidence in the fact that climate change is happening, there remain uncertainties. In particular, there is uncertainty about the levels of greenhouse gas emissions and aerosols and likely to be emitted and, perhaps more significant, there are uncertainties about the degree of warming and the likely impacts. These latter uncertainties can only be reduced by increasing the capability and complexity of ‘whole Earth system’ models that represent in ever-‐increasing realism and detail the scenarios for our future climate. A further challenge is to provide more robust predictions of regional climate change at the decadal, multi-‐decadal and centennial timescales to underpin local adaptation policies. In many regions of the world, there is still considerable uncertainty in the model predictions of the local consequences of climate. Model resolution plays a key role. A dual track approach should be taken involving multi-‐model comparisons at the current leading-‐edge model resolution (about 20 km), alongside the longer-‐term aim to develop a global convective resolving model (about 1 km). To reduce these uncertainties in climate projections requires a coordinated set of experiments and multi-‐year access to a stable HPC platform. Issues relating to mass data storage, and dissemination of model outputs for analysis to a wide-‐ranging community of scientists over a long period, will need to be resolved. A multi-‐group programmatic approach could allow in Europe a set of model inter-‐comparisons focused on a number of priority climate science questions.

Oceanography and Marine Forecasting

The ocean is a fundamental component of the Earth system. Improving understanding of ocean circulation and biogeochemistry is critical to assess properly climate variability and future climate change and related impacts on, for example, ocean acidification, coastal sea level, marine life and polar sea-‐ice cover. The ocean greatly influences the climate system at shorter timescales, i.e. for weather forecasting and seasonal climate prediction, but its influence grows as timescales increase. Beyond climate, ocean scientists are being called on to help assess and maintain the wealth of services that the ocean provides to society. Human activities, including supply of food and energy, transport of goods, etc., exert an ever-‐increasing stress on the open and coastal oceans. These stressors must be evaluated

and regulated in order to preserve the ocean's integrity and resources. Society must also protect against marine natural hazards. Marine safety concerns are becoming more acute as the coastal population and maritime activities continue to grow. For all these concerns, there is a fundamental need to build and efficiently operate the most accurate ocean models in order to assess and predict how the different components of the ocean (physical, biogeochemical, sea-‐ice) evolve and interact. The main perspective is to produce realistic reconstructions of the ocean's evolution in the recent past (e.g. reanalyses) and accurate predictions of the ocean's future state over a broad range of time and space scales, to provide policy makers and the general public with relevant information, and to develop applications and services for government and industry.

Meteorology, Hydrology and Air Quality

Weather and flood events with high socio-‐economic and environmental impact may be infrequent, but the consequences of occurrence can be catastrophic to those societies and Earth systems that are affected. There is, of course, a link to climate prediction and climate change impacts, if severe meteorological and hydrological events are to become more frequent and/or more extreme. Predicting these low-‐frequency, high-‐impact events a few days in advance – with enough certainty and early warning to allow practical mitigation decisions to be taken – remains difficult. Understanding and predicting the quality of air at the Earth’s surface is an applied scientific area of increasing relevance. Poor air quality can cause major environmental and health problems affecting both industrialised and developing countries around the world (e.g. adverse effects on flora and fauna, and respiratory diseases, especially in sensitive people,). Advanced real-‐time forecasting systems are basic and necessary tools for allowing early warning advice to populations and practical mitigation strategies in case of air pollution crisis.

Solid Earth Sciences

Computational challenges in solid Earth sciences span a wide range of scales and disciplines, and address fundamental problems in understanding the Earth system – evolution and structure – in its near-‐surface environment. Solid Earth sciences have significant scientific and social implications, playing today a central role in natural hazard mitigation (seismic, volcanic, tsunami and landslides), hydrocarbon and energy resource exploration, containment of underground wastes and carbon sequestration, and national security (nuclear test monitoring and treaty verification). In the realm of seismic hazard mitigation alone, it is well to recall that, despite continuous progress in building code, one critical remaining step is the ability to forecast the earthquake ground motion to which a structure will be exposed during its lifetime. Until such forecasting can be made reliably, complete success in the design process will not be fulfilled. All these areas of expertise require increased computing capability in order to provide breakthrough science. A programme of provision of leadership-‐class computational resources will make it increasingly possible to address the issues of resolution, complexity, duration, confidence and certainty, and to resolve explicitly phenomena that were previously parameterised. Each of the challenges represents an increase by a factor of at least 100 over individual national facilities currently available. A large number of the numerical models and capability-‐demanding simulations described below will lead to operational applications in other European centres, national centres and industry.

Astrophysics, HEP and Plasma Physics

Astrophysics

Understanding our place in time and space has been central to human scientific and sociological advancement throughout the past four centuries. An appropriate mix of computing infrastructure, software development and observational facilities will allow significant progress in the next decade in answering a number of outstanding fundamental questions in astrophysics and cosmology:

What is the identity of the cosmic dark matter and dark energy?

How did the universe emerge from the dark ages following the Big Bang?

How did galaxies form?

How do galaxies and quasars evolve chemically and dynamically and what is the cause of their diverse phenomenology?

How does the chemical enrichment of the universe take place? How do stars form?

How do stars die? How do planets form?

Where is life outside the Earth?

How are magnetic fields in the universe generated and what role do they play in particle acceleration and other plasma processes?

How can we unravel the secrets of the sources of strongest gravity?

What will as yet unexplored windows into the universe such as neutrinos and gravitational waves reveal?

Answering these questions requires accurate numerical treatment of a range of coupled complex non-‐linear physical processes including gravitation, hydrodynamics, non-‐equilibrium gas chemistry, magnetic fields, radiative transfer and relativistic effects. Although the equations that describe these physical processes are well known, solutions are attainable only by numerical simulation. Exascale resources together with efficient algorithms are essential for this task.

Elementary Particle Physics

While particle physics has been very successful in recent decades in unravelling the fundamental, underlying laws governing our world, it is certain that very important aspects are not yet understood. In particular, it is clear that physics has to change dramatically at much larger energy scales than those explored so far. In view of the enormous costs of accelerators with much higher energy than the existing and planned ones – in Europe in particular LHC and FAIR – the missing pieces of the puzzle can be primarily searched for at the precision frontier. This requires very high luminosity accelerators on the one side and very precise theoretical calculations and the systematic exploration of possible scenarios for Beyond-‐the-‐Standard-‐Model (BSM) physics on the other side. The latter two aspects depend crucially on input from lattice field theory. Within the Standard Model, the theoretical uncertainties are dominated by non-‐perturbative aspects of QCD (quantum chromodynamics), many of which can be explored with lattice-‐QCD (LQCD). Some of the most relevant topics are:

• The QCD phase diagram, i.e. QCD at finite temperature and baryon density, which is also very relevant for astrophysics

• The three-‐dimensional quantum structure of hadrons as encoded in precisely defined QCD quantities like Generalised Parton Distributions (GPDs), distribution amplitudes (DAs), transverse momentum-‐dependent distribution functions (TMDs), Bag parameters, etc.

• The quark masses, which are some of the fundamental parameters of the standard model

A crucial advantage of computer simulations is that mass coupling constants and other aspects of the Standard Model can be varied at will, which allows accidental correlations to be distinguished from fundamental ones and thus clarifies many aspects of physics. The latter, namely the possibility to explore theoretical model spaces, is also a most important aspect of the search for BSM physics: without such numerical simulation of non-‐perturbative aspects it is not possible to know where best to search for signals of the new physics.

Plasma Physics

Some of the most demanding scientific and computational grand plasma challenges are closely tied to the development of plasma-‐confinement devices for fusion energy research and the recent developments in ultra-‐intense laser technology, with the possibility of exploring astrophysical scenarios with a fidelity that was previously not accessible due to limited computing power. The main scientific challenges are in:

• Plasma accelerators (either laser or beam driven) and possible advanced radiation sources based on these, which have promising applications in bio-‐imaging and medical therapy

• Magnetic confinement fusion devices and in particular ITER, the international fusion experiment

• Inertial fusion energy and advanced concepts with ultra-‐intense lasers, which aim to demonstrate nuclear fusion ignition in the laboratory

• Collisionless shocks in plasma astrophysics, associated with extreme events such as gamma ray bursters, pulsars and AGNs

• Solar physics addressing the combined Sun and Earth magneto-‐plasma system by means of observations and simulations

These are topics of relevance not only from a fundamental perspective but also in terms of potential direct economic benefits. Research in plasma accelerators is exploring the route to a new generation of more compact and cheaper particle and light sources. The Magnetic Confinement Fusion (MCF) and the Inertial Confinement Fusion (ICF) approaches to nuclear fusion are critical for a sustainable energy production – a driving force for economic growth.

Materials Science, Chemistry and Nanoscience

Materials Informatics

Materials informatics offers a pathway to cope with the challenge to develop new materials technologies at a faster rate, in a more cost-‐effective way, and in closer alignment to the product development cycle than was previously possible. Combining quantum methods for computing the stability of materials and information techniques such as data mining or data analytics (in statistics) to exploit materials informatics offers a novel approach to materials design. Due to the increased availability of computational resources, it is possible to run many thousands of potential material calculations and generate notable ‘theoretical databases’. Databases of derived materials, with calculated physical and engineering properties, will no doubt be an increasingly important tool for researchers and engineers working in fields related to materials development.

Multiscale Modelling

A pressing research challenge involves the integration of the various length and time scales relevant for materials science. Multiscale materials simulation is currently a high-‐impact field of research, where much effort is focused towards more seamless integration of the length and time scales, from electronic structure calculations, atomistic and molecular dynamics, kinetic and statistical modelling to the continuum. Together with new and emerging techniques, the provision of increased computational power can yield answers to versatile and complex questions central to materials manufacture, non-‐equilibrium processing – growth, processing, patterning using electron or ion beams, or plasma sources – properties, performance and technological applications. A sufficiently detailed and realistic computational modelling and understanding of these highly complex physical and technological processes can be achieved only by large-‐scale computer simulations combining a range of simulations that cover different length and time scales.

Soft Matter Systems – Structure and Flow

A sufficiently detailed theoretical modelling and understanding of highly complex soft matter systems is possible only through large-‐scale computer simulations. This is a huge challenge, as the relevant structures span many orders of length scales. The timescale challenges are even greater. Thus, exceptional amounts of computer time are needed to

simulate soft matter and soft materials in thermal equilibrium. Describing the behaviour of soft matter under flow is even more challenging. To address these challenges, mesoscale hydrodynamics simulation techniques, such as Lattice-‐Boltzmann, Dissipative Particle Dynamics, and Multi-‐Particle Collision Dynamics, have been developed in recent years which allow the investigation of many interesting and important issues.

Photo-‐chemistry

Sunlight is the predominant energy on Earth and a key factor in photosynthesis and photovoltaics. The in-‐depth understanding of the nature of electronic excited states in biological or other complex systems is unquestionably one of the key subjects in present-‐day chemical and physical sciences and there are wide-‐ranging technological applications of these processes. It is a challenge to simulate realistic photo-‐activated processes of interest in biology and materials science. These phenomena usually involve non-‐adiabatic transitions among the electronic states of the system induced by the coupled motion of electronic and nuclear degrees of freedom. Consequently, their simulation requires both accurate ab-‐initio calculations of the (many) electronic states of the system and of the couplings among them and the non-‐adiabatic time evolution of its components.

Nanoscience: Quantum Device Simulation Ab-‐Initio

Our understanding of self-‐assembly, programmed materials, and complex nanosystems and their corresponding architectures is still in its infancy, as is our ability to support design and nano-‐manufacturing. The advance of faster and less energy-‐consuming information processing or the development of new generations of processors requires the shrinking of devices, which demands a more detailed understanding of nano-‐electronics. As semiconductor devices get smaller, so it becomes more difficult to design or predict their operation using existing techniques. Given this reduction in size, the next generation of supercomputers will enable us to perform simulations for whole practical nanoscale devices, based on electronic theory and transport theory, and to develop guidelines for designing new devices that incorporate the quantum effects that control nano-‐level phenomena.

Genomics

The fast evolution of genomics is fuelling the future of personalised medicine. Genetic variability affects how drugs react with each patient, sometimes in a positive manner (increasing the healing effect), sometimes in a negative manner (increasing toxic side effects) or simply by reducing drug response. Personalised medicine is a concept that will replace the outdated idea that a single drug is the solution for an entire population. Thanks to recent advances in high-‐throughput genome sequencing, we can already access the full genomic profile of a patient in a single day, and the throughput of next generation sequencing techniques is increasing much faster than Moore’s law.

Currently, sequencing centres require multi-‐petabyte systems to store patient data, and data processing is carried out on supercomputers in the 100 Tflop/s to 1 Pflop/s range. Requirements are expected to increase dramatically as sequencing projects are extended to entire populations, making possible linkage studies, but for most of the genomics challenges, an ‘unbalanced’ Eflop/s computer (number of computer nodes, I/O and memory capacities) would constitute a substantial barrier to efficient utilisation.

Systems Biology

The perturbation of biological networks is a major underlying cause of adverse drug effects. Intense research is being carried out today to develop models for identifying protein network pathways that will help us to understand the undesired effects of drugs and explore how they are related to network connectivity. Detailed knowledge of the structure and dynamics of biological networks will undoubtedly uncover new pharmacological targets. The use of complex network medicine is expected to have a dramatic impact on therapy in several areas: the discovery of alternative targets; reducing toxicity risks associated with drugs; opening new therapeutic strategies based on the use of ‘dirty’ drugs targeting different proteins; helping to discover new uses for existing drugs.

Systems biology is now at the stage of collecting data to build models for complex simulations that will, in the near future, describe the dynamics of cells and organs that presently remain unknown. Progress is rapid and systems biology will allow us to couple the simulations of the models with a biomedical problem This will require large computational resources and systems biology will benefit from Eflop/s capabilities, but aspects related to data management are going to be as important as pure processing capability.

Molecular Simulation

Eflop/s capabilities will allow the use of more accurate formalisms (more accurate energy calculations, for example) and enable molecular simulation for high-‐throughput applications (e.g. the study of larger number of systems). Unfortunately, if Eflop/s capabilities are achieved simply by aggregating a vast number of slow processors, this will not favour studies of longer timescales (a key tool for computer-‐aided drug design) since it will not be possible to scale up to systems using hundreds of thousand cores (as the simulated systems typically have less than 1 million atoms). The lack of high-‐performance computers appropriate for this research will displace R&D activities to the USA, China or Japan, putting European leadership in this field at risk. Appropriate exascale resources could revolutionise the simulation of biomolecules, allowing molecular simulators to decipher the atomistic clues to the functioning of living organisms.

Biomedical Simulation

The extensive use of simulation will help to integrate knowledge and data about the body, tissues, cells, organelles and biomacromolecules into a common framework that will facilitate the simulation of the impact of factors that perturb the basal situation (drugs, pathology, etc.).Simulation will reduce costs, time to market and animal experimentation. In the medium to long term, simulation will have a major impact on public health, providing insights into the cause of diseases and allowing the development of new diagnostic tools and treatments. It is expected that understanding the basic mechanisms of cognition, memory, perception, etc., will allow the development of completely new forms of energy-‐efficient computation and robotics. The potential long-‐term social and economic impact is immense.

Turbulence

Direct Numerical Simulations (DNS, using no models), which centred on simple turbulent channels five years ago, have turned to jets and boundary layers, which are much closer to real-‐life applications, and the trend towards ‘useful’ flows is likely to continue. The Reynolds numbers have increased by a factor of roughly five, implying a work increase of three orders of magnitude. However, this is only an intermediate stage in turbulence research. A ‘breakthrough’ boundary layer free of viscous effects requires Reynolds numbers of the order of Reτ = 10,000 – five times higher than present simulations. That implies computer times 1,000 times longer than present (Re4), and storage capacities 150 times larger (Re3). Keeping wall times constant implies increasing processor counts from the present O(32 Kproc) to O(32 Mproc), which will require rewriting present codes but is probably not insurmountable. Storage might be a tougher problem. Turbulence research requires storing and sharing large data sets, presently O(100 TBytes) per case, and becoming O(20 PBytes) within the next 5–10 years. Archiving, transmitting and post-‐processing those data will require work, but the rewards in the form of more accurate models, increased physical understanding, and better design strategies will grow apace.

Combustion

Scientific challenges in combustion are numerous. First, a large range of physical scales should be considered, from fast chemical reaction characteristics, pressure wave propagation up to burner scales or system scales. Turbulent flows are, by nature, strongly unsteady. Chemistry and pollutant emissions involve hundreds of chemical species and thousands of chemical

reactions, and they cannot be handled in numerical simulations without adapted models. Usual fuels are liquid, storing a large amount of energy in small volumes. Accordingly, two-‐phase flows should be taken into account (fuel pulverisation, spray evolution, vaporisation, mixing and combustion). Solid particles, such as soot, may also be encountered. Interactions between flow hydrodynamics, acoustics and combustion may induce strong combustion instabilities (gas turbines, furnaces) or cycle-‐to-‐cycle variations (piston engines), decreasing the burner performances and, in extreme cases, leading to the system destruction in short times. Control devices may contribute to avoid these instabilities. The design of cooling systems requires the knowledge of heat transfer to walls due to conduction, convection and radiation as well as flame/material interactions. Europe is very well positioned in this field through the development of scalable and mature code based on LES and DNS simulations. Fire simulations are less mature than gas turbine or internal combustion engine computations but predictions in terms of safety, prevention and fighting are challenging. Forest fires regularly strongly affect south European countries and, because of climate change, may concern northern regions in the future. Their social impact is very important. The simulation of fire fighting– for example, by dropping fluids without or with retardant– is also a challenging research of crucial importance.

Aeroacoustics

In the development of new aircraft, engines, high-‐speed trains, wind turbines and so forth, the prediction of the flow-‐generated acoustic field becomes more and more important since society expects a quieter environment and noise regulations, not only near airports, become stricter every year. The future of noise prediction and one day even noise-‐oriented design belongs to the unsteady three-‐dimensional numerical simulations and first principles, but the contribution of such methods to industrial activities in aerospace seems to be years away. Certification often depends on a fraction of a dB, whereas presently predicting noise to within, say, 2 dB without adjustable parameters is decidedly impressive. The stateoftheart is limited to simplified components or geometries which can be tackled using manually generated structured meshes in contrast to the systems actually installed, which need to be simulated, most probably by adaptive unstructured body-‐fitted or Cartesian grids. The latter can be decomposed into an arbitrary number of blocks such that the computations can be done on massively parallel machines in the Eflop/s range and higher. Such machines are essential.

Biomedical Flows

Surgical treatment in human medicine can be optimised using virtual environments, where surgeons perform pre-‐surgical interventions to explore best practice methods for the individual patient. The treatment of the pathology is supported by analysing the flow field, for example optimising nasal cavity flows or understanding the development of aneurysms. The computational requirements for such flow problems have constantly increased over the last years and have reached the limits of petascale computing in the sense not only of computational effort but also of required storage. It is vital to understand fully the details of the flow physics to conclude the derivation of medical pathologies and to propose, for instance, shape optimisations for surgical interventions. Such an in-‐depth analysis can be obtained only by a higher resolution of the flow field, which in turn increases the overall problem size. To tackle, for example, the nasal cavity problem, where the entire fluid and structure mechanics of the respiratory system is simulated, demands even the next generation of exascale computers, which are expected to be available in 2020.

General Process Technologies Chemical Engineering

Chemical engineering and process technology are traditional users of HPC for dimensioning and optimising reactors in the design stage. Computational techniques are also used for improving the operation of processes, for example through model predictive optimal control or through inverse modelling for estimating system parameters. The computational models used in chemical engineering span a wide range of scales.

On the microscopic level, chemical reactions may be represented by molecular dynamics techniques; on the mesoscopic level, flows through pores or around an individual particle may be of interest. The macroscopic scale eventually considers the operation including

heat and mass transfer in a full industrial-‐scale reactor or even the operation of a full facility. Exascale systems will permit a better understanding of highly dispersed phenomena or very large up (or down) scaling problems, such as aggregates formation and growth, by the development of much improved particle simulation technologies, such as for describing multiscale interactions between fluids and structure, or fluid-‐solid suspension, interfacesand multi-‐physics coupling.

Industrial Applications

Aeronautics Full Multidisciplinary Design and Optimisation (MDO), CFD-‐based noise and in-‐flight simulation: the digital aircraft

Turbo Machines, Propulsion

Aircraft engines, helicopters, etc.

Structure Calculation Design new composite compounds, deformation

Energy Turbulent combustion in closed engines and opened furnaces, explosion in confined area, power generation, hydraulics, nuclear plants

Automotive Combustion, crash, external aerodynamics, thermal exchanges, etc.

Oil and Gas Industries

Full 3D inverse waveform problem (seismic), reservoir modelling, multiphase flows in porous media at different scales, process plant design and optimisation, CO2 storage

Engineering (in general)

Multiscale CFD, multi-‐fluids flows, multi-‐physics modelling, computer-‐aided engineering, stochastic optimisation and uncertainties quantification, etc.

Special Chemistry Molecular dynamics, (catalyst, surfactants, tribology, interfaces), nano-‐systems

Others Bank/finance, medical industry, pharma industry, etc.): big data, data mining, image processing, etc.

Aeronautics

Aircraft companies are now heavily engaged in trying to solve problems such as calculating maximum lift using HPC resources. This problem has an insatiable appetite for computing power and, if solved, would enable companies designing civilian and military aircraft to produce lighter, more fuel-‐efficient and environmentally friendlier planes.

To meet the challenges of future aircraft transportation (‘Greening the Aircraft’), it is vital to be able to flight-‐test a virtual aircraft with all its multidisciplinary interactions in a computer environment and to compile all of the data required for development and certification with guaranteed accuracy in a reduced time frame. For these challenges, exascale is not the final goal – a complete digital aircraft will require more than Zflop/s systems.In parallel, future aircraft concepts require deeper basic understanding in areas such as turbulence, transition and flow control to be achieved by dedicated scientific investigations.

Turbo Machines, Propulsion

Numerical simulation and optimisation is pervasive in the aeronautics industry, and in particular in the design of propulsion engines. The main driving force of technological evolution is substantial targeted reductions of specific fuel consumption and environmental nuisance – in particular greenhouse gases, pollutant emissions and noise – as put forward by regulators such as ACARE and IATA. On the engine side, these ambitious goals are pursued by increasing propulsive and thermodynamic efficiency, reducing weight and finally controlling sources of noise. The targets can probably not be achieved simply through gradual improvement of current concepts. The development of disruptive propulsive technology is needed, relying even more heavily on numerical tools to

overcome the lack of design experience. We can foresee two major challenges related to HPC: the use of high-‐fidelity numerical tools towards a more direct representation of turbulence and the evolution of optimisation strategies.

Energy

The objectives are multiple: first, improvement of safety and efficiency of the facilities (especially nuclear plants), and second, optimisation of maintenance operation and lifespan. This is one field in which physical experimentation, for example with nuclear plants, can be both impractical and unsafe. Computer simulation, in both the design and operational stages, is therefore indispensable. Thermal hydraulic CFD applications: Improvement of efficiency may typically involve mainly steady CFD calculations on complex geometries, while improvement and verification of safety may involve long transient calculations on slightly less complex geometries. Note that, as safety studies increasingly require assessment of CFD code uncertainty, sensitivity to boundary conditions and resolution options must be studied, but turbulence models may still induce a bias in the solution. Doing away with turbulence models and running DNS-‐type calculations at least for a set of reference calculations would be a desirable way of removing this bias. Such studies will require access to multi-‐Eflop/s capacities over several weeks. Neutronics applications will also require access to Eflop/s capacities for moving current Sn neutron transport code to full Monte Carlo transport calculation on millions of cores implementing time-‐dependent and multi-‐physics coupling. Many other applications exist beyond those mentioned: new generations of power plants, innovation in renewable energies and storage, customers’ energy efficiency, development in home and building of new technologies and materials, etc.

Automotive

The automotive industry is actively pursuing important goals that need Eflop/s computing capability or greater, including the following examples:

• Vehicles that will operate for 250,000 kilometres (150,000 miles) on average without the need for repairs – this would save automotive companies substantial money by enabling the vehicles to operate through the end of the typical warranty period at minimal cost to the automakers

• Full-‐body crash analysis that includes simulation of soft tissue damage – today's ‘crash dummies’ are inadequate for this purpose – required particularly by insurance companies

• Longer-‐lasting batteries for electrically power and hybrid vehicles For both aerodynamics and combustion, at least LES, and if possible DNS, simulations are required in an industrial scale and Eflop/s applications must be developed at the right scale.Crash: Most computations are currently done in parallel (8–64 cores), and scalability tests have shown that up to 1,024 cores may be reasonable on 10 million finite elements. It is likely that future simulations will require model sizes for a full car ranging from 1.5 to 10 billion finite elements, demanding the development of new codes (mainly open source) for Eflop/s systems. Such codes must display the attributes of coupling (a standardised mapping between manufacturing simulation and crash simulation), optimisation and stochastic analysis, with embedding into a simulation data management system with automated pre-‐ and post-‐processing including monitoring and coupling to other fields and functionalities. On a 10-‐year horizon, a variety of main challenges, including true virtual testing, will be addressed leading to a factor of > 1,000 for the required number of computations (see section 6.3.5 for a complete list).

Oil and Gas

The petroleum industry is strongly motivated to increase the efficiency of its processes, especially in exploration and production and to reduce risks by the deployment of HPC. Typical steps in the business process are: geoscience for identification of oil and gas underground, development of reservoir modelling, designing of facilities for the cultivation

of hydrocarbons; drilling of wells and construction of plant facilities; operations during the life of the fields; and eventually decommissioning of facilities at the end of production. Geoscience analyses seismic data with numerical techniques for inverse problems. The economic impact of HPC is definitely high and the best possible tools are deployed. Eflop/s is not the ultimate goal for aeronautics. The complete Inverse Problem Resolution of wave equation needs more computational resources. The objective of this application is to produce from a seismic campaign the best estimation of the underground topography in order to optimise reservoir delineation and production by solving the Full Wave Inversion. This application is largely embarrassingly parallel, and the higher performing the HPC system is, the better the approximation of the underground topography. However, large HPC simulations are required to understand (i) multi-‐phases, multi-‐fluid flows of different viscosity, in porous media at high pressure and high temperature and now in carbonates, (ii) multi-‐fluid flows in risers (e.g. BP/Macombo Mexico Gulf), with safe and optimised flows, plus transport and storage.

Others

Banks and insurance companies are increasingly using HPC, mostly embarrassingly parallel Monte-‐Carlo solutions of stochastic ODEs; but high-‐frequency trading will inevitably require better models and faster calculation. They also have the challenge of interconnecting supercomputers and several private clouds. In common with many other industries, they are faced with the ‘big data’ problem in the sense that massive market data are available (Reuters) and current calibration algorithms cannot exploit such large input. Note that 41 machines are characterised as ‘finance’ in the latest Top 500 list (November 2011). They also need new and much more efficient data mining methods. Industrial pharma applications: All these industries, firmly established in Europe, already use ab-‐initio and molecular simulation applied to their domains, and will increase R&D efforts in this field (see sections 4 and 5 of this report) for drug design (GSK, Sanofi) or biomedical applications (L’Oréal). The main issues for these industries include:

• Big data management, generation, transport, storage, etc., due to screening simulations

• Exascale-‐efficient MD software • New data mining for massively parallel QSAR (Structure – Activities Relations)

Academic and Industry Common Issues

Operating software with load balancing, fault tolerance, coupling with users need. On the other hand, what is clearly expected in 2020 includes:

• Standard coupling interfaces and software tools • Mesh-‐generation tool, automatic and adaptive meshing, highly parallel (from

meshes of about 100 million tetras to 10 billion tetras) • Coupling multi-‐physics, refined chemistry • Meshless methods and particle simulation, billions particle simulation • New numerical methods, algorithms, solvers/libraries (BLAS, etc.) • Uncertainty quantification • Optimisation, data assimilation • Large database, big data, new methods for data mining and valorisation

1.4 Balance between Scientific, Industrial and Societal Benefits While the focus of this report lies in the description of the scientific roadmap and the major challenges associated with that roadmap, we should emphasise the increasing economic and societal benefits that arise from making progress towards their resolution. In this section, we consider the potential impact of computer simulations on the economy and society in general, providing a number of compelling examples from each of the scientific domains central to this report. HPC is a key enabler for economic growth in Europe today. It delivers a competitive edge to companies operating in the global marketplace, allowing them to design and produce products and services that differentiate them from their competitors. A study in 2004 by IDC research of HPC users found that almost 100% indicated that HPC was indispensable for their business. All of us experience the effects of HPC in our day-‐to-‐day lives, although in many (and probably most) cases we are unaware of that impact. We travel in cars and aeroplanes designed using modelling and simulation applications run on HPC systems so that they are efficient and safe. HPC is essential for ensuring that our energy needs are met. Finding and recovering fossil fuels require engineering analysis that only HPC can deliver. Nuclear power generation also relies heavily on HPC to ensure that it is safe and reliable. In the coming years, HPC will have an even greater impact as more products and services come to rely on it.

Besides the above benefits that industry can draw from HPC to achieve increased innovation rates and competitiveness, HPC is also a critical and essential technology as we address some of the societal challenges ahead, such as generating clean and efficient energy, predicting and mitigating against the effects of climate change, and ensuring safe and efficient travel. Clearly, one of the most important challenges facing our world is to design and provide clean and climate-‐friendly transportation systems and energy-‐producing technologies, which would not lead to the current levels of CO2 and other greenhouse gases. HPC is therefore essential in two ways:

1. It is the unique way to study and design new processes, as analogic mock-‐up systems are becoming unaffordable, if not impossible to construct (as, for example, in the design of low-‐pressure stable combustion chambers, of more fuel-‐economic terrestrial and airborne vehicles, etc.)

2. It is the only way to check the real impact of these new designs through the use of advanced climate models

To capture the economic and societal benefits through investment in HPC resources, we show in Table 1.2 below a summary of these benefits as a function of scientific domain. We have attempted to colour-‐code these benefits according to economic (red) and societal (green) – and highlight those developments that impact on both (purple).

Table 1.2. The economic and societal benefits arising through HPC provision as a function of scientific domain.

Economic and Societal Benefits in Weather, Climatology and Solid Earth Sciences

Quantifying the Certainty and Impact of Forecasts

Natural disasters claim hundreds of thousands of lives annually and cause vast property losses. To what extent anthropogenic climate change will lead to an increase in occurrence and severity of extreme events and natural disasters is one of today’s most important and challenging scientific questions. The countries that will have access to the highest performance in computing will be able to perform experiments that will become the references for future scientific assessments and associated political decisions. Even though Europe has world-‐class expertise in climate, oceanography, weather and air quality, earthquake and tsunami modelling issues, European scientists may lose their current prominence if they cannot access the most powerful computing systems.

The economic benefit to society of quantifying the certainty and impact of forecasts, on whatever timescale, is enormous. By providing probabilistic results to agencies involved in assessing impacts of extreme events or climate change adaptation, mitigation strategies can be developed and the impacts constrained.

The societal benefits range from mitigation of high-‐impact weather by having a more accurate and timely weather nowcasting system to a better air-‐quality forecast with direct impact on health, traffic (e.g. fog and rain), agriculture (e.g. ozone influence), etc. In the last few years, the emphasis on ‘policy relevance’ has moved beyond the issues of the existence of a human effect on climate and how to mitigate future change. As it becomes more likely that a certain level of climate change is inevitable, interest has extended to how climate will change in the next few decades, and to evaluate the most effective ways to adapt to it. Hence, from a policy point of view, there are two important timescales that have to be understood: the next few decades during which vulnerabilities can be assessed and adaptation responses planned, and the centennial scale on which we can understand how global strategies could mitigate climate change.

Seismic, Volcanic and Tsunami Hazard Mitigation

It is well to recall that, despite continued progress in building code development in Europe and abroad, all these catastrophic events have disruptive implications for society, with high associated economical cost, and they must be properly managed.

This applies particularly to critical facilities such as nuclear power plants and long-‐term waste repositories. Critical steps are: first, the ability to predict a wide range of possibilities in the process of designing and planning new facilities, and second, the means to perform simulation and near-‐term prediction to support decision-‐making strategies for retrofitting existing structures and to manage specific issues. Predictions have to rely on high-‐resolution models and on the ability to do vast ensembles of simulation of these, and to integrate the high data volume outputs using appropriate analytics for uncertainty quantification and extreme scales prediction.

Monitoring of nuclear testing treaties and underground explosion activities relies on the detection and the characterisation of the nuclear explosions from the seismic signals recorded by the global seismic networks. Detecting and discriminating nuclear explosions from earthquakes and other types of wave sources have to rely on deep physical understanding of the wave propagation physics, and on the ability to perform vast ensemble of high-‐resolution simulation of the seismic waves propagation generated by a wide variety of sources. There must also be the ability to process these synthetic waveforms and to compare them with the actual observation.

Research and Development in the Energy Industry

Research and development in the energy industry depends heavily on innovative computational analysis, and leading edge computing and data capabilities, to explore containment of underground wastes, carbon sequestration to reduce global warming, new energy resources and to find innovative means to tap these resources and to monitor their exploitation.

All these tasks require high-‐resolution 3D tomographic images of the Earth’s interior and now time-‐lapse repeated tomography to detect changes. The capability to perform simulation-‐based inversion and optimisation faster, integrating multiscale and multi-‐physics methodology, in high-‐dimensional and complex parameter spaces is extremely important to the overall competitiveness of Europe in these industrial and societal issues.

Marine Monitoring and Forecasting Systems

The societal benefit of marine monitoring and forecasting systems is crucial in the areas of marine safety (e.g. wave models predicting sea state are significantly improved by accurate prediction of ocean currents), marine resources, marine environment and climate (e.g. forecasting of toxic algae blooms, storm surges, regional sea-‐level changes in the long term), seasonal and weather forecasting.

The impact of these services will be enormous, as they will be crucial contributions to the environmental information base allowing Europe to evaluate independently its policy responses in a reliable and timely manner.

The societal benefit of an intensive use of higher-‐resolution ocean models in the research community will also be of crucial importance.

Observations alone cannot provide the fundamental knowledge of the ocean that modelling activities can bring up.This knowledge will be a key factor in the improvement of forecasting and climate prediction models themselves, in the evaluation of uncertainties related to the non-‐linear character of the ocean flows, and in the development and rationalisation of the ocean observational networks.

Economic and Societal Benefits in Astrophysics, High-‐Energy Physics and Plasma Physics

Alternative Sources of Energy

Ultra-‐compact laser-‐plasma accelerators will also produce high-‐quality electron and ion beams directly relevant for biomedical applications. Ion beams from these accelerators (at +200 MeV) would be particularly suited for the treatment of deep tumours or radio-‐resistant cancers, at a very small fraction of the cost of the existing facilities, or, at lower energies, to produce radioisotopes for medical diagnostics. Coherent X-‐rays from high-‐quality electron beams with ultra-‐short pulse duration produced can have a tremendous impact in structural biology and on bio-‐imaging of virus, again at a much smaller fraction of the cost/space of existing light sources.

Besides their interest as natural phenomena (aurorae, etc.), magnetospheric perturbations are also of practical interest since they can affect satellites, disrupt telecommunications and occasionally affect power grids. Thus, solar physics modelling, coupled with observations, can anticipate disruptive solar activity and contribute to the strategy to protect investments in satellites and telecommunications. Europe is also the leading partner in several research satellites, Ulysses, SoHO, Cosmic Vision Solar Orbiter.

Figure 1.2. Spot price for a barrel of oil versus global oil production, suggesting saturation in the supply rate from 2004 (from [26]). Bohacek et al (1996) Medical Research Reviews

Figure 1.2 suggests that global oil production may have peaked in 2004,26 an event that, whenever it happens, will inevitably have enormous consequences for the world economy.

This, in addition to climate change, provides a further timely reminder of society’s urgent need to develop alternative sources of energy that are independent of fossil fuels.

HPC research makes very important contributions to the development of carbon-‐free sources of energy (e.g. from nuclear fusion).

The possible economic benefit from top-‐level computing capacity in the field of thermonuclear fusion research is self-‐evident. Achieving thermonuclear fusion as a possible future energy source is a great dream of mankind. It requires very large investments, and, as in many other fields, computer simulations are an effective way to steer applied research in the right directions.

Europe is the world leader in magnetic confinement with the Joint European Torus (JET) as the top running experiment and as the partner contributing the largest fraction of the construction costs of the international project ITER. Europe is also establishing a strong effort in Inertial Confinement Fusion through large-‐scale pan-‐European laser projects such as the High Power Laser for Energy Research – HiPER.

Forefront computing capabilities are an essential cost-‐effective means to back all these investments and to strengthen or maintain leadership in these fields.

Economic and Societal Benefits in Materials Science, Chemistry and Nanoscience

Finding Substitutes for Critical Minerals

Minerals are important components of many products people use in daily life (e.g. ultra-‐high coercivity magnets, cell phones, computers, laser, navigation and automobiles). Yet the European Union does not mine or process much of that raw material. The Enterprise and Industry Directorate of the European Commission issued a report defining critical raw materials27 as ones whose supply chain is at risk, or for which the impact of a supply restriction would be severe, or both. The report recommends that substitution should be encouraged, notably by promoting research on substitutes for critical raw materials in different applications. The exascale infrastructure will assist the combinatorial materials discovery and design to rapidly discover and develop substitutes for technologies and applications that are currently dependent on these critical minerals for which no known alternative is available today.

Computational materials science is intrinsically a rather diverse interdisciplinary community that plays a very prominent role in PhD and post-‐doctoral research training. The young scientists working in this field, developing complex computational methods and applying them in a multi-‐ and trans-‐disciplinary environment, are in great demand in both industry and academia.

Computational materials science, chemistry and nanoscience have a proven track record of direct impact on our society. One may think of the Internet and smart-‐phone revolution or the efficiency of modern detergents. There is no reason to believe that this may stop; instead, the impact will accelerate due to the steep progress in materials design that is foreseen by the use of an exascale infrastructure. This is necessary to contribute to the pressing issues on energy harvesting, storage, conversion and saving, environmental protection and toxicity management, decontamination, air cleaning, biotechnology and health care.

26 James Murray and David King, ‘Oil's tipping point has passed’, Nature 481, 433, 26th January 2012. 27 http://ec.europa.eu/enterprise/policies/raw-‐materials/files/docs/report-‐b_en.pdf

Energy Materials

Computational materials science assisted by exascale computing can provide invaluable input. Examples where design of new materials is needed are high-‐efficiency photovoltaic cells, fuel cells for electricity production by hydrogen, energy-‐efficient solid-‐state lighting, batteries for energy storage or thermoelectric materials. Given the need for green energy production and green information technology, computational materials science can have a global impact on the grand challenges of energy, environment and sustainability.

Economic and Societal Benefits in Life Sciences and Medicine

Rational Drug Design and the Pharmaceutical Industry

Europe has a very competitive industry that launches almost 40% of the pharmaceutical products on the worldwide market. Advances in genomics, systems biology and molecular simulation are making rational drug design a powerful alternative to trial-‐and-‐error methods. Computing systems that are incapable of performing high-‐power computations in a time frame of weeks would not be profitable for the drug industry. Assuming a 20-‐year patent, drug companies have an average of 6 years’ exclusivity to recover fully their $1.2 billion investment,28 while maintaining their operating costs. This introduces a strong pressure that is damaging pharmaceutical companies in Europe, forcing them to design new approaches to reduce the time required for drug development and to design new therapeutic scenarios.

In this field, personalised medicine appears as one of the low hanging fruits. Its market is expected to show an annual growth of about 10%. The core diagnostic and therapeutic segment of the market is comprised primarily of pharmaceutical, medical device and diagnostics companies and the projections for 2015 are for reaching $452 billion.29 However, any computational strategy leading to increased efficiency in target discovery, lead finding and lead optimisation will have an enormous impact for the European pharmaceutical industry and will represent enormous saving for the national public health systems.

Additional savings will be derived by the partial replacement of animal testing ($150 million per launch30 associated with the development a new pharmaceutical entity) by in-‐silico approaches. In this field, we cannot ignore that, according to the REACH initiative, major animal testing will be required to evaluate the safety of major chemicals sold in Europe decade,31 which will involve millions of laboratory animals, with an estimated total cost of €1.3–9.5 billion.

Researchers in medical simulation, systems biology and molecular simulation are working towards the development of in-‐silico models that can simulate systems from cells to complex organs such as the brain. These models, when complete, will certainly require exascale computing.

Economic and Societal Benefits in Engineering Sciences and Industrial Applications

In the Energy Domain

Every human activity and energy process – nuclear, thermal, hydro – needs to improve continuously its environmental impact (thermal discharge of nuclear/thermal power plants, long-‐term geomorphology in rivers due to hydropower, water or air quality, etc.) Achieving this environmental risk assessment will also require the use of multi-‐physics (sedimentology, water quality), multiscale (from local scale of near field to the large-‐scale of a river basin), and complex multidimensional, time-‐dependent models (sometimes to simulate five years of ‘maybe’ evolution.

28 Tufts University CSDD Outlook 2008 29 The Science of Personalised Medicine: Translating the Promise into Practice (2009) PricewaterhouseCoopers 30 Paul et al. (2010) Nature Reviews Drug Discovery 31 http://ec.europa.eu/environment/chemicals/reach/reach_intro.htm

The main challenge facing the nuclear industry is, today more than ever, to design safe nuclear power plants. This is presently being done for the so-‐called third generation (e.g. the French EPR), and is under active preparatory studies for the forthcoming fourth generation (e.g. sodium-‐ or gas-‐cooled fast-‐neutron reactors). HPC and in particular Eflop/s computing will contribute to improving nuclear power plant design, involving the use of multi-‐physics, multiscale, complex three-‐dimensional time-‐dependent models. It is worth noting that HPC 3D simulations are crucial to EDF for assessing the 10-‐year extension of the lifetime of nuclear power plants, such an extension representing hundreds of millions of euros of saving each. Another example concerns the costs of a drilling by oil companies, which amounts to approximately several tens of millions of dollars; avoiding an unproductive well, thanks to extensive HPC analysis of seismic data, is clearly worth the cost!

The easily accessible oil reserves are decreasing rapidly, with an oil-‐peak forecasted to occur sometime around the middle of the century. Improving the efficiency of the search for new oil reservoirs, including non-‐ traditional reservoirs, can be done only through very advanced wave-‐propagation models and image-‐processing methods. Such methods are being considered by oil companies as they contribute to crucial competitive advantages. Equally, the production from non-‐traditional fields, like marginal or deep-‐water fields, is often characterised by oil and gas qualities, which are more difficult to cultivate. Flow from wells to production plants needs particular attention and engineering abilities; handling of produced fluids is more complex. Energy supply can be safeguarded only with extended capabilities in reservoir modelling and enhanced techniques for production (e.g. processing and transport of crude oil streams in harsh environments). Increased levels and quality of computer simulations enable engineers and scientists to turn potential risks into well-‐managed opportunities for the energy-‐consuming society.

In the Chemical Engineering Domain

The chemical industry heavily relies on oil as a carbon source. With the prospect of increasing oil prices and decreasing oil reserves, gasification (with its low CO2 footprint) of low-‐grade coals and biomass providing synthesis gas for further chemical processing has become an important technology. Gasification with reactor sizes of several metres, residence times of 10–100 seconds and physical processes occurring on micrometre and millisecond scales is a challenging multiscale, multi-‐physics problem. Future (virtual) design and optimisation of these reactors will be driven by numerical simulation. Developing for a large variety of different feedstock requires a large number of simulations. Each single full 3D and dynamic simulation requires efficient numerical models and the corresponding computational resources.

In the Transportation Domain

Future air transport systems will have to meet the constantly increasing needs of European citizens for travel and transport, as well as the pressing requirements to preserve the environment and quality of life. Within the ACARE (Advisory Council for Aeronautics Research in Europe) Vision-‐2020, ambitious goals have been set for air traffic of the next decades. These include a reduction of emissions by 50% and a decrease of the perceived external noise level by 10–20 dB. Continuous improvement of conventional technologies will not be sufficient to achieve these goals: a technological leap forward is required. Numerical-‐based design and flight testing will be a key technology when aiming at more affordable, safer, cleaner and quieter and hence greener aircraft. The access to high-‐performance computers in the exascale range is of the utmost important. Considerable changes in the development processes will lead to significant reductions in development times while at the same time including more and more disciplines in the early design phases to find an overall optimum for the aircraft configuration. This enables the European aircraft industry to retain a leading role in worldwide competition, facing both an old challenge, i.e. competing with the US, and a new, rapidly-‐emerging one, i.e. keeping an innovation advantage over China.

Designing efficient engines, motors and reactors, for airplanes and cars, is critical for both the efficiency and safety of actual and future propulsion modes. This challenge is facing advanced European industries: how can we use less energy for propelling the new vehicles presently under development while meeting the corresponding environmental challenge of emitting less greenhouse gases? How do we design safe reactors which could perform at low pressure without exhibiting dangerous instabilities?

PRACE – The Scientific Case for HPC in Europe Weather, Climatology and solid Earth Sciences

2 WEATHER, CLIMATOLOGY AND SOLID

EARTH SCIENCES

2.1 Summary

Weather, Climatology and solid Earth Sciences (WCES) encompass a wide range of disciplines from the study of the atmosphere, the oceans and the biosphere to issues related to the solid part of the planet. They are all part of Earth system sciences or geosciences. Earth system sciences address many important societal issues, from weather prediction to air quality, ocean prediction and climate change to natural hazards such as seismic, volcanic and tsunami hazards, for which the development and the use of high-‐performance computing plays a crucial role.

Research in the fields of w eather, c limatology and solid Earth s ciences is of key importance for Europe for:

• Informing and supporting preparation of the EU policy on environment, climate and natural hazard mitigation and adaptation

• Understanding the likely impact of the natural environment on EU infrastructure, economy and society

• Enabling informed EU investment decisions in ensuring sustainability within the EU and globally

• Developing civil protection capabilities to protect the citizens of the EU from natural disasters

• Supporting through the EU and ESA joint initiative on Global Monitoring of Environment and Security

The following paragraphs introduce the WCES scientific domains, the societal benefits and the international background.

Climate Change In the last decade, our understanding of climate change has increased, as has the societal need for this to carry into advice and policy. However, while there is great confidence in the fact that climate change is happening, there remain uncertainties. In particular, there is uncertainty about the levels of greenhouse gas emissions and aerosols likely to be emitted and, perhaps even more significant, there are uncertainties about the degree of warming and the likely impacts. Increasing the capability and comprehensiveness of ‘whole Earth system’ models that represent in ever-‐increasing realism and detail scenarios for our future climate is the only way to reduce these latter uncertainties. A further challenge is to provide more robust predictions of regional climate change at the decadal, multi-‐decadal and centennial timescales to underpin local adaptation policies. In many regions of the world, there is still considerable uncertainty in the model predictions of the local consequences of climate at different timescales. Model resolution plays a key role. A dual-‐track approach should be taken involving multi-‐member multi-‐model comparisons at the current leading-‐edge model resolution (about 20 km, limited to a few decades) alongside the longer-‐term aim to develop a global convective resolving model (down to 1 km resolution). To reduce these uncertainties in climate projections requires a coordinated set of experiments and multi-‐year access to a stable High-‐Performance Computing (HPC) platform. Issues relating to mass data storage, and dissemination of model outputs for analysis to a wide-‐ranging community of scientists over a long period, will need to be resolved. A multi-‐group programmatic approach could allow in Europe a set of model inter-‐ comparisons focused on a number of priority climate science questions.

Oceanography and Marine Forecasting The ocean is a fundamental component of the Earth system. Improving understanding of ocean circulation and biogeochemistry is critical to assessing properly climate variability and future climate change and related impacts on, for example, ocean acidification, coastal sea level, marine life and polar sea-‐ice cover. The ocean greatly influences the climate system at shorter timescales (i.e. for weather forecasting and seasonal climate prediction) but its influence grows as timescales increase. Beyond climate, ocean scientists are being called on to help assess and maintain the wealth of services that the ocean provides to society. Human activities, including supply of food and energy, transport of goods, etc., exert an ever-‐increasing stress on the open and coastal oceans. These stressors must be evaluated and regulated in order to preserve the ocean's integrity and resources. Society must also protect itself against marine natural hazards. Marine safety concerns are becoming more acute as the coastal population and maritime activities continue to grow. For all these concerns, there is a fundamental need to build and efficiently operate the most accurate ocean models in order to assess and predict how the different components of the ocean (physical, biogeochemical, sea-‐ice) evolve and interact. The main perspective is to produce realistic reconstructions of the ocean's evolution in the recent past (e.g. reanalyses) and accurate predictions of the ocean's future state over a broad range of time and space scales, to provide policy makers and the general public with relevant information, and to develop applications and services for government and industry.

Meteorology, Hydrology and Air Quality Weather and flood events with high socio-‐economic and environmental impact may be infrequent, but the consequences of occurrence can be catastrophic to those societies and Earth systems that are affected. There is, of course, a link to climate prediction and climate change impacts, if severe meteorological and hydrological events are to become more frequent and/or more extreme. Predicting these low-‐frequency, high-‐impact events a few days in advance – with enough certainty and early warning to allow practical mitigation decisions to be taken – remains difficult. Understanding and predicting the quality of air at the Earth’s surface is an applied scientific area of increasing relevance. Poor air quality can cause major environmental and health problems affecting both industrialised and developing countries around the world (e.g. adverse effects on flora and fauna, and respiratory diseases, especially in sensitive people, ). Advanced real-‐time forecasting systems are basic and necessary tools for allowing early warning advice to populations and practical mitigation strategies in case of air pollution crisis.

Solid Earth Sciences Computational challenges in solid Earth sciences span a wide range of scales and disciplines and address fundamental problems in understanding the Earth system – evolution and structure – in its near-‐surface environment. Solid Earth sciences have significant scientific and social implications, playing today a central role in natural hazard mitigation (seismic, volcanic, tsunami and landslides), hydrocarbon and energy resource exploration, containment of underground wastes and carbon sequestration, and national security (nuclear test monitoring and treaty verification). In the realm of seismic hazard mitigation alone, it is well to recall that, despite continuous progress in building code, one critical remaining step is the ability to forecast the earthquake ground motion to which a structure will be exposed during its lifetime.

All these areas of expertise require increased computing capability in order to provide breakthrough science. A programme of provision of leadership-‐class computational resources will make it increasingly possible to address the issues of resolution, complexity, duration, confidence and certainty, and to resolve explicitly phenomena that were previously parameterised. Each of the challenges represents an increase by a factor of at least 100 over individual national facilities currently available. A large number of the numerical models and capability-‐demanding simulations described below will lead to operational applications in other European centres, national centres and industry.

2.2 Computational Grand Challenges and Expected Outcomes

2.2.1 Climate of the Earth System Motivation

There is a vital need for high-‐performance computing in order to predict the future evolution of the climate and answer key societal questions about the impact of global warming on human activities. Even though the scientific community has little doubt that climate is sensitive to mankind’s activity, many questions remain unsolved at the quantitative level. There is a need to better qualify and quantify uncertainty of the predictions, estimate the probability of extreme events and regional impacts, quantify the feedbacks between climate and biogeochemical cycles such as carbon dioxide and methane, and identify the impacts of climate change on marine and terrestrial ecosystems and on societies. All these questions are strongly linked to the amount of computing power and data storage capacity available since they ask for increased model resolution, large numbers of experiments, increased complexity of Earth system models, and longer simulation periods compared to the current state of climate models. It is also important to perform coordinated ensembles of simulations using different models to ensure the robustness of the model results. Such coordinated multi-‐model activities are carried out within the framework of the IPCC but considerably more could be done within a European context. Sustained computing power of the order of 100 Tflop/s to 1 Pflop/s or more is already required today for Europe to maintain its scientific weight in climate change research worldwide.

Challenges: description and state of the art

Fundamental questions facing climate change research can be summarised in four key challenges.

Challenge #1: The need for very high-‐resolution models to better understand, quantify and predict extreme events and to better assess the impact of climate change on society and economy on the regional scale

Modelling the climate system is a challenge because it requires the simulation of a myriad of interacting and complex processes as well as their analysis at different time and spatial scales. Climate system modelling requires sophisticated numerical models, due to the inherently non-‐linear governing equations. Huge computational resources are needed to solve billions of individual equations describing the physical processes at different scales. Indeed, model simulations are required to represent both modification of the larger-‐scale, global, state (inside which extreme events are developing) and the fine-‐scale temporal and spatial structure of such events (storms, cyclones, intense precipitation, etc.).

Currently, global climate models have typical grid spacing of 100–200 km and are limited in their capacity to represent processes such as clouds, orography effects, small-‐scale hydrology, etc. The latest generation of models, under development or just starting to be used, have grid spacing in the 20–50 km range, and there is evidence that a number of important climate processes are better represented at this resolution (e.g. ENSO, blocking, tropical storm numbers, etc.). A priority is to continue the development of coupled models at such high resolution and use them in multi-‐member multi-‐model inter-‐comparisons focused on key climate processes.

In weather forecasting applications, much higher convective resolving limited-‐domain models are now being used operationally. However, these models cannot be run globally for climate because of the prohibitive cost of associated computing resources and limits in model scalability. The climate community’s first ‘grand challenge’ in the longer term is therefore to develop global climate models that resolve convective scale motions (nominally around 1 km horizontal resolution). These very high-‐resolution models will directly resolve convective systems, allow a better representation of orographic effects, atmosphere and ocean energy and matter transport, and provide greater regional details.

They will allow determination of whether convective scale resolution is necessary for credible predictions of some important aspects of regional climate change. Developing such very high resolutions will require developing scalable and more efficient dynamical cores and improving physical parameterisations.

Figure 2.1. Snapshot of Hadley Centre HadGEM3 simulation at a resolution of 25 km (N512). These simulations, part of the UPSCALE project, were performed on the HLRS HERMIT supercomputer, with funding from PRACE. Ocean temperatures (in colour going from blue = cold to violet = warm) are shown in the background, while clouds (B/W scale) and precipitation (colour) are shown in the foreground. Over land, snow cover is shown in white. (Credits: P. L. Vidale and R. Schiemann (NCAS-‐Climate, University of Reading) and the PRACE-‐UPSCALE team.)

Very high-‐resolution global models are expected to improve our predictions and understanding of the effect of global warming on high-‐impact weather events on seasonal, decadal and century timescales. Another issue is related to the simulation of regional-‐scale climate features, of crucial importance for assessing impacts on society and economic activities (farming, fisheries, health, transportation, etc.) and for which improved regional models, embedded in global climate models, are necessary.

Such regional models, currently running at 10–50 km, are also calling for spatial resolution of a few kilometres.

Increasing model resolution down to 1 km requires increases by factors of at least 100 to 1,000 in computing power compared to the current state, i.e. in the multi-‐petascale to exascale range toward the end of the decade. It should be noted that each increase of the spatial resolution by a factor 2 in each direction mandates at least an eightfold increase in computing power, depending on the time step length that needs to be decreased with increasing resolution.

Challenge #2: The need to move from current climate models towards Earth system models

Today it is clear that models must also include more sophisticated representations of non-‐physical processes and subsystemsthat are of major importance for long-‐term climate development, such as the carbon cycle. Scientists are keen to discover the sensitivity of predictions not only to unresolved physical processes (e.g. the cloud feedbacks mentioned above) but also to non-‐physical o ne s , like those related to biology and chemistry (including, for example, those involving the land surfaces, and GHG-‐reactions). In the last few years, biological and chemical processes have begun to be

included in long-‐term simulations of climate change, albeit in a simplified way. In addition to the value of being able to predict changes in vegetation and atmospheric composition, it turns out that these additional processes can have quite a marked effect on the magnitude of climate change. For example, the European modelling groups were the first to show in coupled Earth system models that the global carbon cycle accelerates climate change.

However, the carbon cycle itself is intertwined with other biogeochemical cycles, such as the nitrogen cycle, so other matter cycles also need to be included. Moreover, other processes, such as GHG-‐reactions or aerosol-‐related processes and their indirect effect on clouds or interactive vegetation, still need to be better accounted for.

It should be noted that including the representation of biogeochemical cycles using different biochemical tracers and aerosols typically increases time by a factor of between 5 and 20 (depending on the complexity of the parameterisations and the number of tracers). An increase of computing power by a factor of 5 to 20 is then required to better account for the complexity of the system.

Challenge #3: Quantifying uncertainty

Future projections of climate change are uncertain for a number of reasons. The future forcing by greenhouse gases and aerosols is uncertain, and climate variations have both a natural and anthropogenic component, both of which need to be represented in climate models. The models are also inherently imperfect owing to physical processes that are either not completely understood or yet to be adequately represented because of limited computer power.

To better understand and predict global and regional climate change and climate variability using numerical models, a wide range of underlying scientific issues needs to be solved by the international community, as reported in the WCRP strategy COPES ‘Coordinated Observation and Prediction of the Earth System’ 2005–2015 (http://wcrp.ipsl.jussieu.fr/).

Taking into account the interests and strengths of the European climate science community and the aim to answer societal needs, the major issues are related to:

• The predictability – and its limits – of climate on a range of timescales • The range of uncertainty that can be fully represented using the models currently available • The sensitivity of climate and how much we can reduce the current uncertainty in the major

feedbacks, including those due to clouds, atmospheric chemistry and the carbon cycle

The consensus approach to solving these problems is to assume that the uncertainty can be estimated by combining multi-‐model multi-‐member experiments. Running multi-‐model experiments in a coordinated European way allows investigation of the sensitivity of results to model parameters. Moreover, running multi-‐member ensembles of climate integrations allows the chaotic nature of climate to be accounted for and thereby enables systematic assessment of the relative roles of natural climate variability and man-‐made climate change.

Use of different scenarios of emissions of greenhouse agents is also mandatory to warrant the production of experiments to probe the future course of climate. Furthermore, quantifying uncertainty about future climate change will require investigating the sensitivity of results to specification of the initial state; the latter initialisation issue is particularly important for decadal timescale predictions where both natural climate variability and man-‐made climate change need to be predicted within the model.

It should be noted that computing requirements scale directly with the number of ensemble members required to represent better uncertainties associated with both internal variability and model parameterisations, but the number of members required to keep the same signal-‐to-‐noise ratio in climate forecasts also increases as the spatial resolution increases. Ensemble experiments are therefore computationally expensive – a factor of 10 to 100 for each experiment – but will bring

enormous economic benefit as they will improve reliability of models and our understanding of uncertainties in forecasts.

Challenge #4: The need to investigate the possibility of climate surprises

In a complex non-‐linear system such as the Earth system, minute actions could cause long-‐term, large-‐scale changes. These changes could be abrupt, surprising and unmanageable. Paleoclimatic data indicate the occurrence of such rapid changes in the past. For example, it is crucial to determine if there are thresholds in the greenhouse gas concentrations over which climate change could become irreversible. The Atlantic thermohaline circulation (THC) might undergo abrupt changes as inferred from paleo-‐records as well as from some long simulations of future climate. The possible climatic consequences of such a slowdown in Atlantic THC are still under debate. Surprises may also arise from ice-‐sheet collapse and large amounts of fresh water in the ocean.

Some key questions arise on the possibility:

• To model and understand glacial–interglacial cycles, including changes in carbon cycle and major ice sheets

• To use observational evidence from past climates to calibrate the sensitivity of complex climate models and respective adjustable model parameters

• To what extent we can attribute signals in the period of the instrumental record to understand Earth system processes (from weather scales to those typical of anthropogenic climate change). The need for longer historical runs, both current-‐era hindcasts and palaeoclimates, and for longer runs to investigate possible future non-‐linear changes is evident, and the computing needs scale accordingly.

To investigate climate surprise requires longer simulations on future and past periods with medium to high resolution and various degrees of complexity. A factor of 10 to 1,000 is required which, however, cannot be achieved just by an increased number of cores but also requires an increase of power for each individual core.

Roadmap

To improve the reliability and capability of climate and Earth system simulations, both physical and software infrastructures are required. Computing power is a strong constraint to the type of problem that can be addressed. At the European level, Tier-‐0 (international – the PRACE infrastructure), Tier-‐1 (national) and Tier-‐2 (institutional) facilities are available. Thus, a suitable strategy for Earth system modelling is needed to exploit efficiently all these systems belonging to the European ‘computing ecosystem’.

Climate models fundamentally scale with difficulty on supercomputers because the problems they represent are connected, algorithmically between elements of the Earth system processes and physically in all physical dimensions. Requiring significant communication results in an increasing overhead with increasing domain decomposition. Both capability and capacity computing are therefore important for Earth system modelling. Capability is needed given the long timescales every coupled model configuration needs to spin up to a stable state. As described above, paleo-‐studies in particular also require relatively long runs and therefore capability. In both cases, this will be true as long as no technique of parallelisation in time is available. Higher-‐resolution simulations, of course, also strongly benefit from capability. However, to carry out control and transient multi-‐member ensemble runs dealing with modern climate and based upon the above-‐mentioned stable state is a typical capacity problem. For example, producing the set of experiments for IPCC AR5, organised through CMIP5, requires running of a high number of simulations (typically cumulated 10,000 simulated years for each of the ~25 modelling centres) and certainly asks for capacity, as all of these runs must be considered as being part of the same experiment. These capacity-‐demanding ensemble

type runs with high-‐resolution models are generally done most efficiently on central HPC systems and not in a distributed manner. There are applications for which distributed systems would provide good performance, but these cases generally depend on models with very good portability and with relatively low input/output volumes, criteria that are not fulfilled in general by Earth system models. So systems especially suited to ESM high-‐performance computing applications need to provide both capability and capacity with a good balance between computer power, storage system size and read/write efficiency.

The climate community is just beginning to use the Tier-‐0 PRACE type machines (e.g. UPSCALE project on HLRS based on atmospheric only model at 20 km scale resolution) available within the PRACE infrastructure. Most climate models are executed (e.g. for IPCC scenarios) on Tier-‐1 national, sometimes on purpose-‐built systems or Tier-‐2 dedicated national machines, which are sufficiently tailored towards climate applications, providing, for example, a good balance between computing performance, bandwidth to storage, and storage capacity. Lack of scalability of the multi-‐component (e.g. atmosphere, ocean, land, coupler, etc.) climate models is just one reason why they have not been run often on Tier-‐0 machines yet. Another important aspect is that access requirements to these Tier-‐0 platforms are based on capability only rather than on both capability and capacity. Another key reason for this limited use is that running a coordinated set of experiments requires multi-‐year access to a stable platform, in terms of hardware and middleware, because of the high cost of loss of bit-‐level reproducibility.

In summary, climate modelling requires:

• Computing platforms offering both capability and capacity and access requirements for the applications including these two aspects

• Multi-‐year access so that the simulations can be carried out on the same Tier-‐0 machine with the same environment as the ones used during the model porting and validating phase

• The possibility of getting multi-‐year multi-‐modelling groups access to investigate scientific questions through targeted multi-‐model experiments

• Appropriate mass storage and dissemination mechanisms for the model output data • Outputs of the coordinated set of experiments easily available for analysis to a wide-‐ranging

community of scientists over a long period

These requirements will be central to the usefulness of the PRACE infrastructure for the climate community. ENES (European Network for Earth System Modelling – http://www.enes.org) has started collaboration with PRACE aimed at fostering the use of Tier-‐0 machines for the climate community. PRACE has started with general-‐purpose machines that serve all research fields. The specific requirements of the climate community, such as appropriate queue structures and access to high-‐volume data archiving, would be better met by a dedicated world-‐class machine for Earth system modelling in the future. Such a facility would also allow the production of ensembles of very high-‐resolution simulations for future climate relevant for the development and provision of climate services. It would also efficiently correlate with the proposed establishment of an Exascale Climate and Weather Science (ECWS) Co-‐Design Centre, where integrated teams of climate and weather science researchers, applied mathematicians, computer scientists and computer architects would cooperate closely. At least, facilities customised for climate models are expected.

2.2.2 Oceanography and Marine Forecasting Motivation

Progress in ocean science is intricately linked to the computing power available because of the need for increasingly higher model resolutions, many more simulations, and greater complexity in ocean system models. Operational oceanography is a new and rapidly growing sector, providing key

assessments for coastal water quality, fisheries and marine ecosystems, offshore, military, transport, etc. The advent of satellite measurements of sea level (altimetry) and of the global Argo array of profiling floats has led to major breakthroughs by overcoming the historically sparse data coverage in the surface ocean. Yet the subsurface and deep ocean remains drastically under-‐sampled. We must therefore assimilate available data into models and make sure that those models account for the key ocean physical and biogeochemical processes to be able to predict the evolution of ocean characteristics and of marine ecosystems at all relevant scales. A key concern in the ocean, as in the atmosphere, is eddies. Their ubiquitous nature causes them to play a fundamental role in setting the mean circulation, in transporting heat, carbon and other key properties across frontal structures and between basins. Small eddies (sub-‐mesoscale) drive variations in vertical motion that affect nutrient supply and thus ocean biota. Eddies and other non-‐linear ocean processes generate intrinsic low-‐frequency variability that cannot be quantified by present observation systems and that require large-‐ensemble modelling strategies. These key concerns can be addressed only by building upon recent advances in ocean modelling to construct more accurate, high-‐resolution models.

Peta-‐ to exa-‐flops computing capabilities would greatly help resolve three major issues facing the oceanographic research community.

Challenge #1: High-‐resolution ocean circulation models

Spatial resolution is of prime importance because major ocean currents are critically governed by small-‐scale topographic features, such as narrow straits and deep sills, and by energetic small-‐scale eddies. These factors are essentially unresolved in current climate models. Existing eddy-‐permitting (O(10 km) grid) models have now begun to capture eddy processes in the subtropics and mid-‐latitudes (with strong effects, for example, on the Gulf Stream), but much higher resolution is needed to achieve comparable progress in the subpolar and polar oceans. Yet it remains a challenge to run realistic global or regional ocean/sea-‐ice models at resolutions high enough to ensure dynamical consistency over a wide range of resolved scales. The immediate challenge is to use such models in many ensemble simulations to quantify and understand the broadband chaotic variability that spontaneously emerges within the eddying ocean. This will further help quantify associated uncertainties in ocean forecasting and in climate prediction. Going further, we also need to represent scales of O(1 km) over large domains (i.e. at a basin or global scale) as process studies show that features of the scale have significant impacts on larger scales. Another challenge concerns developing data assimilation in eddying ocean models: new computationally efficient methodologies are needed to adequately constrain such models with a combination of a large variety of very high-‐resolution satellite observations (wide-‐swathe altimetry, high-‐resolution SST, ocean colour, SAR images, etc.) and the much more dispersed and under-‐sampled in-‐situ data (e.g. ARGO drifters).

Challenge #2: Carbon fluxes

A key challenge concerns the strong control of ocean physics on ocean biogeochemistry. Ocean biogeochemistry affects air–sea CO2 fluxes, which in turn affect atmospheric CO2 and thus climate. To project future changes in air–sea CO2 fluxes and climate change accurately, a key prerequisite is to be able to simulate adequately decadal and inter-‐decadal variability of the oceanic carbon cycle over the recent past, while separating natural from anthropogenic components. Major improvements are needed to assess adequately large-‐scale carbon transport, regional variations of ocean carbon sources and sinks, and exchanges between the open and coastal oceans. In addition, because ocean circulation in general and small-‐scale physical processes in particular greatly affect ocean carbon, it is critical to pursue these studies with coupled physical/biogeochemical global ocean models at the highest resolution available.

Challenge #3: Understanding and monitoring marine ecosystems

Another great challenge in ocean sciences is to understand the evolution of regional marine ecosystems over seasonal to decadal scales and their sensitivity to a changing environment. For instance, increasing atmospheric CO2 also affects the ocean by reducing ocean pH (ocean acidification), which threatens some marine organisms, especially corals and shell builders. Human influence is also causing a general warming of surface waters (through climate change) as well as reductions of oxygen in subsurface waters. How these multiple stressors interact and will change in the future is a burning question. Accurate models of these interactions would greatly improve understanding, monitoring and forecasting of marine resources and influence the preservation of our coastal zones. These models will need to be able to simulate biogeochemical cycles and ‘blooms’ accurately and to build stronger links between halieutic resources and fine-‐scale ocean circulation features. In the mid-‐latitudes, ecosystems are strongly affected by vertical eddy velocities that can enhance or impede the supply of nutrient-‐rich deep water. High resolution is essential, possibly using nested regional models (grid refinement up to 1/100° degree) embedded within larger-‐scale models. Explicit resolution of fine-‐scale processes avoids the use of subgrid-‐scale parameterisations which remain inaccurate in critical regions that have remote impacts on basin or global oceanic circulation.

Roadmap

Running ocean/sea-‐ice circulation models that can resolve the spectrum of ocean dynamics down to the sub-‐mesoscale (e.g. O(1/100°)) is at present beyond any computing capability, except in very local areas and over short periods. Since a twofold increase in grid resolution requires a tenfold increase in computer power, reaching kilometric resolution at global scale demands a thousandfold increase in computer power. Adding a full carbon cycle model increases the computational cost and storage capacity by a factor of O(5). The highest resolution global ocean/sea-‐ice circulation model presently used in Europe for research and operational forecasting (e.g. ORCA12 used in MCS MyOcean) uses a grid resolution of 1/12° (i.e. 5 to 10 km). A single 50-‐year long run of this model requires an available crest computational power of 25 Tflop/s for a period of 2 months, which represents ~2.5 % of the annual crest power available on a Pflop/s computer (or ~10% of a Tier-‐1 computer where this model is presently being run). The most commonly used eddying ocean models (for operational forecasting, seasonal and climate prediction, and ocean climate variability studies) use a grid resolution of about 1/4° (10 to 25 km) and require, for a 50-‐year long run, a crest computational power of 10 Tflop/s for a period of 20 days. In the near future, the scientific objectives of the oceanographic community will require, on the one hand, performing series of multi-‐decadal experiments with O(1/12°) models or ensemble runs of O(50) members with O(¼°) models, or increasing eddying ocean model complexity with a full carbon cycle. Operational oceanography, on the other hand, urgently needs to develop higher-‐resolution products for its marine core services, and to develop global models with a grid resolution of , for example, 1/24° (resolution of 5 km at low latitudes and up to 2 km at high latitudes), asking for an increase of computational power by a factor of 10 or more. At the same time, we need to make significant progress in data assimilation: accurate ocean simulations require accurate initial conditions, accurate forcing fields and accurate calibration of model parameters. These requirements are even more demanding than modelling in terms of computational power. Moreover, it would be important to include grid refinements, down to 1 km scales to resolve explicitly the dynamics of specific oceanic regions that are critical for the global oceanic circulation and therefore avoid the need for parameterisations in key regions. Altogether, the above requirements call for crest computational resources of 500 to 1,000 Tflop/s available for periods of months. This can be obtained only with O(10 to 100) Pflop/s computers coupled to very large storage facilities O(10 to 100 PBytes) that will store the simulation outputs over long periods (at least O(5 y)) for subsequent studies. Finally, support should be made available to train young interdisciplinary scientists to become specialists in not only climate science or HPC but both. Training should be provided via summer schools and international training networks (e.g. the International Training Network SLOOP (SheLf to deep Ocean mOdelling of Processes) recently submitted to FP7).

2.2.3 Weather and Air Quality Motivation

The mitigation of high-‐impact weather achievable by having a more accurate and timely weather nowcasting system requires that the comprehensiveness of the various models used for describing the development of the atmosphere in the near future will have to be enhanced dramatically. The resolution of the numerical weather prediction models will have to be increased to about 1 km horizontally to resolve convection explicitly. In addition, a probabilistic approach will have to be taken to arrive at meaningful warning scenarios. Similarly, one of the most far-‐reaching developments results from the enhanced capabilities in air quality modelling forecasts. Chemical transport models (CTM) that aim to simulate the physical and chemical processes in the atmosphere have been used for urban pollution problems. However, the operational air quality forecast systems require high spatial resolution, significant computational efforts and a large volume of input data. Indeed, this is more relevant on new online codes, where weather and air quality are solved with an integrated approach. The ability to forecast local and regional air pollution events is challenging since the processes governing the production and sustenance of atmospheric pollutants are complex and non-‐linear. The availability of increased computational power and the possibility of accessing scattered data online with the help of a cloud infrastructure, coupled with advances in the computational structure of the models, now enable their use in real-‐time air quality forecasting. Furthermore, this may contribute to the GMES (Global Monitoring for Environment and Security) European initiative, which has stated as a priority objective the deployment of environmental forecasting services. These challenges are currently being tackled, albeit on a much smaller scale, because the computing power is about three orders of magnitude smaller than required for the complete solutions. Today it is only possible to solve the problems for a limited number of variables in a limited area. It is known in principle what needs to be done in future but the resources are not yet available.

Fundamental questions facing weather and air quality research can be summarised in three key challenges.

Challenge #1: The need for very high-‐resolution atmospheric models and associated forecasting system

The resolution of models will have to be increased in both time and space to be able to resolve explicitly physical processes which today are still parameterised. To arrive at meaningful results, this work also entails the evaluation of the error growth due to the uncertainty of the initial and boundary conditions. One option is the computation of multiple scenarios with initial conditions varying within the error space. Furthermore, the data gathered by new, high-‐resolution observing systems, either space-‐ or ground-‐based, need to be assimilated using new techniques such as 4DVAR. Furthermore, the I/O rates of the applications will be in the order of 5 GB/s for the duration of the runs, resulting in files of more than 6 TBytes. In order to test these ideas, a preoperational trial of a complete end-‐to-‐end system needs to be carried out, checking whether the integrated system consisting of data acquisition, data assimilation, forecast run and product generation can be handled in a sufficiently small time period to allow future operational deployment. Similarly, sets of ensemble forecasts will have to be run as a preoperational test case.

Challenge #2: High-‐resolution assimilated air quality forecast and cloud–aerosol–radiation interaction models

A reliable pan-‐European capability of air pollutant forecasting in Europe with high resolution (1 km) becomes essential in informing and alerting the population, and in the understanding of when and why episodes of air pollution arise and how they can be abated. It is well known that the accuracy of an air quality forecasting system depends dramatically on the accuracy of emission data. It is thus timely to develop a system using the four-‐dimensional variational analysis (4DVAR) to improve simulations of air quality and its interactions with climate change. 4DVAR is an improved way of

combining observations valid at different times (satellite data, radiosondes, ground observations, aircraft measurements, photometer data, model data) with background fields from a previous forecast to create a starting point for a new forecast.

4DVAR is computationally very expensive (considerably more expensive than the forecast itself), and also requires a lot of memory. In order to be able to provide high-‐resolution air quality forecasts and improved weather forecasts, an important unresolved question is the role of aerosols in modifying clouds, precipitation and thermal atmospheric structure. Under natural conditions, dust generates complex feedbacks within atmospheric processes: increased dust load modifies the thermal and dynamic structure of the air; modified atmosphere furthermore changes conditions for the dust uptake from deserts, and so on. In a similar way, other aerosols may act as cloud condensation nuclei and affect precipitation processes. In current generation models, these processes are still highly simplified. Several efforts in this area must be progressed with the help of supercomputers.

Challenge #3: Develop pan-‐European short-‐range weather and air quality modelling systems

Weather and air quality models are commonly used in the forecast mode. Further improvements are still needed and these require more computing power. Such improvements include: the representation of clouds and small-‐scale processes, the coupling with the biosphere, atmosphere, aerosols, clouds, chemical reactions in the environment, data assimilation, and the estimation and computation of different sources of emissions. In addition to the requirements of previous sections, this requires including more physics and chemical species in the models. Short-‐range and very high-‐resolution models need observational data and subsequent data assimilation. This assimilation of new data needs to be resolved, also as real-‐time application. This increases the computing requirements by a factor of 10.

Roadmap

The national computing resources for solving the challenges outlined above are expected to grow over the coming years. This growth will allow a gradual increase in complexity and resolution of the various models used and will result in narrowing the gap between what is possible at any given time and what is required. The work will be carried out by collaboration between established scientific communities consisting of the European National Meteorological Services and the European Centre for Medium-‐range Weather Forecasts (organised in the European Meteorological Infrastructure), collaborating universities, and scientific research centres. The European partners have collaborated over many years, often supported by EU projects.

A complete pan-‐European weather and air quality forecast requires high resolution over a wide area, a complete description of the meteorology, gas-‐phase and aerosol chemistry, and their transport and coupling with the other component of the Earth’s system, like emission of mineral dust. Integrated air quality forecasts for the area of all EU Member States at the highest feasible resolution (1 km) require extensive computing resources, as each of the applications proposed will require 100–300 Tflop/s sustained performance. The work will be carried out by established scientific communities, among other initiatives existing at the different Member States on a local or national level. The most relevant initiative is the GMES project, a joint initiative between the EU and ESA to strengthen the acquisition and integration of high-‐quality EU environmental, geographical and socio-‐economic data, that will help improve policymaking from local to global level.

2.2.4 Solid Earth Sciences Motivation

Because solid Earth processes occur on many different spatial and temporal scales, it is often convenient to use different models. A key issue is to better identify and quantify uncertainties, and estimate the probability of extreme events through simulation of scenarios and exploration of parameter spaces. For some problems, the underlying physics is today adequately understood and the main limitation is the amount of computing and data capabilities available. For other problems, a

new level of computing and data capabilities is required to advance our understanding of underlying physics where laboratory experiments can hardly address the wide range of scales involved in these systems, for example modelling and simulating earthquake dynamics rupturing processes together with high-‐frequency radiation in heterogeneous Earth.

The solid Earth community is preparing itself for massive use of supercomputers by the current (re-‐) organisation of some of the communities through large-‐scale EU projects, including the ESFRI project EPOS (http://www.epos-‐eu.org), the FP7-‐Infrastructure project VERCE (http://www.verce.eu), the Marie Curie ITN initiative QUEST (http://www.quest-‐itn.org), and other initiatives like SHARE, TOPOEurope, TOPOMod and MEMoVOLC. The need for leadership-‐class data-‐intensive computing is illustrated through the four major challenges outlined below.

Challenges: description and state of the art HPC research makes very important contributions to the development of carbon-‐free sources of energy (e.g. from nuclear fusion). Fundamental questions facing solid Earth sciences research can be summarised in four key challenges. Challenge #1: Earthquake ground motion simulation and seismic hazard

To understand the basic science of earthquakes and to help engineers better prepare for such events, scientists need to identify which regions are likely to experience the most intense ground shaking, particularly in populated sediment-‐filled basins. This understanding can be used to improve building codes in high-‐risk areas and to help engineers design safer structures, potentially saving lives and property. In the absence of deterministic earthquake prediction, the forecasting of earthquake ground motion based on simulation of scenarios is one of the most promising tools to mitigate earthquake-‐related hazard. This requires intense modelling that meets the actual spatio-‐temporal

Figure 2.2. Top left: Simulation of the seismic wavefield generated by the L’Aquila earthquake (Italy) on 6 April 2009; snapshots at 6 s, 11 s, 16 s and 21 s after the event (vertical displacement, up/down as red/blue); Bottom left: Mesh discretisation of the L’Aquila region for high-‐frequency wave simulations. (Courtesy E. Casarotti and F. Magnoni, see Peter et al., 2011.); Right: Visualisation of the magnetic state in the Earth’s liquid core during an inversion of the magnetic polarity using the Dynamical Magnetic Field Line Imaging method. (Courtesy J. Aubert, see Aubert et al., 2008.)

resolution scales of the continuously increasing density and resolution of the seismic instrumentation which records dynamic shaking at the surface as well as of the basin models. Another important issue is to improve our physical understanding of the earthquake rupture processes and seismicity. Large-‐scale simulations of earthquake rupture dynamics, and of fault interactions, are currently the only means to investigate these multiscale physics together with data assimilation and inversion.

High-‐resolution models are also required to develop and assess fast operational analysis tools for real-‐time seismology and early warning systems. Earthquakes are a fact of life in Europe and all around the world. Accurate simulations must span an enormous range of scales, from metres near the earthquake source to hundreds of kilometres across the entire region, and timescales from hundredths of a second – to capture the higher frequencies, which have the greatest impact on buildings – to hundreds of seconds for the full event. Adding to the challenge, ground motion depends strongly on subsurface soil behaviour. While providing much useful information, today’s most advanced earthquake simulations are generally not capable of adequately reproducing the observed seismograms. The likely reason is that these models are based on a number of assumptions made largely to reduce the computational effort and on the often poor knowledge of the medium at the scale-‐length target of the waveform modelling. There is an urgent need to enhance these simulations and to improve model realism by incorporating more fundamental physics into earthquake simulations. The goal is to:

• Extend by a factor 10 the spatial dimensions of the models • Increase the highest resolved frequency above 5 Hz (for structural engineering purposes)

implying a 64-‐fold increase in computational size (size scales roughly as the cube of the resolved frequency)

• Move to more realistic soil behaviours implying at least a two order of magnitude increase in computational complexity

• Incorporate a new physics-‐based dynamic rupture component at 100 m resolution for realistic wave radiation, and near-‐field risk assessment, implying at least an order of magnitude increase in computation

• Invert for both the earthquake source and the geological parameters which necessitates repeated solutions of the forward problem leading to an increase of one to two order of magnitude in computations

• Perform stochastic modelling of seismic events and wave propagation for quantifying uncertainties and exploring earthquake scenarios which implies a 10–50 times increase in computation

These improved simulations will give scientists new insights into where strong ground motions may occur in the event of such an earthquake, which can be especially intense and long-‐lasting in sediment-‐filled basins.

State of the art can be split into two categories. First, problems for which the underlying physics is adequately understood and the challenges arise mainly from computational limitations (e.g. simulation of an elastic wave propagation in strongly heterogeneous geological media remains a computational challenge for the new Pflop/s technology). Second, problems for which high-‐performance computing resources are required to advance scientific understanding – modelling earthquake dynamic rupturing processes together with high frequency radiation in heterogeneous media is an example of this type of problem. Fully coupled extended earthquake dynamics and wave propagation will remain a grand challenge problem even with the next generation computers.

Challenge #2: High-‐resolution imaging techniques

The capacity for imaging accurately the Earth’s subsurface, on land and below the sea floor, is one of the challenging problems that have important economic applications in terms of resource management, identification of new energy reservoirs and storage sites as well as their monitoring

through time. As recoverable deposits of petroleum become harder to find, the costs of drilling and extraction increase; the need for more detailed imaging of underground geological structures has therefore become obvious. Recent progress in seismic acquisition related to dense networks of sensors and data analysis makes it possible now to extract new information from fine structures of the large volume of recorded signals associated with strongly diffracted waves. This increase in data acquisition is also important for risk mitigation. Imaging accurately seismic rupture evolution on complex faulting systems embedded in a heterogeneous medium, or time-‐lapse monitoring of volcanoes, proceed in similar fashion.

Seismic imaging of the Earth’s subsurface has important implications in terms of energy resources and environmental management. While deep-‐ocean (1,000–2,000 m) fossil energy resources are going to be under extraction, investigations in complex tectonic zones as foothills structures are crucial because these zones are expected to host reservoirs of future economic interest. With the advent of high-‐resolution and large dynamic instrumentation, the challenge is now to exploit fully fine details of the recorded signals going beyond the first arrival waves and exploring late-‐arriving signals associated with strongly and possibly multiply diffracted waves. This will open new perspectives in very complex geological settings, as well as the capacity of monitoring through time waste disposals or reservoirs during their exploitation, and, for example, possible ascents and descents of magmas within volcanoes.

Differential and time-‐lapse seismology, migration and correlation methods are today explored in order to extract this detailed information. In these imaging techniques, only adjoint methods related to linearised techniques are to date tractable, and back-‐projection is the mathematical tool for the image reconstruction. The adjoint methods allow for only local analysis of the resolution and uncertainties. Semi-‐local analysis will require simulated annealing/genetic algorithms leading to a drastic increase in computer resources we cannot yet foresee, without mentioning exhaustive inspections of model space with importance sampling strategy. Because thousands of forward problems should be achieved in an iterative optimisation scheme related to the number of sources and receivers, one must investigate techniques for solving efficiently in a combined way these forward problems altogether. Moreover, in the forward model, new models must accurately simulate complex wave propagation phenomena such as reflection and diffraction in heterogeneous media with high impedance contrasts, or diffraction by rough topographies at the surface of the Earth or at the bottom of the sea, at very high frequencies (10–40 Hz) where complex attenuation is expected. The bridge between deterministic estimations and probabilistic approaches should be clearly identified and will justify the demanding task of performing wave propagation modelling.

Challenge #3: Structure and dynamics of the Earth’s interior

One of the major problems facing Earth scientists is to improve the resolution and the understanding of the Earth’s interior structure and dynamics. Broadband seismological data volumes are increasing at a faster rate than computational power, challenging both the analysis and the modelling of these observations. This progress is thanks to the federation of digital seismic networks and to the notable presence of the European Integrated Data Archive(s) (i.e. ORFEUS and EIDA) which, irrespective of the specific archive to which the data request is submitted, provides data contained in all the federated archives. So far, only a small fraction of the information contained in broadband seismograms is actually used to infer the structure of the Earth’s interior. Recent advances in high-‐performance computing and numerical techniques have facilitated three-‐dimensional simulations of seismic wave propagation at unprecedented resolution and accuracy at regional and global scales. The realm of Pflop/s computing opens the door to full waveform tomographic inversions that make use of these new tools to enhance considerably the resolution of the Earth’s interior image. This is a grand challenge problem due to the large number of mesh-‐dependent model parameters and of wave propagation simulations required during the inversion procedure.

Convection of the solid Earth mantle drives plate tectonics and the Earth thermal evolution. Mantle convection is dominated by slow viscous creep, involving a timescale of hundreds of millions of years. Despite the low velocities, the Rayleigh number is of the order of 107, inducing a quite time-‐dependent dynamic and small convective scales comparable to the size of the domain. One computational challenge is thus the resolution of convective features of less than 100 km over spherical domain of depth 2,900 km and circumference of 40,000 km. Another challenging issue is the resolution of the rapid spatial variations of the physical properties: viscosity is strongly dependent on temperature, pressure and stress (e.g. six orders of magnitude with temperature and two orders of magnitude with depth). Incorporation of melt-‐induced compositional differentiation, self-‐consistent plate-‐like behaviour (elastic brittle) and composition solid–solid phase change is extremely difficult and computationally demanding. How plate tectonics arise from mantle convection is an outstanding issue.

Seismology is the unique method that can probe the Earth’s interior from the surface to the inner core, as well as its external coupling with the atmosphere and the oceans. Improving the capability to enhance the quality of 3D tomographic images of the Earth’s interior, with a resolution of the thermal and chemical heterogeneities lower than tens of kilometres, using the continuously increasing data sets of broadband seismological records, is today essential to improve core–mantle dynamical models and our knowledge of the Earth’s physics. This is also an essential step in order to improve the imaging of earthquake rupture processes using both regional and tele-‐seismic seismological observations. Solid Earth internal dynamical processes often take place on scales of tens to millions of years. Even with the most advanced observational systems, the temporal sampling of these phenomena is poor. In order to understand these systems, simulations must be carried out concurrently with observations. Mantle convection provides the driving force behind plate tectonics and geological processes that shape our planet and control the sea level. Realistic models of thermo-‐mechanical mantle convection in 3D spherical geometry are required to better assimilate mineral physics and seismology information into the deep-‐Earth dynamics. The short-‐time scale dynamic behaviour will serve as the monitor for stress build-‐up that loads seismically active regions.

Numerical 3D simulation of wave propagation at regional and global scales has been achieved recently at unprecedented resolution and will continue to improve in the next few years. Using these new developments for non-‐linear arrival time and waveform inversions will lead to a revolution in global and regional tomography in the next decade. Even in the realm of Pflop/s computing, this seems an extraordinary computational challenge when facing the hundreds or thousands of model parameters involved here. Taking advantage of the fact that adjoint calculations and time-‐reversal imaging are quite straightforward in seismic inverse problems opens new doors for efficiently computing the gradient of the misfit function and to developing new scalable algorithms for seismic inversions. Full waveform inversion is today’s very challenging data-‐intensive application, requiring well-‐balanced HPC architectures.

Three-‐dimensional numerical simulations of mantle convection with both chemical and thermal buoyancies are today performed both in Cartesian and spherical shell geometries. Numerical simulations that include both melt-‐induced compositional differentiation and self-‐consistent plate tectonics-‐like behaviour have been performed only in two dimensions and in small 3D Cartesian geometries. Incorporating melt-‐induced compositional differentiation, self-‐consistent plate-‐like behaviour (elastic brittle) and composition solid–solid phase changes in high-‐resolution spherical shell models is today a challenging problem that can be addressed only in the realm of Pflop/s computing.

Challenge #4: Generation of the Earth’s magnetic field

Named one of the enigmas of natural sciences, the extremely involved magneto-‐hydrodynamic simulations of the core dynamics and the associated external magnetic field are essential to progress in this field. The past seven years have seen significant advances in computational simulations of

convection and magnetic-‐field generation in the Earth's core. Although dynamically self-‐consistent models of the geodynamo have simulated magnetic fields that appear in some ways quite similar to the geomagnetic field, none is able to run in an Earth-‐like parameter regime because of the considerable spatial resolution that is required.

The history of the Earth’s magnetic field variations is engraved in the frozen-‐in field directions found in most volcanic rocks on Earth (e.g. oceanic crust generated at the spreading ridges). Many of those observable directions are used to derive plate motions of recent times, and it is important to understand the constraints of these estimates particularly in times with frequent reversals. On a shorter timescale, it is important to understand the phenomenology of magnetic-‐field reversals, not only because currently the field strength is decreasing steadily with some likelihood of a reversal over the next few thousand years. Understanding the generation of the Earth’s magnetic field is not only crucial for geophysics; it has strong implications in astrophysics in understanding the magnetism of planets and stars. Besides, geodynamo is one of the challenges of non-‐linear physics.

While relevant programs are implemented in parallel, much higher resolution is required to be comparable with the natural conditions. In addition, many realisations are necessary to create stable results for highly non-‐linear processes with strong dependence on initial and boundary conditions. No global convective dynamo simulation has yet been able to afford the spatial resolution required to simulate turbulent convection, which surely must exist in the Earth's low-‐viscosity liquid core. They have all employed greatly enhanced eddy diffusivities to stabilise the low-‐resolution numerical solutions and crudely account for the transport and mixing by the unresolved turbulence. A grand challenge for the next generation of geodynamo models is to produce simulation with the thermal and viscous (eddy) diffusivities set no larger than the actual magnetic diffusivity of the Earth's fluid core, while using the core's dimensions, mass, rotation rate and heat flow. Another challenge is to develop new highly parallel, adjoint-‐based assimilation methods for the understanding and the prediction of the Earth’s magnetic-‐field evolution and fluctuations. Ensemble-‐based methods that exploit massively parallel architectures are the next challenging data-‐intensive HPC applications.

Roadmap

Nowadays, simulations of seismic wave propagation in small geological volumes have reached the Tflop/s plateau production and state-‐of-‐the-‐art cases are rising to the realm of Pflop/s computing. Usually, for heterogeneous geological basins of dimensions 300 km x 300 km x 80 km at 200 m wavelength resolution, a high-‐frequency simulation requires O(1K–10K) processors, O(10–100) TBytes, O(1–10) hours to complete with complex intensive data movement between computer nodes, disk and archival storage elements.

However, these simulations remain limited: the resolved frequencies are still too low for the seismic engineering applications (> 5 Hz); important non-‐linear effects and complex soil behaviours are not yet taken into account; earthquake sources remain simplistic; and uncertainty quantification and data assimilation are not yet reachable. Enhancing the resolution and the physics, making inversion of extended earthquake sources and seismic parameters, and quantifying the uncertainties through strong motion scenarios will push these simulations into the realm of Pflop/s computing. A key requirement will be the provision of large-‐scale end-‐to-‐end data cyber infrastructures to handle, analyse and visualise PBytes of simulated data for storage. Visualisation of very large data sets will be a related important challenging problem.

The time formulation (e.g. wave propagation in time) allows handling 3D imaging problems at the expense of computer time using present-‐day algorithm technology. For simulations on boxes of 100 km x 100 km x 25 km, with 10 Hz content, sustained performances range between 10 Tflop/s and 100 Tflop/s. Moving to more powerful resources will increase the size of the box and/or the maximum frequency. Pflop/s computing will improve seismic imaging resolution by using thousands of recorded seismograms.

Present-‐day numerical algorithm know-‐how should be improved for handling expected biases coming from our initial guess of the Earth’s image. Pflop/s computing will also allow us to tackle the important problem of uncertainties by making use of repeated forward modelling. The frequency formulation (e.g. Helmholtz equation) allows efficient image processing, using a parallel direct algebraic solver, by choosing only a few frequencies and by speeding up multi-‐sources and multi-‐receivers computations. The 3D imaging problem in the frequency domain is today a challenge both for computer resources and for numerical algorithms due to the lack of an efficient large-‐scale parallel direct algebraic solver. The realm of Pflop/s computing will provide the possibility of performing such seismic imaging in the frequency domain. This will require access to a large memory/processor ratio, efficient algorithms for direct decomposition of very large matrices, and optimised parallel and sequential IOs.

Achieving load balancing between processors in the frequency domain approach will be a challenge. Unfortunately, resorting to iterative methods dims the interest of a frequency formulation compared to the time domain formulation. Seismic Data Processing (SDP) is of paramount importance for imaging underground geological structures and is in use all over the world to search for petroleum deposits and to probe the deeper portions of the Earth. Current advances in data acquisitions, multi-‐component and multi-‐attribute analysis have increased the data volume several fold. Processing methods have also changed for high resolution, leading to an increase in the computational effort, which is beyond the scope of actual computer resources. Large data volumes and complex mathematical algorithms make seismic data processing an extremely compute and I/O intensive activity which requires high-‐performance computers (1–10 Tflop/s sustained) with a large memory.

Global wave simulation of body wave phases that explore the Earth’s core is today a Pflop/s challenge problem. This requires global wave simulations, at periods of 1 second or less and space resolution at wavelengths of tens of kilometres, in 3D anelastic Earth models including high-‐resolution crustal models, topography and bathymetry together with rotation and ellipticity. Today, front-‐end global seismology simulations run at wavelengths of tens of kilometres and typical periods down to 1–2 seconds in 3D Earth models, on hundreds of thousand cores with a sustained performance of ~ 200 Tflop/s. The next generation forward global wave simulations will be simulations for periods below 1 second and will require hundreds of TBytes of memory and 1 Pflop/s sustained performance. Another great challenge will be to go for the adjoint-‐based inversion of complete waveforms using these 3D wave propagation simulation models. This will lead to at least one order magnitude increase in the computational requirements.

The first models were able to simulate magnetic fields quite similar to the geomagnetic fields on Gflop/s technologies with subtle compromise leading to the modification of the equations and a set of parameters far from the Earth conditions. In 2005, results for the same set of parameters have been obtained without modifying the physics using 512 processors of the 38.6 Tflop/s Earth Simulator for 6,500 hours. For the first time, a dynamo was obtained with a small viscous moment compared with the magnetic one. The challenge is now to achieve the relevant balance for the dynamics of the Earth's core, for which both moments are vanishing. Massive access to Pflop/s computing will allow European researchers to investigate the mechanism of these dynamos (only obtained in 2005) and understand their physical principle. Yet the parameters available on such resources are still out by a factor 1 million from the actual geophysical values. Progress, achieved over the last few years, clearly indicates that an Earth-‐like solution (for which both moments vanish) could be reached by decreasing the relevant parameter (controlling viscous effects) by a factor 1,000 only. Constructing such an Earth-‐like numerical dynamo model is therefore only realistic in the realm of Pflop/s computing. When such simulations become available, the critical scientific issue will be to interpret the dynamical models in the frame of dynamo theory. This will require PBytes of storage to describe the 4D (time and space) magneto-‐hydrodynamic solution. Another great challenge is to develop highly parallel ensemble-‐based assimilation methods to predict the evolution and the changes of the Earth’s magnetic field. This will require new PBytes capabilities combining data-‐ and CPU-‐intensive architectures.

2.3 A Roadmap for Capability and Capacity Requirements

The computational requirements for the weather, climatology and solid Earth sciences applications discussed in this document have one common feature: the urgent need for access to very large computational resources does not stem from a single aspect such as the need to model a larger number of objects or to model at a higher resolution. Currently available compute power restricts these applications in several ways. For example, the envisaged advanced climate studies require simultaneously higher resolutions, a more sophisticated representation of processes and ensemble methods to quantify uncertainty.

Very similarly, earthquake and ground motion modelling requires higher resolutions, a more sophisticated representation of the physical processes of earthquake source dynamics and quantification of uncertainties on strong motion scenarios. The need to improve multiple aspects of the application implies very high computational requirements. The requirements are typically a factor of 1,000 above what can be run today on the top computational facilities installed in Europe. In absolute terms, the performance requirements of these applications range from 100+ Tflop/s sustained to 1 Pflop/s sustained, with some of the applications having even higher longer-‐term requirements. As a large number of such applications are concerned (see the many challenges described above), the total sustained performance which would be necessary is then in the range of 10+ to 100 Pflop/s.

The ratio between sustained and peak performance varies from application to application; in the past, typical factors of 1:10 for scalar architectures and 1:3 for vector processor-‐based systems were given for weather, climatology and solid Earth sciences. However, vector systems have become less efficient and are effectively absent from the technology landscape. Moreover, due to the extreme parallelism present in the envisaged Eflop/s systems with millions of cores, such performance ratios could be sustained only if the application programs were modified to deal with such parallelism.

The peak performance requirement is therefore in the exascale range, when considering a scalar architecture. (However, it should be stressed here that for many applications, such as medium resolution climate models used for paleo simulations, strong scaling would be required to an extent that seems not to be possible. These applications would only benefit from increased per core performance (e.g. powerful new vector-‐type architectures).) Even new mathematical algorithms might be required. Due to the high internal communication requirements of the applications and the continuous need to modify and enhance the model codes, a general-‐purpose computing system that offers excellent communication bandwidth and low latencies between all processors is required. For most of the WCES applications, the amount of computer memory required is generally not higher than that required by applications in other scientific disciplines. However, studies of the structure and dynamics of the Earth’s deep interior, and high-‐resolution seismic inversion, will require memory sizes approaching 100 TBytes. To ensure efficient utilisation of the system, an I/O subsystem that supports high transfer rates and provides substantial amounts of online disk storage (at least 1+ PBytes today and 10+ for the 2016 timeframe) is essential. Such online storage needs to be complemented by local offline storage (at least 10+ PBytes), to enable inputs and outputs to be stored up to 12 months. A possible long-‐term storage strategy would be for each community to develop its own distributed but shared database system based on data-‐grid technology. The long-‐term archive could then be held at national facilities. Most of this archive would be communal data available to other researchers rather than private data. Depending on the community, these archives would hold between 20 and 100 PBytes of data. To implement a grid-‐based distributed archive system, high-‐speed network links between the European resources and the larger of the national facilities would be a fundamental requirement. Such a strong link with national facilities would enable the bulk of the pre-‐ and post-‐processing to be carried out at these facilities. Equally, visualisation and analysis of model outputs would be possible through these network links.

2.4 Expected Status in 2020 Today, all successful development projects for sustained Pflop/s applications have in common that major portions of application codes had to be rewritten (software refactoring) and algorithms had to be re-‐engineered for applications to run efficiently and productively on novel architectures.

Software must be redesigned in order to meet such requirements and adapted to exploit effectively coming supercomputing architectures, in particular the extreme parallelism. Furthermore, ways have to be implemented to cope with the expected high failure rate of the components. In order to reach such an ambitious target, a deep synergy between HPC experts and application developers from the communities must be fulfilled, and a strong commitment from the scientific counterpart has to be achieved.

This principle is not limited to the high end of supercomputing, but applies to all tiers of the HPC ecosystem. The compute nodes architecture, in fact, follows current technology trends towards more parallelism and more customisation. Compute nodes will integrate thousands of cores, some of which will serve as computational accelerators (e.g. GP-‐GPU, Intel MIC, etc.). These technology trends pose major challenges to which software needs to respond swiftly and effectively. However, the effort involved in software refactoring can be substantial and often surpasses the abilities of individual research groups. Furthermore, such effort often requires a deep knowledge of the algorithms and of the codes and, in many cases, even the understanding of the physics beneath the numerical algorithms. Therefore, a successful practice is to place these activities in community projects where substantial code development activities are common and thus long-‐term software development activities for high-‐end computing can be sustained.

Supercomputer centres with their profound expertise of computing architectures and programming models, have to recast their service activities in order to support, guide and enable scientific program developers and researchers in refactoring codes and re-‐engineering algorithms, influencing the development process at its root. The resulting codes will fit both Tier-‐1 and Tier-‐0 allocation schemas and it will be the particular requirements of the users of these community codes that determine whether applying for Tier-‐1 or Tier-‐0 resources. To become effective, the computer services should be provided for longer periods without the necessity for users to change their codes.

Due to the complexity of these novel HPC environments and the relevance of the scientific challenges, interdisciplinary teams and training programmes will be strongly required. Training programmes will allow WCES scientists to improve their HPC background as well as to establish stronger links between the HPC community and their own domain. In this respect, funding specific actions to support training activities, summer/winter schools, intra-‐European fellowships as well as international incoming and outgoing fellowships will play a strategic role in preparing new scientists with a stronger and more interdisciplinary background. Given the expected increased complexity of the component models and of future exascale computing platforms, a lot of resources should be devoted to the technical aspects of coupled climate modelling; the coupler development teams should be reinforced, including experts in computing science remaining at the same time very close to the climate modelling scientists.

Appendix A: WCES Challenges and Expected Status in the Time Frame 2012–2020

Time Frame Challenges 2012 2016 2020

HPC system peak performance x10 x100

Climate

Climate Extreme Events, Impacts Quantifying Uncertainties

High-‐resolution experiments ~20 km, few decades only atmosphere only ensembles

Small decadal ensembles coupled to ¼o ocean

Ensemble of multi-‐model and multi-‐experiments at medium resolution (~100 km) with coupled climate models

Ensemble of multi-‐model and multi-‐experiments HR (~20 km) centennial experiments with coupled climate models (1/4o) Small decadal ensembles coupled to 1/12o ocean Atmosphere only at higher resolution (~7 km)

Ensemble of multi-‐model and multi-‐experiments very HR (~7 km) experiments with (eddy resolving) coupled climate models Atmosphere only at ~1 km

Climate Prediction

Seasonal multi-‐model (~3), multi-‐member (~50) global predictions made on different HPC systems at ~125 km resolution Decadal: multi-‐model (~4), multi-‐member (~10) decadal hindcasts/predictions made at ~125–250 km coupled resolution

Equivalent seas-‐decadal predictions routinely at ~50 km coupled resolution: ~50 members (seasonal) and 20 (decadal). Improved physics, increased vertical resolution (esp. stratosphere and upper ocean)

Testing of ~15–25 km seas-‐decadal predictions in multi-‐model/multi-‐member configurations

Seas-‐decadal predictions at ~10 km resolution (ensemble size 25–50 members per model)

Increased vertical resolution + advanced physics

Regional Climate Modelling

~25 km standard multi-‐GCM, multi-‐RCP transient downscaling of centennial projections

Limited number of ~10 km multi-‐GCM downscaling of centennial projections over Europe

~10 km standard for multi-‐GCM, centennial dynamical downscaling on Europe scale Limited downscaling at ~5 km resolution and some smaller-‐domain downscaled projections at 1–2 km resolution (cloud resolving)

~2 km downscaling at European scale

Climate Earth System Modelling

See below Appendix B, ‘Key Numbers for Climate Earth System Modelling’

Paleo Climate, e.g. Holocene Simulations or Glacial Cycles

Climate Surprises:32

Coupled Ocean Atmosphere Model, resolution 200–500 km, possible to simulate O(1,000) years per (calendar) year

Need to add additional components. Desirable to increase resolution by a factor of 2 but priority on decreasing turnaround time with given resolution

Simulate several 10,000 years per year with resolution O(100 km) and full-‐blown ESM

32 Thermohaline circulation slow-‐down in the North Atlantic ; rainforest changes and/or boreal forest changes and carbon uptake changes ; ocean stability and ocean C uptake changes ; sudden ice-‐sheet loss and sea level ; permafrost melt and methane release

Oceanography HPC system peak performance

Ocean CINES: 300 Tflop/s x10 x100

Ocean Climate Variability Glossary: blue: physical white: sea-‐ice green: biological/biogeochemical 1/4° : 27 km to 9 km 1/12° : 9 km to 3 km 1/24° : 4.5 km to 1.5 km

– Eddy-‐resolving (1/12°) multi-‐decade simulations of the blue&white global oceans – Eddy-‐permitting (1/4°) centennial simulations of the blue&white global oceans – Multi-‐decade O(1 km) simulations of the coastal and regional blue oceans

– Eddy-‐resolved (1/24°) multi-‐decade simulations of the blue&white global oceans – Eddy-‐resolving (1/12°) multi-‐decade ensembles or multi-‐centennial simulations of the blue&white global oceans – Eddy-‐resolving (1/12°) pluri-‐annual simulations of the blue&white&green global oceans – Eddy-‐permitting (1/4°) multi-‐centennial simulations – Pluri-‐annual O(100 m) simulations of the coastal and regional blue oceans

Ocean Monitoring and Forecasting

– Eddy-‐resolving (1/12°) analyses and forecast of the blue&white global oceans – Eddy permitting (1/4°) reanalyses of the global blue&white oceans – Sub-‐mesoscale eddy-‐permitting (1/36°) analyses and forecast of the blue&white coastal/regional oceans

– Eddy-‐resolved (1/24°) analyses and forecasts of the blue&white oceans – Eddy-‐resolving (1/12°) reanalyses of the global blue&white oceans – Eddy-‐permitting (1/4°) reanalyses of the global blue&white&green oceans – Sub-‐mesoscale eddy-‐permitting (1/36°) analyses and forecast of the blue&white&green coastal/regional oceans – Sub-‐mesoscale eddy-‐resolving O(1 km) analyses and forecast of the blue&white coastal /regional oceans

– Eddy-‐resolved (1/24°) analyses and forecasts of the blue&white oceans – Sub-‐mesoscale eddy-‐resolving O(1 km) analyses and forecast of the blue&white&green coastal/regional oceans – Eddy-‐resolving (1/12°) reanalyses of the global blue&white&green oceans

Solid Earth Sciences Earthquake Ground Motion Simulation and Seismic Hazard

High-‐resolution earthquake dynamics rupture and radiation wave simulation models (f ~4 Hz, l ~100 m) Ground motion simulation up to 4 Hz in complex geological basins,strong impedance contrast, non-‐linear surface soil behaviour, kinematic finite tens source probabilistic earthquake scenarios

Ground motion simulation 4 Hz, in large complex geological basins, dynamic earthquake source, non-‐linear surface soil behaviour, hundreds earthquake scenarios and stochastic approach

Dynamic source and velocity inversion PBytes output

Stochastic source and wave propagation simulation with quantification of the forward uncertainties for ground motion prediction

Global Wave Simulation in 3D Earth Models and 3D Global Tomography

Global surface and body wave simulations at 3 secs for large earthquakes

Global surface and body wave simulations below 1 second for large earthquakes

Bayesian full waveform tomography using global inversion methods for high resolution of the Earth’s

3D long periods full wave form tomography using elastic wave: mantle heterogeneities and anisotropy

Point source (CMT) and extended source inversion

Exploration of the quality of the Earth models in the data space from the comparison between predicted and observed wave forms at the stations of global dense seismic arrays (f < 1 Hz) Full short period waveform tomography using adjoint-‐based inversion methods for high resolution of the Earth’s structure and seismic sources: mantle and core heterogeneities and anisotropy

structure and seismic sources, and forward and inverse error quantification

Mantle Convection Earth Magnetic Field and Geodynamo Modelling

3D spherical thermo-‐chemical convection mantle at global scale including self-‐ consistent plate-‐like behaviour Seismological signature of dynamic mantle convection Geodynamo simulation: investigation of the different regime and scaling

3D high-‐resolution spherical thermo-‐chemical convection mantle at global scale including self-‐consistent plate-‐like behaviour + solid–solid phase changes and resolution of fine scales structures (plumes, swells) Coupling core and mantle dynamical convection models Ensemble-‐based data assimilation for the prediction of the geomagnetic field evolution and temporal changes

Earth-‐like dynamo model with both viscous and magnetic vanishing moments

Several realisations

Appendix B: Some Key Numbers for Ocean and Climate Earth System Modelling Key Numbers for Climate Earth System Modelling Time Frame Challenges 2012 2016 2020

Horizontal resolution of each coupled model component (km) 125 50 10

Increase in horizontal parallelisation wrt 2012 (hyp: weak scaling in 2 directions) 1 6,25 156,25

Horizontal parallelisation of each coupled model component (no. of cores) 1,00E+03 6,25E+03 1,56E+05

Vertical resolution of each coupled model component (no. of levels) 30 50 100

Vertical parallelisation of each coupled model component 1 1 10

No. of components in the coupled model 2 2 5

No. of members in the ensemble simulation 10 20 50

No. of models/groups in the ensemble experiments 4 4 4

Total number of cores (4x6x7x8x9) 8,00E+04 1,00E+06 1,56E+09

Increase 1 13 19531

Data produced (for one component in Gbytes/month-‐of-‐simulation) 2,5 26 1302

Data produced in total, i.e. 13x7x8x9x(increase in vert res) (in Gbytes/month-‐of-‐simulation)

200 4167 1302083

Increase 1 21 6510

Key Numbers for Ocean Modelling

The reference Tier-‐1 computer is JADE at CINES (23040 CORES, 267 Tflop/s)

The unit simulation for research application is a 50-‐year run (multi-‐decade)

Requirements

Challenges Computational Power Storage Capacity

Effective power for Eddy-‐resolving model ORCA1233 (1/12°)

25 Tflop/s for 2 months (i.e. 50 Tflop/s.month) 105 TBytes

Projection for a reanalysis with Eddy-‐resolving model ORCA12 (1/12°)

Projection for Eddy-‐resolving model ORCA12 (1/12°) with biogeochemistry

Projection for Eddy-‐resolved model ORCA24 (1/24°)

Projection for 50-‐members ensemble run of Eddy-‐permitting model ORCA025 (1/4°)

500 Tflop/s for 1 month (500 Tflop/s.month) 60 TBytes

33 ORCA12 is presently the ‘biggest’ ocean circulation model in Europe for research and operational use.

PRACE – The Scientific Case for HPC in Europe Astrophysics, HEP and Plasma Physics

3 ASTROPHYSICS, HIGH-ENERGY PHYSICS

AND PLASMA PHYSICS

3.1 Summary

In recent years, astrophysics, high-‐energy physics and plasma physics have shared a dramatic change in the role of theory for scientific discovery. In all of these fields, new experiments became ever more costly, require increasingly long time scales and aim at the investigation of more and more subtle effects. Consequently, theory is faced with two types of demands: precision of theory predictions has to be increased to the point that it is better than the experimental one. Since the latter can be expected to increase by further orders of magnitude until 2020, this is a most demanding requirement. In all of these research fields, well-‐established theoretical methods have existed for -‐decades. To achieve dramatic progress therefore requires a dramatic increase in theoretical resources, including computer resources for numerical studies.

In parallel, the need to explore model spaces of much larger extent than previously investigated also became apparent. For example, to determine the nature of dark energy and dark matter requires a detailed comparison of predictions from large classes of cosmological models with data from the new satellites and ground-‐based detectors which will be deployed until 2020. These predictions can only be generated by massive numerical simulations. In high-‐energy physics, one of the tasks is to explore many possible extensions of the Standard Model to such a degree that even minute deviations between experimental data and Standard Model predictions can serve as smoking guns for a specific realisation of New Physics. In plasma physics, one of the tasks is to understand the physics observed at ITER at such a high level that substantially more efficient fusion reactors can be reliably designed based on theoretical simulations which explore a large range of options.

While the three fields covered in this section are distinctly different, they also have substantial overlap. For example, the Big Bang is equally a topic of astrophysics as of high-‐energy physics while nucleosynthesis depends on nuclear physics as well as the modelling of supernova explosions. Plasma physics is crucial for many aspects of astrophysics as well as, for example, a better understanding of high-‐energy heavy-‐ion collisions at CERN.

As the experimental roadmap until 2020 is already fixed in all three research fields, it is possible to quantify with some reliability what these demands imply for HPC in Europe. If one requires that theory keeps up with the experimental progress, which is crucial to maximise the scientific output of the latter, these three fields together will require at least one integrated sustained Eflop/s-‐year which will require at least a dedicated compute power of 1EFlop/s, peak for roughly one decade.

3.2.1 Astrophysics In astrophysics, perhaps even more than in other scientific disciplines, there is an intimate interdependency between theoretical research, which is overwhelmingly reliant on simulations and modelling, and the exploitation of data from large observational facilities. Controlled experiments are not possible in astrophysics. This important methodological gap is filled by simulations and modelling which, due to the intrinsic complexity of astrophysical phenomena, require the most advanced computational infrastructure. It follows therefore that state-‐of-‐the-‐art HPC is an essential precondition for the advancement of this scientific discipline and for the proper exploitation of the very large funding that society has decided to invest in astronomical facilities.

Astrophysics and cosmology have a unique public appeal and capture the public imagination in a way that few other sciences do. The public are genuinely interested in many of the questions that professional astrophysicists address, such as the nature of the dark matter or the prevalence of Earth-‐like planets outside our solar system. A vibrant research presence in astrophysics is a cornerstone of the scientific literacy of the population as a whole. Such elevated status, however, has practical consequences of immeasurable impact on society: a recent survey conducted by the Institute of Physics in the UK showed that a large fraction of students who enrol in physics degrees at university are attracted to science by the excitement of fundamental physics and astrophysics.

Europe has traditionally been a world leader in HPC-‐based theoretical astrophysics. European scientists lead the world in the development and release of algorithms and codes, and this has led to many of the most important breakthroughs in, for example, cosmology and stellar evolution. Europe is already committed to leading or participating in many of the major ground-‐ and space-‐based astronomical facilities for the next decade and beyond. By contrast, there is as yet only an incipient coordinated effort through PRACE to ensure that the computational resources that are an essential counterpoint to these facilities are developed. An increase by a factor of 10 in the size of the largest supercomputers available for astrophysical research by 2015 and by a factor of 100 by 2020 is the minimum requirement.

Challenges

The following are 12 fundamental questions in astrophysics in which significant progress is likely in the next decade if an appropriate mix of computing infrastructure, software development and observational facilities can be achieved.

1. What is the identity of the cosmic dark matter and dark energy? 2. How did the universe emerge from the dark ages immediately following the Big Bang? 3. How did galaxies form? 4. How do galaxies and quasars evolve chemically and dynamically and what is the cause of

their diverse phenomenology? 5. How does the chemical enrichment of the universe take place? 6. How do stars form? 7. How do stars die? 8. How do planets form? 9. Where is life outside the Earth? 10. How are magnetic fields in the universe generated and what role do they play in particle

acceleration and other plasma processes? 11. How can we unravel the secrets of the sources of strongest gravity? 12. What will as yet unexplored windows into the universe such as neutrinos and gravitational

waves reveal?

Answering these questions requires accurate numerical treatment of a range of coupled complex non-‐linear physical processes including gravitation, hydrodynamics, non-‐equilibrium gas chemistry, magnetic fields, radiative transfer and relativistic effects. The set of partial differential equations describing this blend of physics is well known but largely inaccessible by analytic techniques. The astrophysics community has addressed these needs by developing a large number of sophisticated algorithms and codes. There are ongoing efforts to enable these codes to scale to 100,000 cores and beyond, to include ‘on the fly’ analysis and to use accelerators such as GPUs; but more manpower support is essential. There is an urgent need to address the data challenge, including storage and integration of observational data and simulation/modelling data. The hardware requirements of the community are a mixture of very large calculations that use a large fraction of the machine alongside a large number of smaller calculations for exploring different models, parameters and physical processes. These questions fall into three broad categories: cosmology and the large-‐scale structure of the universe, planets and stars, and strong gravity and physical processes.

Cosmology and the large-‐scale structure of the universe: Questions 1–5

The next decade will see the advent of large-‐scale cosmological experiments, both from the ground (LSST) and from space (Euclid). Radio telescopes like SKA will revolutionise our understanding of the high redshift Universe, providing 21 cm tomography of the epoch of cosmic reionisation. The successor of the Hubble space telescope, JWST, as well as extremely large optical telescopes of the 30 m class here on Earth will peer back in time to observe the infancy of galaxy formation in unprecedented detail, yielding insights into the formation of the first generation of objects. Large galaxy surveys that are currently underway or are to commence this decade (such as Pan-‐STARRS, Big-‐BOSS) will drastically improve the statistical constraints on dark matter, the nature of dark energy and galaxy formation.

It is imperative to advance dramatically our simulations to allow proper interpretation of upcoming observational data and to provide tight constraints on cosmological models. For example, one key step is to compute reliably the clustering of dark matter for a given set of cosmological parameters. Since gravitational dynamics is a complex non-‐linear problem, we need to use large N body simulations covering the same volume and the same galaxy mass range detected by these surveys. Indeed, to interpret data from Euclid requires simulations of the cosmological horizon, i.e. cubes of size 12’000 Mpc/h, with at least 100 particles per halo of size L*/10, where L* is the luminosity of the Milky Way. This translates into a prodigious number of particles, namely N = 327683 or 35 trillion bodies. State-‐of-‐the-‐art N body solvers (GADGET, PKDGRAV or RAMSES) usually require 200 bytes per particle. For 4 GB per core, taking into account memory overheads (x 2), these calculations require a 106 core machine or larger. Exascale resources will allow simulations of the large-‐scale universe with sufficient particles to resolve all dark matter haloes that could host stars. They will allow multiple realisations of the Hubble volume to test and constrain models for the dark sector. Similar requirements are needed to study baryonic processes on large scales, such as the mechanism of reionisation, or chemical enrichment and feedback from stars and AGN.

Our understanding of galaxy formation and the origin of the Hubble sequence (from dwarf galaxies to grand design spirals and ellipticals) is still extremely sketchy at best, even though a basic formation paradigm exists; ‘hierarchical galaxy formation’. The fundamental problem is that galaxy formation involves that aforementioned blend of different physical processes that are non-‐linearly coupled over a wide range of scales leading to extremely complex dynamics. For this reason, HPC simulation techniques have become the primary avenue for theoretical research in galaxy formation. This is also helped by the fact that the current standard model of cosmology precisely specifies the initial conditions of cosmic structure formation at a time briefly after the Big Bang. It is a computational problem par excellence to try to evolve this initial state forward in time, staying as faithful to the physics as possible. Current state-‐of-‐the-‐art hydrodynamic simulations of galaxy formation reach a resolution length of about 100 parsecs in the hydro-‐ and gravitational components while individual stars are modelled with super-‐particles that represent 1,000 stars. Exascale resources would enable an increase in mass resolution by a factor of a 1,000 over existing state of the art. This would allow simulations of Milky Way models in which the sites of star formation are accurately followed and each star is represented by a single particle.

Planets and Stars: Questions 6–9

Understanding planet formation is one of the key questions of modern astronomy. It will help us to learn about the history of our own planet, setting the conditions for human existence and at the same time giving us an idea of how rare or frequent the conditions for life in the universe are: is life a cosmic phenomenon? Today, the Kepler satellite is revolutionising our understanding of planetary systems around other stars, while missions in the next decade will characterise their atmospheres and search for signatures of life. Simulations are essential for understanding how these systems arise and to quantifying their habitability and stability. Understanding planetary architecture requires detailed 3D simulations of the unobservable processes in disks that involve multiple physical

processes acting over a large range of scales. This sets the nature of turbulence in disks and the resulting planetary building bricks (planetesimals) that are produced. The initial size distribution of planetesimals is vital to understand the further growth to planetary cores. At the same time, the growth and migration of gas giants and the reshuffling of terrestrial planets in the habitable zones around the host star also depend crucially on the properties of turbulence in the gas disk. This program already uses significant resources of the available Pflop/s machines within PRACE. To take the next step in improving our simulations, we have to gain an order of magnitude in resolution, which will catapult our applications from running on about 104 cores for 107 CPU hours to new architectures, providing us with 1011 CPU hours per year, which is a clear scientific application case and a strong argument for the envisioned exascale supercomputers.

Figure 3.1. One typical result of the Millenium Run which investigated the evolution of the matter distribution in a cubic region of the Universe of 2 billion light years extent, by tracing 10 billion particles. Courtesy of Volker Springel (Heidelberg Institute for Theoretical Studies) and Thomas Janka (MPA Garching)

The birth of stars and their planetary systems is intimately coupled to the dynamical state of the gas they are forming from – cold and dense clouds of molecular hydrogen and dust embedded in a turbulent, multi-‐phase environment. Understanding the life cycle of molecular clouds and the local onset and termination of stellar birth in galaxies at different redshifts is a key problem of modern astronomy and lies at the very forefront of computational astrophysics. Progress requires combining very high-‐resolution multi-‐species magneto-‐hydrodynamic simulations of dust and gas with time-‐dependent non-‐equilibrium chemistry in order to describe correctly the different phases of the turbulent, self-‐gravitating ISM and their heating and cooling behaviour (including high-‐precision multi-‐frequency radiative transfer). In addition, we need to account for the internal stellar evolution of the (proto) stars that form within the clouds to be able to model correctly stellar feedback processes such as winds and outflows or radiation. Such feedback may be able to destroy the clouds

Figure 3.2. 3D rendering of a core-‐collapse supernovae explosion 0.15 seconds after the bounce. Courtesy of Volker Springel (Heidelberg Institute for Theoretical Studies) and Thomas Janka (MPA Garching)

from the inside by heating and stirring them, as well as by driving outflows and large-‐scale winds, thus transporting gas (and metals) from the disk into the galactic halo. This global matter cycle plays a major role in controlling the long-‐term evolution of star forming galaxies in the universe. With current computational power, answering these questions is simply not possible. The proposed multiscale and multi-‐physics approach to model the complete life cycle of molecular clouds in the disk of the Milky Way requires a concerted collaborative effort of several research groups with complementary expertise and experience. It will also require an increase in computational power of at least 100, bringing us into the exascale regime.

Strong gravity and physical processes: Questions 10–12

One of the primary goals of relativistic astrophysics until 2020+ will be the first direct measurement of gravitational waves predicted by Einstein's theory, using huge laser interferometer facilities like VIRGO and GEO600 in Europe or LIGO in the US. The strongest sources expecting to radiate in this upcoming exciting observational window to the universe are orbiting and merging binary black holes, colliding compact stars, and collapsing and exploding massive stars. Elaborate numerical models of these astrophysical phenomena are needed for accurate signal predictions that will allow us to extract the gravitational waves from a noisy background and to realise their promise to unravel some of the mysteries of neutron stars and black holes as the most exotic objects in the universe. Exploring phenomena in the strong gravity environment of such extreme objects requires a treatment of general relativistic effects. The numerical complications of the corresponding highly non-‐linear hyperbolic metric equations translate into extremely high computational demands.

Magnetic fields are omnipresent, from the largest dimensions of galaxy clusters and intergalactic space to the intermediate scale of interstellar gas and dust clouds, down to small bodies like planets and moons. While the origin of initial seed fields shortly after the Big Bang is still speculative, the growth of these seed fields is understood as a consequence of plasma flows on large scales and through a cascade of highly turbulent magneto-‐hydrodynamic (MHD) dynamo effects. Computer simulations of these processes are possible, and they would encompass all scales between the global size of objects at the one end and small vortex flows and granulation on the scale of wave structures at the other. However, this requires resolutions reaching orders of magnitude beyond the capabilities of present HPC systems. Understanding how stars end their lives as supernovae (SNe) or what happens when compact stars collide requires following the extreme conditions and physical processes that are acting on very short timescales. It is essential to include neutrino physics and radiative transport processes in these calculations. This ultimately will require highly parallelised Monte Carlo methods, whose application can most easily be adapted to arbitrary source geometries in three dimensions. Neutrino-‐ and photon-‐radiation hydrodynamics simulations in 3D are among the computationally most challenging and demanding tasks to be performed on forthcoming generations of supercomputers with requirements easily reaching into the Eflop/s regime. Results of most sophisticated models are

indispensable for understanding the messages carried by the radiation from such highest-‐energy phenomena and to harvest the cosmic neutrino-‐burst measurements that are possible with big underground facilities like IceCube at the South Pole, SuperK in Japan and new instruments planned in Europe, the US and Japan.

For example, a successful simulation of the first second of a SN explosion (corresponding to 106 time steps) is necessary to predict neutrino and gravitational-‐wave signals and to continue through the nucleosynthesis phase into the late, astronomically observable stages of the explosion. For modest spatial resolution of 10 million cells and 12 neutrino-‐energy bins, this requires about 50 million core hours on 16,000 of the strongest cores of currently available PRACE Tier-‐0 systems. Improving the spatial and energy resolutions by only a factor of two will increase the computer time demand to about 1 PFlop/s-‐year per simulation, and there are many stellar parameter sets for which one does need such simulations. A fully relativistic treatment, convergent turbulent MHD flows and highly accurate neutrino spectra will require even higher exascale resources for SN modelling.

3.2.2 High-‐Energy Physics

Quantum Field Theory (QFT) is the fundamental theory of our world, describing all particles and interactions with extremely high precision. However, it is mathematically only consistent if additional interactions and particles exist which lead only at high energies to significant contributions and cure potential divergences. If this were not to happen, the correct description of thousands of quantities by the Standard Model would be purely accidental, which is statistically close to impossible. Therefore everybody believes that Physics Beyond the Standard Model (BSM) must exist.

The search for this physics is the driving force behind high-‐energy particle physics. Many suggestions exist as to the nature of this physics. Some of them fall within the scope of QFT, some are of fundamentally different nature, like string theory. Supersymmetry, a fundamental symmetry between bosonic and fermion degrees of freedom, is usually assumed at some level. Many high-‐precision experiments at high and low energies search for this new physics and in particular the Large Hadron Collider (LHC) at CERN was built to find it (besides the Higgs particle itself). So far, no clear signal has been observed and it has become unlikely that BSM physics will reveal itself soon through large measurable effects. This expectation has been strengthened by the fact that the recently established34 Higgs particle is so far compatible with Standard Model expectations and thus does not give any hints yet on the physics beyond the Standard Model. Therefore, future discoveries will require high precision, both experimental and theoretical. As the systematic theoretical uncertainties are primarily caused by effects of the quark–gluon interaction, which are described by Quantum Chromodynamics (QCD), a large fraction of present-‐day theoretical work focuses on that theory. In fact, QCD is a very complicated theory, which combines extremely strong non-‐linearities with all the complexity of relativistic quantum field theories, resulting in an extremely rich phenomenology, much of which is still not fully understood. QCD is therefore not only crucial to improving the experimental sensitivity for BSM physics, but it is a fascinating field in its own right. It allows, for example, study of the fundamental connection between quantum physics and thermodynamics in well-‐controlled settings and how effective degrees of freedom (the hadrons, i.e. quark gluon bound states) emerge from fundamental ones (quarks and gluons). Over the years, highly sophisticated techniques have been developed which link all of QCD dynamics to a large number of precisely defined non-‐perturbative quantities. These can be calculated numerically by Lattice QCD (LQCD), and in many cases LQCD is actually the only known method to determine them. Such calculations, which are indispensable to interpret high-‐energy experiments reliably, constitute the largest fraction of the HPC-‐demand of particle physics.

The basic idea of LQFT is the following: most of the non-‐perturbative quantities of interest can be expressed within functional integral quantisation but not in canonical quantisation. This implies that

34 http://www.bbc.co.uk/news/world-‐18702455

functional integrals, which are infinitely dimensional, have to be calculated numerically, which is only possible if a sufficiently large space–time volume is approximated by an ensemble of discrete points, turning the functional integral into a normal integral of very large dimension. To control this approximation, the limit of vanishing lattice spacing has to be under complete control, which is usually the hardest challenge.

Another crucial task for theory is to explore much larger theoretical model spaces. The absence of signals for BSM physics at LHC calls for the development of alternative schemes for unifying theories. For some of the most attractive candidates, conformal symmetry plays a fundamental role. While supersymmetry remains among the most interesting theoretical candidates, its lattice implementation is still plagued by conceptual problems. As it is unclear whether this situation will change in the next few years, it is not possible to predict what computer resources might be needed in future by such investigations. However, for alternative scenarios, inspired by the assumption of conformal symmetry at high-‐energy scales, there are questions that can very effectively be addressed already by existing means, like the search for mechanisms of electroweak symmetry breaking based on strong dynamics. These Technicolor-‐like models require a slow running of the QCD coupling which can be realised in the proximity of the ‘conformal window’ of gauge theories. This is just one example for the need to study QFTs other than QCD, which can be done using the same numerical lattice techniques.

Challenges

With respect to the Grand Challenge problems listed in the last report, very substantial progress was made with respect to QCD thermodynamics and with respect to making lattice QCD simulations at physical quark mass possible (see Figure 3.3), while it turned out that the demands for hadron structure calculations could not be met with the resources available so far. The reason is technical: the extrapolation of simulation results to the real physical situation turned out not to be well under control, such that one needs simulations with much higher statistics in a larger parameter space (for example with respect to quark mass and lattice spacing). Only these can provide the urgently needed information. As stated above, QCD is an extremely rich theory and different lattice collaborations made significant progress in exploring a large range of aspects of hadron physics. They would all benefit greatly from moving simulations even closer to the real physical world.

Figure 3.3. The hadron mass spectrum from simulations at physical quark masses by the BMW-‐Collaboration, compared to the experimental values. Courtesy of Budapest-‐Marseille-‐Wuppertal Collaboration

To summarise all of this under the title of hadron physics does not do justice to the richness of QCD, but it is practical in that most simulations serve the analysis of a number of physics questions in parallel such that it is hardly possible to specify the computational need for each of them separately. The keywords cited above refer to such diverse tasks as pinning down the fundamental constants of the standard model, in particular the quark masses; providing the theory input which allows interpretation of the

decay rates of hadrons containing bottom quarks which is a sector of the Standard Model especially sensitive to BSM physics; determining the transverse quark gluon structure of protons which is needed to interpret certain aspects of proton–proton collisions at CERN; determining the strange quark content of the nucleon, which is surprisingly poorly known; and many, many more.

High-‐energy physics is intimately related to astrophysics and plasma physics. For example, the cosmological Big Bang can be seen as a high-‐energy physics experiment far exceeding the power of any man-‐made accelerator. Therefore it should be possible to extract elements of BSM physics from the analysis of its precise properties, which is one of the joint big tasks of all three fields. Particle theory has to contribute a precise understanding of QFT thermodynamics and of likely BSM candidates. The former is primarily studied for QCD thermodynamics that gets direct experimental input from high-‐energy heavy-‐ion experiments, in particular at LHC. In this field, very substantial progress was made in recent years. Many bulk properties, which were still hotly debated a few years ago, are now settled and theory has progressed to address, for example, subtle charge correlations observed in experiment. However, much more information is still needed. One fundamental piece of information is the equation of state of QCD matter, but there is much more. LQCD does not allow studying QCD dynamics directly which is, however, needed for Early Universe physics. Instead, one can investigate equilibrium properties for many different situations (different baryon densities, background magnetic fields, varying masses and interaction parameters, different gauge groups, different numbers of fermions, etc.) and thus constrain analytic calculations of QFT dynamics. Much of this field is still unexplored. Let us mention as an example simulations of the inflationary phase of the Early Universe and the process of reheating, leading to the hot Big Bang initial conditions, which are needed for astrophysics. Finally, it could prove necessary to repeat these studies with a more costly fermion action. At present, formulations are mostly used which for finite lattice spacing violate chirality, a fundamental symmetry relevant for QCD thermodynamics which might imply that the continuum limit, i.e. the extrapolation to vanishing lattice spacing, is less under control than usually assumed. If such simulations turned out to be necessary, they would require hundreds of PFlop/s-‐years.

The header ‘High-‐energy physics’ is somewhat misleading as QCD at medium large and low energies is equally fascinating. The research in these fields does not aim primarily at the discovery of BSM physics (although this is also part of the agenda, e.g. in neutrino-‐less double beta decay) but rather at the better understanding of ordinary matter, primarily atomic nuclei and their interactions. Again, the ties to astrophysics are very close. A typical application is the core-‐collapse leading to a supernova explosion and the resulting nucleosynthesis of heavy nuclei.

This field has been revolutionised in recent years as it became possible to extract nuclear forces from ab-‐initio LQCD calculations. These forces are in the process of superseding the schematic phenomenological forces used so far. One way to reach this goal uses effective field theory (EFT) as intermediary. (Only very small nuclei can be simulated in toto on the lattice.) With the expected development of computer resources, EFT will allow in the next few years medium heavy nuclei to be treated directly.

These developments are perfectly matched by important theoretical progress in ab-‐initio calculations of nuclear structure and reactions based on the nucleon–nucleon forces obtained from QCD and EFT. We are approaching the point where nuclear physics calculations for heavy nuclei can be done with the same rigour as QFT ones. Particularly demanding are the simulations of explosive events like supernovae, novae, sources for X-‐ray or gamma-‐ray bursts, etc. The associated nucleosynthesis typically involves several thousand isotopes and several hundred thousands of reaction channels. Needless to say, all of this is only possible with numerical techniques.

Other computational tasks in high-‐energy physics are closely related to hydrodynamics and plasma physics. For example, to extract information on the QCD phase diagram one has also to describe the hydrodynamic phase of heavy-‐ion collisions correctly. This requires the simulation of finite systems with relativistic viscous hydrodynamics to very high accuracy that is just another formidable numerical task.

Table 3.1 cannot give more than a rough impression of the development to be expected in the period to 2020. The largest demand is expected in the field of hadron physics for which it can also be quantified rather precisely based on existing experience. This includes the physics of light hadrons and their weak matrix elements, as well as the physics of heavy flavours – charm and beauty – the main goal of LHCb and an essential probe of BSM effects. Phenomenology needs a reduction of present-‐day statistical errors for matrix elements from LQCD by at least an order of magnitude. To obtain a corresponding control of the extrapolation to the real physical world as well as of all other systematic uncertainties, one needs simulations for a large number of lattice parameters. This would require, for example, for the most relevant hadron structure observables of light baryons about 100 PFlop/s-‐years. One example for such systematic uncertainties are possible artefacts from admixtures of hadrons with the same quantum numbers but larger mass. One would also need independent simulations by collaborations using different lattice formulations, again to control other systematic uncertainties. Assuming the existence of at least two large collaborations plus several smaller and less ambitious ones leads to the 300 sustained PFlop/s-‐years given in the table below.

Table 3.1. Summary of some key high-‐energy physics developments to be expected in the period to 2020.

Physics Objective Required Sustained Performance

Relevant Experiments

LQCD at zero temperature: Hadron matrix elements at the physical point (moments of Generalised Parton Distributions (GPD)s and (Distribution Amplitudes) DAs, g_A, hadronic contribution to the muon anomalous magnetic moment, singlet contributions, bag parameters, input for CKM physics, transition form factors, TMDs, etc.) Several independent collaborations and fermion actions

> 300 Pflop/s-‐years

e.g. ALICE, ATLAS, BES, CMS, JLab, J-‐PARC, LHCb, PANDA, PHENIX, STAR

LQCD in a box Decay characteristics of hadron resonances

> 20 Pflop/s-‐years BES, PANDA, LHCb

LQCD at finite temperature: Equation of state for physical masses (at least by two independent collaborations/fermion actions) Localising the critical point (at least by two independent collaborations/fermion actions)

50 Pflop/s-‐years 50 Pflop/s-‐years

ALICE, ATLAS, CBM, CMS, PHENIX, STAR

Non-‐QCD applications of LQFT Investigating ‘walking technicolor’ models Investigating supersymmetric models Investigation inflation scenarios

5 Pflop/s-‐years Unclear, conceptual problems Unclear for lack of benchmarks

LHC, ILC

EFT: calculations for medium heavy nuclei 10 Pflop/s-‐years FAIR Nuclear physics input for nucleosynthesis in supernovae

2 Pflop/s-‐years FAIR

Figure 3.4. Electrostatic potential fluctuations from a gyrokinetic simulation of a tokamak plasma (TCV Machine), obtained with the GENE code (PRACE early access call). IPP Garching

3.2.3 Plasma Physics Plasmas are pervasive in nature, comprising more than 99% of the visible universe, and permeate the solar system, interstellar and intergalactic environments, occurring over huge ranges of scale in space, energy and density scales. Plasma-‐based technology has been at the forefront of scientific research for more than half a century, with applications in fundamental science and a wide range of topics from nuclear fusion to medicine. These are extremely challenging scientific problems, requiring state-‐of-‐the-‐art numerical tools and computational resources

Magnetic fusion research seeks to reach thermonuclear conditions by containing plasmas with strong magnetic fields in suitably designed devices. Attaining burning plasma conditions, at the density achievable in present-‐day devices, requires heating the plasma to very high temperatures (of the order of 100 x 106 ºC) and correspondingly high gradients (~50 x 106 ºC/m). In these conditions, energy and particles are lost from the device through turbulent convection. Understanding the transport of energy and matter is one of the key questions in fusion plasma physics, and it is of great practical interest, since the efficiency of a power station would depend on the ratio of the fusion power output to the input power required to operate the device. The next-‐step international fusion experiment, ITER, is a 10GEuro tokamak being constructed in France, with the objective of achieving a ratio of fusion power to heating power that exceeds 10.

Concurrently, some of the most demanding scientific and computational grand plasma challenges are closely tied in with recent developments in ultra-‐intense laser technology and with the possibility of exploring astrophysical scenarios with a fidelity that was previously not accessible due to limitations in computing power. The main scientific challenges are in (i) plasma accelerators (either laser or beam driven) and possible advanced radiation sources based on these, which have promising applications in bio-‐imaging and medical therapy, (ii) inertial fusion energy and advanced concepts with ultra-‐intense lasers, which aim to demonstrate nuclear fusion ignition in the laboratory, and (iii) collisionless shocks in plasma astrophysics, associated with extreme events such as gamma ray bursters, pulsars and AGNs.

These are topics of relevance not only from a fundamental point of view but also in terms of potential direct economic benefits. For instance, research in plasma accelerators is exploring the route to a new generation of more compact and cheap particle and light sources, a topic where Europe is clearly leading due to the large-‐scale pan-‐European laser projects (e.g. Extreme Light Infrastructure (ELI) and High Power Laser for Energy Research (HiPER)), and the national efforts on the development of laser-‐based secondary sources (e.g. in Germany, France and UK). The exploration of an alternative path, to the Magnetic Confinement Fusion (MCF) approach for nuclear fusion is critical for sustainable energy production, which is the driving force for economic growth. Solar physics is also a very active field of research, with new satellites becoming operational and with increased interest in the numerical simulations of the complex solar dynamics and the impact of solar phenomena on the terrestrial environment.

Specifically, the present-‐day theoretical challenge lies in the need to resolve the equations of magneto-‐hydrodynamics in the turbulent regime across a large portion of the Sun, from the thin (in relative terms) boundary layer, called the tachocline, which marks the transition between the inner region dominated by radiation and the outer convective zone ruled by hydrodynamics, up to the solar surface with its rich phenomenology.

Even more ambitiously, future simulations aim to target the full 11year solar cycle, including a treatment of the coronal phenomena and the generation and time-‐variation of the solar wind. In parallel, the interaction of the solar wind with the geomagnetic environment is also a subject of study of great interest. The study of the combined Sun and Earth magneto-‐plasma system by means of observations and simulations is often referred to as the field of Space Weather.

Challenges

In all of these fields, huge computational challenges stem from the very high ranges of scales in space and in time that are required to model any significant portion of the system to integration times of interest. Advances in HPC are facilitating higher-‐fidelity simulations with better approximations, and these are greatly improving our understanding of plasmas.

Magnetic fusion devices are several metres across, whereas turbulent structures occur at the millimetre scale and significant magnetic disturbances at the scale of at least several centimetres. Computational modelling will play a crucial role in maximising the success of the unique experiment, ITER. In this machine, encompassing all the important phenomena underlying energy losses will require simulations of several thousand grid points, in the two directions perpendicular to the magnetic field (fewer in the parallel direction).

Furthermore, in fusion devices, plasma collisions are so rare that the mean free path along the field lines can be longer than the characteristic macroscopic scale. Modelling transport along the field lines by a fluid closure is problematic. The modern trend is to use a reduced form of the kinetic equation for each plasma component (electrons and ion species). At best, this requires hundreds of grid points in each of two (parallel and perpendicular) velocity directions.

In the case of ITER, the smallest timescale to be resolved is the one associated with millimetre-‐size vortices, which is typically of the order of a microsecond. The energy confinement time can be of the order of seconds. Thus, depending on the time-‐advancing algorithm, a significant simulation to steady-‐state energy balance would require tens of millions of time steps.

As an example, a recent top simulation using a kinetic plasma model was carried out with the GYSELA code at about 1011 variables in phase space, corresponding to a few TBytes of storage. Integrating the model for duration of about 1 ms of physical time required about a month of CPU at an effective speed of around 5 Tflop/s.

However, a significant ITER simulation with a more complete physical model may require 1014

variables and shorter time steps, in order to simulate electron scales, or an integration time of a factor 103 longer for simulations over the energy confinement timescale.

Figure 3.4 shows a picture of the electric potential in a turbulent plasma obtained with the GENE code. Thanks also to PRACE resources, plasma turbulence simulations are advancing rapidly and are being compared with measurements to yield valuable insights into the mechanisms that determine losses from the confinement system.

In laser plasma interaction scenarios, the numerical tools of choice are usually fully relativistic particle-‐in-‐cell (PIC) codes such as the OSIRIS framework. PIC models work at the most fundamental, microscopic level and are therefore the most computationally intensive models in plasma physics.

Figure 3.5. Laser Wakefield Accelerator simulation showing laser (red/white), wakefield structure (green/blue) and accelerated particles (spheres). Code OSIRIS, PRACE first call.

For example, recent scalings for a laser-‐plasma electron accelerator indicate that to reach an energy of 10 GeV the accelerating length must be in the order of ~0.5 m, with a plasma density of ~1017 cm-‐3. Since the laser wavelength (~1 micron) needs to be resolved, the simulation cell size will be of ~10-‐7 m, and the total number of iterations will be in the order of ~107. The total number of simulation particles required will be over ~1011, and the total computer memory requirements are in excess of ~10 TBytes.

Solar physics simulations are not less challenging. Solar plasmas are characterised by a variety of interplaying phenomena, with disparate plasma parameters. Magnetism originates as a result of a yet ill-‐understood dynamo mechanism in the convective region. This couples to the Sun's surface where events such as protuberances, flux emergences and solar flares occur, and where the solar wind is generated.

Turbulence is generally at high Reynolds number, and spatio-‐temporal scale separation is extreme (8–10 orders of magnitude). Present research aims at MHD simulations of at least 10,000 grid points in each spatial direction for long very integration times.

Computer resources awarded with the first PRACE calls have already permitted ground-‐breaking results in plasma physics. Ab-‐initio simulations of turbulence in magnetic fusion plasmas with the GENE code have clarified the limit of validity of energy confinement scaling laws and are exploring the physical mechanisms that can suppress turbulence in transport barriers, which significantly enhance fusion performance.

Studies of inertial confinement fusion with the OSIRIS code have explored various scenarios for fast ignition with realistic simulations as well as the dynamics of particle acceleration in shocks produced by intense laser beams (see Figure 3.5).

Table 3.2 lists a few grand challenge plasma simulations that should become accessible with the computer power that is likely to emerge over the time frame 2012–2020.

Table 3.2. Summary of some key computational challenges in plasma physics.

Grand Challenge Benefits Compute Requirement

Challenges

Magnetic Confinement Plasmas: Tokamaks Global GK simulations of ion-‐scale electrostatic plasma turbulence, to well beyond the turbulence saturation time (say ~1 ms) Natural extensions to longer times (e.g. confinement time ~5 s), include electron scales, include magnetic fluctuations Multiscale simulations that separate slow transport and fast turbulent timescales: couple local gyrokinetics code to transport solver

Theoretical understanding of processes determining fusion power in ITER plasmas. Exploit to optimise ITER and design smaller devices Improve general understanding of plasma turbulence and impact on confinement more sophisticated comparisons of turbulence simulations with measurements from existing tokamaks Seek to maximise scientific benefit from unique 10GEuro expt ITER (EU largest contributor). Need HPC simulations above before ITER first plasma (2019)

~O(200) Pflop/s hrs > O(10 s) TB memory Extensions improve model, but much more demanding computationally Multiscale less demanding route to transport timescale. Need MANY such simulations to optimise ITER scenarios

GK codes scale to < 104,105

cores Can algorithm bottlenecks be improved to reach 108 cores? (serious issue) Can we find clever ways to parallelise time?

Laser plasmas and ICF Can plasma-‐based acceleration lead to the development of compact accelerators for use at the energy frontier, in medicine, in probing materials, and in novel light sources? Can we achieve fusion ignition and, eventually, useful fusion energy from compressed and heated HED fusion plasma?

The holistic comprehension of plasmas, in particular in conditions which are highly non-‐linear such as those associated with intense lasers, is highly complex, and it requires ab-‐initio fully kinetic simulations This understanding is fundamental for the design and development of new generation plasma accelerators with the possibility of tremendous scientific and societal impact

One-‐to-‐one modelling of plasma accelerators requires petascale systems, and parametric scans are required for system design optimisation ICF simulations are even more demanding, and full 3D modelling of implosion, ignition and explosion will push requirements to the exascale realm

Can EM-‐PIC simulations strong scale to ~107–108

cores? Can we extend our models to work effectively in overdense regimes? Can multi-‐physics models be efficiently included in our algorithms?

3.3.1 Astrophysics Astrophysics is a large and diverse research area and numerical simulations play a central role. The community has always taken advantage of the latest HPC hardware. As an example, the first N-‐body simulations of the 1960s followed the trajectories of about 100 particles to study star cluster evolution and galaxy dynamics. Today we are simulating the Hubble volume with almost a trillion particles and with high dynamic resolution. This increase of 1010 in particle number alone in 60 years is faster than Moore's law, thanks to the continued software development that takes place within the field. Astrophysics has always kept up with HPC trends – from serial codes to Vector machines to the current MPI-‐based clusters that have dominated research for over a decade. Novel special-‐purpose hardware, such as the GRAPE board, was developed by the groups studying dense stellar systems. Today many groups are starting to utilise GPUs as hardware accelerators. The community is well poised to follow the current trends towards massively parallel, large multi-‐core nodes, each with multiple GPUs. As in other areas, there is also the need for appropriately matched growth in storage capacity. However, following these trends requires ever more complex algorithms, and investment in manpower for software development is essential.

3.3.2 High-‐Energy Physics The computational needs of LQCD have changed substantially in recent years. While in the past the main cost factor was the generation of ensembles of quantum field configurations, lately analysis has become comparable in computational cost. While ensemble generation requires very high scalability of the architecture to obtain long enough Monte-‐Carlo histories, analysis can also be done efficiently on cluster architectures (CPU or GPU or hybrid) but requires easy programmability due to the very broad spectrum of quantities to be analysed. Due to this shift of requirements, many LQCD collaborations currently face a certain shortage of computer resources for analysis. In the near future, the situation is likely to change again, as the limited memory bandwidth of scalable architectures will reduce the efficiency with which ensemble generation can be performed. Efficiencies of order 10% are currently reached routinely and sometimes substantially more, the maximum efficiency reached for compute kernels being close to 40%. Thus the roughly 300 sustained Pflops-‐years needed for hadron structure physics might already correspond to 3 Eflops-‐years peak, depending on computer architecture.

Another problem, which will become more relevant, is the growing mismatch between compute power and I/O bandwidth.

While ensemble generation is clearly part of capability computing, analysis is halfway between capability and capacity computing. A typical configuration one might analyse in five years from now might consist of multi-‐Gbytes of data requiring a moderately scalable architecture for analysis.

LQCD has a strong record in making efficient use of a large range of computer architectures. The field will therefore be able to drive the usage of future computer architectures, whether it is based on more traditional CPU architectures, GPU architectures or emerging new technologies.

3.3.3 Plasma Physics In magnetic fusion, the big computational challenge for the next decade will be to acquire the capability to simulate the totality of low-‐frequency dynamics in general toroidal geometry. Such a code would treat MHD and micro-‐turbulence in a unified way, and would need mesh sizes nearing 105 point in each transverse direction to encompass electron-‐scale lengths at physical ion/electron mass ratios. This is a daunting challenge if a straightforward scaling from present simulations is employed.

For codes evolving the kinetic equations, a crucial problem is the efficient and accurate interpolation of the value of the distribution function to neighbouring grid points. For PIC codes, noise reduction is a key element. Both methods need efficient integration in velocity space and global solvers for the electromagnetic potentials in complex geometry. This is perhaps the key limiting factor for better performance. Likewise, the main challenge in solar physics will come from comprehensive simulations of the solar interior, the solar surface dynamics and space weather.

For plasma-‐based accelerator modelling, the main computational challenge is to perform 3D full-‐scale one-‐to-‐one modelling of metre class wakefield accelerator scenarios, providing insight and direction into the next generations of plasma-‐based accelerators. Inertial Confinement Fusion, and Fast/Shock Ignition in particular, will also require full-‐scale 3D modelling to analyse energy transport and deposition in the compressed core leading to ignition.

3.4 Expected Status in 2020

3.4.1 Astrophysics Astrophysics and cosmology will continue to be driven by observations of our cosmos. Space-‐ and ground-‐based missions are planned beyond the 2020 timescale that will provide the data to answer our dozen questions. Simulations play an essential role in this grand quest to understand our origins and it is essential that growth in HPC resources continues at its current rate in order to meet the requirements of these missions. If HPC trends continue, and these resources are made available, then by 2020 we expect many breakthroughs and a huge amount of progress towards answering those fundamental questions.

3.4.2 High-‐Energy Physics The development of computational particle physics was so rapid in the past that predictions so far into the future are hardly possible. In any week, for example, LHC could make a ground-‐shaking discovery which might change fundamentally our understanding of particle physics and could change the research agenda of all groups working in that field. For instance, it could be observed that the decay probability for any of the Higgs decay channels does not fit to the Standard Model. Checking as many as possible of them is presently the focus of experimental research at LHC. A few developments can, however, be predicted with a reasonable level of certainty.

• QCD thermodynamics should be more or less settled by 2020. This does not mean that there will not be any open questions left – quite to the contrary – but things like the equation of state or the existence/non-‐existence of a critical point should be decided and should have turned into textbook physics.

• The status of hadron structure calculations is less clear. In the past, all predictions have turned out in the end to be over-‐optimistic. Nevertheless, based on present knowledge, it is (again) a rational expectation that LQCD will have reached the point at which it can provide information comparable in reliability to direct experimental data, but of a highly complementary and broader nature.

• Non-‐QCD LQFT will experience rapid growth over the next years. If BSM physics is found, then this new physics will have to be analysed; if it is not found, the question will be made ever more urgent: how can it be that it is not yet in reach and what are more promising avenues to find it?

• The non-‐LQFT applications which are still at an early stage of development will certainly catch up and play a much more prominent role by 2020.

3.4.3 Plasma Physics With ITER expected to be operational at the end of this decade, there is a strong community effort to develop the numerical tools to carry out the necessary science.

As far as magnetic fusion is concerned, one expects to be able to carry out a full-‐torus simulation with a detailed plasma physics model for the duration of an energy confinement time. This is a grand HPC challenge requiring exascale resources. In parallel, capacity HPC computing will be necessary to carry out parametric studies with somewhat simpler plasma physics models. In parallel with this activity, the community will exploit a suite of codes, combined into a single framework for plasma and machine-‐integrated modelling, ranging from data analysis and reconstruction of the plasma state to interpretative and predictive simulations. This suite will include some first-‐principle codes that require HPC.

By the end of the decade laser plasma accelerators will have matured and detailed full-‐scale simulations of problems involving a very large range of spatial and temporal scales will be required for high-‐fidelity modelling that will also require exascale resources, and that will provide predictive modelling capable of sustaining engineering advances. In inertial confinement fusion, and in advanced ignition schemes in particular, the community should be in a position to perform complete modelling of inertial fusion, using multi-‐physics/multiscale codes to analyse in detail ICF scenarios and optimise target designs.

Similarly, the solar space community is likely to be in the position to carry out the grand challenge of simulating the dynamics of the Sun over a long timescale, comparable to the 11-‐year cycle. Space weather forecasting will require a full chain of codes, from measurement and reconstruction of the solar MHD state, through stability analysis, simulation of the solar surface dynamics, prediction of events and their impacts on the solar wind, down to simulations of the interaction of the solar wind with the magnetosphere. Again, certain codes belonging to this integrated modelling suite will need to access HPC resources.

PRACE – The Scientific Case for HPC in Europe Materials Science, Chemistry and Nanoscience

4 MATERIALS SCIENCE, CHEMISTRY AND NANOSCIENCE

4.1 Summary The advance from petascale to exascale computing will change the paradigm of computational materials science and chemistry. Today the discipline acts as the third scientific method in a tight loop between experiment and theory. The move to petascale will broaden this paradigm – to an integrated engine that determines the pace in a design continuum from the discovery of a fundamental physical effect, a process, a molecule or a material, to materials design, systems engineering, processing and manufacturing activities, and finally to the deployment in technology, where multiple scientific disciplines converge. Exascale computing will significantly accelerate the innovation, availability and deployment of advanced materials and chemical agents and foster the development of new devices. These developments will profoundly influence society and the quality of life, through new capabilities in dealing with the great challenges of knowledge and information, increased productivity, sustained welfare, clean energy, health, etc.

Computational materials science, chemistry and nanoscience is concerned with the complex interplay of the myriads of atoms in a solid or a liquid, thereby producing a continuous stream of new and unexpected phenomena and forms of matter. An extreme range of length, time, energy, entropy and entanglement scales give rise to the complexity of an extremely broad range of materials and associated properties.35 The target of this science is to design materials ranging from the level of a single atom up to the macroscopic scale, and phenomena from electronic reaction times in the femtosecond range up to geological periods. Computational materials science, chemistry and nanoscience stands in close interaction with the neighbouring disciplines of biology and medicine, as well as the geosciences, and it impacts extensive fields of endeavour within the engineering sciences. The above design goal will be achieved by a large and diverse computational community that views as critical assets the conceptualisation, development and implementation of algorithms and tools for cutting-‐edge HPC. These tools are used to great benefit in other communities such as medicine and life sciences, and engineering sciences and industrial applications. as Furthermore, this domain serves another major goal in educating human resources for future advances in computational materials science.

During the past 5–10 years, many of the goals laid out in the first ‘Scientific Case for European Petascale Computing’ have been successfully accomplished. In the field of nanoscience, for example, robust tools for the quantitative understanding of structure and dynamics at the nanoscale have been developed, matching the extraordinary developments in experimentation. Today four major thematic challenges are noted:

1. In response to the new computer petascale architectures, powerful algorithms are developed that make use of thousands of computer cores, which sets a direction of exascale computing.

2. Unprecedented progress is witnessed in the increase of precision and robustness of computational predictions on much finer energy scales, e.g. of many body quantum systems requiring orders of magnitude more processor time.

3. A considerable effort is undertaken to bridge seamlessly the gap between different length and timesscales inherent to this field to reach the complete simulation of an entire device or systems integrated in a technology.

4. Long time trajectory simulations are urgently needed. 35 Inorganic and organic solids, molecules and polymers, smart materials that self-‐repair, actuate and transduce, bio-‐compatible and programmable materials and soft matter, self-‐assembly and biomimetic synthesis, quantum dots and strongly correlated quantum materials, photosynthesis and the ultrafast magnetisation dynamics, to name but a few examples

With a sustained Eflop/s, many of the leadership-‐class calculations of the materials science, chemistry, nanoscience, condensed matter physics, mineralogy and geoscience communities that may take 1–2 years on a petascale computer today excitingly become state-‐of-‐the-‐art high-‐throughput calculations. This allows the application of these methods to a multitude of systems under many different external parameters and enables the validation and quantification of the robustness of the methods and the reliability of the associated models. High throughput facilitates the link between materials science and materials informatics that suggests new discoveries through combinatorial search among a vast number of alternatives in a feedback loop of processing and manufacturing.

Many upcoming challenges in computational materials science, chemistry and nanoscience have been outlined by a recent ESF Material Science and Engineering Expert Committee’s (MatSEEC)36 Science Position paper37 from which we draw benefits. This section, however, is divided into separate overviews of materials science, chemistry and nanoscience. Each field is vast. This division is in part historic and stems from the different communities and the characteristic questions that are addressed although, in practice, the boundaries between the fields are somewhat fluid. A wide variety of societal challenges38 translate in part to research challenges, with these challenges in turn inspiring computational challenges.

This translates into demand for dealing with ever-‐increasing complexity, larger system sizes, increasing robustness, ever-‐longer simulation sequences, seamless changes in length and time scale and a diversity of underlying algorithms on a necessarily diverse set of HPC platforms. The research challenges and the computational challenges span all three fields but are listed below under a particular subfield where they most naturally fit.

There are four major themes of computational challenge common to all sub-‐fields:

1. Exploiting exascale architectures. Highly parallel computing platforms, with tens or hundreds of thousands of cores as well as special-‐purpose processors, such as powerful graphics cards or accelerators, will be essential for the future of this field. In response to these new computer architectures, a critical investigation and exploration of algorithms and their computational implementation to enable such platforms to be utilised is essential. The development of stochastic methods, or linear scaling methods and divide-‐and-‐conquer algorithms, developed during the Pflop/s infrastructure, needs to be adapted to exascale. Truly massively parallel computing is a major challenge for the future.

2. An increase in the precision and robustness of the computational models is required to improve their predictive capability for larger systems over longer timescales. The computational cost often increases as a power law with a large exponent as a function of system size. The underlying algorithms are frequently more complex but potentially benefit most from highly parallel computing platforms. Redeveloping these algorithms for new computer architectures is a major challenge.

3. Bridging seamlessly the gap between different length and timesscales is inherent to this field to transition between the composition of a material, the processing and its conditions of use – one would like to have a single model, that bridges atomistic and continuum descriptions seamlessly,

36 www.esf.org/matseec 37 Computational Techniques, Methods and Materials Design, European Science Foundation, March 2011, ISBN: 978-‐2-‐918428-‐38-‐1

38 Energy harvesting, storage, conversion and saving, environmental protection and toxicity management, decontamination, air cleaning, integration of data and information technology for higher information availability and connectivity, critical materials substitution, biotechnology, topography, health care, or mobility

i.e. contains atomistic and continuum limits as special cases. This problem has yet to be solved, but the mathematical methods required for multiscale modelling are rapidly developing. These include such topics as multi-‐resolution analysis, high-‐dimensional computation, domain decomposition, turbulence, level sets, and discrete mathematics. These topics need to be explored from the point of view of the application to various materials-‐science problems, ranging from differential equations to stochastic simulation. A further dimension is the increased use of materials informatics, the development of databases and their curation and mining.

4. Long time trajectory simulations are needed. The simulation of nucleation, growth, self-‐assembly and polymerisation is central to the design and performance of many diverse materials such as rubbers, paints, fuels, detergents, functional organic materials, cosmetics and food.

4.2.1 Materials Science

4.2.1.1 Hard Matter

State of research Hard condensed matter encompasses an extremely rich variety of materials and systems.39 These materials find their application in solar cells, fuel cells, SQUIDS, sensors, responsive actuators, headphones, non-‐volatile resistive and magnetic memories, processors, cameras, smart phones, batteries, cars, rockets, satellites, etc. Non-‐equilibrium phenomena are of great practical importance in such diverse areas as optimising self-‐assembled and biomimetic techniques to produce and process materials, manufacturing technologies, designing energy-‐efficient transportation, processing structural materials, or mitigating the damage caused by earthquakes.

Challenges

Challenge: Strongly correlated and quantum materials

• Multiferroicity, colossal magneto-‐resistance, exotic superconductivities, Mott-‐insulation, Coulomb-‐blockade, Kondo-‐effect, heavy-‐fermions, orbital ordering are variants of strongly correlated electron phenomena. Our understanding of the underlying physics and chemistry of the emerging class of strongly correlated materials is still severely limited and is hampering technological applications. The current development of the LDA+DMFT method (see section 4.3) provides a new approach that addresses the complexity of the quantum properties of these materials that really requires exascale-‐level simulations.

• Topological and Chern Insulators, the (anomalous-‐, spin-‐, quantum spin, quantum anomalous) Hall effect, the orbital moment and the magneto-‐electric coupling are examples of a new class of quantum materials that is classified by the topological nature of the electrons at the Fermi surface. The spin-‐orbit energy is the relevant energy scale. A reliable integration of the Fermi surface requires an extremely high resolution and is a field that benefits from massively parallel computing.

39 To give an impression of the vastness of the field, we mention metals, semiconductors, insulators, alloys, glasses, amorphous materials, quasi-‐crystals, heterostructures, nanoclusters, quantum dots, graphene, nanotubes, buckyballs and related structures, zeolites, wires, composite materials, phase change materials, smart materials, steel, shape memory materials, magnets, giant magnetoresistance materials, colossal-‐magnetoresistive oxides, spin-‐chain and spin-‐ladder compounds, magneto-‐optical recording materials, piezo-‐ and ferro-‐electric materials, electro ceramics, tunnel junctions, multiferroics, artificially engineered metamaterials, topological insulators, high-‐temperature superconductors, organic superconductors, organic electronic materials, porous silicon, Bose–Einstein condensates, transistors.

Figure 4.1. Visualisation of the near-‐tip deformation in brittle fracture (Alessandro De Vita, King’s College, London).

Challenge: Materials informatics

Materials informatics offers the potential to develop new materials technologies at a faster rate, and in a more cost-‐effective way, than previously possible. The challenge is to reduce development time both for the discovery of new materials and for the prediction of their properties and process. This accelerated method provides closer alignment to the product development cycle while contributing to increased product performance. Some of the techniques of interest in materials informatics are standard – for example, quantum methods for computing the stability of materials, or information techniques such as data mining, or data analytics (in statistics). However, combining these techniques to exploit materials informatics is not standard practice and offers a novel approach to materials design. Due to the increased availability of computational resources, our ability to address complex materials issues is dramatically improving: for example, it is possible to run many thousands of potential material calculations and generate notable ‘theoretical databases’. Databases with derived materials, with calculated physical and engineering properties, will no doubt become an increasingly important tool for researchers and engineers working in fields related to materials development. Capacity computing capabilities of an exascale facility will drive the establishment and management of such databases. It is closely related to the Materials Genome Initiative for Global Competitiveness40 initiated by the Office of Science and Technology Policy of the Executive Office of the President of the United States.

Challenge: Multiscale modelling

A pressing research challenge involves the integration of the various length and time scales relevant for materials science, briefly outlined above. Multiscale materials simulation is currently a high-‐impact field of research, where much effort is focused towards more seamless integration of the length and time scales, from the electronic structure calculations, atomistic and molecular dynamics, kinetic and statistical modelling to the continuum. Together with new and emerging techniques, the provision of increased computational power can yield answers to versatile and complex questions central to materials manufacture, properties, performance and technological applications. Typical examples of multiscale materials problems include the following.

• Modelling related to materials growth, processing and modification using electron and ion beams or plasma techniques. Examples are chemical vapour deposition (CVD) or atomic layer deposition (ALD) growth of thin films and coatings, where the scales vary from the sub-‐nanometre surface region to the metre-‐scale reactor.

• Friction and sealing. These topics are extremely important for mechanical engineering and physics involving, for example, functionality, energy efficiency, environmental protection, safety and miniaturisation of technological devices. Orders of magnitude in length scale have to be covered.

• Ageing of engineering materials. For engineering materials in our daily environment (e.g. from cement and concrete to clay sediments or materials for waste storage), understanding the relationship between their complex, hierarchical microstructure and the long-‐term evolution of transport or mechanical properties is at the basis of improving durability and sustainability. Nano-‐ and micro-‐scale processes – from ion and water transport to local evolution of composition and morphology – drive the evolution and aging of

40 http://www.whitehouse.gov/sites/default/files/microsites/ostp/materials_genome_initiative-‐final.pdf

their mechanical properties and may lead toward functional failure and structural deterioration. Computational physical chemistry, statistical physics and experimental approaches designed for glasses and amorphous materials can give insight into this critical range of length scales, from nanometres to microns.

• Brittle fracture. Many attempts have gradually made clear that fracture in pure phases such as glasses, crystalline semiconductors and minerals – as well as in complex systems such as advanced ceramic fuel cells, thermal barriers and biomimetic coating films – present challenging problems for theory. This is mainly because high accuracy and large system sizes are both necessary ingredients for modelling failure in brittle materials. On the one hand, the ionic or covalent bond-‐breaking and formation associated with brittle fracture advancement (accompanied, for example, by surface reconstruction, chemical attack by inflow of corrosive species, or reactions with pre-‐existing impurities) require interatomic potentials truly capable of quantum chemical accuracy. On the other hand, the need to capture faithfully the stress concentration phenomena requires large-‐scale (~106 atoms) model systems. All the above has made brittle fracture an extremely hard problem to tackle. These simulations contribute decisively to the prediction of the lifetime of high-‐performance materials in an energy technology such as high-‐efficiency gas turbines.

Many of the above problems have been studied on the continuum level. A sufficiently detailed and realistic computational modelling and understanding of these highly complex technological processes can only be achieved by large-‐scale computer simulations combining a large number of particles and long timescale simulations at different length and time scales.

4.2.1.2 Soft Matter

State of research Soft condensed matter (see Figure 4.2) encompasses polymers, colloids, membranes, amphiphiles and surfactants, synthetic and biological polymers, lipids and proteins as well as organic–inorganic hybrid systems. Classical chain molecules, i.e. polymers in the narrower sense, form only a subgroup of soft matter and serve as a reference for model building. These macromolecules find their applications in many different kinds of materials such as rubbers, paints, fuels, detergents, functional organic materials, cosmetics, food, bio-‐membranes, the cytoskeleton and the cytoplasm of living cells.

Figure 4.2. The ‘classical’ soft matter fields of (synthetic) polymers, hard-‐sphere colloids and amphiphilic systems have merged into a single research area during the last decade, because many macromolecules are studied today which display polymeric, colloidal or amphiphilic character simultaneously. (Copyright: Forschungszentrum Jülich.)

The unifying principles of soft matter systems are structures on mesoscopic length scales from nanometres to micrometres, and typical energy scales in the order of the thermal energy. Thermal fluctuations play an important role because of the relatively low energy density in these materials – they are ‘soft’. Self-‐assembly is a dominant feature in soft materials and the essential reason for their complexity. The many interacting degrees of freedom imply that entropy plays an important role or even dominates in many cases, leading to universal behaviour. The spatially large molecules are able to fluctuate strongly in their shape (conformation). Consequently, processes on the microscopic-‐atomistic, and those on the mesoscopic level, contribute to the materials’ properties in equal measure. The large range of relevant length scales implies a large range of timescales. Indeed, there are often ten or more orders of magnitude between the typical timescales of the local atomic movements and the meso-‐ respectively and macroscopic phenomena. Unlike ‘hard matter’ the solid phases of soft matter are at best partially crystalline, but rather typically in an amorphous, glass-‐like state, and often heterogeneous on mesoscopic scales.

Challenges A sufficiently detailed theoretical modelling and understanding of highly complex soft matter systems can be achieved only by large-‐scale computer simulations. This is a huge challenge, as the relevant structures comprise many orders of magnitude concerning the length scales. As far as the timescales are concerned, the problem is more challenging still. Already in the case of the most simple polymer system, a melt of linear neutral homo-‐polymers, structures are encountered on lengths from covalent binding of 0.1 nanometre to clusters of 10 nanometre in size. The length scale of collective phenomena is even larger by an order of magnitude. Relaxation times reach from the period of binding-‐angle vibrations of about 0.1 pico-‐second up to 10 micro-‐seconds for reptation movement, i.e. eight orders of magnitude. (The movement of a chain in a dense polymeric system is highly constrained; due to entanglements with other chains lateral motion of the chain at many points are highly improbable.) The most important questions concern the dependence on the polymer chain length, on the concentration dependence for multi-‐component systems, and the temperature dependence, and therefore require many simulations of this kind. Thus, huge amounts of computer time are needed to simulate soft matter and soft materials in thermal equilibrium.

Similar considerations apply for the simulation of charged systems, such as poly-‐electrolytes in polar solvents. Poly-‐electrolytes, charged colloids and charged amphipihilic molecules are present in a wide range of systems and applications, from biological systems (where charged bio-‐macromolecules, like DNA and proteins, are ubiquitous) to waste-‐water treatment. Here, the main challenge is the long-‐range nature of the electrostatic interaction between the charged macromolecules, the counter-‐ions and the salt ions, and the multi-‐component nature of the relevant systems.

Questions concerning the behaviour of soft matter under flow are even more challenging. Examples include the extrusion process during the fabrication of polymer materials by injection moulding; the directed self-‐assembly of nano-‐colloids to obtain formulations with desired properties for either processing or function of nanoparticle-‐based materials, and the flow properties of blood cells. The main issue here is the incorporation of the long-‐range hydrodynamic interactions, and the description of the interplay of hydrodynamic flows, the deformation of macromolecules, membranes, droplets, vesicles and cells, the effect of walls and confinement, and the effect of thermal fluctuations. To address these challenges, mesoscale hydrodynamics simulation techniques, such as Lattice-‐Boltzmann, Dissipative Particle Dynamics, and Multi-‐Particle Collision Dynamics, have been developed in recent years. Although these approaches need further development, they already allow the investigation of many interesting and important issues.

Another important issue is the desirable permanent monitoring of experimental studies, which will only be possible if the computer performance available today is increased dramatically. Further parallelisation can be the solution only in special cases since the systems’ temporal development has to be followed in many cases. Therefore there are, apart from the need for considerably more

powerful supercomputers, intensive efforts to develop simulation methods with which several length and time scales can be systematically linked to each other (multiscale simulations). Real progress is only possible if both developments go hand in hand. Examples are the consideration of local ion interactions and explicit solvents (e.g. molecular structure of water) for poly-‐electrolytes, to which almost all biopolymers belong; the dynamics of realistic polymer melts with branched polymers, the phase behaviour of multi-‐component systems or scale-‐spanning calculations with realistic dynamics and with conformation changes of smaller and later of larger biopolymers.

Finally, downsizing experiment to the level of control of single molecules or very small dimensions as in micro-‐ and the upcoming nano-‐fluidics will open the pathway to performing experiments in parallel on the computer, giving unseen options of insight into molecular processes. This requires the parallel development of hardware as well as new simulation methods. From a computational point of view, ‘very large’ systems are still to be studied, which come into range with the new powerful hardware. Then non-‐equilibrium behaviour, the basis for almost all natural and technological processes, can be investigated. This should help us to understand these processes but also will help to improve force fields. These typically are parameterised based on macroscopic experiments, while quantum chemistry can be used for bonded interactions. However, both suffer from a naturally rather low level of accuracy compared to the needs in this field, so that the combination of experiment and simulation is also promising here.

4.2.2 Chemistry

State of research Computational chemistry is currently concerned with:

• The design and production of new substances, materials, and molecular devices with properties that can be predicted, tailored, and tuned before production

• The simulation of technologically relevant chemical reactions and processes, which has a huge potential in a variety of fields

• The control of how molecules react, over all timescales and the full range of molecular size catalysis, which remains a major challenge in the chemistry of complex materials, with many applications in industrial chemistry – e.g. a combinatorial materials search under realistic treatment of supported catalytic nanoparticles involving several hundred transition metal atoms would require resources of at least a 10 Pflop/s

• The knowledge of atmospheric chemistry, which is crucial for environmental prediction and protection (clean air).

Challenges

Challenge: Quantum chemistry

The key goal of quantum chemistry is the accurate calculation of geometrical and electronic ground state properties of molecules as well as their excited states. Requested chemical accuracy of 1 kcal/mol is difficult to achieve with functionals available in density functional theory. On the other hand, quantum chemical methods are predominantly applied to small isolated molecules, which correspond to the state of an ideal gas. Most chemical processes, however, take place in condensed phase, and the interaction of a molecule with its environment can generally not be neglected.

• Solvent effects. Solvent molecules can directly interact with the reacting species, for example by coordination to a metal centre or by formation of hydrogen bonds. In such cases, it is necessary to include explicitly solvent molecules in the calculation. Depending on the size of the solvent molecules and the number needed for convergence of the calculated properties, the overall size of the molecular system and the resulting computational effort can increase significantly. Currently, only DFT methods are able to handle several hundred

atoms, but the developments towards linear-‐scaling approaches in quantum chemical wavefunction-‐based methods are very promising (see below). An alternative would be a mixed quantum mechanical (QM) and molecular mechanical (MM) treatment (QM/MM method). If there are no specific solute–solvent interactions, the main effect of the solvent is electrostatic screening, depending on its dielectric constant. This can be described very efficiently by continuum solvation models (CSM).

• Spectroscopy for large molecules. Calculated molecular spectroscopic properties are very helpful in the assignment and interpretation of measured spectra, provided that the accuracy is sufficiently high. In many cases, spectra can be obtained with reasonable accuracy at the DFT or MP2 level, but ultraviolet and visible spectra normally require more elaborate theories like configuration interaction (CI) which are only applicable to very small molecules. To improve on the currently applied semi-‐empirical approaches, it would be necessary to calculate accurate excitation energies also for large molecules such as organic dyes with 50 or more atoms. One option is to implement methods for division of large systems into separate fragments to be calculated with quantum chemical methods on separate nodes in parallel (denoted by different authors as divide-‐and-‐conquer or fragment molecular orbitals).

• Accurate thermochemistry. The highly accurate calculation of thermochemical data with quantum chemical methods is currently possible only for small molecules up to about 10 atoms. However, much of the data for molecules of this size is already known, whereas accurate experiments for larger compounds are quite rare. Therefore, efficient quantum chemical methods are needed which are able to treat molecules with 30–50 atoms with the same level of accuracy. Another problem arises for large molecules. They often have a high torsional flexibility, and the calculation of partition functions based on a single conformer is therefore not correct. Quantum molecular dynamics could probably give better answers, but in many cases it is too expensive. This will clearly change in the coming decade.

Challenge: Photochemistry

Sunlight is the predominant energy on Earth and a key factor in photosynthesis. It is intimately related to life. The in-‐depth understanding of the nature of electronic excited states in biological and other complex systems is unquestionably one of the key subjects in present-‐day chemical and physical sciences. It is well known that interaction between light and matter has many important consequences in biological process and in advanced materials elaboration. It includes the comprehension of physiological process (e.g. vision) to the development of phototherapeutic drugs. There are wide-‐ranging technological applications of these processes: from the elaboration of molecular photoelectronic devices to the design of efficient solar cells, or excited-‐state chemical synthesis-‐ and quantum control of reactions-‐ to possible applications in quantum computing and information processing using excited molecules.

It is a challenge to simulate realistic photo-‐activated processes of interest in biology and materials science. These phenomena usually involve non-‐adiabatic transitions among the electronic states of the system induced by the coupled motion of electronic and nuclear degrees of freedom. Consequently, their simulation requires both accurate ab-‐initio calculations of the (many) electronic states of the system and of the couplings among them and the non-‐adiabatic time evolution of its components. Several approaches have been developed recently to tackle these problems, but the techniques currently available are generally either not efficient or not accurate enough to provide a reliable tool to study photo-‐physical processes in complex systems. The importance of the field will drive progress in the future.

4.2.3 Nanoscience

State of research Nanoscience and nanotechnology are typically understood as research and technology development at the atomic, molecular or macromolecular levels, in the length scale of approximately 1–100 nanometre range, i.e. including 1 to 1 billion atoms. It is creating and using structures, devices and systems that have novel properties and functions because of their small or intermediate size. The ability to control or manipulate on the atomic scale is an essential part of the field. Atomic details are still important: surface charge, impurities, dopants, vacancy, clusters, symmetry, step edges and corners, and passivation. A large number of simulation challenges and opportunities can be found in the broad topical areas of (i) nano building blocks (nanotubes and graphene, quantum dots, clusters, and nanoparticles, organic materials, DNA), (ii) complex nanostructures and nano-‐interfaces, (iii) transport mechanisms at the nanoscale, and (iv) the assembly and growth of nanostructures.

Over the past 10 years, the focus in theory, modelling and simulation research has been on elucidating fundamental concepts that relate the structure of matter at the nanoscale to the properties of materials and devices. As a result, theory, modelling and simulation have played an important role in developing a fundamental understanding of nanoscale building blocks. Computational capability has increased by more than a factor of 1,000 over the past 10 years, leading to more ambitious simulations and wider use of simulation. For example, the number of atoms simulated by classical molecular dynamics for 10 ns time durations has increased from fewer than 10 million in 2000 to nearly 1 billion in 2010. Over the past decade, new theoretical approaches and computational methods have also been developed and are maturing. On the K-‐computer in Japan we witnessed at the end of 2011 a simulation41 – acknowledged by the Bell Award – that achieved an execution performance of more than 3 Pflop/s. This simulation of the electron states of silicon nanowires with approximately 100,000 atoms (20 nanometres in diameter and 6 nanometres long) – close to the actual size of the materials – showed that the electron transport characteristics will change depending on the cross-‐sectional shape of the nanowire.

Opportunities Taking into account the establishment of an exascale infrastructure and the ongoing developments of the computational methods, for the first time in history there will be a direct overlap of experimentally and computationally accessible length scales. This creates unprecedented opportunities for science and technology to gain detailed knowledge of dynamic and transport processes at the nanoscale, moving in the direction of designer functionality and designer materials.

Challenges Our understanding of self-‐assembly, programmed materials, and complex nanosystems and their corresponding architectures is still in its infancy, as is our ability to support design and nanomanufacturing. Two challenges are outlined below.

Challenge: Quantum device simulation ab-‐initio

The advance of faster and less energy-‐consuming information processing or the development of new generations of processors requires the shrinking of devices, which demands a more detailed understanding of nanoelectronics. As semiconductor devices become smaller, so it becomes more difficult to design or predict their operation using existing techniques. On the other hand, given this reduction in size, the next generation of supercomputers will enable us to perform simulations for whole practical nanoscale devices, based on electronic theory and transport theory, and to develop

41 Y. Hasegawa et al., Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (ACM New York, 2011)

Figure 4.3. Spatial distribution of local density of states for the phase change materials Ge512Vac512Sb1024Te2048 in a supercell of 4,096 sites. In the left lower part of the cube the chemical information and in the upper right part the value of the local density of states (DOS) are displayed. Here, large (small) radii of the spheres correspond to high (low) DOS values as specified in the right panel. For both parts of the plot Ge, Vac, Sb, and Te are shown in white, transparent, light blue, and dark blue, respectively. (A. Thiess, R. Zeller, P. H. Dederichs & S. Blügel, 2012.)

guidelines for designing new devices that incorporate the quantum effects that control nano-‐level phenomena. This requires the description of the temporal evolution of a switching quantum device with defects and leads. Envisaged are pico-‐second simulations, for example based on time-‐dependent density functional theory of 1,000,000 atoms of a spin-‐torque magnetic random access memory (MRAM), nano-‐ionics based resistive switching memories, or organic electronics, all possible for the first time with the advent of exascale computing.

Challenge: Design of nanostructures

Self-‐assembly is a central feature of nanoscience. Understanding, predicting, and controlling this process is crucial for the design and manufacture of viable nanosystems. Clearly the subsystems involved in this process are assembling themselves according to some minimum energy principle. Once an understanding of the underlying physics is available, optimization problems can be formulated to predict the final configurations. Since these systems are also huge and likely to have many local minima, a careful development of the models, the constraints, and the algorithms will also be required here. The description is pursued with a QM/MM context.

In the past decade, the fundamental techniques of theory, modelling and simulation that are relevant to materials science have undergone a stunning revolution. It has made this community intense users of computing at all levels driven by the change from the Tflop/s to the Pflop/s level. This field has produced several Gordon Bell Prize winners for the ‘fastest application code’ – in 1988 (1 Gflop/s), in 2001 (11 Tflop/s), and most recently in 2011 (3.08 Pflop/s) on the K-‐computer in Japan for the electronic structure of silicon nanowires. But, impressive as the increase in computing power has been, it is only part of the overall advance in simulation that has occurred over the same period. Advances in the period include the following.

• New mesoscale methods (including dissipative particle dynamics and field theoretic polymer simulation) have been developed for describing systems with long relaxation times and large

spatial scales, and these are proving useful for the rapid prototyping of nanostructures in multicomponent polymer blends. Here the requirement is for larger simulations times and larger systems.

• Molecular dynamics with fast multipole methods for computing long-‐range inter-‐atomic forces have made possible the accurate calculations of the dynamics of millions and sometimes billions of atoms. The requirement here is also for larger simulations times and larger systems. Molecular dynamics simulations are important in soft-‐matter but also in life science.

• Monte Carlo methods for classical simulations have undergone a revolution, with the development of a range of techniques (e.g. parallel tempering, continuum configurational bias and extended ensembles) that permit extraordinarily fast equilibration of systems with long relaxation times. Recently, Monte Carlo methods have been combined with DFT methods to determine materials-‐specific thermodynamic properties.

• Density functional theory (DFT) and extensions such as ab-‐initio ‘Car-‐Parrinello’ molecular dynamics (ab-‐initio MD) and time-‐dependent DFT (TDDFT) have transformed material physics and likewise computational chemistry, surface science and nanoscience. These methods provide the capability to describe the electronic structure, interatomic forces and in part also the electronic excitations of molecules and condensed media containing hundreds or thousands of atoms in a computational volume that might be periodically repeated (see Figure 4.3), together with their static and dynamical structural properties. A large variety of DFT methods42,43 have been developed that are able to cope with the chemical challenge of the periodic table, the heterogeneity of systems and the structural and compositional diversity, i.e. large classes of molecules and materials can now be described with a reliable predictability. Applying the popular local density approximation (LDA), the accuracy of calculated energy differences between equilibrium states is estimated to about 3 meV/atom ~0.1 kcal/mol, 0.2% for a charge density difference, and an atomic force difference of 10-‐5 atomic units. Ab-‐initio MD has extended its field of applications by developing an algorithm to capture rare events. In particular, a number of linear-‐scaling methods have been developed or are under development, some of which are becoming efficient for system sizes larger than 10,000 atoms. These are geared towards efficient use of massively parallel computers, are likely to be extendable to exascale computing and can bring the computable system sizes to new horizons. In the past 10 years we have witnessed a drive to extend the applicability of DFT to wider classes of systems exhibiting strong electron correlations (oxide materials, defects, partially filled d-‐ and f-‐electron systems) or long-‐range correlation, as exemplified by the van der Waals interaction. Modern approaches to improve the description by better exchange-‐correlation functionals are based on orbital-‐dependent quantities, such as hybrid functionals, range-‐separated functionals or a separated treatment of exchange and correlation (e.g. exact exchange plus the random-‐phase approximation). These functionals improve the predictability of properties (e.g. enthalpy of formation of molecules is on average better than 3 kcal/mol for B3LYP functional), demanding some 10–100 times more CPU time. Parallelisation on massively parallel computers seems non-‐trivial, while local accelerators to a CPU would appear beneficial.

• Beyond DFT, Hedin’s GW approximation based on many-‐body perturbation theory has been implemented in many electronic structure codes and has found widespread use in calculating spectroscopic properties. Originally used primarily to calculate the band gap of semiconductors, access to increased computational resource and diversity of methods has seen GW recently applied to study surfaces, nanostructures and molecules. A detailed comparison of the self-‐energy in the GW context with the exchange-‐correlation functional in DFT is expected to pave the way for further improvement of functionals. This interplay of different approaches to

42 Yousef Saad, James R. Chelikowsky & Suzanne M. Shontz, SIAM Review, vol. 52, pp. 3–54 (2010) 43 http://www.psi-‐k.org/codes.shtml

correlated systems has already been exploited in the realm of TDDFT.

• Wavefunction-‐based schemes are the norm for the quantum chemistry community. They allow a systematic approach to the exact solution of the electronic Schrödinger equation and in this way offer a hierarchy of methods useful for estimating error bars of simpler approximations. Also, traditional DFT methods are not capable of predicting correctly the breaking of chemical bonds or describing dark excited states that control photochemistry. However, standard wavefunction-‐based methods have computational costs that rise steeply with the number of atoms in the molecule. Much effort is focused on numerical efficiency and parallelisation. ‘Static’ correlation is often centred on the (usually unattainable) full-‐configuration interaction (CI) method. In practice, CASSCF and CASPT2 are among the most successful methodologies, with the multi-‐configurational perturbation theory (PT2) adding the necessary dynamical correlation. For ‘dynamical correlation’ (or single reference) problems, quantum chemists have developed Coupled Cluster (CC) methodologies such as CCSD(T), or perturbation theories such as MP2, etc. In recent years, the range of systems amenable to highly accurate CC calculations has increased dramatically. Although parallelisation on massively parallelizsed computers is difficult, a breakthrough can be expected in the next decade with the application of the tensor-‐network theory and tensor approximation to quantum chemistry. Introducing linear-‐scaling methods for many quantum-‐chemical methods and for the computation of various molecular properties has circumvented the steep increase of computational effort with molecular size. These methods exploit the local electronic structure and open the way to treat large molecular systems with 1,000 atoms and more at the Hartree Fock, DFT and MP2 levels. This community would appear to be best served by a diverse mix of architectures, including those with computationally powerful fat nodes.

• The interest in Quantum Molecular Dynamics continues to grow. The standard method of solving the Schrödinger equation uses a representation of the wavepacket and Hamiltonian in an appropriate product basis. The method is restricted by the computational resources required, which grow exponentially with the number of degrees of freedom. The treatment of tetra-‐atomic systems is now becoming the state of the art, but studies of systems with more than six degrees of freedom are in general still impossible. The multi-‐configuration time-‐dependent Hartree (MCTDH) algorithm, which corresponds to a multi-‐configurational mean-‐field method, does not overcome the exponential scaling but significantly alleviates the problem due to the construction of a variationally optimised moving basis. MCTDH is arguably today's most powerful wavepacket propagation method, and it can be applied for systems typically involving 20–50 degrees of freedom. The exponential scaling can be avoided by turning to more approximate, in particular to semi-‐classical, methods, where the wavepacket is approximated by an ensemble of particles that follow classical trajectories (e.g. ab-‐initio path integral molecular dynamics). A considerable range of theoretical methods is applied to tackle these systems.

• Quantum Monte Carlo (QMC) methods now promise to provide nearly exact descriptions of the electronic structures of molecules. Traditionally, these methods have been based on the variational MC method, or diffusion MC and Green's function MC. The latter two are projection approaches, which dispense with quantum chemical basis sets but have to deal with the Fermion sign problem and the related fixed-‐node approximation. In general, all QMC methods exhibit good scaling with the number of electrons, enabling the treatment of relatively large systems, but with a computational cost much larger than traditional ab-‐initio methods based on DFT. Unlike wavefunction-‐based quantum chemical methods, they are essentially stochastic in the way they seek to solve the electron correlation problem exactly and thus benefit significantly from massively parallel computer architectures that become effective in Pflop/s and Eflop/s computing. This is one reason why their importance will increase during the next decade.

• During the last few years conventional electronic structure calculations based on DFT in the local density approximation (LDA) have been merged with a modern many-‐body approach – the

dynamical mean-‐field theory (DMFT) – into a novel computational method referred to as LDA+DMFT to address strongly correlated electron systems. The solution of the effective multiband impurity problem is the main point, achieved by a quantum impurity solver – typically the Hirsch–Fye QMC algorithm based on a time discretisation approach. A new generation of continuous-‐time Quantum Monte Carlo (CT-‐QMC) methods for numerically exact calculation of complicated fermionic path integrals has recently been proposed for interacting electrons based on the weak-‐coupling and strong-‐coupling perturbation expansion. This methodological breakthrough in the quantum many-‐body theory will stimulate the realistic modelling of the electronic, magnetic, orbital and structural properties of materials such as transition metals and their oxides. It still needs considerable developments to be able to treat increasingly complex systems.

4.3.1 Materials Science In materials science all these methods come into use as elements of multiscale materials design, where the design of a material includes all aspects from functionality to manufacturing.

Steel is an example of a seven-‐component alloy where the macroscopic properties depend on the microscopic properties of seven chemical constituents that determine the properties on the mesoscopic structure. Catalysts, device elements, smart materials or composite materials are other examples where materials screening will develop into a materials genome project where some 250,000 different materials systems have to be calculated to generate a database. This requires massive capacity computing with high throughput requirements – a factor of 1,000 improvement in throughput speed is necessary to make materials informatics a powerful and widely used tool.

4.3.1.1 Strongly Correlated Electron Materials

LDA+DMFT represents a novel and extremely promising approach to treat strongly correlated electron materials realistically and to compute their properties. There is a strong motivation developing this tool to predictive power. This is a major effort for the decade ahead. Today, an in-‐depth analysis of one system with three to five orbitals may take two years on a rack (4,096 cores) of a Pflop/s high-‐performance computer. The CT-‐QMC algorithm has the potential to push back the sign problem beyond seven orbitals and to lower temperatures that cannot be studied today due to the lack of CPU time. Switching to an exascale computer with 500 MByte per core will allow for a much larger throughput of systems, which is absolutely necessary to scan the properties of these materials as function of temperature, pressure and other external stimuli. Higher throughput is also necessary to validate the accuracy of the underlying model and to engage more people in this research. Only under these conditions can we use these methods in a materials science approach and unravel the secrets of this materials class for use in the context of a functional design. Since the time-‐consuming algorithm is stochastic in nature, the methods can make use of massively parallelised computers. It is expected to scale to exascale computing and it is certainly an application for a Tier-‐0 infrastructure. If more memory per core is available, different impurity solvers can be employed that are more robust against the fermionic sign problem. Considering the many materials that exhibit magnetism, superconductivity, multiferroicity, orbital ordering, Mott-‐transition, Kondo-‐effect in bulk, surfaces, interfaces, molecular crystals, with available CPU time, this becomes a major activity. The modestly estimated CPU time is 30 Pflop/s Tier-‐0 access throughout the year.

4.3.1.2 Soft Matter

Progress in computational studies of soft-‐matter systems requires the possibility to investigate systems that are more complex. Increasing complexity typically requires much larger system sizes. In many cases, due to increased relaxation times, it also requires much longer runs. In the past, progress has been possible on the one hand due to improved simulation codes, and on the other

hand due to the enormous increase in computer power. Since many codes are already very efficient, the field relies heavily on the future increase of computer capability.

Today’s available computer resources are nowhere near adequate to simulate sufficiently realistic chemical models of long polymer chains in a melt long enough to achieve predictions for real systems: this would require an increase in the power of present-‐day computers by two to three orders of magnitude.

An exascale infrastructure would allow access to important new areas of investigation. These include the following:

• The suppression of turbulence in liquids by addition of polymers (‘turbulent drag reduction’)

• The prediction of the flow behaviour of blood from the squeezing of a single red blood cell through narrow capillaries to the streaming through arteries, membrane fluctuations and membrane functions including their interaction with membrane proteins and the realistic consideration of the surrounding water

• The structure building and function of molecular aggregates as well as the connection of the variable conformation of macro-‐molecules with functional groups (e.g. chromophores in fluorescent polymers) and their electronic properties

The latter examples are closely related to other fields of computational science, including catalysis research, quantum chemistry and fluid mechanics. In order to stay competitive in the aforementioned fields and to take part in the described developments, the available computational power has to increase by orders of magnitude, and the access to it on national and international levels is indispensable. Serious consideration must also be given to the provision of a special purpose computer for long MD runs.

A crucial aspect is the optimal and efficient use of massively parallel supercomputers for this very broad range of complex problems in soft matter. The ‘know-‐how’ on suitable parallelisation strategies must be strengthened and expanded.

4.3.2 Chemistry Considering the great importance of non-‐adiabatic processes from photoinduced charge transfer in energy harvesting, solar cells to femto-‐ and ato-‐chemistry, time-‐dependent density functional theory in combination with Ehrenfest dynamics44 and quantum molecular dynamics will be used more extensively in the future. Since the methods require considerable computational resources, considerable benefit will arise from an exascale environment, expanding usage to a wider community. Estimated CPU time ~10 Pflop/s.

4.3.3 Nanoscience Nanoscience has benefited considerably from the DFT, Car-‐Parrinello MD and the DFT tool development. Nanoscience at ‘1 nm’ length scales (i.e. a few thousand atoms) is achievable today on a few thousand processors with a cubic scaling of the CPU time with system size. As a rule of thumb (with many exceptions), 1 processor and 2 GB memory per atom shows good scalability, and the scalability limit is a few processors per atom. Density functional investigations at the nano-‐scale, using existing methods, have large aggregate CPU demands (billion CPU h/year). Nano-‐structure calculations require computation of the structures, energetics and dynamics of highly inhomogeneous systems. As typical for this field, often many-‐of-‐a-‐kind runs are required to conclude.

As an example, one ab-‐initio MD simulation of 1,000 atoms simulated with 20,000 molecular dynamics steps (1 time step is equivalent to 0.5 fsec, total simulation time 10 psec) requires about 2 weeks of simulation time on 4,096 processors. Section 7.7 reveals about 6,000 DFT publications per

44 10 times higher numerical effort than ab-‐initio molecular dynamics, typically large systems of a few hundred atoms

year in Europe, about 750 of which are publications resulting from ab-‐initio MD. Estimating crudely that five simulations of this type lead to one publication, ab-‐initio MD requires currently around 1.875 Pflop/s of sustained computer power throughout the year. The aim of computational nanoscience is to overlap more realistically with the stunning experimental advances of the field producing experimental details that require a quantitative analysis.

This means we have to deal with larger length scales, longer simulation times (estimated factor of increase in floating-‐point operations: 1,000), more configurations (different isomers, compositions, atomic distributions) (estimated factor of increase in floating-‐point operations: 10) and greater precision in approximations to DFT in particular for the excited state (estimated factor 100). Then one can arrive at the rather shocking estimate that, even after an establishment of an exascale infrastructure, the available computer time will be insufficient and only the most excellent research proposals can be funded at a European exascale facility.

The conventional DFT codes will not generally run on 105+ processors and this becomes even more difficult with the addition of orbital-‐based functionals to obtain higher predictability. The order-‐N methods developed over the past years are designed to scale on massively parallel computers and are expected to have the potential to scale-‐up to exascale. Adapting or redesigning these codes for such architectures is not only time-‐intensive but a challenge in itself.

In summary, there is no doubt that the materials science, chemistry and nanoscience community in Europe requires a large allocation of CPU time that will exceed 1 Eflop/s. There is a large demand on Tier-‐0 capability computing for dynamical mean-‐field theory, (ab-‐initio) molecular dynamics, multiscale and device simulation.

Obviously, a European exascale environment must take into account that this field also requires capacity computing to search the immense phase space of opportunities. Therefore, a heterogeneous infrastructure best serves this field.

We re-‐emphasise that a critical requirement of this community is the optimal and efficient use of massively parallel supercomputers for this very broad range of complex problems in soft matter. The ‘know-‐how’ surrounding suitable parallelisation strategies needs to be strengthened and expanded.

4.4 Expected Status in 2020 Implementing existing codes or developing new codes for an exascale facility is an enormous challenge that will be addressed. An analysis of the number of papers published over time from simulation and computation-‐oriented communities in Europe reveals a linear or faster growth in the number of active scientists and published papers. Analysing the increasing number of faculties in Europe in the fields of materials science, physics, applied physics, chemistry and engineering sciences clearly shows that the growth of computer power – a factor of 1,000 every 10 years – leads to a speed of progress which excels simulation to the point that simulation becomes a driving factor for materials discovery and innovation. Indeed, it becomes the driving force in a design continuum from fundamental discovery through systems engineering and processing to technology. By 2020, full device simulations from first principles will become possible. Across Europe we will have more graduate schools of simulation sciences. In 2020 we can expect to reach the ‘simulation laboratory’ paradigm, where core developers of community codes are in contact with a European exascale facility and at the same time educate and train the community on codes for capability computing to enable them to perform simulations using these codes. The innovation and design of new materials, processes, devices and technology will dramatically speed up. Labour-‐intense experimental trial and error will become much more efficient due to the computational pre-‐screening. Materials informatics will gain traction. Designing complex materials systems based on the knowledge of structural, mechanical, chemical, optical, dielectric, electric and magnetic properties will significantly influence the integration of knowledge and communication, the progress of medical analysis capabilities, solutions to energy and environmental quests, and the way our society develops into the future.

PRACE – The Scientific Case for HPC in Europe Life Sciences and Medicine

5 LIFE SCIENCES AND MEDICINE

5.1 Summary

The life sciences community is very diverse and there is an important imbalance between the large community of experimental biologists (who strongly depend on computational results) and the small community of computational biologists (who rely heavily on HPC resources). For this reason, the work of computational biologists has a ‘multiplicative‘ effect on life sciences. As an example, it has been determined that a discontinuation of the access to biological databases would block most research pipelines in biomedicine within a couple of days. Despite the relatively small size of the ‘in-‐silico’ biology community, its impact on the life sciences is enormous. The primary goal of computational biology and bioinformatics is to understand the mechanisms of living systems. With recent experimental advances in this area (e.g. next generation of DNA sequencing instruments) the data generated is becoming larger and more complex. In contrast to other communities, there are no universal computer packages and software evolves very fast to adapt to new instruments. The problems faced by scientists working in molecular simulations and genomics are also very different, as are the computer algorithms used. The importance of having fast and flexible access to very large computational resources is crucial in the many fields of life sciences and the lack of suitable computers can block entire projects with important consequences for science and for society.

Opinions with respect to extreme computing (exascale computing) are unanimously favourable, but opinions about single-‐machine Eflop/s computing were less enthusiastic. While Eflop/s machines are a major requirement for specific issues (e.g. brain simulation), and higher computational power will enable significantly increased accuracy for current modelling studies, some highly important fields in life science will be mainly limited by throughput and data management. Therefore a single-‐minded focus on achieving high flop rates in individual runs, rather than application results, could seriously damage European research in these areas.

Four main areas in life sciences and health that requires HPC are described below: genomics, systems biology, molecular simulation and biomedical simulation. These four fields are strongly related to the pharmaceutical and biotechnology sectors but also to other economically important areas such as food (agriculture), environment (biotoxicity) and energy (biofuels).

5.2.1 Genomics In genomics research we face problems involving the management of massive amounts of data (e.g. the sequencing of 2,500 genomes of cancer patients) in programs that can require hundreds of thousands of processors but little inter-‐processor communication. However, the vast amount of data to be managed (often combined with confidentiality and privacy aspects) hampers the use of cloud or grid-‐computing initiatives as a general solution. Suitable and flexible access to computer resources is crucial in this area. Currently known cornerstones for an exascale system (number of computer nodes, I/O and memory capacities) are clearly driving the race for Eflop/s peak performance. For most of the genomics challenges, such an Eflop/s computer could be even less ‘balanced’ than today’s HPC systems and this would constitute a substantial barrier to using it efficiently.

The fast evolution of genomics is fuelling the future of personalised medicine (see Figure 5.1).

Genetic variability affects how drugs react with each patient, sometimes in a positive manner (increasing the healing effect), sometimes in a negative manner (increasing toxic side effects) or simply by reducing drug response. Personalised medicine is a concept that will replace the outdated idea that a single drug is the solution for an entire population. It will develop specific solutions for segments of population characterised by given genetic factors, or even for individuals. Thanks to recent advances in high-‐throughput genome sequencing, we can already access the full genomic profile of a patient in a single day, and the throughput of next-‐generation sequencing techniques is increasing much faster than Moore’s law. Currently, sequencing centres require multi-‐PByte systems to store patient data, and data processing is carried out on supercomputers in the 100 Tflop/s to 1 Pflop/s range. Requirements are expected to increase dramatically as sequencing projects are extended to entire populations, making possible linkage studies.

Figure 5.1. Pharmacogenomics aims to identify patients at risk for toxicity or reduced response to therapy prior to medication selection. (S. Marsh & H. L. McLeod, Hum. Mol. Genet. 15, R89–R93 (2006).)

5.2.2 Systems Biology Some diseases cannot be understood at the gene level (genomics) but only in a more complex, pathway context. Drug effects are similarly studied at the systems biology level. Disease-‐associated networks containing several proteins have been reported as possible causes of disease.

Furthermore, perturbation of biological networks is a major underlying cause of adverse drug effects. Detailed knowledge of the structure and dynamics of biological networks will undoubtedly uncover new pharmacological targets. Intense research is being carried out today to develop models for identifying protein network pathways that will help to understand the undesired effects of drugs and explore how they are related to network connectivity (see Figure 5.2). The use of complex network medicine is expected to have a dramatic impact on therapy in several areas: the discovery of alternative targets; reducing toxicity risks associated with drugs; opening new therapeutic strategies based on the use of ‘dirty’ drugs targeting different proteins; helping to discover new uses for existing drugs. Systems biology is now at the stage of collecting data to build models for complex simulations that will, in the near future, describe the dynamics of cells and organs that presently remain unknown. The models that are developed today are stored in databases. Progress is rapid and systems biology

will allow us to couple the simulations of the models with a biomedical problem (e.g. monitor mutations in a specific genome that can change the activity of a protein). This will require large computational resources and systems biology will benefit from Eflop/s capabilities, but aspects related to data management are going to be as important as pure processing capability.

Figure 5.2. Development of models that can be used to do drug re-‐profiling and to simulate in-‐silico drug toxicity. (Patrick Aloy et al., IRBB.)

5.2.3 Molecular Simulation Eflop/s capabilities will allow the use of more accurate formalisms (more accurate energy calculations, for example) and enable molecular simulation for high-‐throughput applications (e.g. study of larger numbers of systems). Unfortunately, if Eflop/s capabilities are achieved simply by aggregating a vast number of slow processors, this will not favour studies of longer timescales, since it will not be possible to scale up to systems using hundreds of thousand cores (as the simulated systems typically have less than 1 million atoms).

The needs of the molecular simulation field will probably be better served by a heterogeneous machine, with hierarchical capabilities in terms of the number of cores, the amount of memory, memory access bandwidth and inter-‐core communication. This should be contrasted with current ideas regarding a ‘flat’ machine with peak Eflop/s power. Exascale capability will, however, facilitate biased-‐sampling techniques, which require parallel computing, enabling in-‐silico experiments unreachable today. Examples include the proteome-‐scale screening of chemical libraries to find new drugs and the study of entire organelles, or even cells, at the molecular level. In most of these cases, parallelisation is expected to be hierarchical (e.g. ensemble simulations, multiscale modelling, or a mix of parallelisation and high throughput). Molecular simulation is a key tool for computer-‐aided drug design. The lack of high-‐performance computers appropriate for this research will displace R&D activities to the USA, China or Japan, putting European leadership in this field at risk. Computational simulation of biomolecules provides a unique tool to link our knowledge of the fundamental principles of chemistry and physics with the behaviour of biological systems (see Figure 5.3).

Figure 5.3. Multiscale molecular simulation in life sciences.

Appropriate exascale resources could revolutionise this area, allowing molecular simulators to decipher the atomistic clues to the functioning of living organisms. Certain grand challenge problems in this area fit well to the conventional, general-‐purpose, exascale development roadmap. However, other vital problems in the field will be addressable only through the development of novel architectures, not by huge machines with very large theoretical peak power but limited efficiency for the applications of interest. This is already at an advanced stage in the USA and Japan, and there is an extreme danger that Europe will be left behind.

5.2.4 Biomedical Simulation

We envision projects such as the simulation of the brain (see Figure 5.4), organs and tissue modelling, as well as in-‐silico toxicity prediction.

In these areas, Eflop/s capabilities will be a necessary, but not sufficient, requirement, since the integration of experimental information, human interaction with calculations and the refinement of underlying physical models will also be instrumental for success. As in the case of molecular simulation, multiscale modelling is one of the major challenges of this area and represents one of the major cross-‐cutting issues of exascale systems for life sciences. The extensive use of simulation will allow significant improvements in the quality and quantity of research in this area. Simulation will help to integrate knowledge and data on the body, tissues, cells, organelles and biomacromolecules into a common framework that will facilitate the simulation of the impact of factors that perturb the basal situation (drugs, pathology, etc.).

Figure 5.4. Human brain simulation timeline. (Felix Schürmann & Herny Markram, Blue Brain Project, EPFL.)

Biomedical simulation will reduce costs, time to market and animal experimentation. In the medium to long term, simulation will have a major impact on public health, providing insights into the cause of diseases and allowing the development of new diagnostic tools and treatments. In parallel, simulations will have a major impact on information technology. Thus, it is expected that understanding the basic mechanisms of cognition, memory, perception, etc., will allow the development of completely new forms of energy-‐efficient computation and robotics. The potential long-‐term social and economic impact is immense.

5.3 A Roadmap for Computational Requirements

The priorities set out by the experts include new techniques for (i) data management and large storage systems (increase of shared memory capacity), (ii) interactive supercomputing, (iii) data analysis and visualisation, (iv) multi-‐level simulation, and (v) training. As life sciences and health is such a heterogeneous field, it will be necessary to have several application-‐oriented initiatives developed in parallel, although they can share similar agendas. A flexible access protocol to Tier-‐0 resources will be as important as absolute computer power for this community. The following specific points were highlighted by the panel.

Competence Centre. The life sciences panel is eager to apply the model of USA co-‐design centres focused on exascale physics applications, such as the Centre for Exascale Simulation of Advanced Reactors (CESAR), the Co-‐Design for Exascale Research in Fusion (CERF), the Flash High-‐Energy Density Physics Co-‐Design Centre, and the Combustion Exascale Co-‐Design Centre (CECDC). A centre with academic and industrial participation focused in life sciences and health will be instrumental to facilitate an efficient use of PRACE Tier-‐0 resources in areas such as tissue and organ simulation,

molecular dynamics, cell simulation, genome sequencing and personalised medicine. Considering the complex nature of the bio-‐computational field, only a powerful competence centre will guarantee compatibility between research needs in the area and the new generation of exascale computers.

Capability and Capacity. The panel fails to recognise the real relevance of the debate between ’capability‘ and ’capacity‘. Some of bio-‐problems will require access to multi-‐Pflop/s computers, others will largely benefit from specific purpose or hybrid machines, while others will necessarily require exascale resources. Doubts exist, however, as to whether a massive computer created by aggregating a vast number of slow cores would be beneficial for most of the life science community. We believe that obtaining large peak performance at the expense of a loss of balance would be an error. It should be stressed that bio-‐problems will require not only Eflop/s calculation power but also exadata management capabilities.

Software for Extreme Computing. While most of the software in use today could be used in the exascale, most of the software that should be used has not yet been developed. On the other hand, there are software packages available today whose ‘functionality’ (but not necessarily the code itself) needs to be ported to exascale platforms. These applications will currently not run efficiently on exascale computers without enormous efforts in method development. Concerns exist in the panel over the fact that most current algorithms cannot scale up to using 105 or 106 slow processors. Rather, there is a need to completely reconsider which parallelisation approaches should be used. A brief analysis of the software universe in the field is shown below.

Quantum Chemistry. The current capability of first-‐principles quantum chemistry is used to study neurotransmitters, helical peptides and DNA complexes. Quantum chemistry calculations are precise but expensive. Exascale should make feasible calculations that are unthinkable today. Important applications in this field used for bio-‐simulations include Dalton,45 GAMESS,46 Gaussian47

and CPMD.48

Chemical Informatics. It is becoming unfeasible to fully explore and predict 1D, 2D and 3D chemical properties of small molecules with databases of tens of millions of compounds. Drug discovery based on small molecules will need to deal with the increasing size of databases (up to 1 billion entries today). Several types of open-‐source and proprietary software need to be ready for exascale systems.

Stochastic Models and Biostatistics. Stochastic methods will be applied to model complex biological systems, to simulate large coarse-‐grained systems, to sample conformational space for molecular docking, or to predict secondary structure of RNA. Personalised medicine is based on the so-‐called Single Nucleotide Polymorphism (SNP) association studies to identify mutations as bio-‐markers for genes that predispose individuals to diseases. The existing multi-‐SNP methods are only capable of handling 10 to 100 SNPs, a very small fraction of exascale systems that should provide methods that could handle much higher dimensionality.

Sequence Analysis. With the increased amount of data generated in laboratories around the world, basic protein and DNA sequence-‐based calculations are becoming a significant bottleneck in research. For example, in phylogeny (reconstruction of ancient proteins), present-‐day Bayesian approaches cannot be applied to more than 200 sequences (200 base pairs long), and new methods will increase the complexity. Vital applications in this area of research include BWA,49 BLAST/BLASTMPI,50 CLUSTALW,51 HMMER52 and MrBayes.53 45 http://dirac.chem.sdu.dk/daltonprogram.org/ 46 http://www.msg.ameslab.gov/gamess/ 47 http://www.gaussian.com/ 48 http://www.cpmd.org/ 49 BWA, Burrows-‐Wheeler Aligner, http://bio-‐bwa.sourceforge.net/ 50 blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download

Molecular Modelling. Molecular modelling is a key discipline for rational drug design. Computational tools in this area allow scientists to model pharmaceutical target structures and calculate protein–drug docking energy. Molecular modelling represents one of the main exascale challenges. Vital applications include Gromacs,54 AMBER,55 NAMD,56 Autodock,57 Glide,58 Dock,59 Flexx,60 FTDock,61 LigandFit62 and ROSETTA.63

Network Medicine. In recent years it has become apparent that many common disorders such as cancer, cardiovascular diseases and mental diseases are often caused by multiple molecular abnormalities. As mathematical systems theory shows, the scale and complexity of the solution should match those of the problem. Network medicine has multiple potential biological and clinical applications. For example, the understanding of the effects of cellular interconnectedness may offer better targets for drug development, more accurate biomarkers to monitor diseases and better disease classification. Exascale computing will be the necessary infrastructure to move from a static to a dynamic understanding of biological pathways and protein interaction networks (as a reference, the human interactome connects 25,000 protein-‐coding genes and ~1,000 metabolites). In the panel’s opinion, software to be used in this area in exascale computers still needs to be developed.

Cell Simulation. It is estimated that eukaryotic cells contain about 10,000 different proteins (with close to 1,000,000 copies of some of them). Whole-‐cell and sub-‐cellular simulations (e.g. membranes) will require huge computational resources and efficient coupled multiscale simulation applications. In the panel’s opinion, software to be used in this area in exascale computers still needs to be developed.

Tissue Modelling. As described in previous sections, tissue simulations (like heart, respiratory system and brain) are going to be key issues for animal substitution in drug testing. Future medicine will be based on virtual patient models and this should increase both the safety and the efficacy of drugs. Again, it is clear that software to be used in this field in exascale computers still needs to be developed.

Finally, the panel identified several additional technical aspects that will require special attention from computer scientists. These include: (i) software quality control, (ii) development tools, (iii) software optimisation, (iv) hardware optimisation, and (v) exabyte data management. To implement the life sciences and health exascale computing applications, the experts propose a timeline to build an exascale centre for life sciences. The centre will require the combined expertise of vendors, hardware architects, system software developers, life science researchers and computer scientists working together to make informed decisions about features in the design of the hardware, software and underlying algorithms.

51 http://www.ebi.ac.uk/Tools/msa/clustalw2/ 52 http://hmmer.janelia.org/ 53 mrbayes.sourceforge.net/ 54 www.gromacs.org/ 55 http:// ambermd.org 56 www.ks.uiuc.edu/Research/namd/ 57 http://autodock.scripps.edu/ 58 www.schrodinger.com/ 59 dock.compbio.ucsf.edu/ 60 www.biosolveit.de/FlexX/ 61 www.sbg.bio.ic.ac.uk/docking/ftdock.html 62 www.accelrys.com 63 http://www.rosettacommons.org/

5.4 Expected Status in 2020

5.4.1 Genomics Advances in the technologies for data generation that both increase the output and decrease the cost will mean that, over the next decade, the quantity of data being produced will increase by at least a thousandfold and maybe as much as a millionfold. There are three key aspects that HPC centres will have to deal with: (i) data storage, (ii) data transportation, and (iii) data confidentiality. On the other hand, while the most popular genomics software is regularly reviewed and optimised for new systems, a large part of the available genomics libraries were started in the 1990s and use inefficient script/high-‐level languages (e.g. Perl, Java or Python packages). These codes still perform well for current data loads, but may not be ready for the data challenges of the next decade. In order to avoid simplistic views of the problem, it is important to stress that bioinformatics software has been developed under a strong time pressure, given the rapid changes in technology, and several codes are not open-‐source and can only be optimised by the code owners. Furthermore, they evolve very rapidly, generating serious problems for program optimisation following standard working procedures in the computer science community. Given the large amount of code available, an interesting alternative could be the development of more efficient compilers for script/high-‐level languages.

5.4.2 Systems Biology One of the main challenges here is the reverse engineering of the biological networks operation in normal cells (of all types) and the identification of intercellular communication networks which are responsible of the functioning of multicellular organisms. This is a first step towards a full understanding of the impact that external perturbations can have on biological systems and, in turn, to explain complex human diseases. Current applications dealing with the ‘Omics’ (proteomics, metabolomics, etc.) generally require more system memory than intense CPU usage. Extensive information retrieval and database operations constitute the layer underlying systems biology. Problems related to data handling, data integrity and confidentiality are all important in this area. Model reconstruction and engineering will require the integration of different levels of granularity from coarse-‐grained models to detailed ones. Each specific application will have its own requirements, ranging from easy parallelisable code to highly integrated algorithms. The considerations related to temporal modelling and simulation of fluctuations will add additional levels of complexity. A central repository of data with distributed hubs across Europe will be a major requirement of systems biology. Participation of major bioinformatics initiatives in Europe (such as ELIXIR) in the definition of exascale requirements is judged as very important by this panel.

5.4.3 Molecular Simulation Examples of grand challenges we will face in the future are: (i) simulations of biological systems that are thousands of times larger than those possible today (e.g. realistic cell membrane models, including drug permeation and binding), (ii) simulations that are thousands of times more computationally complex than those possible today (e.g. quantum simulations of biomolecules), and (iii) simulations that cover timescales thousands of times longer than those possible today. However, in the long term, the real challenge in the field will be the multiscale simulation. Structural genomics initiatives are beginning to encompass many of the important organisms, while proteomics initiatives increase our knowledge of the structural space of drug-‐targets. Massive sequencing projects, transcriptomics and functional genomics are deciphering the molecular mechanism of cellular action, and a variety of spectroscopic techniques are providing a picture of how tissues and organisms work (see Figure 5.4). The multiscale simulation will integrate multiple simulations layers in different scales to reach a unified vision of living systems (from atoms to tissues). There is an obvious complexity in merging such techniques that will not have necessarily

the same hardware requirements (memory, disk space, processor, etc.). The exascale systems will need to integrate this multiscale scenario providing a simple user-‐interface.

5.4.4 Biomedical Simulation The simulation of complete organs is a frontier of biocomputation. These simulations are characterised by: (i) a very large, highly heterogeneous state space; (ii) multi-‐level modelling at the molecular, sub-‐cellular, cellular, tissue and organ levels; (iii) multiple timescales (from picoseconds to years); and (iv) structural plasticity. Handling very large volumes of state data will require new techniques for: (i) data management; (ii) collaborative interactive visualisation; (iii) computational steering of simulations; (iv) real-‐time monitoring of performance; (v) run-‐time switching and load balancing between models at different levels of abstraction; and (vi) coding of parallel tasks and processes. Bandwidth and memory capabilities are growing more slowly than flops and they are constrained by energy consumption. It is currently expected that early Eflop/s machines will provide no more than 0.1 EBytes of memory and this may be insufficient for flagship simulations (e.g. whole brain simulation). Tissue simulation will benefit from heterogeneous CPUs, i.e. CPUs that combine complex cores (useful for subcellular simulation) with larger numbers of smaller cores (ideal for the cellular level). In current supercomputing, some compute-‐intensive processes (e.g. visualisation, data analysis) are run offline on specialised machines, a situation that will be impractical in exascale environments. It is therefore important that these processes should be executed in situ. More generally, reducing data flow will require new approaches to I/O that avoid large movements of system memory to disk.

PRACE – The Scientific Case for HPC in Europe Engineering Sciences and Industrial Applications

6 ENGINEERING SCIENCES AND INDUSTRIAL

APPLICATIONS

6.1 Introduction64

Engineering sciences represents a major technological innovator within the European Community and contributes substantially to its economic success. Applications in industry are underpinned by research in the field of engineering science. Consequently, industries of all kinds are facing opportunities and challenges driven by the application of high-‐performance computing (HPC). The efficient use and successful exploitation of modern HPC will therefore play a significant role in delivering increased understanding of realistic engineering problems through high-‐fidelity modelling, simulation and optimisation. However, although European engineering companies have achieved remarkable success, the computational community remains fragmented.

The topics covered by computational engineering are extremely diverse and cover, for example, aeronautical engineering, automotive engineering, civil engineering, oil and gas exploration (seismic) and engineering (multi-‐fluids flows, etc.), chemical processing, nuclear engineering, biological and medical engineering. This includes research to understand and predict turbulent fluid flow, multiphase flow, turbulent combustion, fluid–structure and fluid–material interactions and structural failure, and the integration of these tools into robust optimisation schemes for product design. Other industrial applications, for example in the fields of chemistry and biology, are covered elsewhere in this report. Many of these fields have interlinked challenges such as energy – we need to develop a better understanding of renewable energy sources. Simulation is especially critical to analysing the socio-‐economic impact and any potential environmental consequences.

Examples of cost and efficiency savings possible by the use of PRACE large-‐scale resources in industry in the period 2012–2020 include, among many others:

• Improved efficiency of energy conversion in gas turbines • Reduction in the number and cost of wind tunnel or crash tests associated with aircraft and

automotive design • Development of new energy-‐efficient design of cars and traffic systems • Reduction in the number of unsuccessful wells drilled by use of more accurate seismic

analysis • Reduction of environmental pollution and noise

The engineering community is not as organised as other scientific communities. In contrast to the scientific disciplines, there are few ‘community’ codes, and institutions make use of both in-‐house and commercial software. Examples of advances in engineering science, underpinning further industry development, are often carried out by collaborations between industry, research institutes and universities.

The broad objective of the PRACE engineering working group is to identify challenges and bottlenecks and develop high-‐fidelity software for informing critical design and operational decisions. This will require an understanding of hardware and software HPC trends, such as Intel’s emerging multi-‐core technology or the use of GP-‐GPU architectures. The panel has identified a number of common issues in exploiting current petascale resources. While many codes are known to scale well to hundreds of thousands of cores, subsequent analysis requires the associated data storage and 64 Uli Rüde, Neil Sandham, Dave Emerson

network bandwidths to scale at approximately the same rate. There are concerns that tools for post processing, remote visualisation and valorisation are not suitable for the largest data sets that will occur.

For many engineering applications, there are generally three distinct stages: 1. Pre-‐processing (creating the computational mesh) 2. Solution (discretising equations and implementing the numerical algorithm) 3. Post-‐processing (analysing and displaying numerical results)

A lot of time and effort has been invested in developing efficient numerical solver algorithms. What now has to be considered is the challenge of getting these developments to scale up to many thousands of processors, something that has received only limited attention.

The pre-‐processing stage often requires the creation of a good quality computational mesh that is crucial to the success of any grid-‐based numerical algorithm. However, none of the available tools is capable of generating the size of meshes necessary to exploit hardware using 100,000 cores and beyond. This challenge can be tackled in two main ways: (i) the first would be to investigate parallel grid generation; (ii) the second consists of adaptive resolution techniques, including mesh and/or order adaptation. In both cases, load balancing becomes an important issue.

The use of automated optimisation chains for large-‐scale problems poses additional practical issues, both on job scheduling and external connectivity of the HPC facility. An optimisation chain is driven by a central optimiser, launching autonomously a large number of computations with a certain need for synchronisation. Moreover, such optimisation chains are based on the interaction with a parameterised CAD description of the geometry, manipulated with industrial standard tools which are simply not available on HPC architectures. This means that either these tools need porting, or a connection to a standard (mainly Windows-‐based) computer is required.

Data analysis of results obtained from a current Pflop/s or an upcoming Eflop/s computer presents some formidable challenges. Again, like pre-‐processing, it may not have received the attention but is clearly going to play an important role in interpreting the data produced. Visualisation (perhaps remote) will be the key to understanding the large amounts of data generated, and more research is needed to develop intelligent feature extraction algorithms, but again currently available tools are not suitable for the challenges of exascale. More generally, future massive multiscale and multi-‐physics simulations will generate a deluge of data (rough data and its associated metadata) and there is a need to redevelop (or re-‐invent) the post-‐processing toolchain in order to facilitate data mining on huge and heterogeneous data. Convergence with ‘big data’ methodologies already used in web data mining is to be expected.

A further challenge facing engineering is code coupling. In multi-‐physics applications, where we need to couple continuum-‐based software such as structural mechanics, acoustics, fluid dynamics and thermal heat transfer, this is required in a horizontal fashion. For large numbers of cores, with a complex memory and accelerator hierarchy, much work needs to be done. In addition, there is growing interest in coupling codes in the vertical direction (multiscale models), i.e. from continuum to mesoscale to molecular dynamics to quantum chemistry. This requires bridging length and time scales that span many orders of magnitude.

Further developments with potentially high impact on computational engineering include the use of HPC systems for interactive computational steering that requires interactive behaviour and correspondingly fast response times for the simulation. Even beyond this are real-‐time and embedded simulations and immersive virtual reality techniques. For example, the control systems for a large-‐scale power plant are being designed and developed before the plant itself is operational by using a real-‐time HPC simulator. Similarly, the real-‐time simulator can be used for training the plant operators for dangerous operation modes and emergencies. As another example, real-‐time HPC simulators are being developed into new types of diagnostic tools in medical engineering. For example, blood flow simulators can be used for operation and therapy planning.

In common with many of today’s scientific disciplines, the majority of the numerical algorithms used to solve the problem have been successfully parallelised using MPI. However, the new generation of heterogeneous many-‐cores present formidable challenges to engineering software, which has been developed and validated over many years. Here the software engineering methodology for high-‐performance scientific and engineering codes is critically underdeveloped and standards are necessary in order to secure future software developments.

Community activities are also needed to equip routinely engineering students with the knowledge and confidence to apply HPC in their industrial careers and an outreach activity to spread the use of HPC out from the core areas into new areas such as biomedical engineering, where there may be significant potential for HPC-‐inspired research and development.

6.2 Computational Grand Challenges & Expected Outcomes in Engineering

6.2.1 Turbulence65 Turbulence is one of the most important unsolved problems in classical mechanics. Virtually all flows faster than a few metres per second or larger than a few centimetres are turbulent, including most cases of interest in industry, and practically all atmospheric, oceanic and astrophysical flows. Even the flow in the largest human arteries can become turbulent. When doctors detect a ‘heart murmur’, they are listening to the noise of turbulence. From the engineering point of view, turbulence can be favourable or deleterious. Turbulent mixing allows combustion to proceed efficiently in power plants and aircraft engines, but turbulent drag is responsible for much of the energy spent in transportation. Most of the pressure drop in large water mains or in oil and gas pipelines is turbulent dissipation, and roughly half of the drag of aircraft is turbulent skin friction. About 10% of the energy use in the world is spent overcoming turbulent friction.

Turbulence theory has long been a theme of engineering research, mostly through theory and experiments, but it has made large strides in recent years because of the influence of supercomputing. Simulations are basically experiments by different means, but they offer at least two key advantages: they provide better control of the experimental conditions, including some that cannot be created otherwise, and they result in essentially complete databases. On the other hand, they are expensive. Turbulence is characterised by many degrees of freedom, measured by the Reynolds number, which imply large computational grids. Present research simulations routinely have Reynolds numbers of a few thousands, involve 1010 grid points, and run over hundreds of millions of CPU hours in O(105) processors. Turbulence was explicitly mentioned in the first PRACE Scientific Case as one of the necessary underpinnings of engineering research, and it continues to be so today. Likely future trends were predicted to be the simulation of more complex and realistic flows, and the increase in the Reynolds numbers of canonical ones. Both have taken place. Direct Numerical Simulations (DNS, using no models), which centred on simple turbulent channels five years ago, have turned to jets and boundary layers, which are much closer to real-‐life applications, and the trend towards ‘useful’ flows is likely to continue. The Reynolds numbers have increased by a factor of roughly five, implying a work increase of three orders of magnitude. It is interesting that this increase has taken place with relatively little degradation of computational efficiency, and that many landmark simulations have been performed by European researchers. Even if the European turbulence community has traditionally been strong, its current prominence in computation was far from being assured five years ago.

65 Javier Jiménez, Philipp Schlatter, Roel W. C. P. Verstappen

Another development has been the improvement of large-‐eddy simulation (LES) models, which are an intermediate level of detail between full modelling and direct simulation of turbulence. They hold the best promise for practical turbulence simulations in the future, although boundary conditions continue to be a problem. Many of the theoretical developments have also taken place in Europe, and owe a lot to the new higher-‐Reynolds number numerical data sets against which they can be tested.

On the more applied side, the most impressive results have probably originated from the US, such as the computation of an entire jet engine at Stanford. That simulation used a combination of modelling, LES and DNS, and centred on interfacing the various simulation levels, rather than on physical accuracy. There have been few comparable European programs, which are nevertheless important if HPC is going to fulfil its promise in the application of turbulence research to the real world, particularly in the extension of LES to real industrial cases.

On the other hand, even the favourable situation just described can only be considered an intermediate stage in turbulence research. There is a tentative consensus that a ‘breakthrough’ boundary layer free of viscous effects requires Reynolds numbers of the order of Reτ = 10,000, which are lower than many industrial applications, but five times higher than present simulations. That implies computer times 1,000 longer than present (Re4), and storage capacities 150 times larger (Re3). Keeping wall times constant implies increasing processor counts from the present O(32 Kproc) to O(32 Mproc), which will require rewriting present codes but is probably not insurmountable. Turbulent simulations have scaled correctly for 25 years, from single processors to O(105). Storage might be a tougher problem. Turbulence research requires storing and sharing large data sets, presently O(100 TBytes) per case, and becoming O(20 PBytes) within the next 5–10 years. Archiving, transmitting and post-‐processing those data will require work, but the rewards in the form of more accurate models, increased physical understanding, and better design strategies will grow apace.

6.2.2 Combustion66 Combustion has a strong impact on environment (greenhouse gases, pollutant emissions, noise) but represents more than 80% of energy conversion worldwide and is essential for ground and air transportation, electricity production and industrial processes and is involved in safety (fires and explosions). The central position of combustion in our world will not decrease in the near future. Science, and especially numerical simulations, is mandatory to promote its highest efficiency use with the lowest impact on climate. The objective of combustion studies is to better understand and model physical phenomena to optimise, for example, gas turbines (aero-‐engines or power generation), internal combustion engines or industrial processes in terms of costs, stability, higher efficiency, reduced fuel consumption, near-‐zero pollutant emissions and low noise, or to help in fire prevention and fighting. Computational Fluid Dynamics (CFD) offers to design engineers the unique opportunity to develop new technical concepts, reducing development costs by avoiding extensive and very expensive experimental campaigns. From an economic point of view, industrial companies involved in propulsion and energy systems are among the biggest employers in the European Union. To give them more efficient and cost-‐effective system designs is crucial support to promote their competitiveness on the worldwide market.

Scientific challenges in combustion are numerous. First, a large range of physical scales should be considered from fast chemical reaction characteristics (reaction zone thicknesses of about tens of millimetres, 10-‐6 s), pressure wave propagation (sound speed) up to burner scales (tens of centimetres, 10-‐2 s resident times) or system scales (metres for gas turbines, kilometres for forest fires). Turbulent flows are, by nature, strongly unsteady. Chemistry and pollutant emissions involve hundreds of chemical species and thousands of chemical reactions, and cannot be handled in numerical simulations without adapted models. Usual fuels are liquid, storing a large amount of

66 Denis Veynante and Stewart Cant

energy in small volumes (about 50 MJ/kg). Accordingly, two-‐phase flows should be taken into account (fuel pulverisation, spray evolution, vaporisation, mixing and combustion). Solid particles, such as soot, may also be encountered. Interactions between flow hydrodynamics, acoustics and combustion may induce strong combustion instabilities (gas turbines, furnaces) or cycle-‐to-‐cycle variations (piston engines), decreasing the burner performances and, in extreme cases, leading to the system destruction in short times. Control devices may contribute to avoid these instabilities either based on passive (geometry changes, Helmholtz resonators) or active (actuators) techniques. The design of cooling systems requires the knowledge of heat transfer to walls due to conduction, convection and radiation as well as flame/material interactions.

Fire simulations are probably today less mature than gas turbine or internal combustion engine computations but predictions in terms of safety, prevention and fighting are challenging. Forest fires regularly strongly affect south European countries and, because of climate change, may concern northern regions in the future. Their social impact is very important (land, buildings, human and animal life, agriculture, tourism, economy). Forest fires involve a very large range of spatial and temporal scales. Chemical mechanisms are especially complex (the wood pyrolysis depends on its nature, moisture and involves numerous chemical species). Forest fires are strongly controlled by long-‐distance radiative heat transfer, generally neglected in ordinary combustion computations. Buoyancy effects (large-‐scale flames) as well as interactions with the local meteorology (winds, moisture) and the local topography (hills, valleys, etc.) should also be taken into account and need adapted models when these features are not relevant in burners. The simulation of the fire fighting, for example by dropping fluids without or with retardant, is also a challenging research of crucial importance.

A related area concerns accidental explosions in industrial process plant caused by leaked clouds of flammable gas or vapour. Simulation technology in this area is widely used for safety case assessment, but the accuracy suffers from the very large range of scales that must be represented. The chemistry need not be represented in full detail, but the coupling of flow, turbulence and combustion is strong and complex. This can lead to devastating consequences, and reliable suppression methods are required.

High-‐end high-‐performance computing systems give the opportunity of aggressive research to allow the use of combustion with the highest efficiency and the lowest impact on climate. Combustion simulations will combine three methodologies:

• Direct numerical simulations (DNS) are very high-‐fidelity computation without modelling for turbulence (all the relevant flow scales are explicitly computed). Because of their computational cost, they are limited to small cubic domain and low turbulence Reynolds numbers but are the best workhorse today to reveal the internal structure of turbulent flames, understand propagation, extinction, ignition, pollutant formation and new combustion regimes (homogeneous, or ‘flameless’, combustion) and devise combustion models. Exascale machines will give access to configurations and operating conditions close to realistic laboratory turbulent burners.67

• Large eddy simulations (LES), where the largest flow motions are explicitly computed while only the effects of the small ones are modelled, are more relevant to compute and analyse unsteady flows in larger domains of realistic shapes under practical operating conditions as encountered in gas turbine chambers, piston engines and industrial furnaces. LES have revolutionised the field of numerical combustion in the last 20 years by bringing almost DNS-‐like capacities to actual industrial systems. Examples include the ignition of a helicopter combustion chamber, unstable modes of an industrial gas turbine, and cycle-‐to-‐cycle piston

67 DNS is the topic of the Combustion Co-‐design Center initiated in 2011 by J. Chen and J. Bell at Sandia National Laboratories (USA). Multiple groups in Europe have the capacity to develop such competitive research programmes.

engine variations – all have been simulated on national (Tier-‐1) or European (Tier-‐0 of PRACE) machines.68 Today, industry relies on and invests in LES to compute multiple phenomena that are beyond the capacities of existing classical RANS codes available in companies (see below). European groups are leaders in this field while their LES combustion codes are recognised as the most advanced.

• Reynolds Averaged Navier Stokes (RANS) remains the standard approach used within the energy industry. It allows for the inclusion of complex industrial geometries together with complex physics at a level of computational cost that can be tolerated within the engineering design cycle. Larger, more advanced and more frequent RANS calculations are being carried out in order to explore the design space for novel clean combustion systems. A strong coupling exists between DNS, LES and RANS whereby data and modelling insight move from DNS through LES to RANS, while questions and new challenges move the other way.

To compute only the reacting flow within the combustion chamber is not sufficient and multi-‐physics/chemical phenomena must be coupled in the simulations. For example, in a gas turbine, simultaneous simulations of the combustion chamber, the compressor (feeding the chamber) and the turbine (fed by the chamber) are needed. Flame/wall interactions should be taken into account in terms of heat transfers, flow/structure and flame/material interactions to design cooling systems and control system lifetime. The noise emitted by the combustor, as well as its perception at long distances, must also be computed. These various phenomena are generally described by different codes69 that should run together on a massively parallel machine and exchange data. They lead to new challenges in terms of load balancing and simulation control but also in terms of physical phenomena coupling and model compatibilities.

6.2.3 Aeroacoustics70 In the development of new aircraft, engines, high-‐speed trains, wind turbines and so forth the prediction of the flow-‐generated acoustic field becomes more and more important since society expects a quieter environment and the noise regulations – not only near airports – become stricter every year. Considering what has been achieved in the field of computational aeroacoustics over the last 10 years, it is evident that the future of noise prediction and one day even noise-‐oriented design belongs to the unsteady three-‐dimensional numerical simulations and first principles. However, the contribution of such methods to industrial activities in aerospace seem to be years away, i.e. they lag behind the contributions of computational fluid dynamics to the design of, for example, airframes and gas turbines. Certification often depends on a fraction of a dB, whereas presently predicting noise to within, say, 2 dB without adjustable parameters is without doubt impressive. Generally, industry uses database methods which chronically leave significant uncertainties leading up to flight tests with serious business consequences. And model tests are not at small scales – in the order of 1/10 – reliable to a fraction of a dB. The extra difficulty in simulation aimed at engine, airframe or combustion noise is due to the very wide range of chemical, turbulent, acoustic and geometric scales which are defined by the configuration, the thin wall-‐bounded and free shear layers, the chemical layers and the audible range of sounds.

68 Note that, because of their very high computational costs, these simulations are still limited (a few tens of cycles in internal combustion engines for a given regime) and sometime unique (ignition of a helicopter combustion chamber). Exascale machines will give access to longer physical times and repeated simulations with different designs, operating conditions or model constants to quantify the overall sensitivity to these parameters and optimise practical systems.

69 The structure of the code may strongly depend on the related physical phenomena. For example, combustion involves balances over small volumes and domain splitting is retained for parallelisation. On the other hand, radiative heat transfers are controlled by long-‐distance interactions and are more likely parallelised by wavelengths and/or radiation direction.

70 Wolfgang Schröder

The state of the art is limited to simplified components or geometries which can be tackled using manually generated structured meshes, in contrast to the systems actually installed, which need to be simulated, most probably by adaptive unstructured body-‐fitted or Cartesian grids. The latter can be decomposed into an arbitrary number of blocks such that the computations can be done on massively parallel machines in the Eflop/s range and higher.

Such machines are essential for solving the aeroacoustics problems not only on a generic but on an industrial scale, i.e. a complete wing in high-‐lift configuration, a full landing gear, a combustion-‐chamber–turbine-‐jet configuration, at such a level of efficiency, reliability, and accuracy that a low-‐resolution design can be achieved.

Consider the acoustic analysis of the noise generated by a full landing gear at a Reynolds number that is still two orders of magnitude below the real flow condition. To determine the noise source, the turbulent flow field has to be simulated. This requires a mesh in the range of tera cells and storage in the PByte range. Economically, such an analysis is completely out of range today, and multi-‐petascale and then exascale computers are needed in the next three to five years to make such computations feasible. To tackle problems in the real Reynolds number range, the next generation of computers is necessary.

6.2.4 Biomedical Flows71 Surgical treatment in human medicine can be optimised using virtual environments, where surgeons perform pre-‐surgical interventions to explore best practice methods for the individual patient. The treatment of the pathology is supported by analysing the flow field, for example optimising nasal cavity flows or understanding the development of aneurysms. The computational requirements for such flow problems have constantly increased over recent years and have reached the limits of petascale computing in the sense not only of computational effort but also of required storage. It is vital to understand fully the details of the flow physics to finalise the derivation of medical pathologies and to propose, for instance, shape optimisations for surgical interventions. Such an in-‐depth analysis can be obtained only by a higher resolution of the flow field, which in turn increases the overall problem size.

It goes without saying that it is very important in biomedical flows to resolve highly wall-‐bounded shear layers to understand fully the influence of the flow on the tissue causing irritations. This is done by an accurate computation of the wall-‐shear stresses. In this context, the wall heat flux also needs to be considered, requiring not only a high resolution close to the highly intricate geometry of the wall but also a highly resolved computational mesh representing the deformable tissue. A coupled solution approach is required to compute such fluid-structure interaction problems, which again increases the considered problem size and the computational effort, i.e. it necessitates exascale computing. Moreover, to determine the transitional flow, direct numerical simulations have to be performed to correctly capture time-‐dependent spatial flow structures such as evolving vortices, recirculation zones, separated flow, and mixing layers as they appear (e.g. during the respiration phase in human lungs). The miniaturisation and heating of the flow is strongly coupled to the formation of droplets caused by condensation at inhaled particle surfaces. In this context, understanding the transport, coagulation and collision of millions of particles from micro-‐ to nanometer scale is extremely important. The aspect of particle transport is also essential to understand particle deposition at nasal drug delivery with sprays and of diesel aerosols in the human lung which can cause cancer.

This predicted growth of the computational costs can be only handled by splitting the problem into subproblems distributed over more computational resources. Such resources could be provided by exascale computers. The current trend of reduced distributed memory under a massive increase of

71 Wolfgang Schröder

computational units will form future HPC systems, in which highly reliable fast interconnects need to be implemented to deal with the increased communication effort guaranteeing good scaling speed-‐ups for exascale applications. Furthermore, the additional overhead to perform particle simulations cannot be handled by current Pflop/s computers but could be simulated on exascale systems. All of these simulations need to be performed under unsteady conditions leading to very high storage requirements that will reach the exascale range and cannot be stored on today's HPC systems.

Computations that have to be performed for the nasal cavity problem under high frequency conditions involve Reynolds numbers in the range of Re ≈15,000. The lung problem – not just for the upper bifurcations but for approximately 20 generations – leads to cell numbers in the range of tera cells, a total of about billion time steps , and storage requirements in the range of a PByte. Currently, such a computation takes a few hundred days on a multi-‐Pflop/s IBM BlueGene/Q system (JUQUEEN). To perform such an analysis in the next couple of years definitely requires exascale computing power. Furthermore, to tackle problems where the entire fluid and structure mechanics of the respiration system is simulated demands even the next generation of exascale computers expected to be available in 2020.

6.2.5 Solid Body, Mechanical and Electrical Engineering72

The design of new structures with composite materials – with or without elastomeric behaviour – and mechanical structures – performing as well at very low and high temperatures – has shown a very impressive improvement thanks to HPC. In practice, the equations of mechanical deformations are very non-‐linear and not easily solvable on many core computers. Major progress is expected in the next few years with exascale applications.

Solution of auxiliary problems (equations, eigenvalue)

Hardware enhancements leading to exascale by 2020 – increasing both speed and memory some thousand times – should enable the solution of fundamental auxiliary problems for which there are algorithms with asymptotically linear complexity. This would include the solution of systems of linear equations or eigenvalue problems some 10–100 times faster, or the treatment of problems that are some 10–1,000 times larger. Similar improvements are expected in the solution of basic large-‐scale state problems of mechanics and of electromagnetic fields; this should yield a similar impact in advanced computational engineering (see below). However, these goals will not be accomplished without research. This will result in related progress into the solution which will adjust current approaches to new architectures, the cost of operations depending on the placement of the arguments in memory or structure of the communication costs.

Complex structures (larger problems)

Emerging computers will enable more realistic modelling of complex structures. Examples include:

• The transient analysis of complete engine with evaluation of the stress and temperature fields

• Vibration analysis of relevant parts of power stations taking into account the effect of damping or non-‐linear effects

• Multiscale problems such as more reliable analysis of constructions with fibre composites, or modelling of the crash tests with more realistic interaction of passengers

Optimal design (improved speed)

The typical goal of computation is to improve performance of the product. An increase in computer speed by three orders of magnitude enables a thousandfold enhancement in the resolution of

72 Zdenek Dostal

problems on the current edge of complexity, enough to switch from the analysis of such problems to optimal design. This should markedly increase applicability of the optimal design methodology.

Reliability of computations (improved speed)

Most analysis that is carried out today does not take into account uncertainty of input data. For example, the results of stress tests are typically not the stress limit but its distribution function. The reason is that such computation is considerably more time-‐consuming. To switch from the common deterministic analysis to analysis which takes into account uncertainty of the data would considerably increase the reliability of analysis and resulting decisions. There are engineering problems where the explicit analysis of uncertainty is critical, such as analysis of the deposits of radioactive waste. Improved performance would also result in improved reliability of computation by using methods which require additional costs, such as a posteriori error estimates.

6.2.6 General Process Technologies, Chemical Engineering73 Chemical engineering and process technology are traditional users of HPC for dimensioning and optimising reactors in the design stage. Computational techniques are also used for improving the operation of processes – for example, through model predictive optimal control or through inverse modelling for estimating system parameters. The computational models used in chemical engineering span a wide range of scales. On the microscopic level, chemical reactions may be represented by molecular dynamics techniques, while on the mesoscopic level, flows through pores or around an individual particle may be of interest. The macroscopic scale eventually considers the operation including heat and mass transfer in a full industrial-‐scale reactor or even the operation of a full facility.

Usually, laboratory-‐scale reactors do not scale trivially to full industrial process size. Therefore, simulation tools are essential to avoid time-‐consuming and expensive prototype systems when designing new processes. Typical computational problems here involve complex reactive multiphase flows. On the process scale, these systems can currently be represented only by averaging techniques and with macroscopic models that cannot capture the physics on the microscopic or mesoscopic scale from first principles. Such reactors typically involve bubbles, droplets and flow through pores. Additionally, the nucleation, transport and agglomeration of particles may be of interest. Modelling such kinds of interactions individually is already difficult for single microscopic objects. Currently, such models are computationally only feasible on a small scale since the computational power is insufficient to simulate larger ensembles. Future systems will be essential to bridge the scales better and permit more detailed models.

Exascale systems will permit a better understanding of very dispersed phenomena or very large up (or down) scaling problems, such as aggregates formation and growth, by the development of much improved particle simulation technologies (LBM, IBM, DEM, SPH, etc. )74, for example for describing multiscale interactions between fluids and structure, or fluid–solid suspension, interfaces and multi-‐physics coupling.

For the process design and its optimisation, both now and in the future, macroscopic models based on continuum descriptions will be used. However, macroscopic models require a closure of the model equations. Correct closure laws are essential for the fidelity of the simulation, but currently they can often only be derived from empirical arguments. The predictive power of such macroscopic models is therefore limited and thus their industrial use is not yet satisfactory in many cases. With exascale computers, it will become possible to model and simulate such systems with much finer resolution and it will become increasingly feasible to use more refined and detailed models. For

73 Uli Rüde 74 LBM – Lattice-‐Boltzmann method; IBM – Immersed Boundary Method; DEM – Discrete Element Method; SPH – Smoothed Particle Hydrodraulics (SPH)

example, given exascale computational power, new methods in particle dynamics and discrete element methods can be coupled with continuum-‐based CFD models to simulate particulate flows directly with a full resolution of each particle.

We expect, for example, that such multiscale and multi-‐physics models will present many new opportunities for the process industry, but they critically depend on the interdisciplinary development of models, algorithms and simulation software, and of course on the availability of sufficient computational power in the form of future exascale systems. Eflop/s systems will help to reduce the dependence on surrogate models and hence lead to more and more accurate process models. In some cases, exascale systems may allow the simulation of complex multiscale phenomena at full industrial scales.

6.3 Computational Grand Challenges and Expected Outcomes in Industry

Industrial applications involved in the field of numerical simulation needing next-‐generation exascale systems are mainly:

• Aeronautics: full Multidisciplinary Design and Optimisation (MDO), CFD-‐based noise and in-‐flight simulation: the digital aircraft

• Turbo machines, propulsion: aircraft engines, helicopters, etc. • Structure calculation: design new composite compounds, deformation, etc. • Energy: turbulent combustion in closed engines and opened furnaces, explosion in confined

area, power generation, hydraulics, nuclear plants, etc. • Automotive: combustion, crash, external aerodynamics, thermal exchanges, etc. • Oil and gas industries: full 3D inverse waveform problem (seismic), reservoir modelling,

multiphase flows in porous media at different scales, process plant design and optimisation, CO2 storage, etc.

• Engineering (in general): multiscale CFD, multi-‐fluids flows, multi-‐physics modelling, computer-‐aided engineering, stochastic optimisation, etc.

• Special chemistry: molecular dynamics, (catalyst, surfactants, tribology, interfaces), nano-‐systems, etc.

• Others (bank/finance, medical industry, pharma industry, etc.): ‘Big data’, data mining, image processing, etc.

• Common issues for all the above include Data assimilation, uncertainty quantification, etc.

6.3.1 Turbo Machines, Propulsion75 Motivation. Numerical simulation and optimisation is pervasive in the aeronautics industry, and in particular in the design of propulsion engines. The main driving force of technological evolution is substantial targeted reductions of specific fuel consumption and environmental nuisance – in particular greenhouse gases, pollutant emissions and noise – as put forward by regulators such as ACARE and IATA. On the engine side, these ambitious goals are pursued by increasing propulsive and thermodynamic efficiency, reducing weight and finally controlling sources of noise. The targets can probably not be achieved simply through gradual improvement of current concepts. The development of disruptive propulsive technology is needed, relying even more heavily on numerical tools to overcome the lack of design experience. We can foresee two major challenges related to

75 Koen Hillewaert

HPC: the use of high-‐fidelity numerical tools towards a more direct representation of turbulence and the evolution of optimisation strategies.

High-‐fidelity aerodynamic simulation. Although jet engine design is inherently multidisciplinary, predicting the aerodynamics is both the most critical and the most costly issue. A recent review of the state of the art is given by Tucker.76,77

To date, the Reynolds Averaged Navier-‐Stokes (RANS) approach, modelling the ensemble-‐averaged impact of turbulence on the main flow, is the main workhorse. This is mainly due to the combination of acceptable prediction accuracy with low computational cost. RANS is capable of simulating flow unsteadiness, such as rotor–stator interactions, theoretically if there is a clear scale separation between the turbulence and the computed flow features. The problem is that this scale separation is usually not guaranteed, which is a possible explanation for the discrepancy between computed and measured performance. RANS also clearly fails to predict transitional flows, flow instabilities, broadband (and to a much lesser extent tonal) noise generation, combustion efficiency, etc.

At the other end of the spectrum large eddy simulation (LES) approaches represent the energy-‐carrying scales of the turbulence directly, while using relatively simple models for the more isotropic and universal non-‐resolved scales. However, due to accuracy and resolution requirements, the computational effort for this approach is prohibitive in practice and is likely to remain so; although the approach has already been applied to realistic geometries, it is generally accepted that these computations were not sufficiently resolved.69

Hybrid approaches, using either RANS in the proximity of the wall70 or wall models, result in a further significant reduction of the computational effort, but lead to increased modelling error, in particular concerning flow transition to turbulence. To date, no consensus has been reached on whether these approaches are adequate and what their optimal parameterisation is.

We can expect that the focus for the next five years will be on the further development of the LES and hybrid approaches, thereby relying on direct numerical solutions for a better comprehension of the flow phenomena and as reference data. A second axis concerns the reduction of computational effort using more accurate and adaptive discretisation techniques. Adaptation techniques may also prove a solution for the mesh generation, a process that is inherently difficult to parallelise. During this period, access to large-‐scale resources is of paramount importance for the development of these methods. The recurrent industrial use of these accurate simulation strategies to fans, isolated blade rows and stages is expected near the end of the decade, initially for final verification to complement the optimisation chain, and later on integrated in the optimisation loop. Given the complexity of the flow and the wish to reduce the modelling hypotheses as much as possible, it is expected that the use of computational resources will follow their availability. In any case, by this time industry hopes to use Eflop/s scale machines and beyond, at least for a number of urgent computations.

Optimisation. Given the level of expected performance, it is clear that the design challenges are quite daunting and therefore require the use of automated optimisation chains. These chains autonomously launch simulations to assess the merits of a number of design choices. A very popular class of approaches, in particular for complex design spaces, constructs a surrogate model from these computations, on which the actual optimisation is performed.

Aerodynamic simulation currently relies largely on steady RANS computations. Over the next few years, more expensive unsteady periodic computations will be used more extensively to include unsteady effects, blade excitation and tonal noise in the optimisation loop. However, since significant effort is devoted to the development of more economical approaches, optimisation will continue to

76 ‘Computation of unsteady turbomachinery flows: part 1 – progress and challenges’, P. G. Tucker, Progress in aerospace Sciences, vol. 47, pp. 522-‐ 545, 2011

77 ‘Computation of unsteady turbomachinery flows: part 2 – LES and hybrids’, P. G. Tucker, Progress in aerospace Sciences, vol. 47, pp. 546-‐ 569, 2011

rely on a large number of relatively cheap computations. We can therefore assume that, in the next few years, the main evolution with respect to HPC will probably be a significant increase of the number of computations in function of the available resources, due to the need for robust optimisation and uncertainty quantification, as well as the exploration of ever-‐greater design spaces.

A very important and specific issue to optimisation is the heterogeneity of the application, not only due to the involvement of multiple physics, each with different timescales and computational requirements, but also due the need for automatic geometry modification coupled to mesh generation to the global steering by the optimisation tool. This requires even more flexible scheduling and the development/adaptation of heterogeneous communication protocols. Moreover, standard non-‐HPC numerical technology will need to be ported, in particular CAD manipulation and mesh generation tools. Alternatively, a connection to standard workstations will be required.

In the longer run, high-‐fidelity large-‐scale simulations will also be integrated in the optimisation chain. Given large computational requirements, it is to be expected that this will only be possible within the framework of a multi-‐fidelity approach, combining different levels of resolution for the construction of the surrogate model. The time frame for this inclusion therefore relies on the development of LES and hybrid technology but also on the development of the mathematical framework underpinning multi-‐fidelity surrogate models. We can probably expect to see the first demonstrators towards the end of this decade.

6.3.2 Aeronautics78 The impact of computer simulation in aircraft design has been significant and continues to grow. Numerical simulation allows the development of highly optimised designs and reduced development risks and costs. Boeing, for example, exploited HPC in order to reduce drastically the number of real prototypes from 77 physical prototype wings for the 757 aircraft to only 11 prototype wings for the 787 ‘Dreamliner’ plane. HPC usage saved the company billions of euros.

Aircraft companies are now heavily engaged in trying to solve problems such as calculating maximum lift using HPC resources. This problem has an insatiable appetite for computing power and, if solved, would enable companies designing civilian and military aircraft to produce lighter, more fuel-‐efficient and environmentally friendlier planes.

To meet the challenges of future aircraft transportation (‘Greening the Aircraft’), it is vital to be able to flight-‐test a virtual aircraft with all its multidisciplinary interactions in a computer environment and to compile all of the data required for development and certification with guaranteed accuracy in a reduced time frame.

For these challenges, exascale is not the final goal. A complete digital aircraft will require more than Zflop/s systems.

In parallel, future aircraft concepts require deeper basic understanding in areas such as turbulence, transition and flow control to be achieved by dedicated scientific investigations (see above on each engineering scientific item).

The roadmap for approaching the digital aircraft vision includes the following major simulation and optimisation challenges:

• Improved physical modelling for highly separated flows • Real-‐time simulation of aircraft in flight, coupling of the aerodynamic, structural mechanics,

aeroelastic and flight-‐mechanic disciplines based on high-‐fidelity methods within a multidisciplinary massively parallel simulation environment

• Aerodynamic and aeroelastic data production

78 Philippe Ricoux, Stephane Requena

• Noise source and impact: full development of noise source mechanisms, acoustic radiation and noise impact simulation tools which compute acoustic disturbances on top of aircraft flow

• Multidisciplinary aircraft design: fully coupled simulation of the flow around a parameterised aircraft configuration and surface shapes covering a reactive structural model within a sophisticated optimization process. The coupled large-‐scale simulations will run multiple times, on exascale systems allowing a mix of capacity and capability applications

In terms of timing, the aeronautics industry has already produced their roadmap linking capacity and methods (see Figure 6.1).

Figure 6.1. Aeronautics industry roadmap linking capacity and methods

6.3.3 Seismic, Oil and Gas79 The petroleum industry is strongly motivated to increase the efficiency of its processes, especially in exploration and production and to reduce risks by the deployment of high-‐performance computing. Typical steps in the business process are: geoscience for identification of oil and gas underground, development of reservoir modelling, designing of facilities for the cultivation of hydrocarbons; drilling of wells and construction of plant facilities; operations during the life of the fields; and eventually decommissioning of facilities at the end of production.

Geoscience analyses seismic data with numerical techniques for inverse problems. The economic impact of HPC is definitely high and the best possible tools are deployed.

Again, Eflop/s is not the ultimate goal. The complete Inverse Problem Resolution of wave equation needs more computational resources.

The objective of this application is to produce from a seismic campaign the best estimation of the underground topography in order to optimise reservoir delineation and production by solving the Full Inverse Wave Equation. This application is largely embarrassingly parallel, and the higher performing the HPC system, the better the approximation of the underground topography.

79 Philippe Ricoux

Courtesy of Airbus

With seismic, others physics must be coupled and solved such as electromagnetism (antenna, radar, sonar, etc., and more generally wave propagation). Maxwell equations could be solved without modelling in 3D academic configurations to set database post-‐processed to analyse physics and develop models devoted to practical systems.

A roadmap of the steps of this kind of approach, showing the different necessary and more complex methods of approximations of the physical reality (e.g. elastic, visco-‐elastic, etc.) is shown in Figure 6.2 courtesy of Total, but this roadmap is now accepted by all international oil companies. For this application, it is essential to both define and implement new algorithms representing more accurately the physics of the problems to be solved, and also to deploy ever more powerful hardware.

Moving beyond geoscience, the other activities in the petroleum industry have aspects which are generally classified as system-‐of-‐systems design and multiphase fluid dynamics. Of these topics, fluid dynamics require a significant effort in terms of computing.

Similar criteria are valid for multi-‐fluid problems as for geoscience. Enhanced quality of simulations depends both on more appropriate physical models and on numerical methods and techniques (e.g. for bifurcation analysis). Physical scales are disparate, for instance in pipeline modelling where the diameter is measured in fraction of a metre while the length of the pipeline is normally measured in kilometres.

For all oil and gas applications, the future codes must merge and couple multiscale techniques and multi-‐physics models due to the complexity of non-‐linear and stochastic equations.

So, these domains will require one or more breakthroughs for an efficient use on exascale systems.

In summary, the major problems and issues are: • Use of standard programming model (MPI, Open MP, etc.) cross-‐compiling – portable software

stack for development on PCs and deployment on large HPC systems • Maintenance of legacy codes • Tools for test, verification and validation of (parallel) codes • Memory access • Data management • Task management and distribution

Courtesy of TOTAL

Figure 6.2. Seismic depth imaging methods evolution and HPC

• Efficient solvers • Numerical multiscale techniques and methods • Efficient massively parallel coupling • Uncertainties quantification • Data assimilation

6.3.4 Power Generation, Nuclear Plant80 In this industrial domain,81 the objectives are multiple: (i) improvement of safety and efficiency of the facilities (especially nuclear plants), and (ii) optimisation of maintenance operation and lifespan. This is one field in which physical experimentation, for example with nuclear plants, can be both impractical and unsafe. Computer simulation, in both the design and operational stages, is therefore indispensable.

Thermal Hydraulic CFD Application Field

Improvement of efficiency may typically involve mainly steady CFD calculations on complex geometries, while improvement and verification of safety may involve long transient calculations on slightly less complex geometries.

• Study of flow-‐induced loads to minimise vibration and wear through fretting in fuel assemblies may require from 200 million to 2 billion cells per fuel assembly and to account correctly for both cross-‐flows in the core and walls around the core, at least one quarter of a core (over 100 assemblies) may need to be modelled.

• To study flow-‐induced deformation in PWR (pressurised water reactor) cores, a full core may need to be represented, at a slightly lower resolution, for an estimated grid size of at least 5 billion cells, which leads to runs of 100 Pflop/s over several weeks.

• Detailed simulations designed to verify and increase safety may require full core simulations, and mesh sensitivity studies for these transient calculations may require unsteady calculations for meshes from 5 to 20 billion cells before 2020 which correspond to runs on 400 Pflop/s over several weeks.

• To validate the models used for calculations such as the ones described above, as well as many others, running quasi-‐DNS type calculations on subsets of the calculation domain may be necessary.

This will require meshes in the 20-‐billion cell range by 2012 (to study cross-‐flow in a tube-‐bundle, in a simplified steam generator type configuration), and running similar calculations for more complete calculation domains may require meshes well above 100 billion by 2020.

Note that, as safety studies increasingly require assessment of CFD code uncertainty, sensitivity to boundary conditions and resolution options must be studied, but turbulence models may still induce a bias in the solution.

80 Philippe Ricoux, EDF, EESI 81 Note that we do not consider HPC applications linked to nuclear weapons in this report, but restrict our attention to civil nuclear applications.

Doing away with turbulence models and running DNS-‐type calculations at least for a set of reference calculations would be a desirable way of removing this bias. Such studies will require access to multi-‐Eflop/s capacities over several weeks.

Neutronics Application Field

This includes the capability to model very complex, possibly coupled phenomena over extended spatial and time scales. In addition, uncertainty quantification and data assimilation are considered as key to industrial acceptance, so that their associated computational needs that depend on the complexity of the model considered have to be met.

In terms of computing resources, projections are difficult to make because of the non-‐linear behaviour of iterative algorithms with respect to the degrees of freedom – and the number of processors. Additionally, new algorithms may have to be implemented to address the new types of numerical/physical problems within an evolving architecture. Electric Power Generation Overview

Many other applications exist beyond those mentioned: new generations of power plants, innovation in renewable energies and storage, protection against specific environmental threats (earthquakes, flood, heatwave, etc.), customers’ energy efficiency, development in home and building of technologies and services for energy efficiency, etc.

Several problems should be addressed to reach these goals: CFD, heat and multi-‐fluids flows, thermal hydraulic CFD, LES simulations, etc., modelling very complex systems, possibly coupled phenomena

Figure 6.3. A possible roadmap for Eflop/s neutronics computation, courtesy of EDF

over extended spatial and time scales, mixed with capacities like uncertainty quantification or data assimilation.

The challenge is particularly severe for multi-‐physics, multi-‐scale simulation platforms that will have to combine massively parallel software components developed independently from each other. Another difficult issue is to deal with legacy codes, which are constantly evolving and have to stay in the forefront of their disciplines.

This will require new compilers, libraries, middleware, programming environments and languages, as well as new numerical methods, code architectures, mesh generation tools visualisation tools, etc.

6.3.5 Transportation, Automotive 82 The automotive industry is actively pursuing important goals that need Eflop/s computing capability or greater, including the following examples:

• Vehicles that will operate for 250,000 kilometres (150,000 miles) on average without the need for repairs – this would provide considerable savings for automotive companies by enabling the vehicles to operate through the end of the typical warranty period at minimal cost to the automakers

• Full-‐body crash analysis that includes simulation of soft tissue damage (today's ‘crash dummies’ are inadequate for this purpose) – insurance companies in particular require this

• Longer-‐lasting batteries for electrically powered and hybrid vehicles

For both aerodynamics and for combustion, at least LES, and if possible DNS, simulations are required in an industrial scale and Eflop/s applications must be developed at the right scale, according to weak scalability, but these simulations must be coupled to all physics (flow, thermal, thermodynamic, chemistry, etc.) involved in the global transportation system. This leads to a requirement for coupled simulations involving at least one legacy code with:

• Full-‐scale, multi-‐physics configurations

• Multiple runs for optimisation and parametric/statistical analysis.

The global roadmap for this sector could thus be as follows: • Individual performance and scalability of component codes • Eflop/s systems will mainly allow multiple runs, by ‘farming’ applications, for ‘optimised’

resolutions • Overall performance of the multi-‐physics coupled system. Once more, that leads to farming

applications • Data management

For combustion and external aerodynamics, see above specific scientific descriptions (section 6.2.2).

From the present, where most of the computations are done in parallel (8–64 cores) and scalability tests have shown that up to 1,024 cores may be reasonable on 10 million finite elements, to the future, where model sizes for a full car will range between 1.5 and 10 billion elements.

New codes (mainly open-‐source) must be developed for Eflop/s systems with the following attributes:

82 Philippe Ricoux, Stephane Requena

• Coupling to perform a standardised mapping between manufacturing simulation and crash simulation

• Optimisation and stochastic analysis

• First multi-‐level computations are tested in research and in industrial applications when the so-‐called sub-‐cycling is used – more detailed parts of the problem are treated on a dedicated group of cores.

• In general, crash simulation is already well embedded into a simulation data management system with automated pre-‐ and post-‐processing including monitoring and coupling to other fields and functionalities.

For the 10-‐year perspective, the following main challenges must be addressed:

• True virtual testing replacing some physical tests requiring reliable computations – for example, it will be necessary to replace parts meshed shells by 3D meshes (middle pillar meshed with 30 million finite elements at Audi).

• Handling of a much higher complexity of finite element models (new materials, human models instead of dummies, etc.); these new materials require better and more efficient/stable algorithms. The human models have to be improved, with stochasticity included. It will be necessary to consider not only drivers meshed models but also fully 3D meshed passenger models, to increase the number of test cases to be more representative of typical real car accidents, to model more precisely the behaviour of all the airbags with a good acceleration model (with a law to model the airbags release).

• Ensuring that the overall computational wall clock time remains constant (ideally ca. 8 hours for an overnight production run).

• Addressing true multidisciplinary and multi-‐physics simulations including optimisations and stochastic analysis; this will lead to a factor of > 1,000 for the required number of computations compared to today and the necessity to embed all simulations in an overall simulation data management. As an example, optimising by hand is possible for three to five parameters but not for > 100 parameters at the same time. A big challenge is to lower the weight of cars (in order to reduce their consumption) and first R&D studies showed that there is a need to change materials (from steel to aluminium) and as a result, reconsider the weight/cost/performance ratio. This re-‐conception process with such new materials will need to perform massive shape optimisation studies in order to maintain performance and safety while reducing weight and cost.

• Establishing robust topology and shape optimisation for crash including meta-‐modelling techniques for fast-‐coupled multidisciplinary analysis (especially for fluid structure coupling) or for crash/stamping coupling with a very accurate representation of materials.

• Multi-‐level simulations where some local effects (e.g. failure) are studied on the meso-‐level in parallel to the overall macro-‐computation, which might be realised based on hybrid parallelisation schemes. Representation of the tear sheets and fracture of spot welds is important because it changes the crash shock scenario. The models currently used in industry are not representing, at least not in a simplified way, the behaviour at the meso level. The latter is expected to involve increasing the size of the models by a factor of two.

Addressing these challenges requires projects in the following areas:

• Due to the complexity and high non-‐linearity of crash simulation, it will be difficult to progress through strong scalability. A limit estimation could be between 64 (current standard) and 2,000 cores for next-‐generation crash simulations. Farming applications should run on Eflop/s machines able to address simultaneously both capability and capacity simulations.

• Memory is currently not an issue because of the explicit FE (Finite Element) method, but it will become more important especially given the trend to coupled simulations where a large amount of data needs to be mapped from manufacturing simulation to crash simulation. In the future, some companies may use implicit methods, which are more memory intensive. As an example, BMW is using implicit methods for its crash simulations: the computational cost is greater but results are much more accurate. The ESI Group is working on merging explicit and implicit methods to allow the possibility of performing a crash and a NVH (Noise, Vibration and Harshness) simulation with the same mesh.

• Automated pre-‐processing should be improved (e.g. meshing of 3D objects, coupling between CAD and CAE, unified geometrical modelling by isogeometric analysis, parameterisations for sensitivity and optimisation studies).

As for the numerous other industrial applications in the energy domain, the following methods must be addressed for an Eflop/s crash test application:

• More efficient algorithms for stochastic modelling

• More efficient algorithms for shape and topology optimisation; current single crash simulations are taking around 8 to 15 hours on up to 64 cores. The target will be to perform in an overnight run a whole shape optimisation study on a full body consisting of 10x or 100x single crash simulations for analysis the next day.

• Establishment of a uniform approach for CAD and CAE (and other CAx), already demonstrated on subsystems to tens of parameters; the target would be to use it on a full system

• Improved material models for soft tissues (human model), composites, honeycomb structures, multi-‐material light weighting.

• Algorithms for multi-‐level analysis for composites and other new lightweight materials where a coupling between manufacturing and crash simulation is realised

• Algorithms for multi-‐physics (especially for electric cars) and multidisciplinary simulations

• New techniques for parallelisation to improve scalability (based on sub-‐cycling or other approaches)

• Robust meshing techniques for 3D modelling, which can be used during simulation to enable shape optimisation and adaptive multi-‐level computation (example.g. for failure analysis). Adaptive meshing is still not possible and the trend is to move to the standardisation of functions used in CAD and mesh.

• Fluid–structure interaction (simulation and optimisation). Simulation of combustion and pollutant emissions in combination

6.3.6 Other Important Industrial Applications83 Industrial Medical Applications. This is a large market although the companies involved are mainly SMEs. HPC is used for cardiovascular flows, the modelling of the brain (not yet industrialised), tumour growth, medical images (combination of MRIs, for example). This market will grow using increasing performances in viscous flows, image processing, 2D/3D reconstruction and ‘big data’ management.

Industrial Pharma Applications. All these industries, firmly established in Europe, already use ab-‐initio and molecular simulation applied to their domains, and they will increase R&D efforts in this

83 Philippe Ricoux, Olivier Pironeau

field (see the sections 4 and 5 of this report) for drug design (GSK, Sanofi) and biomedical applications (L’Oréal). The main issues for these industries include:

• ’Big data’ management, generation, transport, storage – due to screening simulations • Exascale efficient MD software • New data mining for massively parallel QSAR (Structure – Activities Relations)

§ (cf. Bio and medical sciences Scientific Case for academic developments)

Banks and Insurance Companies are increasingly using HPC, mostly embarrassingly parallel Monte-‐Carlo solutions of stochastic ODEs; but high-‐frequency trading will inevitably require better models and faster calculation. They also have the challenge of interconnecting supercomputers and several private clouds. Finally, in common with many other industries mentioned in the report, they are faced with the ‘big data’ problem in the sense that massive market data are available (Reuters) and current calibration algorithms cannot exploit such large input. Note that 41 machines are characterised as ‘finance’ in the Top 500 list (November 2011).84

Emerging Technologies. New types of industry are evolving around computer networks, data mining, social networks, etc. The main issues here include:

• ‘Big data’ management, generation, transport, storage • New data mining for massively parallel analysis (K Tables, etc.)

One of the major issues will be to allow major companies to take advantage of HPC for increasing their competitiveness but also to help all their supply chain (including SMEs) to be engaged in the use of HPC. That means that, from one perspective, large-‐scale exascale systems may be easily downscaled to Pflop/s in box systems for SMEs and that software is made available, known and affordable for such small companies. Such an issue is crucial for ensuring global European industrial competitiveness.

84 http://www.top500.org/lists/2011/11

6.4 Engineering and Industrial Exascale Issues The major issues from both an academic and an industrial perspective that must be addressed in order to enable efficient exascale applications are shown in Table 6.1 below.

Table 6.1. Enabling exascale applications –an academic and industrial perspective.

1. The Simulation Environment

• Unified Simulation Framework and associated services: CAD, mesh generation, data-‐setting tools, computational scheme editing aids, visualisation, etc.

• Multi-‐physics simulations: establishment of standard coupling interfaces and software tools, mixing legacy and new generation codes

• Common (jointly developed) mesh-‐generation tool, automatic and adaptive meshing, highly parallel

• Standardised efficient parallel I/O and data management (sorting memory for fast access, allocating new memory as needed in smaller chunks, identifying memory that is rarely/never needed based on heuristic algorithms, etc.)

2. Codes/ Applications

• New numerical methods, algorithms, solvers/libraries, improved efficiency

• Optimisation, data assimilation

• Coupling between stochastic and deterministic methods, uncertainty quantification

• Numerical scheme involving stochastic HPC computing for uncertainty and risk quantification

• Meshless methods and particle simulation • Large database, ‘big data’, new methods for data mining and valorisation

• Scalable programs, strong and weak scalability, load balancing, fault-‐tolerance techniques, multi-‐level parallelism (issues identified with multi-‐core with reduced memory bandwidth per core, collective communications, efficient parallel I/O)

• Development of standards programming models (MPI, OpenMP, C++, Fortran, etc.) handling multi-‐level parallelism and heterogeneous architecture (GPU)

3. Archival Storage and Data Transfer

• Certainly one of the hardest challenges will be in archival storage, network capacity to transfer multi-‐petabyte data sets, and post-‐processing tools, such as graphics software capable of managing individual files in the multi-‐TBytes range.

• That may require the establishment of one or several dedicated service centres across Europe, linked to mass-‐storage facilities. Basic engineering research, as opposed to proprietary development, is cooperative, and it is important that accesses to such data and centres remains open to groups beyond the data originators for several years after the simulations are run.

4. Human Resources • Training, education of HPC developers and engineers

6.5 A Roadmap for Computational Requirements

For many academic or industrial applications, computational requirements are very similar. Timing is similar with a step around 2015–2017 for 100 PF systems and 2020–2022 for exascale. These steps correspond for turbulence, for combustion, for aeronautics, etc., to developing new resolution methods (LES, DNS-‐like) and increasing the number of grid points (as described in previous sections). The table below could be a good compromise for all these domains involving automatic mesh generation.

Table 6.2. Computational requirements for Domains involving automatic mesh generation

Case Adverse-‐pressure-‐gradient boundary layer, Reθ =

20,000

Compressible jet, with nozzle and acoustics,

ReD = 50,000

LES of multi-‐stage low-‐pressure turbine

(50 blades)

Current State of the Art Reθ = 2,000

ReD = 8,000 Developing from pipe

RANS modelling

Likely Date 2015 2015 2020

Grid Points (Gpoints)

300 32 5000

CPU Hours (IBM BG-‐P) 5 Gh 2.5 Gh 200 Gh

Cores (BG-‐P) 4 M cores 1.5 M cores 60 M cores

Central Storage 80 TBytes 80 TBytes 5 PBytes

Archival Disk Storage 7 PBytes 400 TBytes 10 PBytes

Extension of current software (several cases to be run)

Extension of current software

Requires new integration, gridding and other software

As one illustrative example, we project the progress expected in direct numerical simulation in Table 6.3.

Table 6.3 Direct Numerical Simulation Challenges and Expected Status in the 2012-‐2020 Timeframe.

Year 2012 2017 2022

Computational Power 5 Pflop/s 100 Pflop/s > 1 Eflop/s

Main Memory 100 TBytes 1 PBytes 10 PBytes

# Cores 100 K 2 M 100 M

# Particles Simulated 100 K – 10 M 1 M – 100 M > 1G

Particle Model Mass point or spherical Geometrically resolved, Elastic deformable

# CFD Grid Cells 10 G – 100 G 100 G – 1 T > 10 T

Core Hours > 20 M > 200 M > 2 G

Notes Simple particle models, explicit coupling for about 10 M time steps

Using fluid structure interaction techniques and immersed/embedded boundary techniques

Using models of deformable particles each with e.g. 100 degrees of freedom

For industrial applications such as oil and gas, aeronautics and nuclear plants, exascale is not the ultimate goal but is just a stepping stone towards zettascale. Solving the inverse problem of seismic, or designing a digital aircraft or spatial applications, will require much more than exascale computers.

6.6 Expected Status in 2020 Eflop/s computers are expected from HPC vendors around 2020 and one of the key issues will be to keep the overall power consumption acceptable around 20 MW. These systems will be a central ingredient in the further development of engineering research in Europe. Several basic flows are currently waiting on the availability of such machines, for example some turbulence codes and several DNS simulations.

But for the majority of applications, evidence suggests that the real need for large-‐scale cooperative simulation projects is not currently contemplated in the EU funding schemes. These should include technological and basic research into areas such as flow physics, code integration and interfacing, verification and validation, gridding, numerics, parallelisation, and the interaction of all those aspects with new computer and accelerator architectures.

What is particularly required in 2020 is software with load balancing, fault tolerance, coupled with user need. What is clearly expected in 2020 includes:

• Standard coupling interfaces and software tools • Mesh-‐generation tool, automatic and adaptive meshing, highly parallel

§ from meshes of ca. 100 million tetras, 16k cores, one second physical time § expected 2020: 10 billion tetras, 1.5 million cores, one second physical time

• Multi-‐physics, refined chemistry • Billion particle simulations • New numerical methods, algorithms, solvers/libraries • Uncertainty quantification

Finally, we reference again the three prospective roadmaps of industries shown earlier in this section. These capture in turn the expectations of the aeronautics industry (Figure 6.1), reveals the evolution of seismic depth imaging methods (Figure 6.2) and depicts a possible roadmap for Eflop/s neutronics computation.

PRACE – The Scientific Case for HPC in Europe Requirements – The Effective Exploitation of HPC

7 REQUIREMENTS FOR THE EFFECTIVE

EXPLOITATION OF HPC BY SCIENCE AND

INDUSTRY

7.1 Introduction All of the panels contributing to this report are convinced that the competitiveness of European science and industry will be jeopardised if sufficiently capable computers are not made available, together with the associated infrastructure necessary to maximise their exploitation. In reviewing the scientific impact and societal benefits in the preceding sections of this report, the panels have identified multiple areas at risk in concluding that access to high-‐performance computers in the exascale range is of utmost important.

Such resources are likely to remain extremely expensive and require significant expertise to procure, deploy and utilise efficiently; some fields even require research for specialised and optimised hardware. The panel stresses that these resources should continue to be reserved for the most exigent computational tasks of high potential value. It is clear that the computational resource pyramid must remain persistent and compelling at all levels, including national centres, access and data grids. The active involvement of the European Community along with appropriate Member States remains critical in establishing a world-‐leading supercomputer infrastructure in the European ecosystem. Europe must foster excellence and cooperation in order to gain the full benefits of exascale computing for science, engineering and industry in the European Research Area.

In pointing to the compelling need for a continued European commitment to exploit leadership class computers, the panels have considered the infrastructure requirements that must underpin this commitment, and present their considerations below as part of the review of computational needs. This considers both the vital components of the computational infrastructure, and the user support functions that must be provided to realise the full benefit of that infrastructure. This review has led to a set of key recommendations deemed vital in shaping the future provision of resources, recommendations that are justified below and presented as sidebars in the following text, and captured as part of the Executive Summary to this report.

7.2 An Effective and Persistent Infrastructure The resources required to support computational science through a number of large and often diverse computational projects span a hierarchy of levels – desktop, departmental or laboratory level machines, regional centres and supercomputer centres. These resources need to be organised in a hierarchical multi-‐tier pyramid and connected by adequate high-‐speed links and protocols. Furthermore, there are several complementary functions that must be provided by a Computational Infrastructure if it is to prove both effective and persistent (see Table 7.1).

Usually, ‘capacity computing’ is deployed against tasks (b) and (c), while ‘capability computing’ provides the only solution to deliver against task (d). In this Scientific Case, we advocate that this essential component of capability computing should be performed through shared European services that complement national facilities. This will add value at all levels, in particular by being more competitive on the innovative aspects permitted by type (d) tasks. We also show that the infrastructure needs to

Recommendation

N e ed f o r H P C In fr as tru c tur e a t the E u ro p e L eve l

The scientific progress that has been achieved using HPC since the ‘Scientific Case for Advanced Computing in Europe’ was published in 2007, the growing range of disciplines that now depend on HPC, and the technical challenges of exascale architectures make a compelling case for continued investment in HPC at the European level. Europe should continue to provide a world-‐leading HPC infrastructure to scientists in academia and industry, for research that cannot be done any other way, through peer review based solely on excellence.

embrace a pyramid of resources to deliver effectively against all of the above, and we consider from a scientific perspective how this infrastructure might be best balanced.

In order to integrate the variety of resource levels, facilitate access for users and simplify the management of the extreme volumes of data required, an appropriate electronic data communication infrastructure is key. Typically referred to as a ‘Grid’, this infrastructure needs to be highly tuned for HPC usage, and connected to the various tiers of HPC facilities.

Table 7.1. Complementary functions of an effective and persistent infrastructure.

(a) The development and evolution of innovative application programs, models and methods (we return to this function in 7.5.1 below).

(b) Preparatory and post-‐processing work, permitting the design and validation of particular models; this may require both data preparation plus the analysis and exploitation of the data generated by the computations.

(c) Large-‐scale systematic studies, where each case requires true supercomputer power. This enables exploration of the parameter space of devices and phenomena, with the ability to deal with multiple combinations of parameter values, thereby enabling the investigation of the statistical behaviour of phenomena i.e. uncertainty quantification.

(d) Extremely large, so-‐called 'hero' computations, where the sheer power of the entire computational resource is used to study more detailed models than previously possible. The objective may be scientific insight, where the model would include scientific aspects not previously understood, or an attempt to deal with more detailed data than usually feasible. In industry, it may be necessary to validate models extensively before they are used more routinely in design processes. Extremely large computations may also be required to deal with unexpected situations and incidents, in order to mitigate the consequences or rapidly prepare design changes.

(e) Efficient algorithms are an essential ingredient of any HPC project. As larger and larger problems are solved on larger and larger computers, it becomes increasingly important to select optimal, or near optimal, algorithms and solvers. As most problems have a superlinear computational complexity, simply relying on hardware advances to solve these larger problems is ultimately doomed. Moreover, some of the more critical tasks are generic, in the sense that they are not tied to one particular application – or even one particular field – but will occur in most of the challenges listed in this report e.g., the solution of (large and sparse) linear and non-‐linear systems, computation of the Fast Fourier Transform (FFT) and integration of time-‐dependent differential equations.

There was widespread consensus among the panels that the development of the infrastructure, its operation and access mechanisms must be driven by the needs of science, industry and society to conduct world-‐leading research. PRACE should work more closely with its users, with the leadership and management involving both researchers and providers.

While this report targets specifically the infrastructure required to handle capability jobs, it also acknowledges the importance of the remainder of the pyramid. This

Recommendation Integrated Environment for Compute and Data Most application areas foresee the need to run long jobs (for months or years) at sustained performances around 100 Pflop/s to generate core data sets and very many shorter jobs (for hours or days) at lower performances for pre-‐ and post-‐processing, model searches and uncertainty quantification. A major challenge is the end-‐to-‐end management of, and fast access to, large and diverse datasets, vertically through the infrastructure hierarchy. Most researchers seek more flexibility and control over operating modes than they have today to meet the growing need for on-‐demand use with guaranteed turnaround times, for computational steering and to protect sensitive codes and data. Europe-‐level HPC infrastructure should attach equal importance to compute and data, provide an integrated environment across Tiers 0 and 1, and support efficient end-‐to-‐end data movement between all levels. Its operation must be increasingly responsive to user needs and data security issues.

Recommendation

Leadership and Management The development of Europe’s HPC infrastructure, its operation and access mechanisms must be driven by the needs of science and industry to conduct world-‐leading research. This public-‐sector investment must be a source of innovation at the leading edge of technology development and this requires user-‐centric governance. Leadership and management of HPC infrastructure at the Europe level should be a partnership between users and providers.

comprises the grid or network infrastructure, which can be based on state-‐of-‐the-‐art developments within existing projects. The national computational centres are important resources; our view is that the European dimension should consider them as an integral part of the European resource pyramid, whose apex should be an exceptional – at the exascale level of performance – permitting very large capability class resource. Such an approach will position European capability resources at a level comparable to the best in the world, resources that to date are predominantly available in Japan and the USA.

The effect of a European collaboration to advance the apex of the resource pyramid amounts to positioning it competitively with respect to similar systems in other major countries, notably Japan and the USA, and the emerging HPC nations undertaking ambitious HPC programs, including India, Russia and China. The key driver is to promote scientific competitiveness; these systems should be targeted strategically at scientific challenges with the full support and agreement of the relevant scientific communities. In this report, we show that a wide spectrum of scientific challenges demand exascale resources best achieved at the European level. The justification of such an endeavour has been given on scientific grounds.

Most application areas foresee the need to run some long jobs (for months or years) at sustained performances around 100 Pflop/s, typically to generate core data sets, and very many shorter jobs (for hours or days) at lower performances for pre-‐ and post-‐processing, model searches and uncertainty quantification. This requires a small number of Eflop/s machines at Tier-‐0, integrated with a much larger number of multi-‐Pflop/s machines at Tier-‐1. The main impediments to realising these performances are the management of, and fast access to, multi-‐PByte datasets and the algorithm/software challenges of strong scaling to exploit Eflop/s architectures efficiently.

The computational materials science, chemistry and nanoscience community in Europe comprises more than 10,000 scientists – probably some tens of thousands – working in fields as diverse as nanoelectronics, steel, blood flow, poly-‐electrolytes and bio-‐compatible materials. Such an active and diverse community has applications in capability and capacity computing that are best served with a heterogeneous computational science infrastructure, flexible policies for PRACE access and project duration. Many applications require capacity computing. Examples include the investigation of the properties of quantum materials with strongly correlated electrons exhibiting exotic properties, multiscale simulations of complex fluids, soft and biomaterials

Recommendation Thematic Centres Organisational structure is needed to support large long-‐term research programmes, bringing together competences to share expertise. This could take the form of virtual or physical thematic centres which might support community codes and data, operate dedicated facilities, focus on co-‐design, or have a cross-‐cutting role in the development and support for algorithms, software, or tools. While some existing application areas have self-‐organised in this way, new areas such as medicine might achieve more rapid impact if encouraged to follow this path. Thematic centres should be established to support large long-‐term research programmes and cross-‐cutting technologies, to preserve and share expertise, to support training, and to maintain software and data.

and heterogeneous materials, or the complete simulation of a nanoelectronic device. Dealing will all elements of the periodic table, introducing myriads of atoms, scanning temperature, pressure and chemical potential ranges, simulating non-‐equilibrium processes, including external stimuli, all reveals a large phase space of opportunities that encourages further progress by combinatorial materials optimisation to develop a treasure map for technological applications. This type of science is not possible without powerful capacity computing capabilities. On the other hand, capability computing should serve dynamical mean field, Quantum Monte Carlo, molecular dynamics and order-‐N density functional theory software. In these circumstances, the importance and the adoption of the methods would undoubtedly increase with different computer platforms required as a function of algorithm.

Another consequence of this discussion is that it is highly unlikely that there will be a single design or architecture that best addresses the exascale requirements of all disciplines. Indeed, some application areas require intensive use of specific system architectures and/or particular modes of access and operation such as on-‐demand access and guaranteed turnaround, data and code security or access to massive data repositories and instruments, arguing for the introduction of dedicated, thematic facilities. Thus, the computational materials science, chemistry and nanoscience community are giving serious consideration to the provision of a special-‐purpose computer for long molecular dynamics runs. Vital problems in the field of life sciences will only be addressable through the development of novel architectures, not by huge machines with very large theoretical peak power but limited efficiency for the applications of interest. This is already at an advanced stage in the USA and Japan, and there is an extreme danger that Europe will be left behind.

What is clear is that most researchers seek more flexibility and control over operating modes than they have today, largely to manage data efficiently, but also to meet the growing need for on-‐demand use with guaranteed turnaround times. A minority would like support for computational steering and co-‐scheduling. Thus, in biomolecular simulation (see section 5.4.4), the handling o f very large volumes of state data will require new techniques for data management, collaborative interactive visualisation and the computational steering of simulations. Further developments with a potentially high impact on computational engineering include the use of HPC systems for interactive computational steering that requires interactive behaviour and correspondingly fast response times for the simulation. Even beyond this are real-‐time and embedded simulations, and immersive virtual reality techniques (see section 6.1).

7.3 Computational Science Infrastructure in Europe

Important considerations in the provision of high-‐performance computing include the associated development infrastructure in place around the machines, plus the level of expertise required within the scientific community to ensure effective exploitation of the resources provided. We focus here on the human aspect of this infrastructure and what is needed to keep Europe as a leading area in the world.

Development of Adequate Models

The development of adequate models, and their evolution according to scientific progress.

Development of Mathematical Methods,

Numerical and Statistical Methods

The development of, or improvements to, the hierarchy of mathematical methods, numerical and statistical methods, and other resolution techniques required to fully exploit the developed models. It is important to recognise that, while the continuing growth in computer power certainly has a major impact on computational science, by far the greater advances are due to algorithmic and method developments.

Associated Computer Codes

The development of the associated computer codes, together with associated algorithms and their efficient implementation on the available resources. Here code development is taken to include the following components:

• The whole process from a researcher (often a doctoral student or young postdoc) initiating an algorithm for a new type of simulation (for example) to its incorporation in a generally applicable form in widely disseminated codes

• Maintenance of codes that may contain a million lines of Fortran as new advances have to be incorporated from diverse directions

• Code portability and optimisation for new machines, particularly with novel architectures

• Interfacing with other codes • Incorporating new computational developments such as GRID,

middleware, sophisticated databases, metadata, visualisation and the use of different types of architecture for different purposes

• All types of code, from the large community codes, to a toolbox of simple basic codes for researchers to access as platforms for developing new directions.

Researcher Training and Support

The need for training is an inherent consequence of the rapid development of the field and the very sophisticated nature of much of the methodology, including numerous approximations, tricks and short-‐cuts to make the simulations feasible. In many areas, it is only in the simplest routine applications that one can use the code as a 'black box' without expert steering. Young researchers having been trained, and code users generally, need continuing expert support and personal contact as research priorities change and codes evolve. We return to this area of support in section 7.5.

Access to Expertise

Access to expertise, code libraries and other information across this interdisciplinary field. Better code libraries, databases, input and output standardisation, etc., are needed, with the means of access to all sorts of information and personal expertise through websites, newsletters and email lists.

Figure 7.1. Key components of the computational science infrastructure.

We are concerned specifically with the format of a network of expertise which is needed. More specific descriptions have been provided in the preceding thematic chapters, but from a very general perspective this infrastructure should provide for:

• The development of adequate models, and their evolution according to scientific progress • The development of, or improvements to, the hierarchy of mathematical methods, numerical

and statistical methods, and other resolution techniques required to exploit fully the developed models

• The development of the associated computer codes, together with associated algorithms and their efficient implementation on the available resources

• Researcher training and support. The need for training is an inherent consequence of the rapid development of the field and the very sophisticated nature of much of the methodology, including numerous approximations, tricks and short-‐cuts to make the simulations feasible. (We return to this area of support in section 7.6)

• Access to expertise, code libraries and other information across this interdisciplinary field

We expand on each of the above points in Figure 7.1. What is clear is that addressing these requirements requires significant planning and human investment. For example, the development of a large code may involve a collaborative team effort lasting some five years, culminating in a code that may be used for 10 to 20 years.

Therefore, a visible, long-‐term commitment of the European Community and of the research organisations is crucial. Such a commitment would convince the scientific community to commit their own expertise and resources; indeed, commitment to a European exascale-‐level supercomputing infrastructure would be a clear signal of intent, confirming to leading scientists that computational science is, indeed, perceived to be one of the major pillars of scientific progress (see section 7.4).

This argument also suggests that a European exascale-‐level infrastructure would increase the role and impact of the overall computational resource pyramid: beneficiaries would include national centres, application codes repositories, access and data grids, and so on. We have already shown that computational infrastructure is an enabler for scientific and technological development; a European leadership-‐class infrastructure will prove to an enabler for many scientific and engineering programmes.

The organisational structure needed to implement the three activities described above should be on a European level because one country is too small a unit for efficiency and effectiveness. We envisage that the cyber-‐infrastructure has to be largely managed by the research community itself in each particular field because the circumstances vary so widely across the sciences. But of course this will be with some help from major computer centres and/or European organisations such as CECAM, EMBL, etc. These can provide a permanent hub and a home with scientific and organisational support for a particular research community, as well as some technical help.

7.3.1 Panel Perspectives Experience in the astrophysics, high-‐energy physics and plasma physics communities shows that new code development targeted at capability computing is very much an individual’s initiative, and resources are initially scarce. This seems an unavoidable feature of frontier research. Conversion and optimisation of mature codes can be dealt with by a more project-‐oriented organisation involving a team. It should be clear that code development as well as optimisation is an integral part of the research process, with all the hurdles and perhaps dead ends characteristic of exploring the unknown.

The scientific challenges faced, for example in plasma physics, require dedicated effort over timescales measured in decades. Thus, the sustained availability of state-‐of-‐the-‐art computer

resources such as those provided by PRACE, as well as of adequate technical support to code developers and users, is essential to meet the challenges faced by the field.

Less demanding of the highest levels of HPC resources, the development of integrated modelling frameworks requires a strong community effort – probably the development of specific software to interface the codes – substantial optimisation work, and perhaps dedicated, although not necessarily field-‐specific, hardware. The fusion community recognises the crucial importance of HPC infrastructure for plasma simulations. A Pflop/s machine was recently acquired by IFERC (International Fusion Energy Research Center), under an EU-‐Japan bilateral agreement that accompanied approval of the ITER device. This machine provides both capability computing for grand challenges in fundamental plasma physics dynamics and capacity computing for demanding parametric studies of fusion devices. The fusion plasma simulation challenge and potential societal benefit are enormous. Expanding the HPC resources, and the available manpower with appropriate IT and algorithm development skills, could be an essential and worthwhile investment for Europe.

Many of the problems in the life sciences and medicine cannot be addressed with present-‐day simulation methodologies. This goes beyond the adaptation of existing software to new computational platforms and involves a general lack of scalability as well as missing concepts of multiscale, multi-‐model interactions that are required to exploit exascale computing platforms efficiently. Such hurdles can best be overcome by nucleating communities of scientists, from life science research, bioinformatics and computer science, who will work together to address the problems and to develop innovative solutions. Such communities can be fostered by programs such as the E-‐science/E-‐infrastructure schemes implemented in the present ICT program of FP7. A vigorous expansion of such activities is required in order to generate methods and implementations capable of correctly exploiting the new computational resources for life science applications. It must be acknowledged that life science research rewards applications and method development only in the context of successful applications. In order to generate a sustainable and effective set of codes for life science applications, it is important to nucleate and consolidate the scientific community at the European scale. The formation of broad communities targeting exascale method development would be a tremendous benefit for R&D efforts in Europe, because it would enable the transfer of such technologies to the European end-‐user, generating a competitive advantage over other regions.

7.4 The Challenges of Exascale-‐Class Computing HPC is currently undergoing a major change as the next generation of computing systems (‘exascale systems’4) is being developed for 2020. These new systems pose numerous challenges, from a hundredfold reduction of energy consumption85 to the development of programming models for computers that host millions of computing elements. These challenges are common to all and cannot be met by mere extrapolation but require radical innovation in many computing technologies. This offers opportunities to industrial and academic players in the EU to reposition themselves in the field. Europe has all the technical capabilities and human skills needed to tackle the exascale challenge, i.e. to develop native capabilities that cover the whole technology spectrum from processor architectures to applications86. Even though the EU is currently weak compared to the US in terms of HPC system vendors, there are particular strengths in applications, low-‐power computing, systems and integration that can be leveraged to engage successfully in this global race, getting the EU back on the world scene as a leading-‐edge technology supplier. Progress within Europe has to

85 In line with Europe's green economy targets, ec.europa.eu/europe2020/targets/eu-‐targets/index_en.htm; COM(2009) 111, Mobilising Information and Communication Technologies to facilitate the transition to an energy-‐efficient, low-‐carbon economy

86 http://www.prace-‐ri.eu/IMG/png/fecafedc.png

date been channelled through the EESI – The European Exascale Software Initiative11 – an initiative co-‐funded by the European Commission. EESI’s goal is to build a European vision and roadmap to address the challenge of the new generation of massively parallel systems that will provide Pflop/s performances in 2010 and Eflop/s performances in 2020. EESI is investigating the strengths and weaknesses of Europe in the overall international HPC landscape and competition. In identifying priority actions and the sources of competitiveness for Europe induced by the development of peta/exascale solutions and usages, EESI is investigating and proposing programmes in education and training for the next generation of computational scientists. The Initiative is also seeking to identify and stimulate opportunities for worldwide collaborations.

Figure 7.2 The work package structure of EESI11.

WP2 International networking (Europe, US and Asia); acting as an interface between Europe, US and ASIA, WP2 is communicating progress and opportunities to European software communities involved in scientific software development, and also signalling the needs and challenges faced by European scientific software developers, on a global level. It is also expected to identify some US, ASIA and European cross actions, providing coordination with the International Exascale Software Project (IESP12).

WP3 and WP4

Working groups charged with creating a common vision and deriving coherent roadmaps for each of eight specified topics, including the identification of competitiveness sources for Europe, and needs for education and training. The working groups include those in:

• Industrial and Engineering Applications • Weather, Climatology and Earth Sciences • Fundamental Sciences (including Physics and Chemistry) • Life Science and Health • Hardware Roadmaps and Links with Vendors • Software Ecosystem • Numerical Libraries, Solvers and Algorithms • Scientific Software Engineering

Investigating the application drivers for peta-‐ and exa-‐scale computing, looking to identify the needs and expectations of scientific applications in the exascale time frame in terms of scientific challenges, levels of physics involved, coupling of codes, numeric, algorithms, programming models and languages, size of data sets, simulation steering, pre/post processing, and the expected level of performance on exascale-‐class resources

Identifying the necessary technology enabling exascale computing to take the application requirements and needs identified by WP3. This WP uses a cross-‐disciplinary approach to assess novel hard and software technologies addressing the exascale challenge. Other areas of interests are highly scalable system software and program tracing tools, fault tolerance on the system as well as application side, novel programming paradigms as well as novel, highly scalable numerical algorithms

Dissemination; this WP is dedicated to communication and dissemination actions at large

The project is divided into the five work packages outlined in Figure 7.2. Leveraging the results from the EESI deliberations as part of the current exercise has been ensured through including many of the EESI project leads in the Scientific Case panel membership.

7.4.1 Addressing the Data Challenge The management of data faces many challenges as rapidly increasing computational power and similarly fast progress in a variety of sensor technologies create floods of data. Science is not alone in facing an explosion of data, with the consequence that a range of technological solutions are emerging as industry responds to sector-‐wide demand for storage devices at lower cost and lower power use.

Scientific data centres need to be able to exploit these new technologies. Just as the management of data poses a number of significant challenges, so exascale supercomputing is faced with a number of data challenges, perhaps none more so than of the storage system, and particularly the software it entails. I/O capabilities in high-‐performance computing have typically lagged behind the computing capabilities of such systems, especially at the high end. If not addressed, these exascale storage issues promise to become even more intractable by the time these first machines start to appear toward the end of the decade. By way of example, we provide below just two instances of the so-‐called ‘data deluge’ from the scientific panels central to this report – from life sciences and medicine, and from the climate-‐modelling community.

The benefits of the continuous development of more powerful computation systems are visible in many areas of l ife s ciences. For example, at the beginning of 2000, the Human Genome Project87 was an international flagship project that took several months of CPU time using a 100 Gflop/s computer with 1 terabyte of secondary data storage. Today, genomic sequencing has changed from being a scientific milestone to a powerful tool for the treatment of diseases, in particular because it is able to deliver results in days, while the patients are still under treatment. The Beijing Genomics Institute is capable of sequencing more than 100 human genomes a week using the Next Generation Sequencing instruments and a 100 Tflop/s computer that will migrate in the near future to a 1 Pflop/s capability.88 Today, genome sequencing technology is ineffective if the data analysis needs to be carried out on a grid or cloud-‐like distributed computing platform. First, such systems cannot achieve the necessary dataflow, of the order of 20 PBytes/year, and, second, research involving living patients requires both speed and high security that are lacking in such environments. Lastly, ethical and confidentiality issues handicap distributing patient data across the cloud world. In coming years, sequencing instrument vendors expect to decrease costs by one to two orders of magnitude, with the objective of sequencing a human genome for $1,000. This will make it possible to integrate genomic data into clinical trials (that typically involve thousands of human tests) and into the health systems of European countries. Drug development will become easier and faster, and it will have a dramatic impact on therapy. It is worth noting again here that Europe's pharmaceutical industry contributes significantly more to the region's GDP than is true of the pharmaceutical industries in the US and other nations. We should not forget, however, that all these possibilities could only develop if computer resources can deal with the complexity of the large interconnected data sets that are serving the large community of l ife s cience. For example, today the EBI (which hosts the major core bio-‐resources of Europe) has doubled the storage from 6,000 TBytes (in 2009) to 11,000 TBytes (in 2010), and has received an average of 4.6 million requests per day (see Figure 1.1).

Genomics research faces problems (e.g. the sequencing of 2,500 genomes of cancer patients) involving the management of massive amounts of data in programs that can require hundreds of thousands of processors, but little inter-‐processor communication. However, the vast amount of data to be managed (and often confidentiality and privacy aspects) hampers the use of cloud or grid-‐computing initiatives as a general solution. Suitable and flexible access to computer resources is crucial in this area. The genomic subpanel asserts that currently known cornerstones for an exascale system (number of computer nodes, I/O and memory capacities) are clearly driving the

87 International Human Genome Sequencing Consortium. Nature 2001 88 http://www.genomics.cn/en/platform.php?id=248

focus only to reach the Eflop/s peak performance. For most of the genomics challenges, an Eflop/s computer that could be even less ‘balanced’ than today’s HPC systems would be a substantial barrier to using these machines efficiently. The genomics subpanel and by extension the entire life sciences panel wishes to stress their major concerns that their exascale problem will be difficult to treat with the unbalanced architectures of anticipated Eflop/s computers.

Biological data is growing at an incredible rate and, with it, the computational needs in the field are increasing. The panel wishes to stress that, in this field, computing ‘capability’ does not simply translate to the number of flops that can be brought to bear on a single project. It requires instead computer systems that can solve biological problems on an appropriate timescale. This point, considered crucial by the panel, becomes very clear when considering studies that can have a direct impact on the health of living patients. The panel considers that flops should be not the only parameter defining HPC capabilities. This, in turn, requires defining exactly what ‘exascale resources’ means. Efficient data management and fast and flexible interaction with computer resources are, in many fields of life sciences, at least as important as theoretical peak power.

We must remember that biological data is expected to grow by a factor of 10,000 before the end of the present decade, surpassing Moore’s law (see, for example, the growth of storage in the EBI, Figure 1.1). Biological data is very heterogeneous, is difficult to organise and, in some cases, is subjected to ethical restrictions on its use. Efficient management of biological data to obtain relevant information will require optimised I/O capabilities, efficient structures, post-‐processing pipelines (quality and validation), multi-‐PByte data sharing systems and, in some cases, significant main memory requirements. The standard protocols for the access to HPC resources are not presently compatible with the needs of research in several areas of life sciences, especially concerning human health, where fast data processing has a real impact on patients under clinical treatment. With these technical hurdles in mind, the life sciences community is already preparing for the next bio-‐supercomputing challenges.

Advances in the technologies for data generation that both increase the output and decrease the cost will mean that, over the next decade, the quantity of data being produced will increase by at least a thousandfold and perhaps as much as a millionfold. There are three challenges facing HPC centres – data storage, data transportation and data confidentiality. On the other hand, while the most popular genomics software is regularly reviewed and optimised for new systems (e.g. BLAST), a large part of the available genomics libraries has been built since 1990s using inefficient script/high-‐level languages (e.g. Perl, Java or Python packages). These codes still perform well for current data loads, but they may not be ready for the data challenges of the next decade.

Within the climate modelling community, the CMIP5 (Coupled Model Inter-‐comparison Project Phase 5)89 archive is pushing the boundaries of data management, with an expected volume of around 10 PBytes. Rapidly increasing HPC performance will be reflected in increased data volumes, pushing towards and 1 EByte archive within a decade. Dealing with such volumes of data will require fundamental shifts in data management and analysis methodologies. The underlying technology drivers behind such shifts have been outlined in the EESI Working Group Report on Weather, Climate and solid Earth Sciences,90. Three of these are summarised below: 89 At a September 2008 meeting involving 20 climate modelling groups from around the world, the WCRP's Working Group on Coupled Modelling (WGCM), with input from the IGBP AIMES project, agreed to promote a new set of coordinated climate model experiments. These experiments comprise the fifth phase of the Coupled Model Intercomparison Project (CMIP5).

90 Working Group Report on Weather, Climate and solid Earth Sciences, ESI_D3.4_WG3.2-‐REPORT_R2.0.DOCX, CSA-‐2010-‐261513, 08/11/2011. We refer the reader to the Working Group report for details of the other technology drivers, and the associated R&D Strategies. The latter include (i) Taking the computation to the data, (ii) Grid vs cloud, (iii) Scientific Workflow Tools, (iv) Scalable data formats, (v) Search and Query tools, (vi) A range of Storage media, backup and curation tools, (vii) Optimised hardware deployment within the archive, (viii) Archive locations and the optimal location of data centres, and (ix) Options for collecting robust meta-‐data.

1. Increasing rate of data supply. The rapid increase of HPC centre productivity and parallel increases in sensor technologies can be expected to increase data flow by a factor of 1,000 in the coming decade, taking us from peta-‐ to exa-‐scale archives.

2. Power supply constraints. The rapidly falling cost of computation is expected to lead to a correspondingly rapid increase in data generation. Data volumes can be expected to increase by a factor of 100–1,000 over the next 10 years. Over the same period, the fall in energy usage per byte of stored data held on disk may only be a factor 10, compared to a projected fall in procurement costs by a factor of 200. We are likely to move from a regime in which procurement costs dominate to one in which power costs dominate. This will greatly increase the cost of holding data on disk, demanding more power-‐aware data management strategies. Major commercial data centres are responding by placing major data archives near sources of cheap and sustainable cooling and power.

3. Heterogeneity of storage media. The growing complexity of storage systems will challenge management strategies. A storage centre might contain a selection of storage technologies: traditional disk, micro-‐servers, tape, solid state, WORM1. In addition, the disks will have multiple modes of operation: full speed, slow, idle, rest. A multi-‐state cache algorithm will be needed to optimise usage where more than two technologies are deployed and disks are in multiple states.

7.4.2 Software Development and Tools for Exascale-‐Class Computers Exascale HPC systems will be very different from today’s HPC systems and building, operating and using exascale systems will face severe technological challenges. There is wide agreement in the HPC community that these challenges cannot be dealt with only on the hardware level. Therefore the HPC middleware and the application developers have to address these challenges, the most important ones being scalability, resilience, energy and performance (see Table 7.2).

Table 7.2. Software challenges faced by the HPC middleware and application developers –findings of the EESI Software Ecosystem.

Challenges Faced by the HPC Middleware and the Application Developers

Scalability: The number of cores in an exascale system will in the order of 108. Not only the applications and the algorithms but the whole software ecosystem have to support this unprecedented level of parallelism.

Resilience: For statistical reasons, the mean time between critical systems failures will become shorter. The general expectation is that resiliency cannot rely only on hardware features. The whole software ecosystem has to be aware of the resilience issue.

Energy: Exascale systems will be based on low-‐power components (cores, memory, interconnect, etc.). Energy consumption will be crucial and needs to be dynamically managed through software control. In particular, developer tools have to adopt the notion of ‘energy optimisation’ in addition to the standard ‘performance optimisation’.

Performance: Achieving the necessary highest levels of performance on an exascale complex integrated hard-‐ and software stack under the presence of dynamic system adaption because of power and fault-‐tolerant events will be very challenging. Performance-‐aware design of system, runtime and application software will be essential.

Recommendation: Algorithms, Software and Tools Most applications targeting Tier-‐0 machines require some degree of rewriting to expose more parallelism, and many face severe strong-‐scaling challenges if they are effectively to progress to exascale, as is demanded by their science goals. There is an ongoing need for support for software maintenance, tools to manage and optimise workflows across the infrastructure, and visualisation. Support for the development and maintenance of community code bases is recognised as enhancing research productivity and take-‐up of HPC. There is an urgent need for algorithm and software development to be able to continue to exploit high-‐end architectures efficiently to meet the needs of science, industry and society.

Findings of the EESI Software Ecosystem group

For exascale computing, most of the HPC software components have to be newly developed, addressing the exascale challenges of scalability, resilience, energy management, etc. The current HPC software does not address these challenges, therefore an evolutionary approach (develop current SW further) will not be sufficient. A substantial investment in new HPC software development is necessary.

While the exascale hardware and the system software will be substantially different from today’s HPC systems, the applications – in particular the industrial applications – require a continuous path to exascale. The lifetime of HPC application codes is very long and the application developers (in research and in industry) are typically not ready to tune their applications to very specific ‘exotic’ new programming models.

The SW ecosystem must be portable to be able to support various exascale hardware architectures based on many core architectures. The different exascale system architectures should not be visible at the level of the programming model.

It is expected that most of the new exascale software ecosystem will be developed by the R&D community under open-‐source licences. This developer community should work closely – more than in the past – with the vendors, to adapt the new software ecosystem on the vendors’ exascale platforms. Co-‐design processes must be established, ensuring that international HPC hardware and system vendors collaborate with European R&D laboratories.

Considerable and adequate funding is essential for research and development in areas where Europe has technology leadership (e.g. programming models, performance tools, validation and correctness tools) to maintain and extend this leadership. The key players should form alliances and work more closely with the hardware vendors to define (de facto) standards.

The panel acknowledges the work undertaken by the EESI software ecosystem group91 who, in focusing on the HPC software between the hardware and the application (system software and development tools), addressed the aspects of European competitiveness, potential collaborations and the need for future investments (and funding). The panel further supports the findings of this report, which are summarised in Table 7.2; their conclusions are reinforced by all the scientific panels contributing to this report. The materials science, chemistry and nanoscience community is scientifically very diverse and deals with a large spectrum of simulation methodologies that is realised in many different computer codes used by large communities – including those in life science, medicine, engineering and industry. The computer codes are very complex and their lifespan is in general much larger than the lifespan of a given hardware architecture. These codes are typically developed by small expert groups that are part of their respective communities. The adaptation of existing software to new computing platforms is a major challenge that accompanies the advent of petascale computing – either in the form of massively parallel computing platforms with tens of thousands of cores or by employing accelerators (e.g. GPUs). This challenge will become more acute with the arrival of exascale computing. It 91 Working Group Report on the Software Eco-‐system, CSA-‐2010-‐261513, EESI-‐D4.4-‐WG4.2-‐REPORT-‐R2.1.DOCX

Recommendation: A Long Term Commitment to Europe-Level HPC Major experiments depend on HPC for analysis and interpretation of data, including simulation of models to try to match observation to theory, and support research programmes extending over 10–20 year time frames. Some applications require access to stable hardware and system software for 3–5 years. Data typically need to be accessed over long periods and require a persistent infrastructure. Investment in new software must realise benefits over at least 10 years, with the lifetime of major software packages being substantially longer. A commitment to Europe-‐level HPC infrastructure over several decades is required to provide researchers with a planning horizon of 10–20 years and a rolling 5-‐year specific technology upgrade roadmap.

manifests itself as a lack of scalability as well as missing concepts (e.g. in parallelising long-‐time simulations of particle trajectories), or multiscale and multi-‐model interactions that are required to use an exascale computing infrastructure at its best. It is worth noting that many national funding agencies across Europe recognise and reward applications but view method development in a very different light. Clearly, the development of an exascale infrastructure must overcome such obstacles and take into account the nature of the networking and organisational structure of the community at the European scale and nucleate, support and nurture the vigorous and grass-‐root-‐like communities of scientists. Such measures should be addressed within the European funding scheme FP7, with these communities working coherently to phase out computer codes that are not scalable to the new architectures, while developing new codes using in part software from previous instantiations. There is a clear requirement to establish simulation laboratories to provide training and workshops for a wider community on these codes, and in the context of the exascale applications, further develop such codes in response to community requirements.

All these groups together form the broad community targeting exascale method development in Europe with significant cross synergies. PRACE itself should have a small expert group advising the developers and assisting the deployment of the codes. A network of these people is essential for promoting computational science on an exascale infrastructure.

7.5 A Support Infrastructure for the European HPC Community

7.5.1 Long-‐term Continuity of Reliable HPC Provision The central requirement for all fields is the need for long-‐term continuity of reliable HPC provision and support. Thus the typical time scales for any real development in astro, particle or plasma physics is one or several decades. Each large experiment needs continuous theory support, starting from the early planning phase and ending only when the analysis is finished, which often happens only years after a high-‐energy physics experiment was shut down or a satellite ran out of power. One also has to realise that the development and optimisation of application codes needs at least several years. Finally, the demands faced by theory and thus the needs for HPC infrastructure will increase steadily for many years to come.

Most major experimental and observational facilities depend on large-‐scale HPC for analysis and interpretation of data, including simulation of a range of models and/or parameter values to try to match observation to theory. These facilities often support research programmes extending over 10–20 year time frames and the supporting HPC infrastructure needs to exist over a comparable period. Some applications (e.g. climate modelling) require bit-‐reproducibility over multi-‐year programmes, necessitating guaranteed access to stable hardware and system software for periods of 3–5 years. Data typically need to be accessed over long periods and require a persistent infrastructure. Investment in new software must realise benefits over at least a 10-‐year period, with the lifetime of major software packages being substantially longer. A commitment to Europe-‐level HPC infrastructure over several decades is required to provide researchers with a planning horizon of 10–20 years and a rolling 5-‐year specific technology upgrade roadmap. However, the present funding scheme of PRACE based on 5-‐year periods does not fit these

facts. Thus, there exists an urgent need to change PRACE into a long-‐term institution which guarantees adequate and reliable HPC support in Europe also for high-‐level research groups which happen to be situated in countries which by themselves are not able to meet the computational needs of these groups.

Many communities – computational materials science, chemistry and nanoscience being a case in point – are best served if the small, expert group of code developers inherent to a community simulation code is supported within a scheme of the FP7 framework of the European Commission. Indeed, it is this set of expert groups that constitutes the community of exascale method developers. They may be members of small simulation laboratories providing education and service to the community that applies this software using the exascale infrastructure. PRACE has a small expert group advising and assisting expert groups of code developers and the simulation labs.

7.5.2 User Support All across Europe, world-‐class research teams are using HPC resources to make new discoveries. The breadth of research applications is staggering, encompassing virtually all areas of the sciences, engineering and medicine, with growing applications in the social sciences and humanities. Although it is the lead researchers who often assume the high-‐profile roles in the research process, there has to be a large supporting team working behind the scenes for HPC-‐related activities.

An effective HPC facility is much more than hardware; the smooth operation of the facility requires a second tier of highly qualified personnel (HQP) to manage, operate and maintain the facility. It is equally important for these highly trained technical support staff to train and assist researchers in making the best use of this expensive infrastructure.

An investment in people for today and for the future is a critical component of this proposal. In many respects, computing infrastructure can be more readily acquired than human infrastructure. Given adequate funding, the upgrading of the capital equipment is straightforward: one can simply buy whatever is needed.

However, human infrastructure is much more challenging to obtain. It can take years to train people with the necessary skill sets, and then they can be easily enticed away from Europe by the lure of better opportunities coupled with higher salaries. If Europe is to invest in people and skills, then it must also invest in creating the environment to attract and retain them.

A variety of skilled personnel and support roles is therefore essential to the effective operation and maximum exploitation of any HPC facility. The skills and experience needed are extensive, including: (i) managing, operating and maintaining the facility; (ii) training and assisting researchers to make the best use of its resources and capabilities; (iii) ensuring maximal productivity of the HPC sites by, for example, checking that software is run on the most suitable computing platform and reworking code to achieve significant performance gains; and (iv) helping to create new applications in support of innovative research initiatives. This variety of skilled personnel is summarised in Figure 7.3.

Recommendation: People and Train ing There is grave concern about HPC skills shortages across all research areas and particularly in industry. The need is for people with both domain and computing expertise. The problems are both insufficient supply and low retention, because of poor career development opportunities for those supporting academic research. Europe’s long-‐term competitiveness depends on people with skills to exploit its HPC infrastructure. It must provide ongoing training programmes, to keep pace with the rapid evolution of the science, methods and technologies, and must put in place more attractive career structures for software developers to retain their skills in universities and associated institutions.

Figure 7.3. The variety of skilled personnel required in the effective operation and maximum exploitation of any HPC facility.

System Administration and Operations

Systems administration and operations are primarily concerned with the day-‐to-‐day care of the HPC hardware and software infrastructure. The supporting personnel ensure the proper functioning of HPC facilities, providing systems management and operations support. Specific tasks include installing and maintaining operating system(s), performing updates and patches, managing file systems and backups, and ensuring the integrity and security of the user data. These activities are crucial to ensuring that the system is fully functional and available to the community.

Programmer / Analysts

The role of programmer/analysts is to provide specialised technical assistance to researchers, to conduct workshops and training, and to evaluate and implement software tools to make effective use of available resources. HPC hardware typically operates at a sustained rate well below the theoretical peak performance of the system; this is usually due to a lack of parallelism in parts of an application. A creative team of programmer/analysts can double that rate through code optimisations, algorithm re-‐design, enhanced cache utilisation and improved data locality. The added value from such activities can be huge, and can correspond to twice the science delivered for the same hardware. These skills can thus dramatically increase the scientific productivity of the research community. By allowing researchers to run their applications faster, analysts support researchers and their students to do better science.

Applications Programmers

Frontier science requires world-‐class software applications. While much of the development of new scientific functionality is traditionally carried out in a researcher’s own laboratory, HPC applications programmers often make valuable contributions to this work by virtue of their own scientific, numerical or visualisation experience. The additional skills of the support staff often play an integral role in enabling ideas, concepts and advice to flow with greater ease in the subject domain of the scientist. This support has the additional benefit of greatly reducing what is normally a challenging start-‐up period for researchers learning to work with HPC. This skill set is imparted to students and postdoctoral fellows as well, giving them both the scientific knowledge and the programming experience necessary to create new computational methods and applications in their various fields, eventually leading to dramatic new insights.

Data Management and Visualization Personnel

The importance of versatile analysis and visualisation techniques for simulation work is self-‐evident, and both computation and visualisation activities are increasingly being driven by 'science pull' rather than 'technology push'. The most challenging aspect of data management and visualisation is coping with the massive datasets that are being produced. Simulations in climatology, bioinformatics and astrophysics, for example, regularly produce data sets that are hundreds of TBytes or even PBytes in size. Entirely new techniques and computing resources will be necessary to cope with them: in most cases, interactive visualisation is the only practical way to glean insights into these data sets. In addition, the effective exploitation of such volumes of data will require a major development effort in distributed computing across high-‐speed networks. This requires the training and the retention of personnel able to manage the data resources and to develop the new tools and techniques required to visualise them.

7.6 Education and Training of Researchers The overall goal of HPC support is to support proactively the needs of a wide variety of researchers by engaging them through their entire HPC lifecycle. In addition to maintaining the HPC facility, support staff must:

• Create awareness of the resources available and the potential for these resources to accelerate research productivity

• Provide usage instructions and courses (on topics such as parallelism, programming tools and performance analysis)

• Help users find the right match between their application and the available technologies • Develop new tools or tune existing ones (H/W and S/W) to optimise the use of HPC

resources and the applications running on them – will enable the researcher to obtain more results in the same time or the same results in less time, thus freeing up facility time for others

HPC support staff are essential for training Europe’s next generation of scientists and engineers in the use and improvement of HPC resources. Interactions of HPC support staff with graduate students and postdoctoral fellows will provide a fertile training ground to develop the next generation of researchers, giving the new scientists grounding in HPC as part of their disciplinary training. Much like today’s researchers use personal computers to support their work, the next generation of researchers will rely on and be able to take effective advantage of HPC resources, thus accelerating their research outputs. To do so effectively, researchers will need appropriate training.

PRACE has an extensive education and training effort for effective use of the RI through seasonal schools, workshops and scientific and industrial seminars throughout Europe. Seasonal schools target broad HPC audiences, whereas workshops are focused on particular technologies, tools or disciplines or research areas. Education and training material and documents related to the RI are available on the PRACE website as is the schedule of events.92 We also note the detailed section on the needs of education and training from the EESI Working Group report on Scientific Software Engineering.93

Below, we illustrate the specific training requirements identified by the panels through specific reference to the areas of weather, climatology and solid Earth sciences (WCES); astrophysics, high-‐energy physics and plasma physics; engineering sciences and industrial applications; and life sciences and medicine.

92 http://www.training.prace-‐ri.eu/ 93 Where attention is focused on (i) the programming model, (ii) the runtime environment, (iii) debugging, (iv) validation and correctness, (v) performance tools, (vi) performance modelling and simulation, (vii) batch system and resource managers, (viii) I/O and file system, (ix) resilience, and (x) energy efficiency. The report stresses the absence of any such courses in Europe, e.g. on storage, fault tolerance in HPC, and the need to develop these kind of courses in Europe.

Table 7.3. Training perspectives from weather, climatology and solid Earth sciences, astrophysics, high-‐energy physics and plasma physics, and engineering sciences and industrial applications.

Weather, Climatology and solid Earth Sciences (WCES)

Training programmes will allow WCES scientists to improve their HPC background as well as to establish stronger links between the HPC community and their own domain. In this respect, funding specific actions to support training activities, summer/winter schools, intra-‐European fellowships as well as international incoming and outgoing fellowships will play a strategic role in preparing new scientists with a stronger and more interdisciplinary background. Given the expected increased complexity of the component models and of future exascale computing platforms, a lot of resources should be devoted to the technical aspects of coupled climate modelling; the coupler development teams should be reinforced, including experts in computing science remaining at the same time very close to the climate modelling scientists.

In oceanography and marine forecasting, support should be made available to train young interdisciplinary scientists for them to become specialists in not only climate science or HPC but both. Training should be provided via summer schools and international training networks (e.g. the International Training Network SLOOP (SheLf to deep Ocean mOdeling of Processes) recently submitted to FP7).

In solid Earth sciences, the community is preparing itself for extensive use of supercomputers by the current (re-‐) organisation of some of the communities through large-‐scale EU projects, e.g. the Marie Curie Research Training Network SPICE (http://www.spice.rtn.org), the EC-‐projects NERIES (http://www.neries-‐eu.org/), NERA (http://www.nera-‐eu.org), VERCE (http://www.verce.eu). Similar developments occur in geodynamics (e.g. TOPOEurope) and in geodynamo.

Astrophysics, High-‐Energy Physics and Plasma Physics

The steady advancements in computer performance usually demands deep code modifications accompanied with steep learning curves in sophisticated coding techniques. The resources involved in the development of computer hardware are usually much larger than those used for scientific software development. Moreover, this problem is exacerbated by the decline in the number of students interested in enrolling in a scientific career. One must avoid the situation in which expensive improvements in the computer facilities are not followed by comparable scientific advancements because of inadequate manpower to develop and exploit efficient codes. The time has come to stress the importance of a pan-‐European training and tenure-‐track programme in HPC.94

The systematic training of code developers is vital, with the primary task being to educate developers of technical simulation codes in the design of hardware-‐aware algorithms and in the systematic analysis of the computational performance of their programs. Failure to address these requirements will be detrimental to the practical industrial use of exascale systems.

Industry obviously needs skilled personnel to build maintain and program exascale hardware. Human resources are the key element in grasping the competitive edge and benefit of HPC, with education and training mandatory for realising the potential of the technology.

Education and training may be viewed generally in two distinct ways. First is the classical specialisation on specific topics – university courses on themes such as hardware design, compiler technology, numerical algebra, etc. Courses at universities or research institutions are commonly organised in this way and are appropriate solutions for analysis or in-‐depth activities.

A second way to view education and training is in a somewhat different light – from a systems view. That is the capability to build up systems by joining different components or knowledge from different areas. The holistic view is typically necessary when solving real-‐world problems as found in industry. The implementation and deployment of exascale computing have been described with needs for co-‐design and ecosystems. These terms indicate the necessity for personnel with skills for systems ‘thinking and building’.

Classical education and training with focus on analysis and specific in-‐depth study is certainly both needed and appropriate. Here we suggest that it is highly desirable to extend the formation with the complementary aspect of system design and engineering.

94 M. Feldman, http://www.hpcwire.com/hpcwire/2012-‐01-‐03/wanted:_supercomputer_programmers.html

7.6.1 Life Sciences and Medicine There is a clear asymmetry in education and training needs. Life scientists and clinicians must learn how to use the best of breed in-‐silico science. On the other hand, programmers need to understand better end user’s’ needs in more detail. This requires significant efforts to develop methods that are able to address pressing life science problems with the help of exascale computing.

Figure 7.4. Training priorities in the life sciences and medicine.

Method development. Many present-‐day techniques are for fundamental reasons not scalable towards the envisioned access architectures. For many challenging problems, codes addressing various aspects of the problem must be combined in an interdependent set of simulations/calculations. Method development in this direction is still in its infancy and will probably become the most significant obstacle towards the exploitation of exascale computing for life science applications.

Memory management. Memory can become a major bottleneck for life sciences exascale computing challenges. A better understanding of how to optimise this resource will reduce future costs in updating codes.

Integration of optimised libraries in scripts. Genomics and systems biology are fast-‐evolving disciplines and the driving force for code development will be non-‐expert programmers. It is important that bio-‐programmers understand which regions of the code (exascale-‐demanding) should be developed (eventually by specialists) using efficient programming languages. They also need to understand how to link these libraries in high-‐level/scripting languages such as Java, Perl or Python.

Data storage. As data access (memory/disk) is much more costly than flops, training should focus on how to store efficiently information, including compression methods and database storage models.

Data integration and analysis. Tools such as Hadoop and MapReduce can expedite searches through the large, irregular data sets that characterise some life sciences problems. These tools can be effective for retrieving and moving through huge volumes of complex data, but they do not allow researchers to take the next step and pose intelligent questions. A related issue is that these tools may be fine for working with a few TBytes of scientific data, but become cumbersome to use when data sets cross the 100-‐terabyte threshold. Effective tools for scientific data integration and analysis on this scale are largely lacking today.

Code parallelisation. The trend of the hardware industry is to increase flop power by multiplying the number of cores. However, this trend has resulted in unbalanced architectures. This implies enormous efforts in code parallelisation. In the coming years, it will be important to train researchers in, at least, standard parallel programming models (MPI, OpenMP).

Benchmarking support. Benchmarking performance to evaluate hardware and software alternatives: Software tools such as BSC-‐Tools are able to identify inefficient blocks of code, or bottlenecks, in applications. Such performance tools will provide developers with a powerful aid towards reaching exascale challenges. Another vital aspect of benchmarking is the verification of the scientific quality of the results by setting standard protocols for comparison and by developing meta-‐servers that can combine multiple approaches. The diversity of areas makes this issue computationally complex.

Computational methods training. Overall, the groups involved in exascale challenges for life science will require expertise in code parallelisation, applied mathematics, mathematical modelling, statistics, biology, biochemistry, biophysics, data analysis, data visualisation and biological simulation. Therefore, we will need to focus on training in computational methods for those coming from a biological, as opposed to a physical sciences, background. In parallel, computational biologists need to learn how to design software and build friendly interfaces for experimentalists.

There is a critical need to train the computing life science community in the special demands of parallel computing (programming, performance optimisation, etc.), and to prepare them for using HPC in combination with systems and integrative biology. Unfortunately, there are very few places in Europe providing education in bioinformatics and computational biology. The panel identified the priorities in training shown in Figure 7.4.

7.7 Community Building and Centres of Competence

In the preceding section, we considered the generic class of HPC support staff required in developing, sustaining and supporting a petascale-‐class computing infrastructure. We consider below the broader requirements of community building and the software facilities crucial to the successful exploitation of leadership-‐class systems.

We illustrate these requirements through specific reference to the areas of life sciences, and materials science, chemistry and nanoscience.

The life science panel is eager to apply the model of USA co-‐design centres focused on exascale physics applications, such as the Centre for Exascale Simulation of Advanced Reactors (CESAR), the Co-‐Design for Exascale Research in Fusion (CERF), the Flash High-‐Energy Density Physics Co-‐Design Centre, and the Combustion Exascale Co-‐Design Centre.

A centre with academic and industrial participation focused in life sciences and health will be instrumental to facilitate an efficient use of PRACE Tier-‐0 resources in areas such as tissue and organ simulation, molecular dynamics, cell simulation, genome sequencing, and personalised medicine. Considering the complex nature of the bio-‐computational field, only a powerful competence centre will guarantee compatibility between research needs in the area and the new generation of exascale computers.

European activity in materials science has been considerably strengthened with the establishment of CECAM (Centre Européen de Calcul Atomique et Moléculaire)95 as the focal organisation with a nodal structure and with funding links to national organisations. It provides an intersection for many computational disciplines, with activities ranging from the organisation of scientific workshops to that of specific tutorials at the graduate level on the use of especially relevant software, and the sponsorship of specialised courses in computational sciences also at the masters level. In particular, it interlinks the first-‐principles community (Psi-‐k) and the molecular dynamics community.

First-‐principle simulations based primarily on the quantum-‐mechanical density functional theory (DFT) provide the most important framework across physics, surface science, materials science, chemistry, nanoscience, computational biology, mineralogy, Earth science, and engineering. Such simulations are capable of deriving properties – frequently with predictive capability – from the atomic level with no input other than the atomic number of the constituent chemical elements. Included are also areas of ab-‐initio molecular dynamics, time-‐dependent DFT, the quantum many-‐body perturbation theory and many-‐body approaches to electron correlation and strongly correlated electron systems.

Research is pursued worldwide by a large and diverse community. Since the calculations are computationally very demanding, the progress of this field scales with the availability of a powerful supercomputing infrastructure for capability and capacity computing. This community has gained vital experience on HPC infrastructures by developing cutting-‐edge algorithms. Europe is a recognised leader in this field. In 2011, more than 17,000 papers containing DFT calculations were published worldwide. As shown in Figure 7.5, Europe contributes more than one third to this outcome.

A large majority of the computer codes in use by the community field originate from and are being developed in Europe (e.g. see http://www.psi-‐k.org/codes.shtml, or footnote 42). The European scientific value and strength in computational materials science was recently recognised by the European Science Foundation through the launch of a new Research Networking Programme in Advanced Concepts in ab-‐initio Simulations of Materials (Psi-‐k), continuing the highly successful work of the previous Psi-‐k Programmes, in which more than 1,000 scientists are organised.

95 www.cecam.org

Soft matter

Europe is a stronghold of soft matter science worldwide. The importance of this field is, of course, also recognised in the USA, Japan and China, as demonstrated by the steady increase of groups and researchers working in this field. However, Europe has the longest tradition, the largest number of groups, and many of the leading scientists. Soft matter science is characterised by a very fruitful interplay of experiment, theory and computer simulations. As explained above, computational science plays an essential role in the past, current and future progress in this field. Soft matter is integrated in Europe by a Network of Excellence (NOE).

Figure 7.5. Number of publications per year containing results of ab-‐initio calculations based on the density functional theory listed over a timespan of 20 years and originating from Europe, the USA and East Asia, as listed in the Web of Knowledge of the Institute for Scientific Information (ISI). The search criterion was that the topic contains the keywords "ab-‐initio" or "first principles" or "density functional". In 2011, more than 17,000 publications were counted worldwide. Europe contributes more than one third to the total number of publications. The rapid growth in East Asia is due to China. Courtesy of Institute for Scientific Information (ISI)

PRACE – The Scientific Case for HPC in Europe Panel Membership

8 MEMBERSHIP OF INTERNATIONAL SCIENTIFIC PANEL

Area Country Title First Name Last Name Institution E-‐Mail Address Workshop

Weather, Climatology

and Solid Earth Sciences

Prof. Geerd-‐Rüdiger Hoffmann Deutscher Wetterdienst geerd-‐ruediger.hoffmann@dwd.de WCES

Prof. Dr Heiner Igel LMU Munich igel@geophysik.uni-‐muenchen.de WCES

Dr Joachim Biercamp Deutsche Klimarechenzentrum (DKRZ) biercamp@dkrz.de WCES

Dr Reinhard Budich MPI-‐M, Hamburg reinhard.budich@zmaw.de WCES

ES Prof. José María Baldasano UPC / BSC jose.baldasano@bsc.es WCES

Prof. Giovani Aloisio† IS-‐ENES & Univ. Salento giovanni.aloisio@unisalento.it WCES

Dr Massimo Cocco Istituto Nazionale di Geofisica e Vulcanologia massimo.cocco@ingv.it WCES

Dr Alberto Michelini Istituto Nazionale di Geofisica e Vulcanologia alberto.michelini@ ingv.it WCES

FI Dr Johan Silen Finnish Meteorological Institute (FMI) johan.silen @fmi.fi WCES

Dr Jean Claude Andre CERFACS jean-‐claude.andre@cerfacs.fr WCES

Dr Fabien Dubuffet CNRS, Lyon fabien.dubuffet@univ-‐lyon1.fr WCES

Dr Marie Alice Foujols Institut Pierre-‐Simon Laplace (IPSL) marie-‐alice.foujols@ipsl.jussieu.fr WCES

Dr Sylvie Joussaume Institut Pierre-‐Simon Laplace (IPSL) sylvie.joussaume@lsce.ipsl.fr WCES

Prof. Jean Pierre Vilotte Institut de Physique du Globe vilotte@ipgp.jussieu.fr WCES

Prof. Bernard Barnier LEGI/Univ. Grenoble bernard.barnier@legi.grenoble-‐inp.fr WCES

Dr Sophie Valcke CERFACS valcke@cerfacs.fr WCES

SE Dr Colin Jones Swedish Meteorological & Hydrological Institute (SMHI) colin.jones@smhi.se WCES

Dr Mike Ashworth STFC Daresbury mike.ashworth@stfc.ac.uk WCES

Dr Chris Gordon Meteorological Office chris.gordon@ metoffice.gov.uk WCES

Prof. Bryan Lawrence NCAS Reading University bryan.lawrence@stfc.ac.uk WCES

Dr Graham Riley Manchester University graham.riley@manchester.ac.uk WCES

Dr John Brodholt University College London j.brodholt@ucl.ac.uk WCES

Astrophysics, High-‐Energy Physics and

Plasma Physics

Prof. Dr Wolfgang Hillebrandt MPI für Astrophysik wfh@mpa-‐garching.mpg.de HEPPA

Prof. Dr Andreas Schäfer† Fakultät Physik – Universität Regensburg

andreas.schaefer@physik.uni-‐regensburg.de

Prof. Jesús Marco Universidad de Cantabria marco@ifca.unican.es HEPPA

Prof. Gustavo Yepes UAM Madrid gustavo.yepes@uam.es HEPPA

Prof. José María Ibañez Universitat de València jose.m.ibanez@uv.es HEPPA

Dr Victor Tribaldos CIEMAT v.tribaldos@ciemat.es HEPPA

FR Dr Edouard Audit CEA, SAP eaudit@cea.fr HEPPA

Prof. Carlos Frenk University of Durham c.s.frenk@durham.ac.uk HEPPA

Prof. Richard Kenway University of Edinburgh r.d.kenway@ed.ac.uk HEPPA

Dr Colin Roach UKAEA colin.m.roach@ukaea.org.uk HEPPA

Panel Mailing List

CH Prof. Ben Moore University of Zurich moore @physik.uzh.ch HEPPA

Prof. Dr Romain Teyssier University of Zurich teyssier@ physik.uzh.ch HEPPA

DE Prof. Dr Gernot Münster Westfälische Wilhelms-‐Universität Münster munsteg@uni-‐muenster.de HEPPA

FR Prof. Maurizio Ottaviani CEA Cadarache maurizio.ottaviani @ cea.fr HEPPA

Prof. Laurent Lellouch CNRS Luminy lellouch@cpt.univ-‐mrs.fr HEPPA

PT Ricardo Fonseca DCTI – ISCTE, Lisbon ricardo.fonseca @ ist.utl.pt HEPPA

DE Dr Wolfgang Wenzel Institut für Nanotechnologie wenzel@int.fzk.de LIFE

ES Prof. Modesto Orozco† UB / BSC modesto.orozco@bsc.es LIFE

Prof. Roderic Guigó UPF rguigo@imim.es LIFE

Dr Thomas Simonson CNRS thomas.simonson@polytechnique.fr LIFE

Dr Olivier Poch CNRS poch@igbmc.u-‐strasbg.fr LIFE

Dr Charles Laughton University of Nottingham charles.laughton@nottingham.ac.uk LIFE

Panel Mailing List

CH Dr Manuel Peitsch Swiss Institute of Bioinformatics manuel.peitsch@ isb-‐sib.ch LIFE

DE Dr Paolo Carloni German Research School for Simulation Sciences, Jülich p.carloni@ grs-‐sim.de LIFE

DE Prof. Helmut Grubmuller Max-‐Planck Institute for Biophysical Chemistry hgrubmu@gwdg.de LIFE

IT Prof. Anna Tramontano University of Rome ‘La Sapienza’ anna.tramontano@uniroma1.it LIFE

LU Dr Reinhard Schneider University of Luxembourg reinhard.schneider @uni.lu LIFE

SE Prof. Erik Lindahl Stockholm Universiy lindahl @ cbr.su.se LIFE

UK Prof. Peter Coveney University College London p.v.coveney@ucl.ac.uk LIFE

BG Prof. Georgi Vayssilov University of Sofia gnv@chem.uni-‐sofia.bg CMSN

Prof. Dr Gerhard Gompper Forschungszentrum Jülich g.gompper@fz-‐jueilch.de CMSN

Prof.Dr Stefan Blügel† Forschungszentrum Jülich GmbH S.Bluegel@fz-‐juelich.de CMSN

Materials Science,

Chemistry and Nanoscience

Prof. Dr Wolfgang Wenzel Karlsruhe Institute of Technology wolfgang.wenzel@kit.de CMSN

ES Prof. Agustí Lledós UAB agusti.lledos@uab.es CMSN

FI Prof. Risto Nieminen Helsinki University of Technology Risto.Nieminen@hut.fi CMSN

FR Dr Thierry Deutsch CEA-‐Grenoble thierry.deutsch@cea.fr CMSN

IE Prof. Jim Greer Tyndall National Institute, Cork jim.greer@tyndall.ie CMSN

IT Prof. Giovanni Ciccotti University of Rome, ‘La Sapienza’ giovanni.ciccotti@roma1.infn.it CMSN

NO Prof. Kenneth Ruud University of Tromsø ruud@chem.uit.no CMSN

Prof. Martyn Guest Cardiff University guestmf@cardiff.ac.uk CMSN

Prof. Mike Payne Cambridge University mcp1@cam.ac.uk CMSN

Panel Mailing List

AT Prof. Dr Christoph Dellago University of Vienna christoph.dellago@univie.ac.at CMSN

CH Prof. Alessandro Curioni IBM Research – Zurich cur@zurich.ibm.com CMSN

DE Prof. Dr Kurt Binder Johannes Gutenberg-‐Universität Mainz kurt.binder@uni-‐mainz.de CMSN

ES Prof. Manuel Yanez Universidad Autonoma de Madrid manuel.yanez@uam.es CMSN

FI Prof. Kai Nordlungd University of Helsinki kai.nordlund@helsinki.fi CMSN

FR Prof. Gilles Zerah CEA – CECAM gilles.zerah@cea.fr CMSN

FR Prof. Philippe Sautet CNRS and Ecole Normale Supérieure of Lyon

philippe.sautet@ens-‐lyon.fr CMSN

RU Prof. Alexei Khoklov Moscow State University khokhlov@polly.phys.msu.ru CMSN

SE Dr Kersti Hermansson Uppsala University kersti.hermansson@kemi.uu.se CMSN

Prof. Steve Parker Bath University s.c.parker@bath.ac.uk CMSN

Prof. Jonathan Tennyson University College London j.tennyson@ucl.ac.uk CMSN

Dr Adrian Wander STFC Daresbury Laboratory adrian.wander @stfc.ac.uk CMSN

Engineering Sciences and Industrial

Applications

BE Dr Koen Hillewaert CENAERO koen.hillewaert@cenaero.be Eng

CZ Prof. Zdenek Dostal VŠB-‐Technical University of Ostrava zdenek.dostal@vsb.cz Eng

Prof. Dr Uli Rüde University Erlangen-‐Nuremberg ruede@cs.fau.de Eng

Prof. Dr-‐Ing Wolfgang Schröder

RWTH Aachen – Lehrstuhl für Strömungslehre und Aerodynamisches Institut

office@aia.rwth-‐aachen.de Eng

ES Prof. Javier Jimenez Sendín UPM jimenez@torroja.dmt.upm.es Eng

Dr Stephane Requena GENCI stephane.requena@genci.fr Eng

Dr Denis Veynante CNRS denis.veynante@em2c.ecp.fr Eng

Dr Philippe Ricoux† TOTAL philippe.ricoux@total.com Eng

SE Dr Philipp Schlatter KTH Mechanics pschlatt@mech.kth.se Eng

Prof. Neil D Sandham School of Engineering Sciences

(Aero) – University of Southampton

n.sandham@soton.ac.uk Eng

Prof. Stewart Cant University of Cambridge rsc10@cam.ac.uk

Dr David Emerson STFC Daresbury Laboratory david.emerson@stfc.ac.uk Eng

NL Dr Roel Verstappen University of Groningen verstappen@math.rug.nl Eng

Panel Mailing List

DE Dr H. Pitsch RWTH Aachen – Lehrstuhl für Strömungslehre und Aerodynamisches Institut

h.pitsch@itv.rwth-‐aachen.de Eng

† Moderator

Country Title First Name Last Name Institution E-‐Mail Address

Editorial Group

DE Prof. Dr Dr Thomas Lippert

Forschungszentrum Jülich GmbH – Central Institute for Applied Mathematics

Chairman, PRACE 1IP Th.Lippert@fz-‐juelich.de

UK Prof. Richard Kenway University of Edinburgh Chairman, Scientific

Steering Committee r.d.kenway@ed.ac.uk

Prof. Martyn Guest University of Cardiff Lead Editor guestmf@cardiff.ac.uk

PT Dr Maria Ramalho PRACE Acting Managing Director of PRACE aisbl

M.Ramalho@staff.prace-‐ri.eu

IE Dr Turlough Downes Dublin City University Chairman, User Forum Programme Committee

turlough.downes@dcu.ie

Country Title First Name Last Name Institution E-‐Mail Address

PRACE Project IT Dr Giovanni Erbacci CINECA WP4-‐1IP g.erbacci@cineca.it

Dr Norbert Kroll DLR Institute of Aerodynamics & Flow Technology norbert.kroll@dlr.de Eng

Dr Jean Yves Berthou French National Research Agency

jean-‐yves.berthou@agencerecherche.fr

Dr Henri Calandra TOTAL henri.calandra@total.com Eng

Prof. Olivier Pironneau University of Paris VI (Pierre et Marie Curie) pironneau@ann.jussieu.fr Eng

Dr Thiery Poinsot CERFACS thierry.poinsot@ cerfacs.fr Eng

UK Prof. Demetrios Papageorrgiou Imperial College d.papageorgiou@imperial.ac.uk Eng

Dr C. Y. Wu Birmingham University c.y.wu@bham.ac.uk Eng

The Scientific Case for High Performance Computing in Europe 2012-2020

Documents

Transcript of The Scientific Case for High Performance Computing in Europe 2012-2020

Scientific Computing Lecture 5

Scientific Computing What to Do - 國立臺灣大學ccf.ee.ntu.edu.tw/~ypchiou/Scientific_Computing/Chapter... · 2005-03-03 · Scientific Computing 1.1 Introduction Scientific Computing

Introduction to Scientific Computing

Scientific Computing Lecture 10

Solutions Manual Scientific Computing

Scientific Computing lectures

Scientific Computing at Fermilab

An introduction of Scientific Computingctw/Scientific... · An introduction of Scientific Computing Institute of Mathematics Modelling and Scientific Computing National Chiao-Tung

Scientific Computing at SLAC

Scientific Computing - Hardware

Euler's Method Scientific Computing

Scientific Computing Lab

Scientific Computing Numerical Differentiation

Scientific Cloud Computing Infrastructure for Europe – Strategic Plan Bob Jones, IT department, CERN.

Scientific Computing

Chapter 1 Scientific Computing Approximation in Scientific Computing (1.2) January 12, 2010

Scientific Computing Linear Algebra

Scientific Computing Seminar

Introduction: Scientific Computing, Symbolic Computation ... · Scientific Computing: From Wiki • Computational science (or scientific computing) is the field of study concerned

AWS and Scientific Computing