Supercomputing ProgrammeA seven-year programme to enhance the computational and numerical prediction capabilities of the Bureau’s forecast and warning services.
Tim Pugh
Supercomputer Programme Director
Australian Bureau of Meteorology Tuesday, December 13, 2016
• National/Global observing system: atmosphere, marine, water, land, space
• 24/7 Operational forecasting systems for weather, climate, oceans and flooding
• Supercomputing and massive data storage
• High uptime internet communications and disaster recovery
• Professional forecasting capability across multiple disciplines
• Experts out posted in the Australian Defence Force, State Emergency Centres and Aviation Operation Centres
Reliable, resilient, national capability
New funding announced by Australian Government in May 2014
Seven Year Programme from July 2014 to July 2021
� Funding for Supercomputer system, Supporting Data Processing and Storage systems, Data Centre and
Networks, and Numerical Prediction Project (Transitions to Operations)
Programme Investment Areas across People, Processes, Science and Technology
» Benefit Planning and Realisation (Supercomputer and Services Board)
� Investments, Priorities, Delivery and Schedules, Social Economic Value, Return on Investment
» Infrastructure (Information Systems and Services)
� Data centre, networks, HPC and Data Intensive Computing, Software Services, Suite and Job Scheduling, UM
Modelling Infrastructure, System and Application Monitoring
» Delivery (Science to Services)
� Scientific Computing Service, Model Build Team, Numerical Prediction, Guidance Post Processing, Model
Data Services, Software Lifecycles, Verification Frameworks, Software Services
» Scalability (Research and Development)
� Future architectures, Growth in Compute and Data, Software Engineering, Skills
Forecast Production Value Chain
- - - - - - Continuous improvement through research and verification - - - - - -
More accurate - particularly for the location, timing and direction of rainfall, storms and wind changes
More up-to-date - more frequent forecasts available
More valuable - for decision makers, by quantifying forecast outcome probabilities using ensembles
More responsive - through capability to produce additional, on-demand, detailed forecasts for multiple extreme weather and hazard events across Australia.
Investments and Outcomes
Climate Change
Climate variability
Weather
Minutes
Hours
Days
Weeks
Months
Seasons
Years
Decades
Centuries
Alerts
Watches
Warnings
Forecasts
Outlook
Predictions
Guidance
Scenarios
Emergency
response
Strategic
planning
International
policy
negotiation
Sectoral
preparedness
planning
Forecast
uncertainty
Environmental Modelling in the Bureau
Australis HPC system Numerical Prediction for weather, climate, marine, hydrology, space weather
Supercomputer detailsSupercomputer details
CRAY Inc.WILL SUPPLY THE NEW
SUPERCOMPUTER
59 MILLION (USD)HAS BEEN ALLOCATED
FOR THE PROJECT$
Numerical Weather Prediction Roadmap
Projection of Nominal Modelling Resolutions for Future Computing Systems
Model Topography of Sydney, NSW
2 x daily 10-day & 3-day forecast40km Global Model
4 x daily 3-day forecast12km Regional Model
Sydney, NSW
(research 1.5km topography)
4 x daily 36-hour forecast4km City/State Model
TCTC
Increasing model
resolution
for improved local
information
Future model ensembles
for likelihood of
significant weather
2 x daily 10-day & 3-day forecast12km Global Model
8 x daily 3-day forecast4.5km Regional Model
24 x daily 18h or 36h forecast1.5km City/State Model
2013
2020
Modelling Outcomes to Achieve
Capability 2014 HPC systemNew HPC systems
(2016 to 2021)
Model grid resolution (horizontal only) ACCESS-G (global)
ACCESS-R (regional)
ACCESS-C (city)
40 km12 km4 km
25 km > 10 km12 km > 4.5 km
1.5 km
Regular forecast updates(times per day) Global
Regional
City and on-demand
4 times4 times4 times
4 times8 times
Up to 24 times
Tropical cyclone forecasts(horizontal grid resolution)
(forecast length)12 km
Out to 3 days
Up to 3 concurrent events12km > 4.5kmOut to 5 days
Ensembles Forecasts(Certainty for decision makers)
None Yes (Global, City, TC, Relocatable)
Capability to produce additional, on-demand, high-resolution forecasts for extreme weather
None1.5 km
Up to 4 concurrent eventsUp to 24 times per day
What is the Decoupler Strategy?
Products Gen� Best gridded data
� Standard methods
� Common data services
� API management
HPC Apps• 1-2 updates per annum• Grid enhancements• Modelling enhancements• Initial state enhancements
Service Apps• Agile application development• Product consistency (5-10 yrs)• Data access consistency• Fit-for-Purpose Quality
improvements over time
A key aim is to break the coupling between numerical prediction models and customer-specific forecast products. • In this way it acts like an interface between
them, absorbing requirements from both
sides to ensure that a change to one does
not affect the other.
What is the Best Gridded Data?
Data processing levels
Use to define level of processing applied Use Level 3 (Best Data) by default
Strong
coupling
Weak/ no coupling
Incre
asin
g q
ua
lity
Aurora
Australis
PBSpro
Production Scheduler
(Aurora)
North
(Aurora) CS400North
(Aurora)
South
(Aurora) CS400South
GPFS GPFS
Data
Intensive
Lustre Lustre
Compute
Intensive
(Australis)XC40 East
(Australis)XC40 West
XC40 Dev System
Lustre
Terra
PBSpro
Development
Staging Production
ITOpsDashboard
Staging Scheduler
PBSpro
Dev Scheduler
VM Dev
Cloud
Achieving Automation in ModelingNew approaches and improving standards in software development
Australis (Prod)
SCS-Workflow Prod
Australis (Stage)Terra (Dev)
SCS-Workflow StageSCS-Workflow Dev
GIT scs-repos-dev artifactory / binaries
Dev Branch Dev Branch Prod Deploys
User-space development
Some automated testing
Automated deployments
Service account model
Automated testing
Versioned deployments
”One-step” installation
Service account model
master
branch
Suite Schedulers
Computational Platforms
Software Services
Feature Branches
Aurora
Australis
PBSpro
Production
Scheduler
(Aurora)
CS400
North
(Aurora)
CS400
South
GPFS GPFS
Data
Intensive
Lustre Lustre
Compute
Intensive
(Australis)
XC40 East
(Australis)
XC40 WestXC40 Dev System
Lustre
Terra
PBSpro
Development
Staging Production
Staging
Scheduler
PBSpro
Dev
Scheduler
VM Dev Cloud
Emergency Services
Cloud
Aviation Services
Cloud
…Service Cloud
DevOps to Production Simulation to Services
Simulation
Products
copy-out
copy-out
copy-out
BoM Production & Staging Platforms (Australis )38x performance, 8x electrical power
2015 Australis(delivered)
2018 Addition (projected)
2018 Australis (projected)
Ngamai HPC
System (Retired
Oct’16)
Relative Increase
Processor
Intel Xeon
Haswell
12-core, 2.6 GHz
Intel Xeon
Skylake
Intel Xeon
Haswell + Skylake
Intel Xeon
Sandy Bridge
6-core, 2.5 GHz
Increase relative to
Ngamai System
Nodes 2,160 1,952 4,112 5762015: 3.8x
2018: 7.1x
Cores 51,840 78,080 129,920 6,9122015: 7.5x
2018: 18.8x
Aggregate Memory 276 TB 375 TB 651 TB 36.9 TB2015: 7.5x
2018: 17.7x
Usable Storage 4,320 TB 4,320 TB 8,640 TB 214 TB2015: 20.2x
2018: 40.4x
Storage Bandwidth 135 GB/s 171 GB/s 306 GB/s 16 GB/s2015: 8.4x
2018: 19.1x
Sustain system
performance (SSP)253 365 618 16
2015: 15.6x
2018: 38.1x
Typical Power Use 865 kW 783 kW 1,648 kW 200 kW2015: 4.3x
2018: 8.2x
+ =
Computational capacity and performance of Aurora
Specification: 2 Clusters
Number of Nodes: 20 (16 Compute, 4 GPU Compute with GPU)
Service Nodes: 7 (1 NFS, 2 General Purpose, 1 Jump Server, 1 LNET Router, 2 Management Nodes)
Processors per Node: 2
Processor Type: Intel Xeon Broadwell E5-2695v4 18-core 2.1GHz 120W
Memory: 256 GiB (8x 32GB DDR4 2400MHz)
Internal Storage: 1x Intel DC P3608 Series 1.6TB
Network Port:
1x onboard Mellanox Connect-IB with Virtual Protocol Interconnect (VPI), providing 10/20/40/56 Gb/s InfiniBand through a single-port QSFP+
Accelerator Card: 1x NVIDIA K80 GPU with 24GB RAM (4 nodes)
BoM Midrange Production & Staging Platforms (Aurora)
Integrated CS-400 data processing system with two cluster of nodes
� Dual socket nodes with Intel 18-core Broadwell, 256GB DDR4, FDR
Infiniband interconnects (2 x 720 cores)
� Addition of 1.6 TB Intel NVMe flash on all compute nodes and handful
of NVIDIA K80 GPUs for visualisation and data processing
� GPFS data storage system based on DDN GS14K system with 150TB SSD
& 2PB HDD storage pools.
CS-400 (Aurora) running data intensive workloads
• Single node Workloads – Pre/Post-Processing.
• Small file workloads (GPP/OCF).
• Product Generation - Data Quality Verification
• Replacing components previously running on Ngamai and RTDS4 (midrange system).
• Will host new capability (for example):
� Master Data Management
� Operational Data Store
� Data Management/Portal Services
� GPGPU data processing and visualization
� Capacity to cope with the ACCESS NWP v3 modelling system
“TERRA”, a new Cray XC40 System
TERRA is 1/6th the size of AUSTRALIS production system
Two phase delivery:
2016 system: 117 Teraflops, 144 nodes, 3,456 cores, 18.4 TB memory, 1440 TB usable
data storage, 45 GB/s I/O bandwidth• to include I/O Accelerators – 48 TB of NVMe Flash (Data Warp I/O) for computational
research and workflow optimisation (reduce elapse times)
• to include Compute Accelerators – NVIDIA GPUs, Intel Xeon Phi for computational
research and application optimisation
2018 upgrade: 473 Teraflops, 321 nodes, 10,536 cores, 52.4 TB memory, 2880 TB
usable data storage, 90+GB/s I/O bandwidth
Delivered in May 2016, accepted on 30 June, commissioned on 1st September 2016
• to support our Development to Operations (DevOps) methodology and pathway
from the NCI computing facility to AUSTRALIS production system
• to facilitate porting, testing and preparation of scientific code in development
for operations on AUSTRALIS.
BoM Development System (Terra )For application porting and scientific computing development
Parameter 2015 Dev System 2018 Addition 2018 Dev System 2013 HPC System
ProcessorIntel Xeon Haswell
12-core 2.6GHz
Intell XeonSkylake20-core
Intel XeonHaswell + Skylake
Intel XeonSandy Bridge6-core 2.5GHz
Nodes 144 177 321 576
Cores 3,456 7,080 10,536 6,912
Aggregate Memory 18.4 TB 34.0 TB 52.4 TB 36.9 TB
Global Filesystem Technology
Cray/Seagate Sonexion Lustre 2.5.1+
Cray/Seagate Sonexion Cray/Seagate Sonexion
Lustre
Oracle/Lustre1.8.8
Usable Storage 1,440 TB 1,440+ TB 2,880+ TB 214 TB
Storage Bandwidth 45 GB/s 45+ GB/s 90+ GB/s 16 GB/s
Data Storage AccelerationDataWarp I/O
48 TB SSD33.6 GB/s BW
NVDIMMsDataWarp I/O
48 TB SSD33.6 GB/s BW
N/A
Compute InterconnectCray Aries
93 – 157Gb/sCray Aries
93 – 157Gb/sCray Aries
93 – 157Gb/sInfiniband QDR
40Gb/sTypical Power Use (kiloWatts)
71 kW 91 kW 162 kW 200 kW
Top 500 Rmax Linpack 110 TF 362 TF 473 TF 104 TF
+ =
Computing Memory and Storage Trends (2016)
Current Model
CPU
Memory(DRAM)
Parallel Storage(HDD)
Archive Storage(HDD & Tape)
Future Model
CPU
Near Memory(HBM/HMC)
Parallel Storage(HDD)
Archive Storage(HDD & Tape)
Far Memory(DRAM/NVDIMM)
Near Storage(Flash)
Sourced from Cray (2015)
On Node
Off Node
(External)
On Node
Off Node
(External)
Off Node
(Internal/HSN)
HBM = High Bandwidth Memory
HMC = Hybrid Memory Cube or 3D-stacked DRAM
MCDRAM = Multi-Channel DRAM is 3D-stacked DRAM
Flash = (solid-state) non-volatile data storage chip
SSD = Solid State Disk (Flash-based storage device)
HDD = Hard Disk Device (spinning disk)
NWP Model Data Production
1.6 - 1.6 2.8 2.8 5.6
23.0
52.8
75.8
41.2
87.6
128.8
-
20.0
40.0
60.0
80.0
100.0
120.0
140.0
Deterministic Ensemble Total(year)
An
nu
al
Data
Vo
lum
e (
PB
)
APS1 APS2 APS3 APS4
Daily Production
APS1 = 4TB
APS2 = 15TB
APS3 = 208 TB
APS4 = 353 TB
Australis Data Production (not storage)
Application Computational Efficiency and Scalability
Objective:
A proposed collaboration to establish improved performance of ACCESS climate and weather
modelling for HPC systems based on the next generation of HPC systems.
Proposed Goals:
• To computationally meet the operational time windows and throughput needs of weather,
climate, and earth system modelling.
• To utilise new computational architectures, programming models, and algorithms to improve
model performance and scalability from hundreds to thousands of processor cores.
• To achieve capacity computing and I/O storage throughput needed to support ensemble
modelling systems for high resolution weather and coupled climate modelling.
• To elevate the collaboration and contributions in the development of the Australian
Community Climate Earth-System Simulator (ACCESS) and Weather Prediction
Application Computational Efficiency and Scalability
MetOffice
Tsunami Events – Modelling Realtime Events with GPUs
• Currently based on pre-computed scenarios
using NOAA MOST tsunami model
• Runtime of >60 minutes made it impossible to
run a real time simulation during an event
• Performance improvement in 6 weeks by two
HPC programmers.
• Initial results of 24-hour simulation of tsunami
wave propagation
Serial code
Intel Xeon Haswell> 3600 sec (53 min)
OpenMP, 24 cores
Intel Xeon Haswell262 sec (~4.4m)
CUDA, 1 GPU
NVIDIA K80 Telsa134 sec (~2.3 min)
CUDA, 8 GPUs
NVIDIA K80 Telsa22 sec (~0.3 min)
Parallel & GPU version allow on-demand simulation of Tsunami
event:
– More accurate forecasts of effects
– Ensemble modelling
– Better uncertainty estimation, improved risk map
Topics of Interest
• Improved data analytics and workflows
– New data storage technology
– Architectural configuration
– New software tools and pipelines
– Containers in HPC & data processing
• Meteorological Archival and Retrieval Systems
• Data services and data management
– Data access services
– Data integrity management
– Data and metadata management
– Data virtualisation and aggregation
• System Monitoring and Analytics to improve system robustness and resilience
• Machine Learning in observation and forecast sciences
– Radar image recognization and tracking
– Observation quality control
– Probablistic forecasting and ensemble member design and assessment (quality)
– Nowcasting applications
– Predictive analytics for system faults
• Computational Sciences
– Software engineering for next generation applications
– New processor architectures (GPU, Xeon Phi, FPGA)
– Software tools and domain interpretations
– New algorithms in next generation numerical weather and climate prediction
Top Related