PRACE PATC Course: Intel MIC & GPU Programming … PATC Course: Intel MIC & GPU Programming Workshop...
Transcript of PRACE PATC Course: Intel MIC & GPU Programming … PATC Course: Intel MIC & GPU Programming Workshop...
PRACE PATC Course:
Intel MIC & GPU Programming Workshop
LRZ, 27.4.- 29.4.2015
Information
● Course site:
LRZ, Boltzmannstr. 1, 85748 Garching b. München, Kursraum II
● Tutorials:
Every day, interleaved with lectures
● Course material by LRZ, RRZE and Intel
● Workshop Webpage:
https://www.lrz.de/services/compute/courses/x_lecturenotes/MIC_GPU_Workshop/
http://goo.gl/xmbu5s
● WIFI: eduroam (https://www.lrz.de/services/netz/mobil/eduroam/)
● Interest in guided SuperMUC tour?
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
LRZ in the HPC environment
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Gauss@Stuttgart, Gauss@Jülich, Gauss@Garching
PRACE has 25 members, representing European Union Member
States and Associated Countries.
„Hosting-Partner: D, F, I, SP (100 Mio. € per partner , 5 years)
Bavarian Contribution to National Infrastructure
German Contribution to European Infrastructure
PRACE PATC
● PRACE: Partnership for advanced Computing in Europe:
http://www.prace-ri.eu
● PATC (PRACE Advanced Training Center):
LRZ ∈ GCS is one of 6 European PATCs
https://events.prace-ri.eu
● Evaluation:
https://events.prace-ri.eu/event/375/evaluation/evaluate
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Presenters
● Presenters:
Dr. Momme Allalen, LRZ
Dr. David Brayford, LRZ
Dr. Ferdinand Jamitzky, LRZ
Dr.-Ing. Michael Klemm, Intel
Dr.-Ing. Jan Treibig, RRZE
Dr. Volker Weinberg, LRZ
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Tentative Schedule
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Intel Xeon Phi @ LRZ and EU
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Evaluating Accelerators at LRZ
Research at LRZ within PRACE & KONWIHR:
● CELL programming
2008-2009 Evaluation of CELL programming.
IBM announced to discontinue CELL in Nov. 2009.
● GPGPU programming
Regular GPGPU computing courses at LRZ since 2009.
Evaluation of GPGPU programming languages:
CAPS HMPP
PGI accelerator compiler
CUDA, cuBLAS, cuFFT
PyCUDA/R
● RapidMind → ArBB (Intel) → discontinued
● Knights Ferry (2010) → Knights Corner → Intel Xeon Phi27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
} → OpenACC
IPCC (Intel Parallel Computing Centre)
● New Intel Parallel Computing Centre (IPCC) since July 2014:
Extreme Scaling on MIC/x86
● Chair of Scientific Computing at the Department of Informatics in
the Technische Universität München (TUM) & LRZ
● https://software.intel.com/de-de/ipcc#centers
● https://software.intel.com/de-de/articles/intel-parallel-computing-center-at-
leibniz-supercomputing-centre-and-technische-universit-t
● Codes:
Simulation of Dynamic Ruptures and Seismic Motion in Complex
Domains: SeisSol
Numerical Simulation of Cosmological Structure Formation: GADGET
Molecular Dynamics Simulation for Chemical Engineering: ls1 mardyn
Data Mining in High Dimensional Domains Using Sparse Grids: SG++
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
PRACE: Best Practice Guides
● http://www.prace-ri.eu/best-practice-guides/● Best Practice Guide – Hydra, March 2013 PDF HTML
● Best Practice Guide – JUROPA, March 2013 PDF HTML
● Best Practice Guide – Anselm, June 2013 PDF HTML
● Best Practice Guide – Curie, November 2013 PDF HTML
● Best Practice Guide – Blue Gene/Q, January 2014 PDF HTML
● Best Practice Guide – Intel Xeon Phi, February 2014 PDF HTML
● Best Practice Guide - JUGENE, June 2012 PDF HTML
● Best Practice Guide - Cray XE-XC, December 2013 PDF HTML
● Best Practice Guide - IBM Power, June 2012 PDF HTML
● Best Practice Guide - IBM Power 775, November 2013 PDF HTML
● Best Practice Guide - Chimera, April 2013 PDF HTML
● Best Practice Guide - GPGPU, May 2013 PDF HTML
● Best Practice Guide - Jade, February 2013 PDF HTML
● Best Practice Guide - Stokes, February 2013 PDF HTML
● Best Practice Guide - SuperMUC, May 2013 PDF HTML
● Best Practice Guide - Generic x86, May 2013 PDF HTML
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Intel MIC within PRACE: Best Practice
Guide
● Best Practice Guide – Intel Xeon Phi
Created within PRACE-3IP.
Written in Docbook XML.
Michaela Barth (KTH Sweden),Mikko Byckling (CSC
Finland), Nevena Ilieva (NCSA Bulgaria), Sami
Saarinen (CSC Finland), Michael Schliephake KTH
Sweden), Volker Weinberg (LRZ, Editor).
http://www.prace-ri.eu/Best-Practice-Guide-Intel-Xeon-
Phi-HTML
http://www.prace-ri.eu/IMG/pdf/Best-Practice-Guide-
Intel-Xeon-Phi.pdf
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Intel MIC within PRACE: Preparatory
Access
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
● Applications Enabling for Capability Science
27 enabling projects from 17 PRACE partners from 14 countries
Jul-Dec 2013
Computations on Eurora (EURopean many integrated cORe
Architecture) Prototype at CINECA, Italy with 64 Xeon Phi
coprocessors and 64 NVIDIA GPUs
X. Guo, Report on Application Enabling for Capability Science in
the MIC Architecture, PRACE Deliverable D7.1.3,
http://www.prace-ri.eu/IMG/pdf/d7.1.3_1ip.pdf
16 Whitepapers available online:
http://www.prace-project.eu/Evaluation-Intel-MIC
Intel MIC within PRACE: Preparatory
Access
● Performance Analysis and Enabling of the RayBen Code for the Intel® MIC Architecture
● Enabling the UCD-SPH code on the Xeon Phi
● Xeon Phi Meets Astrophysical Fluid Dynamics
● Multi-Kepler GPU vs. Multi-Intel MIC for spin systems simulations
● Enabling Smeagol on Xeon Phi: Lessons Learned
● Code Optimization and Scaling of the Astrophysics Software Gadget on Intel Xeon Phi
● Code Optimization and Scalability Testing of an Artificial Bee Colony Based Software for
Massively Parallel Multiple Sequence Alignment on the Intel MIC Architecture
● Optimization and Scaling of Multiple Sequence Alignment Software ClustalW on Intel Xeon
Phi
● Porting FEASTFLOW to the Intel Xeon Phi: Lessons Learned
● Optimising CP2K for the Intel Xeon Phi
● Towards Porting a Real-World Seismological Application to the Intel MIC Architecture
● FMPS on MIC
● Massively parallel Poisson Equation Solver for hybrid Intel Xeon – Xeon Phi HPC Systems
● Exploiting Locality in Sparse Matrix-Matrix Multiplication on the Many Integrated Core
Architecture
● Porting and Verification of ExaFMM Library in MIC Architecture
● AGBNP2 Implicit Solvent Library for Intel® MIC Architecture
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Towards Exascale: DEEP & DEEP-ER
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
DEEP Project
● Design of an architecture leading to exascale.
● Development of hardware:
Implementation of a Booster based on MIC processors and EXTOLL
interconnect.
● Energy-aware integration of components:
Hot-water cooling.
● Cluster management system.
● Programming environment, programming models.
● Libraries and performance analysis tools.
● Porting applications.
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
DEEP Cluster-Booster Architecture
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Xeon Phi References
● Books:
James Reinders, James Jeffers, Intel Xeon Phi Coprocessor High
Performance Programming, Morgan Kaufman Publ. Inc., 2013
http://lotsofcores.com
Rezaur Rahman: Intel Xeon Phi Coprocessor Architecture and
Tools: The Guide for Application Developers, Apress 2013 .
Parallel Programming and Optimization with Intel Xeon Phi
Coprocessors, Colfax 2013
http://www.colfaxintl.com/nd/xeonphi/book.aspx
● Intel Xeon Phi Programming, Training material, CAPS
● Intel Training Material and Webinars
● V. Weinberg (Editor) et al., Best Practice Guide - Intel Xeon Phi,
http://www.prace-project.eu/Best-Practice-Guide-Intel-Xeon-Phi-
HTML and references therein
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
SuperMIC ∈ SuperMUC @ LRZ
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
SuperMIC ∈ SuperMUC
● Conceptual talks with vendors Dec 2008 to Dec 2009
● Begin of building construction works H2 2009
● Installation PRACE-prototypes at LRZ Q2/Q3 2009
● Finalization of concept Dec 2009
● Benchmarks ready Dec 2009
● Testing phase for benchmarks Jan/Feb 2010
● Competitive dialogue with vendors Feb 2010 – Nov 2010
● Contract conclusion Dec 2010
● Test and porting system Apr 2011
● User operation on SuperMIG Aug 2011
● Building ready Oct 2011
● End of HLRB-II † 21. Oct 2011
● First IBM racks delivered Mar 2012
● Ranked as No. 1 in Europe at ISC’12 18.6.2012
● Inauguration ceremony 20.7.2012
● General user operation on SuperMUC Phase 1 3.9.2012
● Installation of SuperMIC Q1 2014
● SuperMUC phase 2 friendly user phase May 2015
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
SuperMUC System Overview
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
SuperMUC Phase 2: Moving to Haswell
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
6 Haswell islands
512 nodes per island
warm water cooling
LRZ infrastructure
(NAS, Archive, Visualization)
Internet / Grid Services
Mellanox FDR14
Island switch
Haswell-EP
24 cores/node
2.67 GB/core
non blocking
Spine infiniband
switches
pruned tree
I/O
servers
GPFS for
$WORK
$SCRATCH
I/O Servers
(weak coupling of phases 1+2)
Mellanox FDR10
Island switch
non blocking
pruned tree
Thin + Fat islands
of SuperMC
SuperMUC Phase 2: Moving to Haswell
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
SuperMIC: Intel Xeon Phi Cluster
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
SuperMIC: Intel Xeon Phi Cluster
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
SuperMIC ∈ SuperMUC @ LRZ
● 32 compute nodes (diskless)
SLES11 SP3
2 Ivy-Bridge host processors [email protected] GHz with 16 cores
2 Intel Xeon Phi 5110P coprocessors per node with 60 cores
64 GB (Host) + 2 * 8 GB (Xeon Phi) memory
2 MLNX CX3 FDR PCIe cards attached to each CPU socket
● Interconnect
Mellanox Infiniband FDR14
Through Bridge Interface all nodes and MICs are directly accessible
● 1 Login- and 1 Management-Server (Batch-System, xCAT, …)
● Air-cooled
● Supports both native and offload mode
● Batch-system: LoadLeveler
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
SuperMIC Network Access
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
And now …
Enjoy the course!
27/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
29/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Final remarks
Evaluation
Please fill out the PRACE PATC evaluation form:
https://events.prace-ri.eu/event/375/evaluation/evaluate
http://goo.gl/SdHo3c
Also linked on Workshop page.
Thank you!
29/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Future Courses @ LRZ ∈ GCS ∈ PRACE
29/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
● Training LRZ Compute Cloud
Tuesday, May 12, 2015, 9:00 - 18:00
● Recent Advances in Parallel Programming Languages
Monday, June 8, 2015, 9:00 - 16:00
● Introduction to OpenFOAM
Monday, June 9 - Wednesday, June 11, 2015 9:00-17:00
● PRACE PATC Course: Advanced Fortran Topics
Monday, September 14, 2015 - Friday, September 18, 2015, 8:30 - 18:00
● Compact Course: Iterative Linear Solvers and Parallelization 2015
Monday, September 7, 08:30 - Friday, September 11, 2015, 15:30
● PRACE PATC Course: Node-Level Performance Engineering
Thursday, December 10 - Friday, December 11, 2015, 9:00 - 17:00
● Programming with Fortran
Monday, February 8 - Friday, February 12, 2016, 9:00-18:00
Future Courses @ LRZ ∈ GCS ∈ PRACE
● LRZ is part of the Gauss Centre for Supercomputing
(GCS), which is one of the six PRACE Advanced Training
Centres (PATCs) that started in 2012.
● Information on further HPC courses:
by LRZ:
http://www.lrz.de/services/compute/courses/
by the Gauss Centre of Supercomputing (GCS):
http://www.gauss-centre.eu/training
by the PRACE Advanced Training Centres (PATCs):
http://www.training.prace-ri.eu/
29/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
29/04/2015 Intel MIC & GPU Programming Workshop, LRZ 2015
Thank you for your participation!