Artificial Bee Colony Algorithm Presented By: Asma Sanam Larik.
EURORA - Prace Training Portal: Events SANAM-King Abdulaziz City for ... Laboratory #7 CADMOS...
Transcript of EURORA - Prace Training Portal: Events SANAM-King Abdulaziz City for ... Laboratory #7 CADMOS...
#1 Eurora-CINECA
#2 Aurora Tigon-Selex ES Chieti
#3 Beacon-National Institute for
Computational Sciences/University of
Tennessee
#4 SANAM-King Abdulaziz City for
Science and Technology
#5 IBM Thomas J. Watson Research
Center
#6 Cetus-DOE/SC/Argonne National
Laboratory
#7 CADMOS BG/Q-Ecole
Polytechnique Federale de Lausanne
#8 Interdisciplinary Centre for
Mathematical and Computational
Modelling, University of Warsaw
#9 Vesta-DOE/SC/Argonne National
Laboratory
#10 University of Rochester
The Green500 List - June 2013
EURORA (EURopean many integrated cORe Architecture)
Prototype Project
Founded by PRACE 2IP EU project Grant agreement number: RI-283493
Goal: evaluate a new architectures for next generation Tier-0 system
PRACE Partners involved:
CINECA (Italy),
GRNET (Greece), IPB (Serbia), NCSA, (Bulgaria)
Vendor:
Eurotech
EURORA
project objectives
Address Today HPC Constraints: Flops/Watt,
Flops/m2,
Flops/Dollar.
Efficient Cooling Technology: hot water cooling (free cooling);
measure power efficiency, evaluate (PUE & TCO).
Improve Application Performances: at the same rate as in the past (~Moore’s Law);
new programming models.
Evaluate Hybrid (accelerated) Technology: Intel Xeon Phi; NVIDIA Kepler.
Custom Interconnection Technology: 3D Torus network (FPGA);
evaluation of accelerator-to-accelerator
communications.
EURORA,
chassis 1 rack, 16 chassis
16 nodes card or
8 nodes card + 16 accelerators
Eurora Rack
Physical dimensions: 2133mm(48U) h, 1095mm w, 1500 mm d;
Weight (full rack with cooling fully loaded with water): 2000Kg
Power/Cooling typical requirements: 120-130 kW @ 48 Vdc
cooling
Hot water 50-80C
Temperature gap 3-5C
No rotating fans
Cold plates – direct on component liquid cooling
Dry chillers
Free cooling
Temperature sensors – downgrade performance is
required
System isolation
Quick disconnect
EURORA Network
3D Torus custom network
FPGA (Altera Stratix V)
APENET
Ad-hoc MPI subset/API
InfiniBand FDR
Mellanox ConnectX3
MPI + Filesystem
64 compute cards
128 Xeon SandyBridge (2.1GHz, 95W and 3.1GHz, 150W)
16GByte DDR3 1600MHz per node
160GByte SSD per node
1 FPGA (Altera Stratix V) per node
IB QDR interconnect
3D Torus interconnect
128 Accelerator cards (NVIDA K20 and INTEL PHI)
EURORA
prototype configuration
Definition of System Metrics and Benchmarks.
Performance Measurements (Flops/Watt using Linpack).
Connectivity Benchmarks (Bandwidth and Latency).
Application Porting.
Application Benchmarks (time, scalability, watt to solution).
Different Operational Conditions (e.g. change water temp.).
EURORA
planned experiments
• Material Science (Quantum-ESPRESSO)
• Life Science (GROMACS)
• Fundamental Physics (QCD)
• Earth Science / Weather Forecast
• High Throughput Virtual Screening (Pharma
industry - DOMPE’)
EURORA
applications
• Message Passing (MPI)
• Shared Memory (OpenMP)
• Kernel offload (pragmas / native)
• Hybrid: MPI + OpenMP + extensions/OpenCL
EURORA
programming models
First results
DATASET: Ta2O5-2x1xz-552, 20 iterations EURORA (5nodes, 10 K20 GPU, 10 MPI task, 8 OpenMP threads per core): 789.2 secs
PLX (5nodes, 10 M2070 GPU, 10 MPI task, 6 OpenMP threads per core): 2180.4 secs
BGQ (64nodes, 256 MPI task, 8 OpenMP threads per core ) : 920.4 secs
789
2180
920
0
500
1000
1500
2000
2500
EURORA (5 nodes) PLX (5 nodes) FERMI (64 nodes)
seco
nd
sQuantum-ESPRESSO - Benchmark
DATASET: Ta2O5-2x1xz-552
with
GPU
DATASET: 256 H2O molecules BGQ (1024cores) : 9.0 seconds/iteration
BGQ (2048cores) : 5.0 seconds/iteration
BGQ (4096cores) : 3.7 seconds/iteration
EURORA (32core,2.1GHz): 71 seconds/iteration
EURORA (32core,3.1GHz): 61 seconds/iteration
EURORA (64core,3.1GHz): 35 seconds/iteration
EURORA (128core,3.1GHz): 19 seconds/iteration
EURORA (256core, 3.1GHz): 12 seconds/iteration
0
10
20
30
40
50
60
70
80se
con
ds/
ite
rati
on
Quantum-ESPRESSO benchmarkDATASET: 256 H2O
without
GPU