Gaurav slides
date post
14-Apr-2017Category
Devices & Hardware
view
586download
0
Embed Size (px)
Transcript of Gaurav slides
The Path to Exascale Computing Challenges and Opportunities
HPC Meet-up21st May
Gaurav KaulSolutions Architect
Intel
2
Outline
Why Exascale?
Existing Trends The End of Moores Law?
Major Technology Challenges (aka Walls)
Technologies On the Horizon
Scaling Applications for Peta/Exa-Scale Era
Summary
3
Performance Roadmap
1.E-04
1.E-02
1.E+00
1.E+02
1.E+04
1.E+06
1.E+08
1960 1970 1980 1990 2000 2010 2020
GF
LO
P
MFLOP
GFLOP
TFLOP
PFLOP
EFLOP
12 Years 11 Years 10 Years
Client
Hand-held
A bit of History
4
The Top 500 Waterfall
5
50 years of Moores Law
6
Moore and Dennard Scaling
7
8
Current Processor Performance Trends
Technology Scaling Outlook
9
10
The Power & Energy Challenge
200W
150W
100W
100W
4550W
5KW
Compute
Memory
Com
Disk
TFLOP Machine today
5W2W
~5W~3W5W
TFLOP Machine thenWith Exa Technology
~20W
Promising Technologies
11
Rethink System Level Architecture
12
DRAM Scaling Using 3D Memory
13
Innovative Packaging and I/O
14
15
Needs a Paradigm Shift
Evaluate each (old) architecture feature with new priorities
Single thread performance Frequency
Programming productivity Legacy, compatibility
Architecture features for productivity
Constraints (1) Cost
(2) Reasonable Power/Energy
Throughput performance Parallelism
Power/Energy Architecture features for energy
Simplicity
Constraints (1) Programming productivity
(2) Cost
Past and present priorities
Future priorities
Intel: Investing to Remove 6 Bottlenecks
Interconnect
Memory
&
Storage
Processor
Performance
Reliability
and
Resiliency
Standard Programming
Model for Parallelism
Power
Efficiency
Impact on Applications
17
The Many Ways to Parallelism
18
And New Workloads will
Emerge
19
Code Modernization The 4D Approach
20
New for Knights Landing(Next Generation Intel Xeon Phi Products)
2nd half 15 1st commercial systems
3+ TFLOPS1In One Package Parallel Performance & Density
On-Package Memory: High Performance
up to 16GB at launch
5X Bandwidth vs DDR47
Compute: Intel Silvermont Arch. (Intel Atom)2
Low-Power Cores with HPC Enhancements3
3X Single Thread Performance4 vs Prior Gen.
Intel Xeon Processor Binary Compatible5
1/3X the Space6
5X Power Efficiency6
..
.
..
.
Integrated Fabric
Intel Silvermont Arch. Enhanced for HPC6
Processor Package
ConceptualNot Actual Package Layout
Platform Memory: DDR4 Bandwidth and Capacity Comparable to Intel Xeon Processors
LEARN MORE: Knights Landing Webcast (Tuesday June 24th): https://www.brighttalk.com/webcast/10773/116329
Jointly Developed with Micron Technology
https://www.brighttalk.com/webcast/10773/116329
22
What is an FPGA?
FPGAs (Field Programmable Gate Arrays) are
semiconductor devices that can be programmed
- Desired functionality of the FPGA can be (re-)programmed by downloading a configuration into the device
FPGAs offer several advantages over potential
alternatives:
- Lower one-time development cost, and faster time to market compared to custom designed chips (ASICs)
- Ability to implement customer-specific functionality beyond what is available from standard products (ASSPs)
- Customizable and reprogrammable after the device has been deployed to the field compared to both ASIC and ASSP
http://commons.wikimedia.org/wiki/File:Fpga1a.gifhttp://commons.wikimedia.org/wiki/File:Fpga1a.gif
0.01
0.1
1
10
100
1000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Acceleration Architectural
Landscape
Source: ISSCC Proceedings
En
erg
y e
ffic
ien
cy (
MO
PS
/mW
)
Processor Number (sorted by efficiency)
MicroprocessorsReconfigurable
Dedicated HWMore programmable
More efficient
10X
100X
Potential for 10-100X higher performance/watt vs. general purpose cores
23
24
FPGAs as Reconfigurable
Accelerators
Intel Confidential Do Not Forward
25
Example Use Case HFT
What will matter in 10 years
26
Intel Confidential Do Not Forward
27
What Next?
Intel Confidential Do Not Forward
28
Summary