High Bandwidth Memory Presentation
Transcript of High Bandwidth Memory Presentation
HIGH BANDWIDTH MEMORY (HBM) MAY 2015
UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
2 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
Platforms & devices must balance power usage between DRAM and logic chips
GDDR5 is entering the inefficient region of the power/performance curve
AMD anticipated this challenge seven years ago and began work on a solution
GDDR5 WILL SOON STALL GPU PERFORMANCE GROWTH
AMD internal estimates, chart for illustrative purposes only.
3 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
GDDR5 chips aren’t getting any smaller
A large number of devices are needed to reach high bandwidth
Significant board real estate is consumed by wide GDDR5 interfaces
GDDR5’s power demands require larger voltage regulators
All of this determines the size of a high-performance product
GDDR5 ALSO LIMITS FORM FACTORS
110MM
90
MM
PCB area occupied by ASIC + Memory (R9 290X)
4 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
Multi-Media
Source: gecko54000.free.fr
HISTORICALLY: WE SOLVE IT BY SHRINKING AND INTEGRATING FUNCTIONS
Graphics Processor
Source: proyectoyautja.proboards.com
2003 2010 1993 1989 1971
Intel P5 Intel 4004 AMD “Ontario”
North Bridge
AMD K8
South Bridge
Source: bytesandbits.it
AMD “Kabini”
Cache & FPU
Source: gecko54000.free.fr
Intel 486
2013 2015
Die Stacking
AMD HBM
IVR
Source: Extremetech.com
Intel “Haswell”
Photographs used for informational purposes, no endorsement is expressed or implied.
5 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
ON-CHIP INTEGRATION NOT IDEAL FOR DRAM
DRAM is not size or cost effective for integration in a logic-optimized process (e.g. SoC or GPU)…
…but there is still a need and desire to integrate DRAM for performance/power/form factor reasons
Another way to integrate DRAM must be explored
6 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
Why not scale GDDR5 to be faster?
More bandwidth requires more power
Faster CPUs/GPUs require more bandwidth
DRAM power consumption is a non-linear curve: disproportionate power consumption as bandwidth rises
COMMUNICATION IS OVERHEAD CONSUMING POWER, LATENCY, AND FOOTPRINT
OFF-CHIP INTERFACES SCALE POORLY
7 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
Brings DRAM as close as possible to the logic die
Improving proximity enables extremely wide bus widths
Improving proximity simplifies communication and clocking
Improving proximity greatly improves bandwidth per watt
Allows for integration of disparate technologies such as DRAM
AMD developed industry partnerships with ASE, Amkor & UMC to develop the first high-volume manufacturable interposer solution
THE NEXT STEP IN INTEGRATION
THE INTERPOSER
8 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
A new type of memory chip with low power consumption and an ultra- wide bus width
Many of those chips stacked vertically like floors in a skyscraper
New interconnects, called “through-silicon vias” (TSVs) and “µbumps”, connect one DRAM chip to the next
TSVs and µbumps also used to connect the SoC/GPU to the interposer
AMD and SK Hynix partnered to define and develop the first complete specification and prototype for HBM
HIGH-BANDWIDTH MEMORY DRAM BUILT FOR AN INTERPOSER
9 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
z
HBM: A DIFFERENT MEMORY FROM GDDR5
AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
10 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
HBM & interposer give much more bandwidth than GDDR5 for >50% less power3
HBM rebalances DRAM vs. logic power consumption to protect future GPU performance growth
IMPROVING POWER EFFICIENCY WITH STACKED HBM2
Source: AMD
11 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
MASSIVE SPACE SAVINGS OVER GDDR5
94% LESS SURFACE AREA1
AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
12 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
1GB GDDR5 (4x256MB): 28x24mm = 672mm2
1GB HBM Stack: 5x7mm = 35mm2
‒ 19x less surface area for same amount of DRAM
9900mm2 PCB footprint for AMD Radeon™ R9 290X GPU & RAM
<4900mm2 PCB footprint for HBM-based ASIC
‒ >50% smaller PCB Footprint
THINK SMALLER WITH THE INTERPOSER & HBM
PCB area occupied by ASIC + Memory (R9 290X)
PCB area occupied by ASIC with HBM
<70mm
<70
mm
110MM
90
MM
13 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
HBM WITH INTERPOSER: SPEED, POWER & SMALL FORM FACTORS A REVOLUTION IN CHIP DESIGN
HIGH BANDWIDTH Performance well beyond DDR4/GDDR5/LPDDR4
POWER EFFICIENCY >3X the performance per watt of GDDR52
SMALL FORM FACTORS 94% less PCB surface area than GDDR51
INNOVATION New interconnects, interposer & DRAM type designed by AMD
14 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
FOOTNOTES
1. Measurements conducted by AMD Engineering on 1GB GDDR5 (4x256MB ICs) @ 672mm2 vs. 1GB HBM (1x4-Hi) @ 35mm2. HBM-2
2. Testing conducted by AMD engineering on the AMD Radeon™ R9 290X GPU vs. an HBM-based device. Data obtained through isolated direct measurement of GDDR5 and HBM power delivery rails at full memory utilization. Power efficiency calculated as GB/s of bandwidth delivered per watt of power consumed. AMD Radeon™ R9 290X (10.66 GB/s bandwidth per watt) and HBM-based device (35+ GB/s bandwidth per watt), AMD FX-8350, Gigabyte GA-990FX-UD5, 8GB DDR3-1866, Windows 8.1 x64 Professional, AMD Catalyst™ 15.20 Beta. HBM-1
3. Testing conducted by AMD engineering on the AMD Radeon™ R9 290X GPU vs. an HBM-based device. Data obtained through isolated direct measurement of GDDR5 and HBM power delivery rails at full memory utilization. AMD Radeon™ R9 290X and HBM-based device, AMD FX-8350, Gigabyte GA-990FX-UD5, 8GB DDR3-1866, Windows 8.1 x64 Professional, AMD Catalyst™ 15.20 Beta. HBM-3
15 AMD RADEON™ GRAPHICS UPDATE | NDA ONLY – UNDER EMBARGO UNTIL 19 MAY, 2015 @ 8:00 AM EASTERN
ATTRIBUTION
The information presented in this document is for information purposes only. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including, but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. ATTRIBUTION ©2015 Advanced Micro Devices, Inc. All rights reserved. AMD, Radeon, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other names are for informational purposes only and may be trademarks of their respective owners.