Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates...

22
Victor Cai Marketing Manager 8/20/2011 Impact of Memory

Transcript of Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates...

Page 1: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Victor CaiMarketing Manager8/20/2011

Impact of Memory

Page 2: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Application Requirements

What do users want?― Zero latency― Infinite bandwidth― Zero power― Zero cost― Infinite capacity― Backward compatibility, future extensibility― Programmable access granularity― Absolute reliability

More realistic user expectations― Better performance at lower cost

2

Page 3: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

How to get “Better Performance”

Higher Capacity― In-memory databases

● more capability― OLTP databases

● Less disk access, lower average latency― Consolidated (virtual) servers

● More emulated servers in single system = lower response latency

Higher Bandwidth― “Typical” scientific code

● Higher throughput

Lower Latency― single threaded code

Low Power― Enables large capacity applications― Enables mobile/battery based applications

3

Page 4: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Memory System Selection Criteria

No single answer defines “better performance”― Sometimes lower power, lower data rate, longer latency is

“better performance”

― Sometimes lower datarate, longer latency, higher capacity is “better performance”

― Sometimes higher datarate, same latency, same capacity is “better performance”

JEDEC member companies drive future specification requirement based on own respective application set

Specification “frame” scaling decision for near future designs

4

Page 5: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Synchronous DRAM (SDRAM)

Moved from signal interface to command interface― Enables programmable, pipelined operations― Consecutive data phases in single burst designed for CPU

cacheline fill

Multiple banks per device― Facilitates pipelining, different banks may be in different

phases of row-cycle operation

{DDR3, GDDR5, LPDDR2} - all direct evolutionary descendants of SDRAM

5

Page 6: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

From SDRAM to DDR3

SDRAM DDR SDRAM DDR2 SDRAM DDR3 SDRAM Unit

Data rate 66~133 200~400 400~1066 800~2133 MT/s

Voltage 3.3 2.5 1.8 1.5 (1.35) V

Capacity 16~512 256~1024 256~2048 1024~4096 Mb

Burst length 1, 2, 4, 8 2, 4, 8 4, 8 4*, 8

I/O Type LVTTL SSTL-2 SSTL-18 SSTL-15

Termination Off chip Off chip On die On die

Dynamic ODT N/A N/A No RttWR

Bank count 2*, 4 4 4, 8* 8

I/O Calibration None None OCD* ZQ

• Small evolutionary steps to increase data rate, lower voltage and add features 6

Page 7: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

0

20

40

60

80

100

120

Acc

ess

Late

ncy

(ns)

L1 Cache L2 Cache

L3 Cache

DRAM

Memory (System) Access Latency

Component Access Latency― Commodity DRAM (DDR3)

● Open page ~13 ns● Close page ~26 ns

― NAND Flash● ~25,000 ns

System Access Latency Example― "DRAM" only responsible for

~30% of ~100 ns of "memory access latency"

― TLB misses, buffering/re-drive, FIFO, serial-to-parallel, parallel-to-serial conversions, etc. all increase latency

7

Page 8: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Commodity DRAM Latency Scaling Trend

8

90.000

70.000

60.000 60.000 60.00055.000 56.250

54.000 55.00052.533 50.975 50.000

30.000

20.00015.000 15.000

18.00015.000 15.000 15.000 15.000 13.133 13.493 12.500

0

10

20

30

40

50

60

70

80

90

100

1996 1998 1999 2001 2002 2003 2004 2006 2007 2008 2010 2011

Late

ncy

(nan

oose

cond

s)

Year of Introduction

RC

CAS

Page 9: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Commodity DRAM Datarate Scaling Trend

66.667 100.000 133.333

266.667333.333

400.000

533.333

666.667

800.000

1066.000

1333.000

1600.000

0

200

400

600

800

1000

1200

1400

1600

1800

1996 1998 1999 2001 2002 2003 2004 2006 2007 2008 2010 2011

Data

rate

(MT/

s)

Year of Introduction

datarate

9

Page 10: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

10CONFIDENTIAL

Key Challenges

Page 11: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Data In

1 Transmitter2 Transmission Line

4 TerminatorData Out

3 Receiver

Vref

+-

5 Clocking System

Signaling System

1. Transmitter2. Transmission Line3. Receiver4. Terminator5. Clocks Taken from “Signaling in High-Performance Memory Systems,” John

Poulton, ISSCC Tutorial,1999. 11

Page 12: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Channel Topology

Multiple DIMMs per channel 2 ranks (loads) per DIMM Dynamic ODT Asymmetric R/W topology

Memory Controller Die

Memory Controller Package

TraceMC

Socket

Open Field Trace DIMM

Field Trace

Socket-to-socket trace

DRAM Die(DQ Interface)

DRAM PKG WirebondPCB Trace (RS to DRAM PKG)DQ RStub

PCB Trace (Connector to RS)DIMM Connector

PCB Trace (Breakout on system board)

DIMM 0DIMM 2 DIMM 1

12

Page 13: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Current Typical System Memory Configuration

Supply voltage

Loading / Speed 800-1066 1333 1600 1866 2133

DR QR DR QR DR QR DR QR DR QR

1.5V

1DPC - - -

2DPC - - - - - -

3DPC - - - - - - - - -

1.35V

1DPC - - - -

2DPC - - - - - -

3DPC - - - - - - - - -

1.25V

1DPC - - - -

2DPC - - - - - -

3DPC - - - - - - - - -

Page 14: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

14CONFIDENTIAL

Looking Forward

Page 15: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

LRDIMM 4Rx4

Density: 16G Target platform:

― Westmere― Romley― Interlagos

Compatible with DDR3 DIMMs

20% cost advantage over equivalent LRDIMM

Page 16: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Back to the Future

Best candidate to replace/augment DRAM?― Flash

Why?― It’s cheaper than DRAM

Both DRAM and Flash running into process scaling challenges― Alternative technologies making claims and trying to win

mindshare― DRAM and Flash are multi-billion dollar industries with

continuous re-investment

Revolution postponed until ― Effectiveness of billion dollar investments in DRAM/Flash loses

to million dollar investments in alternatives.16

Page 17: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

High Level Goals for DDR4

Latency― Comparable or slightly lower than DDR3

Bandwidth― Data rate to 3.2 Gb/s and beyond

Power― Lower power by lowering voltage AND new features to suppress

idle and active power

Cost― Minimize additional die size penalty relative to DDR3

Capacity― Leverage TSV to create extreme-capacity systems

Compatibility― Leverage as much infrastructure from DDR3 as practicable

17

Page 18: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

DDR4 Features(Subject to Change)

Pseudo Open Drain, faster I/O― Leveraged from GDDR5

Single-ended signaling― Continues on DDRx legacy, may be end of line

1:1 signal/ground ratio― Better noise shielding on connector for higher data rate

Higher data rate, lower voltages Smaller rows for x4 device

― X4 devices to have lower activation power and better performance Same prefetch length as DDR3

― Supports existing CPU cache infrastructure GDDR5-style bank groups

― High data rate, low DRAM core cycle rate, same prefetch length

18

Page 19: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

DDR4 Solution Development

Basic idea: Same register device for both RDIMM and LRDIMM― DDR4 Register + DRAM + no data buffers = RDIMM― DDR4 Register + DRAM + data buffers = LRDIMM

19

133.35 mm

31.2

5 m

m

Same Register Chip for DDR4 RDIMM or DDR4 LRDIMM

Data Buffer Devices0.85 mm pitch fingers

(0.9

mm

talle

r tha

n D

DR

3)

1:1 S:G ratio

Page 20: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

Through Silicon Via (TSV)

(Potentially) a real revolution in DRAM technology― Within DRAM generation event horizon, 3~5 years

Can enable direct attach memory to processor― Low power, low latency― May have slightly higher per bit cost structure

Mitigate the bandwidth-vs-capacity issue― Multiple DRAM dies in single stack presents single electrical

{address, data} load to system

20

Page 21: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

DRAM Stacks with TSV

Traditional Dual Die Package (DDP)― Same DRAM die as used in single die packages― System sees all loads on all DRAM dies

DRAM with TSV― Special die for TSV use― System sees single load of separate I/O re-drive chip 21

DRAM Die

DRAM Die

DRAM DieDRAM Die

DRAM DieDRAM DieDRAM DieDRAM Die

Interface Die

DRAM Die

Single Die PackageLow Cost Dual Die Package (DDP)

(with long wirebond on top die)

Dual Die Package (DDP)(with RDL on both dies)4 high TSV Stack with interface die

Page 22: Impact of Memory - Ansys...Can be used as (In-package) direct attach memory ― Mitigates capacity-vs-bandwidth trade-off ― Enables ultra-high capacity memory systems not possible

TSV Benefits and Drawbacks

Benefits― Can be used as (In-package) direct attach memory― Mitigates capacity-vs-bandwidth trade-off― Enables ultra-high capacity memory systems not possible

today

Drawbacks― Increased cost basis for DRAM manufacturers― Different dies used for SDP and stacked devices― Require new manufacturing/testing models for direct attach

memory function between different devices from different companies

22