Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike...

6
Peta-Cache, Mar30 , 2006 V1 1 Peta-Cache: Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov Gunther Haller [email protected] (650) 926-4257

Transcript of Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike...

Page 1: Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov Gunther Haller haller@slac.stanford.edu.

Peta-Cache, Mar30 , 2006

V1 1

Peta-Cache:Peta-Cache:

Electronics Discussion II

Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov

Gunther [email protected](650) 926-4257

Page 2: Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov Gunther Haller haller@slac.stanford.edu.

Peta-Cache, Mar30 , 2006

V1 2

Flash Storage OptionFlash Storage Option• Skim Builder as in option discussed earlier• Event server (cache box) as in other option• Shown as two boxes for simplicity, could be in one

box (there are pro’s and con’s)• Issue is again interconnect speed• Up to 16 1-Terra-Byte Flash boxes for each event

server– Each lane PCI-E 256 MByte/sec– 16-lanes gives total of 4-Gbytes/sec

bandwidth • Each Flash box has only fraction of total event

store• Flash has limited write-cycles so can’t frequently

rewritten (need to enforce with some policy which is most important)

• But don’t really want to “burn” results of skim in flash, since goal is to make own lists (and flash can’t be reburned at will anyways)

• Flexibility:– One event server can have a sub-set of the

list and events go to client– Or, better, have total of event server as “one”

cache” and event store is managed so that parts of the list which are in other pizza boxes are kept in that cache as opposed to discarded

• Question is again how to populate the Flash most effectively

• Decompression in event server• Flash bad-block management in event server• Reed-Solomon EDAC in event server• Can consider without cache box: 4000 clients

going after the same block, the last one to get data is ~ 300 msec later.

Flash Storage

Event Server

Ethernet/PCI-E/etc

Client (1, 2, or 4 core)

Skim builder (s)

Client (1, 2, or 4 core)

Up to 1,500 cores in ~ 800 units?

Disk Storage

Event Server

Tape

Client (1, 2, or 4 core)

Optionally direct IO

Disk Storage

PCI-E

116

Flash Storage1

16

Page 3: Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov Gunther Haller haller@slac.stanford.edu.

Peta-Cache, Mar30 , 2006

V1 3

Flash-Box, Event BoxFlash-Box, Event Box

• Flash Memory Box• 8-, 16-, or 32-Gbit NAND devices • For 1 Terra-Byte need 250 each 32-Gbit devices

– All on board, or– 32 G-Byte memory cards (DIMM)

» Need > 30 DIMM’s

• Preliminary placement on 19-inch rack PCB shows that we can fit 1 Terra-byte on single board

– PCI-E to PCI-X bridge (to get 64-bit addressing space )– No smarts in here

• Event (Pizza) Box – 8 F40 Xilinx (each has 2 450-MHz PPC’s)– 16 GBytes of RLDRAM2– 8 PLX8508 PCI-E switch 5-ports– 2 PLX8532 8-port switch (32 lanes)

Page 4: Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov Gunther Haller haller@slac.stanford.edu.

Peta-Cache, Mar30 , 2006

V1 4

Flash & Event Server BoxesFlash & Event Server Boxes

• 4Gbit chips: $30, 8Gbit = $60• 4 Gbyte device quote: $110 min qty 1000 (is 4-1GB die stack)• 1 Peta-Byte: 1,000 boxes total $27 Mill

Bridge Chips (total of 16) $500

Misc (Box, board, loading, regulators, etc)

$400

1 TByte Flash (250 x (4-GByte ~ $110)) $27,000

Xilinx’s (8 each) $500

Local RLDRAM2 (16Gbytes) $3,200

Misc (Box, board, loading, regulators, etc)

$400

Misc Switches $500

• Flash Box

• Event Server

Page 5: Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov Gunther Haller haller@slac.stanford.edu.

Peta-Cache, Mar30 , 2006

V1 5

Pizza box block diagram (needs some Pizza box block diagram (needs some modification)modification)

(Out) PCI Express

x4

PPC 405

XILNIX XC4VFX40

RLDRAM II IGbyte

PLX8508

PLX8532

x16

(In) PCI Express

x16

x4

PLX8532

PLX8508

Page 6: Peta-Cache, Mar30, 2006 V1 1 Peta-Cache: Electronics Discussion II Presentation Ryan Herbst, Mike Huffer, Leonid Saphoznikov Gunther Haller haller@slac.stanford.edu.

Peta-Cache, Mar30 , 2006

V1 6

Event Processing CenterEvent Processing Center

switch

file system fabric

sea of cores fabric

switch(s)

switch (s)

HPSS

pizza box

out protocol conversion

Event processing node

disks pizza box as skim builder

in protocol conversion