Shimin Chen Big Data Reading Group
description
Transcript of Shimin Chen Big Data Reading Group
![Page 1: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/1.jpg)
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive ApplicationsA. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09
Shimin ChenBig Data Reading Group
![Page 2: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/2.jpg)
Introduction Gordon: a flash-based system architecture for
massively parallel, data-centric computing. Solid-state disks Low-power processors Data-centric programming paradigms
Can deliver: Up to 2.5X the computation per energy of a
conventional cluster based system Increasing performance by up to 1.5X
![Page 3: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/3.jpg)
Outline Gordon’s system architecture Gordon’s storage system Configuring Gordon Discussion Summary
![Page 4: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/4.jpg)
Gordon’s System Architecture
Gordon node
16 nodes in an enclosure
![Page 5: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/5.jpg)
Prototype Photo:
http://www-cse.ucsd.edu/users/swanson/projects/gordon.html
ASPLOS’09 paper uses simulation
![Page 6: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/6.jpg)
Gordon node Configuration:
256GB of flash storage A flash storage controller (w/ 512MB dedicated DRAM) 2GB ECC DDR2 SDRAM 1.9Ghz Intel Atom processor Running a minimal linux installation
Power: no more than 19w Compared to 81w of a server
900MB/s read and write bandwidth to 256GB disk
![Page 7: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/7.jpg)
Enclosures Within an enclosure, 16 nodes plug into a
backplane that provides 1Gb Ethernet-style network
A rack holds 16 enclosures (16x16=256 nodes) 64 TB of storage 230 GB/s of aggregate IO bandwidth
![Page 8: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/8.jpg)
Programming From the SW and users’ perspectives, a Gordon
system appears to be a conventional computing cluster
Benchmarks: Hadoop
![Page 9: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/9.jpg)
Outline Gordon’s system architecture Gordon’s storage system Configuring Gordon Discussion Summary
![Page 10: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/10.jpg)
Flash Array Hardware
Flash controller
Flash memoryFlash
memoryFlash memoryFlash
memory
4 buses, each support 4 flash packages
CPU2GB
System DRAM
512MB FTL
DRAM
The flash controller has over 300 pins
![Page 11: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/11.jpg)
![Page 12: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/12.jpg)
Super-Page Three ways to
stripe data across flash memory
Horizontal: across buses
Vertical: across packages on the same bus
2-D: combined
![Page 13: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/13.jpg)
Bypassing and Write Combining Read bypassing: merging read requests to the
same page Write combining: merging write requests to the
same page
![Page 14: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/14.jpg)
Trace-Based Study
64KB super-page achieves the best performance
![Page 15: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/15.jpg)
Outline Gordon’s system architecture Gordon’s storage system Configuring Gordon Discussion Summary
![Page 16: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/16.jpg)
Workloads Map-Reduce workloads using Hadoop
Each workload is run on 8 2.4GHz Core 2 Quad machines with 8GB RAM and a single SATA hard disk with 1Gb Ethernet to collect traces.
![Page 17: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/17.jpg)
Power Model
![Page 18: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/18.jpg)
Simulation High-level trace-driven cluster simulator
Trace: second-by-second utilization (number of instructions, number of L2 cache misses, number of bytes sent and received over network, number of hard drive accesses)
Replay the trace for a conventional cluster model and a Gordon model
Detailed storage system simulators Disksim for simulation 1TB, 7200rpm SATAII hard
drive Gordon simulator
![Page 19: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/19.jpg)
![Page 20: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/20.jpg)
88 different node configurations
![Page 21: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/21.jpg)
Optimal Designs
MinT: Min Time MaxE: Max performance/watt MinP: Min Power
![Page 22: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/22.jpg)
![Page 23: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/23.jpg)
Outline Gordon’s system architecture Gordon’s storage system Configuring Gordon Discussion Summary
![Page 24: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/24.jpg)
Exploit Disks Cheap redundancy Virtualize Gorgon:
Gordon + a large disk-based data warehouse Load data from disk storage into Gordon for a job May overlap data transfer time for preparing a job with
computation
![Page 25: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/25.jpg)
System Costs
Actual price is $7200
![Page 26: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/26.jpg)
System Costs for Storing 1PB Data
![Page 27: Shimin Chen Big Data Reading Group](https://reader036.fdocuments.in/reader036/viewer/2022081603/56815ccd550346895dcade6a/html5/thumbnails/27.jpg)
Summary Use flash memory + low-power processor
(Atom) Support data intensive computing: such as Map-
Reduce operations The design choice is attractive because of
higher power/performance efficiency