Hadoop Hardware @Twitter: Size does matter!
-
Upload
hadoop-summit -
Category
Technology
-
view
5.515 -
download
4
description
Transcript of Hadoop Hardware @Twitter: Size does matter!
Hadoop Hardware @Twitter:Size does matter.
@joep and @eecraftHadoop Summit 2013
v2.3
@Twitter#HadoopSummit20132
Joep RottinghuisSoftware Engineer @ Twitter
Engineering Manager Hadoop/HBase team @ Twitter
Follow me @joep
Jay ShenoyHardware Engineer @ Twitter
Engineering Manager HW @ Twitter
Follow me @eecraft
HW & Hadoop teams @ Twitter, Many others
•
•••
•
•••
•
About us
@Twitter#HadoopSummit20133
Scale of Hadoop ClustersSingle versus multiple clustersTwitter Hadoop ArchitectureHardware investigationsResults
•
•
•
•
•
Agenda
@Twitter#HadoopSummit2013
Scale
4
Scaling limits
JobTracker 10’s thousands of jobs per day; 10’s Ks concurrentslots
Namenode 250-300 M objects in single namespace
Namenode @~100 GB heap -> full GC pauses
Shipping job jars to 1,000’s of nodes
JobHistory server at a few 100’s K job history/conf files
•
•
••••
# Nodes
@Twitter#HadoopSummit2013
When / why to split clusters ?
5
In principle preference for single clusterCommon logs, shared free space, reduced admin burden, more rack
diversity
Varying SLA’sWorkload diversity
Storage intensiveProcessing (CPU / Disk IO) intensiveNetwork intensive
Data accessHot, Warm, Cold
•
•
•
•
•••
•
•
@Twitter#HadoopSummit2013
Cluster Architecture
6
@Twitter#HadoopSummit2013
Hardware investigations
7
@Twitter#HadoopSummit20138
Hadoop does not need live HDD swapTwitter DC : No SLA on data nodesRack SLA : Only 1 rack down at any time in a cluster
•
•
•
Service criteria for hardware
@Twitter#HadoopSummit20139
Baseline Hadoop Server (~ early 2012)
E56xx
DIMM
DIMM
DIMM
E56xx
DIMM
DIMM
DIMM
PCH NICGbE
HBA
Expander
Works for the general cluster,but...
Need more density for storage
Potential IO bottlenecks
••
Characteristics: Standard 2U
server 20 servers / rack
E5645 CPU Dual 6-core 72GB memory 12 x 2TB HDD 2 x 1 GbE
•
•
•••••
@Twitter#HadoopSummit201310
Hadoop Server: Possible evolution
Characteristics:+ CPU performance? 20 servers / rack
Candidate forDW
•
NICGbE
HBA
Expander16 x 2T?16 x 3T?24 x 3T?
E5-26xx orE5-24xx
DIMM
DIMM
DIMM
DIMM
E5-26xx orE5-24xx
DIMM
DIMM
DIMM
DIMM
10GbE ?
Can deploy into the general DW cluster, but...
Too much CPU for storage intensive apps
Server failure domain too large if we scale updisks
••
@Twitter#HadoopSummit2013
Rethinking hardware evolution
11
Debunking mythsBigger is always betterOne size fits all
Back to Hadoop Hardware Roots:Scale horizontally, not vertically
Twitter Hadoop Server - “THS”
•
••
•
•
@Twitter#HadoopSummit201312
NIC
SAS HBA
E3-12xxDIMM
DIMM
PCH
GbE
THS for backups
Storage focus:
Cost efficient (single socket, 3Tdrives)
Less memory needed
•
•
Characteristics: + IO Performance
Few fast cores
E3-1230 V2 CPU 16 GB memory 12 x 3 TB HDD SSD boot 2 x 1 GbE
•
•••••
@Twitter#HadoopSummit201313
THS variant for Hadoop-Proc and HBase
NIC
SAS HBA
10GbE
E3-12xxDIMM
DIMM
PCH
Characteristics: + IO Performance
Few fast cores
E3-1230 V2 CPU 32 GB memory 12 x 1 TB HDD SSD boot 1 x 10 GbE
•
•••••
Processing / throughput focus:
Cost efficient (single socket, 1Tdrives)
More disk and network IO persocket
•
•
@Twitter#HadoopSummit201314
THS for cold cluster
NIC
SAS HBA
E3-12xxDIMM
DIMM
PCH
GbE
Characteristics:
Disk Efficiency
Some compute
E3-1230 V2 CPU
32 GB memory
12 x 3 TB HDD
2 x 1 GbE
••
••••Combination of previous 2 use cases:
Space & power efficient
Storage dense and some processingcapabilities
••
@Twitter#HadoopSummit201315
Rack-level view
BaselineTwitter Hadoop Server
Backups Proc ColdPower ~ 8 kW ~ 8 kW ~ 8 kW ~ 8 kWCPU sockets; DRAM 40; 1440 GB 40; 640 GB 40; 1280 GB 40; 1280 GBSpindles; TB raw 240; 480 TB 480; 1,440 TB 480; 480 TB 480; 1,440 TBUplink; Internal BW 20 ; 40 Gbps 20 ; 80 Gbps 40 ; 400 Gbps 20 ; 80 Gbps
1G TOR1G TOR1G TOR
1G TOR1G TOR10G TOR
@Twitter#HadoopSummit201316
Processing performance comparison
Benchmark Baseline Server THS (-Cold)TestDFSIO (write replication = 1) 360 MB/s / node 780 MB/s / nodeTeraGen (30TB replication = 3) 1:36 hrs 1:35 hrsTeraSort (30 TB, replication = 3) 6:11 hrs 4:22 hrs2 Parallel TeraSort (30 TB each, replication = 3) 10:36 hrs 6:21 hrsApplication #1 4:37 min 3:09 minApplication set #2 13:3 hrs 10:57 hrs
Performance benchmark set up:
Each clusters 102 nodes of respective type
Efficient server = 3 racks, Baseline 5+ racks
“Dated” stack: CentOS 5.5, Sun 1.6 JRE, Hadoop 2.0.3
•••
@Twitter#HadoopSummit2013
Results
17
@Twitter#HadoopSummit201316
LZO performance comparison
18
@Twitter#HadoopSummit2013
Recap
19
At a certain scale it makes sense to split into multiple clustersFor us: RT, PROC, DW, COLD, BACKUPS, TST, EXP
For large enough clusters, depending on use-case, it may be worth to choosedifferent HW configurations
•
••
@Twitter#HadoopSummit2013
Conclusion
20
@Twitter our “Twitter Hadoop Server”not only saves many $$$, it is also
faster !
#ThankYou
@joep and @eecraft
Come talk to us at booth 26