Hbase OperationsAt facebook
Paul TuckfieldJanuary 2012
HBase Operations
The Hbase cells▪ Many hbase cells
▪ 3 versions, several minor branches/revs
▪ Mostly uniform host types
▪ Varying network topologies/rack topologies
▪ Varying sizes
▪ We (Ryan, Alex and me) are the “DBAs” or “SREs”of hbase at facebook
▪ Moving towards slightly more differentiation of roles for teams at facebook as hbase effort matures
The Use cases: some live some not▪ Titan (user facing messaging)
▪ Facebook specific time series▪ Puma (user facing stats)▪ ODS (system metrics)Hashout
▪ Eris “multi tennant” “dormitory” for incubation of new projects
▪ CDB : a few use cases replacing what would have been on smallish sharded mysql setups
▪ ODS-Hbase: facebook instrumetnation and alerting system, currently on mysql
▪ prototype/testing of general user data on hbase
We have some important use cases running Hbase, but are small compared to what is running in MySQL and Hadoop. That said, there are some critical use cases, and the fraction of very large facebook environment is still pretty large.
SMC / HSH : basic facebook “cloud” tools used for HBASE• SMC:
• User defined sets of host:port “services”
• Arbitrary metadata
• Machine states (enabled,disabled)
• HSH
• Better version of dsh
• Integration with SMC
Other examples besides deploy:
- Cluster start/stop
- Autostart
- Scan ports
- Scan logs
Deploy: push slaves info to smc, use smc/hsh to push code to hosts that make up the cell
SMC
HBASE
SVN/Git
Deploy, toolUtility, whatever
HBase Maintenance“It’s self-healing”▪ Backups
▪ Stage 1,2,3▪ Repairs
▪ FBAR▪ Upgrades
▪ Rolling, cold▪ Rack concerns
Attempt to standardize bandwidth/rack dispersion tradeoffsRunning on several different generations of network core/rackswitch combo, some slow some fast
Rack oriented would have better intra cell performance in worst case situations (not uncommon)
“horizontally” organized hopefully can survive single rack issues
I’m not so sure it’s a good thing: Network is pretty reliable, why emphasize uplink failure tolerance. maybe we should have sharded hbase setups
2 cells of 40 hosts each, spread across 5 racks rather than “vertical”
Cell 1
Cell 2
Spares
Things we monitor/alertMonitor hundreds of variables in ODS, the facebook timeseries database
Alert /SMS on:
• Hbck failures
• Dfs fsck failures
• Probe / scan a table from client
• Thruput rates in some cases
• Most application alarms left to other teams in an attempt to be relatively generic service to the rest of facebook
TroubleshootingTypical problems ▪ Regionserver/Slave apocalypse
▪ fsck inconsistencies
▪ hbck inconsistencies ▪ Long recoveries/timeouts after failures▪ Wedged regions/meta info
▪ Log splitting during recovery
▪ Memory /thread exhausted -> regionserver deaths
▪ GC pauses , tuning related deaths
▪ Rackswitch bandwidth related issues
Setting up Hbase ClustersDoing all the things▪ HBase versions, 0.89 vs 0.92▪ Rack and host selection▪ Imaging and partitioning▪ Populating SMC tiers▪ Building from templates▪ Pushing▪ Starting up everything!
Tools use $CELLNAME env varTypical session
Run “setcell” to set environ, all subsequent commands are “pointed at” the given hbase cell
- Hbscan to see status of hosts in that cell
- Hblog to look at logs
- Hbprocess (like showprocess)
- Etc.
Typical operations: setcell/hbhostTypically start with “setcell”
Hbhost just shows what is in SMC for this cell
“hbhost nn” or “hbhost master” to ssh to the given host without caring about hostnames.
Hbscan : python “nmap” like scanHbscan to get a quick impression of the state of the cell
Queries SMC for topology
Scans all hosts for all known ports (tcp connect )
Takes a few seconds
Hblog: “normalize” and summarize loglinesAttempt to remove entropy to get to “core” message
Fingerprint with md5
Summarize by md5/host
Columns -> clusterwide errors
Rows-> this particular node is jacked
Observation: Cluster is as slow as the slowest regionserverCommon pattern is to ingest data and multiput to hbase from many frontends
The larger the multiput, the more likely clients will serialize/collide on a hot regionserver
Don’t look at the average . . Look at the average *and* the outliers
But which metric?
(imagine lines drawn from every box to every can)
Observation: evolution/selection of balanceIn a few cases performance issues or bugs relating to load cause hosts to crash
When crash happens regions move around
A new “hand is drawn” with different combintations of regions
When combination of regions is such that there’s no death . . Balanced!
Observation: balancing could be much better• In cases where skew seems to dominate we’ve
experimented with manual region placement /splitting
• Developed basic jruby/groovy scripts using HBaseAdmin
• Maybe support ‘user space’ balancers
(c) 2009 Facebook, Inc. or its licensors. "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0
Top Related