Zero-downtime Hadoop/HBase Cross-datacenter Migration

60
Zero-downtime Hadoop/HBase Cross- datacenter Migration SPN, TrendMicro Scott Miao & Dumbo team members Sep. 19, 2015

Transcript of Zero-downtime Hadoop/HBase Cross-datacenter Migration

Page 1: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Zero-downtime Hadoop/HBase Cross-datacenter Migration

SPN, TrendMicroScott Miao & Dumbo team membersSep. 19, 2015

Page 2: Zero-downtime Hadoop/HBase Cross-datacenter Migration

2

Who am I• Scott Miao• RD, SPN, Trend Micro• Worked on Hadoop ecosystem since

2011• Expertise in HDFS/MR/HBase• Contributor for HBase/HDFS• Speaker in HBaseCon2014• @takeshi.miao

Our blog ‘Dumbo in TW’: http://dumbointaiwan.blogspot.tw/HBasecon2014 sharing:http://www.slideshare.net/HBaseCon/case-studies-session-6https://vimeo.com/99679688

Page 3: Zero-downtime Hadoop/HBase Cross-datacenter Migration

3

Agenda• What problem we suffered• IDC migration• Zero downtime migration• Wrap up

Page 4: Zero-downtime Hadoop/HBase Cross-datacenter Migration

4

What problem we suffered ?

Page 5: Zero-downtime Hadoop/HBase Cross-datacenter Migration

#1Network bandwidth

insufficient

Page 6: Zero-downtime Hadoop/HBase Cross-datacenter Migration

6

Old IDC Layout

● ● ●

POD Core Switch

TOR Switch

41U rack 41U rackPOD

1Gb

1Gb

20 Gb

POD

● ● ●

Up stream devices

HD NNcpu: 8coresmem: 72GBDisk: 4TB

HD DNcpu: 12coresmem: 128GBdisk: 6TB

Other services

12 Gb usageHadoop + services

network traffic

No physical space

Core Switch

Since 2008

x n

x 2 x n

Devices view Servers view

Page 7: Zero-downtime Hadoop/HBase Cross-datacenter Migration

#2Data storage capacity

insufficient

Page 8: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Est. Data Growth

8

• ~2x data growth

Page 11: Zero-downtime Hadoop/HBase Cross-datacenter Migration

11

What options• Enhance old IDC

– Replace 1Gb to 10 Gb network topology– Adjust servers location– Any chances for more physical space ?

• Migrate to new IDC– 10 Gb network topology– Servers location well defined– More physical space

Page 12: Zero-downtime Hadoop/HBase Cross-datacenter Migration

12

What options• Migrate to public cloud

– Provision on-demand• Instance type (NIC/CPU/Mem/Disk) and amount

– Pay as you go– Need to optimize our existing services

Page 13: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Migrate to new IDC !

http://gdimitriou.eu/?m=200912

Page 14: Zero-downtime Hadoop/HBase Cross-datacenter Migration

14

IDC Migration

Page 15: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Recap…

Network bandwidthData storage capacity

insufficient

Page 16: Zero-downtime Hadoop/HBase Cross-datacenter Migration

New IDC Layout

● ● ●

POD Core

TOR Switch

41rack 41U rackSPN Hadoop POD

10Gb

160 Gb

40Gb

Up stream devices

HD NNcpu: 16coresmem: 128GBdisk: 10TB

HD DNcpu: 24coresmem: 196GBdisk: 72TB

Other services

Core Switch

Network traffic becomes far more less

Total 2~3X data storage capacity in terms of our data growth

Grow up to 14 racksx 2 x n

Servers viewDevices view

Page 17: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Now what ?Don’t forget our beloved

elephant~

YARNhttps://gigaom.com/2013/10/25/cloudera-ceo-were-taking-the-high-profit-road-in-hadoop/http://www.pragsis.com/blog/how_install_hadoop_3_commands

Page 18: Zero-downtime Hadoop/HBase Cross-datacenter Migration

18

YARN abstracts the computing frameworks from Hadoop

http://hortonworks.com/hadoop/yarn/

Page 19: Zero-downtime Hadoop/HBase Cross-datacenter Migration

So not only doing migration

also doing upgrade as well

Page 20: Zero-downtime Hadoop/HBase Cross-datacenter Migration

TMH6 V.S. TMH7

2

Project TMH6 TMH7 Highlights

Hadoop 2.0.0 (MRv1) 2.6.0 • YARN + MRv2• YARN + ???

HBase 0.94.2 0.98.5 • MTTR impr.• Stripe Comp.

Zookeeper 3.4.5 3.4.6

Pig 0.10.0 0.14.0 • Pig on Tez

Sqoop1 1.4.2 1.4.5

Oozie 4.0.1 4.0.1

JVM Java6 Java7 • G1GC support

Page 21: Zero-downtime Hadoop/HBase Cross-datacenter Migration

21

How we test our TMH7 ?How our services port and test with TMH7 ?

Apache Bigtop PMC Evans Ye

Comes to rescuein next Session

Page 22: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Something about HW• CPU

– Mores cores• Memory

– More memory• Disk

– Storage capacity

• Network– 10Gb– Topology

• # of nodes per rack– Do PoC

http://www.desktopwallpapers4.me/computers/hardware-28528/

Page 23: Zero-downtime Hadoop/HBase Cross-datacenter Migration

23

Migration + Upgrade• Span two IDCs -> upgrade -> phase out old

one

Old IDC

20 Gb

New IDC

Page 24: Zero-downtime Hadoop/HBase Cross-datacenter Migration

24

Migration + Upgrade• Build new one -> migrate -> phase out old

one

Old IDC

20 Gb

New IDC

Page 25: Zero-downtime Hadoop/HBase Cross-datacenter Migration

1. Build new one2. migrate

3. phase out old one

Page 26: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Are we done ?We even not in the

game !

Page 27: Zero-downtime Hadoop/HBase Cross-datacenter Migration

27

SLA for PROD Services

Various data access patterns

Page 28: Zero-downtime Hadoop/HBase Cross-datacenter Migration

28

Zero downtime migration

Page 30: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Data Access Pattern Analysis

Hadoop/HDFS/MR

Page 31: Zero-downtime Hadoop/HBase Cross-datacenter Migration

2

IDC

Hadoop cluster

Log collector

s

Message queues

Data sourcing services

File compactor

s

Internet

Data inData proc

Applicationservices

Data outService1. New files put (mins)

to HDFS2. Proc files with Pig/MR(hourly/daily) to HDFS3. Get result files from HDFS, do further proc4. Serve user requests

1.

2.

3.4.

Page 32: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Data access patterns for Hadoop/HDFS/MR

• Data in– New file put in couple mins

• Computation– Process data hourly or daily

• Data out– Get result files by services for further process

32

Page 33: Zero-downtime Hadoop/HBase Cross-datacenter Migration

33

Categorize Data• Hot data

– Ingest files in mins• New data file put into Hadoop

continuously– Digest by Pig/MR for services

hourly or daily• Needed history data files

– Usually within couple months

– Sync data by• Replicate Data streaming ingestion

(Message queues + File compactors)

• distcp – every mins

• Cold data– All data except hot

• Time spans couple years data

• For monthly/quarterly/yearly report purposes

• Adhoc query– Copy data by

• disctp, run & leave it alone

Page 34: Zero-downtime Hadoop/HBase Cross-datacenter Migration

34

Kerberos federation among our clusters• Please wait for our next session

– Multi-Cluster Live Synchronization with Kerberos Federated Hadoop by Mammi Chang, Dumbo team

Old IDCTMH6 stg

TMH6 prod

Old IDCTMH7 stg

TMH7 prod

Page 35: Zero-downtime Hadoop/HBase Cross-datacenter Migration

35

New IDCOld IDC

Hadoop(tmh7)

Old Service 1’

New Service 1

Log collectors

20g Link

Hadoop(tmh6)

Old Service 2

Old Service 1

Log collectors

Sync hot data

Sync hot data

Message Queue

Zero downtime migration for Hadoop/HDFS/MR

File Compactors

Copy cold data

File Compactors

Message Queues

Page 36: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Need services’ cooperation• It seems services have no downtime• Latency for hot data sync

– May cause about latency in mins– Due to distcp cron job runs every couple mins

• Need services to– Adjust their jobs to delay couple mins to run

36

Page 37: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Seems pretty !So are we done?

Don’t forget our HBase XD

Page 38: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Data Access Pattern AnalysisHBase

Page 39: Zero-downtime Hadoop/HBase Cross-datacenter Migration

2

IDC

Hadoop cluster

Log collector

s

Message queues

Data sourcing services

File compactor

s

Internet

Data inData proc

Applicationservices

Data outService1. New files put (mins)

to HDFS2. Proc files with Pig/MR(hourly/daily) to HBase3. Random read from HBase4. Serve user requests5. Random writes to HBase

1.

3.4.

5.

2.

Page 40: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Data access patterns for HBase• Data in

– Random write to HBase– Process/write data hourly or daily

• Data out– Random read from HBase

40

Page 41: Zero-downtime Hadoop/HBase Cross-datacenter Migration

41

Considerations for HBase data sync• What we want ?

• All HBase data synced between old and new

• Arrange useless regions (Region merge)• Rowkey: ‘<key>-<timestamp>’• hbase.hregion.max.filesize

– 1GB to 4GB

Page 42: Zero-downtime Hadoop/HBase Cross-datacenter Migration

42

Considerations for HBase data sync• Incompatible changes between old & new

HBases– API binary incomapatible– HDFS level folder structure changed– HDFS level meta data file format changed

• Not include HFileV2

Page 43: Zero-downtime Hadoop/HBase Cross-datacenter Migration

43

Tools for HBase data syncTool Impl. tech. API compatible Service impact Data chunk

Boundary

CopyTable API client call

Cluster Replication API client call

Completebulkload HFileNeed to pending writes and flush

table

Based on when to pending writes

Export/Import SequenceFile + KeyValue + MR

Set start/end timestamp Based on previous

http://hbase.apache.org/book.html#tools

Page 44: Zero-downtime Hadoop/HBase Cross-datacenter Migration

44

Support tools for HBase sync• Pre-splits generator

– Run on TMH6– Deal with region merge issue– To generate pre-splits rowkey file– Create new HTable on TMH7 with this filegen-htable-presplits.sh /user/SPN-hbase/<table-name>/ <region-size-bytes>

<threshold> > /tmp/<table-name>-splits.txt

hbase shellcreate '<table-name>', '<column-family-1>' , SPLITS_FILE => '/tmp/<table-name>-splits.txt'

Page 45: Zero-downtime Hadoop/HBase Cross-datacenter Migration

45

Support tools for HBase sync• RowCount with timerange

– Support on both TMH6 & TMH7– Imported data check– Not officially support– Enhance old one to make our own

rowCounter.sh <table-name> --time-range=<start-timestamp>,end-timestamp># ... com.trendmicro.spn.hbase.mapreduce.RowCounter$RowCounterMapper$Counters ROWS=10892133 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0

Page 46: Zero-downtime Hadoop/HBase Cross-datacenter Migration

46

Support tools for HBase sync• Snapshot

– On TMH7– For every time pass of imported data check– Rollback to previous snapshot if data check fails

hbase shellsnapshot '<table-name>', '<table-name>-<start-timestamp>-<end-timestamp>'

Page 47: Zero-downtime Hadoop/HBase Cross-datacenter Migration

47

Support tools for HBase sync• DateTime <-> Timestamp

# get current java timestamp (long) date +%s%N | cut -b1-13# get current hour java timestamp (long) date --date="$(date +'%Y%m%d %H:00:00')" +%s%N | cut -b1-13# get current hour -1 java timestamp (long) date --date="$(date --date='1 hour ago' +'%Y%m%d %H:00:00')" +%s%N | cut -b1-13# timestamp to date date -d '@1436336202' # must be 10 digits, from left to right

Page 48: Zero-downtime Hadoop/HBase Cross-datacenter Migration

48

Zero downtime migration for HBase

Old IDC

Staging

New IDC

Staging

hbase-tmh7

Hadoop-tmh7

hbase-tmh6

Hadoop-tmh6

ServiceA

ServiceB

1. Confirm KV timestamp with ServiceB2. Export data to HDFS with timestamp3. Gen splits file4. distcp data to TMH75. Create HTable with splits6. Import data to HTable7. Verify data by rowcount W/ timestamp8. Create snapshot9,11. Sync data thru #2~8 (skip 3, 5)10. ServiceB stag test start12. Grant ‘RW’ to HTable for ServiceB13. Install ServiceB in new IDC14. Start ServiceB in new IDC15. Done

2. 3.

4.

5.

6.

7.

8.

12.

ServiceB13. 14.

Page 49: Zero-downtime Hadoop/HBase Cross-datacenter Migration

49

Need services’ cooperation• There still will be a small data gap

– It may be mins• Is it sensitive to services ?

– If it is not• Wait for our final data sync

– If it is• Services need to direct their writes to both clusters

Data sync to HTable -> service start up and run -> final data sync to HtableData gap

Page 50: Zero-downtime Hadoop/HBase Cross-datacenter Migration

50

Wrap up

Page 51: Zero-downtime Hadoop/HBase Cross-datacenter Migration

51

Wrap up• Analyze access patterns

– Batch ? Real time ? Streaming ?– Cold data ? Hot data ?

• Keep it simple!– Use native utils as far as you can

• Rehearsal ! Rehearsal ! Rehearsal !• Communicate with your users closely

Page 52: Zero-downtime Hadoop/HBase Cross-datacenter Migration

52

某一天… 你們migrate的如何? 我migrate完了!

我migrate,完了

有聽有保庇!

Page 53: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Q & A

Page 54: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Thank You

Page 55: Zero-downtime Hadoop/HBase Cross-datacenter Migration

Backups

Page 56: Zero-downtime Hadoop/HBase Cross-datacenter Migration

What items need to take care of• CPU

– Use more cores• One MR task process uses 1 CPU core• Single core clock rate does not increase much

– Do math to compare CPU cores for old and new

2

(codes-per-old-machine * amount-of-machines * increases-percent) / cores-per-new-machine = amount-of-new-machines

1. Hortonworks, Corp., Apache Hadoop Cluster Configuration Guide, 2013 Apr., p. 15.

e.g. # of 8 cores machine s to # of 24 cores machine, with 1.5X capacity higher(8 * 10 * 150%) / 24 = 120 / 24 =~ 5

P.S. could consider to enable hyper-threading1, then the # of cores is double, but 1/3 of doubled cores need to keep for OS

Page 57: Zero-downtime Hadoop/HBase Cross-datacenter Migration

57

What items need to take care of

• Memory– Total memories much higher than our old cluster– Consider next gen. computing framework

((per-slot-gigbytes * total-slots + hbase-heap-gigabytes) * 120%-os-mem) * increase-percent / mem-per-new-machine = amount-of-new-machines

e.g. 8 slots with 2GB for each per old machine(((2GB * 80 + 8GB) * 120%) * 300%) / 192GB = (168GB * 120%) * 300% / 192GB =~ 4

Page 58: Zero-downtime Hadoop/HBase Cross-datacenter Migration

58

What items need to take care of

• Disk– 2~3X storage capacity to fulfill our BIG data size– Hot swapping support– One disk/partition versus 2~3 process (MR tasks)

• Network– Network topology changed (as previous)– 10Gb NIC for Hadoop nodes

total-cores / (disks-per-new-machine * amount-of-new-machines) = amount-of-process-per-diske.g. with total cores is 120; 120 / (12 * 5) =~ 2

Page 59: Zero-downtime Hadoop/HBase Cross-datacenter Migration

What items need to take care of

• Rack– Power consumption & cooling– One rack can support our Hadoop nodes is 15, instead of

20– Ask your HW vendor for PoC !!

• Transactional workload (heavy IO load)• Computation workload (100% CPU workload)• Memory Intensive workload (full memory usage)

• New Hadoop TMH7– Build new one first -> migrate -> phase out old one

2

Page 60: Zero-downtime Hadoop/HBase Cross-datacenter Migration

60

Need services’ cooperation

• Services need to port their codes for TMH7• We released a Dev Env. (all-in-one Hadoop) for

services to test in advanced– VMWare image (OVF)– VagrantBox– Docker image

• A Jira project for users to submit issues if any