HDFS NameNode HA in CDH4

9
2012/07/06 NameNode HA in CDH4

Transcript of HDFS NameNode HA in CDH4

Page 1: HDFS NameNode HA in CDH4

2012/07/06

NameNode HA in CDH4

Page 2: HDFS NameNode HA in CDH4

Background

• Prior to Hadoop 2.0.0, the NameNode was a single point of failure (SPOF) in an HDFS cluster.

Page 3: HDFS NameNode HA in CDH4

Approach and Terminology

• Initial goal is Active-Standby

• Terminology– Active NN: Actively serves the read/write operations from the

clients– Standby NN: Waits, becomes active when Active dies or is

unhealthy– Hot Standby: Standby has all most of the Active’s state and start

immediately

Page 4: HDFS NameNode HA in CDH4

High-level Architecture

Page 5: HDFS NameNode HA in CDH4

Hardware resources

• NameNode machines– Should have equivalent hardware to each other

• Shared storage– Both NameNode can have read/write access– Only a single shared directory is supported

• High-quality dedicated NAS appliance is recommended

• Secondary NameNode is not necessary

Page 6: HDFS NameNode HA in CDH4

Automatic Failover

• Introduce two new components– ZooKeeper– ZKFailoverController (abbreviated as ZKFC)

• It’s a ZooKeeper client • Each of the machines which runs a NameNode also runs a ZKFC• Responsible for:

– Health monitoring– ZooKeeper session management– ZooKeeper-based election

Page 7: HDFS NameNode HA in CDH4

Automatic Failover

ActiveNN

Hot StandbyNN

Shared dir on NFS

DN

ZKFC

ZK

Heartbeat

Block Reports

DN DN

ZK ZK

session session

ZKFC

Heartbeat

Page 8: HDFS NameNode HA in CDH4

Appendix

• High Availability Framework for HDFS NN– HDFS-1623

• HDFS portion of ZK-based FailoverController– HDFS-2185

Page 9: HDFS NameNode HA in CDH4

Questions?