An Overview. Credits Author: Michael Guenther Editor: Aaron Loucks Dancing Elephants: Michael V....

38
HADOOP An Overview

Transcript of An Overview. Credits Author: Michael Guenther Editor: Aaron Loucks Dancing Elephants: Michael V....

Page 1: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

HADOOPAn Overview

Page 2: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Credits

Author: Michael Guenther Editor: Aaron Loucks Dancing Elephants: Michael V. Shuman

Page 3: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

What developers and architects see

3

Page 4: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

What capacity planning folks see

4

Page 5: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

What network folks see

5

Page 6: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

What operations folks see

6

Page 7: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

WHAT SYS ADMINS SEE

Page 8: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

ADMINISTRATINGHADOOP

Learning the Hard Way

Page 9: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Introductions

Aaron Loucks Senior Technical Operations Engineer,

CCHA~11 months active Hadoop Admin

Experience Michael Guenther

Technical Operations Team Lead, CCHA~16 months active Hadoop Admin

Experience

Page 10: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Learning the Hard Way

Early Adoption of Hadoop has some of it’s own issues.The knowledge base is growing, but still pretty

thin.Manning (finally) released their book, so now

we have 3 Hadoop books.HBase has even less documentation available

and no books. (July for Lars George’s book. Probably, hopefully)

Cloudera didn’t officially support HBase until CDH3

Page 11: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Playing Catch Up

IS and Ops came to the game a bit later than development so we had to play catch up early on in the project.

We had to write a lot of our own tools and implement our own processes (rack awareness, log cleanup, metadata backups, deploy configs, etc.)

Additionally, we needed to learn a lot about Linux system details and network setup and configuration.

Page 12: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

New Admin Blues Tech Ops (Aaron and I), aren’t part of the IS

department.This might be different at your company. Some

places, Ops are part of IS. The correct model depends on staffing and which group fulfills various enterprise roles.

Administrating Hadoop/HBase created a problem for our traditional support model and non-SA activity on the machines

It took some time to get used to the new system and what was needed for us to run and maintain it. Most of which changed with CDH3.

Page 13: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Enterprise Wide Admins?

Since we have no centralized team administrating all clouds, configuration and set up varies across the enterprise creating additional challenges.

Staffing Hadoop Administrators is difficult. Especially since we aren’t in the Bay Area.

Page 14: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Configuration File Management

Configuration File Management can be a challenge.We settled on a central folder on a common

mount and an ssh script to push configs.Cloudera recommended using Puppet or

Chef. We haven’t made that jump yet. When the

cluster goes heterogeneous, we will investigate further.

Page 15: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

A Look At Our Cluster

Page 16: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Initial Cluster Setup

Prod started with 3 Masters and 20 DNs across 2 racks (10 and 10)

UAT started with 3 Masters and 15 DNs across 2 racks (5 and 10)

Page 17: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Current Hardware Breakdown Name Node (HMaster), Secondary Name Node, and Job Tracker

Dell R710’s Dual Intel Quad-cores (spec) 72GB of RAM SCSI Drives in RAID configuration (~70GB)

Data Nodes (Task Tracker, Data Node, Region Server) - 30 nodes Dell R410’s Dual Intel Quad-cores 64GB of RAM 4x2TB 5400RPM SATA in JBOD configuration

Zookeeper Servers (Standalone mode) Dell R610’s Dual Intel Quad-cores (Specs) 32GB of RAM SCSI Drives in RAID configuration (~70GB)

Page 18: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Cluster Network Information

Rack DetailsTOR Switches Cisco 4948’s1GB/E links to TOR42U rack, ~32U usable for servers

NetworkTORs are 1GB/E to Core (Cisco 6509’s).

Channel bonding possible if needed.10GB/E is being investigated if needed.192GB Backbone

Page 19: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Growing Our Cluster

Early on, we were unsure of how many servers were needed for launch.

Capacity planning was a total unknown:Reserving data center space was very

difficult.Budgeting for future growth was also

difficult.

Page 20: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Ideal Growth Versus Reality When we did add new servers, we ran into

rack space issues. Our rack breakdown for UAT datanodes is

5, 10, and 15 servers Uneven datanode distribution isn’t handled

well by HBase and Hadoop. Re-racking was not an option. Options: Turn off rack-awareness, go with

the uneven rack arrangement, or lie to Hadoop?

Page 21: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Server Build Out

Initially, we received new machines from the Sys Admins and we had to install Hadoop and HBase.

We worked with the SAs to create a Cobbler image for new types of Hadoop servers.

Now, new machines only need configuration files and are ready for use.

Page 22: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

First Cluster Growth Issue Since we had to spoof rack-awareness,

mis-replicated blocks started showing up.

Run the balancer to fix it right? Not quite. The Hadoop balancer doesn’t

fix mis-replicated blocks. You have to modify your replication

factor on the folders with mis-replicated blocks.

Page 23: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Be Paranoid.

Page 24: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Paranoia – It’s Not So Bad

Be paranoid. Hadoop punishes the unwary (trust us).

Two dfs.name.dir folders are a must. Back up your Name Node images and

edits on a regular basis (hourly). Run fsck –blocks / once a day. Run your dfsadmin –report once a week.

Page 25: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Paranoia - You’ll Get Used To It

Check your various web pages once a day. Name Node, Job Tracker, and HMaster

Set up monitoring and alerting. Set up your trash option in HDFS to

greater than the 10 minute default. Lock down your cluster

Keep everyone off of your clusterProvide a client server for user interaction.

Fuse is a good addition to this server.

Page 26: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Backing Up Your Cluster

Again, multiple dfs.name.dirs Run wget’s regularly on the namenode

image and edits URL to create a backup. Back up your config files prior to any

major change (or even minor). Save your job statistics to HDFS.

mapred.job.tracker.persist.jobstatus.dir Data Node Metadata Zookeeper Data Directory

Page 27: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Learning by Experience(Sometimes Painful Experience)

Page 28: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Issues and Epiphanies

Pinning your yum repositoryWe had this for our cloudera repo mirror list

initially: ○ mirrorlist=http://archive.cloudera.com/redhat/

cdh/3/mirrorsThat’s the latest and greatest CDH3 build

repo (B2, B3, B4, etc).We are on CDH3B3, so we needed to set our

repo mirror list to this: ○ mirrorlist=http://archive.cloudera.com/redhat/

cdh/3b3/mirrors

Page 29: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Issues and Epiphanies

FSCK returned CORRUPT!Initially, we thought this was much, much

worse than it turned out to be when it happened.

It’s still bad, but only the files listed as corrupt are lost. It wasn’t the swath of destruction we thought it would be.

Cloudera might be able to work some magic, but you’ve almost certainly lost your file(s).

Page 30: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Issues and Epiphanies Sudo permissions are key

We avoid using root whenever possible.All config files and folders are owned by our generic

account.Our generic account has some nice permissions though:

○ sudo –u hdfs/hbase/zookeeper/mapred *○ sudo /etc/init.d/hbase-regionserver *○ sudo /etc/init.d/hadoop *

root access might be extremely difficult to come by. It depends heavily on your business and IS policies.

These cover 95% of our day-to-day activity on the cloud.

Page 31: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Issues and Epiphanies Document EVERYTHING

It’s a bit tiresome at first, but issues can sometimes be months between reoccurrence.

Write it down now and save yourself having to research again.

This is especially true when you are setting up your first cluster. There’s a lot to learn, it’s really easy to forget.

Pay special attention to the error message that goes along with the problem. HBase tends to have extremely vague exceptions and error logging.

Page 32: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Issues and Epiphanies

Fair Scheduler WoesWhile nice, the fair scheduler page has

caused some serious problems. Users grow frustrated when their jobs aren’t

running, so they increase the priority.Now their job is running, but others are

being starved.We ended up restricting page access to a

very small subset of users.

Page 33: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Issues and Epiphanies

Do NOT let dfs.name.dir run out of space.This is extremely bad news if you only have

one dfs.name.dir.We have two

○ One Name Node local mount directory○ One SAN mount (also our common mount)

You absolutely need monitoring in place to keep this from happening.

Page 34: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Issues and Epiphanies

Smaller IssuesMissing pid files?Users receive a zip file exception when

running a jobCDH3 Install/Upgrade requires a local

hadoop user.The Job Tracker complains about port

already in use. Check your mapred-site.xml.

Page 35: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Issues and Epiphanies

Memory Settings – HadoopSet your SN and NN to be the same size.Set your starting JVM starting size to be

equal to your max.Set your memory explicitly per process, not

using HADOOP_HEAPSIZE.Set your map and reduce heap size as final

in your mapred-site.xml.

Page 36: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

HBase Issues and Epiphanies

Set your hbase user’s ulimits high – 64k is good.

Sometimes the HBase take a really long time to start back up (2 hours one Saturday).

0.89 WAL File corruption problem. Keep your quorum off of your data nodes

(off that rack really). HBase is extremely sensitive to network

events/maintenance/connectivity issues/etc.

Page 37: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

HBase Issues and Epiphanies Memory Settings – HBase

Region Servers need a lot more memory than your HMaster.

Region Servers can, and will, run out of memory and crash.

Rowcounter is your friend for non-responsive region servers.

Zookeeper should be set to 1 GB of JVM heap.

Talk to Cloudera about special JVM settings for your HBase daemons.

Page 38: An Overview. Credits  Author: Michael Guenther  Editor: Aaron Loucks  Dancing Elephants: Michael V. Shuman.

Questions?We Might Have Answers.