Administrating Ha Do Op

download Administrating Ha Do Op

of 38

Transcript of Administrating Ha Do Op

  • 8/6/2019 Administrating Ha Do Op

    1/38

    An Overview

  • 8/6/2019 Administrating Ha Do Op

    2/38

    Credits

    Author: Michael Guenther

    Editor: Aaron Loucks

    Dancing Elephants: Michael V. Shuman

  • 8/6/2019 Administrating Ha Do Op

    3/38

    What developers and architects

    see

    3

  • 8/6/2019 Administrating Ha Do Op

    4/38

    What capacity planningfolks see

    4

  • 8/6/2019 Administrating Ha Do Op

    5/38

    What network folks see

    5

  • 8/6/2019 Administrating Ha Do Op

    6/38

    What operations folks see

    6

  • 8/6/2019 Administrating Ha Do Op

    7/38

  • 8/6/2019 Administrating Ha Do Op

    8/38

    Learning the Hard Way

  • 8/6/2019 Administrating Ha Do Op

    9/38

    Introductions

    Aaron Loucks

    y Senior Technical Operations Engineer,

    CCHA

    y ~11 months active Hadoop Admin

    Experience

    Michael Guenther

    y Technical Operations Team Lead, CCHAy ~16 months active Hadoop Admin

    Experience

  • 8/6/2019 Administrating Ha Do Op

    10/38

    Learningthe Hard Way

    Early Adoption of Hadoop has some ofits own issues.y The knowledge base is growing, but still

    pretty thin.y Manning (finally) released their book, so

    now we have 3 Hadoop books.

    y HBase has even less documentation

    available and no books. (July for LarsGeorges book. Probably, hopefully)

    y Cloudera didnt officially support HBase untilCDH3

  • 8/6/2019 Administrating Ha Do Op

    11/38

    PlayingCatch Up

    IS and Ops came to the game a bit laterthan development so we had to playcatch up early on in the project.

    We had to write a lot of our own toolsand implement our own processes (rackawareness, log cleanup, metadatabackups, deploy configs, etc.)

    Additionally, we needed to learn a lotabout Linux system details and networksetup and configuration.

  • 8/6/2019 Administrating Ha Do Op

    12/38

    New Admin Blues

    Tech Ops (Aaron and I), arent part of the ISdepartment.y This might be different at your company. Some

    places, Ops are part ofIS. The correct model

    depends on staffing and which group fulfills variousenterprise roles.

    Administrating Hadoop/HBase created aproblem for our traditional support model andnon-SA activity on the machines

    It took some time to get used to the newsystem and what was needed for us to run and

    maintain it. Most of which changed withCDH3.

  • 8/6/2019 Administrating Ha Do Op

    13/38

    Enterprise Wide Admins?

    Since we have no centralized team

    administrating all clouds, configuration

    and set up varies across the enterprise

    creating additional challenges.

    Staffing Hadoop Administrators is

    difficult. Especially since we arent in

    the Bay Area.

  • 8/6/2019 Administrating Ha Do Op

    14/38

    Configuration File Management

    Configuration File Management can be

    a challenge.

    y We settled on a central folder on a common

    mount and an ssh script to push configs.

    y Cloudera recommended using Puppet or

    Chef.

    y We havent made that jump yet. When the

    cluster goes heterogeneous, we will

    investigate further.

  • 8/6/2019 Administrating Ha Do Op

    15/38

    A Look At Our Cluster

  • 8/6/2019 Administrating Ha Do Op

    16/38

    Initial Cluster Setup

    Prod started with 3 Masters and 20 DNs

    across 2 racks (10 and 10)

    UAT started with 3 Masters and 15 DNs

    across 2 racks (5 and 10)

  • 8/6/2019 Administrating Ha Do Op

    17/38

    Current Hardware Breakdown

    Name Node (HMaster), Secondary Name Node, and JobTrackery Dell R710s

    y Dual Intel Quad-cores (spec)

    y 72GB of RAM

    y SCSI Drives in RAID configuration (~70GB)

    Data Nodes (Task Tracker, Data Node, Region Server) - 30nodesy Dell R410s

    y Dual Intel Quad-cores

    y 64GB of RAM

    y 4x2TB 5400RPM SATA in JBOD configuration

    Zookeeper Servers (Standalone mode)y Dell R610s

    y Dual Intel Quad-cores (Specs)

    y 32GB of RAM

    y SCSI Drives in RAID configuration (~70GB)

  • 8/6/2019 Administrating Ha Do Op

    18/38

    Cluster Network Information

    Rack Details

    y TOR Switches Cisco 4948s

    y 1GB/E links to TOR

    y 42U rack, ~32U usable for servers

    Network

    y TORs are 1GB/E to Core (Cisco 6509s).

    Channel bonding possible if needed.y 10GB/E is being investigated if needed.

    y 192GB Backbone

  • 8/6/2019 Administrating Ha Do Op

    19/38

    GrowingOur Cluster

    Early on, we were unsure of how many

    servers were needed for launch.

    Capacity planning was a total unknown:

    y Reserving data center space was very

    difficult.

    y Budgeting for future growth was also

    difficult.

  • 8/6/2019 Administrating Ha Do Op

    20/38

    Ideal Growth Versus Reality

    When we did add new servers, we ran intorack space issues.

    Our rack breakdown for UAT datanodes is

    5, 10, and 15 servers Uneven datanode distribution isnt handled

    well by HBase and Hadoop.

    Re-racking was not an option.

    Options: Turn off rack-awareness, go withthe uneven rack arrangement, or lie toHadoop?

  • 8/6/2019 Administrating Ha Do Op

    21/38

    Server Build Out

    Initially, we received new machines from

    the Sys Admins and we had to install

    Hadoop and HBase.

    We worked with the SAs to create a

    Cobbler image for new types of Hadoop

    servers.

    Now, new machines only needconfiguration files and are ready for use.

  • 8/6/2019 Administrating Ha Do Op

    22/38

    First Cluster GrowthIssue

    Since we had to spoof rack-awareness,

    mis-replicated blocks started showing

    up.

    Run the balancer to fix it right?

    Not quite. The Hadoop balancer doesnt

    fix mis-replicated blocks.

    You have to modify your replicationfactor on the folders with mis-replicated

    blocks.

  • 8/6/2019 Administrating Ha Do Op

    23/38

    Be Paranoid.

  • 8/6/2019 Administrating Ha Do Op

    24/38

    Paranoia Its Not So Bad

    Be paranoid. Hadoop punishes the

    unwary (trust us).

    Two dfs.name.dir folders are a must.

    Back up your Name Node images and

    edits on a regular basis (hourly).

    Run fsck blocks / once a day.

    Run your dfsadmin report once a week.

  • 8/6/2019 Administrating Ha Do Op

    25/38

    Paranoia - Youll Get Used To It

    Check your various web pages once aday.y Name Node, Job Tracker, and HMaster

    Set up monitoring and alerting. Set up your trash option in HDFS to

    greater than the 10 minute default.

    Lock down your clustery Keep everyone off of your cluster

    y Provide a client server for user interaction.Fuse is a good addition to this server.

  • 8/6/2019 Administrating Ha Do Op

    26/38

    BackingUp Your Cluster

    Again, multiple dfs.name.dirs

    Run wgets regularly on the namenode

    image and edits URL to create a backup.

    Back up your config files prior to any major

    change (or even minor).

    Save your job statistics to HDFS.

    y mapred.job.tracker.persist.jobstatus.dir Data Node Metadata

    Zookeeper Data Directory

  • 8/6/2019 Administrating Ha Do Op

    27/38

    Learningby Experience(Sometimes Painful Experience)

  • 8/6/2019 Administrating Ha Do Op

    28/38

    Issues and Epiphanies

    Pinning your yum repositoryy We had this for our cloudera repo mirror list

    initially:

    mirrorlist=http://archive.cloudera.com/redhat/cdh/3/mirrors

    y Thats the latest and greatest CDH3 buildrepo (B2, B3, B4, etc).

    y We are on CDH3B3, so we needed to setour repo mirror list to this:

    mirrorlist=http://archive.cloudera.com/redhat/cdh/3b3/mirrors

  • 8/6/2019 Administrating Ha Do Op

    29/38

    Issues and Epiphanies

    FSCK returned CORRUPT!y Initially, we thought this was much, much

    worse than it turned out to be when it

    happened.y Its still bad, but only the files listed as

    corrupt are lost. It wasnt the swath ofdestruction we thought it would be.

    Cloudera might be able to work somemagic, but youve almost certainly lostyour file(s).

  • 8/6/2019 Administrating Ha Do Op

    30/38

    Issues and Epiphanies

    Sudo permissions are keyy We avoid using root whenever possible.

    y All config files and folders are owned by our genericaccount.

    y Our generic account has some nice permissionsthough: sudo u hdfs/hbase/zookeeper/mapred *

    sudo /etc/init.d/hbase-regionserver *

    sudo /etc/init.d/hadoop *

    y root access might be extremely difficult to come by.

    It depends heavily on your business and IS policies. These cover95% of our day-to-day activity on

    the cloud.

  • 8/6/2019 Administrating Ha Do Op

    31/38

    Issues and Epiphanies

    Document EVERYTHINGy Its a bit tiresome at first, but issues can

    sometimes be months between reoccurrence.

    y

    Write it down now and save yourself having toresearch again.

    y This is especially true when you are setting upyour first cluster. Theres a lot to learn, its reallyeasy to forget.

    y Pay special attention to the error message thatgoes along with the problem. HBase tends tohave extremely vague exceptions and errorlogging.

  • 8/6/2019 Administrating Ha Do Op

    32/38

  • 8/6/2019 Administrating Ha Do Op

    33/38

    Issues and Epiphanies

    Do NOT let dfs.name.dir run out of

    space.

    y This is extremely bad news if you only have

    one dfs.name.dir.

    y We have two

    One Name Node local mount directory

    One SAN mount (also our common mount)

    You absolutelyneed monitoring in place

    to keep this from happening.

  • 8/6/2019 Administrating Ha Do Op

    34/38

    Issues and Epiphanies

    SmallerIssues

    y Missing pid files?

    y Users receive a zip file exception when

    running a job

    y CDH3Install/Upgrade requires a local

    hadoop user.

    y The Job Tracker complains about port

    already in use. Check your mapred-site.xml.

  • 8/6/2019 Administrating Ha Do Op

    35/38

    Issues and Epiphanies

    Memory Settings Hadoop

    y Set your SN and NN to be the same size.

    y Set your starting JVM starting size to be

    equal to your max.

    y Set your memory explicitly per process, not

    using HADOOP_HEAPSIZE.

    y Set your map and reduce heap size as final

    in your mapred-site.xml.

  • 8/6/2019 Administrating Ha Do Op

    36/38

    HBase Issues and Epiphanies

    Set your hbase users ulimits high 64k isgood.

    Sometimes the HBase take a really longtime to start back up (2 hours oneSaturday).

    0.89 WAL File corruption problem.

    Keep your quorum off of your data nodes(off that rack really).

    HBase is extremely sensitive to networkevents/maintenance/connectivityissues/etc.

  • 8/6/2019 Administrating Ha Do Op

    37/38

    HBase Issues and Epiphanies

    Memory Settings HBasey Region Servers need a lot more memory

    than your HMaster.

    y Region Servers can, and will, run out ofmemory and crash.

    y Rowcounter is your friend for non-responsive region servers.

    y

    Zookeeper should be set to 1 GB of JVMheap.

    y Talk to Cloudera about special JVM settingsfor your HBase daemons.

  • 8/6/2019 Administrating Ha Do Op

    38/38

    Questions?We Might Have Answers.