Running Hadoop as Service in AltiScale Platform
-
Upload
inmobi-technology -
Category
Technology
-
view
320 -
download
5
Transcript of Running Hadoop as Service in AltiScale Platform
Experiences in running Hadoop As A Service [email protected] = #HadoopSherpa
DAVID CHAIKEN • 21 NOVEMBER 2014
Talk Outline
Altiscale Company Introduction and Perspective
Altiscale Architecture
Use Cases: Performance, Job Analysis, Scheduling
Infinite Hadoop
Challenges to the Hadoop Community
Copyright © 2014 Al2scale, Inc.
Corporate Background
Hadoop-as-a-Service (HaaS) innovator
Company founded in 2012 (Palo Alto & Chennai)
Founding team from Yahoo • Raymie Stata, CEO, Former CTO
• David Chaiken, CTO, Former Chief Architect
• Charles Wimmer, Head of Operations, Former SRE
Employees from Yahoo, Google, Netflix, LinkedIn, VMware and others
Top-tier investors Copyright © 2014 Al2scale, Inc.
Altiscale Chennai
Long-term colleagues from Yahoo and before
IIT Madras Research Park (back gate of IIT-M)
Architecture, Core Development, Test (Apache Bigtop)
Control Plane agile development, 2-week sprints
Next: Test++, Customer Support, Operations
Copyright © 2014 Al2scale, Inc.
Everybody Loves Hadoop But…
Significant capex expenditure on infrastructure
• Complex to manage and maintain
Time to get cluster up and running is long
Capacity planning is difficult
Skillset is difficult to recruit, train and retain
What about the cloud?
Copyright © 2014 Al2scale, Inc.
True Hadoop-as-a-Service
Altiscale is the industry’s first purpose-built, petabyte scale Hadoop cloud
• Altiscale operates Hadoop for you • Infrastructure optimized to run Hadoop
fast and reliably • Pay for Hadoop service, not
infrastructure
Copyright © 2014 Al2scale, Inc.
We Team With You To Help Deliver Insights
Poten2al insights from a flood of data generated by the
connected world
Our Opera2ons Team and Hadoop Cloud helps realize
those insights
+
Customer Al,scale
Copyright © 2014 Al2scale, Inc.
Customers
Copyright © 2014 Al2scale, Inc.
How We Do It
Virtual Hadoop Cluster
YARN Service
HDFS Service
More Apps
File Transfer
KaRa Flume
Data Connect
Hive Pig Oozie
Pre-‐configured Apps We op2mize the job to complete fast
and cost-‐effec2vely
Your data is migrated to HDFS
and a virtual Hadoop cluster in
our cloud
Our Hadoop Helpdesk gives you access to Hadoop experts
Our Hadoop Opera2ons Team maintains the
cluster and plans the job
Our team monitors and manages the job through to comple2on
We provide an up2me SLA so our Hadoop
cloud is always available Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: Data and Control Planes
Copyright © 2014 Al2scale, Inc.
Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: Data and Control Planes
Altiscale Architecture: Customer Environments
Copyright © 2014 Al2scale, Inc.
Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: O&O Hadoop Cluster
Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: Host Components
Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: Workbenches
Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: Data Transfer
Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: Portal and REST API
Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: Control Plane Databases
Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: Control Plane Services
Copyright © 2014 Al2scale, Inc.
Altiscale Architecture: Hadoop-Based Analysis
Hadoop as a Service Offering
Data is migrated to our HDFS service HDFS Service
Data Connectors
Foundry Apps Apache Mahout Cascading Revolu2on R KaRa/Camus Avro Pentaho Ke\le Matlab Spark Sqoop H2O
Core Apps Apache Hive Apache Pig Apache Oozie Apache HCatalog Apache Flume R JDK/JRE Python H\pFS FUSE LZOP, Snappy, gzip
Terminal access to Hadoop cluster and associated apps
Portal provides job status, billing and support information
1
2
3
Copyright © 2014 Al2scale, Inc.
Challenges…
Copyright © 2014 Al2scale, Inc.
Disks: Configuration, Controllers, Density, Cost
Network: Jumbo Packet MTU
Memory: echo never > \
/sys/kernel/mm/redhat_transparent_hugepage/enabled
Network: When does locality matter?
Flash: When to use SSD?
Performance Challenges…
Copyright © 2014 Al2scale, Inc.
Customer provided Hive query + data sets (100GBs to ~5 TBs) Needed help optimizing the query Didn’t rewrite query immediately Wanted to characterize query performance and isolate bottlenecks first
Customer Case Study: Analyze Query
Ran original query on the datasets in our environment: • Two M/R Stages: Stage-1, Stage-2
Long running reducers run out of memory • set mapreduce.reduce.memory.mb=5120!• Reduces slots and extends reduce time
Query fails to launch Stage-2 with out of memory • set HADOOP_HEAPSIZE=1024 on client machine
Query has 250,000 Mappers in Stage-2 which causes failure
• set mapred.max.split.size=5368709120 to reduce Mappers
Analyze and Tune Execution
Next challenge - how to visualize job execution? Existing hadoop/hive logs not sufficient for this task Wrote internal tools
• parse job history files • plot mapper and reducer execution
Analysis: Job Execution Characteristics
Analysis: Map (Stage-1)
Single reduce task
Analysis: Reduce (Stage-1) Long Tail
Analysis: Map (Stage-2)
Analysis: Reduce (Stage-2)
Lone, long running reducer in first stage of query Analyzed input data:
• Query split input data by userId • Bucketizing input data by userId • One very large bucket: “invalid” userId • Discussed “invalid” userid with customer
An error value is a common pattern! • Need to differentiate between “Don’t know and don’t care”
or “don’t know and do care.”
Analysis Execution: Findings
Loading data into DRAM makes processing fast! Examples: Spark, Impala, 0xdata, …, [SAP HANA], … Streaming systems (Storm, DataTorrent) may be similar Need to increase YARN container memory size
Interactive (DRAM-centric) Processing Systems
Caution: larger YARN container settings for interactive jobs may not be right for batch systems like Hive Container size: needs to combine vcores and memory: yarn.scheduler.maximum-allocation-vcores yarn.nodemanager.resource.cpu-vcores ...!
Hive + Interactive: Watch Out for Container Size
Attempting to schedule interactive systems and batch systems like Hive may result in fragmentation Interactive systems may require all-or-nothing scheduling Batch jobs with little tasks may starve interactive jobs
Hive + Interactive: Watch Out for Fragmentation
Solutions for fragmentation… Reserve interactive nodes before starting batch jobs Reduce interactive container size (if the algorithm permits) Node labels (YARN-726) and gang scheduling (YARN-624)
Hive + Interactive: Watch Out for Fragmentation
Altiscale’s point of view on Hadoop as a Service:
• sell HDFS in increments of 10 TB
• sell compute in increments of 10K TaskHours/Month
We market Infinite Hadoop, and provide services so that customers need not worry about cluster nodes.
But Apache Hadoop user interfaces provide node-oriented view of clusters…
Copyright © 2014 Al2scale, Inc.
Altiscale: Hadoop Storage and Compute
ResourceManager User Interface
Copyright © 2014 Al2scale, Inc.
ResourceManager User Interface
Copyright © 2014 Al2scale, Inc.
NameNode User Interface
Copyright © 2014 Al2scale, Inc.
NameNode User Interface
Copyright © 2014 Al2scale, Inc.
Feedback from Customers Storage plan normally easy to estimate
Compute plan is hard to estimate • Customer pain point: achieving necessary
computation needs sometimes requires more peak compute capacity than provided by the number of nodes required for storage
• Opportunity: average compute often requires less than the number of nodes required for storage
Copyright © 2014 Al2scale, Inc.
Solution: Change Altiscale’s Product! Make “Infinite” computation available to customers
Multitenancy implementation phases, each of which includes a milestone with production deliverables
0. Automation for burn/add/remove nodes 1. Deploy Linux containers using Docker 2. Decouple compute/storage + manual bursting 3. Automation: orchestrate add/remove nodes according to
allocation plan from the capacity team. 4. Optimized: predictive allocation, economic incentives
Copyright © 2014 Al2scale, Inc.
Physical Cluster per Customer
Copyright © 2014 Al2scale, Inc.
NM and DN in Docker Containers
Copyright © 2014 Al2scale, Inc.
Decouple Compute/Storage
Copyright © 2014 Al2scale, Inc.
What Customers Get On demand access to “Infinite” Computation
Ability to handle unexpected needs without contacting Altiscale
“Access to a $10M cluster for just $1M”
Future…
Ability to package Hadoop job environment using Docker (YARN-1964)
Copyright © 2014 Al2scale, Inc.
Hive + Hadoop debugging can get very complex • Sifting through many logs and screens
• Automatic transmission versus manual transmission
Static partitioning induced by Java Virtual Machine has benefits but also induces challenges. Where there are difficulties, there’s opportunity:
• Better tooling, instrumentation, integration of logs/metrics
YARN still evolving into an operating system Just starting to build real multitenancy into Hadoop. Hadoop as a Service: aggregate and share expertise
Challenges to the Hadoop Community