MapReduce Container ReUse

11
© Hortonworks Inc. 2012 MR Container Re-Use (Not YARN specific) Siddarth Seth Member of Technical Staff

Transcript of MapReduce Container ReUse

Page 1: MapReduce Container ReUse

© Hortonworks Inc. 2012

MR Container Re-Use(Not YARN specific)

Siddarth Seth

Member of Technical Staff

Page 2: MapReduce Container ReUse

Current AM Components

Job

Task

TaskAttempt

RMAllocator /

Scheduler

Container Launcher

TA ListenerRunning

Task

RM NM

Page 3: MapReduce Container ReUse

• TaskAttempt and Container operations are tightly coupled– CLC construction, Container Launch

invocation is handled by the TaskAttempt– Container Launch is tied to the TaskAttempt

(instead of container size, LocalResources)– Container shutdown.

Page 4: MapReduce Container ReUse

AM post 3902

Job

Task

TaskAttempt

AMScheduler

NM Communicat

or

TA Listener

RM NM

Container Listener

Running Task

Running JVM

Container

Node

Rack ?

Page 5: MapReduce Container ReUse

• Container and Node have their own states.• Containers interact with the NodeManager• Tasks interact with the scheduler – which

matches containers to task attempts.• Nodes take care of blacklisting – simplifies

the scheduler.• Easier to write a custom scheduler.

Page 6: MapReduce Container ReUse

Current State

• Most of the AM functional changes are done. (Cleanup pending)

• Task side changes are required• A re-use scheduler needs to be

implemented.

Page 7: MapReduce Container ReUse

Facilitates

• Common MapOutputBuffer for maps assigned to the same container.

• Merging per-node or per-rack map output• Custom Task Types

Page 8: MapReduce Container ReUse

© Hortonworks Inc. 2012

1

• Simplify deployment to get started quickly and easily

• Monitor, manage any size cluster with familiar console and tools

• Only platform to include data integration services to interact with any data

• Metadata services opens the platform for integration with existing applications

• Dependable high availability architecture

• Tested at scale to future proof your cluster growth

Hortonworks Data Platform

Page 8

Reduce risks and cost of adoption Lower the total cost to administer and provision Integrate with your existing ecosystem

Page 9: MapReduce Container ReUse

© Hortonworks Inc. 2012

Hortonworks Training

The expert source for Apache Hadoop training & certification

Role-based Developer and Administration training– Coursework built and maintained by the core Apache Hadoop development team.– The “right” course, with the most extensive and realistic hands-on materials– Provide an immersive experience into real-world Hadoop scenarios– Public and Private courses available

Comprehensive Apache Hadoop Certification– Become a trusted and valuable

Apache Hadoop expert

Page 9

Page 10: MapReduce Container ReUse

© Hortonworks Inc. 2012

Next Steps?

• Expert role based training• Course for admins, developers

and operators• Certification program• Custom onsite options

Page 10

Download Hortonworks Data Platformhortonworks.com/download

1

2 Use the getting started guidehortonworks.com/get-started

3 Learn more… get support

• Full lifecycle technical support across four service levels

• Delivered by Apache Hadoop Experts/Committers

• Forward-compatible

Hortonworks Support

hortonworks.com/training hortonworks.com/support

Page 11: MapReduce Container ReUse

© Hortonworks Inc. 2012

Thank You!Questions & Answers

Follow: @hortonworksRead: hortonworks.com/blog

Page 11