Datacenter Management with Apache Mesos
description
Transcript of Datacenter Management with Apache Mesos
![Page 1: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/1.jpg)
Benjamin Hindman – @benh
Datacenter Management with Apache Mesosmesos.apache.org@ApacheMesos
![Page 2: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/2.jpg)
I’ve got tons of data ...
![Page 3: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/3.jpg)
… more everyday!
![Page 4: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/4.jpg)
That must be why they call it a datacenter.
![Page 5: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/5.jpg)
I’d love to answer some questions with the help
of my data!
![Page 6: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/6.jpg)
I think I’ll try Hadoop.
![Page 7: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/7.jpg)
your datacenter
![Page 8: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/8.jpg)
+ Hadoop
![Page 9: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/9.jpg)
happy?
![Page 10: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/10.jpg)
Not exactly …
![Page 11: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/11.jpg)
… Hadoop is a big hammer, but not
everything is a nail!
![Page 12: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/12.jpg)
I’ve got some iterative algorithms, I want to try
Spark!
![Page 13: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/13.jpg)
datacenter management
![Page 14: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/14.jpg)
datacenter management
![Page 15: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/15.jpg)
datacenter management
![Page 16: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/16.jpg)
static partitioning
![Page 17: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/17.jpg)
Oh noes! Spark wants to read and write data to
HDFS!
![Page 18: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/18.jpg)
Hadoop …
(map/reduce)
(distributed file system)
![Page 19: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/19.jpg)
HDFS
![Page 20: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/20.jpg)
HDFS
![Page 21: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/21.jpg)
Could we just give Spark it’s own HDFS cluster
too?
![Page 22: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/22.jpg)
HDFS
![Page 23: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/23.jpg)
HDFS
![Page 24: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/24.jpg)
HDFS
![Page 25: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/25.jpg)
HDFS tee incoming data(2 copies)
![Page 26: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/26.jpg)
HDFS tee incoming data(2 copies)
periodic copy/sync
![Page 27: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/27.jpg)
That sounds annoying … let’s not do that. Can we do any better though?
![Page 28: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/28.jpg)
HDFS
![Page 29: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/29.jpg)
HDFS
![Page 30: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/30.jpg)
HDFS
![Page 31: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/31.jpg)
happy now?
![Page 32: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/32.jpg)
No! We’ve decided to start doing real time
computation with Storm …
![Page 33: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/33.jpg)
datacenter management
![Page 34: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/34.jpg)
datacenter management
![Page 35: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/35.jpg)
happy now!?
![Page 36: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/36.jpg)
Not really … during the day I’d rather give more machines to Spark but at
night I’d rather give more machines to
Hadoop!
![Page 37: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/37.jpg)
datacenter management
![Page 38: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/38.jpg)
datacenter management
![Page 39: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/39.jpg)
datacenter management
![Page 40: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/40.jpg)
datacenter management
![Page 41: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/41.jpg)
![Page 42: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/42.jpg)
And failures require more datacenter management!
![Page 43: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/43.jpg)
datacenter management
![Page 44: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/44.jpg)
datacenter management
![Page 45: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/45.jpg)
datacenter management
![Page 46: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/46.jpg)
![Page 47: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/47.jpg)
![Page 48: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/48.jpg)
![Page 49: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/49.jpg)
![Page 50: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/50.jpg)
I don’t want to deal with this!
![Page 51: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/51.jpg)
the datacenter …rather than think about the datacenter like this …
![Page 52: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/52.jpg)
… is a computerthink about it like this …
![Page 53: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/53.jpg)
datacenter computer
applications
resources
filesystem
![Page 54: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/54.jpg)
mesosapplications
resources
filesystem
kernel
![Page 55: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/55.jpg)
Okay, so how does it work?
![Page 56: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/56.jpg)
Step 1: HDFS
![Page 57: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/57.jpg)
Step 2: Mesosrun a “master” (or multiple for high availability)
![Page 58: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/58.jpg)
Step 2: Mesosrun “slaves” on the rest of the machines
![Page 59: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/59.jpg)
Step 3: Frameworks
![Page 60: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/60.jpg)
Step 3: Frameworks
![Page 61: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/61.jpg)
Step 3: Frameworks
![Page 62: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/62.jpg)
Step 3: Frameworks
![Page 63: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/63.jpg)
Step 3: Frameworks
![Page 64: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/64.jpg)
Step 3: Frameworks
![Page 65: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/65.jpg)
Step 3: Frameworks
![Page 66: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/66.jpg)
Step 3: Frameworks
![Page 67: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/67.jpg)
Step 3: Frameworks
![Page 68: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/68.jpg)
Step 3: Frameworks
![Page 69: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/69.jpg)
Step 3: Frameworks
![Page 70: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/70.jpg)
Step 3: Frameworks
![Page 71: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/71.jpg)
Step 3: Frameworks
![Page 72: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/72.jpg)
tep 4: Profit$
![Page 73: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/73.jpg)
tep 4: Profit (utilize)$
just one big pool of resources,utilize single machines more fully!
![Page 74: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/74.jpg)
tep 4: Profit (utilize)$
![Page 75: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/75.jpg)
tep 4: Profit (utilize)$
![Page 76: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/76.jpg)
tep 4: Profit (utilize)$
![Page 77: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/77.jpg)
tep 4: Profit (utilize)$
![Page 78: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/78.jpg)
tep 4: Profit (utilize)$
![Page 79: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/79.jpg)
tep 4: Profit(statistical multiplexing)$
![Page 80: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/80.jpg)
tep 4: Profit(statistical multiplexing)$
![Page 81: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/81.jpg)
tep 4: Profit(statistical multiplexing)$
![Page 82: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/82.jpg)
tep 4: Profit(statistical multiplexing)$
![Page 83: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/83.jpg)
tep 4: Profit(statistical multiplexing)$
![Page 84: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/84.jpg)
tep 4: Profit(statistical multiplexing)$
reduces CapEx and OpEx!
![Page 85: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/85.jpg)
tep 4: Profit(statistical multiplexing)$
reduces latency!
![Page 86: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/86.jpg)
tep 4: Profit(statistical multiplexing)$
![Page 87: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/87.jpg)
tep 4: Profit (failures)$
![Page 88: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/88.jpg)
tep 4: Profit (failures)$
![Page 89: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/89.jpg)
tep 4: Profit (failures)$
![Page 90: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/90.jpg)
This sounds pretty good!
![Page 91: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/91.jpg)
Other than Hadoop, Spark, and Storm, what
else can I run on Mesos?
![Page 92: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/92.jpg)
frameworks• Hadoop (github.com/mesos/hadoop)• Spark (github.com/mesos/spark)• DPark (github.com/douban/dpark)• Storm (github.com/nathanmarz/storm)• Chronos (github.com/airbnb/chronos)• MPICH2 (in mesos git repository)• Aurora (proposed for Apache incubator)
![Page 93: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/93.jpg)
What about XYZ?
![Page 94: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/94.jpg)
port an existing frameworkstrategy: write a “wrapper” which launches existing components on mesos~100 lines of code to write a wrapper (the more lines, the more you can take advantage of elasticity or other mesos features)see src/examples/ in mesos repository
![Page 95: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/95.jpg)
write a new framework!as a “kernel”, mesos provides a lot of primitives that make writing a new framework relatively easyprimitives: extracted commonality across existing distributed systems/frameworks (launching tasks, doing failure detection, etc) … why re-implement them each time!?
![Page 96: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/96.jpg)
case study: chronosdistributed cron with dependencies
developed at airbnb~3k lines of Scala!distributed, highly available, and fault tolerant without any network programming!http://github.com/airbnb/chronos
![Page 97: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/97.jpg)
Hmm … if Mesos gives me a datacenter
computer … can I run stuff other than
analytics?
![Page 98: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/98.jpg)
case study: aurorarun N instances of my server, somewhere, forever
(where server == arbitrary command line)developed at Twitterruns hundreds of production services, including ads!recently proposed for Apache Incubator!
![Page 99: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/99.jpg)
aurora
![Page 100: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/100.jpg)
aurora
![Page 101: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/101.jpg)
aurora
![Page 102: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/102.jpg)
aurora
![Page 103: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/103.jpg)
aurora
![Page 104: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/104.jpg)
But what about resource isolation!? I don’t want
my end users to have to wait for our website to
load because of resource contention!
![Page 105: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/105.jpg)
resource isolationLinux control groups (cgroups)
CPU (upper and lower bounds)memorynetwork I/O (traffic controller)filesystem (lvm, in progress)
![Page 106: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/106.jpg)
conclusionsdatacenter management is a pain
![Page 107: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/107.jpg)
conclusionsmesos makes running frameworks on your datacenter easier as well as increasing utilization and performance while reducing CapEx and OpEx!
![Page 108: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/108.jpg)
conclusionsrather than build your next distributed system from scratch, consider using mesos
![Page 109: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/109.jpg)
conclusionsyou can share your datacenter between analytics and online services!
![Page 110: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/110.jpg)
Questions?mesos.apache.org@ApacheMesos
![Page 111: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/111.jpg)
framework commonalityrun processes simultaneously (distributed)handle process failures (fault-tolerance)optimize execution (elasticity, scheduling)
![Page 112: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/112.jpg)
primitivesscheduler – distributed system “master” or “coordinator”(executor – lower-level control of task execution, optional)requests/offers – resource allocationstasks – “threads” of the distributed system…
![Page 113: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/113.jpg)
scheduler
ApacheHadoop
Chronos
![Page 114: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/114.jpg)
scheduler(1) brokers for resources(2) launches tasks(3) handles task termination
![Page 115: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/115.jpg)
brokering for resources(1) make resource requests 2 CPUs 1 GB RAM slave *(2) respond to resource offers 4 CPUs 4 GB RAM slave foo.bar.com
![Page 116: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/116.jpg)
offers: non-blocking resource allocationexist to answer the question:“what should mesos do if it can’t satisfy a request?”
(1) wait until it can(2) offer the best allocation it can immediately
![Page 117: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/117.jpg)
offers: non-blocking resource allocationexist to answer the question:“what should mesos do if it can’t satisfy a request?”
(1) wait until it can(2) offer the best allocation it can immediately
![Page 118: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/118.jpg)
resource allocation
ApacheHadoop
Chronos
request
![Page 119: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/119.jpg)
resource allocation
ApacheHadoop
Chronos
request
allocatordominant resource fairnessresource reservations
![Page 120: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/120.jpg)
resource allocation
ApacheHadoop
Chronos
request
allocatordominant resource fairnessresource reservations
optimisticpessimistic
![Page 121: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/121.jpg)
resource allocation
ApacheHadoop
Chronos
request
allocatordominant resource fairnessresource reservations
optimisticpessimisticno overlapping offers all overlapping offers
![Page 122: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/122.jpg)
resource allocation
ApacheHadoop
Chronos
offer
allocatordominant resource fairnessresource reservations
![Page 123: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/123.jpg)
“two-level scheduling”mesos: controls resource allocations to framework schedulersschedulers: make decisions about what to run given allocated resources
![Page 124: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/124.jpg)
end-to-end principle“application-specific functions ought to reside in the end hosts of a network rather than intermediary nodes”
![Page 125: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/125.jpg)
taskseither a concrete command line or an opaque description (which requires a framework executor to execute)
a consumer of resources
![Page 126: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/126.jpg)
task operationslaunching/killinghealth monitoring/reporting (failure detection)resource usage monitoring (statistics)
![Page 127: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/127.jpg)
resource isolationcgroup per executor or task (if no executor)
resource controls adjusted dynamically as tasks come and go!
![Page 128: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/128.jpg)
case study: chronosdistributed cron with dependencies
built at airbnb by @flo
![Page 129: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/129.jpg)
before chronos
![Page 130: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/130.jpg)
before chronos
single point of failure (and AWS was unreliable)resource starved (not scalable)
![Page 131: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/131.jpg)
chronos requirementsfault tolerancedistributed (elastically take advantage of resources)retries (make sure a command eventually finishes)dependencies
![Page 132: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/132.jpg)
chronosleverages the primitives of mesos
~3k lines of scalahighly available (uses Mesos state)distributed / elasticno actual network programming!
![Page 133: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/133.jpg)
after chronos
![Page 134: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/134.jpg)
after chronos + hadoop
![Page 135: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/135.jpg)
case study: aurora“run 200 of these, somewhere, forever”
built at Twitter
![Page 136: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/136.jpg)
before aurorastatic partitioning of machines to serviceshardware outages caused site outagespuppet + monitops couldn’t scale as fast as engineers
![Page 137: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/137.jpg)
aurorahighly available (uses mesos replicated log)uses a python DSL to describe servicesleverages service discovery and proxying (see Twitter commons)
![Page 138: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/138.jpg)
after aurorapower loss to 19 racks, no lost services!more than 400 engineers running serviceslargest cluster has >2500 machines
![Page 139: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/139.jpg)
Mesos
MesosNode Node Nod
e Node
Hadoop
Node Node Node Node
Spark
Node Node
MPI Storm
Node
Chronos
![Page 140: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/140.jpg)
Mesos
MesosNode Node Nod
e Node
Hadoop
Node Node Node Node
Spark
Node Node
MPI
Node
…
![Page 141: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/141.jpg)
Mesos
MesosNode Node Nod
e Node
Hadoop
Node Node Node Node
Spark
Node Node
MPI Storm
Node
…
![Page 142: Datacenter Management with Apache Mesos](https://reader038.fdocuments.in/reader038/viewer/2022110215/56816920550346895de04f03/html5/thumbnails/142.jpg)
Mesos
MesosNode Node Nod
e Node
Hadoop
Node Node Node Node
Spark
Node Node
MPI Storm
Node
Chronos …