Mesos: The Operating System for your Datacenter
-
Upload
david-greenberg -
Category
Technology
-
view
2.718 -
download
0
description
Transcript of Mesos: The Operating System for your Datacenter
Mesos: The Datacenter Opera1ng System
David Greenberg Two Sigma
Who am I?
• Architected project to build a massive Mesos cluster
• Building custom framework and leveraging open source
The Plan
What is Mesos?
How can I use Mesos?
How can I build on Mesos?
What is Mesos?
A long 1me ago…
Are you done with the
machine? I need to load my cards.
Lol no; maybe tomorrow.
1957
Oh man! Let’s all share the
computer, AT THE SAME TIME!
John McCarthy Popularized Timesharing
A long 1me ago…
Are you done with the Hadoop cluster? I need to run my analy1cs job.
Lol no; maybe tomorrow.
2010
Oh man! Let’s all share the cluster,
AT THE SAME TIME!
Ben Hindman Popularized Mesos
Good ideas today mirror good ideas of yesteryear
Mesos: an Opera1ng System
Isola1on
Resource Sharing
Common Infrastructure
• read(), write(), open() • bind(), connect() • apt-‐get, yum
• launchTask(), killTask(), statusUpdate()
• Docker
Distributed System* Anatomy
Workers
Coordinator
* Excluding peer-‐to-‐peer systems
Sta1c Par11oning
Coordinator (Hadoop) Coordinator (Storm)
Mesos (slaves)
Mesos: a Level of Indirec1on
Coordinator
Mesos (master)
Coordinator
Mesos (slaves)
Mesos: a Level of Indirec1on
Coordinator
Mesos (master)
Coordinator
Mesos (slaves)
Mesos: a Level of Indirec1on
Coordinator
Mesos (master)
Coordinator
Mesos (slaves)
Mesos: a Level of Indirec1on
Coordinator
Mesos (master)
Coordinator
Mesos (slaves)
Mesos: a Level of Indirec1on
Coordinator
Mesos (master)
Coordinator
Coordina1ng Execu1on
≈ Scheduling
Mesos (slaves)
Coordinator
Mesos (master)
s/Coordinator/Scheduler/
Mesos (slaves)
Scheduler
Mesos (master)
s/Coordinator/Scheduler/
Mesos (slaves)
JobTracker (Scheduler)
Mesos (master)
Apache Hadoop
Distributed System
≈ (Mesos) framework
a Mesos framework is a distributed system that has a coordinator
a Mesos framework is a distributed system that has a coordinator
a Mesos framework is a distributed system that has a scheduler a
a Mesos framework is an app for your cluster
How can I use Mesos?
Tons of Flexibility!
Jenkins
• Con1nuous build server
• Just install a plugin!
Hadoop
• Mul1-‐cluster isola1on • Fast startup
• Just run the repacked Cloudera CDH 4.2.1 MR1 distribu1on for Mesos
Marathon
• PaaS on Mesos • init.d for the cluster • Docker support • Scales at the click of a budon
• Manages edge routers -‐ HAProxy
Chronos
• Distributed cron • Supports job dependencies
• REST API
Aurora
• Advanced PaaS on Mesos • Powers Twider • Supports phased rollouts • Supports complex deployments
Spark
• In memory Map Reduce, built for “Medium Data”
• Supports SQL as well as Java, Python, and Scala
• Designed for interac1ve analysis via REPL
How do I use these?
• Free online interac1ve tutorials! – hdp://mesosphere.io/learn
• Covers all of the previously men1oned and many more
How can I build on Mesos?
Cluster Manager Status Quo
Cluster Manager
Applica?on/Human
Specifica1on
The specifica1on includes as much informa1on as possible to assist the cluster manager in scheduling and execu1on
Cluster Manager Status Quo
Cluster Manager
Applica?on/Human Wait for task to be executed
Cluster Manager Status Quo
Cluster Manager
Applica?on/Human
Result
Problems with Specifica1ons ① Hard to specify certain desires or constraints ② Hard to update specifica1ons dynamically as
tasks execute and finish/fail
An Alterna1ve Model
Mesos
Scheduler
request 3 CPUs 2 GB RAM
• A request is purposely simplified subset of a specifica1on
• It is just the required resources at that point in )me
What should you do if you can’t sa1sfy a request?
What should you do if you can’t sa1sfy a request?
① Wait un?l you can …
What should you do if you can’t sa1sfy a request?
① Wait un?l you can …
② Offer best you can immediately
What should you do if you can’t sa1sfy a request?
① Wait un?l you can …
② Offer best you can immediately
Mesos Model
Mesos
Scheduler
offer hostname 4 CPUs 4 GB RAM
• Resources are allocated via resource offers
• A resource offer represents a snapshot of available resources that a scheduler can use to run tasks
An Analogue: non-‐blocking sockets
Kernel
Applica?on
write(s, buffer, size);!
An Analogue: non-‐blocking sockets
Kernel
Applica?on
42 of 100 bytes written!!
offer hostname 4 CPUs 4 GB RAM
offer hostname 4 CPUs 4 GB RAM
offer hostname 4 CPUs 4 GB RAM
Mesos Model
Mesos
Scheduler
offer hostname 4 CPUs 4 GB RAM
Scheduler uses the offers to decide what tasks to run
Mesos Model
Mesos
Scheduler
Scheduler uses the offers to decide what tasks to run “Two-‐level scheduling”
task 3 CPUs 2 GB RAM
Two-‐level Scheduling
• Mesos: controls resource alloca+ons to schedulers
• Schedulers: make decisions about what tasks to run given allocated resources
Two-‐level Scheduling Elsewhere
• Mesos influenced by opera1ng system supported user-‐space scheduling – E.g. green threads, gorou1nes
• Mesos is designed less like a “cluster manager” and more like an opera1ng system (or kernel)
Language Bindings
Should I build it on Mesos?
• Theme of MesosCon: it’s easy to build frameworks
• Open source and proprietary frameworks are being created all the 1me – Two Sigma – Neplix – Twider – Hubspot
But should I really build it on Mesos?
• Most users just use Marathon, Hadoop, Spark, and Chronos
• Why did we build our own? – Exo1c workload
The Plan, redux
What is Mesos?
How can I use Mesos?
How can I build on Mesos?
Ques1ons?
Thank you