Mesos: The Operating System for your Datacenter

60
Mesos: The Datacenter Opera1ng System David Greenberg Two Sigma

description

Maybe you’ve heard of Mesos—that thing that you can run Hadoop on. I think it powers Twitter? Isn’t it an Apache project, or something? In this talk, we’ll learn all about Mesos—what it is, how you can leverage it to simplify your infrastructure and reduce AWS/cloud computing costs, and why you should develop your next application on top of it. This talk will give you the tools you need to understand whether Mesos is the right fit for your infrastructure, and several starting points for learning more about Mesos.

Transcript of Mesos: The Operating System for your Datacenter

Page 1: Mesos: The Operating System for your Datacenter

Mesos:  The  Datacenter  Opera1ng  System  

David  Greenberg  Two  Sigma  

Page 2: Mesos: The Operating System for your Datacenter

Who  am  I?  

•  Architected  project  to  build  a  massive  Mesos  cluster  

•  Building  custom  framework  and  leveraging  open  source  

Page 3: Mesos: The Operating System for your Datacenter

The  Plan  

What  is  Mesos?  

How  can  I  use  Mesos?  

How  can  I  build  on  Mesos?  

Page 4: Mesos: The Operating System for your Datacenter

What  is  Mesos?  

Page 5: Mesos: The Operating System for your Datacenter

A  long  1me  ago…  

Are  you  done  with  the  

machine?  I  need  to  load  my  cards.  

Lol  no;  maybe  tomorrow.  

Page 6: Mesos: The Operating System for your Datacenter

1957  

Oh  man!  Let’s  all  share  the  

computer,  AT  THE  SAME  TIME!  

John  McCarthy  Popularized  Timesharing  

Page 7: Mesos: The Operating System for your Datacenter

A  long  1me  ago…  

Are  you  done  with  the  Hadoop  cluster?  I  need  to  run  my  analy1cs  job.  

Lol  no;  maybe  tomorrow.  

Page 8: Mesos: The Operating System for your Datacenter

2010  

Oh  man!  Let’s  all  share  the  cluster,  

AT  THE  SAME  TIME!  

Ben  Hindman  Popularized  Mesos  

Page 9: Mesos: The Operating System for your Datacenter

Good  ideas  today  mirror  good  ideas  of  yesteryear  

Page 10: Mesos: The Operating System for your Datacenter

Mesos:  an  Opera1ng  System  

Page 11: Mesos: The Operating System for your Datacenter

Isola1on  

Page 12: Mesos: The Operating System for your Datacenter

Resource  Sharing  

Page 13: Mesos: The Operating System for your Datacenter

Common  Infrastructure  

•  read(),  write(),  open()  •  bind(),  connect()  •  apt-­‐get,  yum  

•  launchTask(),  killTask(),  statusUpdate()  

•  Docker  

Page 14: Mesos: The Operating System for your Datacenter

Distributed  System*  Anatomy  

Workers  

Coordinator  

*  Excluding  peer-­‐to-­‐peer  systems  

Page 15: Mesos: The Operating System for your Datacenter

Sta1c  Par11oning  

Coordinator  (Hadoop)   Coordinator  (Storm)  

Page 16: Mesos: The Operating System for your Datacenter

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

Page 17: Mesos: The Operating System for your Datacenter

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

Page 18: Mesos: The Operating System for your Datacenter

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

Page 19: Mesos: The Operating System for your Datacenter

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

Page 20: Mesos: The Operating System for your Datacenter

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

Page 21: Mesos: The Operating System for your Datacenter

 Coordina1ng  Execu1on  

≈  Scheduling  

 

Page 22: Mesos: The Operating System for your Datacenter

Mesos  (slaves)  

Coordinator  

Mesos  (master)  

s/Coordinator/Scheduler/  

Page 23: Mesos: The Operating System for your Datacenter

Mesos  (slaves)  

Scheduler  

Mesos  (master)  

s/Coordinator/Scheduler/  

Page 24: Mesos: The Operating System for your Datacenter

Mesos  (slaves)  

JobTracker  (Scheduler)  

Mesos  (master)  

Apache  Hadoop  

Page 25: Mesos: The Operating System for your Datacenter

 Distributed  System  

≈  (Mesos)  framework  

Page 26: Mesos: The Operating System for your Datacenter

a  Mesos  framework  is  a  distributed  system  that  has  a  coordinator  

Page 27: Mesos: The Operating System for your Datacenter

a  Mesos  framework  is  a  distributed  system  that  has  a  coordinator  

Page 28: Mesos: The Operating System for your Datacenter

a  Mesos  framework  is  a  distributed  system  that  has  a  scheduler  a  

Page 29: Mesos: The Operating System for your Datacenter

a  Mesos  framework  is  an  app  for  your  cluster  

Page 30: Mesos: The Operating System for your Datacenter

How  can  I  use  Mesos?  

Page 31: Mesos: The Operating System for your Datacenter

Tons  of  Flexibility!  

Page 32: Mesos: The Operating System for your Datacenter

Jenkins  

•  Con1nuous  build  server  

•  Just  install  a  plugin!  

Page 33: Mesos: The Operating System for your Datacenter

Hadoop  

•  Mul1-­‐cluster  isola1on  •  Fast  startup  

•  Just  run  the  repacked  Cloudera  CDH  4.2.1  MR1  distribu1on  for  Mesos  

Page 34: Mesos: The Operating System for your Datacenter

Marathon  

•  PaaS  on  Mesos  •  init.d  for  the  cluster  •  Docker  support  •  Scales  at  the  click  of  a  budon  

•  Manages  edge  routers  -­‐  HAProxy  

Page 35: Mesos: The Operating System for your Datacenter

Chronos  

•  Distributed  cron  •  Supports  job  dependencies  

•  REST  API  

Page 36: Mesos: The Operating System for your Datacenter

Aurora  

•  Advanced  PaaS  on  Mesos  •  Powers  Twider  •  Supports  phased  rollouts  •  Supports  complex  deployments  

Page 37: Mesos: The Operating System for your Datacenter

Spark  

•  In  memory  Map  Reduce,  built  for  “Medium  Data”  

•  Supports  SQL  as  well  as  Java,  Python,  and  Scala  

•  Designed  for  interac1ve  analysis  via  REPL  

Page 38: Mesos: The Operating System for your Datacenter

How  do  I  use  these?  

•  Free  online  interac1ve  tutorials!  – hdp://mesosphere.io/learn  

•  Covers  all  of  the  previously  men1oned  and  many  more  

Page 39: Mesos: The Operating System for your Datacenter

How  can  I  build  on  Mesos?  

Page 40: Mesos: The Operating System for your Datacenter

Cluster  Manager  Status  Quo  

Cluster  Manager  

Applica?on/Human  

Specifica1on  

The  specifica1on  includes  as  much  informa1on  as  possible  to  assist  the  cluster  manager  in  scheduling  and  execu1on  

Page 41: Mesos: The Operating System for your Datacenter

Cluster  Manager  Status  Quo  

Cluster  Manager  

Applica?on/Human  Wait  for  task  to  be  executed  

Page 42: Mesos: The Operating System for your Datacenter

Cluster  Manager  Status  Quo  

Cluster  Manager  

Applica?on/Human  

Result  

Page 43: Mesos: The Operating System for your Datacenter

Problems  with  Specifica1ons  ① Hard  to  specify  certain  desires  or  constraints  ② Hard  to  update  specifica1ons  dynamically  as  

tasks  execute  and  finish/fail  

Page 44: Mesos: The Operating System for your Datacenter

An  Alterna1ve  Model  

Mesos  

Scheduler  

request  3  CPUs  2  GB  RAM  

•  A  request  is  purposely  simplified  subset  of  a  specifica1on  

•  It  is  just  the  required  resources  at  that  point  in  )me  

Page 45: Mesos: The Operating System for your Datacenter

What  should  you  do  if  you  can’t  sa1sfy  a  request?  

Page 46: Mesos: The Operating System for your Datacenter

What  should  you  do  if  you  can’t  sa1sfy  a  request?  

①      Wait  un?l  you  can  …  

Page 47: Mesos: The Operating System for your Datacenter

What  should  you  do  if  you  can’t  sa1sfy  a  request?  

①      Wait  un?l  you  can  …  

②      Offer  best  you  can  immediately  

Page 48: Mesos: The Operating System for your Datacenter

What  should  you  do  if  you  can’t  sa1sfy  a  request?  

①      Wait  un?l  you  can  …  

②      Offer  best  you  can  immediately  

Page 49: Mesos: The Operating System for your Datacenter

Mesos  Model  

Mesos  

Scheduler  

offer  hostname  4  CPUs  4  GB  RAM  

•  Resources  are  allocated  via  resource  offers  

•  A  resource  offer  represents  a  snapshot  of  available  resources  that  a  scheduler  can  use  to  run  tasks  

Page 50: Mesos: The Operating System for your Datacenter

An  Analogue:  non-­‐blocking  sockets  

Kernel  

Applica?on  

write(s, buffer, size);!

Page 51: Mesos: The Operating System for your Datacenter

An  Analogue:  non-­‐blocking  sockets  

Kernel  

Applica?on  

42 of 100 bytes written!!

Page 52: Mesos: The Operating System for your Datacenter

offer  hostname  4  CPUs  4  GB  RAM  

offer  hostname  4  CPUs  4  GB  RAM  

offer  hostname  4  CPUs  4  GB  RAM  

Mesos  Model  

Mesos  

Scheduler  

offer  hostname  4  CPUs  4  GB  RAM  

Scheduler  uses  the  offers  to  decide  what  tasks  to  run  

Page 53: Mesos: The Operating System for your Datacenter

Mesos  Model  

Mesos  

Scheduler  

Scheduler  uses  the  offers  to  decide  what  tasks  to  run    “Two-­‐level  scheduling”  

task  3  CPUs  2  GB  RAM  

Page 54: Mesos: The Operating System for your Datacenter

Two-­‐level  Scheduling  

•  Mesos:  controls  resource  alloca+ons  to  schedulers  

•  Schedulers:  make  decisions  about  what  tasks  to  run  given  allocated  resources  

Page 55: Mesos: The Operating System for your Datacenter

Two-­‐level  Scheduling  Elsewhere  

•  Mesos  influenced  by  opera1ng  system  supported  user-­‐space  scheduling  – E.g.  green  threads,  gorou1nes  

•  Mesos  is  designed  less  like  a  “cluster  manager”  and  more  like  an  opera1ng  system  (or  kernel)  

Page 56: Mesos: The Operating System for your Datacenter

Language  Bindings  

Page 57: Mesos: The Operating System for your Datacenter

Should  I  build  it  on  Mesos?  

•  Theme  of  MesosCon:  it’s  easy  to  build  frameworks  

•  Open  source  and  proprietary  frameworks  are  being  created  all  the  1me  – Two  Sigma  – Neplix  – Twider  – Hubspot  

Page 58: Mesos: The Operating System for your Datacenter

But  should  I  really  build  it  on  Mesos?  

•  Most  users  just  use  Marathon,  Hadoop,  Spark,  and  Chronos  

•  Why  did  we  build  our  own?  – Exo1c  workload  

Page 59: Mesos: The Operating System for your Datacenter

The  Plan,  redux  

What  is  Mesos?  

How  can  I  use  Mesos?  

How  can  I  build  on  Mesos?  

Page 60: Mesos: The Operating System for your Datacenter

Ques1ons?  

Thank  you