Mesos: The Operating System for your Datacenter

Post on 26-Jan-2015

2.718 views 0 download

description

Maybe you’ve heard of Mesos—that thing that you can run Hadoop on. I think it powers Twitter? Isn’t it an Apache project, or something? In this talk, we’ll learn all about Mesos—what it is, how you can leverage it to simplify your infrastructure and reduce AWS/cloud computing costs, and why you should develop your next application on top of it. This talk will give you the tools you need to understand whether Mesos is the right fit for your infrastructure, and several starting points for learning more about Mesos.

Transcript of Mesos: The Operating System for your Datacenter

Mesos:  The  Datacenter  Opera1ng  System  

David  Greenberg  Two  Sigma  

Who  am  I?  

•  Architected  project  to  build  a  massive  Mesos  cluster  

•  Building  custom  framework  and  leveraging  open  source  

The  Plan  

What  is  Mesos?  

How  can  I  use  Mesos?  

How  can  I  build  on  Mesos?  

What  is  Mesos?  

A  long  1me  ago…  

Are  you  done  with  the  

machine?  I  need  to  load  my  cards.  

Lol  no;  maybe  tomorrow.  

1957  

Oh  man!  Let’s  all  share  the  

computer,  AT  THE  SAME  TIME!  

John  McCarthy  Popularized  Timesharing  

A  long  1me  ago…  

Are  you  done  with  the  Hadoop  cluster?  I  need  to  run  my  analy1cs  job.  

Lol  no;  maybe  tomorrow.  

2010  

Oh  man!  Let’s  all  share  the  cluster,  

AT  THE  SAME  TIME!  

Ben  Hindman  Popularized  Mesos  

Good  ideas  today  mirror  good  ideas  of  yesteryear  

Mesos:  an  Opera1ng  System  

Isola1on  

Resource  Sharing  

Common  Infrastructure  

•  read(),  write(),  open()  •  bind(),  connect()  •  apt-­‐get,  yum  

•  launchTask(),  killTask(),  statusUpdate()  

•  Docker  

Distributed  System*  Anatomy  

Workers  

Coordinator  

*  Excluding  peer-­‐to-­‐peer  systems  

Sta1c  Par11oning  

Coordinator  (Hadoop)   Coordinator  (Storm)  

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

Mesos  (slaves)  

Mesos:  a  Level  of  Indirec1on  

Coordinator  

Mesos  (master)  

Coordinator  

 Coordina1ng  Execu1on  

≈  Scheduling  

 

Mesos  (slaves)  

Coordinator  

Mesos  (master)  

s/Coordinator/Scheduler/  

Mesos  (slaves)  

Scheduler  

Mesos  (master)  

s/Coordinator/Scheduler/  

Mesos  (slaves)  

JobTracker  (Scheduler)  

Mesos  (master)  

Apache  Hadoop  

 Distributed  System  

≈  (Mesos)  framework  

a  Mesos  framework  is  a  distributed  system  that  has  a  coordinator  

a  Mesos  framework  is  a  distributed  system  that  has  a  coordinator  

a  Mesos  framework  is  a  distributed  system  that  has  a  scheduler  a  

a  Mesos  framework  is  an  app  for  your  cluster  

How  can  I  use  Mesos?  

Tons  of  Flexibility!  

Jenkins  

•  Con1nuous  build  server  

•  Just  install  a  plugin!  

Hadoop  

•  Mul1-­‐cluster  isola1on  •  Fast  startup  

•  Just  run  the  repacked  Cloudera  CDH  4.2.1  MR1  distribu1on  for  Mesos  

Marathon  

•  PaaS  on  Mesos  •  init.d  for  the  cluster  •  Docker  support  •  Scales  at  the  click  of  a  budon  

•  Manages  edge  routers  -­‐  HAProxy  

Chronos  

•  Distributed  cron  •  Supports  job  dependencies  

•  REST  API  

Aurora  

•  Advanced  PaaS  on  Mesos  •  Powers  Twider  •  Supports  phased  rollouts  •  Supports  complex  deployments  

Spark  

•  In  memory  Map  Reduce,  built  for  “Medium  Data”  

•  Supports  SQL  as  well  as  Java,  Python,  and  Scala  

•  Designed  for  interac1ve  analysis  via  REPL  

How  do  I  use  these?  

•  Free  online  interac1ve  tutorials!  – hdp://mesosphere.io/learn  

•  Covers  all  of  the  previously  men1oned  and  many  more  

How  can  I  build  on  Mesos?  

Cluster  Manager  Status  Quo  

Cluster  Manager  

Applica?on/Human  

Specifica1on  

The  specifica1on  includes  as  much  informa1on  as  possible  to  assist  the  cluster  manager  in  scheduling  and  execu1on  

Cluster  Manager  Status  Quo  

Cluster  Manager  

Applica?on/Human  Wait  for  task  to  be  executed  

Cluster  Manager  Status  Quo  

Cluster  Manager  

Applica?on/Human  

Result  

Problems  with  Specifica1ons  ① Hard  to  specify  certain  desires  or  constraints  ② Hard  to  update  specifica1ons  dynamically  as  

tasks  execute  and  finish/fail  

An  Alterna1ve  Model  

Mesos  

Scheduler  

request  3  CPUs  2  GB  RAM  

•  A  request  is  purposely  simplified  subset  of  a  specifica1on  

•  It  is  just  the  required  resources  at  that  point  in  )me  

What  should  you  do  if  you  can’t  sa1sfy  a  request?  

What  should  you  do  if  you  can’t  sa1sfy  a  request?  

①      Wait  un?l  you  can  …  

What  should  you  do  if  you  can’t  sa1sfy  a  request?  

①      Wait  un?l  you  can  …  

②      Offer  best  you  can  immediately  

What  should  you  do  if  you  can’t  sa1sfy  a  request?  

①      Wait  un?l  you  can  …  

②      Offer  best  you  can  immediately  

Mesos  Model  

Mesos  

Scheduler  

offer  hostname  4  CPUs  4  GB  RAM  

•  Resources  are  allocated  via  resource  offers  

•  A  resource  offer  represents  a  snapshot  of  available  resources  that  a  scheduler  can  use  to  run  tasks  

An  Analogue:  non-­‐blocking  sockets  

Kernel  

Applica?on  

write(s, buffer, size);!

An  Analogue:  non-­‐blocking  sockets  

Kernel  

Applica?on  

42 of 100 bytes written!!

offer  hostname  4  CPUs  4  GB  RAM  

offer  hostname  4  CPUs  4  GB  RAM  

offer  hostname  4  CPUs  4  GB  RAM  

Mesos  Model  

Mesos  

Scheduler  

offer  hostname  4  CPUs  4  GB  RAM  

Scheduler  uses  the  offers  to  decide  what  tasks  to  run  

Mesos  Model  

Mesos  

Scheduler  

Scheduler  uses  the  offers  to  decide  what  tasks  to  run    “Two-­‐level  scheduling”  

task  3  CPUs  2  GB  RAM  

Two-­‐level  Scheduling  

•  Mesos:  controls  resource  alloca+ons  to  schedulers  

•  Schedulers:  make  decisions  about  what  tasks  to  run  given  allocated  resources  

Two-­‐level  Scheduling  Elsewhere  

•  Mesos  influenced  by  opera1ng  system  supported  user-­‐space  scheduling  – E.g.  green  threads,  gorou1nes  

•  Mesos  is  designed  less  like  a  “cluster  manager”  and  more  like  an  opera1ng  system  (or  kernel)  

Language  Bindings  

Should  I  build  it  on  Mesos?  

•  Theme  of  MesosCon:  it’s  easy  to  build  frameworks  

•  Open  source  and  proprietary  frameworks  are  being  created  all  the  1me  – Two  Sigma  – Neplix  – Twider  – Hubspot  

But  should  I  really  build  it  on  Mesos?  

•  Most  users  just  use  Marathon,  Hadoop,  Spark,  and  Chronos  

•  Why  did  we  build  our  own?  – Exo1c  workload  

The  Plan,  redux  

What  is  Mesos?  

How  can  I  use  Mesos?  

How  can  I  build  on  Mesos?  

Ques1ons?  

Thank  you