Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ①...

202
Benjamin Hindman – @benh Apache Mesos (at Twitter) mesos.apache.org @ApacheMesos

Transcript of Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ①...

Page 1: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Benjamin  Hindman    –  @benh  

Apache  Mesos  (at  Twitter)  mesos.apache.org  

@ApacheMesos  

 

Page 2: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

agenda  ① Cluster  Management  at  Twitter  

② Mesos  

③ Mesos  at  Twitter  

④ VMs,  IaaS,  and  Mesos  

⑤ …  

 

Page 3: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 4: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Monorail  

Page 5: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

cluster management

(configuration/package  management)   (deployment)  

circa  2010  

Page 6: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 7: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Woodstar   Monorail   Macaw   TweetyPie   memcached  

Page 8: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges

Page 9: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges ①  failures  

Page 10: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

failures  

Page 11: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Woodstar   Monorail   Macaw   TweetyPie   memcached  

Page 12: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Woodstar   Monorail   Macaw   TweetyPie   memcached  

Page 13: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges ②   maintenance  

(aka  “planned  failures”)    

Page 14: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Woodstar   Monorail   Macaw   TweetyPie   memcached  

Page 15: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Woodstar   Monorail   Macaw   TweetyPie   memcached  

Page 16: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

planning  for  failure/maintenance  

Page 17: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges ③   utilization  

Page 18: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Rails  

Hadoop  

memcached  

utilization  

Page 19: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

utilization  

Rails  

Hadoop  

memcached  

Page 20: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

utilization  

Rails  

Hadoop  

memcached  buy  less  machines  

or  run  more  applications!  

Page 21: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

planning  for  utilization  intra-­‐machine  resource  sharing:  

share  a  single  machine’s  resources  between  multiple  applications  (multi-­‐tenancy)  

intra-­‐datacenter  resource  sharing:  

share  multiple  machine’s  resources  between  multiple  applications  

Page 22: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 23: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

agenda  ① Cluster  Management  at  Twitter  

② Mesos  

③ Mesos  at  Twitter  

④ VMs,  IaaS,  and  Mesos  

⑤ …  

 

Page 24: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

origins  Mesos  started  as  a  research  project  at  Berkeley  in  early  2009  by  Benjamin  Hindman,  Andy  Konwinski,  Matei  Zaharia,  Ali  Ghodsi,  Anthony  D.  Joseph,  Randy  Katz,  Scott  Shenker,  Ion  Stoica  

Page 25: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

our  motivation  

increase  performance  and  utilization  of  

clusters  

Page 26: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

our  intuition  

① static  partitioning  considered  harmful  

Page 27: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

our  intuition  

② build  new  applications  

Page 28: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

“Map/Reduce  is  a  big  hammer,  but  not  everything  is  a  nail!”  

Page 29: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

workers  

distributed  system*  anatomy  

coordinator  

* overlooking peer-to-peer distributed systems

Page 30: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

static  partitioning  

coordinator   coordinator  

Page 31: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  (slaves)  

Mesos:  level  of  indirection  

coordinator  

Mesos  (master)  

coordinator  

Page 32: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  (slaves)  

Mesos:  level  of  indirection  

coordinator  

Mesos  (master)  

coordinator  

Page 33: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  (slaves)  

Mesos:  level  of  indirection  

coordinator  

Mesos  (master)  

coordinator  

Page 34: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  (slaves)  

Mesos:  level  of  indirection  

coordinator  

Mesos  (master)  

coordinator  

Page 35: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  (slaves)  

Mesos:  level  of  indirection  

coordinator  

Mesos  (master)  

coordinator  

Page 36: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos:  a  level  of  indirection  ①  enable  running  multiple  distributed  systems  

on  the  same  cluster  of  machines  and  dynamically  share  the  resources  more  efficiently!  

static  partitioning  considered  harmful  

Page 37: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos:  a  level  of  indirection  ②  provide  common  functionality  every  new  

distributed  system  re-­‐implements  like  failure  detection,  task  distribution,  task  starting,  task  monitoring,  task  killing,  task  cleanup!  

build  new  applications  

Page 38: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  ≈  cluster  manager  

Page 39: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

cluster  management  

•   PBS  (Portable  Batch  System)  •   TORQUE  •   SGE  (Sun  Grid  Engine)    

Page 40: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

cluster  management  

•   PBS  (Portable  Batch  System)  •   TORQUE  •   SGE  (Sun  Grid  Engine)    

batch  computation!  

Page 41: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  is  an  evolution  of  the  cluster  manager,  designed  to  run  general  purpose  distributed  systems  (i.e.,  not  just  focused  on  batch)  

Page 42: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  

…  

streaming  

support  many  different  types  of  distributed  systems  

Page 43: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(1)  coordinate  for  resources  (aka  resource  allocation)  

Page 44: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(2)  launch  tasks  

Page 45: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(3)  launch  tasks  

Page 46: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

Page 47: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

Page 48: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(4)  task  termination  

Page 49: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(5)  task  status  update  

Page 50: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(1)  coordinate  for  resources  (aka  resource  allocation)  

Page 51: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

agenda  ① Cluster  Management  at  Twitter  

② Mesos  

③ Mesos  at  Twitter  

④ VMs,  IaaS,  and  Mesos  

⑤ …  

 

Page 52: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 53: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 54: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

stateless  services!  

Page 55: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  

…  

streaming  

Page 56: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Apache  Aurora  (incubating)  

Apache  Aurora  (incubating),  a  scheduler  for  running  stateless  services  written  in  any  language  (but  primarily  used  at  Twitter  for  JVM  services)  

Page 57: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

developer  workflow  

(1) describe  service  using  Python  based  DSL  

(2)  submit  service  to  Aurora  using  CLI  

(1)  bundle  services  as  jar,  tar/gzip  

(2)  upload  to  HDFS  

configuration/package  management  

deployment  

service.aurora  

Page 58: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

developer  workflow  

(1) describe  service  using  JSON  

(2)  submit  service  to  Marathon  via  REST  

(1)  bundle  services  as  jar,  tar/gzip  

(2)  upload  to  HDFS  

configuration/package  management  

deployment  

service.json  

Page 59: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

service  discovery  

Apache  ZooKeeper  

using  Apache  ZooKeeper  and  server  sets  (github.com/twitter/commons)  

Page 60: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

service  discovery  

Apache  ZooKeeper  

using  Apache  ZooKeeper  and  server  sets  (github.com/twitter/commons)  

(1)  service  gets  launched  on  machine  

Page 61: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

service  discovery  

(2)  service  gets  registered  in  a  server  set  in  ZooKeeper  

Apache  ZooKeeper  

using  Apache  ZooKeeper  and  server  sets  (github.com/twitter/commons)  

(1)  service  gets  launched  on  machine  

Page 62: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

service  discovery  

(2)  service  gets  registered  in  a  server  set  in  ZooKeeper  

(3)  other  services  use  ZooKeeper  to  find  services  they  need  

Apache  ZooKeeper  

using  Apache  ZooKeeper  and  server  sets  (github.com/twitter/commons)  

(1)  service  gets  launched  on  machine  

Page 63: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

service  discovery  

(2)  service  gets  registered  in  a  server  set  in  ZooKeeper  

(3)  other  services  use  ZooKeeper  to  find  services  they  need  

(4)  services  connect  directly  with  one  another  

Apache  ZooKeeper  

using  Apache  ZooKeeper  and  server  sets  (github.com/twitter/commons)  

(1)  service  gets  launched  on  machine  

Page 64: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

service  discovery  alternative  

(2)  update  HAProxy  with  new  service  location  

(1)  service  gets  launched  on  machine  

(3)  other  services  send  traffic  through  HAProxy  

ZooKeeper/server sets requires injecting code into your clients!

Page 65: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

where  are  we  today?  

ops  developers  

deploys  decoupled  from  ops  (many  deploys  per  day,  per  service)  

maintenance  consists  of  “draining”  hosts,  getting  tasks  rescheduled,  then  pulling  the  cord  

Page 66: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges revisited ①  failures  

② maintenance  

③  utilization  

Page 67: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges revisited ①  failures  

② maintenance  

③  utilization  

Page 68: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

Page 69: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

Page 70: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

Page 71: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(5)  task  status  update  

Page 72: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges revisited ①  failures  

② maintenance  

③  utilization  

Page 73: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(1)  when  resources  become  idle,  can  be  scheduled  and  reused  by  other  schedulers  

Page 74: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(1)  when  resources  become  idle,  can  be  scheduled  and  reused  by  other  schedulers  

Page 75: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(1)  when  resources  become  idle,  can  be  scheduled  and  reused  by  other  schedulers  

Page 76: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(1)  when  resources  become  idle,  can  be  scheduled  and  reused  by  other  schedulers  

Page 77: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(1)  when  resources  become  idle,  can  be  scheduled  and  reused  by  other  schedulers  

Page 78: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(1)  when  resources  become  idle,  can  be  scheduled  and  reused  by  other  schedulers  

(2)  multi-­‐tenancy  on  individual  machines    

Page 79: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  

batch   service   storage   …  streaming  

(1)  when  resources  become  idle,  can  be  scheduled  and  reused  by  other  schedulers  

(2)  multi-­‐tenancy  on  individual  machines    

Page 80: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

multi-­‐tenancy  

task!

task!

containers  

task!

Page 81: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

containerization started  leveraging  containerization  technology  

in  ~2010  

2010  

LXC  

2012  

cgroups  

2013  

namespaces  (preliminary)  

2014  

Page 82: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

agenda  ① Cluster  Management  at  Twitter  

② Mesos  

③ Mesos  at  Twitter  

④ VMs,  IaaS,  and  Mesos  

⑤ …  

 

Page 83: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

wait  …  don’t  virtual  machines  solve  my  cluster  management  

challenges?  

Page 84: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

wait  …  don’t  virtual  machines  solve  my  cluster  management  

challenges?  

No.    VMs  are  neither  sufficient  nor  necessary!  

Page 85: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 86: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 87: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges revisited ①  failures  

② maintenance  

③  utilization  

Page 88: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges revisited ①  failures  

② maintenance  

③  utilization  

public  or  private  IaaS,  failures  still  occur  (on  EC2,  instead  of  racks,  have  availability  zones,  instead  of  datacenters,  have  regions)  

Page 89: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

challenges revisited ①  failures  

② maintenance  

③  utilization  provider  wins  with  public  IaaS,  better  resource  sharing  with  private  IaaS,  but  a  static  partition  of  VMs  is  still  a  static  partition!  

Page 90: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

physical machines virtual machines

aggregation  not  virtualization  

Page 91: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

physical machines “datacenter computer”

aggregation  not  virtualization  

Page 92: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

coordinator  

Mesos  (master)  

Mesos:  level  of  abstraction  

resources  

machines  

Page 93: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos:  level  of  abstraction  

Mesos  build  and  run  

distributed  systems  using  resources  

Page 94: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos:  level  of  abstraction  

IaaS  

Mesos  

provision  and  manage  machines  

build  and  run  distributed  systems  

using  resources  

Page 95: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos:  level  of  abstraction  

PaaS  

IaaS  

Mesos  

deploy  and  manage  applications/services  

provision  and  manage  machines  

build  and  run  distributed  systems  

using  resources  

Page 96: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

PaaS  on  Mesos  

PaaS  

Mesos  

build  and  run  a  PaaS  on  top  of  Mesos:  

Apache  Aurora  and  Marathon  

Page 97: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  on  IaaS  

IaaS  

Mesos  

use  OpenStack  or  EC2  to  run  Mesos  

Page 98: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  on  IaaS/bare  metal  

IaaS  

Mesos  

hardware  use  OpenStack  or  EC2  or  physical  machines  

to  run  Mesos  

Page 99: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

agenda  ① Cluster  Management  at  Twitter  

② Mesos  

③ Mesos  at  Twitter  

④ VMs,  IaaS,  and  Mesos  

⑤ Mesos  Deeper  Dive  

Page 100: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

cluster  manager  status  quo  before  Mesos  was  batch  scheduling    

Page 101: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

cluster  manager  status  quo  

cluster  manager  

application/human  

specification  

the  specification  includes  as  much  information  as  possible  to  assist  the  cluster  manager  in  scheduling  and  execution  

Page 102: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

cluster  manager  status  quo  

cluster  manager  

application/human  wait  for  task  

to  be  executed  

Page 103: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

cluster  manager  status  quo  

cluster  manager  

application/human  

result  

Page 104: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

problems  with  specifications  

①  hard  to  specify  certain  desires  or  constraints  

②  hard  to  update  specifications  dynamically  as  tasks  execute  and  finish/fail  

Page 105: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

MapReduce  specification  

①    what  would  it  look  like?  ②    who  submits  it?  

Page 106: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

an  alternative  model  

masters  

scheduler  

request  3  CPUs  2  GB  RAM  

a  request  is  purposely  simplified  subset  of  a  specification,  mainly  including  the  required  resources  at  that  point  in  time  

Page 107: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

an  alternative  model  

masters  

scheduler  

request  3  CPUs  2  GB  RAM  

a  request  is  purposely  simplified  subset  of  a  specification,  mainly  including  the  required  resources  at  that  point  in  time  

Page 108: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

what  should  you  do  if  you  can’t  satisfy  a  request?  

Page 109: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

what  should  you  do  if  you  can’t  satisfy  a  request?  

①      wait  until  you  can  …  

Page 110: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

what  should  you  do  if  you  can’t  satisfy  a  request?  

①      wait  until  you  can  …  

②      offer  best  you  can  immediately  

Page 111: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

what  should  you  do  if  you  can’t  satisfy  a  request?  

①      wait  until  you  can  …  

②      offer  best  you  can  immediately  

Page 112: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  model  

masters  

scheduler  

offer  hostname  4  CPUs  4  GB  RAM  

resources  are  allocated  via  resource  offers    a  resource  offer  represents  a  snapshot  of  available  resources  that  a  scheduler  can  use  to  run  tasks  

Page 113: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

an  analogue:  non-­‐blocking  sockets  

kernel  

application  

write(s, buffer, size);!

Page 114: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

an  analogue:  non-­‐blocking  sockets  

kernel  

application  

42 of 100 bytes written!!

Page 115: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  model  

masters  

scheduler  

offer  hostname  4  CPUs  4  GB  RAM  

Page 116: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

offer  hostname  4  CPUs  4  GB  RAM  

offer  hostname  4  CPUs  4  GB  RAM  

offer  hostname  4  CPUs  4  GB  RAM  

Mesos  model  

masters  

scheduler  

offer  hostname  4  CPUs  4  GB  RAM  

Page 117: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

offer  hostname  4  CPUs  4  GB  RAM  

offer  hostname  4  CPUs  4  GB  RAM  

offer  hostname  4  CPUs  4  GB  RAM  

Mesos  model  

masters  

scheduler  

offer  hostname  4  CPUs  4  GB  RAM  

scheduler  uses  the  offers  to  decide  what  tasks  to  run  

Page 118: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  model  

masters  

scheduler  

scheduler  uses  the  offers  to  decide  what  tasks  to  run  

task  3  CPUs  2  GB  RAM  

Page 119: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Mesos  model  

masters  

scheduler  

scheduler  uses  the  offers  to  decide  what  tasks  to  run    “two-­‐level  scheduling”  

task  3  CPUs  2  GB  RAM  

Page 120: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

“two-­‐level  scheduling”  Mesos:  controls  resource  allocations  to  schedulers  

schedulers:  make  decisions  about  what  tasks  to  run  given  allocated  resources  

Page 121: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

two-­‐level  scheduling  Mesos  influenced  by  operating  system  supported  user-­‐space  scheduling  and  ideas  behind  scheduler  activations  

 

Mesos  is  designed  less  like  a  “cluster  manager”  and  more  like  an  operating  system  (or  kernel)  

Page 122: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

design  comparison:  Google’s  Omega  

Page 123: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Omega  

database  

scheduler  

snapshot  scheduler  receives  snapshot  of  all  available  resources  

Page 124: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Omega  

database  

scheduler  

transaction  scheduler  submits  transaction  to  “acquire”  resources  

Page 125: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

proposal:  Mesos  is  isomorphic  to  Omega  if  makes  offers  for  everything  available  

Page 126: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Omega  and  Mesos  

database  

scheduler  

snapshot  

masters  

scheduler  

offer  hostname  4  CPUs  4  GB  RAM  

Page 127: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Omega  and  Mesos  

database  

scheduler  

transaction  

masters  

scheduler  

task  3  CPUs  2  GB  RAM  

Page 128: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

offers  represent  the  current  snapshot  of  available  resources  a  framework  can  use  

Page 129: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

concurrency  control  

optimistic  pessimistic  

all  offers  overlap  with  one  another,  thus  causing  frameworks  to  “compete”  first-­‐come-­‐first-­‐served  

Page 130: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

concurrency  control  

optimistic  pessimistic  

offers  made  to  different  frameworks  are  disjoint  

Page 131: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Omega:  requests  are  complimentary,  but  not  necessary  

Page 132: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

agenda  ① Cluster  Management  at  Twitter  

② Mesos  

③ Mesos  at  Twitter  

④ VMs,  IaaS,  and  Mesos  

⑤ Mesos  Deeper  Dive  

⑥ Mesos  Ecosystem  

Page 133: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

built  on  Mesos:  

2009   2010   2013   2014  

Page 134: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

ported  to  Mesos:  

2011   2012   2013   2014  

Page 135: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

some  of  our  adopters  …  

2010   2013   2014  …  

Page 136: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

releases  

0.18.0  (2013-­‐04-­‐01)  

0.18.1  (2014-­‐04-­‐29)  

0.18.2  (2014-­‐05-­‐13)  

0.19.0  (2014-­‐06-­‐04)  

0.19.1  (2014-­‐07-­‐14)  

0.20.0  (2014-­‐08-­‐21)  

Page 137: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

contributors  

38  contributors  in  the  past  6  months  

23  committers  (also  PMC  members)  

(Storm:  13;  Kasa:  15;  ZooKeeper:  16;  Cassandra:  25;  Thrift:  26;  Spark:  32;  Hadoop:  82)  

3  new  committers  in  past  3  months!  

 

Page 138: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

why  have  they  been  up  to?  

Page 139: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

①   containerization  and  isolation  

Page 140: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

containerization started  leveraging  containerization  technology  

in  ~2010  

2010  

LXC  

2012  

cgroups  

2013  

namespaces  (network)  

Page 141: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

containerization started  leveraging  containerization  technology  

in  ~2010  

2010  

LXC  

2012  

cgroups  

2013  

namespaces  (network)  

2014  

Page 142: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Docker  (0.20.0)  

first-­‐class  Docker  support,  i.e.,  use  Docker  images  to  run  containers  with  Docker  primitives  like  volumes,  entrypoints,  etc  (0.20.0)  

 

Learn  More:  

https://github.com/apache/mesos/blob/master/docs/docker-­‐containerizer.md  

Page 143: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 144: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 145: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

②   container  statistics  

Page 146: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

monitor  all  the  things  CPU,  memory,  network  (0.20.0)  

 mesos-­‐slave  GET  /monitor/statistics.json  HTTP/1.1  

{  "source":  "sample_executor",        "statistics":  {  "cpus_system_time_secs":  154.42,                                                                      "cpus_user_time_secs":  258.74,                                                                  "mem_file_bytes":  30613504,                                                                        "mem_rss_bytes":  140341248,                                                                    "net_rx_bytes":  2402099,                                                                                "net_rx_dropped":  0,                                                                    "net_tx_bytes":  1507798,                                                                    "net_tx_dropped":  0,      }}  

Page 147: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

monitor  all  the  things  CPU,  memory,  network  (0.20.0)  

 

 

Learn  More:  

https://github.com/apache/mesos/blob/master/docs/network-­‐monitoring.md  

 

Page 148: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

③   authentication  and  authorization  

Page 149: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

authentication  

credentials:  principals  and  secrets  

 

protocol  built  using  SASL,  designed  to  swap  in/out  other  mechanisms,  e.g.,  kerberos  

Page 150: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

authorization  

introduced  access  control  lists  (ACLs)  

 

“run_tasks”:  [  

   {  “principals”:  {  “type”:  “NONE”  },  

           “users”:  {  “values”:  “root”  }  }]  

action  

subjects  

objects  

“action  performed  by  subjects  on  objects”  

Page 151: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

authorization  

Learn  More:  

https://github.com/apache/mesos/blob/master/docs/authorization.md  

Page 152: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

④   fault  tolerance  and  high  availability  

Page 153: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

mesos-­‐slave  recovery  

slave  

mesos-slave!

executor!

task!

task!

containers  

Page 154: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

mesos-­‐slave  recovery  

slave  

mesos-slave!

executor!

task!

task!

containers  

Page 155: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

mesos-­‐slave  recovery  

slave  

executor!

task!

task!

containers  

Page 156: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

mesos-­‐slave  recovery  

slave  

executor!

task!

task!

containers  

mesos-slave!

Page 157: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

mesos-­‐slave  recovery  

slave  

executor!

task!

task!

containers  

mesos-slave!

Page 158: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

mesos-­‐slave  recovery  

Since  0.14.0!  Running  in  production  at  Twitter  for  ~  1  year  (enabled  by  default  since  0.15.0)  

 

Learn  More:  

https://github.com/apache/mesos/blob/master/docs/slave-­‐recovery.md  

Page 159: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

what’s  being  built?  

Page 160: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

①   primitives  for  stateful  applications  

Page 161: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

stateful  applications  

better  support  for  running  frameworks  like  HDFS,  Cassandra,  directly  on  Mesos!  

 

Learn  More:  

https://issues.apache.org/jira/browse/MESOS-­‐1554  

Page 162: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

②   primitives  for  maintenance  

Page 163: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

aka  “planned  failures”  

 

 

 

 

maintenance  

mesos-­‐slave  

task!

mesos-­‐slave  

Page 164: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

aka  “planned  failures”  

 

 

 

 

maintenance  

mesos-­‐slave  

task!

mesos-­‐slave  

Page 165: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

aka  “planned  failures”  

 

 

 

 

maintenance  

mesos-­‐slave  

task!

mesos-­‐slave  

Page 166: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

aka  “planned  failures”  

 

 

 

 

maintenance  

mesos-­‐slave   mesos-­‐slave  

task!

Page 167: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

aka  “planned  failures”  

 

 

 

 

maintenance  

mesos-­‐slave   mesos-­‐slave  

task! task! (staging)  

Page 168: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

aka  “planned  failures”  

 

 

 

 

maintenance  

mesos-­‐slave   mesos-­‐slave  

task! (running)  

Page 169: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

aka  “planned  failures”  

 

Learn  More:  

https://issues.apache.org/jira/browse/MESOS-­‐1592  

maintenance  

Page 170: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

③   primitives  for  smarter  resource  allocation  and  scheduling  

Page 171: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

resource  allocation  original  implementation  of  resource  allocation  in  Mesos  was  “pessimistic”  

Google’s  Omega  introduced  the  concept  of  “optimistic”  resource  allocation,  which  Mesos  is  a  natural  fit  for!    

Learn  More:  

https://issues.apache.org/jira/browse/MESOS-­‐1607  

 

 

Page 172: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

④   improvements  in  resource  isolation  

Page 173: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

resource  isolation  ①  network  bandwidth  

②  disk  block  I/O  

③  ...  

Learn  More:  

https://issues.apache.org/jira/browse/MESOS-­‐1585  

 

 

 

Page 174: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

⑤   revised  scheduler/executor  API  

Page 175: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

API  2009  relic:  HTTP  (instead  of  Thrift)  but  not  REST  

(still  evaluating  JSON-­‐RPC  vs  REST)  

 

Last  major  change  before  1.0!  Existing  API  and  libraries  will  remain  backwards  compatible,  but  be  deprecated  till  2.0  and  then  removed  

Page 176: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

agenda  ① Cluster  Management  at  Twitter  

② Mesos  

③ Mesos  at  Twitter  

④ VMs,  IaaS,  and  Mesos  

⑤ Mesos  Deeper  Dive  

⑥ Mesos  Ecosystem  

⑦ Conclusion  

Page 177: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

my  other  computer  is  a  datacenter  

Page 178: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

my  other  computer  is  a  datacenter  

Page 179: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

my  other  computer  is  a  datacenter*  

*  collection  of  physical  and/or  virtual  machines  

Page 180: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 181: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

the  ops  perspective  

Page 182: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

the  datacenter  is  just  another  form  factor  

Page 183: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

the  datacenter  is  just  another  form  factor  

Page 184: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

why  can’t  we  run  apps  on  our  datacenters  just  like  we  run  applications  on  our  mobile  phones?  

Page 185: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 186: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Hadoop   Cassandra   Rails   Jenkins   memcached  

Page 187: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

the  dev  perspective  

Page 188: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

applications  don’t  fit  on  a  single  computer  anymore  

Page 189: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

"BIG"  

Page 190: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …
Page 191: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

(1)  lots  of  data  …  (2)  lots  of  users  …    growing  everyday  

Page 192: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

today’s  applications  need  lots  of  resources  (CPUs,  memory,  disk)  

Page 193: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

we’re  all  building  distributed  systems  

Page 194: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

but  everybody  keeps  reinventing  the  wheel  

Page 195: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

how  many  more  buggy  failure  detectors  will  we  build  until  we  stop?  

Page 196: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

desktop computer

server datacenter

OS  

OS  

OS  

the  datacenter  computer  needs  an  operating  system  

Page 197: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

operating  system    “a  collection  of  software  that  manages  the  computer  hardware  resources  and  provides  common  services  for  computer  programs”  

- Wikipedia

Page 198: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

datacenter  operating  system    “a  collection  of  software  that  manages  the  datacenter  computer  hardware  resources  and  provides  common  services  for  computer  programs”  

- Wikipedia

Page 199: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

datacenter  operating  system    “a  collection  of  software  that  manages  the  datacenter  computer  hardware  resources  and  provides  common  services  for  computer  programs”  

- Wikipedia

Page 200: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Apache  Mesos:  datacenter  kernel  

Page 201: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Apache  Mesos:  distributed  systems  kernel  

Page 202: Apache’Mesos - Meetupfiles.meetup.com/15980712/mesos-london-09-25-14.pdf · agenda ① ClusterManagement(at(Twitter ② Mesos(③ Mesos(at(Twitter ④ VMs,IaaS,and Mesos(⑤ …

Thank  You!  

mesos.apache.org  

mesos.apache.org/blog  

@ApacheMesos