Mesos at OpenTable

Click here to load reader

Embed Size (px)

Transcript of Mesos at OpenTable

  • Mesos at OpenTable

    Pablo Delgado Senior Data Engineer OpenTable @pablete

    MesosCon 2015, Seattle, WA

  • Over 32,000 restaurants worldwide

    more than 760 million diners seated since 1998, representing more than $30 billion spent at partner restaurants

    Over 16 million diners seated every month

    OpenTable has seated over 190 million diners via a mobile device. Almost 50% of our reservations are made via a mobile device

    OpenTable currently has presence in US, Canada, Mexico, UK, Germany and Japan

    OpenTable has nearly 600 partners including Facebook, Google, TripAdvisor, Urbanspoon, Yahoo and Zagat.

    2

    OpenTable the worlds leading provider of online restaurant reservations

  • At OpenTable

    we aim to power

    the best dining experiences!

  • Service Oriented Architecture

  • 5

    From monolith to microservices

  • 6

    Mesos: A Platform for Fine-Grained Resource Sharing in the Data CenterPAPER: http://mesos.berkeley.edu/mesos_tech_report.pdf

    Omega: flexible, scalable schedulers for large compute clusters PAPER: http://research.google.com/pubs/pub41684.html

    Apache Mesos

    http://mesos.berkeley.edu/mesos_tech_report.pdfhttp://research.google.com/pubs/pub41684.html

  • 7

    Apache Mesos Mesos slaves connect to

    masters and offer resources like CPU, disk, and memory.

    Masters take those offers and make decisions about resource allocation using frameworks like Singularity.

    Frameworks in turn choose to use resource offers, and run tasks on slaves.

  • 8

    Zookeeper

    Netflixs Exhibitor

    Mesos Master

    Zookeeper

    Netflixs Exhibitor

    Standby Master

    Zookeeper

    Netflixs Exhibitor

    Standby Master

    Docker

    Mesos SlaveDocker

    Mesos Slave

    Docker

    Mesos SlaveDocker

    Mesos Slave

    Docker

    Mesos SlaveDocker

    Mesos Slave

    availability zone 2bavailability zone 2a availability zone 2c

    Apache Mesos

  • Hubspots Singularity Scheduler

  • 10

    Native Docker Support

    JSON REST API and Java Client

    Fully featured web application (replaces and improves Mesos Master UI)

    Deployments, automatic rollbacks, and healthchecks

    Configurable email alerts to service owners

    Singularity Features

  • 11

    Hubspots SingularityProcess types:Web Services WorkersScheduled (CRON-type) JobsOn-Demand Processes

    Slave placement:GREEDYSEPARATE_BY_DEPLOYSEPARATE_BY_REQUESTOPTIMISTIC

    Executors:Mesos executorSingularity executorDocker executor

  • Linux Containers

  • 13

    Docker Immutability

    Portability

    Isolation

  • Service Discovery

  • 15

    Services no longer live in a well known address/port, so we needed a registry or dynamic way to find them. Also it had to be MESOS agnostic.

    Service announce their presence to the Discovery Server

    Service subscribe to changes in dependencies announcement

    Service un-announce on termination or timeout on crash

    Service Discovery

  • 16

    Zookeeper Zookeeper Zookeeper

    availability zone 2bavailability zone 2a availability zone 2c

    Service Discovery

    Discovery Server Discovery Server Discovery Server

    A

    A

    A

    BB

    Announce

    Discover

    Subscribe

  • 17

    Service Discovery API

  • FrontDoor

  • 19

    FrontDoor

    Route external traffic to internal services

    Simple Discovery-aware proxy

    Dynamic configuration

    Developer friendly configuration via Git repo

    REQUEST_URI=/api/timezone* passthru timezone

  • Monitoring

  • 21

    Monitoring

    https://github.com/opentable/mesos_stats

    Finds your service name by parsing the task names.

    Includes grafana dashboard

    Runs inside mesos

  • All together

  • 23

    Github

    Continuous Integration

    Singularity

    Discovery

    MasterZookeeper

    MasterZookeeper

    MasterZookeeper

    SlaveDocker

    SlaveDocker

    SlaveDocker

    SlaveDocker

    SlaveDocker

    SlaveDocker

    FrontDoor

    Docker Registry

    Discovery

    Discovery

    Overview

  • 24

    Github

    Continuous Integration

    Singularity

    Docker Registry

    Developers Concerns

    Initialize projects with Continuous integration template

    Enable monitoring/logging of application level errors

    Build project as an immutable docker image

    Deploy to Mesos through singularity using a rest API

  • 25

    Singularity

    Discovery

    MasterZookeeper

    MasterZookeeper

    MasterZookeeper

    SlaveDocker

    SlaveDocker

    SlaveDocker

    SlaveDocker

    SlaveDocker

    SlaveDocker

    FrontDoor

    Docker Registry

    Discovery

    Discovery

    Operational Concerns

    Provide Mesos with resources

    Monitor and maintain external traffic routing

    Monitor and replace failing resources

  • 26

    Stateless Mesos Cluster

    Datastores

    Caches

    Stateless Simplicity

    Other

    Mysql, PostgreSQL, MongoDB

    Redis, Memcached

    Zookeeper, Amazon S3

  • 27

    US Data Center EU Data Center

    AWS us-west-2 AWS eu-west-1 AWS us-west-2

    PROD PROD

    PROD PROD QA

    DATA PROCESSING

  • 28

    US Data Center EU Data Center

    AWS us-west-2 AWS eu-west-1 AWS us-west-2

    PROD PROD

    PROD PROD QA

    DATA PROCESSING

    Kafka Kafka

    Kafka Kafka Kafka

  • Data Processing

  • 30

    Distributed Multitenant Data Processing

  • 31

    Sparks Approach

    Generalize MapReduce in order to support new apps in the same engine

    General DAGs and Data Sharing

    Unification benefits the engine, which is more efficient, and simple for user

    Handles batch, interactive and online processing

    API available for Java, Scala, Python, SQL, R

  • 32

    Spark RDDs

    Resilient Distributed Datasets (or RDD) are fault-tolerant distributed collections

    They exists in the form of:

    Parallelized Collections

    External datasets, distributed datasets from any storage source supported by Hadoop, including your local file system, HDFS, Cassandra, HBase, Amazon S3, etc.

  • 33

    HadoopRDD(path(=(hdfs://...(

    FilteredRDD(func(=(_.contains()(shouldCache(=(true(

    file:%

    errors:%

    Partition.level%view:%Dataset.level%view:%

    Task%1%Task%2% ...%

    RDD GraphDataset-level view Partition-level view

    file RDD

    errors RDD

    Task 1 Task 2 Task 3 Task n

  • 34

    Scheduling Process

    rdd1.join(rdd2) .groupBy() .filter()

    RDD#Objects#

    build#operator#DAG!

    agnos&c(to(operators!(

    doesnt(know(about(stages(

    DAGScheduler#

    split#graph#into#stages#of#tasks!

    submit#each#stage#as#ready#

    DAG#

    TaskScheduler#

    TaskSet#

    launch#tasks#via#cluster#manager!

    retry#failed#or#straggling#tasks!

    Cluster#manager#

    Worker#

    execute#tasks!

    store#and#serve#blocks!

    Block(manager(

    Threads(Task#

    stage#failed#

    Lifetime of a job. Scheduling Process

  • 35

    Scheduling Process

    rdd1.join(rdd2) .groupBy() .filter()

    RDD#Objects#

    build#operator#DAG!

    agnos&c(to(operators!(

    doesnt(know(about(stages(

    DAGScheduler#

    split#graph#into#stages#of#tasks!

    submit#each#stage#as#ready#

    DAG#

    TaskScheduler#

    TaskSet#

    launch#tasks#via#cluster#manager!

    retry#failed#or#straggling#tasks!

    Cluster#manager#

    Worker#

    execute#tasks!

    store#and#serve#blocks!

    Block(manager(

    Threads(Task#

    stage#failed#

    Lifetime of a job. Scheduling Process

  • 36

    Scheduling Process

    rdd1.join(rdd2) .groupBy() .filter()

    RDD#Objects#

    build#operator#DAG!

    agnos&c(to(operators!(

    doesnt(know(about(stages(

    DAGScheduler#

    split#graph#into#stages#of#tasks!

    submit#each#stage#as#ready#

    DAG#

    TaskScheduler#

    TaskSet#

    launch#tasks#via#cluster#manager!

    retry#failed#or#straggling#tasks!

    Cluster#manager#

    Worker#

    execute#tasks!

    store#and#serve#blocks!

    Block(manager(

    Threads(Task#

    stage#failed#

    Lifetime of a job. Scheduling Process

  • 37

    Scheduling Process

    rdd1.join(rdd2) .groupBy() .filter()

    RDD#Objects#

    build#operator#DAG!

    agnos&c(to(operators!(

    doesnt(know(about(stages(

    DAGScheduler#

    split#graph#into#stages#of#tasks!

    submit#each#stage#as#ready#

    DAG#

    TaskScheduler#

    TaskSet#

    launch#tasks#via#cluster#manager!

    retry#failed#or#straggling#tasks!

    Cluster#manager#

    Worker#

    execute#tasks!

    store#and#serve#blocks!

    Block(manager(

    Threads(Task#

    stage#failed#

    Lifetime of a job. Scheduling Process

  • 38

    Alternating Least Squares (ALS) in MLlib

  • 39

    Driver Program

    SparkContext

    Cluster Manager

    Worker Node

    Executor

    Task Task

    Cache

    Worker Node

    Executor

    Task Task

    Cache

    Running Spark

  • 40

    Driver Program

    SparkContext

    Cluster Manager

    Worker Node

    Executor

    Task Task

    Cache

    Mesos Master

    Mesos Executor

    Worker Node

    Task Task

    Cache