Building a Data-Friendly Platform for a Data-Driven Future

77
© 2016 Mesosphere, Inc. All Rights Reserved. Building a Data-Friendly Platform for a Data- Driven Future Benjamin Hindman - @benh

Transcript of Building a Data-Friendly Platform for a Data-Driven Future

Page 1: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved.

Building a Data-Friendly Platform for a Data-Driven Future

Benjamin Hindman - @benh

Page 2: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 2

$ whoamiINTRO

BENJAMIN HINDMAN

Co-founder and Chief Architect of Mesosphere, Inc.

Formerly Twitter, UC Berkeley

Co-creator of Apache Mesos

@benh

Page 3: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 3

REINFORCING TRENDS

microservices containerization

container/cluster/resource management

Page 4: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 4

REINFORCING TRENDS

microservices containerization

container/cluster/resource management

Page 5: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 5

REINFORCING TRENDS

microservices containerization

container/cluster/resource management

big data & analytics

Page 6: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 6

TWITTERMONORAIL

Page 7: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 7

TWITTERSERVICES

Page 8: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 8

PAINS FROM THE MONOLITHIC ARCHITECTURE

Page 9: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 9

MONOLITHIC TO MICROSERVICES

Page 10: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 10

1.Do one thing and do it well (UNIX).

2.Compose!

3.Test and debug in isolation.

4.Captures organizational structure (many teams working in parallel).

TREND TOWARDS MICROSERVICES

Page 11: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved.

MICROSERVICES

Traditional Application Architecture Today’s Microservices Application Architecture

REST

API

s

Hard to scale,wasting resources

Many functions in a single process

Cross-functional teams organized around capabilities

Scalable, efficient and fully dynamic

Siloed teams

Each element of functionality defined as “microservices”

Page 12: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 12

CONTAINERIZATION

then now

Page 13: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 13

CONTAINERIZATION

then nowmore moving parts less moving parts

Page 14: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 14

CONTAINERIZATION

then nowmore moving parts less moving parts

Page 15: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved.

ANATOMY OF MODERN APPLICATIONS

Data StorageBig Data Processing

X

X X...

..

.Message Queue

..

.... X

Anything else

Functions & Logic Big Data & Analytics

Microservicesin containers

Distributed Systems

Page 16: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 16

Microservices, containers, and big data lead to more difficult operations w/out proper infrastructure.

Challenges:

1)Deployment and scheduling for fault-tolerance.

2)Scheduling for elasticity and efficiency.

3)Service discovery (“naming”) and networking.

4)Operations: maintenance and upgrades.

5)…

NEED FOR NEW INFRASTRUCTURE

Page 17: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 17

CHALLENGES

FAILURES

Page 18: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 18

MAINTENANCECHALLENGES

Page 19: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 19

BREAK OUT OF TRADITIONAL INFRASTRUCTURE SILOS

• Many silos.

• Management nightmare.

• Lengthy cycles to deploy code.• Low utilization.

Apache Mesos and the DC/OS

TRADITIONAL APPROACH

PaaS 1

ContainerApp

1Big Data

Analytics 1Big Data

Analytics 2

PaaS 2

ContainerApp

2Stateful Service

1

Stateful Service

2

UNIFIED APPROACH

ContainerApps(All) Big Data Analytics

(All)

PaaS (All)

Stateful Service(All)

• High performance and resource isolation.

• Easy scalability and multi-tenancy.

• Fault tolerant and highly available.• Highly efficient with highest utilization.

Deploy on-prem or in cloud

Page 20: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 20

RUN EVERYTHING ON THE SAME SHARED INFRASTRUCTUREUTILIZATION

siloed, over-provisioned servers,low utilization

Industry Average12-15% utilization

Page 21: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 21

RUN EVERYTHING ON THE SAME SHARED INFRASTRUCTUREUTILIZATION

Siloed, over-provisioned servers,low utilization

Automated schedulers, workload multiplexing, less machines or more applications

Industry Average12-15% utilization

30-40% utilization, up to 96% for some users

Page 22: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 22

EVOLUTION: FROM STATIC TO DYNAMIC INFRASTRUCTURE

MAINFRAMEData / Transaction

Processing

PHYSICAL (x86)Client-Server Apps

(ERP, CRM, Productivity)

VIRTUALWeb-Based Applications(Enterprise & Consumer)

Existing Computing Infrastructure Is Inefficient And Not Suitable for Modern Workloads

Siloed, Static, Monolithic & Manual

t

Computing Infrastructure Evolution

Efficient, Dynamic, Agile & Automated

UNIFIEDStateless Microservices

Stateful Distributed Systems Analytics

Page 23: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 23

THE DATACENTER COMPUTER

Page 24: Building a Data-Friendly Platform for a Data-Driven Future

© 2015 Mesosphere, Inc. All Rights Reserved. 24

1. TREAT MACHINES AS CATTLE NOT PETS

Keep the base operating system small and simple, run “containerized” applications.

2. AUTOMATE WITH SOFTWARE NOT HUMANS

Let software schedule software, i.e., handle failures, improve utilization, and manage maintenance.

DATACENTER COMPUTER PRINCIPLES

Page 25: Building a Data-Friendly Platform for a Data-Driven Future

desktop computer

server datacenter

OS

OS

OS

Datacenter Computer Needs an Operating System

Page 26: Building a Data-Friendly Platform for a Data-Driven Future

Distributed Systems Kernel (Mesos)

DATACENTER OPERATING SYSTEM (DC/OS)

Distributed systems kernel to abstract resources

26

Page 27: Building a Data-Friendly Platform for a Data-Driven Future

Distributed Systems Kernel (Mesos)

DATACENTER OPERATING SYSTEM (DC/OS)

Distributed systems kernel to abstract resources

User Interface (GUI & CLI)

Core system services (e.g., distributed init, cron, service discovery, package mgt & installer, storage)

27

Datacenter Operating System (DC/OS)

Page 28: Building a Data-Friendly Platform for a Data-Driven Future

Distributed Systems Kernel (Mesos)

DATACENTER OPERATING SYSTEM (DC/OS)

Distributed systems kernel to abstract resources

User Interface (GUI & CLI)

Core system services (e.g., distributed init, cron, service discovery, package mgt & installer, storage)

28

Datacenter Operating System (DC/OS)

On Premise AWS Azure

Page 29: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved.

Page 30: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 30

Apache Mesos is a general purpose cluster manager (i.e., not just focused on batch computation).

WHAT IS APACHE MESOS?MESOS

Page 31: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 31

CLUSTER MANAGEMENTSOLUTIONS | ACADEMIA VS INDUSTRY

IndustryAcademia

Page 32: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 32

DIFFERENT SOFTWARESOLUTIONS | ACADEMIA VS INDUSTRY

IndustryAcademia

● MPI (Message Passing Interface) ● Apache (mod_perl, mod_php)● Web Service (Java, Ruby, …)

Page 33: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 33

DIFFERENT SCALE (AT FIRST)SOLUTIONS | ACADEMIA VS INDUSTRY

IndustryAcademia

● 100’s of machines ● 10’s of machines

Page 34: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 34

CLUSTER MANAGEMENTSOLUTIONS | ACADEMIA VS INDUSTRY

IndustryAcademia

● PBS (Portable Batch System)● TORQUE● SGE (Sun Grid Engine)

● SSH● Puppet / Chef● Capistrano / Ansible

Cluster Managers

Page 35: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 35

DIFFERENT SCALE (CONVERGING)SOLUTIONS | ACADEMIA VS INDUSTRY

IndustryAcademia

● 100’s of machines ● 10’s of machines

1,000s of machines

Page 36: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 36

CLUSTER MANAGEMENTSOLUTIONS | ACADEMIA VS INDUSTRY

IndustryAcademia

● PBS (Portable Batch System)● TORQUE● SGE (Sun Grid Engine)

● SSH● Puppet / Chef● Capistrano / Ansible

Batch Computation!

Page 37: Building a Data-Friendly Platform for a Data-Driven Future

Mesos is a cluster manager with a master/agent architecture

masters

agents

Page 38: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved.

master

agents

coordinator coordinatorcoordinator

38

MESOS

2-LEVEL SCHEDULING

Page 39: Building a Data-Friendly Platform for a Data-Driven Future

Mesos challengedthe status quoof cluster managers

Page 40: Building a Data-Friendly Platform for a Data-Driven Future

cluster manager status quo

cluster manager

application

specificatio

n

the specification includes as much

information as possible to assist

the cluster manager in scheduling

and execution

Page 41: Building a Data-Friendly Platform for a Data-Driven Future

cluster manager status quo

cluster manager

application wait for task

to be

executed

Page 42: Building a Data-Friendly Platform for a Data-Driven Future

cluster manager status quo

cluster manager

application

resul

t

Page 43: Building a Data-Friendly Platform for a Data-Driven Future

problems with specifications

� hard to specify certain desires or constraints

� hard to update specifications dynamically as tasks execute

and finish/fail

Page 44: Building a Data-Friendly Platform for a Data-Driven Future

the bigger picturecluster manager has inadequate knowledge of distributed

system’s execution needs/semantics to make optimal decisions

distributed system’s execution needs/semantics can’t easily or efficiently be expressed to cluster manager

Page 45: Building a Data-Friendly Platform for a Data-Driven Future

MapReduce specification?

Page 46: Building a Data-Friendly Platform for a Data-Driven Future

MapReduce specification?

fine-grainedcourse-grained

Page 47: Building a Data-Friendly Platform for a Data-Driven Future

MapReduce specification?

best resource utilization, but

hard if impossible to specify

fine-grainedcourse-grained

Page 48: Building a Data-Friendly Platform for a Data-Driven Future

MapReduce specification?

fine-grainedcourse-grained

worst resource utilization,

but easy to express (how

most cluster managers run

something like Hadoop)

Page 49: Building a Data-Friendly Platform for a Data-Driven Future

distributed systems register with the Mesos master(s) in order to run computations

masters

agents

frameworks

Page 50: Building a Data-Friendly Platform for a Data-Driven Future

Mesos model

masters

coordinator

request

3 CPUs

2 GB RAM

a request is purposely simplified

subset of a specification including

just the required resources

at that point in time

Page 51: Building a Data-Friendly Platform for a Data-Driven Future

question: what should you do if you can’t satisfy a request?

Page 52: Building a Data-Friendly Platform for a Data-Driven Future

question: what should you do if you can’t satisfy a request?

� wait until you can …

Page 53: Building a Data-Friendly Platform for a Data-Driven Future

question: what should you do if you can’t satisfy a request?

� wait until you can …

� offer best you can immediately

Page 54: Building a Data-Friendly Platform for a Data-Driven Future

question: what should you do if you can’t satisfy a request?

� wait until you can …

� offer best you can immediately

Page 55: Building a Data-Friendly Platform for a Data-Driven Future

Mesos model

masters

coordinator

offer

hostname

4 CPUs

4 GB RAM

Page 56: Building a Data-Friendly Platform for a Data-Driven Future

offer

hostname

4 CPUs

4 GB RAM

offer

hostname

4 CPUs

4 GB RAM

offer

hostname

4 CPUs

4 GB RAM

Mesos model

masters

coordinator

offer

hostname

4 CPUs

4 GB RAM

Page 57: Building a Data-Friendly Platform for a Data-Driven Future

offer

hostname

4 CPUs

4 GB RAM

offer

hostname

4 CPUs

4 GB RAM

offer

hostname

4 CPUs

4 GB RAM

Mesos model

masters

coordinator

offer

hostname

4 CPUs

4 GB RAM

distributed system uses the

offers to perform it’s own

scheduling

Page 58: Building a Data-Friendly Platform for a Data-Driven Future

Mesos model

masters

coordinator

distributed system uses the

offers to perform it’s own

scheduling

task

3 CPUs

2 GB RAM

Page 59: Building a Data-Friendly Platform for a Data-Driven Future

Mesos model

masters

coordinator

distributed system uses the

offers to perform it’s own

scheduling

multi-level scheduling

task

3 CPUs

2 GB RAM

Page 60: Building a Data-Friendly Platform for a Data-Driven Future

1st level: master makes allocations via offers

2nd level: distributed systems schedule tasks using offers

Page 61: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 61

MESOS

LEVEL OF INDIRECTION

Mesos (agents)

coordinator

Mesos (master)

coordinator

responsible for allocation (and reallocation) of resources

Page 62: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved. 62

DATACENTER KERNELMESOS

provides common functionality every new distributed system re-implements:

• resource allocation• resource deallocation• resource reservations• resource isolation• resource monitoring• failure detection• package distribution• task starting, killing, cleanup• volume management• …

today tomorrow

YOU

don’treinventthewheel!

Page 63: Building a Data-Friendly Platform for a Data-Driven Future

distributed systemsare hard to build

Page 64: Building a Data-Friendly Platform for a Data-Driven Future

distributed systemsare hard to operate: deploy, maintain, update

Page 65: Building a Data-Friendly Platform for a Data-Driven Future

operating distributed systems

Page 66: Building a Data-Friendly Platform for a Data-Driven Future

operating distributed systems

� download

� deploy (read book; HA? arrangement?)

� monitor (read book; logs, metrics, alerting)

� maintain� debug (read book, ask internet, IRC)

� “fix” problem (update runbooks, write scripts,

build ancillary system!)

� upgrade (and/or redeploy)

Page 67: Building a Data-Friendly Platform for a Data-Driven Future

perspectivedistributed systems should be able to operate themselves:

deploy, monitor, update, upgrade, etc

need an interface that enables distributed systems to

communicate with the underlying infrastructure, and vice versa

Page 68: Building a Data-Friendly Platform for a Data-Driven Future

perspectivedistributed systems should be able to operatethemselves: deploy, monitor, update, upgrade,

etc

need an interface that enables distributed

systems to communicate with the underlying

infrastructure, and vice versa

Apache Mesos

Page 69: Building a Data-Friendly Platform for a Data-Driven Future

why: the bigger picturehumans have inadequate knowledge of distributed system

needs/semantics to make optimal decisions (even after reading

the book)

distributed system execution needs/semantics can’t easily or efficiently be expressed to humans and/or underlying

infrastructure

Page 70: Building a Data-Friendly Platform for a Data-Driven Future

Linuxapplications “operate” themselves on Linux, when they need

more CPU, they ask the Linux kernel for more CPUs (processes,

threads); when they need certain operations to be performed

(e.g., write/read to file), they ask the Linux kernel; …

Page 71: Building a Data-Friendly Platform for a Data-Driven Future

Mesosdistributed systems “operate” themselves on Mesos, when they

need more CPU, they ask Mesos for more CPUs (tasks,

containers); when they need certain operations to be performed

(e.g., create a persistent volume), they ask Mesos; …

Page 72: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved.

AUTOMATED OPERATIONS OF DISTRIBUTED SYSTEMS

Software will manage itself, using Mesos and the DC/OS API

● Most distributed systems are difficult to manage but they don’t need to be.

Kafka Spark

Cassandra

Data processing engine

Messaging backbone

Distributed database

HDFSDistributed file

system

Page 73: Building a Data-Friendly Platform for a Data-Driven Future

73

THEUNIVERSE

Page 74: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved.

TWEETER ARCHITECTURE ON DC/OS

Page 75: Building a Data-Friendly Platform for a Data-Driven Future

75

CASE STUDY

Forging Ahead with Mesos, Containers and DC/OS

Having now run our event streaming and big data ingestion pipeline services in production on DC/OS, across 3 regions, over the last year, we've achieved the following results:

● A 66% reduction in AWS Instances

● Cost Improvements up to 57%

● An impressive 40 sec time to deploy a new build with zero downtime

● A 3 min time to stand up a new region

● 100% Uptime

● Total Resources needed: 1 DevOps Engineer

Page 76: Building a Data-Friendly Platform for a Data-Driven Future

76

Page 77: Building a Data-Friendly Platform for a Data-Driven Future

© 2016 Mesosphere, Inc. All Rights Reserved.

Check outdcos.io

THANK YOU