Cracking the Container Scale Problem with Apache Mesos

54
Cracking the Container Scale Problem with Apache Mesos Connor Doyle [email protected] Sunil Shah [email protected]

Transcript of Cracking the Container Scale Problem with Apache Mesos

Page 1: Cracking the Container Scale Problem with Apache Mesos

Cracking the Container ScaleProblem with Apache Mesos

Connor [email protected]

Sunil [email protected]

Page 2: Cracking the Container Scale Problem with Apache Mesos

We are Mesosphere

Page 3: Cracking the Container Scale Problem with Apache Mesos

240 million monthly activeusers

500 million tweets per dayUp to 150k tweets per secondMore than 100TB per day of

compressed data

Page 4: Cracking the Container Scale Problem with Apache Mesos

“Mesos is the cornerstone of our elasticcompute infrastructure — it's how we build allour new services and is critical for Twitter'scontinued success at scale. It's one of the

primary keys to our data center efficiency.” —Chris Fry, SVP of Engineering at Twitter

Page 5: Cracking the Container Scale Problem with Apache Mesos
Page 6: Cracking the Container Scale Problem with Apache Mesos

Mesoswhat?  Marathon  Chronos

Demo`s!

Page 7: Cracking the Container Scale Problem with Apache Mesos

Status quo is static partitioning and use of virtual machines

Page 8: Cracking the Container Scale Problem with Apache Mesos

Add some virtual machines

Page 9: Cracking the Container Scale Problem with Apache Mesos

Provision Hadoop

Page 10: Cracking the Container Scale Problem with Apache Mesos

Provision a web service

Page 11: Cracking the Container Scale Problem with Apache Mesos

Moar data, moar Hadoop

Page 12: Cracking the Container Scale Problem with Apache Mesos

Mesos let us treat a cluster ofnodes...

Page 13: Cracking the Container Scale Problem with Apache Mesos

As one big computer

Page 14: Cracking the Container Scale Problem with Apache Mesos

Not as individual machines

Not as VMs

Page 15: Cracking the Container Scale Problem with Apache Mesos

But as computational resourceslike cores, memory, disks, etc.

Page 16: Cracking the Container Scale Problem with Apache Mesos
Page 17: Cracking the Container Scale Problem with Apache Mesos
Page 18: Cracking the Container Scale Problem with Apache Mesos
Page 19: Cracking the Container Scale Problem with Apache Mesos
Page 20: Cracking the Container Scale Problem with Apache Mesos

Mesos for all the things

Page 21: Cracking the Container Scale Problem with Apache Mesos

Mesos is...Open Source Apache project Cluster Resource Manager Scalable to 10,000s of nodes Fault-tolerant, battle-tested An SDK for distributed apps

Page 22: Cracking the Container Scale Problem with Apache Mesos

The Mesos ecosystem isgrowing

Page 23: Cracking the Container Scale Problem with Apache Mesos

 Marathon  Mesoswhat?

Chronos

Demo!

Page 24: Cracking the Container Scale Problem with Apache Mesos

Say hi to Marathon

Page 25: Cracking the Container Scale Problem with Apache Mesos

a self-serve interface to your cluster

Page 26: Cracking the Container Scale Problem with Apache Mesos

distributed "init" for long-running services

Page 27: Cracking the Container Scale Problem with Apache Mesos

a private fault-tolerant PaaS

Page 28: Cracking the Container Scale Problem with Apache Mesos

a container orchestration platform

Page 29: Cracking the Container Scale Problem with Apache Mesos

Marathon does it!Start, stop, scale, update appsNice web interface, APIHighly available, no SPoFNative Docker supportPluggable event busRolling deploy / restartApplication health checksArtifact staging

Page 30: Cracking the Container Scale Problem with Apache Mesos

Service Discovery

Set environment variablesRead config from device (rsync'ed to fs)

Read from K-V storeUse DNS

HAProxy works pretty well

Page 31: Cracking the Container Scale Problem with Apache Mesos

Marathon  REST

POST /v2/appsGET /v2/apps

PUT /v2/apps/{appId}GET /v2/apps/{appId}/tasks

DELETE /v2/apps/{appId}/tasks/{taskId}...

Page 32: Cracking the Container Scale Problem with Apache Mesos

   Chronos

Mesoswhat?

Marathon

Demo!

Page 33: Cracking the Container Scale Problem with Apache Mesos

Introducing Chronos

a scheduler for batch and one-off jobs

Page 34: Cracking the Container Scale Problem with Apache Mesos

Distribute a graph of jobs

Dependency graph for execution

Page 35: Cracking the Container Scale Problem with Apache Mesos

FeaturesDistributed job schedulerWeb interface, APIHighly available, no SPoFNative Docker supportEasy scheduling with repeating intervals

Page 36: Cracking the Container Scale Problem with Apache Mesos

Chronos  REST

PUT chronos-node:8080/scheduler/job/job1

GET chronos-node:8080/scheduler/jobs

DELETE chronos-node:8080/scheduler/task/kill/job2

Page 37: Cracking the Container Scale Problem with Apache Mesos

   

Demo!

Mesoswhat?

Marathon

Chronos

Page 38: Cracking the Container Scale Problem with Apache Mesos

Continuous Delivery Pipeline

Page 39: Cracking the Container Scale Problem with Apache Mesos

Code Base

Page 40: Cracking the Container Scale Problem with Apache Mesos

Docker BuildSteps

1.  docker login2.  docker build3.  docker push4.  report tag

Artifacts1.  docker-tag2.  target/marathon.json

Page 41: Cracking the Container Scale Problem with Apache Mesos

DockerfileFROM nginxMAINTAINER Mesosphere [email protected]

EXPOSE 80

ADD app/ /usr/share/nginx/html

Page 42: Cracking the Container Scale Problem with Apache Mesos

marathon.json{ "id": "/mesosphere/cd-demo-app", "instances": 1, "cpus": 1, "mem": 512, "container": { "type": "DOCKER", "docker": { "image": "mesosphere/cd-demo-app:$tag", "network": "BRIDGE", "portMappings": [ { "servicePort": 28080, "containerPort": 80, "hostPort": 0, "protocol": "tcp" }, { "servicePort": 28443, "containerPort": 443, "hostPort": 0, "protocol": "tcp" } ] } }, "healthChecks": [ { "gracePeriodSeconds": 120, "intervalSeconds": 30, "maxConsecutiveFailures": 3, "path": "/"

Page 43: Cracking the Container Scale Problem with Apache Mesos

Report Tagmkdir -p target

echo %TAG% > target/docker-tag

cat marathon.json | \ jq '.container.docker.image |= "%DOCKER_IMAGE%:%TAG%"' > target/marathon.json

Page 44: Cracking the Container Scale Problem with Apache Mesos

Artifacts

Page 45: Cracking the Container Scale Problem with Apache Mesos

Deploy1.  http PUT http://marathon/v2/apps < marathon.json2.  Send slack message

Page 46: Cracking the Container Scale Problem with Apache Mesos

Slack Messageecho '{ "username" :"%SLACK_USERNAME%", "channel" :"%SLACK_CHANNEL%", "text" :"%SLACK_MESSAGE%", "icon_emoji" :"%SLACK_EMOJI%", "mrkdwn" :%SLACK_MARKDOWN%}' | http --print=HhBb --json POST %SLACK_WEBHOOK_URL%

Page 47: Cracking the Container Scale Problem with Apache Mesos

Build Parameters

Page 48: Cracking the Container Scale Problem with Apache Mesos

Roll OutMarathon will now begin rolling out1 the updated version of the

application and your team has just been notified of thedeployment.

[1] https://mesosphere.github.io/marathon/docs/deployments.html#rolling-restarts

Page 49: Cracking the Container Scale Problem with Apache Mesos

Run a Docker containerhttp -v POST http://10.53.20.188:8080/v2/apps @app-ruby.json

{ "container": { "type": "DOCKER", "docker": { "image": "superguenter/demo-app" } }, "cmd": "rails server -p $PORT", "id": "rails-demo", "instances": 1, "cpus": 0.01, "mem": 256, "ports": [3000]}

Page 50: Cracking the Container Scale Problem with Apache Mesos

Scale!http -v PUT http://10.53.20.188:8080/v2/apps/demo @scale-app.json

{ "instances": 3}

Page 51: Cracking the Container Scale Problem with Apache Mesos

Deploy a new Python versionhttp -v PUT http://10.53.20.188:8080/v2/apps/demo @app-python.json

{ "container": { "type": "DOCKER", "docker": { "image": "superguenter/demo-app" } }, "cmd": "python -m SimpleHTTPServer $PORT", "id": "demo", "instances": 3, "cpus": 0.01, "mem": 256, "ports": [3000]}

Page 52: Cracking the Container Scale Problem with Apache Mesos

Use HTTP healthcheckshttp -v POST http://10.53.20.188:8080/v2/apps @app-with-healthcheck.json

{ "container": { "type": "DOCKER", "docker": { "image": "superguenter/demo-app" } }, "cmd": "rails server -p $PORT", "id": "healthcheck-demo", "instances": 1, "cpus": 0.01, "mem": 256, "ports": [3000], "healthChecks": [ { "path": "/", "portIndex": 0, "protocol": "HTTP", "gracePeriodSeconds": 30, "intervalSeconds": 30, "timeoutSeconds": 30,

"maxConsecutiveFailures": 0

Page 53: Cracking the Container Scale Problem with Apache Mesos

Add a UNIQUE hostname constrainthttp -v POST http://10.53.20.188:8080/v2/apps @app-with-constraint.json

{ "container": { "type": "DOCKER", "docker": { "image": "superguenter/demo-app" } }, "cmd": "python -m SimpleHTTPServer $PORT", "id": "constraint-demo", "instances": 6, "cpus": 0.01, "mem": 256, "ports": [3000], "healthChecks": [ { "path": "/", "portIndex": 0, "protocol": "HTTP" } ], "constraints": [ ["hostname" "UNIQUE"]

Page 54: Cracking the Container Scale Problem with Apache Mesos

Thanks!Come and talk to us

P.S., we're hiring!

Get Mesosphere packages: 

Read about Marathon: 

Read about Chronos: 

Try out Mesosphere on GCE: 

Come work with us: 

mesosphere.io/downloads

github.com/mesosphere/marathon

github.com/mesos/chronos

google.mesosphere.io

mesosphere.io/jobs