GOTO Chicago - Speed and scale, how to get there

143
Speed and Scale: How to get there. Adrian Cockcroft @adrianco May 2014

description

To deliver software products at high velocity requires four things. First, a culture of innovation that can see and respond to opportunities. Second, the data and analytics to evaluate alternatives. Third, a culture that can make decisions and assign resources quickly. Fourth, agile development and self service deployment. A fine grain loosely coupled architecture scales as the team size grows, a freedom and responsibility culture provides autonomy for innovation and fast decision making, unstructured "Big Data" analytics gets answers quickly, cloud removes the latency of resource allocation, and DevOps removes the coordination latency that slows down deployment. Traditional enterprise architectures are based on monolithic applications and relational databases. Cloud native architectures are based on buiding single function REST-based microservices that support integration across denormalized NoSQL data stores and a wide range of web services. This talk will also discuss strategies, patterns and pathways to perform a gradual migration towards cloud native.

Transcript of GOTO Chicago - Speed and scale, how to get there

Page 1: GOTO Chicago - Speed and scale, how to get there

Speed and Scale: How to get there.

Adrian Cockcroft @adrianco May 2014

Page 2: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Page 3: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Typical reactions to my Netflix talks…

Page 4: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Typical reactions to my Netflix talks…

“You guys are crazy! Can’t believe it”

– 2009

Page 5: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Typical reactions to my Netflix talks…

“You guys are crazy! Can’t believe it”

– 2009

“What Netflix is doing won’t work”

– 2010

Page 6: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Typical reactions to my Netflix talks…

“You guys are crazy! Can’t believe it”

– 2009

“What Netflix is doing won’t work”

– 2010 It only works for ‘Unicorns’ like

Netflix” – 2011

Page 7: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Typical reactions to my Netflix talks…

“You guys are crazy! Can’t believe it”

– 2009

“What Netflix is doing won’t work”

– 2010 It only works for ‘Unicorns’ like

Netflix” – 2011

“We’d like to do that but can’t”

– 2012

Page 8: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Typical reactions to my Netflix talks…

“You guys are crazy! Can’t believe it”

– 2009

“What Netflix is doing won’t work”

– 2010 It only works for ‘Unicorns’ like

Netflix” – 2011

“We’d like to do that but can’t”

– 2012

“We’re on our way using Netflix OSS code”

– 2013

Page 9: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What I learned from my time at Netflix

Page 10: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What I learned from my time at Netflix● Speed wins in the marketplace

Page 11: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What I learned from my time at Netflix● Speed wins in the marketplace● Remove friction from product development

Page 12: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What I learned from my time at Netflix● Speed wins in the marketplace● Remove friction from product development● High trust, low process, no hand-offs between teams

Page 13: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What I learned from my time at Netflix● Speed wins in the marketplace● Remove friction from product development● High trust, low process, no hand-offs between teams● Freedom and responsibility culture

Page 14: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What I learned from my time at Netflix● Speed wins in the marketplace● Remove friction from product development● High trust, low process, no hand-offs between teams● Freedom and responsibility culture● Don’t do your own undifferentiated heavy lifting

Page 15: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What I learned from my time at Netflix● Speed wins in the marketplace● Remove friction from product development● High trust, low process, no hand-offs between teams● Freedom and responsibility culture● Don’t do your own undifferentiated heavy lifting● Use simple patterns automated by tooling

Page 16: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What I learned from my time at Netflix● Speed wins in the marketplace● Remove friction from product development● High trust, low process, no hand-offs between teams● Freedom and responsibility culture● Don’t do your own undifferentiated heavy lifting● Use simple patterns automated by tooling● Self service cloud makes impossible things instant

Page 17: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Enterprise IT Adoption of Cloud

By Simon Wardley http://enterpriseitadoption.com/

Now%*&!”

Page 18: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Speed

Page 19: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Innovation

Page 20: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

New ideas

Page 21: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

New products

Page 22: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What separates incumbents from

disruptors?

Page 23: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Assumptions

Page 24: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Optimizations

Page 25: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

“It isn't what we don't know that gives us trouble, it's

what we know that ain't so.” !

Will Rogers

http://www.brainyquote.com/quotes/quotes/w/willrogers385286.html

Page 26: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Incumbents follow the $$$

Market size lags disruption because high price products are replaced by low priced products

Page 27: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Disruptors find what used to be

expensive

Page 28: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Learn to waste them to save money

elsewhere

Page 29: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Examples

Page 30: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Solid State Disk

Example

Page 31: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Storage systems assume random

reads are expensiveDecades of filesystems and storage array development based on spinning rust

Page 32: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

RR is free Immutable writes

Log-mergeSSD works best for random reads and sequential writes. Bad for updates.

Page 33: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

SSD packaging as disk, as PCI card

now as memory DIMMEach generation reduces overhead and improves price/performance

Page 34: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery VenturesDisclosure: Diablo Technologies is a Battery Ventures Portfolio Company See www.battery.com for a list of portfolio investments

Page 35: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Traditional vs. Cloud Native Storage Architectures

Business Logic

Database Master

Fabric

Storage Arrays

Database Slave

Fabric

Storage Arrays

Page 36: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Traditional vs. Cloud Native Storage Architectures

Business Logic

Database Master

Fabric

Storage Arrays

Database Slave

Fabric

Storage Arrays

Business Logic

Cassandra Zone A nodes

Cassandra Zone B nodes

Cassandra Zone C nodes

Cloud Object Store Backups

Page 37: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Traditional vs. Cloud Native Storage Architectures

Business Logic

Database Master

Fabric

Storage Arrays

Database Slave

Fabric

Storage Arrays

Business Logic

Cassandra Zone A nodes

Cassandra Zone B nodes

Cassandra Zone C nodes

Cloud Object Store Backups

SSDs inside arrays disrupt incumbent suppliers

Page 38: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Traditional vs. Cloud Native Storage Architectures

Business Logic

Database Master

Fabric

Storage Arrays

Database Slave

Fabric

Storage Arrays

Business Logic

Cassandra Zone A nodes

Cassandra Zone B nodes

Cassandra Zone C nodes

Cloud Object Store Backups

SSDs inside ephemeral instances disrupt an entire industry

SSDs inside arrays disrupt incumbent suppliers

Page 39: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 40: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

● Cassandra scalability

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 41: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

● Cassandra scalability

● Linear scale up benchmarked and seen in production

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 42: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

● Cassandra scalability

● Linear scale up benchmarked and seen in production

● Hundreds of nodes per cluster in common use today

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 43: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

● Cassandra scalability

● Linear scale up benchmarked and seen in production

● Hundreds of nodes per cluster in common use today

● Thousands of nodes per cluster actively being tested and used

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 44: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

● Cassandra scalability

● Linear scale up benchmarked and seen in production

● Hundreds of nodes per cluster in common use today

● Thousands of nodes per cluster actively being tested and used

● Cassandra scale using high end AWS storage instances

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 45: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

● Cassandra scalability

● Linear scale up benchmarked and seen in production

● Hundreds of nodes per cluster in common use today

● Thousands of nodes per cluster actively being tested and used

● Cassandra scale using high end AWS storage instances

● EC2 i2.8xlarge - over 300,000 iops read or write, 6.4TB of SSD

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 46: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

● Cassandra scalability

● Linear scale up benchmarked and seen in production

● Hundreds of nodes per cluster in common use today

● Thousands of nodes per cluster actively being tested and used

● Cassandra scale using high end AWS storage instances

● EC2 i2.8xlarge - over 300,000 iops read or write, 6.4TB of SSD

● 100 nodes = 30 million iops and 640 TB - Ludicrous

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 47: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

● Cassandra scalability

● Linear scale up benchmarked and seen in production

● Hundreds of nodes per cluster in common use today

● Thousands of nodes per cluster actively being tested and used

● Cassandra scale using high end AWS storage instances

● EC2 i2.8xlarge - over 300,000 iops read or write, 6.4TB of SSD

● 100 nodes = 30 million iops and 640 TB - Ludicrous

● 1000 nodes = 300 million iops and 6.4 PB - Plaid!

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 48: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How to Scale Storage Beyond Ludicrous

● Cassandra scalability

● Linear scale up benchmarked and seen in production

● Hundreds of nodes per cluster in common use today

● Thousands of nodes per cluster actively being tested and used

● Cassandra scale using high end AWS storage instances

● EC2 i2.8xlarge - over 300,000 iops read or write, 6.4TB of SSD

● 100 nodes = 30 million iops and 640 TB - Ludicrous

● 1000 nodes = 300 million iops and 6.4 PB - Plaid!

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 49: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Disruptor Cassandra

Perfect match for SSD, no write amplification, no updates, scales to plaid

Page 50: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Product Development

Another disruptive example

Page 51: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Assumption: Process prevents

problemsAnother disruptive example

Page 52: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Non-Cloud Product Development

Months before you find out whether the product meets the need

Business Need • Documents • Weeks

Approval Process • Meetings • Weeks

Hardware Purchase • Negotiations • Weeks

Software Development • Specifications • Weeks

Deployment and Testing • Reports • Weeks

Customer Feedback • It sucks! • Weeks

Page 53: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Non-Cloud Product Development

Months before you find out whether the product meets the need

Hardware provisioning is undifferentiated heavy lifting – replace it with IaaS

Business Need • Documents • Weeks

Approval Process • Meetings • Weeks

Hardware Purchase • Negotiations • Weeks

Software Development • Specifications • Weeks

Deployment and Testing • Reports • Weeks

Customer Feedback • It sucks! • Weeks

Page 54: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Non-Cloud Product Development

Months before you find out whether the product meets the need

Hardware provisioning is undifferentiated heavy lifting – replace it with IaaS

Business Need • Documents • Weeks

Approval Process • Meetings • Weeks

Hardware Purchase • Negotiations • Weeks

Software Development • Specifications • Weeks

Deployment and Testing • Reports • Weeks

Customer Feedback • It sucks! • Weeks

IaaS Cloud

Page 55: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Non-Cloud Product Development

Months before you find out whether the product meets the need

Hardware provisioning is undifferentiated heavy lifting – replace it with IaaS

Business Need • Documents • Weeks

Software Development • Specifications • Weeks

Deployment and Testing • Reports • Weeks

Customer Feedback • It sucks! • Weeks

Page 56: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Process Hand-Off Steps for Product Development on IaaS

Product Manager

Development Team

QA Integration Team

Operations Deploy Team

BI Analytics Team

Page 57: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

IaaS Based Product Development

Weeks before you find out whether the product meets the need

Business Need • Documents • Weeks

Software Development • Specifications • Weeks

Deployment and Testing • Reports • Days

Customer Feedback • It sucks! • Days

Page 58: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

IaaS Based Product Development

Weeks before you find out whether the product meets the need

Business Need • Documents • Weeks

Software Development • Specifications • Weeks

Deployment and Testing • Reports • Days

Customer Feedback • It sucks! • Days

etc…

Page 59: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

IaaS Based Product Development

Weeks before you find out whether the product meets the need

Software provisioning is undifferentiated heavy lifting – replace it with PaaS

Business Need • Documents • Weeks

Software Development • Specifications • Weeks

Deployment and Testing • Reports • Days

Customer Feedback • It sucks! • Days

etc…

Page 60: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

IaaS Based Product Development

Weeks before you find out whether the product meets the need

Software provisioning is undifferentiated heavy lifting – replace it with PaaS

Business Need • Documents • Weeks

Software Development • Specifications • Weeks

Deployment and Testing • Reports • Days

Customer Feedback • It sucks! • Days

PaaS Cloud

etc…

Page 61: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

IaaS Based Product Development

Weeks before you find out whether the product meets the need

Software provisioning is undifferentiated heavy lifting – replace it with PaaS

Business Need • Documents • Weeks

Software Development • Specifications • Weeks

Customer Feedback • It sucks! • Days

etc…

Page 62: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Process Hand-Off Steps for Feature Development on PaaS

Product Manager

Developer

BI Analytics Team

Page 63: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

PaaS Based Product Feature Development

Days before you find out whether the feature meets the need

Business Need • Discussions • Days

Software Development • Code • Days

Customer Feedback • Fix this Bit! • Hours

etc…

Page 64: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

PaaS Based Product Feature Development

Days before you find out whether the feature meets the need

Building your own business apps is undifferentiated heavy lifting – use SaaS

Business Need • Discussions • Days

Software Development • Code • Days

Customer Feedback • Fix this Bit! • Hours

etc…

Page 65: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

PaaS Based Product Feature Development

Days before you find out whether the feature meets the need

Building your own business apps is undifferentiated heavy lifting – use SaaS

Business Need • Discussions • Days

Software Development • Code • Days

Customer Feedback • Fix this Bit! • Hours

SaaS/ BPaaS Cloud

etc…

Page 66: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

PaaS Based Product Feature Development

Days before you find out whether the feature meets the need

Building your own business apps is undifferentiated heavy lifting – use SaaS

Business Need • Discussions • Days

Customer Feedback • Fix this Bit! • Hours

etc…

Page 67: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

SaaS Based Business App Development

Hours before you find out whether the feature meets the need

Business Need •GUI Builder •Hours

Customer Feedback •Fix this bit! •Seconds

Page 68: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

SaaS Based Business App Development

Hours before you find out whether the feature meets the need

Business Need •GUI Builder •Hours

Customer Feedback •Fix this bit! •Seconds

and thousands more…

Page 69: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What Happened?

Rate of change increased

Cost and size and risk of change

reduced

Page 70: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Continuous Delivery on

Cloud

Page 71: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Observe

Orient

Decide

ActContinuous Delivery on

Cloud

Page 72: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Observe

Orient

Decide

Act

Land grab opportunity Competitive

Move

Customer Pain Point

Measure Customers

Continuous Delivery on

Cloud

Page 73: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Observe

Orient

Decide

Act

Land grab opportunity Competitive

Move

Customer Pain Point

INNOVATION

Measure Customers

Continuous Delivery on

Cloud

Page 74: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Observe

Orient

Decide

Act

Land grab opportunity Competitive

Move

Customer Pain Point

Analysis

Model Hypotheses

INNOVATION

Measure Customers

Continuous Delivery on

Cloud

Page 75: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Observe

Orient

Decide

Act

Land grab opportunity Competitive

Move

Customer Pain Point

Analysis

Model Hypotheses

BIG DATA

INNOVATION

Measure Customers

Continuous Delivery on

Cloud

Page 76: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Observe

Orient

Decide

Act

Land grab opportunity Competitive

Move

Customer Pain Point

Analysis

JFDI

Plan Response

Share Plans

Model Hypotheses

BIG DATA

INNOVATION

Measure Customers

Continuous Delivery on

Cloud

Page 77: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Observe

Orient

Decide

Act

Land grab opportunity Competitive

Move

Customer Pain Point

Analysis

JFDI

Plan Response

Share Plans

Model Hypotheses

BIG DATA

INNOVATION

CULTURE

Measure Customers

Continuous Delivery on

Cloud

Page 78: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Observe

Orient

Decide

Act

Land grab opportunity Competitive

Move

Customer Pain Point

Analysis

JFDI

Plan Response

Share Plans

Incremental Features

Automatic Deploy

Launch AB Test

Model Hypotheses

BIG DATA

INNOVATION

CULTURE

Measure Customers

Continuous Delivery on

Cloud

Page 79: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Observe

Orient

Decide

Act

Land grab opportunity Competitive

Move

Customer Pain Point

Analysis

JFDI

Plan Response

Share Plans

Incremental Features

Automatic Deploy

Launch AB Test

Model Hypotheses

BIG DATA

INNOVATION

CULTURE

CLOUD

Measure Customers

Continuous Delivery on

Cloud

Page 80: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Note: Non-Destructive Production Updates

● “Immutable Code” Service Pattern

● Existing services are unchanged, old code remains in service

● New code deploys as a new service group

● No impact to production until traffic routing changes

● A|B Tests, Feature Flags and Version Routing control traffic

● First users in the test cell are the developer and test engineers

● A cohort of users is added looking for measurable improvement

● Finally make default for everyone, keeping old code for a while

Page 81: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Disruptor Continuous Delivery

Compute capacity is an ephemeral commodity, learn to waste it to save time and get agility

Page 82: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Development and Operations

Another disruptive example, if you assume they don’t mix…

Page 83: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Developers make code

Page 84: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Operations run code

Page 85: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

It can take weeks to get a VM after a developer

files a ticket…

Page 86: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

But if operations is a self service API…

Page 87: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Developers run their own code

Page 88: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Developers are on call

Page 89: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Developers have freedom

Page 90: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Developers have incentives to be

responsible

Avoids the externalities of over-dependence on operations to fix everything

Page 91: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Less down time

With the right incentives and tooling developers write code that scales and doesn't break

Page 92: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

No meetings

Developers end up spending more time developing than when they had to keep explaining their code to ops

Page 93: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

DevOps is a re-org, not a new team to hire

For most companies, the cultural transformation needed to do DevOps is the blocker

Page 94: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Disruptor High Trust Culture

DevOps

Give up central coordination and control, to get speed and align incentives

Page 95: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

It’s what you know that isn’t so…

Page 96: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

It’s what you know that isn’t so…

● Make your assumptions explicit

Page 97: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

It’s what you know that isn’t so…

● Make your assumptions explicit● Extrapolate trends to the limit

Page 98: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

It’s what you know that isn’t so…

● Make your assumptions explicit● Extrapolate trends to the limit● Listen to non-customers

Page 99: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

It’s what you know that isn’t so…

● Make your assumptions explicit● Extrapolate trends to the limit● Listen to non-customers● Follow developer adoption, not IT spend

Page 100: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

It’s what you know that isn’t so…

● Make your assumptions explicit● Extrapolate trends to the limit● Listen to non-customers● Follow developer adoption, not IT spend● Map evolution of products to services to utilities

Page 101: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

It’s what you know that isn’t so…

● Make your assumptions explicit● Extrapolate trends to the limit● Listen to non-customers● Follow developer adoption, not IT spend● Map evolution of products to services to utilities● Re-organize your teams for speed of execution

Page 102: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

How do we get there?

Page 103: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

"This is the IT swamp draining manual for anyone who is neck deep in alligators.”

Page 104: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Once you’re out of the swamp, read this…

Page 105: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Open Source Ecosystems

● The most advanced, scalable and stable code you can get is OSS

● No procurement cycle, fix and extend it yourself

● Github is a developer’s online resume

● Github is also your company’s online resume!

● Extensible platforms create ecosystems

● Give up control to get ubiquity – Apache license

!Innovate, Leverage and Commoditize

Page 106: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Cloud Native for High Availability

● Business logic isolation in stateless micro-services

● Immutable code with instant rollback

● Auto-scaled capacity and deployment updates

● Distributed across availability zones and regions

● De-normalized single function NoSQL data stores

● See over 40 NetflixOSS projects at netflix.github.com

● Get “Technical Indigestion” trying to keep up with techblog.netflix.com

Page 107: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

A Microservice Definition !

Loosely coupled service oriented architecture with bounded contexts

See http://en.wikipedia.org/wiki/Domain-driven_design for discussion of bounded contexts

Page 108: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Scaling Continuous Delivery Models

● Devs book a train ticket

● Everyone runs the monolith

● Queue for the next train

● Coordination chat session

● Need to learn deploy process

● Copy code to existing servers

● Few concurrent versions

● Tens of monolithic updates/day maximum

● Roll-forward only

● “Done” is released to prod

● Everyone has their own build

● Dev runs their own microservice

● No waiting, no meetings

● API call to update prod timeline

● Automated hands-off deploy

● Immutable code on new servers

● Unlimited concurrent versions

● 100s of independent updates

● Roll-back in seconds

● “Done” is retired from prod

Monolithic Microservices

Page 109: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Separate Concerns Using Micro-services

● Invert Conway’s Law – teams own service groups and backend stores

●One “verb” per single function micro-service, size doesn’t matter

●One developer independently produces a micro-service

● Each micro-service is it’s own build, avoids trunk conflicts

● Deploy in a container: Tomcat, AMI or Docker, whatever…

● Stateless business logic. Cattle, not pets.

● Stateful cached data access layer can use ephemeral instances

http://en.wikipedia.org/wiki/Conway's_law

Page 110: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Microservices Development Architecture

● Client libraries Even if you start with a raw protocol, a client side driver is the end-state Best strategy is to own your own client libraries from the start

● Multithreading and Non-blocking Calls Reactive model RxJava uses Observable to hide concurrency cleanly Netty can be used to get non-blocking I/O speedup over Tomcat container

● Circuit Breakers – See Fluxcapacitor.com for code NetflixOSS Hystrix, Turbine, Latency Monkey, Ribbon/Karyon Also look at Finagle/Zipkin from Twitter

Page 111: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Microservice Datastores

● Book: Refactoring Databases SchemaSpy to examine schema structure Denormalization into one datasource per table or materialized view

● Polyglot Persistence Use a mixture of database technologies, behind REST data access layers See NetflixOSS Storage Tier as a Service HTTP (staash.com) for MySQL and C*

● CAP – Consistent or Available when Partitioned Look at Jepsen torture tests for common systems aphyr.com/tags/jepsen There is no such thing as a consistent distributed system, get over it…

Page 112: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Strategies for impatient product managers

● Carrot “This new feature you want will be ready faster as a microservice”

● Stick “This new feature you want will only be implemented in the new microservice based system”

● Shiny Object “Why don’t you concentrate on some other part of the system while we get the transition done?”

Page 113: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Monitoring and Microservices

Page 114: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Issues with Continuous Delivery and Microservices

● High rate of change Code pushes can cause floods of new instances and metrics Short baseline for alert threshold analysis – everything looks unusual

● Ephemeral Configurations Short lifetimes make it hard to aggregate historical views Hand tweaked monitoring tools take too much work to keep running

● Microservices with complex calling patterns End-to-end request flow measurements are very important Request flow visualizations get overwhelmed

Page 115: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Microservice Based Architectures

See http://www.slideshare.net/LappleApple/gilt-from-monolith-ruby-app-to-micro-service-scala-service-architecture

From a Gilt Groupe Presentation

Page 116: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

“Death Star” Architecture Diagrams

As visualized by Appdynamics, Boundary.com and Twitter internal tools

Page 117: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

“Death Star” Architecture Diagrams

As visualized by Appdynamics, Boundary.com and Twitter internal tools

Netflix Gilt Groupe (12 of 450) Twitter

Page 118: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Monitoring Micro-services

● Appdynamics Instrument the JVM to capture everything including traffic flows Insert tag for every http request with a header annotation guid Visualize the over-all flow or the business transaction flow

● Boundary.com and Lyatiss CloudWeaver Instrument the packet flows across the network Capture the zone and region config from cloud APIs and tags Correlate, aggregate and visualize the traffic flows

● Instrumented PaaS Communication Mechanisms CloudFoundry and Apcera route all traffic through NATS NetflixOSS ribbon client and karyon server http annotation guid In-band mechanisms can scale beyond capabilities of centralized tools

Visualizing the request flow

Page 119: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Continuous Delivery and DevOps Implications

●Changes are smaller but more frequent

● Individual changes are more likely to be broken

●Changes are normally deployed by developers

●Feature flags are used to enable new code

● Instant detection and rollback matters much more

Page 120: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What’s wrong with measuring in minutes?Takes too long to see a problem

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Metric Threshold

Page 121: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What’s wrong with measuring in minutes?Takes too long to see a problem

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Metric Threshold

Something broke at 2m20

Page 122: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What’s wrong with measuring in minutes?Takes too long to see a problem

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Metric Threshold

Something broke at 2m20

40s of failure didn’t trigger

Page 123: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What’s wrong with measuring in minutes?Takes too long to see a problem

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Metric Threshold

Something broke at 2m20

40s of failure didn’t trigger

1st high metric seen at agent on

instance

Page 124: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What’s wrong with measuring in minutes?Takes too long to see a problem

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Metric Threshold

Something broke at 2m20

40s of failure didn’t trigger

1st high metric seen at agent on

instance

1st high metric arrives at monitoring system

Page 125: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What’s wrong with measuring in minutes?Takes too long to see a problem

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Metric Threshold

Something broke at 2m20

40s of failure didn’t trigger

1st high metric seen at agent on

instance

1st high metric arrives at monitoring system

1st high metric processed (maybe)

Page 126: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What’s wrong with measuring in minutes?Takes too long to see a problem

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Metric Threshold

Something broke at 2m20

40s of failure didn’t trigger

1st high metric seen at agent on

instance

1st high metric arrives at monitoring system

1st high metric processed (maybe)

1st high metric seen on graph

Page 127: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

What’s wrong with measuring in minutes?Takes too long to see a problem

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Metric Threshold

Something broke at 2m20

40s of failure didn’t trigger

1st high metric seen at agent on

instance

1st high metric arrives at monitoring system

1st high metric processed (maybe)

1st high metric seen on graph

Three datapoints on user graph so looks

bad at 8m00.

Page 128: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Whoops! I didn’t mean that! Reverting…

Not cool if it takes 5 minutes to see it failed and 5 more to see a fix No-one notices if it only takes 5 seconds to detect and 5 to see a fix

Page 129: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Try that again by the secondMore confidence more quickly

Threshold

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Page 130: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Try that again by the secondMore confidence more quickly

Threshold

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Something broke at 2m20

Page 131: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Try that again by the secondMore confidence more quickly

Threshold

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Something broke at 2m20

Measurable in 1s

Page 132: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Try that again by the secondMore confidence more quickly

Threshold

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Something broke at 2m20

Measurable in 1s

1st high metric seen at agent on

instance

Page 133: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Try that again by the secondMore confidence more quickly

Threshold

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Something broke at 2m20

Measurable in 1s

1st high metric seen at agent on

instance

1st high metric arrives at monitoring system

Page 134: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Try that again by the secondMore confidence more quickly

Threshold

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Something broke at 2m20

Measurable in 1s

1st high metric seen at agent on

instance

1st high metric arrives at monitoring system

1st high metric processed

Page 135: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Try that again by the secondMore confidence more quickly

Threshold

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Something broke at 2m20

Measurable in 1s

1st high metric seen at agent on

instance

1st high metric arrives at monitoring system

1st high metric processed

1st high metric seen on graph

Page 136: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Try that again by the secondMore confidence more quickly

Threshold

0

1

2

3

4

5

Minute 1 Minute 2 Minute 3 Minute 4 Minute 5 Minute 6 Minute 7

Something broke at 2m20

Measurable in 1s

1st high metric seen at agent on

instance

1st high metric arrives at monitoring system

1st high metric processed

1st high metric seen on graph

Three datapoints on user graph so looks

bad at 2m25.

Page 137: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

NetflixOSS Hystrix / Turbine Circuit Breaker Monitoring

http://techblog.netflix.com/2012/12/hystrix-dashboard-and-turbine.html

Streaming metrics directly from services to a web browser each second

Page 138: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

NetflixOSS Hystrix / Turbine Circuit Breaker Monitoring

http://techblog.netflix.com/2012/12/hystrix-dashboard-and-turbine.html

Streaming metrics directly from services to a web browser each second

Page 139: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Latest SaaS Based Monitoring Products

www.vividcortex.com and www.boundary.com

Seeing Problems In Seconds

Page 140: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Metric to display latency needs to be less than human attention span (~10s)

Page 141: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Summary

● Speed wins in the marketplace

● Remove friction from product development

● High trust, low process

● Freedom and responsibility culture

● Don’t do your own undifferentiated heavy lifting

● Simple patterns automated by tooling

● Microservices for speed and availability

Page 142: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Separation of Concerns

Bounded Contexts

Page 143: GOTO Chicago - Speed and scale, how to get there

‹#› | Battery Ventures

Any Questions?

● Battery Ventures http://www.battery.com ● Adrian’s Blog http://perfcap.blogspot.com ● Slideshare http://slideshare.com/adriancockcroft !

● Migrating to Microservices – Qcon London - March 6th, 2014 ● Monitorama Opening Keynote Portland OR - May 7th, 2014 ● GOTO Chicago Opening Keynote May 20th, 2014 ● DevOps Summit at Cloud Expo New York – June 10th, 2014 ● Qcon New York – June 11th, 2014 ● GOTO Copenhagen/Aarhus – Denmark – Oct 25th, 2014

Disclosure: some of the companies mentioned are Battery Ventures Portfolio Companies See www.battery.com for a list of portfolio investments