Progressive Delivery Patterns & Benefits · 2020-02-27 · How You Roll Matters @davekarow Approach...

Progressive Delivery Patterns & Benefits

Dave Karow Continuous Delivery Evangelist

Split.io

@davekarow

The future is already here — it's just not very evenly distributed.

William Gibson

@davekarow

What is Progressive Delivery?

Patterns In The Wild

Summing It Up

QR Code

@davekarow

What is Progressive Delivery and what are the potential benefits?

@davekarow @davekarow

Carlos Sanchez (Sr. Cloud Software Engineer @ Adobe)

https://blog.csanchez.org/2019/01/22/progressive-delivery-in-kubernetes-blue-green-and-canary-deployments/

Progressive Delivery is the next step after Continuous Delivery, where new versions are deployed to a subset of users and are evaluated in terms of correctness and performance before rolling them to the totality of the users and rolled back if not matching some key metrics.

@davekarow

Potential Benefits of Progressive Delivery

Avoid Downtime

Limit the Blast Radius

Limit WIP / Achieve Flow

Learn During the Process

@davekarow

How You Roll Matters Approach

Benefits Blue/Green

Deployment

Canary Release Feature Flag

Rollout

Feature Delivery

Platform

Avoid

Downtime

Limit The

Blast Radius

Limit WIP /

Achieve Flow

Learn During

The Process

https://www.split.io/blog/learn-the-four-shades-of-progressive-delivery/

Harvey Balls by Sschulte at English Wikipedia [CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0)] @davekarow

How You Roll Matters

@davekarow

Approach

Benefits Blue/Green

Deployment

Canary Release Feature Flag

Rollout

Feature Delivery

Platform

Avoid

Downtime

Limit The

Blast Radius

Limit WIP /

Achieve Flow

Learn During

The Process

@davekarow

https://www.split.io/blog/learn-the-four-shades-of-progressive-delivery/

Harvey Balls by Sschulte at English Wikipedia [CC BY-SA (https://creativecommons.org/licenses/by-sa/3.0)]

Feature Delivery Platform Capabilities

@davekarow

Let’s Venture Into the Wild!

Bruce Turner from AustinTX

https://www.flickr.com/people/66994844@N00

[CC BY (https://creativecommons.org/licenses/by/2.0)]

@davekarow

Booking.com

@davekarow

Booking.com: a well-documented example of the pattern

@davekarow

https://medium.com/booking-com-development/moving-fast-breaking-things-and-fixing-them-as-quickly-as-possible-a6c16c5a1185

@davekarow

@davekarow @davekarow

@davekarow

Feature Flag Managing exposure like a dimmer or light board

0%

10%

20%

50%

100%

@davekarow

Booking.com’s experience with Manage: “asynchronous feature release” ● Deploying has no impact on user experience ● Deploy more frequently with less risk to business and users ● The big win is Agility

@davekarow

@davekarow

Monitoring the needle in the haystack If you roll out a change to just 5% of your population

...and 20% (1 in 5) of the exposed users get an error,

that’s a HUGE problem!

But, what % of your total user population is getting that error?

1%

@davekarow

@davekarow

Booking.com’s experience with Monitor: “Experimentation as a safety net” ● Each new feature is wrapped in its own experiment ● Allows: monitoring and stopping of individual changes ● The developer or team responsible for the feature can enable

and disable it... ● ...regardless of who deployed the new code that contained it.

@davekarow

Booking.com safety net automated: “circuit breaker” ● Active for the first three minutes of feature release ● Severe degradation → automatic abort of that feature ● Acceptable divergence from core value of local ownership

and responsibility where it’s a “no brainer” that users are being negatively impacted

@davekarow

Booking.com Circuit Breaker (Automatic Stopping)

@davekarow

Guardrail metrics

@davekarow

@davekarow

Booking.com’s experience with Experimentation: A way to validate ideas ● Measure (in a controlled manner) the impact changes have

on user behaviour ● Every change has a clear objective (explicitly stated

hypothesis on how it will improve user experience) ● Measuring allows validation that desired outcome is achieved

@davekarow

Feature Flag Experimentation example

@davekarow

50% 50%

The quicker we manage to validate new ideas the less time is wasted on things that don’t work and the more time is left to work on things that make a difference.

Booking’s Big Takeaway

@davekarow

LinkedIn XLNT/LIX

@davekarow

● Built a targeting engine that could “split” traffic between existing and new code

● Impact analysis was by hand only (and took ~2 weeks), so nobody did it :-(

Essentially just feature flags without automated feedback

LinkedIn early days: a modest start for XLNT

@davekarow

LinkedIn XLNT Today A controlled release with standardized KPI calculation launched very 5 minutes

100 releases per day

6000 metrics that can be “followed” by any stakeholder: “What releases are moving the numbers I care about?”

@davekarow

Lessons learned at LinkedIn ● Build for scale: no more coordinating over email ● Make it trustworthy: targeting and analysis must be rock solid ● Design for diverse teams, not just data scientists

Ya Xu Head of Data Science, LinkedIn Decisions Conference 10/2/2018

@davekarow

Step 1 Feature flags

Step 2 Sensors Correlation

Step 3 Stats Engine Causation

“Holy Grail” Mgmt console System of record Alerting

Increasing functionality & company adoption

Co

st t

o b

uild

an

d m

ain

tain

Summing it up: The patterns are proven

@davekarow

Maturity hasn’t come easily or fast for the pioneers

Split implements these patterns as a service: split.io

@davekarow

@davekarow

Checklists to DIY or Buy ● Foundational Capabilities You’ll Need ● How-To’s: Monitor & Experiment

@davekarow

Decouple deploy from release

❏ Allow changes of exposure w/o new deploy or rollback

❏ Support targeting by UserID, attribute (population), random hash

Foundational Capability #1

@davekarow

Automate a reliable and consistent way to answer,

“Who have we exposed this to so far?”

❏ Record who hit a flag, which way they were sent, and why

❏ Confirm that targeting is working as intended

❏ Confirm that expected traffic levels are reached


@davekarow

Automate a reliable and consistent way to answer,

“How is it going for them (and us)?”

❏ Automate comparison of system health (errors, latency, etc…)

❏ Automate comparison of user behavior (business outcomes)

❏ Make it easy to include “Guardrail Metrics” in comparisons to

avoid the local optimization trap


@davekarow

Limit the blast radius of unexpected consequences so you can replace the

“big bang” release night with more frequent, less stressful rollouts.

Build on the foundational capabilities to:

❏ Ramp in stages, starting with dev team, then dogfooding, then % of public

❏ Monitor at feature rollout level, not just globally (vivid facts vs faint signals)

❏ Alert at the team level (build it/own it)

❏ Kill if severe degradation detected (stop the pain now, triage later)

❏ Continue to ramp up healthy features while “sick” are ramped down or killed

How-To: Release Faster With Less Risk

@davekarow

Focus precious engineering cycles on “what works” with experimentation,

making statistically rigorous observations about what moves KPIs (and what

doesn’t).

Build on the foundational capabilities to:

❏ Target an experiment to a specific segment of users

❏ Ensure random, deterministic, persistent allocation to A/B/n variants

❏ Ingest metrics chosen before the experiment starts (not cherry-picked after)

❏ Compute statistical significance before proclaiming winners

❏ Design for diverse audiences, not just data scientists (buy-in needed to stick)

How-To: Engineer for Impact (Not Output)

@davekarow

Whatever you are, try to be a good one.

William Makepeace Thackeray

@davekarow

Progressive Delivery Resources tinyurl.com/pd4u2020 (just posted to twitter as well)

@davekarow

Progressive Delivery Patterns & Benefits · 2020-02-27 · How You Roll Matters @davekarow Approach...

Documents

Transcript of Progressive Delivery Patterns & Benefits · 2020-02-27 · How You Roll Matters @davekarow Approach...