5 must-have patterns for your microservice - buildstuff

46
>>> 5 must-have Patterns for your web-scale Microservices @aliostad Ali Kheyrollahi, ASOS

Transcript of 5 must-have patterns for your microservice - buildstuff

Page 1: 5 must-have patterns for your microservice - buildstuff

>>> 5 must-have Patterns for your web-scale

Microservices

@aliostad

Ali Kheyrollahi, ASOS

Page 2: 5 must-have patterns for your microservice - buildstuff

@aliostad

> stackoverflow> £1.5 bln

global fashion destination

> 35% every year

Page 3: 5 must-have patterns for your microservice - buildstuff

/// ASOS in numbers

2 0 1 6 T u r n O v e r → £15 bln

A c t i v e C u s t o m e r s → 12 M

N e w P r o d u c t s / w k → 4 k

U n i q u e V i s i t s / m o → 123 M

P a g e V i e w s / d a y → 95 M

P l a t f o r m T e a m s → 40

A z u r e D a t a C e n t r e s → 5

Page 4: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Microservices Architecture

Page 5: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// why microservices> Scaling people not the solution

> Decentralising decision centres => Agility

> Frequent deployment => Agility

> Reduced complexity of each ms (Divide/Conquere) => Agility

> Overall solution complex but ...

Page 6: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// anecdote

Often you can measure your success in implementing Microservice Architecture not be the number of services you build, but by the number you decommission.

Page 7: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// microservices vs soaSOA Microservices

Main Goal Architectual Decoupling Agility

Audience Mainly Architecture Business (Everyone)

Set out to solve Architectural CouplingScaling People,

Frequent DeploymentImpact on Structure of

OrganisationMinimal Huge

Service Cardinality Usually up to a dozen >40 (Commonly >100)

When to do Always teams > ~5**

Law Conway’s Reverse Conway’s

** Debateable. There are articles and discussions on this very topic

Page 8: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// microservice challenges

> Very difficult to build a complete mental picture of solution

> When things go wrong, need to know where before why

> Potentially increased latency

> Performance outliers intractable to solve

> A complete mind-shift requiring a new operating model

Page 9: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// probability distribution

Response Time

Pro

bab

ilty

Page 10: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// performance outliersMicroservice

AMicroservie

B

99th Percentile = 500ms 99th Percentile = 500ms

A B Total<1s 99% 99% 98.01%

>500m 1% 99% 0.99%>500m 99% 1% 0.99%

>1s 1% 1% 0.01%

Page 11: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// ActivityId Propagator

Page 12: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// ActivityId

> Every customer request matters

> Every request is unique

> Every request creates a chain (or tree) of calls/events

> Activities are correlated

> You need an ActivityId (or CorrelationId) to link calls/events

Page 13: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// ActivityIdMicroservice

Id

IdId Thread Local Storage

Id

To Other APIs

Id

Event

Page 14: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// ActivityId - HTTPRequest

GET /api/v2/foo HTTP/1.1 host: foo.com activity-id: 96c5a1f106ce468ebcca8303ed7464bd

Response

200 OK activity-id: 96c5a1f106ce468ebcca8303ed7464bd

Page 15: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// ActivityId Demo

Page 16: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Retry and Timeout Policy

Page 17: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// FailureMicroservice

A

1% chance of failure

XWait (back-off)XWait (back-off longer)

Microservice B

1% chance of failure

Page 18: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Preemptive TimeoutMicroservice

A

XretryXretry

Short timeout

Short timeout

Microservice B

Page 19: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// TimeoutC

B

A

A > B > CA > B + C

Page 20: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Choosing a timeout?

Static => Based on Server SLO

Dynamic => 95th percentile

Page 21: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// IO Monitor

Page 22: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Blame Game“If there is a single place where

you can play blame game, instead of collective responsibility,

it is in Microservices troubleshooting”

Page 23: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Did you say IO??

Microservice

DBAPI

Cache

Measure... every time your code

goes out of your process

Page 24: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Recording Methods> Explicitly by calling record()

> Asking the library to record a closure

> Aspect-oriented

Java (spf4j)

private static final MeasurementRecorder recorder = RecorderFactory.createScalableCountingRecorder(forWhat, unitOfMeasurement, sampleTimeMillis);

… recorder.record(measurement);

.NET (PerfIt)

var ins = new SimpleInstrumentor(new InstrumentationInfo() { Counters = CounterTypes.StandardCounters, Description = "test", InstanceName = "Test instance", CategoryName = TestCategory });

ins.Instrument(() => Thread.Sleep(100), "test...");

Java and .NET

@PerformanceMonitor(warnThresholdMillis=1, errorThresholdMillis=100, recorderSource = RecorderSourceInstance.Rs5m.class)

[PerfItFilter(“PerfItTests", InstanceName = "Test")]public string Get(){ return Guid.NewGuid().ToString();}

Page 25: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Publishing Methods

> Local file (various to logstash)

> TCP and HTTP (many, to zipkin, influxdb)

> UDP (statsd, collectd to graphite, logstash)

> Raising Kernel-level event (Windows ETW)

> Local communication (statsd)

Page 26: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// DapperDapper PaperGoogle, 2010

Scalable

Transparent Low-overhead

-> Span Id

-> Parent Id

-> Trace Id

Page 27: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// zipkinzipkin

by Twitter

zipkin

CollectorStorage

QueryWeb

Page 28: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Sampling“The first production version of Dapper used a uniform sampling probability for all processes at Google, averaging one sampled trace for every 1024 candidates… [however] we are in the process of deploying an adaptive sampling scheme that is parameterized not by a uniform sampling probability, but by a desired rate of sampled traces per unit time.”

Dapper Paper

Zipkin samples in the collector using a strategy pattern: an implementation of CollectorSampler abstract class.

Page 29: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// IO Monitor Demo

Page 30: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Circuit- Breaker

Page 31: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// tri-state> Closed traffic can flow normally

> Open traffic does not flow

> Half-open circuit breaker tests the waters again

Closed

Open

Half-open

Test

Failure

Wait timeout

Page 32: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Netflix Hysterix

RequestVolumeThreshold

ErrorThresholdPercentage

SleepWindowInMilliseconds

TimeInMilliseconds

NumBuckets

Page 33: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Fallback

> Custom: e.g. serve content from a local cache (status 206)

> Silent: return null/no-data/empty (status 200/204)

> Fail-fast: Customer experience is important (status 5xx)

Page 34: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Canary and Health Endpoint

Page 35: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Health Endpoints

Ping returns a success code when invoked

Canary returns a connectivity status and latency on the service and dependencies

“… none of them invoke any application code”

Page 36: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// PingRequest

GET /api/health HTTP/1.1 host: foo.com

Response

200 OK

Response

500 Server Error

Page 37: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// CanaryRequest

GET /api/canary HTTP/1.1 host: foo.com

Response

200 OK

{

[Nested Structure]

}

Page 38: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// ChirpResult

{ "serviceName": "foo", "latency": "00:00:00.0542172", "statusCode": 200, "isCritical": true }

Page 39: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// ChirpResult

Page 40: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// ChirpResult - critical failure

API

NC

NC

C

200

200

500

500

Page 41: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// ChirpResult - non-critical failure

API

NC

NC

C

500

200

200

200

Page 42: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// AOP / Declarative (c#)

[AzureStorageCanary("Foo-AzureStorage-BarDatabaseServer", “config-key-for-cn“)] [SqlCanary("SQL-BazActiveDatabase", null, typeof(SqlConnectionFactory))] [CanaryEndpointCanary("Dependency-Api", “config-key-for-endpoint“)] public class CanaryController : CanaryBaseController { … // some boilerplate code }

Page 43: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Deep vs Shallow

API

API“Deep”“Shallow”/api/canary?deep=false

Page 44: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Canary Demo

Page 45: 5 must-have patterns for your microservice - buildstuff

@aliostad

/// Wrap-up> If you have more than ~5 teams, consider Microservices

> Logging/Monitoring/Alerting: single most important asset

> Use ActivityId Propagator to correlate (consider zipkin)

> Cloud is a jungleTM. Without retry/timeout you won’t survive

> Monitor and measure all calls to external services (blame game)

> Protect your systems with circuit-breakers (and isolation)

> Canary helps you detect connectivity from customer view

Page 46: 5 must-have patterns for your microservice - buildstuff

@aliostad

Thomas Wood: Daisy Picture

Thomas Au: Thermometer Picture

Torbakhopper: Cables Picture

Dam Picture - Japan

Hsiung: Lights Picture

Health Endpoint in API Design