What is exactly anti fragile in dev ops - v3

Post on 27-Jan-2015

116 views 4 download

Tags:

description

 

Transcript of What is exactly anti fragile in dev ops - v3

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 1

What is Exactly Anti-Fragile in DevOps? Asher Sterkin

Distinguished Engineer, SPVSS, Cisco Video Systems, Israel

September 30, 2013

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 2

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4

Antifragile

Some things benefit from shocks…

volatility, randomness, disorder,

and stressors and love adventure,

risk, and uncertainty… there is no

word for the exact opposite of

fragile. Let’s call it antifragile. Nassim N. Taleb, “Antifragile. Things that gain from

disorder”

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6

The Book and Reactions

The Netflix cloud architecture is

anti-fragile… The Netflix culture is

anti-fragile… Getting stronger

through failure is the basis of anti-

fragility. Avoiding failure at all costs

… makes you brittle and

vulnerable...

Adrian Cockcroft, “Looking back at 2012 with

pointers to 2013”

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7

The Book and Reactions

If the idea is nice and neat,

however, the book that houses it

is just the opposite. It is a big,

baggy, sprawling mess.

David Runciman, review of the book in Guardian

November 21, 2012

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8

Larger Body of Knowledge

• Complex Adaptive Systems

• Highly-Optimized Tolerance

• Technology Development Cycle

• Disruptive Innovations

• Product Development Flow

• Lean Start Up

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9

For Today • De-fragilization

• Skin in the Game

• Barbell

• Asymmetric Pay-off

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10

DevOps Areas

OPS DEV

Area 1: extend delivery to

production

Area 2: extend operations

feedback to project

Area 3: embed project knowledge into operations

Area 4: embed operations knowledge into project

Patrick Debois: “Codifying devops practices”

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 11

De-Fragilization

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12

“Beauty plus pity-that is the closest we can get to a definition of art. Where there is beauty there is pity for the simple reason that beauty must die: beauty always dies, the manner dies with the matter, the world dies with the individual.”

V. Nabokov, “Lecture on Metamorphosis”

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13

DevOps Areas

OPS DEV

Area 1: extend delivery to

production

Area 2: extend operations

feedback to project

Area 3: embed project knowledge into operations

Area 4: embed operations knowledge into project

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 15

Large batches increase cycle time

Large batches increase variability in flow

John Allspaw: “Ops Meta-Metrics”,

slides 103-109

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 16

Reducing batch size accelerates feedback

Reducing batch size reduces overhead

Reducing batch size reduces risk

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17

Batch size and Bottlenecks Reduce batch

size before you

attack

bottlenecks

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18

DevOps Areas

OPS DEV

Area 1: extend delivery to

production

Area 2: extend operations

feedback to project

Area 3: embed project knowledge into operations

Area 4: embed operations knowledge into project

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19

“Resilience through Failure”

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 22

Resilience to failure is a

lofty goal. It enables a

system to survive and

withstand failure. There's an

even higher peak to strive

for, however: making the

system stronger and better

with each failure.

A. Tseitlin, “The Antifragile Organization”

Culture of Continuous Learning

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 23

E. Deming, “Out of Crisis”

“Confusing common causes with special causes will only make things worse

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 24

Resilience, Reliability, Robustness

Normal Operation

MTBF

Failure!

MTTD MTTR

Max Downtime

Normal Operation

MTBF

Max Data

Lost

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 25

It’s me

“Obsessive protection of the system against extremely rare events makes it more fragile. Resilience comes before the last percentiles of reliability.

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 26

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 27

DevOps Areas

OPS DEV

Area 1: extend delivery to

production

Area 2: extend operations

feedback to project

Area 3: embed project knowledge into operations

Area 4: embed operations knowledge into project

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 28

Give ‘em pagers!

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 29

“DevOps: architects, developers, QA/QC, system engineers, and IT cooperate to maximize the company value

It’s me again

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 30

Barbell

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 31

DevOps Areas

OPS DEV

Area 1: extend delivery to

production

Area 2: extend operations

feedback to project

Area 3: embed project knowledge into operations

Area 4: embed operations knowledge into project

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 33

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 34

Developers

• Responding to outages, takes on-call

• Alerting systems thresholding, design

• Architecture design and review

• Building metrics collection

• Application configuration

• Shipping public-facing code

• Responding to outages, takes on-call

• Alerting systems thresholding, design

• Architecture design and review

• Building metrics collection

• Application configuration

• Infrastructure buildout/management

Operations

John Allspaw, “Reply to NoOps @ Netflix”

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 35

DevOps Areas

OPS DEV

Area 1: extend delivery to

production

Area 2: extend operations

feedback to project

Area 3: embed project knowledge into operations

Area 4: embed operations knowledge into project No, not really

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 36

Every analogy has its limit.

One just needs to learn where

to stop.

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 37

N. Taleb, “Antifragile”

“The first step to anti-fragility consists in decreasing downside…This brings us to the solution in the form of barbell … Away from Golden Middle.

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 38

Away from golden Middle

Risk

Aversion

Risk

Loving

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 41

N. Taleb, “Antifragile”

“The downside/lost should be known and protected, not probability

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 42

It’s me

“ You never know where from your next outage or cyber attack will come from and when. The maximal downtime and maximal data lost should be known and guaranteed regardless of probabilities.

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 43

Asymmetric Pay-Offs

x

Gain

/Loss f

(x)

Pain

Gain

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 44

DevOps Areas

OPS DEV

Area 1: extend delivery to

production

Area 2: extend operations

feedback to project

Area 3: embed project knowledge into operations

Area 4: embed operations knowledge into project

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 45

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 46

N. Taleb

“Never be sucker. Period!

Thanks