Kim itSMF New England: ITIL at Ludicrous Speeds - Rugged DevOps 6a

Post on 13-May-2015

697 views 1 download

Tags:

Transcript of Kim itSMF New England: ITIL at Ludicrous Speeds - Rugged DevOps 6a

@RealGeneKim, genek@realgenekim.me

Session ID:

Gene Kim

Author, Visible Ops Handbook

New England LIG 5th Annual itSMF Conference

June 7, 2012

ITIL At Ludicrous Speeds: Rugged DevOps and More……

@RealGeneKim, genek@realgenekim.me

Where Did The High Performers Come From?

@RealGeneKim, genek@realgenekim.me

Visible Ops: Playbook of High Performers

The IT Process Institute has been studying high-performing organizations since 1999 What is common to all the high

performers? What is different between them

and average and low performers?

How did they become great? Answers have been codified in

the Visible Ops Methodology

www.ITPI.org

@RealGeneKim, genek@realgenekim.me

DevOps:Engage Ludicrous Speed!

@RealGeneKim, genek@realgenekim.meSource: John Allspaw

@RealGeneKim, genek@realgenekim.meSource: John Allspaw

@RealGeneKim, genek@realgenekim.me

@RealGeneKim, genek@realgenekim.meSource: John Allspaw

@RealGeneKim, genek@realgenekim.meSource: John Allspaw

@RealGeneKim, genek@realgenekim.meSource: Theo Schlossnagle

@RealGeneKim, genek@realgenekim.meSource: Theo Schlossnagle

@RealGeneKim, genek@realgenekim.meSource: Theo Schlossnagle

@RealGeneKim, genek@realgenekim.meSource: John Jenkins, Amazon.com

@RealGeneKim, genek@realgenekim.me

Ludicrous Speed!

16

@RealGeneKim, genek@realgenekim.me

Ludicrous Fail?!

17

@RealGeneKim, genek@realgenekim.me

@RealGeneKim, genek@realgenekim.meSource: James Wickett

@RealGeneKim, genek@realgenekim.me

Why DevOps Is So Important To Me

@RealGeneKim, genek@realgenekim.me

Since 1999, We’ve Benchmarked 1500+ IT Organizations

Source: IT Process Institute (2008)

Source: EMA (2009)

@RealGeneKim, genek@realgenekim.me

High Performing IT Organizations

High performers maintain a posture of compliance Fewest number of repeat audit findings One-third amount of audit preparation effort

High performers find and fix security breaches faster 5 times more likely to detect breaches by automated control 5 times less likely to have breaches result in a loss event

When high performers implement changes… 14 times more changes One-half the change failure rate One-quarter the first fix failure rate 10x faster MTTR for Sev 1 outages

When high performers manage IT resources… One-third the amount of unplanned work 8 times more projects and IT services 6 times more applications

Source: IT Process Institute, 2008

@RealGeneKim, genek@realgenekim.me

2007: Three Controls Predict 60% Of Performance

To what extent does an organization define, monitor and enforce the following? Standardized configuration strategy Process discipline Controlled access to production systems

Source: IT Process Institute, 2008

@RealGeneKim, genek@realgenekim.me

Tough Love From Ari Balogh

@RealGeneKim, genek@realgenekim.me

The Downward SpiralOperations Sees… Too many fragile and insecure

applications in production Too much time required to restore

service Too much firefighting and unplanned

work Planned project work cannot

complete Frustrated customers leave Market share goes down Business misses Wall Street

commitments Business makes even larger

promises to Wall Street

Dev Sees… More urgent, date-driven

projects put into the queue Even more fragile code (less

secure) put into production More releases have

increasingly “turbulent installs” Release cycles lengthen to

amortize “cost of deployments” Bigger deployment failures More time spent on firefighting Ever increasing backlog of work

that cold help the business win Ever increasing amount of

tension between IT Ops, Development, Design…

These aren’t ITSM or IT Operations problems…These are business problems!

@RealGeneKim, genek@realgenekim.me26

My Mission: Figure Out How Break The IT Core Chronic Conflict

Every IT organization is pressured to simultaneously: Respond more quickly to urgent business needs Provide stable, secure and predictable IT service

Source: The authors acknowledge Dr. Eliyahu Goldratt, creator of the Theory of Constraints and author of The Goal, has written extensively on the theory and practice of identifying and resolving core, chronic conflicts.

Words often used to describe process improvement:“hysterical, irrelevant, bureaucratic, bottleneck, difficult to understand, not

aligned with the business, immature, shrill, perpetually focused on irrelevant technical minutiae…”

@RealGeneKim, genek@realgenekim.me

Good News: It Can Be Done

Bad News: You Can’t Do It Alone

@RealGeneKim, genek@realgenekim.me

Ops

@RealGeneKim, genek@realgenekim.me

QA And Test

Source: Flickr: vandyll

@RealGeneKim, genek@realgenekim.me

Development

@RealGeneKim, genek@realgenekim.me

Process And Controls

@RealGeneKim, genek@realgenekim.meSource: Flickr: birdsandanchors

Product Management And Design

@RealGeneKim, genek@realgenekim.me

DevOps: It’s A Real Movement

I would never do another startup that didn’t employ DevOps like principles

It’s not just startups – it’s happening in the enterprise and in public sector, too

I believe working in DevOps environments will be a necessary skillset 5 years from now

Agile helped Dev regain trust with the business; DevOps will help all of IT

IT becoming more automated relies on DevOps practices (especially PaaS)

@RealGeneKim, genek@realgenekim.me

If I Could Wave A Magic Wand, Everyone Will…

Become conversant with DevOps and recognize the practices when you see them

Be energized about how ITSM practitioners can contribute in this organizational journey

Leave with some concrete steps to get some great outcomes

Become a part of a team that starts putting DevOps practices into place

34

@RealGeneKim, genek@realgenekim.me

How Do You Do DevOps?

35

@RealGeneKim, genek@realgenekim.me

The Prescriptive DevOps Cookbook

“DevOps Cookbook” Authors

Patrick DeBois, Mike Orzen, John Willis

Goals

Codify how to start and finish DevOps transformations

How does Development, IT Operations and Infosec become dependable partners

Describe in detail how to replicate the transformations describe in “When IT Fails: The Novel”

@RealGeneKim, genek@realgenekim.me

“The Goal” by Dr. Eliyahu Goldratt

@RealGeneKim, genek@realgenekim.me38

@RealGeneKim, genek@realgenekim.me39

@RealGeneKim, genek@realgenekim.me

The First Way:Systems Thinking

@RealGeneKim, genek@realgenekim.me

The First Way:Systems Thinking

(Business) (Customer)

@RealGeneKim, genek@realgenekim.me

The First Way:Systems Thinking (Left To Right)

Don’t pass defects downstream Don’t optimize locally Always increase flow: elevate bottlenecks,

reduce WIP, throttle release of work, reduce batch sizes

Understanding where reliance is placed

@RealGeneKim, genek@realgenekim.me

Phase 1: Extend the Agile CI/CR Processes

Create one-step Dev, Test and Production environment creation procedure in Sprint 0

Create the one-step automated code deployment procedure

Properly integrate release, configuration and change into the value stream (as well as QA and infosec)

Ensure developers don’t leave until production change is successful

Assign Ops person into Dev team

@RealGeneKim, genek@realgenekim.me

Definition: Kanban Board

Signaling tool to reduce WIP and increase flow

44

@RealGeneKim, genek@realgenekim.me

The First Way:Systems Thinking: ITSM Insurgency

Have someone attend the daily Agile standups Gain awareness of what the team is working on

Find the automated infrastructure project team (e.g., puppet, chef) Release managers can provide hardening guidance Integrate and extend their production configuration monitoring

Find where code packaging is performed Integrate security testing pre- and post-deployment

Integrate testing into continuous integration and release process Add security test scripts to automated test library

Define what changes/deploys cannot be made without triggering full retest

@RealGeneKim, genek@realgenekim.me

The First Way:Outcomes

Determinism in the release process

Creating single repository for code and environments

Consistent Dev, QA, Int, and Staging environments, all properly built before deployment begins

Decreased cycle time

Reduce deployment times from 6 hours to 45 minutes Refactor deployment process that had 1300+ steps

spanning 4 weeks Faster release cadence

@RealGeneKim, genek@realgenekim.me

The Second Way:Amplify Feedback Loops

@RealGeneKim, genek@realgenekim.me

The Second Way:Amplify Feedback Loops (Right to Left)

Expose visual data so everyone can see how their decisions affect the entire system

Get Development closer to Operations and customers

Create a reliable system system of work that improves itself

@RealGeneKim, genek@realgenekim.me

Phase 2: Extend Release Process And Create Right -> Left Feedback Loops

Embed Dev into Ops escalation process Invite Dev to post-mortems/root cause analysis

meeting Have Dev cross-train IT Operations Ensure application monitoring/metrics to aid in

Ops and Infosec work (e.g., incident/problem management

@RealGeneKim, genek@realgenekim.me

The Second Way:Amplify Feedback Loops: ITSM Insurgency

Find areas in the incident and problem management processes where Development knowledge could help

Ensure that countermeasures are captured in the Agile backlog

Find that developer who really cares about the production environment

@RealGeneKim, genek@realgenekim.me

The Second Way:Outcomes

Defects and security issues getting fixed faster than ever

Reusable Ops and Infosec user stories now part of the Agile process

All groups communicating and coordinating better

Everybody is getting more work done

@RealGeneKim, genek@realgenekim.me

The Third Way:Culture Of Continual Experimentation And Learning

@RealGeneKim, genek@realgenekim.me

The Third Way:Culture Of Continual Experimentation And Learning

Foster a culture that rewards: Experimentation (taking risks) and learning from

failure Repetition is the prerequisite to mastery

Why? You need a culture that keeps pushing into the danger

zone And have the habits that enable you to survive in the

danger zone

@RealGeneKim, genek@realgenekim.me

You Don’t Choose Chaos Monkey…Chaos Monkey Chooses You

@RealGeneKim, genek@realgenekim.me

Phase 3: Organize Dev and Ops To Achieve Organizational Goals

Allocate 20% of Dev cycles to non-functional requirements

Integrate fault injection and resilience into design, development and production (e.g., Chaos Monkey)

@RealGeneKim, genek@realgenekim.me

The Third Way:Culture Of Continual Experimentation And Learning: ITSM

Ensure that process improvement projects are in the Agile backlog Make technical debt visible Help prioritize work against features and other non-functional

requirements

Release your Chaos Monkey Rehearse cleaning up after the Chaos Monkey Find processes that waste everyone’s time

@RealGeneKim, genek@realgenekim.me

@RealGeneKim, genek@realgenekim.me

The Third Way:Outcomes

Technical debt is being paid off

Exploitable attack surface area decreases

Continual reduction of unplanned work

More cycles for planned work

More resilient code and environments

Balancing nimbleness and practiced repetition

Enabling wider range of risk/reward balance

@RealGeneKim, genek@realgenekim.me

What Does Transformation Feel Like?

61

@RealGeneKim, genek@realgenekim.me

Find What’s Most Important First

@RealGeneKim, genek@realgenekim.me

Quickly Find What Is Different…

@RealGeneKim, genek@realgenekim.me

Before Something Bad Happens…

@RealGeneKim, genek@realgenekim.me

Find Risk Early…

@RealGeneKim, genek@realgenekim.me

Communicate It Effectively To Peers…

@RealGeneKim, genek@realgenekim.me

Hold People Accountable…

@RealGeneKim, genek@realgenekim.me

Based On Objective Evidence…

@RealGeneKim, genek@realgenekim.me

Answer Important Questions…

@RealGeneKim, genek@realgenekim.me

Recognize Compounding Technical Debt…

@RealGeneKim, genek@realgenekim.me

That Gets Worse…

@RealGeneKim, genek@realgenekim.me

And Fixing It…

Source: Pingdom

@RealGeneKim, genek@realgenekim.me

Have What We Need, When When We Need It…

@RealGeneKim, genek@realgenekim.me

Big Things Get Done Quickly…

@RealGeneKim, genek@realgenekim.me

Ever Increasing Situational Mastery…

@RealGeneKim, genek@realgenekim.me

Help The Business Win…

@RealGeneKim, genek@realgenekim.me

With Support From Your Peers…

@RealGeneKim, genek@realgenekim.me

And Do More With Less Effort…

@RealGeneKim, genek@realgenekim.me

This Is An Important ProblemOperations Sees… Fragile applications are prone to failure Long time required to figure out “which

bit got flipped” Detective control is a salesperson Too much time required to restore

service Too much firefighting and unplanned

work Urgent security rework and

remediation Planned project work cannot complete Frustrated customers leave Market share goes down Business misses Wall Street

commitments Business makes even larger promises

to Wall Street

Dev Sees… More urgent, date-driven projects

put into the queue Even more fragile code (less

secure) put into production More releases have increasingly

“turbulent installs” Release cycles lengthen to

amortize “cost of deployments” Failing bigger deployments more

difficult to diagnose Most senior and constrained IT ops

resources have less time to fix underlying process problems

Ever increasing backlog of work that cold help the business win

Ever increasing amount of tension between IT Ops, Development, Design…

@RealGeneKim, genek@realgenekim.me80

@RealGeneKim, genek@realgenekim.me

@RealGeneKim, genek@realgenekim.me

If I Could Wave A Magic Wand, Everyone Will…

Become conversant with DevOps and recognize the practices when you see them

Be energized about how ITSM practitioners can contribute in this organizational journey

Leave with some concrete steps to get some great outcomes

Become a part of a team that starts putting DevOps practices into place

82

…And fill out the survey forms!

@RealGeneKim, genek@realgenekim.me

When IT Fails: The Novel and The DevOps Cookbook

Coming in July 2012

“In the tradition of the best MBA case studies, this book should be mandatory reading for business and IT graduates alike.”Paul Muller, VP Software Marketing, Hewlett-Packard

“The greatest IT management book of our generation.”Branden Williams, CTO Marketing, RSA

Gene Kim, Tripwire founder, Visible Ops co-author

@RealGeneKim, genek@realgenekim.me

When IT Fails: The Novel and The DevOps Cookbook

Our mission is to positively affect the lives of 1 million IT workers by 2017

If you would like the “Top 10 Things Infosec Needs To Know About DevOps,” sample chapters and updates on the book:

Sign up at http://itrevolution.com Email genek@realgenekim.me Hand me a business card

Gene Kim, Tripwire founder, Visible Ops co-author

@RealGeneKim, genek@realgenekim.me

If you’d like the slides from today’s presentation…

Text your name, email, website and the number 61761 to +1 (858) 598-3980

Visit: http://www.instantcustomer.com/go/61761 Or scan this QR Code:

85