Planning Application Resilience

64
Planning Application Resilience Jennifer Davis 2/26/15

Transcript of Planning Application Resilience

Planning Application Resilience

Jennifer Davis 2/26/15

Goal: Communication

Jennifer Davis Solutions Engineer Twitter: @sigje Hashtag: #getchef Email: [email protected]

What is Resilience?

Resilience

•  Elasticity – Spring back into shape •  Recoverability – Quick to recover/rebuild

Enduring Resilience

•  State of Elasticity •  State of Recoverability

Resilience Minimum Goal!

Fear

Depression

Fear

Event

Degradation

Failure

Event

Resilience

Recognition

Event Recovery

Post Traumatic Growth

Recognition

Event

Growth

Applications reflect Quality of Organizations

Conway’s law

Organizations which design systems … are constrained to produce designs which are copies of the communication

structures of these organizations.

Applications reflect Quality of Organizations

The resilience of an organization will be reflected in the resilience of the applications and services built by the

organization.

5 Critical Metrics of Resilient Organizations

• Willingness to tackle challenges. • Sense of Agency. • Adaptability. • Diversity. • Rope Factor

Red Shirt Syndrome

• Tackling challenges. •  Learned Helplessness.

Stormtrooper Syndrome

• Agency. • Adaptability: Role Adherence.

Borg Syndrome

• Embrace Diversity. • Recognize differences in

perspectives. • Eliminate system

blindness.

Rope Factor

Enough rope to get things done, not enough for cowboys.

Rope too short Excessive time in meetings Death march for each sprint

Rope too long Cowboy behavior.

Resilience is ordinary.

•  Intentional behaviors, thoughts, and actions. •  Reflection of the organization.

Don’t build organizational systems that encourage the wrong behaviors.

Resilience isn’t managed through limiting change.

•  Security Patches? •  Over Engineering Delays in Schedule •  Under Engineering – Rewrite required to scale

Stability is a myth.

Qualities of Resilient Software

•  Elasticity •  Recoverability

Qualities of Resilient Software

•  Elasticity •  Recoverability

Automation Resilience

Resilient Automation Platform

•  Complex dependency handling between nodes. •  Fault tolerance. •  Security. •  Multi-Platform. •  Flexibility.

Chef is a Resilient Automation Platform.

Chef is a language.

• Describe infrastructure as code. •  Programmatically provision and

configure servers. •  Versioning, artifacts

Chef is a toolset

•  Collection of tools that allow you to model, measure, and improve workflows.

chef is a command line utility

•  Generate skeleton for application, cookbook, recipes, attributes, files, templates, and custom resources.

•  Prep environment with correct ruby gems. •  Verifies environment is configured and installed correctly.

Chef is a community.

•  Mailing lists •  https://supermarket.chef.io/ •  Chef Conf 3/31 – 4/2 Santa Clara •  Chef Summit •  IRC #chef •  Twitter @chef

CODE: BUILDITBETTER

Configuration as Code

Elastic Configurable

Responsive

Elasticity Recoverability

Infrastructure Automation is creating control systems that reduce the burden on people to manage services and increase the quality, accuracy and precision of a service to the consumers of the service.

Infrastructure Elements to Resources

file package

cron user

File Package Cron Job

User

Resources

•  Fundamental building blocks •  Describes piece of system and it’s desired state •  Chef DSL is ruby.

Example of describing a resource

Recipe: (chef-apply cookbook)::(chef-apply recipe) * package[nano] action install - install version 2.0.9-7.el6 of package nano

sudo chef-apply -e "package 'nano'"

Test and Repair Resources follow a test and repair model

•  package ”nano"

Is nano installed?

Done Install it

Yes No

Recipe

•  A recipe is an ordered list of resources.

Recipe

package “httpd”

template “/var/www/html/index.html” do

source “index.html.erb”

end

service “httpd” do

action [:enable, :start]

end

Cookbook

•  A collection of recipes (and other elements like files and templates). •  Map 1-1 to a piece of software or functionality. •  Distribution unit •  Versioned •  Modular and re-usable.

Chef Provisioning – Part of Chef DK

https://flic.kr/p/knDPjc

•  Describe multiple tier applications. •  Deploy many copies of your

application cluster. •  Spread cluster across different clouds/

machines. •  Orchestrate deployment. •  Parallelize machine deployment.

Configuration as Code

Elastic Configurable

Responsive

Elasticity Recoverability

Chef Provisioning

machine ‘web1’ do recipe ‘webserver’

end

Multi-platform

•  AWS •  Azure •  Fog •  Vagrant •  Docker •  LXC •  .. more

.. We’ll use AWS in this example https://github.com/chef/chef-provisioning-aws

http://aws.amazon.com/start-ups/loft/

AWS

•  SQS Queues •  SNS Topics •  Elastic Load Balancers •  VPCs •  Security Groups •  Instances •  Images •  Autoscaling Groups •  SSH Key pairs •  Launch configs

AWS Config: ~/.aws/config

[default]  region=us-­‐west-­‐2  aws_access_key_id  =    aws_secret_access_key  =    

Cookbook Setup

$ chef generate cookbook webserver

Provision Recipe

$ cd webserver $ chef generate recipe provision

Edit Provision Recipe

$ vi recipes/provision.rb

Edit Provision Recipe

require “chef/provisioning/aws_driver” with_driver “aws”

machine ‘web1’ do

recipe ‘webserver’

converge true

end

..but I need multiple webservers

require “chef/provisioning/aws_driver” with_driver “aws”

num_webservers = 3

(0… num_webservers).each do |i|

machine “web_0#{i}” do

recipe ‘apache’

end

end

…add security

aws_security_group "#{name}-http" do inbound_rules [{:ports => 80, :protocol => :tcp, :sources => ['0.0.0.0/0']}]

end

…add security

with_machine_options({      :bootstrap_options  =>  {  

       :security_groups  =>  [  "#{name}-­‐http”]      }  })

..add load balancing

load_balancer  "#{name}-­‐webserver-­‐lb"  do      load_balancer_options({          :availability_zones  =>  ["us-­‐west-­‐2a",  "us-­‐west-­‐2b",  “us-­‐west-­‐2c"],          :listeners  =>  [{:port  =>  80,  :protocol  =>  :http,  :instance_port  =>  80,  :instance_protocol  =>  :http  }],          :security_group_name  =>  “#{name}-­‐http”      })      machines  elb_instances  end  

Bulkhead Pattern

•  Compartmentalization to limit failure. •  Repeatable Clusters •  … across platforms.

Configuration as Code

Elastic Configurable Responsive

Elasticity Recoverability

Responsive

•  Chef-Client •  Chef Handlers •  Jenkins with Test Kitchen •  Collaborate with Source Control •  Share your stories

Responsive – chef-client

•  Agent that runs on node applies policy

Responsive - Jenkins with Test Kitchen •  Write tests to minimize risk •  Push change regularly

Responsive – Collaborate with Source Control

•  Don’t let role adherence get in the way of collaboration. •  Pull requests

Responsive – Chef Handlers

•  Start •  Exception •  Report

Share your stories

•  Blameless Postmortems are really useful. •  Knowledge sharing across teams. •  Share across companies – DevOpsDays

Don’t build tools that create systems that encourage the wrong behaviors.

Jumpstart Learning

•  The LearnChef Site •  Guided Tutorials •  Chef Fundamentals intro

http://learnchef.com •  How-To’s, Conference Talks, Webinars, more

http://youtube.com/user/getchef •  Attend a Chef Fundamentals Class (HELLO-CHEF code)

Further Resources

•  http://chef.io •  http://docs.chef.io •  http://supermarket.chef.io •  http://lists.opscode.com •  irc.freenode.net #chef •  Twitter @chef #getchef, @learnchef #learnchef

Thank you!

Jennifer Davis Twitter: @sigje Hashtag: #getchef Email: [email protected]