Shift Happens - Rapidly Rolling Forward During Production Failure
-
Upload
ibm-urbancode-products -
Category
Software
-
view
101 -
download
2
Transcript of Shift Happens - Rapidly Rolling Forward During Production Failure
1IBM _
Chapter Opening
September 16, 2015Presentation Title
Shift Happens
Continually moving forward when the outcome looks bleak@Al_Wagner
2IBM _
Avoiding Deployment Failures..
especially those that could cause a production outage, is top of mind for many IT professionals. However, sometimes failures will occur in production, which means that planning for recovery is essential. Preventative measures like canary, blue/green or rolling deployments can help, but also having the ability to roll forward (instead of rolling back), also known as shifting right, means you can push through a failure while learning from deployment process mistakes and shortening mean time to recovery (MTTR).• Deployment models like canary, blue/green and rolling that can
help prevent major production outages• How to pinpoint deployment failures in your process and correct
them• Pulling together a basic failure response plan• How you can roll forward while improving your deployment
process
September 16, 2015Shift Happens
DevOps is all about executing with speed!
Line-of-business
Customer
Getting ideas into production fast – Getting people to use it – Analyzing their feedback
Continuous Delivery
Continuous Feedback
Continuous Innovation
• Reducing Scopeü Small batches of
incremental changes• Empowering Resources
ü Co-located, automatous teams
• Accelerating Schedulesü Automate, automate,
automate• Increasing Quality
ü Everyone contributes
ü Small batches of incremental changes
ü Co-located, automatous teams, collaboration
ü Continuous release & deployment
ü Everyone contributes
5
Managing the Iron Triangle by…
QualitySchedule
Scope
Resources
Traditional software deployments
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
Users
Environment #1
Environment #2
LoadBalancer
1. Servers taken off-line
Traditional software deployments
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
Users
Environment #1
Environment #2
LoadBalancer New deployment is tested
1. Servers taken off-line2. New release is deployed & tested
The clock is ticking!
Software Deployment
Traditional software deployments
WebServer
AppServer
DatabaseServer
AppServer
DatabaseServer
Users
Environment #1
Environment #2
LoadBalancer New version of application
1. Servers taken off-line2. New release is deployed & tested3. Servers brought back on-line
WebServer
And when disaster strikes! You need to know…
What failed?
Where didit fail?
What apps were
impacted?
Should I move traffic to
another server?
Do we go forward or rollback?
If you fail to plan; you plan to fail!
Why didit fail?
During the post mortem, you need to uncover…Did anything trigger the deployment failure?What was the root cause of the failure?What could we have done differently to avoid this situation?How can we improve so it doesn’t happen again?
Accelerate delivery of incremental software change
Failures due to inconsistent devand production environments
Bottlenecks trying to deliver more
frequent releases to meet market
demands
Complex, manual, processes for release lack
repeatability and speed
Poor visibility into dependencies
across releases, resources, and
teams
Accelerate delivery of incremental software change
Failures due to inconsistent devand production environments
Bottlenecks trying to deliver more
frequent releases to meet market
demands
Complex, manual, processes for release lack
repeatability and speed
Poor visibility into dependencies
across releases, resources, and
teams
The Four Pillars ofGold-Standard Deployment
• Use the same processü Reduces deployment errors
• Automate, automate, automateü Deliver repeatability, reliability, &
with traceability
• Deliver incremental changesü Reduces risk to business
• Release what you testü Increases confidence
Automate provisioning and deploymentsSCM
Build Automation
Publish build
Pull changes
IBMCloudOrchestrator
IBMPureApplicationSystem
IBMCloudManagerwithOpenStack
IBMBluemix
Provision environment with open patterns
Public: Sharedoff premises cloud
Dedicated: off premises cloud
Local: Dedicated on premises
cloudTraditional IT
ü Traceable
VMWarevCenter
ü Repeatableü Reliable
IBM UrbanCodeDeploy
Automatedeployment to
hybrid environments
IBM Cloud UrbanCode Deploy as a Service
Develop Build
Mobile Device
Mainframe
Traditional
Deploy
Features of the new SaaS offering• Full automated application delivery capabilities• Hosted on IBM infrastructure, managed by IBM• Monthly subscription, license managed by IBM• Full product support
App
App
App
App
SoftLayer, AWS, Azure
App
IBM Cloud UrbanCode
Deploy
NEW!
16Page© 2016 IBM Corporation
IBM UrbanCode Release for release management
16
ü No more release week-end parties: Coordinate stakeholders, orchestrate deployment activities, enforce qualification process with relevant workflow and quality gates, get necessary approvals prior getting to production. Make releases predictable and boring!
ü Reduced down time: Eliminate wasted time, orchestrate large & complex releases involving several hundred applications, and hundreds of stakeholders.
ü Reduced time to market with continuous delivery releases:Accelerate release frequency with distributed release management for small scope frequent releases delivered by application teamsMake releases predictable and boring!
IBM UrbanCode Release & Deploy iOS mobile appü Monitor Progress:
Understand the overall progress of your releases and remaining work. Get real time calculations of the projected completion time
ü Alert for Critical issues: See critical data of late tasks and idling tasks so you can encounter problems and mitigate business risks.
ü Understand team status: Learn from teams what they are blocked by to take the right corrective actions
https://itunes.apple.com/ca/app/ibm-urbancode-release-deploy/id1084753666?mt=8
Shift right and continuously move forward
Accelerate releases by making a conscious decision to carry an acceptable level of …
…into PRODUCTION!
Dark Launches & Toggles• Feature toggle - restricts access to source code
in development until ready for release to end users
if “work_in_progress” {develop new functionality here
} else {already deployed as production code
};
• Business toggle – control user or group of user access to new functionality
if “beta_usergroup” {provide access to new experiment
} else {route user to existing production code
};
ü ProsNew experiments can quickly be made available to groups of trusted users
X ConsIncrease in technical debt as ”toggle” code needs to be managed
Zero downtime deployment strategies
Canary Release Blue/Green Deployments Rolling Deployments
a technique to reduce the risk of introducing a new software version in production by slowly rolling out the change to a small subset of users before rolling it out to the entire infrastructure and making it available to everybody.
a release technique that reduces downtime and risk by running two identical production environments called Blue and Green. At any time, only one of the environments is live, with the live environment serving all production traffic.
a software release strategy that staggers deployment across multiple phases, which usually include one or more servers performing one or more functions within a server cluster to reduce application downtime.
Canary Releases (example flow)
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
Users
Old Version
Old Version
50% ofUsers
LoadBalancer
50% ofUsers
Canary Releases (example flow)
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
Users
Old Version
New Version
AllUsers
DeploymentAutomationInventory
LoadBalancer
Canary Releases (example flow)
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
Users
Old Version
New Version
MostUsers
(95%)
SomeUsers
(5%)
DeploymentAutomationInventory
LoadBalancer
As confidence in the new release increases, the percentage of users who have access is increased.
Canary Releases (example flow)
WebServer
AppServer
DatabaseServer
Users LoadBalancer
Old Version
WebServer
AppServer
DatabaseServer
New Version
AllUsers
DeploymentAutomationInventory
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
New VersionNew Version
Eventually the new version is deployed to the second environment.
Canary Releases (example flow)
WebServer
AppServer
DatabaseServer
Users LoadBalancer
Old Version
WebServer
AppServer
DatabaseServer
New Version
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
New VersionNew Version
50% ofUsers
50% ofUsers
And the user load is split across the two environments.
Blue / Green Deployments (example flow)
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
Environment #1
RouterUsers
AllUsers
Two environments, each of sufficient resources to serve the application in production.
Environment #2Previous Release
(hot stand-by)
Blue / Green Deployments (example flow)
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
Environment #1
Environment #2
RouterUsers
AllUsers
Two environments, each of sufficient resources to serve the application in production.
DeploymentAutomationInventory
The new release is deployed to the idle environment.
Blue / Green Deployments (example flow)
WebServer
AppServer
DatabaseServer
WebServer
AppServer
DatabaseServer
Environment #1
RouterUsers
AllUsers
Two environments, each of sufficient resources to serve the application in production.
Environment #2
Previous Release(hot stand-by)
When the new deployment is working as expected, users are routed to the new version.
LoadBalancer
Rolling Deployments (example flow)
WebServer
AppServer
DatabaseServer
Server Cluster #1
WebServer
AppServer
DatabaseServer
Server Cluster #2
WebServer
AppServer
DatabaseServer
Server Cluster #3
WebServer
AppServer
DatabaseServer
Server Cluster #4
Users
Rolling Deployments (example flow)
WebServer
AppServer
DatabaseServer
Server Cluster #1
WebServer
AppServer
DatabaseServer
Server Cluster #2
WebServer
AppServer
DatabaseServer
Server Cluster #3
WebServer
AppServer
DatabaseServer
Server Cluster #4
Users
DeploymentAutomationInventory
LoadBalancer
1. Cluster #1 taken off-line2. Application change deployed3. Deployment tested
Rolling Deployments (example flow)
WebServer
AppServer
DatabaseServer
Server Cluster #1
WebServer
AppServer
DatabaseServer
Server Cluster #2
WebServer
AppServer
DatabaseServer
Server Cluster #3
WebServer
AppServer
DatabaseServer
Server Cluster #4
Users
DeploymentAutomationInventory
LoadBalancer
1. Cluster #1 brought back on-line2. Cluster #2 is taken off-line3. Application change deployed4. Deployment tested
Rolling Deployments (example flow)
WebServer
AppServer
DatabaseServer
Server Cluster #1
WebServer
AppServer
DatabaseServer
Server Cluster #2
WebServer
AppServer
DatabaseServer
Server Cluster #3
WebServer
AppServer
DatabaseServer
Server Cluster #4
Users
DeploymentAutomationInventory
LoadBalancer
1. Cluster #3 brought back on-line2. Cluster #3 & #4 is taken off-line3. Application change deployed4. Deployment tested
Rolling Deployments (example flow)
WebServer
AppServer
DatabaseServer
Server Cluster #1
WebServer
AppServer
DatabaseServer
Server Cluster #2
WebServer
AppServer
DatabaseServer
Server Cluster #3
WebServer
AppServer
DatabaseServer
Server Cluster #4
Users LoadBalancer
All environments are presenting the latest version of the application.
Pros and Cons…Canary Release Blue/Green Deployments Rolling Deployments
Pros• No downtime of production
environment• Quick access to a backup
environment• A/B testing of new features and
functionality• Capture performance metrics of
new release during early adoption
Cons• Management and maintenance of
multiple versions of the software• Maintain persistent sessions
during deployment• Database must support two
versions of the application (until cut-over is complete)
Pros• No downtime of production
environment• Quick access to a backup
environment – hot standby• Ability to test application in a
production environment
Cons• Requires two similar environments• Maintain persistent sessions
during deployment• Database must support two
versions of the application (until cut-over is complete)
Pros• No downtime of production
environment• Incrementally validate
deployments and reduce risk• Reduce visibility of performance
degradation• Seamless user experience
Cons• Maintain persistent sessions
during deployment• Database must support two
versions of the application (until deployment is complete)
Your mission if you choose to accept it…Measure your DevOps progress• Deployment / Change Frequency
– Measures delivery team responsiveness, cohesiveness, capabilities, efficiency, & tooling effectiveness
• Change Lead Time– Measure efficiency of end to end development process; from first code change to deployment– Measure cycle time of the individual activities
• Change Failure Rate– How many deployment fail / number of deployments
• Mean Time To Recover (MTTR)– How long does it take to recover from a failure– Understand the contributors to failure:
• code complexity, number of app changes, number of operating environment changes