StarWest 2013 Performance is not an afterthought – make it a part of your Agile Delivery

Post on 10-May-2015

1.162 views 2 download

Tags:

description

This presentation was given at StarWest 2013 in Anaheim, CA and also broadcasted through the Virtual Conference. It shows how important it is to focus on performance throughout continuous delivery in order to avoid the most common performance problem patterns that still cause applications to crash and engineers spending their weekends and nights in a firefighting/war room situation

Transcript of StarWest 2013 Performance is not an afterthought – make it a part of your Agile Delivery

11

PERFORMANCE IS NOT AN AFTERTHOUGHT

Make it part of your Agile Delivery

2

3

4

5

Testing is Important – and gives Confidence

6

But are we ready for “The Real” world?

7

Measure Performance during the game

Ball Possession: 40 : 60Fouls: 0 : 0Score: 0 : 0

Minute 1 - 5

8

Measure Performance during the game

Minute 6 - 35

Ball Possession: 80 : 20Fouls: 2 : 12Score: 0 : 0

9

Deep Dive Analysis

10

Options “To Fix” the situation

11

Not always a happy ending

Minute 90

Ball Possession: 80 : 20Fouls: 4 : 25Score: 3 : 0

12

FRUSTRATED FANS!!

12

13

How does that relate to

Software?

How does that relate to

Software?

1414

From Deploy to …From Deploy to …

Deploy Promotion/Event Problems Ops Playbook War Room

Timeline

1515

The “War Room” – back then

'Houston, we have a problem‘NASA Mission Control Center, Apollo 13, 1970

The “War Room” – back then

'Houston, we have a problem‘NASA Mission Control Center, Apollo 13, 1970

1616

The “War Room” – NOW

Facebook – December 2012

The “War Room” – NOW

Facebook – December 2012

1717

Problem: Unclear End User Problem DescriptionsProblem: Unclear End User Problem Descriptions

2020

Status Quo: Ops Runbook – High Resource UsageStatus Quo: Ops Runbook – High Resource Usage

2121Lack of data?Lack of data?

2222

23

Answers to the right questionsAnswers to the right questions

2424

What are the real questions?What are the real questions?

Individual Users? ALL users?

Is it the APP? Or Delivery Chain?

Code problem? Infrastructure?

One transaction? ALL transactions?

In AppServer? In Virtual Machine?

2525

Problem: What Devs would like to haveProblem: What Devs would like to have

2626

Problem: What Devs would like to haveProblem: What Devs would like to have

Top Contributor is related to String handling

99% of that time comes from RegEx Pattern Matching

Page Rendering is the main component

2727

Its getting this …Its getting this …Its like getting this …

28

… when you need to see this!

2929

Problem: Attitudes like this don’t help eitherProblem: Attitudes like this don’t help either

Image taken from https://www.scriptrock.com/blog/devops-whats-hype-about/

Shopzilla CIO (in 2010): “… when they get in the war room - the developers and ops teams describe the problem as the enemy, not each other”

3030

Problem: Very “expensive” to work on these issuesProblem: Very “expensive” to work on these issues

~80% of problems

caused by ~20% patterns

YES we know this

80% Dev Time in Bug Fixing

$60B Defect Costs

BUT

3131

TOP PROBLEM PATTERNS

• Taken From Production Environments

3232

Top Problem Patterns: Resource PoolsTop Problem Patterns: Resource Pools

3333

Top Problem Patterns: Resource PoolsTop Problem Patterns: Resource Pools

3434

Deployment Mistakes lead to internal ExceptionsDeployment Mistakes lead to internal Exceptions

3535

Deployment Mistakes lead to high logging overheadDeployment Mistakes lead to high logging overhead

3636

Production Deployment leads to Log SYNC IssuesProduction Deployment leads to Log SYNC Issues

3737

Long running SQL with Production DataLong running SQL with Production Data

3838

N+1 Query ProblemN+1 Query Problem

4040

Memory Leaks in Cache Layer with Production DataMemory Leaks in Cache Layer with Production Data

Still crashes

Problem fixed!Fixed Version Deployed

4242

BLOATED Web SitesBLOATED Web Sites

17! JS Files – 1.7MB in Size

Useless Information!Even might be a security risk!

4343

Missing or incorrect configured browser cachesMissing or incorrect configured browser caches

62! Resources not cached

49! Resources with short expiration

4444

SLOW or Failing 3rd Party ContentSLOW or Failing 3rd Party Content

4545

Want MORE of these and more details?Want MORE of these and more details?

http://apmblog.compuware.com

4646

Lots of Problems that could have been avoidedLots of Problems that could have been avoided

• BUT WHY are they still making it to Production?

?

4747

Missing Focus on PerformanceMissing Focus on Performance

4848

Different Goals for Dev and OpsDifferent Goals for Dev and Ops

4949

Disconnected Teams despite “Shared Responsibility”Disconnected Teams despite “Shared Responsibility”

5050

Solution: DevOps + Performance FocusSolution: DevOps + Performance Focus

5151

BEST PRACTICES

5252

CultureCulture Become ONE TeamBecome ONE Team

5353

CultureCulture TestabilityTestability

5454

Automate & Measure …Automate & Measure …PerformancePerformance

5555

Automate & Measure …Automate & Measure …ScalabilityScalability

5656

AutomateAutomate DeploymentDeployment

5757

How? Performance Focus in Test AutomationHow? Performance Focus in Test Automation

12 0 120ms

3 1 68ms

Build 20 testPurchase OK

testSearch OK

Build 17 testPurchase OK

testSearch OK

Build 18 testPurchase FAILED

testSearch OK

Build 19 testPurchase OK

testSearch OK

Build # Test Case Status # SQL # Excep CPU

12 0 120ms

3 1 68ms

12 5 60ms

3 1 68ms

75 0 230ms

3 1 68ms

Test Framework Results Architectural Data

We identified a regresesion

Problem solved

Lets look behind the scenes

Exceptions probably reason for failed tests

Problem fixed but now we have an architectural regression

Problem fixed but now we have an architectural regression

Now we have the functional and architectural confidence

5858

How? Performance Focus in Test AutomationHow? Performance Focus in Test Automation

Analyzing All Unit / Performance Tests

Analyze Perf Metrics

Identify Regressions

5959

How? Performance Focus in Test AutomationHow? Performance Focus in Test Automation

Cross Impact of KPIs

6060

How? Performance Focus in Test AutomationHow? Performance Focus in Test Automation

Embed your Architectural Results in Jenkins

6161

ShareShare ToolsTools

6262

ShareShare ResultsResults

6363

Getting control over your weekend again …Getting control over your weekend again …

Enjoy a beer with friends?

Instead of pizza and soda with your colleagues?

64

&1+1 = 3

6565

YOU HAVE TIME FOR THE REAL …YOU HAVE TIME FOR THE REAL …

6666

DevOps Automation In-ActionDevOps Automation In-Action

• Automate Load Test Analysis and Regression Detection

6767

DevOps Automation In-ActionDevOps Automation In-Action

• Automate Load Test Analysis and Regression Detection

6868

DevOps: Actionable Data to OpsDevOps: Actionable Data to Ops

• Input for Capacity and Deployment PlanningNumber of Requests on The App Server we will need to

handle

Might need to tune GC Settings to reduce GC Overhead

CPU is going to be tight with these machines – also

impacted by GC Activity

Input on Thread Pool Configuration

Memory Usage for expected load still provides enough

“headroom”

6969

IF WE DO ALL THATIF WE DO ALL THAT

80% Dev Time for Bug Fixing

$60B Costs by Defects

7070

Want MORE of these and more details?Want MORE of these and more details?

http://apmblog.compuware.com

71© 2011 Compuware Corporation — All Rights Reserved

Simply Smarter