How Are My VM’s Doing? Managing for Performance

34
Akorri Copyright © 2008 How Are My VM’s Doing? Managing for Performance Mike Matchett Dir. Product Management [email protected]

description

How Are My VM’s Doing? Managing for Performance. Mike Matchett Dir. Product Management [email protected]. Akorri Customers & Awards Any size enterprise across multiple industries. Healthcare, Education, Legal. Manufacturing, High Tech, Pharma. Financial / Insurance. Online Services. - PowerPoint PPT Presentation

Transcript of How Are My VM’s Doing? Managing for Performance

Page 1: How Are My VM’s Doing? Managing for Performance

Akorri Copyright © 2008

How Are My VM’s Doing?Managing for Performance

Mike Matchett Dir. Product [email protected]

Page 2: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com2

Akorri Customers & AwardsAny size enterprise across multiple industries

Financial / InsuranceHealthcare, Education, Legal

Manufacturing, High Tech, Pharma Online Services

Page 3: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com3

Virtualization Decouples Apps & Resources

SAN SAN

NETWORK

Server PoolServer Pool

SAN LAYER

Storage PoolStorage Pool Tier 1Tier 2

Archive

NETWORK

Physical Infrastructure Model Virtual Infrastructure Model

Page 4: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com4

Management of IT Virtualization

• Good– Sharing resource “Pools” means less dedicated waste– Normalized resource units lowers administrative costs– Explicit "entitlements“ with “unused” available at peaks

• Bad– Hard to see deep physical resource sharing by application– Hard to tell if the whole pool is shared efficiently– When contention happens it’s bad for everyone at once

• Ugly Ugly – Who's 100% is really 100%?

– ESX knobs and switches control capacity, not performance

Page 5: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com5

AkorriCross-domain Management for Virtual Infrastructure

• Agent-less Collection Across Databases, Servers, Storage, VMware & Storage Virtualization

• Advanced Analytics & Modeling

• Performance and Utilization

• Troubleshooting & Root Cause

• Optimization and Planning

• Rapidly Delivers ROI:

– Faster Problem Resolution– Avoid Performance Problems– Planning and Optimization

V2.0

Page 6: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com6

Akorri BalancePoint’s ModelIncludes Server and Storage Virtualization

Storage Storage

Virtualization

VirtualizationServer

Server

Virtualization

Virtualization

Page 7: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com7

Troubleshooting Performance Issues is Difficult in Virtualized Data Centers

RecognizeRecognizeProblemProblemProactiveProactiveAnalysisAnalysis

NoNoProblemProblem

•Map virtual topology•Identify faults•Identify bottlenecks•Identify contention•Make recommendations

•IRT™•Performance Index™•Utilization Analysis•Management Reporting

X-Domain Analytics

ResolveResolveProblemProblem

BalancePointBalancePoint

BalancePoint

RecognizeRecognizeProblemProblem

Track DownTrack DownDepend-Depend-enciesencies

InterrogateInterrogateComponentsComponents

IsolateIsolateFaultsFaults

FindFind““Root Root

Cause”Cause”

ResolveResolveProblemProblem

Page 8: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com8

Example:Topology

• VMware ESX• Netapp iSCSI

• CPU problem

Page 9: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com9

Same Example – Storage View

Page 10: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com10

KPIs and Metrics - Example

Infrastructure Infrastructure ResponseResponseTimeTime

UsageUsageIndexIndex

IOPSIOPS

CapacityCapacity

Page 11: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com11

Understand Resource Contention

Application Contention for a RAID group

VMware ESX Server CPU Contention

Page 12: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com12

Dynamic Thresholds and Prediction• Thresholds can be dynamically set based on historical behavior• Predicts performance for the next 48 hours• Helps to manage seasonality and identify spikes in future activity

Identify ProblemsBefore They Happen

Page 13: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com13

IT Service Management

For Effectiveness (Performance Analysis) -• Load/Throughput - Number of Transactions• Response Time – Time it takes a Transaction to

complete

And for Efficiency (Capacity Management) -• Utilization – How Busy is the service?

– How much of the available service capacity is being used?– How many transactions can it handle at good performance

levels?

Page 14: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com14

Response Time is Non-Linear• Max Capacity

happens when system is 100% utilized

• Service Level is set to a performance threshold

• Optimal Capacity happens at less than 100% utilization

Res

po

nse

Tim

e (s

ec/t

ran

)

Arrival Rate (trans/sec)

Uti

lizat

ionService

Level

ServiceTime

0 Trans = MAXThroughput

= OptimalThroughput

0%

100%

Page 15: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com15

Queuing Theory to The Rescue…

• Queuing Models create Response Time curves– Based on established mathematics– Useful analytically (historically) as well as predictively– A simple queuing model can represent a check-out line at the

grocery store

• Complex Queuing Network Models can represent nested IT domains– Advanced cross-domain solutions model IT virtualization

Page 16: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com16

Infrastructure Response Time Are we giving good performance? • Infrastructure Efficiency -

How long to service each transaction?

• Can be scored for how much of the time good service is provided…– But requires a known

Service LevelP

erfo

rman

ce

Service Level

ResponseResponse Time Time

Page 17: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com17

Akorri Performance IndexA better 100%...

Is Infrastructure Over- or Under-Utilized?

= 100 Optimal Utilization– Optimal Point is based on

modeling for performance

> 100 OVER– Performance is in jeopardy

– Infrastructure over-utilized

< 100 UNDER– Performance is stable

– Infrastructure has headroom

Per

form

ance

PI = 0

PI = 100

PI > 100

OptimalOptimal PointPoint

Page 18: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com18

Practical Examples with BalancePoint

• Operations Management and P2V Planning

• Justifying Additional Physical Servers for Virtualized Server Clusters

• Trigger/measure IT optimization projects

• CIO Investment Planning

Page 19: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com19

Scorecard ReportingKey Performance and Utilization Information for ESX and VMs,

Physical Servers, Application Service, Storage Usage

Page 20: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com20

Do I Need More ESX Hosts?Can My Current Servers Support More Virtual Machines?

• E.g. VMware ESX Servers

• Model for PI factors in:– Server capability

– Storage capability

– Other apps (contention)

• Easily rolls up to cluster, domain, and datacenter scores

100

Workload(Transactions per Sec)

Yes: More VMs

No: Over Utilized

Per

form

ance

(Re

sp

on

se

Tim

e p

re T

ran

sa

cti

on

)

Page 21: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com21

Example: VMware Status ReportKey Performance and Utilization Information for ESX and VMs

Page 22: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com22

The Business of IT Trigger and Measure IT Optimization Projects

For Example - • If PI is always low (<20%)

– Server Consolidation

– Storage Tiering

• If PI is often high (>120%)

– Infrastructure Upgrades

– Application Tuning

• If PI varies high and low– Load Re-balancing

– Server and Storage Virtualization

PI over Time

PI over Time

Page 23: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com23

CIO Reviews IT KPI’s For Every Application/VM each Quarter…

2020

4040

6060

8080

100100

12512515015019019024024032532540540548048057057069069090090010001000

0-100: SHOWS HEADROOM0-100: SHOWS HEADROOM

100+: INDICATES RISK100+: INDICATES RISK

• PI is “linear” up to 100• A score of 100 = “Optimal”• Example: an ESX server with 5 VMs and a PI

score of “50” could handle 5 more similar VMs

• PI is “non-linear” over 100• Escalates rapidly with poor performance• High penalty for poor service levels

The PI ScaleThe PI Scale

Page 24: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com24

For Akorri VMware CustomersWhat We Do:

• Provide single view of VMware infrastructure

• Alert on current and future performance problems, and identifies the source of the problem

• Help troubleshoot performance problems through advanced analytics and predictive modeling

• Optimize server/storage utilization

• Drive IT alignment across virtual infrastructure

BalancePoint helps ensure the success of BalancePoint helps ensure the success of virtualization projects in production environments.virtualization projects in production environments.

Page 25: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com25

Thanks!

Mike Matchett

Director Product Management

Akorri

[email protected]

http://www.akorri.com

Live BalancePoint webex demos every Wed – check website for details…

Page 26: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com26

Additional Slides

Page 27: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com27

Availability v. Performance

• Availability– Relatively easy to monitor and measure inside and out– ROI is limited to minimizing amount of downtime

• 100% uptime is the best you can do

• Performance– Hard to measure internally, calibrate externally– ROI is theoretically unbounded

• Can always try to improve performance another 10%...

Improving Availability from 99.99 to 99.999% buys 5 minutes of uptime/yr - (100% * 5 min).

Improving Performance by 10% can buy continuing productivity - (+10% * 7*24*365*60 min)

Page 28: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com28

Manage Availability or Performance?

• Availability – Under-performing systems don’t meet service levels,

and are therefore not considered available…

• Performance– Un-available systems are just performing very very

badly…

At a service level the all-or-nothing Availability definition works. However IT must use performance to manage, optimize, and plan.

Page 29: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com29

Infrastructure AgilityProve Virtualization Works…

• An analysis of variance of infrastructure efficiency over time– Lower variance means

higher agility

• Resources dedicated to single applications will usually show low agility

• Shared resource pools are dynamically assigned to applications demonstrating high agility

PI over Time

PI over Time

High Variability = Low Agility

Agile Datacenters automatically handle large changes in application usage while also optimizing IT investment!

Page 30: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com30

How?• BalancePoint discovers and collects performance and

utilization data directly from VirtualCenter and also from:– VM OS – ESX Server OS– Database– Server components– Storage systems

• Collection is done without any software agents

• BalancePoint uses advanced analytical techniques to correlate across the I/O stack:– Queue depth analysis– Infrastructure response time and throughput– Historical / time series analysis– Storage and server capacity utilization analysis and trending

Page 31: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com31

BalancePoint VMware Advantages

• Multiple points of deep data collection and analysis across domains – DB, VM, CPU, memory, HBA, array– Not simply collecting and presenting VirtualCenter stats

• Heterogeneous storage array support and drill down – Other VMware management tools have little/no storage insight

• Akorri performance analytics & metrics (IRT, UI, PI)– Not simply reporting raw stats

• Rapid installation due to agent-less design– No heavy agent infrastructure

BalancePoint shows exactly what is happening BalancePoint shows exactly what is happening across the server and storage infrastructure.across the server and storage infrastructure.

Page 32: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com32

What Else?

• Akorri is a VMware Technology Alliance partner

• BalancePoint is VMotion/DRS “aware”– Identifies when a VM has moved & tracks performance changes

• BalancePoint supports all major storage array vendors– EMC, IBM, HP, HDS, Netapp, Dell, Engenio, Dot Hill, etc.

• BalancePoint supports all major server OS’s– VMware, Linux, Windows, Solaris, HPUX, AIX, etc.

Managing VirtualInfrastructure

Page 33: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com33

BalancePoint Produces Results• An internet business avoided purchasingavoided purchasing unnecessary

storage hardware worth $350K.

• A financial firm found a bottleneckfound a bottleneck in HBA settings that was slowing down millions of dollars worth of storage.

• An insurance company realized $271K in year-one ROI$271K in year-one ROI

• A healthcare company cut troubleshooting timecut troubleshooting time for application performance events in half.

• A financial company avoided buying more softwareavoided buying more software that could only manage vendor-specific platforms.

• A service provider used BalancePoint to ensure the success of a business-critical VMware projectbusiness-critical VMware project.

Page 34: How Are My VM’s Doing? Managing for Performance

Akorri © 2008 www.akorri.com34

Oracle and SQL• Automatically maps database

to storage infrastructure– Oracle instances

– Oracle schema elements

• Creates ViewPoint Topology

• Provides visibility into complex Oracle configurations

• Improves troubleshooting of Oracle issues, performance and capacity problems

Deep Storage Insight for Database Applications