Capacity Managementand the Cloud
description
Transcript of Capacity Managementand the Cloud
Danny Quilton, COO,
Capacitas
Capacity
Management and the
Cloud
itSMF UK Conference 2012
Capacity Management and the Cloud
With the advent of cloud computing is capacity management still required?
© Capacitas 2002-2012 2
itSMF UK Conference 2012
Capacity Management and the Cloud
Capacity planning no longer required? When the Associated Press (AP) wanted the flexibility for application
hosting and cloud data storage in the cloud, they turned to the Windows Azure platform from Microsoft.
"Capacity planning is the thing that stands out as the biggest advantage of the Microsoft cloud model. The Windows Azure platform takes that out of the equation for us, unlike the other cloud providers." - Jonathan Malek, Chief Architect and Director of Research, Associated Press
See how Windows Azure helped the AP develop a new global API through easy scalability that removed the need for costly and time-consuming capacity planning.
Microsoft.com
© Capacitas 2002-2012 3
itSMF UK Conference 2012
Capacity Management and the Cloud
Agenda • Capacity management defined
• Flawed assumptions
• Case studies
• Summary
© Capacitas 2002-2012 4
itSMF UK Conference 2012
Capacity Management and the Cloud
What is Capacity and Performance Management?
© Capacitas 2002-2012
Supply Demand
ICT capacity
Business demand
5
itSMF UK Conference 2012
Capacity Management and the Cloud
What is Capacity and Performance Management?
© Capacitas 2002-2012
Cost Level of Service
Service availability
Business throughput
Service response time
Third-party provider
operational costs
Capital cost of ICT
Operational cost of ICT
6
itSMF UK Conference 2012
Capacity Management and the Cloud
A Risk-based Approach
© Capacitas 2002-2012
Capacity Management is required
Business-critical service
Significant business growth
Extraordinary business peaks
High level of service
demanded by the business
Long lead times associated with
capacity upgrades
Highly competitive
market
Requirement to manage ICT
costs
High likelihood of a merger or
acquisition
7
itSMF UK Conference 2012
Capacity Management and the Cloud
Flawed Assumptions
© Capacitas 2002-2012
Capacity can be turned up
instantaneously
Capacity planning discipline is no longer required
Cloud capacity is cheap
Bottlenecks can be alleviated by expanding cloud
capacity
Capacity management can be delegated to
the cloud provider
Performance is guaranteed by the
cloud provider
8
itSMF UK Conference 2012
Capacity Management and the Cloud
Capacity can be turned up instantly?
© Capacitas 2002-2012
Public Cloud
Cloud instance may be brought up rapidly
Private Cloud
Our experience is that this is of the order of
weeks
9
itSMF UK Conference 2012
Capacity Management and the Cloud
Capacity planning discipline is no longer required?
How many servers? How much will it cost?
Time
Cloud
Demand Waste Capacity
Time
Physical
Demand Waste Capacity
© Capacitas 2002-2012 10
itSMF UK Conference 2012
Capacity Management and the Cloud
The key questions
• Understand the application
• Model future user demand
• Model utilisation
• Understand acceptable utilisation thresholds
• Plan how many servers to buy and when
How many servers?
Time
Physical
Demand Waste Capacity
© Capacitas 2002-2012 11
itSMF UK Conference 2012
Capacity Management and the Cloud
The key questions
• Understand the application
• Model future user demand
• Model utilisation
• Understand acceptable utilisation thresholds
• Plan how much to spend and when
How much will it cost?
Time
Cloud
Demand Waste Capacity
© Capacitas 2002-2012 12
itSMF UK Conference 2012
Capacity Management and the Cloud
Capacity planning discipline is no longer required?
• Consider an ecommerce service
• Variable service demand:
• Seasonality
• Promotions
• What will future demand look like?
© Capacitas 2002-2012
y = 12.088x - 436266R² = 0.9436
26/0
3/2
004
26/0
4/2
004
26/0
5/2
004
26/0
6/2
004
26/0
7/2
004
26/0
8/2
004
26/0
9/2
004
26/1
0/2
004
26/1
1/2
004
26/1
2/2
004
26/0
1/2
005
26/0
2/2
005
26/0
3/2
005
26/0
4/2
005
26/0
5/2
005
26/0
6/2
005
26/0
7/2
005
26/0
8/2
005
26/0
9/2
005
26/1
0/2
005
26/1
1/2
005
26/1
2/2
005
26/0
1/2
006
26/0
2/2
006
26/0
3/2
006
26/0
4/2
006
26/0
5/2
006
26/0
6/2
006
26/0
7/2
006
26/0
8/2
006
26/0
9/2
006
26/1
0/2
006
26/1
1/2
006
26/1
2/2
006
26/0
1/2
007
26/0
2/2
007
26/0
3/2
007
26/0
4/2
007
26/0
5/2
007
26/0
6/2
007
26/0
7/2
007
26/0
8/2
007
26/0
9/2
007
26/1
0/2
007
26/1
1/2
007
26/1
2/2
007
26/0
1/2
008
26/0
2/2
008
26/0
3/2
008
26/0
4/2
008
26/0
5/2
008
26/0
6/2
008
26/0
7/2
008
26/0
8/2
008
26/0
9/2
008
26/1
0/2
008
26/1
1/2
008
26/1
2/2
008
26/0
1/2
009
26/0
2/2
009
26/0
3/2
009
26/0
4/2
009
26/0
5/2
009
26/0
6/2
009
Dai
ly P
urc
has
es
Historical Service Demand for an e-commerce Service
Actual Daily Purchases Trend 180day Linear (Trend 180day)
13
itSMF UK Conference 2012
Capacity Management and the Cloud
Capacity planning discipline is no longer required?
• Demand planning must still be undertaken © Capacitas 2002-2012
01
/01
/20
06
01
/04
/20
06
01
/07
/20
06
01
/10
/20
06
01
/01
/20
07
01
/04
/20
07
01
/07
/20
07
01
/10
/20
07
01
/01
/20
08
01
/04
/20
08
01
/07
/20
08
01
/10
/20
08
01
/01
/20
09
01
/04
/20
09
01
/07
/20
09
01
/10
/20
09
01
/01
/20
10
01
/04
/20
10
01
/07
/20
10
01
/10
/20
10
01
/01
/20
11
01
/04
/20
11
01
/07
/20
11
01
/10
/20
11
01
/01
/20
12
Forecast Service Demand for an e-commerce Service
Actual Daily Purchases Forecast Daily Purchases
14
itSMF UK Conference 2012
Capacity Management and the Cloud
Capacity planning discipline is no longer required?
• Still require
• Performance testing
• Performance tuning
• Otherwise we have proliferation of capacity in the cloud
• Here we have a 3-fold increase in opex
© Capacitas 2002-2012
Non-tuned application
Server instance
1
Server instance
2
Server instance
3
Server instance
4
Server instance
5
Server instance
6
15
Tuned application
Server instance
1
Server instance
2
itSMF UK Conference 2012
Capacity Management and the Cloud
Case Study 1: Cloud Capacity is Cheap?
• E-commerce service on owned, physical infrastructure
• Proof of concept to assess capacity required on a private cloud
© Capacitas 2002-2012
Processing Capacity per Instance
Number of Instances
Owned infrastructure 8-core 32
Private Cloud Proposal A 4-core ?
Private Cloud Proposal B 8-core ?
16
itSMF UK Conference 2012
Capacity Management and the Cloud
Case Study 1: Cloud Capacity is Cheap?
• Testing carried out to established the relative capacity of current and proposed architecture
• Tests against key transactions in the e-commerce application
• Profound implications for the business case © Capacitas 2002-2012
CPU processing time per key transaction (relative to owned infrastructure; per core)
Number of Instances Required
Owned infrastructure 1.0 32
Private Cloud Proposal A 1.4 90
Private Cloud Proposal B 2.0 64
17
itSMF UK Conference 2012
Capacity Management and the Cloud
Case study 2: Capacity Planning in the Cloud
• A travel e-commerce service
• A mobile site to provide a better travel information during periods of disruption
• This site was to be hosted as a cloud-based service
• The demand on the mobile site would start off low and grow to unknown levels
© Capacitas 2002-2012 18
itSMF UK Conference 2012
Capacity Management and the Cloud
Performance Testing
To ensure that
• User response times will be within performance SLAs
• System will provide value for money within budgets
Find/Fix Code Defects
Determine and Optimise Response Times
Determine and Optimise Costs
© Capacitas 2002-2012 19
itSMF UK Conference 2012
Capacity Management and the Cloud
Step 1: Finding Code Defects • Before response times and costs can be measured, defects
introducing non-linearity must be found and fixed
• Memory leaks; logical bottlenecks; locking
© Capacitas 2002-2012 20
itSMF UK Conference 2012
Capacity Management and the Cloud
Step 2: Determining Response Time • Test
response times as experienced by the user
• Are times within SLAs?
© Capacitas 2002-2012 21
Average Before fix
Average After fix 1
Average After fix 2
Average SLA
itSMF UK Conference 2012
Capacity Management and the Cloud
Step 3: Cost Optimisation
• We have resolved code defects
• The service now meets response time SLA
• We can go live, right?
• No!
Service performance should be tuned to achieve cost optimality
© Capacitas 2002-2012 22
itSMF UK Conference 2012
Capacity Management and the Cloud
Step 3: Cost Optimisation • Pre-optimisation
© Capacitas 2002-2012 23
Capacity of the service (User visits per second per instance)
Number of instances required (to support 40 visits per second)
Pre-optimisation 2 20
• Post-optimisation
Capacity of the service (User visits per second per instance)
Number of instances required (to support 40 visits per second)
Post-optimisation 50 1
itSMF UK Conference 2012
Capacity Management and the Cloud
Step 3: Cost Optimisation Pre-optimisation
© Capacitas 2002-2012 24
Post-optimisation
itSMF UK Conference 2012
Capacity Management and the Cloud
Bottlenecks can be alleviated by expanding cloud capacity?
• Our experience is that most bottlenecks relate to logical rather than physical capacity constraints
© Capacitas 2002-2012
Physi
cal Capaci
ty CPU
Memory
Disk space
Disk I/O
Network bandwidth
Logic
al Capaci
ty Allocated size of a database table
Capacity of a third-party’s web service
The number of threads
The number of database locks
Free connections in a connection pool
25
itSMF UK Conference 2012
Capacity Management and the Cloud
Bottlenecks can be alleviated by expanding cloud capacity?
• Here there is a logical capacity constraint with regard a database table’s allocated space
• Increasing web or database instances capacity will not address the root cause
© Capacitas 2002-2012
Allocated space to database table
Database instance
Web server instance Web server instance Web server instance
26
itSMF UK Conference 2012
Capacity Management and the Cloud
Capacity management can be delegated to the cloud provider?
• Planning must be undertaken to forecast peak demand and size accordingly
• Requires business knowledge
• Requires specific skills
• Potential conflict of interest!
© Capacitas 2002-2012
01
/01
/20
06
01
/04
/20
06
01
/07
/20
06
01
/10
/20
06
01
/01
/20
07
01
/04
/20
07
01
/07
/20
07
01
/10
/20
07
01
/01
/20
08
01
/04
/20
08
01
/07
/20
08
01
/10
/20
08
01
/01
/20
09
01
/04
/20
09
01
/07
/20
09
01
/10
/20
09
01
/01
/20
10
01
/04
/20
10
01
/07
/20
10
01
/10
/20
10
01
/01
/20
11
01
/04
/20
11
01
/07
/20
11
01
/10
/20
11
01
/01
/20
12
Forecast Service Demand for an e-commerce Service
Actual Daily Purchases Forecast Daily Purchases
27
itSMF UK Conference 2012
Capacity Management and the Cloud
Case Study 3: Capacity management can be delegated to the cloud provider?
• A retailer
• E-commerce service
• Migrated the service from own infrastructure to a cloud service
• The number of processor cores in the cloud was the same as the number of processor cores on the previous infrastructure
• Service performance degraded post migration
© Capacitas 2002-2012 28
itSMF UK Conference 2012
Capacity Management and the Cloud
Case Study 3: Capacity management can be delegated to the cloud provider?
• No issues reported by service provider
• CPU loading measured at the guest
• Other than 1 server CPU loading was within acceptable bounds!
© Capacitas 2002-2012 29
itSMF UK Conference 2012
Capacity Management and the Cloud
Case Study 3: Capacity management can be delegated to the cloud provider?
• However evidence of high CPU queue lengths and service performance degradation
• Insufficient processor capacity configured on the host machines
© Capacitas 2002-2012 30
itSMF UK Conference 2012
Capacity Management and the Cloud
Case Study 4: Capacity management can be delegated to the cloud provider?
• e-commerce travel
• Cloud database on Microsoft Azure
• Service went live with a 5 GB database
• Low growth expected
© Capacitas 2002-2012 31
• Post go-live the database was growing at 2.1 GB per day
• Forecast growth of 767 GB over the first year
• SQL Azure databases instances limited to 150 GB
• So forecast capacity requirement of 6 databases, each 130 GB in size
itSMF UK Conference 2012
Capacity Management and the Cloud
Case Study 4: Cost Implications
Current capacity
© Capacitas 2002-2012 32
Forecast capacity
itSMF UK Conference 2012
Capacity Management and the Cloud
Case Study 4: Capacity management can be delegated to the cloud provider?
• Investigation demonstrated that the growth of 2.1 GB per day was due to:
1. Poor archiving
2. ‘Scrapers’ searching for invalid route combinations, resulting in large numbers of database inserts
© Capacitas 2002-2012 33
itSMF UK Conference 2012
Capacity Management and the Cloud
Case Study 5: Remediation of Performance Issues
• Application hosted in the Microsoft Azure cloud (web app and database)
• Performance testing proved a response time degradation
• Web tier OK
• It was not possible to launch any performance tools/diagnostics against the database service
• Extremely difficult to establish root cause!
© Capacitas 2002-2012 34
itSMF UK Conference 2012
Capacity Management and the Cloud
Performance is Guaranteed by the Cloud Provider?
• Extract from the Amazon EC2 SLA
• End to end performance is not guaranteed
© Capacitas 2002-2012 35
itSMF UK Conference 2012
Capacity Management and the Cloud
Performance is Guaranteed?
• Counter-argument is to ‘design for failure’
• Automatically detect capacity constraints
• Automatically detect unhealthy instances
• Then automatically bring up new instances
• Risk that instance proliferation can adversely impact system-wide performance
• However end to end performance is still not guaranteed
© Capacitas 2002-2012 36
itSMF UK Conference 2012
Capacity Management and the Cloud
Comparing Performance of Cloud Service Providers
© Capacitas 2002-2012 37
itSMF UK Conference 2012
Capacity Management and the Cloud
Summary • Clear benefits of cloud computing
• However capacity management is still required
• Capacity management is key to managing the cost of a cloud services
© Capacitas 2002-2012
38
itSMF UK Conference 2012
Capacity Management and the Cloud
Questions?
www.capacitas.co.uk
© Capacitas 2002-2012
39