Capacity Management Process Handbook

28
Capacity Management Process Handbook By Mark S. Mahre C Michael Dalton March 2009

Transcript of Capacity Management Process Handbook

Page 1: Capacity Management Process Handbook

Capacity Management Process Handbook

By

Mark S. Mahre

C Michael Dalton

March 2009

Page 2: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 2

Table of Contents

People, Process and Technology ................................................................................................................... 5

Identify, Map and Model Existing Processes ............................................................................................ 6

ITIL Process Framework for the IT Environment ........................................................................................... 7

ITIL Service Support Processes: ................................................................................................................. 7

Incident Management ........................................................................................................................... 7

Problem Management .......................................................................................................................... 8

Change Management ............................................................................................................................ 8

Release Management ........................................................................................................................... 9

Configuration Management .................................................................................................................. 9

ITIL Service Delivery Processes: .............................................................................................................. 10

Service Level Management ................................................................................................................. 10

Capacity Management ........................................................................................................................ 10

Availability Management .................................................................................................................... 11

Service Continuity Management ........................................................................................................ 12

Financial Management........................................................................................................................ 12

Capacity Management ................................................................................................................................ 13

Business Capacity Management ............................................................................................................. 13

Service Capacity Management................................................................................................................ 13

Component Capacity Management ........................................................................................................ 13

Key Activities of Capacity Management ................................................................................................. 13

Capacity Management Information System (CMIS) ............................................................................... 14

Capacity Management Roles .................................................................................................................. 15

Capacity Management Processes ........................................................................................................... 15

Capacity Management Artifacts ............................................................................................................. 16

CPM-1.0 Capacity Management (Overview) ..................................................................................... 16

CPM-1.1 Forecast Future Demand for IT Capacity ............................................................................ 17

CPM-1.2 Compile Capacity Plan ........................................................................................................ 17

CPM-1.3 Monitor IT Capacities ......................................................................................................... 18

CPM-1.4 Carry Out Capacity Management Reporting ...................................................................... 18

CPM-1.5 Commission Optimization Measures for IT Capacities ....................................................... 19

Page 3: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 3

Capacity Planning ........................................................................................................................................ 20

Determine service level requirements.................................................................................................... 20

Analyze current capacity ......................................................................................................................... 21

Plan for and align with future business requirements. .......................................................................... 21

Storage Capacity Planning .......................................................................................................................... 22

Five Steps for Improving Storage Capacity Planning .............................................................................. 23

Storage Capacity Planning Capacity Model (SCPCM).............................................................................. 24

Overview of the Storage Capacity Planning Capability Model ........................................................... 24

Estimation Planning ............................................................................................................................ 24

Resource Side Planning ....................................................................................................................... 25

Integrated Application Side / Resource Side Planning ........................................................................ 26

Business Plan / Resource Side Planning .............................................................................................. 28

Page 4: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 4

Table of Figures

Figure 2 : People, Process and Technology ................................................................................................... 5

Figure 21: CPM-1.0 Capacity Management (see

Appendix A for enlarged view).................................................................................................................... 16

Figure 22: CPM-1.1 Forecast Future Demand for IT Capacity (see

Appendix A for enlarged view).................................................................................................................... 17

Figure 23: CPM-1.2 Compile Capacity Plan (see

Appendix A for enlarged view).................................................................................................................... 17

Figure 24: CPM-1.3 Monitor IT Capacities (see

Appendix A for enlarged view).................................................................................................................... 18

Figure 25: CPM-1.4 Carry Out Capacity Management Reporting (see Appendix

A for enlarged view) .................................................................................................................................... 18

Figure 26: CPM-1.5 Commission Optimization Measures for IT Capacities (see Appendix

A for enlarged view) .................................................................................................................................... 19

authored by: C Michael Dalton

Page 5: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 5

People, Process and Technology

The three elements of any technology based project involve People, Processes, and the Technology

itself. Getting all three aligned is absolutely essential to ensuring a change will work within an

organization.

It is important to realize that the order in which these are addressed makes an essential difference as

well.

1. People - what are the key issues: who owns the process, who is involved, what are their roles, are

they committed to improving it and working together and importantly are they prepared to do the work

to fix the problem

2. Process - a process can be defined as starting with a trigger event that creates a chain of actions that

results in something being prepared for a customer of that process. Starting at high level and identifying

the key big steps is important to see the process from end-to-end. Only then, move into more detail to

capture the various layers involved and various exceptions. Focusing on the high frequency transactions

can have significant benefit to standardizing the process. Also remember that it can be the non-

standard transactions where service slips the most or has the greatest potential for significant failure in

the process.

3. Technology - now that people are aligned, and the process developed and clarified, technology can

be applied to ensure consistency in application of the process and to provide the thin guiding rails to

keep the process on track. This makes it easier to follow the process than not do so.

Of course there is much more to getting a technology project right - but get these three elements sorted

out and you will be a long way down the path to project success

Figure 1 : People, Process and Technology

Page 6: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 6

Identify, Map and Model Existing Processes Identifying, mapping and modeling processes is the first step in defining meaningful policies and procedures. Understanding existing processes gives clarity to how an enterprise actually works, which in turn dictates the business rules that are encapsulated into its policies. Knowledge of these processes also drives all procedural efforts within those processes defining the sequence, transition points, and decision points of the workflow that underlie all detailed procedures. The goal of the Process Management program is to help identify, assess, document and manage processes within IT and the business as a whole. This approach includes four key steps:

Understanding Process Frameworks Capturing and Communicating Processes Defining Detailed Processes and/or Procedures Managing a Process Portfolio

Page 7: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 7

ITIL Process Framework for the IT Environment

Operational processes within the IT environment are those used jointly by the solution staff and its

customers to design, test, deploy, operate, and maintain a solution. These processes are generally

labeled in broad terms such as Problem, Change, Configuration, etc. The operational processes in this

section are based on ITIL process best practices and lessons learned.

ITIL Service Support Processes:

Incident Management - Receiving, recording, and classifying user

reports of malfunctions, primarily received through the help desk. ITIL

defines an incident as “any event which is not part of the standard

operation of a service and which causes, or may cause, an interruption to,

or a reduction in the quality of that service.” The objective of Incident

Management is to restore normal service operation as quickly as possible

with minimum disruption to the business, thus ensuring that the best

achievable levels of availability and service are maintained.

The following recommendations should be considered for incident management.

o Review the previous day’s incidents during a periodic solution status meeting.

The status of each open incident should be reviewed quickly covering:

o date incident was reported

o impact of incident

o priority of incident resolution

o who the incident has been assigned to

o status on determining the cause of the incident

o status on determining resolution of incident

BMC Remedy 7 contains an Incident Management module to track all incidents. Each record contains

the above information on each incident. It can be used as both a tracking mechanism and a reference

for future incident resolution.

authored by: C Michael Dalton

Page 8: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 8

Problem Management - Analysis of incidents to uncover patterns of repetition that might

indicate a common root cause. Positive conclusion results in a Request for Change (RFC), and the cycle

repeats. To determine the root cause of an incident or a group of related incidents, and develops and

documents a resolution to ensure that the incident does not occur again, or if it does, there is a known

fix for the problem.

The following recommendations should be considered for problem management.

o Tactically, use the daily status meeting and the incident tracking system to record and track

problems. This helps to meet current requirements and its updating and reporting capabilities

are under the immediate control of the team.

BMC Remedy 7 has a Problem Management module that can be used to strategically record and track all

problems.

Change Management - Response to and action upon requests for change. Process includes

solution evaluation and design, risk analysis, prioritization, approvals, and feasibility testing. To ensure

standardized methods and procedures are used for efficient and prompt handling of all Changes in order

to minimize the impact of any related Incidents upon service.

The following recommendations should be considered for change management:

o Tactically, introduce a change management discipline through the proposed daily status

meetings. All changes to the solution should be identified and discussed during this meeting.

This would include changes to resolve incidents or problems as well as those needed to enhance

the solution.

BMC Remedy 7 includes a Change Management module that allows the Team Lead, as part of his/her

project management responsibilities to use a formal change management process to track and

implement changes to the solution.

Page 9: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 9

Release Management - Sequence of events for rolling out a change to the user environment

in order to minimize disruption, prevent errors and loss of data, and maintain proper documentation.

To ensure the success of large scale software or hardware changes in a distributed IT environment

through various stages of planning, development, testing, communication, and distribution into

production. It is tightly coupled with the Change and Configuration processes.

Configuration Management - Creation and maintenance of a database of all IT

configuration items, their relationship with other items, and their proper state. To provide a logical

model of the IT infrastructure by identifying, controlling, maintaining, and verifying the versions of all

Configuration Items (CI’s) in existence.

The largest numbers of configuration changes normally come from:

o New servers to be backed-up

o New databases to be backed-up

o New requirements for e-mail back-up and/or archiving

o Increased storage requirements for resources currently being backed-up

The following recommendations should be considered for configuration management:

o Institute a weekly review of new storage requirements by the team and its customers. This

would allow the team time to adjust the configuration for new requirements in “bulk mode”

versus addressing each one individually and perhaps making contradictory and/or duplicate

adjustments. Recognizing that in today’s environment a week may seem a long time in

addressing customer requirements, there could be an emergency path for implementation, but

the use of it should be the exception with a justification, not the norm.

Changes to the configuration would be made in connection with established Change Management

processes to ensure that proper documentation and tracking is done. BMC Remedy 7 has a

Configuration Management Database component and can interface with several standard CMDBs in the

current marketplace.

Page 10: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 10

ITIL Service Delivery Processes:

Service Level Management – To maintain and gradually

improve business aligned IT service quality , through a constant cycle

of agreeing, monitoring, reporting and reviewing IT service

achievements and through instigating actions to eradicate

unacceptable levels of service.

The backup requirements submission process is different from

organization to organization, but the requirements are relatively infrequent and are sometimes handled

on a “best we can do” basis. A team needs to be regulated and guided by more than just “informal”

service guidelines.

The following recommendations should be considered for service level management:

o There should be “standard” back-up service levels developed by the team.

For example, a “standard” back-up service level based on server type, i.e. Notes, Database, Standard,

etc. It may be useful to publish these on the backup request submission form to educate the customers

on the standard services and set their expectations accordingly. This education section can also be used

to describe the expectations of the team of the customer. What roles and responsibilities do they have

in the successful operation of the solution? There may be tasks or activities that they should or should

not perform to aid the team in the effective meeting of service levels. These “standard” backup service

levels can be viewed as an informal implementation that can then be incorporated into a Service Catalog

as defined in ITIL.

“One-off” of exception backup policies should be avoided if at all possible. If there are a large number of

these occurring, then they should be examined and perhaps a single “umbrella” policy can be developed

that meets all, or the great majority, of the exception requirements. This single “umbrella” policy can be

viewed as a Service Level Agreement (SLA) between the customer and the service provider, team, as

defined in ITIL.

Capacity Management – To understand the future business requirements (the required

service delivery), the organization’s operation (the current service delivery), the IT infrastructure (the

means of service delivery), and ensure that all current and future capacity and performance aspects of

the business requirements are provided cost effectively.

Capacity management is a key process that has a significant impact on the continued success of a

solution. The overall backup solution can be under strain due to the lack of capacity.

Page 11: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 11

The following factors are typically the cause of a lack of capacity:

o Growth in the amount of data backed-up from resources already in the solution

o New requirements for the back-up of servers, databases and e-mail and the length of retention

o Consistent incremental growth of new resources being added to the solution

o Additional capacity may not have been planned for sufficiently based on the new growth rate

o Hardware was purchased in anticipation of the growth but its implementation was delayed

o New large projects requiring storage services have not been identified early enough.

The following recommendations should be considered for capacity management:

o A team member should be assigned the responsibility to be the on-going capacity planner for

the solution. It would be reasonable to give this individual the solution architecture and design

position as well. In this way he or she would develop both the requirements and the

corresponding changes in the solution to meet those requirements.

o A storage resource management solution should be considered. Various tools are available to

implement this solution. It would provide data on data trends, the amounts of data being

managed and the ability to review patterns of data size and use.

o The capacity planner should provide monthly reports on current and future capacity needs as

predicted by trend lines and newly identified projects. The trend lines should have times

associated with them where additional capacity will be required.

Availability Management – To optimize the capability of the IT infrastructure and

supporting organization to deliver a cost effective and sustained level of availability that enables the business to satisfy its objectives. In general, the availability of the storage solution is dictated by the expectations of its customers. For

example, the solution will be available during the daily backup period of 5:00 PM to 4:00 AM. In addition

to the backup period, some customers feel that the solution should be available during normal business

hours. This is normally not a problem, but the team is debugging the previous night’s problems during

the day and occasionally has to make the solution unavailable. If there are no formal availability

agreements, the customer teams may not be notified of pending solution outages. This can cause some

problems.

The following recommendations should be considered for availability management.

o The team should publish availability expectations for the solution. This would include hours

where the solution should be available as well as pre-planned outage windows for problem

resolution and/or maintenance.

o A “key contact” list should be established to be used to notify the customer groups of a pending

outage and/or expected up times. Communication could be done via e-mail, pagers, or phone

messages. It would be the responsibility of the “key contact” in the customer group to notify his

or her peers of the pending change in status.

Page 12: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 12

Service Continuity Management – To support the overall Business Continuity

Management process through ensuring that the required IT technical and services facilities can be

recovered within required and agreed business time-scales.

For many organizations, the storage solution plays a key role in the disaster recovery plan. It is

recommended that an analysis should be done of the impact of losing the servers and their supporting

infrastructure. Based on this analysis, appropriate plans should be put in place to ensure the correct

level of service continuity for the storage solution.

Financial Management – To provide cost effective stewardship of the IT assets and the

financial resources used in providing IT services.

For many organizations, the need for backup is totally justified, but the lack of backup policy and

accountably makes it difficult to justify the overall cost of the system. Cost can be controlled and

managed if it is introduced when defining the backup requirements and the retention policies.

Page 13: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 13

Capacity Management

Capacity management is the practice of considering future capacity requirements for the current and yet to be implemented IT services. The purpose of capacity management is to ensure sufficient capacity exists to support new services and solutions considering the advances in new technology. Employing a capacity management process helps to ensure there is adequate funding so that new capacity can implemented when needed.

Business Capacity Management This sub-process translates business needs and plans into requirements for Service and IT Infrastructure, ensuring that future business requirements for IT Services are quantified, designed, planned and implemented in a timely fashion. The main objective is to ensure that the customer outcomes for IT Services are considered and understood, and that sufficient IT Capacity to support any new or changed Services is planned and implemented within an appropriate timescale. The activities associated with Business Capacity Management are:

Modeling – Modeling is a tool which helps to predict the current and future behavior of the infrastructure and to determine capacity requirements based on a given volume and variety of work.

Application sizing – Application sizing evaluates the necessary resources to run new or changed applications. The resulting predictions include information about expected performance levels, necessary hardware, and costs.

Capacity planning - Capacity planning analyzes future needs.

Service Capacity Management The focus of this sub-process is the management, control and prediction of the end-to-end performance and capacity of the live, operational IT Services usage and workloads. It ensures that the performance of all Services, as detailed in Service targets with SLA’s and SLR’s, is monitored and measured, and that the collected data is recorded, analyzed and reported. The main objective is to identify and understand the IT Services, their use of components, working patterns, peaks and troughs, and to ensure that the Services meet their SLA targets.

Component Capacity Management The focus of this sub-process is the management, control and prediction of the performance, utilization and capacity of individual IT technology components (CPU, Disk, etc). It ensures that all components within the IT Infrastructure that have finite resource are monitored and measured, and that the collected data is recorded, analyzed and reported.

Key Activities of Capacity Management The activities of Service Capacity Management and Component Capacity Management are:

Demand management – Demand management is the practice of transferring and transition demand in order to prevent an infrastructure component from becoming overloaded. Demand management reviews capacity requirements from both a long term and short term perspective.

Page 14: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 14

Monitoring - Monitoring the infrastructure components is a way to measure and assure that the agreed-upon service levels can be achieved. Examples of resources that should be monitored are central processing unit (CPU) utilization, disk utilization, network utilization, and the number of concurrent licenses in use.

Analysis - Trend analysis is utilized as a measure to forecast future utilization needs. An analysis of current systems and components may initiate efficiency improvements or the acquisition of additional IT components.

Tuning - Tuning helps better utilize system resources or improve the performance of a particular service by identifying steps to take which will optimize the system for current or anticipated workload.

Implementation - Implementation is the process of implementing the changed or new capacity which was identified during the monitoring, analysis, and tuning activities.

Capacity Management Information System (CMIS) Holds the information needed by all sub-processes within Capacity Management. For example, the data monitored and collected as part of Component and Service Capacity Management is used in Business Capacity Management to determine what infrastructure components or upgrades to components are needed, and when.

Business

Capacity

Management

SLA/SLR and

IT Service Design

Service

Capacity

Management

Component

Capacity

ManagementCap

acity

Man

agem

ent

Tool

Review current

capacity & performance

Improve current

resource & performance

Assess, agree &

document new

requirements & capacity

Plan new capacity

Capacity Plan

Forecast

Capacity &

Performance Reports

CM

IS

Building a capacity plan is probably the most important, and most complex, piece of work that the

capacity management team produces. In most cases the Capacity Plan is produced on an annual basis

and mid-course corrections are made at predetermined intervals, usually quarterly or semiannually. The

plan takes into account current business growth by business unit or process, budgetary considerations

from financial organization, planned projects from applications teams and the technical support

organizations, service restoration requirements from availability management teams, business

Page 15: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 15

contingency requirements from the disaster recovery teams, and service level agreements in place and

planned.

Capacity Management Roles The Capacity Manager role has the responsibility for ensuring that the aims of Capacity Management

are met. This includes responsibilities such as:

Ensuring that there is adequate IT Capacity to meet required levels of Service and for ensuring that senior IT management is correctly advised on how to match Capacity and demand, and to ensure that use of existing Capacity is optimized

Identifying, along with the Service Level Manager, capacity requirements through discussions with the business users; remembering that the Service Level Manager is responsible for the negotiations with the customer

Understanding the current usage of the infrastructure and IT Services, and the maximum capacity of each component

Performing sizing on all proposed new Services and systems, possibly using modeling techniques, to ascertain capacity requirements

Forecasting future capacity requirements based on business plans, usage trends, sizing of new Services, etc

Production, regular review and revision of the Capacity Plan, in line with the organization’s business planning cycle, identifying current usage and forecast requirements during the period covered by the plan

Ensuring that appropriate levels of monitoring of resources and system performance are set Analysis of usage and performance data, and reporting on performance against targets

contained in SLAs Raising Incidents and Problems when breaches of capacity or performance thresholds are

detected, and assisting with the investigation and diagnosis of capacity-related Incidents and Problems

Ensuring that all changes are assessed for their impact on capacity and performance Carrying out performance testing of new Services and systems Determines performance Service levels that are maintainable and cost justifiable Act as a focal point for all capacity and performance issues

Capacity Management Processes Capacity Management (CPM) processes and sub-processes include: (see below)

CPM-1.0 Capacity Management (Overview) CPM-1.1 Forecast Future Demand for IT Capacity CPM-1.2 Compile Capacity Plan CPM-1.3 Monitor IT Capacities CPM-1.4 Carry Out Capacity Management Reporting CPM-1.5 Commission Optimization Measures for IT Capacities

Page 16: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 16

Capacity Management Artifacts Capacity Management (CPM) artifacts include: (see Appendix B for samples)

CHM-R1 Change Record

CHM-R2 Forward Schedule of Changes (FSC)

CHM-R3 Request for Change (RFC)

CPM-R1 Capacity Forecast

CPM-R2 Capacity Plan

CPM-R3 Capacity Report

CPM-1.0 Capacity Management (Overview) Capacity Management supports the optimum and cost effective provision of IT Services by helping IT

organizations match their IT resources (software, hardware, human resources) to the business needs.

This process involves estimations of future demand, which are the basis for planning future capacity

needs, resulting in the Capacity Plan.

Figure 2: CPM-1.0 Capacity Management (see Appendix A for enlarged view)

Page 17: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 17

CPM-1.1 Forecast Future Demand for IT Capacity The future demand for IT capacities is to be estimated upon the basis of performance and capacity

utilization measurements, in addition to information from the client side.

Figure 3: CPM-1.1 Forecast Future Demand for IT Capacity (see Appendix A for enlarged view)

CPM-1.2 Compile Capacity Plan Design of concrete measures for the adjustment of IT capacities in reaction to infringements to

performance and capacity agreements which have already occurred, or which are likely to occur in the

future; compilation of corresponding RFCs.

Figure 4: CPM-1.2 Compile Capacity Plan (see Appendix A for enlarged view)

Page 18: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 18

CPM-1.3 Monitor IT Capacities Monitoring of performance and IT capacities agreed upon in the SLAs and where necessary,

identification of measures for the continued assurance of the agreed Service Levels.

Figure 5: CPM-1.3 Monitor IT Capacities (see Appendix A for enlarged view)

CPM-1.4 Carry Out Capacity Management Reporting Reporting on the adherence to performance and capacity agreements and on the counter measures for

the correction of infringements or projected shortages related to the agreed Service Levels.

Figure 6: CPM-1.4 Carry Out Capacity Management Reporting (see Appendix A for enlarged view)

Page 19: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 19

CPM-1.5 Commission Optimization Measures for IT Capacities After the successful clearance of the change, the implementation planning is to be detailed;

subsequently the implementation is commissioned to suitable technical experts within Application

and/or Infrastructure Management.

Figure 7: CPM-1.5 Commission Optimization Measures for IT Capacities (see Appendix A for enlarged view)

Page 20: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 20

Capacity Planning

Capacity Planning, in the most general sense can be accomplished

in a three step process:

Determine Service Level Requirements

Analyze Current Capacity

Plan for and Align with Future Business Requirements

Determine service level requirements. The first step in the capacity planning process is to categorize work performed by the system and to quantify user expectations for how a unit of that work gets done. The overall process of establishing service level requirements first demands an understanding of the current workloads and secondly, the knowledge to forecast future processing requirements. Workloads – from a capacity planning perspective, a computer system processes workloads (which supply the demand) and delivers service to users. These workloads must be identified with a definition of satisfactory service. It is useful to analyze the work being done from a business perspective, rather than in technical terms, using business-relevant workload definitions. It is much easier to project future work when it is expressed in terms that make business sense. For capacity planning purposes it is useful to associate a “unit of work” with a workload. This is a measureable quantity of work done, as opposed to the amount of system resources required to accomplish that work. For example, measure the number of transactions, backup operations, etc. rather than the CPUs, network connections, I/O channels, or disk requirements. The next step is to establish a service level agreement between the service provider and the service customer that defines acceptable service (from the user’s perspective). Response time and/or throughput are typical performance indicators used. Other possible metrics may include number of requests processed over a given timeframe or the number of requests processed under a given threshold. At a minimum, the following information should be documented for the accurate provisioning of each

service:

The business processes supported by the service. The business priority for these processes. Expected demand for this service and its seasonality (if any). Anticipated growth in demand for this service over the next three years. The worst response time or throughput acceptable for the service.

Page 21: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 21

Analyze current capacity Current capacity is analyzed to qualitatively determine if user needs are being met and for the

establishment of the baseline for future planning.

The first step is to compare the measurements of any items referenced in service level agreements with

their objectives. This is the most basic indication of whether the system has adequate capacity.

Next, check the usage of the various resources of the system (CPU, I/O devices, memory, or storage) for

over/under utilization and/or threshold infringements that may be currently problematic or pose issues

in the future. Quantify the utilization statistics by workload; this identifies the most demanding users of

system resources.

Lastly, determine where each workload spends its time, by analyzing the components of the response

time. This helps to identify bottlenecks and which system resources are responsible for the greatest

elapsed time.

Analysis of capacity measurement data is accomplished by:

Comparing the measurements of any items referenced in Service Level Agreements (SLA) with their processing objectives.

Reviewing the usage of the data. Recording the resource utilization for each unit of work and determining the major process

consuming each server resource.

Plan for and align with future business requirements. Forecasting future business activity - combined with trends in historical measurements of incoming units

of work, such as orders or transactions will determine future resource requirements and ensure IT

effectiveness.

First you need to forecast what your organization will require of your IT systems ongoing. Once you

know what to expect in terms of incoming work, you can determine the optimal system configuration

for meeting defined service levels into the future.

Activities contributing to future processing requirements include:

Expected growth in the business Requirements for implementing new applications Planned acquisitions or divestitures IT budget limitations Requests for consolidation of IT resources Plans to implement new business process applications Demand within new IT projects

Page 22: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 22

Storage Capacity Planning

Start with an internal SLA (Service Level Agreement) between the users of the system and the

operations group. The SLA should be clear about the expectations for growth and the expected

availability, including acceptable scheduled and unscheduled downtime.

As part of the capacity management process, you need to plan, size, and control the system so that it

always meets the minimum performance expectations in the SLA. Once the system is up and running,

incorporate regular monitoring, of both capacity and availability, into the process.

The first step to effective capacity planning in an existing system is to do an audit of your data center, to

determine just what resources you already have. It’s important to find out exactly what hardware and

how much storage capacity is actually in play.

Next, you need to determine actual utilization, which should begin with summary reports, then drill

down to more detail, in order to identify over-provisioned storage, recover wasted storage and increase

overall storage utilization. This kind of reporting is critical in order to provide the right data on historical

storage utilization, allowing precise capacity trending reports to give you an idea of future requirements.

It’s also important to categorize your data and determine where it all sits. All too often, mission-critical

data is sharing space with outdated, non-critical files. Once you locate and categorize files, you should

purge or archive unimportant files, migrating your most critical data to the most accessible, efficiently

managed storage resources. There are several tools for quickly identifying and moving files between

your various tiers of storage based on the importance of each file. The best file location policy tools are

usually those that reside in the file system, where the most detailed information about a file is located.

At this point, it often makes sense to develop tiers of storage, segmenting different levels of data based

on type and content. This goes a long way toward helping you maximize your storage capacity.

Obviously, one important benefit of storage capacity planning is cost savings. By making better use of

the storage you already possess, you can put off hardware purchases—often for a long time to come.

But, just as importantly, a good capacity planning process should result in better data management;

critical data will now be grouped and housed together in the most accessible, best-managed storage

facilities, while non-critical data will be either purged or packed away in archives. This should make

mission-critical data more available across the organization.

Page 23: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 23

Five Steps for Improving Storage Capacity Planning

Getting started with capacity planning is like saving for retirement, you can't start too early. In this tip,

there are five key steps that will lay the foundation for good storage capacity planning.

1. Collect the data. Initially you need to determine a baseline with history. The fastest way to start collecting data is to buy a tool designed to do so.

The two options are to purchase a shrink-wrapped tool or create your own. At its simplest, many shops use spreadsheets to collect and store the data. However, this method can be labor-intensive when you grow beyond a couple of storage arrays and 10 or so hosts. If you are so inclined, you could also write your own collection scripts and store the data in a database. This should scale pretty well and will allow daily data points without much more effort.

Whether you buy a product or not, be sure to collect the following key elements:

On the storage infrastructure side, collect the raw, usable and allocated capacity for each tier.

For storage networking ports, count the total number of ports and the number in use.

For each server that uses non-local storage, grab the total capacity presented, the file system usable total, the file system utilized total and the database utilized by tier.

If you create your own spreadsheet, for each data element, use Microsoft Excel's SLOPE function or a linear regression algorithm to determine the growth rate for each data element over time.

2. Define standardized metrics. As the data starts to build, it can be overwhelming. Without a plan, the new information actually creates more questions than answers. For this reason, it's important to have a few key metrics that will be used to measure the storage service as a whole, and each internal customer's usage data.

The storage service should be measured to make sure it's performing as desired. Key metrics include the total raw, usable and allocated capacity by tier over time. Using the growth rate of each tier and the total capacity available, determine the date when the current storage will be full. This helps plan storage purchases so you can take action when the growth doesn't match the budget. Also, calculate the cost of the storage service in cost per terabyte by dividing the total storage budget into the number of usable terabytes on the floor. An interesting way to combine cost and utilization into one simple number is a metric called the "effective cost of storage." This is the total storage budget divided by the utilized storage. Storage managers can extract more value to the business by lowering the storage budget or increasing the utilization.

It’s a good idea to develop application by application utilization, growth rate and cost metrics similar to the overall infrastructure number, but specific to an application.

3. Meet with the stakeholders. Take the standardized metrics collected and meet with the internal application owners at least quarterly to review the metrics. Help them to see the utilization, growth rates and costs of their application. Use this as an opportunity to capture the business

Page 24: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 24

drivers that may affect storage growth, manually overriding the calculated growth rates when applicable.

Meet with IT management on a monthly basis to review the storage service numbers. Discuss the trends, costs and storage budget. When senior management feels informed about the business drivers effecting storage and less surprised about immanent purchases, life will be easier for everyone.

4. Set some goals. After meeting with the stakeholders, it's time to set some improvement goals. Decide how many days of storage to keep on hand. Will you buy storage once or four times per year? Set a goal for reducing the effective cost of storage, 15% per year should be attainable. Getting to that new number will be driven by whether the applications tiered correctly and whether or not you need to increase utilization or allocation rates.

5. Make some changes and measure again. In order to meet the goals we set in step 4, some changes will be required. Reclaim the underutilized storage from applications, or maybe it's time to brush up the storage request process to ensure too much storage isn't being allocated to applications.

Storage Capacity Planning Capacity Model (SCPCM) A Storage Capacity Planning Capacity Model (SCPCM) is a very helpful tool for organizations to develop a

storage capacity planning process that meets their business needs. A SCPCM provides the industry a

consistent model to define and communicate:

Best industry practices for lowering "utilization threshold" for earlier out-of-storage alarms

Where an IT organization's processes are now

How an IT organization can modify its existing processes

A roadmap of how to get to a desired state from an existing state

A way to measure progress

Overview of the Storage Capacity Planning Capability Model The SCPCM provides a framework for organizing capacity planning processes into four stages that lay

successive foundations for continuous process improvement. These four stages define a simple scale for

evaluating a storage capacity planning process capability. These stages also help an organization

prioritize its improvement efforts. The four stages are: Estimation Planning, Resource Side Planning,

Integrated Application Side / Resource Side Planning, and Business Plan / Resource Side Planning.

Estimation Planning In the Estimation Planning stage, IT concentrates its efforts on managing its storage subsystem

hardware. The IT organization determines what type of storage to provide; provisions and allocates

LUNs to application servers; and determines and implements data protection, backup/restore, and

Page 25: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 25

disaster recovery methodologies. Based on an occasional inquiry using an SRM or homegrown tool, IT

has an estimate of the amount of storage currently deployed and what is currently allocated.

In this stage, the IT organization employs no formal storage capacity planning process.

Based on requests for more storage from application owners, IT administrators allocate LUNs from

existing inventory. Somewhat more proactive IT organizations purchase additional inventory in advance

of its educated guess on when existing inventory will run out. How far in advance is based on its

"storage lead time", which is the amount of time it takes to get storage purchase requisitions developed

and approved; ordered and delivered from the suppliers; and then installed, configured, and

provisioned.

To determine how much storage to purchase, IT relies on one of the following mechanisms:

The business unit or application owners tell them how much additional storage they will need.

They trust their "gut instinct".

They make purchase decisions based on what was bought last year, last quarter, or last month. Since current storage consumption and consumption growth rates are not accurately known, this stage typically exhibits very high over-provisioning as storage administrators look to avoid out-of-storage situations, thereby wasting valuable CapEx budget.

Despite overall over-provisioning, incidences of out-of-storage situations frequently occur when

individual application server volumes or storage subsystems run out of storage without the "storage

lead time" advanced notice. This over-provisioning typically results in a 30% - 40% utilization rate (the

amount of used storage divided by the amount of allocated storage). Out-of-storage emergencies are

very expensive to the IT budget as well as to the business overall.

Resource Side Planning In the Resource Side Planning stage, IT semi-automates the collection of storage subsystem data and

forecasts future purchases based on current subsystem inventory trends. This stage does not typically

take into account trends in the demand for storage, but carefully manages resource side inventory levels

to ensure that overall inventory never runs out.

At this stage, IT employs semi-automated collection of storage subsystem information. The information

collected includes:

The storage subsystem's maximum capacity.

The amount of raw physical storage that is currently installed on each subsystem.

Of the raw physical storage deployed: o The amount usable by applications. o The amount used for data protection (RAID, snapshots) and accounting. o The amount used as a shared pool for thin provisioning.

Of the storage that is usable to applications, how much is actually currently allocated to applications.

The growth rate of physical storage.

Page 26: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 26

As in the Estimation Planning stage, at this stage IT allocates LUNs to application servers based on

requests from application owners. By trending out the inventory of storage that they allocate to

applications, IT will develop an estimated date when usable storage will run out. From this data the

"storage lead time" will be subtracted, providing a timeframe when additional storage will be

purchased. Often storage is purchased quarterly and the amount purchased is based on the historical

growth rates of allocated storage.

The advantage of the Resource Side Planning stage over the Estimation Planning stage is that the

administrator manually monitors the storage inventory so they are less likely to encounter out-of-

storage situations where no additional storage exists and emergency purchases must be made.

However, since resource monitoring is not fully automated and application consumption is not

monitored, out-of-storage situations on individual application server volumes still occur. More

importantly, IT has no idea how much of the allocated LUNs are actually being used by applications and

at what rate they are being consumed. Therefore IT has no way of assessing if the incoming storage

requests are too early, too late, or right on time. As a result, significant over-provisioning still occurs to

the tune of a 35% - 45% utilization rate.

Integrated Application Side / Resource Side Planning At this stage, IT systematically collects, stores, and analyzes information about applications' storage

consumption and growth in addition to the storage subsystem inventory described in the Resource

Planning stage. IT can then produce storage forecasts based on application consumption and then

correlate that with the storage subsystem inventory.

The Integrated Application Side / Resource Side Planning stage first analyzes the application side

consumption to understand application host volume projections to determine when, where, and what

type of storage is needed. The process then correlates application consumption analysis with the

storage resource inventory, which allows storage administrators to develop a storage capacity plan

based on how the current subsystems' inventories need to be allocated to efficiently meet the

application consumption demand.

The strength of the Integrated Application Side / Resource Side Planning stage is that it can successfully

answer the five fundamental questions that must be answered in order to run an efficient storage

capacity planning process:

1. When additional storage is needed - deploying too soon causes unused storage to depreciate on the shelf, and it does not take advantage of the 25% per year storage hardware price declines.

2. How much storage is required - how much storage is required to stay within the target utilization levels.

3. Where storage is needed - specifically which volumes are projected to approach utilization thresholds and require additional allocated storage

4. What type of storage is appropriate - whether expensive high end storage is required, or can lower priced tier 2 or 3 storage fulfil the applications' requirements.

Page 27: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 27

5. Where will the additional storage come from - the required storage can come from existing unallocated storage, from the purchase of additional disk drives in existing subsystems, or from the purchase of new subsystems.

The Integrated Application Side / Resource Side Planning process is outlined in the following steps:

1. Application side consumption data should be collected at least daily from all volumes on all servers in the IT environment. This data should include each volume's total storage allocation, total storage used, and total free storage. The data should be stored in a historical database for analysis.

2. The IT organization must define the following storage capacity policies:

- "Utilization threshold": balances the avoidance of out-of-storage situations, hardware cost, and performance considerations. Typical target thresholds are 70% - 80% for most applications.

- "Storage lead time": the time it takes to purchase, deliver, install, configure, and allocate new physical storage.

- "Storage increment": the amount of storage that should be added to a volume when it approaches its utilization threshold. This amount should avoid allocating/purchasing storage too often and allocating/purchasing storage too early. These policies can be unique for different groups or classes of volumes, or standard policies that apply to all volumes in the entire IT storage infrastructure.

3. The application side consumption data gathered in the first step should now be used to determine the forecasted growth of each and every volume using advanced statistical analysis. Once growth for every volume is forecasted, it must now be compared to the utilization threshold to determine when in the future the threshold will be violated. This will result in a list of volumes and dates for when its growth will violate the utilization threshold. From this data, the storage lead time should be subtracted to determine when storage needs to be ordered to ensure it gets allocated to the volume in alignment with when its utilization hits the threshold. From the storage increment, the amount of storage to provide the volume can be calculated.

4. Each volume that requires additional storage supports an application. Each application within the company has different availability and performance requirements which dictate the type of storage required for each application. By correlating the applications' requirements to the volumes that support them, the type of storage (i.e. expensive tier 1 or less expensive tier 2 or tier 3) that each volume requires can then be determined.

5. With the information determined in the first four steps, the application demand for additional storage is well understood. Now, the application demand must be correlated with the storage supply to understand where the additional storage will come from. The following analysis should now be used:

- Compare the additional storage requirements to the inventory of unallocated storage in existing subsystems. At the appropriate time, allocate the unused inventory to the appropriate volumes. Since there are no purchase and delivery steps, the storage lead time for this case can be shortened.

- If unallocated storage doesn't cover growing application demand, then the next step is to look at the storage subsystem maximum capacities and unused slots to determine if

Page 28: Capacity Management Process Handbook

Capacity Management Process Handbook

2009 authored by: C Michael Dalton Page 28

additional disk drives can be added to existing frames to provide the required storage. This may be the most cost effective purchase option.

- If the above two steps still do not provide the necessary storage, then new storage subsystems will have to be ordered. Often a portion of the application data must be migrated from the existing subsystem to the new one.

6. With an ongoing capacity planning process in place, a storage infrastructure budget can be accurately determined and refined as conditions change. Work force plans to purchase, install, configure, and allocate storage can now be defined and adjusted to form a living "To Do" list for the coming quarters.

At this stage the IT organization can start looking at this complete process as the storage

supply chain, a systematic inventory management process for its most consumable asset - storage. This

supply chain model incorporates application demand consumption analysis; physical storage supply and

unused and unallocated inventory costs; the cost of data protection, backup and restore, and disaster

recovery methodologies; storage tiering strategies; cost of capital; human resource management; and

timetables for purchasing and deployment.

Business Plan / Resource Side Planning The Business Plan / Resource Side Planning stage is exactly like the Integrated Demand Side / Resource

Side stage with one exception. Instead of performing application demand forecasting from historical

storage application trends, storage demand forecasting is based on Key Business Metrics (KBMs) from

the company's business plan.

Using historical data, key business metrics are correlated against storage usage. Once correlation

algorithms are determined, storage forecasts can be developed by using KBM values from next year's

business plan.

For example, a retail chain correlates the number of new stores it opened last year and last year's

revenue to storage usage. From that data, storage forecasts can be generated based on next year's

planned new store openings and revenue projections.

End of Document