Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ......

12
Addressing UNIX and NT server performance IBM Global Services “Despite planning and testing before big online events, Web site managers are being blindsided by traffic beyond their wildest expectations.” Stacey Collett, “The Glitch that Stole Christmas?,” C o m p u t e r w o r l d, November 15, 1999. Key Topics Evaluating server performance Determining responsibilities and skills Resolving existing perform- ance problems Assessing data for UNIX and NT platforms Planning for growth The use of Web technologies for commerce and communications has caused an economic revolution, impacting even the most sophisticated information technology (IT) professionals. This dramatic change challenges IT organizations to achieve a new level of server performance—one that enables the business to grow and respond to new technology challenges. Implementing server performance management can help an IT organization effectively gain control of exponential growth and meet performance expectations for end users. Business results and server performance In the UNIX ® and Microsoft ® Windows NT ® environment, purchasing decisions and continued customer satisfaction depend heavily on server performance. If a Web application doesn’t respond quickly enough, your customer may well abandon your site and go elsewhere. Reduced productivity and loss of business are just two obvious consequences of poor server performance. Another is that systems that fall short cannot provide the necessary flexibility to respond rapidly to the changing demands of the business. As you evaluate your current performance management capabilities, consider the following: What are my commitments to business units and what service level agreements (SLAs) are in place? Am I, as the service provider, legally bound to provide a response within a predefined time frame?

Transcript of Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ......

Page 1: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

Addressing UNIX and NTserver performance

IBM Global Services

“Despite planning and testing before bigonline events, Web site managers are beingblindsided by traffic beyond their wildestexpectations.”

Stacey Collett, “The Glitch that Stole Christmas?,”Computerworld, November 15, 1999.

Key Topics

Evaluating server performance

Determining responsibilitiesand skills

Resolving existing perform-ance problems

Assessing data for UNIX andNT platforms

Planning for growth

The use of Web technologies for commerce and communications has causedan economic revolution, impacting even the most sophisticated informationtechnology (IT) professionals. This dramatic change challenges IT organizationsto achieve a new level of server performance—one that enables the businessto grow and respond to new technology challenges. Implementing serverperformance management can help an IT organization effectively gain controlof exponential growth and meet performance expectations for end users.

Business results and server performanceIn the UNIX® and Microsoft® Windows NT® environment, purchasing decisions andcontinued customer satisfaction depend heavily on server performance. If a Webapplication doesn’t respond quickly enough, your customer may well abandonyour site and go elsewhere. Reduced productivity and loss of business are just twoobvious consequences of poor server performance. Another is that systems that fallshort cannot provide the necessary flexibility to respond rapidly to the changingdemands of the business.

As you evaluate your current performance management capabilities, considerthe following:

• What are my commitments to business units and what service level agreements(SLAs) are in place?

• Am I, as the service provider, legally bound to provide a response within apredefined time frame?

Page 2: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

2

Adding hardware to address performance problems may provide relief for ashort time. However, this action does not constitute a cost-effective approachto resource management.

The performance analyst’s responsibility is to understand system behavior andidentify performance and availability exposures. The objective is to match systemresources to application and user requirements. An ongoing challenge is dealingwith requirements that continue to change during and after tuning is done. Forexample, a server may be called upon to support more users than was originallyplanned, or storage requirements may have grown because of changes in applica-tions. Quickly identifying the need to add system resources or reprioritize workloadsis an important skill of the analyst—an iterative process of measurement, resourceallocation and resource utilization.

Assessment: The starting pointWhen you experience server performance problems, a performance assessmentmay be necessary to identify probable causes. Certain indicators highlight theneed for an assessment. For example, your users may be taking extended coffeebreaks throughout the day because their tasks are not completing within thedesired amount of time. Nightly backup or reporting jobs may still be executingwhen users arrive in the morning. Queries that had previously completed in lessthan one second are now taking minutes to execute. Clearly, assessing the rootcauses of these problems is essential.

You also need to perform capacity planning assessments to avoid performanceproblems. Adding new technology or applications, mergers with another departmentor company, adding additional users or consolidating many servers into fewersystems all require an understanding of existing server performance and utilization.

Whether you are addressing existing performance problems or planning for futurechanges, you need to have a thorough knowledge of your IT environment. A clearunderstanding of how to effectively utilize servers through system tuning helps youachieve user performance expectations.

Defining responsibilities and skillsWhen beginning the task of server performance analysis, you should first deter-mine who will be responsible for the overall process, then confirm what technicalskills will be required for the job. Do those skills currently exist in-house? If not, isthere sufficient ongoing demand to invest in the education of one or more of youremployees? Does the current performance situation allow adequate trainingtime? Many companies have found that their time and dollars are best spent byoutsourcing systems management. For some, that may involve monthly or quarterlyassessments. Others may turn over all system-management tasks—includingperformance analysis and tuning—to an outside resource.

Page 3: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

3

Effective performance assessments follow a process of general guidelines forsystems analysis and tuning. Although there is no book or guide that fits everysituation, there are certain fundamental steps that should be taken to analyzeand tune a system. Not all of these apply to every situation, and some suggestionsmay not be realistic for your environment. For instance, you may have diverseworkloads that conflict with each other. Web servers present different tuning chal-lenges than data servers or application servers. You need to balance resourcesbased on the business priority. This includes users who want consistent acceptableresponse time, realtime business applications, and batch processes which run atpredetermined times.

As you begin to form your performance-management strategy, determine yourmajor objective. Below are some common goals:

• Instantaneous response time for e-commerce or interactive users• Optimized performance without the purchase of additional hardware• Batch jobs completed within a specific time frame.

Components of server performance managementSystem managers with experience in keeping a computer system efficientlytuned recognize the following areas as essential for success:

• Resource monitoring—Resource utilization must be monitored so perform-ance problems can be easily detected—either before they occur or immedi-ately thereafter.

• Analysis and control —Once a performance problem is suspected, the propertools must be selected and applied so that the nature of the problem can beunderstood and the appropriate corrective action taken.

• Capacity planning—Long-term capacity requirements must be analyzed sothat sufficient resources can be acquired on a just-in-time (JIT) basis for maxi-mum cost and utilization efficiency (i.e., will your system be ready for the nextwave of Internet activity?).

“Forrester Research estimates that trainingwill cost $6,000 to $8,000 per personannually and that a well-trainedIT staff will spend about four weeks ayear in training.”

Patricia Schnaidt, “New Skills Prevent IT BrainDrain”, Network Computing, October 30, 1999.

Page 4: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

4

Since performance management is an ongoing process, the initial goal shouldbe both simple and quantifiable. Good performance or making the systemrun faster is not simple or quantifiable. Making a batch job run in five minutesand decreasing response time from two seconds to less than one second on aWeb page are goals. By following a tuning methodology, you will learn whether ornot your stated goals are attainable with current, available resources. You mayfind that you have conflicting or unachievable performance goals.

That’s why it’s important to manage expectations during this goal-setting period.What you uncover can change the project’s outcome, as well as your goals.

Defined goals and objectives help you more effectively use your skills and tools.The monitoring method used should fit the server’s place in your organization’soverall IT strategy.

• A non-critical server may only need occasional monitoring using tools that comewith the operating system or tools which are available in the public domain.

• Servers critical to your enterprise—those which support key business functions—may require constant monitoring with a suite of sophisticated tools that generaterealtime statistical data.

Factors that can impact your monitoring method decision are:• How quickly does performance degradation need to be addressed?• Who on your staff is experienced and available?• What is your budget?

Resolving existing performance problemsThe following flowchart and discussion provide a systematic approach to perform-ance analysis and tuning for both UNIX and NT systems.

• System CPUIf you determine that the system is CPU-bound, you have several options. Althougha faster processor provides some relief for an under-powered system, it may not beeconomically feasible.

It may be advantageous to spread the workload across two or more CPUs to obtainthe required throughput and response times. Whenever possible, the resourcesshould be used in parallel, rather than serial. This configuration can range fromseparate, standalone systems to tightly coupled parallel-processor systems. Spread-ing your databases over a number of servers on a network may also provide relief.

Page 5: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

5

Add CPU Reduce loadTune application

Add diskReorganize dataDelete data

Add/tune memory Reduce loadTune application

Diskbottleneck?

No

Yes

Yes

CPU bottleneck?

Memory bottleneck?

No

Yes

No

No

Evaluate other factors

Networkbottleneck?

Add capacity Reduce load

Yes

Performance tuning methodology

When upgrading from a single processor system to a multiprocessor (MP)system, make sure you either benchmark the application on the proposed hard-ware or obtain a reference account that is actually running the application in theenvironment under consideration. Some applications may actually run slower inan MP environment.

Another way to reduce instantaneous CPU resource requirement is to schedulenon-critical applications to run when the system is more lightly loaded. Runningless important work in the background allows the more important foregroundprocesses to have better access to the CPU.

Page 6: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

6

Applications can also have a substantial impact on performance. If a user hasaccess to source code and has a good working knowledge of the operating systemand application, that user may be able to improve system performance dramaticallyby finding hot spots or inefficient code and making the necessary corrections.

• System memoryIf your problem is not related to CPU overload, the next step is to check availablememory. Lack of sufficient real memory causes excessive paging or “thrashing.” Thisadditional paging activity can cause the disk devices to be “busy,” resulting in moretime spent waiting for I/O. If you cannot eliminate the thrashing through tuning, thenconsider installing more real memory.

Even though most UNIX systems are known to consume most of real memory, this isnot necessarily an indication of a memory constraint. If your system is constrainedby memory, either tune the virtual memory management subsystem or add addi-tional memory. Remember to take into account that some application designs havea significant impact on memory utilization.

• System disk utilizationWhen you determine that memory is adequate, the next area to analyze is the useof disk resources. An unbalanced load on disk drives can cause one or more drivesto be “busy” a large percentage of the time while others sit idle. This disparity oftencauses a bottleneck that throttles the I/O throughput and slows response times.The user should organize the data so that the normal access patterns equally utilizeall available drives. If the logical volumes, file systems or files are fragmented, thetime required for the disk to access the data may be much longer than would other-wise be necessary.

For UNIX systems, one common gauge of disk utilization is the time the drive isactive (%tm_act field of iostat). On a UNIX system, if time active exceeds 35 per-cent, the system may be I/O bound. On an NT system, if the drive activity exceeds80 percent and the Current Disk Queue is greater than two, then the system maybe I/O bound.

• Network utilizationIf you determine that system resources are not overutilized, your next step is toenlist the help of a network analyst. A detailed analysis of your network can identifydesign and traffic pattern problems that are causing poor performance. Your serverperformance may even be affected by problems that exist in unrelated areas ofthe network.

Page 7: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

7

Assessment data for UNIX platformsWhen approaching a UNIX system performance problem, make use of thesestandard tools, which are available for performance monitoring:vmstat- Reports virtual memory, CPU and process scheduling statisticsiostat - Reports CPU statistics and input/output statistics for TTY, disks

and CD-ROMsps- Shows current status of threads and processessar - Provides CPU statistics on a per-processor basis, as well as numerous

other statistics regarding system activity

Public domain tools available from ftp sites include top, monitor and several others.

The vmstat command reports statistics about processes, virtual memory, disks,faults and CPU activity. The command line syntax is similar to that of the iostatcommand. The most important columns in the vmstat reports are:r - Size of the run queuepi - Number of pages paged in from paging spacepo- Number of pages paged out to paging spaceus- Percent of user CPU timesy- Percent of system CPU timeid - Percent of CPU idle timewa- Percent of CPU idle time while there is pending local disk I/O

Page 8: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

8

Vmstat output will differ, depending on which UNIX platform you are using. Theseexamples are from an AIX® operating system. Even though the command lineoptions may differ and the columns may have different headings, the same basicinformation is available from the various platforms.

For iostat output, the most important fields to monitor are described as follows:%tm_act- Percent of time each physical volume is busyKbps- Kbytes per second transfer rateKb_read- Kbytes read during the intervalKb_wrtn- Kbytes written during the interval

The %tm_act, Kb_read, and Kb_wrtn fields of the first report will give an indicationof the overall disk load balancing. Iostat output can differ, depending on which UNIXplatform you are using. Even though the command line options may vary and thecolumns may have different headings, the same basic information is available fromthe various platforms.

Since iostat is relatively easy to use and interpret, it provides a sound basic analysistool for the I/O subsystem. The utility is used for monitoring system I/O device load-ing. Reports generated provide valuable information to help in modifying systemconfigurations. Making these modifications improves performance by balancing theI/O load between physical disks.

Iostat is normally run for a predetermined number of iterations at a user-definedinterval. The first report generated by the iostat command provides cumulative sta-tistics from the time when the system was booted to the present. Each subsequentreport covers the time since the previous report. This information is updated atregular intervals by the kernel.

Page 9: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

9

Assessment data for NT platformsIn the NT environment, make use of the Microsoft Windows NT Performance Monitor.This tool is included as part of the operating system, and contains enough function-ality to perform a mid-level performance analysis of the operating system.

The Performance Monitor provides a common graphical user interface to performanalysis on all major subsystems. This tool has three different views: chart, log, andreport. The chart view provides live performance monitoring or the playback of aperformance log file. The log view is the most important tool because it allows datato be saved and analyzed later. The report view is useful for obtaining averages ofperformance statistics over a monitoring period.

There are three parts to NT server performance metrics:Objects- Memory, physical disk and processor are a few of the objects on NT.Counters- This refers to a statistic of an object, such as Pages/Sec, which is a

memory-object statistic that refers to the number of pages read fromor written to disk per second.

Instances- Processors and disks are physical parts of the operating system thathave multiple instances. If there are two processors on the system,then the processor object will have two instances

• CPU bottleneckSeveral important objects and counters should be observed when analyzing systemCPU resources. The system object contains many important systemwide counters,such as Total Processor Time%. Microsoft suggests a CPU bottleneck may exist ifthe Total Processor Time% consistently is greater than 80 percent, and the Proces-sor Queue Length Counter is consistently greater than two. To determine if theworkload is being spread evenly across all of the processors of a multiple processorsystem, use the Processor Object’s Processor Time%. The following graphic givesan example of this statistic.

Page 10: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

10

• Memory bottleneckTo determine if a memory bottleneck exists, several key statistics should be mea-sured. For example, Microsoft suggests that if the memory object’s Pages/Sec isconsistently over 20 Pages/Sec, then a memory bottleneck may exist.

• Disk bottleneckA disk bottleneck may exist if the Physical Disk Object’s Disk Time% exceeds80 percent and the Current Disk Queue Length consistently exceeds two.

Planning for growthWhen approaching system performance from a proactive viewpoint, you shouldimplement a method for viewing all of your server resources. Ideally, you want someautomated indicators to show when resources are overutilized, well-utilized orunderutilized. This will assist with your day-to-day and long-term planning. Youmay implement an off-the-shelf product that provides problem alerting as well ashistorical trending information. If your requirements are unique, you may choose todevelop custom reporting using output from available system monitoring tools. If youare looking for assistance in interpreting system information and planning, one op-tion is to outsource server trending and reporting. Whatever method you choose,having this information available can help you better utilize your existing resourcesand more effectively plan for growth.

Page 11: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

11

Features to consider:• Realtime alerts• Diagnostic analysis• Recommendations for improvement• Graphical views• Trending information• Multiplatform support

SummaryAs your server environment evolves, it is important to continually assess systemresources. This paper has described how to effectively access the informationyou need to identify the causes of performance problems. Consolidating thisinformation across the enterprise can help you plan for growth and change, andbetter leverage skills and resources. Although there is a cost of implementingserver performance tuning—whether you choose to train internal staff oroutsource—the cost of lost productivity, user dissatisfaction and the possibilityof lost customers will far outweigh the expense.

For more informationFor more information on IBM Performance Management and Capacity PlanningServices, call 1-800-426-4682 (in the U.S.), 919-301-4141 (from outside the U.S.),or e-mail us at: [email protected].

Page 12: Addressing UNIX and NT server performance - IBM · Addressing UNIX and NT server performance ... with the operating system or tools which are available in the public domain. ... iostat-Reports

© Copyright IBM Corporation 2000

IBM Global ServicesRoute 100Somers, NY 10589U.S.A.

Produced in the United States of America06-00All Rights Reserved

IBM, AIX and the e-business logo are trademarksor registered trademarks of International BusinessMachines Corporation in the United States, othercountries, or both.

Microsoft, Windows and Windows NT aretrademarks of Microsoft Corporation in theUnited States, other countries, or both.

UNIX is a registered trademark in the United Statesand other countries, licensed exclusively throughThe Open Group.

Other company, product and service names maybe trademarks or service marks of others.

References in this publication to IBM productsand services do not imply that IBM intends tomake them available in all countries in whichIBM operates.

IBM Integrated Technology Services organizationin the United States, part of IBM Global Services,design and development of services offerings,has successfully achieved registration to theISO 9001, 1994 international quality standard.

G563-0338-00