Avoiding Chaos: Methodology for Managing Performance in a Shared Storage Area Network Environment

IBM Global Services – MSS

© 2005 IBM Corporation

Avoiding Chaos: Methodology for Managing Performance in a Shared Storage Area Network Environment

Brett Allison

July 25-29, 2005 New Orleans, LA

P10

MSS Performance


Trademarks

The following terms are trademarks of the IBM Corporation:

Enterprise Storage Server® - Abbreviated: ESS

TotalStorage® Expert TSE

FAStT/DS4000/DS8000

AIX®

Other trademarks appearing in this report may be considered trademarks of their respective companies.

SANavigator,EFCM McDATA

UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Limited.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.EMC is a registered trademark of EMC Inc.

HP-UX is a registered trademark of HP Inc.

Solaris is a registered trademark of SUN Microsystems, Inc

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

MSS Performance


Table of Contents

Scope of Presentation

Chaos Definition – How do we avoid it?

What is Shared Storage?

Why Deploy Shared Storage?

What is Performance Management’s Role?

Planning Considerations

Developing a Re-active Methodology

Deriving a Pro-active Methodology?

Developing a Predictive Methodology?

MSS Performance


Chaos and How to Avoid It?

Chaos

– A state of things in which chance is supreme

– A state of utter confusion

– A confused mass or mixture

– The result of uncoordinated actions by independent actors

How do we avoid it?

– Understand the causes of the confusion

– Plan and implement processes and tools for reducing causes

MSS Performance


What Are the Major Benefits of SAN and Shared Storage?

Perf

Availability

Reduce Cost

MSS Performance


What is a SAN?

Servers

Edge Switch - A

Edge Switch - B

ISL’s

Core Switch - A

Core Switch - B

Storage Servers

What can be measured?

Links Links

Storage Switch - A

Storage Switch - B

MSS Performance


What is Shared on the Enterprise Storage Server?

Host Adapters ESCON FICON SCSIFront End

CentralCPUs Cache NVS

Cluster

Back End SSA SSA SSAAdapters SSA

Legend Rank1 Rank9

Raid 5 Ranks

D D D S

D P D D

Eight Pack 1

Eight Pack 2

D D P D

S D D D

Loop A Loop BDisks

D = Data P = Parity S = Spare

MSS Performance


How is Data Shared on the Disks?

S

Eig

ht P

ack

1E

igh

t Pa

ck 2

Loop ADisks (Rank1)

1 2 3

4 P 5 6

1 2 3

P 4 5 6

1 2 P

3 4 5 6

1 P 2

3 4 5 6

P 1 2

3 4 5 6

1 2 3

4 P 65

Legend

Volume 1 – Staging Server Test DB

Volume 2 – Production DB

Volume 3 – TSM Disk Pool

Volume 4 – Data Warehouse Load

Volume 5 – Production DB Log Files

Volume 6 – Production DB Index

MSS Performance


What Role Does Performance Management Play in Shared Storage?

Performance

Management

Planning

Predictive

Reactive

Proactive

MSS Performance


Assessment and Design Considerations

Shared

Workload Variance

RT Sensitivity

Low/Small High/Large

Bandwidth

Budget

Dedicated

MSS Performance


A Reactive Methodology – Online Focus

Hostresourceissue?

Fix itID hotHost disks

ID hotHost disks

Host

Storage server

SAN config,SAN perf data

Storage Srvrperf data

Fix it

N Y

MSS Performance


Identify Host Disks with High I/O Response Time – Example of AIX Server with SDD installed

------------------------------------------------------------------------Detailed Physical Volume Stats (512 byte blocks)------------------------------------------------------------------------VOLUME: /dev/hdisk23 description: IBM FC 2105800reads: 1659 (0 errs) read sizes (blks): avg 8.0 min 8 max 8 sdev 0.0 read times (msec): avg 30.25 min 13.335 max 36.228 sdev 6.082 read sequences: 1659 read seq. lengths: avg 8.0 min 8 max 8 sdev 0.0

Gather Response Time Data ‘filemon’ (See Appendix C)

Gather LUN ->hdisk information (‘lsvp –a’ See Appendix D)Hostname VG vpath hdisk Location LUN SN S Connection Size-------- -- ------- --------- -------- -------- - ---------- ----server1 vg1 vpath96 hdisk23 2Y-08-02 71012345 Y R1-B4-H1-ZA 8.0

Format the data (Script - See Appendix H)

45.88469430.251659hdisk231234571012345server19:0012/17/05

AVG WRITE SIZE (KB)

WRITE TIMES (ms)

# WRITES

AVG READ SIZE (KB)

READ TIMES (ms)

# READS

HDISKESSLUNSERVER NAME

TIMEDATE

MSS Performance


Drilling Down

29.8099th

15.9195th

13.7290th

10.3285th

Read Time (ms)Percentile

Does the I/O Response time warrant further investigation?

Summarize the Read I/O Response Time by percentiles using Excel’s percentile function: =PERCENTILE(RANGE,PCT).

747424.7945831.562299rank4071F1234512170900

7266011.22173427.651924rank4071A1234512170900

529505.8846930.251659rank407101234512170900

Total RT (ms)

Avg Write RT (ms)

WritesAvg Read RT (ms)

ReadsArrayLUNTime

Filemon sample time was at 9:00 AM. What was happening on ESS 12345 and Array rank40 at that time?

If yes, then correlate the normalized data with ESS Arrays

Correlate LUN with Array Create a Top LUN List

MSS Performance


Did Contention Exist on the Storage Server for the Time Periods When the Attached Server had Contention?

Array rank40 had a large spike in activity causing disk utilization to rise to 68% on average for the period starting at 8:45 AM and ending at 9:00 AM

Gather ESS Physical Array Data – Appendix E

Array - rank40 Avg Disk Utilization

0

10

20

30

40

50

60

70

80

08.00.00 08.15.00 08.30.00 08.45.00 09.00.00 09.15.00 09.30.00 09.45.00 10.00.00 10.15.00 10.30.00 10.45.00 11.00.00 11.15.00

Begin Time - 15 Minute Interval

PC

T U

tili

zati

on

Spike in Utilization

MSS Performance


Cache 2 Disk Tracks

0

20000

40000

60000

80000

100000

120000

08.00.00 08.15.00 08.30.00 08.45.00 09.00.00 09.15.00 09.30.00 09.45.00 10.00.00 10.15.00 10.30.00 10.45.00 11.00.00 11.15.00

Time - 15 Minute Interval

Tra

ck

s

73A12345

72D12345

71F12345

71A12345

70A12345

75612345

75212345

75012345

74712345

74512345

73912345

73612345

73112345

71612345

71012345

70112345

What Caused the Spike in Disk Utilization on Array rank40?Gather LUN level data – Appendix F

Spike in C2D

During the 8:45 – 9:00 AM interval there was a significant spike in Cache2 Disk Track transfers to LUN 73912345. The owner of the LUN was server2 and from working with the SA we find that this LUN is TSM storage pool

MSS Performance


Fixing the Problem?

Identify Hot Array

Migrate LUNs to Target

Quantify LUN I/O Rate on Array ArrayH: LUN IOR = (R+W – CH)/Interval

Quantify Array I/O Rate Delta ArrayH:IOR - Threshold IOR = Delta IOR

Identify Target Array IOR Threshold < (Delta IOR + ArrayT:IOR)

Legend: ArrayH = Hot Array; ArrayT = Target Array; IOR = I/O Rate

MSS Performance


ESS Analysis Considerations

Did any ESS arrays have: – High disk utilization?

– High I/O rates?

Did the ESS have SSA adapter/loop level constraints?

Did the ESS clusters have: – Poor cache hit rates?

– Low cache hold time?

– High % of NVS delays?

MSS Performance


ESS Analysis Gotchas

Variance

Time Stamps

Expectations

Availability of Data

Lack of Config. Info.

Measure-ability

MSS Performance


Getting Proactive/Predictive

Create Service Level Agreements

Educate customers

Automate collection & correlation

Analyze regularly

Create views for

– Healthcheck (ESS, Array: Appendix M)

– Trending

– Provisioning

– Capacity Planning

MSS Performance


MSS Performance


Appendix A - Best Practices for Performance in a Shared ESS Environment

Technology Description

General Spread I/O evenly across adapters and disk groups

General Avoid placing LUNs on heavily utilized disk groups

General Use small LUN size (8-16) for more granular tuning

General Isolate source and backup volumes on separate disk groups

General Isolate/dedicate high bandwidth workloads (Data Warehouse)

AIX SDD/HBA Utilize at least 4 paths for heavy workloads

AIX LV Understand AIX – LV Intra Policy of Max and how it effects placement – Spreads LV partitions across all LUNs in VG

FS Striping Understand implications of Filesystem striping

Database(s) If write activity is heavy (Logs) segregate at array level from other workloads

Flash Copy Disk Group/Adapter isolation for Flash copy source and target

MSS Performance


Appendix B: Resources

AIX Documentation – http://www-1.ibm.com/servers/aix/library/index.html

Linux – iostat– http://linux.inet.hr/

HP-UX Documentation– http://docs.hp.com/

Solaris Documentation– http://docs.sun.com/app/docs

ESS Documentation– ESS Model 800 Performance

– IBM TotalStorage Expert Reporting: How to Produce Built-In and Customized Reports– IBM TotalStorage Expert Hands-On Usage Guide

MSS Performance


Appendix C - Measure End-to-End Host Disk I/O Response Time

OS Native Tool

Command/Object Metric(s)

AIX filemon filemon -o /tmp/filemon.log -O all

read time (ms)

write time (ms)

HP-UX sar sar –d avserv (ms)

Linux *iostat iostat –d 2 5 svctm (ms)

NT/Wintel perfmon Physical Disk Avg. Disk sec/Read

Solaris iostat iostat –xcn 2 5 svc_t (ms)

The iostat package for Linux is only valid with a 2.4 & 2.6 kernelSee Appendix B for links to more information

MSS Performance


Appendix D: Getting LUN Serial Numbers for ESS Devices

OS Tool Command Key Other Metrics

AIX, HP-UX, Solaris

ESS Util

lsvp –a LUN SN VG, hostname, Connection, hdisk

Linux SDD lsvpcfg LUN SN Device Name

Wintel SDD Datapath query device

Serial Device Name

Note: ESS Utilities for AIX/HP-UX/Solaris are available at: http://www-1.ibm.com/servers/storage/support/disk/2105/downloading.html

Host config. - http://www.redbooks.ibm.com/abstracts/tips0553.html

MSS Performance


Appendix E: DB2 Query for Array Performance Data

Note: This information is relevant only if you have the TotalStorage Expert installed and access to the DB2 command line on the TSE server.

SELECT DISTINCT

A.*,

B.M_CARD_NUM,

B.M_LOOP_ID,

B.M_GRP_NUM

FROM

DB2ADMIN.VPCRK A,

DB2ADMIN.VPCFG B

WHERE (

(

A.PC_DATE_B >= '%STARTDATE' AND

A.PC_DATE_E <= '%ENDDATE' AND

A.PC_TIME_B >= '%STARTTIME' AND

A.PC_TIME_E <= '%ENDTIME' AND

A.M_MACH_SN = '%ESSID' AND

A.M_MACH_SN = B.M_MACH_SN AND

A.M_ARRAY_ID = B.M_ARRAY_ID AND

A.P_TASK = B.P_TASK

)

)

ORDER BY

A.M_ARRAY_ID, A.PC_DATE_B, A.PC_DATE_E with ur;

MSS Performance


Appendix F: DB2 Query for LUN Performance Data

Note: This query requires sql access to the TotalStorage Expert for ESS

SELECT DISTINCT A.M_VOL_ADDR, B.*FROM VPVOL A, VPCCH BWHERE ( A.M_MACH_SN = '%ESSID' AND A.M_MACH_SN = B.M_MACH_SN AND A.M_LSS_LA = B.M_LSS_LA AND A.M_VOL_NUM = B.M_VOL_NUM AND B.PC_DATE_B >= '%STARTDATE' AND B.PC_DATE_E <= '%ENDDATE' AND B.PC_TIME_B >= '%STARTTIME' AND B.PC_TIME_E <= '%ENDTIME' ) ;

MSS Performance


Appendix G: Reactive Methodology High Level Workflow

1. Customer Calls Help Desk

2. Help Desk asks customer for I/O Response Time Data

3. Analyst analyzes data

4. If a statistically significant number of I/O Response Times exceeds reasonable response time then determine shared resource constraint

-Or-

5. If there is no evidence of I/O Response time & no indication of shared resource issues then problem is likely at the server infrastructure or application level

MSS Performance


Appendix H: Format ‘lsvp –a’ and ‘filemon’ (Logic)

1. Process ‘lsvp –a’ file

Build hdisk hash with key = hdisk and value = LUN SN

Build ess hash with key = hdisk and value = last 5 chars of LUN SN

2. Process filemon file

Create hashes for each of the following values with hdisk as the key: Date, Start time, Physical Volume, Reads, Avg Read Time, Avg Read Size, Writes, Avg Write Time, Avg Write Size

3. Print data to file with headers and commas to separate fields

Iterate through hdisk hash and use the common hdisk key to index into the other hashes and printing out those that have values

MSS Performance


Appendix I – Sample Wintel Datapath Query Output

DEV#: 0 DEVICE NAME: Disk0 Part0 TYPE: 2105F20 POLICY: RESERVE

SERIAL: 02612345

============================================================================

Path# Adapter/Hard Disk State Mode Select Errors

0 Scsi Port5 Bus0/Disk0 Part0 OPEN NORMAL 3212602 1

1 Scsi Port5 Bus0/Disk0 Part0 OPEN NORMAL 865 1

Note: The SERIAL number indicates the LUN information. The first 3 digits are the LUN number and the last 5 are the ESS serial number.

MSS Performance


Appendix J: Array Level Information – VPCRK - Gotchas

Q_SAMP_DEV_UTIL – This metric is calculated and sometimes overstated

PC_RBT_AVG & PC_WBT_AVG – The documentation for both of these fields indicates Bytes. These fields actually represent Kbytes.

Q_IO_SEQ – Number of Sequential I/Os. This number is reported incorrectly. There is a patch for this issue.

Q_CL_NVS_FULL_PRCT - Cluster-level percent of total IO requests delayed due to NVS space constraints in this time period, for the cluster with affinity to this array. This field is reported incorrectly. A patch is available and I have verifed that it no longer reports % > 100.

MSS Performance


Appendix K: ESS ComponentsComponent Sub-component Metrics

Cluster Level Cache PCT cache hits/Cache Hold Time

NVS Percent of delays caused by limitations in NVS

Front-End FC HBA Adapter Throughput/RT Available via CLI but not feasible for continuous measurement

CPU - No statistics

I/O planar - No statistics

Backend SSA Adapters No TSE statistics. It is possible to roll up from Array level or use CLI to get stats

Arrays KB Read/sec, KB Written/sec, I/O Rates, Sequential PCT, Read PCT

Disk Drive Calculated Response Time, Disk

LUN Level Logical statistics (Cache/Tracks/etc)

MSS Performance


Allocation

Request

Appendix L: A Process for New LUN Allocations with Performance Input

Identify healthy

target arrays

Identify arrays

with free space

Assign LUNs on

target arrays

MSS Performance


Appendix M: ESS Array HealthCheck and Drill Down

MSS Performance


Appendix N: Glossary

ISL – Inter-switch links

Edge switch – A switch that connects the end points (servers, storage servers) to the network backbone.

Core switch – A switch that is located on the network backbone

MSS Performance


BiographyBrett Allison has been doing distributed systems performance related work since 1997 including J2EE application analysis, UNIX/NT, and Storage technologies. His current role is performance analyst for the IGS Managed Storage Services offering. MSS currently manages over 1 Petabyte of data. He has developed a number of internally used performance analysis tools used by ITS/IGS. He has spoken at a previous Storage Symposium and is the author of several White Papers on performance

Avoiding Chaos: Methodology for Managing Performance in a Shared Storage Area Network Environment

Documents

Transcript of Avoiding Chaos: Methodology for Managing Performance in a Shared Storage Area Network Environment