Hp Connect 10 06 08 V5
-
Upload
guestea711d0 -
Category
Technology
-
view
615 -
download
0
description
Transcript of Hp Connect 10 06 08 V5
1
Capacity Planning for
Itanium
Paul O’ Sullivan and Prem S. Sinha, PhD.
PerfCap Corporation 76-39A Northeastern Blvd.,, Nashua, NH 03062
www.PerfCap.com; [email protected]; 603-594-0222
2
PerfCap Corporation
• Group Started within Digital/Compaq (now HP) over 21 years ago
• Operating as independent corporation since 2001• Privately Owned, Zero Debt• Currently focused on Performance Monitoring, Capacity
Planning and Asset Management• 20+ Years of Solid Engineering & Development• Worldwide Presence• HP and other resellers continue to sell it world wide• Partnership
– HP, IBM, SUN– Microsoft Certified Partner
3
Some of Current Customers
• Barclays UK• Commerzbank • Deutsche Bank UK• SIAC• Mary Kay • Certegy• Analog Devices• Royal Bank of Scotland
• BNP Paribas (3th Largest Retail Bank in Europe) Enterprise License – Unlimited use (3000+ deployed)
• ISE (Largest Options Stock Exchange)Enterprise License – Unlimited use
• US Postal ServicesMonitoring 450 nodes
• Thomson ReutersUp to 45,000+
• International Papers • Vodafone• British Telecom• MDS Pharmacy• Pfizer• Qwest • Lockheed Martin• Caremark
• Swedish Customs• Netherlands Army• CNS Dubai• UPMC Medical Center• UIC Medical Center• University Hospital, Zurich• US Dept. of Education• SUNY Buffalo Univ.
4
Capacity Planning Endorsement
Adrian Cockcroft winner of A.A. Michelson lifetime achievement award at 2007 CMG, in his personal blog wrote
“The most interesting commercial tool I saw at CMG earlier this month is a capacity monitoring tool called PAWZ from PerfCap Corporation. The key thing they have worked on is taking the human out of the loop as much as possible with sophisticated capacity modeling algorithms and a simple and scalable operational model. ... The core idea is that you care about "headroom" in a service, and anything that limits that headroom is taken into account. Running out of CPU power, network bandwidth, memory, threads etc. will increase response time of the service, so monitor them all, track trends in headroom and calculate the point in time where lack of headroom will impact service response time. At eBay we used to call this the "time to live" for a service. You can easily focus on the services that have the shortest time to live, and proactively make sure that you have a low probability of poor response time.”
5
Challenges
Do More With Less
• Large number of geographically dispersed resources
• Multi-platform
• Automate the process – On a daily basis
– Collect Data
– Consolidate/Analyze Data
– Generate Performance and Capacity Reports
– Send “Need-to-Know” Exception Notification
• Information availability: anytime anywhere
– web access
6
Data ManagementHierarchical Approach
Raw Data
Key Performance Data
Risk Data
: Performance Analysts
: Capacity Planners
7
Desk TopBrowser
Intranet
PAWZFindITServer (NT/W2K)
Networks Storage
Events
Trending
Clusters
Real Time
Applications
Performance
Reports
Daily, Weekly Health Reports
Critical Systems
Asset Location
Change Report
Configuration
Asset
Reports
Windows NT/2000/XP
SUN Solaris
HP-UX
IBM-AIX
OpenVMSCluster
LINUX
Tru64 UNIX
8
PAWZ Components
• PAWZ Agent/Monitor: Resides on each node to be monitored– Collect Performance data 24x7
– Send colleted data to PAWZ Server in real time and/or once a day
• PAWZ Server: Resides on a Windows based server and communicates with hundreds of PAWZ Agents– Receives data from PAWZ Agent
– Processes and produces real time, daily and historical charts and reports
– Produces trend graphs for simple projections
– Runs a queuing network modeler for capacity planning
• PAWZ Browser: Resides on any corporate desktop
– Shows all reports and charts within Internet Browser
– Manage most of PAWZ functions
9
PAWZ Key Functionality
• Collect performance data 24 x 7
• Provide real time and daily alerts based on performance thresholds
• Provide Performance Reports:– Real Time
– Daily
– Historical – for trending
• Performs Saturation Analysis every day for each node for capacity planning
• Performs Risk Analysis to detect systems that could be at Risk.
• Provides consolidated data center configuration report
10
Capacity Planning
Definition: A process to determine how much computing resources are
required to meet business growthOr
How much business can grow before some device will run out of capacity
To answer “What if” questions like:– Can my current configuration handle three times of current workload – when will
my current configuration saturate– What will be impact of a new application on current system performance– What will be impact of upgrading a current server or adding a new server– Can I reduce the number of servers with out violating my “Service Level
Agreement” – a.k.a Server Consolidation
11
Sizing Methods
RulesofThumb
LinearProjec-tions
AnalyticModels
Simula-tionModels
Bench-marks
RealSystem
Cos
t
Accuracy
12
J F M A M J J A S O N D
Capacity Planning via Trending
Time
Per
form
ance
Met
ric
(Av.
or
Pea
k C
PU
Uti
liza
tion
)
• Simple to produce and follow• Issues
• defining right Capacity Limit• single vs composite metric• end user satisfaction
Today
RemainingCapacity
Capacity Limit
13
PAWZ Planner
Workload
Res
pon
se T
ime
Saturation Point
Where do you want to operate?
Current Workload
Headroom
Response Time = {Service Time + Queuing Time}
14
Capacity Planning via Modeling
Steps:
• Data Collection
• Identifying Peak Interval(s)
• Workload Characterization
• Model Validation
• Saturation Analysis
• “What If” Analysis
15
PAWZ Planner
16
Remaining Headroom (Capacity) Trend
17
Headroom Risk Analysis
Time
Hea
dro
om
Headroom threshold
Headroom crosses threshold
Lead time
Amber status – system within lead time of dropping below headroom threshold.
Lead time
Headroom reaches 0
Red status – system within lead time of exhausting capacity.
Current state
18
Risk Analysis
19
Risk Analysis
20
Risk Analysis
21
22
“What if”
PAWZ Planner has a “what-if” Capacity Planning module to forecast:- How much business can grow before some device will run out of
capacity• Can my current configuration handle three times of current
workload – when will my current configuration saturate• What will be impact of a new application on current system
performance• What will be impact of upgrading a current server or adding a
new server
23
“What if”CPU & Disk Upgrade
Before
After
24
Itanium Capacity Study
• Typical Study– Capability to do any platform to any other
platform (Alpha to Integrity)– Hardware:-
• Customer on Integrity Server cluster with HP-UX and Oracle
• RX8620 (4/4/16), 64Gb Memory
• SAN
25
Itanium Capacity Study
• Alternate models considered:-– RX8640 32 Core– P570 32 Core– M8000 32 Core
• 3 or 4 node cluster considered
26
Itanium Capacity Study
• Reason for Study– Expected substantial application growth– System already Peaking at CPU– What alternate configurations would provide
adequate growth of at least 200% current workload?
• HP and non-HP configurations considered
27
Itanium Capacity Study
28
CPU by Image / Disk I/O Rate
29
CPU by Core
30
Memory vs Process Count
31
Total IO Counts
32
IO Rates
33
Disk Response Time
34
Performance Data from Benchmark
CPU Utilization 86.3%
Disk I/O Rate 1514/s
Hard Page Fault Rate 1.2/s
Memory Utilization 73%
35
Current Response Time Curve
36
Where should your system live?
37
At peak sustained load, 9% headroomCPU is primary resource bottleneckPossible solutions:
• Horizontal scaling• Integrity upgrade• Alternate hardware platform
Headroom - Current System
38
Configuration Alternatives(3 or 4 nodes)
HP rx8620 (1.1 GHz, Itanium 2) – current configurationHP rx8640 (1.6 GHz, 24MB L3 cache), 16 coreHP rx8640 (1.6 GHz, 25MB L3 cache), 32 coreIBM p 570 (2.2 GHz, Power 5), 16 coreIBM p 570 (2.2 GHz, Power 5), 32 coreIBM p 570 (4.7 GHz, Power 6), 16 coreSun SPARC Enterprise M8000 (2.4 GHz) , 16 coreSun SPARC Enterprise M8000 (2.4 GHz) , 32 core
Configuration must support 200%
workload growth
39
Response Time vs Workload Growth3-node RAC
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
-100 -50 0 50 100 150 200 250 300 350 400
% Workload Growth from Benchmark
Re
lati
ve
Re
sp
on
se
Tim
e
HP rx8620 (1.1 GHz Itanium 2), 16-core
HP rx8640 (1.6 GHz, 24MB, Itanium 2), 16-core
HP rx8640 (1.6 GHz, 24MB, Itanium 2), 32-core
IBM p570 (2.2 GHz, Power 5), 16-core
IBM p570 (2.2 GHz, Power 5), 32-core
IBM p570 (4.7 GHz, Power 6), 16-core
Sun SPARC Enterprise M8000 (2.4 GHz), 16-core
Sun SPARC Enterprise M8000 (2.4 GHz), 32-core
40
Response Time vs Workload Growth4-node RAC
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
-100 -50 0 50 100 150 200 250 300 350 400
% Workload Growth from Benchmark
Rel
ativ
e R
esp
on
se T
ime
HP rx8620 (1.1 GHz, Itanium 2), 16-core
HP rx8640 (1.6 GHz, 24 MB L3 cache), 16-core
HP rx8640 (1.6 GHz, 24 MB L3 cache), 32-core
IBM p570 (2.2 GHz, Power 5), 16-core
IBM p570 (2.2 GHz, Power 5), 32-core
IBM p570 (4.7 GHz, Power 6), 16-core
Sun SPARC Enterprise M8000 (2.4 GHz), 16-core
Sun SPARC Enterprise M8000 (2.4 GHz), 32-core
41
Projection Conclusions
• CPU is constraining resource• Memory, disk will support 200% growth• 3 configuration platforms support growth:
– HP rx8640 (1.6 GHz, 25MB L3 cache), 32 core
– IBM p 570 (2.2 GHz, Power 5), 32 core
– IBM p 570 (4.7 GHz, Power 6), 16 core
– Sun SPARC Enterprise M8000 (2.4 GHz) , 32 core
• Horizontal scaling to 4 nodes will not change qualifying platforms. However, cores may be adjusted.
42
Minimal Cores, 3-node RAC
0.0
0.2
0.4
0.6
0.8
1.0
1.2
-100 -50 0 50 100 150 200 250 300
% Workload Growth from Benchmark
Re
lati
ve
Re
sp
on
se
Tim
e
Sun SPARC Enterprise M8000 (2.4 GHz), 32 cores
HP rx8640 (1.6 GHz, 25MB L3 cache), 30 cores
IBM p 570 (2.2 GHz, Power 5), 26 cores
IBM p 570 (4.7 GHz, Power 6), 12 cores
43
Minimal Cores, 4-node RAC
0.0
0.2
0.4
0.6
0.8
1.0
1.2
-100 -50 0 50 100 150 200 250 300
% Workload Growth from Benchmark
Re
lati
ve
Re
sp
on
se
Tim
e
Sun SPARC Enterprise M8000 (2.4 GHz), 24 cores
HP rx8640 (1.6 GHz, 25MB L3 cache), 24 cores
IBM p 570 (2.2 GHz, Power 5), 20 cores
IBM p 570 (4.7 GHz, Power 6), 10 cores
44
Mixing 1.1 GHz and 1.6 GHz Itanium Cores
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
-100 -50 0 50 100 150 200 250 300
% Workload Growth from Benchmark
Re
lati
ve
Re
sp
on
se
Tim
e
rx8620 (1.1 GHz, 16 cores)
rx8620 (1.1GHz, 16 cores + 1.6 GHz, 16 cores)
rx8640 (1.6 GHz, 32 cores)
45
Minimal Number of Cores per Node Supporting 200% Growth
Platform 3-node 4-node
Sun SPARC Enterprise M8000 (2.4 GHz) 32 24
HP rx8640 (1.6 GHz, 25MB L3 cache) 30 24
IBM p 570 (2.2 GHz, Power 5) 26 20
IBM p 570 (4.7 GHz, Power 6) 12 10
46
Itanium Capacity Study
• Customer satisfied– Had options
• Reduce Oracle cost by reducing number of cores
• Forecast from real data
• Could approach vendors with confidence
• Today– 90% of this study automated via PAWZ
• Same Graphs
• Same Results
47
Modelling Capability
• Hardware– Alpha to Integrity– Integrity to new models and beyond– Other vendors to Integrity
• Software– Increases in workload– Optimization– Decreases in workload– Virtualization
48
Summary
• PerfCap offers an integrated Performance Management and Capacity Planning Software that is:
– Out-of-the-box (no scripting required)
– Fully automated
– Multi-Platform
– Web based
– Highly scalable
• Pricing – Independent of number and class of CPUs in a server