vSphere Managemen t and Automation

25
© 2010 VMware Inc. All rights reserved vSphere Management and Automation Al Grandville VMware - Sr. Systems Engineer South Florida - Enterprise Accounts

description

vSphere Managemen t and Automation. Al Grandville VMware - Sr. Systems Engineer South Florida - Enterprise Accounts. Customer Value Journey. COST EFFICIENCY. QUALITY OF SERVICE. BUSINESS AGILITY. IT as a Service. IT Production. Business Production. 85%. 70%. 30%. 15%. How?. - PowerPoint PPT Presentation

Transcript of vSphere Managemen t and Automation

Page 1: vSphere Managemen t and Automation

© 2010 VMware Inc. All rights reserved

vSphere Management and Automation

Al GrandvilleVMware - Sr. Systems EngineerSouth Florida - Enterprise Accounts

Page 2: vSphere Managemen t and Automation

2

COST EFFICIENCY

QUALITY OF SERVICE

BUSINESS AGILITY

IT Production Business Production IT as a Service

15%

30%

70%

85%

Customer Value Journey

How?

Page 3: vSphere Managemen t and Automation

3

VMware’s Current Focus: Private Cloud

Enable Self-Service Infrastructure as a Service Orchestration / workflow Chargeback

Automate Infrastructure & Operations Management Performance Capacity Configuration

Ensure Security & Compliance Operational best practices Regulatory

Streamline IT Service Management Problem, incident, change and

configuration

Page 4: vSphere Managemen t and Automation

4

Performance

Page 5: vSphere Managemen t and Automation

5

1st and 2nd Generation Monitoring

Monitoring Solutions Insufficient for Performance Management

1st generation – good data collection, but alert storms

2nd generation – rules can’t adapt to change

Result:• Performance problems often occur

with no real warning• Performance problems require time

consuming manual effort to resolve• Virtual infrastructure is blamed for

application performance problems that originate elsewhere

3/4/08 16:45 Host 1 processingTimeServ The Processing Time Service Level on process… n/a n/a n/a

3/4/08 16:45 Host 1 Processor_Table 0 Processor 0 is at 87.0%. A CPU Bottleneck is….. n/a 0 Windows_System

3/4/08 16:44 Host 2 System_Table The number of hardware interrupts per second… n/a 0 Windows_System

3/4/08 16:30 Host 2 Processor_Table 1 Processor 1 is at 84.0%. A CPU Bottleneck is …. n/a 0 Windows_System

3/4/08 16:25 n/a responseTimeServ… The Response Time Service Level on Toadwor.. n/a n/a n/a

3/4/08 16:20 n/a processingTimeServ.. The Processing Time Service Level on Prospec.. n/a n/a n/a

3/4/08 16:08 Host 1 Ora_Sql_Hogs_Alert Oracle: SFPRD A CPU Hog has been detected n/a OraSF Oracle

3/4/08 16:08 Host 1 Ora_Sql_Hogs_Alert Oracle: SFPRD SQL with high

* New Approach – uses analytics to turn a sea of data into information

Page 6: vSphere Managemen t and Automation

6

Performance Visibility Across the Virtualized Datacenter

Full visibility up and down the

datacenter stack Auto-detects

deviations from learned

baselines

Drill into ESX server for

further details

Automatically aggregates

100s of metrics into health

scores

Page 7: vSphere Managemen t and Automation

7

Anticipate Application Issues Before They Happen

Proactive warning related

to Oracle workload

Correlated workload metrics

forecast a potential breach

Project forward future issues

hours or days in advance

Page 8: vSphere Managemen t and Automation

8

Slide 8

Where does Alive’s Analytics begin?

Learns your dynamic ranges of “Normal” without Templates

Doesn’t assume IT data has a normal “bell-shaped” distribution

Accepts any time series data

Learns patterns of behavior – hour-by-hour, day-by-day, etc.

Sophisticated, Metric-level Dynamic Thresholding

Page 9: vSphere Managemen t and Automation

9

Smart Alert™ - Alive Correlates Abnormalities Across the Application

User Experience (eg, RUM, etc.)

Servers (eg, VCenter, ITM)

App Data (eg, Wily, etc.)

Network Data (eg, SMARTS, etc.)

Application-level AnalysisSmart Alert Generation (“When”)

Business Data (eg, Finance)

Root Cause AnalysisSmart Alert Summary (“What”)

! SMART ALERT

Page 10: vSphere Managemen t and Automation

10

What Does Alive Do With This Foundational Analysis?

Alive Auto-Pilot Dashboards

Alive Smart Alert™

Alive Generates Preemptive “Performance Alerts” with automatic RCA

-- Houston We Have a Problem!

Alive shows role-based information about on-going Performance

-- A Single View of the Truth!

Alive On-Demand Analytics

Alive’s behavioral and trend analysis gives powerful information for infrastructure optimization

-- Capacity Management, VM Workload Optimization

Page 11: vSphere Managemen t and Automation

11

VMware Performance Management OPEX Savings

Incident Management Lifecycle Savings Manage/Resolve incidents Proactive alerts reduce costs

30-40%

Change Lifecycle Savings Manage changes to

apps/infrastructure “Before/after” analysis reduces

changed-related incidents 30-40%

Incident Management Savings Managing Service Desk issues

(Incidents) Manual threshold elimination

reduces erroneous tickets by 50-60%

Problem Management Savings Closing problems after systems

restored, includes root cause analysis

Root cause analysis reduces problem closure by 30%

Page 12: vSphere Managemen t and Automation

12

Customer Success: IT Operations

Before 400 critical alerts/hour

End user complaints alerted IT to the problem

End users impacted (avg. 2 hours/outage)

12 Level-2 engineers on bridge call to address problem

After 20 alerts/MONTH

3 hours advanced warning of slowdown w/root cause

NO end user impact

1 Level-2 Engineer and 1 DBA to address problems

Learn NormalSmart Alerting

Root Cause

Solve performance issues before end users are affected and reduce total alerts

Page 13: vSphere Managemen t and Automation

13

Capacity

Page 14: vSphere Managemen t and Automation

14

Why do you need to manage virtual infrastructure capacity?

Spreadsheets and simple tools are no longer sufficient for managing virtual resources.

90% of VMs are over-provisioned!

Page 15: vSphere Managemen t and Automation

15

Capacity Management of Virtual Infrastructure is hard!

As the creator of vSphere, VMware can give you a complete and accurate view of your available resources

CPU OptimizationsvSMP, Shares, Reservations, Limits

Memory OptimizationsTransparent Page Sharing,

Memory Ballooning, Memory Compression

Storage Optimizations Thin Provisioning, Linked-Clones

ClustersDRS, HA, FT, vMotion, Storage vMotion

Workload FluxVMs growing/shrinking, added/removed

vSphere36 days remaining

Reserved Capacity

?Usable

Capacity

RemainingCapacity

UsedCapacity

Page 16: vSphere Managemen t and Automation

16

How VMware Simplifies Capacity Management

Deliver the right capacity at the right time!

When will I run out of capacity? What if I add, remove, reconfigure capacity? Can I defer infrastructure investments?

Forecast

How can we use my resources more efficiently? What VMs should be right-sized? Can I reclaim over-provisioned or unused capacity?

Optimize

What are my historical utilization trends? What resources have been requested vs. needed? How many more VMs will fit in my current VI?

Analyze

Page 17: vSphere Managemen t and Automation

17

Effective Capacity Management Increases Consolidation Ratios

Source: Leading Healthcare Provider in Southern US

You could fit 3-6x more VMs in your environment and plan for the future.

Microsoft License Savings

$750,000

5:1 30:1

Page 18: vSphere Managemen t and Automation

18

Change & Configuration

Page 19: vSphere Managemen t and Automation

19

OS Data HW Data Cron Jobs Device Drivers Storage (Quota, Space,

File systems) Event Log Settings File System Networking Processes Registry Services/Exported Svcs Software Inventory System Startup User Services WMI

Accounts Groups Account Policy Audit Policy Directory Permissions Directory Audit Settings Event Log & (ng)Syslog

config and Events Patches Registry Key Permissions Service Accounts Shares and Permissions User Rights

Active Directory IIS SQL Server Exchange Oracle Apache Sendmail

OS & HW DNS & Routing File level details Physical Network Resource Pools Virtual Network Snapshot details Storage (SAN,NFS,…) VI Capability

(vMotion,DRS,…) Advanced settings Security Profile Logs

Change Management Must be Comprehensive to be Effective

More than 80,000 configuration variables!

VMware now provides change and remediation for the entire datacenterActive Directory

& SecurityVMware Infrastructure Operations Applications

Page 20: vSphere Managemen t and Automation

20

VMware delivers Rapid Response and Remediation

Change is #1 Reason for Downtime

1. Real-time analytics alert of impending performance degradation

2. Comprehensive change tracking isolates root cause

3. Single-click rollback to remediate and return to normal

Great! No more conference calls and fire drills!

Page 21: vSphere Managemen t and Automation

21

Comprehensive Impact Analysis Drives Change Control

A change to this VM

Directly affects this App Server

Could impact these Web

Servers

Potential impacted

clients

Get better visibility into physical, virtual infrastructure & applications

Page 22: vSphere Managemen t and Automation

22

Save Time with Automated Patching and Provisioning

Software Provisioning (Windows)• Create software packages• Push packages to systems & guests• Tied to compliance• Push software to systems out of compliance

(e.g. Anti-virus)

Patching to Mitigate Vulnerabilities• Pull down patch bulletins for the OS vendors• Assess the infrastructure for vulnerabilities• Remediate - Push patches out to the guests

and systems that need them

Provision Standard Images (vSphere, Windows and Linux)• Install ESX to Bare Metal• Install OS in a VM Container in ESX/ESXi • Install OS to Bare Metal

Common provisioning platform for both physical and virtual environments

Provision SW

Patch

Provision OS

Page 23: vSphere Managemen t and Automation

23

Account PolicySoftware Inventory

File SystemServices

Anti Virus PostureChange Rollback

Customer Success: Operations Management

Manage 3X more VMs and Physical Servers with the same staff

80,000+ variablesPasswords

Access Registration keys

PatchesVulnerabilities

Active Directory

35 Windows servers managed / Admin

12 Web servers / Admin 11 UNIX/Linux servers

/ Admin

24 hours to change local admin passwords on 2000 systems

2 administrators covering all systems including DMZ

Verification not possible due to time and constraints resources

Before

72 Windows servers managed / Admin

32 Web servers / Admin 24 UNIX/Linux servers

/ Admin

1 Hour to Change local admin passwords on 2000 Systems

1 Administrator scheduling the job via VCM

Verification is automated with VCM Reporting

After

Page 24: vSphere Managemen t and Automation

24

Alive Enterprise

Performance Analytics

vCenter Configuration Mgr

Change & Configuration Mgmt

VMware Solution for Infrastructure and Operations Mgmt

Non-VMware (incl. physical) environments VMware Cloud environments

vSphere

Adapters

Application Discovery Mgr

Discovery & Impact Analysis

vCenter CapacityIQ

vCenter ServerCapacity Optimization

vSphere Management Console

Page 25: vSphere Managemen t and Automation

25

Questions?