VMware Advance Troubleshooting Workshop - Day 6

109
Host Maintenance Day 6 VMware vSphere: Install, Configure, Manage

Transcript of VMware Advance Troubleshooting Workshop - Day 6

Page 1: VMware Advance Troubleshooting Workshop - Day 6

Host Maintenance

Day 6

VMware vSphere:

Install, Configure, Manage

Page 2: VMware Advance Troubleshooting Workshop - Day 6

Content

• VMware Update Manager

• Host Profile

• Resource Pool

• Resource Monitoring

• Using Alarms

Page 3: VMware Advance Troubleshooting Workshop - Day 6

Introducing vSphere Update Manager

Page 4: VMware Advance Troubleshooting Workshop - Day 6

Learner Objectives

By the end of this lesson, you should be able to meet the following objectives:

• Describe VMware vSphere® Update Manager™ functionality

• List the steps to install vSphere Update Manager

• Use vSphere Update Manager to create and attach a baseline

Page 5: VMware Advance Troubleshooting Workshop - Day 6

About vSphere Update Manager

vSphere Update Manager enables centralized, automated patch andversion management for VMware ESXi™ hosts, virtual machinehardware, VMware Tools™, and virtual appliances.

vSphere Update Manager reduces security risks:

• Reduces the number of vulnerabilities.

• Eliminates many security breaches that exploit older vulnerabilities.

vSphere Update Manager reduces the diversity of systems in anenvironment:

• Makes management easier.

• Reduces security risks.

vSphere Update Manager keeps machines running more smoothly:

• Patches include bug fixes.

• Makes troubleshooting easier.

Page 6: VMware Advance Troubleshooting Workshop - Day 6

vSphere Update Manager Capabilities

Automated patch downloading:

• Begins with information-only downloading.

• Is scheduled at regular configurable intervals.

Creation of baselines and baseline groups

Scanning:

• Inventory systems are scanned for baseline compliance.

Remediation:

• Inventory systems that are not compliant can be automatically patched.

Reduces the number of reboots required after VMware Tools updates

Page 7: VMware Advance Troubleshooting Workshop - Day 6

vSphere Update Manager Components

vSphere Update Manager includes several components and requires

network connectivity with VMware vCenter Server™.

vSphere Update Manager server component:

• Install on the same computer as Windows vCenter Server or on a different

computer.

Client components:

• VMware vSphere® Update Manager Client™ runs on the desktop:

– Use the vSphere Update Manager Client to perform patch and version management

of the vSphere inventory.

• Update Manager tab in the VMware vSphere® Web Client plug-in:

– Use to view scan results and compliance states for vSphere inventory objects.

Database:

• Use to store and organize server data.

Page 8: VMware Advance Troubleshooting Workshop - Day 6

Requirements for Installing vSphere Update Manager

vSphere Update Manager has the following installation requirements:

• vCenter Server must be installed before installing vSphere Update Manager.

• Installation of vSphere Update Manager requires network connectivity with anexisting vCenter Server system.

• Each vSphere Update Manager installation must be associated with a singlevCenter Server instance.

• The vSphere Update Manager server can be installed on the same system asvCenter Server or on a different system.

• vSphere Update Manager 6.0 can be installed only on 64-bit Windows operatingsystems. vSphere Update Manager 6 is compatible only with vCenter Server 6.

• vSphere Update Manager provides the following client components:

– vSphere Update Manager Client

– vSphere Update Manager Web Client

• The vSphere Update Manager server requires an SQL Server or an Oracledatabase.

• Install the vSphere Update Manager components from the vCenter Server installerfor Windows.

Page 9: VMware Advance Troubleshooting Workshop - Day 6

Installing vSphere Update Manager

To use vSphere Update Manager, you must ensure that your vCenterServer 6 is already installed, and complete the following tasks:

1. Create and prepare a database.

2. Install the vSphere Update Manager server.

3. Install the vSphere Update Manager Client.

4. Enable the vSphere Update Manager plug-in for the vSphere WebClient.

Page 10: VMware Advance Troubleshooting Workshop - Day 6

Configuring vSphere Update Manager Settings

You can modify the vSphere Update Manager configuration only if you

have the correct privileges:

• Network Connectivity Settings

• Download Settings

• Proxy Settings

• Checking for Updates (Download Schedule) Settings

• Notification Check Schedule Settings

• Virtual Machine Settings

• Host and Cluster Settings

Page 11: VMware Advance Troubleshooting Workshop - Day 6

Baseline and Baseline Groups

A baseline includes one or more patches, extensions, or upgrades:

• vSphere Update Manager includes two default dynamic patch baselines andthree upgrade baselines.

A baseline group includes multiple baselines:

• Can contain one upgrade baseline and one or more patch and extensionbaselines.

Page 12: VMware Advance Troubleshooting Workshop - Day 6

Creating and Editing Patch or Extension Baselines

You can create custom patch, extension, and upgrade baselines to meet

the needs of your specific deployment by using the New Baseline wizard:

• Create a fixed patch baseline:

– Fixed baselines consist of a set of patches that do not change as patch availability

changes.

• Create a dynamic patch baseline:

– Dynamic baselines consist of a set of patches that meet certain criteria.

• Create a host extension baseline:

– Extension baselines contain additional software for ESXi hosts. This additional

software might be VMware software or third-party software.

• Filter patches or extensions in the New Baseline wizard:

– When you create a patch or extension baseline, you can filter the patches and

extensions available in the vSphere Update Manager repository to find specific

patches and extensions to exclude or include in the baseline.

Page 13: VMware Advance Troubleshooting Workshop - Day 6

Attaching a Baseline

To view compliance information and scan objects in the inventory against baselines or baseline groups, you must first attach baselines or baseline groups to these objects.

You can attach baselines or baseline groups to objects from the Update Manager Web Client in the vSphere Web Client.

• Click the Monitor > Update Manager > Attach tabs.

Page 14: VMware Advance Troubleshooting Workshop - Day 6

Scanning for Updates

Scanning evaluates the inventory object against the baseline or baseline group.

Page 15: VMware Advance Troubleshooting Workshop - Day 6

Viewing Compliance for vSphere Objects

You can review compliance information for the virtual machines, virtual appliances, and hosts against baselines and baseline groups that you attach.

Page 16: VMware Advance Troubleshooting Workshop - Day 6

Remediating Objects

You can remediate virtual machines, templates, virtual appliances, and

hosts:

• You can perform the remediation immediately or schedule it for a later date.

• Host remediation runs in different ways, depending on the types of baselines

that you attach and whether the host is in a cluster or not.

• For ESXi hosts in a cluster, the remediation process is sequential by default.

• Remediation of hosts in a cluster requires that you temporarily disable cluster

features such as VMware vSphere® Distributed Power Management™ and

VMware vSphere® High Availability admission control.

Page 17: VMware Advance Troubleshooting Workshop - Day 6

Patch Recall Notification

At regular intervals, vSphere Update Manager contacts VMware to

download notifications about patch recalls, new fixes, and alerts:

• Notification Check Schedule is selected by default.

On receiving patch recall notifications, vSphere Update Manager takes

the following actions:

• Generates a notification in the notification tab

• No longer applies the recalled patch to any host:

– Patch is flagged as recalled in the database.

• Deletes the patch binaries from its patch repository

vSphere Update Manager does not uninstall recalled patches from ESXi

hosts. It waits for a newer patch and applies that patch to make a host

compliant.

Page 18: VMware Advance Troubleshooting Workshop - Day 6

Review of Learner Objectives

You should be able to meet the following objectives:

• Describe vSphere Update Manager functionality

• List the steps to install vSphere Update Manager

• Use vSphere Update Manager to create and attach a baseline

Page 19: VMware Advance Troubleshooting Workshop - Day 6

Host Profiles

Page 20: VMware Advance Troubleshooting Workshop - Day 6

Learner Objectives

By the end of this lesson, you should be able to meet the following objectives:

• Describe the host profiles workflow

• Identify how to create a host profile

• Recognize how to apply a host profile to an ESXi host or cluster

• Use host profiles to perform remediation on an ESXi host

Page 21: VMware Advance Troubleshooting Workshop - Day 6

About Host Profiles

Host profiles provide an automated and centrally managed mechanism for host configuration and configuration compliance.

Page 22: VMware Advance Troubleshooting Workshop - Day 6

Host Profiles Workflow

The host profile workflow starts with the concept of a reference host. The reference host serves as the template from which the host profile is extracted:

1. Set up and configure the reference host.

2. Create a host profile from the reference host.

3. Attach other hosts or clusters to the host profile.

4. Check the compliance of the added hosts to the host profile. If all hosts are compliant with the reference host, they are correctly configured.

5. Apply the resulting recommendations to the hosts.

Page 23: VMware Advance Troubleshooting Workshop - Day 6

Creating a Host Profile

You create a host profile by extracting the designated reference host’s configuration.

Page 24: VMware Advance Troubleshooting Workshop - Day 6

Attaching a Host Profile to a Host or Cluster

After creating a host profile from a reference host, you attach the host or cluster to the host profile.

Page 25: VMware Advance Troubleshooting Workshop - Day 6

Checking Compliance

You can confirm the compliance of a host or cluster to its attached hostprofile and determine which configuration parameters on a host aredifferent from those specified in the host profile.

Page 26: VMware Advance Troubleshooting Workshop - Day 6

Remediating an ESXi Host

In the event of a compliance failure, use the remediate function to applythe host profile settings onto the host.

This action changes all host profile-managed parameters to the valuescontained in the host profile attached to the host.

Page 27: VMware Advance Troubleshooting Workshop - Day 6

Review of Learner Objectives

You should be able to meet the following objectives:

• Describe the host profiles workflow

• Identify how to create a host profile

• Recognize how to apply a host profile to an ESXi host or cluster

• Use host profiles to perform remediation on an ESXi host

Page 28: VMware Advance Troubleshooting Workshop - Day 6

Key Points

• vSphere Update Manager reduces security vulnerabilities by keeping systemsup to date and by reducing the diversity of systems in an environment.

• Host profiles encapsulate the host configuration and help you manage the hostconfiguration.

Questions?

Page 29: VMware Advance Troubleshooting Workshop - Day 6

Resource Pools

Page 30: VMware Advance Troubleshooting Workshop - Day 6

Learner Objectives

By the end of this lesson, you should be able to meet the following objectives:

• Assign share values for CPU, memory, and disk resources

• Describe how virtual machines compete for resources

• Create a resource pool

• Set resource pool attributes

• Establish CPU and memory reservations and limits

• Describe expandable reservations

Page 31: VMware Advance Troubleshooting Workshop - Day 6

Shares, Limits, and Reservations

A virtual machine powers on only if its reservation can be guaranteed.

Available Capacity

0 MHz/MB

Limit

Shares are used to

compete in this range.

Reservation

Page 32: VMware Advance Troubleshooting Workshop - Day 6

How Virtual Machines Compete for Resources

Virtual machines are resource consumers. The default resource settingsthat you assign during creation work well for most machines.

VM A

1000

VM B

1000 1000

Change number of shares. VM A

1000

VM B

3000

VM C

1000

Power on virtual machine. VM A

1000

VM B

3000

VM C

1000

VM D

1000

Power off virtual machine. VM A

1000

VM B

3000

VM D

1000

Number of shares. VM C

Page 33: VMware Advance Troubleshooting Workshop - Day 6

About Resource Pools

A resource pool is a logical abstraction of hierarchically managed CPU and memory resources.

Root Resource

Pool

Parent Resource

Pool

Child Resource

Pool

Sibling Resource

Pools

Page 34: VMware Advance Troubleshooting Workshop - Day 6

Resource Pool Attributes

You can create a child resource pool

of any VMware ESXi™ host,

resource pool, or VMware vSphere®

Distributed Resource Scheduler™

cluster.

• Shares: Low, Normal, High, Custom

• Reservations: In MHz or GHz, MB or GB

• Limits:

– In MHz or GHz, MB or GB.

– Unlimited access, by default, up to

maximum amount of resource accessible.

• Reservation type:

– Expandable selected: Virtual machines and

subpools can draw from this pool’s parent.

– Expandable deselected: Virtual machines

and subpools can draw only from this pool,

even if its parent has free resources.

Page 35: VMware Advance Troubleshooting Workshop - Day 6

Reasons to Use Resource Pools

Use of resource pools can provide these benefits:

• Flexible hierarchical organization

• Isolation between pools and sharing in pools

• Access control and delegation

• Separation of resources from hardware

• Management of sets of virtual machines running a multitier service

• Ability to prioritize virtual machine workloads

Page 36: VMware Advance Troubleshooting Workshop - Day 6

Resource Pool Case Study

Company X’s IT department has two internal customers:

• The Finance Department supplies two-thirds of the budget.

• The Engineering Department supplies one-third of the budget.

Each internal customer has both production and test/dev virtual machines.

You must control the resource consumption of the test/dev virtual machines.

You must also ensure the entitled resources of the Finance Department.

Page 37: VMware Advance Troubleshooting Workshop - Day 6

Resource Pool Example

This example shows where resource attributes are set on a resource pool.

Engineering Pool

CPU Shares: 1000Reservation: 1,000 MHzLimit: 4,000 MHzExpandable Reservation: Yes

Eng-Test VM

CPU Shares: 1000

Reservation: 0 MHz

Limit: 4000 MHz

Eng-Prod VM

CPU Shares: 2000

Reservation: 250 MHz

Limit: 4000 MHz

Page 38: VMware Advance Troubleshooting Workshop - Day 6

Resource Pools Example: CPU Shares

In this example, the Finance resource pool has twice as many CPUshares as the Engineering resource pool. It is entitled to twice as manyCPU resources as the Engineering resource pool.

CPU Shares: 1000

Eng-Test VM

Engineering pool

CPU Shares: 1000

Engineering Pool Engineering pool

CPU Shares: 2000

Finance Pool

CPU Shares: 2000

Eng-Prod VM

CPU Shares: 1000

Fin-Test VM

CPU Shares: 2000

Fin-Prod VM

Standalone Host: Srv001

(Root Resource Pool)

Page 39: VMware Advance Troubleshooting Workshop - Day 6

22%

22%45%

11%

Resource Pools Example: CPU Contention

CPU Shares: 1000

Eng-Test VM

Engineering poolCPU Shares: 1000

-%33 of Physical CPU

Engineering Pool Engineering poolCPU Shares: 2000

-67% of Physical CPU

Finance Pool

CPU Shares: 2000

Eng-Prod VM

CPU Shares: 1000

Fin-Test VM

CPU Shares: 2000

Fin-Prod VM

Eng-test gets ~33% of Engineering’s CPU allocation: approximately 11% of the physical CPU.

Finance ~67%

Engineering ~33%

Srv01 All VMs below are running on the same physical CPU.

Page 40: VMware Advance Troubleshooting Workshop - Day 6

Expandable Reservation

Borrowing resources occurs

recursively from the ancestors of

the current resource pool:

• The Expandable Reservation option

must be enabled.

• This option offers more flexibility but

less protection.

Expanded reservations are not

released until the virtual machine

that caused the expansion is shut

down or its reservation is reduced.

Retail Pool

Reservation: 3,000 MHzExpandable Reservation: Yes

Root Resource Pool

Total CPU: 10,200 MHz

Total Memory: 3,000 MB

eCommerce Apps

Pool

eCommerce Web

Pool

Reservation:

1,000 MHz

Expandable? No

Reservation:

1,200 MHz

Expandable? Yes

A mismanaged or mis-sized expandable reservation

might claim all unreserved capacity.

Page 41: VMware Advance Troubleshooting Workshop - Day 6

Example of Expandable Reservation (1)

eCommerce resource pools

reserve 2,200 MHz of the 3,000

MHz that the Retail pool has

reserved.

Power on virtual machines in the

eCommerce Web pool.

With Expandable Reservation

disabled on the eCommerce Web

pool, VM3 cannot be started with

a reservation of 500 MHz:

• Lower the virtual machine

reservation.

• Enable Expandable Reservation.

• Increase the eCommerce Web

pool’s reservation.

Retail Pool

Reservation: 3,000 MHzExpandable Reservation: No

VM1R=400

VM2R=300

VM3R=500

Root Resource Pool

Total CPU: 10,200 MHz

Total Memory: 3,000 MB

eCommerce Apps

Pool

eCommerce Web

Pool

Reservation:

1,000 MHz

Expandable? No

Reservation:

1,200 MHz

Expandable? Yes

Page 42: VMware Advance Troubleshooting Workshop - Day 6

Example of Expandable Reservation (2)

Enable Expandable Reservation

on the eCommerce Web pool.

The system considers the

resources available in the child

resource pool and its direct

parent resource pool.

The virtual machine’s reservation

is charged against the reservation

for eCommerce Web.

eCommerce Web’s reservation is

charged against the reservation

for Retail.

Retail Pool

Reservation: 3,000 MHzExpandable Reservation: Yes

Root Resource Pool

Total CPU: 10,200 MHz

Total Memory: 3,000 MB

VM4R=500

VM5R=500

VM6R=500

VM1R=400

VM2R=300

VM7R=500

VM3R=500

**200 MHz Used by Retail**

**Full Reservation Used**

eCommerce Apps

Pool

eCommerce Web

Pool

Reservation:

1,000 MHz

Expandable? Yes

Reservation:

1,200 MHz

Expandable? Yes

Page 43: VMware Advance Troubleshooting Workshop - Day 6

Admission Control for CPU and Memory Reservations

Admission control is used to ensure that you do not allocate resources that are not available.

Power on a virtual machine.Create a subpool

with its own reservation.Increase a pool’s

reservation.

Expandablereservation?

Can this poolsatisfy reservation?

No

Yes. Go to parent pool.

Succeed

FailNo

Yes

Page 44: VMware Advance Troubleshooting Workshop - Day 6

Resource Pool Summary Tab

The resource pool Summary tab displays information that applies to the host machine and its resources.

Page 45: VMware Advance Troubleshooting Workshop - Day 6

Resource Reservation Tab

On the Resource Reservation tab, you can view information about a resource pool’s CPU, memory, and storage resources.

Page 46: VMware Advance Troubleshooting Workshop - Day 6

Scheduling Changes to Resource Settings

You can schedule a task to change the resource settings of a resource pool or virtual machine.

Page 47: VMware Advance Troubleshooting Workshop - Day 6

Review of Learner Objectives

You should be able to meet the following objectives:

• Assign share values for CPU, memory, and disk resources

• Describe how virtual machines compete for resources

• Create a resource pool

• Set resource pool attributes

• Establish CPU and memory reservations and limits

• Describe expandable reservations

Page 48: VMware Advance Troubleshooting Workshop - Day 6

Monitoring Resource Use

Page 49: VMware Advance Troubleshooting Workshop - Day 6

Learner Objectives

By the end of this lesson, you should be able to meet the followingobjectives:

• Use the performance-tuning methodology and resource monitoring tools

• Use performance charts to view and improve performance

• Monitor the key factors that can affect the virtual machine’s performance: CPU,memory, disk, and network bandwidth use

Page 50: VMware Advance Troubleshooting Workshop - Day 6

Performance-Tuning Methodology

Follow these best practices for performance-tuning your vSphere infrastructure:

• Assess performance:

– Use appropriate monitoring tools.

– Record a numerical benchmark before changes.

• Identify the limiting resource.

• Make more resources available:

– Allocate more.

– Reduce competition.

– Log your changes.

• Benchmark again.Do not make casual changes

to production systems.

Page 51: VMware Advance Troubleshooting Workshop - Day 6

Resource-Monitoring Tools

Many resource and performance monitoring tools are available toadministrators to use with vSphere.

Inside the Guest OS

Perfmon DLL

Task Manager

Outside the Guest OS

VMware vCenter Server™ performance charts

VMware vRealize™ Operations™

VMware vRealize™ Hyperic™

VMware vSphere®/ESXi system logs

resxtop and esxtop

Page 52: VMware Advance Troubleshooting Workshop - Day 6

Resource-Monitoring Tools

Many resource and performance monitoring tools are available toadministrators to use with vSphere.

Inside the Guest OS

Perfmon DLL

Task Manager

Outside the Guest OS

VMware vCenter Server™ performance charts

VMware vRealize™ Operations™

VMware vRealize™ Hyperic™

VMware vSphere®/ESXi system logs

resxtop and esxtop

Page 53: VMware Advance Troubleshooting Workshop - Day 6

Guest Operating System Monitoring Tools

To monitor performance in the guest operating system, use tools that youare familiar with, such as Windows Task Manager.

Windows Task Manager

Page 54: VMware Advance Troubleshooting Workshop - Day 6

Using Perfmon to Monitor Virtual Machine Resources

The Perfmon DLL in VMware Tools™ provides virtual machine processorand memory objects to access host statistics inside a virtual machine.

Page 55: VMware Advance Troubleshooting Workshop - Day 6

About Monitoring Inventory Objects with Performance Charts

The vSphere statistics subsystem collects data on the resource usage of inventory objects:

• Counters and Metric Groups

• Collection Levels and Collection Intervals

• Data Availability

Page 56: VMware Advance Troubleshooting Workshop - Day 6

Working with Overview Performance Charts

The overview performance charts display the most common metrics for an object in the inventory.

Host’s Performance Charts

Partial Overview Panel

Page 57: VMware Advance Troubleshooting Workshop - Day 6

Working with Advanced Performance Charts

Advanced charts support data counters that are not supported in otherperformance charts.

Chart Options Objects

Chart Type

Counters Rollups Statistics Type

Chart Metrics Timespan

Page 58: VMware Advance Troubleshooting Workshop - Day 6

Chart Options: Real-Time and Historical

vCenter Server stores statistics at different specificities.

Time Interval Data Frequency Number of Samples

Real-Time (past hour) 20 seconds 180

Past Day 5 minutes 288

Past Week 30 minutes 336

Past Month 2 hours 360

Past Year 1 day 365

Page 59: VMware Advance Troubleshooting Workshop - Day 6

Chart Types

Depending on the metric type and object, performance metrics aredisplayed in different types of charts.

Line chart:

• Each instance is shown separately.

Bar chart:

• Each instance is a bar in the chart.

Pie chart:

• Each instance is a slice in a circular pie.

Stacked chart:

• Graphs are stacked on top of one another.

Page 60: VMware Advance Troubleshooting Workshop - Day 6

Saving Charts

You click the Save Chart icon above the graph to save performancechart information.

You can save information in these formats:

• PNG

• JPEG

• CSV

Page 61: VMware Advance Troubleshooting Workshop - Day 6

Objects and Counters

The performance charts graphically display CPU, memory, disk, network,

and storage metrics for devices and entities managed by vCenter Server.

Objects are instances or aggregations of devices:

• Examples: vCPU0, vCPU1, vmhba1:1:2, aggregate over all NICs

Counters identify which statistics to collect:

• Examples:

– CPU: Used time, ready time, usage (%)

– NIC: Network packets received

– Memory: Memory swapped

Page 62: VMware Advance Troubleshooting Workshop - Day 6

Statistics Type

The statistics type is the unit of measurement used during the statistics interval.

Statistics Type Description Example

Rate Value over the current interval CPU use (MHz)

Delta Change from previous interval CPU ready time

Absolute Absolute value, independent of interval Memory active

Page 63: VMware Advance Troubleshooting Workshop - Day 6

Rollup

Rollup is the conversion function between statistics intervals:

• 5 minutes of past-hour statistics are converted to 1 past-day value:

– Fifteen 20-second statistics are rolled up into a single value.

• 30 minutes of past-day statistics are converted to 1 past-week value:

– Six 5-minute statistics are rolled up into a single value.

• Other rollup types: Minimum, Maximum

Rollup Type Conversion Function Sample Statistic

Average Average of data points CPU use (average)

Summation Sum of data points CPU ready time (milliseconds)

Latest Last data point Uptime (days)

Page 64: VMware Advance Troubleshooting Workshop - Day 6

Setting Log Levels

Setting log levels enables the user to control the quantity and type ofinformation logged.

Examples of when to set log levels:

• When troubleshooting complex issues, set the log level to verbose or trivia.Troubleshoot and set it back to info.

• To control the amount of information being stored in the log files.

Option Description

None Turns off logging

Error (errors only) Displays only error log entries

Warning (errors and warnings) Displays warning and error log entries

Info (normal logging) Displays information, error, and warning log entries

VerboseDisplays information, error, warning, and verbose log

entries

Trivia (extended verbose)Displays information, error, warning, verbose, and

trivia log entries

Page 65: VMware Advance Troubleshooting Workshop - Day 6

Interpreting Data from the Tools

vCenter Server monitoring toolsand guest operating systemmonitoring tools provide differentpoints of view.

Task Manager in

Guest Operating System

CPU Usage

Chart for Host

Page 66: VMware Advance Troubleshooting Workshop - Day 6

CPU-Constrained Virtual Machine

If CPU usage is continuously high, the virtual machine is constrained byCPU. However, the host might have enough CPU for other virtualmachines to run.

Multiple virtual machines are constrained by CPU if the followingconditions are present:

• High CPU usage in the guest operating system

• Relatively high CPU ready values for the virtual machines

Single Virtual Machine CPU Usage

Page 67: VMware Advance Troubleshooting Workshop - Day 6

Memory-Constrained Virtual Machine

Check the virtual machine’s ballooning activity to determine if the virtual

machine is constrained for memory:

• If ballooning activity is high, this state might not be a problem if all virtual

machines have sufficient memory.

• If ballooning activity is high and the guest operating system is swapping, then

the virtual machine is constrained for memory.

Page 68: VMware Advance Troubleshooting Workshop - Day 6

Memory-Constrained Host

If active host-level swapping is occurring, then host memory isovercommitted.

Page 69: VMware Advance Troubleshooting Workshop - Day 6

Monitoring Active Memory of a Virtual Machine

Monitor for increases in active memory on the host:

• Host active memory refers to active physical memory used by virtual machines

and the VMkernel.

• If amount of active memory is high, this situation might lead to virtual machines

that are memory-constrained.

Page 70: VMware Advance Troubleshooting Workshop - Day 6

Disk-Constrained Virtual Machines

Disk-intensive applications can saturate the storage or the path.

If you suspect that a virtual machine is constrained by disk access:

• Measure the throughput and latency between the virtual machine and storage.

• Use the advanced performance charts to monitor:

– Read rate and write rate

– Read latency and write latency

Page 71: VMware Advance Troubleshooting Workshop - Day 6

Monitoring Disk Latency

To determine disk performance problems, monitor two disk latency data

counters:

• Kernel command latency:

– The average time spent in the VMkernel per SCSI command.

– High numbers (greater than 2 or 3 ms) represent either an overworked array or an

overworked host.

• Physical device command latency:

– The average time the physical device takes to complete a SCSI command.

– High numbers (greater than 15 or 20 ms) represent a slow or overworked array.

Page 72: VMware Advance Troubleshooting Workshop - Day 6

Network-Constrained Virtual Machines

Network-intensive applications often bottleneck on path segmentsoutside the ESXi host:

• Example: WAN links between server and client

If you suspect that a virtual machine is constrained by the network:

• Confirm that VMware Tools is installed.

– Enhanced network drivers are available.

• Measure the effective bandwidth between the virtual machine and its peersystem.

• Check for dropped receive packets and dropped transmit packets.

Page 73: VMware Advance Troubleshooting Workshop - Day 6

Review of Learner Objectives

You should be able to meet the following objectives:

• Use the performance-tuning methodology and resource monitoring tools

• Use performance charts to view and improve performance

• Monitor the key factors that can affect the virtual machine’s performance: CPU,memory, disk, and network bandwidth use

Page 74: VMware Advance Troubleshooting Workshop - Day 6

Using Alarms

Page 75: VMware Advance Troubleshooting Workshop - Day 6

Learner Objectives

By the end of this lesson, you should be able to meet the followingobjectives:

• Create alarms with condition-based triggers

• Create alarms with event-based triggers

• View and acknowledge triggered alarms

Page 76: VMware Advance Troubleshooting Workshop - Day 6

About Alarms

An alarm is a notification that occurs in response to selected events or

conditions that occur with an object in the inventory.

Default alarms exist for various inventory objects:

• Many default alarms for hosts and virtual machines

You can create custom alarms for a wide range of inventory objects:

• Virtual machines, hosts, clusters, data centers, datastores, datastore clusters,

networks, distributed switches, and distributed port groups

Page 77: VMware Advance Troubleshooting Workshop - Day 6

Alarm Settings

To monitor your environment, you can create and modify alarmdefinitions in the VMware vSphere® Web Client.

Page 78: VMware Advance Troubleshooting Workshop - Day 6

Alarm Triggers

An alarm requires a trigger.

Types of triggers:

• Condition or state trigger: Monitors the current condition or state. Examples:

– A virtual machine’s current snapshot is above 2 GB in size.

– A host is using 90 percent of its total memory.

– A datastore has been disconnected from all hosts.

• Event: Monitors events. Examples:

– The health of a host’s hardware has changed.

– A license has expired in the data center.

– A host has left the distributed switch.

Page 79: VMware Advance Troubleshooting Workshop - Day 6

Configuring Condition Triggers

Condition or state triggers monitor metrics for a host, virtual machine, or datastore.

Page 80: VMware Advance Troubleshooting Workshop - Day 6

Configuring Event Triggers

Event triggers monitor the current state of a host, virtual machine, ordatastore.

Page 81: VMware Advance Troubleshooting Workshop - Day 6

Configuring Actions

You can define actions that the system performs when the alarm istriggered or changes status.

Page 82: VMware Advance Troubleshooting Workshop - Day 6

Configuring vCenter Server Notifications

You must configure the email address of the sender account to enablevCenter Server operations such as sending email notifications as alarmactions.

Select SNMP receivers to

specify trap destinations.

Select Mail to set

SMTP parameters.

Page 83: VMware Advance Troubleshooting Workshop - Day 6

Viewing and Acknowledging Triggered Alarms

The Acknowledge Alarm feature is used to track when triggered alarms

are addressed.

Page 84: VMware Advance Troubleshooting Workshop - Day 6

Review of Learner Objectives

You should be able to meet the following objectives:

• Create alarms with condition-based triggers

• Create alarms with event-based triggers

• View and acknowledge triggered alarms

Page 85: VMware Advance Troubleshooting Workshop - Day 6

Troubleshooting Virtual Machines

Page 86: VMware Advance Troubleshooting Workshop - Day 6

Disk Content IDs

A content ID (CID) resides in each virtual machine’s disk descriptor file for integrity and state tracking.

Win01-A.vmdk is the parent of

Win01-A-000001.vmdk.

Win01-A-000001.vmdk is the parent of

Win01-A-000002.vmdk.

Win01-A.vmdk

Win01-A-000001.vmdk Win01-A-000002.vmdk

Page 87: VMware Advance Troubleshooting Workshop - Day 6

Virtual Machine Problem 1

CID mismatch conditions are triggered by several factors:

• Interruptions to VMware vSphere® vMotion® migrations

• VMware software errors

As an initial check in this situation, view the virtual machine’s vmware.log file to identify the specific disk chain affected.

Performing a snapshot operation returns a content ID mismatch error.

Page 88: VMware Advance Troubleshooting Workshop - Day 6

Content ID Mismatch Example

In this example, the parentCID descriptor in Win01-A-000002.vmdkis incorrect.

Win01-A.vmdk

Win01-A-000001.vmdk Win01-A-000002.vmdk

Page 89: VMware Advance Troubleshooting Workshop - Day 6

Resolving a Content ID Mismatch

To resolve the problem, make a backup of the disk descriptor files that require editing and use a text editor to correct the mismatch.

Verify that the CID mismatch was corrected by running the following command against the highest-level snapshot:

vmkfstools -e Win01-A-000002.vmdk

If you see failure messages, the CID mismatch was not corrected.

Win01-A-000001.vmdk Win01-A-000002.vmdk

Change

this value.

Page 90: VMware Advance Troubleshooting Workshop - Day 6

Virtual Machine Problem 2

A virtual machine generating heavy I/O workload might encounter issues

when quiescing before a snapshot operation.

Quiescing can be done by using the following technologies:

• Microsoft Volume Shadow Copy Service (VSS)

• The VMware Tools™ SYNC driver

As an initial check, verify that you can create a manual, nonquiesced

snapshot by using Snapshot Manager.

Taking a quiesced VMware snapshot of a virtual machine fails.

Page 91: VMware Advance Troubleshooting Workshop - Day 6

Resolving Quiesced Snapshot Failure

When quiescing with VSS, the following items are required:

• VSS prerequisites are met.

• Appropriate services are running and startup types are correct.

• The VSS provider is used.

• All the VSS writers are stable and not reporting errors.

If you are quiescing with the SYNC driver, you have the following options:

• Disable the SYNC driver.

• Stop the service generating heavy I/O before taking a snapshot.

The SYNC driver is required only for legacy versions of Windows that donot include VSS.

Page 92: VMware Advance Troubleshooting Workshop - Day 6

Virtual Machine Problem 3

If you cannot create or commit a snapshot, multiple initial checks are

available:

• Verify that the virtual disk type is supported for snapshots:

– RDM in physical mode, independent disks, or virtual machines configured with bus-

sharing are not supported.

• If you have reached 32 levels of snapshots (which includes the base disk), you

cannot create more snapshots:

– In VMware vSphere® Web Client, verify this limit with Manage Snapshots.

The user cannot create or commit a snapshot.

Page 93: VMware Advance Troubleshooting Workshop - Day 6

Identifying Possible Causes

When you cannot create or commit snapshots, take a top-downapproach to troubleshooting, starting with VMware vCenter Server™permissions, checking the virtual machine’s files, and then checking theESXi host’s datastores.

ESXi

Host

Possible Causes

The user does not have permission to create or commit

snapshots.

Virtual

Machine

vCenter Server

The –delta.vmdk file does not have an associated

descriptor file.

The snapshot file size reached a maximum not supported by

the datastore.

Space on the datastore is insufficient to commit all snapshots.

Page 94: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: No Permissions to Create Snapshots

The user might not have permission to create or commit snapshots.

To verify whether this is the case, check the user’s permissions in vCenter Server.

To resolve this issue, perform one of the following tasks:

• Give the user the virtual machine power-user role.

• Create a custom role that includes snapshot management permissions and assign that role to the user.

Page 95: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: Missing Delta Descriptor File

You cannot create or commit a snapshot if a snapshot (delta) file does

not have an associated descriptor file:

• For example, examplevm-000001-delta.vmdk is missing its

corresponding descriptor file, examplevm-000001.vmdk.

If the –delta.vmdk has no descriptor file, then you must create one:

1. Copy the base disk descriptor file, giving it the name of the missing descriptor

file.

2. Edit the new descriptor file to change its format from a base disk to a

snapshot delta disk descriptor.

Page 96: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: Insufficient Space on Datastore

You cannot create or commit a snapshot if the space on the datastore isinsufficient to commit all snapshots.

To verify that enough space exists, identify the ESXi host and datastoreon which the snapshot files reside and run df –h.

Multiple ways to resolve the issue of insufficient space are available:

• Increase the size of the datastore.

• Move virtual machines to datastores with sufficient space.

Page 97: VMware Advance Troubleshooting Workshop - Day 6

Virtual Machine Problem 4

If a virtual machine fails to power on, begin by viewing the error

message, either displayed in vSphere Web Client or found in the virtualmachine log file, vmware.log:

• Look for hints for why the power-on operation is failing.

Powering on a virtual machine fails.

Page 98: VMware Advance Troubleshooting Workshop - Day 6

Identifying Possible Causes

When a virtual machine does not power on, take a top-down approach totroubleshooting. Start with the virtual machine’s files and then check theESXi host.

ESXi

Host

One or more virtual machine files are missing.

One of the virtual machine’s files is locked.

Possible Causes

Insufficient resources exist on the ESXi host.

The ESXi host is not responsive.

Virtual

Machine

Page 99: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: Virtual Machine Files Missing

One or more of the virtual machine’s files might be missing.

Determine whether virtual machine files are missing:

ls /vmfs/volumes/Shared/Win01-B

To resolve this problem, restore the missing file or files from your last backup. If a descriptor file is missing, recreate the descriptor file manually.

Page 100: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: Virtual Machine File Locked

A virtual machine will not power on if one of the virtual machine’s files, for

example, the virtual machine’s disk file, is locked.

Perform these steps to find a locked file:

1. Power on a virtual machine.

– If the power-on fails, the error message identifies the affected file.

2. Determine whether the file can be locked.

– touch filename

3. Determine which ESXi host has locked the file.

– vmkfstools -D /vmfs/volumes/Shared/Win01-B/Win01-B-flat.vmdk

Page 101: VMware Advance Troubleshooting Workshop - Day 6

Resolving a Locked Virtual Machine File

The following steps are an alternative way to resolve a locked virtual

machine file:

1. Determine which ESXi host has a network adapter with the MAC addressfrom the vmkfstools output.

2. Identify the process that is holding the lock.

lsof | grep name_of_locked_file

3. Kill the process that is locking the file.

– If the file is being accessed by a running virtual machine, the lock cannot be taken

away or removed.

If you still cannot determine which process or entity has the virtual machine file

locked, perform the following tasks:

1. Migrate all virtual machines from the ESXi host that created the lock to

another ESXi host.

2. Reboot the ESXi host that created the lock.

Page 102: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: Insufficient Resources on ESXi Host

A virtual machine will not power on if insufficient resources are on theESXi host.

To determine if sufficient resources exist, check CPU, memory, networkresource, and storage resource availability on the ESXi host.

To resolve this issue, perform one of the following tasks:

• Decrease the CPU and memory reservations on the virtual machine.

• Add more resources to the ESXi host, cluster, or resource pool.

Page 103: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: ESXi Host Unresponsive

If the ESXi host is unresponsive, a virtual machine will not power on.

Determine whether the host has crashed or is hanging:

• If the host is hanging, you cannot perform the following tasks:

– Ping the VMkernel network interface.

– Determine whether VMware vSphere® Client™ responds to queries.

– Monitor network traffic from the ESXi host and its virtual machine.

• If the host has crashed, you will see the purple crash screen on the ESXiconsole.

Page 104: VMware Advance Troubleshooting Workshop - Day 6

Virtual Machine Problem 6

If VMware Tools fails to install, verify that the guest operating system issupported by VMware.

Check the release notes or the knowledge base for any known issueswith installing VMware Tools in certain guest operating systems.

The VMware Tools installation fails to complete.

Page 105: VMware Advance Troubleshooting Workshop - Day 6

Identifying Possible Causes

When VMware Tools installation fails, take a top-down approach to

troubleshooting, starting with your virtual machine configuration and then

checking the ISO image configuration on the ESXi host.

ESXi

Host

An incorrect guest OS is selected for the virtual machine.

Possible Causes

The correct VMware Tools ISO image is not being loaded.

The VMware Tools ISO image cannot be found.

The VMware Tools ISO image is corrupt.

Virtual

Machine

Page 106: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: Wrong Guest Operating System

If VMware Tools installation fails, the wrong guest operating systemmight have been configured for the virtual machine.

In vSphere Web Client, view the guest operating system settings underVM Options. Ensure that the correct guest operating system is selected.

Page 107: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: ISO Image Not Being Loaded

If an incorrect VMware Tools ISO image is being loaded, VMware Tools installation will fail.

Verify that the ISO image is connected to CD/DVD drive 1.

To resolve the problem:

• Manually start the VMware Tools installer.

• Manually connectthe correct ISO imageto the virtual machine.

Page 108: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: ISO Image Cannot Be Found

If the VMware Tools ISO image cannot be found, VMware Tools installation will fail.

Verify that ISO images exist on the ESXi host:

• ISO Images are in /usr/lib/vmware/isoimages.

• This directory is a symbolic link to the /productLocker/vmtools directory.

To resolve this problem, ensure that the /usr/lib/vmware/isoimages directory is correctly linked to the /productLocker/vmtools directory.

Page 109: VMware Advance Troubleshooting Workshop - Day 6

Possible Cause: VMware Tools ISO Image Corrupt

A VMware Tools installation will fail if the VMware Tools ISO image iscorrupt.

To verify whether corruption has occurred, compare the checksum of thecorrupt ISO image with a known good ISO image.

To resolve this issue of different checksums, copy a known, stable ISOimage from an ESXi host to the /productLocker/vmtools directoryon the ESXi host with the corrupt image.