SER2980BU vCenter Server 6.5 Deep Dive and Troubleshooting ... · vCenter Server 6.5 Deep Dive and...

33
Pawan Naidu Prateek Jain SER2980BU #VMworld #SER2980BU vCenter Server 6.5 Deep Dive and Troubleshooting VMworld 2017 Content: Not for publication or distribution

Transcript of SER2980BU vCenter Server 6.5 Deep Dive and Troubleshooting ... · vCenter Server 6.5 Deep Dive and...

Pawan NaiduPrateek Jain

SER2980BU

#VMworld #SER2980BU

vCenter Server 6.5 Deep Dive and Troubleshooting

VMworld 2017 Content: Not fo

r publication or distri

bution

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

CONFIDENTIAL 2

VMworld 2017 Content: Not fo

r publication or distri

bution

Agenda

1 Why High Availability for vCenter Server

2 Platform Services Controller High Availability

3 vCenter High Availability aka VCHA

4 Troubleshooting vCenter High Availability Issues

5 Troubleshooting Certificates Issues

#SER2980BU CONFIDENTIAL 4

VMworld 2017 Content: Not fo

r publication or distri

bution

Loginsight

Horizon View

SRM

NSX

vROPS

Orchestrator

VRA

VCD

BackupVASA

#SER2980BU CONFIDENTIAL 4

Why High Availability is Important for vCenter Server

VMworld 2017 Content: Not fo

r publication or distri

bution

Platform Services Controller High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 79

Apart from Single Sign On functionality PSC got several new components introduced in vCenter 6.0 which continued in version 6.5

• Component Manager (CM)

• Service Control Agent (SCA)

• VMware Certificate Authority (VMCA)

• VMware Endpoint Certificate Store (VECS)

• VMware Authentication Framework Demon (VMAFD)

• VMware HTTP Reverse Proxy (RHTTP)

• VMware License Service

• Global Permissions

VMware Platform Services Controller (PSC)

Platform

Services

ControllerPlatform

Services

Controller

Platform

Services

Controller

Platform

Services

Controller

Platform

Services

Controller

Platform

Services

ControllerVMworld 2017 Content: N

ot for publicatio

n or distribution

8

Loadbalancer Overview

8

External Platform

Services Controller

External Platform

Services Controller

SSO – Site A

• Deployed in a Single Site

• Traffic Flow

• Functionality

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 9

Update Config

and Endpoint

Run SSOConfig and

LsEndpoint

06

Installation

PSC nodes

01

Validation

MachineSSL has to be

validated on all the PSC

nodes

03

Certificate

Validation

Confirm if we get the same

Certificate on LB

and on PSC nodes

05

MachineSSL

Generate and replace

MachineSSL

02

Loadbalancer

Prerequisites & Configuration

04

PSC HA

Platform Services Controller High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

• Install and configure PSC nodes in

a single SSO domain and Site

01 Installation

Single Site

Node 1

Node 2

PSC HA

#SER2980BU CONFIDENTIAL 10

Platform Services Controller High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 11

• Generate and replace machine SSL

for all the PSC nodes behind the LB

KB: https://kb.vmware.com/kb/2147627

• FQDN and Common Name

• MachineSSL configured in SSL

Passthrough Mode only

02 MachineSSL

PSC HAPSC HA

Platform Services Controller High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

• At this stage we must validate the

properties of the MachineSSL on both

the PSC nodes

03 Validation

PSC HA

/var/log/vmware/vmacd/certificate-manager.log

2017-07-25T06:57:17.195Z INFO certificate-manager [u'__MACHINE_CERT']

2017-07-25T06:57:17.195Z INFO certificate-manager Running command : ['/usr/lib/vmware-vmafd/bin/vecs-cli', 'entry', 'getcert', '--

text', '--store', 'MACHINE_SSL_CERT', '--alias', '__MACHINE_CERT']

2017-07-25T06:57:17.210Z INFO certificate-manager Output :

Certificate:

Data:

Version: 3 (0x2)

Serial Number:

79:00:00:00:03:7a:4b:03:66:55:32:c2:e4:00:00:00:00:00:03

Signature Algorithm: sha256WithRSAEncryption

Issuer: DC=local, DC=virtualworld, CN=virtualworld-AD-CA

Validity

Not Before: Jul 25 06:18:04 2017 GMT

Not After : Jul 25 06:18:04 2019 GMT

Subject: C=IN, ST=KA, L=Bangalore, O=VMware, OU=GSS, CN=psc-lb.virtualworld.local

X509v3 Subject Alternative Name:

DNS:psc1.virtualworld.local, DNS:psc2.virtualworld.local, DNS:psc-lb.virtualworld.local

#SER2980BU CONFIDENTIAL 12

Platform Services Controller High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

Prerequisites

• NSX Manager configured and

registered to an existing vCenter Server

• At least one Edge deployed to configure

Edge Services Gateway

Follow KB :

https://kb.vmware.com/kb/2147046

04 Loadbalancer

PSC HA

#SER2980BU CONFIDENTIAL 13

Platform Services Controller High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

Validating MachineSSL

vCenter Server Appliance - /usr/lib/vmware-

vmafd/bin/vecs-cli entry list --store

MACHINE_SSL_CERT –text

Validate if the loadbalancer is presenting the

same certificate

openssl s_client -connect psc-

lb.virtualworld.local:443

05 Certificate Validation

PSC HA

root@psc1 [ /usr/lib/vmware-vmafd/bin ]# ./vecs-cli entry list

--store MACHINE_SSL_CERT

Number of entries in store : 1

Alias : __MACHINE_CERT

Entry type : Private Key

Certificate : -----BEGIN CERTIFICATE-----

MIIF2TCCBMGgAwIBAgITeQAAAAN6SwNmVTLC5AAAAA

AAAzANBgkqhkiG9w0BAQsFADBSMRUwEwYKCZImiZPy

LGQBGRYFbG9jYWwxHDAaBgoJkiaJk/IsZAEZFgx2aXJ0d

WFsd29ybGQxGzAZBgNVBAMTEnZpcnR1YWx3b3JsZC1

BRC1DQTAeFw0

8MxTExreHvNBzeja2hRK+Rj7JBvuSGn9tLYFF8Mi9sDu3sb

0pL0hCxH4ljSgDFNM4LTuv3PqeWBIw9Z53171qYMM/V+7

VyVOenKUQqTgaIBsGC9D1pFluufpq6yQyhcn+o5ZMkV6Yj

CRtpS9a9XK271vUycv/hR3GJ+lrjehGJaKINcZqLqHKwxGn

5VMaIoWQ==

-----END CERTIFICATE-----

root@psc1 [ ~ ]# openssl s_client -connect

psc-lb.virtualworld.local:443

Number of entries in store : 1

Alias : __MACHINE_CERT

Entry type : Private Key

Certificate : -----BEGIN CERTIFICATE-----

MIIF2TCCBMGgAwIBAgITeQAAAAN6SwNmVTLC5AAAAA

AAAzANBgkqhkiG9w0BAQsFADBSMRUwEwYKCZImiZPy

LGQBGRYFbG9jYWwxHDAaBgoJkiaJk/IsZAEZFgx2aXJ0d

WFsd29ybGQxGzAZBgNVBAMTEnZpcnR1YWx3b3JsZC1

BRC1DQTAeFw0

8MxTExreHvNBzeja2hRK+Rj7JBvuSGn9tLYFF8Mi9sDu3sb

0pL0hCxH4ljSgDFNM4LTuv3PqeWBIw9Z53171qYMM/V+7

VyVOenKUQqTgaIBsGC9D1pFluufpq6yQyhcn+o5ZMkV6Yj

CRtpS9a9XK271vUycv/hR3GJ+lrjehGJaKINcZqLqHKwxGn

5VMaIoWQ==

-----END CERTIFICATE-----

#SER2980BU CONFIDENTIAL 14

Platform Services Controller High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 15

06 Update Config and

Endpoint

• updateSSOConfig.py

• updateLsEndpoint.py

PSC HA

root@psc1 [ /usr/lib/vmware-sso/bin ]# python updateSSOConfig.py --lb-

fqdn=psc-lb.virtualworld.local

root@psc1 [ /usr/lib/vmware-sso/bin ]# python UpdateLsEndpoint.py --lb-

fqdn=psc-ha.virtualworld.local [email protected] --

password=*******

Platform Services Controller High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 16

Update Config

and Endpoint

Run SSOConfig and

LsEndpoint

06

Installation

PSC nodes

01

Validation

MachineSSL has to be

validated on all the PSC

nodes

03

Certificate

Validation

Confirm if we get the same

Certificate on LB

and on PSC nodes

05

MachineSSL

Generate and replace

MachineSSL

02

Loadbalancer

Prerequisites & Configuration

04

PSC HA

Platform Services Controller High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

vCenter High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

vCenter High Availability

vSphere HAWindows Server Failover

Clustering

Easy to deploy ✓ ✓ ✘No shared storage required ✓ ✘ ✘No reliance on physical mode RDMs ✓ ✓ ✘Managed via the vSphere Web Client ✓ ✓ ✘Failover across vSphere Clusters ✓ ✘ ✘Cross DC failover w/o storage replication ✓ ✘ ✘

#SER2980BU CONFIDENTIAL 18

Comparing vCenter High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 19

Basic Mode and Advanced

mode deployment options

available

Enhanced Linked mode not

available with Embedded PSC

vCenter HA deployment

Supports Both Embedded and

External PSC deployments

04

03

01

02

05

Consists of 3 nodes i.e.

Active, Passive and

Witness Node

Available only with vCenter

Server Appliance

vCenter Server High Availability

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL

Basic – (Wizard-driven)

• vCenter Server Appliance virtual machine running

in the same vCenter

• vCenter in same SSO domain• Automated deployment

Advanced – (Manual Configuration)

• Active/Passive and Witness nodes are running on vCenter servers in different SSO domain

• Manual Cloning and Manually Configuring 2nd

network interface• Manual DRS Rules Configuration

1

2

vCenter High Availability Deployment Modes

20

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 21

Cluster Modes of Operation

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 22

Validating VCHA Configuration

VMworld 2017 Content: Not fo

r publication or distri

bution

Troubleshooting vCenter High Availability Issues

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 24

Active Node failure triggers a failover & Cluster Degraded

Passive node promotes to Active and serves client requests

Users would see “Failover in

Progress” on vSphere UI

during failover

Active Passive

Witness

vCenter HA Failure Scenarios #1

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 25

No Failover on Passive Node failure & Cluster Degraded

Active node continues to serves client requests

Cluster Quorum intact as

Witness and Active nodes

are available

Active Passive

Witness

vCenter HA Failure Scenarios #2

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 26

No Failover on Witness Node failure & Cluster Degraded

Active node continues to serves client requests

Cluster Quorum intact as

Passive and Active nodes

are available

Active Passive

Witness

vCenter HA Failure Scenarios #3

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 27

VCHA cannot tolerate multiple node failures

Active node would stop serving client requests

Resolve node issues to

achieve quorum and ensure

cluster stability

Active Passive

Witness

vCenter HA Failure Scenarios #4

VMworld 2017 Content: Not fo

r publication or distri

bution

Identify the

current roles

of deployed

VCHA nodes

Mount Patch ISO on all 3 nodes

Patch the

Passive node

Command -

software-

packages

install --iso --

acceptEulas

Initiate Failover

to change

node roles

Patch the witness Node , Command -software-packages install --iso --acceptEulas

Place VCHA Cluster in

maintenance mode

Patch the last

node ,

Command -

software-

packages install

--iso --

acceptEulas

MaintenancePatch

ISO

Identify

Nodes

Patching

Passive

Node

Initiate

Failover

Patching

Witness

Node

Patch the last node

#SER2980BU CONFIDENTIAL 28

Upgrading Nodes in a VCHA Cluster

VMworld 2017 Content: Not fo

r publication or distri

bution

Troubleshooting Certificates Issues

VMworld 2017 Content: Not fo

r publication or distri

bution

Best Practices for replacing Certificates

Replace Machine SSL Certificate on the vCenter Server Appliance

Node prior to configure VCHA

Running VCHA Setup

Removal of VCHA is required, before replacing Machine SSL on vCenter Server

1

2

Replacing SSL Certificates on a VCHA Cluster

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 31

Replacing Machine SSL Certificates on PSC HA

[2017-07-25T18:37:52.924Z] [INFO ] http-bio-9090-exec-9 com.vmware.vise.vim.security.sso.impl.SsoCmLocatorImpl Fetching SSO endpoints from CM[2017-07-25T18:37:52.995Z] [ERROR] http-bio-9090-exec-9 com.vmware.vise.vim.security.sso.impl.SsoServiceImpl Error when fetching sts root certificates java.util.concurrent.ExecutionException: java.security.cert.CertificateException: Could not parse certificate: java.io.IOException: java.lang.IllegalArgumentException: Input byte array has incorrect ending byte at 2004

at java.util.concurrent.FutureTask.report(FutureTask.java:122)at java.util.concurrent.FutureTask.get(FutureTask.java:206)at com.vmware.vise.vim.security.sso.impl.SsoServiceImpl.getStsConfig(SsoServiceImpl.java:351)at

com.vmware.vise.vim.security.sso.impl.SsoServiceImpl.getStsCertificates(SsoServiceImpl.java:422)

VMworld 2017 Content: Not fo

r publication or distri

bution

#SER2980BU CONFIDENTIAL 32

VMworld 2017 Content: Not fo

r publication or distri

bution

VMworld 2017 Content: Not fo

r publication or distri

bution

VMworld 2017 Content: Not fo

r publication or distri

bution