VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

54
Deep Dive into vSphere Log Management with vCenter Log Insight Steve Flanders, VMware Chengdu Huang, VMware VCM4445 #VCM4445

Transcript of VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

Page 1: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

Deep Dive into vSphere Log Management with

vCenter Log Insight

Steve Flanders, VMware

Chengdu Huang, VMware

VCM4445

#VCM4445

Page 2: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

2

Agenda

Introduction

Query Building Deep Dive

Performance Deep Dive

Mini Deep Dives

Wrap Up

Page 3: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

3

Introduction

Page 4: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

4

Presenters

Steve Flanders

• Senior Solutions Architect, VMware

• VCAP-DCA

• @smflanders

• sflanders.net

Chengdu Huang

• Chief Architect of Log Insight, VMware

• PhD, University of Illinois at Urbana-Champaign

• @chengduh

Page 5: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

5

Problem Statement

VMware Logs

OS and

App Logs

200 ESXi Host + VMs = 200GB or 2B log events per day

Physical Infrastructure Logs

Page 6: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

6

Full Stack Aggregation + Analytics

3rd party infrastructure

e.g. Cisco, Dell, EMC, HP, NetApp

Operating System

Search

Analyze

Discover

Visualize

Logs

Custom and 3rd party apps

e.g. MS, Oracle, SAP

Syslo

g

Log Insight

Operational Log

Management

& Analytics

vCloud® Suite

Page 7: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

7

Query Building Deep Dive

Page 8: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

8

Objectives

Understand what comprises a query

Learn how to query using matches and regular expressions

Learn best practices for query construction

Page 9: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

9

Interactive Analytics – Overview

Page 10: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

10

Aggregation functions / analytics

Manipulation of visual data

Results List

Textual representation of data

Search Box and Query Builder

Full-text and regular expressions

Overview Chart

Visual representation of data

Adjust Scale

Time Range for the query

Breakdown Charts for each of

the fields

Save Chart

Interactive Analytics – Overview Detailed

Other Options

Save/Load/Export Query

Add/Manage Alerts

Manage Extracted Fields

Export Query Results

Page 11: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
Page 12: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

12

Interactive Analytics – Overview

Page 13: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

13

Search Box and Query Builder

Full-text and regular expressions

Interactive Analytics – Search/Query

Page 14: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

14

Search Box and Query Builder

Full-text and regular expressions

Time Range for the query

Breakdown Charts for each of

the fields

Other Options

Save/Load/Export Query

Add/Manage Alerts

Manage Extracted Fields

Export Query Results

Interactive Analytics – Search/Query

Aggregation functions / analytics

Manipulation of visual data

Page 15: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

15

Demo!

Page 16: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

16

Interactive Analytics – Query Building 1/2

• The search terms support globing, i.e. ‘*’ and ‘?’

• Prefix queries are not supported: *rror or ?error are invalid

• Auto completion for both keywords and constraints

• The number of matches for the autocompleted terms is an approximation

• Only auto completion for the first word in phrase

• The incoming messages are

Auto completion

Highlighting of matches

Page 17: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

17

Interactive Analytics – Query Building 2/2

• ‘equals’ and ‘does not equal’ support * (glob) and ?

• starts with(err) and matches(err*) are the same query

• Comma separated values form an OR constraint

• hostname matches hostA, hostB means hostname is either hostA OR hostB

• Clicking on a field in the message list or a bar in the overview chart list creates

a constraint

• The constraints can form a logical AND (match all) or logical OR (match any)

all (logical and) or any (logical or) Comparison operators

different for string and

numeric fields

Alphanumeric fields can

have a regex constraint ‘exists’ does not

require a

constraint value

Page 18: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

18

Recap – Query Building

General

• Case insensitive queries

• Complete keyword matching

• Special character queries via regular expressions only

• Globs (* and ?) can be used to enhance keyword queries

Search bar

• Space separated keywords are logical AND queries

• Phrases are entered using double quotations

• No regular expressions

Constraints

• Field operations

• Values separated by comma are logical OR queries

• Multiple constraints can be logical AND or logical OR queries

• Regular expressions available

Page 19: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

19

Performance Deep Dive

Page 20: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

20

Objectives

Understand the system architecture

Understand the considerations for ingestion versus queries

Apprehend common performance problems

• “I have X hosts sending logs to Log Insight, and it can’t keep up”

• “I ran this query and it took a long time to finish”

• “My dashboard is really slow to load”

Page 21: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

21

System Architecture

z

Syslog

Indexes Compressed

Logs

Ingestion Pipeline

Query Processing Pipeline

Web

Server

TCP

UDP

Clients

Page 22: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

22

Ingestion Pipeline

Multi-staged pipeline

• Connected with bounded queues

• Message dropping happens when all queues are full

Very resource efficient

Resource Usage

CPU Heavy

Memory Light

Disk IO Neutral

Network Light

Page 23: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

23

Performance Consideration – Ingestion Rate Not High Enough

CPU

• CPU utilization hovers at 100% - give more CPU cores

• Ingestion generally does not utilize more than 6 CPU cores

Memory

• More can help incoming rate spikes

Disk IO

• “Effective” IOPS

Network

• Reliability

• Consider syslog aggregator when the number of hosts is very large

Page 24: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

24

Query Engine

Complex processing pipeline

• High performance

• Admission control to avoid thrashing

A lot more resource intensive

Resource Usage

CPU Heavy

Memory Heavy

Disk IO Heavy

Network Light

Page 25: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

25

Performance Consideration – Time Range

Very big impact on performance

• Affect amount of data to process

• Affect IO and memory locality

Use short, specific time range

Page 26: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

26

Performance Consideration – Keyword vs Regex

Keyword is much faster

Convert regex to keyword if possible

• error.* => error*

• (start|stop|power off) => start,stop,”power off”

Huge performance gain

• Sometimes 10x faster

Page 27: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

27

Performance Consideration – Field Extraction

Extracting dynamic fields

• Provide sufficient and specific context

Page 28: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

28

Performance Consideration – Run-away Queries

Monitor run-away queries

• Count all messages in the past 3 years that match ((((((0?[1-9])|([1-2][0-

9])|(3[0-1]))-

(([jJ][aA][nN])|([mM][aA][rR])|([mM][aA][yY])|([jJ][uU][lL])|([aA][uU][gG])|([oO][cC

][tT])|([dD][eE][cC])))|(((0?[1-9])|([1-2][0-9])|(30))-

(([aA][pP][rR])|([jJ][uU][nN])|([sS][eE][pP])|([nN][oO][vV])))|(((0?[1-9])|(1[0-

9])|(2[0-8]))-([fF][eE][bB])))-

(20(([13579][01345789])|([2468][1235679]))))|(((((0?[1-9])|([1-2][0-9])|(3[0-1]))-

(([jJ][aA][nN])|([mM][aA][rR])|([mM][aA][yY])|([jJ][uU][lL])|([aA][uU][gG])|([oO][cC

][tT])|([dD][eE][cC])))|(((0?[1-9])|([1-2][0-9])|(30))-

(([aA][pP][rR])|([jJ][uU][nN])|([sS][eE][pP])|([nN][oO][vV])))|(((0?[1-9])|(1[0-

9])|(2[0-9]))-([fF][eE][bB])))-(20(([13579][26])|([2468][048])))))

Page 29: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

29

Performance Considerations – Run-away Queries

Cancel run-away queries

Time elapsed since was issued

(including queuing time)

Whether the query is still waiting

to be executed

Cancel the

execution

Page 30: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

30

Recap – Resource and Performance

More CPU helps

• Many steps are CPU-bound

• Allow more queries run in parallel

More memory helps

• More memory for VA helps OS IO buffer cache

• Bigger heap size gives more room for application cache

Faster IO helps

• Exclusively read; a lot of random accesses

• IO demand can be very high

Network is not a concern

Heavily depends on the queries

Page 31: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

31

Mini Deep Dives

Page 32: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

32

Retention and Archiving

Page 33: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

33

Retention

Bucket 0

Tim

e

Bucket 1

Bucket 2

Bucket 0

Bucket 0 Bucket 1

Bucket n Bucket 1 Bucket n-1 Bucket 0

Bucket n+1 Bucket 2 Bucket n Bucket 1

Page 34: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

34

Archiving

Bucket 0

Tim

e

Bucket 1

Bucket 2

Bucket 0

Bucket 0 Bucket 1

Archive (NFS)

Archive (NFS)

Bucket n Bucket 1 Bucket n-1 Archive (NFS)

Bucket 0

Bucket 2n Bucket n+1 Bucket 2n-1 Archive (NFS)

Bucket n

… Drop Full

Page 35: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

35

Ingestion

Page 36: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

36

Ingestion – Syslog

Allowed over syslog protocol today

• Means you need a syslog agent on every device

• Exception – vCenter Server events, tasks, and alarms (API)

Syslog agents are flexible

• Can monitor files (e.g. logs in non-standard locations, configuration, etc.)

• Can tag messages (makes querying easier)

• Can convert SNMP to syslog

Page 37: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
Page 38: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

38

Client Configuration – Syslog-NG

Forward logs

• Uncomment/Add the following section and edit as needed

#

# Enable this and adopt IP to send log messages to a log server.

#

#destination logserver { udp("10.10.10.10" port(514)); };

#log { source(src); destination(logserver); };

Monitor a file

• For each file to monitor add a line like:

source s_file { file(“/path/to/app.log” flags(no-parse)); };

• Then modify the forward logs line in above like:

log { source(src); source(s_file); destination(logserver); };

Source

• http://www.syslog.org/logged/reading-logs-from-a-file-in-syslog-ng/

Page 39: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

39

Client Configuration – Syslog-NG (Cont.)

Tag logs

• Using tags

source s_file { file(“/path/to/app.log” flags(no-parse) log_prefix(“APP: “); };

source s_file { file(“/path/to/app.log” flags(no-parse) program_override(“APP: “); };

• Using templates

destination my_file {

file("/path/to/app.log" template("$ISODATE $FULLHOST $TAG $MESSAGE"));

};

SNMP to syslog

• If running syslog-ng v3 or newer and have snmptrapd configured

filter f_snmptrapd { program(“snmptrapd”); };

rewrite r_snmptrapd { subst(“^([^ ]+) (.*)$ “, “${2}”); set(“${1}” value(“HOST”)); };

Source

• http://bazsi.blogs.balabit.com/2008/11/syslog-ng-3-0-and-snmp-traps/

Page 40: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

40

Client Configuration – Rsyslog

Forward logs (http://www.rsyslog.com/

sending-messages-to-a-remote-syslog-server/)

• UDP

<what>;<to>;<forward> @server.example.com:514

• TCP

<what>;<to>;<forward> @@server.example.com:514

• Example

*.* @@server.example.com:514

Monitor a file (http://www.rsyslog.com/doc/imfile.html)

module(load="imfile" PollingInterval="10") #needs to be done just once

input(type="imfile" File="/path/to/file1"

Tag="tag1"

StateFile="/var/spool/rsyslog/statefile1"

Severity="error"

Facility="local7")

Page 41: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

41

Client Configuration – Rsyslog (Cont.)

Tag logs

template(name="FileFormat" type="string"

string= "%TIMESTAMP% %HOSTNAME% %syslogtag%%msg%\n"

)

SNMP to syslog

$template mkeventd,"<%PRI%>%TIMESTAMP% %HOSTNAME% %syslogtag%

%msg%\n"

$template mkeventdsnmp,"<%PRI%>%TIMESTAMP% %msg:F,58:1$%

%syslogtag%%msg%\n"

:programname,isequal,"snmptrapd" ^/omd/sites/mysite/bin/mkevent;mkeventdsnmp

:programname,!isequal,"snmptrapd" ^/omd/sites/mysite/bin/mkevent;mkeventdSources

Page 43: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

43

Alerts

Page 44: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

44

Alerts – Types

Query-based alerts

• Email

• vCenter Operations Manager

System alerts

• Dropped messages

• Failed to archive

• About to retire, or delete, old data

Page 45: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

45

Alerts – Enable/Disable

Query-based alerts

• Content Pack alerts – always disabled

• Custom alerts – always user-specific

• If neither email nor vCenter Operations Manager is selected then disabled

• Otherwise, enabled

• NOTE: If previously enabled and then disabled, settings are preserved

System alerts

• Cannot be individually disabled

• Cannot be modified

Disable ALL alerts

• Administration > General > Suspend All Alerts

• Applies to query-based alerts and system alerts

• Avoid if possible!

Page 46: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

46

Alerts – SNMP

Email SNMP

1

2

Page 47: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

47

Time

Page 48: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

48

Interactive Analytics – Timestamp

• The displayed timezone is that of the browser

• The Time Range follows the browser time

• If the current time is 9pm PDT but the browser time is 8pm PDT, “Latest 5 minutes of

data” means [7:55pm PDT, 8pm PDT]

• The incoming messages are timestamped at arrival with the time of the Log Insight VA

It can cause a small discrepancy

between the timestamp in the timestamp

and timestamp that Log Insight uses

Page 49: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

49

Wrapping Up

Page 50: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

50

Summary

Size properly – ingestion and queries set resource requirements

• CPU is a common bottleneck for ingestion and queries

• Memory can help, but typically not as much as other resources

• IOPS is a common bottleneck especially for queries

• Network should not be the bottleneck, but connectivity can impact ingestion

Queries – be as specific as possible

• Limit the time range

• Provide as much textual context as possible

• Use globs when needed

• Avoid regular expressions whenever possible

Management – other considerations

• Monitor NFS archive – a full archive can lead to dropped events

• Disable all alerts – also disables system alerts

Page 51: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

51

Log Insight Resources

General Log Insight Resources

• Product

http://www.vmware.com/products/datacenter-virtualization/vcenter-log-insight

• Communities

http://communities.vmware.com/community/vmtn/vcenter/vcenter-log-insight

• Marketplace (content packs)

http://loginsight.vmware.com/

• Twitter

@VMLogInsight (follow and get 5 free licenses!)

VMworld Log Insight Resources

• General Session: VCM4528 – Tips and Tricks with vCenter Log Insight

• General Session: VCM5034 – Troubleshooting at Cox Communications

• Group Discussion: VCM1005-GD – Log Insight with Steve Flanders

• Solutions Exchange: VMware booth – Log Analytics

• Hands-on Labs: HOL-SDC-1301 – VMware vCenter Log Insight

Page 52: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

THANK YOU

Page 53: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
Page 54: VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight

Deep Dive into vSphere Log Management with

vCenter Log Insight

Steve Flanders, VMware

Chengdu Huang, VMware

VCM4445

#VCM4445