Monitoring NGINX (plus): key metrics and how-to

Monitoring nginxAlexis Lê-Quôc, Datadog

Agenda• Dramatis personae • Observations • Monitoring 1 nginx (plus) with logs • Monitoring 1 nginx (plus) with metrics • Monitoring N nginx effectively

@alq CTO at Datadog

Datadog == monitoring• Monitoring as a service • Work really will with large, dynamic environments (e.g. clouds) • Aggregate performance metrics • Correlate nginx performance with the rest of your infrastructure

ObservationsFrom the field

Some stats• Across all monitored servers • nginx ~10% • Apache ~5% • CPU and CPU/$ is the dominant resource

% of instances per core count

Core count1 2 4 8 12 16 24 32

% of instances per type (AWS only)

EC2 typec3.l c3.2xl c1.xl c3.8xl m3.l c3.xl m3.m cc2.8xl t2.m c3.4xl rest

3.1%4.4%4.5%4.7%5%5.3%

13%14%

Monitoring nginx1. Monitoring with logs 2. Monitoring with status 3. Monitoring with statsd

Monitoring with logs

• Canonical example of log indexers • Your choice of:

• logstash • splunk • logentries, sumologic, loggly, etc.

nginx log forwarder indexer UI

Monitoring with logs

nginx log forwarder indexer UI

Strengths Weaknesses

forensics & anomalies low signal-to-noise ratio

content-driven analysis “black box”

Monitoring with metrics

• open-source: ngx_http_stub_status_module • bare-bone metrics • human-readable text presentation

• plus: ngx_http_status_module • a lot more metrics for each function • json format

• Your choice of… • Datadog, Nagios, Zabbix, etc. for open-source • Datadog for nginx plus

nginx status collector aggregator UI/alerts

Monitoring with metrics

nginx status collector aggregator UI/alerts

lightweight & real-time no insight into content

“white box”

Simple metrics taxonomy1. What it measures

• Work or resource • Focus on work because work == value • Resource analysis useful to understand performance

• Use Brendan Gregg’s USE • Utilization (% over time) • Saturation (queue length) • Errors (count over time)

2. Type • Gauge: sample • Counter: accumulated sample, needs to be derived to be

meaningful

http://www.brendangregg.com/usemethod.html

Open-source metrics

Class Type Resource/Work Notes

Current connections Gauge Resource reading, writing,

idleAccepted

connections Counter Resource

Handled connections Counter Resource <= accepted if

resource limit

Requests Counter Work True purpose of the server

•Latency must be measured using logs or statsd.

Key “plus” metrics

Class Type Resource/Work Notes

5xx Errors Counter Work without log analysis

5xx/sum(Nxx) Gauge Work error rate %

idle/dropped connections Gauge Resource saturation

active/total connections Gauge Resource upstream

capacity

Requests Counter Work true purpose of the server

• Latency must be measured using logs or statsd.

Monitoring with statsd

nginx statsd UI/alerts

lightweight, real-time, standard not comprehensive

custom metrics, content-aware

https://github.com/zebrafishlabs/nginx-statsd

Example

Monitoring nginx1. Logs for content-analysis (forensics, anomalies, marketing) 2. Status for (white box) performance monitoring 3. statsD for custom metrics

No single method gives you everything you need.

Monitoring a lot of nginx1. Requires aggregation 2. It’s all about Metadata (“Pet-to-cattle” mindset) 3. Correlation

Aggregation• By default for log-based monitoring • Not by default for metric-based monitoring

Metadata• Analyze by properties that are not the host identity • Find anomalies that are not obvious • Pet-to-cattle evolution: hosts don’t matter, services do

Correlation• nginx is only one piece of the infrastructure

#plugwww.datadog.com

Thank you!Questions/Comments? @alq

Monitoring NGINX (plus): key metrics and how-to

Technology

Transcript of Monitoring NGINX (plus): key metrics and how-to

Tuning NGINX for high performance - SCALE · Tuning NGINX for high performance Nick Shadrin nick@nginx.com. #nginx #nginxconf ... • NGINX Amplify is a free monitoring SaaS solution.

Metrics and Monitoring Infrastructure: Lessons Learned Building Metrics at LinkedIn

NGINX Plus on AWS · NGINX Plus is built on top of the open source NGINX web server, and offers additional features around load balancing, monitoring, proxy routing, and advanced

Agile Monitoring and Control & Agile Metrics

NGINX modules reference...Nginx, Inc. develops and maintains NGINX open source distribution, and o ers commercial support and professional services for NGINX. About NGINX Plus •

Nginx monitoring with graphite

Monitoring Highly Dynamic and Distributed Systems with NGINX Amplify

Lustre Metrics: New Techniques For Monitoring Lustre

Nginx performance monitoring with Dynatrace

Performance Metrics and Monitoring Tools for Sustainable ...

Tuning NGINX for high performance - Apistek · • NGINX has simple set of metrics with stub_status module. Conﬁgure stub_status • NGINX Plus provides more extensive metrics with

NFV Infrastructure Metrics for Monitoring Virtualized ... · NFV Infrastructure Metrics for Monitoring Virtualized Network Deployments ... report presents an overview of existing

NYU Startup School: Measuring & Monitoring Metrics that Matter

Agile Metrics: Progress Monitoring of Agile Contractors

Monitoring Quality Metrics to Know When to Ship

Top Metrics Product Companies Should Be Monitoring

Grid2003 Monitoring, Metrics, and Grid Cataloging System

Apache Flink Training - Metrics & Monitoring

Network monitoring and metrics

NGINX High Availability and Monitoring