Monitoring Containers with Weave Scope

25
Monitoring Containers David Kaltschmidt @davkals

Transcript of Monitoring Containers with Weave Scope

Page 1: Monitoring Containers with Weave Scope

Monitoring ContainersDavid Kaltschmidt

@davkals

Page 2: Monitoring Containers with Weave Scope
Page 3: Monitoring Containers with Weave Scope

Rogue waves present considerable danger for several reasons:

• unpredictable

• may appear suddenly or without warning

• and can impact with tremendous force.

Page 4: Monitoring Containers with Weave Scope

Performance Methodologies

• For system engineers- ways to analyse unfamiliar systems

• For app developers- guidance for metric and dashboard design

- Brendan Gregg’s Systems Methodology

Page 5: Monitoring Containers with Weave Scope

Traffic Light Anti-Method

1. Turn all metrics into traffic lights

2. Everything green? No worries, mate.

- Brendan Gregg’s Systems Methodology

🚦

Page 6: Monitoring Containers with Weave Scope

https://github.com/weaveworks/scope

Intuition Engineering:

we need a tool that gives us an intuitive understanding of the entire system at a glance.

Page 7: Monitoring Containers with Weave Scope

http://cloud.weave.works/

Weave Cloud

Hosted version of Weave Scope

Runs on K8s

Page 8: Monitoring Containers with Weave Scope

DEMO

Page 9: Monitoring Containers with Weave Scope

Great, but…• Short-lived connections

• Hairballs, and shifting layouts

Page 10: Monitoring Containers with Weave Scope

scope-probescope-probe

scope-app

Browser

scope-probe

Host 1 Host 2 Host 3

Scope OSS Architecture

Page 11: Monitoring Containers with Weave Scope

Connection Tracking/home/weave # conntrack -E [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41066 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41066 [ASSURED] [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=36236 dport=32778 src=172.17.0.8 dst=192.168.99.100 sport=80 dport=36236 [ASSURED] [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41068 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41068 [ASSURED] [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=52996 dport=32776 src=172.17.0.6 dst=192.168.99.100 sport=80 dport=52996 [ASSURED] [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41070 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41070 [ASSURED] [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=52998 dport=32776 src=172.17.0.6 dst=192.168.99.100 sport=80 dport=52998 [ASSURED] [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41072 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41072 [ASSURED] [DESTROY] tcp 6 src=192.168.99.100 dst=192.168.99.100 sport=57975 dport=32777 src=172.17.0.7 dst=192.168.99.100 sport=80 dport=57975 [ASSURED] [DESTROY] tcp 6 src=172.17.0.10 dst=10.128.0.1 sport=41074 dport=80 src=172.17.0.1 dst=172.17.0.10 sport=42525 dport=41074 [ASSURED]

/home/weave # cat /proc/net/tcp sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 00000000:0016 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 16810 1 ffff8800d79c1800 100 0 0 10 0 1: 0100007F:EB74 0100007F:0FC8 06 00000000:00000000 03:0000016D 00000000 0 0 0 3 ffff8800ae3f6e80 2: 0100007F:EB69 0100007F:0FC8 01 00000000:00000000 00:00000000 00000000 0 0 307011 1 ffff8800cf467040 21 4 30 10 -1 3: 0100007F:EB7B 0100007F:0FC8 06 00000000:00000000 03:00000D27 00000000 0 0 0 3 ffff8800d7a47538 4: 0100007F:EB7C 0100007F:0FC8 06 00000000:00000000 03:0000110E 00000000 0 0 0 3 ffff8800cf656c70 5: 0100007F:EB67 0100007F:0FC8 01 00000000:00000000 00:00000000 00000000 0 0 306868 1 ffff8800d79c1040 21 4 27 10 -1 6: 0100007F:EB76 0100007F:0FC8 06 00000000:00000000 03:00000556 00000000 0 0 0 3 ffff8800d37ac748 7: 0100007F:EB7F 0100007F:0FC8 06 00000000:00000000 03:000014F7 00000000 0 0 0 3 ffff8800d87f0c70

Page 12: Monitoring Containers with Weave Scope

Stop sampling, start listening

EBPF

• user-defined sandboxed kernel programs

• live-instrumentation on vanilla kernel

• listen to connection events same way as conntrack does, but with PID(!)

Page 13: Monitoring Containers with Weave Scope
Page 14: Monitoring Containers with Weave Scope

Have you used D3.js?

Page 15: Monitoring Containers with Weave Scope

Heuristic I: Alignment

Page 16: Monitoring Containers with Weave Scope

Heuristic II: Edge crossings

Page 17: Monitoring Containers with Weave Scope

Heuristic III: Commensurate layout changes

Page 18: Monitoring Containers with Weave Scope

Back to Monitoring Containers

Page 19: Monitoring Containers with Weave Scope

USE vs RED

USE: For every resource, check:

• Utilisation

• Saturation

• Errors

RED: For every service, check:

• Request Rate

• Error rate

• Duration (latency distribution)

*http://www.brendangregg.com/usemethod.html

Page 20: Monitoring Containers with Weave Scope

New data sources

• plugins can add metadata and metrics

• EBPF (ongoing)

• custom instrumentation

Page 21: Monitoring Containers with Weave Scope
Page 22: Monitoring Containers with Weave Scope

Prometheus & K8s

• Kubernetes is already instrumented for Prometheus

• Application-level metrics from instrumentation

Page 23: Monitoring Containers with Weave Scope

Hosted Prometheus, multi-tenant (OSS)

Got a version running in Weave Cloud

Run local Prometheus with a remote destination

https://github.com/weaveworks/prism

Page 24: Monitoring Containers with Weave Scope

We’re hiring!London BerlinSan Francisco

Page 25: Monitoring Containers with Weave Scope

Questions?David Kaltschmidt

@davkals https://weave.works