Icinga @ VIX (+ ACOnet) · Icinga and check_mk only one active check per host checks wether...

18
RIPE63 EIX WG - Vienna - 3 rd November 2011 1 Icinga @ VIX (+ ACOnet) http://www.vix.at Presentation by [email protected]

Transcript of Icinga @ VIX (+ ACOnet) · Icinga and check_mk only one active check per host checks wether...

RIPE63 EIX WG - Vienna - 3rd November 2011 2

Icinga facts

http://www.icinga.org

Fork of nagios (version 3.1.something)core developer @ VIXeasy transitionmuch better performance (~103 to 104)compound commands

checks~100 hosts~6000 services

RIPE63 EIX WG - Vienna - 3rd November 2011 3

Icinga check types

activeIcinga process starts check pluginIcinga process parses resultIcinga process triggers actions

passivecheck runs autonomouslycheck submitts results - if they are relevantIcinga “listens” for results and processes them

RIPE63 EIX WG - Vienna - 3rd November 2011 5

Icinga and check_mk

only one active check per hostchecks wether check_mk runssmall cpu load

many passive checks per check_mkreport of check_mk results per passive checkgrep data via single snmp-walkalso performance-data (for graphs)massive reduction of snmp-queries

RIPE63 EIX WG - Vienna - 3rd November 2011 6

Icinga, pnp4nagios and rrdcached

pnp4nagios interprets performance datafeeds data into rrdgraphs are integrated in web-portalcustom templates for graphs

rrdcachedspeed up rrd accessincreases IO performanceless R/W access necessary

RIPE63 EIX WG - Vienna - 3rd November 2011 7

What do we check?

Interfacesup/downerrorsdropped packets

BGPstatus of sessions on routers and BIRD

System healthCPU loadmemorymodule statusfan status / power suppliesmodule uptime (not yet)

RIPE63 EIX WG - Vienna - 3rd November 2011 8

What do we graph?

Interfacestraffic pps/bpserrorsdropped packetsDOM (digital optical monitoring)

BGPaccepted prefixesnon-accepted prefixes (counter value)update rate

System healthCPU loadmemory usage and largest continuous chunk

RIPE63 EIX WG - Vienna - 3rd November 2011 9

Checking BGP sessions of BIRD route-server

snmpd.confextend .1.3.6.1.4.1.2021.77 bgpstatus /usr/bin/perl birdwatch.pl

custom birdwatch.pl send SOCK,"show protocols all \"R*\"\n",0;

snmpwalk -v2c -c <com> <rs> -On .1.3.6.1.4.1.2021.77 .1.3.6.1.4.1.2021.77.4.1.2.9.98.103.112.115.116.97.116.117.115.34 = STRING: "R1853x1 1853 ipv4 6 260"

check_mk Pluginname, asnum, proto, state, prefixes=line[0].split(" ")

check_mk Output state+perfdataPeer R1853x1 ipv4    OK   (prefixes=260;;;;)

RIPE63 EIX WG - Vienna - 3rd November 2011 10

Icinga config

database is authoritative sourcedescriptionwhat to check/ignorealarm trigger values

crawler checks checkscheck_mk inventorydailyalarms if mismatch

RIPE63 EIX WG - Vienna - 3rd November 2011 11

dedicated hardware

2 x HP DL380 G712 cores12 GB RAM15k harddisks raid1

100 MB RAM-disk for status info

Disk IO is limiting factor

no virtualizationless dependenciesalso virtualization environment is monitoredwouldn’t make sense to virtualize monitoring system