NSClient++: Monitoring Simplified at OSMC 2013

65
Simplified

description

A presentation about the up-coming 0.4.2 version of NSClient++ and how it drastically makes monitoring simpler!

Transcript of NSClient++: Monitoring Simplified at OSMC 2013

Page 1: NSClient++: Monitoring Simplified at OSMC 2013

Simplified

Page 2: NSClient++: Monitoring Simplified at OSMC 2013

How many use NSClient++

Page 3: NSClient++: Monitoring Simplified at OSMC 2013

How many like NSClient++?

..pdh collection thread not running…ERROR: Missing argument exceptionPdhCollectQueryData? failed: : -2147481643: No data to return.Failed to query performance counters:..pdh collection thread not running…ERROR: Missing argument exceptionPdhCollectQueryData? failed: : -2147481643: No data to return.Failed to query performance counters:

Page 4: NSClient++: Monitoring Simplified at OSMC 2013

How many thinks it’s simple?

CheckEventLog file=application file=system MaxWarn=1

MaxCrit=1 "filter=generated gt-2d AND severity NOT IN

('success', 'informational') AND source != 'SideBySide'"

truncate=800 unique descriptions

"syntax=%severity%: %source%: %message% (%count%)"

Page 5: NSClient++: Monitoring Simplified at OSMC 2013

What’s 3+8?

Page 6: NSClient++: Monitoring Simplified at OSMC 2013

How many saw me last year?

Boring…Get started already!

Page 7: NSClient++: Monitoring Simplified at OSMC 2013

dev not ops

worked in ops a long time ago

work with “soa” not, C/C++, nagios, …

Page 8: NSClient++: Monitoring Simplified at OSMC 2013
Page 9: NSClient++: Monitoring Simplified at OSMC 2013
Page 10: NSClient++: Monitoring Simplified at OSMC 2013

0.4.1: 2012-10-xx

0.4.2: 2013-10-xx?

0.4.3: 2014-02-xx?

Page 11: NSClient++: Monitoring Simplified at OSMC 2013
Page 12: NSClient++: Monitoring Simplified at OSMC 2013

one-man-band

no company

no commercial version

no paid time

Page 13: NSClient++: Monitoring Simplified at OSMC 2013

Please don’t be angry!

Some times I am busy

Page 14: NSClient++: Monitoring Simplified at OSMC 2013

sponsoring!donations!support!

but…

Page 15: NSClient++: Monitoring Simplified at OSMC 2013
Page 16: NSClient++: Monitoring Simplified at OSMC 2013
Page 17: NSClient++: Monitoring Simplified at OSMC 2013

Sockets:

New protocols:

Real-time checks: Simplified:

Modernized:

Page 18: NSClient++: Monitoring Simplified at OSMC 2013

Secure monitoring

Page 19: NSClient++: Monitoring Simplified at OSMC 2013

Build 90 (2013-02-xx)◦ nsclient-full.ini

◦ Reload from script

◦ (re)added check_filesize (ie. Check_nt –v FILESIZE)

◦ Encoding support for NRPE

◦ New option: scan-range for CheckEventLog

◦ Various minor bug fixes

Build 96 (2013-04-xx)◦ Reverted external script quoting issues

◦ (re)added check_fileage (ie. Check_nt –v FILEAGE)

◦ Added support for binding to both ipv6 and ipv4

◦ Various minor bug fixes

Build 102 (2013-08-xx)◦ PDH improvements

◦ Performance data: pass through

◦ Encoding support through out

◦ Various minor bug fixes and enhancements

Page 20: NSClient++: Monitoring Simplified at OSMC 2013
Page 21: NSClient++: Monitoring Simplified at OSMC 2013
Page 22: NSClient++: Monitoring Simplified at OSMC 2013
Page 23: NSClient++: Monitoring Simplified at OSMC 2013
Page 24: NSClient++: Monitoring Simplified at OSMC 2013
Page 25: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

Page 26: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

filter=” level = ’error’ ”

Page 27: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

filter=” source = ’App1’ ”

Page 28: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

filter=” source = ’App1 ”

Page 29: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

filter=” source = ’App1’ or source = ’App3’ ”

Page 30: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

filter=” source = ’App1’ or source = ’App3’or level = ’error’ ”

Page 31: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

filter=” source = ’App1’ or source = ’App3’or level = ’error’ or level = ’warning’ ”

Page 32: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

filter=” (source = ’App1’ or source = ’App3’or level = ’error’ or level = ’warning’) and source != ’Excel’ ”

Page 33: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

filter=” (source = ’App1’ or source = ’App3’or level = ’error’ or level = ’warning’) and source != ’Excel’ ”

filter=” (source in (’App1’, ’App3’) or level in (’error’, ’warning’)) and source != ’Excel’ ”

Page 34: NSClient++: Monitoring Simplified at OSMC 2013

Core Load … …

core1 5 … …

core2 0 … …

core3 0 … …

core4 5 … …

core5 0 … …

core6 0 … …

Total 2 … …

filter=” load > 10 ”

Page 35: NSClient++: Monitoring Simplified at OSMC 2013

Name Size … …

Foo.txt 5k … …

Bar.txt 12k … …

Log.txt 0 … …

Test.txt 123 … …

Foobar.txt 1k … …

Testing.txt 2k … …

Barfoo.txt 24k … …

filter=” size > 10k ”

Page 36: NSClient++: Monitoring Simplified at OSMC 2013

Name Size … …

physical 8g … …

commited 12g … …

… … … …

… … … …

… … … …

… … … …

… … … …

filter=” used > 80% ”

Page 37: NSClient++: Monitoring Simplified at OSMC 2013

filter = (id NOT IN ('3', '4', '6', '11', '16', '23', '24', '27', '29', '36', '46', '47', '50', '56', '134', '142', '219', '267', '270', '1006', '1009', '1014', '1030', '1035', '1036', '1055', '1058', '1071', '1073', '1085', '1102', '1110', '1111', '1112', '1131', '1291', '1500', '3095', '5719', '5722', '5783', '5788', '5789', '6008', '7000', '7001', '7003', '7005', '7009', '7011', '7022', '7023', '7024', '7026',

'7030', '7031', '7034', '7038', '7041', '9015', '9018', '9026', '9028', '10009', '10010', '10016', '10149', '12294', '15300', '15301', '24679', '36887', '36888', '40960', '40961', '45056') AND level IN ('error', 'warning'))

OR (id IN ('3') AND source NOT IN ('FilterManager') AND level IN ('error', 'warning'))

OR (id IN ('4') AND source NOT IN ('q57','L2ND') AND level IN ('error', 'warning')) OR (id IN ('6') AND source NOT IN ('Security-Kerberos') AND level IN ('error', 'warning')) OR (id IN ('11') AND source NOT IN ('Kerberos-Key-Distribution-Center') AND level IN ('error', 'warning')) OR (id IN ('16') AND source NOT IN ('WindowsUpdateClient') AND level IN ('error',

'warning')) OR (id IN ('23') AND source NOT IN ('Eventlog') AND level IN ('error', 'warning')) OR (id IN ('24') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('27') AND source NOT IN ('Eventlog') AND level IN ('error', 'warning')) OR (id IN ('29') AND source NOT IN ('Kerberos-Key-Distribution-Center') AND level IN ('error', 'warning')) OR (id IN ('36') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('46') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('47') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('50') AND source NOT IN ('TermDD','Time-Service') AND level IN ('error', 'warning')) OR (id IN ('56') AND source NOT IN ('TermDD') AND level IN ('error', 'warning')) OR (id IN ('134') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('142') AND

source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('219') AND source NOT IN ('Kernel-pnp') AND level IN ('error', 'warning')) OR (id IN ('267') AND source NOT IN ('Storage-agents') AND level IN ('error', 'warning')) OR (id IN ('270') AND source NOT IN ('Storage-agents') AND level IN ('error', 'warning')) OR (id IN ('1006') AND source

NOT IN ('DNS Client Events','GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1009') AND source NOT IN ('picadm') AND level IN ('error', 'warning')) OR (id IN ('1014') AND source NOT IN ('DNS Client Events') AND level IN ('error', 'warning')) OR (id IN ('1030') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1035') AND

source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1036') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1055') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1058') AND source NOT IN ('GroupPolicy') AND

level IN ('error', 'warning')) OR (id IN ('1071') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1073') AND source NOT IN ('USER32') AND level IN ('error', 'warning')) OR (id IN ('1085') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1102') AND source NOT IN

('SNMP') AND level IN ('error', 'warning')) OR (id IN ('1110') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1111') AND source NOT IN ('Server Agents') AND level IN ('error', 'warning')) OR (id IN ('1112') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1131') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1291') AND source NOT IN ('NIC-agents') AND level IN ('error', 'warning')) OR (id IN ('1500') AND source NOT IN ('SNMP') AND level IN ('error', 'warning')) OR (id IN ('3095') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5719') AND source NOT IN

('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5722') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5783') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5788') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5789') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('6008') AND source NOT IN ('Eventlog') AND level IN ('error', 'warning')) OR (id IN ('7000') AND source NOT IN ('service control manager') AND

level IN ('error', 'warning')) OR (id IN ('7001') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7003') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7005') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7009') AND source NOT IN

('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7011') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7022') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7023') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN

('7024') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7026') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7030') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7031') AND source NOT IN ('service control manager') AND

strings not like 'citrix' AND level IN ('error', 'warning')) OR (id IN ('7034') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7038') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7041') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN

('9015') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('9018') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('9026') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('9028') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('10009') AND source NOT IN ('DistributedCOM') AND level IN ('error', 'warning')) OR (id IN ('10010') AND source NOT IN ('DistributedCOM') AND level IN ('error', 'warning')) OR (id IN ('10016')

AND source NOT IN ('DistributedCOM') AND level IN ('error', 'warning')) OR (id IN ('10149') AND source NOT IN ('WindowsRemoteManagement') AND level IN ('error', 'warning')) OR (id IN ('12294') AND source NOT IN ('Directory-Services-SAM') AND level IN ('error', 'warning')) OR (id IN ('15300') AND source NOT IN ('HTTPEVENT') AND level IN ('error', 'warning')) OR (id IN ('15301') AND source NOT IN ('HTTPEVENT') AND level IN ('error', 'warning')) OR (id IN ('24679') AND source NOT IN ('Cissesrv') AND level IN ('error',

'warning')) OR (id IN ('36887') AND source NOT IN ('Schannel') AND level IN ('error', 'warning')) OR (id IN ('36888') AND source NOT IN ('Schannel') AND level IN ('error', 'warning')) OR (id IN ('40960') AND source NOT IN ('LSASRV') AND level IN ('error', 'warning')) OR (id IN ('40961') AND source NOT IN ('LSASRV') AND level IN ('error',

'warning')) OR (id IN ('45056') AND source NOT IN ('LSASRV') AND level IN ('error', 'warning'))

Page 38: NSClient++: Monitoring Simplified at OSMC 2013

Key Safe Key Description

= eq Equals

!= ne Not equals

> gt Greater than

< lt Less than

>= ge Greater or equal than

<= le Less or equal than

in ( <LIST OF VALUES>) In a given list

not in (…) Not in a given list

Page 39: NSClient++: Monitoring Simplified at OSMC 2013

Key Safe Key Description

= eq Equals

!= ne Not equals

> gt Greater than

< lt Less than

>= ge Greater or equal than

<= le Less or equal than

in ( <LIST OF VALUES>) In a given list

not in (…) Not in a given list

like Substring matching

regexp Regular expression

not like Opposite of like

not regexp Opposite of regexp

Page 40: NSClient++: Monitoring Simplified at OSMC 2013

filter

warning

critical

Page 41: NSClient++: Monitoring Simplified at OSMC 2013

Level Source … …

Error Word … …

Error Excel … …

Info Word … …

Warning Excel … …

Error App1 … …

Warning App1 … …

Error App3 … …

filter=” source = ’App1’ “

warn=” level = ’Warning’ “

Page 42: NSClient++: Monitoring Simplified at OSMC 2013

Custom strings

Supports substitutions ${…}

top- and detail-syntax

Page 43: NSClient++: Monitoring Simplified at OSMC 2013

detail-syntax=”s: ${source} “top-syntax=“Hello: ${list}”

Hello: s: App1, s: App1, s: App3

Page 44: NSClient++: Monitoring Simplified at OSMC 2013
Page 45: NSClient++: Monitoring Simplified at OSMC 2013
Page 46: NSClient++: Monitoring Simplified at OSMC 2013
Page 47: NSClient++: Monitoring Simplified at OSMC 2013

defaults!

Page 48: NSClient++: Monitoring Simplified at OSMC 2013

check_cpuJust works!

Page 49: NSClient++: Monitoring Simplified at OSMC 2013
Page 50: NSClient++: Monitoring Simplified at OSMC 2013

Monitored Server(Windows)

Monitoring Server(Nagios)

check_cpu

check_uptimecheck_mem

check_eventlog

check_updates......

Monitored Server(Windows)

Monitoring Server(Nagios)

check_cpu

check_uptimecheck_mem

check_eventlog

check_updates......

Page 51: NSClient++: Monitoring Simplified at OSMC 2013

Monitored Server(Windows)

Monitoring Server(Nagios)

check_cpu

check_uptimecheck_mem

check_eventlog

check_updates......

Monitored Server(Windows)

Monitoring Server(Nagios)

check_cpu

check_uptimecheck_mem

check_eventlog

check_updates......

Page 52: NSClient++: Monitoring Simplified at OSMC 2013

Monitored Server(Windows)

Monitoring Server(Nagios)

Error detected in eventlog

Everything is ok

Monitored Server(Windows)

Monitoring Server(Nagios)

Error detected in eventlog

Everything is ok

Page 53: NSClient++: Monitoring Simplified at OSMC 2013

Zero overhead log-file checks

Stateful monitoring

Adaptive thresholds?

Correlation CEP

Composite checks

Page 54: NSClient++: Monitoring Simplified at OSMC 2013
Page 55: NSClient++: Monitoring Simplified at OSMC 2013

Two options:

Page 56: NSClient++: Monitoring Simplified at OSMC 2013
Page 57: NSClient++: Monitoring Simplified at OSMC 2013
Page 58: NSClient++: Monitoring Simplified at OSMC 2013

check_service computer=192.168.0.1check_disk drive=\\192.168.0.1\c$check_task_sched computer=192.168.0.1check_wmi computer=192.168.0.1

Page 59: NSClient++: Monitoring Simplified at OSMC 2013

Light weight remote deployable agentSimilar to psexeccheck_cpucheck_memorycheck_processExternal scripts!

Page 60: NSClient++: Monitoring Simplified at OSMC 2013
Page 61: NSClient++: Monitoring Simplified at OSMC 2013

How many thinks it’s simple?

CheckEventLog file=application file=system MaxWarn=1

MaxCrit=1 "filter=generated gt-2d AND severity NOT IN

('success', 'informational') AND source != 'SideBySide'"

truncate=800 unique descriptions

"syntax=%severity%: %source%: %message% (%count%)"

Page 62: NSClient++: Monitoring Simplified at OSMC 2013

How many thinks it’s simple?

check_eventlog

Page 63: NSClient++: Monitoring Simplified at OSMC 2013
Page 64: NSClient++: Monitoring Simplified at OSMC 2013

Photo by Olga Berrios

Page 65: NSClient++: Monitoring Simplified at OSMC 2013

Information about NSClient++http://nsclient.org

facebook.com/nsclient

Slides, and exampleshttp://nsclient.org/nscp/conferances/nwc/2013/

My Bloghttp://blog.medin.name

Most images taken by mewhilst visiting the INTREPID