Nagios Conference 2013 - Michael Medin - NSClient++ Whats New
-
Upload
nagios -
Category
Technology
-
view
381 -
download
0
description
Transcript of Nagios Conference 2013 - Michael Medin - NSClient++ Whats New
NSClient++Whats new?
http://nsclient.org
http://nsclient.org
MonitoringSimplified
How many use NSClient++
NS-what did he say??#@*&%!
wrong room!
How many like NSClient++?
..pdh collection thread not running…ERROR: Missing argument exceptionPdhCollectQueryData? failed: : -2147481643: No data to return.Failed to query performance counters:..pdh collection thread not running…ERROR: Missing argument exceptionPdhCollectQueryData? failed: : -2147481643: No data to return.Failed to query performance counters:
simple?
CheckEventLog file=application file=system MaxWarn=1
MaxCrit=1 "filter=generated gt-2d AND severity NOT IN
('success', 'informational') AND source != 'SideBySide'"
truncate=800 unique descriptions
"syntax=%severity%: %source%: %message% (%count%)"
dev
worked in ops a long time ago
not ops
work with soa not, C/C++, nagios
Michael Medin
NSClient++
agentSince 2003?
windowslinux and
modular by design
Highly extensible 0.4.1: 2012-10-xx
0.4.2: 2013-10-xx?
<0.4.0
not open coreOpen source
0.4.3: 2014-02-xx?
is stable
one-man-bandno company, no commercial version, no payed time
Please
Some times I am busy
Please
Some times I am busy
Get your a** over here and play
NOW!
one-man-bandno company
sponsoring!donations!support!
, no commercial version, no payed time
Thank you!
Sockets: ipv6, ssl (true)
New protocols: NRDP, check_mk, Graphite, syslog, smtp
Real-time checks: eventlog, logfilesSimplified: Command line syntax
Modernized: NRPE, NSCA, check_nt
0.4.1
0.4.1Build 90 (2013-02-xx)◦ nsclient-full.ini
◦ Reload from script
◦ (re)added check_filesize (ie. Check_nt –v FILESIZE)
◦ Encoding support for NRPE
◦ New option: scan-range for CheckEventLog
◦ Various minor bug fixes
Build 96 (2013-04-xx)◦ Reverted external script quoting issues
◦ (re)added check_fileage (ie. Check_nt –v FILEAGE)
◦ Added support for binding to both ipv6 and ipv4
◦ Various minor bug fixes
Build 102 (2013-08-xx)◦ PDH improvements
◦ Performance data: pass through
◦ Encoding support through out
◦ Various minor bug fixes and enhacements
0.4.2: The goals
Modern Windows support
Simplified monitoring
Real-time monitoring
Linux checks
0.4.2: The STATUS
Modern Windows support
Simplified monitoringReal-time monitoring
Linux checks NSCP protocolCheck_xxx clients
0.4.2: Some Examples
Check_os_Version
Check_pagefileCheck_process
NO MORE PDH Check_serviceNrpe_client
Filters
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
()
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
(source or level )
filter = (id NOT IN ('3', '4', '6', '11', '16', '23', '24', '27', '29', '36', '46', '47', '50', '56', '134', '142', '219', '267', '270', '1006', '1009', '1014', '1030', '1035', '1036', '1055', '1058', '1071', '1073', '1085', '1102', '1110', '1111', '1112', '1131', '1291', '1500', '3095', '5719', '5722', '5783', '5788', '5789', '6008', '7000', '7001', '7003', '7005', '7009', '7011', '7022', '7023', '7024', '7026',
'7030', '7031', '7034', '7038', '7041', '9015', '9018', '9026', '9028', '10009', '10010', '10016', '10149', '12294', '15300', '15301', '24679', '36887', '36888', '40960', '40961', '45056') AND level IN ('error', 'warning'))
OR (id IN ('3') AND source NOT IN ('FilterManager') AND level IN ('error', 'warning'))
OR (id IN ('4') AND source NOT IN ('q57','L2ND') AND level IN ('error', 'warning')) OR (id IN ('6') AND source NOT IN ('Security-Kerberos') AND level IN ('error', 'warning')) OR (id IN ('11') AND source NOT IN ('Kerberos-Key-Distribution-Center') AND level IN ('error', 'warning')) OR (id IN ('16') AND source NOT IN ('WindowsUpdateClient') AND level IN ('error',
'warning')) OR (id IN ('23') AND source NOT IN ('Eventlog') AND level IN ('error', 'warning')) OR (id IN ('24') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('27') AND source NOT IN ('Eventlog') AND level IN ('error', 'warning')) OR (id IN ('29') AND source NOT IN ('Kerberos-Key-Distribution-Center') AND level IN ('error', 'warning')) OR (id IN ('36') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('46') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('47') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('50') AND source NOT IN ('TermDD','Time-Service') AND level IN ('error', 'warning')) OR (id IN ('56') AND source NOT IN ('TermDD') AND level IN ('error', 'warning')) OR (id IN ('134') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('142') AND
source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('219') AND source NOT IN ('Kernel-pnp') AND level IN ('error', 'warning')) OR (id IN ('267') AND source NOT IN ('Storage-agents') AND level IN ('error', 'warning')) OR (id IN ('270') AND source NOT IN ('Storage-agents') AND level IN ('error', 'warning')) OR (id IN ('1006') AND source
NOT IN ('DNS Client Events','GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1009') AND source NOT IN ('picadm') AND level IN ('error', 'warning')) OR (id IN ('1014') AND source NOT IN ('DNS Client Events') AND level IN ('error', 'warning')) OR (id IN ('1030') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1035') AND
source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1036') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1055') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1058') AND source NOT IN ('GroupPolicy') AND
level IN ('error', 'warning')) OR (id IN ('1071') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1073') AND source NOT IN ('USER32') AND level IN ('error', 'warning')) OR (id IN ('1085') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1102') AND source NOT IN
('SNMP') AND level IN ('error', 'warning')) OR (id IN ('1110') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1111') AND source NOT IN ('Server Agents') AND level IN ('error', 'warning')) OR (id IN ('1112') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1131') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1291') AND source NOT IN ('NIC-agents') AND level IN ('error', 'warning')) OR (id IN ('1500') AND source NOT IN ('SNMP') AND level IN ('error', 'warning')) OR (id IN ('3095') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5719') AND source NOT IN
('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5722') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5783') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5788') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5789') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('6008') AND source NOT IN ('Eventlog') AND level IN ('error', 'warning')) OR (id IN ('7000') AND source NOT IN ('service control manager') AND
level IN ('error', 'warning')) OR (id IN ('7001') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7003') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7005') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7009') AND source NOT IN
('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7011') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7022') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7023') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN
('7024') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7026') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7030') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7031') AND source NOT IN ('service control manager') AND
strings not like 'citrix' AND level IN ('error', 'warning')) OR (id IN ('7034') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7038') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7041') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN
('9015') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('9018') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('9026') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('9028') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('10009') AND source NOT IN ('DistributedCOM') AND level IN ('error', 'warning')) OR (id IN ('10010') AND source NOT IN ('DistributedCOM') AND level IN ('error', 'warning')) OR (id IN ('10016')
AND source NOT IN ('DistributedCOM') AND level IN ('error', 'warning')) OR (id IN ('10149') AND source NOT IN ('WindowsRemoteManagement') AND level IN ('error', 'warning')) OR (id IN ('12294') AND source NOT IN ('Directory-Services-SAM') AND level IN ('error', 'warning')) OR (id IN ('15300') AND source NOT IN ('HTTPEVENT') AND level IN ('error', 'warning')) OR (id IN ('15301') AND source NOT IN ('HTTPEVENT') AND level IN ('error', 'warning')) OR (id IN ('24679') AND source NOT IN ('Cissesrv') AND level IN ('error',
'warning')) OR (id IN ('36887') AND source NOT IN ('Schannel') AND level IN ('error', 'warning')) OR (id IN ('36888') AND source NOT IN ('Schannel') AND level IN ('error', 'warning')) OR (id IN ('40960') AND source NOT IN ('LSASRV') AND level IN ('error', 'warning')) OR (id IN ('40961') AND source NOT IN ('LSASRV') AND level IN ('error',
'warning')) OR (id IN ('45056') AND source NOT IN ('LSASRV') AND level IN ('error', 'warning'))
Numbers, constants etcKey Safe Key Description
= eq Equals
!= ne Not equals
> gt Greater than
< lt Less than
>= ge Greater or equal than
<= le Less or equal than
in ( <LIST OF VALUES>) In a given list
not in (…) Not in a given list
StringsKey Safe Key Description
= eq Equals
!= ne Not equals
> gt Greater than
< lt Less than
>= ge Greater or equal than
<= le Less or equal than
in ( <LIST OF VALUES>) In a given list
not in (…) Not in a given list
like Substring matching
regexp Regular expression
not like Opposite of like
not regexp Opposite of regexp
All good things are three!
Filter
Warning
Critical
Ok
Level Source … …
Error Word … …
Error Excel … …
Info Word … …
Warning Excel … …
Error App1 … …
Warning App1 … …
Error App3 … …
Display
Custom strings
Supports
top- and detail-syntax
Display
detail- ${source} top- ${list}
Hello: s: App1, s: App1, s: App3
check_pagefile
"filter=name = 'total
check_uptime
"warn=uptime < -
"crit=uptime < -
check_process process=explorer.exe
"warn=working_set > 70m"
"detail-syntax=${exe} ws:${working_set}, handles: ${handles}, user time:${user
Simple?
Let me guess
This all seems Like a lot of typing!
Sensibledefaults!
check_cpuJust works!
Real timemonitoring
Active monitoring!
Monitored Server(Windows)
Monitoring Server(Nagios)
check_cpu
check_uptimecheck_mem
check_eventlog
check_updates......
Monitored Server(Windows)
Monitoring Server(Nagios)
check_cpu
check_uptimecheck_mem
check_eventlog
check_updates......
Monitored Server(Windows)
Monitoring Server(Nagios)
check_cpu
check_uptimecheck_mem
check_eventlog
check_updates......
Monitored Server(Windows)
Monitoring Server(Nagios)
check_cpu
check_uptimecheck_mem
check_eventlog
check_updates......
Passive monitoring!
Real-time monitoring!
Monitored Server(Windows)
Monitoring Server(Nagios)
Error detected in eventlog
Everything is ok
Monitored Server(Windows)
Monitoring Server(Nagios)
Error detected in eventlog
Everything is ok
CheckLogFile
NSClient++ Core
Linux Kernel
FILE
NSCA NSCAClient
SimpleFileWriter
File
CheckLogFile
NSClient++ Core
Linux Kernel
FILE
NSCA NSCAClient
SimpleFileWriter
File
No CPU overhead
Notified instantly
Powerful filtering
CheckLogFile
NSClient++ Core
Linux Kernel
FILE
NSCA NSCAClient
SimpleFileWriter
File
CheckLogFile
NSClient++ Core
Linux Kernel
FILE
NSCA NSCAClient
SimpleFileWriter
File
[/modules]CheckLogFile = enabledNSCAClient = enabledSimpleFileWriter = enabled
[/settings/logfile/real-time/checks/my_check]destination = FILE,NSCAfile = test.txtwarning = column1 like ‘warn’critical = column2 like ‘crit’
[/settings/NSCA/client/targets/default]address = 10.11.12.13encryption = aespassword = secreter
But I use
CheckLogFile
NSClient++ Core
Linux Kernel
FILE
NSCA NSCAClient
SimpleFileWriter
SimpleCacheCACHE
NRPEServer
CheckLogFile
NSClient++ Core
Linux Kernel
FILE
NSCA NSCAClient
SimpleFileWriter
SimpleCacheCACHE
NRPEServer
No CPU overhead
Powerful filtering
Stored in cache
Check latest result Fetched instantly
CheckLogFile
NSClient++ Core
Linux Kernel
FILE
NSCA NSCAClient
SimpleFileWriter
SimpleCacheCACHE
NRPEServer
CheckLogFile
NSClient++ Core
Linux Kernel
FILE
NSCA NSCAClient
SimpleFileWriter
SimpleCacheCACHE
NRPEServer
[/modules]CheckLogFile = enabledSimpleCache = enabledNRPEServer = enabled
[/settings/logfile/real-time/checks/my_check]destination = CACHEfile = test.txtwarning = column1 like ‘warn’critical = column2 like ‘crit’
[/settings/NRPE/server]allowed hosts = 10.11.12.13allow arguments = true
But HOW ABOUT Graphing?
Two options:
1, store/fetch from cache
2, submit passively
but not to Nagios!
apt-getgit clone git://github.com/mickem/nscp.gitmkdir build ; cd buildcmake ../nscpmake
Manually install visual studio, python and cmake
Download and unpack nscp source
python nscp\build\python\fetchdeps.py --target x64 --cmake-config dist
cmake ../nscp
msbuild /p:Configuration=RelWithDebInfo NSCP.sln
Please help with packages!
I will give you free* beer!
*Free as in your free to buy it your self!
NativeSecure
Simple
Fast Light weightA work in progress
check_service computer=192.168.0.1check_disk drive=\\192.168.0.1\c$check_task_sched computer=192.168.0.1check_wmi computer=192.168.0.1
Light weight remote deployable agentSame as psexeccheck_cpucheck_memorycheck_processExternal scripts!
http://nsclient.org
MonitoringSimplified
simple?
CheckEventLog file=application file=system MaxWarn=1
MaxCrit=1 "filter=generated gt-2d AND severity NOT IN
('success', 'informational') AND source != 'SideBySide'"
truncate=800 unique descriptions
"syntax=%severity%: %source%: %message% (%count%)"
simple?
check_eventlog
Photo by Olga Berrios
THANK YOU!
Information about NSClient++http://nsclient.org
facebook.com/nsclient
Slides, and exampleshttp://nsclient.org/nscp/conferances/nwc/2013/
My Bloghttp://blog.medin.name