Post on 21-Jun-2015
description
Proactieve MonitoringOpen ICT infrastructuur monitoring solutions
Gentbrugge, 22 october 2014Jan Guldentops ( j@ba.be )BA N.V. ( http://www.ba.be )
Who am I?
● Jan Guldentops (° 1973)● Historian by education, ICT Infrastructure builder by vocation,
Security guy by accident● Open Source fundamentalist after houres (LPI, RHCE, RHCSA,
VSP, VTSP, ...)● Focus on planning, building, security and maintening network,
storage,server and cloud infrastructures● Hands on guy with 20 years of practical experience
– Testlab
– MacGuyver Projects
● Founding Partner of Better access (°1996) and BA (°2003)
Brave new world
Why monitor ?
● Permanently keeping an eye on all aspects of your infrastructure ● network, storage, security, servers, applications, power, etc.
● Seeing the status questionis in one blink of the eye
● being able to alert the right people in case of problems
● Work proactively ● Detecting problems before they become critical
● knowing somethings wrong before the phone rings...
● Historical reporting ● knowing when, where and what problems arrive can help you locate typical
problems and resolve them
● Did we keep our SLA ? Did our supplier keep his SLA ?
● The numbers tell the tale ! ( Meten is weten / Le mètre à ruban )
Lots of monitoring solutions
Open Source
Netsaint, Big Brother, OpenNms
Nagios / Icinga
Commercial Open Source
Zabbix, Centreon, Groundworks
Closed source solutions
PRTG, Whatsupgold, Intermapper
Scom ( Yes even Microsoft enters this space!)
“Enterprise”
HP, Tivoli, BMC, Netiq, etc.
Cloud oplossingen aka Monitoring as a service
Cloudprovider, Telco, etc.
Gartner
Funny but true
Why nagios ?
● OPEN !!!!!!!!
● Mature: almost 12 years of development went into it
● It has a big, living community and ecosystem● Nagiosexchange
● Nagios plugin community
● Easy to adapt to specialized needs and monitoring possibilities ● e.g. I have a customer who uses it to monitor all the aspects of his automated
carwash setup.
● It scales to pretty big infrastructures● Multi monitoring nodes
● Failover, etc.
● Last but not least: it works !
Problem with nagios ?
● Nagios is like linux -> Everybody can built his own version of it● Core Nagios ( version 2.0 or 3.0 )
● Enterprise version : Nagios XI
● Open Monitoring Distribution / Check_mk
● Groundwork
● Op5
● Centreon
● Forking: ICINGA
● Big collection of loose development / packages
● Steep learning curve
BA Monitoring Distro
● BA decided to standardise to our own distro● 100% open source● Delivered as a physical or virtual appliance
(Ready2run)● Treasure chest of all the available tools, checks,
templates, example configs● Based on Check_MK / OMD● Updatable / supportable
How does it work ?
Checks: Server-, storage and virtual infrastructure
● SNMP ( e.g. for HP-servers )
● Agents : ● check_mk
● NRPE ( Nagios Remote Plugin Executor )
● NCSA
● NSClient++
● remote ssh commands
● specific custom built scripts● Blacklisting check
● Backup check
● Etc.
Check_MK
Checks : Network infrastructure
● ICMP / UDP ● Latency● package loss● bandwith
● Active monitoring by snmp ● Pulls ● Traps
● RMON / Nflows / Rflows
Checks : analysing central logfiles
● Logging all system messages to a central syslog
● Analyse the logfiles and create alerts ● custom scripts ● Look for anomalies● Splunk...● Check backuplogs● Etc.
Checks : virtual machines
● Compatible with most virtualisation products ● Can monitor the vms through
– hypervisor / vcenter ● SNMP● Same way as you check bare metal servers :
– Check_mk– Nrpe / NCSA – Remote ssh
Checks: services
● Check if a service is still running ● Through check plugins● There are quite a number of checks available● Use the community : http://nagiosplugins.org ● Write them yourself in perl, python or another
interpreter language
Checks: Infrastructure
● Power / UPS ● SNMP● Serial cable ● custom software from supplier
● Environmental: Temperature / humidity-sensors ● Lots of check sensors available● Work usually by SNMP
● Videosecurity ● Access control systems
Checks: special projects
● Security: Hostbased IDS ● Scans of your network and the connected hosts
( NMAP / MACtable / etc. )● What's new in the network ?
● Spam blacklist check ● Check certificates● Rogue snapshot check● Licentiemanagement
Visualisation
● DASHBOARD● Techies and non-techies ● Webinterface ● Mobile
– Mobile HTML5– Jnag mobile apps for Android, Iphone, Ipad
● Nagvis – allows you to project status on custom images– Full customization possible!
Cool nagvis examples
Cool Nagvis examples
Cool Nagvis Examples
Mobile App
Visualisation
Alarming
● Once something goes wrong you want to alert the right people
● Alertgroups
● Alertgroups can be combined with the right timings
● Alerts can be given by :
● SMS
● Semadigit
● Social media ( twitter )
● Jabber ( Instant Messaging )
● RSS
● Special stuff :
● Integrate in ticketingsystem
● webservices
● hardware ( IO, lights, etc. )
● Automated stuff ( run scripts )
Reporting
● Historical data of what happened ● Every checks has a status than can be kept
for later analysis● Can be used for :
● SLA ● Resource planning ● Troubleshooting
Use it as a tool
● Modus operandi : ● Acknowledge problems
● Shedule downtime
● Put the right relations between monitored entities
● Don't alert for everything and all the time !
● Integrate with other tools : ● Ticketingsystem
– OTRS, Omnitracker, Topdesk ( Work-in-progres)
● Dispatch
● Integrate documentation systems
● Inventory
Monitoring isn't always right
● Check is only so intelligent as you make it!● False positives or negatives● Problems with :
● Network Latency ● Load on the monitoringserver ● Load on the monitored appliance
● Monitoring infrastructure is a great target for hackers!
New possibilities
● SIEM – Security Incident and Event Management● Aanval integration with Nagios is under development
● Devops● DevOps is the practice of operations and development engineers
participating together in the entire service lifecycle, from design through the development process to production support.
● DevOps is also characterized by operations staff making use many of the same techniques as developers for their systems work.
● Application Performance Monitoring
● Automonitoring● Automagically privisioning monitoring in yiour systems
Demotime!
Thank YouContact us
016/29.80.45
016/29.80.46
www.ba.be / Twitter: batweets
Remy TorenVaartdijk 3/501B-3018 Wijgmaal
info@ba.be
Twitter: JanGuldentops
http://be.linkedin.com/in/janguldentops/