Osmius: Monitoring Made Easy
-
Upload
osmius -
Category
Technology
-
view
1.327 -
download
3
description
Transcript of Osmius: Monitoring Made Easy
OsmiusThe Opensource Monitoring Tool
Monitoring Made Easy
Osmius Peopleware
Schedule
2
What is Osmius? Main capabilities and Concepts
Monitoring with Osmius.Instances and Services – Reports
Osmius infrastructure: Deploying agents and Centralized Managing
Osmius Framework:Let's make a new agent
What is Osmius?
3
Osmius is Open software that allow us to monitorand supervise anything connected to a network
What is “anything”?
4
Systems
Host Server, ApplicationsDatabases
Service is down Response time
CPU % Load
And what about...Social networks?
Clusters?News about a new
protein?
Applications
Web ApplicationsServicesEnd user
experience
Response timeTransactions
Why monitoring?
5
● Know before your users that a problem has occurred......before they call you.
● Foresee problems before they arise.......so you can prevent them from occur.
● Capacity planning....... review historic data to analyze trends
● Improve Quality...
● Monitoring is an increasing Market ... there are more and more systems connected, true?
But... why monitoring?
6
Save costs!
● Reduce nonavailabilities in your business processes.
● Use resources where they are supposed to be. Prioritize.
● Foresee problems and save hours spent dealing with them.
● Use that hours to improve or develop new areas.
● Avoid false alarms and the “always running” symptom.
● Learn from your systems your users behavior.
Why Osmius?
7
● Easy: To understand......which means easy to implement.
● Business Oriented: From technical view......to service and business process targets.
● Fast: Near real time application....C++ and C core. Not only scripts. 1000 events/sec in this laptop.
● Extendable: Osmius Development Framewok...... build your own agents. Chose intrussive or not intrussive ones.
● Multiplatform...... don't tie to specific vendors or markets.
● Open software, open architecure, open research ...... open business model, commercial support, universities.
Distributed Architecture
Your BusinessYour “things”
RoutersServers Web
ServersDatabases
APIAPI API API
MA MAMA
CS
AgentsMonitorEvents
MastersConfig tasksControlDeployment
CentralReceptionCorrelationsNotifications
Database
ConsoleOperation & adminBusiness View
Agents
Central Servercorrelations
SQL Database
Java ConsoleTomCat Server
Master Agents
Applications...Stock Shares...
SSL
More than100.000 events/sec
Round RobinStorage Policy
ACE FrameworkFast :: Multiplatform
Business View
OperationTechnicalEventsInstancesStates
Routers Servers Web Servers
Databases
ServicesManagersServicesAvailabilitySLA
CRM Intranet ... Web
BusinessStaffSLAsProcessControl PanelNotificationsSubscriptions
Gold Silver Bronze
More Views..AdminSecurityData Mining
Broken agreements - Predictions
Billing P2 Pn...
Process
Business oriented notifcations
Services
SLA
Instances
Osmius Features
10
Instance : Everything you want to monitor.Instance Type : Defines the class of instance.Event Types : Variables you poll from instance types.Event : Value from a question to a variable.Criticity : Event “color”.
Easy to understand...
Instance Types Instance
Intranet DB
Customers DB
Firewall Host
Other Server
Event Types
# Sessions?
CPU Load %?
Free disk Mb?
# Users?
Uptime?
Events13 sessions in Intranet
99 sessions in CustDB
10 seconds uptime CustDB
80% CPU Load in firewall
100 users in Other!!
..................................
.... ....
Osmius Features
Easy to understand...
h
Event
State
InstanceTypes
EventTypeSystem Desc
Instance Type: SNMP Device
Instance: “Router”
Osmius Features
12
Instance : Everything you want to monitor.Service : A group of InstancesSLA : Service Level Agrement
Services should accomplish SLAs
Easy to Integrate with the business...
“Beta” Router
“Alpha” Host
Intranet Service
....
Intranet DB
Exchange Srv
“Gamma” Host
SLA GoldThese services must be each month:
Availability 99.999%Ok State 95.999%
Down: less than 10 minutes per week
Osmius Features
Service Oriented
Service
Mean Time Between: Failures - Recovery
Availability 30 days
ServiceEvents
Osmius Features
Defining SLAs
SLA Targets
Osmius Features
Tracking SLAs: Control Panel
Osmius Features
16
Subscriptions : Every thing you want to informed about. Even when you are out of office.
Subs. Channel : You can be notified in several ways... By email. By SMS. By Jabber. Using Asterisk By a new travel ticketing in the Help Desk.
Easy to be informed... notifications
Ø b
NotificationTypes
Subscription“notify me when”
Channel“by”
email
SMS
Time shifts“if I am in”
....
Service Availability Changes
Instance State ChangeInstance State Change
SLA
Intranet Service
X
Working time
Not Working time
Out of officeJabber
Y
Osmius Features
Global State : Overall system mark between 0 and 100
Based on Service state and availability.Each service is weighted based on SLAs targets.Can be used to track system evolution.Be notified when it's below 80.
Easy to be informed... global state
17
Osmius Features
Concept : You don't need to know exactly the CPU load on day February, 16th 2007 at 03:00 a.m.
The older the data the lesser the detail you need.
5 minutes ago : Exact CPU LoadLast week : Hourly averageLast year : Daily averageMore than two years : Doesn't matter!
Osmius parameters:Number of days to delete data.Number of days to group events: one average per day.Number of days to group events: one average per hour.
Easy to ... maintain
18
Osmius Features
Osmius automatically takes care of these parameters:E : Number of days to Erase data.D : Group events one per day.H : Group events one per hour.
Easy to ... maintain
19
CPU LOAD
TUX host
Today: Max. detailNo Data 1/day 1/hour
E D H
Osmius Features
Correlation : If last event from HOST01 instance was CPULOAD withstate critical and a new event with the same type arrives I only want to see one row if the state is also critical.
If a new event informs that ROUTER is up and ok, please remove both events from “active view”.
Instance State and Service State is calculated from state of active events.
The Active Events view should be clean.
Easy to ... maintain. Correlation
20
Osmius Features
Easy to ... maintain. Correlation
21
ᄎᄎ
Repeated Events
“Only” 10 rows
Historic View
Osmius Features
Templates : Group events and parameters to apply in batch mode to one or several instances.
Default : Osmius provides a default template with the main events and parameters a typical Instance monitoring.
Easy to ... configure.
22
% CPU Load Look every 5 mins || Warning: > 90 || Critical: > 95
# Users Inactive
Net KBytes Out Look every 5 mins || Warning: > 100 || Critical: > 150 | Silent
% CPU Load Look every 30 secs || Warning: > 80 || Critical: > 85
Look every 30 secs || Warning: > 10 || Critical: > 15
Net KBytes Out Look every 60 secs || Warning: > 30 || Critical: > 40
# Users
Template“Default”
Template“Secure”
Osmius Features
Silent mode : You can configure each defined event to work in “silent mode”.
“Don't send me events unless there is change of state”From OK to WARNING : YESFrom CRITICAL to OK : YESFrom OK to OK : NO
Saves network resources and prevent resource starvation
Easy to ... configure. Silent Mode
23
Osmius Features
Agents : They are responsible of retrieve events.
Master Agents : They allow us to manage their agents. Configuration changes. Deploy of new agents and files. Run in several platforms (unix, windows)
Easy to ... manage
24
MasterServer
HTTP Agent
Linux Agent
MySql Agent
Master
...HTTP Agent
Linux Agent
MySql Agent
ReloadStart | Stop ¨Deploy
Events
TasksConfig.
Events
Instances
SSL SecuredCommunications
Osmius Features
Easy to ... manage
25
Master Agent
Agents
RemoteTasks
Agent Parameters
Instances monitoredby this Master Agent's agent.
Osmius Features
Goals : Provide good reports within the Console.
A few selected reports. First, think what you want.Design it.Try it.Add paremeters (week, month, top 10, top 20,...)Is it ok?
You can always do it your self (Open Source again)
Users don't need to install a new product “Osmius Reporter”
Easy to extract information from data: Reports
26
Osmius Features
Top N eventsIdentify most problematic events by occurrence or criticity.
Event Evolution per Day. Identify event stroms and evolution.
Top N Active Instances.Which items are generating more events and more alarms.
Top N nonavailable Instances or ServicesWhich items are those “always down”. :(
27
Reports
Osmius Features
Top N less healthy ServicesIdentify most problematic Services.
InventoryElements, Services, configured events...
InfrastructureHow many agents, where, type,...
Reports
28
Osmius Features
Reports
Osmius Features
Downloads : Source code tarballs in SourceForge. Get last code from subversion in SF. Binary distribution for server and master agents. One per platform.
Next – Next – Next : We're working with BitRock to make installers:
Multiplatform.Graphical and text mode.Very, very easy.
Easy to install
30
www.bitrock.com
Bitrock's mission is to make software easier to use and deploy
Osmius Features
Agent Framework : Develop new agents using Osmius Framework is easy.Trainig courses and documentation is available.How about one week to have a new agent? (our average)
Integrated with Osmius and remote management. Robust and tested. Fast
Define you own events:“Notify me when nonprocesed orders > 31”
Open Source : (GPLv2 Licence)Users don't have to be tied to a specific provider.What kind of monitoring software you're relying on?
Easy to expand
31
Development Model
Methodology
Scrum Agile methodology.● Organize features in a product stack.● Priorize the features:
● Customers needs.● Product goals and research lines.
● Prepare a Sprint: Set of features to fit in onemonth.● Release a new internal or customer version every
month.● Unitary and integrated Tests● Documentation
● High visibility. Publish “burn down chart” (next slide)
32
Development Model
33
Results and demo every month Stable release: Twice a year Task: Lasts two days máx. Updated every day. Visibility Enables work at home.
(c) Softhouse
Development Model
34
Osmius 8.01-1 Osmius 8.04-1
Osmius 8.05-1 Osmius 8.07 Production
Documentation
35
What is Osmius? http://www.Osmius.net
Osmius Manual Wikihttp://www.Osmius.net/osmwiki
Osmius Professional Serviceshttp://www.Osmius.com
OsmiusThe Opensource Monitoring Tool
Osmius is supported by Ministry of Science and Education
Ministry of Industry, Tourism and TradeCentro de Desarrollo Tecnológico e Industrial
of Spain
Osmius & Peopleware