Configuration of Metrics and Policies in Oracle Enterprise Manager

2
Configuration of Metrics and Policies in Oracle Enterprise Manager 11 juli 2007 Pagina 1 van 2 Configuration of Metrics in Oracle Enterprise Manager Oracle Enterprise Manager 10gR2 Monitoring targets using OEM heavily leans on Metrics and Policies. As you must have noticed, after installation of Oracle Enterprise Manager Hundreds of metrics and polices are installed. These metrics and policies will most probably start generating tons of alerts and at the end of the day you might end up with an EM console indicating many target alerts and policy violations. This paper will focus on the configuration of metrics by using monitoring templates. As seen in many organisations it proves to be pretty difficult to get things organized. Which metric is important to be monitored and what values make sense as warning and critical threshold? When trying to answer these questions, one should start asking himself for each metric: What risk is there for the continuity of services when this metric’s warning or critical thresholds are met and an alert is generated? What should be the reaction of the DBA staff if a warning or critical alert is generated? If we are unable to specify such a reaction, it does not make sense to monitor this metric, because what to do if? More specifically you would also want to know: what are the right values for warning and critical thresholds? To make sure what metrics you should ask these questions for, start figuring out what types of targets are you monitoring? Hosts, Database Instances, Clusters, Cluster Databases, Listeners, Agents, OC4J, BPEL_Process_Manager etc. For each of these target types we will create monitoring templates. Putting it all in action First we start creating a spreadsheet to put all info together. The spreadsheet could contain the following columns: Metric Business risk Business impact User action Consecutive Number of Occurrences Preceding Notification Warning threshold Critical threshold Collection schedule Of course you can add other columns, if that makes sense for your specific needs. Take a closer look to the spreadsheet Let’s see how we are going to use the spreadsheet. Metric The Name of the metric. Example: “Data Block Corruption Error Stack” Business risk What risk is there for the continuity of services when this metric’s warning or critical thresholds are met and an alert is generated? Example: “Loss of business data” Business impact What impact does this risk has for the business? Example: “High”

description

Configuration of Metrics and Poicies in Oracle Enterprise Manager

Transcript of Configuration of Metrics and Policies in Oracle Enterprise Manager

Page 1: Configuration of Metrics and Policies in Oracle Enterprise Manager

Configuration of Metrics and Policies in Oracle Enterprise Manager 11 juli 2007

Pagina 1 van 2

Configuration of Metrics in Oracle

Enterprise Manager

Oracle Enterprise Manager 10gR2

Monitoring targets using OEM heavily leans on

Metrics and Policies. As you must have

noticed, after installation of Oracle Enterprise

Manager Hundreds of metrics and polices are

installed. These metrics and policies will most

probably start generating tons of alerts and at

the end of the day you might end up with an

EM console indicating many target alerts and

policy violations.

This paper will focus on the configuration of

metrics by using monitoring templates.

As seen in many organisations it proves to be

pretty difficult to get things organized. Which

metric is important to be monitored and what

values make sense as warning and critical

threshold?

When trying to answer these questions, one

should start asking himself for each metric:

• What risk is there for the continuity of

services when this metric’s warning or

critical thresholds are met and an alert is

generated?

• What should be the reaction of the DBA

staff if a warning or critical alert is

generated?

If we are unable to specify such a reaction,

it does not make sense to monitor this

metric, because what to do if.?

• More specifically you would also want to

know: what are the right values for warning

and critical thresholds?

To make sure what metrics you should ask

these questions for, start figuring out what

types of targets are you monitoring?

Hosts, Database Instances, Clusters, Cluster

Databases, Listeners, Agents, OC4J,

BPEL_Process_Manager etc.

For each of these target types we will create

monitoring templates.

Putting it all in action

First we start creating a spreadsheet to put all

info together.

The spreadsheet could contain the following

columns:

• Metric

• Business risk

• Business impact

• User action

• Consecutive Number of Occurrences

Preceding Notification

• Warning threshold

• Critical threshold

• Collection schedule

Of course you can add other columns, if that

makes sense for your specific needs.

Take a closer look to the spreadsheet

Let’s see how we are going to use the

spreadsheet.

Metric

The Name of the metric.

Example: “Data Block Corruption Error Stack”

Business risk

What risk is there for the continuity of services

when this metric’s warning or critical thresholds

are met and an alert is generated?

Example: “Loss of business data”

Business impact

What impact does this risk has for the

business?

Example: “High”

Page 2: Configuration of Metrics and Policies in Oracle Enterprise Manager

Configuration of Metrics and Policies in Oracle Enterprise Manager 11 juli 2007

Pagina 2 van 2

User action

What should be the next thing to do for the

DBA?

Example: “Consider database or tablespace

recovery.”

Consecutive �umber of Occurrences

Preceding �otification

After how many violations should an alert be

triggered?

Example: When you like to be alerted at the

moment CPU usage starts exceeding 80%, it

might be wise to do this after the CPU usage

exceeds 80% for 3 consecutive occurrences.

As on a busy production system high CPU

usage can occur, this might indicate a

structural problem when occurring for 15

minutes.

Warning and critical thresholds

Here you will specify the threshold values for

the warning and critical alert level.

Creating templates

After finishing the analysis of metrics in you

spreadsheet, the next thing to do would be

creating the monitoring templates.

Create standard templates for each target type

that needs to be monitored. Create customized

templates for customer / project specific

situations. These customized templates should

only contain metric threshold settings that are

different from the threshold settings in the

standard templates.

So you might end up with something like:

• CT-Cluster-P0022

• CT-Cluster-P0045

• CT-Single-Database-Instance-CUST32

• ST-Agent

• ST-Cluster

• ST-Cluster-Database

• ST-Cluster-Instance

• ST-Host

• ST-Listener

• ST-Single-Database-Instance

This example shows 3 customized templates,

of which 2 are project and 1 is customer

specific and 7 standard templates.

Applying templates to targets

In order to apply all metric thresholds to you

targets, you should apply the monitoring

templates to these targets.

Copyright © 2007

Rob Zoeteweij

E-mail: [email protected]

Blog:

http://mcbobsstruggle.typepad.com/oracle_

enterprise_manager/