Deploying NNMi 9.0

Post on 22-Nov-2014

3.909 views 13 download

description

This session is a must for anyone who is planning to deploy NNMi 9.0. This is a technical presentation focused on tips and tricks for successfully deploying and configuring NNMi software. Attendees will learn field tested best practices for deploying NNMi and NNMi Advanced software.

Transcript of Deploying NNMi 9.0

1 ©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Deploying NNMi 9.0

Larry Besaw

Chief Architect, NNMi

2

Introduction

– Sample “deployment” on a small test lab

– All using NNMi 9.00

– Will not address NNM 6.x/7.x to NNMi upgrades. This

will be a virgin installation of NNMi 9.00

– Goal is to give you a feel for what is required and for you to

see how straightforward the tasks are

– This is abbreviated but the steps are similar for even our

largest deployments

– HP has written various white papers on deploying NNMi 9:

http://support.openview.hp.com/selfsolve/manuals

3

Steps We‟ll Take

– Initial Login and User Creation

– Apply license

– Configure Communication

– Configure Discovery

– Configure Monitoring

– Configure Incidents, Traps and Automatic Actions

– Configure the Graphical User Interface

– Maintenance

– Health Checks

– Possible Use Scenarios

4

Other Steps We Will Not Cover Include:

– Machine sizing

– NNMi support for HTTPS and LDAP

– Integration with other products such as HP NA, HP OM, HP

UCMDB, and other 3rd party products

– Configuring HA or Application Failover

– Configuring an Oracle database

– Configuring Global Network Management (GNM)

– Configuring IPv6

– iSPIs

– See the NNMi Deployment Guide for more information on these

topics

5

Assumptions

– Installation has already been done

• Installation Hints: Check all prerequisites especially kernel parameters,

shared memory, semaphores, RAM, etc.

– This example is done on a Unix machine. Paths need to be

converted for Windows.

6

After the Install, Validate Processes Are Running

– At command line, run “ovstatus –c” for a basic check.

– Most processes are now within the ovjboss so you must also

run “ovstatus –v” for the details of the jboss services.

7

jboss Processes# ovstatus -v ovjboss

object manager name: ovjboss

state: RUNNING

PID: 20413

last message: Initialization complete.

exit status: -

additional info:

SERVICE STATUS

CPListener Service is started

CommunicationModelService Service is started

CommunicationParametersStatsService Service is started

CustomPoller Service is started

EventsCustomExportService Service is started

ExtensionDeployer Service is started

InstanceDiscoveryService Service is started

IslandSpotterService Service is started

KeyManager Service is started

StagedSnmp Service is started

StatePoller Service is started

TrustManager Service is started

8

Initial Login

– Login with browser (no more java plugins required)

– http://myserver.example.com/nnm

9

– Initially log in with the system password created during installation

10

Create a new user

– It‟s best to create an administrator account rather than using

the system login

11

– Click on the Configuration Workspace and select User Interface configuration

12

– Click on the New Icon to launch the Account Mapping form

– Select the pull down menu to the right of the Account entry and select New. (It is a common mistake to try to simply type in the Account rather than creating a New one.)

13

– Type in the User Name and Password (we‟ll call ours “admin” and the password will be “adminpw”)

– Select the role and hit save and close

14

– We now have an admin account

– Try logging out and back in with the new account. You can see the user

presently logged in to this session in the upper right.

15

Apply a License

– Product comes with a 250 node instant on license. You

don‟t need to do anything to use this license. But once you

hit 250 nodes, no other nodes will be discovered or

monitored.

– You can also obtain a temporary license from HP for initial

trial.

– You can apply the license via a GUI using

nnmlicense.ovpl NNM –g

– Or via the command line using

nnmlicense.ovpl NNM -f ./mylicense.key

16

Configure Communication

– Go to the Configuration Workspace and select Communication Configuration

17

– Select the Default Community Strings tab and click on the New icon. Enter all of your SNMP Read Community Strings here.

– Order does not matter. NNMi attempts all Comm Strings simultaneously and chooses the first one that succeeds.

– You can also modify the default timeout and retry attempts here. After making changes, save and close the form.

– SNMP „GetBulk‟ by default – recommended to be unchecked for end nodes

18

SNMP Management Address Discovery

– Leave “Enable SNMP Address Discovery” unchecked if you want to keep

the same loopback address as the only address that will ever be tried for

SNMP communication.

19

Discovery Configuration

– List based discovery (similar to loadhosts of legacy NNM)

• More control

• No surprises

• Requires high level of knowledge of nodes

• Only name or IP is required. No subnet masks required

• Static data

– Auto-discovery (we‟ll use this for our example)

• Always up to date

• Requires good “rules” to control breadth of discovery

• By default, NNMi only discovery switches and routers (this is easily expanded)

20

– Go to the Configuration Workspace and select Discovery Configuration.

21

– Select the Auto-Discovery Rules tab and click on the New icon.

22

– Fill out the Basics section and click on the New icon for the IP Address Ranges in this rule. The value for Ordering doesn‟t matter in this case since we‟ll only have one auto-discover rule.

23

– Type in a range. A rule can contain multiple ranges.

– Choose the Range Type (Inclusive for our example)

– Save and close the forms.

– We now have a discovery rule for the 15.2.*.* subnet.

24

– Since we didn‟t enable “ping sweep”, we must provide a seed router to get the discovery started.

– Add the name or IP address of a router in this subnet to begin the discovery

25

–NNMi uses the following sources of "hints":• ARP cache

• DNS

• Ping Sweep if configured

• BGP - Border Gateway Protocol

• CDP - Cisco Discovery Protocol

• EIGRP - Cisco Enhanced Interior Gateway Routing Protocol

• ENDP - Enterasys Discovery Protocol (also known as CDP - Cabletron

Discovery Protocol)

• FDP - Foundry Discovery Protocol

• OSPF - Open Shortest Path First

26

– You‟ll now begin to see nodes get discovered. You can view the list of discovered nodes in many places in the GUI. Try the Network Overview under the Topology Maps workspace.

– Note that this is usually an abbreviation of the entire set of nodes.

– You can also see discovery progress using the following graphs:

• Tools -> Self-Monitoring Graphs -> Discovery Progress

• Tools- > Status Distribution Graphs -> Node Status

27

Manual Seed Discovery– Recommended if you know your network well

– Create a seed file• List of management IPs for all the nodes to be managed• One IP per line, comments followed by „#‟• Example: <IP address> # <DNS name>• IP goes under „Hostname/IP‟ field and DNS goes to „Notes‟ section• „Notes‟ can later be filtered for troubleshooting based on hostnames

– nnmloadseeds.ovpl –f <seedfile>

28

Manual Seed Discovery (cont.)

– Rediscovery interval 24 hrs by default

– Load the nodes across 24 hrs so that discovery goes across

the cycle

• For example, for loading 3000 nodes manually using a seed file:

− Load 300 nodes at a time every 3 hrs overnight (evening to next day morning)

– Ensures that CPU spikes for NNMi server are not occurring

daily at the same time

29

Interface Discovery Filter

– Now supports interface filters for discovery

– Based on the interface groups

– Interfaces passing this filter do not get added during discovery of new nodes

– For already existing nodes, next scheduled or manual config poll takes care of removing the filters from the topology

30

Monitoring Configuration

–Default Behavior• NNMi monitors “connected” interfaces where connected means the

interface has a discovered connection to another interface in NNMi.

Most access switch ports would not be considered connected if you

don‟t discover end nodes. Instead typically the uplink would be

monitored.

• NNMi does ICMP monitoring of management addresses

• You may not need to make any additional changes

31

– Example of monitoring the uplink

32

Steps to Modify Monitoring

– The basic steps to modify the monitoring in NNM include: • Create a node group and/or interface group

• Associate a monitoring setting (polling policy) with the group

• Prioritize the monitoring setting (nodes and interfaces can match multiple groups)

(by default “connected”

interfaces and

management addresses)

33

– Suppose that we have some nodes with an IfAlias that begins with “tunnel to”.

– We have been instructed that these interfaces need to be monitored if their speed is also 9 Kbs.

– We‟ll need to create a filter to identify any interfaces that match this criterion.

– Then we‟ll apply a polling policy to these interfaces.

34

Creating an Interface Group

– Under the Configuration Workspace, select Interface Groups

35

– Click on the New icon

36

– Name the new Interface Group

– Create the filter expression using the logical operands

– Save the Interface Group

– Test the membership with Actions -> Show Members

37

New Filter Operators

– “Not” operator

– “Exists” and “Not Exists” allow filters to be created with

expected results when combining multiple custom attributes

or capabilities

• (hostname like router*.hp.com OR EXISTS((customAttrName=edge

AND customAttrValue=true)))

38

– Results of the membership test

– Watch out for any “stale” filters on this view that might be inadvertently

applied

39

Apply a Polling Policy to the Interface Group

– In order to poll the interfaces defined by this filter, we must apply a

polling policy to this group.

– Open the Monitoring Configuration view

40

– Since we defined an Interface Filter, select the Interface Settings tab

– Note the current “ordering” values

– Click New

41

– Select the Interface Group and enter in an Ordering value

– We want it to be “higher” than the other policies (lower number)

– Extend the polling beyond connected interfaces

– Save and Close the form

42

Testing the Polling Policy

– Identify the selected interfaces (We‟ll select Inventory->Interfaces and choose

our filter in the pull-down menu.)

– Open one of the interfaces

– Select Actions -> Monitoring Settings

43

– Validate the Interface Group policy that is applied

– Validate that the interface is being polled

44

Making Exceptions to Polling

– Most polled objects can be Unmanaged or set to

Out of Service

45

Custom Poller– Can configure polling to monitor MIB information not monitored by default

– Enhanced in 9.0 to support MIB Expressions

– Setting Up Your MIB

• Step 1: Identify the MIB Variable You Want to Poll

• Step 2: Ensure the MIB Includes Supported Types

• Step 3: Load the Required MIB

• Step 4: Use an SNMP Query to View Current MIB Variable Values

– Setting Up a Custom Poll

• Step 1: Enable Custom Poller

• Step 2: Create a Custom Poller Collection

• Step 3: Create a Policy for a Custom Poller Collection

– Examine the Results of Your Custom Poll

• Step 1: View the Custom Node Collections Associated with Custom Poller Policies

• Step 2: View Details of a Custom Node Collection

• Step 3: View Details of a Polled Instance

• Step 4: Evaluate the Results of the Custom Poll

46

MIB Browser– Do an SNMP query to get current values

47

Enable Custom Poller– Click on “Enable Custom Poller” to enable

48

Create a Custom Poller Collection

49

Create a MIB Expression– See the animation in the online help for a demo on creating a MIB expression

50

Select a MIB Variable

51

Finish Creating the Custom Poller Collection– Can configure thresholds

– Can affect node status and generate an incident

– Can export custom poller data to Metrics SPI for reporting or to CSV files

52

Create a Custom Poller Policy– Specify the Custom Poller Collection we just created, the node group,

any filter, and the polling interval

53

Examine the Results of the Custom Poll– Custom Polled Instances also available on the Node form

54

Incident Configuration

– With NNMi, you can change various aspects of an incident. Some examples include enabling an incident, formatting a message, enabling de-duplication and enabling rate correlation.

– Newly available features are Suppression, Dampening, and Enrichment• Each of these as global settings or per node group or interface group

– Any of these can be applied based on payload filtering of CIA name/value pairs• CIA = Custom Incident Attribute (or varbinds)

– Suppose we wish to enhance the Interface Down incident to include the Interface Alias in the message.

– Open the Incident Configuration view

55

– Choose the Management Event Configuration tab and open the Interface Down incident

– Create filter on basis of any field, multiple filter selection in a single go

56

– We add the argument $ifAlias to our message

– See “Valid Parameters for Configuring Incident Messages” in the help

57

– Now new incidents that arrive in the browser will have the new message format

– If there is no Alias defined, it is shown as null

– Filter on „Last Occurrence time‟ – new filter capabilities

58

Understanding Incident Configuration

Matches

Interface

Settings?

Trap Arrives Matches

Node

Settings?

Apply default

Suppression, Enrichment,

Dampening, Actions

Apply Suppression,

Enrichment, Dampening,

Actions specific to Node

Group

Apply Suppression,

Enrichment, Dampening,

Actions specific to

Interface Group

yes

yes

no no

59

Enrichment– Enrichment allows you to process a node differently based on node or

interface group membership

60

Enrichment (cont.)– For example, can set the Priority to “Top” and assign to a particular

operator

61

Suppression– Suppression allows you to discard an incident

– For example, can suppress a trap if the payload indicates the status is normal

or warning

62

Dampening– Dampening defines a hold period before an incident is released

– If Incident is Closed during this period, it is never released

63

Dampening (cont.)

– Out of the box, APA root cause (down) management events

are dampened for a period of 6 minutes.

• Dampening can be configured to a maximum of one hour.

– If you wish to go back to the original undampened

behavior, you can uncheck the Enable box on each

individual incident. Or alternatively, there is a command

line tool called nnmsetdampenedinterval.ovpl that can be

used to set the dampening period for all incidents. You can

disable dampening “across the board” with the command:

• nnmsetdampenedinterval.ovpl -hours 0 -minutes 0 -seconds 0

64

Custom Correlation• Create new incident correlations of multiple existing child incidents under an existing incident or into a new parent incident • Incident filtering and correlation logic can be based on custom attributes or attribute values of the source node or source object.• NNMi provides a sample Custom Correlation for correlating sub-interface down incidents under the main interface down incident (for Cisco devices)• Another scenario that could be provided here: Forward a „Connection Down‟ event to the NBI only when both Primary and Secondary links go down between two sites

65

Trap Configuration

– Traps must be defined by a MIB

– Load MIBs into NNMi using the nnmincidentcfg.ovpl command• Use the –loadMib or –loadTraps option depending on requirements

– Can alternately use Tools->Load MIB… GUI to load incidents from trap

# nnmincidentcfg.ovpl -u admin -p adminpw -loadMib ./ruggedcom.mib

Mib file loaded: /var/tmp/mibs/./ruggedcom.mib.

# nnmincidentcfg.ovpl -u admin -p adminpw -loadMib ./rcsysinfo.mib

Mib file loaded: /var/tmp/mibs/./rcsysinfo.mib.

# nnmincidentcfg.ovpl -u admin -p adminpw -loadTraps ./ruggedcomtraps.mib

Mib file loaded: /var/tmp/mibs/./ruggedcomtraps.mib.

Number of traps: 4.

The following traps were added to incident configuration:

cfgChangeTrap - .1.3.6.1.4.1.15004.5.4

swUpgradeTrap - .1.3.6.1.4.1.15004.5.3

powerSupplyTrap - .1.3.6.1.4.1.15004.5.2

genericTrap - .1.3.6.1.4.1.15004.5.1

66

– „Discard Unresolved traps‟ – a checkbox to discard traps from devices that are not loaded into NNMi topology

– Block unwanted traps using nnmtrapd.conf and nnmtrapconfig.ovpl• See reference pages for more details

67

Action Configuration

– You can add automatic actions to incidents.

– Usually done on Management Events rather than SNMP Traps because it‟s hard to predict the rate and volume of traps.

– NNMi automatic actions can either be executable commands or command line scripts or Python Scripts.

– In NNMi, actions are based on Lifecycle State change for incidents.

– Suppose you want to take an action when an interface goes down and another action when the interface comes back up again.• Both actions should be placed on the InterfaceDown incident

• One should be associated with the Lifecycle State of “Registered” and the other should be associated with the Lifecycle State of “Closed”

• There will not be an associated “Up” incident.

68

– Suppose we have a script called writelog.ovpl that we want

to run when a Node Down incident arrives

– As root, copy the writelog.ovpl script into the actions

directory

– Windows:

\Documents and Settings\All Users\Application Data\HP\HP BTO

Software\shared\nnm\actions

– UNIX:

/var/opt/OV/shared/nnm/actions

– Confirm that the command is executable

– There is a separate process called „nnmactions‟ now in 9.0

that runs actions

• ovstatus –c shows the status of this process

69

– Open the Management Event Configuration tab from within the

Incident Configuration Form

– Open the Node Down incident

70

– Select the Action Configuration tab and click on the New button

71

– Select the appropriate Lifecycle state (Registered in our example), Command Type (ScriptOrExecutable in our example) and the name of the command (specify the full path).

– Define a payload filter on the basis of CIA names and values • Execute the action only of the source node name is like: *NYC*.usa.hp.com

– Save and Close the form

72

– Last, we must enable the action by checking the Enable box.

– Save and Close the form

73

– Now we should test the action. The easiest way to do this is to look for a previous occurrence of this incident and modify the lifecycle state

74

– We can practice running this action by setting the Lifecycle State back to Registered. This will cause our action to execute after you save this form (thus saving the Lifecycle State change).

– After saving it, we verify that our action ran as expected. We can look at the log file that this sample action script writes to. We should then set the Lifecycle State back to Closed and save the incident to return it to its original state.

75

GUI Configuration - Node Group Map Configuration (aka containers)

– Container maps can be created that will show nodes that are contained in a Node Group

– Let‟s suppose that we wish to create some logical containers for a few different subnets and also nodes based on names. • Subnet A = Management Address of 192.25.*.*• Data Center = nodes that have a system name beginning with “data_center”

– Let‟s suppose we wish to create the hierarchy of groups:

My Network My Important Subnets Subnet A

Data Center

76

– Easiest to work from the leaf groups first

– We create a Node Group for Subnet A

– Test it as previously shown

77

– Now create a node group for the Data Center

78

– Next we must create a Node Group Map for each Node Group in the hierarchy

79

– Save the Layout on each Node Group Map

80

– For “branch” node group maps, no filter is necessary. Instead we only

need to specify the hierarchy by selecting the Child Node Groups.

– Again must create the map for the group.

– Optionally, can expand the child nodes in parent node Group‟s map

81

– We now have a map hierarchy that we can drill down and back. In our

example, you can open the Node Group Map for the node group “My

Network”.

82

– From this map, we can drill down (double click) and back with the

arrows

83

– Background graphics can be easily added

– Change the connectivity type as well – Layer2/Layer3

– Node group connectivity

• Site to Site connections based on End-points

− Show only Gigabit connectivity between sites based on „Interface Groups‟

84

– Sign out and Sign in shows the changes in Node Groups map settings

– We can also change the status propagation algorithms

85

– Select the initial UI launch

– „Last Node group‟ and „First node group‟ in the list• Decided by the „Topology map ordering‟ field

86

SNMP Line Graph Configuration– Can create custom SNMP Line graphs based on MIB expressions

87

SNMP Line Graph Configuration– Can configure custom SNMP Line Graphs as menu items

88

Maintenance

– Backup and Restore• Full backup

−nnmbackup.ovpl / nnmrestore.ovpl

• Embedded database backup−nnmbackupembdb.ovpl / nnmrestoreembdb.ovpl

– Configuration Export and Import • Allows for pinpoint configuration snapshots• Make a snapshot before making any config change

− nnmconfigexport.ovpl / nnmconfigimport.ovpl

89

Maintenance of traps– Need to regularly clean traps from the NNMi database (not the trap

store)nnmtrimincidents.ovpl –u system –p mypassword -age 1

-incr weeks -origin SnmpTrap –trimOnly –quiet

Trap Store

incoming

traps

NNMi Database

Tra

p S

erv

ice

filt

erin

g

Incid

en

t C

on

fig

ura

tio

n

NNMi

Trap

Service

nnmtrapdump.ovpl

NNMi

User

Interface

(nnmtrapconfig.ovpl)

90

Health Checks – Run ovstatus –v to make sure the jboss processes are running well

– Launch the Help->System Information menu item for a listing of some important data points like Database objects, NNMi system health, Statepoller detailed health

91

NNMi Self-Monitoring

– Monitors itself and generated incident in case non-Normal health

92

NNMi Status and Health Reports

– Tools NNMi Status and Status Health reports show details on NNMi

process and overall status

93

Status Distribution Graphs

– Tools Status distribution graphs• Nodes, Interfaces and IP address Status graphs

94

Self-monitoring Graphs

– Tools NNMi Self-Montioring • Discovery progress, Trap rate graphs, SNMP request stats etc

95

– Load the MIBs from UI now

– Open the MIB Browser UI for more details on MIB tree

96

Audit the Node for Supported MIBs

97

Possible Usage Scenarios

– Management by Exception

98

– Layer 2 map showing outage

99

– Investigate conclusions, history of status, incidents

– Run actions like ping, trace route, etc.

100

Map-based Management

101

List-based Management

102

Miscellaneous Tips

– Use the embedded database even for large scale.

– Use caution with SNMP timeout configuration. This timeout value is incremented with each timeout and can grow quickly beyond your original intention. Keep your ping timeout and your SNMP timeout fairly equal in time.

– Use the Conclusions tab in the Node Form to understand why the current status is set for the node.

– Reduce the number of connections between node groups via the End Points Filter in the Node Group Map settings form.

– Do not use a “@” in your SNMP strings. This is a reserved character for Cisco devices and causes NNMi grief if used.

103

Outcomes

–NNMi 9 is quick and easy to

deploy

–More advanced customization is

available if needed

104

Q&A

105©2010 Hewlett-Packard Development Company, L.P.

To learn more on this topic, and to connect with your peers after

the conference, visit the HP Software Solutions Community:

www.hp.com/go/swcommunity

106