Best Practices for Deployment Planning

Best Practices for Deployment Planning

The intent of this paper is for readers to become familiar with the IBM® Tivoli® Application Dependency and Discovery Manager (TADDM) product. This paper describes the value that TADDM brings to IT organizations, a description of the major components of TADDM installations, and descriptions of how these components work together.This paper also covers large-scale TADDM deployment scenarios with the following emphasis:

Planning large-scale TADDM deployments Hardware requirements for each TADDM component

Maintenance concerns in large-scale TADDM deployments

1 How TADDM Works

o 1.1 Architecture

o 1.2 TADDM Server (Domain Manager)

o 1.3 Database Server

o 1.4 Sensors

o 1.5 Anchor server

o 1.6 Windows gateway server

o 1.7 Enterprise Domain Manager Server (eCMDB)

o 1.8 Discovery Targets

2 Large-scale Deployment Planning

o 2.1 TADDM Server and Domain Manager

o 2.2 Database server

o 2.3 Anchor Server

o 2.4 Windows gateway server

o 2.5 The eCMDB Server

o 2.6 Discovery Targets

3 Large-scale Deployment Example

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-HowTADDMWorks

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-DiscoveryTargets

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-TheeCMDBServer

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-Windowsgatewayserver

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-AnchorServer

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-Databaseserver

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-TADDMServerandDomainManager

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-TADDMServerandDomainManager

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-LargescaleDeploymentPlanning

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-DiscoveryTargets

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-EnterpriseDomainManagerServer(eCMDB)

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-EnterpriseDomainManagerServer(eCMDB)

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-Windowsgatewayserver

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-Anchorserver

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-Sensors

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-DatabaseServer

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-TADDMServer(DomainManager)

https://www.ibm.com/developerworks/wikis/display/tivoliaddm/Best+Practices+for+Deployment+Planning#BestPracticesforDeploymentPlanning-Architecture

TADDM provides an unparalleled level of visibility into how the information technology infrastructure actually delivers the business applications, showing the interdependence of the software and physical components. It assists in evaluating the impact of a Request for Change and captures the changes that happen in the environment. TADDM also provides the capability of comparing components of similar types and determining whether they comply with a golden master configuration.

The key benefits of TADDM:

TADDM provides complete and detailed application topology maps of business applications and its supporting infrastructure, including cross-tier dependencies, runtime configuration values and complete change history. By leveraging the automated maintenance of these application topology maps and the ability to easily integrate this data with other enterprise information, IT organizations can accomplish the following things:

Ensure cost-effective and successful implementation of their Information Technology Infrastructure Library (ITIL) and Business Service Management (BSM) initiatives

Dramatically lower the business risks of service failures and inconsistencies

Ensure compliance to technology and regulatory standards

Reduce time to problems resolution

The key differentiators of TADDM:

TADDM delivers the most complete and powerful application discovery solution:

Global scale, deep detail

o Fast, automated application discovery: TADDM scales to thousands of servers and works seamlessly across multiple domains without adverse impact on load, bandwidth or security. TADDM uses many Discovery sensors to enable ready to use discovery of virtually all components found in the typical data center, across the application software, host, storage, and network tiers. Discovery sensors are extensible, reside on the TADDM server, and collect configuration attributes and dependencies.

o Deep configuration detail: TADDM provides visibility into the necessary details that are required to plan and manage application configuration change, critical changes that impact service delivery. These include deployed software objects like EJBs and .NET assemblies, application and operating systems patches, logical and physical network and storage settings, and their change history.

Enterprise intelligence

o Using the TADDM Operations Portal, you can answer critical questions about your infrastructure and your business. Leveraging the knowledge of your application infrastructure, TADDM can access and integrate information from your financial, asset, and even your HR databases to answer questions about compliance, governance, and capital efficiency. This aggregated data can be gathered by multiple TADDM implementations, as well as vendor enterprise applications, spread across separate business units or geographies. The TADDM Operations Portal provides a seamless query and reporting framework so you can correlate and analyze the data, and deliver knowledge that can be shared across the enterprise.

Comprehensive and proven technology

Rapid return on your investment: After installation, the TADDM agent-free discovery starts building your application topology, giving you results within hours. The agent-free approach eliminates the qualification, processor load, network bandwidth, and security costs associated with agent-based approaches. The powerful topology and task-driven user interface means minimal training costs, rapid time-to-value, and fewer support issues. And with the prepackaged analytics and the TADDM Operations Portal, you can produce insightful, change-planning impact analysis and compliance reports.

o Enterprise class security: TADDM uses industry standard secure protocols such as SSH, WMI and SNMP in the discovery process, ensuring that sensitive data is both secure and accessible only by authorized access. TADDM can be configured to provide discovery across firewall zones, without compromising security or requiring changes in firewall policy. Data is stored in a secure database, and all user and API sessions are access controlled.

o Easy Integration: TADDM was built for easy integration with an open architecture, robust and open APIs, and a complete and easy to use Software Development Kit (SDK). Using the APIs and SDK you can rapidly deploy and share application maps across management products and teams and processes, so you can effectively align your IT infrastructure with business services and objectives.

How TADDM Works

The following sections describe how TADDM works.

Architecture

TADDM consists of several main architectural components, some of which are optional, and all of which are designed to work together to function in a single TADDM deployment. This section explains how each component works to successfully plan, deploy, and maintain a TADDM installation.

The following diagram depicts a small-scale installation of TADDM:

This simple TADDM installation consists of the following components:

Microsoft® Windows® based TADDM Server A desktop system for running the TADDM Graphical User Interface (GUI) console. The

GUI can be run on the TADDM Server in small environments.

DB2® or Oracle database server. The database can be co-located on the TADDM Server in small environments

One or more discovery targets (some of which are Microsoft Windows based).

The TADDM Server is configured as a dual TADDM Server and Windows gateway in this example.

The TADDM Server includes an embedded local anchor server, for running the discovery sensors.

The following figure is an example of a large-scale TADDM installation, using all of the TADDM Server components in a variety of ways.

The additional TADDM components are used to provide the following functions:

Increased performance and scalability Access across firewalls

Discovery of Microsoft Windows based discovery targets

In this example, multiple TADDM Servers (Domain Managers) are used for increased scalability, as part of a large TADDM Enterprise CMDB (eCMDB) deployment. TADDM anchor servers are deployed into each fire wall zone to minimize firewall changes, and Windows gateway systems are deployed into each fire wall zone where Microsoft Windows based discovery targets are located.

TADDM Windows gateway systems are required for TADDM discoveries because TADDM uses the Microsoft Windows Management Interface (WMI) protocol for discovering Microsoft Windows based computers.

TADDM Server (Domain Manager)

The TADDM Server, is the cornerstone of all TADDM installations. The TADDM Server provides the following functions:

Graphical User Interface (GUI) server for administration and reporting. Reporting and Query interfaces made up of Command Line Interfaces (CLI) and

Application Program Interfaces (API).

Bulk-Load interface for data from Discovery Library Adapters (DLA).

Native Discovery of the IT Enterprise environment using built in TADDM sensors.

Local anchor: All TADDM servers and Domain Managers contain their own built in anchor server called a local anchor for the running of sensors.

Data management and reconciliation manages configuration items (CIs) that are discovered or loaded from different data sources and feeds.

Naming Convention: The TADDM Server is called a Domain Manager or Domain Server in an Enterprise CMDB (eCMDB) TADDM installation.

During a discovery, the TADDM Server, starts a local anchor server which starts the sensors. TADDM sensors are small components specifically targeted at discovering the configuration items (CIs) from a resource. For example, the TADDM DB2 sensor can extract CIs from IBM DB2 database servers. There are many different sensors included with TADDM to enable discovery of virtually every hardware and software resource found in data centers.

Discoveries are started manually by a user, or automatically based on a schedule, and are specified by one or more discovery scopes and a single discovery profile.A discovery scope is a definition that lists TCP/IP addresses (or host names), TCP/IP address ranges, or TCP/IP subnets that a discovery should be executed on. For address ranges and subnets, exclusion lists can be defined to exclude systems. For example, one could define a single scope to discover all systems in a subnet, except for systems A and B.

Tip: An exclusion list in one scope applies to all scopes in a discovery, even if the target is explicitly included in a different scope. A side effect of this behavior is that you can define a PERMANENT_EXCLUDE scope to ensure that certain targets are never scanned, as long as the PERMANENT_EXCLUDE scope is part of the discovery run.

It is typical to have multiple discovery scope definitions in a single TADDM deployment to take the following actions:

Partition a full discovery over multiple discovery windows:For example, to do a full discovery each week, you create multiple discovery scopes so the job is carried out during off hours over 5 days (1/5th of the data center is discovered at a time).

Perform different types of discoveries on different systems:For example, all systems can be discovered using a Level 1 profile, a small subset can be discovered using a Level 2 profile, and an even smaller subset can be discovered using a Level 3 profile. Discovery profiles are described below.

Speed up discoveries by dedicating scopes to targeted anchors and gateways:For example, you can assign anchors and gateway servers to dedicated scopes, so time is not wasted waiting for an anchor or gateway to establish a connection that is not possible because of firewalls.

Provide finer control of Access Lists (credentials) for performance:TADDM tries all users (in order) defined in the Access List to establish connections to remote systems, which could slow discoveries as TADDM tries to find a suitable credential to make a connection. To solve this problem, you can define separate Access Lists on each discovery scope, supplying only valid credentials for the systems located in the scope.

A discovery profile specifies the sensors that are enabled during a discovery. By default, there are three fixed predefined discovery profiles. A user of TADDM is free to copy and edit any user defined discovery profiles. There are three fixed predefined profiles.

Profile Description

Level 1 Discovers IP based systems and routers without any credentials. No attempts to log on to discovered targets are made, therefore Level 1 discoveries are fast.

Level 2 Includes Level 1 and also attempts to establish SSH, SNMP, or WMI sessions with discovered systems to gather more detailed information. For example, additional

information can be gathered about the processor, disk drives, installed applications, OS patch levels, and so on in a Level 2 based discovery. Because TADDM requires a session with the remote system, OS credentials must be configured for TADDM to successfully log into the remote discovery targets. Additionally, Level 2 discovery targets need to be enabled for discovery (see the Discovery Targets section).

Level 3

Includes Level 2 and also attempts to establish sessions with software applications to gather detailed information about them. An example of a sensor that is enabled in the Level 3 discovery profile is the WebSphere Application Server (WAS) sensor, to gather WAS Cluster, JVM settings, modules installed, and so on. Since TADDM requires a session with the remote software application, application credentials must be configured for TADDM to successfully log into the remote software application. In the above example, a WebSphere user ID and password are required.

As a discovery progresses, the TADDM Server stores the discovered CIs into its own database (databases are not shared, even in eCMDB environments). After all sensors have completed, the TADDM Server post-processes the newly discovered CIs, along with the CIs that are already in its database. During post-processing, the following actions are performed by the TADDM Server:

Topology Builder: Issued to compute additional embedded connections between discovered objects to complete the application topology. In addition, Topology Builder also does additional data processing to reconcile data obtained from different data sources.

View Manager: Builds the in-memory data structures required for the GUI to render the topology efficiently.

Change Manager: Generates Change Events and updates the change history records.

State Manager: Also builds the in-memory topology cache for propagating changes in the topology graph.

Database Server

The TADDM database server hosts the databases that are needed in a TADDM deployment. Currently only IBM DB2 and Oracle database servers are supported. Each TADDM Server, Domain Manager, and eCMDB Server has its own separate and dedicated database.

Most TADDM customers choose to deploy all TADDM databases onto a single centrally managed database server, to ease hardware & maintenance costs.

Important: Tivoli support highly recommends that a separate dedicated physical system be used to host the database (or multiple databases, if deploying multiple TADDM domains) found in a TADDM deployment.

During the initial startup of a TADDM Server, Domain Manager, or eCMDB server, the database schema (tables, indexes, and so on) is created automatically. Because of this, it takes longer to start these components the first time, usually an extra 20 minutes, as the schema is created.

Database activity mostly occurs during discoveries, in response to GUI users, and during report generation, as all TADDM Configuration Items are stored in the database.

The databases are accessed through the Java™ Database Connectivity (JDBC) interface, usually by a Type-4 JDBC driver provided by the database vendor and located in Java Archive (JAR) files. For example, the IBM DB2 Type-4 JDBC driver is found in the db2jcc.jar file and is located in the following directory of a TADDM Server.

..../cmdb/dist/lib/jdbc/

The JDBC database connections are specified by a JDBC URL, database user name (TADDM requires two users), and database passwords. All JDBC related configuration is stored at the following location:

..../cmdb/dist/etc/collation.properties

Special attention should be made to the planning, tuning, and required maintenance that is needed for your database servers.

Important: It is highly recommended that the TADDM database server be located on the same TCP/IP subnet as the TADDM Server (Domain Manager) and that the network be capable of full duplex at 100Mbps.

Sensors

TADDM sensors are components used to discover the Configuration Items (CIs) from a resource. For example, the native DB2Sensor sensor can extract CIs from IBM DB2 database servers. There are many different sensors included with TADDM to enable discovery of virtually every hardware and software resource found in your data center.

Some TADDM sensors, in addition to discovering CIs from a resource, also enable other sensors (this is called seeding); for example, the DB2Sensor sensor cannot run unless DB2 processes have been detected by the GenericServerSensor sensor. In this case, the GenericServerSensor sensor enables other sensors by acting as a "seed" into downstream sensors. Likewise, the GenericServerSensor sensor is dependent on the SessionSensor sensor, which is responsible for establishing SSH connections to remote hosts.

These seeding relationships between sensors are depicted in the following sensor hierarchy figure.

If during a discovery an expected sensor does not start, the possible cause of the problem is an issue with a seeding sensor above it in the hierarchy. Check the status of all sensors issued against the same discovery target for an explanation.

Important: All sensors are started in an anchor server, either a dedicated anchor server or one that is embedded in the TADDM Server, called a local anchor. Most sensors spawn commands or in someway interact with discovery targets to perform their discoveries. Due to the running of commands on the discovery targets, the impact of the sensor on the remote discovery target is not zero. For this reason, some customers choose to only perform discoveries during scheduled maintenance periods, to avoid impacting the business processes the servers are intended for.

The impact of running discoveries during maintenance periods is that applications might not be communicating, and therefore some relationships might not be detected.The following list shows the usual flow of the discovery against each remote host:

1. A TADDM Server initiatives the PingSensor or IPRangeSensor sensor (this depends on how the discovery targets are defined in the discovery scope).

2. During the detection of a running system by one of the initial sensors, the PortSensor sensor is started. The job of the PortSensor is to determine if there is a way to remotely access the discovery target, such as with SSH, WMI, and so on.

3. After the PortSensor sensor scan, assuming an SSH or similar server was detected on the remote host, the SessionSensor sensor is started. The SessionSensor sensor is seeded by the results of the PortSensor sensor, primarily with the IP address of the remote host, and the open ports that are found (SSH, WMI, and so on). The SessionSensor sensor uses the applicable, user-defined credential lists to try to establish a connection. All sensors requiring a session to the remote host are started in an anchor server, so the SessionSensor sensor runs on an anchor server. Note that all TADDM Servers (Domain Managers) embed a local anchor server.

4. When successfully establishing a session to the remote host, the SessionSensor sensor seeds both the GenericComputerSystemSensor and GenericServerSensor sensors:

o The GenericComputerSystemSensor sensor determines which operating system is executing on the remote host (through the cached session created by the SessionSensor sensor), and launches a tailored computer system sensor, such as the AIXComputerSystemSensor sensor for AIX systems. The AIXComputerSystemSensor sensor is able to retrieve operating-specific information from the resource, such as MAC addresses, file systems, and processor types and speeds for AIX systems. In the same way, the WindowsComputerSystem sensor can do the same thing, but for Microsoft Windows based systems.

o The GenericServerSensor sensor discovers running software application server processes. On Microsoft Windows based systems, this is done through the Windows Management Interface (WMI) APIs. On UNIX® and Linux® systems, this is done through the open source List Open Files (lsof) command. If a running software application is discovered that TADDM has a built-in sensor for, then additional sensors are seeded and launched; for example, the DB2Sensor or MQSensor sensors. The TADDM built-in, and user-defined Custom Server definitions are also analyzed for matches and for further detailed discoveries of running processes.

If a sensor is not enabled in the discovery profile, then it is not used during the discovery process. It might be beneficial to disable sensors that you do not want to collect CIs for. Then, they will not be part of your Change and Dependency processes, reducing the time spent in discovery and post-processing the overhead of data that is not needed.

As sensors run in the anchor server to discover the IT environment, commands are run on the discovery targets over the session. The output of the commands are returned (STDOUT) back

over the network for the analysis of the sensor. Sensors usually issue many commands to populate all of the CIs for the resource. The CIs are processed at the TADDM Server (or Domain Manager in eCMDB environments), and are eventually stored into the Chance and Configuration Management Database (CMDB) located at the database server.

During the discovery, the status of all currently running and completed sensors are displayed in the GUI. There are four possible statuses for a sensor:

DoneThe sensor completed successfully

In ProgressThe sensor is still running (and can potentially start additional sensors)

WarningThe sensor completed successfully, but problems were detected

ErrorThe sensor failed, and possibly no CIs were generated

For sensors showing the Warning and Error status, the sensor log should be analyzed to determine the cause of the problem and to make a correction for the next discovery. If the logs are set to DEBUG level tracing (through the collation.properties file), then a detailed list of the actions of the sensor, commands issued, and so on are logged.

Sensors run in parallel, and the number of threads dedicated for sensors is controlled by the com.collation.discover.dwcount setting in the collation.properties configuration file.

Each sensor, whether a seeding sensor or not, do discover and store CIs. A discovery is complete when all sensors and seeded sensors cannot discover any additional CIs, either because of incorrectly configured discovery targets or access lists, or simply the exhaustive search of the IT environment had completed successfully. The status of the last ten recently completed discoveries are saved, and can be found in the History section of the TADDM console.There are a variety of built-in sensors and predefined custom server definitions available, and each sensor can have independent prerequisites. These are the typical prerequisites:

Software components, such as the lsof command on UNIX and Linux systems Credentials for software, such as WebSphere Application Server user names and

passwords to run the WASSensor sensor

User access to required commands, for example the HMCSensor sensor requires the TADDM service account to have the hmcoperator task role.

For a list of the prerequisites and troubleshooting of sensors, see the Best Practices for Using TADDM Sensors document found in the References section of this paper.

Anchor server

A TADDM anchor server is the component that runs sensors. Every TADDM Server and Domain Manager contains a local anchor server. You must deploy additional dedicated anchor servers into fire wall zones to enable TADDM to discover systems that are located in those zones.

The purpose of deploying additional anchor servers is to minimize changes to the firewalls that separate a TADDM Server from a cluster of discovery targets. Instead of enabling a TADDM Server to communicate with all systems in a fire wall zone, you only need to allow communication to an anchor server in the fire wall zone. Because the anchor server is located in the fire wall zone, it has unfettered network access to all the systems located in the zone, so that discoveries can be performed.

Anchor servers can be nested below other anchor servers when there are multiple firewall zones between discovery targets and TADDM Servers. An anchor server is specified in the GUI console. The user is responsible for opening the required ports in the firewall to allow communication with the anchor server.

During a discovery, the user-specified anchor servers are automatically installed. The installation of the anchor server only exists for the period of the discovery, after which the installation files are removed from the anchor server. When a TADDM Server, or higher level anchor server, establishes an SSH session with a designated anchor server, TADDM deploys a Java Virtual Machine Runtime Environment (JVM/JRE) and additional binary files, and starts the JVM that is hosting the anchor server. This installation package is approximately 100 MB in size and is deployed during each discovery.

After the Anchor Server is started, it takes on the responsibilities of running the TADDM Sensors against the Discovery Targets that are located in that firewall zone, according to the user-defined discovery scope definitions that the anchor is limited to.

Important: It is recommended that each anchor server use a "Limited to Selected Scope" setting to increase performance. The discovery targets in the scope should be reachable (not blocked by a firewall) by the anchor server. This prevents the anchor server from waiting for TCP/IP connections to timeout because of a firewall filter.

If the anchor server discovers a system that is also designated as an anchor server (a nested anchor server), then the anchor server deploys the installation package to the newly discovered system, and the process continues. In this case, the firewall separating the two anchor servers must be configured to allow communication between those two systems. Similarly, if an anchor server establishes an SSH session with a user-designated Windows gateway server, its installation package is deployed over the session, and the Windows gateway server is started.

Sometimes it can be beneficial to disable the local anchor server, found on all TADDM Servers and Domain Managers, to offload work from the TADDM Server if there are performance problems at the TADDM Server. This can be accomplished by configuring the local anchor to use a "Limited to Selected Scope" setting that only contains the list of remote user-designated anchor servers and Windows gateway servers. In the TADDM console, the local anchor server is the anchor with address set to root server.

Windows gateway server

Windows gateway servers are used to discover Microsoft Windows based systems. Because the discovery mechanism that is used to discover Windows based systems relies on the Windows Management Interface (WMI), the sensors must establish sessions from a computer running the Microsoft Windows operating system. This computer is called the Windows gateway server. Conversely, TADDM uses SSH for sessions to UNIX and Linux based discovery targets.A Windows gateway server is set up similarly to TADDM anchor servers:

The User designates which computer is used as a Windows gateway server in the TADDM GUI console.

During a discovery, an installation package containing the binary files that are needed to start a gateway is deployed automatically. This installation package is approximately 5 MB in size.

After the gateway is started, the responsibility for discovering Windows based computers are delegated to the gateway.

For a computer to be used as a Windows gateway server, there are two software prerequisites that must be installed first:

1. Microsoft .NET Framework2. A compatible SSH server: BitVise WinSSHD server, or as an open-source alternative,

Cygwin (with bundled OpenSSH sshd server)

Similar to an anchor server, a Windows gateway server must be designated in each fire wall zone that contains Microsoft Windows based discovery targets.

Dual-use anchor and gateway server

It is possible to configure a single Windows based computer as a dual anchor server and gateway server. To do this,you must install the prerequisite software needed for setting up Windows gateway servers. After that is done, using the TADDM Console, configure the computer as a gateway, and also as an anchor (2 entries are needed).

It is also possible to configure a single computer as both a TADDM Server (Domain Manager) and a gateway server, but for performance reasons, this is not recommended except in the smallest of TADDM installations.

Enterprise Domain Manager Server (eCMDB)

See the TADDM Capabilities and Best Practices publication for information on eCMDB servers.

IBM TADDM is designed to modularly scale to very large data centers. A single TADDM Server can support around 10 000 physical servers (dependent on the profiles used). Customers can also tune TADDM operating characteristics (that is, discovery engine thread counts, discovery

sensor time-outs, and so on) as well as increase the TADDM Server or database server resources (processors, memory, and so on) to achieve increased support for infrastructure discovery and storage.

Although it is possible to scale TADDM Servers to support large enterprise environments, IBM provides a domain-based, best-practice deployment architecture to elegantly scale the solution to support several tens of thousands of infrastructure elements.

Most large enterprise environments are divided into management domains that represent the span of control of a given IT operations team. Domains can be based on organizational, functional, or geographic boundaries, combinations of these, or other criteria. To support the operational needs of the domain, IBM recommends a standalone TADDM Server (Domain Manager) per management domain. Each Domain Server is responsible for its own domain, discovering and storing all configuration data for its local domain. Users of each management domain use the local TADDM instance to manage the operational aspects of their domain, including the running of analytics such as change history, comparison and inventory reports.

However, IT organizations also have the need to have cross-domain views of their IT information. For example, your CIO might want to see an aggregate count of the enterprise wide Oracle deployments to ensure that the enterprise is compliant with its licensing contracts. To address this capability, IBM provides the TADDM Enterprise Domain Manager (eCMDB). The TADDM Enterprise Domain Manager integrates data, called federating data, from the multiple local TADDM server instances to provide a rolled-up, single enterprise-wide repository of information. This federated architecture ensures that the data is not duplicated in multiple data stores; the Domain Manager stores references to the appropriate data in the local TADDM Server and accesses the data on as needed basis.

The Domain Manager provides a Web-based portal to administer the local domain servers and to view and analyze the rolled-up enterprise data. It also provides a query and reporting interface that you can customize, which allows data to be easily shared across the enterprise.

The following lists show what an eCMDB does and does not do:

Does maintain change history Does provide bulkload and import capability

Does provide cross domain query capability

Does provide a common security framework across domains

Does provide the same API for access to data across domains

Does merge attributes when data is retrieved from eCMDB database with data in eCMDB taking precedence over attribute values currently on domain (to allow for bulkloads)

Does not do any discovery or topology build

Does not allow a domain to belong to more than one eCMDB

Does not allow nesting of eCMDBs

Discovery Targets

Discovery targets are the IT resources that TADDM discovers, such as the physical computers, routers and switches, operating systems, databases, and other software packages that make up a data center. TADDM populates the CMDB database with the Configuration Items (CIs) found on the discovery targets.

The discovery targets are accessed by TADDM in a variety of ways, and is dependent on the discovery profile being used (sensors enabled), the access lists (credentials) specified, and the nature of the discovered targets (operating system, applications installed, and so on).

Level 1 discoveries are network-based scans. There is nothing needed on the discovery targets because no remote sessions are established with the target.

For level 2 and level 3 discovery profiles, all discovery targets require that information be gathered, and possibly changes made, to enable the targets to be discovered by TADDM. Minimally, the following things need to be done to all discovery targets that are scanned with a level 2 (or level 3) discovery profile:

Identify or create an operating system service account on the target, that will be used by TADDM to establish remote sessions with the target. It is highly suggested that the same service account be created on all targets (same user and password) to ease in the maintenance of the access lists within the TADDM console.

Ensure that the service account has the correct permissions for running the commands the Discovery sensors rely on. See the Best Practices For Using TADDM Sensors white paper for specific details about the prerequisites for each sensor.

For UNIX and Linux targets:

o The open source 'lsof' command must be installed. If a suitable lsof binary file is not available, the command needs to be compiled manually from the lsof source, which is located at ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/ . Currently, the latest and recommended version is 4.80.

ftp://lsof.itap.purdue.edu/pub/tools/unix/lsof/

o The lsof command must be run as the root user by the TADDM service account user. The lsof command must be setuid root or the sudo access as root (NOPASSWD) must be granted.

o There must be a functioning SSH server running on the target for remote SSH sessions.

For Microsoft Windows targets:

o Local firewalls on the target must be configured to allow WMI access from the Windows gateway server.

Large-scale Deployment Planning

When planning a large-scale TADDM deployment, there are three major factors that must be considered to ensure that expectations are met:

1. The number of configuration items (CIs) in the environment. This requires an understanding of which discovery profiles will be used during discoveries, and against which scopes.

2. The time one is willing to reserve for discoveries, such as the length of the discovery and maintenance periods.

3. The frequency of redoing full discoveries, for example, on a weekly basis.

A single TADDM server that meets the minimum specifications can discover and post-process CIs at a rate of approximately one configuration item in 120 ms, using built-in sensors for the discovery (90 ms for discovery and 30 ms for post-processing). Another way of stating this rate is 30 000 CIs in one hour. IBM defines a server equivalent as 200 CIs, so this is the same as saying 150 SEs per hour.

Important: Users should read Chapter 2 (Architecture Overview and Deployment Planning) in the TADDM Capabilities and Best Practices Redbooks publication before deploying TADDM.

The following equations show the interrelations between these variables:

Number of hours to do a single full discovery:

Number of CIs that a TADDM Deployment can discover in the allotted time:

Number of Domain Managers needed to support a discovery in the allotted time (round the number up):

Examples:

You have two Domain Managers, with a discovery rate of 20 000 CIs per hour, for the first server, and 30 000 CIs per hour for the second server. You have approximately 3 000 server equivalents. How long does it take to do a single discovery?

Because a server equivalent is equal to 200 CIs, you have 600 000 configuration items. In this example, the first equation is used:

The answer is 12 hours.

For this same example, how many SEs can you discover in 2 hours? The answer is 500.In this next example you have approximately 1 500 000 configuration items (7 500 server equivalents), and needs a full discovery once a week. You mandate that discoveries can only be performed during maintenance periods, which are three hours a night.

Because a full discovery needs to be completed once a week, and the number of hours allocated to the discovery is 21 hours a week, the number of domain managers needed is as follows:

Rounding up, you need three domain managers.

In this next example you have 8 000 server equivalents, and you want to do four full discoveries a week. You also can only allot four hours a night for discoveries. How many Domain Managers will you need?

If the hardware costs for eight Domain Managers is too high, it is advised to either increase the number of hours to allot for a discovery (increase the maintenance period) or reduce the number of full discoveries that you need per week.

TADDM Server and Domain Manager

The functions of the TADDM Server require significant processor, memory, network, and database resources to run properly. A system undersized in any of these physical resources could become a bottleneck and slow down the overall functions of the TADDM server, resulting in poor GUI performance, and longer discovery times.

In the same way, the user defined discovery scopes and discovery profiles can have a drastic effect on the number of configuration items (CIs) that the TADDM server has to maintain. It is strongly advised that careful planning be done before deployment to ensure that the number of CIs do not exceed the limit for a single TADDM Server.

For the purposes of sizing, the following categories of TADDM servers are used (based on Server Equivalents):

Small: up to 1 000 SEsMedium: 1 000 - 2 500 SEsLarge: 2 500 - 5 000 SEsEnterprise: 5 000 - 10 000 SEs

TADDM Server Hardware Specifications

Processor

2 GHz (minimum), 3 GHz (or faster) recommended

Small: 2 processorsMedium: 3 processorsLarge: 4 processorsEnterprise: 4 processors

RAM

All sizes are minimum, double the minimum is recommended:

Small: 4 GBMedium: 4 GBLarge: 6 GBEnterprise: 8 GB

Disk

5 GB minimum

This includes product installation, and additional space for log files. Additional space might be required for DLA books, additional logging requirements, and so on.

Network 100 Mbit Full Duplex Ethernet minimum

Operating System

See the Tivoli Platform and Database Support Matrix document for the full list.

On certain versions of the AIX operating system, you cannot compile 'nmap', used by the StackScanSensor (Level 1 profile). Plan to have a non-AIX based anchor server if you depend on 'nmap.'

The most recommended and used operating system is RHEL v4.

TADDM Server notes

The TADDM database server should be located on the same subnet with the TADDM Server, each capable of 100 Mbit Full Duplex.

Consider mounting the TADDM logging directory onto a separate physical disk drive, to increase disk I/O throughput when running with DEBUG tracing all the time.

Increase the discovery worker thread count to run more sensors in parallel:com.collation.discover.dwcount

A setting of 32 is a good idea (the default is 16). The configuration parameter is found in the collation.properties file.

Database server

The TADDM database server has extremely high disk I/O throughput, processor, memory and network requirements. Take special precautions to increase the hardware requirements if the database server is shared by multiple TADDM Servers (Domain Managers).

For the purposes of sizing, the following categories of TADDM servers are used, based on Server Equivalents (SEs):

Small: up to 1 000 SEsMedium: 1 000 - 2 500 SEsLarge: 2 500 - 5 000 SEsEnterprise: 5 000 - 10 000 SEs

Database Server Hardware Specifications

If you intend to use a single shared database server for multiple TADDM domains, then these hardware specifications should be doubled or tripled.

Database IBM DB2 UDB Enterprise Sever Edition v8.2 or later Oracle v9 or 10g

Configure database logging onto a separate physical disk drive.

Processor

2 GHz (minimum), 3 GHz (or faster) is recommended

Small: 1 processorMedium: 2 processorsLarge: 2 processorsEnterprise: 4 processors

RAM

All sizes are minimum, double the minimum is recommended:

Small: 1 GBMedium: 2 GBLarge: 3 GBEnterprise: 4 GB

Disk

100GB available: Expect 2 MB of disk space for each server equivalent (SE) that is discovered. Plan accordingly for extra space if multiple TADDM discovery versions are to be saved. Each version requires an additional 2 MB of space per SE that is discovered.

Disk I/O throughput is extremely important. Use multiple small hard disk drives instead of a single, large drive is highly recommended (use RAID v6 to support multiple drive failures). Minimum hard disk drives: 2, recommended 3+

Network 100 Mbit Full Duplex Ethernet minimum Operating System

See the Tivoli Platform and Database Support Matrix (found in the References of this paper).

Database Server Notes

The TADDM database server should be located on the same subnet with the TADDM Server, each capable of 100 Mbit Full Duplex.

Disk I/O throughput requirements are very high for the TADDM database server. It is suggested that you use RAID v5 or RAID v6 with multiple hard disk drives to increase data integrity and disk I/O throughput.

It is recommended that you use a single database server of very high quality for all TADDM Domain Managers to reduce hardware and maintenance costs.

Anchor Server

The anchor server serves as a proxy for a Domain Manager in a fire wall zone. It should have a fast Ethernet connection with the owning upstream Domain Manager (or higher level anchor server).

Anchor Server Hardware Specifications

Processor Two-Core 2 Ghz minimum, Four-Core 4Ghz each are recommended for large

domains

Always choose fewer, more powerful processors. For example, choose two 4 Ghz processors instead of eight 1 Ghz processors.

RAM 2 GB minimum, 4 GB recommended2 GB Disk swap space should be enabled

Disk 2 GB available Network 100 Mbit Full Duplex Ethernet minimum

Operating System

Any Operating System that is capable of running a TADDM Server can also be used as an anchorserver. See the Tivoli Platform and Database Support Matrix document for the full list.

On certain versions of the AIX operating system, you cannot to compile 'nmap', used by the StackScanSensor sensor (Level 1 profile). Plan to have a non-AIX based anchor server if you depend on 'nmap'.

The most recommended and used operating system is RHEL v4.

Anchor Server Notes

Use 100 Mbit Full-Duplex Ethernet to eliminate network bottlenecks. Adding anchor servers will not help performance problems on a Domain Manager.

Do not use the AIX operating system if you plan to use the 'nmap' tool for Level 1 discoveries.

On Linux and UNIX systems, access is required to local 'sudo' command (with no password).

Windows gateway server

The Windows gateway server is extremely processor intensive, though there is no high requirement for disk I/O or RAM.

Windows Gateway Server Hardware Specifications

Processor Two-Core 3 Ghz minimum, Four-Core 3 Ghz each is recommended RAM 1 GB minimum, 2 GB recommended Disk Less than 5 MB is needed Network 100 Mbit Full Duplex Ethernet minimum Operating System

Only compatible Microsoft Windows operating systems. Windows 2003 is recommended

Windows Gateway Server Notes

Windows gateway servers can be shared by multiple TADDM Domain Managers. Add additional Windows gateway servers if a single server is a scale or performance

bottleneck.

A compatible SSH server is required and the Windows .NET framework must be installed.

The eCMDB Server

When deploying TADDM in an Enterprise mode with multiple domain managers, you need a dedicated Enterprise CMDB server and its accompanying database server.

The eCMDB Server Hardware Specifications

Processor

Four processors, each at 2 GHz (minimum), 3 GHz (or faster) is recommended

Always choose fewer, more powerful processors: for example, choose two 4Ghz processors instead of four 2Ghz processors.

RAM 8 GB 4 GB disk swap space should be enabled

Disk 5 GB available Network 100 Mbit Full Duplex Ethernet minimum Operating System

See the Tivoli Platform and Database Support Matrix (found in the References of this paper).

The eCMDB Database Server Hardware Specifications

Processor Four processors, each at 2 GHz (minimum), 3 GHz (or faster) is recommended

RAM 4 GB minimum 4 GB disk swap space should be enabled

Disk

100 GB available: Expect 2 MB of hard disk space for each server equivalent (SE) that is discovered. Plan accordingly for extra space if multiple TADDM discovery versions are to be saved; each version requires an additional 2 MB of space per SE that is discovered.

Disk I/O throughput is extremely important. Use multiple small hard dsk drives instead of a single large drive is highly recommended (use RAID v6 to support multiple drive failures). Minimum disk drives: 2, recommended 3+

Network 100 Mbit Full Duplex Ethernet minimum Operating See the Tivoli Platform and Database Support Matrix (found in the References of

System this paper)

Discovery Targets

The discovery targets do not have any particular hardware requirements, though changes need to be made to those systems, including creating a service account for TADDM to access, installing the lsof command, and so on. See the section for Discovery Targets in How TADDM Works.

Large-scale Deployment Example

Goal: To deploy TADDM to discover 11 data-centers located throughout the United States, to aid in security and system compliance, such as for patch levels. These are the requirements:

Monitor approximately 12 000 lab resources spread throughout the U.S. Only Operating systems details are necessary (Level 2 only scans).

No firewalls blocking traffic between TADDM server and the target systems, so there is no need for anchors.

Fifty percent of lab resources are located in a single data center.

Around 15 users will be connecting to this environment, primarily from the main data center.

The following list shows what the deployment plan for this solution included:

A single eCMDB server: 4 dual-core each 3 Ghz processors with 16 GB RAM (8 cores total), 100 GB hard disk drive

Three Domain Managers: 4 dual-core each 3 Ghz processors with 8 GB RAM (8 cores total), 100GB hard disk drive

A single database server (shared by all domains): 4 dual-core each 3Ghz processors with 20 GB RAM (8 cores total) ,100 GB hard disk drive on external storage array for high disk I/O throughput

Single Windows gateway server (shared by all domains): 4 dual-core each 3 Ghz processors (8 cores total), with 4 GB RAM, 100 GB hard disk drive

All servers installed in a single data center (the large data center). Because there are no firewalls between the data centers, there is no need for anchor servers. Each Domain Manager is partitioned to discover a subset of the 12 000 lab resources. A service account is created on all systems to ease in the administration of the TADDM credential lists.

Best Practices for Deployment Planning

Documents

Transcript of Best Practices for Deployment Planning