Information Life Cycle Management (Data Archiving) in Business

23
Satyam Computer Services Limited | www.satyam.com © 2008 Satyam Computers Services Ltd 1 Information Life Cycle Management (Data Archiving) in Business Intelligence WHITE PAPER Author(s): Nipun Sharma, SAP-BI Solutions Labs Company: Satyam Computer Services Limited Created on: 19 th June 2008

Transcript of Information Life Cycle Management (Data Archiving) in Business

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 1

Information Life Cycle Management (Data Archiving) in Business Intelligence

WHITE PAPER

Author(s): Nipun Sharma, SAP-BI Solutions Labs

Company: Satyam Computer Services Limited

Created on: 19th June 2008

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 2

Table of Contents

Abstract ............................................................................................................................................................... 3

Data Archiving as part of Information Life Cycle Management .......................................................................... 3

Need Information Life Cycle Management (Data Archiving) in SAP BI .............................................................. 4

Motivation for Information Life Cycle Management (Data Archiving) ................................................................. 5

Benefits of an Information Life Cycle Management Strategy (Data Archiving) .................................................. 6

Implementing the Information Life Cycle Management (Data Archiving) in SAP BI........................................... 6

Data Archiving Methods ...................................................................................................................................... 7

Challenges for SAP BI Data Archiving ............................................................................................................. 14

Limitations of Archiving ..................................................................................................................................... 15

Important Data Archiving Features ................................................................................................................... 15

Purpose and Suitability of Data Archiving ........................................................................................................ 16

Storing Archived Data (ADK based) ................................................................................................................. 17

Determining which archiving strategy to use .................................................................................................... 18

Recommendations: ........................................................................................................................................... 18

7 Key Points to Take Home .............................................................................................................................. 19

Related Content ................................................................................................................................................ 20

About the Authors ............................................................................................................................................. 21

About Satyam ................................................................................................................................................... 22

Satyam in SAP Business Intelligence Space ................................................................................................... 22

Copyright........................................................................................................................................................... 23

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 3

Abstract

The purpose of this paper is to introduce the concepts of SAP BI Information Life Cycle management (Data Archiving). This paper is targeted at customers who are interested in implementing data archiving and SAP consultants. Readers who want to get a better understanding of SAP BI Archiving, and what it can do for its customers will be benefited from this article. It gives the different types of archiving processes which can be used for archiving the not much used data. After reading this article you will be familiar with the benefits and potential of Information Life Cycle Management - data archiving in SAP BI.

This paper relies on an understanding of SAP systems based upon release SAP NetWeaver 7.0 and below. It is intended to give the reader a basic awareness of the features of archiving available in BW 3.5 and BI 7.0. However, please note the following: (a) no portion in this paper will be explaining the technical steps to do archiving; (b) this paper by no means should be considered as sufficient knowledge base for positioning or implementing SAP Data Archiving.

Data Archiving as part of Information Life Cycle Management

Organized data constitutes information. In today’s business scenario, large amount of information is created on the fly. This information stays in the databases and systems of an organization of a long period. This information gets changes and then it gets archived and down the line it’s destroyed. This process of creation, storage, retrieval, and destruction of information is the full Information Life Cycle Management.

In SAP parlance, ILM is defined as a combination of processes and technologies whose goal it is to provide the right information at the right time, and at the right place, with the lowest possible costs - over the entire life time of the data. This entails knowing and categorizing a company's data, defining policies that govern what the company does with the information, setting up the system in such a way that these policies can be applied to the data, and then implementing a customer-specific information management strategy with the help of technology.

Data archiving focuses on effective data management and it helps in load reduction and legal compliance support. Data archiving means storing of the data into a secondary location from where it can be retrieved later whenever required.

Source: SAP AG

NetWeaver Service: Data Archiving

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 4

Need Information Life Cycle Management (Data Archiving) in SAP BI

The ever changing business environment brings in new business demands. High growth in business demand subsequently brings in a similar growth in data volume. The main challenge that a company faces today is the heavily growing SAP BI database. A huge database reduces the retrieval performance of the data at query runtime. Increase in BI database also makes the maintenance and administration efforts grow. In effect, the cost for process, personnel and technology will grow which will spike the TCO. Legal requirements like SEC, FDA, and SOA for ERP data pose important challenges.

SAP data archiving is the only method supported by SAP to remove application data from the database in a consistent, secure and comprehensive manner. Consistency is ensured through the use of checks performed by the archiving programs. A purely database-integrated archiving is not used, because the database does not know the business context of the data to be archived. Using data archiving you can select significant objects, such as accounting documents, material master records or HR master data, and remove them from the database, without having to worry about the fundamental table design of the linked data. The archived data is stored in a file system and from there can be moved to other storage media. For security reasons the archived data is not deleted from the database until the archive files have been read and hereby confirmed.

The Administrative cost of 1TB of memory is 5 to 7 times as high as the memory cost itself.

With a fundamental ILM strategy it is possible to increase the volume, reduce resource consumption, enable greater availability and speed up both data loading and querying.

Starting with mySAP ERP, data is extracted by the extract layer to BI. In BI the data model is different. It is Multidimensional. The EDW layer consists of Data Store Objects (DSO) for example. Above that is operational reporting. InfoCubes indicate multidimensional objects for strategic analysis. On top of that data mining or business planning can be done.

As you go up, the aggregation level increases, and granularity decreases.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 5

Motivation for Information Life Cycle Management (Data Archiving)

Source: SAP AG

As per a recent study done by SAP, the growth rate of organization databases increases with time (YoY). The above image gives us a clear picture that the database size would increase at a very fast pace year on year and which in turn would hit the cost for the required hardware to cope up with the increasing size of the database.

The picture clearly shows the benefits of data archiving done on the SAP system after regular intervals. The potential saving in terms of hardware investments and growth of data in the online servers can be greatly reduced.

As a thumb rule, the ratio of data required to be online and instantly accessible to old data, which could be archived, and stored offline is 1:6. For example, if an enterprise has 2100 GB of SAP database, the online data, which is frequently used by SAP users will be 300 MB and the rest (1800 MB) will be scarcely used and hence can be archived.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 6

Benefits of an Information Life Cycle Management Strategy (Data Archiving)

The main advantage Data archiving offers is the reduction in resource consumption. The effect of which can be seen in the reduction of hardware costs for hard drives on the BI side where the online data will be stored, as the data grow more and more each day the need for hard drives to store the increasing data becomes unavoidable. This investment can be delayed and can be avoided for a long time, if the archiving solution is in place.

The main memory and the CPU consumption will decrease; this will save the cost of administration by a large factor. Less effort for backup and disaster recovery will be required as the data gets into the archive storage at regular intervals.

With data archiving solution in place the frequency of backup is reduced and the data can be recovered at a faster pace. The availability of data compliant to cover legal requirements becomes extremely easy and fast. As an effect, data access becomes fast as only the data required is searched.

Performance is another aspect which is greatly benefited with archiving solution. It speeds up the loading process in SAP NetWeaver BI. SAP BI query response time in dialog is optimized.

With a good archiving strategy in place, even with an increase in data volume, management and use of large amount of information becomes more effective. With this, information is available for any time frames for any ad-hoc analysis and rebuild.

Greater System Availability is another major advantage. Large data volumes can be the cause of long runtimes during regular administration tasks on your system, such as data backups. This is especially detrimental during operations that require that an application system or certain parts of a system to be shut down; meaning that during that time the system is not available to the end user. A shut down may be required during an upgrade to a higher software release or after a system failure, when the data is being restored. Here, data archiving can help minimize the time a system is unavailable, by reducing the volume of data in your database. In addition, data archiving can take place while your system is online, which means you do not have to shut down your system during archiving operations.

Implementing the Information Life Cycle Management (Data Archiving) in SAP BI

The first step towards realizing the information life cycle management is to analyze and categorize the information layer via technical BI content: Query Statistics, data usage, access frequencies, granularity levels.

Then we would define policies for these categories. Decide which are the most important business areas, data, InfoCubes, etc., retention period of data, time slicing for data access. Special legal requirements must be considered here. It may mean that data from one InfoCube must be kept for one year, while data from another InfoCube must be kept for ten years because of legal requirements. Data or information is categorized according to importance (Matrix below shows the same)

Third step in ILC would be to apply the policies defined above to different layer types and data categories.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 7

Categorize the information according to Importance:

Online Database Near-line Storage Classic Archive

Frequently read/changed data

Rarely read data

Very rarely read data Finally the realization, which includes automating archiving, deletion of data from operative systems, migration, deletion, access and reload processes etc. The realization part is explained in the following topic “Data Archiving Methods”.

Source: SAP AG

The above picture shows the flowchart view of the Data Archiving Process.

Data Archiving Methods

1. Archiving Development Kit (ADK) only (classical)

Starting with BW 3.0, SAP introduced the ERP proven ADK-based archiving in BW. ADK based archiving does not provide query access to the archived data but instead provides archiving capabilities for InfoProviders. It is possible to generate archiving objects for InfoCubes and DataStore objects. Based on these archiving objects, ADK files can be generated and stored on the file system of the application server. Using the Archive Link interface, these ADK files can be passed to any external archiving solution. These ADK files are not readable for BW Queries. For Query access, they have to be reloaded. For reading the data sequential access is done, which make the data retrieval slow. BW transaction for invoking classical archiving is SARA. Applies to both BW 3.5 & BI 7.0 The archiving procedure is divided into three main steps:

Creation of archive files: In the write phase the data to be archived is written sequentially into newly created archive files

Delete from the database: The delete program reads the data from the archive files and then deletes it from the database.

Storage of archive files: The newly created archive files can then be moved to a storage system or copied to a tape.

The removal to an external storage system can be triggered manually or automatically. It is also possible to store the data before the delete phase.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 8

Source: SAP AG

2. XML Based Archiving

SAP offers XML-based archiving after the release of SAP Web Application Server 6.40. With this new procedure, XML archiving objects are used to write data in the form of resources either to a file system or directly to a WebDAV system, which takes on the role of the storage system.

In XML-based archiving delete phase does not exist as a separate step. The data can be written directly to a WebDAV system. The archiving programs are generally scheduled in the background. However, they can also run in online mode.

XML archiving is primarily used for XML archiving objects implemented by ABAP applications with XML interfaces. It is based on the following principles:

A comprehensive use of standard:

- XML: Generally accepted markup language and exchange format for complex objects, especially business objects, which is supported by a wide range of tools; compared to ADK, this format is used to archive data objects as whole objects (not broken down into different records).

- XML schema used to validate and describe the structures of XML documents.

- HTTP(s) used for secure communication between the application system and the XML data archiving service (XML DAS).

- WebDAV used as a model for the hierarchical organization of archived data and as a general (not SAP-specific) protocol for connecting storage systems including archive systems.

- JAVA used as a platform-independent programming language, which is also widely used outside the SAP world.

- J2EE as a standard for the development of JAVA-based enterprise applications.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 9

Data access is flexible and available even in the long term

- Independent from the application system that generates the data (in case a system has a shorter life cycle than the data to be archived).

- Independent from the application system technology (also relevant for JAVA applications) Basis for cross-system data accesses and searches.

Central archiving service to minimize administration efforts for system landscapes with several components.

Implementation Considerations:

The XML DAS is a new technology that was designed specifically for XML archiving objects.

Integration XML-based archiving involves two logically separate systems, which are generally also physically separate: The application system whose database is to be cleaned up and which contains the data to be archived, and the archiving system in which XML DAS is running.

The archiving programs of an XML archiving object use the XML Archive API instead of ADK. The XML Archive API communicates with XML DAS using HTTP(s). XML DAS is part of the SAP J2EE Engine, which means that the archiving takes place on the JAVA stack of an SAP Web AS.

3. PBS archive add on CBW

PBS is a certified partner for SAP to provide Data Archiving solutions. PBS provides add on named CBW for the Data archiving solution for SAP BW 3.x version. CBW is a supplementary software solution which provides comfortable and integrated BW queries to archived and database InfoCube and ODS data without restoring the archive files into the BW database. The only prerequisite for CBW is a successful SAP BW data archiving process in place.

Source: SAP AG

This add on can be applied to all customer specific InfoCube and ODS archive files (generic character). It also supports the compliance of legal and internal reporting requirements for archived data (long retention periods).

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 10

CBW and SAP data archiving allow a significant reduction of the BW database growth without loss of reporting opportunities (significant cost savings). The architecture of PBS archive add on is shown below. Which shows the linkage of the SAP ADK archive to the PBS add on. Archive and Index Data are stored in a File System or Archive Server. The indexes are generated by the CBW, and the data can be access with the ArchiveLink interface.

Source: SAP AG

Architecture

Once the data is archived, we can still get the data in our queries by creating a MultiProvider on top of the virtual InfoProvider with ArchiveLink services, which can connect to the PBS ADK interface and fetch the indexed data

Source: SAP AG

CBW Data Access Concept

4. Nearline only and Nearline + ADK

Nearline Storage is a new category of data persistency that is similar to archiving. The overall goal is to take read-only data out of the database system and to put it on cheaper devices (normally file systems) without losing the ability to direct access the data for analysis and ETL purposes. Therefore, NLS providers typically

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 11

display NLS data in SQL interfaces: wherever the data resides, whatever format it might have, and how high the compression rates might be.

For these two cases a Nearline Service has to be installed and configured in the BI system. Only in these two cases direct access to archived data from a query will be supported. If the Nearline Service is operated without an ADK-based archiving it has to store the complete archived data and has to fulfill legal requirements for archived data (e.g. long-term storage, audit ability, immutability...). In combination with an ADK-based archiving it could be sufficient to store only index information to ADK archives within the Nearline Storage. If ADK-based archiving is enabled an ADK Archiving Object is generated for and assigned to the Data Archiving Process.

Furthermore, if a Nearline Service is assigned to the Data Archiving Object a Nearline Object is generated for the access to the Nearline Service. A Nearline Object plays a similar role for Nearline Storage as the ADK Archiving Object for classical ADK-based data archiving.

Some of the NLS partners offer creating indexes for these ADK files to make them directly accessible for BI Queries and Data Transfer Processes (DTP) in a NetWeaver 2004s environment as well. This could be one migration path from ADK to NLS based on an enhancement for ADK archiving of one of the partners. Another solution is to reload the ADK files from archive and to export them into NLS partitions again; however, this solution requires more effort. Applies only to BI 7.0

Source: SAP AG

Basis for Nearline Storage

Near Line Storage strategies in SAP NW 2004s, are based on the idea that the access frequency for data decreases over time, the older the data gets, the more infrequently used it is. Therefore NLS is aiming to fill the gap between online memory and classical offline archives. It will have direct SQL access, unlike classical offline archives which are not directly accessible.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 12

Source: SAP AG

We must decide which the appropriate device categories for our environment are. What is the performance quality we want to accept?

Performance Characteristics:

Performance and cost-optimized management to keep data in multi-layer, transparent storage systems.

Combined storage systems of disk, tape and optical storage media in different variations.

Direct row-oriented data access - also to compressed archive data using all types of storage media.

Access strategies and aging pattern for logical grouping of data.

Dynamic migration on cost-effective storage media for data files with regressive access frequency.

Automated process for back up, shadowing, mirroring, recovery, etc.

Key Advantages of NLS

NLS fills the gap between online storage and offline storage with its data residing neither in the BI data base nor in a classic archive system.

NLS data can be accessed directly for analyses and data load purposes (BI Queries, DTP’s, and Reload Feature).

NLS handling is provided for InfoCubes and DataStore Objects and processed using Data Archiving Processes (DAP).

NLS storage and Online Storage together consistently reflect the BI data persistency of an InfoProvider.

NLS data is read-only.

NLS partitioned portions of an InfoProvider are write-protected.

Separation of frequently used data and rarely used data via admin cockpit capabilities.

Intelligent data access o Analysis/feedback data selection o High level index in SAP NetWeaver BI o Low level index in near line storage

Open interface for certified partners

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 13

Where is Archiving and Near-line storage applicable?

The main feature of the NLS interface is that data archiving processes or data transfer processes can be defined to bring data from the BI database to the NLS environment or vice-versa. The data archiving process has split the data volume of the cube into two parts. If you run a query on it the data manager has to decide if the data that your query wants to see resides on the database or in the NLS environment.

The high-level index is very important in this context. It works transparently as far as the user is concerned. Overall, we reduce data volumes but we do not lose data accessibility.

Then when you run a query on the cube you can choose additional query properties, i.e. Nearline Storage Should Be Read As Well. You should know that data in this cube has been archived and you must switch this flag on.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 14

Challenges for SAP BI Data Archiving

One of the main challenges we face while doing an archiving of BI data is flexible and complete online access of the data which is archived to be available in BI end user queries. This means seamless retrieval of the BI online data with the BI archived data for the end user. With this comes another challenge: when the query accesses the archived data, the response time for the retrieval of archived data should be as short as possible. We need to make sure that there are no restrictions in the normal BI data flow due to the archiving process.

Source: SAP AG

The above picture explains the how the different applications are linked to each other. As we can see there are different layers in OLAP. For effective archiving, we need to decide the different levels and different horizons where archiving is required.

Source: SAP AG

In the BI layer as shown above, EDW is the persistent inbound horizon: data is highly granular. We want to have this data directly available for ad-hoc analysis, data mining purposes, strategic analysis etc. The data volume will increase very fast. Above that direct availability is required here as well. With this fast increasing data the data access and performance will be greatly affected. This is the prime reason why we need archiving. But the greater challenge here is to provide an archiving solution which caters to the requirement of ad-hoc analysis, data mining purpose, strategic analysis etc.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 15

Limitations of Archiving

- At present, the archiving of data is supported only from InfoCubes and ODS objects. The archiving of data from other data stores, such as master data or PSA, is not possible with this release.

- The archived data areas in InfoCube and ODS objects are not protected against future changes. If data is loaded into an InfoCube or an ODS object after an archiving session, inconsistent states can occur, depending on the update method, if new records (as regards their keys) are in an archived data area. In the case of ODS objects, this can lead to an inconsistent delta for data targets.

- A direct reload to the original object is not supported. Reloading is possible using an extraction interface, but this should not take place in the original object, but rather in a copy of the object (both objects can be combined as a MultiProvider, so that all data in an object is available in Reporting).

Important Data Archiving Features

1. Data Security

To ensure that no data is lost due to errors during the archiving session, the data archiving process consists of two steps. In the first step the data is written to the archive files. In the second step the data is removed from the database. However, this only takes place after the archive file has been written in its entirety and read successfully. This process helps detect any errors that may have occurred when the data was transferred from the database to the archive file via the network. If an error occurred you can set up a new archiving session, because the data is either still located in the database or in an archive file.

2. Archiving in Online Mode

An archiving session, made up of write and delete jobs, can take place while the system is online, meaning that users can continue to use the system while data is being archived. However, you may encounter performance bottlenecks if tables from which data is being deleted are also accessed in online business processes. Therefore it is recommended that you archive during times of low user activity in your system.

3. Data Compression

During data archiving data is compressed by up to a factor of 5. Data that is saved in cluster tables is not compressed any further. Through the compression the archive files occupy as little space as possible on the storage system.

4. Release- and Platform-Independent

To ensure that the archived data can be interpreted over long periods of time, ADK also stores metadata in the archive file together with the application data. This Meta data contains information about the current runtime environment of the data

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 16

Purpose and Suitability of Data Archiving

1. What is the purpose of data archiving?

Archiving application data is an important part of managing mass data. Unlike backup and recovery, data archiving is a process that stands in close relationship with the applications and directly affects the business processes of a company. When you archive data, it is still available to the applications for display purposes, but can no longer be changed.

2. Which data should be archived?

For example, for a rebate settlement you need to process all the invoices of a given business year, the corresponding data can only be archived once the rebate settlement has been completed

3. When should application data be archived?

Application data should only be archived when the following criterion has been fulfilled:

a) It is no longer needed in any transactions or processes, such as for completing the annual balance sheet.

b) It does not have to be changed anymore.

c) The data probably does not have to be displayed very often anymore.

Before you start archiving make sure you fulfill all documentation requirements by creating all the necessary information which you may need for later audits

4. Is data archiving enough for auditing purposes?

Data archiving was not designed as a tax and audit tool. It can support you in meeting the requirements of tax authorities, by conserving data and making it available over a longer period of time. During an audit it is also possible to access archived data, in case more detailed information is needed that does not appear in the other documents you previously created.

5. When is data archiving beneficial?

Archiving application data is beneficial when the effort spent on maintaining your database is becoming too expensive and when, at the same time, you can store and manage the archived data without spending huge amounts of money.

6. What status does archived data have?

Archived data cannot be modified and can therefore no longer be used in the processes of current business operations. It is inseparable from the system in which it was created and can only be accessed and interpreted from this system. If you need historical data for informational purposes, you can read access the archived data.

7. Can archived data be reloaded into the database?

From a technical standpoint, archived data can be reloaded into the database. However, because we are dealing with historical data, which has not been part of any changes in the database, you run the risk of generating inconsistencies in your database. Therefore we always discourage the reloading of data back into the database.

8. When should you start archiving your data?

You should begin with data archiving before it is too late and you have exhausted all the alternative measures for improving the condition of your system. This includes planning how big your system needs to be based on your anticipated data volume (sizing), and determining the residence times of your data. The latter point refers to the amount of time the data must remain in the database. You should also identify and fulfill any audit requirements before you begin archiving your data. Always keep in mind that data archiving slows down the growth of your database. It cannot stop the growth completely. Therefore the goal of data archiving is to keep your system under control over a long period of time, not to return your system to a controllable state.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 17

Storing Archived Data (ADK based)

After Archiving data now we need to decide where to store the data. There are different alternatives for storing and keeping archive files (ADK based). Archived data must be stored safely and over long periods of time, because in principle it needs to be accessible at all times. Because the data has been removed from the database and now exists exclusively in archive files, the storage security requirements for archive files are very high.

You have the following options to store archive files:

1. Storage on an SAP certified storage system or content server, using SAP ArchiveLink.

You can store archive files on a storage system, using the SAP ArchiveLink interface. You should only use storage systems that have been certified by SAP for SAP ArchiveLink. First, however, we need to define some important terms used in conjunction with SAP ArchiveLink and describe the function of SAP ArchiveLink. SAP ArchiveLink is a communication interface between SAP applications and external components, such as an external storage system. It provides SAP applications with a group of interfaces, services and scenarios, with which documents and business processes can be integrated as easily as possible. SAP ArchiveLink also contains an interface to storage systems.

2. Storage on a Hierarchical Storage Management System:

HSM systems can be accessed as if they were file systems, meaning that you do not need any special interface to be able to use an HSM system for writing or reading data. The HSM software takes over the automatic conversion of the file and folder paths, according to the predefined rules. As a result an HSM system can be accessed by any system as if it were a hard disk of potentially unlimited size. Through the transparent integration of different storage media and architecture, the HSM system presents itself as a single file system.

3. Manual or alternative storage:

For the manual or alternative storage of archive files it makes sense to use or reuse storage systems that already exist in a company. Then you do not need to make any additional investments in hard or software. Usually the type of storage systems used would be large robot systems or jukeboxes that are already being used in a company. These storage systems can be integrated into the data archiving process either directly or indirectly.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 18

Determining which archiving strategy to use

Option Pros Cons

ADK (Classic archive)

SAP-standard BW solution

Comprehensive procedure for writing, deleting, and storing BW data

Allows the use of a blend of cost-effective data storage media

Platform-independent

Handles structural changes in the definition of BW elements

Easy-to-use front end

Intensive from an administrative standpoint

May not satisfy quick access requirements

Reloading of data into InfoCube / DSO

No query access to the archived data

No adaptability to future changes in data model (Cube/DSO)

NLS (Nearline storage)

SAP-standard BW solution

Comprehensive procedure for writing, deleting, and storing BW data

Allows the use of a blend of cost-effective data storage media

Platform-independent

Lower cost and improved performance because non-vital data is dropped from database

Reloading of data into InfoCube / DSO is not required

Query access to the archived data

Designer can build to the lowest level of granularity desired

Size of the BW system is effectively unlimited

Increases training needs

Outside expertise needed during initial

deployment

Interface maintenance between NLS

and BW not currently automated

(possible in future release)

Only available from version NW BI7.x

PBS archive add on CBW (ArchiveLink)

PBS is a certified partner for SAP

Comprehensive procedure for writing, deleting, and storing BW data

Lower cost and improved performance because non-vital data is dropped from database

Reloading of data into InfoCube / DSO is not required

Query access to the archived data

Size of the BW system is effectively unlimited

Not necessary for version BI 7.x since NLS will give the same benefits

Outside expertise needed during initial deployment

Not suitable for storing data warehouse objects requiring fast or frequent access

Augment your infrastructure

All data stays in primary database

No new software or administrative processes required

Expensive

Performance eventually degrades

Administrative functions become more complex and time-consuming

Recommendations:

If the BW business users do not require significant historical data for reporting or quick access to historical data, then the ADK is good. Otherwise NLS is preferred.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 19

7 Key Points to Take Home

- A successful data aging strategy balances the business’ need for data availability with system TCO.

- Data aging plays a key role in Enterprise Data Warehousing scenarios.

- A successful SAP BI data aging strategy begins with data modeling: Data retention must be factored

into the blueprinting phase, sizing, and capacity planning.

- SAP BI’s robust data aging toolkit provides both near-line storage and data archiving for offline

storage.

- Utilize SAP BW near-line storage as a cost-effective alternative for accessing less active data via

SAP BW query.

- Utilize SAP BW flexible data archiving functionality for infrequently accessed data that can be stored

offline.

- Keep in mind that your data aging strategy and selection of archiving, storage, retention, and access

tools should enable your SAP BW implementation to run smoothly and continue driving your

business forward.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 20

Related Content

http://www.thespot4sap.com/articles/SAP_Data_Archiving_Overview.asp

http://help.sap.com/saphelp_nw2004s/helpdata/en/ad/b594429d7c0631e10000000a1550b0/frameset.htm

https://www.sdn.sap.com/irj/sdn/inf?rid=/webcontent/uuid/ef1b00f7-0a01-0010-74ab-e9260e80a441

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 21

About the Authors

Nipun Sharma is a Team Lead – SAP BI at Satyam Computer Services Limited, Bangalore - India. He is working in BI space for past 3 years and has rich experience in development and performance tuning methodologies involving delivering Business Intelligence and Data Warehousing solutions to Retail, Logistics, Utilities, Finance industry. He holds a Bachelor of engineering degree in Computer science.

Satyam SAP-BI Solutions Labs is a body of experienced and qualified SAP-BI consultants. Its main objective is to create cutting-edge technological solutions in the area of SAP-BI and create multiple Service Offerings for Satyam’s Customer base. The consultants of this group also offer functional and technical consulting services to its customers.

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 22

About Satyam

Satyam Computer Services Ltd. (NYSE: "SAY"), established in 1987, is a global consulting and IT services company, offering solutions from strategy consulting to implementing IT solutions across various verticals including Manufacturing, Oil and Gas, Telecom, Retail, Transportation, Commercial and Healthcare services.

Satyam's network spans 55 countries, across 6 continents. Over 30,000 dedicated and highly skilled IT professionals, work in development centers in India, the USA (including one in Parsippany, NJ), the UK, the UAE, Canada, Hungary, Singapore, Malaysia, China, Japan and Australia and serve over 489 global companies, including over 156 Fortune 500 corporations.

The following picture gives an overview of Satyam’s global presence:

Satyam in SAP Business Intelligence Space

SatyamSAP started its BW practice in 2001 to provide BW/SEM solutions to its global clientele. Today the BW practice has grown to a sizeable team of over 550 associates working on BW engagements globally. The BW practice offers a wide range of Data integration and BI services and solutions.

A primary focus at SatyamSAP is the delivery of Business Intelligence in the SAP environment. We have developed exceptional core competencies in this area. The highlights of the BW practice at SatyamSAP are:

One of the largest BW practices in India

Over 550 plus strong team spread across Americas, Europe, APAC and Australia

Team consists of Architects and Designers with Techno functional skill sets

The team understands the Business user requirement and they possess the ability to convert it to Analytics

WHITE PAPER: INFORMATION LIFE CYCLE MANAGEMENT (DATA ARCHIVING) IN BUSINESS INTELLIGENCE

Satyam Computer Services Limited | www.satyam.com

© 2008 Satyam Computers Services Ltd 23

Supported by 2700 strong Data warehousing team from Satyam

Experts in the other data warehousing tools like Informatica, BO etc.

Dedicated SAP BW Practice over 900 person years of experience

SatyamSAP has a 100% successful BW implementation track record

Consultants providing solutions in BW from inception through current BW 3.X/BI 2004s release with extensive knowledge of APO and EBP integration.

Dedicated infrastructure – Satyam Technology Center with latest SAP R/3, BW, SEM and APO installations

Expertise in Onsite/Offshore execution model for SAP BW projects – Single large engagement with over 115 people team

Satyam Business Intelligence Solutions:

The Business Intelligence solutions that Satyam provides based on the SAP BW platform are:

Fixed-price implementations, upgrades and maintenance of BW

Blended delivery model involving onsite and offshore options to lower costs

BW training in all areas

Phased SEM implementation methodology with a step-by-step approach to show clients benefits with limited risk exposure

End-to-end BW solutions, from project management and design to implementation and maintenance

BW Consolidation and Enterprise Warehousing for corporations

BW Consulting

Data Mining and Archival Services

Copyright

© 2008 Satyam Computer Services Limited. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of Satyam. The information contained herein may be changed without prior notice.