Data domain and enterprise vault

32
1 Using Data Domain Storage with Symantec Enterprise Vault 8 Using Data Domain Storage with Symantec Enterprise Vault 8 White Paper Michael McLaughlin Data Domain Technical Marketing Charles Arconi Cornerstone Technologies - Principal Consultant Data Domain, Inc. 2421 Mission College Boulevard, Santa Clara, CA 95054 866-WE-DDUPE; 408-980-4800 Version 1.0, Revision A September 5, 2009

description

Data domain and enterprise vault integration and test results

Transcript of Data domain and enterprise vault

Page 1: Data domain and enterprise vault

1

Using Data Domain Storage with Symantec Enterprise Vault 8

Using Data Domain Storage with Symantec Enterprise

Vault 8

White Paper

Michael McLaughlin – Data Domain – Technical Marketing

Charles Arconi – Cornerstone Technologies - Principal Consultant

Data Domain, Inc. 2421 Mission College Boulevard, Santa Clara, CA 95054 866-WE-DDUPE; 408-980-4800 Version 1.0, Revision A September 5, 2009

Page 2: Data domain and enterprise vault

2

Using Data Domain Storage with Symantec Enterprise Vault 8

Copyright © 2009 Data Domain, Inc. All rights reserved. Data Domain, the Data Domain logo, SISL, and Global Compression are trademarks or registered trademarks of Data Domain, Inc. All other trademarks used or mentioned herein belong to their respective owners. Data Domain products are protected by one or more of the following patents issued to Data Domain. U.S. Patents 6,928,526; 7,007,141; 7,065,619; 7,143,251; 7,305,532; 7,373,464; 7,424,498; 7,434,015 and other patents and patents pending in USA and other countries. Disclaimer: The information contained in this publication is subject to change without notice. Data Domain, Inc. makes no warranty of any kind with regard to this manual, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Data Domain, Inc. shall not be liable for errors contained herein or for incidental or consequential damages in connection with the furnishing, performance, or use of this manual. The instructions provided in this document by Data Domain are for customer convenience and are not warranted or supported by Data Domain, Inc. Data Domain expects users to integrate third-party software and arrays as needed, but Data Domain is not responsible for the usability of the third-party software or the arrays after installation.

Page 3: Data domain and enterprise vault

3

Using Data Domain Storage with Symantec Enterprise Vault 8

Using Data Domain Storage with Symantec Enterprise

Vault 8 White Paper

Table of Contents Using Data Domain Storage with Symantec Enterprise Vault 8 ................................................................... 1

White Paper .................................................................................................................................................. 1

Using Data Domain Storage with Symantec Enterprise Vault 8 White Paper .............................................. 3

Table of Contents ....................................................................................... Error! Bookmark not defined.

Introduction .................................................................................................................................................. 5

Recommendations ........................................................................................................................................ 6

Solution Overview ......................................................................................................................................... 7

Symantec ................................................................................................................................................... 8

Enterprise Vault ........................................................................................................................................ 9

Symantec NetBackup .............................................................................................................................. 10

Microsoft Exchange................................................................................................................................. 10

Data Domain ........................................................................................................................................... 12

Test Environment ........................................................................................................................................ 13

Exchange Results ..................................................................................................................................... 15

FSA Results .............................................................................................................................................. 16

Discussion of the Results ........................................................................................................................ 17

Enterprise Vault Setup ................................................................................................................................ 17

Configuring the Vault Store Partition...................................................................................................... 18

Data Protection with Symantec NetBackup ................................................................................................ 20

Backup of MS Exchange Data to the Data Domain System .................................................................... 20

Backup of File System Data to Data Domain .......................................................................................... 21

Using a Data Domain System as Backup and Archive Vault Storage .......................................................... 21

Recovering EV8 Archives ............................................................................................................................. 23

Data Domain Replication ........................................................................................................................ 23

Page 4: Data domain and enterprise vault

4

Using Data Domain Storage with Symantec Enterprise Vault 8

Backup to Third-Party Solutions ............................................................................................................. 23

Benefits of Data Domain with Enterprise Vault 8 ....................................................................................... 25

Appendix A. Enterprise Vault Server Load Testing Results ......................................................................... 26

Appendix B. Enterprise Vault Setup Details ................................................................................................ 29

Appendix C. References .............................................................................................................................. 32

Data Domain Links .................................................................................................................................. 32

Symantec Links ........................................................................................................................................ 32

Page 5: Data domain and enterprise vault

5

Using Data Domain Storage with Symantec Enterprise Vault 8

Introduction

In today’s data centers, information and data growth in almost all areas is never ending. One very effective way of dealing with the management and protection of increasing data sets is through the addition of archiving extensions to the existing environment.

Symantec™ Enterprise Vault™ is one of the leading archive applications for Microsoft®

Exchange®

and Microsoft SharePoint®

, Windows file servers, and Lotus®

Domino®

and

Notes®

. When managing these applications, the storage choices for primary application data, archive data, and backup data sets present many options for today’s IT administrators.

This white paper covers the use of the Data Domain deduplication storage solution as a target for Symantec Enterprise Vault 8 (EV8) archiving solutions. Specific testing was performed for Microsoft Exchange and Microsoft file servers. Similar testing was not performed for Microsoft SharePoint or Lotus Domino and Notes.

It is not the intent of this paper to exhaustively cover all possible archiving and storage use

cases, configurations, or scenarios. The goal of this paper is to examine the specific use and benefits of Data Domain storage with Enterprise Vault 8 with Exchange and File Server

Archiving (FSA). Best practices and configuration guidelines will also be covered where they affect the overall solution.

This paper is intended for solution architects and storage administrators involved in the planning and deployment of Symantec Enterprise Vault archive solutions that will be based on Data Domain storage. Some knowledge regarding storage administration, the Symantec Enterprise Vault archiving software, and Data Domain systems is assumed.

Page 6: Data domain and enterprise vault

6

Using Data Domain Storage with Symantec Enterprise Vault 8

Recommendations

Table 1 lists the basic guidelines for deploying a Data Domain storage system with Symantec Enterprise Vault 8.0. Table 1: Recommendations

1. When planning EV8 server requirements, use Data Domain storage instead of the

local disk to remove the deduplication processing burden on Enterprise Vault servers.

2. Enable the Data Domain vault store partition options to indicate that the device has

deduplication and compression capabilities.

3. If only email archiving is planned, size the Exchange archive storage requirements as you would for using a local disk. The deduplication results from Data Domain will be similar to those archived by native Enterprise Vault levels. Note that the initial data set presented (pre-deduplication) will appear larger on the Data Domain systems but the end result of actual storage used should be similar to that of EV8.

4. Use the Data Domain storage for backup of the Exchange environment as normal.

There are no combined storage efficiency effects of backup and archive for Exchange.

5. If only file server archiving is planned, size the FSA archives as you would for a single

first instance full backup of the source file server.

6. Use the Data Domain storage for backup of the file server environment as normal. There

will be some combined effect of backup and archive for FSA roughly equivalent to adding

one more full backup to the storage requirements.

7. Create separate directories for the backup and archive functions to improve manageability and measurability. Use CIFS shares to control access from the

application perspective.

8. Protect the archive by using Data Domain replication to a secondary site or system. Because archiving is usually a continuous process throughout the day, the replication bandwidth requirements for the archive data are usually spread over the whole day.

9. Use Enterprise Vault Collection and Migration methods for very large archives to

improve the manageability and performance of backing up the archives if replication is not possible or sufficient for business needs. Follow the Symantec guidelines for this setup.

Page 7: Data domain and enterprise vault

7

Using Data Domain Storage with Symantec Enterprise Vault 8

Solution Overview

Solution Overview

The archiving solution explored in this white paper consisted of several interrelated Windows- based systems that would be commonly found in a typical deployment of Enterprise Vault. The test environment included the following Windows Server 2003 (R2 SP2 x86) systems:

• Windows Active Directory (AD) server

• Microsoft Exchange 2003 server

• Microsoft Outlook client servers (2)

• Windows file server

• Symantec NetBackup 6.5 Master/Media server

• Enterprise Vault server (including SQL2005)

The systems listed above constituted the core non-archived application environment for Microsoft Exchange and the CIFS file server. They were constructed on a dedicated VMware system. The ESX server was hosted on a dual quad-core Intel base architecture. The Enterprise Vault server was isolated to a separate physical server. This system was also deployed on a dual quad-core Intel based system. Each physical server had 8 GB of memory and SAN storage. The rationale for using VMware for the non-archive systems was to provide a manageable and measurable platform that supported system-level snapshots. The ability to easily take snapshots of a working configuration and roll them back to a defined reference point in time to test alternate archive solution configurations was valuable for this type of comparison testing. The Enterprise Vault server could be easily reinstalled as needed to provide a clean starting point for measuring both the performance and capacity aspects of the different storage options explored in this effort and covered in this guide.

Note: Testing absolute performance measures was not the goal of this paper or this

testing.

Storage for the test environment was provisioned from a local SAN, both for direct-attached disks and VMFS partitions. The Vault Store partitions for the archives were provisioned from local SAN (3PAR) disk as well as from a Data Domain DD690g gateway system.

Figure 1 shows the basic configuration of the test environment used. The test systems were

configured mainly for comparison testing. Actual system deployments might vary significantly,

but the relative results observed should be similar.

Page 8: Data domain and enterprise vault

8

Using Data Domain Storage with Symantec Enterprise Vault 8

Solution Overview

Figure 1: Testing Architecture

Symantec

Enterprise Vault provides a robust platform for information archiving. Different applications, such as email and file sharing, can be easily targeted with flexible archiving capabilities. The archived data can be sent to a variety of integrated storage solutions. Each integrated storage solution might support different underlying capabilities.

The new features in EV8 allow for the selective use of the capabilities of the

underlying storage. The two storage options covered in this paper are:

1. Local disk employing the deduplication and compression capabilities built in to EV8.

2. Data Domain storage, which has its own deduplication and compression

capabilities.

Symantec Enterprise Vault was used to archive both Microsoft Exchange data and CIFS file server data.

Note: Lotus Domino and Notes and Microsoft SharePoint are not covered.

Page 9: Data domain and enterprise vault

9

Using Data Domain Storage with Symantec Enterprise Vault 8

Solution Overview

Enterprise Vault

Symantec Enterprise Vault 8.0.1 (SP2) server was configured on a Windows 2003 server. MS SQL 2005 was also installed on the same EV8 server. Separate disk drives were configured for each of the following Enterprise Vault and SQL server content:

• System and program files

• MSMQ

• SQL database

• SQL logs

• Enterprise Vault indexes

• Enterprise Vault storage (for local disk tests)

The Enterprise Vault server was run on dedicated hardware and not in the VMware environment. The EV8 server storage locations were on SAN. Microsoft SQL 2005 server was also installed on the same physical server as EV8. This followed Symantec’s guidelines for installation because we were only archiving four mailboxes. This allowed us to more easily capture performance data for the combined Vault and SQL processing workloads. This configuration would closely replicate a real-world setup. These choices were made based on Cornerstone’s vast experience deploying Enterprise Vault in large corporate environments.

The CPU load on the EV8 server was monitored with the standard built-in Microsoft "perfmon" utility. Performance data was captured for the periods of archiving activity. This included a small amount of lead and post processing periods when there was no activity on the server. The results of the "perfmon" monitoring are covered later in this paper and in “Appendix A. Enterprise Vault Server Load Testing Results” on page 26.

One more important note is that during each test, we stopped or turned off other systems in the VMware configuration that were not being tested. Only the AD server and the Exchange server were running during email archiving, and only the AD and the Windows file server were left running for FSA tests. This limited any impact from the VMware configuration and other processes that might run during the testing periods.

The Enterprise Vault server was configured to archive the Exchange 2003 system and the Windows file server system. One set of archives would be run to the local disk on the Enterprise

Vault server using deduplication and compression capabilities within EV8. Then the configuration would be reset and the Exchange 2003 and Windows file server systems would be archived to the Data Domain storage using the deduplication and compression capabilities of the Data Domain system.

No date or time limits were set so that we could capture everything on the disk. EV8 is designed to share the resources of the source system, so it typically took several passes to complete all data archiving.

Page 10: Data domain and enterprise vault

10

Using Data Domain Storage with Symantec Enterprise Vault 8

Solution Overview

Symantec NetBackup

Symantec NetBackup 6.5.4 was used to back up the systems involved in this test

environment. We performed four types of full backups:

1. Local files (a sample set on each system was used to validate basic client backup and

restore functionality)

2. Exchange Information Store on the Exchange 2003 server

3. CIFS network share of the file server

4. Enterprise Vault backups on the EV8 server

Note: The backups were written to a separate Data Domain system and then later combined for deduplication results comparisons.

Microsoft Exchange

The Microsoft Exchange Server 2003 environment was set up with one Exchange site, two storage groups, and four mailboxes. There was a separate mailbox for each of the four test users. Users 1 and 3 were in storage group 1; users 2 and 4 were in storage group 2. Figure 2 shows the overall Exchange server and site structure and their relationship to the other systems in the test configuration. Standard practices were used in building the Exchange server, such as separating the drives for data and logs.

Page 11: Data domain and enterprise vault

11

Using Data Domain Storage with Symantec Enterprise Vault 8

Solution Overview

Figure 2: Exchange Environment

The Exchange server database was loaded with data from real (retired) users’ PST files. There was no plan to model Exchange performance or change rates because these will vary from customer site to site. Because multiple PST files were imported into each test user, the size of the individual mailbox was larger than a typical user mailbox, but the overall Exchange content was complex enough for a reasonable testing. After the mailboxes were loaded and confirmed to be working, a baseline VMware snapshot was created to provide a repeatable rollback point for comparative testing. Figure 3 shows the general sizing information of the source PST files that we used. This included about 37 PST files totaling about 17 GB of e-mail data loaded into the four test users mailboxes.

Page 12: Data domain and enterprise vault

12

Using Data Domain Storage with Symantec Enterprise Vault 8

Solution Overview

Figure 3: PST Sample Data Sets

As stated earlier, the Exchange server was running in a VMware virtual machine on an ESX server. So were the AD controller and Outlook clients. This allowed us to perform quick system resets between tests.

Data Domain

For the testing purposes for this white paper, we used a DD690g system as the target storage for the Vault Storage Partition. The system was a gateway model attached to the same 3PAR storage array that we used for the local disk test case. The disk I/O throughput levels of the EV8 server when writing or reading from this particular system seemed relatively small. The observed archive processing rates for Exchange and FSA ranged from 30K to 40K items per hour, where an item was either an Exchange mailbox item or an individual file during FSA. This item processing rate translated into about 2-4 GB per hour of archive storage throughput required. Other configurations could vary and should be evaluated for proper capacity and throughput sizing.

Note: For this comparative testing, neither the local SAN disk nor the Data Domain NAS (CIFS) storage device presented any apparent bottlenecks during the archiving operations.

Page 13: Data domain and enterprise vault

13

Using Data Domain Storage with Symantec Enterprise Vault 8

Test Environment

Test Environment

The emphasis of the testing was not focused on explicit performance, capacity or deduplication results. Instead, we were looking at the relative performance of alternate storage configurations and practices. We used small sample sets (10-20 GB) with sufficient structure (150,000 – 300,000 items) to exercise the email and file server archiving processes for approximately two hours to observe the average rates and behaviors.

The two storage options for the Vault Store Partition being tested were a locally attached SAN disk and a Data Domain DD690g storage system. Both of these storage resources performed well. Proper storage sizing of throughput and capacity for real world deployments is outside the scope of this project. The focus was more on architectural and behavioral considerations of where to perform archive data deduplication within the various solutions.

The basic testing process used for this exercise was similar for both Exchange and File System

Archiving as well as for the local disk and Data Domain storage solutions. After the Vault

Partition of choice was configured, the test followed these steps:

1. Perform a full backup of the source data set with NetBackup (Exchange or Windows

files) to establish a baseline for the content size and composition.

2. Run an EV8 Archive Report (no data movement or shortcuts) to determine the

proper connectivity and configuration.

3. Run an EV8 Archive task to begin the actual archive operation of moving data from the source to the target Vault Store Partition. The schedule was open for a full day (24 hours) and required a few restarts to facilitate the expedient archiving of the source data. (Normal operations would proceed over multiple days as needed during specific windows.) During this task, shortcuts were prepared but the original data was not removed until a subsequent step.

4. Perform another full backup of the source data set with NetBackup (Exchange or Windows files) to establish any internal changes to source data that may have occurred

5. Set the EV8 trigger file to allow the shortcut process to proceed. This provides a

simple way to indicate that the data has been backed up or is secure and can be

removed from source data set

6. Run an EV8 Shortcut task to process the source data set and replace original content (messages and files) with appropriate application shortcuts. This may require one or more restarts of the task or trigger file to complete the entire shortcut operation. (Under normal operations this activity would synchronize and run over several days.) The load on EV8 for this is the same for either storage type because no Vault Storage Partition data is processed.

Page 14: Data domain and enterprise vault

14

Using Data Domain Storage with Symantec Enterprise Vault 8

Test Environment

7. Perform another full backup of the source data set with NetBackup (Exchange or Windows

files) to establish the new reduced data set size and backup operation

After creating shortcuts, we confirmed that the archiving process had completed by checking the Enterprise Vault reports. We also opened each mailbox to confirm that shortcuts had been created.

The archiving task was run to archive the data and create shortcuts. We chose to create shortcuts because this would closely replicate customer environments and performance in the real world. We set the number of Exchange server mailbox items to archive per pass at the maximum of 10,000 so that we could capture non-stop archiving in perfmon and finish in a timely manner. See Figure 4.

Figure 4: Archiving Task Properties

Page 15: Data domain and enterprise vault

15

Using Data Domain Storage with Symantec Enterprise Vault 8

Test Environment For the email archiving tests, after we were satisfied that all of the mail data had been archived, we then defragmented the Exchange databases by creating new stores and moving the mailboxes to them. Otherwise, the old Exchange databases would still occupy the same space but be less full than before. This is a common administrative task that would occasionally be required in production deployments to take advantage of the reduction in the data maintained by Exchange. Note that the archiving and shortcut processes move data from the Exchange database and leave much smaller shortcut links behind. Without this administrative step, the size of the Exchange database does not decrease.

The file archiving policy was set to capture all file types. No date or time limits were set; this let us capture everything on the disk. For Exchange, the archiving was only run against mailboxes and not public folders or journaling.

Exchange Results

Table 2 shows the size of each mailbox and storage group before archive processing was run. Table 2: Exchange Mailbox Sizes (Pre Archive)

User Mailbox Mailbox STG (MB)

Size Items

1 3,517,485 81,000 STG 1

3 4,102,035 58,519 9,558,216

2 2,563,699 70,107 STG 2

4 3,344,764 76,462 7,610,568

Table 3 shows the change in database size after archiving the mailbox content. The Exchange

databases were reduced from about 17.2 GB to 2.3 GB of data on disk. Table 3: Exchange Mailbox Size (Post Archive)

User Mailbox Mailbox STG (MB)

Size Items

1 382,042 80,993 STG 1

3 442,927 58,510 1,375,304

2 171,476 70,100 STG 2

4 224,395 76,450 923,720

Page 16: Data domain and enterprise vault

16

Using Data Domain Storage with Symantec Enterprise Vault 8

Test Environment

FSA Results

Figure 5 and Figure 6 shows the results for the FSA testing. The figures show the file server data set properties before and after the archive run. The archiving operation reduced the original 11.7 GB of files to 670 MB of on-disk storage.

Figure 5: File Server Data Size (Pre Archive)

Figure 6: File Server Data Size (Post Archive)

Page 17: Data domain and enterprise vault

17

Using Data Domain Storage with Symantec Enterprise Vault 8

Enterprise Vault Setup

Discussion of the Results

The file system archiving and Exchange mailbox archiving were performed with both local storage and Data Domain as the target for the Vault. This gave us the perfmon data related to the difference in processing overhead between Symantec’s deduplication engine and offloading the deduplication to the Data Domain system. The files used for archiving were gathered from sampled corporate data to provide more realistic examples. The total data size was 10.6 GB totaling about 160,000+ files.

To correctly repeat the tests so that the results would be based on identical data, we rolled the Exchange environment back by using the VMware snapshot functionality. We followed these steps to roll back the physical hardware running the Enterprise Vault server and SQL:

1. Uninstall Enterprise Vault.

2. Drop the databases from SQL.

3. Provision the desired storage location.

4. Reinstall Enterprise Vault.

5. Choose the storage location for the Vault or archive based on the test we were

running: either to local storage or to the Data Domain system.

Enterprise Vault Setup

Setting up a storage partition is outlined in detail in the Enterprise Vault Administration Guide. Some important considerations are as follows:

1. The location of the partition in relation to end users of the archiving system. This is

critical for performance. When a user retrieves a placeholder or shortcut, they will be calling the data from the Vault. If the partition is located across a WAN or even a slow network, the time to restore will increase.

2. The I/O performance of the storage location. This will have an impact that is similar to a

slow network connection. Also, poor I/O performance can create a situation of diminishing returns as user requests increase.

3. Good choices for the new partition setup. During the partition setup, you need to choose whether the storage device or EV8 handles deduplication and compression. If you are adding a Data Domain device as the partition target, we recommend that you choose Device performs data deduplication and Device performs data Compression as shown in Figure 9: "Vault Store Partition Options" on page 20.

Page 18: Data domain and enterprise vault

18

Using Data Domain Storage with Symantec Enterprise Vault 8

Enterprise Vault Setup

4. The Data Domain storage solution does not impose a specific partition size. Keep in

mind that backups of the partitions may be required. Examine the Collection and Migration configuration options to manage the structure and amount of data that needs to be backed up with each pass.

5. Design for failover. If you are using Symantec’s Building Blocks failover method, make

sure that you use UNC paths to the storage. This is important because the path to the storage is maintained in SQL; if a failover occurs, the new directory server will get storage path information from SQL.

For additional details about specific options and choices in the Enterprise Vault setup,

refer to “Appendix B. Enterprise Vault Setup Details” on page 29.

Configuring the Vault Store Partition

For this testing, the main changes in the Enterprise Vault configuration were:

• The location of the Vault Store partition

• The properties of that Vault Store partition

Everything else was set the same to either reflect real-world usage or to meet the specific monitoring and testing requirements outlined in this document.

Figure 7 through Figure 9 show the steps required to choose between the Data Domain storage and the local disk (NTFS) alternative. The first step is to name the new Vault Store Partition and mark is as Open (or active). The name of the Vault Store Partition is the logical name that appears in the Enterprise Vault Management tool. See Figure 7.

Page 19: Data domain and enterprise vault

19

Using Data Domain Storage with Symantec Enterprise Vault 8

Enterprise Vault Setup

Figure 7: Create Vault Store Partition

The next step is to select the appropriate Storage type ( Figure 8). Testing all of the possible Enterprise Vault Storage types is outside the scope of this paper. Our interest was mainly in comparing the Data Domain system to a baseline type of local disk (NTFS Volume).

Figure 8: Select Vault Store Partition Type

Page 20: Data domain and enterprise vault

20

Using Data Domain Storage with Symantec Enterprise Vault 8

Data Protection with Symantec NetBackup

For either Data Domain or NTFS (local disk), you can specify whether the storage supports deduplication and compression. For Data Domain storage , select these options to let the Data Domain system perform the deduplication and compression. For a local disk, deselect these options so that the EV8 server performs deduplication and compression.

Figure 9: Vault Store Partition Options

Data Protection with Symantec NetBackup

One of the benefits of archiving application data is the opportunity to reduce the backup (and restore) workloads as data volumes increase. For this part of the testing, we used Symantec NetBackup as the backup application. We performed full backups on the MS Exchange server and the Windows file server both before and after the archive tasks had removed the content from the original locations.

Backup of MS Exchange Data to the Data Domain System

For the sample set we used, the Exchange databases were reduced by approximately 87%.

About 15 GB of the original 17.2 GB was archived, with about 2.3 GB remaining. The backup

times decreased from 20 -30 minutes to 2-3 minutes. The Exchange backup was must faster and

smaller after the archiving process finished and the Exchange storage groups (STG) were

compacted. This is expected because the archiving task removes a significant amount of data and the new "edb" (exchange database) files are smaller and easier to protect.

Page 21: Data domain and enterprise vault

21

Using Data Domain Storage with Symantec Enterprise Vault 8

Using a Data Domain System as Backup and Archive Vault Storage

Backup of File System Data to Data Domain

The File server backups reduced the backup set size by almost 95%. The archiving operation moved approximately 11 GB of data from the original 11.7 GB file system, leaving about 900 MB of data as shortcuts. The backup of the file server was much smaller, but not much faster. This is mainly the result of the composition of the file system. There are still over 160,000 files to traverse to complete a full backup. Even though the size dropped significantly as a result of the archiving operation, the time to access all of the files consumed most of the time in this case.

Using a Data Domain System as Backup and Archive

Vault Storage

One of the primary objectives of the testing and analysis was to determine if there were any combined effects of using the Data Domain as both a backup and an archive platform.

The Data Domain storage can be used as a backup target by following the normal configuration methods for the backup application and the desired protocol. The testing conducted for this paper used the CIFS protocol and mounted a separate backup folder on the Data Domain system as a NetBackup Disk Storage Unit to the Windows Master/Media server.

For integration with Enterprise Vault, we created another folder to hold the EV8 Vault Store partitions. These were configured by making a few selections during the creation of the Vault Store partition.

When we looked for deduplication within data sets for the Exchange environment, we found that the various formats used for storing Microsoft Exchange data during its life cycles did not yield any significant degree of commonality. Most of the data that goes into Exchange is in the form of messages (text) and attachments (various document types) . This data is transformed in subtle or sometimes not so subtle ways as it is managed. As a result, the combined effect of performing a backup of the original Exchange source data was negligible for Exchange archiving.

The format of the mailbox contents and the underlying Exchange database files is sufficiently different than the original content to limit deduplication results. Likewise, the format of the exported PST files, the NetBackup backup images extracted using the Exchange MAPI interface, and the Enterprise Vault dvs files are

Page 22: Data domain and enterprise vault

22

Using Data Domain Storage with Symantec Enterprise Vault 8

Using a Data Domain System as Backup and Archive Vault Storage

also each different enough to reduce the deduplication effect to approximately 2x for the first instance. Most of this deduplication is achieved through local compression capabilities. Many of the above formats (with the exception of backup images) are not repeated, so the amount of potential deduplication is limited.

We also observed that for archive-only scenarios, the final amount of real storage used to hold the Vault Store Partition archive data for either storage solution (the Data Domain system or the local disk) was about the same. These results were relative to the specific scenario of Exchange or file server (FSA).

There were, however, noticeable benefits when using the Data Domain system as a File

Server Archive (FSA) storage platform in conjunction with the backups of the original file server content. Because the underlying file content does not change as much between representations (see the Exchange discussion earlier in this paper), there are opportunities for additional deduplication of the two data sets.

What we noticed was that, when the original files were backed up to the Data Domain system before being archived, the archive data sets deduplicated about 3x further than if the data had not been backed up to the Data Domain system. The details for this specific test are shown in Table 4. Table 4: Deduplication Comparison

Data Set Composition

Original file composition 11.7GB (~170,000 files / folders)

Archived file composition 0.9GB (~170,000 files / folders)

Archive data set on local disk 5.9GB (~248,000 files / folders) [EV8 deduplication]

Archive data set on Data Domain (logical size) 12.8GB (~247,000 files/folder)

Deduplicated data set (disk used - Archive only) 5.6GB

Deduplicated data set (disk used - Backup+Archive) 1.5GB

Page 23: Data domain and enterprise vault

23

Using Data Domain Storage with Symantec Enterprise Vault 8

Recovering EV8 Archives

Recovering EV8 Archives

There are multiple ways to protect and recover the Enterprise Vault archives. The typical method is to properly back up the Enterprise Vault configuration with your backup application by following the guidelines in the Symantec Enterprise Vault documentation. These methods are good for protecting both the Enterprise Vault server configuration and the archives themselves.

Data Domain Replication

With Data Domain storage, you can also leverage the storage system replication capabilities and have the folders that are used as Vault Store Partition locations replicated to another Data Domain system. This replication is transmitted over IP. While this does not protect the server or the total primary site archive configuration, it does provide an additional method for securing the archive data sets.

With this approach, you have two additional methods of recovery should the archive data (or main archive storage system) fail:

• Method one is to repurpose the second (replica) Data Domain system in place of the

first. This can be accomplished with simple system naming and network changes. The replica system Vault Store Partition will appear on the Enterprise Vault server just as the original CIFS share appeared.

• Method two would require you to reverse the replication process and have the original

data sent back to the (repaired) Data Domain system. The advantage of this approach is that the backup copy on your replica does not get put into service and can still function as a recovery system when needed.

Backup to Third-Party Solutions

If you are going to back up the archive to a third-party solution such as NetBackup, it is important to set the Enterprise Vault Collection and Migration options properly. The Data Domain system does not have or enforce specific volume sizes, so a single archive on the Data Domain system could grow very large. This can be become problematic for backup applications because there is no easy way to protect a large data set of many, many small files – such as an archive.

The Collection and Migration methods within Enterprise Vault provide another way to help manage archive sizes. You can adjust the sizing options to cause the archive content to get collected in fewer and larger containers that are better suited for backup handling. Details about choosing the best settings are outside the scope of this paper; the best choice in a given situation will depend on data set composition, workloads, retention requirements, and backup performance. The default size is 10 MB and the default age is 10 days for the Collection function.

Page 24: Data domain and enterprise vault

24

Using Data Domain Storage with Symantec Enterprise Vault 8

Recovering EV8 Archives

Figure 10 and Figure 11 show the forms where you set the relevant Enterprise Vault properties for Vault Store Partitions.

Figure 10: Enterprise Vault Collections

Figure 11: Migrate to NetBackup

Page 25: Data domain and enterprise vault

25

Using Data Domain Storage with Symantec Enterprise Vault 8

Benefits of Data Domain with Enterprise Vault

Benefits of Data Domain with Enterprise Vault 8

Based on the testing performed for this paper as well as a review of real-world use cases, we found that Data Domain storage provides an excellent platform for both backup and archive applications. In addition to the data deduplication results that can be obtained in normal backups, the archive scenarios showed that server workloads can be reduced by moving the deduplication function to the storage device.

By using Data Domain storage for the archive location;

The load on the EV8 server was about half as much as when using local disk. This was achieved by performing deduplication and compression on the Data

Domain system instead of using the EV8 built-in functions. See “Appendix A. Enterprise Vault Server Load Testing Results” on page 26.

The Data Domain storage system also has unique benefits from a reliability and recoverability perspective that may not be available with more traditional storage alternatives. The Data Invulnerability Architecture brings much more than just RAID6 functionality to the maintenance of data integrity written to the storage. This is critical for long-term archive solutions where the information will be stored unchanged for greater periods of time.

In addition to traditional recovery methods such as storage backups, Data Domain systems provide an optimized replication capability to ensure that you have a working copy of the backups and archives offsite.

With proper sizing for capacity and performance, Data Domain storage can be an effective

addition to both the backup and archive tasks in today’s IT enterprises.

See “Appendix C. References” on page 33 for links to more details and additional Data Domain resources.

Page 26: Data domain and enterprise vault

26

Using Data Domain Storage with Symantec Enterprise Vault 8

Appendix A. Enterprise Vault Server Load Testing Results

Appendix A. Enterprise Vault Server Load Testing

Results

This appendix describes the detailed results that we obtained during the Enterprise Vault server perfmon load test analysis. Figure 12 shows the average CPU load of the EV8 server while archiving email from the Exchange server. The EV8 server was responsible for deduplicating and compressing the data as it wrote the final archives to the local disk storage target. As Figure 12 demonstrates, the EV8 server was not fully loaded and ran this process with about a 23% CPU load.

Figure 12: EV8 with Local Disk (Exchange Archiving)

Figure 13 shows the performance of the same server during the Exchange archive that used the Data Domain storage (which performed the deduplication and compression). There is a significantly lower CPU workload of around 10%. Allowing for some EV8 task restarts, the archive tasks took about the same amount of time to complete. Our conclusion is that the workloads on the Enterprise Vault server and associated SQL services can be reduced by about 50% if Data Domain storage is exploited instead of the local disk.

Page 27: Data domain and enterprise vault

27

Using Data Domain Storage with Symantec Enterprise Vault 8

Appendix A. Enterprise Vault Server

Figure 13: EV8 with Data Domain (Exchange Archiving)

A similar CPU load result was observed during the FSA archiving. Figure 14 and Figure 15 show

EV8 server CPU load levels when archiving files to local disk (divided into two parts). Figure 16 shows the CPU load when the archive was written to the Data Domain storage. There is roughly a 16-28% load with the local disk and only about a 10% load when using the Data Domain as the Vault Store Partition.

Figure 14: EV8 with Local Disk (FSA - Part 1)

Page 28: Data domain and enterprise vault

28

Using Data Domain Storage with Symantec Enterprise Vault 8

Appendix A. Enterprise Vault Server Load Testing Results

Note: Part 1 and Part 2 were required because of restarts of EV8 archive tasks and the time gap in the perfmon log files.

Figure 15: EV8 with Local Disk (FSA - Part 2)

Except for one small peak, the CPU load of the EV8 server when using the Data Domain storage

was significantly lower (less than half ) than the processing required for the EV8 server to

perform deduplication and compression of the data stream.

Figure 16: EV8 with Data Domain (FSA)

Page 29: Data domain and enterprise vault

29

Using Data Domain Storage with Symantec Enterprise Vault 8

Appendix B. Enterprise Vault Setup

Appendix B. Enterprise Vault Setup Details

Screen shots of specific EV8 option settings.

Figure 17: EV8 Archiving Policy Settings

Page 30: Data domain and enterprise vault

30

Using Data Domain Storage with Symantec Enterprise Vault 8

29 Appendix B. Enterprise Vault Setup Details

Figure 17: EV8 Archiving Policy Settings

Page 31: Data domain and enterprise vault

31

Using Data Domain Storage with Symantec Enterprise Vault 8

Appendix B. Enterprise Vault Setup Details

Page 32: Data domain and enterprise vault

32

Using Data Domain Storage with Symantec Enterprise Vault 8

Appendix C. References

Appendix C. References

Data Domain Links

Data Invulnerability Architecture

• http://www.datadomain.com/products/DIA.html

SISL™

Domain Technology

• http://www.datadomain.com/products/technology.html

Avoiding the Disk Bottleneck in the Data Domain Deduplication File System (FAST08

Technical Paper)

• http://www.usenix.org/events/fast08/tech/full_papers/zhu/zhu.pdf

Symantec Links

ftp://ftp.support.veritas.com/pub/support/products/ Exchange_Mailbox_Archiving_Unit/276547.pdf