Replication with ETERNUS CS800 - Best Practice Guide

10
BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011] Page 1 of 10 OVERVIEW Replication is a feature of the ETERNUS CS800 Series de-duplication appliances that uses TCP, an Ethernet protocol, to efficiently transport a complete copy of user data residing on one ETERNUS CS800 (“the source”) to another ETERNUS CS800 (“the target”). High efficiency is achieved by transporting only the unique data blocks plus metadata from source to target. SCOPE Intended Audience: End Users, System Engineers, RTS, Resellers This document provides best practice guidance when configuring replication between ETERNUS CS800 de-duplication appliances. This is not intended to be a standalone document. OBJECTIVE The value of replication is Disaster Recovery (DR). The target ETERNUS CS800 can failback a copy of the data to the same or another ETERNUS CS800. The target ETERNUS CS800 may be used to directly access the user data at the DR site. The target ETERNUS CS800 may be physically relocated to another server location for access to the user data. DEFINITION OF TERMS A variety of replication terminology is used in this document. This document makes every attempt to use the same terminology as introduced in the ETERNUS CS800 User’s Guide. Adaptive de-duplication The mode of de-duplication which allows data de-duplication to run concurrent with the backup being ingested. The de- duplication process will adapt to the speed of the ingest. Backup Window In normal use, “backup window” refers to the customer-defined period of time during which the customer data is backed up. It usually has a clearly identifiable start and stop time. When used together with deferred de-duplication in an ETERNUS CS800 context, the “backup window” refers to a “reservation window” during which de-duplication is suspended so that all Disk I/O can be applied to maximize data ingest in order to minimize the normal user backup window. In order to minimize confusion about which “backup window” is being discussed, this document will refer to this ETERNUS CS800 deferred de-duplication backup window as the “deferred de-duplication window”. Deferred de-duplication The mode of de-duplication which begins only after the deferred de-duplication window. Typically, deferred de-duplication begins after the backup ingest is complete. Deferred de-duplication window A defined window during which no de-duplication will take place. This allows maximum system resources to be devoted to data ingest thus allowing a faster backup. The deferred de-duplication window applies only to the share/partition for which it is defined. It is possible to define a second share/partition and perform backups that overlap the same time period. The data written to the share without a defined deferred de-duplication window will be subjected to adaptive de-duplication. ■De-duplication pool The term used to refer to the collection of unique data stored in a CS800 de-duplication appliance. The size of the de-duplication pool is reported as the After Reduction statistic on the ETERNUS CS800 GUI and is a measure of the disk space occupied by all data backed up to ETERNUS CS800 after the data has been de-duplicated and compressed. ■Failback – The ETERNUS CS800 procedure that uses replication to copy a replicated share or partition from a target ETERNUS CS800 to another ETERNUS CS800 system. BEST PRACTICE GUIDE REPLICATION WITH ETERNUS CS800

description

With Fujitsu ETERNUS CS800, customers will benefit from all advantages of disk based backup solutions, inclusive deduplication. Through deduplication ETERNUS CS800 reduces the necessary backup capacity up to 95%.

Transcript of Replication with ETERNUS CS800 - Best Practice Guide

Page 1: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 1 of 10

OVERVIEW

Replication is a feature of the ETERNUS CS800 Series de-duplication appliances that uses TCP, an Ethernet protocol, to efficiently transport a complete copy of user

data residing on one ETERNUS CS800 (“the source”) to another ETERNUS CS800 (“the target”). High efficiency is achieved by transporting only the unique data

blocks plus metadata from source to target.

SCOPE

Intended Audience: End Users, System Engineers, RTS, Resellers

This document provides best practice guidance when configuring replication between ETERNUS CS800 de-duplication appliances. This is not intended to be a

standalone document.

OBJECTIVE

The value of replication is Disaster Recovery (DR).

■The target ETERNUS CS800 can failback a copy of the data to the same or another ETERNUS CS800.

■The target ETERNUS CS800 may be used to directly access the user data at the DR site.

■The target ETERNUS CS800 may be physically relocated to another server location for access to the user data.

DEFINITION OF TERMS

A variety of replication terminology is used in this document. This document makes every attempt to use the same terminology as introduced in the ETERNUS

CS800 User’s Guide.

■Adaptive de-duplication – The mode of de-duplication which allows data de-duplication to run concurrent with the backup being ingested. The de-

duplication process will adapt to the speed of the ingest.

■Backup Window – In normal use, “backup window” refers to the customer-defined period of time during which the customer data is backed up. It usually

has a clearly identifiable start and stop time. When used together with deferred de-duplication in an ETERNUS CS800 context, the “backup window”

refers to a “reservation window” during which de-duplication is suspended so that all Disk I/O can be applied to maximize data ingest in order to

minimize the normal user backup window. In order to minimize confusion about which “backup window” is being discussed, this document will refer to

this ETERNUS CS800 deferred de-duplication backup window as the “deferred de-duplication window”.

■Deferred de-duplication – The mode of de-duplication which begins only after the deferred de-duplication window. Typically, deferred de-duplication

begins after the backup ingest is complete.

■Deferred de-duplication window – A defined window during which no de-duplication will take place. This allows maximum system resources to be

devoted to data ingest thus allowing a faster backup. The deferred de-duplication window applies only to the share/partition for which it is defined. It is

possible to define a second share/partition and perform backups that overlap the same time period. The data written to the share without a defined

deferred de-duplication window will be subjected to adaptive de-duplication.

■De-duplication pool – The term used to refer to the collection of unique data stored in a CS800 de-duplication appliance. The size of the de-duplication

pool is reported as the After Reduction statistic on the ETERNUS CS800 GUI and is a measure of the disk space occupied by all data backed up to

ETERNUS CS800 after the data has been de-duplicated and compressed.

■Failback – The ETERNUS CS800 procedure that uses replication to copy a replicated share or partition from a target ETERNUS CS800 to another

ETERNUS CS800 system.

BEST PRACTICE GUIDE REPLICATION WITH ETERNUS CS800

Page 2: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 2 of 10

■File or cartridge replication – File or cartridge replication (FCR) extends continuous and name-space replication from operating at a share/partition level

and zooms in to the file-directory/virtual cartridge level. FCR can be used to synchronize the content of a share or partition that is concurrently

accessible at both source and target ETERNUS CS800.

■Namespace – The term that Fujitsu applies to metadata required to reconstruct de-duplicated data back into its native application format. It is used in

phrase combinations such as “namespace replication” or “synchronize the namespace.”

■Partition – An ETERNUS CS800 storage destination for data transferred by FC or iSCSI where the structure is considered to be a virtual tape library

(VTL) and the content is written to virtual tape cartridges.

■Recover – The ETERNUS CS800 procedure to make replicated and namespace data accessible on ETERNUS CS800 to which it had been replicated. If

a share was replicated, then a share is recovered. If a partition is replicated, then a partition is recovered. It is not possible to convert a share to a

partition (or vice-versa) during the recovery procedure.

■Share – An ETERNUS CS800 storage destination for data transferred by NAS where the content is treated as files and directories.

■Source – The term often applied to the ETERNUS CS800 that is sending a copy of de-duplicated data to a second ETERNUS CS800.

■Synchronize – When used in this document, this term means that two entities are made and/or confirmed to be identical. For example, namespace

replication will synchronize the relevant share and/or partition content and metadata between source and target system. When used in the context of

“virtual tape cartridge”, “file”, or “directory”, “synchronize” operates at the more granular reference of the context (for example, “synchronize cartridges”)

between source and target. Consult APPENDIX A – Directory/File or Cartridge Replication for more information about File or Cartridge Replication

(FCR) and synchronizing at the more granular level.

■Target – the label often applied to the ETERNUS CS800 that is receiving a copy of de-duplicated data.

REQUIREMENTS FOR REPLICATION

■De-duplicated data – The data must be de-duplicated before it can be replicated. The user can create a NAS share or a VTL partition and specify that

data written to that share/partition be de-duplicated.

■Specified data – Specify what data is to be replicated: The user must specify, on the source system, that a particular share/partition is to be replicated.

■Sufficient bandwidth – You need to have a circuit of sufficient bandwidth available to link the source to the target. Both ends of the circuit require TCP.

The user has a variety of circuit options available.

■Specified replication target – Consult the ETERNUS CS800 User’s Guide for procedural details.

a) The user must first tell the target that it should allow replication from the source system. This is done at the target.

b) Next, the user must tell the source ETERNUS CS800 the name or IP address of the target device. The source system will immediately check if the

target is reachable and if replication to that target has been authorized at the target.

■Schedule – Implement a schedule for routine namespace replication between source and target. This is optimally scheduled to take place after both the

backup and de-duplication have completed.

WHAT DATA CAN BE REPLICATED?

Although ETERNUS CS800 can be used to store both de-duplicated as well as non-de-duplicated data at the same time on the same appliance, only de-duplicated

data can be replicated. Data to be replicated must be written to a share/partition that is configured for both de-duplication and replication. Shares/partitions must be

configured for de-duplication at the time they are created. De-duplication can-not be added or removed once a share/partition has been configured. Replication can

be enabled/disabled on a per-share or per-partition granularity even after the share/partition is created as long as the share/partition was created with de-duplication

enabled.

HOW DOES ETERNUS CS800 REPLICATION WORK?

ETERNUS CS800 replication has two phases that work together to synchronize copies between the source and target ETERNUS CS800. Both phases – continuous

replication and namespace replication are required to maintain synchronization. Continuous replication moves unique blocks in a background process while

namespace replication synchronizes the metadata between the source and target.

Continuous Replication

■Continuous replication does not have its own enable/disable command. As long as replication is enabled, continuous replication will seek to replicate the

unique data blocks between source and target, but only for shares/partitions that have de-duplication and replication enabled.

■As the de-duplication process discovers new unique data (data that isn’t already in the local de-duplication pool), it puts a reference to that data in a

queue for continuous replication to process.

Page 3: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 3 of 10

■Continuous replication, while processing the queue, asks the target ETERNUS CS800 if it already has a copy of the recently-stored unique data. If the

target responds that it already has a copy of that data, continuous replication moves on to the next entry in the queue. If the target responds that it does

not have a copy of that unique data, continuous replication is responsible for moving a copy of that unique data to the target.

■Continuous replication is extremely efficient because it only sends inquiries to the target if there is new unique data on the source. That is, there is no

need to inquire about data previously replicated between source and target.

■In this way, continuous replication assures that there is a copy of the unique data blocks for a share/partition also on the target. More information (the

namespace, also known as metadata) is needed in order to reassemble the data into its original format. Metadata is synchronized by namespace

replication.

■Continuous replication is constantly checking to see if there is anything in its queue. If it finds a queue entry, it immediately processes the item.

■Continuous replication is suspended whenever namespace replication is active.

� If a backup occurs while continuous replication is suspended, any new unique data tags will be added to the continuous replication queue for

later processing.

� The continuous queue will once again be processed whenever namespace replication is not running.

NAMESPACE REPLICATION

■Namespace replication is responsible for synchronizing the metadata between source and target. The metadata is required in order to reassemble the

de-duplicated data back into the format originally written by the backup application. The data cannot be reassembled without the meta-data.

■Namespace replication must be enabled on a per-share/partition basis using the GUI.

■Namespace replication can be scheduled to occur routinely as often as once per day. It can also be initiated on-demand. Click “Replicate Now” for

namespace replication on demand.

■Namespace replication will normally execute immediately when started either by schedule or on demand. If a namespace replication is already active,

then subsequent requests are queued, the respective share/partition will show a status of queued, and the queue is processed in FIFO order.

Partial Namespace Replication

Partial namespace replication can occur under the following conditions:

■Namespace replication is triggered while a NAS file is open in the share to be replicated.

■Namespace replication is triggered while a virtual cartridge from the partition to be replicated is loaded in a virtual drive.

■Namespace replication is triggered before all data for a share / partition has been de-duplicated

This means that not all metadata required for reassembling the data into its original application format is available. It also means that not all data blocks are available

because only unique de-duplicated data blocks are replicated to the target. Consequently, only some of the data can be reassembled into the original application

format on the target until a complete namespace replication is achieved. The potential ramification of a partial namespace replication is that some files may not be

available for a restore. A successful (i.e., not a partial) namespace typically catches up 24 hours later if a daily namespace replication schedule is implemented.

Manually clicking on the Replicate Now button in the GUI will also allow name-space replication to resynchronize.

In order to avoid a “partial” completion status, it is advisable to schedule namespace replication to occur after all data has been de-duplicated.

■In order to avoid partial namespace replication when issuing the “replicate now” command manually, click the Check Readiness button first. Check

Readiness will determine if all data destined for the respective share/partition has been de-duplicated and report back.

■See when should I schedule namespace replication in the following chapters.

WHAT CONTROL DO I HAVE OVER REPLICATION?

There are several commands to control replication for the entire ETERNUS CS800 system. Consult the ETERNUS CS800 User’s Guide for details.

Pause/Resume

■Click Pause to pause all namespace and continuous replication. The pause will take effect as soon as the current data block finishes replicating. That is,

on a low bandwidth replication link, it may take some time before you see the effect of the pause command.

■Click Resume to allow replication to resume.

Enable/Disable

■Click Enable to enable replication for all shares and partitions that have de-duplication configured.

� CAUTION: If there are shares or partitions that you do not want replicated, then you should enable the shares and/or partitions individually

rather than using this GUI command. Refer to the ETERNUS CS800 User’s Guide if you require help with this procedure.

Page 4: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 4 of 10

� Namespace replication will begin at the next scheduled time for each share and partition. If there is no namespace replication schedule, then it

will depend on when the user clicks Replicate Now for the respective share or partition.

� The continuous replication queue will begin building with the next backup that writes into that share or partition. If there is no namespace

replication active and the circuit between source and target is up, then continuous replication will begin moving new unique data to the target.

Only new unique data ingested during that backup will be replicated to the target.

■Click Disable to disable replication from all shares and partitions that have been configured for de-duplication.

� All in-process and queued namespace replications will attempt to complete before the Disable toggles off their namespace replication.

Replication for an individual share/partition can be managed by enabling and/or disabling replication for that share/partition. Call up the configuration for that

share/partition and edit the replication setting. Refer to the ETERNUS CS800 User’s Guide if you require help with this procedure.

HOW CAN I TELL HOW FAST MY REPLICATION IS PROCEEDING AND HOW MUCH BANDWIDTH MY REPLICATION IS USING?

There is no single number that defines the replication rate. Replication is a mutual dependency between the de-duplication rate, replication queue processing rate,

network loading and network latency. Replication is broken down into two measurements: (1) Replication Processing Rate, and (2) Replication Ethernet Load Rate.

Data must first be de-duplicated before it can be added to the replication queue. With adaptive de-duplication, data is de-duplicated as it is ingested. In a hypothetical

case, if ingest occurs at 100 MB/S and the rate of change in that data is 5%, then the rate at which new data is encountered is 5% of 100 MB/S, or 5 MB/S.

■The rate at which new data is encountered is the same as the rate at which it is placed on the replication queue: 5 MB/S.

■Consequently, the rate at which this new data is available for replication via the Ethernet port is 5 MB/S.

This means there is a replication rate that is based on the rate of ingest and a replication rate that is based on the amount of Ethernet loading. In the above

hypothetical example, ETERNUS CS800 is processing the ingest at 100 MB/S and determining what already exists at the replication target. The replication process-

ing rate is 100 MB/S. Unique data blocks are identified and replicated to the target at a replication Ethernet load rate of 5 MB/S (assuming that there are no

bandwidth or latency bottlenecks in the replication link).

Recap:

■Ingest (backup) is at 100 MB/S

■De-duplication is at 100 MB/S

■Replication is at 100 MB/S because we're verifying that some data already exists at the target (because it already exists in the de-duplication pool) and

we're transferring a copy of only that data that isn't already in the blockpool.

■Side effect: Ethernet loading is 5 MB/S

Replicating the namespace happens very quickly, typically finishing within seconds if namespace replication is scheduled to occur after de-duplication has completed

and continuous replication has moved all the data blocks.

WHY SHOULD I REPLICATE THE NAMESPACE WHEN I FIRST CREATE A NEW SHARE/PARTITION?

Always replicate a share/partition namespace immediately after it is created and has de-duplication and replication enabled, independent of whether you have

specified either the Adaptive or the deferred de-duplication policy.

Replicate the namespace, via the on-demand Replicate Now button, for each new share/partition as soon as it is created and before any data is written to it.

■The initial namespace replication of the empty share/partition will run very quickly (in a matter of minutes).

■This action establishes the namespace structure for the share/partition on the target so that the first namespace replication after a backup will run

quickly.

Failure to replicate the empty namespace is not fatal. The speed of the first-ever namespace replication following a backup, where the empty namespace was not

replicated first, may run noticeably slower than if the best practice recommendation had been followed. This will be especially noticeable if a huge amount of data has

been backed up.

Page 5: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 5 of 10

WHEN SHOULD I SCHEDULE NAMESPACE REPLICATION?

Namespace replication interacts with the de-duplication pool for its duration. Therefore, namespace replication will perform optimally if it does not overlap with other

processes (such as de-duplication, space reclamation, restores, Read/Verify, tape creation) that also access the de-duplication pool at the same time. If scheduled

properly, namespace replication will complete in a matter of minutes.

Avoid overlap of the following processes with namespace replication:

■De-duplication. You should avoid overlap with de-duplication for two reasons:

� So that you do not end up with a partial namespace replication (see Partial Namespace Replication for more information).

� So that you do not inadvertently slow de-duplication. In ETERNUS CS800 systems that are nearly full to capacity, this could have the side

effect of slowing the backup.

■Space Reclamation. Every ETERNUS CS800 must eventually reclaim the data blocks occupied by expired data. That can be a very I/O intensive

process that is best completed as quickly as possible. If replication and space reclamation overlap, both can potentially be slowed by more than a factor

of 2.

■Restores, backup application Read/Verify, and tape creation. All of these processes generate additional I/O. If the data being retrieved from

ETERNUS CS800 is available from non-truncated space, a cache of native format data, then the impact of the operation is minimal. However, if the data

first has to be retrieved from the de-duplication pool and reconstructed into native application format, then there will be a noticeable impact on

performance of all of the overlapping processes.

If you have short discrete backup windows, then it should be relatively easy to determine the optimal schedule for namespace replication.

HOW MUCH BANDWIDTH DO I NEED FOR MY REPLICATION TO BE SUCCESSFUL?

ETERNUS CS800 will transfer only unique data, data that the target does not already have, when replicating from source to target. So you need sufficient bandwidth

to

■Replicate the daily load of new unique data from source to target

■Replicate the namespace (typically only a few MB)

■Room for data growth

For new ETERNUS CS800 installs: the ETERNUS Pre-Sales Systems Engineer (SE) has a sizing tool that that can calculate what your expected effective bandwidth

requirement will be

■Be aware that although you might have “plenty” of bandwidth available, end-to-end latency can significantly impact the ability of ETERNUS CS800 to

utilize that bandwidth.

■If a communications link is already present between source and target location, you should per-form an FTP of 50-100 MB of totally random data

between source and target and measure the performance. That will be a measure of how much bandwidth is available for replication.

NOTE: Totally random data is required so that any WAN optimization device (Riverbed, Silver Peak, etc.) in the circuit does not, without the knowledge of the user,

inflate the FTP transfer rate. ETERNUS CS800 will be replicating only unique data. In some instances, the user may elect to enable encryption during replication.

WAN optimization devices typically do not accelerate replication packets.

If ETERNUS CS800 is already installed and replicating:

■The SE can perform the same FTP test measurement to determine effective bandwidth.

HOW LONG WILL MY FIRST-EVER (NAMESPACE) REPLICATION TAKE?

There are several questions that have to be asked before an answer can be provided:

1. We need to know when this first-ever replication will be activated. For example:

� Will replication be activated at the time that ETERNUS CS800 is installed? In this case, the first-ever replication will correspond with the first-ever backup.

� Will replication be activated some time (days/weeks/months) after the first-ever backup to ETERNUS CS800? In this case, there will be a backlog of data

waiting to be replicated.

� Was replication configured but the namespace replication schedule was overlooked? If so, anywhere in the range of “none of the data” to “most of the

data blocks” may already be at the target and the amount of time for namespace replication may be very short (minutes).

2. We need an estimate of how much data will be queued for this first-ever replication.

� As described in question 1, above, the amount of data to replicate will vary depending on when this first-ever replication is performed. It could have a

significant range from “all data in the share/partition” to “only the metadata.”

Page 6: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 6 of 10

� Although ETERNUS CS800 will only replicate unique data from source to target, it will go through a verification process during replication to make sure

that all necessary data is replicated to the target and can be reconstructed into native application format.

� The more data there is to replicate, the longer it will take.

3. We need to know what effective bandwidth is available. See how much bandwidth do I need for my replication to be successful?

4. What other activities will the source and target s be engaged in (for example, backup, space reclamation, read/verify, tape creation, etc) that could impact

the speed of replication?

� Refer to the section titled “When should I schedule namespace replication?” for a discussion about competing activities.

� Impact of competing activities will depend on both scheduling and duration of the first-ever replication. If it is short in duration, the likelihood of impact is

minimal. However, if duration is long, then overlap with competing activities is inevitable.

A simple example:

1. Install two ETERNUS CS800 systems and configure replication

2. Back up 1 TB of Exchange data in 8 hours.

� Exchange servers are usually configured for single-instance store (no longer valid for Exchange 2010), so there is only one copy of any e-mail and

attachment. That means the first Exchange backup typically has less than 5% de-duplication. Space savings from the initial Exchange backup come

mainly from compression and not de-duplication.

� Exchange data is typically 1.6:1 compressible. We will ignore compression in this simple example.

� The rate of change in the content from one Exchange backup to the next can range from 1% to over 20%. The typical rate of change is 10%. The lower

the rate of change, the more de-duplication is achieved among the backups stored in ETERNUS CS800.

3. The first-ever replication will need to transfer a copy of nearly the entire first backup: 1 TB. The amount of time required for this transfer depends on the

effective bandwidth available.

� Using T1 (1.544 Mbps), that first-ever replication would take about 60 days. In the meantime, subsequent backups will be introducing more unique data

that will be added to the continuous replication queue for replication after the current queue is completed.

� Using OC1 (51.840 Mbps), that would take about 2 days.

� Using OC3 (155.260 Mbps), that would take 8 hours, because continuous replication is happening during the backup and namespace replication assures

that the namespace is the synchronized between the source and target system.

� The more new data that is backed up for the first time and replicated, the proportionately longer the first-ever replication will take.

4. After the first-ever replication, only the new unique data is replicated to the target. If we assume a typical 10% rate of change between these Exchange

backups, then the routine full backup’s replication would take…

� About 6 days with T1. Obviously this disqualifies T1 as a bandwidth to use for this replication link.

� About 1.5 days with T2 (6.312 Mbps), also disqualifying T2 since we need to finish routine replication in a 24-hour window or the data will start building an

irreconcilable backlog.

� About 8 hours with T3 (44.736 Mbps) and higher bandwidths. Replication is not occupying the entire bandwidth during this 8-hour period. The reason the

duration is estimated at 8 hours is because the backup is happening during the same 8 hours and the unique data is being sent to the target as it’s being

encountered. The effective load would be 3.5 MB/S out of an available bandwidth of 5.6 MB/S.

You can see from the complexity of the above list of qualifying questions, that there is no simple answer to this question. Your ETERNUS CS800 Pre-Sales Systems

Engineer (SE) is your best source of information to answer this question.

HOW CAN I ACCELERATE THE FIRST-EVER REPLICATION?

The first replication of a backup event usually takes significantly more time than later routine replication events. That is because at the time of the first event, virtually everything in the de-duplication pool on the source is typically new and unknown to the target system. The first replication event will be transferring a larger amount of de-duplicated data than any of the following routine replication events. There can be an exception to this "first replication is significantly longer than the others" statement. For example, if you are replicating four remote ETERNUS CS800 systems to the same target, it is possible that one of the other source systems may have already deposited data into the target system de-duplication pool that duplicates what another wants to send. In that instance, the de-duplication pool content does not have to change. Only the namespace replication needs to occur, and that happens very quickly because the namespace is typically very small. (Typical namespace size is only few MB.) A number of initialization options are available that can decrease the amount of time needed for that first replication. The goal in each of these is to seed the de-duplication pool of the destination ETERNUS CS800 so that a minimum number of bytes need to be transferred to maintain synchronization between the two systems. Option 1: Co-locate the source and target and replicate locally Attach both the source and target systems on a dedicated GigE network and replicate locally at the highest rate supported by the ETERNUS CS800. This allows the initial replication to proceed at the fastest possible rate.

Page 7: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 7 of 10

After replication completes, the target system can be deployed to its intended location and subsequent replications to maintain synchronization between units will require significantly less time. Option 2: Co-locate the source and target with the backup server for the first full backup Depending on the amount of data to back up, this option may be faster than performing co-located replication on a dedicated GigE network. The Fujitsu Pre-Sales Systems Engineer can provide advice. You have three operational options:

■Sequentially perform a full backup to the source ETERNUS CS800 and then to the target ETERNUS CS800. ■Perform an inline full backup to both source and target at the same time. ■Clone the data from source to target after the first full backup completes. When considering these operational options, keep in mind the following: ■The type of backup (VTL or NAS) must be identical for both ETERNUS CS800 systems. ■VTL backups, depending on your server and ecosystem, can run significantly faster than NAS backups.

Steps:

1. Co-locate and execute one of the operational options mentioned above. This will place the unique data blocks into the de-duplication pool of each ETERNUS CS800.

2. Perform a namespace replication from source to target. � Wait for namespace replication to complete before proceeding. � Very little, if any additional unique data will be transferred during this namespace replication. � This establishes a recovery point for the source in the target device. � This recovery point will have a copy of the namespace from the source. � This namespace copy will have pointers to all the blocks in the target’s de-duplication pool. � Effectively, each unique block in the target’s de-duplication pool will have 2 subscribers: The original process that put the unique blocks into the target’s

de-duplication pool, and the copy of the namespace from the source. 3. Delete the clone/inline copy saveset references in the backup application catalog by expiring the savesets and releasing the media. 4. Delete the share/partition on the target ETERNUS CS800.

� Wait for this command to complete before proceeding. � This will remove all pointers from that share/partition to the unique blocks in the de-duplication pool. The unique blocks will not disappear or become

eligible for space reclamation because the namespace replication that you performed in step 2 (above) is pointing to the same unique blocks. Only unique blocks with zero “subscribers” pointing to them are eligible for space reclamation.

� Failure to do this will keep the original data around forever and can impact the amount of space available for future replication and data retention. 5. Deploy the target ETERNUS CS800 to its intended location.

Once deployed, the target may need a day or two to replicate unique data from new backups that may have taken place while the target was in transit. Option 3: Use physical tape to initialize the target ETERNUS CS800 Depending on the amount of data to back up, this option may be faster than performing co-located replication on a dedicated GigE network. The ETERNUS CS800 Pre-Sales Systems Engineer can provide advice. It is essential that the type of backup (VTL or NAS) is preserved during this process. Physical tape is only the transport medium and your process of writing the data to the tape at the source must be precisely re-versed when reading data from the tape and writing it to the target. While there may be other methods and/or utilities for accomplishing this, this Best Practices option will only focus on using the customer backup application and VTL to accomplish this “seeding”.

■You must engage the same backup application in this process at both the source and target in order to preserve the formatting and metadata inserts of

the original backup application. Failure to do so will result in having that same data replicated later and not being fully recognized. This means it will be stored a second time with its new application format.

■Failure to follow this procedure correctly may mean that the first remote replication will take a very long time. ■Failure to follow this procedure correctly may mean that your data will consume more disk space on the target and may cause the target to run out of

disk capacity sooner than expected.

Steps: 1. Use your backup application to create a clone copy of a recent backup that you did to tape (either virtual or physical tape).

� A copy of the most recent backup assures that you have the majority of new unique data. � If you have a fairly recent tape copy, you can use it instead of creating a new tape copy because a recent copy will typically have more that 80% of the

data that you will be replicating to the target. The older the tape copy, the less useful it will be.

2. Transport that tape copy to the location of the target. 3. Create a (temporary) VTL partition on the target.

� You will clone/copy the tape to this partition in order to initialize the de-duplication pool with a copy of the unique data. 4. Using the same backup application at the target as you used at the source:

a. Import the cartridge to the backup application. This will make the contents of the cartridge accessible to the backup application. The contents will be identifiable as one or more backup save sets.

Page 8: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 8 of 10

b. Duplicate/clone the contents of that cartridge to a virtual cartridge in the temporary partition of the target. This will create a copy of the de-duplicated data in the target. Later, when you perform your first namespace replication, the data that is already in the target’s de-duplication pool will not need to be transferred, thereby significantly speeding up the name-space replication process.

5. Perform a namespace replication from source to target. � Wait for namespace replication to complete before proceeding. � Very little, if any, additional unique data will be transferred during this process.

NOTE: The more backups that have occurred at the source since the tape was written and copied to the target, the more new unique data there will be that has to be transferred to the target via replication.

� This establishes a recovery point for the source in the target. � This recovery point will have a copy of the namespace from the source. � This namespace copy will have pointers to all the blocks in the target’s de-duplication pool that are in common with the source. � Effectively, each unique block in the target’s de-duplication pool will have 2 subscribers: The original process that put the unique blocks into the target’s

de-duplication pool, and the copy of the namespace from the source.

6. Delete the clone/inline copy save set references in the backup application catalog by expiring the save sets and releasing the media. 7. Additional cleanup step: delete / expire the references created from the imported tape. 8. Delete the temporary partition on the target.

� Wait for this command to complete before proceeding. � This will remove all pointers from that temporary partition to the unique blocks in the de-duplication pool. The unique blocks will not disappear or become

eligible for space reclamation because the namespace replication that you performed in step 5 (above) is pointing to the same unique blocks. Only unique blocks with zero “subscribers” pointing to them are eligible for space reclamation.

� Failure to delete this temporary partition will keep the original data around forever and can impact the amount of space available for future replication and data retention.

WHEN SHOULD I USE ENCRYPTION WITH REPLICATION?

ETERNUS CS800 offers the ability to encrypt data while in transit. That is, the source ETERNUS CS800 encrypts the blocks when sending them. The target ETERNUS CS800 decrypts the blocks upon receipt and stores them unencrypted. AES-128 encryption is used. Customers who have VPNs (virtual private networks) or encrypted circuits typically have no need to encrypt replication data with the ETERNUS CS800. Customers who use public networks or have ultra-high security requirements for their data may wish to encrypt replication data that is in transit. APPENDIX A – DIRECTORY/FILE OR CARTRIDGE REPLICATION

What is File or Cartridge Replication? File or Cartridge Replication (FCR) extends continuous and namespace replication from operating at a share/partition level and zooms in to the file-directory/virtual cartridge level. FCR can be used to synchronize the content of a share or partition that is concurrently accessible at both source and target. FCR applies only to ETERNUS CS800 de-duplication appliances running v1.3.1 or later firmware. Consult the ETERNUS CS800 User’s Guide to learn how to configure FCR. Using VTL as an example:

■One could configure FCR for each virtual cartridge in the virtual library. ■When an FCR cartridge is written to, continuous replication will transfer de-duplicated unique data to the replication target. ■Once the cartridge is unmounted, FCR will wait for any trailing de-duplication and continuous replication traffic for that cartridge to complete. Then FCR

will transfer the namespace for that cartridge to the destination ETERNUS CS800. ■Once completed, this process allows immediate access to that cartridge and the new data on it to servers accessing the destination ETERNUS CS800. ETERNUS CS800 has reserved “replication threads” just for FCR. While these replication threads may end up competing for replication bandwidth, they significantly shorten the namespace replication wait time for data written to a cartridge. This is almost like clicking Replicate Now on the ETERNUS CS800 GUI after every cartridge eject, but doing so automatically rather than manually. Advantages: ■FCR means that the target VTL is synchronized down to the cartridge level shortly after a cartridge is unloaded and the data has been de-duplicated. ■This enhances data availability at the target. Data is immediately accessible at the target after FCR completes. When should I use FCR? FCR fulfills one or more requirements of various user groups. If you have one or more of those requirements, then you should use FCR.

■Some users want the assurance that data has been replicated as early as possible in order to minimize the chance that a backup does not have a DR copy. FCR fulfills that requirement.

Page 9: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 9 of 10

■Some users want to access the DR copy as early as possible after the backup. FCR fulfills that requirement. ■Some users want their backup application to create a physical tape at the target (replicated) site. FCR ensures the unloaded virtual cartridges between

the source and target are identical. ■Some users want to perform namespace replication more than once per day. FCR does NOT fulfill this requirement. See why should I do namespace

replication if I’m using FCR for my entire share/partition in the following chapters. Why should I do namespace replication if I’m using FCR for my entire share/partition? FCR assures that specific content at both source and target is identical after synchronization.

■For NAS shares – this is at the file or directory level and occurs via a CLI or GUI command. ■For VTL partitions – this is at the virtual cartridge level and occurs when a virtual cartridge configured for FCR via the GUI is unloaded from a virtual tape

drive. FCR synchronization assures that data from the source is accessible on the target as soon as possible. “Accessible” means that the data from the source is synchronized to an active share or partition. It does not mean that the entire namespace for the share/partition on the source has been replicated to the target. Share/partition-level namespace replication only occurs when a user clicks on Replicate Now in the GUI or if namespace replication for that share/partition has been scheduled to occur routinely. Users must still perform routine namespace replication for the shares/partitions on the source that they want to replicate. Unless routine namespace replication is performed, any share/partition Recover action would not include the most recent changes synchronized through FCR that have occurred since the most recent namespace replication. How do I recover data that has been synchronized with FCR but for which no namespace replication has yet occurred ? Data that has been synchronized to a target with FCR can be recovered to the source using the steps described in this section, depending on the circumstance. It is assumed that this would be part of a disaster recovery procedure.

■If the user only wanted to replicate a copy from the target to a third ETERNUS CS800, then this is a simple situation of establishing and executing namespace replication.

■If the user had some disaster at the source location that occurred after a full namespace replication and no other data had been written and synchronized via FCR, then the user would perform a normal replication failback as documented in the ETERNUS CS 800 User’s Guide.

■The procedures suggested below would only be followed if a disaster happened on the source after FCR updates had occurred on the replication target and it was vital to retrieve a copy of the most recent library state.

■It is not expected that the procedures suggested below will be used routinely, but only as part of a disaster recovery procedure. Recovering FCR-updated VTL partitions from the target If you had been replicating a VTL partition using the combination of namespace replication and FCR and now wanted to replace everything in the source partition with everything in the active VTL partition on the target, do the following:

1. Replicate the active partition from the target back to the source. (All data already exist on the source, so this namespace replication will complete quickly.) 2. Delete the original partition on the source. (You must do this because the original partition contains duplicate barcodes to the partition you replicated to the

source in step 1. ETERNUS CS800 will not allow identical barcodes in active partitions.) 3. Recover the replicated partition and give it the same (or different) name as before. 4. Populate the recovered partition with tape drives, as before. 5. Connect the VTL to your backup application. 6. Perform an inventory to identify where the cartridges are located.

NOTE: A backup application Import should not be necessary because the backup application catalog should already reflect any new data that had been backed up. Recovering FCR-updated NAS shares from the target If you were replicating NAS shares and only want to retrieve a subset of the data stored in the share on the target:

1. Replicate the active share on the target back to the source. (Call this the “failback share” copy.) 2. Recover the failback share and give it a different name. 3. Mount the original and the failback share and copy the desired files/directories from the failback share to the original share. 4. Unmount and delete the failback share. Failure to do this could result in unique data in the fail-back share remaining on indefinitely, reducing available

capacity, and influencing the overall de-duplication ratio reported by ETERNUS CS800. 5. Or you can turn off FCR at the source and turn it on at the target. Then manually trigger synchronization back from the target to the source for the specific

files/directories you re-quire. Don’t forget to reset FCR to its original orientation when you’re done. If you were replicating NAS shares and wanted to retrieve everything from the target to replace everything you have in that share on the source:

1. Replicate the active share on the target back to the source. 2. Delete the original share on the source. 3. Recover the replicated share and give it the same (or different) name as before.

Page 10: Replication with ETERNUS CS800 - Best Practice Guide

BEST PRACTICE GUIDE - REPLICATION [MARCH, 2011]

Page 10 of 10

4. Mount the recovered share and continue as before. APPENDIX B – FREQUENTLY ASKED QUESTION

If a partition (or share) does not have replication enabled for it, but does have de-duplication enabled, does any of its data get replicated?

Only data from a share/partition that has both de-duplication and replication enabled is replicated to the target. All replication is done on a per-share/partition basis. Shares/partitions that do not have replication enabled will not have their unique content replicated.

Can I replicate only part of a partition? For example, my retention policy is four weeks but I only want to replicate the most-recent two weeks. Can I do that?

Replication is all-or-none for a given share/partition. A solution that meets your requirement might be to create a second partition on the source system and use the application to clone select data from the first partition to this second partition. After it has been cloned, the original copy in the source partition can be expired by the application. This has several advantages: ■Isolates those data that should be replicated from those that should not be replicated. ■Potentially reduces the number of bytes being replicated, thereby reducing replication band-width demand. ■Reduces the amount of data to be stored on the target. Storing only two weeks of unique data should be less than or equal to storing up to four weeks of

unique data TB. ■Allows separate retention and expiration policies for the two types of data. Archive copies can be retained indefinitely where short-term copies could be

expired after mere days or weeks. If I expire one or more backups in my backup application, does that mean the data for the expired backups will not be replicated?

Simply expiring a save set with the application does not mean that it will not be replicated. The system does not know that a save set has been expired by the application. It is only when the application overwrites the expired save set that the ETERNUS CS800 system releases the blocks containing the data of the expired save set. Released blocks can then be overwritten with new data.

CONTACT Fujitsu Technology Solutions GmbH Mies-van-der-Rohe-Straße 8, Munich, 80807, Germany E-mail: [email protected] Website: http://ts.fujitsu.com

Here follows the legal disclaimer of your organization: e.g.: All rights reserved, including intellectual property rights. Technical data subject to modifications and delivery subject to availability. Any liability that the data and illustrations are complete, actual or correct is excluded. Designations may be trademarks and/or copyrights of the respective manufacturer, the use of which by third parties for their own purposes may infringe the rights of such owner. For further information see ts.fujitsu.com/terms_of_use.html Copyright © Fujitsu Technology Solutions GmbH 2011