Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially...

15
Technical Report Azure Site Recovery Best Practices Guide Chris Lionetti, NetApp May 2015 | TR-4413 Abstract You can protect Microsoft Hyper-V virtual machine assets by using cloud-orchestrated, on- premisestoon-premises mirroring. This protection orchestration includes directing the NetApp ® controllers on each site to initiate and maintain the mirroring operations.

Transcript of Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially...

Page 1: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

Technical Report

Azure Site Recovery Best Practices Guide

Chris Lionetti, NetApp

May 2015 | TR-4413

Abstract

You can protect Microsoft Hyper-V virtual machine assets by using cloud-orchestrated, on-

premises–to–on-premises mirroring. This protection orchestration includes directing the

NetApp® controllers on each site to initiate and maintain the mirroring operations.

Page 2: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

2 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

TABLE OF CONTENTS

1 Problem Definition ................................................................................................................................ 3

1.1 A Few Assumptions about Readers ................................................................................................................3

2 NetApp Solution .................................................................................................................................... 3

3 Initial Microsoft Solution ...................................................................................................................... 3

3.1 Current Microsoft Solution...............................................................................................................................3

4 Prerequisites ......................................................................................................................................... 4

5 What You Need to Enable Azure Site Recovery ................................................................................ 4

6 Differentiation of SnapMirror and ASR............................................................................................... 4

6.1 Integrating SMHV with ASR ............................................................................................................................5

7 Set Up Replication Groups and Make Cloud Settings ...................................................................... 5

8 Azure Portal Settings ........................................................................................................................... 7

9 Creating a Recovery Plan .................................................................................................................. 10

10 Failing Over a VM to a Secondary Site ............................................................................................. 11

10.1 Reversing Back to the Primary Site .............................................................................................................. 13

11 Creating Test Failover VMs ............................................................................................................... 13

12 A Few Final Words .............................................................................................................................. 14

LIST OF FIGURES

Figure 1) Configure VM template for replication group. ................................................................................................ 10

Page 3: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

3 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

1 Problem Definition

If you talk to any IT architect and ask what keeps him or her up at night, the answer often has to do with

the lack of a disaster recovery (DR) or business continuity (BC) solution. Despite deploying high-

availability (HA) servers connected to HA storage, a site can still be extremely vulnerable.

A complicating factor is that the person who deploys a private cloud of virtual machines (VMs) is

responsible for protecting those machines; the storage itself is usually managed by an entirely different

team, and backup operations are managed by yet another team. It’s no wonder that so many machines

go unprotected.

1.1 A Few Assumptions About Readers

To get the most out of this document, you should already have done the following:

Configured NetApp Data ONTAP® SMI-S Agent 5.2 and connected it to your installation of System

Center Virtual Machine Manager (SCVMM).

Enabled NetApp SnapMirror® software on the source and destination NetApp target devices, and

learned how to configure SnapMirror between controllers.

2 NetApp Solution

NetApp addresses this lack of DR and BC by offering host-based software that allows application owners

to be in charge of their own snapshots and site mirroring operations. These software products (NetApp

SnapManager Suite) are optimized to the specific needs of each application (for Microsoft SQL Server,

Exchange, SharePoint, and others). They also offer some very compelling advanced features, such as

automatically mounting a database at a remote site and completing a database consistency check.

However, these advanced features come at a price: Additional software must be installed and functional

on the machines hosting the applications in order for those VMs to be properly protected.

3 Initial Microsoft Solution

With the release of Windows Server 2012, Microsoft addressed this issue in an entirely different way with

Hyper-V Replica. Hyper-V Replica relies on the host to mirror its VM configuration and virtual hard drive

write operations through the host’s own network adapter. This solution, while sufficient for enabling

replication for a few virtual machines, doesn’t scale well for larger deployments. With Azure Site

Recovery, Microsoft alleviated much of the cumbersome configuration needed to enable replication

across on-premises Hyper-V sites.

3.1 Current Microsoft Solution

Microsoft has invested considerable time in supporting the Storage Management Initiative Specification

(SMI-S) and uses it as a method (and common language) for System Center Virtual Machine Manager

(SCVMM) to communicate with storage. With the SMI-S protocol, SCVMM can be used to deploy new

logical unit numbers (LUNs) or new Server Message Block (SMB) 3.0 shares, to create and change

initiator groups (igroups) on a controller, and even to create clones in a way that works with multiple

vendors. This capability allows the use of SCVMM to deploy new VMs from templates significantly faster

than would be possible by using a network deployment.

Microsoft uses the SMI-S standard to discover the capabilities of an array, to initiate a mirror operation,

and to protect a VM without having to ask the VM host to do any of the heavy lifting.

With this is mind, Microsoft wrote an update for SCVMM that includes management of Hyper-V VMs and

that allows SCVMM to off-load the actual data movement to the target devices (from vendors such as

NetApp, for example). This update is called Azure Site Recovery SAN Replication and is delivered by

Page 4: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

4 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

using the System Center VMM 2012 R2 Update Rollup 5.0, which was announced as being generally

available on February 18, 2015.

To support these features on a NetApp storage virtual machine (SVM), you will need Data ONTAP SMI-S

Agent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or

collections of VMs) from a primary private cloud site to a secondary private cloud site. Azure Site

Recovery requires no host-based software (or configuration) to protect a VM, because all the

configuration occurs from the SCVMM manager window.

4 Prerequisites

You should already be familiar with the technologies that allow Microsoft Azure Site Recovery to operate.

These foundational features are NetApp SMI-S Agent; NetApp SnapMirror data replication technology;

and, optionally, the NetApp OnCommand® Plug-In for Microsoft.

You should also have already familiarized yourself with the following TRs, and have configured these

products and features according to the following published best practices:

SnapMirror Configuration and Best Practices Guide for Clustered Data ONTAP | TR-4015

Clustered Data ONTAP 8.2 Cluster and Vserver Peering Express Guide

Best Practices and Implementation Guide for NetApp SMI-S Agent 5.2 | TR-4271

Data ONTAP SMI-S Agent 5.2 Installation and Configuration Guide

OnCommand Plug-In 4.1 for Microsoft Best Practices Guide | TR-4354

5 What You Need to Enable Azure Site Recovery

You must have the following prerequisite components to enable Azure Site Recovery (ASR):

An Azure account

An SCVMM 2012 R2 installation with System Center VMM 2012 R2 Update Rollup 5.0

A NetApp controller on the primary site and on the failover site licensed with SnapMirror

An SVM with a NetApp FlexClone® license

An SVM with a block-level storage protocol license (iSCSI, FC, or FCoE)

NetApp Data ONTAP SMI-S Agent v5.2 or later running in your environment

A defined SCVMM cloud on both the primary and secondary sites

6 Differentiation of SnapMirror and ASR

You will find that ASR uses a simple method to protect a VM. The real advantage of ASR is that it is

completely manageable from the SCVMM console without requiring any additional privileges from the

storage administrators, other than access by using SMI-S. ASR is deployed by creating a protection

template that is applied to a server or a collection of servers. This feature allows the administrator to

define a protection profile for servers that haven’t yet been created. With this preemptive way of

protecting VMs, as a site grows organically, its protection grows with it.

Although the SnapManager suite of products offers deep integration with the applications, it does not offer

proactive protection of VMs not yet created. However, the SnapManager suite does offer a mechanism

for the application owner to directly initiate a protective NetApp Snapshot® copy operation before making

changes to an application, which can be automatically included in remote mirrors.

The ASR and SCVMM integration is limited to block-level storage protocols such as iSCSI or FC. To

protect a VM that is created on an SMB 3.0 share, the VM should be live-migrated to a block protocol

Page 5: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

5 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

location to allow the off-loading feature to operate. ASR off-loaded transfers that use the SMI-S with

NetApp can be used only to move a VM from one NetApp controller to another NetApp controller—site-to-

site replication. Replication to Azure or to non NetApp storage from a NetApp controller falls back to host-

based mirroring instead of allowing off-loaded SAN-based operations.

ASR is another important tool for your toolbox, and it can be used to protect at-risk VMs. ASR protection

can coexist on the same NetApp controller and can protect the same VM as SnapManager protection

does; therefore, you can still depend on those deeper integration points for a few mission-critical servers.

The choice is yours, and the added flexibility can help provide a customizable and powerful DR- and BC-

enabled infrastructure.

6.1 Integrating SnapManager for Hyper-V with ASR

Although there is no native integration between SnapManager for Hyper-V (SMHV) and ASR, they can be

used together because neither directly interferes with the workings of the other. The strength of SMHV is

that it is tied to the VM and knows how to create a hardware-based, VSS-flushed NetApp Snapshot copy.

After SMHV creates this Snapshot copy, it lives in the NetApp FlexVol® volume.

When ASR uses NetApp SnapMirror software to copy the FlexVol volume from the primary to the

secondary location, the SMHV-created Snapshot copies get pulled along. This means that if you are

forced to perform an unplanned failover to the remote site, you can choose to bring up the most recent

(unflushed) copy of the VM, or you can choose the most recent SMHV (VSS-flushed) copy of the VM. The

only action that you need to take is to revert the VM to the most recent SMHV-created Snapshot copy

instead of using the most recent SnapMirror copy. To set up SnapManager for Hyper-V, follow the

directions outlined in the Technical Report titled “NetApp SnapManager 2.1 for Hyper-V on Clustered

Data ONTAP 8.3 Best Practice Guide.”

For a deeper conversation about the ASR feature and for SMI-S help in general, visit our Microsoft Cloud

and Virtualization Communities site.

7 Setting Up Replication Groups and Completing Cloud Settings

After you have met the requirements outlined in section 5 of this document, you can protect your VMs

with ASR and the SAN replication functionality provided by the NetApp SMI-S Agent and SnapMirror.

Take the following steps:

1. Open Fabric Resources in SCVMM and, under Storage, click Arrays.

2. When your SVMs appear in the managed array list (you should see the primary site array and the secondary site array), you can right-click the primary array and select Properties.

3. One of the options for Properties is Replication Groups. Click New to launch the Create Replication Group wizard. Make sure to select LUNs attached to Hyper-V hosts that will hold the virtual disks of the protected VM.

Page 6: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

6 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

4. After you create your Fabric Replication Group, you can either create or modify your cloud under the VMs section. Make sure that on the first page of the Cloud settings you select the Send Configuration Data About This Cloud to the Azure Site Recovery option.

Page 7: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

7 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

5. Continue to set up the private cloud as you normally would by selecting all the appropriate resources; however, the last step prior to the summary should be to select replication groups. On the Replication Groups page, select the replication group that you created in the previous step.

A second cloud must exist at the DR destination site (and also have resources assigned to it). If this

second cloud at the DR site does not exist, create it by following the same steps to create the primary

cloud, which were outlined previously.

8 Azure Portal Settings

When both clouds exist—on the primary and the failover site—and they are sending configuration data to

Azure Site Recovery, launch the Azure Site Portal and continue the configuration from this portal.

1. Continue to Recovery Services and select the Azure Site Recovery vault.

2. Select Protected Items.

Page 8: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

8 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

3. Select the appropriate primary cloud to launch the configuration steps shown in the following screenshot. As you select each item from the top down, different options become visible.

Note: Make sure to choose SAN as the Replication Type so that the Arrays tab appears.

After the protection is configured for the primary cloud and the secondary cloud is identified, Azure configures the secondary cloud as its pairing partner.

Page 9: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

9 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

4. Configure the Server Storage setting under the Resources option for the ASR vault by selecting the source and target sites for the replication pairing.

5. Select the option to map the storage to the vault by selecting the NetApp SVM and Storage Pools.

After you have completed these steps, you have a new designation for the two clouds—the primary cloud is designated as the protected cloud, and the secondary cloud is designated as the recovery cloud.

6. After you complete the Azure Site Recovery Cloud settings, click Protected Cloud under Protected Items and select Virtual Machines. The option to add a replication group appears.

Page 10: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

10 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

8.1 Creating and Modifying the Default Templates

When you have configured the Azure portal settings, you can create or modify the default VM template for

the replication group. Doing so allows all new VMs inside that replication group to automatically inherit the

protection settings. Figure 1 shows the Create VM Template Wizard.

Figure 1) Configure VM template for replication group.

9 Creating a Recovery Plan

Now that you have at least a single VM that is part of a cloud and is protected by using ASR, you’re ready

to develop a plan for what to do in case of a failure. To create a recovery plan, complete the following

steps:

1. From the ASR menu, select the protected cloud vault, Recovery Plans, and Create Recovery Plan.

Page 11: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

11 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

2. Provide a name for the recovery plan and select where the recovery plan will operate.

3. Select the SAN replication type and select the individual replication group to be included in this recovery group.

Now the recovery plan has been created and you can select some custom settings for the recovery

group. For example, you can divide the pack of servers into two sets and let them start up in order. This

capability is valuable if you have a server, such as a domain controller, that needs to be available before

other servers start or if you have a SQL back end for a series of web server front ends. Note that other

customer actions can be placed in here as well, such as post-VM scripts.

You will also note that the actions listed in the recovery plan assume that you are performing a planned

failover and will properly shut down the VMs on the primary site before failing over. These actions still

occur even without proper shutdown, but if a proper shutdown is possible, it prevents problems.

10 Failing Over a VM to a Secondary Site

To fail over a VM to a secondary site, complete the following steps:

1. Open the Azure Site Recovery options, and choose Failover for the VM that you want to move.

Page 12: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

12 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

When you choose to move your VMs from site to site, the orders stipulated in the recovery plan—in this case, a set of three VMs that will be booted in a specific order—will be followed.

Page 13: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

13 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

2. Monitor the job through the Job button or go grab a coffee; but be fast, the process won’t take long.

After this process is completed, the VMs are running in your secondary cloud on your secondary site.

10.1 Reversing Back to the Primary Site

To reverse back to the primary site, complete the following steps:

1. From the bottom bar of the same Azure screen that you used to initiate the failover, click Reverse Replication. This action returns the VMs to the primary site in the same way that they were moved to the secondary site.

11 Creating Test Failover VMs

You can perform a test failover without affecting the VMs on the primary site. To avoid a test VM failover

on the network while the original is running, test the remote site by using an isolated network. The

production VM is brought up and running on the failover site. The test VM failover option creates the VM

on the remote site that is network isolated from your production network.

Page 14: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

14 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

12 A Few Final Words

Although SnapManager for Hyper-V and the other SnapManager suite offer DR protection by using

SnapMirror, the SnapManager suite also protects VMs in a different way by focusing on application

integration and flushed snapshots. The use of Azure Site Recovery does not prevent or preclude the

additional protection that the SnapManager suite offers, and these products can coexist in the same

infrastructure.

The options to protect your VMs have simply been expanded, enabling you to choose protection methods

that are effective and that also fit your unique business requirements.

Page 15: Technical Report Azure Site RecoveryAgent 5.2 or later. This process is essentially Azure-orchestrated disaster recovery to fail over VMs (or collections of VMs) from a primary private

15 Azure Site Recovery Best Practices Guide © 2015 NetApp, Inc. All Rights Reserved

Refer to the Interoperability Matrix Tool (IMT) on the NetApp Support site to validate that the exact product and feature versions described in this document are supported for your specific environment. The NetApp IMT defines the product components and versions that can be used to construct configurations that are supported by NetApp. Specific results depend on each customer's installation in accordance with published specifications.

Trademark Information

NetApp, the NetApp logo, Go Further, Faster, AltaVault, ASUP, AutoSupport, Campaign Express, Cloud ONTAP, Clustered Data ONTAP, Customer Fitness, Data ONTAP, DataMotion, Fitness, Flash Accel, Flash Cache, Flash Pool, FlashRay, FlexArray, FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexVol, FPolicy, GetSuccessful, LockVault, Manage ONTAP, Mars, MetroCluster, MultiStore, NetApp Insight, OnCommand, ONTAP, ONTAPI, RAID DP, RAID-TEC. SANtricity, SecureShare, Simplicity, Simulate ONTAP, SnapCenter, Snap Creator, SnapCopy, SnapDrive, SnapIntegrator, SnapLock, SnapManager, SnapMirror, SnapMover, SnapProtect, SnapRestore, Snapshot, SnapValidator, SnapVault, StorageGRID, Tech OnTap, Unbound Cloud, WAFL and other names are trademarks or registered trademarks of NetApp Inc., in the United States and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such. A current list of NetApp trademarks is available on the Web at http://www.netapp.com/us/legal/netapptmlist.aspx. TR-4413-0515

Copyright Information

Copyright © 1994–2015 NetApp, Inc. All rights reserved. Printed in the U.S. No part of this document covered by copyright may be reproduced in any form or by any means—graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an electronic retrieval system—without prior written permission of the copyright owner.

Software derived from copyrighted NetApp material is subject to the following license and disclaimer:

THIS SOFTWARE IS PROVIDED BY NETAPP "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

NetApp reserves the right to change any products described herein at any time, and without notice. NetApp assumes no responsibility or liability arising from the use of products described herein, except as expressly agreed to in writing by NetApp. The use or purchase of this product does not convey a license under any patent rights, trademark rights, or any other intellectual property rights of NetApp.

The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications.

RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.277-7103 (October 1988) and FAR 52-227-19 (June 1987).