Problem Definition

If you talk to any IT architect and ask what keeps them up at night, it is commonly the lack of a disaster recovery (DR) or business continuance (BC). Despite deploying high available (HA) servers connect to HA storage, the site as a whole is still extremely vulnerable.


The other problem you commonly encounter is that the person who deploys a Private Cloud of Virtual Machines (VMs) is responsible for protecting those machines, while the storage itself is usually run by entirely different team and backup operations by even another team. It’s no wonder why so many machines go unprotected.


How NetApp Solves This Problem

NetApp has been addressing this problem by offering host based software to allow application owners to be in charge of their own snapshots and site mirroring operations. These products (NetApp SnapManager Suite) are optimized to the specific needs of each application (for SQL, Exchange, SharePoint, and others) and offer some very compelling advanced features such as automatically mounting a Database at a remote site and completing a Data Base Consistency Check. These advanced features come at a price however, namely that additional software needs to be installed and function across the machines hosting the applications to be properly protected.


How Microsoft Started to Solve This Problem

Microsoft with the release of Windows Server 2012 addressed this in an entirely different way in a product called Hyper-V Replica which relies on the host to mirror its VMs configuration and Virtual Hard Drive (VHD) write operations through the hosts own network adapter. This solution, while sufficient for enabling replication for a few virtual machines, did not scale very well for larger deployments. With Azure Site Recovery, Microsoft alleviated the cumbersome configuration needed to enable replication across on-premises Hyper-V sites.


How Microsoft Now Solves This Problem with SMI-S Enabled Storage

As you may know, Microsoft has invested considerable time in supporting the Storage Management Initiative-Specification (SMI-S) and uses it as a method (and common language) for System Center Virtual Machine Manager (SCVMM) to talk to storage. Using the SMI-S protocol, SCVMM can deploy new LUNs or new SMB3 Shares, or change initiator Groups on a controller and even create Clones in a way that works cross-vendor. This allows SCVMM to deploy new VMs from templates significantly faster than would be possible with a network deployment.


It makes sense that Microsoft would use the SMI-S standard to discover the capabilities of an array and to initiate a mirror operation and protect a VM without having to ask the VMs host to do any of the heavy lifting. With this is mind Microsoft wrote an update for SCVMM which includes management of Hyper-V virtual machines but also allows SCVMM to offload the actual data movement to the target devices (i.e. from vendors such as NetApp). This is called Azure Site Recovery SAN Replication and is delivered via the System Center VMM 2012 R2 Update Rollup 5.0 which was announced as “Generally Available” on February 18th, 2015 here. From NetApp you will need the Data ONTAP SMI-S Agent version 5.2 or newer. This is essentially an Azure orchestrated disaster recovery process to fail over virtual machines (or collections of virtual machines) from a primary private cloud site to secondary private cloud site. Azure Site Recovery (ASR) also requires no host based software (or configuration) to protect a VM as all the configuration happens from the SCVMM manager window.


What You Need to Enable ASR

    • You will need an Azure Account (it is called Azure Site Recovery after all)
    • You will need an SCVMM 2012r2 installation with System Center VMM 2012 R2 Update Rollup 5.0
    • You will need a NetApp Controller on the Primary Site and one at the Failover Site
    • You will need the appropriate number of SnapMirror Licenses
    • You will need the NetApp FlexClone License
    • You will need a Block Level Storage Protocol License (iSCSI, FC, or FCoE)
    • You will need the NetApp Data ONTAP SMI-S Agent v5.2 or newer running in your environment
    • You will need to define a SCVMM cloud on each site

Differentiation of SnapMirror and ASR

You will find that ASR uses a simple method to protect a VM, and the real advantage to ASR is that it is completely manageable from the SCVMM console without any additional privileges from the Storage Administrators other than access via SMI-S. The way that ASR is deployed is to create a protection template that is applied to an entire type or collection of servers. This allows me to define a protection profile for servers that don’t exist yet. This pre-emptive way to protecting VMs can ensure that as a site grows organically its protection grows with it.


While the SnapManager suite of products offers deep integration with the applications, they do not offer proactive protection of VMs not yet created. The SnapManager suite does however offer a mechanism for the application owner to directly initiate a protective Snapshot before making changes to an application, which can be automatically included in remote mirrors.


The ASR and VMM integration is limited to block level storage protocols such as iSCSI or FC. What that means is that if you have a VM that is created on a SMB3.0 Share that VM should be live-migrated to a block protocol to allow for the offloading feature. Also, ASR and VMM integration with NetApp can only be used to move a VM from one NetApp controller to another NetApp controller; i.e. Site to Site replication. Replication to Azure from a NetApp controller is not yet supported.


ASR is another important tool for your toolbox, and it can be used to protect at risk VMs. ASR protection can co-exist on the same NetApp controller with SnapManager protection, so you can still depend on those deeper integration points for a few mission critical servers.  The choice is yours, and the added flexibility should assist in providing a customizable and powerful DR/BC enabled infrastructure.


Now read part 2, which will have a narrow focus on the steps to make Azure Site Recovery operational in your environment.