Introduction
I was asked by a client recently to evaluate the DR replication and recovery features of Veeam Backup and Replication 9.0 (VBR) in comparison to the incumbent vSphere Replication 6.1 (VR) / Site Recovery Manager 6.1 (SRM) solution. I thought that the results would be worth sharing,
The tables below and accompanying notes describe the key features of the products in the evaluation. SRM and VR features are described individually, but are considered as an individual “solution” for the purposes of this particular evaluation. The backup capabilities of VBR are also not part of the evaluation.
I have tried to be as objective and open as possible, keeping opinion to a minimum. However, I would be the first to admit that I have more experience with the vSphere products than Veeam. When I was researching this myself I found a lot of outdated information, so this my attempt to provide an up-to-date resource.
If you believe that any of the information below is incorrect, by all means leave a comment, but please provide reference links to back up any assertion.
General
- These describe the core/traditional strengths of each of the products
- The different pricing models make generalised cost comparisons difficult
- Both solutions are agnostic of storage used at replication endpoints. However, VBR cannot orchestrate failover for SAN-replicated configurations as SRM can.
Replication
- VBR has configurable levels of compression. VR compression is enabled/disabled.
- vSphere Replication doesn’t natively provide throttling but this can be achieved with Network I/O Control http://www.vmguy.com/wordpress/throttle-vsphere-replication-with-network-io-control
- When “Application-aware” setting is configured, VBR can exclude swap file blocks from replication
- When “Application-aware” setting is configured, VBR can exclude deleted file blocks from replication
- Both products can exclude specific vDisks from replication
- Both products support multiple recovery points
- VR replication settings are configured per VM. VBR settings are per job which can apply to multiple VMs, potentially making configuration simpler. However, grouping many VMs into the same job may also place limitations on the orchestration of failover processing. This is confirmed for Planned Failover “If you start planned failover for several VMs that are replicated with one replication job, these VMs will be processed one by one, not in parallel” (https://helpcenter.VBR.com/backup/vsphere/planned_failover.html ), but unconfirmed for unplanned failover.
- Both products support using existing vDisks for replication seeding. VBR is more flexible regarding compatibility of the seeding vDisk.
- This subject is where most FUD is generated. To clarify:
- If VM quiescing is not used, both products will produce “crash consistent” replicas only.
- If VM quiescing is used, both products will use VSS to produce application consistent replicas for all Windows 2008 R2 or later systems running VSS-aware applications (https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2041909 ).
- VBR provides an additional “Application-aware” option that provides further functionality (such as the ability to truncate SQL log files) implemented by an agent within the VM that is installed on-demand.
- To assist with applications that are not VSS-aware, both products allow for freeze/thaw scripts. VR provides this via VM Tools within the guest VM. VBR allows them to be configured within the management console GUI if “Application-aware” configuration is enabled.
- VBR provides for pre/post replication job scripts. This may help with ensuring application consistency across multiple VMs.
- VBR uses CBT (Changed Block Tracking) for which VMWare provides a public API that vendor applications may utilise. VR uses the HBR driver built into the hypervisor.
- VR can be configured for a minimum RPO of 15 mins (5 mins if vSAN source and destination being used). VBR can be configured for lower schedule intervals down to “continuous”. In reality there is a practical lower limit to the replication interval constrained by the time taken to replicate changed data to the target. Even “continuous” replication can take minutes between each replication cycle. In addition, RPOs less than 15 minutes in conjunction with VM quiescing is usually impractical due to the impact of quiescing on VM performance.
Orchestration
- VBR and VR can perform recovery on an individual VM. It can be done with SRM, bit it would be convoluted and isn’t really the purpose of SRM.
- SRM provides the distinction between testing and executing a recovery plan, providing distinct network and IP address mappings for each scenario. VBR does not provide similar functionality for its failover plans.
- Network mappings are configured per replication job in VBR which allows for different mappings per job, should the need arise. However, any subsequent mappings changes would require jobs to be modified individually. SRM mappings are managed in a single location and apply to all recovery plans. In addition separate mappings can be configured for recovery plan testing.
- IP address mappings are configured per replication job in VBR which allows for different mappings per job, should the need arise. However, any subsequent mappings changes would require jobs to be modified individually. SRM mappings are managed in a single location and apply to all recovery plans. In addition separate mappings can be configured for recovery plan testing.
- VBR provides for failover plan pre and post scripts. Scripts can be added to SRM recovery plans as custom actions at pertinent points in the workflow.
- SRM provides for multiple pre and post power on scripts to be applied to individual VMs. Per-VM scripts are not available for VBR.
- VBR failover plan allows for VMs to be recovered in a specified order with configurable delay between each VM power on. SRM recovery plan provides 5 recovery priority levels with the highest priority level VMs powered on first. Higher priority levels are considered a dependency for the lower levels – all VMs within the priority level are powered on before the next priority level VMs are processed. Within each priority level individual VM dependencies can also be defined, else all VMs are powered on simultaneously (VBR limits the maximum number of VMs that can be started simultaneously when you run a failover plan to 10). In addition, SRM provides for custom actions to be placed at pertinent points in the workflow to run custom scripts or prompt the operator to perform a non-automated task (which pauses the workflow).
- Both solutions provide failback functionality. However SRM implements this as part of the recovery plan process, allowing all VMs in the plan to be failed back in an automated manner. VBR failback needs to be performed on each VM individually.
Management
- VBR is managed using the Veeam Backup and Replication console, which is logical and performs well. An additional plugin provides limited functionality to the vSphere Web Client. VR and SRM are managed as plugins to the vSphere Web Client, allowing the integration of VR and SRM functions into a single administrative console. However it is well documented that the vSphere Web Client performs poorly.
- HTML reports for replication jobs can be generated by VBR very quickly using the management console. SRM allows for recovery plan information to be exported to HTML. Graphical VR reports are available for viewing within the vSphere Web Client, but are not readily exportable.
- Both solutions provide PowerShell plugins for access to their APIs.
- VBR provides the capability to backup and restore its configuration, but not to human-readable form. Limited VR configuration can be exported to CSV file. SRM recovery plan configuration can be exported to HTML.
Conclusion
Both solutions have made significant strides to address their weaknesses. VBR has some orchestration capabilities in the form of Failover Plans, whilst VR has added multiple recovery points and network compression, for example.
However, it is still the case that VBR excels (unsurprisingly) at it’s primary function of backup and, to a lesser degree, replication, but is still limited on DR orchestration. The VR and SRM combination, however, excels at DR availability and orchestration, but has no further functionality beyond that.
Note: Veeam has recently announced “Veeam Availability Orchestrator” which it claims delivers “Disaster recovery orchestration for the Enterprise”. It will be interesting to see if this plugs the orchestration gaps in the Veeam product suite.
References
Application-Aware Processing
https://helpcenter.VBR.com/backup/vsphere/application_aware_processing.html
https://bp.VBR.expert/job_configuration/application_aware_image_processing.html
How Volume Shadow Copy Service Works
https://technet.microsoft.com/en-us/library/cc785914(v=ws.10).aspx