Monitoring and Managing Hyper-V Replica

In the previous posts, we have looked at scenarios for using Hyper-V Replica in the small business environment. We also looked at how to enable replication and configure this. Once it has been configured and is working, we need to be able to manage the environment and monitor the replication to ensure that there is integrity in the process.
In the post, we will examine the monitoring and management features and also look at testing the replica to ensure that the systems will failover if they are required.

Monitoring

The easiest way to check the status of the replication is to view the properties via the console.

Just right click on the VM that has a replica and Select Replica, View Replication Health. This can be done on the primary server or on the replica.

The Replication Health window will give you the following information:

  • Replication State
  • Replication Type (whether this is the primary or replica)
  • The Primary and Replica server names
  • The Replication Health Status (the usual Windows OK tick, warning, or critical notifications)
  • Statistics over the past number of hours since the past 9am processing run, including averages for size and latency, and any errors encountered
  • The last replication run
  • Status on Test failovers

This information can also be exported to a CSV file.

If you are Powershell inclined, you can view this information by using the Get-VMReplication command.

There are also Performance Monitor counters available for monitoring.

And of course, there are events to be monitored in the Event Logs.

Obviously, with such information available, it is possible to script and enable reporting on this information.

Management

There are a few things that you can do with a VM Replica, which are shown on the Replication choice on the VM on the Replica Server.

On the Primary server, there are only 4 options. The last 3 are the same on both servers. On the primary server, there is no option to Test failover, since the replica physically resides on the replica server only. Planned failover does the same thing as Failover on the replica server.

  1. Failover

    If a problem occurs at the primary VM, then there are some decisions to be made. If the issue is going to be a short outage (ie. A power failure), it may be best to wait out the outage, rather than put the failover in place. This is because of the overheads in performing a failover and restoring back from a failover are probably going to take longer. However, if the power will be cut for 4 hours, and the business needs to be operational, then a failover to the replica would be a good option.

    To perform a failover, the primary VM must be offline. Select the recovery point to return the VM state to (or choose the last one selected). Then click the Fail Over button.

     

    If the primary VM is still online, you will get an error.

     

    The Replica VM will start immediately. Once it has started up, you will need to reset IP addresses, as the NIC hardware will be different. In most cases, not much else will need to change. It will look as though the VM is running with a snapshot (In fact, it is!).

     

    Now, the management choices have changed on the Replication section.

     

    There are 5 options when a VM is in failover mode.

    1. Reverse Replication. This option completes the failover by reversing the replication direction. The reverse replication wizard will start up, which takes you through the same steps as setting up a new replication. The current running replica will now become the primary VM, while the VM at the primary will be removed and a new replica will be created. Initial replication will be performed in the reverse direction.
    2. Remove Replication Points. This option completes the failover by making the VM a primary VM. Replication will stop and the smapshot will be merged into the VM. After this, the only options to continue are to reverse the replication (from the replica) or cancel replication (on both servers).
    3. Cancel Failover. This will cancel the failover, and revert changes back to the original primary VM.
    4. View Replication Health. The replication health status will be displayed here with various errors, since the replication is in failover mode.
    5. Remove Replication. This option will remove the replication connection between the two servers. This operation must be performed on both the primary and replica servers.

     

  2. Test Failover

    The Test Failover option creates a copy of the VM and allows you to turn on and run the VM in a test setting. This allows you to check that replication is working and that the VM will boot up without issues.

    Selecting Test Failover will bring up the options for the test. You can select the last recovery point or earlier points if they exist.

     

    A Test VM will be created with the NIC not connected. You could create an internal test Network and assign a NIC connection to boot up in an isolated situation to test the VM. You can perform any Hyper-V management functions on this test VM. Remove the VM when it is no longer required.

     

  3. Pause Replication.

    This option pauses the replication. To resume the replication, select Resume Replication.

     

  4. View Replication Health. This option has been discussed in the monitoring section above.
  5. Remove Replication. This option is self-explanatory. Note that the operation must be performed on both host servers.

 

 

 

3 thoughts on “Monitoring and Managing Hyper-V Replica”

    1. Monitor the Event Log.
      Look for the following IDs
      32315 Warning – Hyper-V failed to replicate changes for virtual machine ‘servername’ (Virtual Machine ID …..). Hyper-V will retry replication after 1 minute(s).
      32022 Error – Hyper-V could not replicate changes for virtual machine ‘servername’: The operation timed out (0x00002EE2). (Virtual Machine ID …..)

Leave a Reply to Daniel Mundy Cancel reply

Your email address will not be published. Required fields are marked *

Solve the Equation to continue * Time limit is exhausted. Please reload CAPTCHA.