Getting Started with Remus
|This document was originally designed for use with the xm toolkit circa release 4.0. As such, it needs updating.|
In essence, Remus works by continually live-migrating your VM from the primary physical host to the backup. So, the first thing you should do is make sure that you can live-migrate your VM (note that while Remus will synchronise your VM's local disk image to the backup, basic live migration expects a single disk image published through shared storage to both the sender and the receiver):
xm migrate --live myvm mybackuphost
xl migrate myvm mybackuphost
If that works, the next step is to try running Remus in its simplest mode, without disk replication or network protection (note that this can cause disk corruption -- use a scratch image until you turn on disk replication in the next step):
remus --no-net myvm mybackuphost
xl remus myvm mybackuphost
With luck you should see a continual stream of messages on your console every 200 milliseconds, indicating that checkpoints are being transmitted to the backup. You can increase or decrease the checkpoint frequency by passing a -i n option to remus, where n is the desired interval between checkpoints, in milliseconds. For example,
remus -i 100 myvm mybackuphost
xl remus -i 100 myvm mybackuphost
will send checkpoints every 100 ms. Once Remus has started up, the VM will exist on both the primary and the backup. If you run "xm list" or "xl list" on the backup, you should see the VM sitting in the p (paused) state, consuming no CPU unless the primary fails.
Once you have Remus running, you can test failover however you want, e.g., by running "xm destroy myvm" or "xl destroy myvm" on the primary, or even pulling the power plug on the primary. Run "xm list" or "xl list" on the backup now, and you should see that your VM is now in either the r (running) or b (blocked) states, meaning that it is either processing or waiting for work to do. Log in and see that whatever was running before you killed the primary is still running on the backup!
Now it's time to add disk protection, which ensures that the memory checkpoint on the backup is exactly synchronized with its disk state. Before starting your VM, ensure that each of the physical hosts has an identical copy of the disk image available at the same path. For example, if your VM's image is at /dev/vmdisk/myguest on the primary, you should create the same disk on the backup and synchronize it before starting your guest. LVM snapshots are a simple and quick way to go while you're testing: install your guest image on an LVM volume, then use snapshots of that volume as the VM disk path. After failover, instead of resynchronizing the disk you can just drop and recreate the snapshots.
To ask Remus to mirror your guest's disk, simply replace tap2: in the device string with tap2:remus:backuphost:anyfreeport|, e.g.:
disk = [ 'tap2:remus:bkup:9000|aio:/dev/vmdisk/myguest,xvda1,w' ]
When your VM first tries to write to the disk (most likely at mount time), the Remus tapdisk will automatically open a channel to the backup on the port number you supplied. Note that this will block until you start the remus script, which creates the receiver on the backup. Network buffering
For full protection, remove the --no-net flag from the options you pass to remus. You'll notice that your ping times go up by on average your interval time / 2, but that if you have a network connection to your VM, it will remain functioning even after failure. Try sshing into your guest and running, say, top -d 0.5, then pulling the plug on the primary. You may see a second or two of network delay, but the session should remain intact.
Congratulations, your VM is now fully protected!
For best results, pin the VCPUs for dom0 and your guest to separate cores (letting them migrate freely can cause occasional large spikes in suspend latency when the system is under load):
xm vcpu-pin Domain-0 0 0 xm vcpu-pin myguest 0 0 1
xl vcpu-pin Domain-0 0 0 xl vcpu-pin myguest 0 0 1
Using DRBD as disk backend
DRBD disks offer one main advantage over traditional tapdisk backends: resynchronization of storage after failover. When the primary machine comes back online (after a failover), the VM's disk can be resynchronized with the backup's copy, while the VM continues to execute. DRBD with its excellent resynchronization algorithms will take care of only copying those blocks that have changed since the failure happened. Once the resynchronization completes, you can immediately start Remus from the backup to the primary, without ever needing to stop or shutdown the VM! For more information on DRBD and its functionality, please visit the DRBD project webpage.
To use Remus with DRBD, you would have to use a modified version of the DRBD system, that supports Remus style disk replication protocol (named as protocol D). You can find the git repo at git://aramis.nss.cs.ubc.ca/drbd-8.3-remus. This repo is currently forked off from drbd-8.3.9 branch. Just clone the repo to your machine, build & install the DRBD kernel module and the drbd-utils package in the usual way (refer to documentation on DRBD webpages). drbd-utils package from the linux distribution repositories will not work. For using a DRBD resource as Remus disk, specify "protocol D" as the disk replication protocol. A couple of sample configuration files are available in the scripts/ directory (global_common.conf.protoD, testvms_protoD.res).
Protocol D is a hybrid between Asynchronous Replication and Checkpoint based replication. Disk writes are buffered at the backup and released to the disk only at end of a checkpoint. Protocol D operates in dual-primary mode when running under Remus and operates in primary-secondary mode under non-Remus settings. The block-drbd script in /etc/xen/scripts/ (installed by drbd-utils package) automatically take care of promoting the backup's resources to primary state when remus is started.
- This page was originally part of the page located at http://nss.cs.ubc.ca/remus/doc.html which no longer exists.