The contents of this wiki are no longer actively maintained. The most current documentation is available at http://ceph.com/docs.

QEMU-RBD

From Ceph wiki

Revision as of 21:04, 10 June 2012 by Joshd (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Introduction

The qemu-rbd module was originally developed by Christian Brunner. It allows striping a VM block device over objects stored in the Ceph distributed object store. This gives you shared block storage to facilitate VM migration between hosts and fancy things like that.

See also rbd.

Building

The prerequisites for building QEMU with rbd support are both librados and librbd (librados-dev and librbd-dev packages in Debian). It goes something like this:

$ git clone git://git.qemu.org/qemu.git
$ cd qemu
$ ./configure --enable-rbd
$ make; make install

Creating rbd image

Using the rbd as described in Rbd is one way to create images.

There are two other options that use the qemu tools:

  • Creating a new image:

This will create a 10GB data image named 'foo' inside the RADOS data pool.

$ qemu-img create -f rbd rbd:data/foo 10G
  • Converting an old image:

This will create an image named 'lenny' in the data pool that clones our old qcow2 image:

$ qemu-img convert -f qcow2 -O rbd /data/debian_lenny_amd64_small.qcow2 rbd:data/lenny

Now we can run kvm using this image for booting:

$ qemu-system-x86_64 --drive format=rbd,file=rbd:data/lenny

Migration

We can also migrate the running vm between two hosts that can access the RADOS cluster: On the source host we run the previous command and on the destination host (called dest_host) we'd run:

$ qemu-system-x86_64 --drive format=rbd,file=rbd:data/lenny -incoming tcp:0:4444

Now we can migrate. On the source host we'll enter the qemu console (Ctrl+Alt+2) and run:

% migrate -d 0:dest_host:4444

Libvirt

Virtual disks

libvirt is a library for managing virtual machines. Using version 0.8.7 or later, you can define network disks which allow you to use rbd via libvirt.

An example disk configuration would look like:

    <disk type='network' device='disk'>
      <source protocol='rbd' name='poolname/imagename'>
          <host name='mon.example.org' port='6789'/>
          <host name='mon.example.org' port='6790'/>
          <host name='mon.example.org' port='6791'/>
      </source>
      <target dev='vda' bus='virtio'/>
    </disk>

The hosts in this configuration are the ceph monitors. If these are set in /etc/ceph/ceph.conf, they may be omitted from the libvirt disk configuration. Note: With Ubuntu apparmor blocks access to /etc/ceph/ceph.conf by default causing a permission denied error, the quick fix is to change /etc/apparmor.d/abstractions/libvirt-qemu to allow access.

Snapshotting

While QEMU-RBD supports snapshots, you can't simply snapshot running Virtual Machines via rbd nor qemu-img, when doing so, the Qemu process won't be notified, thus not knowing about the snapshot.

Upstream libvirt supports snapshotting RBD devices since 0.9.12. The version included in Ubuntu 12.04 also includes this capability.

To snapshot a running virtual machine you could use libvirt's virsh snapshot-create method which notifies the running Qemu process.

For example:

root@client01:~# virsh snapshot-create alpha
Domain snapshot 1283504027 created

root@client01:~#

Now we want to list the available snapshots:

root@client01:~# virsh snapshot-list alpha
 Name                 Creation Time             State
---------------------------------------------------
 1283504027           2010-09-03 10:53:47 +0200 running

root@client01:~#

Caching

Starting with Ceph versions 0.46 caching has been added, it can be enabled by setting rbd_cache=1 as in the following example.

   <disk type='network' device='disk'>
      <source protocol='rbd' name='poolname/imagename:rbd_cache=1'/>
      <driver name='qemu' type='rbd'/>
      <target dev='vda' bus='virtio'/>
    </disk>

In 0.47 you can change the cache size by setting rbd_cache_size, and set it to write-through mode by setting rbd_cache_max_dirty=0, i.e.:

   <disk type='network' device='disk'>
      <source protocol='rbd' name='poolname/imagename:rbd_cache=1:rbd_cache_size=67108864:rbd_cache_max_dirty=0'/>
      <driver name='qemu' type='rbd'/>
      <target dev='vda' bus='virtio'/>
    </disk>

For more details see http://article.gmane.org/gmane.comp.file-systems.ceph.devel/6402

Personal tools