Rados Block Device merged for 2.6.37

sage

The Linux kernel merge window is open for v2.6.37, and RBD (rados block device) has finally been merged.  RBD lets you create a block device in Linux that is striped over objects stored in a Ceph distributed object store.  This basic approach gives you some nice features:

  • “thin provisioning” — space isn’t used in the cluster until you write to it
  • reliable — data objects are replicated by Ceph, so no single node failure (besides the mounting host) will take out the device
  • scalable — the device can be arbitrarily sized (and resized)
  • snapshots — RBD supports read-only named snapshots (and rollback)

One of the nice things about the kernel implementation is that there is relatively little new code; mostly it just reuses the infrastructure already in place for the Ceph file system.  The biggest change is a code refactor that moves much of the old ceph module (fs/ceph) into libceph, which includes the networking layer and interaction with the cluster monitor and OSDs (now in net/ceph).  The new rbd module (drivers/block/rbd.c) uses only libceph.  One consequence of this refactor is that the ceph-client-standalone.git repository (which includes just the backported module source, allowing you to build ceph and rbd against older kernels) has been reorganized to contain three separate modules.  The new RBD code is currently found in the unstable and unstable-backport branches of ceph-client-standalone.git, and the unstable branch of ceph-client.git.

There is also a new(ish) command line tool rbd that is used for creating and manipulating images (block devices) within the cluster.

For more information about using RBD, see http://ceph.newdream.net/wiki/Rbd.