Christian Brunner sent an initial implementation of ‘rbd’, a librados-based block driver for qemu/KVM, to the ceph-devel list last week. A few minor nits aside, it looks pretty good and works well. The basic idea is to stripe a VM block device over (by default) 4MB objects stored in the Ceph distributed object store. This gives you shared block storage to facilitate VM migration between hosts and fancy things like that. The implementation is super simple: it’s just a few hundred lines wiring the qemu storage abstraction up to librados. (This is very similar to what the Sheepdog folks are doing.)
We’re currently hacking together a proper rbd Linux block device for the kernel, as well, based on the osdblk device (which turns a SCSI T10 OSD object into a block device). The goal is to make the two compatible. At this stage you can create an rbd block device, format (mke2fs) and mount it, and it seems to work.
Both drivers will eventually get snapshot support.
Stay tuned!
Sounds cool! Will these drivers also support I/O barrier requests?
At the moment we don’t support that. Having that would require a much complicated synchronization logic that we currently try to avoid. Supporting such requests does not come free out of the box from the RADOS cluster, and will have to include some controlling mechanism inside the driver (or by using some other cluster entity for that). In the ceph filesystem such issues are handled in a much complicated way through both the filesystem client and the mds.
Does this make Sheepdog obsolete?