Today Inktank announced the latest offering for their Enterprise Subscription customers, “Inktank Ceph Enterprise.” This release couples the existing power of Ceph with Inktank’s unparalleled support and a monitoring and analytics GUI. Inktank Ceph Enterprise aims to enhance the Ceph experience for enterprise customers using the same underlying APIs, tools, and commands that are available to the Open Source community, packaged in a single SKU for easy purchasing.
Ceph is deeply rooted in Open Source ideals, and one of the main goals of the open source project is to see companies build innovative and revolutionary products with it. A deliberately fragmented copyright ensures that no one can exert undue control over the project, and the LGPL 2 license ensures that proprietary software can comfortably plug in to Ceph at will. This positions Ceph to deliver the perfect mix of freedom and enterprise capability. While there are quite a few products being developed with Ceph as a cornerstone, Inktank Ceph Enterprise is a prime example of the power and flexibility that Ceph’s open design can bring to the enterprise world.
- Posted by sage
- October 18th, 2013
This development release includes a significant amount of new code and refactoring, as well as a lot of preliminary functionality that will be needed for erasure coding and tiering support. There are also several significant patch sets improving this with the MDS.
- The MDS now disallows snapshots by default as they are not considered stable. The command ‘ceph mds set allow_snaps’ will enable them.
- For clusters that were created before v0.44 (pre-argonaut, Spring 2012) and store radosgw data, the auto-upgrade from TMAP to OMAP objects has been disabled. Before upgrading, make sure that any buckets created on pre-argonaut releases have been modified (e.g., by PUTing and then DELETEing an object from each bucket). Any cluster created with argonaut (v0.48) or a later release or not using radosgw never relied on the automatic conversion and is not affected by this change.
- Any direct users of the ‘tmap’ portion of the librados API should be aware that the automatic tmap -> omap conversion functionality has been removed.
- Most output that used K or KB (e.g., for kilobyte) now uses a lower-case k to match the official SI convention. Any scripts that parse output and check for an upper-case K will need to be modified.
- build: Makefile refactor (Roald J. van Loon)
- ceph-disk: fix journal preallocation
- ceph-fuse: trim deleted inodes from cache (Yan, Zheng)
- ceph-fuse: use newer fuse api (Jianpeng Ma)
- ceph-kvstore-tool: new tool for working with leveldb (copy, crc) (Joao Luis)
- common: bloom_filter improvements, cleanups
- common: correct SI is kB not KB (Dan Mick)
- common: misc portability fixes (Noah Watkins)
- hadoop: removed old version of shim to avoid confusing users (Noah Watkins)
- librados: fix installed header #includes (Dan Mick)
- librbd, ceph-fuse: avoid some sources of ceph-fuse, rbd cache stalls
- mds: fix LOOKUPSNAP bug
- mds: fix standby-replay when we fall behind (Yan, Zheng)
- mds: fix stray directory purging (Yan, Zheng)
- mon, osd: improve osdmap trimming logic (Samuel Just)
- mon: kv properties for pools to support EC (Loic Dachary)
- mon: some auth check cleanups (Joao Luis)
- mon: track per-pool stats (Joao Luis)
- mon: warn about pools with bad pg_num
- osd: automatically detect proper xattr limits (David Zafman)
- osd: avoid extra copy in erasure coding reference implementation (Loic Dachary)
- osd: basic cache pool redirects (Greg Farnum)
- osd: basic whiteout, dirty flag support (not yet used)
- osd: clean up and generalize copy-from code (Greg Farnum)
- osd: erasure coding doc updates (Loic Dachary)
- osd: erasure coding plugin infrastructure, tests (Loic Dachary)
- osd: fix RWORDER flags
- osd: fix exponential backoff of slow request warnings (Loic Dachary)
- osd: generalized temp object infrastructure
- osd: ghobject_t infrastructure for EC (David Zafman)
- osd: improvements for compatset support and storage (David Zafman)
- osd: misc copy-from improvements
- osd: opportunistic crc checking on stored data (off by default)
- osd: refactor recovery using PGBackend (Samuel Just)
- osd: remove old magical tmap->omap conversion
- pybind: fix blacklisting nonce (Loic Dachary)
- rgw: default log level is now more reasonable (Yehuda Sadeh)
- rgw: fix acl group check (Yehuda Sadeh)
- sysvinit: fix shutdown order (mons last) (Alfredo Deza)
We have now frozen the code for v0.72 Emperor, and the next sprint or two will be focused primarily on stability and testing (paritcularly the upgrade path). There is also still a lot of ongoing development work in flight for the erasure coding and tiering that is coming in Firefly, but that code may sit outside of master for a bit longer while we harden things.
You can get v0.71 from the usual places:
- Posted by sage
- October 6th, 2013
Another development release is out. Our timing on these has been slightly erratic. As a result, this one has a bit less stuff than 0.69 did or 0.71 will. The highlights are some rgw and mon fixes, and the architecture detection for enabling the optimized Intel CRC32c code is now working (which is nice: it’s about 8x faster than the generic code!). This is one minor librados API fix; librados users should check the release notes.
Notable changes include:
- mon: a few ‘ceph mon add’ races fixed (command is now idempotent) (Joao Luis)
- crush: fix name caching
- rgw: fix a few minor memory leaks (Yehuda Sadeh)
- ceph: improve parsing of CEPH_ARGS (Benoit Knecht)
- mon: avoid rewriting full osdmaps on restart (Joao Luis)
- crc32c: fix optimized crc32c code (it now detects arch support properly)
- mon: fix ‘ceph osd crush reweight …’ (Joao Luis)
- osd: revert xattr size limit (fixes large rgw uploads)
- mds: fix heap profiler commands (Joao Luis)
- rgw: fix inefficient use of std::list::size() (Yehuda Sadeh)
For more information, please see the complete release notes.
You can get v0.70 from the usual locations:
- Posted by sage
- October 4th, 2013
This point release fixes an important performance issue with radosgw, keystone authentication token caching, and CORS. All users (especially those of rgw) are encouraged to upgrade.
- crush: fix invalidation of cached names
- crushtool: do not crash on non-unique bucket ids
- mds: be more careful when decoding LogEvents
- mds: fix heap check debugging commands
- mon: avoid rebuilding old full osdmaps
- mon: fix ‘ceph crush move …’
- mon: fix ‘ceph osd crush reweight …’
- mon: fix writeout of full osdmaps during trim
- mon: limit size of transactions
- mon: prevent both unmanaged and pool snaps
- osd: disable xattr size limit (prevents upload of large rgw objects)
- osd: fix recovery op throttling
- osd: fix throttling of log messages for very slow requests
- rgw: drain pending requests before completing write
- rgw: fix CORS
- rgw: fix inefficient list::size() usage
- rgw: fix keystone token expiration
- rgw: fix minor memory leaks
- rgw: fix null termination of buffer
For more information, please see the complete release notes and changelog.
You can get v0.67.4 from the usual locations:
For the past few months I have been working towards a way to use Ceph for virtual machine images in Apache CloudStack. This integration is important to end users because it allows them to use Ceph’s distributed block device (RBD) to speed up provisioning of virtual machines.
We (my company) have been long-time contributors to Ceph (since version 0.17!), and will be using it in our own cloud product. Support for Ceph didn’t exist in CloudStack… So we built it!
I’m co-owner of a Dutch webhosting company called PCextreme B.V. My role as CTO is to do our Research & Development and that enables me to play with Ceph (a lot).
Quite some time ago we were convinced we wanted to use Ceph with RBD in our VPS product, but we weren’t sure how. Were we going to write our own cloud management software? OpenStack seemed like a good choice since it already had RBD integration, but while looking at OpenStack we came across CloudStack. I’m not going to do the OpenStack vs CloudStack discussion, but we decided CloudStack suited us better. It however lacked RBD support!
To make this integration work, a few things needed to be done:
This work has been completed and merged, and will all be part of the new CloudStack 4.0 release, which is slated for the end of October. Between now and then, we’d like people to try it!
To get started, take a look at the related documentation. If you encounter any problems, feel free to ask for help on the Ceph or CloudStack mailing lists. Or join the #ceph (OFTC) or #cloudstack (Freenode) IRC channels, I’m idling there for most of the time.
In my (rather brief) time digging in to Ceph and working with the community, most discussions generally boil down to two questions: “How does Ceph work?” and “What can I do with Ceph?” The first question has garnered a fair amount of attention in our outreach efforts. Ross Turk’s post “More Than an Object Store” does a fantastic job summarizing Ceph’s magic. The second question is what I will address below.
So what can you do with Ceph? For those who like to read the ending first, the answer turns out to be “a blindingly awesome ton.” Thankfully that doesn’t spoil it for the rest of us, because it’s the details that make it fun. In an email discussion of these details, it was Inktank’s chief suit, Bryan Bogensberger, who managed to succinctly summarize many of the available options while still citing examples and supporting data. (How do you like that, a business guy who has a solid handle on the tech. How lucky are we!?) Without immediately overwhelming you with all the supporting details, his list was as follows:
- Posted by sage
- October 16th, 2012
Another development release of Ceph is ready, v0.53. We are getting pretty close to what will be frozen for the next stable release (bobtail), so if you would like a preview, give this one a go. Notable changes include:
- librbd: image locking
- rbd: fix list command when more than 1024 (format 2) images
- osd: backfill reservation framework (to avoid flooding new osds with backfill data)
- osd, mon: honor new “nobackfill” and “norecover” osdmap flags
- osd: new “deep scrub” will compare object content across replicas (once per week by default)
- osd: crush performance improvements
- osd: some performance improvements related to request queuing
- osd: capability syntax improvements, bug fixes
- osd: misc recovery fixes
- osd: fix memory leak on certain error paths
- osd: default journal size to 1 GB
- crush: default root of tree type is now “root” instead of “pool” (to avoid confusiong wrt rados pools)
- ceph-fuse: fix handling for .. in root directory
- librados: some locking fixes
- mon: some election bug fixes
- mon: some additional on-disk metadata to facilitate future mon changes (post-bobtail)
- mon: throttle osd flapping based on osd history (limits osdmap “thrashing” on overloaded or unhappy clusters)
- mon: new “osd crush create-or-move …” command
- radosgw: fix copy-object vs attributes
- radosgw: fix bug in bucket stat updates
- mds: fix ino release on abort session close, relative getattr path, mds shutdown, other misc items
- upstart: stop jobs on shutdown
- common: thread pool sizes can now be adjusted at runtime
- build fixes for Fedora 18, CentOS/RHEL 6
The latest version of OpenStack, Folsom, was recently released. This release makes block devices in general, and Ceph block devices (RBD) in particular, much easier to use. If you’re not familiar with OpenStack terminology, there are a few things you should know before proceeding:
- instance – a virtual machine
- image – a template for a virtual machine
- volume – a block device
- Cinder – OpenStack service for managing block devices (replaces nova-volumes from previous versions)
- Glance – OpenStack service for storing images and metadata about them (image type, size, owner, etc.)
In previous releases, you could create volumes and attach them to virtual machines, and you could even boot from them, but there was no way to put data on them without going and doing it manually yourself. To boot from a volume, you’d have to: