The last few weeks have been very exciting for Inktank and Ceph. There have been a number of community examples of how people are deploying or using Ceph in the wild. From the ComodIT orchestration example, to the unique approach of Synnefo delivering unified storage with Ceph and many others that haven’t made it to the blog yet. It is a great time to be doing things with Ceph!
We at Inktank have been just as excited as anyone in the community and have been playing with a number of deployment and orchestration tools. Today I wanted to share an experiment of my own for the general consumption of the community, deploying Ceph with Canonical’s relatively new deployment tool, ‘Juju,’ that is taking cloud deployments by storm. If you follow this guide to the end you should end up with something that looks like this:
- Posted by sage
- February 19th, 2013
We’ve been spending a lot of time working on bobtail-related stabilization and bug fixes, but our next development release v0.57 is finally here! Notable changes include:
- osd: default to libaio for the journal (some performance boost)
- osd: validate snap collections on startup
- osd: ceph-filestore-dump tool for debugging
- osd: deep-scrub omap keys/values
- ceph tool: some CLI interface cleanups
- mon: easy adjustment of crush tunables via ‘ceph osd crush tunables …’
- mon: easy creation of crush rules vai ‘ceph osd rule …’
- mon: approximate recovery, IO workload stats
- mon: avoid marking entire CRUSH subtrees out (e.g., if an entire rack goes offline)
- mon: safety check for pool deletion
- mon: new checks for identifying and reporting clock drift
- radosgw: misc fixes
- rbd: wait for udev to settle in strategic places (avoid spurious errors, failures)
- rbd-fuse: new tool, package
- mds, ceph-fuse: manage layouts via xattrs
- mds: misc bug fixes with clustered MDSs and failure recovery
- mds: misc bug fixes with readdir
- libcephfs: many fixes, cleanups with the Java bindings
- auth: ability to require new cephx signatures on messages (still off by default)
The last couple of weeks have also seen steady forward progress on v0.58, which will be out in two weeks. Now is probably as good a time as any to mention that we’ve solidified our release timeline for the next couple of months. Each development release will be out on a regular two-week schedule (feature-frozen and then delayed one sprint, or two weeks, for QA), culminating in the freeze for v0.61 “cuttlefish” at the beginning of April, to be released at the end of the month.
You can get v0.57 from the usual locations:
At this year’s Cloud Expo Europe I had a nice chat with the guys from ComodIT who are making some interesting deployment and orchestration tools. They were kind enough to include their work in a blog post earlier this week and give me permission to replicate it here for your consumption.
As always, if any of you have interesting things that you have done with Ceph we always want to hear about it. Feel free to send a link to @Ceph or email it to our Community alias. Now enjoy this week’s slice of deployment goodness.
- Posted by sage
- February 14th, 2013
We’ve fixed an important bug that a few users were hitting with unresponsive OSDs and internal heartbeat timeouts. This, along with a range of less critical fixes, was sufficient to justify another point release. Any production users should upgrade.
Notable changes include:
- osd: flush peering work queue prior to start
- osd: persist osdmap epoch for idle PGs
- osd: fix and simplify connection handling for heartbeats
- osd: avoid crash on invalid admin command
- mon: fix rare races with monitor elections and commands
- mon: enforce that OSD reweights be between 0 and 1 (NOTE: not CRUSH weights)
- mon: approximate client, recovery bandwidth logging
- radosgw: fixed some XML formatting to conform to Swift API inconsistency
- radosgw: fix usage accounting bug; add repair tool
- radosgw: make fallback URI configurable (necessary on some web servers)
- librbd: fix handling for interrupted ‘unprotect’ operations
- mds, ceph-fuse: allow file and directory layouts to be modified via virtual xattrs
You can get v0.56.3 from the usual locations:
During my most recent schlep through Europe I met some really great people, and heard some awesome Ceph use cases. One particularly interesting case was the work the guys at Synnefo shared with me at FOSDEM that they have been doing with Ganeti and RADOS. They were nice enough to write up some of the details on their blog and give me permission to repost here.
If any of you have interesting things that you have done with Ceph we always want to hear about it. Feel free to send a link to @Ceph or email it to our Community alias. Now, on to the goods!
- Posted by rturk
- February 8th, 2013
It’s me again, your friendly Ceph community manager. Lately I’ve gone off the deep end collecting data about activity and participation from our mailing lists, IRC channel, and git repository. I think they’re interesting and I’d like to start regularly sharing what I see. Today I’m going to focus on some interesting trends from the mailing list.
Before getting into all that, though, I’m going to make like Fight Club and skip to the conclusion: it’s time to create a mailing list for users and operators that will compliment ceph-devel.
Our New ceph-users List
You may have noticed that things are busier in ceph-devel lately, and the metrics confirm: there are more people, more topics, and more discussion. Some of this new activity can be attributed to our growing core developer team, but most of it is something we haven’t seen until recently: people need help with configuration, deployment, tuning, and administration.
In short, the community has become active enough to need a dedicated venue for user/operator discussion. While ceph-devel is still the right place to discuss the development of Ceph, those who use Ceph have a dedicated list of their own.
Here are all the vitals for our new list, ceph-users. If you are a user of Ceph (or want to help those who are), I encourage you to subscribe! Information for the other lists (ceph-devel and ceph-commit) can always be found at our List and IRC page.
One of the things that makes Ceph particularly powerful is the number of tunable options it provides. You can control how much data and how many operations are buffered at nearly every stage of the pipeline. You can introduce different flushing behavior, or change how many threads are used for filestore operations. The downside is that it can be a bit daunting to dive into all of this and even to know where to start. Here at Inktank we’ve gotten a lot of questions about how these options affect performance. The answer is often that it depends. Different hardware and software configurations will favor different Ceph options. To give people an at least a rough idea of what kinds of things might be worth looking at, we decided to dive in and sweep through some of the options that most likely would have an effect on performance. Specifically in this article we’ll look at different Ceph parameters when using disks in a JBOD configuration.
Since Inktank is willing to pay me to draw web comics (Hi Guys!), Here’s a picture of approximately what I looked like after getting through all of this:
- Posted by sage
- February 24th, 2012
This release contains a few key fixes for v0.42:
- fixed osdmap encoding for older clients. In particular, v0.41 versions of things like librbd won’t crash talking to v0.42 daemons.
- ceph-dencoder and man page are included in the rpm and deb
- ceph.spec file is fixed
If you are upgrading, upgrade to v0.42.2, not v0.42. Ignore v0.42.1: I forgot to include the encoding fix and didn’t notice until after I’d pushed the tag.
You can get the latest from the usual locations:
- Posted by sage
- February 20th, 2012
v0.42 is ready! This has mostly been a stabilization release, with a few critical bugs fixed. There is also an across-the-board change in data structure encoding that is not backwards-compatible, but is designed to allow future changes to be (both forwards- and backwards-).
Notable changes include:
- osd: new (non-backwards compatible!) encoding for all structures
- osd: fixed bug with transactions being non-atomic (leaking across commit boundaries)
- osd: track in-progress requests, log slow ones
- osd: randomly choose pull target during recovery (better load balance)
- osd: fixed recovery stall
- mon: a few recovery bug fixes
- mon: trim old auth files
- mon: better detection/warning about down pgs
- objecter: expose in-process requests via admin socket
- new infrastructure for testing data structure encoding changes (forward and backward compatibility)
Aside from the data structure encoding change, there is relatively little new code since v0.41. This should be a pretty solid release.
For v0.43, we are working on merging a few big changes. The main one is a new key/value interface for objects: each object, instead of storing a blob of bytes, would consist of a (potentially large) set of key/value pairs that can be set/queried efficiently. This is going to make a huge difference for radosgw performance with large buckets, and will help with large directories as well. There is also ongoing stabilization work with the OSD and new interfaces for administrators to query the state of the cluster and diagnose common problems.
v0.42 can be found from the usual locations:
- Posted by Dona
- February 14th, 2012
Last month, we sponsored the Ada Initiative as a Bronze level sponsor. Ada is an organization that encourages and supports women to join the open technology and culture which is typically dominated by men. It’s not just about balancing out the demographic. Having more women in technology opens up a vast new source for talent. As Sage said in a recent interview for Ada, “It’s hard to find people who are qualified for these jobs, and frustrating when half the human population is excluded and disadvantaged because of the nature of the community. We want to do whatever we can to set an example and change that perspective.”
As a Bronze level sponsor, Ceph contributed $7,000 to the Ada Initiative and in addition matched all donations contributed from Jan 17-19, which totaled $1,583. The combined contributions equaled $10,166. For more information about the Ada Initiative, please visit their site, https://adainitiative.org.
We hope these donations help to increase the number of women in the tech arena. That said, we’re actively searching for good talent for the Ceph project so as always we welcome men, women and “others” to apply for a job today!