The Ceph Blog

Ceph blog stories provide high-level spotlights on our customers all over the world

February 23, 2016

v0.94.6 Hammer released

This Hammer point release fixes a range of bugs, most notably a fix for unbounded growth of the monitor’s leveldb store, and a workaround in the OSD to keep most xattrs small enough to be stored inline in XFS inodes.

We recommend that all hammer v0.94.x users upgrade.

For more detailed information, see the complete changelog.

NOTABLE CHANGES

  • build/ops: Ceph daemon failed to start, because the service name was already used. (issue#13474, pr#6832, Chuanhong Wang)
  • build/ops: LTTng-UST tracing should be dynamically enabled (issue#13274, pr#6415, Jason Dillaman)
  • build/ops: ceph upstart script rbdmap.conf incorrectly processes parameters (issue#13214, pr#6159, Sage Weil)
  • build/ops: ceph.spec.in License line does not reflect COPYING (issue#12935, pr#6680, Nathan Cutler)
  • build/ops: ceph.spec.in libcephfs_jni1 has no %post and %postun (issue#12927, pr#5789, Owen Synge)
  • build/ops: configure.ac: no use to add “+” before ac_ext=c (issue#14330, pr#6973, Kefu Chai, Robin H. Johnson)
  • build/ops: deb: strip tracepoint libraries from Wheezy/Precise builds (issue#14801, pr#7316, Jason Dillaman)
  • build/ops: init script reload doesn’t work on EL7 (issue#13709, pr#7187, Hervé Rousseau)
  • build/ops: init-rbdmap uses distro-specific functions (issue#12415, pr#6528, Boris Ranto)
  • build/ops: logrotate reload error on Ubuntu 14.04 (issue#11330, pr#5787, Sage Weil)
  • build/ops: miscellaneous spec file fixes (issue#12931, issue#12994, issue#12924, issue#12360, pr#5790, Boris Ranto, Nathan Cutler, Owen Synge, Travis Rhoden, Ken Dreyer)
  • build/ops: pass tcmalloc env through to ceph-os (issue#14802, pr#7365, Sage Weil)
  • build/ops: rbd-replay-* moved from ceph-test-dbg to ceph-common-dbg as well (issue#13785, pr#6580, Loic Dachary)
  • build/ops: unknown argument –quiet in udevadm settle (issue#13560, pr#6530, Jason Dillaman)
  • common: Objecter: pool op callback may hang forever. (issue#13642, pr#6588, xie xingguo)
  • common: Objecter: potential null pointer access when do pool_snap_list. (issue#13639, pr#6839, xie xingguo)
  • common: ThreadPool add/remove work queue methods not thread safe (issue#12662, pr#5889, Jason Dillaman)
  • common: auth/cephx: large amounts of log are produced by osd (issue#13610, pr#6835, Qiankun Zheng)
  • common: client nonce collision due to unshared pid namespaces (issue#13032, pr#6151, Josh Durgin)
  • common: common/Thread:pthread_attr_destroy(thread_attr) when done with it (issue#12570, pr#6157, Piotr Dałek)
  • common: log: Log.cc: Assign LOG_DEBUG priority to syslog calls (issue#13993, pr#6994, Brad Hubbard)
  • common: objecter: cancellation bugs (issue#13071, pr#6155, Jianpeng Ma)
  • common: pure virtual method called (issue#13636, pr#6587, Jason Dillaman)
  • common: small probability sigabrt when setting rados_osd_op_timeout (issue#13208, pr#6143, Ruifeng Yang)
  • common: wrong conditional for boolean function KeyServer::get_auth() (issue#9756, issue#13424, pr#6213, Nathan Cutler)
  • crush: crash if we see CRUSH_ITEM_NONE in early rule step (issue#13477, pr#6430, Sage Weil)
  • doc: man: document listwatchers cmd in “rados” manpage (issue#14556, pr#7434, Kefu Chai)
  • doc: regenerate man pages, add orphans commands to radosgw-admin(8) (issue#14637, pr#7524, Ken Dreyer)
  • fs: CephFS restriction on removing cache tiers is overly strict (issue#11504, pr#6402, John Spray)
  • fs: fsstress.sh fails (issue#12710, pr#7454, Yan, Zheng)
  • librados: LibRadosWatchNotify.WatchNotify2Timeout (issue#13114, pr#6336, Sage Weil)
  • librbd: ImageWatcher shouldn’t block the notification thread (issue#14373, pr#7407, Jason Dillaman)
  • librbd: diff_iterate needs to handle holes in parent images (issue#12885, pr#6097, Jason Dillaman)
  • librbd: fix merge-diff for >2GB diff-files (issue#14030, pr#6980, Jason Dillaman)
  • librbd: invalidate object map on error even w/o holding lock (issue#13372, pr#6289, Jason Dillaman)
  • librbd: reads larger than cache size hang (issue#13164, pr#6354, Lu Shi)
  • mds: ceph mds add_data_pool check for EC pool is wrong (issue#12426, pr#5766, John Spray)
  • mon: MonitorDBStore: get_next_key() only if prefix matches (issue#11786, pr#5361, Joao Eduardo Luis)
  • mon: OSDMonitor: do not assume a session exists in send_incremental() (issue#14236, pr#7150, Joao Eduardo Luis)
  • mon: check for store writeablility before participating in election (issue#13089, pr#6144, Sage Weil)
  • mon: compact full epochs also (issue#14537, pr#7446, Kefu Chai)
  • mon: include min_last_epoch_clean as part of PGMap::print_summary and PGMap::dump (issue#13198, pr#6152, Guang Yang)
  • mon: map_cache can become inaccurate if osd does not receive the osdmaps (issue#10930, pr#5773, Kefu Chai)
  • mon: should not set isvalid = true when cephx_verify_authorizer return false (issue#13525, pr#6391, Ruifeng Yang)
  • osd: Ceph Pools’ MAX AVAIL is 0 if some OSDs’ weight is 0 (issue#13840, pr#6834, Chengyuan Li)
  • osd: FileStore calls syncfs(2) even it is not supported (issue#12512, pr#5530, Kefu Chai)
  • osd: FileStore: potential memory leak if getattrs fails. (issue#13597, pr#6420, xie xingguo)
  • osd: IO error on kvm/rbd with an erasure coded pool tier (issue#12012, pr#5897, Kefu Chai)
  • osd: OSD::build_past_intervals_parallel() shall reset primary and up_primary when begin a new past_interval. (issue#13471, pr#6398, xiexingguo)
  • osd: ReplicatedBackend: populate recovery_info.size for clone (bug symptom is size mismatch on replicated backend on a clone in scrub) (issue#12828, pr#6153, Samuel Just)
  • osd: ReplicatedPG: wrong result code checking logic during sparse_read (issue#14151, pr#7179, xie xingguo)
  • osd: ReplicatedPG::hit_set_trim osd/ReplicatedPG.cc: 11006: FAILED assert(obc) (issue#13192, issue#9732, issue#12968, pr#5825, Kefu Chai, Zhiqiang Wang, Samuel Just, David Zafman)
  • osd: avoid multi set osd_op.outdata in tier pool (issue#12540, pr#6060, Xinze Chi)
  • osd: bug with cache/tiering and snapshot reads (issue#12748, pr#6589, Kefu Chai)
  • osd: ceph osd pool stats broken in hammer (issue#13843, pr#7180, BJ Lougee)
  • osd: ceph-disk prepare fails if device is a symlink (issue#13438, pr#7176, Joe Julian)
  • osd: check for full before changing the cached obc (hammer) (issue#13098, pr#6918, Alexey Sheplyakov)
  • osd: config_opts: increase suicide timeout to 300 to match recovery (issue#14376, pr#7236, Samuel Just)
  • osd: disable filestore_xfs_extsize by default (issue#14397, pr#7411, Ken Dreyer)
  • osd: do not cache unused memory in attrs (issue#12565, pr#6499, Xinze Chi, Ning Yao)
  • osd: dumpling incrementals do not work properly on hammer and newer (issue#13234, pr#6132, Samuel Just)
  • osd: filestore: fix peek_queue for OpSequencer (issue#13209, pr#6145, Xinze Chi)
  • osd: hit set clear repops fired in same epoch as map change – segfault since they fall into the new interval even though the repops are cleared (issue#12809, pr#5890, Samuel Just)
  • osd: object_info_t::decode() has wrong version (issue#13462, pr#6335, David Zafman)
  • osd: osd/OSD.cc: 2469: FAILED assert(pg_stat_queue.empty()) on shutdown (issue#14212, pr#7178, Sage Weil)
  • osd: osd/PG.cc: 288: FAILED assert(info.last_epoch_started >= info.history.last_epoch_started) (issue#14015, pr#7177, David Zafman)
  • osd: osd/PG.cc: 3837: FAILED assert(0 == “Running incompatible OSD”) (issue#11661, pr#7206, David Zafman)
  • osd: osd/ReplicatedPG: Recency fix (issue#14320, pr#7207, Sage Weil, Robert LeBlanc)
  • osd: pg stuck in replay (issue#13116, pr#6401, Sage Weil)
  • osd: race condition detected during send_failures (issue#13821, pr#6755, Sage Weil)
  • osd: randomize scrub times (issue#10973, pr#6199, Kefu Chai)
  • osd: requeue_scrub when kick_object_context_blocked (issue#12515, pr#5891, Xinze Chi)
  • osd: revert: use GMT time for hitsets (issue#13812, pr#6644, Loic Dachary)
  • osd: segfault in agent_work (issue#13199, pr#6146, Samuel Just)
  • osd: should recalc the min_last_epoch_clean when decode PGMap (issue#13112, pr#6154, Kefu Chai)
  • osd: smaller object_info_t xattrs (issue#14803, pr#6544, Sage Weil)
  • osd: we do not ignore notify from down osds (issue#12990, pr#6158, Samuel Just)
  • rbd: QEMU hangs after creating snapshot and stopping VM (issue#13726, pr#6586, Jason Dillaman)
  • rbd: TaskFinisher::cancel should remove event from SafeTimer (issue#14476, pr#7417, Douglas Fuller)
  • rbd: avoid re-writing old-format image header on resize (issue#13674, pr#6585, Jason Dillaman)
  • rbd: fix bench-write (issue#14225, pr#7183, Sage Weil)
  • rbd: rbd-replay does not check for EOF and goes to endless loop (issue#14452, pr#7416, Mykola Golub)
  • rbd: rbd-replay-prep and rbd-replay improvements (issue#13221, issue#13220, issue#13378, pr#6286, Jason Dillaman)
  • rbd: verify self-managed snapshot functionality on image create (issue#13633, pr#7182, Jason Dillaman)
  • rgw: Make RGW_MAX_PUT_SIZE configurable (issue#6999, pr#7441, Vladislav Odintsov, Yuan Zhou)
  • rgw: Setting ACL on Object removes ETag (issue#12955, pr#6620, Brian Felton)
  • rgw: backport content-type casing (issue#12939, pr#5910, Robin H. Johnson)
  • rgw: bucket listing hangs on versioned buckets (issue#12913, pr#6352, Yehuda Sadeh)
  • rgw: fix wrong etag calculation during POST on S3 bucket. (issue#11241, pr#7442, Vladislav Odintsov, Radoslaw Zarzynski)
  • rgw: get bucket location returns region name, not region api name (issue#13458, pr#6349, Yehuda Sadeh)
  • rgw: missing handling of encoding-type=url when listing keys in bucket (issue#12735, pr#6527, Jeff Weber)
  • rgw: orphan tool should be careful about removing head objects (issue#12958, pr#6351, Yehuda Sadeh)
  • rgw: orphans finish segfaults (issue#13824, pr#7186, Igor Fedotov)
  • rgw: rgw-admin: document orphans commands in usage (issue#14516, pr#7526, Yehuda Sadeh)
  • rgw: swift API returns more than real object count and bytes used when retrieving account metadata (issue#13140, pr#6512, Sangdi Xu)
  • rgw: swift use Civetweb ssl can not get right url (issue#13628, pr#6491, Weijun Duan)
  • rgw: value of Swift API’s X-Object-Manifest header is not url_decoded during segment look up (issue#12728, pr#6353, Radoslaw Zarzynski)
  • tests: fixed broken Makefiles after integration of ttng into rados (issue#13210, pr#6322, Sebastien Ponce)
  • tests: fsx failed to compile (issue#14384, pr#7501, Greg Farnum)
  • tests: notification slave needs to wait for master (issue#13810, pr#7226, Jason Dillaman)
  • tests: qa: remove legacy OS support from rbd/qemu-iotests (issue#13483, issue#14385, pr#7252, Vasu Kulkarni, Jason Dillaman)
  • tests: testprofile must be removed before it is re-created (issue#13664, pr#6450, Loic Dachary)
  • tools: ceph-monstore-tool must do out_store.close() (issue#10093, pr#7347, huangjun)
  • tools: heavy memory shuffling in rados bench (issue#12946, pr#5810, Piotr Dałek)
  • tools: race condition in rados bench (issue#12947, pr#6791, Piotr Dałek)
  • tools: tool for artificially inflate the leveldb of the mon store for testing purposes (issue#10093, issue#11815, issue#14217, pr#7412, Cilang Zhao, Bo Cai, Kefu Chai, huangjun, Joao Eduardo Luis)

GETTING CEPH

Careers