The Ceph Blog

Ceph blog stories provide high-level spotlights on our customers all over the world

July 14, 2017

v10.2.8 Jewel released

This point release brought a number of important bugfixes in all major
components of Ceph. However, it also introduced a regression that could cause
MDS damage, and a new release, v10.2.9, was published to address this.
Therefore, Jewel users should not upgrade to this version – instead, we
recommend upgrading directly to v10.2.9.

That being said, the v10.2.8 release notes do contain important information,
so please read on.

For more detailed information, see the complete changelog.

OSD Removal Caveat

There was a bug introduced in Jewel (#19119) that broke the mapping behavior
when an “out” OSD that still existed in the CRUSH map was removed with ‘osd rm’.
This could result in ‘misdirected op’ and other errors. The bug is now fixed,
but the fix itself introduces the same risk because the behavior may vary between
clients and OSDs. To avoid problems, please ensure that all OSDs are removed
from the CRUSH map before deleting them. That is, be sure to do:

ceph osd crush rm osd.123

before:

ceph osd rm osd.123

Snap Trimmer Improvements

This release greatly improves control and throttling of the snap trimmer. It
introduces the “osd max trimming pgs” option (defaulting to 2), which limits
how many PGs on an OSD can be trimming snapshots at a time. And it restores
the safe use of the “osd snap trim sleep” option, wihch defaults to 0 but
otherwise adds the given number of seconds in delay between every dispatch
of trim operations to the underlying system.

Other Notable Changes

  • build/ops: “osd marked itself down” will not recognised if host runs mon + osd on shutdown/reboot (issue#18516, pr#13492, Boris Ranto)
  • build/ops: ceph-base package missing dependency for psmisc (issue#19129, pr#13786, Nathan Cutler)
  • build/ops: enable build of ceph-resource-agents package on rpm-based os (issue#17613, issue#19546, pr#13606, Nathan Cutler)
  • build/ops: rbdmap.service not included in debian packaging (jewel-only) (issue#19547, pr#14383, Ken Dreyer)
  • cephfs: Journaler may execute on_safe contexts prematurely (issue#20055, pr#15468, “Yan, Zheng”)
  • cephfs: MDS assert failed when shutting down (issue#19204, pr#14683, John Spray)
  • cephfs: MDS goes readonly writing backtrace for a file whose data pool has been removed (issue#19401, pr#14682, John Spray)
  • cephfs: MDS server crashes due to inconsistent metadata (issue#19406, pr#14676, John Spray)
  • cephfs: No output for ceph mds rmfailed 0 –yes-i-really-mean-it command (issue#16709, pr#14674, John Spray)
  • cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeClient) (issue#18914, pr#14685, “Yan, Zheng”)
  • cephfs: Test failure: test_open_inode (issue#18661, pr#14669, John Spray)
  • cephfs: The mount point break off when mds switch hanppened (issue#19437, pr#14679, Guan yunfei)
  • cephfs: ceph-fuse does not recover after lost connection to MDS (issue#16743, issue#18757, pr#14698, Kefu Chai, Henrik Korkuc, Patrick Donnelly)
  • cephfs: client: fix the cross-quota rename boundary check conditions (issue#18699, pr#14667, Greg Farnum)
  • cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file (issue#19033, pr#14684, Yang Honggang)
  • cephfs: non-local quota changes not visible until some IO is done (issue#17939, pr#15466, John Spray, Nathan Cutler)
  • cephfs: normalize file open flags internally used by cephfs (issue#18872, issue#19890, pr#15000, Jan Fajerski, “Yan, Zheng”)
  • common: monitor creation with IPv6 public network segfaults (issue#19371, pr#14324, Fabian Grünbichler)
  • common: radosstriper: protect aio_write API from calls with 0 bytes (issue#14609, pr#13254, Sebastien Ponce)
  • core: Objecter::epoch_barrier isn’t respected in _op_submit() (issue#19396, pr#14332, Ilya Dryomov)
  • core: clear divergent_priors set off disk (issue#17916, pr#14596, Greg Farnum)
  • core: improve snap trimming, enable restriction of parallelism (issue#19241, pr#14492, Samuel Just, Greg Farnum)
  • core: os/filestore/HashIndex: be loud about splits (issue#18235, pr#13788, Dan van der Ster)
  • core: os/filestore: fix clang static check warn use-after-free (issue#19311, pr#14044, liuchang0812, yaoning)
  • core: transient jerasure unit test failures (issue#18070, issue#17762, issue#18128, issue#17951, pr#14701, Kefu Chai, Pan Liu, Loic Dachary, Jason Dillaman)
  • core: two instances of omap_digest mismatch (issue#18533, pr#14204, Samuel Just, David Zafman)
  • doc: Improvements to crushtool manpage (issue#19649, pr#14635, Loic Dachary, Nathan Cutler)
  • doc: PendingReleaseNotes: note about 19119 (issue#19119, pr#13732, Sage Weil)
  • doc: admin ops: fix the quota section (issue#19397, pr#14654, Chu, Hua-Rong)
  • doc: radosgw-admin: add the ‘object stat’ command to usage (issue#19013, pr#13872, Pavan Rallabhandi)
  • doc: rgw S3 create bucket should not do response in json (issue#18889, pr#13874, Abhishek Lekshmanan)
  • fs: Invalid error code returned by MDS is causing a kernel client WARNING (issue#19205, pr#13831, Jan Fajerski, xie xingguo)
  • librbd: Incomplete declaration for ContextWQ in librbd/Journal.h (issue#18862, pr#14152, Boris Ranto)
  • librbd: Issues with C API image metadata retrieval functions (issue#19588, pr#14666, Mykola Golub)
  • librbd: Possible deadlock performing a synchronous API action while refresh in-progress (issue#18419, pr#13154, Jason Dillaman)
  • librbd: is_exclusive_lock_owner API should ping OSD (issue#19287, pr#14481, Jason Dillaman)
  • librbd: remove image header lock assertions (issue#18244, pr#13809, Jason Dillaman)
  • mds: C_MDSInternalNoop::complete doesn’t free itself (issue#19501, pr#14677, “Yan, Zheng”)
  • mds: Too many stat ops when trying to probe a large file (issue#19955, pr#15472, “Yan, Zheng”)
  • mds: avoid reusing deleted inode in StrayManager::_purge_stray_logged (issue#18877, pr#14670, Zhi Zhang)
  • mds: enable start when session ino info is corrupt (issue#19708, issue#16842, pr#14700, John Spray)
  • mds: fragment space check can cause replayed request fail (issue#18660, pr#14668, “Yan, Zheng”)
  • mds: heartbeat timeout during rejoin, when working with large amount of caps/inodes (issue#19118, pr#14672, John Spray)
  • mds: issue new caps when sending reply to client (issue#19635, pr#15438, “Yan, Zheng”)
  • mon: OSDMonitor: make ‘osd crush move …’ work on osds (issue#18587, pr#13261, Sage Weil)
  • mon: fix ‘sortbitwise’ warning on jewel (issue#20578, pr#15208, huanwen ren, Sage Weil)
  • mon: make get_mon_log_message() atomic (issue#19427, pr#14587, Kefu Chai)
  • mon: remove bad rocksdb option (issue#19392, pr#14236, Sage Weil)
  • msg: IPv6 Heartbeat packets are not marked with DSCP QoS – simple messenger (issue#18887, pr#13450, Yan Jun, Robin H. Johnson)
  • msg: set close on exec flag (issue#16390, pr#13585, Kefu Chai)
  • osd: –flush-journal: sporadic segfaults on exit (issue#18820, pr#13477, Alexey Sheplyakov)
  • osd: Give requested scrubs a higher priority (issue#15789, pr#14686, David Zafman)
  • osd: Implement asynchronous scrub sleep (issue#19986, issue#19497, pr#15529, Brad Hubbard)
  • osd: Object level shard errors are tracked and used if no auth available (issue#20089, pr#15416, David Zafman)
  • osd: ReplicatedPG: try with pool’s use-gmt setting if hitset archive not found (issue#19185, pr#13827, Kefu Chai)
  • osd: allow client throttler to be adjusted on-fly, without restart (issue#18791, pr#13214, Piotr Dałek)
  • osd: bypass readonly ops when osd full (issue#19394, pr#14181, Jianpeng Ma, yaoning)
  • osd: degraded and misplaced status output inaccurate (issue#18619, pr#14325, David Zafman)
  • osd: new added OSD always down when full flag is set (issue#15025, pr#14326, Mingxin Liu)
  • osd: pg_pool_t::encode(): be compatible with Hammer <= 0.94.6 (issue#19508, pr#14392, Alexey Sheplyakov)
  • osd: pre-jewel “osd rm” incrementals are misinterpreted (issue#19119, pr#13884, Ilya Dryomov)
  • osd: preserve allocation hint attribute during recovery (issue#19083, pr#13647, yaoning)
  • osd: promote throttle parameters are reversed (issue#19773, pr#14791, Mark Nelson)
  • osd: reindex properly on pg log split (issue#18975, pr#14047, Alexey Sheplyakov)
  • osd: restrict want_acting to up+acting on recovery completion (issue#18929, pr#13541, Sage Weil)
  • rbd-nbd: check /sys/block/nbdX/size to ensure kernel mapped correctly (issue#18335, pr#13932, Mykola Golub, Alexey Sheplyakov)
  • rbd: [api] temporarily restrict (rbd_)mirror_peer_add from adding multiple peers (issue#19256, pr#14664, Jason Dillaman)
  • rbd: qemu crash triggered by network issues (issue#18436, pr#13244, Jason Dillaman)
  • rbd: rbd –pool=x rename y z does not work (issue#18326, pr#14148, Gaurav Kumar Garg)
  • rbd: systemctl stop rbdmap unmaps all rbds and not just the ones in /etc/ceph/rbdmap (issue#18884, issue#18262, pr#14083, David Disseldorp, Nathan Cutler)
  • rgw: “cluster [WRN] bad locator @X on object @X….” in cluster log (issue#18980, pr#14064, Casey Bodley)
  • rgw: ‘radosgw-admin sync status’ on master zone of non-master zonegroup (issue#18091, pr#13779, Jing Wenjun)
  • rgw: Change loglevel to 20 for ‘System already converted’ message (issue#18919, pr#13834, Vikhyat Umrao)
  • rgw: Use decoded URI when verifying TempURL (issue#18590, pr#13724, Alexey Sheplyakov)
  • rgw: a few cases where rgw_obj is incorrectly initialized (issue#19096, pr#13842, Yehuda Sadeh)
  • rgw: add apis to support ragweed suite (issue#19804, pr#14851, Yehuda Sadeh)
  • rgw: add bucket size limit check to radosgw-admin (issue#17925, pr#14787, Matt Benjamin)
  • rgw: allow system users to read SLO parts (issue#19027, pr#14752, Casey Bodley)
  • rgw: don’t return skew time in pre-signed url (issue#18828, issue#18829, pr#14605, liuchang0812)
  • rgw: failure to create s3 type subuser from admin rest api (issue#16682, pr#14815, snakeAngel2015)
  • rgw: fix break inside of yield in RGWFetchAllMetaCR (issue#17655, pr#14066, Casey Bodley)
  • rgw: fix failed to create bucket if a non-master zonegroup has a single zone (issue#19756, pr#14766, weiqiaomiao)
  • rgw: health check errors out incorrectly (issue#19025, pr#13865, Pavan Rallabhandi)
  • rgw: list_plain_entries() stops before bi_log entries (issue#19876, pr#15383, Casey Bodley)
  • rgw: multisite: fetch_remote_obj() gets wrong version when copying from remote (issue#19599, pr#14607, Zhang Shaowen, Casey Bodley)
  • rgw: multisite: some yields in RGWMetaSyncShardCR::full_sync() resume in incremental_sync() (issue#18076, pr#13837, Casey Bodley, Abhishek Lekshmanan)
  • rgw: only append zonegroups to rest params if not empty (issue#20078, pr#15312, Yehuda Sadeh, Karol Mroz)
  • rgw: pullup civet chunked (issue#19736, pr#14776, Matt Benjamin)
  • rgw: rgw_file: fix event expire check, don’t expire directories being read (issue#19623, issue#19270, issue#19625, issue#19624, issue#19634, issue#19435, pr#14653, Gui Hecheng, Matt Benjamin)
  • rgw: swift: disable revocation thread under certain circumstances (issue#19499, issue#9493, pr#14789, Marcus Watts)
  • rgw: the swift container acl does not support field .ref (issue#18484, pr#13833, Jing Wenjun)
  • rgw: typo in rgw_admin.cc (issue#19026, pr#13863, Ronak Jain)
  • rgw: unsafe access in RGWListBucket_ObjStore_SWIFT::send_response() (issue#19249, pr#14661, Yehuda Sadeh)
  • rgw: upgrade to multisite v2 fails if there is a zone without zone info (issue#19231, pr#14136, Danny Al-Gaaf, Orit Wasserman)
  • rgw: use separate http_manager for read_sync_status (issue#19236, pr#14195, Casey Bodley, Shasha Lu)
  • rgw: when converting region_map we need to use rgw_zone_root_pool (issue#19195, pr#14143, Orit Wasserman)
  • rgw: zonegroupmap set does not work (issue#19498, issue#18725, pr#14660, Orit Wasserman, Casey Bodley)
  • rgw:fix memory leaks in data/md sync (issue#20088, pr#15382, weiqiaomiao)
  • tests: ‘ceph auth import -i’ overwrites caps, should alert user before overwrite (issue#18932, pr#13544, Vikhyat Umrao)
  • tests: New upgrade test for #19508 (issue#19829, issue#19508, pr#14930, Nathan Cutler)
  • tests: [ FAILED ] TestLibRBD.ImagePollIO in upgrade:client-upgrade-kraken-distro-basic-smithi (issue#18617, pr#13107, Jason Dillaman)
  • tests: [librados_test_stub] cls_cxx_map_get_XYZ methods don’t return correct value (issue#19597, pr#14665, Jason Dillaman)
  • tests: additional rbd-mirror test stability improvements (issue#18935, pr#14154, Jason Dillaman)
  • tests: api_misc: [ FAILED ] LibRadosMiscConnectFailure.ConnectFailure (issue#15368, pr#14763, Sage Weil)
  • tests: buffer overflow in test LibCephFS.DirLs (issue#18941, pr#14671, “Yan, Zheng”)
  • tests: clone workunit using the branch specified by task (issue#19429, pr#14371, Kefu Chai, Dan Mick)
  • tests: drop upgrade/hammer-jewel-x (issue#20574, pr#15933, Nathan Cutler)
  • tests: dummy suite fails in OpenStack (issue#18259, pr#14070, Nathan Cutler)
  • tests: eliminate race condition in Thrasher constructor (issue#18799, pr#13608, Nathan Cutler)
  • tests: enable quotas for pre-luminous quota tests (issue#20412, pr#15936, Patrick Donnelly)
  • tests: fix oversight in yaml comment (issue#20581, pr#14449, Nathan Cutler)
  • tests: move swift.py task from teuthology to ceph, phase one (jewel) (issue#20392, pr#15870, Nathan Cutler, Sage Weil, Warren Usui, Greg Farnum, Ali Maredia, Tommi Virtanen, Zack Cerza, Sam Lang, Yehuda Sadeh, Joe Buck, Josh Durgin)
  • tests: qa/Fixed upgrade sequence to 10.2.0 -> 10.2.7 -> latest -x (10.2.8) (issue#20572, pr#16089, Yuri Weinstein)
  • tests: qa/suites/upgrade/hammer-x: set “sortbitwise” for jewel clusters (issue#20342, pr#15842, Nathan Cutler)
  • tests: qa/workunits/rados/test-upgrade-*: whitelist tests for master (part 1) (issue#20577, pr#15360, Sage Weil)
  • tests: qa/workunits/rados/test-upgrade-*: whitelist tests for master (part 2) (issue#20576, pr#15778, Kefu Chai)
  • tests: qa/workunits/rados/test-upgrade-*: whitelist tests the right way (issue#20575, pr#15824, Kefu Chai)
  • tests: rados: sleep before ceph tell osd.0 flush_pg_stats after restart (issue#16239, issue#20489, pr#14710, Kefu Chai, Nathan Cutler)
  • tests: run upgrade/client-upgrade on latest CentOS 7.3 (issue#20573, pr#16088, Nathan Cutler)
  • tests: run-rbd-unit-tests.sh assert in lockdep_will_lock, TestLibRBD.ObjectMapConsistentSnap (issue#17447, pr#14150, Jason Dillaman)
  • tests: systemd test backport to jewel (issue#19717, pr#14694, Vasu Kulkarni)
  • tests: test/librados/tmap_migrate: g_ceph_context->put() upon return (issue#20579, pr#14809, Kefu Chai)
  • tests: test_notify.py: rbd.InvalidArgument: error updating features for image test_notify_clone2 (issue#19692, pr#14680, Jason Dillaman)
  • tests: upgrade/hammer-x failing with OSD has the store locked when Thrasher runs ceph-objectstore-tool on down PG (issue#19556, pr#14416, Nathan Cutler)
  • tests: upgrade:hammer-x/stress-split-erasure-code-x86_64 fails in 10.2.8 integration testing (issue#20413, pr#15904, Nathan Cutler)
  • tools: brag fails to count “in” mds (issue#19192, pr#14112, Oleh Prypin, Peng Zhang)
  • tools: ceph-disk does not support cluster names different than ‘ceph’ (issue#17821, pr#14765, Loic Dachary)
  • tools: ceph-disk: Racing between partition creation and device node creation (issue#19428, pr#14329, Erwan Velu)
  • tools: ceph-disk: bluestore –setgroup incorrectly set with user (issue#18955, pr#13489, craigchi)
  • tools: ceph-disk: ceph-disk list reports mount error for OSD having mount options with SELinux context (issue#17331, pr#14402, Brad Hubbard)
  • tools: ceph-disk: do not setup_statedir on trigger (issue#19941, pr#15504, Loic Dachary)
  • tools: ceph-disk: enable directory backed OSD at boot time (issue#19628, pr#14602, Loic Dachary)
  • tools: rados: RadosImport::import should return an error if Rados::connect fails (issue#19319, pr#14113, Brad Hubbard)
Nathan Cutler

Careers