The Ceph Blog

Ceph blog stories provide high-level spotlights on our customers all over the world

March 13, 2019

v13.2.5 Mimic released

This is the fifth bugfix release of the Mimic v13.2.x long term
stable release series. We recommend all Mimic users upgrade.

Notable Changes

  • This release fixes the pg log hard limit bug that was introduced in
    13.2.2, https://tracker.ceph.com/issues/36686. A flag called
    pglog_hardlimit has been introduced, which is off by default. Enabling
    this flag will limit the length of the pg log. In order to enable
    that, the flag must be set by running ceph osd set pglog_hardlimit
    after completely upgrading to 13.2.2. Once the cluster has this flag
    set, the length of the pg log will be capped by a hard limit. Once set,
    this flag must not be unset anymore. In luminous, this feature was
    introduced in 12.2.11. Users who are running 12.2.11, and want to
    continue to use this feauture, should upgrade to 13.2.5 or later.
  • This release also fixes a CVE on civetweb, CVE-2019-3821 where SSL file
    descriptors were not closed in civetweb in case the initial negotiation fails.
  • There have been fixes to RGW dynamic and manual resharding, which no longer
    leaves behind stale bucket instances to be removed manually. For finding and
    cleaning up older instances from a reshard a radosgw-admin command reshard
    stale-instances list
    and reshard stale-instances rm should do the necessary
    cleanup. These commands should not be used on a multisite setup as the stale
    instances may be unlikely to be from a reshard and can have consequences. In
    the next version the admin CLI will prevent this command to be run on a
    multisite cluster, however for the current release users are urged not to
    use the delete command on a multisite cluster.

Changelog

    • build/ops: Destruction of basic_string _GLIBCXX_USE_CXX11_ABI=0 and C++17 mode results in invalid delete (issue#38177, pr#26593, Kefu Chai, Jason Dillaman)
    • build/ops: rpm: require ceph-base instead of ceph-common (issue#37620, pr#25809, Sébastien Han)
    • build/ops: run-make-check.sh ccache tweaks (issue#24817, issue#24777, pr#25153, Nathan Cutler, Jonathan Brielmaier, Erwan Velu)
    • ceph-create-keys: fix octal notation for Python 3 without losing compatibility with Python 2 (issue#37641, pr#25531, James Page)
    • cephfs: MDCache::finish_snaprealm_reconnect() create and drop MClientSnap message (issue#38285, pr#26472, “Yan, Zheng”)
    • cephfs: mgr/status: fix fs status subcommand did not show standby-replay MDS’ perf info (issue#36399, pr#25031, Zhi Zhang)
    • ceph-objectstore-tool: Dump hashinfo (issue#37597, pr#25721, David Zafman)
    • ceph-volume-client: allow setting mode of CephFS volumes (issue#36651, pr#25413, Tom Barron)
    • ceph-volume: enable device discards (issue#36532, pr#25749, Jonas Jelten)
    • ceph-volume: fix JSON output in inventory (issue#37390, pr#25923, Sebastian Wagner)
    • ceph-volume: Fix TypeError: join() takes exactly one argument (2 given) (issue#37595, pr#25771, Sebastian Wagner)
    • ceph-volume normalize comma to dot for string to int conversions (issue#37442, pr#25775, Alfredo Deza)
    • ceph-volume: revert partition as disk (issue#37506, pr#26294, Jan Fajerski)
    • ceph-volume: set permissions right before prime-osd-dir (issue#37486, pr#25777, Andrew Schoen, Alfredo Deza)
    • ceph-volume tests/functional declare ceph-ansible roles instead of importing them (issue#37805, pr#25837, Alfredo Deza)
    • ceph-volume zap: improve zapping to remove all partitions and all LVs, encrypted or not (issue#37449, pr#25351, Alfredo Deza)
    • cli: dump osd-fsid as part of osd find <id> (issue#37966, pr#26035, Noah Watkins)
    • client: do not move f->pos untill success write (issue#37546, pr#25683, Junhui Tang)
    • client: fix failure in quota size limitation when using samba (issue#37547, pr#25678, Junhui Tang)
    • client: fix fuse client hang because its pipe to mds is not ok (issue#36079, pr#25903, Guan yunfei)
    • client: retry remount on dcache invalidation failure (issue#27657, pr#24695, Venky Shankar)
    • client: session flush does not cause cap release message flush (issue#38009, pr#26424, Patrick Donnelly)
    • cmake: do not pass -B{symbolic,symbolic-functions} to linker on FreeBSD (issue#36717, pr#25525, Willem Jan Withagen)
    • common: fix memory leaks in WeightedPriorityQueue (issue#36248, pr#25295, Radoslaw Zarzynski)
    • common: fix missing include boost/noncopyable.hpp (issue#38178, pr#26277, Willem Jan Withagen)
    • core: list-inconsistent-obj output truncated, causing osd-scrub-repair.sh failure (issue#37653, pr#25603, David Zafman)
    • core: luminous->(mimic,nautilus): PGMapDigest decode error on luminous end (issue#38295, pr#26451, Sage Weil)
    • core: Objecter::calc_op_budget: Fix invalid access to extent union member (issue#37932, pr#26066, Simon Ruggier)
    • core: scrub warning check incorrectly uses mon scrub interval (issue#37264, pr#26493, David Zafman)
    • deep fsck fails on inspecting very large onodes (issue#38065, pr#26291, Igor Fedotov)
    • doc: pin the version for “breathe” to 4.1.11 (issue#38229, pr#26333, Alfredo Deza)
    • doc: rados/configuration: refresh osdmap section (issue#38051, pr#26373, Ilya Dryomov)
    • doc: updated Ceph documentation links (issue#37793, pr#26180, James McClune)
    • doc/user-management: Remove obsolete reset caps command (issue#37663, pr#25607, Brad Hubbard)
    • journal: max journal order is incorrectly set at 64 (issue#37541, pr#25957, Mykola Golub)
    • librbd: fix missing unblock_writes if shrink is not allowed (issue#36778, pr#25252, runsisi)
    • librbd: reset snaps in rbd_snap_list() (issue#37508, pr#25459, Kefu Chai)
    • mds: broadcast quota message to client when disable quota (issue#38054, pr#26292, Junhui Tang)
    • mds: create separate config for heartbeat timeout (issue#37674, pr#26010, Patrick Donnelly)
    • mds: directories pinned keep being replicated back and forth between exporting mds and importing mds (issue#37368, pr#25521, Xuehan Xu)
    • mds: disallow dumping huge caches to formatter (issue#36703, pr#25642, Venky Shankar)
    • mds: do not call Journaler::_trim twice (issue#37566, pr#25561, Tang Junhui)
    • mds: fix bug filelock stuck at LOCK_XSYN leading client can’t read data (issue#37333, pr#25676, Guan yunfei)
    • mds: fix incorrect l_pq_executing_ops statistics when meet an invalid item in purge queue (issue#37567, pr#25559, Junhui Tang)
    • mds: fix potential re-evaluate stray dentry in _unlink_local_finish (issue#38263, pr#26474, Zhi Zhang)
    • mds: fix races of updating wanted caps (issue#37464, pr#25680, “Yan, Zheng”)
    • mds: handle fragment notify race (issue#36035, pr#26252, “Yan, Zheng”)
    • mds: handle state change race (issue#37594, pr#26051, “Yan, Zheng”)
    • mds: log evicted clients to clog/dbg (issue#37639, pr#25857, Patrick Donnelly)
    • MDSMonitor: allow beacons from stopping MDS that was laggy (issue#37724, pr#25685, Patrick Donnelly)
    • MDSMonitor: missing osdmon writeable check (issue#37929, pr#26069, Patrick Donnelly)
    • mds: purge queue recovery hangs during boot if PQ journal is damaged (issue#37543, pr#26055, Patrick Donnelly)
    • mds: PurgeQueue write error handler does not handle EBLACKLISTED (issue#37394, pr#25523, Patrick Donnelly)
    • mds: remove duplicated l_mdc_num_strays perfcounter set (issue#37516, pr#25681, Zhi Zhang)
    • mds: remove wrong assertion in Locker::snapflush_nudge (issue#37721, pr#25885, “Yan, Zheng”)
    • mds: runs out of file descriptors after several respawns (issue#35850, pr#25822, Patrick Donnelly)
    • mds: severe internal fragment when decoding xattr_map from log event (issue#37399, pr#25519, “Yan, Zheng”)
    • mds: trim cache after journal flush (issue#38010, pr#26214, Patrick Donnelly)
    • mds: wait shorter intervals if beacon not sent (issue#36367, pr#25980, Patrick Donnelly)
    • mgr: add get_latest_counter() to C++ -> Python interface (issue#38138, pr#26074, Jan Fajerski)
    • mgr/balancer: add cmd to list all plans (issue#37418, pr#25293, Yang Honggang)
    • mgr/balancer: add crush_compat_metrics param to change optimization keys (issue#37412, pr#25291, Dan van der Ster)
    • mgr/dashboard: Set mirror_mode to None (issue#37870, pr#26009, Sebastian Wagner)
    • mgr: deadlock: _check_auth_rotating possible clock skew, rotating keys expired way too early (issue#23460, pr#26426, Yan Jun)
    • mgr: prometheus: added bluestore db and wal devices to ceph_disk_occupation metric (issue#36627, pr#25218, Konstantin Shalygin)
    • mgr: race between daemon state and service map in ‘service status’ (issue#36656, pr#25368, Mykola Golub)
    • mgr/restful: fix py got exception when get osd info (issue#38182, pr#26200, Boris Ranto, zouaiguo)
    • mgr: various python3 fixes (issue#37415, pr#25292, Noah Watkins)
    • mgr will refuse connection from the monitor who starts behind it (issue#37753, pr#26235, Xinying Song)
    • mgr/zabbix: Send more PG information to Zabbix (issue#38180, pr#25944, Wido den Hollander)
    • mon: A PG with PG_STATE_REPAIR doesn’t mean damaged data, PG_STATE_IN… (issue#38070, pr#26304, David Zafman)
    • mon: log last command skips latest entry (issue#36679, pr#25526, John Spray)
    • mon: mark REMOVE_SNAPS messages as no_reply (issue#37568, pr#25782, “Yan, Zheng”)
    • mon/OSDMonitor: do not populate void pg_temp into nextmap (issue#37784, pr#25844, Aleksei Zakharov)
    • mon: shutdown messenger early to avoid accessing deleted logger (issue#37780, pr#25846, ningtao)
    • msg/async: backport recent messenger fixes (issue#36497, issue#37778, pr#25958, xie xingguo)
    • msg/async: crashes when authenticator provided by verify_authorizer not implemented (issue#36443, pr#25299, Sage Weil)
    • multisite: es sync null versioned object failed because of olh info (issue#23842, issue#23841, pr#25578, Tianshan Qu, Shang Ding)
    • os/bluestore: fixup access a destroy cond cause deadlock or undefine (issue#37733, pr#26260, linbing)
    • os/bluestore: KernelDevice::read() does the EIO mapping now (issue#36455, pr#25854, Radoslaw Zarzynski)
    • os/bluestore: rename does not old ref to replacement onode at old name (issue#36541, pr#25313, Sage Weil)
    • osd: Add support for osd_delete_sleep configuration value (issue#36474, pr#25507, Jianpeng Ma, David Zafman)
    • osd-backfill-stats.sh fails in rados/standalone/osd.yaml (issue#37393, issue#35982, pr#26329, Sage Weil, David Zafman)
    • osd: backport recent upmap fixes (issue#37940, issue#37881, pr#26128, huangjun, xie xingguo)
    • osdc/Objecter: update op_target_t::paused in _calc_target (issue#37398, pr#25718, Song Shun, runsisi)
    • osd: failed assert when osd_memory_target options mismatch (issue#37507, pr#25605, xie xingguo)
    • osd: force-backfill sets forced_recovery instead of forced_backfill in 13.2.1 (issue#27985, pr#26324, xie xingguo)
    • osd/mon: fix upgrades for pg log hard limit (issue#36686, pr#26206, Neha Ojha)
    • osd/OSDMap: cancel mapping if target osd is out (issue#37501, pr#25699, ningtao, xie xingguo)
    • osd/OSD: OSD::mkfs asserts when reusing disk with existing superblock (issue#37404, pr#25385, Igor Fedotov)
    • osd/PG.cc: account for missing set irrespective of last_complete (issue#37919, pr#26239, Neha Ojha)
    • osd/PrimaryLogPG: fix the extent length error of the sync read (issue#37680, pr#25708, Xiaofei Cui)
    • osd: Prioritize user specified scrubs (issue#37269, pr#25513, David Zafman)
    • os/filestore: ceph_abort() on fsync(2) or fdatasync(2) failure (issue#38258, pr#26438, Sage Weil)
    • pybind/mgr: drop unnecessary iterkeys usage to make py-3 compatible (issue#37581, pr#25759, Mykola Golub)
    • pybind/mgr/status: fix ceph fs status in py3 environments (issue#37573, pr#25694, Jan Fajerski)
    • qa: pjd test appears to require more than 3h timeout for some configurations (issue#36594, pr#25557, Patrick Donnelly)
    • qa/rados/upgrade: align thrashing with upgrade suite, don’t import/export pgs (issue#37665, pr#25856, Sage Weil)
    • qa/tasks/radosbench: default to 64k writes (issue#37797, pr#26354, Sage Weil)
    • qa: test_damage needs to silence MDS_READ_ONLY (issue#37944, pr#26072, Patrick Donnelly)
    • qa: test_damage performs truncate test on same object repeatedly (issue#37836, issue#37837, pr#26047, Patrick Donnelly)
    • qa: teuthology may hang on diagnostic commands for fuse mount (issue#36390, pr#25515, Patrick Donnelly)
    • qa: whitelist cap revoke warning (issue#25188, pr#26496, Patrick Donnelly)
    • qa/workunits/rados/test_health_warnings: prevent out osds (issue#37776, pr#25850, Sage Weil)
    • qa: wrong setting for msgr failures (issue#36676, pr#25517, Patrick Donnelly)
    • rbd: fix delay time calculation for trash move (issue#37861, pr#25954, Mykola Golub)
    • rgw: debug logging for v4 auth does not sanitize encryption keys (issue#37847, pr#26003, Casey Bodley)
    • rgw: Don’t treat colons specially in resource part of ARN (issue#23817, pr#25386, Adam C. Emerson)
    • rgw: fails to start on Fedora 28 from default configuration (issue#24228, pr#26129, Matt Benjamin)
    • rgw: feature – log successful bucket resharding events (issue#37647, pr#25740, J. Eric Ivancich)
    • rgw_file: user info never synced since librgw init (issue#37527, pr#25485, Tao Chen)
    • rgw: fix max-size in radosgw-admin and REST Admin API (issue#37517, pr#25449, Nick Erdmann)
    • rgw: fix version bucket stats (issue#21429, pr#25643, Shasha Lu)
    • rgw: handle S3 version 2 pre-signed urls with meta-data (issue#23470, pr#25899, Matt Benjamin)
    • rgw: master zone deletion without a zonegroup rm would break rgw rados init (issue#37328, pr#25511, Abhishek Lekshmanan)
    • rgw: multisite: sync gets stuck retrying deletes that fail with ERR_PRECONDITION_FAILED (issue#37448, pr#25505, Casey Bodley)
    • rgw: Object can still be deleted even if s3:DeleteObject policy is set (issue#37403, pr#26309, Enming.Zhang)
    • rgw: “radosgw-admin bucket rm … –purge-objects” can hang (issue#38134, pr#26266, J. Eric Ivancich)
    • rgw: radosgw-admin: translate reshard status codes (trivial) (issue#36486, pr#25198, Matt Benjamin)
    • rgw: rgwgc: process coredump in some special case (issue#23199, pr#25624, zhaokun)
    • rpm: Use hardened LDFLAGS (issue#36316, pr#25171, Boris Ranto)
abhishekl

Careers