Planet Ceph

Aggregated news from external sources

April 18, 2008

Delayed capability release

The Ceph MDS server issues “capabilities” to clients to grant them permission to read or write objects for a particular file. I’ve added a delayed release of capabilities after a file is closed, as many workloads will quickly reopen the same file. In that case, we can re-use our existing capabilities (and be assured by the metadata leases that we’re correctly resolved the pathname to the correct inode) without any additional MDS interaction. This is a big win for, surprisingly, find, which has a pretty peculiar access pattern when descending out of directories:

open("..", O_RDONLY|O_NOFOLLOW)         = 4
fchdir(4)                               = 0
close(4)                                = 0
open("..", O_RDONLY|O_NOFOLLOW)         = 4
fchdir(4)                               = 0
close(4)                                = 0
fchdir(3)                               = 0
open(".", O_RDONLY|O_NOFOLLOW)          = 4
fchdir(4)                               = 0
close(4)                                = 0
close(1)                                = 0

The explicit open was almost doubling the number of MDS ops. After adding delayed cap release, the workload (as seen by the MDS), now looks pretty tidy:

mds0 <== client0 ==== client_request(client0.22 open /a) ====
mds0 <== client0 ==== client_request(client0.23 readdir #10000000000) ====
mds0 <== client0 ==== client_request(client0.24 open /a/b) ====
mds0 <== client0 ==== client_request(client0.25 readdir #10000000001) ====
mds0 <== client0 ==== client_request(client0.26 open /a/b/c) ====
mds0 <== client0 ==== client_request(client0.27 readdir #10000000002) ====
mds0 <== client0 ==== client_request(client0.28 open /a/b/d) ====
mds0 <== client0 ==== client_request(client0.29 readdir #10000000003) ====
mds0 <== client0 ==== client_request(client0.30 open /a/e) ====
mds0 <== client0 ==== client_request(client0.31 readdir #10000000004) ====
mds0 <== client0 ==== client_file_caps(ack ino 10000000000 seq 2 caps [ ] wanted[ ] size 0/0) ====
mds0 <== client0 ==== client_file_caps(ack ino 10000000001 seq 2 caps [ ] wanted[ ] size 0/0) ====
mds0 <== client0 ==== client_file_caps(ack ino 10000000002 seq 2 caps [ ] wanted[ ] size 0/0) ====
mds0 <== client0 ==== client_file_caps(ack ino 10000000003 seq 2 caps [ ] wanted[ ] size 0/0) ====
mds0 <== client0 ==== client_file_caps(ack ino 10000000004 seq 2 caps [ ] wanted[ ] size 0/0) ====

..where the last 5 message release the capabilities a few seconds later.

Careers