HOWTO test a Ceph crush rule

loic

The crushtool utility can be used to test Ceph crush rules before applying them to a cluster.

$ crushtool --outfn crushmap --build --num_osds 10 \ host straw 2 rack straw 2 default straw 0

id weight type name reweight

-9 10 default default -6 4 rack rack0 -1 2 host host0 0 1 osd.0 1 1 1 osd.1 1 -2 2 host host1 2 1 osd.2 1 3 1 osd.3 1 -7 4 rack rack1 -3 2 host host2 4 1 osd.4 1 5 1 osd.5 1 -4 2 host host3 6 1 osd.6 1 7 1 osd.7 1 -8 2 rack rack2 -5 2 host host4 8 1 osd.8 1 9 1 osd.9 1

Creates a crushmap from scratch (–build). It assumes there is a total of 10 OSDs available ( –num_osds 10 ). It then places two OSDs in each host ( host straw 2 ). The resulting hosts (five of them) are then placed in racks, at most two per racks ( rack straw 2 ). All racks are placed in the default root (that’s what the zero stands for : all of them) ( default straw 0 ). The last rack only has one host because there is an odd number of hosts available.
The crush rule to be tested can be injected in the crushmap with

crushtool --outfn crushmap --build --num_osds 10 host straw 2 rack straw 2 default straw 0 crushtool -d crushmap -o crushmap.txt cat >> crushmap.txt <<EOF rule myrule { ruleset 1 type replicated min_size 1 max_size 10 step take default step choose firstn 2 type rack step chooseleaf firstn 2 type host step emit } EOF crushtool -c crushmap.txt -o crushmap

This crushmap should be able to provide two OSDs ( for placement groups for instance ) and it can be verified with the –test option.

$ crushtool -i crushmap --test --show-statistics --rule 1 --min-x 1 --max-x 2 --num-rep 2 rule 1 (myrule), x = 1..2, numrep = 2..2 CRUSH rule 1 x 1 [0,2] CRUSH rule 1 x 2 [7,4] rule 1 (myrule) num_rep 2 result size == 2: 2/2

The –rule 1 designates the rule that was injected. The –rule 0 is the default rule that is created by default. The x can be thought of as the unique name of the placement group for which OSDs are reclaimed. The –min-x 1 –max-x 2 varies the value of x from 1 to 2 therefore trying the rule only twice. –min-x 1 –max-x 2048 would create 2048 lines. Each line shows the value of x after the rule number. In rule 1 x 2 the 1 is the rule number and the 2 is the value of x. The last line shows that for all values of x (2/2 i.e. 2 values of x out of 2), when asked to provide 2 OSDs (num_rep 2) the crush rule was able to provide 2 (result size == 2).

If asked for 4 OSDs, the same crush rule may fail because it has barely enough resources to satisfy the requirements.

$ crushtool -i crushmap --test --show-statistics --rule 1 --min-x 1 --max-x 2 --num-rep 4 rule 1 (myrule), x = 1..2, numrep = 4..4 CRUSH rule 1 x 1 [0,2,9] CRUSH rule 1 x 2 [7,4,1,3] rule 1 (myrule) num_rep 4 result size == 3: 1/2 rule 1 (myrule) num_rep 4 result size == 4: 1/2

The statistics at the end shows that one of the two mappings failed: the result size == 3 is lower than the required number num_rep 4. If asked for more OSDs than the rule can provide, the rule will always fail.

crushtool -i crushmap --test --show-statistics --rule 1 --min-x 1 --max-x 2 --num-rep 5 rule 1 (myrule), x = 1..2, numrep = 5..5 CRUSH rule 1 x 1 [0,2,9] CRUSH rule 1 x 2 [7,4,1,3] rule 1 (myrule) num_rep 5 result size == 3: 1/2 rule 1 (myrule) num_rep 5 result size == 4: 1/2

More examples of crushtool usage can be found in the crushtool directory of the Ceph sources.