Planet Ceph

Aggregated news from external sources

  • March 27th, 2015
    Ceph erasure coding overhead in a nutshell

    Calculating the storage overhead of a replicated pool in Ceph is easy.
    You divide the amount of space you have by the “size” (amount of replicas) parameter of your storage pool.

    Let’s work with some rough numbers: 64 OSDs of 4TB each.

    Raw size: 64 * 4  = 256TB
    Size 2  : 128 / 2 = 128TB
    Size 3  : 128 / 3 = 85.33TB
    

    Replicated pools are expensive in terms of overhead: Size 2 provides the same resilience and overhead as RAID-1.
    Size 3 provides more resilience than RAID-1 but at the tradeoff of even more overhead.

    Explaining what Erasure coding is about gets complicated quickly.

    I like to compare replicated pools to RAID-1 and Erasure coded pools to RAID-5 (or RAID-6) in the sense that there are data chunks and recovery/parity/coding chunks.

    What’s appealing with erasure coding is that it can provide the same (or better) resiliency than replicated pools but with less storage overhead – at the cost of the computing it requires.

    Ceph has had erasure coding support for a good while already and interesting documentation is available:

    The thing with erasure coded pools, though, is that you’ll need a cache tier in front of them to be able to use them in most cases.

    This makes for a perfect synergy of slower/larger/less expensive drives for your erasure coded pool and faster, more expensive drives in front as your cache tier.

    To calculate the overhead of a erasure coded pool, you need to know your ‘k’ and ‘m’ values of your erasure code profile.

    chunk

      When the encoding function is called, it returns chunks of the same size. Data chunks which can be concatenated to reconstruct the original object and coding chunks which can be used to rebuild a lost chunk.

    K

      The number of data chunks, i.e. the number of chunks in which the original object is divided. For instance if K = 2 a 10KB object will be divided into K objects of 5KB each.

    M

      The number of coding chunks, i.e. the number of additional chunks computed by the encoding functions. If there are 2 coding chunks, it means 2 OSDs can be out without losing data.

    The formula to calculate the overhead is:

    nOSD * k / (k+m) * OSD Size
    

    Finally, let’s look at a couple different erasure coding profile configurations based on 64 OSDs of 4 TB ranging from m=1 to m=4 and k=1 to k=10:

    |     | 1      | 2      | 3      | 4      |
    |-----|--------|--------|--------|--------|
    | 1   | 128.00 | 85.33  | 64.00  | 51.20  |
    | 2   | 170.67 | 128.00 | 102.40 | 85.33  |
    | 3   | 192.00 | 153.60 | 128.00 | 109.71 |
    | 4   | 204.80 | 170.67 | 146.29 | 128.00 |
    | 5   | 213.33 | 182.86 | 160.00 | 142.22 |
    | 6   | 219.43 | 192.00 | 170.67 | 153.60 |
    | 7   | 224.00 | 199.11 | 179.20 | 162.91 |
    | 8   | 227.56 | 204.80 | 186.18 | 170.67 |
    | 9   | 230.40 | 209.45 | 192.00 | 177.23 |
    | 10  | 232.73 | 213.33 | 196.92 | 182.86 |
    | Raw | 256    | 256    | 256    | 256    |
    
  • April 8th, 2014
    Erasure Coding in Ceph

    Erasure Coding : All you have to know If there is data , there would be failure and there will also be administrators like us to recover this data and Erasure Coding is our shield.Storage systems have technologies for data protection and reco…

Careers