Planet Ceph

Aggregated news from external sources

  • May 12, 2015
    Ceph Jerasure and ISA plugins benchmarks

    In Ceph, a pool can be configured to use erasure coding instead of replication to save space. When used with Intel processors, the default Jerasure plugin that computes erasure code can be replaced by the ISA plugin for better write … Continue reading

  • May 7, 2015
    Improving Ceph python scripts tests

    The Ceph command line and ceph-disk helper are python scripts for which there are integration tests (ceph-disk.sh and test.sh). It would be useful to add unit tests and pep8 checks. It can be done by creating a python module instead … Continue reading

  • April 10, 2015
    Ceph make check in a ram disk

    When running tests from the Ceph sources, the disk is used intensively and a ram disk can be used to reduce the latency. The kernel must be rebooted to set the ramdisk maximum size to 16GB. For instance on Ubuntu … Continue reading

  • March 27, 2015
    Update: OpenStack Summit Vancouver Presentation
    The schedule for the upcoming OpenStack Summit 2015 in Vancouver is finally available. Sage and I submitted a presentation about “Storage security in a critical enterprise OpenStack environment“. The submission was accepted and the talk is scheduled for Monday, May 18th, 15:40 – 16:20. 

    There are also some other talks related to Ceph available:
    Checkout the links or the schedule for dates and times of the talks. 

    See you in Vancouver!

  • March 27, 2015
    Ceph erasure coding overhead in a nutshell

    Calculating the storage overhead of a replicated pool in Ceph is easy.
    You divide the amount of space you have by the “size” (amount of replicas) parameter of your storage pool.

    Let’s work with some rough numbers: 64 OSDs of 4TB each.

    Raw size: 64 * 4  = 256TB
    Size 2  : 128 / 2 = 128TB
    Size 3  : 128 / 3 = 85.33TB
    

    Replicated pools are expensive in terms of overhead: Size 2 provides the same resilience and overhead as RAID-1.
    Size 3 provides more resilience than RAID-1 but at the tradeoff of even more overhead.

    Explaining what Erasure coding is about gets complicated quickly.

    I like to compare replicated pools to RAID-1 and Erasure coded pools to RAID-5 (or RAID-6) in the sense that there are data chunks and recovery/parity/coding chunks.

    What’s appealing with erasure coding is that it can provide the same (or better) resiliency than replicated pools but with less storage overhead – at the cost of the computing it requires.

    Ceph has had erasure coding support for a good while already and interesting documentation is available:

    The thing with erasure coded pools, though, is that you’ll need a cache tier in front of them to be able to use them in most cases.

    This makes for a perfect synergy of slower/larger/less expensive drives for your erasure coded pool and faster, more expensive drives in front as your cache tier.

    To calculate the overhead of a erasure coded pool, you need to know your ‘k’ and ‘m’ values of your erasure code profile.

    chunk

      When the encoding function is called, it returns chunks of the same size. Data chunks which can be concatenated to reconstruct the original object and coding chunks which can be used to rebuild a lost chunk.

    K

      The number of data chunks, i.e. the number of chunks in which the original object is divided. For instance if K = 2 a 10KB object will be divided into K objects of 5KB each.

    M

      The number of coding chunks, i.e. the number of additional chunks computed by the encoding functions. If there are 2 coding chunks, it means 2 OSDs can be out without losing data.

    The formula to calculate the overhead is:

    nOSD * k / (k+m) * OSD Size
    

    Finally, let’s look at a couple different erasure coding profile configurations based on 64 OSDs of 4 TB ranging from m=1 to m=4 and k=1 to k=10:

    |     | 1      | 2      | 3      | 4      |
    |-----|--------|--------|--------|--------|
    | 1   | 128.00 | 85.33  | 64.00  | 51.20  |
    | 2   | 170.67 | 128.00 | 102.40 | 85.33  |
    | 3   | 192.00 | 153.60 | 128.00 | 109.71 |
    | 4   | 204.80 | 170.67 | 146.29 | 128.00 |
    | 5   | 213.33 | 182.86 | 160.00 | 142.22 |
    | 6   | 219.43 | 192.00 | 170.67 | 153.60 |
    | 7   | 224.00 | 199.11 | 179.20 | 162.91 |
    | 8   | 227.56 | 204.80 | 186.18 | 170.67 |
    | 9   | 230.40 | 209.45 | 192.00 | 177.23 |
    | 10  | 232.73 | 213.33 | 196.92 | 182.86 |
    | Raw | 256    | 256    | 256    | 256    |
    
  • March 11, 2015
    New release of python-cephclient: 0.1.0.5

    I’ve just drafted a new release of python-cephclient
    on PyPi: v0.1.0.5.

    After learning about the ceph-rest-api I just had
    to do something fun with it.

    In fact, it’s going to become very handy for me as I might start to develop
    with it for things like nagios monitoring scripts.

    The changelog:

    dmsimard:

    • Add missing dependency on the requests library
    • Some PEP8 and code standardization cleanup
    • Add root “PUT” methods
    • Add mon “PUT” methods
    • Add mds “PUT” methods
    • Add auth “PUT” methods

    Donald Talton:

    • Add osd “PUT” methods

    Please try it out and let me know if you have any feedback !

    Pull requests are welcome 🙂

  • March 9, 2015
    Provisionning a teuthology target with a given kernel

    When a teuthology target (i.e. machine) is provisioned with teuthology-lock for the purpose of testing Ceph, there is no way to choose the kernel. But it can be installed afterwards using the following: cat > kernel.yaml <<EOF interactive-on-error: true roles: … Continue reading

  • March 9, 2015
    Ceph OSD uuid conversion to OSD id and vice versa

    When handling a Ceph OSD, it is convenient to assign it a symbolic name that can be chosen even before it is created. That’s what the uuid argument for ceph osd create is for. Without a uuid argument, a random … Continue reading

  • March 3, 2015
    Re-schedule failed teuthology jobs

    The Ceph integration tests may fail because of environmental problems (network not available, packages not built, etc.). If six jobs failed out of seventy, these failed test can be re-run instead of re-scheduling the whole suite. It can be done … Continue reading

  • February 28, 2015
    HOWTO extract a stack trace from teuthology (take 2)

    When a Ceph teuthology integration test fails (for instance a rados jobs), it will collect core dumps which can be downloaded from the same directory where the logs and config.yaml files can be found, under the remote/mira076/coredump directory. The binary … Continue reading

  • February 24, 2015
    CDS: Infernalis Call for Blueprints

    The “Ceph Developer Summit” for the Infernalis release is on the way. The summit is planed for 03. and 04. March. The blueprint submission period started on 16. February and will end 27. February 2015. Do you miss something in Ceph or plan to deve…

  • February 17, 2015
    Vote for Presentations: OpenStack Summit Vancouver 2015

    The next OpenStack summit is planed to take place in Vancouver from 18 – 22 May 2015. The  “Call for Speakers” ended last week. The vote for presentation period started today and will end 23. February, 23:00 UTC (00:00 CEST, 05:00 CST). This …

Careers