Planet Ceph

Aggregated news from external sources

  • June 22, 2014
    A use case of Tengine, a drop-in replacement and fork of nginx

    I experimented with Taobao’s fork of nginx, Tengine, in front of an object storage cluster. I was surprised by the results.

  • June 22, 2014
    A use case of Tengine, a drop-in replacement and fork of nginx

    I’ve always been a fan of nginx, it was love at first sight.

    I tend to use nginx first and foremost as a reverse proxy server for web
    content and applications. This means that nginx sends your request to
    backend servers and forwards you their response.

    Some examples of backend servers I use:

    Now, the cool thing is that these backend servers are good at what they
    do: serve code and applications written in specific languages.

    Mix an awesome, lightweight, proxy and an awesome backend server, you’re
    in for some serious performance.
    This is in contrast to Apache that has an approach with modules: it
    tries to do everything itself – jack of all trades, master of none.

    nginx is steadily increasing it’s market share against the likes of
    Apache and it’s not exactly complicated to understand why.

    Did I tell you that nginx can also do SSL termination and be used as
    a load balancer ?


    Enough of nginx, let’s talk about Tengine.

    Ever heard of Taobao ? I’ll be honest, I hadn’t until fairly
    It turns out they are number 8 on Alexa’s top websites, right in
    front of Twitter.
    When China makes up almost 20% of the World’s population, even a
    small penetration on the market is in fact huge by all means.

    Tengine is a fork of nginx created by the team over at Taobao. There’s a
    lot of features in Tengine that do not (yet) exist in nginx and some
    features that upstream maintainers said they would not implement.

    Some highlights include:

    • All features of Nginx-1.4.7 are inherited, i.e., it is 100%
      compatible with Nginx.
    • Dynamic module loading support (DSO). No need to recompile Tengine.
    • Send unbuffered uploads directly to backend servers
    • More load balancing methods like consistent hashing and session persistence
    • Input body filter support, for use in things like web application firewalls
    • Logging enhancements: Syslog (local or remote), pipe logging and log sampling

    The use case: Object storage

    Long story short, Object storage is a mean of storing data online
    and make it easily accessible with the help of APIs.
    Example of products using this technology include Dropbox, Google Drive,
    Microsoft OneDrive or Amazon S3.

    Owncloud is also a good open source and self-hosted alternative front
    end to Object Storage.

    Openstack Swift and Ceph Object Gateway

    Openstack Swift and Ceph Object Gateway (RADOS Gateway) are two
    of the most popular open source object storage solutions out there right now.

    They’re both similar in that you upload files to a proxy server – a
    Swift proxy server or a Ceph RADOS Gateway server. These proxy servers
    take care of sending the files back to storage servers that ensure data
    is distributed and replicated to ensure the high availability and
    redundancy of your data.

    It looks a bit like this:

                                    +--> |  Storage  |
                                    |    +-----------+
        +-----+  File  +-------+    |    +-----------+
        | You | +----> | Proxy | +-----> |  Storage  |
        +-----+        +-------+    |    +-----------+
                                    |    +-----------+
                                    +--> |  Storage  |

    Now, in a highly available and distributed environment, you might have
    dozens or hundreds of storage and proxy servers. There are a lot of
    options out there, you might have something like haproxy, pound
    or nginx for load balancing.

    With a load balancer in front of your proxy servers, your setup now
    looks like this:

                                                 +-------+         +-----------+
                                            +--> | Proxy | +--+--> |  Storage  |
                                            |    +-------+    |    +-----------+
                                            |                 |                 
        +-----+  File  +---------------+    |    +-------+    |    +-----------+
        | You | +----> | Load Balancer | +-----> | Proxy | +-----> |  Storage  |
        +-----+        +---------------+    |    +-------+    |    +-----------+
                                            |                 |                 
                                            |    +-------+    |    +-----------+
                                            +--> | Proxy | +--+--> |  Storage  |
                                                 +-------+         +-----------+

    I noticed a problem when using nginx as a load balancer in front of
    servers that are the target of large and numerous uploads. nginx buffers
    the request of the body and this is something that drives a lot of
    discussion in the nginx mailing lists.

    This effectively means that the file is uploaded twice. You upload a
    file to nginx that acts as a reverse proxy/load balancer and nginx waits
    until the file is finished uploading before sending the file to one of
    the available backends. The buffer will happen either in memory or to an
    actual file, depending on configuration.

    Tengine was recently brought up in the Ceph mailing lists as part of
    the solution to tackling the problem so I decided to give it a try and
    see what kind of impact it’s unbuffered requests had on performance.

    An unscientific test

    I uploaded a 1GB file to an Object storage cluster with nginx 1.6.0 in
    front. I then swapped it out for Tengine 1.5.2 and tried again. Swapping
    webservers was as simple as uninstalling Nginx and installing Tengine
    from a package I built. The configuration I had was 100% compatible,
    I only had to add configuration to disable request buffering.

    The layout looked like this:

        +----+  1GB File   +---------------+       +-------+       +-----------+
        | Me | +---------> | Load Balancer | +---> | Proxy | +---> |  Storage  |
        +----+             +---------------+       +-------+       +-----------+
                               1Gbps           1Gbps

    With nginx, the upload took 1 minute 13 seconds.
    With Tengine, the upload took 41 seconds.

    That’s a difference of more than 30 seconds !


    I was blown away by the difference disabling the buffering made.
    Tengine really was a drop-in replacement to Nginx, much like
    MariaDB 5.5 is for MySQL.

    This blog now runs Tengine, perhaps there is also a
    bright future ahead of Taobao’s team ?

    It might just start making waves outside of China.
    Let’s wait and see.

  • June 9, 2014
    Locally repairable codes and implied parity

    When a Ceph OSD is lost in an erasure coded pool, it can be recovered using the others. For instance if OSD X3 was lost, OSDs X1, X2, X4 to X10 and P1 to P4 are retrieved by the primary … Continue reading

  • May 29, 2014
    Back from the Juno summit Ceph integration into OpenStack

    Six months have passed since Hong Kong and it is always really exciting to see all the folks from the community gathered all-together in a (bit chilly) convention center. As far I saw from the submitted and accepted talks, Ceph continues its road to the top. There is still a huge growing interest about Ceph. On tuesday May 13th, Josh… Read more →

  • May 27, 2014
    Ceph erasure code jerasure plugin benchmarks

    On a Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz processor (and all SIMD capable Intel processors) the Reed Solomon Vandermonde technique of the jerasure plugin, which is the default in Ceph Firefly, performs better. The chart is for decoding erasure … Continue reading

  • May 8, 2014
    Create a partition and make it an OSD

    Note: it is similar to Creating a Ceph OSD from a designated disk partition but simpler. In a nutshell, to use the remaining space from /dev/sda and assuming Ceph is already configured in /etc/ceph/ceph.conf it is enough to: $ sgdisk … Continue reading

  • May 5, 2014
    Recovering from a cinder RBD host failure

    OpenStack Havana Cinder volumes associated with a RBD Ceph pool are bound to a host. cinder service-list –host +—————+———————–+——+———+——-+ | Binary | Host | Zone | Status | State | +—————+———————–+——+———+——-+ | cinder-volume | | ovh | enabled … Continue reading

  • May 4, 2014
    Inktank & Redhat to open source Calamari, the Ceph web interface

    Sage Weil, creator of Ceph, announced that Calamari – the Ceph web interface – will be open sourced, following Redhat’s acquisition of Inktank.

  • May 3, 2014
    Non profit OpenStack & Ceph cluster distributed over five datacenters

    A few non profit organizations (April, FSF France,…) and volunteers constantly research how to get compute, storage and bandwidth that are: 100% Free Software Content neutral Low maintenance Reliable Cheap The latest setup, in use since ocbober 2013, is … Continue reading

  • May 3, 2014
    Inktank and Redhat to open source Calamari, the Ceph web interface

    You might have heard this already but Redhat made an annoucement
    last week that they will be acquiring Inktank, the company behind

    Inktank steered Ceph’s development, offered training and provided
    support through an entreprise package which included Calamari: a web
    interface to have insight on what is going on inside your cluster.
    You can have a peek at what Calamari looks like here – in a session
    from Portland’s Openstack Summit.

    Now, what’s interesting is that prior to this announcement, Calamari
    was closed source and would only be available as part of Inktank’s
    entreprise package. The fact that Calamari was closed source resulted in
    several open source alternatives spawning left and right – some of them
    pretty good looking, too – to name a few:

    I’m glad that Calamari will be open sourced, hopefully this means the
    community will focus their efforts on one initiative. Having personally
    worked on both python-cephclient and kraken, the programmer in
    me is also curious as to how Inktank developed it. I can’t wait.

  • April 22, 2014
    Sharing hard drives with Ceph

    A group of users give hard drives to the system administrator of the Ceph cluster. In exchange, each of them get credentials to access a dedicated pool of a given size from the Ceph cluster. The system administrator runs: # … Continue reading

  • April 8, 2014
    Erasure Coding in Ceph

    Erasure Coding : All you have to know If there is data , there would be failure and there will also be administrators like us to recover this data and Erasure Coding is our shield.Storage systems have technologies for data protection and reco…