Planet Ceph

Aggregated news from external sources

  • August 10, 2014
    A new Ceph mirror on the east coast

    I’m glad to announce that Ceph is now part of the mirrors iWeb provides.

    It is available in both IPv4 and IPv6 by:

    The mirror provides 4 Gbps of connectivity and is located on the eastern coast of Canada, more precisely in Montreal, Quebec.

    We feel this complements very well the principal ceph mirror at ceph.com located on the west coast and the european mirror on eu.ceph.com.

    Feel free to give it a try and let me know if you see any problems !

  • August 7, 2014
    Low Cost Scale-Out Nas for the Office

    The aim is showing that it is possible to create a low-cost storage, efficient and scalable, using opensource solutions.
    In the example below, I am using Ceph for scalability and reliability, combined with EnhanceIO to ensure very good performance.

    Architecture

    The idea was to create a storage with two parts: the storage itself (backing store) and a large cache to keep good performance on all data actually used.
    In fact, the volume needs may be important, but in the context of use for office, the data really used every day is only a portion of this storage.
    In my case, I intend to use Ceph deployed on low-cost hardware to ensure a scalable and reliable conformatble volume, based on small servers with large SATA drives. Data access will be via Samba shares on a slightly more powerful machine with an SAS Array to create a big cache. (Since Firefly, it would be more interesting to use ceph cache tiering)

    Hardware

    Ceph Part (Backing Store)

    The material you choose should match the requirements of both performance and cost. To keep prices extremely competitive machines chosen here are based on Supermicro hardware without additional disk controller. Initially, the ceph storage is composed of 5 machines: store-01-02 store, store-03-04 store, store-05. Each node is build with simple CPU Core 2, 4 GB Memory, 1 SSD Drive (Intel 520 60GB) for system and Ceph journal and 3 SATA Drive of 3 TO (Seagate CS 3TB) with no specific Drive controller (Using onbord controller).

    Samba / EnhanceIO Part (File Server and Cache)

    The cache must be on the same server as the file server. In my case, I use Dell server with 10 sas drive 10k in Raid10 for the cache.

    Network

    I do not provide for a specific network, just a gigabit switch with vlan to separate public / private network for Ceph OSD and interface bonding for file server.

    The system is currently used everyday since more than one year. And it works perfectly.

    Installation

    Here is some detail of the installation (from what I remember or I noted):

    Ceph Cluster

    The current configuration is running on Ceph Cuttlefish (v0.61) and Debian Wheezy. (Installation made in June 2013.)

    Debian install and partitionning

    The operating system is installed on the SSD with 3 others partitions for osd journal.

    $ apt-get install vim sudo sysstat ntp smartmontools
    $ vim /etc/default/smartmontools
    $ vim /etc/smartd.conf
    

    Kernel upgrade

    The use of the kernel version 3.12 available in backports seems improve memory footprint on the osd node.

    $ echo "deb http://ftp.fr.debian.org/debian wheezy-backports main" >> /etc/apt/sources.list
    $ apt-get install -t wheezy-backports linux-image-3.12-0.bpo.1-amd64
    

    Ceph Installation

    ( Based on the official documentation: http://ceph.com/docs/master/start/ )

    On each server create a ceph user

    $ useradd -d /home/ceph -m ceph
    $ passwd ceph
    
    $ vim [[/etc/hosts]]
    192.168.0.1       store-b1-01
    192.168.0.2       store-b1-02
    192.168.0.3       store-b1-03
    192.168.0.4       store-b1-04
    192.168.0.5       store-b1-05
    
    $ echo "ceph ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ceph
    $ chmod 0440 /etc/sudoers.d/ceph
    

    On store-b1-01 (deployment server)

    Create a new key for ssh authentification :

    $ ssh-keygen
    
    $ cluster="store-b1-01 store-b1-02 store-b1-03 store-b1-04 store-b1-05"
    $ for i in $cluster; do
          ssh-copy-id ceph@$i
      done
    
    $ vim /root/.ssh/config
    Host store*
            User ceph
    

    Install ceph-deploy and its dependencies

    $ wget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | sudo apt-key add -
    $ echo deb http://eu.ceph.com/debian-cuttlefish/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list
    $ apt-get update
    $ apt-get install python-pkg-resources python-setuptools ceph-deploy collectd
    $ curl http://python-distribute.org/distribute_setup.py | python
    $ easy_install pushy
    

    Install ceph on cluster

    $ ceph-deploy install $cluster
    $ ceph-deploy new store-b1-01 store-b1-02 store-b1-03
    $ vim ceph.conf
    $ ceph-deploy mon create store-b1-01 store-b1-02 store-b1-03
    

    Cluster deployment

    $ ceph-deploy gatherkeys store-b1-01
    $ ceph-deploy osd create \
                store-b1-01:/dev/sdb:/dev/sda5 \
                store-b1-01:/dev/sdc:/dev/sda6 \
                store-b1-01:/dev/sdd:/dev/sda7 \
                store-b1-02:/dev/sdb:/dev/sda5 \
                store-b1-02:/dev/sdc:/dev/sda6 \
                store-b1-02:/dev/sdd:/dev/sda7 \
                store-b1-03:/dev/sdb:/dev/sda5 \
                store-b1-03:/dev/sdc:/dev/sda6 \
                store-b1-03:/dev/sdd:/dev/sda7 \
                store-b1-04:/dev/sdb:/dev/sda5 \
                store-b1-04:/dev/sdc:/dev/sda6 \
                store-b1-04:/dev/sdd:/dev/sda7 \
                store-b1-05:/dev/sdb:/dev/sda5 \
                store-b1-05:/dev/sdc:/dev/sda6 \
                store-b1-05:/dev/sdd:/dev/sda7
    

    Add in fstab:

    /dev/sdb1       /var/lib/ceph/osd/ceph-0        xfs     inode64,noatime         0       0
    /dev/sdc1       /var/lib/ceph/osd/ceph-1        xfs     inode64,noatime         0       0
    /dev/sdd1       /var/lib/ceph/osd/ceph-2        xfs     inode64,noatime         0       0
    

    One can check that all osd have been created, and check the cluster status :

    $ ceph osd tree
    
    # id    weight  type name   up/down reweight
    -1  41.23   root default
    -8  41.23       datacenter DC1
    -7  41.23           rack b1
    -2  8.24                host store-b1-01
    0   2.65                    osd.0   up  1   
    1   2.65                    osd.1   up  1   
    2   2.65                    osd.2   up  1   
    -3  8.24                host store-b1-02
    3   2.65                    osd.3   up  1   
    4   2.65                    osd.4   up  1   
    5   2.65                    osd.5   up  1   
    -4  8.24                host store-b1-03
    6   2.65                    osd.6   up  1   
    7   2.65                    osd.7   up  1   
    8   2.65                    osd.8   up  1   
    -5  8.24                host store-b1-04
    9   2.65                    osd.9   up  1   
    10  2.65                    osd.10  up  1   
    11  2.65                    osd.11  up  1   
    -6  8.24                host store-b1-05
    12  2.65                    osd.12  up  1   
    13  2.65                    osd.13  up  1   
    14  2.65                    osd.14  up  1   
    

    File Server

    Debian interface bonding

    $ apt-get install ifenslave
    
    $ vim /etc/network/interface
    auto bond0
    iface bond0 inet static
            address 192.168.0.12
            netmask 255.255.0.0
            gateway 192.168.0.1
            slaves eth0 eth1
            bond-mode 802.3ad
    
    $ echo "alias bond0 bonding
    options bonding mode=4 miimon=100 lacp_rate=1" > vim /etc/modprobe.d/bonding.conf
    
    $ echo "bonding" >> /etc/modules
    

    Kernel Version

    The kernel version is important for using KRBD. I recommend to use at least kernel 3.10.26 or later.

    $ apt-get install debconf-utils dpkg-dev debhelper build-essential kernel-package libncurses5-dev
    $ cd /usr/src/
    $ wget http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.6.11.tar.bz2
    $ tar xjf linux-3.6.11.tar.bz2
    

    RBD

    Create rbd volume (Format 1) :

    $ rbd create datashare/share1 --image-format=1 --size=1048576
    
    $ mkdir /share1
    $ echo "/dev/rbd/datashare/share1   /share1 xfs _netdev,barrier=0,nodiratime        0   0" >> /etc/fstab
    

    EnhanceIO

    The choice of cache mechanism is focused on “EnhanceIO” because it allows to enable or disable cache while a source volume is being used. This is particularly useful when we want to resize a volume without interrupting service.

    Build EnhanceIO :

    $ apt-get install build-essential dkms
    $ git clone https://github.com/stec-inc/EnhanceIO.git
    $ cd EnhanceIO/
    $ wget http://ftp.de.debian.org/debian/pool/main/e/enhanceio/enhanceio_0+git20130620-3.debian.tar.xz
    $ tar xJf enhanceio_0+git20130620-3.debian.tar.xz
    $ dpkg-buildpackage -rfakeroot -b
    $ dpkg -i ../*.deb
    

    Create cache :

    For exemple in write-though :
    (/dev/sdb2 is a local partition dedicate for cache)

    $ eio_cli create -d /dev/rbd1 -s /dev/sdb2 -p lru -m wt -b 4096 -c share1
    

    If you want to use write-back cache, you can protect the file system to mount before the cache by using a symbolic link in the udev script. ( https://github.com/ksperis/EnhanceIO/commit/954e167fdb580d514747512ce2bd1c9c29a77418 )

    Samba

    Packages installation

    $ echo "deb http://ftp.sernet.de/pub/samba/3.6/debian wheezy main" >> /etc/apt/sources.list
    $ apt-get update
    $ apt-get install sernet-samba sernet-winbind xfsprogs krb5-user acl attr
    

    Update startup script

    $ vim /etc/init.d/samba
    # Should-Start:      slapd cups rbdmap
    # Should-Stop:       slapd cups rbdmap
    
    $ insserv -d samba
    

    Configure and join the domain ( https://help.ubuntu.com/community/ActiveDirectoryWinbindHowto )

    $ vi /etc/krb5.conf
    ...
    $ kinit Administrator@AD.MYDOMAIN.COM
    
    $ vim /etc/samba/smb.conf
    [global]
        workgroup = MYDOMAIN
        realm = AD.MYDOMAIN.COM
        netbios name = MYNAS
        wins server = 192.168.0.4
        server string = %h server
        dns proxy = no
        log file = /var/log/samba/log.%m
        log level = 1
        max log size = 1000
        syslog = 0
        panic action = /usr/share/samba/panic-action %d
        security = ADS
        winbind separator = +
        client use spnego = yes
        winbind use default domain = yes
        domain master = no
        local master = no
        preferred master = no
        encrypt passwords = true
        passdb backend = tdbsam
        obey pam restrictions = yes
        unix password sync = yes
        passwd program = /usr/bin/passwd %u
        passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* .
        pam password change = yes
        idmap uid = 10000-20000
        idmap gid = 10000-20000
        template shell = /bin/bash
        template homedir = /share4/home/%D/%U
        winbind enum groups = yes
        winbind enum users = yes
        map acl inherit = yes
        vfs objects = acl_xattr recycle shadow_copy2
        recycle:repository =.recycle/%u 
        recycle:keeptree = yes
        recycle:exclude = *.tmp
        recycle:touch = yes
        shadow:snapdir = .snapshots
        shadow:sort = desc
        ea support = yes
        map hidden = no
        map system = no
        map archive = no
        map readonly = no
        store dos attributes = yes
        load printers = no
        printing=bsd
        printcap name = /dev/null
        disable spoolss = yes
        guest account = invité
        map to guest = bad user
    
    $================================================================
    
    [share0]
        comment = My first share
        path = /share0
        writable = yes
        valid users = @"MYDOMAIN+Domain Admins" "MYDOMAIN+laurent"
    
    [share1]
        comment = Other share
        path = /share1
        writable = yes
        valid users = @"MYDOMAIN+Domain Admins" "MYDOMAIN+laurent"
    
    ....
    
    
    $ /etc/init.d/samba restart
    $ net join -U Administrator
    $ wbinfo -u
    $ wbinfo -g
    
    $ vi /etc/nsswitch.conf
    passwd:         compat winbind
    group:          compat winbind
    

    Virtual Shadow Copy

    Using this script :
    https://github.com/ksperis/autosnap-rbd-shadow-copy

    TimeMachine Backup

    Fisrt, create rbd timemachine and mount to /mnt/timemachine with noatime and uquota if you want to use per user quota (Do not add to the autosnap script.)

    Install avahi

    $ apt-get install avahi-daemon avahi-utils
    
    $ vim /etc/avahi/services/smb.service
    <?xml version="1.0" standalone='no'?>
    <!DOCTYPE service-group SYSTEM "avahi-service.dtd">
    <service-group>
      <name replace-wildcards="yes">%h (File Server)</name>
      <service>
        <type>_smb._tcp</type>
        <port>445</port>
      </service>
      <service>
        <type>_device-info._tcp</type>
        <port>0</port>
        <txt-record>model=RackMac</txt-record>
      </service>
    </service-group>
    
    $ vim /etc/avahi/services/timemachine.service
    <?xml version="1.0" standalone="no"?>
    <!DOCTYPE service-group SYSTEM "avahi-service.dtd">
    <service-group>
      <name replace-wildcards="yes">%h (Time Machine)</name>
      <service>
        <type>_afpovertcp._tcp</type>
        <port>548</port>
      </service>
      <service>
        <type>_device-info._tcp</type>
        <port>0</port>
        <txt-record>model=TimeCapsule</txt-record>
      </service>
      <service>
        <type>_adisk._tcp</type>
        <port>9</port>
        <txt-record>sys=waMA=00:1d:09:63:87:e0,adVF=0x100</txt-record>
        <txt-record>dk0=adVF=0x83,adVN=TimeMachine</txt-record>
      </service>
    </service-group>
    

    Install netatalk and configure like this :

    $ apt-get install netatalk
    $ vim /etc/netatalk/afpd.conf
    - -tcp -noddp -nozeroconf -uamlist uams_dhx.so,uams_dhx2.so -nosavepassword -setuplog "default log_info /var/log/afpd.log"
    
    
    $ vim /etc/netatalk/AppleVolumes.default   (Remove home directory + Add :)
    /mnt/timemachine/ "TimeMachine" cnidscheme:dbd options:usedots,upriv,tm allow:@"MYDOMAIN+Domain Users"
    
    $ vim /etc/pam.d/netatalk
    auth       required     pam_winbind.so
    account    required     pam_winbind.so
    session    required     pam_unix.so
    

    Setting quota for a user : (Do not use soft limits, because it will not be recognized by timemachine.)

    $ xfs_quota -x -c 'limit bhard=1024g user1' /mnt/timemachine
    

    Benchmark

    ( For sequential IO, bandwidth is limited by the network card on the client side. )

  • August 5, 2014
    Ceph Submissions Abound for OpenStack Paris

    Twitter || Facebook || Google+ || Lists/IRC Voting for submissions is well underway for the next OpenStack summit, and this one is shaping up to be another great place to talk about Ceph. Almost fifty talks are currently available for voting on the OpenStack site! Ceph has been steadily gaining popularity in the OpenStack world, …Read more

  • August 2, 2014
    OpenStack Summit Paris 2014 – CFS

    The Call for Speakers period for the OpenStack Summit from 03. – 07.11.2014 in Paris ended this week. Now the voting for the submitted talks started and ends at 11:59pm CDT on August 6. (6:59 am CEST on 7. August).I’ve submitted a talk to the stor…

  • August 2, 2014
    Temporarily disable Ceph scrubbing to resolve high IO load

    In a Ceph cluster with low bandwidth, the root disk of an OpenStack instance became extremely slow during days. When an OSD is scrubbing a placement group, it has a significant impact on performances and this is expected, for a … Continue reading

  • July 31, 2014
    Brace yourself, DevStack Ceph is here!

    For more than a year, Ceph has become increasingly popular and saw several deployments inside and outside OpenStack. For those of you who do not know Ceph is unified, distributed and massively scalable open source storage technology that provides several ways to access and consume your data such as object, block and filesystem. The community and Ceph itself has greatly… Read more →

  • July 30, 2014
    v0.83 released

    Another Ceph development release! This has been a longer cycle, so there has been quite a bit of bug fixing and stabilization in this round. There is also a bunch of packaging fixes for RPM distros (RHEL/CentOS, Fedora, and SUSE) and for systemd. We’ve also added a new librados-striper library from Sebastien Ponce that provides …Read more

  • July 29, 2014
    v0.80.5 Firefly released

    This release fixes a few important bugs in the radosgw and fixes several packaging and environment issues, including OSD log rotation, systemd environments, and daemon restarts on upgrade. We recommend that all v0.80.x Firefly users upgrade, particularly if they are using upstart, systemd, or radosgw. NOTABLE CHANGES ceph-dencoder: do not needlessly link to librgw, librados, …Read more

  • July 29, 2014
    Lots Going on with Ceph

    While we knew that after the acquisition of Inktank life would accelerate again, it seems like the Ceph community is quickly approaching ludicrous speed, and it shows no sign of slowing down. We have had some amazing participation in the various endeavors, but it would be completely understandable if you had missed something amidst the …Read more

  • July 21, 2014
    Ceph Turns 10 Twitter Photo Contest

    OSCON has arrived (although if you came in for the Ceph tutorial session that’s old news to you)! As a part of our participation in OSCON, and as a way to celebrate the fact that Ceph turned 10 years old this year, we have decided to have our party be a distributed one. We would …Read more

  • July 17, 2014
    Celebrate 10 Years of Ceph at OSCON!

    Ceph is coming back to OSCON next week (July 20-24 in Portland, OR). The difference however, is that this year we need two digits to tell people how old we are. Stop by for some mild festivities at the Ceph booth (P2) as we share cupcakes, and t-shirts that salute the hard work of all …Read more

  • July 16, 2014
    Inktank Ceph Enterprise 1.2 arrives with erasure coding and cache-tiering

    Today, we’re proud to announce the availability of Red Hat’s Inktank Ceph Enterprise 1.2, a solution based on the Firefly release from the Ceph community. This latest version brings some major new features to enterprises – ones we believe will both further cement Ceph’s position as a leading storage solution for OpenStack infrastructure while promoting […]

Careers