Planet Ceph

Aggregated news from external sources

  • September 6, 2013
    OpenStack Havana flush token manually

    It has always been a huge pain to manage token in MySQL espacially with PKI token since they are larger than UUID token.
    Almost a year ago I wrote an article to purge token via a script.
    So finally, we have an easy option to purge all expired token.

  • September 3, 2013
    Who You Callinโ€™ Wimpy? Storage on ARM-based Servers Is Brawny!

    A Guest Blogpost from John Mao, Product Marketing, Calxeda Wimpy a good thing? For CPU cores? Maybe so…. Now, I’m not exactly a fan of the term wimpy—the voice in this 80s trash bag commercial tends to haunt me—but in his research note last Wednesday called “Are Wimpy Cores Good for Brawny Storage?”, Paul Teich of Moor Insights & […]

  • September 2, 2013
    First glimpse at CoreOS

    CoreOS is an emergent project that aims to address one of the most pressing questions in the server’s world.
    We at eNovance, therefore released eDeploy: a tool that performs bare metal deployment and manages upgrades with ease.
    Deploying and up…

  • August 29, 2013
    Mon Failed to Start

    Some common problems when adding a monitor to an existing cluster, for example if config is not found :

     $ service ceph start mon.ceph-03
     /etc/init.d/ceph: mon.ceph-03 not found (/etc/ceph/ceph.conf defines osd.2 , /var/lib/ceph defines osd.2)
    

    If you do not want to specify a section mon.ceph-03 in ceph.conf, you need to have a file sysvinit in /var/lib/ceph/mon/ceph-ceph-03/

    $   ls -l /var/lib/ceph/mon/ceph-ceph-03/
    total 8
    -rw-r--r-- 1 root root   77 août  29 16:56 keyring
    drwxr-xr-x 2 root root 4096 août  29 17:03 store.db
    

    Just create the file, then it should start :

    $ touch /var/lib/ceph/mon/ceph-ceph-03/sysvinit
    $ service ceph start mon.ceph-03
    === mon.ceph-03 === 
    Starting Ceph mon.ceph-03 on ceph-03...
    failed: 'ulimit -n 32768;  /usr/bin/ceph-mon -i ceph-03 --pid-file /var/run/ceph/mon.ceph-03.pid -c /etc/ceph/ceph.conf '
    Starting ceph-create-keys on ceph-03...
    

    Next error on starting monitor, if you have a look to log you can see :

    $ tail -f ceph-mon.ceph-03.log
    mon.ceph-03 does not exist in monmap, will attempt to join an existing cluster
    no public_addr or public_network specified, and mon.ceph-03 not present in monmap or ceph.conf
    

    You shoud verify, that you do not have ceph-create-keys process that hang, if so you can kill it :

    $ ps aux | grep create-keys
    root      1317  0.1  1.4  36616  7168 pts/0    S    17:13   0:00 /usr/bin/python /usr/sbin/ceph-create-keys -i ceph-03
    
    $ kill 1317
    

    Verify that you have defined this mon on the current monmap

    $  ceph mon dump
    dumped monmap epoch 6
    epoch 6
    fsid e0506c4d-e86a-40a8-8306-4856f9ccb989
    last_changed 2013-08-29 16:58:06.145127
    created 0.000000
    0: 10.2.4.10:6789/0 mon.ceph-01
    1: 10.2.4.11:6789/0 mon.ceph-02
    2: 10.2.4.12:6789/0 mon.ceph-03
    

    You need to retrieve the current monmap and add if for this node :

    $ ceph mon getmap -o /tmp/monmap
    2013-08-29 17:36:36.204257 7f641a54d700  0 -- :/1005682 >> 10.2.4.12:6789/0 pipe(0x2283400 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x2283660).fault
    got latest monmap
    
    $ ceph-mon -i ceph-03 --inject-monmap /tmp/monmap
    

    Try again :

    $ service ceph start mon.ceph-03
    === mon.ceph-03 === 
    Starting Ceph mon.ceph-03 on ceph-03...
    Starting ceph-create-keys on ceph-03...
    

    It’s seems working fine. You can verify the state of monitor quorum :

    $ ceph mon stat
    e6: 3 mons at {ceph-01=10.2.4.10:6789/0,ceph-02=10.2.4.11:6789/0,ceph-03=10.2.4.12:6789/0}, election epoch 1466, quorum 0,1,2 ceph-01,ceph-02,ceph-03
    

    For more information, have a look to the documentation :
    http://ceph.com/docs/master/rados/operations/add-or-rm-mons/

  • August 28, 2013
    RBD Image Real Size

    To get the real size used by a rbd image :

    rbd diff $POOL/$IMAGE | awk ‘{ SUM += $2 } END { print SUM/1024/1024 ” MB” }’

    For exemple :

    $rbd info myrbd
    rbd image ‘myrbd’:
    size 2048 MB in 512 objects
    order 22 (4096 KB objects)
    block_na…

  • August 27, 2013
    Deep Scrub Distribution

    To verify the integrity of data, Ceph uses a mechanism called deep scrubbing which browse all your data once per week for each placement group.
    This can be the cause of overload when all osd running deep scrubbing at the same time.

    You can easly see i…

  • August 22, 2013
    Configure Ceph RBD caching on OpenStack Nova

    By default, OpenStack doesn’t use any caching. However, you might want to enable the RBD caching.

    As you may recall, the current implementation of the RBD caching is in-memory caching solution.
    Although, at the last Ceph Developer Summit (l…

  • August 20, 2013
    Ceph OSD : Where Is My Data ?

    The purpose is to verify where my data is stored on the Ceph cluster.

    For this, I have just create a minimal cluster with 3 osd :

    1
    
    $ ceph-deploy osd create ceph-01:/dev/sdb ceph-02:/dev/sdb ceph-03:/dev/sdb

    Where is my osd directory on ceph-01 ?

    1
    2
    
    $ mount | grep ceph
    /dev/sdb1 on /var/lib/ceph/osd/ceph-0 type xfs (rw,noatime,attr2,delaylog,noquota)

    The directory content :

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    
    $ cd /var/lib/ceph/osd/ceph-0; ls -l
    total 52
    -rw-r--r--   1 root root  487 août  20 12:12 activate.monmap
    -rw-r--r--   1 root root    3 août  20 12:12 active
    -rw-r--r--   1 root root   37 août  20 12:12 ceph_fsid
    drwxr-xr-x 133 root root 8192 août  20 12:18 current
    -rw-r--r--   1 root root   37 août  20 12:12 fsid
    lrwxrwxrwx   1 root root   58 août  20 12:12 journal -> /dev/disk/by-partuuid/37180b7e-fe5d-4b53-8693-12a8c1f52ec9
    -rw-r--r--   1 root root   37 août  20 12:12 journal_uuid
    -rw-------   1 root root   56 août  20 12:12 keyring
    -rw-r--r--   1 root root   21 août  20 12:12 magic
    -rw-r--r--   1 root root    6 août  20 12:12 ready
    -rw-r--r--   1 root root    4 août  20 12:12 store_version
    -rw-r--r--   1 root root    0 août  20 12:12 sysvinit
    -rw-r--r--   1 root root    2 août  20 12:12 whoami
    
    $ du -hs *
    4,0K  activate.monmap → The current monmap
    4,0K  active      → "ok"
    4,0K  ceph_fsid   → cluster fsid (same return by 'ceph fsid')
    2,1M  current
    4,0K  fsid        → id for this osd
    0 journal         → symlink to journal partition
    4,0K  journal_uuid
    4,0K  keyring     → the key
    4,0K  magic       → "ceph osd volume v026"
    4,0K  ready       → "ready"
    4,0K  store_version   
    0 sysvinit
    4,0K  whoami      → id of the osd

    The data are store in the directory “current” :
    It contains some file and many _head file :

    1
    2
    3
    4
    5
    6
    
    $ cd current; ls -l | grep -v head
    total 20
    -rw-r--r-- 1 root root     5 août  20 12:18 commit_op_seq
    drwxr-xr-x 2 root root 12288 août  20 12:18 meta
    -rw-r--r-- 1 root root     0 août  20 12:12 nosnap
    drwxr-xr-x 2 root root   111 août  20 12:12 omap

    In omap directory :

    1
    2
    3
    4
    5
    6
    7
    8
    
    $ cd omap; ls -l
    -rw-r--r-- 1 root root     150 août  20 12:12 000007.sst
    -rw-r--r-- 1 root root 2031616 août  20 12:18 000010.log 
    -rw-r--r-- 1 root root      16 août  20 12:12 CURRENT
    -rw-r--r-- 1 root root       0 août  20 12:12 LOCK
    -rw-r--r-- 1 root root     172 août  20 12:12 LOG
    -rw-r--r-- 1 root root     309 août  20 12:12 LOG.old
    -rw-r--r-- 1 root root   65536 août  20 12:12 MANIFEST-000009

    In meta directory :

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    
    $ cd ../meta; ls -l
    total 940
    -rw-r--r-- 1 root root  710 août  20 12:14 inc\uosdmap.10__0_F4E9C003__none
    -rw-r--r-- 1 root root  958 août  20 12:12 inc\uosdmap.1__0_B65F4306__none
    -rw-r--r-- 1 root root  722 août  20 12:14 inc\uosdmap.11__0_F4E9C1D3__none
    -rw-r--r-- 1 root root  152 août  20 12:14 inc\uosdmap.12__0_F4E9C163__none
    -rw-r--r-- 1 root root  153 août  20 12:12 inc\uosdmap.2__0_B65F40D6__none
    -rw-r--r-- 1 root root  574 août  20 12:12 inc\uosdmap.3__0_B65F4066__none
    -rw-r--r-- 1 root root  153 août  20 12:12 inc\uosdmap.4__0_B65F4136__none
    -rw-r--r-- 1 root root  722 août  20 12:12 inc\uosdmap.5__0_B65F46C6__none
    -rw-r--r-- 1 root root  136 août  20 12:14 inc\uosdmap.6__0_B65F4796__none
    -rw-r--r-- 1 root root  642 août  20 12:14 inc\uosdmap.7__0_B65F4726__none
    -rw-r--r-- 1 root root  153 août  20 12:14 inc\uosdmap.8__0_B65F44F6__none
    -rw-r--r-- 1 root root  722 août  20 12:14 inc\uosdmap.9__0_B65F4586__none
    -rw-r--r-- 1 root root    0 août  20 12:12 infos__head_16EF7597__none
    -rw-r--r-- 1 root root 2870 août  20 12:14 osdmap.10__0_6417091C__none
    -rw-r--r-- 1 root root  830 août  20 12:12 osdmap.1__0_FD6E49B1__none
    -rw-r--r-- 1 root root 2870 août  20 12:14 osdmap.11__0_64170EAC__none
    -rw-r--r-- 1 root root 2870 août  20 12:14 osdmap.12__0_64170E7C__none   → current osdmap
    -rw-r--r-- 1 root root 1442 août  20 12:12 osdmap.2__0_FD6E4941__none
    -rw-r--r-- 1 root root 1510 août  20 12:12 osdmap.3__0_FD6E4E11__none
    -rw-r--r-- 1 root root 2122 août  20 12:12 osdmap.4__0_FD6E4FA1__none
    -rw-r--r-- 1 root root 2122 août  20 12:12 osdmap.5__0_FD6E4F71__none
    -rw-r--r-- 1 root root 2122 août  20 12:14 osdmap.6__0_FD6E4C01__none
    -rw-r--r-- 1 root root 2190 août  20 12:14 osdmap.7__0_FD6E4DD1__none
    -rw-r--r-- 1 root root 2802 août  20 12:14 osdmap.8__0_FD6E4D61__none
    -rw-r--r-- 1 root root 2802 août  20 12:14 osdmap.9__0_FD6E4231__none
    -rw-r--r-- 1 root root  354 août  20 12:14 osd\usuperblock__0_23C2FCDE__none
    -rw-r--r-- 1 root root    0 août  20 12:12 pglog\u0.0__0_103B076E__none     → Log for each pg
    -rw-r--r-- 1 root root    0 août  20 12:12 pglog\u0.1__0_103B043E__none
    -rw-r--r-- 1 root root    0 août  20 12:12 pglog\u0.11__0_5172C9DB__none
    -rw-r--r-- 1 root root    0 août  20 12:12 pglog\u0.13__0_5172CE3B__none
    -rw-r--r-- 1 root root    0 août  20 12:13 pglog\u0.15__0_5172CC9B__none
    -rw-r--r-- 1 root root    0 août  20 12:13 pglog\u0.16__0_5172CC2B__none
    ............
    -rw-r--r-- 1 root root    0 août  20 12:12 snapmapper__0_A468EC03__noneosd

    Try decompiling crush map from osdmap :

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    
    $ ceph osd stat
    e12: 3 osds: 3 up, 3 in
    
    $ osdmaptool osdmap.12__0_64170E7C__none --export-crush /tmp/crushmap.bin
    osdmaptool: osdmap file 'osdmap.12__0_64170E7C__none'
    osdmaptool: exported crush map to /tmp/crushmap.bin
    
    $ crushtool -d /tmp/crushmap.bin -o /tmp/crushmap.txt
    
    $ cat /tmp/crushmap.txt
    # begin crush map
    
    # devices
    device 0 osd.0
    device 1 osd.1
    device 2 osd.2
    
    # types
    type 0 osd
    type 1 host
    type 2 rack
    type 3 row
    type 4 room
    type 5 datacenter
    type 6 root
    
    # buckets
    host ceph-01 {
      id -2       # do not change unnecessarily
      # weight 0.050
      alg straw
      hash 0  # rjenkins1
      item osd.0 weight 0.050
    }
    host ceph-02 {
      id -3       # do not change unnecessarily
      # weight 0.050
      alg straw
      hash 0  # rjenkins1
      item osd.1 weight 0.050
    }
    host ceph-03 {
      id -4       # do not change unnecessarily
      # weight 0.050
      alg straw
      hash 0  # rjenkins1
      item osd.2 weight 0.050
    }
    root default {
      id -1       # do not change unnecessarily
      # weight 0.150
      alg straw
      hash 0  # rjenkins1
      item ceph-01 weight 0.050
      item ceph-02 weight 0.050
      item ceph-03 weight 0.050
    }
    
    ...
    
    # end crush map

    Ok it’s what I expect. ๐Ÿ™‚

    The cluster is empty :

    1
    2
    
    $ find *_head -type f | wc -l
    0

    The directory list correspond to the ‘ceph pg dump’

    1
    2
    3
    
    $ for dir in ` ceph pg dump | grep '\[0,' | cut -f1 `; do if [ -d $dir_head ]; then echo exist; else echo nok; fi; done | sort | uniq -c
    dumped all in format plain
         69 exist

    To get all stats for a specific pg :

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    101
    102
    103
    104
    
    $ ceph pg 0.1 query
    { "state": "active+clean",
      "epoch": 12,
      "up": [
            0,
            1],
      "acting": [
            0,
            1],
      "info": { "pgid": "0.1",
          "last_update": "0'0",
          "last_complete": "0'0",
          "log_tail": "0'0",
          "last_backfill": "MAX",
          "purged_snaps": "[]",
          "history": { "epoch_created": 1,
              "last_epoch_started": 12,
              "last_epoch_clean": 12,
              "last_epoch_split": 0,
              "same_up_since": 9,
              "same_interval_since": 9,
              "same_primary_since": 5,
              "last_scrub": "0'0",
              "last_scrub_stamp": "2013-08-20 12:12:37.851559",
              "last_deep_scrub": "0'0",
              "last_deep_scrub_stamp": "2013-08-20 12:12:37.851559",
              "last_clean_scrub_stamp": "0.000000"},
          "stats": { "version": "0'0",
              "reported_seq": "12",
              "reported_epoch": "12",
              "state": "active+clean",
              "last_fresh": "2013-08-20 12:16:22.709534",
              "last_change": "2013-08-20 12:16:22.105099",
              "last_active": "2013-08-20 12:16:22.709534",
              "last_clean": "2013-08-20 12:16:22.709534",
              "last_became_active": "0.000000",
              "last_unstale": "2013-08-20 12:16:22.709534",
              "mapping_epoch": 5,
              "log_start": "0'0",
              "ondisk_log_start": "0'0",
              "created": 1,
              "last_epoch_clean": 12,
              "parent": "0.0",
              "parent_split_bits": 0,
              "last_scrub": "0'0",
              "last_scrub_stamp": "2013-08-20 12:12:37.851559",
              "last_deep_scrub": "0'0",
              "last_deep_scrub_stamp": "2013-08-20 12:12:37.851559",
              "last_clean_scrub_stamp": "0.000000",
              "log_size": 0,
              "ondisk_log_size": 0,
              "stats_invalid": "0",
              "stat_sum": { "num_bytes": 0,
                  "num_objects": 0,
                  "num_object_clones": 0,
                  "num_object_copies": 0,
                  "num_objects_missing_on_primary": 0,
                  "num_objects_degraded": 0,
                  "num_objects_unfound": 0,
                  "num_read": 0,
                  "num_read_kb": 0,
                  "num_write": 0,
                  "num_write_kb": 0,
                  "num_scrub_errors": 0,
                  "num_shallow_scrub_errors": 0,
                  "num_deep_scrub_errors": 0,
                  "num_objects_recovered": 0,
                  "num_bytes_recovered": 0,
                  "num_keys_recovered": 0},
              "stat_cat_sum": {},
              "up": [
                    0,
                    1],
              "acting": [
                    0,
                    1]},
          "empty": 1,
          "dne": 0,
          "incomplete": 0,
          "last_epoch_started": 12},
      "recovery_state": [
            { "name": "Started\/Primary\/Active",
              "enter_time": "2013-08-20 12:15:30.102250",
              "might_have_unfound": [],
              "recovery_progress": { "backfill_target": -1,
                  "waiting_on_backfill": 0,
                  "backfill_pos": "0\/\/0\/\/-1",
                  "backfill_info": { "begin": "0\/\/0\/\/-1",
                      "end": "0\/\/0\/\/-1",
                      "objects": []},
                  "peer_backfill_info": { "begin": "0\/\/0\/\/-1",
                      "end": "0\/\/0\/\/-1",
                      "objects": []},
                  "backfills_in_flight": [],
                  "pull_from_peer": [],
                  "pushing": []},
              "scrub": { "scrubber.epoch_start": "0",
                  "scrubber.active": 0,
                  "scrubber.block_writes": 0,
                  "scrubber.finalizing": 0,
                  "scrubber.waiting_on": 0,
                  "scrubber.waiting_on_whom": []}},
            { "name": "Started",
              "enter_time": "2013-08-20 12:14:51.501628"}]}

    Retrieve an object on the cluster

    In this test we create a standard pool (pgnum=8 and repli=2)

    1
    2
    3
    4
    5
    6
    7
    8
    
    $ rados mkpool testpool
    $ wget -q http://ceph.com/docs/master/_static/logo.png
    $ md5sum logo.png
    4c7c15e856737efc0d2d71abde3c6b28  logo.png
    
    $ rados put -p testpool logo.png logo.png
    $ ceph osd map testpool logo.png
    osdmap e14 pool 'testpool' (3) object 'logo.png' -> pg 3.9e17671a (3.2) -> up [2,1] acting [2,1]

    My Ceph logo is on pg 3.2 (main on osd.2 and replica on osd.1)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    $ ceph osd tree
    # id  weight  type name   up/down reweight
    -1    0.15    root default
    -2    0.04999     host ceph-01
    0 0.04999         osd.0   up  1   
    -3    0.04999     host ceph-02
    1 0.04999         osd.1   up  1   
    -4    0.04999     host ceph-03
    2 0.04999         osd.2   up  1

    And osd.2 is on ceph-03 :

    1
    2
    3
    4
    5
    
    $ cd /var/lib/ceph/osd/ceph-2/current/3.2_head/
    $ ls
    logo.png__head_9E17671A__3
    $ md5sum logo.png__head_9E17671A__3
    4c7c15e856737efc0d2d71abde3c6b28  logo.png__head_9E17671A__3
    

    It exactly the same ๐Ÿ™‚

    Import RBD

    Same thing, but testing as a block device.

    1
    2
    3
    4
    5
    6
    7
    8
    
    $ rbd import logo.png testpool/logo.png 
    Importing image: 100% complete...done.
    $ rbd info testpool/logo.png
    rbd image 'logo.png':
      size 3898 bytes in 1 objects
      order 22 (4096 KB objects)
      block_name_prefix: rb.0.1048.2ae8944a
      format: 1

    Only one object.

    1
    2
    3
    4
    5
    6
    7
    
    $ rados ls -p testpool
    logo.png
    rb.0.1048.2ae8944a.000000000000
    rbd_directory
    logo.png.rbd
    $ ceph osd map testpool logo.png.rbd
    osdmap e14 pool 'testpool' (3) object 'logo.png.rbd' -> pg 3.d592352c (3.4) -> up [0,2] acting [0,2]

    Let’s go.

    1
    2
    3
    4
    
    $ cd /var/lib/ceph/osd/ceph-0/current/3.4_head/
    $ cat logo.png.rbd__head_D592352C__3
    <<< Rados Block Device Image >>>
    rb.0.1048.2ae8944aRBD001.005:

    Here we can retrieve the block name prefix of the rbd ‘rb.0.1048.2ae8944a’ :

    1
    2
    
    $ ceph osd map testpool rb.0.1048.2ae8944a.000000000000
    osdmap e14 pool 'testpool' (3) object 'rb.0.1048.2ae8944a.000000000000' -> pg 3.d512078b (3.3) -> up [2,1] acting [2,1]

    On ceph-03 :

    1
    2
    3
    
    $ cd /var/lib/ceph/osd/ceph-2/current/3.3_head
    $ md5sum rb.0.1048.2ae8944a.000000000000__head_D512078B__3
    4c7c15e856737efc0d2d71abde3c6b28  rb.0.1048.2ae8944a.000000000000__head_D512078B__3

    We retrieve the file unchanged because it is not split ๐Ÿ™‚

    Try RBD snapshot

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    
    $ rbd snap create testpool/logo.png@snap1
    $ rbd snap ls testpool/logo.png
    SNAPID NAME        SIZE 
         2 snap1 3898 bytes
    $ echo "testpool/logo.png" >> /etc/ceph/rbdmap
    $ service rbdmap reload
    [ ok ] Starting RBD Mapping: testpool/logo.png.
    [ ok ] Mounting all filesystems...done.
    
    $ dd if=/dev/zero of=/dev/rbd/testpool/logo.png 
    dd: écriture vers « /dev/rbd/testpool/logo.png »: Aucun espace disponible sur le périphérique
    8+0 enregistrements lus
    7+0 enregistrements écrits
    3584 octets (3,6 kB) copiés, 0,285823 s, 12,5 kB/s
    
    $ ceph osd map testpool rb.0.1048.2ae8944a.000000000000
    osdmap e15 pool 'testpool' (3) object 'rb.0.1048.2ae8944a.000000000000' -> pg 3.d512078b (3.3) -> up [2,1] acting [2,1]

    It’s the same place on ceph-03 :

    1
    2
    3
    4
    
    $ cd /var/lib/ceph/osd/ceph-2/current/3.3_head
    $ md5sum *
    4c7c15e856737efc0d2d71abde3c6b28  rb.0.1048.2ae8944a.000000000000__2_D512078B__3
    dd99129a16764a6727d3314b501e9c23  rb.0.1048.2ae8944a.000000000000__head_D512078B__3

    We can notice that file containing 2 (snap id 2) contain original data.
    And a new file has been created for the current data : head

    For next tests, I will try with stripped files, rbd format 2 and snap on pool.

  • August 18, 2013
    Ceph Dumpling

    The Ceph community just finished its latest three-month cycle of development, culminating in a new major release of Ceph called “Dumpling,” or v0.67 for those of a more serious demeanor. Inktank is proud to have contributed two major pieces of functionality to Dumpling. 1. Global Namespace and Region Support – Many service providers and IT […]

  • August 9, 2013
    Samba Shadow_copy and Ceph RBD

    I add script to create snapshot on rbd for use with samba shadow_copy2.
    For more details go on https://github.com/ksperis/autosnap-rbd-shadow-copy

     How to use :

    Before you need to have ceph cluster running and samba installed.

    Verify admin access to the ceph cluster : (should not return error)

    1
    
    $ rbd ls

    Get the script :

    1
    2
    3
    4
    5
    
    $ mkdir -p /etc/ceph/scripts/
    $ cd /etc/ceph/scripts/
    $ wget https://raw.github.com/ksperis/autosnap-rbd-shadow-copy/master/autosnap.conf
    $ wget https://raw.github.com/ksperis/autosnap-rbd-shadow-copy/master/autosnap.sh
    $ chmod +x autosnap.sh

    Create a block device :

    1
    2
    3
    4
    5
    
    $ rbd create myshare --size=1024
    $ echo "myshare" >> /etc/ceph/rbdmap
    $ /etc/init.d/rbdmap reload
    [ ok ] Starting RBD Mapping: rbd/myshare.
    [ ok ] Mounting all filesystems...done.

    Format the block device :

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    
    $ mkfs.xfs /dev/rbd/rbd/myshare
    log stripe unit (4194304 bytes) is too large (maximum is 256KiB)
    log stripe unit adjusted to 32KiB
    meta-data=/dev/rbd/rbd/myshare   isize=256    agcount=9, agsize=31744 blks
             =                       sectsz=512   attr=2, projid32bit=0
    data     =                       bsize=4096   blocks=262144, imaxpct=25
             =                       sunit=1024   swidth=1024 blks
    naming   =version 2              bsize=4096   ascii-ci=0
    log      =internal log           bsize=4096   blocks=2560, version=2
             =                       sectsz=512   sunit=8 blks, lazy-count=1
    realtime =none                   extsz=4096   blocks=0, rtextents=0

    Mount the share :

    1
    2
    3
    
    $ mkdir /myshare
    $ echo "/dev/rbd/rbd/myshare /myshare xfs defaults 0 0" >> /etc/fstab
    $ mount /myshare

    Add this section in your /etc/samba/smb.conf :

    1
    2
    3
    4
    5
    6
    
    [myshare]
        path = /myshare
        writable = yes
      vfs objects = shadow_copy2
      shadow:snapdir = .snapshots
      shadow:sort = desc

    Reload samba

    1
    
    $ /etc/init.d/samba reload

    Create snapshot directory and run the script :

    1
    2
    3
    4
    5
    6
    
    $ mkdir -p /myshare/.snapshots
    $ /etc/ceph/scripts/autosnap.sh
    * Create snapshot for myshare: @GMT-2013.08.09-10.16.10-autosnap
    synced, no cache, snapshot created.
    * Shadow Copy to mount for rbd/myshare :
    GMT-2013.08.09-10.14.44

    Verify that the first snapshot is correctly mount :

    1
    2
    3
    
    $ mount | grep myshare
    /dev/rbd1 on /myshare type xfs (rw,relatime,attr2,inode64,sunit=8192,swidth=8192,noquota)
    /dev/rbd2 on /myshare/.snapshots/@GMT-2013.08.09-10.14.44 type xfs (ro,relatime,nouuid,norecovery,attr2,inode64,sunit=8192,swidth=8192,noquota)

    Also, you can add this on crontab to run everyday the script :

    1
    
    $ echo "00 0    * * *   root    /bin/bash /etc/ceph/scripts/autosnap.sh" >> /etc/crontab
  • August 3, 2013
    Test Ceph Persistant Rbd Device

    Create persistant rbd device

    Create block device and map it with /etc/ceph/rbdmap

    1
    2
    3
    4
    5
    
    $ rbd create rbd/myrbd --size=1024
    $ echo "rbd/myrbd" >> /etc/ceph/rbdmap
    $ service rbdmap reload
    [ ok ] Starting RBD Mapping: rbd/myrbd.
    [ ok ] Mounting all filesystems...done.

    View rbd mapped :

    1
    2
    3
    
    $ rbd showmapped
    id pool image snap device    
    1  rbd  myrbd -    /dev/rbd1

    Create FS and mount :

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    
    $ mkfs.xfs /dev/rbd/rbd/myrbd 
    log stripe unit (4194304 bytes) is too large (maximum is 256KiB)
    log stripe unit adjusted to 32KiB
    meta-data=/dev/rbd/rbd/myrbd     isize=256    agcount=9, agsize=31744 blks
             =                       sectsz=512   attr=2, projid32bit=0
    data     =                       bsize=4096   blocks=262144, imaxpct=25
             =                       sunit=1024   swidth=1024 blks
    naming   =version 2              bsize=4096   ascii-ci=0
    log      =internal log           bsize=4096   blocks=2560, version=2
             =                       sectsz=512   sunit=8 blks, lazy-count=1
    realtime =none                   extsz=4096   blocks=0, rtextents=0
    
    $ mkdir -p /mnt/myrbd
    $ blkid | grep rbd1
    /dev/rbd1: UUID="a07e969e-bb1a-4921-9171-82cf7a737a69" TYPE="xfs"
    $ echo "UUID=a07e969e-bb1a-4921-9171-82cf7a737a69 /mnt/myrbd xfs defaults 0 0" >> /etc/fstab
    $ mount -a

    Check :

    1
    2
    
    $ mount | grep rbd1
    /dev/rbd1 on /mnt/myrbd type xfs (rw,relatime,attr2,inode64,sunit=8192,swidth=8192,noquota)

    Test snapshot

    1
    
    $ touch /mnt/myrbd/v1

    Make snapshot :

    1
    2
    3
    
    $ sync && xfs_freeze -f /mnt/
    $ rbd snap create rbd/myrbd@snap1
    $ xfs_freeze -u /mnt/

    Change a file :

    1
    
    $ mv /mnt/myrbd/v1 /mnt/myrbd/v2

    Mount snapshot in RO :

    1
    2
    3
    
    $ mkdir -p /mnt/myrbd@snap1
    $ rbd map rbd/myrbd@snap1
    $ mount -t xfs -o ro,norecovery,nouuid "/dev/rbd/rbd/myrbd@snap1" "/mnt/myrbd@snap1"
    1
    2
    3
    
    $ ls "/mnt/myrbd"
    total 0
    v2

    OK.

    1
    2
    
    $ ls "/mnt/myrbd@snap1"
    total 0

    Nothing ??? Something went wrong with the sync ?

    Try again :

    1
    2
    3
    4
    5
    6
    
    $ sync && xfs_freeze -f /mnt/
    $ rbd snap create rbd/myrbd@snap2
    $ xfs_freeze -u /mnt/
    $ mkdir -p /mnt/myrbd@snap2
    $ rbd map rbd/myrbd@snap2
    $ mount -t xfs -o ro,norecovery,nouuid "/dev/rbd/rbd/myrbd@snap2" "/mnt/myrbd@snap2"

    Move again the file.

    1
    
    $ mv /mnt/myrbd/v2 /mnt/myrbd/v3
    1
    2
    3
    4
    5
    6
    
    $ ls /mnt/myrbd@snap2
    total 0
    v2
    $ ls /mnt/myrbd
    total 0
    v3

    All right.

    Stop rbdmap (will remove all rbd mapped device)

    1
    
    $ service rbdmap remove

    Remove line added in /etc/ceph/rbdmap

    Remove myrbd :

    1
    2
    3
    4
    
    $ rbd snap purge rbd/myrbd
    Removing all snapshots: 100% complete...done.
    $ rbd rm rbd/myrbd
    Removing image: 100% complete...done.
  • August 2, 2013
    Don’t Forget Unmap Before Remove Rbd

    1
    2
    3
    4
    $ rbd rm rbd/myrbd
    Removing image: 99% complete…failed.2013-08-02 14:07:17.530470 7f3ba2692760 -1 librbd: error removing header: (16) Device or resource busy
    rbd: error: image still has watchers
    This means the image is still open or the clien…

Careers