The Ceph Blog

Deploying Ceph with Juju

NOTE: This guide is out of date.  Please see the included documentation on the more recent charms in the charmstore.

The last few weeks have been very exciting for Inktank and Ceph. There have been a number of community examples of how people are deploying or using Ceph in the wild. From the ComodIT orchestration example, to the unique approach of Synnefo delivering unified storage with Ceph and many others that haven’t made it to the blog yet. It is a great time to be doing things with Ceph!

We at Inktank have been just as excited as anyone in the community and have been playing with a number of deployment and orchestration tools. Today I wanted to share an experiment of my own for the general consumption of the community, deploying Ceph with Canonical’s relatively new deployment tool, ‘Juju,’ that is taking cloud deployments by storm. If you follow this guide to the end you should end up with something that looks like this:


Juju is a “next generation service deployment and orchestration framework“. The cool part about Juju is you can use just about anything to build your Juju “charms” (recipes) from bash and your favorite scripting language, all the way up to Chef and Puppet. A good portion of the knowledge for the Ceph charms developed by Clint Byrum and James Page actually came from both the Chef cookbooks and the work on ceph-deploy, which we’ll cover in later installments.

For the purposes of this experiment I decided to build the environment using Amazon’s EC2 but you can also use an OpenStack deployment or on your own bare metal in conjunction with Canonical’s MAAS product. The client machine used to spin up the bootstrap environment and then later spin up all the other servers will be an Ubuntu Quantal (12.10) LTS image, but could be any Ubuntu box, including your laptop. The rest of the working machines will be spun up using Quantal as well.

Juju is very generous about spinning up new boxes (typically one per service) so I chose to make all of my boxes spin up using the ‘t1.micro’ machine size so anyone playing with this guide wouldn’t incur massive EC2 charges. Now, on to the meat!

Getting Started

As I said, start the process by spinning up an Ubuntu 12.10 LTS image as your client, this way you don’t have to dump a bunch of software/config on your local machine. This will be the client you use to spin everything else up. Once you have your base Ubuntu install lets add the PPA and install Juju.

> sudo apt-add-repository ppa:juju/pkgs
> sudo apt-get update && sudo apt-get install juju

Now that we have Juju installed we need to tell it to generate a config file.

> juju bootstrap

This will throw an error, but creates ~/.juju/environments.yaml for you to edit. Since we’re using EC2 we need to tell Juju about our credentials so it can spin up new machines and deploy new services. You’ll notice that I’m using the default-series of ‘quantal’ for all of my node machines. This is important since this tells juju where and how to grab the important bits of each charm.

> vi ~/.juju/environments.yaml

default: cephtest
environments:
  cephtest:
    type: ec2
    access-key: YOUR-ACCESS-KEY-GOES-HERE
    secret-key: YOUR-SECRET-KEY-GOES-HERE
    control-bucket: (generated by juju)
    admin-secret: (generated by juju)
    default-series: quantal
    juju-origin: ppa
    ssl-hostname-verification: true

Setting up the Bootstrap Environment

Now that Juju can interact with EC2 directly we need to get a bootstrap environment set up that will hold our configs and deploy our services. Since I can’t set the global configs yet, I need to tell it manually that this box needs to be a ‘t1.micro’ instance.

> juju bootstrap --constraints “instance-type=t1.micro”

This will take a few minutes to spin up the machine and get the environment set up. Once this is completed you should be able to see the machine via the ‘juju status’ command.

> juju status

2012-11-07 13:06:30,645 INFO Connecting to environment...
2012-11-07 13:06:42,313 INFO Connected to environment.
machines:
  0:
    agent-state: running
    dns-name: ec2-23-20-70-201.compute-1.amazonaws.com
    instance-id: i-d79492ab
    instance-state: running
services: {}
2012-11-07 13:06:42,408 INFO 'status' command finished successfully

Now we have a bootstrap environment and we can tell it that all boxes should default to ‘t1.micro’ unless otherwise specified. There are a number of settings that you can monkey with, take a look at the constraints doc for more details.

> juju set-constraints instance-type=t1.micro

Make it Pretty!

For those who like to see a visual representation of what’s happening, or just feel like letting someone else watch what’s going on, Juju now has a GUI that you can use. While I wouldn’t recommend using the GUI as a replacement for the command line to deploy the charms below, you can certainly use it to watch what’s happening. For more mature charms (and in the future) this GUI should be more than capable of managing your resources. In any case, it’s neat to have pretty pictures as you tapdance on the CLI.

If you would like to install the GUI feel free to grab my version of the ‘juju-gui’ charm (at the time of this article the main charm wasn’t on quantal yet):

> juju deploy cs:~pmcgarry/quantal/juju-gui

Once that completes (and it could take a while for everything to download and install) you’ll need to ‘expose’ it so you can get to it:

> juju expose juju-gui

This will give you the ability to access the box publicly via a web browser at the ec2 address shown in ‘juju status’. The detault user name and password are ‘admin’ and the ‘admin-secret’ value from your ~/.juju/environments.yaml file. Feel free to leave that up while you do the rest of this work to watch the magic happen.

Prep for Ceph Deployment

Our Juju environment is now ready to start spinning up our Ceph cluster, we just need to do a little leg work so Juju has all the important details up-front. First we need to grab a few Ceph tools:

> sudo apt-get install ceph-common && sudo apt-get install uuid

We need to generate a uuid and auth key for Ceph to use.

> uuid

insert this as the $fsid below

> ceph-authtool /dev/stdout --name=$NAME --gen-key

insert this as the $monitor-secret below.

Now we need to drop these (and a few other) values into our yaml file:

> vi ceph.yaml

ceph:
    source: http://ceph.com/debian-bobtail/ quantal main
    fsid: d78ae656-7476-11e2-a532-1231390a9d4b
    monitor-secret: AQDcNRlR6MMZNRAAWw3iAobsJ1MLoFBLJYo4yg==

ceph-osd:
    source: http://ceph.com/debian-bobtail/ quantal main
    osd-devices: /dev/xvdf

ceph-radosgw:
    source: http://ceph.com/debian-bobtail/ quantal main

You’ll notice we’re also passing a ‘source’ item to Juju, this tells the charm where to grab the appropriate code for Ceph, in this case the latest release (Bobtail 0.56.3 when this was written) from Ceph.com.

Tail Those Logs!

Since a good portion of this setup is experimental it’s a good idea to tail the logs. Thankfully, Juju makes this extremely easy for you to do. Simply open a second term window, ssh to your client machine, and type:

>juju debug-log

This will aggregate all of the logs from your cluster into a single output for easy browsing in case something goes wrong.

Deploying Ceph Monitors

Time to start deploying our Ceph cluster! In this case we’re going to deploy the first three machines with ceph-mon (Ceph monitors) since we typically recommend at least three in order to reach a quorum. You’ll want to wait until all three machines are up before moving on.

> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph

You’ll notice that while these charms are in the charm store (cs:) they are off on my own user space. This is because I had to make a few tweaky changes for these charms to deploy happily on ec2 and use bobtail and quantal. These charms are still a bit new so if you have tweaks or changes feel free to give me a shout, or play with the main Ceph charms on jujucharms.com. In the future you’ll be able to deploy using just ‘ceph’ instead of anyone’s user space.

EXAMPLE: > juju deploy -n 3 --config ceph.yaml ceph

This could take a while, so just keep checking ‘juju status’ until you have the machines running AND the agents set to ‘started.’ You should also see the debug-log go through a flurry of activity when it starts getting close to the end.

Once we have the monitors up and running you can take a look at what your deployment looks like. If you want to you can even ssh in to one of the machines using Juju’s built-in ssh tool.

> juju status

machines:
  0:
    agent-state: running
    dns-name: ec2-50-16-15-64.compute-1.amazonaws.com
    instance-id: i-2b45f657
    instance-state: running
  1:
    agent-state: running
    dns-name: ec2-50-19-23-167.compute-1.amazonaws.com
    instance-id: i-3b368547
    instance-state: running
  2:
    agent-state: running
    dns-name: ec2-107-22-128-107.compute-1.amazonaws.com
    instance-id: i-1f368563
    instance-state: running
  3:
    agent-state: running
    dns-name: ec2-174-129-51-96.compute-1.amazonaws.com
    instance-id: i-15368569
    instance-state: running
services:
  ceph:
    charm: cs:~pmcgarry/quantal/ceph-0
    relations:
      mon:
      - ceph
    units:
      ceph/0:
        agent-state: started
        machine: 1
        public-address: ec2-50-19-23-167.compute-1.amazonaws.com
      ceph/1:
        agent-state: started
        machine: 2
        public-address: ec2-107-22-128-107.compute-1.amazonaws.com
      ceph/2:
        agent-state: started
        machine: 3
        public-address: ec2-174-129-51-96.compute-1.amazonaws.com
> juju ssh ceph/0 sudo ceph -s

   health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
   monmap e2: 3 mons at {ceph232118103=10.243.121.227:6789/0,ceph501969115=10.245.210.114:6789/0,ceph5423414494=10.245.89.32:6789/0}, election epoch 6, quorum 0,1,2 ceph232118103,ceph501969115,ceph5423414494
   osdmap e1: 0 osds: 0 up, 0 in
    pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail
   mdsmap e1: 0/0/1 up

From this status we can see that there are three monitors up (“monmap e2: 3 mons at {…}”) and no OSDs (“osdmap e1: 0 osds: 0 up, 0 in”). Time to spin up some homes for those bits!

Deploying OSDs

Once our monitors look healthy it’s time to spin up some OSDs. Feel free to drop as many in as you please, for the purposes of this experiment I chose to spin up three.

> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph-osd

That will take a little bit to complete so you may want to go grab an infusion of caffeine at this point. One thing to keep in mind is that earlier in our ceph.yaml we defined the physical devices for our OSDs as /dev/xvdf. If you are familiar with EC2 you will know that that device doesn’t exist yet, so our OSD deploy command will spin up and configure boxes, but we’re not quite there yet.

When you get back, if you take a look with juju status you should now see a bunch of new machines and a new section called ceph-osd:

> juju status

…
  ceph-osd:
    charm: cs:~pmcgarry/quantal/ceph-osd-0
    relations: {}
    units:
      ceph-osd/0:
        agent-state: started
        machine: 4
        public-address: ec2-174-129-82-169.compute-1.amazonaws.com
      ceph-osd/1:
        agent-state: started
        machine: 5
        public-address: ec2-50-16-0-95.compute-1.amazonaws.com
      ceph-osd/2:
        agent-state: started
        machine: 6
        public-address: ec2-75-101-175-213.compute-1.amazonaws.com
    

Now we need to actually give it the disks it needs. Via your EC2 console (or using ec2 command line tools) you need to spin up 3 EBS volumes and attach one to each of your OSD machines. If you need help there is a pretty decent, concise walkthrough at:

http://www.webmastersessions.com/how-to-attach-ebs-volume-to-amazon-ec2-instance

Once you have the volumes attached we need to tell Juju to go back and use them:

> juju set ceph-osd "osd-devices=/dev/xvdf"

This will trigger a rescan and get your OSDs functioning. All that’s left now is to connect our monitor cluster with the new pool of OSDs.

> juju add-relation ceph-osd ceph

We can ssh into one of the Ceph boxes and take a look at our cluster now:

>juju ssh ceph/0

> sudo ceph -s

   health HEALTH_OK
   monmap e2: 3 mons at {ceph232118103=10.243.121.227:6789/0,ceph501969115=10.245.210.114:6789/0,ceph5423414494=10.245.89.32:6789/0}, election epoch 6, quorum 0,1,2 ceph232118103,ceph501969115,ceph5423414494
   osdmap e10: 3 osds: 3 up, 3 in
    pgmap v115: 208 pgs: 208 active+clean; 0 bytes data, 3102 MB used, 27584 MB / 30686 MB avail
   mdsmap e1: 0/0/1 up

Congratulations, you now have a Ceph cluster! Feel free to write a few apps against it, show it to all of your friends, or just nuke it and start refining your chops for a production deployment.

Extra Credit

Since that Juju GUI screen looked so empty I decided I wanted to play a bit more with the tools at my disposal. If you would like to take this exercise a bit further we can also add a few RADOS Gateway machines and load-balance them behind an haproxy machine. To do this is only a few more commands with Juju:

> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph-radosgw
> juju expose ceph-radosgw

> juju deploy cs:~pmcgarry/quantal/haproxy
> juju expose haproxy

> juju add-relation ceph-radosgw haproxy

That should be it! You’ll notice that I have my own copy of the haproxy, this is simply because it isn’t technically released for quantal yet, but my (unmodified) version seems to run just fine.

Troubleshooting

Juju actually makes troubleshooting and iterative development VERY easy (one of my favorite things about it). If you would like to delve deeper into playing with Juju I highly recommend reading their docs, which are quite good. However, one of the most useful tools (beyond the debug-log I mentioned earlier) is the ability to step through the hooks as juju tries to run them. For example, lets say we tried to deploy Ceph and ‘juju status’ was telling us there was an ‘install-error.’ We could use our second term window to execute the following:

> juju debug-hooks ceph/0

This allows us to debug the execution of the hooks on a specific machine (in this case ceph/0). Now in our main window we can type:

> juju resolved --retry ceph/0

We get a preformatted setup in our ‘debug-hooks’ window with an indication at the bottom that we’re on the “install” hook. From here we can change to the hooks directory and rerun the install hook:

> cd hooks
> ./install

From here we can troubleshoot errors on this box before going back and pushing a patch to Launchpad.net. I wont try to recreate the expansive documentation on the jujucharms site, but fiddling with Juju has been far less frustrating that some other orchestration frameworks I have poked at recently. Good luck, and happy charming!

Cleaning Up

If you would like to close up shop you can either destroy just the services (if you want to keep the machines running for deploying other Juju tests):

> juju destroy-service ceph
> juju destroy-service ceph-osd
> juju destroy-service ceph-radosgw
> juju destroy-service haproxy

…or just drop some dynamite on the whole thing (this will kill everything but your client machine, including your bootstrap environment):

> juju destroy-environment

Wrap Up

You are now a seasoned veteran of Ceph deployment, what more could you want? If you do have questions, comments, or anything for the good of the cause we would love to hear about it. Currently the best way to get help or give feedback is in our #Ceph irc channel but our mailing lists are also pretty active. For Juju-specific feedback you can also hit up the #Juju irc channel. If you see any egregious errors on this writeup or would like to know more about Ceph community plans feel free to send email to patrick at inktank dot com.

scuttlemonkey out

Comments: Deploying Ceph with Juju

  1. Hello! Very well written article!

    I have 2 questions:

    1. How can I implement replication? So I can have a failsafe cluster.
    2. How can I expand the storage capacity of my cluster adding only new disks, not new servers.

    Thank you very much!

    Posted by Daniel
    June 7, 2013 at 12:37 pm
    • Thanks!

      1. For multi-cluster replication it’s still a bit of a work-in-progress. For RBD you can use the incremental snapshots feature, and with Dumpling you’ll have a much better option for the RESTful gateway to do something similar. Longer-term we’re planning for a robust geo solution all in one cluster, but for now we’re working on DR-type solutions.

      2. I’m assuming you’re referring to “how can I do this with Juju” rather than just in general. If that’s the case, you can repeat the > juju set ceph-osd “osd-devices=/dev/xvdf” step after adding more disks to a host to trigger a rescan just like you did the first time through.

      Hope that helps! If you have more questions feel free to stop by our #ceph irc channel on irc.oftc.net.

      Posted by scuttlemonkey
      June 7, 2013 at 3:39 pm
  2. hi!

    I am doing openstack installation and planned to use ceph as object storage. I am planning to deploy everything using maas and juju. I am new to ceph, From above steps it looks like we will need 6 physical servers, 3 for monitors and 3 for osd, is that correct?
    Cant we install ceph-osd on same monitor servers?

    Plz help would be very appreciated.

    Posted by Sonali Jadhav
    July 28, 2013 at 1:54 pm
    • At the time of writing this, that wasn’t possible (the targeted machine option wasn’t yet available for juju deploy). However, combining your monitors and OSDs is not going to give you the redundancy or performance that you might wish for. For testing purposes you could certainly do that, but I wouldn’t recommend it for a production environment.

      Posted by scuttlemonkey
      July 28, 2013 at 2:42 pm
  3. Hi, thanks for the quick response, I am planning for production scenario only. I have 3 big config servers standing, which i thought would be enough ceph before reading this guide, But i want to go for ceph storage cluster, So I am thinking to 2 scenarios right now,
    1) 2 ceph OSDs and 1 moitor

    and

    2)ceph monitor plus osd on all 3 servers – i am thinking of this scenario just to start now and later one once we grow more we’ll separate monitor roles from osd server if its possible (Is that possible?)

    Which one of above scenario you would suggest will be best for production to start?

    Posted by Sonali Jadhav
    July 28, 2013 at 4:41 pm
    • Sorry for the lag in responding here, have been traveling.

      I would probably go with the second option here so you can get used to dealing with multiple mons and keys across machines. If you decide to scale up you can always bring up new OSD-only machines and balance things away from your mons for better performance.

      The only caveat here would be to make sure that you have multiple drives in those servers as production setups recommend 1 OSD per disk (rather than the quick-start which shows you how to do OSD per directory). Stop by #ceph on irc.oftc.net if you have any questions. Thanks.

      Posted by scuttlemonkey
      August 5, 2013 at 6:23 pm
  4. whats the value of $NAME? in command ceph-authtool /dev/stdout –name=$NAME –gen-key ?

    Posted by sonali
    September 2, 2013 at 1:58 pm

Add Comment

© 2013, Inktank Storage, Inc.. All rights reserved.