Charmed Ceph manual install

The guide shows how to perform a general install of Charmed Ceph. It will provide detail on the fundamental concepts and show how a more customised Ceph cluster can be achieved. Ensure that the base requirements have been met.

What you will need

Cluster specifications

The Ceph cluster will have three Monitors and six OSDs. The OSDs will be provided by three storage nodes, with two OSDs hosted per node (backed by devices /dev/sdb and /dev/sdc).

A Monitor will be containerised on each of the storage nodes. This means that you will require three machines for the Ceph cluster. One additional machine will be needed for the Juju controller. The MAAS cluster must therefore consist of at least four machines.

The MAAS nodes will be running Ubuntu 18.04 LTS (Bionic) as will any LXD containers created during the deployment. The use of Bionic will give us the opportunity to include the UCA in our configuration. Ceph Octopus will be deployed.

Set up the environment

Ensure that you have a host with Juju installed and that it has network connectivity to the MAAS cluster. You must also have a user created on the MAAS cluster.

Inform Juju about the remote MAAS cluster and provide credentials for accessing it. Here the MAAS cluster will be added to Juju as a cloud called ‘my-maas’.

juju add-cloud
juju add-credential my-maas

We’ll then create a Juju controller management node as well as a Juju model to house the Ceph cluster. The controller will be called ‘my-controller’ and the model will be called ‘ceph’:

juju bootstrap --bootstrap-series=bionic my-maas my-controller
juju add-model --config default-series=bionic ceph

Please see the relevant Juju documentation if you need help with carrying out the above commands.

Configuration options

In this deployment charm configuration options will be placed in a YAML file as opposed to stating them directly on the command line. File ceph.yaml will contain the options for all the charms we will use:

ceph-mon:
  customize-failure-domain: true
  monitor-count: 3
  expected-osd-count: 3
  source: cloud:bionic-ussuri

ceph-osd:
  customize-failure-domain: true
  osd-devices: /dev/sdb /dev/sdc
  source: cloud:bionic-ussuri

The meaning of the above options are explained below, categorised by charm.

ceph-mon

  • customize-failure-domain
    This option determines how a Ceph CRUSH map is configured. A value of ‘false’ (the default) will lead to a map that will replicate data across hosts (implemented as Ceph bucket type ‘host’). With a value of ‘true’ all MAAS-defined zones will be used to generate a map that will replicate data across Ceph availability zones (implemented as bucket type ‘rack’). This option is also supported by the ceph-osd charm, and its value must be the same for both charms.

  • expected-osd-count
    This option states the number of OSDs expected to be deployed in the cluster. This value can influence the number of placement groups (PGs) to use per pool. The PG calculation is based either on the actual number of OSDs or this option’s value, whichever is greater. The default value is ‘0’, which tells the charm to only consider the actual number of OSDs.

  • monitor-count
    This option gives the number of ceph-mon units in the monitor cluster (where one ceph-mon unit represents one MON). The default value is ‘3’ and is generally a good choice. For other monitor counts, it is good practice to set this explicitly to avoid a possible race condition during the formation of the cluster. The capacity for fault tolerance is based upon the preservation of quorum (majority of the original MON cluster). For example, a count of five can tolerate two failures (quorum: ⅗), a count of four can tolerate one failure (quorum: ¾), and a count of three can also tolerate one failure (quorum: ⅔).

  • source
    This option states the software sources. A common value is an OpenStack UCA release (e.g. ‘cloud:xenial-queens’ or ‘cloud:bionic-ussuri’). See Ceph and the UCA. The underlying host’s existing apt sources will be used if this option is not specified (this behaviour can be explicitly chosen by using the value of ‘distro’). This option is also supported by the ceph-osd charm, and its value must be the same for both charms.

ceph-osd

  • customize-failure-domain
    This option’s description is the same as that of the identically-named ceph-mon option. Each charm’s option must have the same value.

  • osd-devices
    This option lists what block devices can be used for OSDs across the cluster. This list may affect newly added ceph-osd units as well as existing units (the option may be modified after units have been added). The charm will attempt to activate as Ceph storage any listed device that is visible by the unit’s underlying machine.

  • source
    This option’s description is the same as that of the identically-named ceph-mon option. Each charm’s option must have the same value.

In particular, we see that OpenStack Ussuri on Bionic has been chosen as the software source for Ceph. From Ceph and the UCA it can be deduced that the resulting Ceph version will be Octopus.

Data resiliency

Ceph storage is natively highly available. This means that, by design, data objects are stored in such a way that data resiliency is ensured in the advent of an OSD failure. By default the ceph-osd charm will maintain three replicas of each storage object across the cluster.

The storage of these objects is organised on a per pool basis, where each pool is of one of two types:

  • replicated (default)
  • erasure coded

For a detailed explanation of these two pool types see section Pool types in this guide.

Deploy Ceph

The actual deployment of Ceph is straightforward:

juju deploy -n 3 --config ./ceph.yaml ceph-osd
juju deploy -n 3 --to lxd:0,lxd:1,lxd:2 --config ./ceph.yaml ceph-mon
juju add-relation ceph-osd:mon ceph-mon:osd

As planned, a containerised Monitor is placed on each storage node. We’ve assumed that the machines spawned in the first command are assigned the IDs of 0, 1, and 2. The latter is the default behaviour (i.e. a model’s first machine will have an ID of 0).

The output to the juju status command should look similar to this:

Model  Controller     Cloud/Region     Version  SLA          Timestamp
ceph   my-controller  my-maas/default  2.8.1    unsupported  14:31:23Z

App       Version  Status  Scale  Charm     Store       Rev  OS      Notes
ceph-mon  15.2.3   active      3  ceph-mon  jujucharms   48  ubuntu  
ceph-osd  15.2.3   active      3  ceph-osd  jujucharms  303  ubuntu  

Unit         Workload  Agent  Machine  Public address  Ports  Message
ceph-mon/0*  active    idle   0/lxd/0  10.0.0.130             Unit is ready and clustered
ceph-mon/1   active    idle   1/lxd/0  10.0.0.131             Unit is ready and clustered
ceph-mon/2   active    idle   2/lxd/0  10.0.0.132             Unit is ready and clustered
ceph-osd/0*  active    idle   0        10.0.0.127             Unit is ready (2 OSD)
ceph-osd/1   active    idle   1        10.0.0.128             Unit is ready (2 OSD)
ceph-osd/2   active    idle   2        10.0.0.129             Unit is ready (2 OSD)

Machine  State    DNS         Inst id              Series  AZ       Message
0        started  10.0.0.127  node1                focal   default  Deployed
0/lxd/0  started  10.0.0.130  juju-385085-0-lxd-0  focal   default  Container started
1        started  10.0.0.128  node2                focal   default  Deployed
1/lxd/0  started  10.0.0.131  juju-385085-1-lxd-0  focal   default  Container started
2        started  10.0.0.129  node3                focal   default  Deployed
2/lxd/0  started  10.0.0.132  juju-385085-2-lxd-0  focal   default  Container started

Above we see the three Monitors and the six OSDs (two per storage node). It should also be clear that each Monitor is containerised (e.g. 1/lxd/0 indicates “LXD machine 0 on machine 1”).

Under the ‘Version’ column for either the ceph-mon or ceph-osd application a value of ‘15.2.3’ is shown. This corresponds to Ceph Octopus.

Verification

Verify the state of the Ceph cluster by displaying the output to the traditional ceph status command. Invoke this command on one of the Monitors:

juju ssh ceph-mon/0 sudo ceph status

Sample output is:

 cluster:
    id:     de95f820-c7e1-11ea-a916-b9347f3c399e
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum juju-385085-0-lxd-0,juju-385085-1-lxd-0,juju-385085-2-lxd-0 (age 5d)
    mgr: juju-385085-0-lxd-0(active, since 5d), standbys: juju-385085-2-lxd-0, juju-385085-1-lxd-0
    osd: 6 osds: 6 up (since 5d), 6 in (since 5d)

  data:
    pools:   1 pools, 1 pgs
    objects: 0 objects, 0 B
    usage:   6.0 GiB used, 174 GiB / 180 GiB avail
    pgs:     1 active+clean

You now have a Ceph cluster up and running.