This guide describes the procedure of removing an OSD from a Ceph cluster.
This method makes use of the ceph-osd charm’s remove-disk
action, which appeared in the charm’s quincy/stable
channel. There is a pre-Quincy version of this page available.
-
Before removing an OSD unit, we first need to ensure that the cluster is healthy:
juju ssh ceph-mon/leader sudo ceph status
-
Identify the target OSD
Check OSD tree to map OSDs to their host machines:
juju ssh ceph-mon/leader sudo ceph osd tree
Sample output:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.09357 root default -5 0.03119 host finer-shrew 2 hdd 0.03119 osd.2 up 1.00000 1.00000 ...
Assuming that we want to remove
osd.2
. As shown in the output, it is hosted on the machinefiner-shrew
.Check which unit is deployed on this machine:
juju status
Sample output:
... Unit Workload Agent Machine Public address Ports Message ... ceph-osd/1* blocked idle 1 192.168.122.48 No block devices detected using current configuration ... Machine State DNS Inst id Series AZ Message ... 1 started 192.168.122.48 finer-shrew jammy default Deployed ...
In this case,
ceph-osd/1
is the unit we want to remove.Therefore, the target OSD can be identified by the following properties:
OSD_UNIT=ceph-osd/1 OSD=osd.2 OSD_ID=2
-
Remove the OSD disk
Remove the OSD disk using the
remove-disk
action:juju run $OSD_UNIT remove-disk osd-ids=$OSD purge=true --wait=5m
Note:For Juju versions < 3.0, use the
juju run-action
command.Note:The
remove-disk
action attempts to safely remove the target OSD from the cluster. This action will fail with a timeout error if the OSD cannot be safely removed within the timeout period (default is 5 minutes, but you can configure it to 10s, 2m, 10m, etc. ) or if there are not enough OSDs remaining in the cluster after the removal to meet the various pool level replication requirements. If you insist on removing the disk even if it is considered unsafe, you can addforce=true
to the command when running the action. -
(Optional) If the unit hosting the target OSD does not have other active OSDs attached and you would like to delete it, you can do so by running:
juju remove-unit $OSD_UNIT
Caution:If there are active OSDs on the unit, removing it will produce unexpected errors.
-
Ensure the cluster is in healthy state after being scaled down:
juju ssh ceph-mon/leader sudo ceph status