Observability

Observability constitutes the gathering, processing, visualization, and alerting of telemetry signals generated by a given workload.

The Observability stack supported by Charmed Ceph is the Canonical Observability Stack (COS Lite). The instructions below assume a pre-existing COS environment. See the COS documentation for how to get started.

Connect to COS

The connection to COS is made via the ceph-mon charm. There are two relations that can be used on the ceph-mon charm that can be utilized for Observability purposes.

  1. cos-agent: this relation integrates ceph-mon with an grafana-agent application which serves as a single point of entry to push metrics, alert rules and dashboards to COS. Note this is the more featureful and preferred way to integrate Ceph with COS
  2. Legacy metrics-endpoint: this can be used to have the COS Prometheus scrape ceph-mon for metrics and alert rules only.

Use either one or the other, do not relate both the metrics-endpoint and cos-agent as this will lead to duplication of metrics.

As Ceph and COS will usually be deployed to two separate Juju models, cross-model relations must be used in both cases.

The cos-agent integration

The cos-agent relation connects ceph-mon to a grafana-agent application which will then take metrics, alert rules and dashboards from ceph-mon and transmit to COS. This is the recommended way to integrate Ceph with COS.

In the following, assume we have a COS model named cos-lite and a Ceph model simply named ceph:

First, create offers in the COS model:

juju offer -m cos-lite prometheus:prometheus-receive-remote-write
juju offer -m cos-lite grafana:grafana-dashboard

Then, in the Ceph model deploy a grafana-agent subordinate (note, this will need rev 167 or newer):

juju deploy -m ceph grafana-agent --channel latest/edge

Add an integration to ceph-mon and ceph-osd applications:

juju integrate -m ceph ceph-mon:cos-agent grafana-agent
juju integrate -m ceph grafana-agent ceph-osd

The integration between grafana-agent and ceph-mon will pull metrics, alert rules and dashboards from Ceph, and the integration with the ceph-osd application will install the Grafana Node Exporter on the OSD machines to gather low-level machine metrics.

Add integrations to grafana-agent for the grafana and prometheus endpoints:

juju integrate -m ceph grafana-agent admin/cos-lite.grafana-dashboard
juju integrate -m ceph grafana-agent admin/cos-lite.prometheus-receive-remote-write

The grafana-agent application will push dashboards, metrics and alert rules to COS via those relations.

The legacy metrics-endpoint integration

As an alternative to the above integration via the grafana-agent application it’s also possible to only transfer alert rules and configure metrics scraping by relating ceph-mon directly with the prometheus application in COS via the metrics-endpoint relation. Note that this integration should be considered legacy and might be deprecated in the future. Most users will want to use the grafana-agent integration.

In the following, assume we have a COS model named cos-lite and a Ceph model named ceph:

First, create an offer in the COS model:

juju offer -m cos-lite prometheus:metrics-endpoint

Then integrate in the Ceph model:

juju integrate -m ceph ceph-mon:metrics-endpoint admin/cos-lite.prometheus

Doing the above will cause Prometheus to scrape metrics from Ceph and configure Prometheus with alerting rules.

The default alerting rules are taken from the upstream Ceph project. For convenience, they are provided in the reference section: Metrics and alerts.

Further reading on CMR

These Juju documentation pages are recommended reading for more details on cross-model relations/integrations (CMR):

Dashboards

The cos-agent integration offers dashboards through the grafana-dashboard relation to COS.

Two of these dashboards provide statistics on RBD image I/O. However, collecting these statistics must be explicitly enabled due to the potential performance impact when dealing with a large number of pools. For more details, refer to the ceph-mon documentation.

Customise alerts

Alerting rules can be customised by means of a Juju charm resource, named alert-rules:

juju attach ceph-mon alert-rules=<path/to/yaml/file>