Add options for configuring a dedicated Ceph network in MicroCloud

Index LX072
Title Add options for configuring a dedicated Ceph network in MicroCloud
Subteam LXD
Status Braindump
Authors Gabriel Mougard
Stakeholders Maria Seralessandri Thomas Parrott
Type Implementation
Created Mar 28, 2024

Abstract

Introduce a new MicroCloud configuration initialization option to allow users to configure a dedicated network used by Ceph.

Rationale

When deploying a cluster of machines with Ceph nodes, alongside other services, it is important to be able to configure the network interfaces used by Ceph for public and cluster traffic. Some network interface might be tailored for high throughput and low latency, like 100GbE+ QSFP links, while others might be more suited for ‘management’ traffic, like 1GbE or 10GbE links.

In the Ceph terminology, the public network is used for client and monitor traffic (the ‘management’ network, not really data intensive), while the cluster network is used for OSD and replication traffic (this network is data intensive and would better be setup on fast links). By default, Ceph uses the same network for both types of traffic in a MicroCloud deployment, but it would be preferable to be able to configure separate networks for each type of traffic if the cluster has enough network interfaces to sustain these traffics.

Specification

Design: Bootstrapping process

  1. Refactor the MicroCloud Bootstrapping logic to allow for the configuration of the public and cluster networks. MicroCeph is already able to accept custom parameters during its Bootstrapping process: https://github.com/canonical/microceph/pull/266

  2. During the interactive configuration phase, after the storage related questions and before the OVN networking related questions, we will ask two new questions:

What subnet (IPv4/IPv6 CIDR notation) would you like your Ceph public traffic on? [default=<microcloud_internal_subnet>]

and

What subnet (IPv4/IPv6 CIDR notation) would you like you Ceph internal traffic on? [default=<microcloud_internal_subnet>]

With these questions, we can either skip this Ceph networking configuration entirely if the user leave empty responses to both questions or choose to specify the subnets. If a custom subnet is provided, MicroCloud will check that all the cluster members have at least one network interface with an IP falling within this subnet. Else, the setup will fail. If there are more, the chosen interface used for the network traffic will be determined by the OS’ IP routing table. The user can choose to bind an OSD to an IP address on the network:

[osd]

osd bind ip = 10.10.10.10

But this is out of scope as this would need to be done manually by an administrator and not by MicroCloud.

Here is a picture that describe three scenarios that could hopefully be useful:

Design: Bootstrapping process with a YAML preseed file

We would introduce the ceph section in the preseed.yaml structure with the following parameters:

...
# `ceph` is optional and represents the Ceph network configuration
ceph:
  # Subnet (CIDR notation) for the Ceph public network (IPv4 or IPv6 format)
  public_network: 192.0.2.0/24 # or `2001:db8::/32`
  # Subnet (CIDR notation) for the Ceph internal (cluster) network (IPv4 or IPv6 format)
  internal_network: 192.0.1.0/24 # or `2001:db7::/32`

The validation rules are the same as explained before.

Design: Adding a new machine

When adding a new machine to the cluster, if dedicated Ceph networks have been configured in the already bootstrapped cluster, we need to validate that the new machine has at least one interface in each dedicated Ceph network CIDR range. Let’s take an example with the above picture:

  • If a partially disaggregated Ceph has been bootstrapped, we only need to perform one check for the presence of the dedicated interfaces allowing both public and internal Ceph traffic to flow in the to be added cluster members.
  • If a fully disaggregated Ceph has been bootstrapped, we neeed to perform two checks: the presence of dedicated interfaces for Ceph internal traffic and dedicated interfaces for Ceph public traffic.

If one check is not successful, we error out and ask the user to reconfigure the network interfaces on the new machines.

Note: even if the added cluster members are parametrized with no OSDs and that dedicated Ceph subnets have been set in the bootstrapped cluster, we still enforce the need to have dedicated interfaces on the to be added cluster members. This is to avoid Ceph cluster data integrity issues if for example, disks are added to cluster members with no dedicated interfaces to communicate with the rest of the Ceph cluster.

API changes

No API changes.

CLI changes

  • New questions in the interactive bootstrap process (see the Design section above).

Database changes

No database changes.

The question about distributed networking ("Configure distributed networking? (yes/no) [default=yes]: ") selects the uplink networks with no IPs assigned or bridges, so by default the interface used by Ceph is not listed

you’re right. I’ll remove this comment. Thanks!

As discussed as these options are mutually exclusive we should just have a single network setting like we did for the DNS settings: https://github.com/canonical/microcloud/pull/228

As discussed in our 1:1:

  • The new question should be default to false (same as we’re doing for OVN network traffic question).
  • Please check what the word/term that is used for cluster member/node and use it consistently in the questions.

Hey, just as a heads-up separating public and cluster nets for Ceph was a relatively frequently requested feature for performance and security reasons so I wonder if it wouldn’t be better to enable separating those.

Hi, thanks for the heads-up. Our concern is avoiding to have an over-complex Microcloud init process and offer a default config. In the case the customer wants to further separate the networks, as far as I know, he can easily change the cluster network via (Micro)Ceph, while the public one can be configured only at the startup. Is it correct?

1 Like

@maria-seralessandri that’s right, the cluster network can be changed post-install – although it does entail an OSD restart, so there’s some user impact.