Ubuntu core microcloud cluster problem with ovn.sbctl data base--bad data-need to edit/fix

Ubuntu Support Template

**Ubuntu Version**:
Example: 22.04 LTS, 24.04, 25.10

**Desktop Environment (if applicable)**:
Example: GNOME, KDE Plasma, XFCE, Budgie, etc.

**Problem Description**:
Describe what you’re trying to do and what happens instead.
If you can easily reproduce the problem, include the steps so others can try.

Example:

Open Settings → Displays
Try to change resolution
Screen goes black

**Relevant System Information**:
Include any details that might help (e.g., hardware, drivers, or special configurations).

**Screenshots or Error Messages**:
If applicable, paste error messages or screenshots.
We prefer copy-pasted text and screenshots instead of photos of screens.

For multi-line text or logs, wrap them in triple backticks like this or highlight the text and use </> in the composer:

`sudo dmesg | tail -20`

**What I’ve Tried**:
List the solutions or workarounds you’ve already attempted.

**Before Posting**:
:mag: Please check if similar issues have already been reported and resolved.

:blue_book: See the “Start here” guide:
https://discourse.ubuntu.com/t/welcome-to-support-and-help/49951`Preformatted text`

in august 2025 i built a ubuntu core 24 microcloud cluster on 4 rpi 5 's. rp0, rp1, rp2, rp3,
in october i removed rp3. to use for a new project, as it was always just running chassis.
over Christmas weekend. rp0 started screwing up, long delays in execution of every command,
lxd not being available, unable to ping api.snapcraft.io, (1500 pings 237 replies) also pinging
it’s own ip, due to look up failure. etc. tried for days to fix it.
today January 5 th. got the idea that since i still had the sd card for rp3, i would re-configure
the cluster by removing rp0, and re-attach rp3 to the cluster. once again having the required 3
machines working smoothly.

removed rp0, shutdown all machines, took the 2 nvme drives out of rp0, using usb cradel formatted
each to “no file systems” using gnome disks on my laptop. put them and the original rp3 sd card
back in. what was rp0.

restarted all 3 machines. all machines responsive, and pinging etc. did updates of all snaps on
all three machines, they all look just like this one. then let them run for 2 hours to settle down.
then ran microcloud add.

following is the read out from rp1, with selected ones from rp2, and rp3.
lines containing my comments, conclusions, and questions. are preceded with #
in some places i used (and) to indicate deletions/modifications of sensitive data.

ubuntu-one-sci@rp1:~$ snap list
Name Version Rev Tracking Publisher Notes
adguard-home v0.107.71 8757 latest/stable ameshkov✓ -
console-conf 24.04.1+git45g5f9fae19+g7598200 80 24/stable canonical✓ -
core22 20251105 2194 latest/stable canonical✓ base
core24 20251026 1244 latest/stable canonical✓ base
lxd 6.6-2dcd56e 37190 latest/stable canonical✓ in-cohort,held
microceph 19.2.1+snap74c0060321 1585 squid/stable canonical✓ in-cohort,held
microcloud 2.1.1-d49bea6 1842 2/stable canonical✓ in-cohort,held
microovn 24.03.6+snap05d4342007 934 24.03/stable canonical✓ in-cohort,held
pi 24-3 151 24/stable canonical✓ gadget
pi-kernel 6.8.0-1043.47 1069 24/stable canonical✓ kernel
snapd 2.72 25585 latest/stable canonical✓ snapd
ubuntu-one-sci@rp1:~$ sudo microcloud add
Waiting for services to start …
! Warning: Discovered non-LTS version “6.6” of LXD
Use the following command on systems that you want to join the cluster:

microcloud join

When requested, enter the passphrase:

passphrase and cert string deleted in this document

Verify the fingerprint (fingerprint deleted in this document) is displayed on joining systems.

Selected rp3 at 192.168.1.60

Gathering system information…
Would you like to set up local storage? (yes/no) [default=yes]:

Using /dev/disk/by-id/nvme-nvme.1e4b- (rest of id deleted in this document) on rp3 for local storage pool

Would you like to set up distributed storage? (yes/no) [default=yes]:

Using 1 disk(s) already setup on rp2 for remote storage pool
Using 1 disk(s) already setup on rp1 for remote storage pool

Using 1 disk(s) on rp3 for remote storage pool

Do you want to encrypt the selected disks? (yes/no) [default=no]:

Using (device id deleted in this document) on rp3 for OVN uplink

Configure dedicated OVN underlay networking? (yes/no) [default=no]:
Initializing new services …
Awaiting cluster formation …
⨯ Error: System “rp3” failed to join the cluster: Failed to update cluster status of services: Failed to join “MicroOVN” cluster: Post “https://192.168.1.30:6443/core/internal/hooks/on-new-member?target=rp0”: Unable to connect to “192.168.1.30:6443”: dial tcp 192.168.1.30:6443: connect: no route to host

192.168.1.30 is ip for rp0 which is no longer a member of the cluster.

to find out what is going on, i ran.

ubuntu-one-sci@rp2:~$ sudo microcloud status

Status: ERROR

┃ ⨯ LXD is not found on rp3
┃ ⨯ MicroCloud members not found in LXD: rp3
┃ ⨯ MicroOVN is not available on rp3
┃ ! MicroOVN is not found on rp3

┌──────┬──────────────┬──────┬─────────────────┬────────────────────────┬─────────────┐
│ Name │ Address │ OSDs │ MicroCeph Units │ MicroOVN Units │ Status │
├──────┼──────────────┼──────┼─────────────────┼────────────────────────┼─────────────┤
│ rp1 │ 192.168.1.40 │ 1 │ mds,mgr,mon │ central,chassis,switch │ ONLINE │
│ rp2 │ 192.168.1.50 │ 1 │ mds,mgr,mon │ central,chassis,switch │ ONLINE │
│ rp3 │ 192.168.1.60 │ 1 │ mds,mgr,mon │ - │ UNREACHABLE │
└──────┴──────────────┴──────┴─────────────────┴────────────────────────┴─────────────┘

from rp3 microcloud status shows this

Status: ERROR

┃ ⨯ LXD is not found on rp3
┃ ! MicroOVN is not found on rp3

┌──────┬──────────────┬──────┬─────────────────┬────────────────────────┬────────┐
│ Name │ Address │ OSDs │ MicroCeph Units │ MicroOVN Units │ Status │
├──────┼──────────────┼──────┼─────────────────┼────────────────────────┼────────┤
│ rp1 │ 192.168.1.40 │ 1 │ mds,mgr,mon │ central,chassis,switch │ ONLINE │
│ rp2 │ 192.168.1.50 │ 1 │ mds,mgr,mon │ central,chassis,switch │ ONLINE │
│ rp3 │ 192.168.1.60 │ 1 │ mds,mgr,mon │ - │ ONLINE │
└──────┴──────────────┴──────┴─────────────────┴────────────────────────┴────────┘

ubuntu-one-sci@rp2:~$ sudo microovn.ovn-sbctl show
Chassis rp1
hostname: rp1
Encap geneve
ip: “192.168.1.40”
options: {csum=“true”}
Chassis rp2
hostname: rp2
Encap geneve
ip: “192.168.1.50”
options: {csum=“true”}
Port_Binding cr-lxd-net2-lr-lrp-ext
Chassis rp0
hostname: rp0
Encap geneve
ip: “192.168.1.30”
options: {csum=“true”}

the problems is that chassis rp0 should read rp3 and its ip should be 192.168.1.60 instead of 30,

how do i edit/fix this? the “ovn south bound data base file”. i have been all over the docs and tutorials, nada…

i think because of the problems with rp0, either did not get a clean removal from the cluster. or the sd card

is the source of the configuration including rp0. as i mentioned above i formatted the 2 nvme drives to

no file system. but the status reports do not mention rp0 at all. that is what is making me suspect

something on the sd card. how serious is that warning about lxd 6.6 not being lts?

What you really seem to be saying is:

Sometimes machines on my network have unexpected IP addresses. And that causes havoc on my cluster.” All the problems you describe seem to cascade from that network issue.

Yeah, that happens sometimes with dynamic IP addresses (DHCP)

Have you looked into using static IP networking? Or mapping MAC addresses to IP addresses in the router settings?

1 Like

i only use static ip’s on my network. even the 3 ip’s that come from my provider are static as well. dhcp is disabled in all routers. that is why i am suspecting pre-existing configuration on the sd card. telling it to use the i p that is used by rp0. (note – to prevent collisions with rp0 it is shutdown so i can get this fixed, without it interfering with this misconfiguration and presenting. red herring symptoms)