OVN setup issue for rest of machines of LXD Cluster

,

I have set up the LXD cluster on OVN. OVN is working fine on all 5 servers but containers created in servers 4 and 5 do not ping other containers. Some configs are missing in the documentation for the OVN cluster setup for the LXD Cluster. can you please help with it?

Hello, can you please provide the version of LXD you’re using, as well as your LXD network configurations.

So that’s lxc network show <your network> for any related networks.

It might be helpful to see lxc network show <your network> --target <your server> as well to see if there’s any discrepancy on server 4 and server 5.

Also lxc network info <your network> and <lxc network info <your network> --target <your server>

root@server4:~# lxc network show ovnet
config:
bridge.mtu: “1442”
ipv4.address: 10.155.54.1/24
ipv4.nat: “true”
network: lxdbr0
volatile.network.ipv4.address: 10.214.134.51
description: “”
name: ovnet
type: ovn
used_by:
- 20 containers
managed: true
status: Created
locations:

  • 6 servers

root@server4:~# lxc network show ovnet --target server4
config:
bridge.mtu: “1442”
ipv4.address: 10.155.54.1/24
ipv4.nat: “true”
network: lxdbr0
volatile.network.ipv4.address: 10.214.134.51
description: “”
name: ovnet
type: ovn
used_by:

  • 20 containers
    managed: true
    status: Created
    locations:
  • 6 servers

root@server4:~# lxc network show ovnet --target server5
config:
bridge.mtu: “1442”
ipv4.address: 10.155.54.1/24
ipv4.nat: “true”
network: lxdbr0
volatile.network.ipv4.address: 10.214.134.51
description: “”
name: ovnet
type: ovn
used_by:

  • 20 containers
    managed: true
    status: Created
    locations:
  • 6 servers

root@server4:~# lxc network info ovnet
Name: ovnet
MAC address: 00:16:3e:3a:0f:f6
MTU: 1442
State: up
Type: broadcast

IP addresses:
inet 10.155.54.1/24 (link)

Network usage:
Bytes received: 0B
Bytes sent: 0B
Packets received: 0
Packets sent: 0

OVN:
Chassis: server2

root@server4:~# lxc network info ovnet --target server4
Name: ovnet
MAC address: 00:16:3e:3a:0f:f6
MTU: 1442
State: up
Type: broadcast

IP addresses:
inet 10.155.54.1/24 (link)

Network usage:
Bytes received: 0B
Bytes sent: 0B
Packets received: 0
Packets sent: 0

OVN:
Chassis: server2

root@server4:~# lxc network info ovnet --target server5
Name: ovnet
MAC address: 00:16:3e:3a:0f:f6
MTU: 1442
State: up
Type: broadcast

IP addresses:
inet 10.155.54.1/24 (link)

Network usage:
Bytes received: 0B
Bytes sent: 0B
Packets received: 0
Packets sent: 0

OVN:
Chassis: server2

Strange thing is rarely it ping works on server5 container but never worked on server4 container.

Please can you describe how you set up the cluster, along with host OS version, OVN version and LXD version.

Additionally please can you show the output of sudo ovn-nbctl show and sudo ovs-vsctl show on the problem hosts. Thanks

OS - Ubuntu 22.04 LTS
OVN - 2.17.8
LXD - 5.20-f3dd836

root@62.171.175.198:~# sudo ovn-nbctl show
ovn-nbctl: unix:/var/run/ovn/ovnnb_db.sock: database connection failed (No such file or directory)
root@62.171.175.198:~# sudo ovs-vsctl show
d1c0df42-afca-43c9-a737-8783f1e65b19
Bridge lxdovn1
Port lxdovn1
Interface lxdovn1
type: internal
Port lxdovn1b
Interface lxdovn1b
Port patch-lxd-net30-ls-ext-lsp-provider-to-br-int
Interface patch-lxd-net30-ls-ext-lsp-provider-to-br-int
type: patch
options: {peer=patch-br-int-to-lxd-net30-ls-ext-lsp-provider}
Bridge br-int
fail_mode: secure
datapath_type: system
Port patch-br-int-to-lxd-net30-ls-ext-lsp-provider
Interface patch-br-int-to-lxd-net30-ls-ext-lsp-provider
type: patch
options: {peer=patch-lxd-net30-ls-ext-lsp-provider-to-br-int}
Port veth6327067e
Interface veth6327067e
error: “could not open network device veth6327067e (No such device)”
Port veth5ea25464
Interface veth5ea25464
error: “could not open network device veth5ea25464 (No such device)”
Port ovn-3ae638-0
Interface ovn-3ae638-0
type: geneve
options: {csum=“true”, key=flow, remote_ip=“84.247.171.75”}
bfd_status: {diagnostic=“Control Detection Time Expired”, flap_count=“67”, forwarding=“true”, remote_diagnostic=“Neighbor Signaled Session Down”, remote_state=up, state=up}
Port br-int
Interface br-int
type: internal
Port ovn-vmd116-1
Interface ovn-vmd116-1
type: geneve
options: {csum=“true”, key=flow, remote_ip=“89.117.61.135”}
bfd_status: {diagnostic=“Neighbor Signaled Session Down”, flap_count=“5”, forwarding=“true”, remote_diagnostic=“Control Detection Time Expired”, remote_state=up, state=up}
Port ovn-vmd116-0
Interface ovn-vmd116-0
type: geneve
options: {csum=“true”, key=flow, remote_ip=“89.117.61.140”}
bfd_status: {diagnostic=“Control Detection Time Expired”, flap_count=“9”, forwarding=“true”, remote_diagnostic=“Neighbor Signaled Session Down”, remote_state=up, state=up}
Port ovn-vmd480-0
Interface ovn-vmd480-0
type: geneve
options: {csum=“true”, key=flow, remote_ip=“144.91.74.36”}
bfd_status: {diagnostic=“No Diagnostic”, flap_count=“1”, forwarding=“true”, remote_diagnostic=“Control Detection Time Expired”, remote_state=up, state=up}
Port ovn-vmd117-0
Interface ovn-vmd117-0
type: geneve
options: {csum=“true”, key=flow, remote_ip=“5.182.33.3”}
bfd_status: {diagnostic=“Control Detection Time Expired”, flap_count=“41”, forwarding=“true”, remote_diagnostic=“Neighbor Signaled Session Down”, remote_state=up, state=up}
ovs_version: “2.17.8”

root@144.91.74.36:~# sudo ovn-nbctl show
ovn-nbctl: unix:/var/run/ovn/ovnnb_db.sock: database connection failed (No such file or directory)
root@144.91.74.36:~# sudo ovs-vsctl show
3931e598-f183-43d0-a506-18ffed6bc8cc
Bridge br-int
fail_mode: secure
datapath_type: system
Port patch-br-int-to-lxd-net30-ls-ext-lsp-provider
Interface patch-br-int-to-lxd-net30-ls-ext-lsp-provider
type: patch
options: {peer=patch-lxd-net30-ls-ext-lsp-provider-to-br-int}
Port ovn-3ae638-0
Interface ovn-3ae638-0
type: geneve
options: {csum=“true”, key=flow, remote_ip=“84.247.171.75”}
bfd_status: {diagnostic=“Neighbor Signaled Session Down”, flap_count=“111”, forwarding=“true”, remote_diagnostic=“Control Detection Time Expired”, remote_state=up, state=up}
Port ovn-vmd509-0
Interface ovn-vmd509-0
type: geneve
options: {csum=“true”, key=flow, remote_ip=“62.171.175.198”}
bfd_status: {diagnostic=“Control Detection Time Expired”, flap_count=“5”, forwarding=“true”, remote_diagnostic=“No Diagnostic”, remote_state=up, state=up}
Port ovn-vmd116-1
Interface ovn-vmd116-1
type: geneve
options: {csum=“true”, key=flow, remote_ip=“89.117.61.135”}
bfd_status: {diagnostic=“Control Detection Time Expired”, flap_count=“123”, forwarding=“true”, remote_diagnostic=“Neighbor Signaled Session Down”, remote_state=up, state=up}
Port br-int
Interface br-int
type: internal
Port ovn-vmd117-0
Interface ovn-vmd117-0
type: geneve
options: {csum=“true”, key=flow, remote_ip=“5.182.33.3”}
bfd_status: {diagnostic=“Neighbor Signaled Session Down”, flap_count=“59”, forwarding=“true”, remote_diagnostic=“Control Detection Time Expired”, remote_state=up, state=up}
Port ovn-vmd116-0
Interface ovn-vmd116-0
type: geneve
options: {csum=“true”, key=flow, remote_ip=“89.117.61.140”}
bfd_status: {diagnostic=“Control Detection Time Expired”, flap_count=“183”, forwarding=“true”, remote_diagnostic=“Neighbor Signaled Session Down”, remote_state=up, state=up}
Bridge lxdovn1
Port lxdovn1
Interface lxdovn1
type: internal
Port lxdovn1b
Interface lxdovn1b
Port patch-lxd-net30-ls-ext-lsp-provider-to-br-int
Interface patch-lxd-net30-ls-ext-lsp-provider-to-br-int
type: patch
options: {peer=patch-br-int-to-lxd-net30-ls-ext-lsp-provider}
ovs_version: “2.17.8”

from OVN Central -

root@vmd117741:/# sudo ovn-nbctl show
ovn-nbctl: unix:/var/run/ovn/ovnnb_db.sock: database connection failed ()
root@vmd117741:/# sudo ovn-sbctl show
Chassis vmd116805
hostname: vmd116805.contaboserver.net
Encap geneve
ip: “89.117.61.140”
options: {csum=“true”}
Port_Binding lxd-net30-instance-a1493d57-2c90-4189-97c8-68a489aef371-eth0
Port_Binding lxd-net30-instance-6938ca4a-a770-4f34-adb2-f5f24bb11f14-eth0
Port_Binding cr-lxd-net30-lr-lrp-ext
Port_Binding lxd-net30-instance-f9f49515-1600-4f94-972d-c7f66cc936de-eth0
Port_Binding lxd-net30-instance-a171329d-767b-4093-9d57-882b21bd1eca-eth0
Port_Binding lxd-net30-instance-02863ebc-49aa-42d4-a201-eeead38fcc46-eth0
Port_Binding lxd-net30-instance-2452ec61-642c-4ab6-a650-ef9b8849ab31-eth0
Port_Binding lxd-net30-instance-b4309adc-9087-4dae-a5ac-7a0c243ff7f9-eth0
Port_Binding lxd-net30-instance-20c0d8a0-32fc-4372-a670-c45fbeec215b-eth0
Chassis vmd50955
hostname: vmd50955.contaboserver.net
Encap geneve
ip: “62.171.175.198”
options: {csum=“true”}
Chassis vmd116804
hostname: vmd116804.contaboserver.net
Encap geneve
ip: “89.117.61.135”
options: {csum=“true”}
Port_Binding lxd-net30-instance-976badc0-a4bf-4bd3-9906-5619c4b6615f-eth0
Port_Binding lxd-net30-instance-85af886f-3ebf-4474-adfe-5da354c0cea1-eth0
Port_Binding lxd-net30-instance-d3418a4d-2046-4890-8c53-69bc8f539c7d-eth0
Port_Binding lxd-net30-instance-7f97203a-0bb7-4a5a-a4ba-3b609cbf52d2-eth0
Port_Binding lxd-net30-instance-f94c1cc5-cd55-4640-8f12-72475cd9c4b4-eth0
Port_Binding lxd-net30-instance-461eda10-13bb-4cc9-aae0-54b3011ad1e6-eth0
Port_Binding lxd-net30-instance-455fd820-fca8-4ac9-8391-d8aeaba438e7-eth0
Chassis vmd117741
hostname: vmd117741.contaboserver.net
Encap geneve
ip: “5.182.33.3”
options: {csum=“true”}
Port_Binding lxd-net30-instance-7e3a33e4-0865-4eb4-b924-dd9b1470ed56-eth0
Port_Binding lxd-net30-instance-6d67a874-39e9-48c8-a147-2e8a58e78592-eth0
Port_Binding lxd-net30-instance-ed14f8fb-5efe-4cff-bd16-5a41a76cbc48-eth0
Port_Binding lxd-net30-instance-e7327020-4ee7-4ab0-8231-c379bb58da89-eth0
Port_Binding lxd-net30-instance-73096f44-dd21-45ea-93b4-5af94987fb26-eth0
Port_Binding lxd-net30-instance-bd2647f4-47dd-4df9-b892-e3694eaa12b5-eth0
Port_Binding lxd-net30-instance-e7e3ceae-38fb-4bd2-b9cd-d0ca99b7459d-eth0
Chassis vmd48074
hostname: vmd48074.contaboserver.net
Encap geneve
ip: “144.91.74.36”
options: {csum=“true”}
Chassis “3ae63804-63f7-4645-8bb1-6a632a0cae86”
hostname: vmd127803.contaboserver.net
Encap geneve
ip: “84.247.171.75”
options: {csum=“true”}
Port_Binding lxd-net30-instance-dec65987-b49d-418f-9c02-f4310fe2cdc5-eth0
Port_Binding lxd-net30-instance-ff89afa1-9eca-4593-b534-4d48c2ffaf7e-eth0
Port_Binding lxd-net30-instance-aaa51dc3-a159-41bf-92c2-97032a7f49e0-eth0
Port_Binding lxd-net30-instance-64679ce5-29f1-4635-81b1-ca244069e319-eth0
Port_Binding lxd-net30-instance-df06eaf9-6f5c-47d1-91d0-4f7980aba35a-eth0
Port_Binding lxd-net30-instance-d4e72f86-18a1-4e25-a2f3-43e83f80620b-eth0
Port_Binding lxd-net30-instance-02deff27-5d6c-4f01-b602-75dcbf59bfe8-eth0
Port_Binding lxd-net30-instance-bec3d7e9-a314-4143-bca0-8d81c10f944b-eth0

These downed geneve tunnels are likely the issue - they are what send packets between cluster members.

This is 6th node added later and all containers inside it work fine. this node has only ovs switch.