I have a deployment making use of the storage_device ams configuration option and the subnet_prefix_length amx-lxd configuration option. I can reconfigure our nodes to make use of a single storage device resolving the first issue, but deploying smaller LXD subnets is a requirement for future deployments. I had not realized the change initially since the operator framework migration is listed as an improvement while simultaneously deprecating many configuration options…
It’s worth noting the documentation on bare metal deployments still lists configuration options that have been removed now:
Are there plans to reimplement some of the many removed configurations/what options do I have for modifying the LXD network size now?
Yes, that was overseen but is on our list to be reworked. You will see an update shortly.
Can you give us some insight on what exactly you need to modify the subnet size for? By default we use a /24 for the local bridge. In the past we had the option to allow using a different size to support a large number of instances per machine (> 255) but that was not of any use and actual deployments at that size are structured differently.
@morphis I would agree /24 is good for many deployments, however I was reducing the size of the subnet down to /26. In combination with this I was using nat rules to expose the containers directly to the network by dedicating entire networks to an additional interface on the ams nodes.
What this means in turn is that each ams node needs to have its own dedicated /24 network on our LAN when the actual number of containers our bare metal nodes can run is much smaller. With a /26 the number of addresses is much closer to the number of instances that can actually run on the node.
On this note, a true fix to this issue would be allowing for the instances to directly connect to the network and not requiring NAT rules to begin with but based on this post that seems to be the only option.
I see, you want the instance directly expose to the external network? That is not what we have currently designed things for. The current assumption in the design is, that NAT is the only way to expose instances to the external network as that way there is strict control over what is accessible and what not.
Can you tell us a bit more about the reason why NAT doesn’t work for you? I guess you want to expose the instance to make their service endpoints accessible from the instance?
Nat vs directly connected containers is not the biggest deal, direct connections just makes things simpler since we wouldn’t have to deploy NAT rules to the LXD nodes. In addition for connecting to services on the instance it’s imperative we are able to source the traffic per instance for outgoing connections.
Example NAT:
ip daddr != {LANNETWORK} oif “bond1” snat ip prefix to ip saddr map { 192.168.100.0/26 : {LANNETWORK}/26 }
We know all instances stood up would belong to 192.168.100.0/26, in a /20 the NAT would encompass many more addresses that would likely never get used. Each lxd node needs to be mapped to a different LAN network segment to avoid overlap. In previous versions I was using the subnet_prefix_length configuration option as it was easily available. Perhaps there’s an easier way to limit the scope of deployed instances through instance count to ensure instances are load balanced to other nodes instead of relying on the bridge network size?
By default we use a /24 for the local bridge.
I was doing some testing on a fresh 1.24 instance with no configuration options and it seems like it’s using a /20, did this change in 1.25? I can’t seem to get that version running on my hardware quite yet to test.
sudo lxc network list
+---------------+----------+---------+------------------+------+--------------------+---------+---------+
| NAME | TYPE | MANAGED | IPV4 | IPV6 | DESCRIPTION | USED BY | STATE |
+---------------+----------+---------+------------------+------+--------------------+---------+---------+
| amsbr0 | bridge | YES | 192.168.100.1/20 | none | AMS network bridge | 0 | CREATED |
Hopefully I don’t sound too crazy, I know it’s a bit of a strange use case.
Yes, we reduced the size of the subnet for the local bridge to a /24
from a /20
. We used a /20
because in earlier days we were running higher density installations we required more than 255 instances per machine. However these days are gone and a /24
should be more than enough, if more are needed different setups are required.
I think you have reported Bug #2110311 “Missing Dependencies during Fresh Baremetal Instal...” : Bugs : Anbox Cloud for this. The main problem seems to be how things are configured in the bundles. See the bug for more details.