Setting linux.sysctl.net.core.wmem_max results in No such file or directory - Failed to setup sysctl parameters

Hello
I want to configure a few sysctl parameters but when doing so the container cannot start with the following log.

lxc juju-dd86b8-0 20240215133341.422 ERROR    conf - ../src/src/lxc/conf.c:setup_sysctl_parameters:3348 - No such file or directory - Failed to setup sysctl parameters net.core.wmem_max to 134217728
lxc juju-dd86b8-0 20240215133341.423 ERROR    conf - ../src/src/lxc/conf.c:lxc_setup:4504 - Failed to setup sysctl parameters
lxc juju-dd86b8-0 20240215133341.423 ERROR    start - ../src/src/lxc/start.c:do_start:1272 - Failed to setup container "juju-dd86b8-0"
lxc juju-dd86b8-0 20240215133341.423 ERROR    sync - ../src/src/lxc/sync.c:sync_wait:34 - An error occurred in another process (expected sequence number 4)
lxc juju-dd86b8-0 20240215133341.511 WARN     network - ../src/src/lxc/network.c:lxc_delete_network_priv:3631 - Failed to rename interface with index 0 from "eth0" to its initial name "vethfea7c946"
lxc juju-dd86b8-0 20240215133341.511 ERROR    lxccontainer - ../src/src/lxc/lxccontainer.c:wait_on_daemonized_start:878 - Received container state "ABORTING" instead of "RUNNING"
lxc juju-dd86b8-0 20240215133341.512 ERROR    start - ../src/src/lxc/start.c:__lxc_start:2107 - Failed to spawn container "juju-dd86b8-0"
lxc juju-dd86b8-0 20240215133341.512 WARN     start - ../src/src/lxc/start.c:lxc_abort:1036 - No such process - Failed to send SIGKILL via pidfd 17 for process 415705
lxc 20240215133341.202 ERROR    af_unix - ../src/src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20240215133341.202 ERROR    commands - ../src/src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_init_pid"

Is it not possible to set this values on a lxc container or might I be missing something?

It seems that net.core.wmem_max is not namespace aware and you can’t even see the file inside the container:

$ lxc shell c1
# sysctl net.core.wmem_max
sysctl: cannot stat /proc/sys/net/core/wmem_max: No such file or directory

@amikhalitsyn does that sound accurate?

2 Likes

Is this intended or something that can be fixed?

I want to run a software that needs to read the values of

net.core.rmem_default
net.core.rmem_max
net.core.wmem_default
net.core.wmem_max

It would be optimal to set them for the container but it might be sufficient to be able to read the values set on the host.

Absolutely. This thing is not net namespaced:
https://github.com/torvalds/linux/blob/b401b621758e46812da61fa58a67c3fd8d91de0d/net/core/sysctl_net_core.c#L377

Is this intended or something that can be fixed?

Good question. This is something that we can take a look into, at least allow reading of these values inside the container.

Can you tell us which piece of software you are running inside the container that requires this?

I can explain why I’m asking this. This is not just curiosity (it is, but not only!).
The problem is that we can’t just “namespacify” whatever we want to and send patch to the Linux kernel mailing list and get it approved and applied. Any change that goes into the Linux kernel should be reasonable and motivated and has a good and correct use-case. So having a name of a piece of a software (ideally, open source) is a good and valid argument too.

Can you tell us which piece of software you are running inside the container that requires this?

I’m trying to run a Solana blockchain validator client.
https://docs.solanalabs.com/operations/setup-a-validator#system-tuning
https://github.com/solana-labs/solana

The software tries to read these values at startup and exits when it cannot.

Any change that goes into the Linux kernel should be reasonable and motivated and has a good and correct use-case. So having a name of a piece of a software (ideally, open source) is a good and valid argument too.

I understand and it sounds reasonable. Interesting to see it this is enough motivation.

Cool! Makes sense. I’ve put it in my TODO list and check if we can support and sent this to upstream.

From the practical standpoint, I would suggest you to report this to solana developers too. And ask them what they thing about tweaking solana itself from throwing an error to throwing a warning. You can link this topic to provide them with a bit of extra context.

The problem is that even if we imagine that this change with wmem/rmem containerization is landed today to the upstream Linux kernel, the time until you get this working in the stable Linux distros is too big (6 month minimum, or even 1 year!).

1 Like

Thank you, that’s all I can ask for.

I can do that. Although I’ve gotten the feeling that running in any container or vm is not really recommended.
But I can ask if the client really needs to exit when it cannot read the values.

Got it. I will look into other solutions e.g. using a vm even if it’s not what we usually use in our environment.

2 Likes

There is a hidden switch that can be used to start Solana even when these specific sysctl values cannot be read. https://github.com/solana-labs/solana/issues/35329

So this will work for me to run Solana in a lxc container.

1 Like