Container occasionally cannot resolve 'localhost' on boot

ibrummel · January 11, 2024, 2:00pm

Hi,
I’ve noticed an odd issue with my containers. I’m running a Ubuntu 20.04 container which runs Nginx inside. The nginx config has a proxy_pass set to localhost:8080. Very occasionally when starting the container, Nginx errors that it’s unable to resolve localhost. Restarting the Nginx service resolves this issue and localhost is able to be resolved. Localhost is indeed in /etc/hosts.

I do not have any idea why or how this is happening, I’d appreciate if someone else has an idea to troubleshoot and debug further. For a fix I’ve added restarts to the system unit but I am curious of the root cause here.

LXD Version: 5.0.2

container /etc/hosts

127.0.0.1 localhost.localdomain localhost

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Container config

architecture: x86_64
config:
  boot.autostart: "true"
  image.architecture: amd64
  image.description: ubuntu 20.04 amd64 (20240107_12:03)
  image.os: ubuntu
  image.release: focal
  image.serial: "20240107_12:03"
  image.type: squashfs
  image.variant: default
  limits.cpu: "6"
  limits.memory: 17180MB
  snapshots.expiry: 1w
  snapshots.schedule: 5 2 * * *
  user.user-data: |
    #cloud-config
    disable_root: false
    system_info:
      apt_get_wrapper:
        command: eatmydata
        enabled: False
    apt:
      preserve_sources_list: true
    package_update: true
    packages:
      - openssh-server
      - python3
      - python3-pip
      - python3-simplejson
      - python3-openssl
  volatile.base_image: 21c1ae3abf5e90048ece1ebe508e197855f57a7874b031e4d39df72cbb214969
  volatile.cloud-init.instance-id: 9e625b08-377f-4288-8420-dc424ccc9941
  volatile.eth0.hwaddr: 00:16:3e:7c:26:b6
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
  volatile.uuid: be8160f5-7697-4535-9461-505db494e661
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: br1
    type: nic
  root:
    path: /
    pool: default
    size: 323GB
    type: disk
ephemeral: false
profiles:
- default
- bootstrap-ansible-no-kernel
stateful: false
description: ""

br1 network config

config: {}
description: ""
name: br1
type: bridge
used_by:
...
managed: false
status: ""
locations: []

Nginx error

ystemd[1]: Starting A high performance web server and a reverse proxy server...
ginx[226]: nginx: [emerg] host not found in upstream "localhost" in /etc/nginx/sites-enabled/d>
ginx[226]: nginx: configuration file /etc/nginx/nginx.conf test failed
ystemd[1]: nginx.service: Control process exited, code=exited, status=1/FAILURE
ystemd[1]: nginx.service: Failed with result 'exit-code'.
ystemd[1]: Failed to start A high performance web server and a reverse proxy server.
ystemd[1]: nginx.service: Unit cannot be reloaded because it is inactive.

tomp · January 16, 2024, 2:04pm

My first thought is to check that the nginx service is configured to start after the networking has been brought online in systemd.

Also have you considered using 127.0.0.1 rather than localhost to avoid name resolution entirely?

ibrummel · January 16, 2024, 2:51pm

The systemd unit is configured to start after network (default).

After=network.target nss-lookup.target

I’ve been able to workaround this by specifying the After to be networking-online to really ensure that networking is fully up.

After=network-online.target

I have considered using 127.0.0.1 but it’s friendlier to developers using the containers if localhost is used, and again it would be just a workaround.