How to troubleshoot networking on macOS

Contents:

Architecture

Multipass uses “hyperkit” to run instances, which utilises MacOS’ Hypervisor.framework. This framework manages the networking stack for the instances.

On creation of an instance, Hypervisor.framework on the host uses MacOS’ “Internet Sharing” mechanism to

  1. create a virtual switch and connects each instance to it (subnet 192.168.64.*)
  2. provide DHCP and DNS resolution on this switch at 192.168.64.1 (via bootpd & mDNSResponder services running on the host); this is configured by an auto-generated file /etc/bootpd.plist - but editing this is pointless as MacOS re-generates it as it desires.

Note that according to “System Preferences” -> “Sharing”, the "Internet Sharing"service can appear disabled - this is ok. In the background, it will still be enabled to support instances.

Tools known to interfere with Multipass

  • VPN software can be aggressive at managing routes, and may route 192.168.64 subnet through the VPN interface, instead of keeping it locally available.
    • Possible culprits: OpenVPN, F5, Dell SonicWall, Cisco AnyConnect, Citrix/Netscaler Gateway, Jupiter Junos Pulse / Pulse Secure
    • Tunnelblick doesn’t cause problems
  • Cisco Umbrella Roaming Client it binds to localhost:53 which clashes with Internet Sharing, breaking instance’s DNS (ref: Umbrella Roaming Client OS X and Internet Sharing)
  • dnscrypt-proxy/dnscrypt-wrapper/cloudflared-proxy
    Default configuration binds to localhost port 53, clashing with Internet Sharing.
  • another dnsmasq process bound to localhost port 53
  • custom DHCP server bound to port 67? (“sudo lsof -iUDP:67 -n -P” should show launchd & bootpd only)

Problem class

Generic networking problems

Unable to determine IP address usually implies some networking configuration is incompatible, or there is interference from a Firewall or VPN.

Troubleshooting (section to be expanded)

  1. Firewall
    1. Is Firewall enabled?
    2. If so it must not “Block all incoming connections”
      • Blocking all incoming connections prevents a DHCP server from running locally, to give an IP to the instance.
      • It’s ok to block incoming connections to “multipassd” however.
  2. VPN
  3. Little Snitch - defaults are good, it should permit mDNSResponder and bootpd access to BPF
    If you’re having trouble downloading images and/or see Unknown errors when trying to multipass launch -vvv, Little Snitch may be interfering with multipassd's network access (ref. #1169)
  4. Internet Sharing - doesn’t usually clash
  5. Is the bootpd DHCP server alive? (sudo lsof -iUDP:67 -n -P should mention bootpd)
    • start it by running sudo launchctl load -w /System/Library/LaunchDaemons/bootps.plist

Network routing problems

Could try

sudo route -nv add -net 192.168.64.0/24 -interface bridge100

If you get a “File exists” error, maybe delete and retry?

sudo route -nv delete -net 192.168.64.0/24
sudo route -nv add -net 192.168.64.0/24 -interface bridge100

Maybe -static route helps?

If using “Cisco AnyConnect” - try using “OpenConnect” (brew install openconnect) instead as it messes with routes less (but your company sysadmin/policy may not permit/authorize this).

Does your VPN software provide a “Split connection” option - where VPN sysadmin can designate a range of IP addresses to not be routed through the VPN.

  • Cisco does
  • Pulse Secure / Jupiter Junos Pulse do

Potential workaround for VPN conflicts (ref: #495)

After the nat … line (if there is one, otherwise at the end) in /etc/pf.conf, add this line:

nat on utun1 from bridge100:network to any -> (utun1)

and reload PF with $ sudo pfctl -f /etc/pf.conf.

Possible other option - configure Multipass to use a different subnet?

Edit /Library/Preferences/SystemConfiguration/com.apple.vmnet.plist to change the “Shared_Net_Address” value to something other than 192.168.64.1 -.

  • works if you edit the plist file and stay inside 192.168 range, as Multipass hardcoded for this

Note on this:
If you change the subnet and launch an instance, it will get an IP from that new subnet. But if you try changing it back, the change is reverted on next instance start. It appears that the dhcp server reads the last IP in /var/db/dhcpd_leases, decides the subnet from that, and updates Shared_Net_Address to match. So only way to really revert this change is edit/delete /var/db/dhcpd_leases.

DNS problems

Can you ping IP addresses?

$ ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
^C
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2030ms

Note that macOS’s firewall can block the ICMP packets that ping uses, which will interfere with this test. Make sure you disable “Stealth Mode” in “System Preferences”->“Security & Privacy” -> “Firewall” just for this test.


If you try again:

multipass@x:~$ ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=53 time=7.02 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=53 time=5.91 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=53 time=5.12 ms
^C
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2143ms
rtt min/avg/max/mdev = 5.124/6.020/7.022/0.781 ms

This means the instance can indeed connect to the internet, but DNS resolution is broken. Testing DNS resolution using the dig tool now may show it broken:

multipass@x:~$ dig google.ie
; <<>> DiG 9.10.3-P4-Ubuntu <<>> google.ie
;; global options: +cmd
;; connection timed out; no servers could be reached

But if it shows this, it’s all working:

multipass@x:~$ dig google.ie
; <<>> DiG 9.10.3-P4-Ubuntu <<>> google.ie
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48163
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.ie.   		 IN    A

;; ANSWER SECTION:
google.ie.   	 15    IN    A    74.125.193.94

;; Query time: 0 msec
;; SERVER: 192.168.64.1#53(192.168.64.1)
;; WHEN: Thu Aug 01 15:17:04 IST 2019
;; MSG SIZE  rcvd: 54

To test further, try supplying an explicit DNS server

multipass@x:~$ dig @1.1.1.1 google.ie
; <<>> DiG 9.10.3-P4-Ubuntu <<>> @1.1.1.1 google.ie
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11472
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1452
;; QUESTION SECTION:
;google.ie.   		 IN    A

;; ANSWER SECTION:
google.ie.   	 39    IN    A    74.125.193.94

;; Query time: 6 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Thu Aug 01 15:16:27 IST 2019
;; MSG SIZE  rcvd: 54

This implies the problem is with macOS’s “Internet Sharing” feature - for some reason its built-in DNS server is broken.

The built-in DNS server should be “mDNSResponder” which binds to localhost on port 53.

If using Little Snitch or another per-process firewall, ensure mDNSResponder can establish outgoing connections. MacOS’ built-in firewall should not interfere with it.

Check what is bound to that port on the host with

$ sudo lsof -iTCP:53 -iUDP:53 -n -P
COMMAND   PID       	USER   FD   TYPE         	DEVICE SIZE/OFF NODE NAME
mDNSRespo 191 _mdnsresponder   17u  IPv4 0xa89d451b9ea11d87  	0t0  UDP *:53
mDNSRespo 191 _mdnsresponder   25u  IPv6 0xa89d451b9ea1203f  	0t0  UDP *:53
mDNSRespo 191 _mdnsresponder   50u  IPv4 0xa89d451b9ea8b8cf  	0t0  TCP *:53 (LISTEN)
mDNSRespo 191 _mdnsresponder   55u  IPv6 0xa89d451b9e2e200f  	0t0  TCP *:53 (LISTEN)

The above output shows the correct state while a instance is running. If no instance is running (and Internet Sharing disabled in System Preferences), the command should return nothing.

Any other command appearing in that output means a process is conflicting with Internet Sharing, and thus will break DNS in the instance.

Possible workarounds

  1. Configure DNS inside the instance to use an external working DNS server. Can do so by appending this line to /etc/resolv.conf manually:
    nameserver 1.1.1.1
    
    “1.1.1.1” is a free DNS service provided by CloudFlare, but you can use your own.
  2. Use a custom cloud-init to set /etc/resolv.conf for you on first boot.

ARP problems

The macOS bridge used for hyperkit filters packets so that only the IP address originally assigned to the VM is allowed through. If you add an additional address (e.g. IP alias) to the VM, the ARP broadcast will get through but the ARP response will be filtered out.

This means that applications which rely on additional IP addresses, such as metallb under microk8s, will not work.

3 Likes

ARP problems

macOS filters packets so that only the IP address originally assigned to the VM is allowed through. If you add an additional address (e.g. IP alias) to the VM, the ARP broadcast will get through but the ARP response will be filtered out.

This means that applications which rely on additional IP addresses, such as metallb under microk8s, will not work.

1 Like

Thanks @candlerb there’s an “Edit” button below the first post that should’ve let you add your content. It may be that since you’re a fresh user it was limited for you.

In any case I’ve now incorporated it into the post.

For Mac - I have added the cloud-init template that edits /etc/netplan/50-cloud-init.yaml file
Solutions available at
https://github.com/rajasoun/multipass

1 Like

reposting to format the code

hi thank you for making this troubleshooting section,
I followed until running my instances, success but cannot connect to the internet
ping to google is failing

I tried to

    ubuntu@great-squaker$ ping 1.1.1.1
    PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
    64 bytes from 1.1.1.1: icmp_seq=1 ttl=56 time=74.8 ms
    64 bytes from 1.1.1.1: icmp_seq=2 ttl=56 time=35.1 ms
    64 bytes from 1.1.1.1: icmp_seq=3 ttl=56 time=179 ms
    64 bytes from 1.1.1.1: icmp_seq=4 ttl=56 time=55.6 ms
    64 bytes from 1.1.1.1: icmp_seq=5 ttl=56 time=39.2 ms
    64 bytes from 1.1.1.1: icmp_seq=6 ttl=56 time=52.1 ms
    ^C
    — 1.1.1.1 ping statistics —
    6 packets transmitted, 6 received, 0% packet loss, time 5342ms

I tried to use the DNS on dig

ubuntu@great-squeaker:~$ dig @1.1.1.1 google.com
; <<>> DiG 9.11.3-1ubuntu1.11-Ubuntu <<>> @1.1.1.1 google.com
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

checked lsof on in my host using sudo lsof -iTCP:53 -iUDP:53 -n -P

returned empty, then I start the mDNSResponder via launchctl

$ sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist
$ sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist
/System/Library/LaunchDaemons/com.apple.mDNSResponder.plist: service already loaded

sudo lsof -iTCP:53 -iUDP:53 -n -P still returned empty

Any help is appreciated, thanks

This just doesnt work. Ive tried everything. The whole thing is useless without networking.
I tried to change the subnet in /Library/Preferences/SystemConfiguration/com.apple.vmnet.plist but it doesnt seem to work. Im still getting the old subnet.

I’ve spun up an instance on an OS X host and installed nfs-common to it. When I try to access a LAN nfs share, I’m getting an “access denied by server” error (the share can be mounted from another, bare metal instance of Ubuntu 18.04).

The one thing that jumps out is that the multipass instance is on a different subnet (192.168.64.x) from our LAN (192.168.0.x). I set the share to accept connections from 192.168.64.0 mask=255.255.255.0, but no joy.

Anything different I need to be doing?

I’m running Big Sur with the Firewall turned on. I don’t have a way to disable the Firewall because this is a work laptop and everything is locked down.

When I run microk8s install I get this error:
An error occurred when trying to execute 'sudo ping -c 1 snapcraft.io' with 'multipass': returned exit code 1.

The shell command sudo ping -c 1 snapcraft.io works on my local but in the VM the command hangs and never returns. Also from the VM it’s able to reach the Internet.

What are my options on getting the microk8s install to run successfully?

@cwagnello hey, ideally you’d be able to allow outgoing traffic on bridge100 and DNS and DHCP going the other way.

If you absolutely can’t, you could try using VirtualBox, that has a user-space networking implementation that should not be affected by firewalls. Note that you can’t then access the instance other than through multipass shell, unless you also use a bridged network.

1 Like

@saviq VirtualBox worked as expected. I did find a solution that allowed me to use Hyperkit when installing MicroK8S. The MicroK8S Github has the answer I needed after doing more digging.

It would be very useful to have the vm image, config, and cache locations listed in here.
As per https://github.com/canonical/multipass/issues/566 not having this documented has lead to lots of frustration.

/var/root/Library/Application\ Support/multipassd
/var/root/Library/Caches/multipassd

I found that the MultiPass hyperkit network setup (Intel macs) did not work with our corporate VPN, which blocks traffic from subnets. I was able to work around it using https://github.com/amine7536/multipass-vpnkit, which adds some parameters to the hyperkit startup to tunnel traffic through the host. Would be nice if this could be incorporated into multipass as an option.