NOHZ tick-stop error (ZFS and Ubuntu Fan on Noble 24.04)

rg68 · November 13, 2024, 11:30pm

I just purchased a couple of used Lenovo P510’s to experiment with LXD. Unfortunately, once I create a cluster and launch a couple of containers, the workstations become utterly unusable. I was able to capture output from dmesg, and this is the (I think) problem logs (the “cut here” notation helps!). Hoping somebody can point me in the correct direction.

Latest BIOS from Lenovo (5/24/2022 IIRC), latest Ubuntu (24.04.1) and latest LXD snap (5.21.2). Kernel is 6.8.0-48. (Workstation locked up at that point – so I have like 30 seconds, tops.)

Vanilla Ubuntu ran all day, no issues. The snap install and ZFS didn’t seem to cause issues, but I didn’t let it sit. I think it was either the cluster join, or maybe launching the containers that initiated the bad behavior.

[26689.256997] NOHZ tick-stop error: local softirq work is pending, handler #200!!!
[26689.277888] NOHZ tick-stop error: local softirq work is pending, handler #200!!!
[26689.325701] NOHZ tick-stop error: local softirq work is pending, handler #200!!!
[26689.353495] ------------[ cut here ]------------
[26689.353499] Voluntary context switch within RCU read-side critical section!
[26689.353504] WARNING: CPU: 11 PID: 6392 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x2ce/0x2f0
[26689.353511] Modules linked in: veth nft_masq nft_chain_nat vxlan ip6_udp_tunnel udp_tunnel dummy bridge stp llc nvme_tcp nvme_keyring nvme_fabrics nvme_core nvme_auth ebtable_filter ebtables ip6table_raw ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_raw iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter nf_tables vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock zfs(PO) spl(O) qrtr cfg80211 binfmt_misc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac nouveau snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp drm_gpuvm drm_exec snd_hda_intel gpu_sched snd_intel_dspcfg snd_intel_sdw_acpi drm_ttm_helper kvm_intel snd_hda_codec ttm snd_hda_core drm_display_helper snd_hwdep kvm cec snd_pcm ee1004 snd_timer rc_core nls_iso8859_1 irqbypass think_lmi i2c_algo_bit snd i2c_i801 rtsx_usb_ms mei_me rapl memstick firmware_attributes_class
[26689.353560]  intel_wmi_thunderbolt wmi_bmof intel_cstate mxm_wmi video mei intel_pch_thermal i2c_smbus soundcore lpc_ich input_leds mac_hid serio_raw sch_fq_codel dm_multipath msr efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 uas usb_storage rtsx_usb_sdmmc rtsx_usb crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 e1000e ahci xhci_pci libahci pata_acpi xhci_pci_renesas wmi aesni_intel crypto_simd cryptd
[26689.353593] CPU: 11 PID: 6392 Comm: systemd-resolve Tainted: P           O       6.8.0-48-generic #48-Ubuntu
[26689.353595] Hardware name: LENOVO 30B4S16S00/102F, BIOS S00KT73A 05/24/2022
[26689.353596] RIP: 0010:rcu_note_context_switch+0x2ce/0x2f0
[26689.353599] Code: fe ff ff ba 02 00 00 00 be 01 00 00 00 e8 fa d0 fe ff e9 6b fe ff ff 48 c7 c7 18 99 06 ac c6 05 ad 8e 61 02 01 e8 b2 12 f2 ff <0f> 0b e9 96 fd ff ff 0f 0b e9 36 ff ff ff 0f 0b e9 18 ff ff ff 66
[26689.353600] RSP: 0018:ffffa8f342ac7af0 EFLAGS: 00010046
[26689.353602] RAX: 0000000000000000 RBX: ffff8ecebfdb5a00 RCX: 0000000000000000
[26689.353604] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[26689.353605] RBP: ffffa8f342ac7b10 R08: 0000000000000000 R09: 0000000000000000
[26689.353605] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[26689.353606] R13: ffff8ebf80d48000 R14: 0000000000000000 R15: ffff8ebf8f4b1d80
[26689.353607] FS:  00007054bbc85940(0000) GS:ffff8ecebfd80000(0000) knlGS:0000000000000000
[26689.353609] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[26689.353610] CR2: 0000619469f7cd90 CR3: 00000001a8048003 CR4: 00000000003706f0
[26689.353611] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[26689.353612] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[26689.353613] Call Trace:
[26689.353615]  <TASK>
[26689.353618]  ? show_regs+0x6d/0x80
[26689.353621]  ? __warn+0x89/0x160
[26689.353626]  ? rcu_note_context_switch+0x2ce/0x2f0
[26689.353628]  ? report_bug+0x17e/0x1b0
[26689.353632]  ? handle_bug+0x51/0xa0
[26689.353635]  ? exc_invalid_op+0x18/0x80
[26689.353637]  ? asm_exc_invalid_op+0x1b/0x20
[26689.353641]  ? rcu_note_context_switch+0x2ce/0x2f0
[26689.353643]  __schedule+0x81/0x6b0
[26689.353648]  schedule+0x33/0x110
[26689.353650]  schedule_hrtimeout_range_clock+0x13a/0x150
[26689.353654]  schedule_hrtimeout_range+0x13/0x30
[26689.353657]  ep_poll+0x342/0x390
[26689.353663]  ? __pfx_ep_autoremove_wake_function+0x10/0x10
[26689.353666]  do_epoll_wait+0xdb/0x100
[26689.353668]  __x64_sys_epoll_wait+0x6f/0x110
[26689.353670]  x64_sys_call+0x18af/0x25c0
[26689.353673]  do_syscall_64+0x7f/0x180
[26689.353678]  ? __seccomp_filter+0x368/0x570
[26689.353682]  ? __task_pid_nr_ns+0x6c/0xc0
[26689.353686]  ? syscall_exit_to_user_mode+0x86/0x260
[26689.353689]  ? do_syscall_64+0x8c/0x180
[26689.353692]  ? syscall_exit_to_user_mode+0x86/0x260
[26689.353695]  ? do_syscall_64+0x8c/0x180
[26689.353697]  ? irqentry_exit_to_user_mode+0x7b/0x260
[26689.353698]  ? irqentry_exit+0x43/0x50
[26689.353699]  ? exc_page_fault+0x94/0x1b0
[26689.353702]  entry_SYSCALL_64_after_hwframe+0x78/0x80
[26689.353704] RIP: 0033:0x7054bc25a007
[26689.353721] Code: 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb be 0f 1f 44 00 00 f3 0f 1e fa 80 3d 45 10 0e 00 00 41 89 ca 74 10 b8 e8 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 55 48 89 e5 48 83 ec 20 89 55 f8 48 89
[26689.353722] RSP: 002b:00007ffc9b18fba8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e8
[26689.353725] RAX: ffffffffffffffda RBX: 000000000000002a RCX: 00007054bc25a007
[26689.353726] RDX: 000000000000002a RSI: 000061946ad25a50 RDI: 0000000000000004
[26689.353727] RBP: 00007ffc9b18fcc0 R08: 0000000000000000 R09: 0000000000000004
[26689.353728] R10: 00000000ffffffff R11: 0000000000000202 R12: 000061946ad25a50
[26689.353729] R13: 0000000000000015 R14: 000061946ad9c360 R15: ffffffffffffffff
[26689.353730]  </TASK>
[26689.353731] ---[ end trace 0000000000000000 ]---

tomp · November 14, 2024, 8:20am

Hi there

Please can you try without ZFS and also without using the Ubuntu Fan mode.

There are known issues with ZFS and Ubuntu Fan in the Ubuntu 6.8 kernel.

@amikhalitsyn does this look more like the ZFS or Ubuntu Fan issue?

amikhalitsyn · November 14, 2024, 12:00pm

it looks like a Ubuntu FAN one Bug #2064176 “LXD fan bridge causes blocked tasks” : Bugs : linux package : Ubuntu

Upd: you need kernel >= 6.8.0-50.50 to have this issue fixed

amikhalitsyn · November 14, 2024, 12:01pm

But don’t use ZFS in production with Ubuntu Noble. It’s extremely unstable.

See:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/2077926

github.com/openzfs/zfs

Kernel oops page fault triggered by Docker in arc_prune

opened 08:59AM - 06 Jul 24 UTC

maxpoulin64

Type: Defect

### System information Type | Version/Name --- | --- Distribution Name | Arc…hLinux Distribution Version | Latest Kernel Version | 6.8.9-zen1-1-zen Architecture | x86_64 OpenZFS Version | 2.2.4-2 I'm holding to 6.8.9 specifically to stay within official supported kernel versions. ### Describe the problem you're observing Extracting large container images in Docker causes ZFS to trigger an unhandled page fault, and permanently locks up the filesystem until reboot. Sync will never complete, and normal shutdown also doesn't complete. ### Describe how to reproduce the problem Running this particular container reliably hangs ZFS on my system during extraction, using Docker's ZFS storage driver. ```sh docker run -it --rm -p 8080:8080 --gpus all --name localai quay.io/go-skynet/local-ai:latest-aio-gpu-hipblas ``` It gets stuck on a line such as this one and never completes, killing the Docker daemon makes it a zombie, IO is completely hosed. ``` 6ddbee975253: Extracting 352.2MB/352.2MB ``` ### Include any warning/errors/backtraces from the system logs ``` [184791.050957] BUG: unable to handle page fault for address: 00000000208db6e0 [184791.050969] #PF: supervisor instruction fetch in kernel mode [184791.050972] #PF: error_code(0x0010) - not-present page [184791.050975] PGD 0 P4D 0 [184791.050981] Oops: 0010 [#1] PREEMPT SMP NOPTI [184791.050985] CPU: 11 PID: 482 Comm: arc_prune Tainted: P W OE 6.8.9-zen1-1-zen #1 b3e4ad3c9dbde87c9fb9d46fb90ca62a28a66a12 [184791.050992] Hardware name: Micro-Star International Co., Ltd. MS-7B09/X399 GAMING PRO CARBON AC (MS-7B09), BIOS 1.B0 08/09/2018 [184791.050995] RIP: 0010:0x208db6e0 [184791.051042] Code: Unable to access opcode bytes at 0x208db6b6. [184791.051045] RSP: 0018:ffffb417d2293ce0 EFLAGS: 00010246 [184791.051049] RAX: 00000000208db6e0 RBX: ffffb417d2293d94 RCX: 0000000000000000 [184791.051052] RDX: 0000000000000000 RSI: ffffb417d2293d30 RDI: ffff97e1ac586a80 [184791.051056] RBP: 0000000000003ae0 R08: 0000000000006d66 R09: ffff97e4860e2e90 [184791.051059] R10: ffff97e4860e2e80 R11: ffff97e1f96c0000 R12: ffff97e538d00000 [184791.051063] R13: ffff97e48bf9d780 R14: ffff97e4860e2e28 R15: ffff97e1ac586a80 [184791.051066] FS: 0000000000000000(0000) GS:ffff97e46e4c0000(0000) knlGS:0000000000000000 [184791.051070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [184791.051074] CR2: 00000000208db6e0 CR3: 000000019b3e6000 CR4: 00000000003506f0 [184791.051077] Call Trace: [184791.051082] <TASK> [184791.051085] ? __die+0x10f/0x120 [184791.051092] ? page_fault_oops+0x171/0x4e0 [184791.051101] ? exc_page_fault+0x7f/0x180 [184791.051107] ? asm_exc_page_fault+0x26/0x30 [184791.051119] ? zfs_prune+0xb0/0x4e0 [zfs 158ff065068c3ea6e221f98356463834dc655cec] [184791.051438] ? zpl_prune_sb+0x36/0x60 [zfs 158ff065068c3ea6e221f98356463834dc655cec] [184791.051653] ? arc_prune_task+0x22/0x40 [zfs 158ff065068c3ea6e221f98356463834dc655cec] [184791.051880] ? taskq_thread+0x2d4/0x6f0 [spl 44541b25f59ba0491e81482257bd475148318e14] [184791.051901] ? srso_return_thunk+0x5/0x5f [184791.051907] ? finish_task_switch.isra.0+0x94/0x2f0 [184791.051914] ? __pfx_default_wake_function+0x10/0x10 [184791.051924] ? __pfx_taskq_thread+0x10/0x10 [spl 44541b25f59ba0491e81482257bd475148318e14] [184791.051941] ? kthread+0xe8/0x120 [184791.051946] ? __pfx_kthread+0x10/0x10 [184791.051951] ? ret_from_fork+0x34/0x50 [184791.051955] ? __pfx_kthread+0x10/0x10 [184791.051960] ? ret_from_fork_asm+0x1b/0x30 [184791.051969] </TASK> [184791.051971] Modules linked in: xt_conntrack nf_conntrack_netlink xfrm_user xfrm_algo ip6table_nat ip6table_filter ip6_tables xt_addrtype br_netfilter overlay rfcomm snd_seq_dummy snd_hrtimer snd_seq wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel bridge stp llc uhid cmac algif_hash algif_skcipher af_alg xt_MASQUERADE bnep xt_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c crc32c_generic iptable_filter dm_crypt cbc encrypted_keys vfat fat intel_rapl_msr intel_rapl_common btusb snd_hda_codec_realtek btrtl crct10dif_pclmul snd_hda_codec_generic btintel crc32_pclmul iwlmvm snd_hda_codec_hdmi btbcm crc32c_intel snd_usb_audio btmtk polyval_clmulni snd_hda_intel snd_usbmidi_lib polyval_generic mac80211 gf128mul libarc4 snd_intel_dspcfg snd_ump ghash_clmulni_intel snd_intel_sdw_acpi bluetooth snd_rawmidi sha512_ssse3 joydev snd_seq_device snd_hda_codec sha256_ssse3 ecdh_generic iwlwifi mousedev sha1_ssse3 mc [184791.052084] razerkbd(OE) crc16 aesni_intel snd_hda_core crypto_simd snd_hwdep cryptd snd_pcm igb cfg80211 rapl ptp snd_timer sp5100_tco pps_core gpio_amdpt snd dca wmi_bmof rfkill pcspkr soundcore gpio_generic mxm_wmi i2c_piix4 k10temp mac_hid kvmfr(OE) sg crypto_user loop nfnetlink ip_tables x_tables hid_steam ff_memless hid_logitech_hidpp hid_logitech_dj hid_generic trusted asn1_encoder tee dm_mod usbhid amdgpu vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd video amdxcp i2c_algo_bit drm_ttm_helper ttm kvm_amd drm_exec gpu_sched drm_suballoc_helper kvm nvme drm_buddy nvme_core drm_display_helper xhci_pci irqbypass cec ccp nvme_auth xhci_pci_renesas wmi zfs(POE) spl(OE) vendor_reset(OE) nct6775 nct6775_core hwmon_vid i2c_dev [184791.052189] CR2: 00000000208db6e0 [184791.052193] ---[ end trace 0000000000000000 ]--- [184791.052196] RIP: 0010:0x208db6e0 [184791.052216] Code: Unable to access opcode bytes at 0x208db6b6. [184791.052219] RSP: 0018:ffffb417d2293ce0 EFLAGS: 00010246 [184791.052223] RAX: 00000000208db6e0 RBX: ffffb417d2293d94 RCX: 0000000000000000 [184791.052226] RDX: 0000000000000000 RSI: ffffb417d2293d30 RDI: ffff97e1ac586a80 [184791.052229] RBP: 0000000000003ae0 R08: 0000000000006d66 R09: ffff97e4860e2e90 [184791.052232] R10: ffff97e4860e2e80 R11: ffff97e1f96c0000 R12: ffff97e538d00000 [184791.052235] R13: ffff97e48bf9d780 R14: ffff97e4860e2e28 R15: ffff97e1ac586a80 [184791.052238] FS: 0000000000000000(0000) GS:ffff97e46e4c0000(0000) knlGS:0000000000000000 [184791.052241] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [184791.052244] CR2: 00000000208db6e0 CR3: 000000019b3e6000 CR4: 00000000003506f0 [184791.052248] note: arc_prune[482] exited with irqs disabled ``` The stack trace is always the same. Disk passes scrub with 0 errors after rebooting.

rg68 · November 14, 2024, 3:34pm

Thanks for the quick reply. I’ll try with btrfs, and no fan (have no clue if I created one or not).

Follow up questions:

Would I be better off using Ubuntu 22.04 or even jumping ahead to 24.10? This is a home project (I have admittedly weird hobbies ) … so I’m not tied to anything specifically.
I wanted to note that the storage drivers documentation seems to strongly suggest ZFS.

Thanks for the tickets to watch as well!
-Rob

amikhalitsyn · November 14, 2024, 3:55pm

I’m pretty sure that 90% of users of this forum (including myself) share your hobby

yes, it will be better, but only if you won’t be using HWE kernel (which is the same as being shipped with Ubuntu Noble). As this is not an issue of Ubuntu Noble itself, but a kernel+ZFS incompatibility-related issue. As Ubuntu 22.04 uses old kernel 5.15 it has no that problem and works reliably with ZFS.

rg68 · November 14, 2024, 4:31pm

Thank you. I’ll try that as well!

tomp · November 15, 2024, 8:27am

Yes Ubuntu 22.04 fan mode works fine too.

tomp · November 15, 2024, 8:28am

Yes we do suggest using ZFS, but that is with an expectation of kernel that has a stable ZFS implementation (which Ubuntu 22.04 has).

rg68 · November 15, 2024, 7:00pm

Cool. That seems to be the route I’m moving towards. Bridge doesn’t communicate between the servers. Fan does. Likely want to confine the default network tough… 16 million IP addrs is a bit unwieldy! (240.0.0.0/8)

I don’t have enough machines/network ports to get MicroOVN to work automagically (nor am I network savvy enough to have the patience to figure it out).
-Rob

rg68 · November 22, 2024, 8:33pm

Thought I would offer an update.

I found that ZFS isn’t super stable/reliable for Ubuntu 22.04 (although it seems pretty good in 24.04). So I switched over to btrfs and turned off quotas (based on the LXD docs) since what I’m working with is VMs. As far as funky disk stuff, this seems to be stable - I haven’t seen anything resembling the unable to find zvol errors I was getting.

What I’m now finding is that the Ubuntu Fan seems to have a bunch of “hiccups”. For instance:

Error: Action Failed get_task: Task e0c8dfc6-a50e-4ad9-474e-f412627cded1 result: Preparing apply spec: Preparing package nats-v2-migrate: Fetching package blob: Getting blob from inner blobstore: Getting blob from inner blobstore: Shelling out to bosh-blobstore-dav cli: Running command: ‘bosh-blobstore-dav -c /var/vcap/bosh/etc/blobstore-dav.json get a630f86c-4678-4f3e-94de-2774ea4ca362 /var/vcap/data/tmp/bosh-blobstore-externalBlobstore-Get3073628143’, stdout: ‘Error running app - read tcp 240.4.0.20:48154->240.4.0.4:25250: read: connection reset by peer’, stderr: ‘’: exit status 1

I get random connection reset by peer messages. Note than on my single host LXD setup (ZFS, Ubuntu 24.04, just a network bridge), I don’t think I’ve ever seen that occur.

My only assumption is that this is something with the fan. Thoughts?

Current versions, and this is a two node cluster where hydra2 is identical:

$ uname -a
Linux hydra1 5.15.0-126-generic #136-Ubuntu SMP Wed Nov 6 10:38:22 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.5 LTS
Release:	22.04
Codename:	jammy
$ lxc version
Client version: 5.21.2 LTS
Server version: 5.21.2 LTS

Thanks!
-Rob

tomp · November 25, 2024, 11:18am

There have been issues with Ubuntu fan in Noble but I understood they had been fixed now.

Suggest ensuring you have updated kernel, or try Ubuntu 22.04 and see if the issue is resolved there.

rg68 · November 25, 2024, 6:49pm

I think I figured it out. (This is all automated to some degree, so sometimes it’s a process of discovery…)

If I let the BOSH VM assign it’s own IP, the routing isn’t quite correct for the Ubuntu fan. However, since it’s a managed network, I can configure everything to work via DHCP (I “sneakily” just set the ipv4.address to what is expected). That 3rd route (begins 240.4.0.1) didn’t exist before being sneaky.

bosh/0:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc mq state UP group default qlen 1000
    link/ether 00:16:3e:38:51:cb brd ff:ff:ff:ff:ff:ff
    altname enp5s0
    inet 240.4.0.4/8 metric 1024 brd 240.255.255.255 scope global dynamic eth0
       valid_lft 3085sec preferred_lft 3085sec
bosh/0:~# ip route
default via 240.4.0.1 dev eth0 proto dhcp src 240.4.0.4 metric 1024 
240.0.0.0/8 dev eth0 proto kernel scope link src 240.4.0.4 metric 1024 
240.4.0.1 dev eth0 proto dhcp scope link src 240.4.0.4 metric 1024

Thanks!
-Rob