Nvidia_drv.so spin-wait on SIGALRM causes Xorg freeze during screen lock with active DXVK/NVAPI client

**Ubuntu Version:** 24.04 LTS

**Desktop Environment:** GNOME (Shell 46.2 / mutter 46.2)

**Problem Description:**

Pressing Ctrl+L to lock the screen while a Wine/DXVK application is running causes the display to permanently freeze. No lock screen appears, the cursor stops, nothing responds. The TTY is still accessible (`Ctrl+Alt+F3`) and `sudo systemctl restart gdm` recovers the session. It does not happen if the Wine application is closed before locking. 100% reproducible on demand.

I’ve traced this down to what looks like a signal-safety bug inside NVIDIA’s closed-source `nvidia_drv.so` Xorg driver. I’ve filed it with NVIDIA (https://forums.developer.nvidia.com/t/xorg-receives-sigalrm-and-freezes-during-screen-lock-with-wine-ge-proton-application-running-nvidia-driver-580-open-ubuntu-24-04/366049) and gotten no response. I’m posting here hoping someone has experience getting traction on closed-source driver bugs or knows if I’m missing diagnostics that would make the report more actionable.

Reproduction steps:

1. Start an X11 GNOME session

2. Launch Lutris → start Battle.net via GE-Proton (any Wine/DXVK app should reproduce this)

3. While the app is running, press Ctrl+L

4. Display freezes permanently

**Relevant System Information:**

| Component | Details |

|—|—|

| GPU | NVIDIA GeForce RTX 4070 Ti |

| Driver | nvidia-driver-580-open 580.126.09 |

| Kernel | 6.17.0-19-generic |

| Display server | X11 (not Wayland) |

| nvidia_drm modeset | 1 |

| NVreg_PreserveVideoMemoryAllocations | 1 |

| Application | Lutris 0.5.22 (Flatpak) → umu-launcher 1.4.0 → GE-Proton10-34 → Battle.net |

| Wine env | DXVK enabled, WINEESYNC=1, WINEFSYNC=1, DXVK_ENABLE_NVAPI=1, nvapi64=n |

There is also a second GPU in the system (AMD `1002:13c0`). Xorg loads both `nvidia_drv.so` and `amdgpu_drv.so` in the same process. The freeze is in the NVIDIA path.

**Screenshots or Error Messages:**

The first thing that appears in the journal when Ctrl+L is pressed:

```

gsd-media-keys[1354384]: Couldn’t lock screen:

GDBus.Error:...Code24: Timeout was reached

```

`dbus-monitor` at the moment of freeze shows the lock request makes it from `gsd-media-keys` to gnome-shell with no problems, then falls silent:

```

method call sender=:1.60 → destination=:1.74

path=/org/gnome/ScreenSaver; member=Lock

method call sender=:1.74 → destination=:1.33

path=/org/gnome/ScreenSaver; member=Lock

```

No reply ever comes back. `strace` on gnome-shell shows it blocked waiting on the X11 socket to Xorg:

```

poll([{fd=10, events=POLLIN}], 1, -1

```

`strace` on Xorg captures the exact moment it stops:

```

17:09:14.886203 writev(11, …) = 32

17:09:14.886213 writev(11, …) = 32

17:09:14.886222 writev(11, …) = 32

17:09:14.900162 — SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} —

```

Xorg was actively writing to client sockets — not hung, not waiting — and then SIGALRM arrives and the process stops responding entirely. A longer capture from an earlier run shows Xorg’s `setitimer(ITIMER_REAL)` call (which it makes every ~5ms for input polling) simply stops for the full duration of the freeze:

```

13:46:10.988172 setitimer(ITIMER_REAL, …) ← last call before freeze

             \[82 seconds of silence\]

13:47:33.003987 — SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} —

13:47:33.004002 rt_sigreturn({mask=}) = 8

13:47:33.004197 setitimer(ITIMER_REAL, …) = 0 ← resumes after GDM restart

```

I attached GDB to Xorg as root and caught SIGALRM being delivered during a lock attempt. The backtrace of Thread 1 (Xorg’s main thread) at the moment of the freeze:

```

#0 0x00007f225dd0e80b in __GI_sched_yield () [libc.so.6]

#1 0x00007f225ce9410c in ?? () [nvidia_drv.so]

#2 0x00007f225cea4f87 in ?? () [nvidia_drv.so]

#3 0x00007f225d2d8dbe in ?? () [nvidia_drv.so]

#4 0x0000000000000020 in ?? ()

#5 0x00007f225d2da4b8 in ?? () [nvidia_drv.so]

#6 0x00007ffcaa218048 in ?? ()

#7 0x00007f225d2da618 in ?? () [nvidia_drv.so]

#8 0x0000000000000000 in ?? ()

```

Xorg’s main thread is inside a `sched_yield()` spin-wait loop inside `nvidia_drv.so` at the exact moment SIGALRM fires. nvidia_drv.so is mapped at `0x00007f225ce42760–0x00007f225d3296d1` — all five driver frames land inside it. Frame #3 is ~0xE96158 bytes into the 6.7 MiB library, likely where the spin-wait originates.

A second capture from a different run caught the main thread in `ioctl()` instead:

```

rip = 0x71d0cc324e1d <__GI___ioctl+61>

rdi = 0x15 (fd 21 — the NVIDIA device node)

rsi = 0xc0106d00

```

Both captures: main thread blocked inside NVIDIA driver code when the signal fires. Every other Xorg thread was idle in `pthread_cond_wait` inside `libgallium-25.2.8` (the AMD GPU’s Mesa workers — completely uninvolved):

```

Thread 12 “Xorg:cs0” → pthread_cond_wait → libgallium

Thread 11 “Xorg:disk$0” → pthread_cond_wait → libgallium

Thread 10 “Xorg:sh0” → pthread_cond_wait → libgallium

Threads 9, 8, 7, 6 → same pattern

Thread 1 “Xorg” → sched_yield → nvidia_drv.so ← frozen here

```

`auditd` tracing every `alarm()` and `setitimer()` syscall across all processes confirmed Wine/DXVK/GE-Proton make zero ITIMER_REAL calls. The signal is Xorg’s own. Wine only contributes by holding active GPU work at the time of the lock.

`nvidia-smi` during the freeze:

```

PID 1375875 /usr/lib/xorg/Xorg 232 MiB

PID 1376134 /usr/bin/gnome-shell 117 MiB

PID 1377186 [Wine/DXVK / Battle.net] 165 MiB ← active at freeze

```

One more thing worth noting: when GDB was left in `handle SIGALRM pass noprint nostop` mode (ptrace intercepts the signal before delivery but passes it through), the freeze did **not** occur. The lock screen came up normally. GDB’s ptrace overhead shifts the signal delivery timing enough to miss the critical window inside the spin-wait. This confirms the bug is a timing-sensitive race, not something wrong with the lock request itself.

After `systemctl restart gdm`, Xorg exits cleanly — exit code 0, no crash, no segfault:

```

(II) NVIDIA(GPU-0): Deleting GPU-0

(II) Server terminated successfully (0). Closing log file.

```

Kernel journal is completely clean throughout — no GPU errors, no hung task warnings, no DRM timeout messages.

**What I’ve Tried:**

- Confirmed the bug is 100% reproducible with Battle.net open, 0% reproducible with it closed

- Ruled out Wine sending the signal via full `auditd` syscall audit

- Traced the lock request through DBus, gnome-shell, and down to the Xorg X11 socket with dbus-monitor and strace

- Caught the exact freeze moment in GDB — main thread in `sched_yield()` inside `nvidia_drv.so`

- Confirmed the Heisenbug: GDB ptrace overhead prevents the freeze by shifting signal timing

- Filed upstream with NVIDIA developer forums — no response

The only workaround I have is closing the Wine application before locking. Not ideal.

My understanding of what’s happening: `nvidia_drv.so` spin-waits on a GPU fence from Xorg’s main thread while the DXVK process holds the GPU. Xorg’s SIGALRM fires into that spin-wait. The signal handler tries to re-enter `nvidia_drv.so`, which isn’t re-entrant for this path. The main thread deadlocks with itself and the display server halts permanently. The fix needs to be in `nvidia_drv.so` — either masking SIGALRM around that critical section or moving fence polling off the main thread — but it’s a closed-source binary so I can’t do it myself.

**What I’m hoping someone here can help with:**

- Is there a better path to get NVIDIA engineering to actually look at this — a Launchpad bug against the Ubuntu driver packaging, a contact through the Ubuntu kernel team, a mailing list? The developer forum feels like a dead end.

- Is there standard additional diagnostics I should be collecting? I’m thinking `sudo gcore <xorg_pid>` during the freeze (a core dump NVIDIA could load with their internal symbols) and `sudo nvidia-debugdump -d` for GPU state. Missing anything obvious?

- Has anyone seen this pattern before with NVIDIA bugs, or know of similar reports that got resolved?