Ubuntu 24.04.2 and Kernel 6.8.0-51 and Nvidia Drivers

Ubuntu 24.04.2 LTS,

System Details Report


Report details

  • Date generated: 2025-03-01 11:47:25

Hardware Information:

  • Hardware Model: ASRock X570 Steel Legend WiFi ax
  • Memory: 64.0 GiB
  • Processor: AMD Ryzen™ 9 3900X × 24
  • Graphics: Software Rendering
  • Disk Capacity: 2.0 TB

Software Information:

  • Firmware Version: P5.60
  • OS Name: Ubuntu 24.04.2 LTS
  • OS Build: (null)
  • OS Type: 64-bit
  • GNOME Version: 46
  • Windowing System: Wayland
  • Kernel Version: Linux 6.8.0-51-generic

My Nvidia drivers do not apparently load as I have two Nvidia P106-100 that are worthless. If I boot using kernel 6.11.0-17, the drivers load. Now on this machine kernels 6.8.0-52 and 6.11.0-17 usually restart sometime between 23;00 and 02:00. There are no errors in the journal just a fresh boot. So I booted using 6.8.0-51 (it runs until I restart or power loss) and was surprised to see I was using linux-modules-nvidia-535-server-6.11.0-17-generic and linux-modules-nvidia-535-server-generic-hwe-24.04. Maybe the later is okay since I am using Ubuntu 24.04.2 but I thought that the other driver being for kernel 6.11.0-17 was definitely an issue. Iremoved the Nvidia modules and then reinstalled Nvidia 535,? driver version. After reboot I did “apt list --installed ‘linux-modules-nvidia*’” and I got:
linux-modules-nvidia-535-server-6.11.0-17-generic/noble-updates,now 6.11.0-17.17~24.04.2+1 amd64 [installed,automatic]
linux-modules-nvidia-535-server-generic-hwe-24.04/noble-updates,now 6.11.0-17.17~24.04.2+1 amd64 [installed]
with no drivers loaded and no functioning Nvidia GPU’s. I use the GPU’s to run BOINC. I am annoyed that it reboots without any errors with newer kernels BUT I want my GPU’s to be useable. How can I get the proper drivers to install? Thanks

Why did you install the -server- driver in the first place ? Did you plan to train AI models ?

Try the ubuntu-drivers command, it should check your hardware and list the available drivers for the card…

I have 2 Nvidia P106-100’s that have no video output. I have them installed in the fasted slots I have. I also of a Radeon HD6450 installed in a 1 by slot that Ubuntu will use for video output as long as I have the dpkms or server drivers on the NVidia cards.

I used ubuntu-drivers install -gpgpu. It uses a driver that the ubuntu gui app lists as manually installed.

I selected nvidia-driver-535-server when I ran ubuntu-drivers.

When the system boots I am now seeing some Nvidia logs.
Mar 01 12:27:41 kernel: nvidia 0000:03:00.0: enabling device (0000 → 0002)
NVRM: nvidia.ko because it does not include the required GPU
NVRM: www.nvidia.com.
Mar 01 12:27:41 kernel: nvidia 0000:03:00.0: enabling device (0000 → 0002)
NVRM: nvidia.ko because it does not include the required GPU
NVRM: www.nvidia.com.
Mar 01 12:27:41 kernel: nvidia: loading out-of-tree module taints kernel.
Mar 01 12:27:41 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 511
Mar 01 12:27:41 kernel: nvidia 0000:03:00.0: enabling device (0000 → 0002)
NVRM: nvidia.ko because it does not include the required GPU
NVRM: www.nvidia.com.
Mar 01 12:27:41 kernel: nvidia: probe of 0000:03:00.0 failed with error -1
Mar 01 12:27:41 kernel: nvidia 0000:0f:00.0: enabling device (0000 → 0002)
NVRM: nvidia.ko because it does not include the required GPU
NVRM: www.nvidia.com.
Mar 01 12:27:41 kernel: nvidia: probe of 0000:0f:00.0 failed with error -1
Mar 01 12:27:41 kernel: nvidia-nvlink: Unregistered Nvlink Core, major device number 511
Mar 01 12:27:41 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 511
NVRM: nvidia.ko because it does not include the required GPU
NVRM: www.nvidia.com.

According to https://www.techpowerup.com/gpu-specs/p106-100.c2980

This device has no display connectivity, as it is not designed to have monitors connected to it.

BOINC is a framework for sharing compute/(gpu) cycles for calculations.

Note first=best comment on
https://www.reddit.com/r/MachineLearning/comments/a71hw8/d_can_mining_only_gpus_such_as_the_p106100_be/

The PCI bus frequencies may be mismatched with your other video card:
"Since the older GPU won’t be able to match the new frequency, it will just drop off the system at the firmware level (no longer OS-visible). In short, you want to have all GPUs in your system be the same or, at least, verify that the GPUs can all run on the same PCIe bus frequency. ", claytonkb

1 Like

Thanks but as I think I mentioned this machine has run in this configuration for several months with the same GPU cards. I also mentioned that the GPU’s have no video output. I did get it working.

sudo apt purge ‘nvidia*’
ubuntu-drivers --gpgpu list
ubuntu-drivers --gpgpu install nvidia-driver-470-server
reboot : no GPU activity
sudo apt update
sudo apt upgrade
sudo apt autoremove
nvidia-smi : command not found can be installed
sudo apt install nvidia-utils-470-server : got a message that I should install nvidia-driver-470-server
nvidia-smi : could not locate nvidia drivers
sudo apt install nvidia-driver-470-server

The install process did some builds and I saw messages “Building initial module for 6.8.0-51-generic”
and “Building initial module for 6.11.0-17-generic”.
I don’t recall seeing messages like this when I used ubuntu-drivers or Software & Updates->Additional Drivers

reboot
AND my GPU’s were working again,

Thanks for your help!