Multiple Ubuntu drives in server - boots into emergency mode

I have been running Ubuntu in my Dell Poweredge T30 server for many years, but its got to the point where I need to do a clean install.

The original Ubuntu 24.04 is running on an SSD (1TB). I intended to temporarily keep this drive whilst installing a fresh version of Ubuntu 24.04 on another SSD (2TB).

Once the fresh version of Ubuntu 24.04 and all the other apps have been installed on the new SSD (2TB), I will remove the SSD (1TB) running the older Ubuntu 24.04.

I had planned to switch between the old drive and the new drive by selecting the boot drive at start up by pressing the F12 key.

I fitted the new SSD (2TB) and installed Ubuntu 24.04 on it. I then tried swithing to the old SSD (1TB) at boot up, but I now get strange warnings about booting into emergency mode.

I have to press CTRL-D several times to continue until it finally boots into Ubuntu Desktop.

Why has this happened ? Is it not possible to have multiple Ubuntu drives in a server at the same time and switch between the drives at boot up ?

Below are pictures of the boot up (ignore the warning about not being able to find the drive ‘spare’ as I have temporarily disconnected it)



If you only have one ESP - efi system partition, you will only have one Ubuntu entry in UEFI. All installs use same name/label. Then that grub can boot all other installs on system, if updated to find them.

If each drive has an ESP (which it should) then you may have two Ubuntu entries in UEFI, but different partuuids in the entry.

To see UEFI boot entries. Older versions of efibootmgr needed parameter -v to see details. Newer versions give too much info with -v. You want to see partUUID.

sudo efibootmgr -v
lsblk -e 7 -o name,fstype,size,fsused,label,partlabel,mountpoint,uuid,partuuid
Also fstab will have mount of an ESP. That should be esp of same drive, but is by UUID.
cat /etc/fstab

While I normally add boot entries for other install into 40_custom and turn off os-prober, I do change name/label of entries, so I know which install and which drive I am booting from.
You can change this in /etc/default/grub by commenting out current line & adding your own:
My default NVMe drive:
GRUB_DISTRIBUTOR=“noble”
My test install on sda
GRUB_DISTRIBUTOR=“noble-a”

But all installs in one ESP actually use /EFI/ubuntu/grub.cfg to boot. Something hard coded, supposedly for Secure boot reasons.

1 Like

How you boot install media, UEFI or old BIOS mode, it is then how it installs. If you boot in BIOS mode on gpt drive, it adds a bios_grub partition which you now seem to have.
UEFI install would add an ESP - efi system partition which is FAT32 with boot, esp flags.

LVM is most often used by servers and required for full drive encrypted installs. It is an advanced volume management way to manage your volumes (which replaces partitions). You have to use LVM tools not partition tools to manage LVMs. You have one large volume inside one partitions. Volume then can be resized or you can add volumes for /home, data or other / test installs. Or expand volume to use entire partition. Severs often have smaller / and larger volumes for other folders. Best for new uses to use standard partitions unless you have to have full drive encryption.

1 Like

Thanks oldfred

Any idea why I randomly get the error and how to resolve it ?

Does the output of “journalctl -xb” give any clues ?

This issue only started when swapping between the old drive (1TB SSD) and the new drive (2TB SSD).

At first I did this by pressing F12 at startup to select the boot drive. Then I resulted to physically disconnecting the drives to swap between them, but I still get the error.

I would prefer to avoid doing a fresh install all over again, but I don’t want to have this random error either.

Rather than entire log file, may be better to just see errors/warning.
But with my system I get a page or two. Some are just messages that have the word error in them, so not an error.

sudo grep -E -i “error|warning” /var/log/dmesg

Also check boot mode in UEFI settings.
If UEFI system, that normally is default boot mode. And switching to BIOS mode may or may not always take place correctly. You may be able to set default to old BIOS mode, if that is what you really want. UEFI tries to boot first option, then second, if it gets error on first. But that switching may not always work, especially if switch also is between a BIOS and an UEFI boot. Always best to use UEFI boot on UEFI hardware. Only some very early UEFI systems from 2012 or so, maybe better with old BIOS mode.

1 Like

thanks oldfred

I had to reboot the server after performing some package updates and the same error happened again.

It took several attempts to get it to boot up, so I need to get this fixed somehow.

I ran the command you mentioned at got the following…

sudo grep -E -i "error|warning" /var/log/dmesg
[    0.250987] kernel: acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_ERROR)
[    0.344769] kernel: ERST: Error Record Serialization Table (ERST) support is initialized.
[    0.477102] kernel: RAS: Correctable Errors collector initialized.

What does this mean ? Can it be fixed and if so how ?

I still have no idea why this issue occurred.

As shown in my screenshots, the drive is using legacy boot mode (not UEFI).

With Legacy boot, grub stores drive to boot from.
#To see what drive grub2 uses see this line - grub-pc/install_devices:
sudo debconf-show grub-pc # for BIOS with grub-pc
It will show drive model & serial number
to see similar drive info
sudo lshw -C Disk -short

Your error report looks like commands with errors in description, not errors. And
AE_ERROR normally can be ignored.

The output of “sudo debconf-show grub-pc” is shown below (means nothing to me!)

sudo debconf-show grub-pc
  grub-efi/install_devices_disks_changed:
  grub-pc/install_devices_failed: false
  grub2/kfreebsd_cmdline:
  grub2/unsigned_kernels_title:
  grub-efi/cloud_style_installation: false
  grub-pc/hidden_timeout: true
  grub2/linux_cmdline_default: quiet splash
  grub-pc/install_devices_failed_upgrade: true
* grub-pc/install_devices: /dev/disk/by-id/ata-Samsung_SSD_870_EVO_2TB_S621NG0R104070Z
  grub-pc/partition_description:
  grub-pc/cloud_style_installation: false
  grub2/update_nvram: true
  grub-pc/timeout: 0
  grub-pc/chainload_from_menu.lst: true
  grub-pc/kopt_extracted: false
  grub2/enable_os_prober: false
  grub-efi/install_devices:
  grub2/kfreebsd_cmdline_default: quiet splash
  grub-efi/partition_description:
  grub-pc/mixed_legacy_and_grub2: true
  grub2/linux_cmdline:
  grub-pc/install_devices_disks_changed:
  grub-efi/install_devices_empty: false
  grub2/no_efi_extra_removable: false
  grub-pc/disk_description:
  grub2/unsigned_kernels:
  grub-pc/postrm_purge_boot_grub: false
  grub-efi/install_devices_failed: false
  grub-pc/install_devices_empty: false

The output of “sudo lshw -C Disk -short” is below…

sudo lshw -C Disk -short
H/W path           Device     Class          Description
========================================================
/0/100/17/0        /dev/sda   disk           4TB WDC WD40EFRX-68N
/0/100/1d/0/0      /dev/sdd   disk           4TB WDC WD40EFRX-68N
/0/100/1d/0/1      /dev/sde   disk           2TB Samsung SSD 870

I have 4 x 4TB HDD, so I dont know why lshw is only showing 2 of them ?

As shown in my boot BIOS setup, I have 4 x 4TB drives but none of these are boot drive (P2, P1, P0 and S0).

The 2TB SSD (S1) is the boot drive and this is at the top of the list, so it should boot from the 2TB SSD before trying to boot from the 4TB drives.

Is it just a matter of disabling the 4TB drives in the boot BIOS ?

I dont know whether its relevant to the issues I am having, but I have connected the new OS (2TB SSD) to the SATA5 port via a PCI card. Its not a SATA port on the mother board (see diagram below).

This is just temporary and I plan to move it back to the SATA3 port on the motherboard once I put the server back together. This is where the old OS (1TB SSD) was originally connected

This line shows the BIOS drive grub is useing:

If you have NVMe drives, you have to be careful on SATA ports. Most motherboards disable a SATA port if NVMe drive is used. Check your manual. So not know, but expect same with your PCI Express adapter.

Drives usually are in SATA port order, but UEFI/BIOS may bring up a drive in different order. Grub defaults to HD0 on a new install which normally is lowest SATA port or first NVMe drive.

1 Like