Please test autoinstalls for 20.04!

Hitting a new issue now - Does the Ubuntu Autoinstall actually support the Curtin syntax for providing options to a mount setting within a config stanza for storage?

I’ve been trying to use the config: option to provide additional settings for the storage: stanza, but I can’t get the options: to take effect in the installed system. A short example of just a mount stanza that I’m trying to use:

      - id: mount_vartmp
        options: 'nodev,noexec,nosuid'
        type: mount
        path: /var/tmp
        device: fs_vartmp

The additional options I provide aren’t used when I run mount on a running system, and they’re also not copied into /etc/fstab like the Curtin docs suggest they should be:

curtin will ensure the target OS uses the provided mount options by updating the target OS (/etc/fstab).

I notice that the Autoinstall Reference specifies:

For full flexibility, the installer allows storage configuration to be done using a syntax which is a superset of that supported by curtin, described at https://curtin.readthedocs.io/en/latest/topics/storage.html.

But the Autoinstall Reference doesn’t describe exactly which of the Curtin options are/aren’t available.

Looking at the Autoinstall Schema doesn’t provide much help either as the schema isn’t very specific about the storage: options that can be used.

Yes, but only in fairly new versions – subiquity version 21.04.1 or newer, so Ubuntu 21.04 is the only released version that supports it, but if you take a focal daily or upgrade the snap during install it should work (the upgrade isn’t offered by default to users of 21.04.2 media because it crashes interactively but if you say refresh: yes channel: stable in an autoinstall it should work). Or wait a couple of weeks for 21.04.3.

This worked – thanks!

For anyone else curious, I added:

  refresh-installer:
    update: true
    channel: 'stable'

to the autoinstall file.

It’s worth noting that it seems to be very picky about the mount options being quoted strings and refuses to go any further if they aren’t quoted. Just one for people to be aware of.

I’ve been able to run ansible-playbook successfully. (Installed via python3-pip, ansible 2.9.x.) What error are you seeing? Try a simple:

- curtin in-target --target=/target -- /bin/sh -c -- 'ANSIBLE_LOG_PATH=/var/log/ansible.log ansible -m ping -o -a data=SMOKETEST localhost'

in late-commands.

I don’t want to wait on autoupdates because I don’t have access to anything in the outside world. I’m sitting in a DMZ and can only see my local subnet, therefore everything I need has to be on the ISO that’s doing the install. So I need the network up but no autoupdates. Security updates will come later as a separate process to all servers.

Update: so yes, it would be nice to have an option to tell autoinstall not to even try doing the autoupdates.

Did you ever figure out if this is problematic? I need the system to either power down or pause on keypress before it reboots.

I have tried read -t which throws an error (anyone know why?), and I have been hesitant to force a powerdown until I was confident there weren’t any side-effects.

Hi there,

Longtime user of preseed so I’m still getting the hang of the new installation tooling, and I have run into a problem for which I am having no luck finding a solution.

Specifically, right after the ramdisk loads and before cloud-init itself starts, there’s a step to request an address via DHCP, and and DHCPDISCOVER packets are sent out on all connected interfaces.

I need some help sorting out how to restrict this to a specific interface (selected by MAC or logical name), as we have scores of servers with multiple interfaces and if the wrong NIC is used to bootstrap, the automated install will fail.

Historically the preseed keyword interface (and the netcfg attribute family in general) could be used to control which NIC was used, but that doesn’t seem to work anymore.

I could use a hand identifying which option(s) I need to use, and suitable values for them.

Here’s my iPXE boot stanza:

imgargs vmlinuz ip=dhcp ipv6.disable=1 interface=eth0 biosdevname=0 net.ifnames=0 root=/dev/ram0 ramdisk_size=1800000
url=http://installserver/images/ubuntu20/ubuntu-20.04.3-live-server-amd64.iso ds=nocloud-net;s=http://installserver/cloud-init/ fsck.mode=skip ro autoinstall initrd=initrd ||
boot ||

and yeah I’m still forcing the kernel to use legacy interface naming but due to the sheer quantity and variety of hardware in our data centers, using systemd’s logical naming convention instead is total a non-starter.

Again, any insights would be very welcome!

thanks,
Klaus

Thank you, much! Where’s the documentation on this?

Hi, I am also facing the same issue (systemd-cat curtin python3 exited with error). Can you please help me?

I am also facing this error. Can anyone please help me? @mwhudson

Hi. I am creating some Packer provisioning profiles for Ubuntu and having a hard time getting the autoinstaller to create additional users.

#cloud-config
autoinstall:

identity:
hostname: template
username: username1
password: $6$HK2Mp6GHbIJqBvrD$.4ClLOTwVO1LEaq4mGJ3FWmJVMJ3uS71UFSqD5ynJVqK9kutaEQLV4YfBYCnXJB3z/lA9X.fOGwFs0SPigp6p.
user-data:
users:
- username: username2
password: $6$oeVvT2UdIsWfSZ6o$m./hIcEbHN.jaw9D4IqXtPamHcOCvb0uaMuVhlKuNSUwHt4/dzhGIod0hB4xj4EDyqHE8wn06n/ra5.IdJma//
groups: [adm, cdrom, dip, plugdev, lxd, sudo]

Username1 gets created and works, username2 doesn’t, what am I doing wrong? Having looked at some more docs, I’ve now also tried “passwd:” instead of password, “name” instead of username as well as lock-passwd: false, but the user simply isn’t getting created at all.

EDIT: solved by removing identity section altogether and defining all users under user-data. Mind you, you will want lock-passwd: false and deliberately define Bash as the shell unless you really do want SH.

So for anyone else who might have the same initial problem I did, it looks like adding BOOTIF=01-<primary interface MAC address> to the kernel boot args is the secret ingredient for limiting DHCP broadcasts to a specific interface … which sort of figures, I guess the old ways are still sometimes the best ways.

2 Likes

What’s the right way to deal with LVM metadata left behind from a previous install?

I use a VG called “vg0”, and if I recreate the OS disk as an early-command (by destroying and recreating the underlying RAID virtual disk), curtin fails because it still thinks “/dev/vg0” exists.

Obviously it doesn’t – because creating the RAID virtual disk destroys all the LVM metadata – but I’m not sure how to tell curtin to refresh its LVM state info.

If I re-run the install a second time – because there’s no stale LVM metadata when rebooting after this failure – there’s no trouble.

Anyone else run into this problem?

I have this exact issue, as well.

All of our existing tooling expects more generic patterns like disk mounting by label (which we use extensively with our Hadoop clusters, among other setups.)

This limitation really puts a huge kink in things.

If anyone was looking for a code snippet to make the new /etc/fstab a bit friendlier-looking, I put this together for my own environment:

perl -e '
    use Cwd qw(abs_path);

    opendir($dh, "/dev/mapper");
    while (readdir $dh) {
        next if ($_ !~ /^vg/);
        $lvms{abs_path("/dev/mapper/$_")} = $_;
    }
    closedir $dh;

    while (<STDIN>) {
        # entries matching mount points
        if ($_ =~ m#^/dev#) {
            @entry = split;
            # handles block devices
            if (@entry[0] =~ m#/by-uuid/#) {
                $blkdev = abs_path(@entry[0]);
                $uuid = qx(/sbin/blkid -o value -s UUID $blkdev);
                chomp($uuid);
                @entry[0] = "UUID=$uuid";
            # handles logical devices
            } elsif (@entry[0] =~ m#/by-id/#) {
                $blkdev = abs_path(@entry[0]);
                @entry[0] = "/dev/mapper/$lvms{$blkdev}"
            }
            print join(" ", @entry) . "\n";
            next;
        # everything else
        } else {
            print;
        }
    }
' < /target/etc/fstab > /tmp/fstab.$$
if [ -s /tmp/fstab.$$ ]; then
    mv /target/etc/fstab /target/etc/fstab-$(date +%F)
    mv /tmp/fstab.$$ /target/etc/fstab
fi

My perl’s a little rusty so the syntax etc. isn’t exactly best practice.

I’m pretty impressed so far with the subiquity autoinstaller. Here’s an example of an autoinstall config I’ve been playing with, for a server I plan to further configure using Ansible (Note: I don’t bother with root-on-ZFS; just for my data):

#cloud-config
autoinstall:
  version: 1
  locale: en_US.UTF-8
  refresh-installer: # Check for updated installer
    update: yes
  storage:
    #   UEFI system partition
    # + boot partition
    # + swap partiton (instead of swap file)
    # + root volume (via LVM)
    # + ZFS intent-log placeholder (optional)
    # + ZFS L2ARC placeholder (optional)
    # + ZFS pool (optional)
    config:
      - type: disk
        match:      # select largest ssd
          ssd: yes
          size: largest
        id: ssd0    # ...and call it ssd0
        ptable: gpt # use gpt partitions on ssd0
        path: /dev/nvme0n1 # device's common /dev name
        wipe: superblock
      - type: partition # create partitions on ssd0
        device: ssd0
        number: 1
        id: uefi-partition
        size: 512M
        flag: boot  # efi system partition needs boot flag
        grub_device: true # ...and must be the grub device
      - type: partition
        device: ssd0
        number: 2
        id: boot-partition
        size: 512M
        flag: linux
      - type: partition
        device: ssd0
        number: 3
        id: swap-partition
        size: 128G
        flag: swap
      - type: partition # optional
        device: ssd0
        number: 5
        id: zil-partition
        size: 64G
        flag: linux
      - type: partition # optional
        device: ssd0
        number: 6
        id: l2arc-partition
        size: 256G
        flag: linux
      - type: partition
        device: ssd0
        number: 4
        id: volume-group-partition
        size: -1
        flag: linux
      - type: lvm_volgroup
        id: volgrp0
        name: volgrp0
        devices:
          - volume-group-partition
      - type: lvm_partition
        volgroup: volgrp0
        id: root-volume
        name: root-volume
        size: -1
      - type: format # format partitions on ssd0
        id: uefi-format
        volume: uefi-partition
        fstype: fat32 # ESP gets FAT32
        label: ESP
      - type: format
        id: boot-format
        volume: boot-partition
        fstype: ext2 # /boot gets ext2, ext3, ext4
        label: BOOT
      - type: format
        id: swap-format
        volume: swap-partition
        fstype: swap # swap
        label: SWAP
        flag: swap
      - type: format
        id: root-format
        volume: root-volume
        fstype: xfs # / (root) gets ext4, xfs, btrfs
        label: ROOT
      - type: mount # mount formatted partitions from ssd0
        id: root-mount # / (root) gets mounted first
        device: root-format
        path: /
      - type: mount
        id: boot-mount # /boot gets mounted next
        device: boot-format
        path: /boot
      - type: mount
        id: uefi-mount # /boot/efi gets mounted next
        device: uefi-format
        path: /boot/efi
      - type: mount
        id: swap-mount
        device: swap-format
        fstype: swap
        path: none
      - type: zpool
        id: zfs-pool
        pool: folio
        vdevs: # replace hdd/ssd devices with real device/partition ids
          - 'hdd0' # single disk vdev
          # - 'mirror hdd0 hdd1' # mirror vdev
          # - 'raidz2 hdd0 hdd1 hdd2 hdd3 hdd4 hdd5' # raidz2 vdev
          # - 'log ssd0-part5' # zil vdev (see partition scheme, above)
          # - 'cache ssd0-part6' # l2arc vdev (see partitions scheme, above)
        mountpoint: /var/local
        pool_properties:
          # virtually all of these are optional, but this configuration works for my pool
          - ashift: '12' # use 4k sectors (as most modern disks do)
          - autoexpand: 'on' # useful with raid1, raid10, raidz1/2/3
        fs_properties:
          - acltype: 'posixacl' # use posix-style access control lists
          - xattr: 'sa' # better performance of extended attributes (less compatible with older zfs versions)
          - compression: 'zstd' # compress filesystem using zstd (less compatible with older zfs versions)
          - dnodesize: 'auto' # allow zfs to manage dnodesize itself
          - relatime: 'on' # fewer writes updating file access time
          - normalization: 'formD' # normalize filenames to utf-8...
          - utf8only: 'on'         # ...and enforce utf-8 filenames
          - 'com.sun:auto-snapshot': 'true' # enable zfs-auto-snapshot for pool
          - 'com.sun:auto-snapshot:monthly': 'true' # or weekly, daily, hourly, frequent
    swap:
      size: 0
  identity:
    hostname: 'xray'
    username: 'marvin'
    password: '$6$<snip>'
  ssh:
    install-server: true
    allow-pw: false
    authorized-keys:
      - 'ssh-rsa AAA<snip>'
  packages:
    - build-essential
    - git
    - python3-pip # necessary for ansible to work?
    - tasksel
    - zfsutils-linux

Unfortunately, it seems to ignore the type: zpool section entirely. Anyone have any ideas why? I get that ZFS support in curtin is experimental, but how do I find out in a log? I’d like to test each of the options commented out below the single-disk vdev.

Other than that, it works pretty well.

tentei fazer igual mas sem sucesso!

Is there a way to let subiquity check the hash or signature of the user-data? I think this will be more secure to protect against tampered file server

I have trouble setting a root password. Even if I set ‘disable_root: false’ in ‘user-data’, ‘password: “whatever”’ in ‘identity’ does not set a root password. So far I do a sed on ‘/etc/shadow’ in ‘late-commands’ to set one.

Furthermore there seems to be no possibility to set a disk for partitioning addressing it with ‘/dev/disk/by-path/whatever’. This is a problem if you are setting up a s390x machine with qemu to reboot it later as a LPAR or if you want to switch/boot a vm later on a z/VM.

Am I missing something?

There is a confusion on the documentation on this. You have to set the “grub_device: true” on the Partition for EFI and on the disk for BIOS. Because the installer (Subiquity) has this weird logic below:
https://github.com/canonical/subiquity/blob/a4454c9153f712b4e2eaa889899fe584477a5a9c/subiquity/models/filesystem.py#L1426
This function must return false. Otherwise the exception “autoinstall config did not create needed bootloader partition” is raised. See line below:
https://github.com/canonical/subiquity/blob/a4454c9153f712b4e2eaa889899fe584477a5a9c/subiquity/server/controllers/filesystem.py#L121

1 Like

I will give an example what is working and what is not:
This will work:

  storage:
    config:
    - {ptable: vtoc, path: /dev/vda, wipe: superblock-recursive, preserve: false,
      name: 'boot', grub_device: false, type: disk, id: disk-foo}
    - {device: disk-foo, size: 512M, wipe: superblock, flag: '', number: 1, preserve: false,
      grub_device: false, type: partition, id: partition-0}

This will not:

  storage:
    config:
    - {ptable: vtoc, path: /dev/disk/by-path/ccw-0.0.XXXX, wipe: superblock-recursive, preserve: false,
      name: 'boot', grub_device: false, type: disk, id: disk-foo}
    - {device: disk-foo, size: 512M, wipe: superblock, flag: '', number: 1, preserve: false,
      grub_device: false, type: partition, id: partition-0}

Edit:
The working (first) config actually produces a wanted fstab:

[...]
# /boot was on /dev/vda1 during curtin installation
/dev/disk/by-path/ccw-0.0.XXXX-part1 /boot xfs defaults 0 1

So I guess working with /dev/vda is alright. Have to check this with multiple DASDs defined while installing…