Hi everyone, below you will find the updates of the Ubuntu Server team members from the last week. If you are interested in discussing a topic please start a thread in the Server area of this Discourse site.
It has been some busy weeks lately with Ubuntu Eoan Freeze Dates (https://wiki.ubuntu.com/EoanErmine/ReleaseSchedule). I’m listing here some of my past accomplishments since last post.
QEMU memory barrier not enough for sync primitives for ARM64
LP: #1805256 - qemu-img hangs on high core count ARM system (josh asked)
SUMMARY:
We’ve discovered that there was a race condition in QEMU AIO loop during qemu-img executions. I was able to make sure it was a primitive atomicity issue by demonstrating the issue did not happen when mutexes protected the affected variables. Upstream QEMU, Marvel and Huawei engineers are working on it (likely memory alignment/cache lines issue):
- https://bugs.launchpad.net/qemu/+bug/1805256 (comments 16 to 19)
- Upstream discussions:
- Paolo asked input from Jan Glauber (marvell) and they’ve discussed locking primitives concepts.
- Paolo provided a fix but it wasn’t enough as Dann Frazier tested it and it did not work.
- Marvel engineer provided feedback about the 2 locks having the same size in memory.
- Changing type (of one) seems to resolve the issue.
Systemd restarts and HA software stack behavior
systemd-networkd:
https://github.com/systemd/systemd/issues/12050
https://github.com/systemd/systemd/pull/12511
https://github.com/ssahani/systemd/commit/b0fa0b4fd5ba
The following 3 bugs:
There are mainly 2 “fixes” for this issue:
-
keepalived is able to recognize systemd-networkd changes and change cluster status in order to reconfigure managed NICs (keepalived (> 2.0.x)).
-
systemd-networkd implements a new stanza (KeepConfiguration=) to systemd service unit files in order to fix not only this behavior but all those HA related software that manages secondary IPs and/or aliases to NICs being managed by systemd-networkd.
- Discussed best way to approach with Christian
- Fixing systemd-networkd seems more appropriate (1st attempt)
- Changing keepalived might be more appropriate for SRUs (2nd attempt)
BACKPORTED:
The commits bellow implement support to "keep configuration":
commit 1e498853a39b46155cb89b5c9e74ecb27aaba3ed
Author: Yu Watanabe <watanabe.yu+github@gmail.com>
Date: Mon Jun 3 01:21:13 2019
test-network: add tests for KeepConfiguration=
commit c98d78d32abba6aadbe89eece7acf0742f59047c
Author: Yu Watanabe <watanabe.yu+github@gmail.com>
Date: Mon Jun 3 03:37:25 2019
man: add documentation about KeepConfiguration
commit db51778f85cb076e9ed1fe7f7e29cc740365c245
Author: Yu Watanabe <watanabe.yu+github@gmail.com>
Date: Mon Jun 3 00:33:13 2019
network: make KeepConfiguration=static drop DHCP addresses and routes
Also, KeepConfiguration=dhcp drops static foreign addresses and routes.
commit 95355a281c06c5970b7355c38b066910c3be4958
Author: Yu Watanabe <watanabe.yu+github@gmail.com>
Date: Mon Jun 3 14:05:26 2019
network: add KeepConfiguration=dhcp-on-stop
The option prevents to drop lease address on stop.
By setting this, we can safely restart networkd.
commit 7da377ef16a2112a673247b39041a180b07e973a
Author: Susant Sahani <ssahani@vmware.com>
Date: Mon Jun 3 00:31:13 2019
networkd: add support to keep configuration
Provided a PPA and a MR for the Eoan SRU:
https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1815101
https://code.launchpad.net/~rafaeldtinoco/ubuntu/+source/systemd/+git/systemd/+merge/374027
Asked @xnox about his opinion for the same SRU made to Disco and Bionic.
grub-install issues with Eoan installer
LP: #1838525 - installer - LVM setup fails to install grub on virtio storage
SUMMARY Of the problem (or a HUGE TL;DR):
- Installer depends on “grub-mkdevice --no-floppy -m -” command to get bootable devices ordering.
- grub-mkdevice was dropped upstream and it is included in grub2 by a quilt patch.
- grub-mkdevice orders everything that is in /dev/disk/by-id/* excluding, in this order, everything containing “-part”, “dm-” and “md-”.
- LVM partitions are added to /dev/disk/by-id, but not the entire disk (as the PV is the partition itself).
- UDEV creates /dev/disk/by-id depending on 60-persistent-storage.rules:
virtio-blk
KERNEL=="vd*[!0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}"
KERNEL=="vd*[0-9]", ATTRS{serial}=="?*", ENV{ID_SERIAL}="$attr{serial}", SYMLINK+="disk/by-id/virtio-$env{ID_SERIAL}-part%n"
So, LVM puts ID_SERIAL in LVM partitions, they get added to /dev/disk/by-id and installer is lost when trying to order it, as LVM partition gets into 1st position of choice, instead of the full disk (for hd0, hd1, … grub setup).
There are 3 alternatives to fix this and I have chosen the one I believe has the smaller potential for any type of regression. Comment #30 describes what caused the regression and these 3 alternatives:
- To revert this change for current release, since this rule was added to “make navigation a bit easier using PV UUIDs”, as the commit says. We would worry about installer changes in the next release.
- Another possibility would be to change the logic inside “grub-mkdevicemap.c: make_device_map()->grub_util_iterate_devices()” to ignore all symlinks from /dev/disk/by-id/ containing lvm-pv-uuid-*. We would not have to worry about this in the next release if using debian-installer.
- Another option would be to change grub-installer package/logic. Unfortunately, a few days before the full freeze, I don’t think messing with the installer would be a good option to avoid regressions (potential regression item would grow in significance).
I’m choosing (2) because ubuntu foundations already faced a similar situation, when grub-mkdevicemap.c file was removed from grub2 code and they re-added it by using a quilt patch, assuming it was the easiest and better to maintain. I’m doing something similar, patching the patch that creates grub-mkdevicemap.c file again to ignore /dev/disk/by-id/lvm-pv-uuid-* files (like it already does for other symlinks, actually).
With that, I have created the following merge request: https://code.launchpad.net/~rafaeldtinoco/ubuntu/+source/grub2/+git/grub2/+merge/373792
And the following PPA:
https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1838525
MAAS deploy on servers with attached Pendrive
(1) LP: 1833618 - sg3-utils - failing to deploy Ubuntu Disco
Disco - https://code.launchpad.net/~rafaeldtinoco/ubuntu/+source/sg3-utils/+git/sg3-utils/+merge/373439
Bionic - https://code.launchpad.net/~rafaeldtinoco/ubuntu/+source/sg3-utils/+git/sg3-utils/+merge/373440
PPA - https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1833618
sg3-utils-udev package installed udev rules file that was changing ID_SERIAL attributes from connected USB block devices. This happened because USB memory sticks usually are SPC-only SCSI devices, not supporting VPDs 0x80 and 0x83. That caused MAAS to misbehave in getting ID_SERIAL of the connected pendrive.
Simplestreams SRU to Xenial
After analyzing the following bugs:
• LP: #1611987 - simplestreams - [SRU] glance-simplestreams-sync charm doesn't support keystone v3
• LP: #1686437 - simplestreams - can't sync images for keystone v3
• LP: #1719879 - simplestreams - [SRU] swift client needs to use v1 auth prior to ocata
• LP: #1728982 - simplestreams - [SRU] openstack mirror with keystone v3 always imports new images
For the keystone v3 fixes revno 454 is the minimum we need SRU’d back to xenial. Bionic 0.1.0~bzr460-0ubuntu1 has these changes. These two merges are the pertinent changes:
1. Keystone v3 Support - https://is.gd/wq7r6g
2. Fix KSv3 Bugs - https://is.gd/OOEo3G
0.1.0~bzr426-0ubuntu1.3 was uploaded. It contained a fix for LP: #1686437 (can’t sync images for keystone v3). That change contained a regression (LP: #1728982 - openstack mirror with keystone v3 always imports new images - and LP: #1719879 - swift client needs to use v1 auth prior to ocata) and was marked as verification needed.
Work was done to fix that regression. A merge proposal is made for Xenial at https://is.gd/7ixQbO. We have a PPA at https://is.gd/Boda8J that contains a fix for the regression caused by 0.1.0~bzr426-0ubuntu1.3 and others.
Feedback from that PPA was asked and was given by Ed, Chris, Felipe and Billy. Billy found an issue about squashfs and that was fixed into 0.1.0~bzr426-0ubuntu1.4~ppa0, also uploaded to PPA at https://is.gd/Boda8J.
SRU template is needed in all referenced bugs:
• 428-do-not-require-that-hypervisor_config-be-present.patch (LP: #1578622)
• 433-glance-ignore-inactive-images.patch (LP: #1583276)
• 436-glance-fix-race-conditions.patch (LP: #1584938)
• 450-453-454-keystone-v3-support.patch (LP: #1686437, #1728982, #1719879)
• 455-nova-lxd-support-squashfs-images.patch (LP: #1686086)
And version 0.1.0~bzr426-0ubuntu1.4 is good for a SRU and already tested by multiple people.