Mutter microrelease updates: high regression rate

Dear Desktop Team,

I’ve noticed a significant number of regression reports related to mutter SRUs recently, so I looked back at the bug database to see if this was the case and found the following. This calls into question the safety of taking microrelease updates from upstream rather than cherry-picking. There seem to be a high rate of regressions both shipped by upstream in microreleases, as well as some caused by interactions between upstream updates and the triple-buffering patch.

An alternative method of doing SRUs for mutter would be to revert to specifically targeted individual fixes rather than taking upstream microrelease updates wholesale.

Before we draw any conclusions, please could you take a look and see what you think? For example, here are some that I selected from recent bugs tagged regression-update where the root cause appears to have been the mutter package:

12/23 LP: #2046360: upstream mutter 45.1
6/23: LP: #2023363: regression in 44.1
8/23: LP: #2030959: regression in 42.9?
8/23: LP: #2030355: reported regression-update but not triaged
7/23: LP: #2026887: introduced in 42.9
7/23: LP: #2026886: introduced between 42.5-0ubuntu1 and 42.9-0ubuntu2, not resolved
6/23: LP: #2023759: introduced between 44.0.2 and 44.1, fixed in 44.2
10/20: LP: #1899206: reported broken in 3.36.2
10/20: LP: #1900906: regressed in 3.36.6

writing tests seen somehow problematic and expecting this from contributors seems also weird…

and since testing is a profession, PRO wanted but may be some occasional independence help to LTS releases…

I agree the quality of what we get from Mutter upstream can be a problem. And the fact that I have to maintain triple buffering as a patch is also a problem because full testing of all possible drivers, platforms, multi-monitor configs and multi-GPU configs isn’t feasible for every update. There have been some regressions in updates and I am hard on myself with those as evidenced in the above bug list. But overall triple buffering has been ready for release for 2 years, which is why we released it 2 years ago. I can’t control the fact that it hasn’t been merged yet, only respond to every discussion that comes up.

Last cycle I spent months fixing regressions caused by the KMS thread feature in Mutter 45, so that 45.0 could be released on time. The KMS thread is not a feature I believe should exist because it is over-complicated and unstable. Its use of realtime scheduling continues to cause random SIGKILLs today. But the main problem with this is that I didn’t get to respond to other queries (including triple buffering) for months while I was fixing those upstream regressions.

There is also some observer bias here. People who are more proactive in bug management are more likely to tag the bugs they see as a regression. Looking through the above list, the late Gunnar Hjalmarsson was very proactive in bug management. And most of the rest were tagged by myself who happens to also be the top bug triager globally. So I think if more people were proactive about bug management and quality then we’d see a wider spread of regression tags throughout the entire distro. If you look at the set of packages I monitor then you’ll find that Mutter is not the package we should be worrying about.

Overall I would lean toward taking the Mutter updates that upstream gives us. Because what I am finding is that they fix a significant number of issues before our users report them.

7 Likes

At least Ubuntu 24.04 LTS now has an exclusive selling point with Triple Buffering as being the most performant GNOME platform… :slight_smile:

True but 22.04 had it too.

4 Likes

Thanks for the explanation.

What do you think could be improved in the QA process for these SRUs to reduce the number of regressions? Some new test case more focused on triple buffering perhaps?

Looking back at 2023 I think the main factor was our dependence on getting a new GNOME release out the door at the same time as a new Ubuntu release. And while I’m busy fixing upstream GNOME bugs (not triple buffering but sometimes they do break triple buffering) I don’t have time to focus on Ubuntu-specific things.

We could improve that by not being dependent on a GNOME release that’s still unfinished at the time of us wanting to package it. A more staggered cadence perhaps.

New test cases always help, but that’s an overly simplified assessment of a broader resourcing discussion we can have privately.

On the subject of Ubuntu quality in general, I had a good talk to Mauro recently about boosting community involvement in this area.

Maybe Ubuntu can just move to .6 and .12?