Scope of GNOME MRU

For (1) and (2) - maybe what we want is to our own script to parse the upstream metadata and map to source package names, and then commit the output (per release) to ubuntu-archive-tools? We don’t expect the list to change within a release, only between releases.

I guess the stretch goal here is for sru-review to grow support for emitting “This package falls under the $PACKAGE MRE; see $URL for details” on the relevant packages…

I’ve now done the script to parse the upstream metadata and output a list of packages covered by the GNOME MRE, and committed the results (16.04, 18.04, 20.04, and 22.04).

Please give both the script and the lists a once-over; I’m happy with what it generates, but may well have missed something.

Integrating a check for when a package is on this list (or other MRE lists) into the SRU tooling is still a stretch goal :wink:

Update update: I’ve generated the list for 22.10. This has required small updates to the script (to handle newly-duplicated packages)


Do we have a diff from previous one? /me lazy

EDIT, here it is:

--- jammy	2022-11-01 17:26:19.000000000 +0100
+++ kinetic	2022-11-01 17:25:42.000000000 +0100
@@ -1,8 +1,6 @@
@@ -31,7 +29,6 @@
@@ -40,7 +37,6 @@
@@ -112,7 +108,6 @@
@@ -126,10 +121,12 @@
@@ -150,12 +147,13 @@

Thanks for working on moving things forward Chris, we will review the script and the list.

On the SRU verification would it work for the SRU team if the instructions are stored in (where ‘sourcename’ is the name of the Ubuntu source package formatted wikistyle, GnomeTextEditor for gnome-text-editor is an existing example).

We would create the pages from now on as we SRU packages which don’t have one yet.

1 Like

I created a category for our current test cases. We commit to keep adding new ones as we prepare SRUs:

raof, thank you for your work on the script! The output matches what I expect.


Having a place to keep test plans would be great - thanks @seb128 and @jbicha!

Just one note - we (SRU team) can definitely consider them case by case per SRU. But it would be better if each could be reviewed approved by the SRU team, if we could note that somewhere such that it is invalidated if there are significant changes. Then we could provide better consistency since if a test plan is already “SRU team approved” it won’t need to be reviewed again, and you wouldn’t be surprised with requests for changes or amendments if a different SRU team member reviews your upload. But then this needs to be done in a way that a future SRU team member can confirm that the test plan is the same one that was previously approved by a previous SRU team member.

I don’t mind what mechanism we use, but could we arrange that please? Since there will be a collection, perhaps we could review the first time a particular test plan is used, and then leave some kind of approval note?

I added a section to the bottom of with an idea for how pages could be marked as reviewed. But it’s a decision for the SRU Team how those pages should be marked and reviewed.


What else is needed to complete this discussion?

I’ve refreshed the MRE list for lunar here. In doing so, the script needed additional handling of the libgweather->libgweather4 source transition, and I realised that the previous kinetic list included libgweather, which is not actually in the kinetic archives.

So, with that script refresh, the two diffs are:

--- a/kinetic
+++ b/kinetic
@@ -115,7 +115,7 @@ libgnomekbd
--- kinetic     2023-05-24 11:36:05.447903500 +1000
+++ lunar       2023-05-24 11:28:12.252708977 +1000
@@ -109,7 +109,6 @@

Also, I’m still on the hook for some SRU tooling to automatically detect when an SRU is for a package with an MRE.

Hm. I was going to say that I think we’ve not yet satisfied Robbie’s “we can tell that the test plan is the same as the approved one” requirement, but on second thoughts, test plans on the Wiki satisfy that (in the same way that all the other test plans on the wiki do) - if the last edited tags match the review time and reviewer.

So, I think what is needed to complete the discussion here is for us, on the SRU team, to go through the test plans as we process GNOME MRE SRUs and approve them or request changes until we can approve them.

To get that started: a review of the Mutter test plan:
In general, I’d like something more structured than “verify things continue to work well”

As a Wayland compositor, Mutter is particularly hardware-dependent and a critical piece of infrastructure. I think we should call out specific testing on whatever set of platforms we think are appropriate (especially since around half the bugs on this particular mutter SRU are explicitly "mutter does $BAD_THING on $SPECIFIC_HARDWARE).

  • As a first cut of hardware we explicitly want to support, I suggest testing on:
    • A mesa driver (if we’re only doing one, i915 is probably the right choice; if we can do two, both i915 and amdgpu)
    • The most recent stable NVIDIA driver series (bonus points: most recent and oldest)
    • RPi? Definitely for the current SRU, as it contains an explicit fix for RPi behaviour; I’m not clear exactly how much the desktop team supports RPi-on-the-desktop, though.

The current SRU should probably have explicit testing on multi-monitor setups. I’d be inclined to request it as a part of the generic testing plan, too.

What other aspects of Mutter are high risk, or low cost to test? Scaling factors != 1.0 & 2.0? Mixed scaling? Video playback? Are gnome-shell extensions tied into Mutter, or is that just gnome-shell/gjs?

It should be fairly quick to exclude any really serious breakage on any given platform - just log in, bring up the spread, run a video player, run glmark2-es2-wayland (on Wayland sessions) should verify that a given platform hasn’t been critically regressed. I would therefore expect the major problem in testing to be access to hardware; what does the Desktop team have access to?

1 Like

And a review of the GNOME Shell test plan.

This looks pretty good to me. Again, I’d prefer something more concrete than “verify that things continue to work well” for Test Case 1, but I understand that it’s pretty easy to test and maybe low-value to precisely specify, so I’m OK with that.

It wasn’t immediately obvious to me where the list of extensions was for Test Case 2; I’ve edited the wiki page to make this more obvious (to me, at least!).

1 Like

In reviewing a gjs SRU just now, I noticed that it didn’t link to the test plan @jbicha wrote up above and nor was that linked from the main GNOME exception documentation. So I’ve adjusted these.

I’ve also written up my idea of how we’ve agreed test plans and approvals will work. @jbicha please could you review?

On the gjs test plan itself, it looks fine to me but I’m not particularly familiar with the area. @raof please could you review it?

One further thought.

Right now, gjs has 1.76.2 in lunar-proposed, and I’m accepting 1.72.4 into jammy-proposed.

Which one is newer? Could, for example, 1.72.4 contain fixes for bugs that are only fixed in a 1.76.3 that is not yet prepared for Lunar? If so, then we might be introducing fixes into Jammy that users will then face a regression for if they upgrade to Lunar, which is something we usually try to avoid by asking that later stable releases be fixed first (or at the same time).

This may or may not be the case specifically right now, but generally, is this kind of situation something we should plan against in how we process these SRUs?

For the gjs test plan, it looks broadly sane.

Again, I’d be a fan of a test plan with more structure than “make sure GNOME Shell still works correctly” - for example, is it worth checking the journal to identify any new warnings generated to check that they’re harmless?

Would it additionally be worth checking that the supported Shell extensions still work? They obviously have the potential to be affected by any change in the JS environment, but it’s not clear to me how likely they are to hit things that the default Shell doesn’t, so I don’t know if testing them is worthwhile.

FWIW I’ve also marked the GNOME Shell test plan as approved. I still think the Mutter test plan should call out some explicit hardware testing, and there are still some open questions there.

It’s been raised that glib2.0 is a particularly core package, used extensively outside the GNOME ecosystem. As such, does it need either a very stringent test plan, or to be excluded from the GNOME MRE entirely?

That said, I know that historically it has been under the GNOME MRE, and checking the list of regresson-updates-tagged bugs only finds a case of a CVE fix, which was known ahead of time to regress some usecases. This looks like a history of trouble-free microrelease updates, so maybe this is fine? :woman_shrugging:

glib has an extensive test suite used in both build tests and as installed tests run as autopkgtests. glib also triggers a lot of autopkgtests in other packages. glib also has a track record of limiting itself to bugfixes for stable point releases. I believe glib satisfies the usual criteria for “new upstream microreleases”. Unfortunately, much of the rest of GNOME doesn’t have as thorough of automated testing so we need a microrelease exception for GNOME.

Hm. I see I wasn’t very clear in my last message - I was pretty happy with glib2.0, but I wanted to check with the rest of the SRU team.

That’s now been done, and the decision is “glib2.0 is used too widely outside of GNOME to fall under the GNOME MRE policy, but glib2.0 is well-tested and has a long history of trouble-free bugfix updates and so is a standard MRE candidate”.

The next action on this is:

  • I’ll review (and, presumably, accept) the current glib2.0 upload
  • I’ll write up a stand-alone glib2.0 MRE test plan page
1 Like

mutter appears to have an unacceptably high regression rate and I’ve posted about this at Mutter microrelease updates: high regression rate.