Hm. I was going to say that I think we’ve not yet satisfied Robbie’s “we can tell that the test plan is the same as the approved one” requirement, but on second thoughts, test plans on the Wiki satisfy that (in the same way that all the other test plans on the wiki do) - if the last edited
tags match the review time and reviewer.
So, I think what is needed to complete the discussion here is for us, on the SRU team, to go through the test plans as we process GNOME MRE SRUs and approve them or request changes until we can approve them.
To get that started: a review of the Mutter test plan:
In general, I’d like something more structured than “verify things continue to work well”
As a Wayland compositor, Mutter is particularly hardware-dependent and a critical piece of infrastructure. I think we should call out specific testing on whatever set of platforms we think are appropriate (especially since around half the bugs on this particular mutter SRU are explicitly "mutter does $BAD_THING
on $SPECIFIC_HARDWARE
).
- As a first cut of hardware we explicitly want to support, I suggest testing on:
- A mesa driver (if we’re only doing one, i915 is probably the right choice; if we can do two, both i915 and amdgpu)
- The most recent stable NVIDIA driver series (bonus points: most recent and oldest)
- RPi? Definitely for the current SRU, as it contains an explicit fix for RPi behaviour; I’m not clear exactly how much the desktop team supports RPi-on-the-desktop, though.
The current SRU should probably have explicit testing on multi-monitor setups. I’d be inclined to request it as a part of the generic testing plan, too.
What other aspects of Mutter are high risk, or low cost to test? Scaling factors != 1.0 & 2.0? Mixed scaling? Video playback? Are gnome-shell extensions tied into Mutter, or is that just gnome-shell/gjs?
It should be fairly quick to exclude any really serious breakage on any given platform - just log in, bring up the spread, run a video player, run glmark2-es2-wayland (on Wayland sessions) should verify that a given platform hasn’t been critically regressed. I would therefore expect the major problem in testing to be access to hardware; what does the Desktop team have access to?