Wayland Protocols for Special-Purpose Surfaces

Background

The Wayland ecosystem has settled on XDG Shell as the default window protocol. When a GTK or Qt app creates a Wayland surface, it uses this protocol to talk to the compositor. XDG Shell surfaces are suitable for normal applications (file managers, web browsers, video players, etc).

Some functionality is deliberately absent from XDG Shell. For example, programs can not move their surfaces to a specific location on-screen, or even know where the compositor has placed them. Programs also have no control over stacking order. These limitations are in place for a number of reasons, including security and flexibility of compositor design. Occasionally they are inconvenient to normal apps, but the real problem comes in building shell components.

Shell Components

In the Mir team, we use the term “shell components” to refer to all the special elements that make a desktop environment (aka a desktop shell). These are things like panels, notifications, desktop backgrounds, lock screens, etc. Generally, shell components can not be correctly implemented with the tools provided by XDG shell. Some desktop environments (GNOME) enjoy drawing them all inside the compositor, but others (Sway, MATE) prefer to split them out into separate programs. This works good on X11 and can work on Wayland as well, but we need another protocol.

Layer Shell

Layer Shell is such a protocol built by the developers of Sway/wlroots. It is currently implemented by a number of compositors including Mir, and used by a number of apps. It allows surfaces to be anchored to screen edges or pinned above/below normal apps. Surfaces can only have one role at a time, which means Layer Shell surfaces can not also be XDG Shell surfaces.

Integrating With Toolkits

The Layer Shell approach works great in theory and for clients built from scratch, but integrating with toolkits raises issues. I built a library to use Layer Shell with GTK, and it wasn’t easy. GTK3 supports custom Wayland surfaces, but it doesn’t expose enough interfaces to do everything needed. Rather than being improved, this support is being completely removed in GTK4. I don’t have as much experience with Qt or other toolkits, but from what I do know building Layer Shell clients in them may not be any easier.

Most complex desktop environments are built with some sort of toolkit. Some own their own toolkit and thus have complete control over it, but many do not. Critically, the GTK devs have made it clear they will not bend over backwards to accommodate non-GNOME desktops, but a number of other desktops use GTK. For this reason, being able to build shell components in somewhat non-cooperative toolkits would be extremely useful.

Possible Improvements to the Status Quo

I can think of a number of ways to allow building shell components with toolkits. Naturally none is perfect.

Bake Layer Shell/Other Protocols Into Toolkits

This would require toolkit cooperation, but may actually be less controversial than general purpose custom protocol APIs. The drawback is that adding features, upgrading to a new protocol version, etc would require coordination with the toolkit developers and a release cycle of the toolkit. It’s also not clear if the major toolkits would even be open to this. They are certainly unlikely to accept more than a few protocols (keep in mind Layer Shell may end up being only one of many special-purpose surface protocols).

Patch Toolkits

Even if upstream doesn’t want our changes, the toolkits could be soft-forked or patched in distros. We could either add support for specific protocols, or the APIs required to support any protocol. This would require a lot of work by someone (read: not me) and probably wouldn’t make anyone very happy.

Use Protocols That Do Not Replace XDG Shell

New protocols could let apps add functionality to surfaces without replacing the XDG Shell objects that toolkits make. Features that are common to both types of surfaces (such as resizing and popups) could be done by the toolkit with XDG Shell as normal. Additional functionality could be added by the app attaching an additional Wayland object to the same wl_surface. Most toolkits at least allow apps to retrieve a wl_surface, so this is reasonable. This is the solution I’m currently favoring.

Other Wacky Ideas

These are probably bad, but I want to include them for completeness

  • A wrapper program could act as a proxy Wayland server, relaying most messages unchanged but adapting XDG shell calls to Layer Shell calls. Clients would still need to implement some additional protocol but compositors wouldn’t have to worry about it
  • Allow apps to switch out surface roles on the fly. This is explicitly forbidden by Wayland, but compositors can do what they like.

Conclusion

I’m making this post now so I can get feedback before I start trying to build anything. I’m sure some out there have already given this a fair bit of thought, and it’s important that as much of the community as possible is on the same page to avoid fragmentation. I’m especially interested in hearing from the current developers and users of Layer Shell.

4 Likes

Layer Shell

It does seem that being a “layer of the desktop” is something that could very plausibly be orthogonal to having a “top level” or “pop up” role. As you say, this would allow the client to leverage the existing toolkit implementation of the role, only adding the features needed for a “layer of the desktop”.

It would need the compositor to chose, for example, what happens to a surface anchored to an edge if it gets a resize request. But the compositor already has licence to ignore such requests.

Special Purpose Surfaces

I can see you are generalizing from “surfaces that are layers of the desktop” to “special purpose surfaces”, but do you have any other examples of special purpose surfaces?

Toolkit design

I can see a benefit to an approach that enables “unstable” (or bespoke to a project) protocols to be implemented in addition to stable protocols that are, hopefully, already implemented by a toolkit. While it could be argued that enabling extensions is a matter for toolkit design, your suggested approach of "allow the user to get the wl_surface" is a lot easier for the toolkit maintainer than “allow arbitrary extensions”.

Do you have any other examples of special purpose surfaces?

Possibly input methods? Popups on foreign toplevels? I’m not sure that either of those cases are definitely applicable, but I get a distinct feeling there will be more cases where we need toolkit-created surfaces to behave in special ways.