Why Ubuntu 22.04 is so fast (and how to make it faster)

If you’ve upgraded to Ubuntu 22.04 then you probably noticed how smooth the GNOME experience is. If you haven’t noticed then try comparing it to an older Ubuntu release or even the latest Fedora. The main enhancement responsible for this is the introduction of triple buffering in Ubuntu.

Before describing what triple buffering is, consider double buffering:

  1. Wait for the monitor to display the last frame.
  2. Wait a little bit more.
  3. Prepare the next frame behind the scenes.
  4. Offer the next frame to the monitor.
  5. Goto 1.
Monitor refresh    | *         *         *         *         *
Prepare next frame |   A         B         A         B         A
Display next frame |           A         B         A         B

                        frame rate = 100%, latency < 1 frame

There are always two images in this loop: the one you see on screen and the next one that will follow it. That’s double buffering.

A problem occurs however when preparing (rendering) the next frame takes too long:

Monitor refresh    | *         *         *         *         *
Prepare next frame |   A..............     B..............     A...
Display next frame |                     A                   B

                        frame rate = 50%, latency < 2 frames

The above diagram shows double buffering only achieving half the frame rate of the monitor. You see this as stutter, although some people use the word “lag”.

The long render times shown above are often due to the GPU running at its lowest frequency. But the GPU is not very smart and it doesn’t know that you probably wanted a higher frame rate. The blank space between frames is when the GPU is completely idle and as such it thinks it is appropriate to stay at the same low frequency.

But what if we reduce the gaps? Ensure the GPU is not allowed to idle until it at least gives us the full frame rate? This means we have to pre-render two frames instead of one:

Monitor refresh    | *         *         *         *         *         *
Prepare next frame |   A..............B....C....     A..............B...
Display next frame |                     A         B         C         A

                        frame rate = 100%, latency < 2 frames

So now using three different buffers we are able to achieve full frame rate. A is rendered slowly at the default low frequency but when we don’t stop for a break the GPU knows (well the graphics driver knows) it needs to speed up. That’s why B and C have shorter render times. It is only by trying to pre-render two frames that we make it likely at least one frame has been pre-rendered in time.

The above diagram is illustrative of Intel GPUs that typically start at 30% speed. It is for illustrative purposes only and should not be seen as an accurate representation of what is happening for all GPUs all the time.

The end result is that GNOME sessions in Ubuntu 22.04 will use your full hardware’s ability to first reach full frame rate, and only after that is achieved will performance scale down to more power efficient settings.

So all I needed was a higher GPU frequency?

Not exactly. The ....... can represent not only a long render time but also unexpected events that delay rendering from starting. This is especially an issue in a single threaded event loop like in gnome-shell. The same benefits still apply - as we pre-render an extra frame we are able to cope with and recover from such hiccups without stuttering being seen on screen. An improvement to smoothness is therefore seen with triple buffering even on systems without frequency scaling.

Will my games run faster?

Probably not. Games with an unthrottled frame rate (or just vsync “off”) already convince the GPU to run as fast as it can. Games that are throttled to the refresh rate (vsync “on”) will benefit if they are run in a window but usually not in full screen. We turn off triple buffering in full screen direct scanout mode because (a) it was too much work at the time (development took almost 2 years already); and (b) the compositor render time is zero there so isn’t a metric that needs improving. For extra smoothness in future, perhaps.

More buffers means more latency right?

No, not in this case. Our triple buffering implementation dynamically switches between double and triple buffering as required. When it is required, double buffering would only provide half the ideal frame rate (or worse). So you’re comparing two frames of latency with two frames of latency. Triple buffering just doubles the frame rate. If the rendering is simple then we switch to double buffering and latency drops below one frame.

Source code

The source code for triple buffering in GNOME is available to everyone as a patch for mutter 43 or as a patch for mutter 42. It is already included in Ubuntu 22.04 and later.

But I want more performance

While triple buffering in Ubuntu 22.04 provides a significant improvement out of the box, it’s always possible to go faster. Here are some suggestions that won’t void the warranty:

Extensions

Ubuntu’s three default gnome-shell extensions are great, but they do have a measurable performance impact. We’re working to fix this but in the meantime you might consider disabling any that you don’t need. If you don’t already have the Extensions app icon installed, you can just run:

gjs /usr/share/gnome-shell/org.gnome.Extensions

or

sudo apt install gnome-shell-extension-prefs

to install the Extensions app icon.

Mouse movement

You might get a feeling that Wayland sessions have a slightly laggy, slightly sloppy mouse response. You’re not imagining things. This is a feature of Linux’s atomic KMS architecture where all on screen changes are deferred to occur at exactly the same time. If you don’t want that to include the mouse pointer then consider reverting to traditional KMS mode by editing /etc/environment and adding:

Ubuntu 22.04:

MUTTER_DEBUG_ENABLE_ATOMIC_KMS=0

Ubuntu 22.10:

MUTTER_DEBUG_FORCE_KMS_MODE=simple

and remember to reboot afterwards. This ensures the mouse pointer location is allowed to update even sooner than the rest of the screen.

Web browsers

Neither of the major web browsers enable native Wayland support by default yet, but they do support it. If you want the fastest, smoothest browsing experience (and more precise touchpad scrolling) then you can enable native Wayland rendering in:

Firefox

Add this to /etc/environment:

MOZ_ENABLE_WAYLAND=1

and reboot.

Chrome

Open address:

chrome://flags/#ozone-platform-hint

and change the setting “Preferred Ozone platform” to “Wayland”, then click Relaunch.

Phase shifting

Mutter contains an optimization that dynamically adjusts the render start time to try and minimize latency to the screen. This improves latency by a fraction of a frame interval, so a few milliseconds. The downside is that it can make frame skips appear.

If you are willing to sacrifice a few milliseconds latency in order to gain a smoother frame rate then you can add this to your /etc/environment:

CLUTTER_PAINT=disable-dynamic-max-render-time

What if I want to prioritise low power instead of frame rate?

You probably don’t need to. As soon as the screen stops changing your CPU and GPU are going to change to a low power state. Even with triple buffering.

If you really wanted to minimize power usage by knowingly reducing frame rate then you could:

  • Set your monitor to a lower refresh rate; or
  • Limit your Intel GPU to a low clock speed (write a lower value to /sys/class/drm/card0/gt_max_freq_mhz); or
  • Use the NVIDIA Settings app to control the power profile of the NVIDIA driver, if installed.

27 Likes

Thank you for the information provided. I do need to say that 22.04 actually much more slow than 20.04

My specs are rather good:

image

But there is a constant lag (Does not matter if using Wayland or X11) when dragging windows or even having screen sharing time at work. The whole team has noticed a huge impact in visual performance. From dragging most windows, to screen sharing making the computer feel like a 486dx, to even freezing apps like Telegram, Chrome or Firefox for 3 to 10 seconds, then getting control of them again.

This has been talked about in the following links but I also have had this happen to me since day one:

https://askubuntu.com/questions/1408208/ubuntu-22-04-too-slow

https://www.reddit.com/r/Ubuntu/comments/x9kpho/ubuntu_2204_feels_clunky_and_slow/

https://askubuntu.com/questions/1406263/ubuntu-22-04-runs-too-slow

I tried using a different Nvidia driver with no luck.

I tried changing from Wayland to X11. No luck there.

I checked for ram issues, CPU bottlenecks. Nothing. No app was ever using more than 5% of the CPU and the RAM was never above 5GB every time the freeze or lag happened.

I do need to mention I have 2 monitors. One is a 4K 144Hz and the other one is a 1920x1080 60hz one if that helps at all.

1 Like

That’s an NVIDIA driver bug which sadly got worse in GNOME 42 so everyone is noticing it now. NVIDIA is aware of it, has investigated and confirmed it’s a driver bug. It affects other desktop environments too. Though I should note for everyone else that it’s only a problem on desktop machines using NVIDIA as the primary GPU.

One potential workaround is to add __GL_SYNC_TO_VBLANK=0 to /etc/environment. Although I also plan on investigating other workarounds.

If anyone reading this has performance problems without using the NVIDIA driver then please open a new bug by running: ubuntu-bug gnome-shell

4 Likes

Yes that’s a Xorg driver bug which usually only annoys NVIDIA users (because everyone else is using Wayland instead). The problem is that Xorg always updates both monitors together and sometimes it will choose to do that only at the rate of the slower monitor. The workaround is to set __GL_SYNC_DISPLAY_DEVICE to be equal to the name of the faster monitor (from xrandr).

2 Likes

Wow. You da man. Okay what about applying the workaround and moving from xorg to wayland. What do you recommend

I wish I could recommend Wayland for NVIDIA, and it might be OK for you, but there are too many bugs in NVIDIA Wayland sessions.

Instead it’s probably best to stick with Xorg and:

  1. Edit /etc/environment and add:
    __GL_SYNC_TO_VBLANK=0
    
    At a guess, that will remove the need for the __GL_SYNC_DISPLAY_DEVICE workaround too.
  2. Reboot.
2 Likes

Hi @vanvugt looks like this is working beautifully. I will be testing for the whole week so if you need to assign me anything else to test and provide feedback in the proper bug report links, I will be happy to assist you in any way. Thank you friend, this helps me while working on the computer.

Hello @vanvugt , after reading your amazing explanation, I’m left wondering, is triple buffering Wayland only? The benefits of triple buffering disappear in my case when I’m using Xorg.

My hardware (asus vivobook 14 m413da, amd 5 3500U, radeon vega 8) has very strange performance on most desktop environments, showing bad frame rates specially on window manager specific animations (like gnome’s overview, kde’s and cinnamon’s different fancy animations).

This hardware renders Windows 10/11 animations perfectly and lots of intensive tasks (in both win and linux) without issue, but window manager animations in linux? Nope.

Triple buffering fixes the issue on Wayland, making Gnome go completely smooth even on a high load on battery power (truly amazing!). BUT, for some reason, the bad frame rates on the overview return on Xorg.
Aside the main question, “is triple buffering Wayland only?”, do you have any idea where things might have gone wrong to display animations so badly in so recent hardware (2019)? No amount of smart searching and asking on diverse forums lead to any solution.

@raxelgrande,

Triple buffering applies to both Xorg and Wayland. In fact it was working on Xorg about a year earlier than Wayland, and then the remaining year was spent fixing the Wayland issues.

Prior to triple buffering (e.g. Ubuntu 20.04) I would expect Xorg to perform better than Wayland because being a separate process grants it some of the asynchronous benefits of triple buffering. After triple buffering (Ubuntu 22.04 and later) I expect Wayland should perform better as the asynchronicity is about the same but there is no round trip between two different display server processes on every frame. So that’s the first area where Xorg is slower.

Here’s a list of reasons why Xorg might perform worse:

  • Extra round trip between the gnome-shell and Xorg processes on every frame.
  • Xorg drivers mostly can’t handle multi-monitors well and may stutter (LP#1820832).
  • Multi-GPUs: This logic is very different between Xorg and Wayland. Mostly I expect Wayland to be slower (tracking in mutter#2315) but it’s theoretically possible that the Xorg driver you have is less efficient at this than mutter’s Wayland implementation. This is an issue that many laptop owners will hit unknowingly when they plug in a monitor and that port is wired to a discrete GPU.
  • The Xorg driver fails to provide 3 or 4 framebuffers. That would cause serious stutter (or crashes) but I have not seen it happen on desktop drivers yet.
3 Likes

@raxelgrande, I am testing with AMD now and can confirm Wayland sessions are more responsive than Xorg as expected. But nothing buggy and no “bad frame rates”.

To get your Xorg performance issue investigated further, please log in to “Ubuntu on Xorg” and then run:

ubuntu-bug gnome-shell
2 Likes

These performance tips are awesome. Ubuntu is now smoother than ever on my Chromebook with Intel UHD 600 Graphics!

@raxelgrande, are you using Ubuntu 22.10? I was able to reproduce the problem you describe after switching back to Intel graphics: https://launchpad.net/bugs/1989582 and have traced it to a regression in mutter 43.

2 Likes

When I run sudo nano /etc/environment if I want to include the two commands you mentioned above to help test on my own Nvidia MX150 GPU/Intel hybrid laptop setup running Xorg, should /etc/environment look like the following below or does it all need to go within the first line in the quotes? Appreciate any clarity thank you. Running Ubuntu 22.04.1 LTS, btw.

PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
CLUTTER_PAINT=disable-dynamic-max-render-time
__GL_SYNC_TO_VBLANK=0

in 22.04 and onwards you should really not edit /etc/environment … there were some efforts done to integrate with systemd’s /etc/environment.d/ to avoid this …
perhaps @vanvugt can change the instructions accordingly …

the right way would be:

echo  "CLUTTER_PAINT=disable-dynamic-max-render-time" | sudo tee -a /etc/environment.d/90clutter.conf
echo "__GL_SYNC_TO_VBLANK=0" | sudo tee -a /etc/environment.d/90vblank.conf

(and then indeed do a reboot to make the change take effect)

4 Likes

The issue i was having on Xorg got fixed!

A really timely coincidence, before doing the ubuntu-bug report i did a performance test with video playback and other apps in workspaces using battery power and it worked perfectly (less smooth than wayland, but smooth)

I can pinpoint the issue appearing in Gnome 3.38 (3.36 had no bad frame rate issues) and it lasted to a Gnome 42.2+ point release (last time i tested Ubuntu showcasing the error, was around early July). Currently, on Ubuntu 22.04.1 it works perfectly both on Xorg and Wayland.
It also happened on multiple distros, with near unnoticeable frame drops in Tumbleweed, Arch and an awful performance on Fedora.
i don’t know if the fix is ubuntu specific or a gnome one, i need to test it, but whatever changed between July and September fixed it.

The issue i had is different from the bugs you linked. The best description is “under a medium-heavy load using xorg and battery power (happens plugged in with less notoriety), the gnome-shell animation would reduce the frame rate to about 15fps, even in scenarios where the system should have more than enough resources to render the animations well.”

I also noticed that the bug is made more notorious depending on what battery utility program you use, setting power-profiles-daemon to power save mode drastically reduced frame rates on the overview.
auto-cpufreq was the one that made the bug worse, with even worse frame rates on low/idle use.
tlp gave the best performance on battery and plugged, showing little to no drops in frame rates.

I suspect power-profiles-daemon can have a negative effect on triple buffering, could be worth testing performance.