While I work on making gnome-shell faster, especially at 4K, I’ve noticed more than ever my system is GPU-bound. Meaning:
performance is visibly bad
performance is better for very small windows and worse for larger windows
top doesn’t report any significant CPU usage (< 15%).
I use Intel GPUs so that means they depend on system RAM for everything. And it seems my corporate-style desktops only came with one stick of RAM each. This means they are only using a single channel when actually my motherboard allows two channels in parallel (see the top right of this diagram).
Querying the BIOS/motherboard confirms I am only using one channel:
sudo dmidecode -t17
So I decided to take the RAM from the two identical machines and put them in the one machine on separate channels, hoping it would improve gnome-shell memory throughput in compositing. Here are the results (milliseconds per frame on average/peak, lower is better):
Test
Single Channel
Dual Channel
Rendering a maximized window
6.0 / 6.1
5.2 / 5.3
Rendering in Activities Overview
16 / 22
14 / 19
So that’s a consistent performance improvement to gnome-shell of around 13%. Seems it would be worthwhile to invest in a new pair of DIMMs in my case… But this only makes sense for integrated GPUs, and only really in theory would help things that are close to being memory (bandwidth) bound like graphics compositing at high resolution.
In the following there are two channels (Channel A and Channel B). Each channel has two banks, 0 and 1. Read the following output as triplets of lines.
Specifically,
There two two channels, A and B.
Each channel has two banks.
In this example, Bank 0 for each channel has a DIMM stick, 8GB+8GB=16GB.
The two DIMM sticks are on separate channels.
$ sudo dmidecode -t17 | grep 'Size:\|Locator:'
Size: 8192 MB
Locator: ChannelA-DIMM0
Bank Locator: BANK 0
Size: No Module Installed
Locator: ChannelA-DIMM1
Bank Locator: BANK 1
Size: 8192 MB
Locator: ChannelB-DIMM0
Bank Locator: BANK 2
Size: No Module Installed
Locator: ChannelB-DIMM1
Bank Locator: BANK 3
The whole idea to get the benefit of the speed increase, is to have more than one memory sticks, and put them each on their own channel.
Also note while a 13% improvement is useful, it’s not big enough to solve most people’s performance issues. I expect much bigger improvements through software by the time we get to GNOME 3.38.
Note that if you’re buying a new Intel CPU with integrated GPU then getting one with Iris Plus graphics gives you double the number of GPU cores (“GT3”), as well as some faster dedicated graphics memory (most Intel GPUs have zero dedicated graphics memory).
But we seem to be at an odd point in the product life cycle right now where you can only get those better Iris Plus GPUs in mobile CPUs (see this list), not in any current desktop CPUs. However… you can find those mobile CPUs with Intel’s best Iris Plus GPUs in some models of desktop NUC.
Still it’s a bit strange you need to go for a lower power CPU to get the better GPU. I guess maybe their new products later this year will fill the hole.
That was not my intention; I guess I’m just frustrated by the fact that my laptop comes with a QHD+ screen (3200x1800) with a CPU-iGPU (7500U-HD620) pair that can barely handle it…
That’s very interesting info, thanks!
Maybe AMD’s drivers are not mature enough yet…