Alright, I have done benchmarks and testing.
The Hardware
The primary test machine in this experiment is a Dell Optiplex 9020, featuring an Intel Core i5-4570 CPU @ 3.20 GHz, a 1 TB ADATA SU750 SATA solid-state drive, and 16 GB of 1600MHz PC3-12800 DDR3 RAM (A-Tech brand). Testing was done using both the experimental Lunar build against x86_64-v3 and stock Ubuntu 23.04. This machine is juuuuust barely new enough to run x86_64-v3 code, and I was able to boot the ISO, install it, and do tests on it without much issues (although the not-quite-complete universe build in the test archive threw a wrench in my works a couple of times).
Additionally, I threw in my laptop (a Kubuntu Focus XE Gen 1) for comparison. It features an Intel Core i5 1135g7 CPU and 32 GB of RAM (I think 3200MHz). It also has a 1 TB Samsung NVMe solid state drive. It’s running Kubuntu 22.04 (not the x86_64-v3 rebuild of 23.04). While I attempted to interfere with the benchmarks as little as possible on the primary test machine (start the processing and then don’t touch it unless you absolutely have to until it’s done), I did the benchmarks on my laptop with a full KDE session and several apps running in the background. So my laptop’s benchmarks aren’t as reliable. However, they should give you an idea of how the machine’s speedups stack up to more modern hardware.
The Tests
Three workloads were run - 7zip benchmarks, cryptsetup benchmarks, and compiling the OpenJDK 8 source package using debuild. The former two provided detailed benchmarking info, while the latter (the OpenJDK build) I timed using sudo time debuild
. (I had to use sudo as the package wouldn’t build without it for some reason.) These tests were run in a relatively vanilla Ubuntu Server installation, a relatively vanilla installation of the test Lunar rebuild, and also on my laptop which had very little special prep done before running the benchmarks.
Stats
Optiplex, normal Ubuntu Server 23.04, 7zip benchmark results:
7-Zip (z) 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
64-bit locale=en_US.UTF-8 Threads:4
Compiler: 12.2.0 GCC 12.2.0
Linux : 6.2.0-39-generic : #40-Ubuntu SMP PREEMPT_DYNAMIC Tue Nov 14 14:18:00 UTC 2023 x86_64
PageSize:4KB THP:madvise hwcap:2 hwcap2:2
Intel(R) Core(TM) i5-4570 CPU @ 3.20 GHz (306C3)
1T CPU Freq (MHz): 2706 3299 3576 3585 3586 3585 3586
2T CPU Freq (MHz): 199% 3567 200% 3569
RAM size: 15893 MB, # CPU hardware threads: 4
RAM usage: 889 MB, # Benchmark threads: 4
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 19794 345 5582 19256 | 174783 400 3732 14911
23: 20222 366 5627 20604 | 172073 400 3725 14889
24: 18557 365 5467 19953 | 169467 400 3718 14872
25: 17914 370 5525 20454 | 165951 399 3706 14770
---------------------------------- | ------------------------------
Avr: 19122 362 5550 20067 | 170569 399 3720 14861
Tot: 381 4635 17464
Optiplex, x86_64-v3 Lunar rebuild, 7zip benchmark results:
7-Zip (z) 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
64-bit locale=en_US.UTF-8 Threads:4
Compiler: 12.2.0 GCC 12.2.0
Linux : 6.2.0-20-generic : #20-Ubuntu SMP PREEMPT_DYNAMIC Thu Oct 19 21:30:29 UTC 2023 : x86_64
PageSize:4KB THP:madvise hwcap:2 hwcap2:2
Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz (306C3)
1T CPU Freq (MHz): 2647 3211 3581 3590 3588 3588 3587
2T CPU Freq (MHz): 199% 3566 200% 3570
RAM size: 15885 MB, # CPU hardware threads: 4
RAM usage: 889 MB, # Benchmark threads: 4
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 20486 354 5634 19929 | 173682 399 3709 14817
23: 20126 368 5571 20506 | 171373 400 3709 14829
24: 19993 373 5761 21497 | 168567 400 3701 14793
25: 19314 377 5844 22053 | 165046 400 3676 14689
---------------------------------- | ------------------------------
Avr: 19980 368 5702 20996 | 169667 400 3699 14782
Tot: 384 4700 17889
KFocus XE, 7zip benchmark results:
7-Zip (z) 21.07 (x64) : Copyright (c) 1999-2021 Igor Pavlov : 2021-12-26
64-bit locale=en_US.UTF-8 Threads:8
Compiler: 11.2.0 GCC 11.2.0
Linux : 6.2.0-35-generic : #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Oct 6 10:23:26 UTC 2 : x86_64
PageSize:4KB THP:madvise hwcap:6 hwcap2:2
11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz (806C1)
1T CPU Freq (MHz): 4085 4129 4171 4156 4181 4159 4181
4T CPU Freq (MHz): 393% 3694 399% 3787
RAM size: 31867 MB, # CPU hardware threads: 8
RAM usage: 1779 MB, # Benchmark threads: 8
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 26902 721 3632 26171 | 206327 791 2224 17594
23: 24302 698 3546 24761 | 202496 790 2216 17516
24: 24940 741 3617 26816 | 199341 793 2204 17490
25: 23687 721 3752 27046 | 192679 782 2191 17144
---------------------------------- | ------------------------------
Avr: 24958 720 3637 26199 | 200211 789 2209 17436
Tot: 755 2923 21817
Optiplex, normal Ubuntu Server 23.04, cryptsetup benchmark results:
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1 1263344 iterations per second for 256-bit key
PBKDF2-sha256 1685813 iterations per second for 256-bit key
PBKDF2-sha512 1199743 iterations per second for 256-bit key
PBKDF2-ripemd160 731224 iterations per second for 256-bit key
PBKDF2-whirlpool 513001 iterations per second for 256-bit key
argon2i 7 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id 7 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
# Algorithm | Key | Encryption | Decryption
aes-cbc 128b 666.6 MiB/s 2698.9 MiB/s
serpent-cbc 128b 91.3 MiB/s 581.6 MiB/s
twofish-cbc 128b 192.5 MiB/s 372.7 MiB/s
aes-cbc 256b 505.1 MiB/s 2113.6 MiB/s
serpent-cbc 256b 92.9 MiB/s 580.8 MiB/s
twofish-cbc 256b 195.3 MiB/s 372.7 MiB/s
aes-xts 256b 2398.2 MiB/s 2404.5 MiB/s
serpent-xts 256b 532.0 MiB/s 522.2 MiB/s
twofish-xts 256b 344.6 MiB/s 347.7 MiB/s
aes-xts 512b 1899.7 MiB/s 1897.5 MiB/s
serpent-xts 512b 537.5 MiB/s 521.9 MiB/s
twofish-xts 512b 348.0 MiB/s 347.2 MiB/s
Optiplex, x86_64-v3 Lunar rebuild, cryptsetup benchmark results:
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1 1248304 iterations per second for 256-bit key
PBKDF2-sha256 1669707 iterations per second for 256-bit key
PBKDF2-sha512 1223542 iterations per second for 256-bit key
PBKDF2-ripemd160 758738 iterations per second for 256-bit key
PBKDF2-whirlpool 511001 iterations per second for 256-bit key
argon2i 7 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000ms time)
argon2id 7 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000ms time)
# Algorithm | Key | Encryption | Decryption
aes-cbc 128b 666.9 MiB/s 2695.0 MiB/s
serpent-cbc 128b 91.8 MiB/s 580.0 MiB/s
twofish-cbc 128b 191.5 MiB/s 372.1 MiB/s
aes-cbc 256b 504.6 MiB/s 2110.6 MiB/s
serpent-cbc 256b 93.1 MiB/s 580.9 MiB/s
twofish-cbc 256b 195.2 MiB/s 371.5 MiB/s
aes-xts 256b 2395.7 MiB/s 2402.4 MiB/s
serpent-xts 256b 530.7 MiB/s 521.5 MiB/s
twofish-xts 256b 345.6 MiB/s 347.9 MiB/s
aes-xts 512b 1896.3 MiB/s 1895.3 MiB/s
serpent-xts 512b 536.9 MiB/s 521.8 MiB/s
twofish-xts 512b 348.3 MiB/s 347.1 MiB/s
KFocus XE cryptsetup benchmark results:
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1 2335358 iterations per second for 256-bit key
PBKDF2-sha256 4088015 iterations per second for 256-bit key
PBKDF2-sha512 1646116 iterations per second for 256-bit key
PBKDF2-ripemd160 967321 iterations per second for 256-bit key
PBKDF2-whirlpool 672164 iterations per second for 256-bit key
argon2i 7 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id 7 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
# Algorithm | Key | Encryption | Decryption
aes-cbc 128b 1620.4 MiB/s 5936.8 MiB/s
serpent-cbc 128b 98.1 MiB/s 371.2 MiB/s
twofish-cbc 128b 246.1 MiB/s 453.3 MiB/s
aes-cbc 256b 1243.1 MiB/s 4794.6 MiB/s
serpent-cbc 256b 105.1 MiB/s 370.6 MiB/s
twofish-cbc 256b 251.9 MiB/s 452.6 MiB/s
aes-xts 256b 4799.7 MiB/s 4808.6 MiB/s
serpent-xts 256b 346.0 MiB/s 347.9 MiB/s
twofish-xts 256b 416.9 MiB/s 419.4 MiB/s
aes-xts 512b 4247.2 MiB/s 4250.8 MiB/s
serpent-xts 512b 370.9 MiB/s 350.6 MiB/s
twofish-xts 512b 417.2 MiB/s 416.4 MiB/s
Optiplex, normal Ubuntu Server 23.04, OpenJDK 8 build times (debuild of openjdk-8 source package from Lunar):
16035.05user 1197.54system 1:34:15elapsed 304%CPU (0avgtext+0avgdata 7030872maxresident)k
3521864inputs+67711640outputs (2900major+258587136minor)pagefaults 0swaps
Optiplex, x86_64-v3 Lunar rebuild, OpenJDK 8 build times (debuild of openjdk-8 source package from Lunar):
16011.88user 1189.26system 1:33:29elapsed 306%CPU (0avgtext+0avgdata 6372128maxresident)k
2098064inputs+67909208outputs (9429major+256627718minor)pagefaults 0swaps
No openJDK 8 build test was done on the KFocus XE as my build environment on my laptop is not suitable for benchmarking.
Comparison
A spreadsheet with the benchmark data is available here (ODS format): https://drive.google.com/file/d/1mmHZhI_sRq0ont0G9p-CCiMT0jYzLOBl/view?usp=sharing
7zip Benchmark:
Percentages are ratios of KiB/s, higher is better.
- Optiplex, normal Ubuntu Server 23.04 (baseline):
- Compression: 100%, Decompression: 100%
- Optiplex, x86_64-v3 Lunar rebuild:
- Compression: 104.49%, Decompression: 99.47%
- Focus XE:
- Compression: 130.52%, Decompression: 117.38%
Cryptsetup Benchmark:
argon2i and argon2id iterations per second were identical across all three benchmarks and are therefore omitted.
- Optiplex, normal Ubuntu Server 23.04 (baseline):
- Optiplex, x86_64-v3 Lunar rebuild:
- Hashing (ratios of iterations per second, higher is better)
- PBKDF2-sha1: 98.81%
- PBKDF2-sha256: 99.04%
- PBKDF2-sha512: 101.98%
- PBKDF2-ripemd160: 103.76%
- PBKDF2-whirlpool: 99.61%
- Cryptography (ratios of MiB/s, higher is better)
- aes-cbc 128b: Encryption 100.05%, Decryption 99.86%
- serpent-cbc 128b: Encryption 100.55%, Decryption 99.72%
- twofish-cbc 128b: Encryption 99.48%, Decryption 99.84%
- aes-cbc 256b: Encryption 99.90%, Decryption 99.86%
- serpent-cbc 256b: Encryption 100.22%, Decryption 100.02%
- twofish-cbc 256b: Encryption 99.95%, Decryption 99.68%
- aes-xts 256b: Encryption 99.90%, Decryption 99.91%
- serpent-xts 256b: Encryption 99.76%, Decryption 99.87%
- twofish-xts 256b: Encryption 100.29%, Decryption 100.06%
- aes-xts 512b: Encryption 99.82%, Decryption 99.88%
- serpent-xts 512b: Encryption 99.89%, Decryption 99.98%
- twofish-xts 512b: Encryption 100.09%, Decryption 99.97%
- KFocus XE:
- Hashing (ratios of iterations per second, higher is better)
- PBKDF2-sha1: 184.86%
- PBKDF2-sha256: 242.50%
- PBKDF2-sha512: 137.21%
- PBKDF2-ripemd160: 132.29%
- PBKDF2-whirlpool: 131.03%
- Cryptography (ratios of MiB/s, higher is better)
- aes-cbc 128b: Encryption 243.08%, Decryption 219.97%
- serpent-cbc 128b: Encryption 107.45%, Decryption 63.82%
- twofish-cbc 128b: Encryption 127.84%, Decryption 121.63%
- aes-cbc 256b: Encryption 246.11%, Decryption 226.85%
- serpent-cbc 256b: Encryption 113.13%, Decryption 63.81%
- twofish-cbc 256b: Encryption 128.98%, Decryption 121.44%
- aes-xts 256b: Encryption 200.14%, Decryption 199.98%
- serpent-xts 256b: Encryption 65.04%, Decryption 66.62%
- twofish-xts 256b: Encryption 120.98%, Decryption 120.62%
- aes-xts 512b: Encryption 223.57%, Decryption 224.02%
- serpent-xts 512b: Encryption 69.00%, Decryption 67.18%
- twofish-xts 512b: Encryption 119.89%, Decryption 119.93%
OpenJDK 8 Build Time
- Optiplex, normal Ubuntu Server 23.04 (baseline):
- Optiplex, x86_64-v3 Lunar rebuild:
- Time (ratios of number of seconds, lower is better)
- user: 99.86%
- system: 99.31%
- elapsed: 99.19%
- Misc
- CPU usage (ratio, higher is probably better): 100.66%
- maxresident k (lower is probably better): 90.63%
- inputs (don’t know what this means): 59.57%
- outputs (don’t know what this means): 100.29%
- major pagefaults (lower is probably better): 325.14%
- minor pagefaults (lower is probably better): 99.24%
Conclusion
In its current state, the performance of the x86_64-v3 rebuild of Lunar (in at least the tested workloads) is underwhelming. Some slight performance increases were seen in some areas, with some slight and likely negligible decreases in others. More research and testing may be needed for this endeavour to acheive it’s intended goal of a faster Ubuntu.
Comments
Yes, I actually built OpenJDK 8 of all things for my benchmarking. Why? I wanted to build some large codebase to see how fast it went. The build deps for KWin weren’t installable on the x86_64-v3 test rebuild, neither were the deps for the Linux kernel. OpenJDK8 however had all the deps it needed, so that’s what I built.
I tried to boot the x86_64-v3 rebuild on a couple of machines I knew were too old (an HP Chromebook G4 something-or-other, and an HP Elitebook 8570p) just to see what would happen. Both of them just stuck at a black screen and never unstuck. No error messages, no segfaults, no kernel panic, no fan spinup, no beeps, and no blinking LEDs. Just a black screen. (I do note however that pressing Caps Lock or Num Lock on the Elitebook while the boot was stuck did not result in the lights turning on or off. Additionally, on the Chromebook, the light on the flash drive went off and stayed off, on the Elitebook the light stayed on and never went off. And lastly, on the Chromebook, the screen was solid black, on the Elitebook a white cursor stuck in the upper-left corner, not blinking.) It might be useful to somehow integrate tests for x86_64-v3 support into the finished ISOs if we do end up making x86_64-v3 a supported separate architecture, so that an error can be displayed to the user if their hardware is too old.
I have no clue why my desktop so thoroughly stomped my much newer laptop in most of the serpent encryption benchmarks. Even the normal Ubuntu 23.04 Server left my laptop in the dust that area (without the x86_64-v3 changes).