GCC 14 vs. Clang 18 - AMD Ryzen Threadripper 7980X

AMD Ryzen Threadripper 7980X compiler benchmarking on Fedora 40 by Michael Larabel for a future article.

GCC 14.0.1 20240411

Processor: AMD Ryzen Threadripper 7980X 64-Cores @ 7.79GHz (64 Cores / 128 Threads), Motherboard: System76 Thelio Major (FA Z5 BIOS), Chipset: AMD Device 14a4, Memory: 4 x 32GB DDR5-4800MT/s Micron MTC20F1045S1RC48BA2, Disk: 1000GB CT1000T700SSD5, Graphics: AMD Radeon Pro W7900 45GB, Audio: AMD Device 14cc, Monitor: DELL P2415Q, Network: Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6E

OS: Fedora Linux 40, Kernel: 6.8.5-301.fc40.x86_64 (x86_64), Desktop: GNOME Shell 46.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0.5 (LLVM 18.1.1 DRM 3.57), Compiler: GCC 14.0.1 20240411, File-System: btrfs, Screen Resolution: 1920x1080

Kernel Notes: Transparent Huge Pages: madvise
Environment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"
Compiler Notes: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-multilib --enable-offload-defaulted --enable-offload-targets=nvptx-none,amdgcn-amdhsa --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=i686 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver
Processor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105
Python Notes: Python 3.12.2
Security Notes: SELinux + gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

Clang 18.1.1

OS: Fedora Linux 40, Kernel: 6.8.5-301.fc40.x86_64 (x86_64), Desktop: GNOME Shell 46.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0.5 (LLVM 18.1.1 DRM 3.57), Compiler: Clang 18.1.1 + LLVM 18.1.1, File-System: btrfs, Screen Resolution: 1920x1080

Kernel Notes: Transparent Huge Pages: madvise
Environment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"
Processor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105
Python Notes: Python 3.12.2
Security Notes: SELinux + gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

Kvazaar

This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

SVT-AV1

This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.

uvg266

uvg266 is an open-source VVC/H.266 (Versatile Video Coding) encoder based on Kvazaar as part of the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

x265

This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.

miniBUDE

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample high resolution (currently 15400 x 6940) JPEG image. Learn more via the OpenBenchmarking.org test page.

Coremark

This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.

SecureMark

SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.

LZ4 Compression

This test measures the time needed to compress/decompress a sample file (silesia archive) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.

Zstd Compression

This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

srsRAN Project

srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.

QuantLib

QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.

SMHasher

SMHasher is a hash function tester supporting various algorithms and able to make use of AVX and other modern CPU instruction set extensions. Learn more via the OpenBenchmarking.org test page.

Result

cycles/hash

Result

cycles/hash

Result

cycles/hash

Result

cycles/hash

Result

cycles/hash

Result

cycles/hash

Result

cycles/hash

Result

cycles/hash

Result

cycles/hash

JPEG-XL libjxl

The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.

JPEG-XL Decoding libjxl

The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. The JPEG XL encoding/decoding is done using the libjxl codebase. Learn more via the OpenBenchmarking.org test page.

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

TSCP

This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

John The Ripper

This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.

Stargate Digital Audio Workstation

Stargate is an open-source, cross-platform digital audio workstation (DAW) software package with "a unique and carefully curated experience" with scalability from old systems up through modern multi-core systems. Stargate is GPLv3 licensed and makes use of Qt5 (PyQt5) for its user-interface. Learn more via the OpenBenchmarking.org test page.

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenSSL

Google SynthMark

SynthMark is a cross platform tool for benchmarking CPU performance under a variety of real-time audio workloads. It uses a polyphonic synthesizer model to provide standardized tests for latency, jitter and computational throughput. Learn more via the OpenBenchmarking.org test page.

Google Draco

Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenVINO

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

Primesieve

Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve primarily benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.

FLAC Audio Encoding

This test times how long it takes to encode a sample WAV file to FLAC audio format ten times using the --best preset settings. Learn more via the OpenBenchmarking.org test page.

Opus Codec Encoding

Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus five times. Learn more via the OpenBenchmarking.org test page.

Helsing

Helsing is an open-source POSIX vampire number generator. This test profile measures the time it takes to generate vampire numbers between varying numbers of digits. Learn more via the OpenBenchmarking.org test page.

RNNoise

RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.

WavPack Audio Encoding

This test times how long it takes to encode a sample WAV file to WavPack format with very high quality settings. Learn more via the OpenBenchmarking.org test page.

121 Results Shown

miniBUDE:
OpenMP - BM1
OpenMP - BM2
OpenSSL:
SHA256
SHA512
ChaCha20
AES-128-GCM
AES-256-GCM
ChaCha20-Poly1305
Quicksilver:
CTS2
CORAL2 P1
CORAL2 P2
OpenVINO:
Person Detection FP16 - CPU
Face Detection FP16-INT8 - CPU
Vehicle Detection FP16-INT8 - CPU
Face Detection Retail FP16-INT8 - CPU
Road Segmentation ADAS FP16-INT8 - CPU
Machine Translation EN To DE FP16 - CPU
Weld Porosity Detection FP16-INT8 - CPU
Person Vehicle Bike Detection FP16 - CPU
Noise Suppression Poconet-Like FP16 - CPU
Person Re-Identification Retail FP16 - CPU
Handwritten English Recognition FP16-INT8 - CPU
Age Gender Recognition Retail 0013 FP16-INT8 - CPU
Kvazaar:
Bosphorus 4K - Medium
Bosphorus 4K - Very Fast
Bosphorus 4K - Super Fast
Bosphorus 4K - Ultra Fast
SVT-AV1:
Preset 4 - Bosphorus 4K
Preset 8 - Bosphorus 4K
Preset 12 - Bosphorus 4K
Preset 13 - Bosphorus 4K
uvg266:
Bosphorus 4K - Slow
Bosphorus 4K - Medium
Bosphorus 4K - Very Fast
Bosphorus 4K - Super Fast
Bosphorus 4K - Ultra Fast
x265
miniBUDE:
OpenMP - BM1
OpenMP - BM2
GraphicsMagick:
Swirl
Rotate
Sharpen
Enhanced
Resizing
Noise-Gaussian
HWB Color Space
Coremark
SecureMark
LZ4 Compression:
3 - Compression Speed
3 - Decompression Speed
9 - Compression Speed
9 - Decompression Speed
Zstd Compression:
12 - Compression Speed
12 - Decompression Speed
19 - Compression Speed
19 - Decompression Speed
19, Long Mode - Compression Speed
19, Long Mode - Decompression Speed
srsRAN Project:
PDSCH Processor Benchmark, Throughput Total
PUSCH Processor Benchmark, Throughput Total
QuantLib:
Multi-Threaded
Single-Threaded
SMHasher:
wyhash
SHA3-256
Spooky32
fasthash32
FarmHash128
t1ha2_atonce
FarmHash32 x86_64 AVX
t1ha0_aes_avx2 x86_64
MeowHash x86_64 AES-NI
JPEG-XL libjxl:
PNG - 90
JPEG - 90
PNG - 100
JPEG - 100
JPEG-XL Decoding libjxl
WebP Image Encode:
Quality 100
Quality 100, Lossless
Quality 100, Highest Compression
Quality 100, Lossless, Highest Compression
ASTC Encoder:
Medium
Thorough
Exhaustive
Very Thorough
TSCP
GROMACS
LAMMPS Molecular Dynamics Simulator
John The Ripper:
bcrypt
WPA PSK
Blowfish
HMAC-SHA512
MD5
Stargate Digital Audio Workstation:
96000 - 1024
192000 - 1024
Liquid-DSP:
1 - 256 - 512
64 - 256 - 512
128 - 256 - 512
OpenSSL:
RSA4096:
sign/s
verify/s
Google SynthMark
Google Draco:
Lion
Church Facade
OpenVINO:
Person Detection FP16 - CPU
Face Detection FP16-INT8 - CPU
Vehicle Detection FP16-INT8 - CPU
Face Detection Retail FP16-INT8 - CPU
Road Segmentation ADAS FP16-INT8 - CPU
Machine Translation EN To DE FP16 - CPU
Weld Porosity Detection FP16-INT8 - CPU
Person Vehicle Bike Detection FP16 - CPU
Noise Suppression Poconet-Like FP16 - CPU
Person Re-Identification Retail FP16 - CPU
Handwritten English Recognition FP16-INT8 - CPU
Age Gender Recognition Retail 0013 FP16-INT8 - CPU
C-Ray
Primesieve
FLAC Audio Encoding
Opus Codec Encoding
Helsing
RNNoise
WavPack Audio Encoding

GCC 14.0.1 20240411

Testing initiated at 23 April 2024 15:56 by user phoronix.

Clang 18.1.1

OS: Fedora Linux 40, Kernel: 6.8.5-301.fc40.x86_64 (x86_64), Desktop: GNOME Shell 46.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0.5 (LLVM 18.1.1 DRM 3.57), Compiler: Clang 18.1.1 + LLVM 18.1.1, File-System: btrfs, Screen Resolution: 1920x1080

Testing initiated at 23 April 2024 23:44 by user phoronix.

GCC 14 vs. Clang 18 - AMD Ryzen Threadripper 7980X

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

GCC 14.0.1 20240411

Clang 18.1.1

miniBUDE

OpenSSL

Quicksilver

OpenVINO

Kvazaar

SVT-AV1

uvg266

x265

miniBUDE

GraphicsMagick

Coremark

SecureMark

LZ4 Compression

Zstd Compression

srsRAN Project

QuantLib

SMHasher

JPEG-XL libjxl

JPEG-XL Decoding libjxl

WebP Image Encode

ASTC Encoder

TSCP

GROMACS

LAMMPS Molecular Dynamics Simulator

John The Ripper

Stargate Digital Audio Workstation

Liquid-DSP

OpenSSL

Google SynthMark

Google Draco

OpenVINO

C-Ray

Primesieve

FLAC Audio Encoding

Opus Codec Encoding

Helsing

RNNoise

WavPack Audio Encoding

121 Results Shown

GCC 14.0.1 20240411

Clang 18.1.1