AMD Ryzen Threadripper 7980X compiler benchmarking on Fedora 40 by Michael Larabel for a future article.
GCC 14.0.1 20240411 Processor: AMD Ryzen Threadripper 7980X 64-Cores @ 7.79GHz (64 Cores / 128 Threads), Motherboard: System76 Thelio Major (FA Z5 BIOS), Chipset: AMD Device 14a4, Memory: 4 x 32GB DDR5-4800MT/s Micron MTC20F1045S1RC48BA2, Disk: 1000GB CT1000T700SSD5, Graphics: AMD Radeon Pro W7900 45GB, Audio: AMD Device 14cc, Monitor: DELL P2415Q, Network: Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6E
OS: Fedora Linux 40, Kernel: 6.8.5-301.fc40.x86_64 (x86_64), Desktop: GNOME Shell 46.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0.5 (LLVM 18.1.1 DRM 3.57), Compiler: GCC 14.0.1 20240411, File-System: btrfs, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-multilib --enable-offload-defaulted --enable-offload-targets=nvptx-none,amdgcn-amdhsa --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=i686 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driverProcessor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105Python Notes: Python 3.12.2Security Notes: SELinux + gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Clang 18.1.1 OS: Fedora Linux 40, Kernel: 6.8.5-301.fc40.x86_64 (x86_64), Desktop: GNOME Shell 46.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0.5 (LLVM 18.1.1 DRM 3.57), Compiler: Clang 18.1.1 + LLVM 18.1.1, File-System: btrfs, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105Python Notes: Python 3.12.2Security Notes: SELinux + gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 GCC 14.0.1 20240411 Clang 18.1.1 40 80 120 160 200 SE +/- 0.09, N = 3 SE +/- 1.52, N = 15 175.90 174.77 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Clang 18.1.1 GCC 14.0.1 20240411 50 100 150 200 250 SE +/- 0.27, N = 3 SE +/- 0.24, N = 3 210.78 175.05 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA256 GCC 14.0.1 20240411 Clang 18.1.1 20000M 40000M 60000M 80000M 100000M SE +/- 254544356.50, N = 3 SE +/- 281782565.20, N = 3 112513527163 109345824360 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA512 Clang 18.1.1 GCC 14.0.1 20240411 8000M 16000M 24000M 32000M 40000M SE +/- 38811972.88, N = 3 SE +/- 40543653.85, N = 3 37246099377 37084177003 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20 GCC 14.0.1 20240411 Clang 18.1.1 90000M 180000M 270000M 360000M 450000M SE +/- 56533218.78, N = 3 SE +/- 161435637.78, N = 3 437829686883 288123851390 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-128-GCM GCC 14.0.1 20240411 Clang 18.1.1 200000M 400000M 600000M 800000M 1000000M SE +/- 665285492.45, N = 3 SE +/- 246618519.80, N = 3 822088248977 816719157710 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-256-GCM GCC 14.0.1 20240411 Clang 18.1.1 150000M 300000M 450000M 600000M 750000M SE +/- 612129667.02, N = 3 SE +/- 847452668.98, N = 3 707658246553 701069882433 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20-Poly1305 GCC 14.0.1 20240411 Clang 18.1.1 70000M 140000M 210000M 280000M 350000M SE +/- 34389860.61, N = 3 SE +/- 32266915.14, N = 3 310352617927 196215871807 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -ldl
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CTS2 GCC 14.0.1 20240411 Clang 18.1.1 5M 10M 15M 20M 25M SE +/- 6666.67, N = 3 SE +/- 75351.03, N = 3 21853333 19186667 1. (CXX) g++ options: -fopenmp -O3 -march=native
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 GCC 14.0.1 20240411 Clang 18.1.1 6M 12M 18M 24M 30M SE +/- 26034.17, N = 3 SE +/- 32145.50, N = 3 28573333 23570000 1. (CXX) g++ options: -fopenmp -O3 -march=native
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P2 Clang 18.1.1 GCC 14.0.1 20240411 5M 10M 15M 20M 25M SE +/- 10000.00, N = 3 SE +/- 80069.41, N = 3 22130000 21786667 1. (CXX) g++ options: -fopenmp -O3 -march=native
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 40 80 120 160 200 SE +/- 0.27, N = 3 SE +/- 0.66, N = 3 200.06 199.47 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 20 40 60 80 100 SE +/- 0.07, N = 3 SE +/- 0.08, N = 3 84.53 83.82 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 1100 2200 3300 4400 5500 SE +/- 6.00, N = 3 SE +/- 8.31, N = 3 4938.07 4888.56 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 3K 6K 9K 12K 15K SE +/- 11.99, N = 3 SE +/- 5.86, N = 3 13824.92 13601.85 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 300 600 900 1200 1500 SE +/- 9.55, N = 3 SE +/- 16.52, N = 3 1304.85 1266.10 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 80 160 240 320 400 SE +/- 0.54, N = 3 SE +/- 2.83, N = 15 355.93 317.98 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 2K 4K 6K 8K 10K SE +/- 6.57, N = 3 SE +/- 9.39, N = 3 8089.55 8024.94 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 900 1800 2700 3600 4500 SE +/- 5.60, N = 3 SE +/- 5.05, N = 3 4299.75 4232.67 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 700 1400 2100 2800 3500 SE +/- 28.58, N = 15 SE +/- 27.73, N = 15 3367.26 2933.12 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 1300 2600 3900 5200 6500 SE +/- 3.36, N = 3 SE +/- 2.64, N = 3 5969.17 5885.26 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 500 1000 1500 2000 2500 SE +/- 1.62, N = 3 SE +/- 4.45, N = 3 2288.66 2283.11 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 20K 40K 60K 80K 100K SE +/- 108.76, N = 3 SE +/- 47.46, N = 3 116609.44 115684.85 -fno-strict-overflow -fwrapv 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Medium Clang 18.1.1 GCC 14.0.1 20240411 9 18 27 36 45 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 38.67 37.05 -lpthread 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Very Fast Clang 18.1.1 GCC 14.0.1 20240411 20 40 60 80 100 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 84.16 79.49 -lpthread 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Super Fast Clang 18.1.1 GCC 14.0.1 20240411 20 40 60 80 100 SE +/- 0.32, N = 3 SE +/- 0.91, N = 5 96.37 93.12 -lpthread 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Clang 18.1.1 GCC 14.0.1 20240411 20 40 60 80 100 SE +/- 0.86, N = 3 SE +/- 0.89, N = 3 97.80 97.05 -lpthread 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K Clang 18.1.1 GCC 14.0.1 20240411 3 6 9 12 15 SE +/- 0.023, N = 3 SE +/- 0.023, N = 3 9.853 9.848 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K GCC 14.0.1 20240411 Clang 18.1.1 20 40 60 80 100 SE +/- 0.71, N = 3 SE +/- 0.39, N = 3 95.25 93.19 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K GCC 14.0.1 20240411 Clang 18.1.1 40 80 120 160 200 SE +/- 1.09, N = 3 SE +/- 1.72, N = 8 197.49 194.89 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K Clang 18.1.1 GCC 14.0.1 20240411 40 80 120 160 200 SE +/- 1.67, N = 15 SE +/- 1.61, N = 3 195.31 193.78 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
uvg266 uvg266 is an open-source VVC/H.266 (Versatile Video Coding) encoder based on Kvazaar as part of the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Slow GCC 14.0.1 20240411 Clang 18.1.1 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 27.16 26.45 1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Medium GCC 14.0.1 20240411 Clang 18.1.1 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 30.39 29.50 1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast Clang 18.1.1 GCC 14.0.1 20240411 20 40 60 80 100 SE +/- 0.26, N = 3 SE +/- 0.15, N = 3 74.38 70.73 1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast Clang 18.1.1 GCC 14.0.1 20240411 20 40 60 80 100 SE +/- 0.22, N = 3 SE +/- 0.25, N = 3 75.27 72.65 1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Clang 18.1.1 GCC 14.0.1 20240411 20 40 60 80 100 SE +/- 0.08, N = 3 SE +/- 0.13, N = 3 76.45 74.07 1. (CXX) g++ options: -O3 -march=native
x265 This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.6 Video Input: Bosphorus 4K Clang 18.1.1 GCC 14.0.1 20240411 9 18 27 36 45 SE +/- 0.12, N = 3 SE +/- 0.17, N = 3 40.08 39.16 1. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 GCC 14.0.1 20240411 Clang 18.1.1 900 1800 2700 3600 4500 SE +/- 2.34, N = 3 SE +/- 37.96, N = 15 4397.54 4369.14 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Clang 18.1.1 GCC 14.0.1 20240411 1100 2200 3300 4400 5500 SE +/- 6.63, N = 3 SE +/- 5.94, N = 3 5269.55 4376.24 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample high resolution (currently 15400 x 6940) JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Swirl GCC 14.0.1 20240411 Clang 18.1.1 120 240 360 480 600 SE +/- 0.88, N = 3 SE +/- 0.58, N = 3 554 457 -lgomp -lomp 1. (CC) gcc options: -fopenmp -O3 -march=native -ljpeg -lX11 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Rotate GCC 14.0.1 20240411 Clang 18.1.1 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 155 153 -lgomp -lomp 1. (CC) gcc options: -fopenmp -O3 -march=native -ljpeg -lX11 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Sharpen GCC 14.0.1 20240411 Clang 18.1.1 50 100 150 200 250 SE +/- 0.33, N = 3 SE +/- 1.00, N = 3 216 199 -lgomp -lomp 1. (CC) gcc options: -fopenmp -O3 -march=native -ljpeg -lX11 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Enhanced Clang 18.1.1 GCC 14.0.1 20240411 60 120 180 240 300 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 256 250 -lomp -lgomp 1. (CC) gcc options: -fopenmp -O3 -march=native -ljpeg -lX11 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Resizing GCC 14.0.1 20240411 Clang 18.1.1 40 80 120 160 200 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 193 178 -lgomp -lomp 1. (CC) gcc options: -fopenmp -O3 -march=native -ljpeg -lX11 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Noise-Gaussian GCC 14.0.1 20240411 Clang 18.1.1 40 80 120 160 200 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 185 159 -lgomp -lomp 1. (CC) gcc options: -fopenmp -O3 -march=native -ljpeg -lX11 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: HWB Color Space GCC 14.0.1 20240411 Clang 18.1.1 60 120 180 240 300 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 259 209 -lgomp -lomp 1. (CC) gcc options: -fopenmp -O3 -march=native -ljpeg -lX11 -lz -lm -lpthread
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS Clang 18.1.1 GCC 14.0.1 20240411 90K 180K 270K 360K 450K SE +/- 1948.09, N = 3 SE +/- 1794.60, N = 3 432909 416097 1. (CC) gcc options: -pedantic -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 3 - Decompression Speed GCC 14.0.1 20240411 Clang 18.1.1 1200 2400 3600 4800 6000 SE +/- 20.86, N = 3 SE +/- 60.76, N = 4 5658.0 5493.7 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 9 - Compression Speed GCC 14.0.1 20240411 Clang 18.1.1 11 22 33 44 55 SE +/- 0.30, N = 3 SE +/- 0.07, N = 3 49.02 45.86 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 9 - Decompression Speed GCC 14.0.1 20240411 Clang 18.1.1 1300 2600 3900 5200 6500 SE +/- 36.97, N = 3 SE +/- 4.45, N = 3 5882.1 5640.1 1. (CC) gcc options: -O3 -march=native
Zstd Compression This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed GCC 14.0.1 20240411 Clang 18.1.1 90 180 270 360 450 SE +/- 3.11, N = 3 SE +/- 0.40, N = 3 399.1 383.4 -Qunused-arguments 1. (CC) gcc options: -O3 -march=native -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed Clang 18.1.1 GCC 14.0.1 20240411 500 1000 1500 2000 2500 SE +/- 2.17, N = 3 SE +/- 21.94, N = 3 2340.9 2251.4 -Qunused-arguments 1. (CC) gcc options: -O3 -march=native -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed GCC 14.0.1 20240411 Clang 18.1.1 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 25.9 25.8 -Qunused-arguments 1. (CC) gcc options: -O3 -march=native -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed Clang 18.1.1 GCC 14.0.1 20240411 400 800 1200 1600 2000 SE +/- 2.11, N = 3 SE +/- 2.52, N = 3 1962.8 1917.8 -Qunused-arguments 1. (CC) gcc options: -O3 -march=native -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed Clang 18.1.1 GCC 14.0.1 20240411 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 12.3 12.1 -Qunused-arguments 1. (CC) gcc options: -O3 -march=native -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed Clang 18.1.1 GCC 14.0.1 20240411 400 800 1200 1600 2000 SE +/- 3.56, N = 3 SE +/- 10.14, N = 3 1861.0 1825.3 -Qunused-arguments 1. (CC) gcc options: -O3 -march=native -pthread -lz
srsRAN Project srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PDSCH Processor Benchmark, Throughput Total Clang 18.1.1 GCC 14.0.1 20240411 4K 8K 12K 16K 20K SE +/- 173.47, N = 3 SE +/- 160.00, N = 15 20338.7 20064.5 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PUSCH Processor Benchmark, Throughput Total GCC 14.0.1 20240411 Clang 18.1.1 1000 2000 3000 4000 5000 SE +/- 0.38, N = 3 SE +/- 0.23, N = 3 4801.9 4446.1 MIN: 3393.9 / MAX: 4802.5 MIN: 2910.7 / MAX: 4446.5 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Multi-Threaded Clang 18.1.1 GCC 14.0.1 20240411 60K 120K 180K 240K 300K SE +/- 1085.73, N = 3 SE +/- 975.48, N = 3 295240.5 293011.2 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Single-Threaded Clang 18.1.1 GCC 14.0.1 20240411 1000 2000 3000 4000 5000 SE +/- 22.05, N = 3 SE +/- 36.95, N = 3 4697.6 4611.6 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
JPEG-XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 90 Clang 18.1.1 GCC 14.0.1 20240411 11 22 33 44 55 SE +/- 0.47, N = 15 SE +/- 0.41, N = 15 49.45 48.58 1. (CXX) g++ options: -O3 -march=native -fno-rtti -fPIE -pie -lm
OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 90 Clang 18.1.1 GCC 14.0.1 20240411 12 24 36 48 60 SE +/- 0.62, N = 4 SE +/- 0.51, N = 5 51.25 46.97 1. (CXX) g++ options: -O3 -march=native -fno-rtti -fPIE -pie -lm
OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 100 GCC 14.0.1 20240411 Clang 18.1.1 9 18 27 36 45 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 41.18 40.71 1. (CXX) g++ options: -O3 -march=native -fno-rtti -fPIE -pie -lm
OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 100 GCC 14.0.1 20240411 Clang 18.1.1 10 20 30 40 50 SE +/- 0.17, N = 3 SE +/- 0.04, N = 3 41.69 41.65 1. (CXX) g++ options: -O3 -march=native -fno-rtti -fPIE -pie -lm
JPEG-XL Decoding libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. The JPEG XL encoding/decoding is done using the libjxl codebase. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: All GCC 14.0.1 20240411 Clang 18.1.1 130 260 390 520 650 SE +/- 1.93, N = 3 SE +/- 3.24, N = 3 600.87 597.36
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless GCC 14.0.1 20240411 Clang 18.1.1 0.4658 0.9316 1.3974 1.8632 2.329 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.07 2.05 -lpng16 -ljpeg 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Highest Compression Clang 18.1.1 GCC 14.0.1 20240411 1.269 2.538 3.807 5.076 6.345 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 5.64 4.57 -lpng16 -ljpeg 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless, Highest Compression Clang 18.1.1 GCC 14.0.1 20240411 0.1845 0.369 0.5535 0.738 0.9225 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.82 0.81 -lpng16 -ljpeg 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Medium Clang 18.1.1 GCC 14.0.1 20240411 100 200 300 400 500 SE +/- 0.34, N = 3 SE +/- 0.79, N = 3 459.58 450.11 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Thorough Clang 18.1.1 GCC 14.0.1 20240411 14 28 42 56 70 SE +/- 0.11, N = 3 SE +/- 0.03, N = 3 63.56 62.45 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Exhaustive Clang 18.1.1 GCC 14.0.1 20240411 1.2404 2.4808 3.7212 4.9616 6.202 SE +/- 0.0146, N = 3 SE +/- 0.0104, N = 3 5.5130 5.3446 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Very Thorough Clang 18.1.1 GCC 14.0.1 20240411 3 6 9 12 15 SE +/- 0.0212, N = 3 SE +/- 0.0183, N = 3 8.9615 8.7120 1. (CXX) g++ options: -O3 -flto -pthread
TSCP This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance Clang 18.1.1 GCC 14.0.1 20240411 500K 1000K 1500K 2000K 2500K SE +/- 2576.48, N = 5 SE +/- 3119.54, N = 5 2118389 1960084 1. (CC) gcc options: -O3 -march=native
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare GCC 14.0.1 20240411 Clang 18.1.1 1.2535 2.507 3.7605 5.014 6.2675 SE +/- 0.005, N = 3 SE +/- 0.008, N = 3 5.571 5.247 1. (CXX) g++ options: -O3 -march=native -lm
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK GCC 14.0.1 20240411 Clang 18.1.1 120K 240K 360K 480K 600K SE +/- 668.00, N = 3 SE +/- 3691.65, N = 3 541461 460525 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish Clang 18.1.1 GCC 14.0.1 20240411 30K 60K 90K 120K 150K SE +/- 609.99, N = 3 SE +/- 79.08, N = 3 157785 152866 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 GCC 14.0.1 20240411 Clang 18.1.1 60M 120M 180M 240M 300M SE +/- 1504066.82, N = 3 SE +/- 6642498.38, N = 15 298853000 214448333 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 Clang 18.1.1 GCC 14.0.1 20240411 3M 6M 9M 12M 15M SE +/- 17975.29, N = 3 SE +/- 15878.01, N = 3 14840333 13318667 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
Stargate Digital Audio Workstation Stargate is an open-source, cross-platform digital audio workstation (DAW) software package with "a unique and carefully curated experience" with scalability from old systems up through modern multi-core systems. Stargate is GPLv3 licensed and makes use of Qt5 (PyQt5) for its user-interface. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 1024 Clang 18.1.1 GCC 14.0.1 20240411 2 4 6 8 10 SE +/- 0.013288, N = 3 SE +/- 0.000716, N = 3 6.550766 6.161616 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 1024 Clang 18.1.1 GCC 14.0.1 20240411 0.9537 1.9074 2.8611 3.8148 4.7685 SE +/- 0.003706, N = 3 SE +/- 0.009323, N = 3 4.238755 4.097617 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 Clang 18.1.1 GCC 14.0.1 20240411 5M 10M 15M 20M 25M SE +/- 343818.42, N = 13 SE +/- 417025.62, N = 12 23426308 18258917 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 Clang 18.1.1 GCC 14.0.1 20240411 300M 600M 900M 1200M 1500M SE +/- 5417358.93, N = 3 SE +/- 3868821.24, N = 3 1231066667 1025733333 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 512 Clang 18.1.1 GCC 14.0.1 20240411 300M 600M 900M 1200M 1500M SE +/- 7521155.35, N = 3 SE +/- 4106498.91, N = 3 1297866667 1242200000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 GCC 14.0.1 20240411 Clang 18.1.1 9K 18K 27K 36K 45K SE +/- 89.80, N = 3 SE +/- 71.24, N = 3 44305.1 44172.3 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 GCC 14.0.1 20240411 Clang 18.1.1 300K 600K 900K 1200K 1500K SE +/- 1971.44, N = 3 SE +/- 967.52, N = 3 1280463.3 1275596.8 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -ldl
Google SynthMark SynthMark is a cross platform tool for benchmarking CPU performance under a variety of real-time audio workloads. It uses a polyphonic synthesizer model to provide standardized tests for latency, jitter and computational throughput. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 Clang 18.1.1 GCC 14.0.1 20240411 200 400 600 800 1000 SE +/- 2.74, N = 3 SE +/- 2.19, N = 3 1002.52 990.57 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion Clang 18.1.1 GCC 14.0.1 20240411 900 1800 2700 3600 4500 SE +/- 18.35, N = 3 SE +/- 16.90, N = 3 3942 4049 1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade Clang 18.1.1 GCC 14.0.1 20240411 1100 2200 3300 4400 5500 SE +/- 9.17, N = 3 SE +/- 9.33, N = 3 5112 5240 1. (CXX) g++ options: -O3 -march=native
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 40 80 120 160 200 SE +/- 0.22, N = 3 SE +/- 0.53, N = 3 159.76 160.20 -fno-strict-overflow -fwrapv - MIN: 54.41 / MAX: 238.04 MIN: 50.55 / MAX: 283.62 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 80 160 240 320 400 SE +/- 0.25, N = 3 SE +/- 0.31, N = 3 377.58 380.50 -fno-strict-overflow -fwrapv - MIN: 186.1 / MAX: 401.52 MIN: 321.62 / MAX: 405.78 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.38 6.45 -fno-strict-overflow -fwrapv - MIN: 3.23 / MAX: 27.63 MIN: 3.45 / MAX: 34.45 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 1.0035 2.007 3.0105 4.014 5.0175 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.38 4.46 -fno-strict-overflow -fwrapv - MIN: 2.05 / MAX: 26.42 MIN: 2.1 / MAX: 27.33 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 6 12 18 24 30 SE +/- 0.18, N = 3 SE +/- 0.33, N = 3 24.44 25.20 -fno-strict-overflow -fwrapv - MIN: 11.45 / MAX: 51.44 MIN: 10.11 / MAX: 73 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 20 40 60 80 100 SE +/- 0.13, N = 3 SE +/- 0.85, N = 15 89.77 100.60 -fno-strict-overflow -fwrapv - MIN: 33.39 / MAX: 179.49 MIN: 33.14 / MAX: 247.72 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.64 7.70 -fno-strict-overflow -fwrapv - MIN: 3.6 / MAX: 33.44 MIN: 3.41 / MAX: 30.42 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.33 7.44 -fno-strict-overflow -fwrapv - MIN: 3.9 / MAX: 28.61 MIN: 4.21 / MAX: 29.46 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 5 10 15 20 25 SE +/- 0.14, N = 15 SE +/- 0.18, N = 15 18.28 21.12 -fno-strict-overflow -fwrapv - MIN: 7.31 / MAX: 109.14 MIN: 7.43 / MAX: 121.23 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 1.215 2.43 3.645 4.86 6.075 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 5.33 5.40 -fno-strict-overflow -fwrapv - MIN: 3.15 / MAX: 26.39 MIN: 3.38 / MAX: 23.08 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 27.86 27.93 -fno-strict-overflow -fwrapv - MIN: 16.47 / MAX: 52.7 MIN: 16.08 / MAX: 54.79 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU GCC 14.0.1 20240411 Clang 18.1.1 0.0968 0.1936 0.2904 0.3872 0.484 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.43 0.43 -fno-strict-overflow -fwrapv - MIN: 0.19 / MAX: 22.84 MIN: 0.2 / MAX: 25.55 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -shared -ldl
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC 14.0.1 20240411 Clang 18.1.1 2 4 6 8 10 SE +/- 0.026, N = 3 SE +/- 0.018, N = 3 6.179 8.492 1. (CC) gcc options: -lm -lpthread -O3 -march=native
Opus Codec Encoding Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus five times. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.4 WAV To Opus Encode Clang 18.1.1 GCC 14.0.1 20240411 5 10 15 20 25 SE +/- 0.02, N = 5 SE +/- 0.02, N = 5 18.42 19.33 1. (CXX) g++ options: -O3 -march=native -fvisibility=hidden -logg -lm
RNNoise RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 0.2 Input: 26 Minute Long Talking Sample Clang 18.1.1 GCC 14.0.1 20240411 2 4 6 8 10 SE +/- 0.046, N = 3 SE +/- 0.035, N = 3 7.484 7.607 1. (CC) gcc options: -O3 -march=native -pedantic -fvisibility=hidden
GCC 14.0.1 20240411 Processor: AMD Ryzen Threadripper 7980X 64-Cores @ 7.79GHz (64 Cores / 128 Threads), Motherboard: System76 Thelio Major (FA Z5 BIOS), Chipset: AMD Device 14a4, Memory: 4 x 32GB DDR5-4800MT/s Micron MTC20F1045S1RC48BA2, Disk: 1000GB CT1000T700SSD5, Graphics: AMD Radeon Pro W7900 45GB, Audio: AMD Device 14cc, Monitor: DELL P2415Q, Network: Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6E
OS: Fedora Linux 40, Kernel: 6.8.5-301.fc40.x86_64 (x86_64), Desktop: GNOME Shell 46.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0.5 (LLVM 18.1.1 DRM 3.57), Compiler: GCC 14.0.1 20240411, File-System: btrfs, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-multilib --enable-offload-defaulted --enable-offload-targets=nvptx-none,amdgcn-amdhsa --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=i686 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driverProcessor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105Python Notes: Python 3.12.2Security Notes: SELinux + gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 23 April 2024 15:56 by user phoronix.
Clang 18.1.1 Processor: AMD Ryzen Threadripper 7980X 64-Cores @ 7.79GHz (64 Cores / 128 Threads), Motherboard: System76 Thelio Major (FA Z5 BIOS), Chipset: AMD Device 14a4, Memory: 4 x 32GB DDR5-4800MT/s Micron MTC20F1045S1RC48BA2, Disk: 1000GB CT1000T700SSD5, Graphics: AMD Radeon Pro W7900 45GB, Audio: AMD Device 14cc, Monitor: DELL P2415Q, Network: Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6E
OS: Fedora Linux 40, Kernel: 6.8.5-301.fc40.x86_64 (x86_64), Desktop: GNOME Shell 46.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0.5 (LLVM 18.1.1 DRM 3.57), Compiler: Clang 18.1.1 + LLVM 18.1.1, File-System: btrfs, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105Python Notes: Python 3.12.2Security Notes: SELinux + gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 23 April 2024 23:44 by user phoronix.