Benchmarks by Michael Larabel for a future article.
Linux 6.8 Processor: 2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1007B BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive, Graphics: ASPEED, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 23.10, Kernel: 6.8.0-060800-generic (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113ePython Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Linux 6.9 27 Mar OS: Ubuntu 23.10, Kernel: 6.9.0-060900rc1daily20240327-generic (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113ePython Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AMD EPYC Genoa-X Linux 6.9 Kernel Benchmarks OpenBenchmarking.org Phoronix Test Suite 2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 1520GB 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.8.0-060800-generic (x86_64) 6.9.0-060900rc1daily20240327-generic (x86_64) GCC 13.2.0 ext4 1920x1200 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernels Compiler File-System Screen Resolution AMD EPYC Genoa-X Linux 6.9 Kernel Benchmarks Performance System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - Python 3.11.6 - Linux 6.8: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Linux 6.9 27 Mar: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Linux 6.8 vs. Linux 6.9 27 Mar Comparison Phoronix Test Suite Baseline +18.9% +18.9% +37.8% +37.8% +56.7% +56.7% 75.4% 11.2% 10.8% 9.5% 9.3% 8.5% 8.1% 7% 6.8% 5.2% 5.1% 5.1% 4.7% 4.5% 4.4% 4.4% 3.9% 3.3% 3.2% 3.1% 3% 3% 2.8% 2.7% 2.7% 2.6% 2.5% 2.4% 2.3% 2% llava-v1.5-7b-q4 - CPU S.V.M.P Semaphores R.R.W.R Overwrite Read While Writing Update Rand NUMA Update Rand mistral-7b-instruct-v0.2.Q8_0 - CPU R.S.A.F.I - CPU R.S.A.F.I - CPU Rand Read V.D.F.I - CPU Context Switching V.D.F.I - CPU 32 - Process AVX-512 VNNI Vector Math M.T.E.T.D.F - CPU M.T.E.T.D.F - CPU Preset 12 - Bosphorus 4K F.D.R.F.I - CPU F.D.R.F.I - CPU CORAL2 P2 Rand Read H.E.R.F.I - CPU Matrix Math H.E.R.F.I - CPU R.R.W.R 2% 1024 Llamafile Stress-NG Stress-NG RocksDB RocksDB RocksDB RocksDB Stress-NG Speedb Llamafile OpenVINO OpenVINO RocksDB OpenVINO Stress-NG OpenVINO Hackbench Stress-NG Stress-NG OpenVINO OpenVINO SVT-AV1 OpenVINO OpenVINO Quicksilver Speedb OpenVINO Stress-NG OpenVINO Speedb MariaDB mariadb-slap Linux 6.8 Linux 6.9 27 Mar
AMD EPYC Genoa-X Linux 6.9 Kernel Benchmarks stress-ng: NUMA stress-ng: Pthread stress-ng: Semaphores stress-ng: Matrix Math stress-ng: Vector Math stress-ng: AVX-512 VNNI stress-ng: Context Switching stress-ng: System V Message Passing openssl: SHA256 openssl: SHA512 openssl: ChaCha20 openssl: AES-128-GCM openssl: AES-256-GCM openssl: ChaCha20-Poly1305 quicksilver: CORAL2 P1 quicksilver: CORAL2 P2 openvino: Vehicle Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K tensorflow: CPU - 256 - ResNet-50 tensorflow: CPU - 512 - ResNet-50 sockperf: Throughput compress-7zip: Compression Rating compress-7zip: Decompression Rating namd: ATPase with 327,506 Atoms namd: STMV with 1,066,628 Atoms rocksdb: Overwrite rocksdb: Rand Read rocksdb: Update Rand rocksdb: Read While Writing rocksdb: Read Rand Write Rand speedb: Rand Read speedb: Update Rand speedb: Read Rand Write Rand mysqlslap: 1024 openssl: RSA4096 llamafile: llava-v1.5-7b-q4 - CPU llamafile: mistral-7b-instruct-v0.2.Q8_0 - CPU openssl: RSA4096 ospray-studio: 1 - 4K - 1 - Path Tracer - CPU ospray-studio: 2 - 4K - 1 - Path Tracer - CPU ospray-studio: 3 - 4K - 1 - Path Tracer - CPU ospray-studio: 1 - 4K - 32 - Path Tracer - CPU ospray-studio: 2 - 4K - 32 - Path Tracer - CPU ospray-studio: 3 - 4K - 32 - Path Tracer - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU build-godot: Time To Compile build-linux-kernel: defconfig build-linux-kernel: allmodconfig build-llvm: Ninja hackbench: 32 - Process sockperf: Latency Under Load Linux 6.8 Linux 6.9 27 Mar 646.34 58297.64 264112681.09 915179.07 1306706.19 18968297.94 40075791.13 25534145.87 284349804093 92015573177 1095564664557 2003432456830 1735984667803 777163987993 17713333 15873333 11218.12 34785.75 3864.65 1086.48 7703.03 5765.91 8.431 86.617 159.513 157.918 117.77 136.82 319386 833319 1357351 20.75382 6.46062 381523 1047290863 375714 20665324 3182572 1084511886 293746 1603891 51 97844.2 12.03 12.79 3232989.3 558 563 659 17788 17834 20984 4.26 5.50 12.40 44.13 23.59 33.25 87.482 26.914 198.990 86.399 10.836 16.356 691.54 59057.15 292541552.77 937042.03 1349057.51 19594674.67 41854298.68 28384870.15 283971383020 91315613557 1093188749947 2001406545920 1732048811713 774761881287 17903333 16294444 11719.99 35740.91 4063.11 1119.56 7762.09 5898.06 8.565 88.143 164.308 159.118 119.29 136.37 324029 821061 1345621 20.68441 6.38903 416860 1096081951 406258 22418140 3484877 1112167042 313791 1572109 52 97661.7 21.10 13.46 3230787.6 559 563 655 17746 17877 20914 4.08 5.35 11.80 42.82 23.50 32.43 87.392 26.880 196.881 86.719 10.425 16.109 OpenBenchmarking.org
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Pthread Linux 6.8 Linux 6.9 27 Mar 13K 26K 39K 52K 65K SE +/- 649.26, N = 3 SE +/- 308.02, N = 2 58297.64 59057.15 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Semaphores Linux 6.8 Linux 6.9 27 Mar 60M 120M 180M 240M 300M SE +/- 1107000.14, N = 3 SE +/- 1842285.12, N = 3 264112681.09 292541552.77 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Matrix Math Linux 6.8 Linux 6.9 27 Mar 200K 400K 600K 800K 1000K SE +/- 3353.49, N = 3 SE +/- 4156.75, N = 3 915179.07 937042.03 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Vector Math Linux 6.8 Linux 6.9 27 Mar 300K 600K 900K 1200K 1500K SE +/- 1312.02, N = 3 SE +/- 1326.11, N = 3 1306706.19 1349057.51 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: AVX-512 VNNI Linux 6.8 Linux 6.9 27 Mar 4M 8M 12M 16M 20M SE +/- 19220.66, N = 3 SE +/- 25357.28, N = 3 18968297.94 19594674.67 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Context Switching Linux 6.8 Linux 6.9 27 Mar 9M 18M 27M 36M 45M SE +/- 130933.36, N = 3 SE +/- 110197.66, N = 3 40075791.13 41854298.68 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: System V Message Passing Linux 6.8 Linux 6.9 27 Mar 6M 12M 18M 24M 30M SE +/- 770160.74, N = 12 SE +/- 9139.64, N = 3 25534145.87 28384870.15 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 Linux 6.8 Linux 6.9 27 Mar 60000M 120000M 180000M 240000M 300000M SE +/- 6021947.24, N = 3 SE +/- 91878369.01, N = 3 284349804093 283971383020 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 Linux 6.8 Linux 6.9 27 Mar 20000M 40000M 60000M 80000M 100000M SE +/- 24428915.02, N = 3 SE +/- 200003199.85, N = 3 92015573177 91315613557 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 Linux 6.8 Linux 6.9 27 Mar 200000M 400000M 600000M 800000M 1000000M SE +/- 5401967811.92, N = 3 SE +/- 4778305950.15, N = 3 1095564664557 1093188749947 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM Linux 6.8 Linux 6.9 27 Mar 400000M 800000M 1200000M 1600000M 2000000M SE +/- 5046453117.89, N = 3 SE +/- 483814569.81, N = 3 2003432456830 2001406545920 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM Linux 6.8 Linux 6.9 27 Mar 400000M 800000M 1200000M 1600000M 2000000M SE +/- 826670837.74, N = 3 SE +/- 1466705411.41, N = 3 1735984667803 1732048811713 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 Linux 6.8 Linux 6.9 27 Mar 170000M 340000M 510000M 680000M 850000M SE +/- 372177165.42, N = 3 SE +/- 709551963.95, N = 3 777163987993 774761881287 1. (CC) gcc options: -pthread -m64 -O3 -ldl
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 Linux 6.8 Linux 6.9 27 Mar 4M 8M 12M 16M 20M SE +/- 17638.34, N = 3 SE +/- 156027.06, N = 3 17713333 17903333 1. (CXX) g++ options: -fopenmp -O3 -march=native
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P2 Linux 6.8 Linux 6.9 27 Mar 3M 6M 9M 12M 15M SE +/- 81921.37, N = 3 SE +/- 272810.02, N = 9 15873333 16294444 1. (CXX) g++ options: -fopenmp -O3 -march=native
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 3K 6K 9K 12K 15K SE +/- 5.76, N = 3 SE +/- 24.42, N = 3 11218.12 11719.99 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 8K 16K 24K 32K 40K SE +/- 19.55, N = 3 SE +/- 83.35, N = 3 34785.75 35740.91 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 900 1800 2700 3600 4500 SE +/- 1.98, N = 3 SE +/- 3.68, N = 3 3864.65 4063.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 200 400 600 800 1000 SE +/- 1.18, N = 3 SE +/- 0.57, N = 3 1086.48 1119.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 1700 3400 5100 6800 8500 SE +/- 90.39, N = 3 SE +/- 55.96, N = 3 7703.03 7762.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 1300 2600 3900 5200 6500 SE +/- 9.86, N = 3 SE +/- 26.57, N = 3 5765.91 5898.06 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K Linux 6.8 Linux 6.9 27 Mar 2 4 6 8 10 SE +/- 0.052, N = 3 SE +/- 0.054, N = 3 8.431 8.565 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K Linux 6.8 Linux 6.9 27 Mar 20 40 60 80 100 SE +/- 0.60, N = 15 SE +/- 0.66, N = 15 86.62 88.14 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K Linux 6.8 Linux 6.9 27 Mar 40 80 120 160 200 SE +/- 1.75, N = 3 SE +/- 0.71, N = 3 159.51 164.31 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K Linux 6.8 Linux 6.9 27 Mar 40 80 120 160 200 SE +/- 1.80, N = 15 SE +/- 1.96, N = 15 157.92 159.12 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
TensorFlow This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 Linux 6.8 Linux 6.9 27 Mar 30 60 90 120 150 SE +/- 0.34, N = 3 SE +/- 0.86, N = 3 117.77 119.29
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 Linux 6.8 Linux 6.9 27 Mar 30 60 90 120 150 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 136.82 136.37
Sockperf This is a network socket API performance benchmark developed by Mellanox. This test profile runs both the client and server on the local host for evaluating individual system performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Messages Per Second, More Is Better Sockperf 3.7 Test: Throughput Linux 6.8 Linux 6.9 27 Mar 70K 140K 210K 280K 350K SE +/- 2015.52, N = 5 SE +/- 3244.28, N = 6 319386 324029 1. (CXX) g++ options: --param -O3 -rdynamic
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating Linux 6.8 Linux 6.9 27 Mar 300K 600K 900K 1200K 1500K SE +/- 3867.90, N = 3 SE +/- 9219.74, N = 3 1357351 1345621 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ns/day, More Is Better NAMD 3.0b6 Input: ATPase with 327,506 Atoms Linux 6.8 Linux 6.9 27 Mar 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 20.75 20.68
OpenBenchmarking.org ns/day, More Is Better NAMD 3.0b6 Input: STMV with 1,066,628 Atoms Linux 6.8 Linux 6.9 27 Mar 2 4 6 8 10 SE +/- 0.00857, N = 3 SE +/- 0.00170, N = 3 6.46062 6.38903
RocksDB This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Overwrite Linux 6.8 Linux 6.9 27 Mar 90K 180K 270K 360K 450K SE +/- 180.51, N = 3 SE +/- 1320.84, N = 3 381523 416860 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read Linux 6.8 Linux 6.9 27 Mar 200M 400M 600M 800M 1000M SE +/- 4458137.36, N = 3 SE +/- 2941599.90, N = 3 1047290863 1096081951 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Update Random Linux 6.8 Linux 6.9 27 Mar 90K 180K 270K 360K 450K SE +/- 379.66, N = 3 SE +/- 1155.95, N = 3 375714 406258 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing Linux 6.8 Linux 6.9 27 Mar 5M 10M 15M 20M 25M SE +/- 155882.11, N = 3 SE +/- 390827.28, N = 12 20665324 22418140 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read Random Write Random Linux 6.8 Linux 6.9 27 Mar 700K 1400K 2100K 2800K 3500K SE +/- 3761.31, N = 3 SE +/- 3987.34, N = 3 3182572 3484877 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Speedb Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read Linux 6.8 Linux 6.9 27 Mar 200M 400M 600M 800M 1000M SE +/- 5405476.66, N = 3 SE +/- 848602.66, N = 3 1084511886 1112167042 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Update Random Linux 6.8 Linux 6.9 27 Mar 70K 140K 210K 280K 350K SE +/- 945.90, N = 3 SE +/- 2172.75, N = 3 293746 313791 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read Random Write Random Linux 6.8 Linux 6.9 27 Mar 300K 600K 900K 1200K 1500K SE +/- 2371.05, N = 3 SE +/- 7976.44, N = 3 1603891 1572109 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 Linux 6.8 Linux 6.9 27 Mar 20K 40K 60K 80K 100K SE +/- 54.75, N = 3 SE +/- 73.99, N = 3 97844.2 97661.7 1. (CC) gcc options: -pthread -m64 -O3 -ldl
Llamafile Mozilla's Llamafile allows distributing and running large language models (LLMs) as a single file. Llamafile aims to make open-source LLMs more accessible to developers and users. Llamafile supports a variety of models, CPUs and GPUs, and other options. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.6 Test: llava-v1.5-7b-q4 - Acceleration: CPU Linux 6.8 Linux 6.9 27 Mar 5 10 15 20 25 SE +/- 0.42, N = 15 SE +/- 0.49, N = 15 12.03 21.10
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.6 Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU Linux 6.8 Linux 6.9 27 Mar 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 12.79 13.46
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 Linux 6.8 Linux 6.9 27 Mar 700K 1400K 2100K 2800K 3500K SE +/- 1592.51, N = 3 SE +/- 1390.93, N = 3 3232989.3 3230787.6 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OSPRay Studio Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9 27 Mar 120 240 360 480 600 SE +/- 1.00, N = 3 SE +/- 1.15, N = 3 558 559
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9 27 Mar 120 240 360 480 600 SE +/- 1.33, N = 3 SE +/- 1.53, N = 3 563 563
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9 27 Mar 140 280 420 560 700 SE +/- 3.28, N = 3 SE +/- 0.58, N = 3 659 655
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9 27 Mar 4K 8K 12K 16K 20K SE +/- 60.00, N = 3 SE +/- 64.98, N = 3 17788 17746
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9 27 Mar 4K 8K 12K 16K 20K SE +/- 30.37, N = 3 SE +/- 68.53, N = 3 17834 17877
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9 27 Mar 4K 8K 12K 16K 20K SE +/- 40.08, N = 3 SE +/- 48.89, N = 3 20984 20914
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 0.9585 1.917 2.8755 3.834 4.7925 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 4.26 4.08 MIN: 3.76 / MAX: 20.9 MIN: 3.64 / MAX: 20.85 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 1.2375 2.475 3.7125 4.95 6.1875 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 5.50 5.35 MIN: 4.83 / MAX: 27 MIN: 4.84 / MAX: 24.91 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 12.40 11.80 MIN: 11.56 / MAX: 38.83 MIN: 11.09 / MAX: 37.4 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 10 20 30 40 50 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 44.13 42.82 MIN: 36.04 / MAX: 216.03 MIN: 34.68 / MAX: 192.44 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 6 12 18 24 30 SE +/- 0.34, N = 3 SE +/- 0.26, N = 3 23.59 23.50 MIN: 9.67 / MAX: 85.05 MIN: 9.59 / MAX: 65.24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9 27 Mar 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.13, N = 3 33.25 32.43 MIN: 29.55 / MAX: 52.69 MIN: 29.07 / MAX: 50.08 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Sockperf This is a network socket API performance benchmark developed by Mellanox. This test profile runs both the client and server on the local host for evaluating individual system performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org usec, Fewer Is Better Sockperf 3.7 Test: Latency Under Load Linux 6.8 Linux 6.9 27 Mar 4 8 12 16 20 SE +/- 0.13, N = 10 SE +/- 0.12, N = 24 16.36 16.11 1. (CXX) g++ options: --param -O3 -rdynamic
Linux 6.8 Processor: 2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1007B BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive, Graphics: ASPEED, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 23.10, Kernel: 6.8.0-060800-generic (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113ePython Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 27 March 2024 16:44 by user phoronix.
Linux 6.9 27 Mar Processor: 2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1007B BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive, Graphics: ASPEED, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 23.10, Kernel: 6.9.0-060900rc1daily20240327-generic (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113ePython Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 28 March 2024 01:54 by user phoronix.