cpuall_v2 AMD EPYC 7413 24-Core testing with a GIGABYTE MZ32-AR0-00 v01000100 (M18 BIOS) and Gigabyte NVIDIA GeForce RTX 4090 on Rocky Linux 9.3 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403189-NE-2403174NE64&gru&rdt .
cpuall_v2 Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel OpenCL Compiler File-System Screen Resolution Desktop Display Server 6000_4ch_ll 4090x2 Intel 0000% @ 3.30GHz (48 Cores / 96 Threads) ASUS Pro WS W790E-SAGE SE (0215 BIOS) Intel Alder Lake-S PCH 64GB 4001GB CT4000P3SSD8 + 0GB Virtual HDisk0 ASPEED Realtek ALC1220 2 x Intel X710 for 10GBASE-T Fedora 39 6.7.7-200.fc39.x86_64 (x86_64) OpenCL 3.0 + OpenCL 1.2 Intel FPGA SDK for OpenCL 20.3 + OpenCL 3.0 LINUX + OpenCL 1.2 Intel FPGA SDK for OpenCL 20.3 GCC 13.2.1 20231205 + Clang 17.0.6 + LLVM 17.0.6 xfs 1920x1200 AMD EPYC 7413 24-Core @ 2.65GHz (24 Cores) GIGABYTE MZ32-AR0-00 v01000100 (M18 BIOS) AMD Starship/Matisse 6 x 16 GB DDR4-2667MT/s 18ASF2G72PZ-2G6D2 960GB INTEL SSDPE21D960GA + 2 x 1600GB Toshiba KXG50PNV2T04 + 4001GB Nextorage SSD NE1N4TB + 3 x 59GB INTEL SSDPEK1A058GA Gigabyte NVIDIA GeForce RTX 4090 NVIDIA AD102 HD Audio Aquantia AQC107 NBase-T/IEEE + Mellanox MT27500 Rocky Linux 9.3 5.14.0-362.24.1.el9_3.x86_64 (x86_64) GNOME Shell 40.10 X Server 1.20.11 GCC 11.4.1 20230605 + Clang 16.0.6 + LLVM 16.0.6 + CUDA 12.3 1024x768 OpenBenchmarking.org Kernel Details - 6000_4ch_ll: Transparent Huge Pages: madvise - 4090x2: Transparent Huge Pages: always Compiler Details - 6000_4ch_ll: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-multilib --enable-offload-defaulted --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=i686 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver - 4090x2: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-isl Disk Details - NONE / attr2,inode64,logbsize=32k,logbufs=8,noquota,relatime,rw,seclabel / Block Size: 4096 Processor Details - 6000_4ch_ll: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xd0004b1 - 4090x2: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa0011d1 Security Details - 6000_4ch_ll: SELinux + gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - 4090x2: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: disabled RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
cpuall_v2 fio: Rand Read - POSIX AIO - Yes - 4KB - 1 - Default Test Directory fio: Rand Read - POSIX AIO - Yes - 4KB - 32 - Default Test Directory fio: Rand Write - POSIX AIO - Yes - 4KB - 1 - Default Test Directory fio: Rand Write - POSIX AIO - Yes - 4KB - 32 - Default Test Directory fio: Rand Read - POSIX AIO - Yes - 4KB - 1 - Default Test Directory fio: Rand Read - POSIX AIO - Yes - 4KB - 32 - Default Test Directory fio: Rand Write - POSIX AIO - Yes - 4KB - 1 - Default Test Directory fio: Rand Write - POSIX AIO - Yes - 4KB - 32 - Default Test Directory intel-mlc: Max Bandwidth - All Reads intel-mlc: Max Bandwidth - 3:1 Reads-Writes intel-mlc: Max Bandwidth - 2:1 Reads-Writes intel-mlc: Max Bandwidth - 1:1 Reads-Writes intel-mlc: Max Bandwidth - Stream-Triad Like intel-mlc: Peak Injection Bandwidth - All Reads intel-mlc: Peak Injection Bandwidth - 3:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - 2:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - 1:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - Stream-Triad Like stream: Copy stream: Scale stream: Triad stream: Add cachebench: Read cachebench: Write cachebench: Read / Modify / Write intel-mlc: Idle Latency build-linux-kernel: defconfig whisper-cpp: ggml-medium.en - 2016 State of the Union 6000_4ch_ll 4090x2 14771 14720 99667 99620 57.7 57.5 389 389 113954.65 98042.64 93920.66 88062.65 95035.27 113973.3 97387.8 92578.4 87976.7 95038.5 90064.0 89640.2 94108.6 93836.8 12582.595354 85773.382625 93157.944569 114.7 51.727 1652.16433 7844 10798 64667 64967 30.6 42.2 253 253 38822.93 32025.23 30554.89 29284.50 32011.55 38812.0 31776.0 30510.8 29207.1 32055.0 32574.7 21120.9 23746.5 23560.9 9190.903357 51622.472353 102381.017139 91.5 72.107 1218.07265 OpenBenchmarking.org
Flexible IO Tester Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.36 Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory 6000_4ch_ll 4090x2 3K 6K 9K 12K 15K SE +/- 137.52, N = 7 SE +/- 5.78, N = 3 14771 7844 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.36 Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory 6000_4ch_ll 4090x2 3K 6K 9K 12K 15K SE +/- 159.37, N = 5 SE +/- 280.58, N = 15 14720 10798 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.36 Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 1816.90, N = 15 SE +/- 120.19, N = 3 99667 64667 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.36 Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 1450.85, N = 15 SE +/- 66.67, N = 3 99620 64967 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.36 Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory 6000_4ch_ll 4090x2 13 26 39 52 65 SE +/- 0.54, N = 7 SE +/- 0.03, N = 3 57.7 30.6 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.36 Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory 6000_4ch_ll 4090x2 13 26 39 52 65 SE +/- 0.62, N = 5 SE +/- 1.10, N = 15 57.5 42.2 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.36 Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory 6000_4ch_ll 4090x2 80 160 240 320 400 SE +/- 7.12, N = 15 SE +/- 0.67, N = 3 389 253 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.36 Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory 6000_4ch_ll 4090x2 80 160 240 320 400 SE +/- 5.69, N = 15 SE +/- 0.33, N = 3 389 253 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Intel Memory Latency Checker Test: Max Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - All Reads 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 27.22, N = 3 SE +/- 94.69, N = 3 113954.65 38822.93
Intel Memory Latency Checker Test: Max Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - 3:1 Reads-Writes 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 47.67, N = 3 SE +/- 27.70, N = 3 98042.64 32025.23
Intel Memory Latency Checker Test: Max Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - 2:1 Reads-Writes 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 7.15, N = 3 SE +/- 25.39, N = 3 93920.66 30554.89
Intel Memory Latency Checker Test: Max Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - 1:1 Reads-Writes 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 4.74, N = 3 SE +/- 19.96, N = 3 88062.65 29284.50
Intel Memory Latency Checker Test: Max Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - Stream-Triad Like 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 8.34, N = 3 SE +/- 25.09, N = 3 95035.27 32011.55
Intel Memory Latency Checker Test: Peak Injection Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - All Reads 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 26.33, N = 3 SE +/- 88.37, N = 3 113973.3 38812.0
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - 3:1 Reads-Writes 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 30.53, N = 3 SE +/- 19.44, N = 3 97387.8 31776.0
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - 2:1 Reads-Writes 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 22.58, N = 3 SE +/- 25.15, N = 3 92578.4 30510.8
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - 1:1 Reads-Writes 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 29.53, N = 3 SE +/- 50.27, N = 3 87976.7 29207.1
Intel Memory Latency Checker Test: Peak Injection Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - Stream-Triad Like 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 8.22, N = 3 SE +/- 16.97, N = 3 95038.5 32055.0
Stream Type: Copy OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Copy 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 16.98, N = 5 SE +/- 31.85, N = 5 90064.0 32574.7 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
Stream Type: Scale OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Scale 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 63.75, N = 5 SE +/- 16.22, N = 5 89640.2 21120.9 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
Stream Type: Triad OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Triad 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 12.86, N = 5 SE +/- 14.55, N = 5 94108.6 23746.5 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
Stream Type: Add OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Add 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 26.62, N = 5 SE +/- 17.00, N = 5 93836.8 23560.9 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read 6000_4ch_ll 4090x2 3K 6K 9K 12K 15K SE +/- 0.15, N = 3 SE +/- 6.47, N = 3 12582.60 9190.90 MIN: 12577.1 / MAX: 12583.24 MIN: 9160.88 / MAX: 9203.86 1. (CC) gcc options: -O3 -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 23.70, N = 3 SE +/- 41.10, N = 3 85773.38 51622.47 MIN: 51047.3 / MAX: 97997.28 MIN: 39519.8 / MAX: 54896.66 1. (CC) gcc options: -O3 -lrt
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write 6000_4ch_ll 4090x2 20K 40K 60K 80K 100K SE +/- 6.98, N = 3 SE +/- 367.10, N = 3 93157.94 102381.02 MIN: 81727.05 / MAX: 98992.88 MIN: 77271.95 / MAX: 109243.01 1. (CC) gcc options: -O3 -lrt
Intel Memory Latency Checker Test: Idle Latency OpenBenchmarking.org ns, Fewer Is Better Intel Memory Latency Checker 3.10 Test: Idle Latency 6000_4ch_ll 4090x2 30 60 90 120 150 SE +/- 0.37, N = 3 SE +/- 0.03, N = 3 114.7 91.5
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig 6000_4ch_ll 4090x2 16 32 48 64 80 SE +/- 0.51, N = 15 SE +/- 0.70, N = 3 51.73 72.11
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-medium.en - Input: 2016 State of the Union 6000_4ch_ll 4090x2 400 800 1200 1600 2000 SE +/- 10.69, N = 3 SE +/- 14.14, N = 9 1652.16 1218.07 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Phoronix Test Suite v10.8.5