fkfa AMD Ryzen Threadripper 3970X 32-Core testing with a ASUS ROG ZENITH II EXTREME (1802 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2402183-NE-FKFA3807928&grr&sor .
fkfa Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b c AMD Ryzen Threadripper 3970X 32-Core @ 3.70GHz (32 Cores / 64 Threads) ASUS ROG ZENITH II EXTREME (1802 BIOS) AMD Starship/Matisse 4 x 16GB DDR4-3600MT/s Corsair CMT64GX4M4Z3600C16 Samsung SSD 980 PRO 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio ASUS VP28U Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 22.04 6.2.0-39-generic (x86_64) GNOME Shell 42.2 X Server + Wayland 4.6 Mesa 22.0.1 (LLVM 13.0.1 DRM 3.49) 1.2.204 GCC 11.4.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Graphics Details - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-D1820201-101 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
fkfa vkfft: FFT + iFFT C2C 1D batched in double precision namd: STMV with 1,066,628 Atoms vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkfft: FFT + iFFT C2C 1D batched in single precision gromacs: MPI CPU - water_GMX50_bare vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling oidn: RTLightmap.hdr.4096x4096 - CPU-Only vkfft: FFT + iFFT C2C Bluestein in single precision namd: ATPase with 327,506 Atoms vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT C2C 1D batched in half precision oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only dav1d: Chimera 1080p 10-bit dav1d: Chimera 1080p compress-lz4: 3 - Decompression Speed compress-lz4: 3 - Compression Speed compress-lz4: 9 - Decompression Speed compress-lz4: 9 - Compression Speed compress-lz4: 1 - Decompression Speed compress-lz4: 1 - Compression Speed dav1d: Summer Nature 4K vkfft: FFT + iFFT R2C / C2R dav1d: Summer Nature 1080p a b c 14171 0.56907 3080 48131 3.257 53592 0.61 7336 2.05279 17165 71571 1.24 1.24 371.78 387.75 4312 110.95 4530.1 39.02 4694.7 729.96 220.24 24910 595.65 14275 0.56955 3069 49047 3.259 53670 0.61 7299 2.05647 17831 74310 1.24 1.24 370.6 386.87 4332 111.21 4530.4 38.82 4699.2 733.54 220.68 24784 595.67 14310 0.56923 3065 48154 3.254 52851 0.61 7307 2.05986 17459 74530 1.24 1.24 371.49 386.73 4311.3 106.23 4544.2 37.88 4700.8 731.47 219.25 25277 595.01 OpenBenchmarking.org
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision c b a 3K 6K 9K 12K 15K 14310 14275 14171 1. (CXX) g++ options: -O3
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0b6 Input: STMV with 1,066,628 Atoms b c a 0.1281 0.2562 0.3843 0.5124 0.6405 0.56955 0.56923 0.56907
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision a b c 700 1400 2100 2800 3500 3080 3069 3065 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision b c a 11K 22K 33K 44K 55K 49047 48154 48131 1. (CXX) g++ options: -O3
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare b a c 0.7333 1.4666 2.1999 2.9332 3.6665 3.259 3.257 3.254 1. (CXX) g++ options: -O3 -lm
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling b a c 11K 22K 33K 44K 55K 53670 53592 52851 1. (CXX) g++ options: -O3
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.2 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only c b a 0.1373 0.2746 0.4119 0.5492 0.6865 0.61 0.61 0.61
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision a c b 1600 3200 4800 6400 8000 7336 7307 7299 1. (CXX) g++ options: -O3
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0b6 Input: ATPase with 327,506 Atoms c b a 0.4635 0.927 1.3905 1.854 2.3175 2.05986 2.05647 2.05279
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision b c a 4K 8K 12K 16K 20K 17831 17459 17165 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision c b a 16K 32K 48K 64K 80K 74530 74310 71571 1. (CXX) g++ options: -O3
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.2 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only c b a 0.279 0.558 0.837 1.116 1.395 1.24 1.24 1.24
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.2 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only c b a 0.279 0.558 0.837 1.116 1.395 1.24 1.24 1.24
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.4 Video Input: Chimera 1080p 10-bit a c b 80 160 240 320 400 371.78 371.49 370.60 1. (CC) gcc options: -pthread
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.4 Video Input: Chimera 1080p a b c 80 160 240 320 400 387.75 386.87 386.73 1. (CC) gcc options: -pthread
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 3 - Decompression Speed b a c 900 1800 2700 3600 4500 4332.0 4312.0 4311.3 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 3 - Compression Speed b a c 20 40 60 80 100 111.21 110.95 106.23 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 9 - Decompression Speed c b a 1000 2000 3000 4000 5000 4544.2 4530.4 4530.1 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 9 - Compression Speed a b c 9 18 27 36 45 39.02 38.82 37.88 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 1 - Decompression Speed c b a 1000 2000 3000 4000 5000 4700.8 4699.2 4694.7 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 1 - Compression Speed b c a 160 320 480 640 800 733.54 731.47 729.96 1. (CC) gcc options: -O3
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.4 Video Input: Summer Nature 4K b a c 50 100 150 200 250 220.68 220.24 219.25 1. (CC) gcc options: -pthread
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R c a b 5K 10K 15K 20K 25K 25277 24910 24784 1. (CXX) g++ options: -O3
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.4 Video Input: Summer Nature 1080p b a c 130 260 390 520 650 595.67 595.65 595.01 1. (CC) gcc options: -pthread
Phoronix Test Suite v10.8.5