new rn AMD Ryzen Threadripper 3970X 32-Core testing with a ASUS ROG ZENITH II EXTREME (1802 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2410152-NE-NEWRN099440&sro&grs .
new rn Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper 3970X 32-Core @ 3.70GHz (32 Cores / 64 Threads) ASUS ROG ZENITH II EXTREME (1802 BIOS) AMD Starship/Matisse 4 x 16GB DDR4-3600MT/s Corsair CMT64GX4M4Z3600C16 Samsung SSD 980 PRO 500GB AMD Radeon RX 5700 8GB AMD Navi 10 HDMI Audio ASUS VP28U Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 22.04 6.8.0-45-generic (x86_64) GNOME Shell 42.9 X Server + Wayland 4.6 Mesa 22.0.1 (LLVM 13.0.1 DRM 3.57) 1.2.204 GCC 11.4.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
new rn onednn: Recurrent Neural Network Training - CPU onednn: IP Shapes 3D - CPU xnnpack: FP16MobileNetV3Small onednn: Convolution Batch Shapes Auto - CPU litert: Mobilenet Quant xnnpack: FP32MobileNetV3Small xnnpack: FP32MobileNetV2 xnnpack: FP16MobileNetV2 litert: Mobilenet Float onednn: Deconvolution Batch shapes_1d - CPU xnnpack: FP32MobileNetV3Large litert: DeepLab V3 xnnpack: QS8MobileNetV2 onednn: Recurrent Neural Network Inference - CPU onednn: Deconvolution Batch shapes_3d - CPU xnnpack: FP16MobileNetV1 litert: Inception V4 litert: SqueezeNet litert: Quantized COCO SSD MobileNet v1 xnnpack: FP16MobileNetV3Large xnnpack: FP32MobileNetV1 litert: Inception ResNet V2 litert: NASNet Mobile onednn: IP Shapes 1D - CPU a b c d 1063.25 6.26589 2270 5.41074 1405.64 2310 2695 2200 1820.83 8.92496 3828 3924.43 2086 544.588 2.74673 1614 30460.8 2506.99 2356.92 3100 1929 29125 21130.2 1.11186 1065.95 9.69956 2151 5.94199 1299.31 2209 2678 2156 1890.63 8.96791 3872 3884.51 2058 545.194 2.71999 1606 30192.8 2507.88 2344.66 3099 1920 29065.3 21246.2 1.10833 1068.48 9.83021 2074 5.92981 1300.82 2266 2788 2194 1830.85 8.68858 3938 3892.22 2060 546.139 2.71566 1616 30102.7 2533.05 2356.07 3106 1925 29257.6 21105.6 1.10933 3079.07 9.98705 2455 5.9588 1291.57 2219 2692 2242 1827.86 8.80404 3904 3963.25 2097 551.716 2.74986 1597 30176.1 2524.93 2366.82 3080 1935 29288.1 21148.3 1.10728 OpenBenchmarking.org
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a b c d 700 1400 2100 2800 3500 1063.25 1065.95 1068.48 3079.07 MIN: 1055.25 MIN: 1056.53 MIN: 1059.46 MIN: 1894.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b c d 3 6 9 12 15 6.26589 9.69956 9.83021 9.98705 MIN: 6.22 MIN: 9.64 MIN: 9.79 MIN: 9.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small a b c d 500 1000 1500 2000 2500 2270 2151 2074 2455 1. (CXX) g++ options: -O3 -lrt -lm
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b c d 1.3407 2.6814 4.0221 5.3628 6.7035 5.41074 5.94199 5.92981 5.95880 MIN: 5.33 MIN: 5.86 MIN: 5.86 MIN: 5.88 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant a b c d 300 600 900 1200 1500 1405.64 1299.31 1300.82 1291.57
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small a b c d 500 1000 1500 2000 2500 2310 2209 2266 2219 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 a b c d 600 1200 1800 2400 3000 2695 2678 2788 2692 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 a b c d 500 1000 1500 2000 2500 2200 2156 2194 2242 1. (CXX) g++ options: -O3 -lrt -lm
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float a b c d 400 800 1200 1600 2000 1820.83 1890.63 1830.85 1827.86
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b c d 3 6 9 12 15 8.92496 8.96791 8.68858 8.80404 MIN: 8.08 MIN: 8.24 MIN: 6.99 MIN: 8.13 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b c d 800 1600 2400 3200 4000 3828 3872 3938 3904 1. (CXX) g++ options: -O3 -lrt -lm
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 a b c d 800 1600 2400 3200 4000 3924.43 3884.51 3892.22 3963.25
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b c d 500 1000 1500 2000 2500 2086 2058 2060 2097 1. (CXX) g++ options: -O3 -lrt -lm
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b c d 120 240 360 480 600 544.59 545.19 546.14 551.72 MIN: 540.62 MIN: 541.34 MIN: 542.08 MIN: 538.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU a b c d 0.6187 1.2374 1.8561 2.4748 3.0935 2.74673 2.71999 2.71566 2.74986 MIN: 2.68 MIN: 2.65 MIN: 2.66 MIN: 2.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b c d 300 600 900 1200 1500 1614 1606 1616 1597 1. (CXX) g++ options: -O3 -lrt -lm
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 a b c d 7K 14K 21K 28K 35K 30460.8 30192.8 30102.7 30176.1
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b c d 500 1000 1500 2000 2500 2506.99 2507.88 2533.05 2524.93
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 a b c d 500 1000 1500 2000 2500 2356.92 2344.66 2356.07 2366.82
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large a b c d 700 1400 2100 2800 3500 3100 3099 3106 3080 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b c d 400 800 1200 1600 2000 1929 1920 1925 1935 1. (CXX) g++ options: -O3 -lrt -lm
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 a b c d 6K 12K 18K 24K 30K 29125.0 29065.3 29257.6 29288.1
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile a b c d 5K 10K 15K 20K 25K 21130.2 21246.2 21105.6 21148.3
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a b c d 0.2502 0.5004 0.7506 1.0008 1.251 1.11186 1.10833 1.10933 1.10728 MIN: 1.08 MIN: 1.07 MIN: 1.08 MIN: 1.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Phoronix Test Suite v10.8.5