new rn AMD Ryzen Threadripper 3970X 32-Core testing with a ASUS ROG ZENITH II EXTREME (1802 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2410152-NE-NEWRN099440&sor&grr .
new rn Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper 3970X 32-Core @ 3.70GHz (32 Cores / 64 Threads) ASUS ROG ZENITH II EXTREME (1802 BIOS) AMD Starship/Matisse 4 x 16GB DDR4-3600MT/s Corsair CMT64GX4M4Z3600C16 Samsung SSD 980 PRO 500GB AMD Radeon RX 5700 8GB AMD Navi 10 HDMI Audio ASUS VP28U Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 22.04 6.8.0-45-generic (x86_64) GNOME Shell 42.9 X Server + Wayland 4.6 Mesa 22.0.1 (LLVM 13.0.1 DRM 3.57) 1.2.204 GCC 11.4.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
new rn xnnpack: QS8MobileNetV2 xnnpack: FP16MobileNetV3Small xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV1 xnnpack: FP32MobileNetV3Small xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV1 onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU litert: Inception V4 litert: Inception ResNet V2 litert: NASNet Mobile litert: Mobilenet Float litert: DeepLab V3 litert: SqueezeNet litert: Quantized COCO SSD MobileNet v1 litert: Mobilenet Quant onednn: Deconvolution Batch shapes_1d - CPU onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_3d - CPU a b c d 2086 2270 3100 2200 1614 2310 3828 2695 1929 1063.25 544.588 30460.8 29125 21130.2 1820.83 3924.43 2506.99 2356.92 1405.64 8.92496 1.11186 6.26589 5.41074 2.74673 2058 2151 3099 2156 1606 2209 3872 2678 1920 1065.95 545.194 30192.8 29065.3 21246.2 1890.63 3884.51 2507.88 2344.66 1299.31 8.96791 1.10833 9.69956 5.94199 2.71999 2060 2074 3106 2194 1616 2266 3938 2788 1925 1068.48 546.139 30102.7 29257.6 21105.6 1830.85 3892.22 2533.05 2356.07 1300.82 8.68858 1.10933 9.83021 5.92981 2.71566 2097 2455 3080 2242 1597 2219 3904 2692 1935 3079.07 551.716 30176.1 29288.1 21148.3 1827.86 3963.25 2524.93 2366.82 1291.57 8.80404 1.10728 9.98705 5.9588 2.74986 OpenBenchmarking.org
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 b c a d 500 1000 1500 2000 2500 2058 2060 2086 2097 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small c b a d 500 1000 1500 2000 2500 2074 2151 2270 2455 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large d b a c 700 1400 2100 2800 3500 3080 3099 3100 3106 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 b c a d 500 1000 1500 2000 2500 2156 2194 2200 2242 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 d b a c 300 600 900 1200 1500 1597 1606 1614 1616 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small b d c a 500 1000 1500 2000 2500 2209 2219 2266 2310 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b d c 800 1600 2400 3200 4000 3828 3872 3904 3938 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 b d a c 600 1200 1800 2400 3000 2678 2692 2695 2788 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 b c a d 400 800 1200 1600 2000 1920 1925 1929 1935 1. (CXX) g++ options: -O3 -lrt -lm
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a b c d 700 1400 2100 2800 3500 1063.25 1065.95 1068.48 3079.07 MIN: 1055.25 MIN: 1056.53 MIN: 1059.46 MIN: 1894.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b c d 120 240 360 480 600 544.59 545.19 546.14 551.72 MIN: 540.62 MIN: 541.34 MIN: 542.08 MIN: 538.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 c d b a 7K 14K 21K 28K 35K 30102.7 30176.1 30192.8 30460.8
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 b a c d 6K 12K 18K 24K 30K 29065.3 29125.0 29257.6 29288.1
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile c a d b 5K 10K 15K 20K 25K 21105.6 21130.2 21148.3 21246.2
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float a d c b 400 800 1200 1600 2000 1820.83 1827.86 1830.85 1890.63
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 b c a d 800 1600 2400 3200 4000 3884.51 3892.22 3924.43 3963.25
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b d c 500 1000 1500 2000 2500 2506.99 2507.88 2524.93 2533.05
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 b c a d 500 1000 1500 2000 2500 2344.66 2356.07 2356.92 2366.82
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant d b c a 300 600 900 1200 1500 1291.57 1299.31 1300.82 1405.64
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU c d a b 3 6 9 12 15 8.68858 8.80404 8.92496 8.96791 MIN: 6.99 MIN: 8.13 MIN: 8.08 MIN: 8.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU d b c a 0.2502 0.5004 0.7506 1.0008 1.251 1.10728 1.10833 1.10933 1.11186 MIN: 1.07 MIN: 1.07 MIN: 1.08 MIN: 1.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b c d 3 6 9 12 15 6.26589 9.69956 9.83021 9.98705 MIN: 6.22 MIN: 9.64 MIN: 9.79 MIN: 9.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a c b d 1.3407 2.6814 4.0221 5.3628 6.7035 5.41074 5.92981 5.94199 5.95880 MIN: 5.33 MIN: 5.86 MIN: 5.86 MIN: 5.88 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU c b a d 0.6187 1.2374 1.8561 2.4748 3.0935 2.71566 2.71999 2.74673 2.74986 MIN: 2.66 MIN: 2.65 MIN: 2.68 MIN: 2.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Phoronix Test Suite v10.8.5