new rn tr AMD Ryzen Threadripper 7980X 64-Cores testing with a System76 Thelio Major (FA Z5 BIOS) and AMD Radeon RX 6700 XT 12GB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2410161-PTS-NEWRNTR258&grr .
new rn tr Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper 7980X 64-Cores @ 5.37GHz (64 Cores / 128 Threads) System76 Thelio Major (FA Z5 BIOS) AMD Device 14a4 4 x 32GB DDR5-4800MT/s Micron MTC20F1045S1RC48BA2 1000GB CT1000T700SSD5 AMD Radeon RX 6700 XT 12GB AMD Device 14cc DELL P2415Q Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6E Ubuntu 24.10 6.11.0-8-generic (x86_64) GNOME Shell 47.0 X Server + Wayland 4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.58) GCC 14.2.0 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - CPU Microcode: 0xa108105 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
new rn tr xnnpack: QS8MobileNetV2 xnnpack: FP16MobileNetV3Small xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV1 xnnpack: FP32MobileNetV3Small xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV1 litert: Quantized COCO SSD MobileNet v1 litert: Mobilenet Quant litert: Inception ResNet V2 litert: DeepLab V3 litert: Inception V4 litert: NASNet Mobile onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU litert: SqueezeNet litert: Mobilenet Float onednn: Deconvolution Batch shapes_1d - CPU onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_3d - CPU a b c d 4031 4164 5624 3743 2031 4238 5962 4032 2119 7924.86 15789.8 34923.9 13513.8 26612.8 173455 546.398 326.236 3266.42 2102.99 7.46041 0.582029 0.335141 0.557676 1.02976 3988 4301 5748 3804 2052 4271 6005 4053 2156 7935.03 16068.8 33651.9 12117.3 26574.6 156703 548.732 326.788 3303.96 2141.96 7.42610 0.573270 0.337890 0.554351 1.02672 4025 4255 5827 3825 2070 4269 6047 4055 2156 7672.13 16501.2 34821.7 11896.5 26433.2 142541 548.888 325.448 3264.06 2112.35 7.50405 0.577303 0.333022 0.560390 1.02743 3985 4294 5806 3762 2052 4285 5932 4007 2159 7531.43 15994.0 34742.1 12363.4 26390.1 151275 550.540 325.778 3282.50 2132.45 7.48971 0.572739 0.337384 0.554405 1.02498 OpenBenchmarking.org
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b c d 900 1800 2700 3600 4500 SE +/- 32.71, N = 3 SE +/- 32.13, N = 3 SE +/- 27.10, N = 3 4031 3988 4025 3985 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small a b c d 900 1800 2700 3600 4500 SE +/- 51.08, N = 3 SE +/- 19.50, N = 3 SE +/- 72.75, N = 3 4164 4301 4255 4294 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large a b c d 1200 2400 3600 4800 6000 SE +/- 67.87, N = 3 SE +/- 63.97, N = 3 SE +/- 39.03, N = 3 5624 5748 5827 5806 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 a b c d 800 1600 2400 3200 4000 SE +/- 37.65, N = 3 SE +/- 48.96, N = 3 SE +/- 23.68, N = 3 3743 3804 3825 3762 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b c d 400 800 1200 1600 2000 SE +/- 11.84, N = 3 SE +/- 19.34, N = 3 SE +/- 17.89, N = 3 2031 2052 2070 2052 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small a b c d 900 1800 2700 3600 4500 SE +/- 28.17, N = 3 SE +/- 20.51, N = 3 SE +/- 34.96, N = 3 4238 4271 4269 4285 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b c d 1300 2600 3900 5200 6500 SE +/- 43.59, N = 3 SE +/- 3.71, N = 3 SE +/- 38.74, N = 3 5962 6005 6047 5932 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 a b c d 900 1800 2700 3600 4500 SE +/- 40.70, N = 3 SE +/- 22.36, N = 3 SE +/- 4.98, N = 3 4032 4053 4055 4007 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b c d 500 1000 1500 2000 2500 SE +/- 28.29, N = 3 SE +/- 15.37, N = 3 SE +/- 9.40, N = 3 2119 2156 2156 2159 1. (CXX) g++ options: -O3 -lrt -lm
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 a b c d 2K 4K 6K 8K 10K SE +/- 199.12, N = 12 SE +/- 83.25, N = 15 SE +/- 184.40, N = 12 7924.86 7935.03 7672.13 7531.43
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant a b c d 4K 8K 12K 16K 20K SE +/- 321.23, N = 12 SE +/- 268.08, N = 13 SE +/- 187.77, N = 3 15789.8 16068.8 16501.2 15994.0
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 a b c d 7K 14K 21K 28K 35K SE +/- 242.28, N = 3 SE +/- 330.78, N = 15 SE +/- 375.95, N = 3 34923.9 33651.9 34821.7 34742.1
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 a b c d 3K 6K 9K 12K 15K SE +/- 244.20, N = 15 SE +/- 118.42, N = 3 SE +/- 50.22, N = 3 13513.8 12117.3 11896.5 12363.4
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 a b c d 6K 12K 18K 24K 30K SE +/- 227.17, N = 14 SE +/- 209.28, N = 3 SE +/- 113.50, N = 3 26612.8 26574.6 26433.2 26390.1
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile a b c d 40K 80K 120K 160K 200K SE +/- 1880.66, N = 3 SE +/- 1575.26, N = 3 SE +/- 2554.81, N = 12 173455 156703 142541 151275
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a b c d 120 240 360 480 600 SE +/- 0.35, N = 3 SE +/- 0.64, N = 3 SE +/- 1.00, N = 3 546.40 548.73 548.89 550.54 MIN: 540.95 MIN: 542.36 MIN: 542.07 MIN: 542.26 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b c d 70 140 210 280 350 SE +/- 1.83, N = 3 SE +/- 0.28, N = 3 SE +/- 0.91, N = 3 326.24 326.79 325.45 325.78 MIN: 322.1 MIN: 319.81 MIN: 321.39 MIN: 320.56 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b c d 700 1400 2100 2800 3500 SE +/- 24.67, N = 3 SE +/- 34.80, N = 4 SE +/- 2.50, N = 3 3266.42 3303.96 3264.06 3282.50
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float a b c d 500 1000 1500 2000 2500 SE +/- 20.11, N = 3 SE +/- 12.05, N = 3 SE +/- 14.67, N = 3 2102.99 2141.96 2112.35 2132.45
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b c d 2 4 6 8 10 SE +/- 0.00222, N = 3 SE +/- 0.04488, N = 3 SE +/- 0.07017, N = 3 7.46041 7.42610 7.50405 7.48971 MIN: 6.52 MIN: 6.55 MIN: 4.63 MIN: 4.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a b c d 0.131 0.262 0.393 0.524 0.655 SE +/- 0.003697, N = 3 SE +/- 0.000734, N = 3 SE +/- 0.004087, N = 3 0.582029 0.573270 0.577303 0.572739 MIN: 0.55 MIN: 0.54 MIN: 0.54 MIN: 0.54 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b c d 0.076 0.152 0.228 0.304 0.38 SE +/- 0.001917, N = 3 SE +/- 0.002114, N = 3 SE +/- 0.001618, N = 3 0.335141 0.337890 0.333022 0.337384 MIN: 0.31 MIN: 0.31 MIN: 0.31 MIN: 0.31 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b c d 0.1261 0.2522 0.3783 0.5044 0.6305 SE +/- 0.004992, N = 3 SE +/- 0.000955, N = 3 SE +/- 0.003803, N = 3 0.557676 0.554351 0.560390 0.554405 MIN: 0.52 MIN: 0.51 MIN: 0.52 MIN: 0.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU a b c d 0.2317 0.4634 0.6951 0.9268 1.1585 SE +/- 0.00042, N = 3 SE +/- 0.00238, N = 3 SE +/- 0.00347, N = 3 1.02976 1.02672 1.02743 1.02498 MIN: 0.96 MIN: 0.97 MIN: 0.96 MIN: 0.96 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Phoronix Test Suite v10.8.5