new rn tr AMD Ryzen Threadripper 7980X 64-Cores testing with a System76 Thelio Major (FA Z5 BIOS) and AMD Radeon RX 6700 XT 12GB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2410161-PTS-NEWRNTR258&grs .
new rn tr Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper 7980X 64-Cores @ 5.37GHz (64 Cores / 128 Threads) System76 Thelio Major (FA Z5 BIOS) AMD Device 14a4 4 x 32GB DDR5-4800MT/s Micron MTC20F1045S1RC48BA2 1000GB CT1000T700SSD5 AMD Radeon RX 6700 XT 12GB AMD Device 14cc DELL P2415Q Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6E Ubuntu 24.10 6.11.0-8-generic (x86_64) GNOME Shell 47.0 X Server + Wayland 4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.58) GCC 14.2.0 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - CPU Microcode: 0xa108105 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
new rn tr litert: NASNet Mobile litert: Inception ResNet V2 xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV3Small xnnpack: FP16MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: FP16MobileNetV1 xnnpack: FP32MobileNetV1 litert: Mobilenet Float onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU litert: SqueezeNet xnnpack: FP32MobileNetV2 xnnpack: QS8MobileNetV2 xnnpack: FP32MobileNetV3Small onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU litert: Inception V4 onednn: Recurrent Neural Network Training - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Inference - CPU litert: Quantized COCO SSD MobileNet v1 litert: Mobilenet Quant litert: DeepLab V3 a b c d 173455 34923.9 5624 4164 3743 5962 2031 2119 2102.99 0.582029 0.335141 3266.42 4032 4031 4238 0.557676 7.46041 26612.8 546.398 1.02976 326.236 7924.86 15789.8 13513.8 156703 33651.9 5748 4301 3804 6005 2052 2156 2141.96 0.573270 0.337890 3303.96 4053 3988 4271 0.554351 7.42610 26574.6 548.732 1.02672 326.788 7935.03 16068.8 12117.3 142541 34821.7 5827 4255 3825 6047 2070 2156 2112.35 0.577303 0.333022 3264.06 4055 4025 4269 0.560390 7.50405 26433.2 548.888 1.02743 325.448 7672.13 16501.2 11896.5 151275 34742.1 5806 4294 3762 5932 2052 2159 2132.45 0.572739 0.337384 3282.50 4007 3985 4285 0.554405 7.48971 26390.1 550.540 1.02498 325.778 7531.43 15994.0 12363.4 OpenBenchmarking.org
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile a b c d 40K 80K 120K 160K 200K SE +/- 1880.66, N = 3 SE +/- 1575.26, N = 3 SE +/- 2554.81, N = 12 173455 156703 142541 151275
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 a b c d 7K 14K 21K 28K 35K SE +/- 242.28, N = 3 SE +/- 330.78, N = 15 SE +/- 375.95, N = 3 34923.9 33651.9 34821.7 34742.1
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large a b c d 1200 2400 3600 4800 6000 SE +/- 67.87, N = 3 SE +/- 63.97, N = 3 SE +/- 39.03, N = 3 5624 5748 5827 5806 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small a b c d 900 1800 2700 3600 4500 SE +/- 51.08, N = 3 SE +/- 19.50, N = 3 SE +/- 72.75, N = 3 4164 4301 4255 4294 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 a b c d 800 1600 2400 3200 4000 SE +/- 37.65, N = 3 SE +/- 48.96, N = 3 SE +/- 23.68, N = 3 3743 3804 3825 3762 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b c d 1300 2600 3900 5200 6500 SE +/- 43.59, N = 3 SE +/- 3.71, N = 3 SE +/- 38.74, N = 3 5962 6005 6047 5932 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b c d 400 800 1200 1600 2000 SE +/- 11.84, N = 3 SE +/- 19.34, N = 3 SE +/- 17.89, N = 3 2031 2052 2070 2052 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b c d 500 1000 1500 2000 2500 SE +/- 28.29, N = 3 SE +/- 15.37, N = 3 SE +/- 9.40, N = 3 2119 2156 2156 2159 1. (CXX) g++ options: -O3 -lrt -lm
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float a b c d 500 1000 1500 2000 2500 SE +/- 20.11, N = 3 SE +/- 12.05, N = 3 SE +/- 14.67, N = 3 2102.99 2141.96 2112.35 2132.45
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a b c d 0.131 0.262 0.393 0.524 0.655 SE +/- 0.003697, N = 3 SE +/- 0.000734, N = 3 SE +/- 0.004087, N = 3 0.582029 0.573270 0.577303 0.572739 MIN: 0.55 MIN: 0.54 MIN: 0.54 MIN: 0.54 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b c d 0.076 0.152 0.228 0.304 0.38 SE +/- 0.001917, N = 3 SE +/- 0.002114, N = 3 SE +/- 0.001618, N = 3 0.335141 0.337890 0.333022 0.337384 MIN: 0.31 MIN: 0.31 MIN: 0.31 MIN: 0.31 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b c d 700 1400 2100 2800 3500 SE +/- 24.67, N = 3 SE +/- 34.80, N = 4 SE +/- 2.50, N = 3 3266.42 3303.96 3264.06 3282.50
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 a b c d 900 1800 2700 3600 4500 SE +/- 40.70, N = 3 SE +/- 22.36, N = 3 SE +/- 4.98, N = 3 4032 4053 4055 4007 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b c d 900 1800 2700 3600 4500 SE +/- 32.71, N = 3 SE +/- 32.13, N = 3 SE +/- 27.10, N = 3 4031 3988 4025 3985 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small a b c d 900 1800 2700 3600 4500 SE +/- 28.17, N = 3 SE +/- 20.51, N = 3 SE +/- 34.96, N = 3 4238 4271 4269 4285 1. (CXX) g++ options: -O3 -lrt -lm
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b c d 0.1261 0.2522 0.3783 0.5044 0.6305 SE +/- 0.004992, N = 3 SE +/- 0.000955, N = 3 SE +/- 0.003803, N = 3 0.557676 0.554351 0.560390 0.554405 MIN: 0.52 MIN: 0.51 MIN: 0.52 MIN: 0.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b c d 2 4 6 8 10 SE +/- 0.00222, N = 3 SE +/- 0.04488, N = 3 SE +/- 0.07017, N = 3 7.46041 7.42610 7.50405 7.48971 MIN: 6.52 MIN: 6.55 MIN: 4.63 MIN: 4.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 a b c d 6K 12K 18K 24K 30K SE +/- 227.17, N = 14 SE +/- 209.28, N = 3 SE +/- 113.50, N = 3 26612.8 26574.6 26433.2 26390.1
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a b c d 120 240 360 480 600 SE +/- 0.35, N = 3 SE +/- 0.64, N = 3 SE +/- 1.00, N = 3 546.40 548.73 548.89 550.54 MIN: 540.95 MIN: 542.36 MIN: 542.07 MIN: 542.26 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU a b c d 0.2317 0.4634 0.6951 0.9268 1.1585 SE +/- 0.00042, N = 3 SE +/- 0.00238, N = 3 SE +/- 0.00347, N = 3 1.02976 1.02672 1.02743 1.02498 MIN: 0.96 MIN: 0.97 MIN: 0.96 MIN: 0.96 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b c d 70 140 210 280 350 SE +/- 1.83, N = 3 SE +/- 0.28, N = 3 SE +/- 0.91, N = 3 326.24 326.79 325.45 325.78 MIN: 322.1 MIN: 319.81 MIN: 321.39 MIN: 320.56 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 a b c d 2K 4K 6K 8K 10K SE +/- 199.12, N = 12 SE +/- 83.25, N = 15 SE +/- 184.40, N = 12 7924.86 7935.03 7672.13 7531.43
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant a b c d 4K 8K 12K 16K 20K SE +/- 321.23, N = 12 SE +/- 268.08, N = 13 SE +/- 187.77, N = 3 15789.8 16068.8 16501.2 15994.0
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 a b c d 3K 6K 9K 12K 15K SE +/- 244.20, N = 15 SE +/- 118.42, N = 3 SE +/- 50.22, N = 3 13513.8 12117.3 11896.5 12363.4
Phoronix Test Suite v10.8.5