gnr-new

2 x Intel Xeon 6980P testing with a Intel AvenueCity v0.01 (BHSDCRB1.IPC.0035.D44.2408292336 BIOS) and ASPEED on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2410219-NE-GNRNEW81583&grr.

gnr-new ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionabcdefghIntel Xeon 6980P @ 3.90GHz (128 Cores / 256 Threads)Intel AvenueCity v0.01 (BHSDCRB1.IPC.0035.D44.2408292336 BIOS)Intel Ice Lake IEH768GB2 x 1920GB KIOXIA KCD8XPUG1T92 + 960GB SAMSUNG MZ1L2960HCJR-00A07ASPEEDIntel I210 + 2 x Intel 10-Gigabit X540-AT2Ubuntu 24.046.10.0-phx (x86_64)GCC 13.2.0ext41920x12002 x Intel Xeon 6980P @ 3.90GHz (256 Cores / 512 Threads)1520GBOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x10002f0Python Details- Python 3.12.2Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected

gnr-new xnnpack: QS8MobileNetV2xnnpack: FP16MobileNetV3Smallxnnpack: FP16MobileNetV3Largexnnpack: FP16MobileNetV2xnnpack: FP16MobileNetV1xnnpack: FP32MobileNetV3Smallxnnpack: FP32MobileNetV3Largexnnpack: FP32MobileNetV2xnnpack: FP32MobileNetV1cp2k: H20-256cp2k: Fayalite-FISTonednn: Recurrent Neural Network Training - CPUonednn: Recurrent Neural Network Inference - CPUlitert: NASNet Mobilelitert: Inception ResNet V2litert: Inception V4litert: DeepLab V3litert: Mobilenet Floatlitert: Mobilenet Quantlitert: SqueezeNetlitert: Quantized COCO SSD MobileNet v1cp2k: H20-64onednn: Deconvolution Batch shapes_1d - CPUonednn: IP Shapes 1D - CPUonednn: IP Shapes 3D - CPUonednn: Convolution Batch Shapes Auto - CPUonednn: Deconvolution Batch shapes_3d - CPUabcdefgh1601816450205641305565721649721106141007326107.60975.855922.329541.5718742497036.873575.545569.57701.3623350.514175.217895.112.18177.81820.5338660.9687820.3079931.1224216209147931243501300261161723022355142616677109.30276.241940.142554.00922770010857981248.734737.68300.114592.413568.915577.712.52779.14890.5884350.9574750.2917661.109211660216425202361364360611624120904145496934109.89476.13946.678554.40329128516770878595.336867.68544.320879.413116.815569.912.5380.15110.5782340.9457050.297141.12372480430157277290035545483472818550183616627362667378.61788.0951300.17907.42118220091820736511612018066587.955675.653573.692404.114.026176.3731.362191.454560.51981.0818479.2781.10379.04681.037OpenBenchmarking.org

XNNPACK

Model: QS8MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: QS8MobileNetV2abcd100K200K300K400K500K1601816209166024804301. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Smallabcd30K60K90K120K150K16450147931164251572771. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Largeabcd60K120K180K240K300K2056424350202362900351. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV2abcd12K24K36K48K60K130551300213643545481. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV1abcd7K14K21K28K35K657261166061347281. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Smallabcd40K80K120K160K200K1649717230162411855011. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Largeabcd20K40K60K80K100K211062235520904836161. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV2abcd13K26K39K52K65K141001426114549627361. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV1abcd6K12K18K24K30K732666776934266731. (CXX) g++ options: -O3 -lrt -lm

CP2K Molecular Dynamics

Input: H20-256

OpenBenchmarking.orgSeconds, Fewer Is BetterCP2K Molecular Dynamics 2024.3Input: H20-256abcdefgh20406080100107.61109.30109.8978.6279.2781.1079.0581.041. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm

CP2K Molecular Dynamics

Input: Fayalite-FIST

OpenBenchmarking.orgSeconds, Fewer Is BetterCP2K Molecular Dynamics 2024.3Input: Fayalite-FISTabcd2040608010075.8676.2476.1388.101. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm

oneDNN

Harness: Recurrent Neural Network Training - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUabcd30060090012001500922.33940.14946.681300.17MIN: 878.99MIN: 910.45MIN: 920.2MIN: 1285.871. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUabcd2004006008001000541.57554.01554.40907.42MIN: 538.82MIN: 549.09MIN: 549.73MIN: 894.751. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

LiteRT

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: NASNet Mobileabcd300K600K900K1200K1500K1874242277002912851182200

LiteRT

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception ResNet V2abcd200K400K600K800K1000K97036.8108579.0167708.0918207.0

LiteRT

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception V4abcd80K160K240K320K400K73575.581248.778595.3365116.0

LiteRT

Model: DeepLab V3

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: DeepLab V3abcd30K60K90K120K150K45569.534737.636867.6120180.0

LiteRT

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet Floatabcd14K28K42K56K70K7701.368300.108544.3066587.90

LiteRT

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet Quantabcd12K24K36K48K60K23350.514592.420879.455675.6

LiteRT

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: SqueezeNetabcd11K22K33K44K55K14175.213568.913116.853573.6

LiteRT

Model: Quantized COCO SSD MobileNet v1

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Quantized COCO SSD MobileNet v1abcd20K40K60K80K100K17895.115577.715569.992404.1

CP2K Molecular Dynamics

Input: H20-64

OpenBenchmarking.orgSeconds, Fewer Is BetterCP2K Molecular Dynamics 2024.3Input: H20-64abcd4812162012.1812.5312.5314.031. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm

oneDNN

Harness: Deconvolution Batch shapes_1d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUabcd408012016020077.8279.1580.15176.37MIN: 11.78MIN: 12.79MIN: 12.18MIN: 37.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 1D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUabcd0.30650.6130.91951.2261.53250.5338660.5884350.5782341.362190MIN: 0.48MIN: 0.52MIN: 0.51MIN: 1.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUabcd0.32730.65460.98191.30921.63650.9687820.9574750.9457051.454560MIN: 0.91MIN: 0.91MIN: 0.9MIN: 1.321. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Convolution Batch Shapes Auto - Engine: CPUabcd0.1170.2340.3510.4680.5850.3079930.2917660.2971400.519800MIN: 0.28MIN: 0.27MIN: 0.28MIN: 0.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUabcd0.25280.50560.75841.01121.2641.122421.109211.123721.08184MIN: 1.1MIN: 1.08MIN: 1.09MIN: 11. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread


Phoronix Test Suite v10.8.5