Intel Xeon 6900P - SNC vs. HEX Clustering Mode

Benchmarks by Michael Larabel for a future article..

HTML result view exported from: https://openbenchmarking.org/result/2409257-NE-INTELGNRH28&grr&sor.

Intel Xeon 6900P - SNC vs. HEX Clustering ModeProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionHEX ModeSNC3 - Default2 x Intel Xeon 6980P @ 3.90GHz (256 Cores / 512 Threads)Intel BIRCHSTREAM (BHSDCRB1.IPC.0035.D44.2408292336 BIOS)Intel Ice Lake IEH1520GB960GB SAMSUNG MZ1L2960HCJR-00A07ASPEEDIntel I210 + 2 x Intel 10-Gigabit X540-AT2Ubuntu 24.046.8.0-45-generic (x86_64)GCC 13.2.0ext41920x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x10002f0 Java Details- OpenJDK Runtime Environment (build 21.0.4+7-Ubuntu-1ubuntu224.04)Python Details- Python 3.12.3Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected

Intel Xeon 6900P - SNC vs. HEX Clustering Modepetsc: Streamsopenradioss: Rubber O-Ring Seal Installationstockfish: Chess Benchmarknwchem: C240 Buckyballeasywave: e2Asean Grid + BengkuluSept2007 Source - 2400namd: STMV with 1,066,628 Atomshpcg: 160 160 160 - 60libxsmm: 128dacapobench: H2 Database Enginetensorflow: CPU - 512 - ResNet-50pgbench: 100 - 1000 - Read Only - Average Latencypgbench: 100 - 1000 - Read Onlylibxsmm: 256hpcg: 144 144 144 - 60byte: Whetstone Doublebuild-llvm: Ninjadaphne: OpenMP - Points2Imagebyte: Dhrystone 2svt-av1: Preset 3 - Bosphorus 4Kbyte: System Callnamd: ATPase with 327,506 Atomssvt-av1: Preset 5 - Beauty 4K 10-bitbuild-llvm: Unix Makefileshpcg: 104 104 104 - 60openradioss: Bird Strike on Windshielddacapobench: Apache Tomcateasywave: e2Asean Grid + BengkuluSept2007 Source - 1200build-linux-kernel: allmodconfigsvt-av1: Preset 8 - Beauty 4K 10-bitpgbench: 100 - 1000 - Read Write - Average Latencypgbench: 100 - 1000 - Read Writecassandra: Writesopenradioss: Bumper Beamlammps: 20k Atomsgpaw: Carbon Nanotubegraph500: 26graph500: 26graph500: 26graph500: 26relion: Basic - CPUopenradioss: INIVOL and Fluid Structure Interaction Drop Containeropenradioss: Chrysler Neon 1Mincompact3d: X3D-benchmarking input.i3dsvt-av1: Preset 13 - Beauty 4K 10-bitpyhpc: CPU - Numpy - 4194304 - Isoneutral Mixingdaphne: OpenMP - Euclidean Clusterspecfem3d: Mount St. Helenssvt-av1: Preset 5 - Bosphorus 4Kincompact3d: input.i3d 193 Cells Per Directionblender: Barbershop - CPU-Onlydacapobench: BioJava Biological Data Frameworkincompact3d: input.i3d 129 Cells Per Directiondaphne: OpenMP - NDT Mappingbuild-linux-kernel: defconfiggromacs: MPI CPU - water_GMX50_bareopenradioss: Cell Phone Drop Testaskap: tConvolve MPI - Griddingaskap: tConvolve MPI - Degriddingminibude: OpenMP - BM2minibude: OpenMP - BM2lammps: Rhodopsin Proteindacapobench: Apache Xalan XSLTspecfem3d: Water-layered Halfspacepyhpc: CPU - Numpy - 4194304 - Equation of Statespecfem3d: Layered Halfspacespecfem3d: Homogeneous Halfspacespecfem3d: Tomographic Modelcompress-7zip: Decompression Ratingcompress-7zip: Compression Ratingamg: svt-av1: Preset 8 - Bosphorus 4Ksvt-av1: Preset 13 - Bosphorus 4Kdacapobench: Apache Kafkablender: Pabellon Barcelona - CPU-Onlyblender: Classroom - CPU-Onlylibxsmm: 64dacapobench: Jythonmt-dgemm: Sustained Floating-Point Rateblender: Junkshop - CPU-Onlyblender: Fishy Cat - CPU-Onlylibxsmm: 32blender: BMW27 - CPU-OnlyHEX ModeSNC3 - Default595124.2316231.105661414151779.7140.4382.55596159.8357862.217178214.731.6456385802706.0161.9163728535.694.0614243.6918804694499.09.106910572120.14.499705.684214.439168.784172.45893359.596193.1148.19173.22613656147125110.6892.75288.86696410700067332600019546700001530330000114.39793.1366.4672.108810413.4101.931676.384.01897477331.1932.8219026668.4057910.903198322547.4326.98332.13740.7392985.282064.2283.7577093.93170.47425009.3212822911.7137.4226885185.5139720444.4486445521377979911774770839500066.992195.989599521.5417.515554.035683865.65880010.8910.493214.17.69449771.2850219.855874743171660.6141.9051.97540170.0028182.911238173.991.8435426524213.9171.2053722311.476.7304375.3118658833934.39.155897496230.93.787555.773197.129176.009154.56922352.965131.1497.99174.04313506100742110.5594.05286.877110595000087176900021162400001748790000104.05190.6361.4868.891237912.9791.907612.853.99571036630.8562.5643823168.0055030.831954996574.8423.44332.59239.18125342107661268.5306713.24171.40225959.2438509811.5467.2075692595.4331742394.3612348561394947897725850420133363.967193.486609821.2817.555826.535623545.81423210.8510.433507.47.49OpenBenchmarking.org

PETSc

Test: Streams

OpenBenchmarking.orgMB/s, More Is BetterPETSc 3.19Test: StreamsHEX ModeSNC3 - Default130K260K390K520K650KSE +/- 1960.36, N = 3SE +/- 5879.52, N = 4595124.23449771.291. (CC) gcc options: -fPIC -O3 -O2 -lpthread -lm

OpenRadioss

Model: Rubber O-Ring Seal Installation

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Rubber O-Ring Seal InstallationSNC3 - DefaultHEX Mode50100150200250SE +/- 3.87, N = 12SE +/- 6.59, N = 9219.85231.10

Stockfish

Chess Benchmark

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 17Chess BenchmarkSNC3 - DefaultHEX Mode130M260M390M520M650MSE +/- 16310733.41, N = 6SE +/- 10082654.90, N = 95874743175661414151. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver

NWChem

Input: C240 Buckyball

OpenBenchmarking.orgSeconds, Fewer Is BetterNWChem 7.0.2Input: C240 BuckyballSNC3 - DefaultHEX Mode4008001200160020001660.61779.71. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2

easyWave

Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400HEX ModeSNC3 - Default306090120150SE +/- 1.63, N = 12SE +/- 1.82, N = 12140.44141.911. (CXX) g++ options: -O3 -fopenmp

NAMD

Input: STMV with 1,066,628 Atoms

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0Input: STMV with 1,066,628 AtomsHEX ModeSNC3 - Default0.57511.15021.72532.30042.8755SE +/- 0.06201, N = 15SE +/- 0.01818, N = 132.555961.97540

High Performance Conjugate Gradient

X Y Z: 160 160 160 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 160 160 160 - RT: 60SNC3 - DefaultHEX Mode4080120160200SE +/- 0.02, N = 3SE +/- 0.29, N = 3170.00159.841. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

libxsmm

M N K: 128

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128SNC3 - DefaultHEX Mode2K4K6K8K10KSE +/- 103.51, N = 3SE +/- 151.59, N = 98182.97862.21. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

DaCapo Benchmark

Java Test: H2 Database Engine

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: H2 Database EngineSNC3 - DefaultHEX Mode4K8K12K16K20KSE +/- 336.25, N = 15SE +/- 372.20, N = 151123817178

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50HEX ModeSNC3 - Default50100150200250SE +/- 2.28, N = 4SE +/- 1.54, N = 3214.73173.99

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average LatencyHEX ModeSNC3 - Default0.41470.82941.24411.65882.0735SE +/- 0.108, N = 12SE +/- 0.023, N = 31.6451.8431. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read OnlyHEX ModeSNC3 - Default140K280K420K560K700KSE +/- 42784.99, N = 12SE +/- 6739.41, N = 36385805426521. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

libxsmm

M N K: 256

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256SNC3 - DefaultHEX Mode9001800270036004500SE +/- 124.41, N = 15SE +/- 36.38, N = 34213.92706.01. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

High Performance Conjugate Gradient

X Y Z: 144 144 144 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60SNC3 - DefaultHEX Mode4080120160200SE +/- 0.29, N = 3SE +/- 0.22, N = 3171.21161.921. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

BYTE Unix Benchmark

Computational Test: Whetstone Double

OpenBenchmarking.orgMWIPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Whetstone DoubleHEX ModeSNC3 - Default800K1600K2400K3200K4000KSE +/- 113.17, N = 3SE +/- 382.28, N = 33728535.63722311.41. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

Timed LLVM Compilation

Build System: Ninja

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaSNC3 - DefaultHEX Mode20406080100SE +/- 0.86, N = 5SE +/- 0.68, N = 1576.7394.06

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: Points2Image

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous Suite 2021.11.02Backend: OpenMP - Kernel: Points2ImageSNC3 - DefaultHEX Mode9001800270036004500SE +/- 52.66, N = 3SE +/- 47.68, N = 154375.314243.691. (CXX) g++ options: -O3 -std=c++11 -fopenmp

BYTE Unix Benchmark

Computational Test: Dhrystone 2

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Dhrystone 2HEX ModeSNC3 - Default4000M8000M12000M16000M20000MSE +/- 9501831.17, N = 3SE +/- 22131931.33, N = 318804694499.018658833934.31. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 4KSNC3 - DefaultHEX Mode3691215SE +/- 0.011, N = 3SE +/- 0.031, N = 39.1559.1061. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

BYTE Unix Benchmark

Computational Test: System Call

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: System CallHEX ModeSNC3 - Default200M400M600M800M1000MSE +/- 144616.99, N = 3SE +/- 219845.26, N = 3910572120.1897496230.91. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

NAMD

Input: ATPase with 327,506 Atoms

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0Input: ATPase with 327,506 AtomsHEX ModeSNC3 - Default1.01242.02483.03724.04965.062SE +/- 0.05577, N = 15SE +/- 0.01928, N = 34.499703.78755

SVT-AV1

Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Beauty 4K 10-bitSNC3 - DefaultHEX Mode1.29892.59783.89675.19566.4945SE +/- 0.006, N = 3SE +/- 0.018, N = 35.7735.6841. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Timed LLVM Compilation

Build System: Unix Makefiles

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix MakefilesSNC3 - DefaultHEX Mode50100150200250SE +/- 0.78, N = 3SE +/- 1.09, N = 3197.13214.44

High Performance Conjugate Gradient

X Y Z: 104 104 104 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 60SNC3 - DefaultHEX Mode4080120160200SE +/- 0.92, N = 3SE +/- 0.15, N = 3176.01168.781. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenRadioss

Model: Bird Strike on Windshield

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Bird Strike on WindshieldSNC3 - DefaultHEX Mode4080120160200SE +/- 0.38, N = 3SE +/- 0.18, N = 3154.56172.45

DaCapo Benchmark

Java Test: Apache Tomcat

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: Apache TomcatHEX ModeSNC3 - Default2K4K6K8K10KSE +/- 144.35, N = 15SE +/- 74.15, N = 1589339223

easyWave

Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200SNC3 - DefaultHEX Mode1326395265SE +/- 0.59, N = 5SE +/- 0.84, N = 1252.9759.601. (CXX) g++ options: -O3 -fopenmp

Timed Linux Kernel Compilation

Build: allmodconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: allmodconfigSNC3 - DefaultHEX Mode4080120160200SE +/- 1.64, N = 3SE +/- 0.27, N = 3131.15193.11

SVT-AV1

Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Beauty 4K 10-bitHEX ModeSNC3 - Default246810SE +/- 0.009, N = 3SE +/- 0.055, N = 38.1917.9911. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average LatencyHEX ModeSNC3 - Default1632486480SE +/- 0.01, N = 3SE +/- 0.20, N = 373.2374.041. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read WriteHEX ModeSNC3 - Default3K6K9K12K15KSE +/- 2.00, N = 3SE +/- 37.41, N = 313656135061. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

Apache Cassandra

Test: Writes

OpenBenchmarking.orgOp/s, More Is BetterApache Cassandra 5.0Test: WritesHEX ModeSNC3 - Default30K60K90K120K150KSE +/- 1522.45, N = 3SE +/- 1022.25, N = 3147125100742

OpenRadioss

Model: Bumper Beam

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Bumper BeamSNC3 - DefaultHEX Mode20406080100SE +/- 0.43, N = 3SE +/- 0.36, N = 3110.55110.68

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsSNC3 - DefaultHEX Mode20406080100SE +/- 0.21, N = 3SE +/- 0.33, N = 394.0592.751. (CXX) g++ options: -O3 -lm -ldl

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon NanotubeSNC3 - DefaultHEX Mode20406080100SE +/- 0.72, N = 3SE +/- 0.28, N = 386.8888.871. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 26SNC3 - DefaultHEX Mode200M400M600M800M1000M11059500009641070001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 26SNC3 - DefaultHEX Mode200M400M600M800M1000M8717690006733260001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 26SNC3 - DefaultHEX Mode500M1000M1500M2000M2500M211624000019546700001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 26SNC3 - DefaultHEX Mode400M800M1200M1600M2000M174879000015303300001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

RELION

Test: Basic - Device: CPU

OpenBenchmarking.orgSeconds, Fewer Is BetterRELION 4.0.1Test: Basic - Device: CPUSNC3 - DefaultHEX Mode306090120150SE +/- 0.54, N = 3SE +/- 0.83, N = 3104.05114.401. (CXX) g++ options: -fopenmp -std=c++11 -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -ljpeg -lmpi_cxx -lmpi

OpenRadioss

Model: INIVOL and Fluid Structure Interaction Drop Container

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: INIVOL and Fluid Structure Interaction Drop ContainerSNC3 - DefaultHEX Mode20406080100SE +/- 0.25, N = 3SE +/- 0.28, N = 390.6393.13

OpenRadioss

Model: Chrysler Neon 1M

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Chrysler Neon 1MSNC3 - DefaultHEX Mode1530456075SE +/- 0.37, N = 3SE +/- 0.38, N = 361.4866.46

Xcompact3d Incompact3d

Input: X3D-benchmarking input.i3d

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: X3D-benchmarking input.i3dSNC3 - DefaultHEX Mode1632486480SE +/- 0.42, N = 3SE +/- 0.09, N = 368.8972.111. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SVT-AV1

Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Beauty 4K 10-bitHEX ModeSNC3 - Default3691215SE +/- 0.00, N = 3SE +/- 0.02, N = 313.4112.981. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

PyHPC Benchmarks

Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Isoneutral Mixing

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Isoneutral MixingSNC3 - DefaultHEX Mode0.43450.8691.30351.7382.1725SE +/- 0.020, N = 3SE +/- 0.011, N = 31.9071.931

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: Euclidean Cluster

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous Suite 2021.11.02Backend: OpenMP - Kernel: Euclidean ClusterHEX ModeSNC3 - Default150300450600750SE +/- 5.87, N = 3SE +/- 15.97, N = 12676.38612.851. (CXX) g++ options: -O3 -std=c++11 -fopenmp

SPECFEM3D

Model: Mount St. Helens

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Mount St. HelensSNC3 - DefaultHEX Mode0.90431.80862.71293.61724.5215SE +/- 0.037097693, N = 7SE +/- 0.026444752, N = 33.9957103664.0189747731. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 4KHEX ModeSNC3 - Default714212835SE +/- 0.19, N = 3SE +/- 0.35, N = 331.1930.861. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionSNC3 - DefaultHEX Mode0.63491.26981.90472.53963.1745SE +/- 0.01327750, N = 3SE +/- 0.02990610, N = 152.564382312.821902661. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Barbershop - Compute: CPU-OnlySNC3 - DefaultHEX Mode1530456075SE +/- 0.33, N = 3SE +/- 0.18, N = 368.0068.40

DaCapo Benchmark

Java Test: BioJava Biological Data Framework

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: BioJava Biological Data FrameworkSNC3 - DefaultHEX Mode12002400360048006000SE +/- 42.20, N = 15SE +/- 50.95, N = 855035791

Xcompact3d Incompact3d

Input: input.i3d 129 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per DirectionSNC3 - DefaultHEX Mode0.20320.40640.60960.81281.016SE +/- 0.005913486, N = 15SE +/- 0.011942737, N = 30.8319549960.9031983221. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Darmstadt Automotive Parallel Heterogeneous Suite

Backend: OpenMP - Kernel: NDT Mapping

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous Suite 2021.11.02Backend: OpenMP - Kernel: NDT MappingSNC3 - DefaultHEX Mode120240360480600SE +/- 8.56, N = 12SE +/- 1.66, N = 3574.84547.431. (CXX) g++ options: -O3 -std=c++11 -fopenmp

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: defconfigSNC3 - DefaultHEX Mode612182430SE +/- 0.20, N = 8SE +/- 0.23, N = 823.4426.98

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareSNC3 - DefaultHEX Mode816243240SE +/- 0.10, N = 3SE +/- 0.05, N = 332.5932.141. (CXX) g++ options: -O3 -lm

OpenRadioss

Model: Cell Phone Drop Test

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Cell Phone Drop TestSNC3 - DefaultHEX Mode918273645SE +/- 0.06, N = 3SE +/- 0.29, N = 339.1840.73

ASKAP

Test: tConvolve MPI - Gridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingSNC3 - DefaultHEX Mode30K60K90K120K150KSE +/- 1080.26, N = 3SE +/- 1249.70, N = 3125342.092985.21. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MPI - Degridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingSNC3 - DefaultHEX Mode20K40K60K80K100KSE +/- 797.06, N = 3SE +/- 704.03, N = 3107661.082064.21. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2HEX ModeSNC3 - Default60120180240300SE +/- 2.64, N = 6SE +/- 6.07, N = 12283.76268.531. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2HEX ModeSNC3 - Default15003000450060007500SE +/- 66.01, N = 6SE +/- 151.76, N = 127093.936713.241. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin ProteinSNC3 - DefaultHEX Mode1632486480SE +/- 1.30, N = 12SE +/- 0.88, N = 371.4070.471. (CXX) g++ options: -O3 -lm -ldl

DaCapo Benchmark

Java Test: Apache Xalan XSLT

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: Apache Xalan XSLTHEX ModeSNC3 - Default6001200180024003000SE +/- 40.23, N = 15SE +/- 50.69, N = 1525002595

SPECFEM3D

Model: Water-layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Water-layered HalfspaceSNC3 - DefaultHEX Mode3691215SE +/- 0.054952663, N = 3SE +/- 0.016038732, N = 39.2438509819.3212822911. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

PyHPC Benchmarks

Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Equation of State

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Equation of StateSNC3 - DefaultHEX Mode0.38540.77081.15621.54161.927SE +/- 0.008, N = 3SE +/- 0.012, N = 31.5461.713

SPECFEM3D

Model: Layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Layered HalfspaceSNC3 - DefaultHEX Mode246810SE +/- 0.060537089, N = 3SE +/- 0.036865236, N = 37.2075692597.4226885181. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Homogeneous Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Homogeneous HalfspaceSNC3 - DefaultHEX Mode1.24062.48123.72184.96246.203SE +/- 0.012290725, N = 3SE +/- 0.007527831, N = 35.4331742395.5139720441. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Tomographic Model

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Tomographic ModelSNC3 - DefaultHEX Mode1.00092.00183.00274.00365.0045SE +/- 0.014525946, N = 3SE +/- 0.014802330, N = 34.3612348564.4486445521. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 24.05Test: Decompression RatingSNC3 - DefaultHEX Mode300K600K900K1200K1500KSE +/- 10392.40, N = 3SE +/- 5512.25, N = 3139494713779791. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 24.05Test: Compression RatingHEX ModeSNC3 - Default200K400K600K800K1000KSE +/- 5282.93, N = 3SE +/- 8001.79, N = 39117748977251. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Algebraic Multi-Grid Benchmark

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2SNC3 - DefaultHEX Mode2000M4000M6000M8000M10000MSE +/- 11711915.09, N = 3SE +/- 20523024.36, N = 3850420133377083950001. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 4KHEX ModeSNC3 - Default1530456075SE +/- 0.74, N = 3SE +/- 0.65, N = 366.9963.971. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 4KHEX ModeSNC3 - Default4080120160200SE +/- 4.03, N = 12SE +/- 0.39, N = 3195.99193.491. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

DaCapo Benchmark

Java Test: Apache Kafka

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: Apache KafkaHEX ModeSNC3 - Default13002600390052006500SE +/- 1.20, N = 3SE +/- 2.60, N = 359956098

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Pabellon Barcelona - Compute: CPU-OnlySNC3 - DefaultHEX Mode510152025SE +/- 0.05, N = 3SE +/- 0.03, N = 321.2821.54

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Classroom - Compute: CPU-OnlyHEX ModeSNC3 - Default48121620SE +/- 0.04, N = 3SE +/- 0.14, N = 317.5117.55

libxsmm

M N K: 64

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64SNC3 - DefaultHEX Mode12002400360048006000SE +/- 63.47, N = 15SE +/- 68.57, N = 155826.55554.01. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

DaCapo Benchmark

Java Test: Jython

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: JythonSNC3 - DefaultHEX Mode8001600240032004000SE +/- 41.68, N = 3SE +/- 22.30, N = 335623568

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateHEX ModeSNC3 - Default8001600240032004000SE +/- 5.73, N = 3SE +/- 6.73, N = 33865.663545.811. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas

Blender

Blend File: Junkshop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Junkshop - Compute: CPU-OnlySNC3 - DefaultHEX Mode3691215SE +/- 0.03, N = 3SE +/- 0.06, N = 310.8510.89

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Fishy Cat - Compute: CPU-OnlySNC3 - DefaultHEX Mode3691215SE +/- 0.12, N = 3SE +/- 0.10, N = 310.4310.49

libxsmm

M N K: 32

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32SNC3 - DefaultHEX Mode8001600240032004000SE +/- 58.49, N = 12SE +/- 68.51, N = 123507.43214.11. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: BMW27 - Compute: CPU-OnlySNC3 - DefaultHEX Mode246810SE +/- 0.02, N = 3SE +/- 0.05, N = 37.497.69


Phoronix Test Suite v10.8.5