gh200

ARMv8 Neoverse-V2 testing with a Pegatron JIMBO P4352 (00022432 BIOS) and NVIDIA GH200 144G HBM3e 143GB on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2410120-NE-G2008653578&grs&rdt.

gh200ProcessorMotherboardMemoryDiskGraphicsNetworkOSKernelDisplay DriverOpenCLCompilerFile-SystemScreen ResolutionabARMv8 Neoverse-V2 @ 3.47GHz (72 Cores)Pegatron JIMBO P4352 (00022432 BIOS)1 x 480GB LPDDR5-6400MT/s NVIDIA 699-2G530-0236-RC11000GB CT1000T700SSD3NVIDIA GH200 144G HBM3e 143GB2 x Intel X550Ubuntu 24.046.8.0-45-generic-64k (aarch64)NVIDIAOpenCL 3.0 CUDA 12.6.65GCC 13.2.0ext41920x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-dIwDw0/gcc-13-13.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto --without-cuda-driver -v Processor Details- Scaling Governor: cppc_cpufreq ondemand (Boost: Disabled)Java Details- OpenJDK Runtime Environment (build 21.0.4+7-Ubuntu-1ubuntu224.04)Python Details- Python 3.12.3Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected

gh200build-linux-kernel: defconfigonnx: ZFNet-512 - CPU - Parallelgraphics-magick: HWB Color Spacemnn: mobilenet-v1-1.0xnnpack: FP32MobileNetV2graphics-magick: Swirlonnx: T5 Encoder - CPU - Standardonnx: yolov4 - CPU - Standardpyperformance: asyncio_tcp_sslgraphics-magick: Noise-Gaussianmnn: mobilenetV3build2: Time To Compilegraphics-magick: Noise-Gaussiangraphics-magick: Swirlxnnpack: QU8MobileNetV3Smallgraphics-magick: HWB Color Spacexnnpack: FP16MobileNetV3Largex265: Bosphorus 1080pxnnpack: QU8MobileNetV3Largemnn: SqueezeNetV1.0xnnpack: QU8MobileNetV2xnnpack: FP16MobileNetV3Smallbuild-linux-kernel: allmodconfigxnnpack: FP32MobileNetV3Smallmnn: squeezenetv1.1onnx: fcn-resnet101-11 - CPU - Parallelgraphics-magick: Rotatecompress-7zip: Compression Ratingxnnpack: FP16MobileNetV2onnx: super-resolution-10 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Standardblender: Fishy Cat - CPU-Onlymnn: inception-v3onnx: yolov4 - CPU - Parallelpovray: Trace Timeblender: BMW27 - CPU-Onlypyperformance: gc_collectgraphics-magick: Resizingsimdjson: LargeRandonnx: CaffeNet 12-int8 - CPU - Standardgraphics-magick: Sharpensimdjson: TopTweetonnx: ResNet50 v1-12-int8 - CPU - Parallelgraphics-magick: Resizingonnx: super-resolution-10 - CPU - Standardonnx: ZFNet-512 - CPU - Standardpyperformance: pathlibsimdjson: Kostyapyperformance: nbodyonnx: T5 Encoder - CPU - Parallelgraphics-magick: Sharpenmnn: resnet-v2-50pyperformance: json_loadsx265: Bosphorus 4Kcompress-7zip: Decompression Ratinggraphics-magick: Enhancedlczero: Eigenpyperformance: python_startuppyperformance: pickle_pure_pythonsimdjson: DistinctUserIDblender: Pabellon Barcelona - CPU-Onlyonnx: ResNet101_DUC_HDC-12 - CPU - Parallelpyperformance: raytracepyperformance: gopyperformance: asyncio_websocketspyperformance: django_templatepyperformance: crypto_pyaesetcpak: Multi-Threaded - ETC2build-llvm: Unix Makefilesgraphics-magick: Enhancedpyperformance: async_tree_iopyperformance: regex_compilemnn: nasnetc-ray: 5K - 16epoch: Conegromacs: MPI CPU - water_GMX50_barewarpx: Plasma Accelerationcompress-7zip: Decompression Ratingcompress-7zip: Compression Ratingwarpx: Uniform Plasmaonnx: ResNet50 v1-12-int8 - CPU - Standardbyte: Dhrystone 2c-ray: 4K - 16byte: Pipegromacs: water_GMX50_barec-ray: 1080p - 16blender: Barbershop - CPU-Onlyblender: Classroom - CPU-Onlybyte: Whetstone Doublebuild-llvm: Ninjabyte: System Callpyperformance: xml_etreepyperformance: floatpyperformance: chaosxnnpack: FP32MobileNetV3Largemnn: MobileNetV2_224graphics-magick: Rotatesimdjson: PartialTweetsonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Parallelstockfish: Chess Benchmarkstockfish: Chess Benchmarklczero: BLASab66.71045.23144301.793967657390.0675.146851.493011.13484.7862176051083656122612.6114843.396945881285.1329451.8240.58290233139352384018.6487321.2370.4629430.30322273.0213.6935.807117.78638.061.084421.151262.144114.14161.332282159.068210.11715.53.1164.5109.24117111.33517.58.8142052435936018.72054.16154.460.36019621798.251026.354.8471.193276.92635174882.35.00836.208188.206.00120.3805382141881938477516.90202076317.8684998587529.820.357202565282.27.1565.195381.4578.37721978.0175.030145868649.345.856.847.414261.5022094.063298.192776.336.2855353.62523.145236.197262160.111715.640.7916243.111582.560039.151934.7586522.1451194.364172.2255849675316842876373.41342.29094081.883925685375.3824.956621.442911.09787.5152236191108671119912.3515133.461963866289.8259301.8530.57406332639882182918.4122317.2770.4572630.30683372.1813.8525.873097.86938.431.074381.141271.754084.11162.489284160.166211.53815.43.1364.9108.59617011.26917.48.8641816236136218.82044.18153.730.35850921897.850826.255469.588277.74535075082.14.99636.125187.795.9920.4111646641836538442116.88694846317.64994389993.220.342202436523.67.1595.197381.5578.36721932.3175.039145872070.345.856.847.414261.5022094.063259.12789.326.2411154.31093.147886.152942186.921741.970.7856433.150372.660139.205934.7254823.6437201.746170.26569473429188288587OpenBenchmarking.org

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: defconfigab1632486480SE +/- 0.55, N = 1366.7173.41

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelab1020304050SE +/- 0.50, N = 1545.2342.291. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagickOperation: HWB Color Spaceab90180270360450SE +/- 0.67, N = 34304081. GraphicsMagick 1.3.42 2023-09-23 Q16 http://www.GraphicsMagick.org/

Mobile Neural Network

Model: mobilenet-v1-1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: mobilenet-v1-1.0ab0.42370.84741.27111.69482.1185SE +/- 0.005, N = 31.7931.883MIN: 1.34 / MAX: 22.05MIN: 1.35 / MAX: 21.771. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

XNNPACK

Model: FP32MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK 2cd86bModel: FP32MobileNetV2ab2004006008001000SE +/- 8.41, N = 39679251. (CXX) g++ options: -O3 -lrt -lm

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: Swirlab150300450600750SE +/- 4.26, N = 36576851. (CC) gcc options: -fopenmp -O2 -ljpeg -lSM -lICE -lX11 -lz -lm -lpthread -lgomp

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardab80160240320400SE +/- 2.61, N = 3390.07375.381. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardab1.1582.3163.4744.6325.79SE +/- 0.07121, N = 35.146854.956621. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

PyPerformance

Benchmark: asyncio_tcp_ssl

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: asyncio_tcp_sslab0.33530.67061.00591.34121.6765SE +/- 0.00, N = 31.491.44

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: Noise-Gaussianab70140210280350SE +/- 2.18, N = 153012911. (CC) gcc options: -fopenmp -O2 -ljpeg -lSM -lICE -lX11 -lz -lm -lpthread -lgomp

Mobile Neural Network

Model: mobilenetV3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: mobilenetV3ab0.25520.51040.76561.02081.276SE +/- 0.009, N = 31.1341.097MIN: 0.69 / MAX: 11.14MIN: 0.68 / MAX: 11.221. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Build2

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterBuild2 0.17Time To Compileab20406080100SE +/- 0.22, N = 384.7987.52

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagickOperation: Noise-Gaussianab50100150200250SE +/- 1.73, N = 32172231. GraphicsMagick 1.3.42 2023-09-23 Q16 http://www.GraphicsMagick.org/

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagickOperation: Swirlab130260390520650SE +/- 5.51, N = 36056191. GraphicsMagick 1.3.42 2023-09-23 Q16 http://www.GraphicsMagick.org/

XNNPACK

Model: QU8MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK 2cd86bModel: QU8MobileNetV3Smallab2004006008001000SE +/- 9.82, N = 3108311081. (CXX) g++ options: -O3 -lrt -lm

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: HWB Color Spaceab140280420560700SE +/- 8.82, N = 36566711. (CC) gcc options: -fopenmp -O2 -ljpeg -lSM -lICE -lX11 -lz -lm -lpthread -lgomp

XNNPACK

Model: FP16MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK 2cd86bModel: FP16MobileNetV3Largeab30060090012001500SE +/- 21.31, N = 3122611991. (CXX) g++ options: -O3 -lrt -lm

x265

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx265Video Input: Bosphorus 1080pab3691215SE +/- 0.18, N = 312.6112.351. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6

XNNPACK

Model: QU8MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK 2cd86bModel: QU8MobileNetV3Largeab30060090012001500SE +/- 8.97, N = 3148415131. (CXX) g++ options: -O3 -lrt -lm

Mobile Neural Network

Model: SqueezeNetV1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: SqueezeNetV1.0ab0.77871.55742.33613.11483.8935SE +/- 0.027, N = 33.3963.461MIN: 2.14 / MAX: 29.88MIN: 2.15 / MAX: 23.431. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

XNNPACK

Model: QU8MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK 2cd86bModel: QU8MobileNetV2ab2004006008001000SE +/- 6.69, N = 39459631. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK 2cd86bModel: FP16MobileNetV3Smallab2004006008001000SE +/- 20.00, N = 38818661. (CXX) g++ options: -O3 -lrt -lm

Timed Linux Kernel Compilation

Build: allmodconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: allmodconfigab60120180240300SE +/- 2.59, N = 3285.13289.83

XNNPACK

Model: FP32MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK 2cd86bModel: FP32MobileNetV3Smallab2004006008001000SE +/- 16.38, N = 39459301. (CXX) g++ options: -O3 -lrt -lm

Mobile Neural Network

Model: squeezenetv1.1

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: squeezenetv1.1ab0.41690.83381.25071.66762.0845SE +/- 0.044, N = 31.8241.853MIN: 1.17 / MAX: 20.37MIN: 1.18 / MAX: 15.451. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelab0.13120.26240.39360.52480.656SE +/- 0.002945, N = 30.5829020.5740631. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: Rotateab70140210280350SE +/- 4.26, N = 33313261. (CC) gcc options: -fopenmp -O2 -ljpeg -lSM -lICE -lX11 -lz -lm -lpthread -lgomp

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip CompressionTest: Compression Ratingab90K180K270K360K450KSE +/- 3097.20, N = 33935233988211. 7-Zip 23.01 (arm64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20

XNNPACK

Model: FP16MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK 2cd86bModel: FP16MobileNetV2ab2004006008001000SE +/- 15.62, N = 38408291. (CXX) g++ options: -O3 -lrt -lm

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelab510152025SE +/- 0.10, N = 318.6518.411. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelab70140210280350SE +/- 0.27, N = 3321.24317.281. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardab0.10420.20840.31260.41680.521SE +/- 0.001019, N = 30.4629430.4572631. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardab0.0690.1380.2070.2760.345SE +/- 0.001997, N = 30.3032220.3068331. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0.2Blend File: Fishy Cat - Compute: CPU-Onlyab1632486480SE +/- 0.44, N = 373.0272.18

Mobile Neural Network

Model: inception-v3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: inception-v3ab48121620SE +/- 0.02, N = 313.6913.85MIN: 11.51 / MAX: 42.34MIN: 11.64 / MAX: 411. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Parallelab1.32142.64283.96425.28566.607SE +/- 0.04903, N = 35.807115.873091. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

POV-Ray

Trace Time

OpenBenchmarking.orgSeconds, Fewer Is BetterPOV-RayTrace Timeab246810SE +/- 0.061, N = 37.7867.8691. POV-Ray 3.7.0.10.unofficial

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0.2Blend File: BMW27 - Compute: CPU-Onlyab918273645SE +/- 0.04, N = 338.0638.43

PyPerformance

Benchmark: gc_collect

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: gc_collectab0.2430.4860.7290.9721.215SE +/- 0.01, N = 151.081.07

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: Resizingab100200300400500SE +/- 6.06, N = 34424381. (CC) gcc options: -fopenmp -O2 -ljpeg -lSM -lICE -lX11 -lz -lm -lpthread -lgomp

simdjson

Throughput Test: LargeRandom

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: LargeRandomab0.25880.51760.77641.03521.294SE +/- 0.00, N = 31.151.141. (CXX) g++ options: -O3 -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardab30060090012001500SE +/- 2.68, N = 31262.141271.751. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: Sharpenab90180270360450SE +/- 0.58, N = 34114081. (CC) gcc options: -fopenmp -O2 -ljpeg -lSM -lICE -lX11 -lz -lm -lpthread -lgomp

simdjson

Throughput Test: TopTweet

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: TopTweetab0.93151.8632.79453.7264.6575SE +/- 0.01, N = 34.144.111. (CXX) g++ options: -O3 -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelab4080120160200SE +/- 0.70, N = 3161.33162.491. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagickOperation: Resizingab60120180240300SE +/- 1.15, N = 32822841. GraphicsMagick 1.3.42 2023-09-23 Q16 http://www.GraphicsMagick.org/

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardab4080120160200SE +/- 1.12, N = 3159.07160.171. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardab50100150200250SE +/- 2.64, N = 3210.12211.541. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

PyPerformance

Benchmark: pathlib

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: pathlibab48121620SE +/- 0.03, N = 315.515.4

simdjson

Throughput Test: Kostya

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: Kostyaab0.70431.40862.11292.81723.5215SE +/- 0.01, N = 33.113.131. (CXX) g++ options: -O3 -lrt

PyPerformance

Benchmark: nbody

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: nbodyab1428425670SE +/- 0.09, N = 364.564.9

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Parallelab20406080100SE +/- 0.76, N = 3109.24108.601. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagickOperation: Sharpenab4080120160200SE +/- 0.33, N = 31711701. GraphicsMagick 1.3.42 2023-09-23 Q16 http://www.GraphicsMagick.org/

Mobile Neural Network

Model: resnet-v2-50

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: resnet-v2-50ab3691215SE +/- 0.10, N = 311.3411.27MIN: 8.54 / MAX: 42.16MIN: 8.57 / MAX: 39.911. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

PyPerformance

Benchmark: json_loads

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: json_loadsab48121620SE +/- 0.06, N = 317.517.4

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265Video Input: Bosphorus 4Kab246810SE +/- 0.03, N = 38.818.861. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 24.05Test: Decompression Ratingab90K180K270K360K450KSE +/- 944.71, N = 34205244181621. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.43Operation: Enhancedab80160240320400SE +/- 0.67, N = 33593611. (CC) gcc options: -fopenmp -O2 -ljpeg -lSM -lICE -lX11 -lz -lm -lpthread -lgomp

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.31.1Backend: Eigenab80160240320400SE +/- 4.26, N = 33603621. (CXX) g++ options: -flto -pthread

PyPerformance

Benchmark: python_startup

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: python_startupab510152025SE +/- 0.06, N = 318.718.8

PyPerformance

Benchmark: pickle_pure_python

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: pickle_pure_pythonab4080120160200SE +/- 0.33, N = 3205204

simdjson

Throughput Test: DistinctUserID

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: DistinctUserIDab0.94051.8812.82153.7624.7025SE +/- 0.00, N = 34.164.181. (CXX) g++ options: -O3 -lrt

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0.2Blend File: Pabellon Barcelona - Compute: CPU-Onlyab306090120150SE +/- 0.37, N = 3154.46153.73

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelab0.0810.1620.2430.3240.405SE +/- 0.001260, N = 30.3601960.3585091. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

PyPerformance

Benchmark: raytrace

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: raytraceab50100150200250SE +/- 0.33, N = 3217218

PyPerformance

Benchmark: go

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: goab20406080100SE +/- 0.07, N = 398.297.8

PyPerformance

Benchmark: asyncio_websockets

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: asyncio_websocketsab110220330440550SE +/- 0.33, N = 3510508

PyPerformance

Benchmark: django_template

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: django_templateab612182430SE +/- 0.12, N = 326.326.2

PyPerformance

Benchmark: crypto_pyaes

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: crypto_pyaesab1224364860SE +/- 0.03, N = 354.855.0

Etcpak

Benchmark: Multi-Threaded - Configuration: ETC2

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 2.0Benchmark: Multi-Threaded - Configuration: ETC2ab100200300400500SE +/- 2.47, N = 3471.19469.591. (CXX) g++ options: -flto -pthread

Timed LLVM Compilation

Build System: Unix Makefiles

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix Makefilesab60120180240300SE +/- 0.32, N = 3276.93277.75

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagickOperation: Enhancedab80160240320400SE +/- 0.33, N = 33513501. GraphicsMagick 1.3.42 2023-09-23 Q16 http://www.GraphicsMagick.org/

PyPerformance

Benchmark: async_tree_io

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: async_tree_ioab160320480640800SE +/- 2.96, N = 3748750

PyPerformance

Benchmark: regex_compile

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: regex_compileab20406080100SE +/- 0.12, N = 382.382.1

Mobile Neural Network

Model: nasnet

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: nasnetab1.12682.25363.38044.50725.634SE +/- 0.036, N = 35.0084.996MIN: 4.49 / MAX: 27.91MIN: 4.52 / MAX: 20.421. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

C-Ray

Resolution: 5K - Rays Per Pixel: 16

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 2.0Resolution: 5K - Rays Per Pixel: 16ab816243240SE +/- 0.02, N = 336.2136.131. (CC) gcc options: -lpthread -lm

Epoch

Epoch3D Deck: Cone

OpenBenchmarking.orgSeconds, Fewer Is BetterEpoch 4.19.4Epoch3D Deck: Coneab4080120160200SE +/- 2.18, N = 4188.20187.791. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareab246810SE +/- 0.003, N = 36.0015.9901. (CXX) g++ options: -O3 -lm

WarpX

Input: Plasma Acceleration

OpenBenchmarking.orgSeconds, Fewer Is BetterWarpX 24.10Input: Plasma Accelerationab510152025SE +/- 0.03, N = 320.3820.411. (CXX) g++ options: -O3

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip CompressionTest: Decompression Ratingab90K180K270K360K450KSE +/- 507.84, N = 34188194183651. 7-Zip 23.01 (arm64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 24.05Test: Compression Ratingab80K160K240K320K400KSE +/- 4213.31, N = 33847753844211. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

WarpX

Input: Uniform Plasma

OpenBenchmarking.orgSeconds, Fewer Is BetterWarpX 24.10Input: Uniform Plasmaab48121620SE +/- 0.18, N = 316.9016.891. (CXX) g++ options: -O3

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardab70140210280350SE +/- 0.61, N = 3317.87317.601. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

BYTE Unix Benchmark

Computational Test: Dhrystone 2

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Dhrystone 2ab1100M2200M3300M4400M5500MSE +/- 2591819.88, N = 34998587529.84994389993.21. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

C-Ray

Resolution: 4K - Rays Per Pixel: 16

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 2.0Resolution: 4K - Rays Per Pixel: 16ab510152025SE +/- 0.00, N = 320.3620.341. (CC) gcc options: -lpthread -lm

BYTE Unix Benchmark

Computational Test: Pipe

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Pipeab40M80M120M160M200MSE +/- 32087.94, N = 3202565282.2202436523.61. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

GROMACS

Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACSInput: water_GMX50_bareab246810SE +/- 0.004, N = 37.1567.1591. GROMACS version: 2023.3-Ubuntu_2023.3_1ubuntu3

C-Ray

Resolution: 1080p - Rays Per Pixel: 16

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 2.0Resolution: 1080p - Rays Per Pixel: 16ab1.16932.33863.50794.67725.8465SE +/- 0.003, N = 35.1955.1971. (CC) gcc options: -lpthread -lm

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0.2Blend File: Barbershop - Compute: CPU-Onlyab80160240320400SE +/- 0.54, N = 3381.45381.55

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0.2Blend File: Classroom - Compute: CPU-Onlyab20406080100SE +/- 0.08, N = 378.3778.36

BYTE Unix Benchmark

Computational Test: Whetstone Double

OpenBenchmarking.orgMWIPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Whetstone Doubleab150K300K450K600K750KSE +/- 19.25, N = 3721978.0721932.31. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

Timed LLVM Compilation

Build System: Ninja

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Ninjaab4080120160200SE +/- 1.19, N = 3175.03175.04

BYTE Unix Benchmark

Computational Test: System Call

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: System Callab30M60M90M120M150MSE +/- 15202.21, N = 3145868649.3145872070.31. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

PyPerformance

Benchmark: xml_etree

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: xml_etreeab1020304050SE +/- 0.03, N = 345.845.8

PyPerformance

Benchmark: float

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: floatab1326395265SE +/- 0.03, N = 356.856.8

PyPerformance

Benchmark: chaos

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: chaosab1122334455SE +/- 0.06, N = 347.447.4

XNNPACK

Model: FP32MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK 2cd86bModel: FP32MobileNetV3Largeab30060090012001500SE +/- 6.51, N = 3142614261. (CXX) g++ options: -O3 -lrt -lm

Mobile Neural Network

Model: MobileNetV2_224

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: MobileNetV2_224ab0.3380.6761.0141.3521.69SE +/- 0.019, N = 31.5021.502MIN: 1.12 / MAX: 13.52MIN: 1.15 / MAX: 9.891. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagickOperation: Rotateab50100150200250SE +/- 0.88, N = 32092091. GraphicsMagick 1.3.42 2023-09-23 Q16 http://www.GraphicsMagick.org/

simdjson

Throughput Test: PartialTweets

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: PartialTweetsab0.91351.8272.74053.6544.5675SE +/- 0.00, N = 34.064.061. (CXX) g++ options: -O3 -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardab7001400210028003500SE +/- 21.77, N = 33298.193259.101. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelab6001200180024003000SE +/- 9.74, N = 32776.332789.321. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardab246810SE +/- 0.04448, N = 36.285536.241111. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelab1224364860SE +/- 0.29, N = 353.6354.311. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardab0.70831.41662.12492.83323.5415SE +/- 0.00610, N = 33.145233.147881. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelab246810SE +/- 0.02687, N = 36.197266.152941. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardab5001000150020002500SE +/- 4.77, N = 32160.112186.921. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelab400800120016002000SE +/- 8.64, N = 31715.641741.971. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardab0.17810.35620.53430.71240.8905SE +/- 0.001678, N = 30.7916240.7856431. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelab0.70881.41762.12642.83523.544SE +/- 0.00260, N = 33.111583.150371. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardab0.59851.1971.79552.3942.9925SE +/- 0.01750, N = 32.560032.660131. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Parallelab3691215SE +/- 0.06344, N = 39.151939.205931. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardab1.07072.14143.21214.28285.3535SE +/- 0.05942, N = 34.758654.725481. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelab612182430SE +/- 0.25, N = 1522.1523.641. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardab4080120160200SE +/- 2.71, N = 3194.36201.751. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Parallelab4080120160200SE +/- 1.46, N = 3172.23170.271. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Stockfish

Chess Benchmark

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfishChess Benchmarkab15M30M45M60M75MSE +/- 959000.15, N = 1558496753694734291. Stockfish 16 by the Stockfish developers (see AUTHORS file)

Stockfish

Chess Benchmark

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 17Chess Benchmarkab40M80M120M160M200MSE +/- 6156005.01, N = 151684287631882885871. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -flto -flto-partition=one -flto=jobserver


Phoronix Test Suite v10.8.5