9950X onnx svt

AMD Ryzen 9 9950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (2204 BIOS) and AMD Radeon RX 7900 GRE 16GB on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2408225-NE-9950XONNX44&grs.

9950X onnx svtProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcdeAMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads)ASUS ROG STRIX X670E-E GAMING WIFI (2204 BIOS)AMD Device 14d82 x 32GB DDR5-6400MT/s Corsair CMK64GX5M2B6400C322000GB Corsair MP700 PROAMD Radeon RX 7900 GRE 16GBAMD Navi 31 HDMI/DPDELL U2723QEIntel I225-V + Intel Wi-Fi 6EUbuntu 24.046.10.0-phx (x86_64)GNOME Shell 46.0X Server + Wayland4.6 Mesa 24.2~git2406040600.8112d4~oibaf~n (git-8112d44 2024-06-04 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57)GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xb40401aPython Details- Python 3.12.3Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

9950X onnx svtonnx: yolov4 - CPU - Parallelonnx: bertsquad-12 - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: bertsquad-12 - CPU - Parallelonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelwhisperfile: Tinyonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelsvt-av1: Preset 8 - Beauty 4K 10-bitonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: T5 Encoder - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelwhisperfile: Smallonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardsvt-av1: Preset 8 - Bosphorus 4Ksvt-av1: Preset 13 - Bosphorus 4Konnx: fcn-resnet101-11 - CPU - Parallelonnx: GPT-2 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardsvt-av1: Preset 8 - Bosphorus 1080psvt-av1: Preset 5 - Bosphorus 1080psvt-av1: Preset 5 - Bosphorus 4Ksvt-av1: Preset 3 - Bosphorus 4Kwhisperfile: Mediumsvt-av1: Preset 5 - Beauty 4K 10-bitonnx: ArcFace ResNet-100 - CPU - Standardsvt-av1: Preset 13 - Beauty 4K 10-bitsvt-av1: Preset 3 - Bosphorus 1080psvt-av1: Preset 13 - Bosphorus 1080psvt-av1: Preset 3 - Beauty 4K 10-bitonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: bertsquad-12 - CPU - Standardonnx: bertsquad-12 - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Parallelonnx: GPT-2 - CPU - Standardonnx: GPT-2 - CPU - Parallelabcde7.7136825.1022190.624221.93611.775061.4985161.414245.27829.63488152.641225.15624.479410.5931.41808190.54913.784071.0823127.97316578.94041.3330127.185851.24498.657257.2012.22498150.8652.559594.95375302.419128.70744.10411.814328.971847.57455.741119.40037.3131009.6401.76016.260924.1918390.73705.2094.441164.505381.726706.5507917.938640.8495201.866449.4811.174054.0762039.834884.92865.248025.247487.8611014.067372.5475129.6446.195506.624847.6889925.0524192.676220.4711.601660.4839156.717243.72730.31093152.853224.1424.446110.6091.41079190.48213.882570.7958128.23627573.42541.3018125.625852.41899.255258.6472.25003151.2452.555024.92894304.327127.96143.97411.821327.128417.58856.051119.40437.2091011.2361.75816.531524.2099391.384708.8214.46134.535091.74336.5412617.839240.9041202.881444.4361.172354.100439.913986.19805.189485.249147.9594214.123972.0326130.1246.379116.608788.0377325.0458192.823214.47511.983262.198157.628242.23830.03969149.511223.78524.281310.5271.42067187.73513.91471.5953129.62575571.19340.8941125.997855.08299.222257.4182.24795149.5662.578154.98053303.057128.36843.76911.794329.238287.58356.011919.41937.1981012.7681.7616.07624.4512387.873703.8914.468384.662091.749746.6870217.851741.1806200.78444.8471.168794.1270239.92483.44685.185645.325957.9348113.965771.8677124.416.342486.68278.6283723.4098183.018223.20911.708862.1495157.689245.86730.03582149.296224.2424.772910.5251.40886190.60413.94870.5538129.79648574.31741.0832127.169854.13998.242255.7052.2368150.2892.557934.93277303.122128.10343.87411.818328.289447.5456.03419.48837.1511011.4331.75916.088424.3388390.94709.794.459294.47961.740316.6972317.844740.3648202.724447.0631.170034.0663242.714285.40135.463225.24597.8619414.170471.6925115.8946.339786.650467.8244425.0243193.253217.91711.752660.315157.65239.03230.37277149.907220.16324.259410.7371.43691191.36113.70571.6359129.38091575.41540.8234126.016861.65498.06256.1522.23767151.1972.550634.93075301.639127.60543.73911.744328.013257.54155.906819.48737.1971010.7631.75916.57824.4935392.057695.9374.541874.588331.73726.6699117.885241.2192202.807446.8911.159724.182139.958985.08425.173955.224987.9339913.957772.9637127.8016.341396.61099OpenBenchmarking.org

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Parallelabcde246810SE +/- 0.04367, N = 3SE +/- 0.06383, N = 97.713687.688998.037738.628377.824441. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardabcde612182430SE +/- 0.04, N = 325.1025.0525.0523.4125.021. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardabcde4080120160200SE +/- 1.93, N = 6SE +/- 0.25, N = 3190.62192.68192.82183.02193.251. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelabcde50100150200250SE +/- 1.21, N = 3221.94220.47214.48223.21217.921. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Parallelabcde3691215SE +/- 0.07, N = 3SE +/- 0.07, N = 311.7811.6011.9811.7111.751. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardabcde1428425670SE +/- 0.49, N = 361.5060.4862.2062.1560.321. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardabcde4080120160200SE +/- 1.98, N = 3SE +/- 0.41, N = 3161.41156.72157.63157.69157.651. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelabcde50100150200250SE +/- 1.89, N = 3245.28243.73242.24245.87239.031. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Whisperfile

Model Size: Tiny

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisperfile 20Aug24Model Size: Tinyabcde714212835SE +/- 0.12, N = 329.6330.3130.0430.0430.37

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelabcde306090120150SE +/- 0.85, N = 3152.64152.85149.51149.30149.911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardabcde50100150200250SE +/- 0.09, N = 3225.16224.14223.79224.24220.161. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelabcde612182430SE +/- 0.09, N = 324.4824.4524.2824.7724.261. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Beauty 4K 10-bitabcde3691215SE +/- 0.03, N = 3SE +/- 0.04, N = 310.5910.6110.5310.5310.741. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelabcde0.32330.64660.96991.29321.6165SE +/- 0.00694, N = 31.418081.410791.420671.408861.436911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Parallelabcde4080120160200SE +/- 0.96, N = 3SE +/- 0.37, N = 3190.55190.48187.74190.60191.361. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardabcde48121620SE +/- 0.05, N = 3SE +/- 0.05, N = 313.7813.8813.9113.9513.711. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelabcde1632486480SE +/- 0.42, N = 3SE +/- 0.36, N = 371.0870.8071.6070.5571.641. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Whisperfile

Model Size: Small

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisperfile 20Aug24Model Size: Smallabcde306090120150SE +/- 0.21, N = 3127.97128.24129.63129.80129.38

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardabcde130260390520650SE +/- 2.14, N = 3578.94573.43571.19574.32575.421. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallelabcde918273645SE +/- 0.10, N = 341.3341.3040.8941.0840.821. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardabcde306090120150SE +/- 0.57, N = 3SE +/- 0.96, N = 3127.19125.63126.00127.17126.021. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardabcde2004006008001000SE +/- 2.64, N = 3851.24852.42855.08854.14861.651. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 4Kabcde20406080100SE +/- 0.16, N = 3SE +/- 0.23, N = 398.6699.2699.2298.2498.061. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 4Kabcde60120180240300SE +/- 0.27, N = 3SE +/- 0.18, N = 3257.20258.65257.42255.71256.151. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelabcde0.50631.01261.51892.02522.5315SE +/- 0.01511, N = 32.224982.250032.247952.236802.237671. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Parallelabcde306090120150SE +/- 0.19, N = 3SE +/- 0.53, N = 3150.87151.25149.57150.29151.201. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardabcde0.58011.16021.74032.32042.9005SE +/- 0.02001, N = 32.559592.555022.578152.557932.550631. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardabcde1.12062.24123.36184.48245.603SE +/- 0.00675, N = 34.953754.928944.980534.932774.930751. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 1080pabcde70140210280350SE +/- 0.64, N = 3SE +/- 0.56, N = 3302.42304.33303.06303.12301.641. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 1080pabcde306090120150SE +/- 0.11, N = 3SE +/- 0.17, N = 3128.71127.96128.37128.10127.611. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 4Kabcde1020304050SE +/- 0.06, N = 3SE +/- 0.13, N = 344.1043.9743.7743.8743.741. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 4Kabcde3691215SE +/- 0.04, N = 3SE +/- 0.00, N = 311.8111.8211.7911.8211.741. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Whisperfile

Model Size: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisperfile 20Aug24Model Size: Mediumabcde70140210280350SE +/- 0.20, N = 3328.97327.13329.24328.29328.01

SVT-AV1

Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Beauty 4K 10-bitabcde246810SE +/- 0.022, N = 3SE +/- 0.052, N = 37.5747.5887.5837.5407.5411. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardabcde1326395265SE +/- 0.11, N = 355.7456.0556.0156.0355.911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Beauty 4K 10-bitabcde510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 319.4019.4019.4219.4919.491. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 1080pabcde918273645SE +/- 0.01, N = 3SE +/- 0.01, N = 337.3137.2137.2037.1537.201. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 1080pabcde2004006008001000SE +/- 2.29, N = 3SE +/- 1.10, N = 31009.641011.241012.771011.431010.761. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Beauty 4K 10-bitabcde0.3960.7921.1881.5841.98SE +/- 0.005, N = 3SE +/- 0.003, N = 31.7601.7581.7601.7591.7591. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardabcde48121620SE +/- 0.13, N = 316.2616.5316.0816.0916.581. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallelabcde612182430SE +/- 0.06, N = 324.1924.2124.4524.3424.491. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardabcde90180270360450SE +/- 3.03, N = 3390.73391.38387.87390.94392.061. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelabcde150300450600750SE +/- 3.47, N = 3705.21708.82703.89709.79695.941. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardabcde1.02192.04383.06574.08765.1095SE +/- 0.00178, N = 34.441164.461304.468384.459294.541871. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelabcde1.0492.0983.1474.1965.245SE +/- 0.02467, N = 34.505384.535094.662094.479604.588331. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardabcde0.39370.78741.18111.57481.9685SE +/- 0.00641, N = 31.726701.743301.749741.740311.737201. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelabcde246810SE +/- 0.03664, N = 36.550796.541266.687026.697236.669911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardabcde48121620SE +/- 0.04, N = 317.9417.8417.8517.8417.891. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelabcde918273645SE +/- 0.15, N = 340.8540.9041.1840.3641.221. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardabcde4080120160200SE +/- 0.28, N = 3201.87202.88200.78202.72202.811. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelabcde100200300400500SE +/- 3.07, N = 3449.48444.44444.85447.06446.891. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardabcde0.26420.52840.79261.05681.321SE +/- 0.00364, N = 31.174051.172351.168791.170031.159721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelabcde0.9411.8822.8233.7644.705SE +/- 0.03127, N = 34.076204.100404.127024.066324.182101. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardabcde1020304050SE +/- 0.07, N = 339.8339.9139.9242.7139.961. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Parallelabcde20406080100SE +/- 0.50, N = 3SE +/- 0.50, N = 384.9386.2083.4585.4085.081. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardabcde1.22922.45843.68764.91686.146SE +/- 0.05529, N = 6SE +/- 0.00692, N = 35.248025.189485.185645.463225.173951. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Parallelabcde1.19832.39663.59494.79325.9915SE +/- 0.02621, N = 3SE +/- 0.01022, N = 35.247485.249145.325955.245905.224981. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardabcde246810SE +/- 0.03518, N = 3SE +/- 0.06114, N = 37.861107.959427.934817.861947.933991. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelabcde48121620SE +/- 0.08, N = 3SE +/- 0.07, N = 314.0714.1213.9714.1713.961. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardabcde1632486480SE +/- 0.26, N = 3SE +/- 0.27, N = 372.5572.0371.8771.6972.961. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Parallelabcde306090120150SE +/- 0.73, N = 3SE +/- 1.08, N = 9129.64130.12124.41115.89127.801. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardabcde246810SE +/- 0.07694, N = 3SE +/- 0.01653, N = 36.195506.379116.342486.339786.341391. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Parallelabcde246810SE +/- 0.00827, N = 3SE +/- 0.02289, N = 36.624846.608786.682706.650466.610991. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt


Phoronix Test Suite v10.8.5